ACM Multi-label Dataset (2008 version)

This is a collection of 86116 text documents classified according to the ACM categories [1998 Version].

Each document is associated with a paper. Each document always has a title, and may have an abstract, zero or more keywords, zero or more general terms, zero or more ACM categories assigned to it, etc (check the STATS section below). An example of a document and its main fields are shown below:

titleIs a bot at the controls?: Detecting input data attacks
abstractThe use of programmatically generated input data in place of human-generated input data poses problems for many computer applications in use today. Mouse clicks and keyboard strokes can automatically be generated to cheat in online games, or to perpetrate click fraud. The ability to discern whether input data was computationally generated instead of created by a human input device is therefore of paramount importance to these types of applications. This paper describes a method for detecting input data that was computationally modified or fabricated. This includes detecting data that was not directly generated by a physical human input device such as a keyboard or mouse. A prototype of this system was built on existing hardware and was shown to be effective at detecting attacks on a real application. This detection method is capable of addressing the majority of input-based attacks currently in use. When used in conjunction with a trusted peripheral, it offers a robust mechanism for ensuring a computer is not at the controls.
keywordssecurity, online games, cheat detection, cheating
general termssecurity



The dataset is available as:


This collection of documents was extracted from the ACM digital library in 2007-2008 during a scientific project named COPSRO (Computational Approach to Ontology Profiling of Scientific Research Organisations). Project funded by FCT (Portuguese Science and Technology Foundation) under the reference PTDC/EIA/69988/2006.


If you use this dataset for research, please cite one of the following works: