Systems and tools that finds similarities among essays and reports are widely used by today’s universities and schools to detect plagiarism. Such tools are however insufficient when used for source code comparisons since they are fragile to the most simplest forms of disguises.
Other methods that analyses intermediate forms such as token strings, syntax trees and graph representations have shown to be more effective than using simple textual matching methods. In this master thesis report we discuss how program dependence graphs, an abstract representation of a programs semantics, can be used to find similar procedures.
We also present an implementation of a system that constructs approximated program dependence graphs from the abstract syntax tree representation of a program. Matching procedures are found by testing graph pairs for either sub-graph isomorphism or graph monomorphism depending on whether structured transfer of control has been used.
Under a scenario based evaluation our system is compared to Moss, a popular plagiarism detection tool. The result shows that our system is more or least as effective than Moss in finding plagiarized procedured independently on the type of modifications used.
Source: Linköping University
Author: Holma, Niklas