Smart DifferencerTM tools

Developers frequently need to determine differences between various versions of text files comprising an application system's source code. Among others, such differences facilitate reviewing, debugging, and testing newly changed source code. Ideally, developers would like to be told about differences in terms that make sense with respect to the type of source code and its constructs, e.g., "delete statement", "insert expression", "move block", "rename identifier".

Conventional differencing tools (e.g., diff) compute differences based on source lines of test, using line-based models of editing like "insert line", "delete line", or "replace line". These tools are very useful for for arbitrary text, but are are not cognizant of the structure of the programming language in which source code is written. When used on source code, this often causes the reported differences not to obey the boundaries of the underlying language constructs, e.g. a fragment of a statement or the suffix of one statment and a prefix of another statement may be reported as a change based on accidental organization of these into lines. Worse, simple reformatting or changes in comments will result in lots of apparent changes without any actual semantic impact on the source code. This is conceptually jarring to the developer, who thinks of changes in terms of program structures and abstract editing operations manipulating such structures.

The SD SmartDifferencer shows the differences between two versions of source code in terms of abstract editing operations applied to programming language constructs. The language constructs are discovered by parsing the code using a production language parser (and depending on language, determining scopes and symbol tables). Editing operations include insert, delete, copy, merge, and rename (globally, across a scope, or pointwise). Language constructs include primitives like identifiers, numbers, string literals, etc., as well as compound phrases like declarations, statements, expressions, etc. The editing operations are not bound to source lines but may only affect part of a line. They ignore comments, irrelevant whitespace, and actual formatting of numbers (radix, leading zeros) and string literals (equivalent escape sequences, etc.). Note that whitespace within string literals is considered as relevant and taken into account when determining differences.

By default, the SmartDifferencer produces output intended for consumption by a developer, This output is kept compact by summarizing multiple adjacent edits of the same kind into a single edit. Each of these edits is followed by the actual program fragments involved in the respective edit.

Alternatively or in addition, the SmartDifferencer can produce output intended for consumption by another tool, e.g. by a display tool visually displaying the compared source codes to a developer with the changes being highlighted by different colors depending on the kind of the respective edit. This output includes further details of the edits that are usually not of direct interest to a developer.

Benefits include enhancement of developer productivity both individually and during code reviews by suppressing semantically irrelevant changes like formatting, comments, whitespace, representation of numbers and strings, etc. focusing the developer's attention on changes that are semantically coherent with respect to the language and thus meaningful, and describing differences in terms of edits over the underlying language constructs. Integration of such a differencer into a source code control system will also aid developers.

Typical Features

Available for the Following Languages

Unusual Requirements?

Is your language not listed? Does it run in an unusual environment, or you have some custom need? SD can configure a Smart Differencer tool for you! These tools are based on DMS, and inherit DMS's language agility and scalability.

Other Tools

Semantic Designs offers a variety of other software tools.





Smart Differencer tools