Smart DifferencerTM tools
Developers frequently need to determine differences between various versions of text files comprising an application system's source code. Among others, such differences facilitate reviewing, debugging, and testing newly changed source code. Ideally, developers would like to be told about differences in terms that make sense with respect to the type of source code and its constructs, e.g., "delete statement", "insert expression", "move block", "rename identifier".
Conventional differencing tools (e.g., diff) compute differences based on source lines of test, using line-based models of editing like "insert line", "delete line", or "replace line". These tools are very useful for for arbitrary text, but are are not cognizant of the structure of the programming language in which source code is written. When used on source code, this often causes the reported differences not to obey the boundaries of the underlying language constructs, e.g. a fragment of a statement or the suffix of one statment and a prefix of another statement may be reported as a change based on accidental organization of these into lines. Worse, simple reformatting or changes in comments will result in lots of apparent changes without any actual semantic impact on the source code. This is conceptually jarring to the developer, who thinks of changes in terms of program structures and abstract editing operations manipulating such structures.
The SD SmartDifferencer shows the differences between two versions of source code in terms of abstract editing operations applied to programming language constructs. The language constructs are discovered by parsing the code using a production language parser (and depending on language, determining scopes and symbol tables). Editing operations include insert, delete, copy, merge, and rename (globally, across a scope, or pointwise). Language constructs include primitives like identifiers, numbers, string literals, etc., as well as compound phrases like declarations, statements, expressions, etc. The editing operations are not bound to source lines but may only affect part of a line. They ignore comments, irrelevant whitespace, and actual formatting of numbers (radix, leading zeros) and string literals (equivalent escape sequences, etc.). Note that whitespace within string literals is considered as relevant and taken into account when determining differences.
By default, the SmartDifferencer produces output intended for consumption by a developer, This output is kept compact by summarizing multiple adjacent edits of the same kind into a single edit. Each of these edits is followed by the actual program fragments involved in the respective edit.
Alternatively or in addition, the SmartDifferencer can produce output intended for consumption by another tool, e.g. by a display tool visually displaying the compared source codes to a developer with the changes being highlighted by different colors depending on the kind of the respective edit. This output includes further details of the edits that are usually not of direct interest to a developer.
Benefits include enhancement of developer productivity both individually and during code reviews by suppressing semantically irrelevant changes like formatting, comments, whitespace, representation of numbers and strings, etc. focusing the developer's attention on changes that are semantically coherent with respect to the language and thus meaningful, and describing differences in terms of edits over the underlying language constructs. Integration of such a differencer into a source code control system will also aid developers.
Typical Features
- Compares two files for a specific language
- Understands target language syntax precisely:
- Whitespace and comments (ignored)
- Keywords and identifiers
- Integer and floating values and their equivalent but variant possible spellings
- Strings and their equivalents according to escaping conventions
- Full syntax structure of the language (using DMS language Front Ends)
- Output in terms of language syntax elements: statements, expressions, blocks, identifiers
- More succinct output than a conventional string diff tool, with coherent explanations and code display rather than simple string dumps
- Detects consistent renaming within a block of code
- Generates deltas in two forms:
- Human readable format, showing location of deltas type (language nonterminals), locations (line,column) and before and after text
- Summary form, showing just succint summary of delta types and locations
Available for the Following Languages
- C (ANSI89, MS VisualC6, GCC3/4, MS Visual Studio 2005)
- C++ (ANSI89, MS VisualC6, GCC3/4, MS Visual Studio 2005)
- COBOL (IBM Enterprise)
- C# 2.0, 3.0, 4.0
- Java 1.5/1.6
- PHP 5.0
Unusual Requirements?
Is your language not listed? Does it run in an unusual environment, or you have some custom need? SD can configure a Smart Differencer tool for you! These tools are based on DMS, and inherit DMS's language agility and scalability.
Other Tools
Semantic Designs offers a variety of other software tools.
For more information: Info@semanticdesigns.com
Copyright 1995-2010 Semantic Designs, Incorporated
DMS and "Design Maintenance System" are registered trademarks of Semantic Designs, Inc.
The SD logo and "Semantic Designs" are registered service marks of Semantic Designs, Inc.
CloneDR, PARLANSE, JOVIAL2C, Thicket, Smart Differencer are trademarks of Semantic Designs, Inc.
The OMG logo is a registered trademark of the Object Management Group, Inc. in the United States and other countries.
To view our Privacy Policy, click here
Comments or problems: Webmaster@semanticdesigns.com
