The Documentary Structure of Source Code (2002)

Author: Michael L. Van De Vanter

Abstract:
Many tools designed to help programmers view and manipulate source code exploit the formal structure of the programming language. Language-based tools use information derived via linguistic analysis to offer services that are impractical for purely text-based tools. In order to be effective, however, language-based tools must be designed to account properly for the documentary structure of source code: a structure that is largely orthogonal to the linguistic but no less important. Documentary structure includes, in addition to the language text, all extra-lingual information added by programmers for the sole purpose of aiding the human reader: comments, white space, and choice of names. Largely ignored in the research literature, documentary structure occupies a central role in the practice of programming. An examination of the documentary structure of programs leads to a better understanding of requirements for tool architectures.


Information and Software Technology, Volume 44, Issue 13, 1 October 2002, pp. 767-782.

17 pages (PDF)

This is a significantly extended version of an earlier workshop paper.