Abstract: 
Many tools designed to help programmers view and manipulate source code 
exploit the formal structure of the programming language. Language-based 
tools use information derived via linguistic analysis to offer services 
that are impractical for purely text-based tools. In order to be 
effective, however, language-based tools must be designed to account 
properly for the documentary structure of source code: a structure that 
is largely orthogonal to the linguistic but no less important. 
Documentary structure includes, in addition to the language text, all 
extra-lingual information added by programmers for the sole purpose of 
aiding the human reader: comments, white space, and choice of names. 
Largely ignored in the research literature, documentary structure 
occupies a central role in the practice of programming. An examination 
of the documentary structure of programs leads to a better understanding 
of requirements for tool architectures.
Information and Software Technology, Volume 44, Issue 13, 1 October 2002, pp. 767-782.
17 pages (PDF)This is a significantly extended version of an earlier workshop paper.
