Program UnderstandingHave you ever wondered how programmers work through the process of solving a problem? Is it a systematic process? Is it blind luck? Is it an educated guess tempered by experience? Recent research, as scanty as it is in this area, has spawned several cognitive theories on how programmers do what they do. Rather than delve into each theory, let's look at the common structure that threads through all of them. Generally, theories on program understanding (PU) have four fundamental elements:
Syntactic knowledge is language specific knowledge the programmer needs to understand the physical structure of the code. Though not an absolute necessity, lack of syntactic knowledge for an experienced programmer means a constant shift between surface analysis (syntactic understanding) and purpose analysis (trying to understand meaning and intent). Domain knowledge is problem specific knowledge and is unrelated to program specific knowledge. The programmer's goal is to successfully merge these disparate knowledges together. For example, understanding a program about gas systems maintenance requeries some knowledge about the natural gas domain. Similarly, a program designed to maintain a horse registry system requires some knowledge about the domain of horses. External knowledge is specific stored knowledge that is readily available for review and assimilation into the internal knowledge base. Examples of external knowledge are documentation, organizational policies, user manuals, code printouts, etc. Assimilation is a mapping process between the internal and external stores of knowledge and the application being studied. Existing code that is being analyzed is processed as follows. Using either a systematic (line by line examination) or opportunistic (haphazard review) approach, the programmer first looks for clues or signposts that trigger a match to a familiar structure in internal memory. If a match is found, the code segment is categorized into a developing mental model of the application. The categorization is just an abstraction process that labels the code segment with a symbol, e.g., search routine or sort routine. If no clue is found, the code segment is systematically examined with the goal of discovering the underlying functionality. Once found and matched it's categorized into the mental model. This process continues through the entire program with the overall goal of generating a fully developed mental model of the application. As the mental model grows in form and function, it is itself gradually integrated into the internal knowledge base for future use. Agreement with existing knowledge serves to reinforce and add depth, while innovative code that contradicts existing knowledge is absorbed as new code and adds breath to existing knowledge. The process is complete when the newly formulated mental model is fully developed. To summarize, the PU process is an iteractive process of mapping between the unknown (application under study) and the known (internal and external knowledge base). The mapping process is far from simple, involving constant switching between systematic and opportunistic review coupled with an abstraction process that chunks detail under a common identifier The ultimate goal of the mapping and abstraction process is a mental model of the new application that is a composite, at an abstract level of domain knowledge and stored knowledge. As the mental model develops, each abstracted segment is functionally integrated into the model. The key for the programmer is "functionally integrated," order out of chaos, coherent flow out of randomness, purpose out of disorder. RR 2 Box 168, Jericho, VT 05465, 802-899-3115 Info@lavalleeccs.com |