Krzysztof Czarnecki Generic programming

Первоисточник: www.issi.uned.es/doctorado/generative/Bibliografia/TesisCzarnecki.pdf

Generative Programming and Related Paradigms

There are three other programming paradigms which have similar goals to Generative Programming:

Generic programming,
Domain-Specific Languages (DSLs), and
Aspect-Oriented Programming (AOP).

Generative Programming is broader in scope than these approaches, but uses important ideas from each [CEG+98]:

Generic Programming may be summarized as “reuse through parameterization.” Generic programming allows components which are extensively customizable, yet retain the efficiency of statically configured code. This technique can eliminate dependencies between types and algorithms that are conceptually not necessary. For example, iterators allow generic algorithms which work efficiently on both dense and sparse matrices [SL98a]. However, generic programming limits code generation to substituting concrete types for generic type parameters and welding together pre-existing fragments of code in a fixed pattern. Generative programming is more general since it provides automatic configuration of generic components from abstract specifications and for a more powerful parameterization.

Domain-Specific Languages (DSLs) provide specialized language features that increase the abstraction level for a particular problem domain; they allow users to work closely with domain concepts (i.e. they are higly intentional), but at the cost of language generality. Domain-specific languages range from widely-used languages for numerical and symbolic computation (e.g., Mathematica) to less well-known languages for telephone switches and financial calculations (to name just a few). DSLs are able to perform domain-specific optimizations and error checking. On the other hand, DSLs typically lack support for generic programming.

What Is this Thesis About?

Aspect-Oriented Programming Most current programming methods and notations concentrate on finding and composing functional units, which are usually expressed as objects, modules, and procedures. However, several properties such as error handling and synchronization cannot be expressed using current (e.g. OO) notations and languages in a cleanly localized way. Instead, they are expressed by small code fragments scattered throughout several functional components. Aspect-Oriented Programming (AOP) [KLM+97] decomposes problems into functional units and aspects (such as error handling and synchronization). In an AOP system, components and aspects are woven together to obtain a system implementation that contains an intertwined mixture of aspects and components (i.e. tangled code). Weaving can be performed at compile time (e.g. using a compiler or a preprocessor) or at runtime (e.g. using dynamic reflection). In any case, weaving requires some form of metaprogramming6 (see Section 7.6.2). Generative programming has a larger scope since it includes automatic configuration and generic techniques, and provides new ways of interacting with the compiler and development environment.

Putting It All Together: Generative Programming The concept of generative programming encompasses the techniques of the previous three approaches, as well as some additional techniques to achieve the goals listed in Section 1.4:

DSL techniques are used to improve intentionality of program code, and to enable domain- specific optimizations and error checking. Metaprogramming allows us to implement the necessary language extensions.7
AOP techniques are used to achieve separation of concerns by isolating aspects from functional components. Metaprogramming allows us to weave aspects and components together.
Generic Programming techniques are used to parameterize over types, and iterators are used to separate out data storage and traversal aspects.
Configuration knowledge is used to map between the problem space and solution space. Different parts of the configuration knowledge can be used at different times in different contexts (e.g. compile time or runtime or both). The implementation of automatic configuration often requires metaprogramming.

1.6 Generative Analysis and Design Generative Programming focuses on designing and implementing reusable software for generating specific systems rather than developing each of the specific systems from scratch. Therefore, the scope of generative analysis and design are families of systems and not single systems. This requirement is satisfied by Domain Engineering. Part of Domain Engineering is Domain Analysis, which represents a systematic approach to identifying the scope, the features, and the variation points of the reusable software based on the analysis of existing applications, stakeholders, and other sources. Domain Analysis allows us to identify not only the immediately relevant features, but also the potentially relevant ones as early as possible. The knowledge of the planned and potential features is a prerequisite for arriving at a robust design capable to scale up.

Metaprogramming involves writing programs whose parts are related by the “about” relationship, i.e. some parts are about some other parts. An example of a metaprogram is a program which manipulates other programs as data, e.g. a template metaprogram, a compiler, or a preprocessor (see Section 8.1). Another example are programs implementing the abstractions of a programming language in a reflective way, e.g. metaclasses in Smalltalk (the latter implement the behavior of classes). An example of metaprogramming in Smalltalk is given in Section 7.4.7.

By a language extension we mean capabilities extending the expressive power of a programming language which are traditionally not being packaged in conventional libraries, e.g. domain-specific optimizations, domain-specific error checking, syntax extensions, etc. (see Section 9.4.1).

Generative Programming, K. Czarnecki Furthermore, Domain Analysis helps us to identify the dependencies between variation points. For example, selecting a multi-threaded execution mode will require activating synchronization code in various components of a system or selecting some storage format may require selecting specialized processing algorithms. This kind of explicit configuration knowledge allows us to implement automatic configuration and to design easy-to-use and scalable configuration interfaces, e.g. interfaces based on specialized languages (so-called domain-specific languages) or application builders (e.g. GUI builders). None of the current OOA/D methods address the above-mentioned issues of multi-system- scope development. On the other hand, they provide effective system modeling techniques. Thus, the integration of Domain Engineering and OOA/D methods is a logical next step. From the viewpoint of OOA/D methods, the most important contribution of Domain Engineering is feature modeling, a technique for analyzing and capturing the common and the variable features of systems in a system family and their interdependencies. The results of feature modeling are captured in a feature model, which is an important extension of the usual set of models used in OOA/D.

We propose methods which

combine aspects of Domain Engineering, OOA/D, and AOP and
are specialized for different categories of domains to be most appropriate for Generative Programming. As an example of such a method, we develop a Domain Engineering Method for Reusable Algorithmic Libraries (DEMRAL), which is appropriate for building algorithmic libraries, e.g. libraries of numerical codes, image processing libraries, and container libraries.

1.7 Generative Implementation Technologies

As noted in the previous section, Generative Programming requires metaprogramming for weaving and automatic configuration. Supporting domain-specific notations may require syntactic extensions. Libraries of domain abstraction based on Generative Programming ideas thus need both implementation code and metacode which can implement syntax extensions, perform code generation, and apply domain-specific optimizations. We refer to such libraries as active libraries [CEG+98]:

Active libraries are not passive collections of routines or objects, as are traditional libraries, but take an active role in generating code. Active libraries provide abstractions and can optimize those abstractions themselves. They may generate components, specialize algorithms, optimize code, automatically configure and tune themselves for a target machine, and check source code for correctness. They may also describe themselves to and extend tools such as compilers, profilers, debuggers, domain- specific editors, etc.

This perspective forces us to redefine the conventional interaction between compilers, libraries, and applications. Active Libraries may be viewed as knowledgeable agents, which interact with each other to produce concrete components. Such agents need infrastructure supporting communication between them, code generation and transformation, and interaction with the programmer.

Active Libraries require languages and techniques which open up the programming environment. Implementation technologies for active libraries include the following [CEG+98]: Extensible compilation and metalevel processing systems In metalevel processing systems, library writers are given the ability to directly manipulate language constructs. They can analyze and transform syntax trees, and generate new source code at compile time. The MPC++ metalevel architecture system [IHS+96] provides this capability for the C++ language. MPC++ even allows library developers to extend the syntax of the language in certain ways (for

What Is this Thesis About?

Example, adding new keywords). Other examples of metalevel processing systems are Open C++ [Chi95], Magik [Eng97], and Xroma [CEG+98]. An important differentiating factor is whether the metalevel processing system is implemented as a pre-processor, an open compiler, or an extensible programming environment (e.g. Intentional Programming; see Section 6.4.3). Program Specialization Researchers in Partial Evaluation have developed an extensive theory and literature of code generation. An important discovery was that the concept of generating extensions [Ers78] unifies a very wide class of apparently different program generators. This has the big advantage that program generators can be implemented with uniform techniques, including diverse applications such as parsing, translation, theorem proving, and pattern matching. Through partial evaluation, components which handle variability at run-time can be automatically transformed into component generators (or generating extensions in the terminology of the field, e.g. [DGT96]) which handle variability at compile-time. This has the potential to avoid the need for library developers to work with complex meta-level processing systems in some cases. Automatic tools for turning a general component into a component generator (i.e. generating extension) now exist for various programming languages such as Prolog, Scheme, and C (see [JGS93]).

Multi-level languages Another important concept from partial evaluation is that of two-level (or more generally, multi-level) languages. Two-level languages contain static code (which is evaluated at compile-time) and dynamic code (which is compiled, and later executed at run-time). Multi-level languages [GJ97] can provide a simpler approach to writing program generators (e.g., the Catacomb system [SG97]).

The C++ language includes some compile-time processing abilities quite by accident, as a byproduct of template instantiation. Nested templates allow compile-time data structures to be created and manipulated, encoded as types; this is the basis of the expression templates technique [Vel95a]. The template metaprogram technique [Vel95b] exploits the template mechanism to perform arbitrary computations at compile time; these “metaprograms” can perform code generation by selectively inlining code as the “metaprogram” is executed. This technique has proven a powerful way to write code generators for C++. In this context, we can view C++ as a two-level language.

Runtime code generation (RTCG) RTCG systems allow libraries to generate customized code at run-time. This makes it possible to perform optimizations which depend on information not available until run-time, for example, the structure of a sparse matrix or the number of processors in a parallel application. Examples of such systems which generate native code are `C (Tick-C) [EHK96, PEK97] and Fabius [LL95]. Runtime code modification can also be achieved using dynamic reflection facilities available in languages such as Smalltalk and CLOS. The latter languages also provide their own definitions in the form of extendible libraries to programmers.