Translation of source code to object module : The Preprocessor Compilation Process
Written by Administrator
| Article Index |
|---|
| Translation of source code to object module : The Preprocessor Compilation Process |
| The Compilation Process |
| All Pages |
The preposessor (We'll be talking of the C preprocessor) is a seperate program invoked by the compiler as the first part of translation of the source code generated by a programmer.The source code is translated from a source file (encoded in a source character set) to an object module (remember the .obj files that you see while compiling C programs?).This end product is known as a translation unit.It should be noted that the properties of scoping , visibility and linkage apply to this translation unit and not the source code.

For the sake of clarity , I restate that the translation unit shown above above is the product of preprocessing and the properties of scoping , visibility and linkage cannot be applied at any stage before obtaining this unit.
The compilation process is a non-deterministic , abstract finite state machine .
The above statement is quite short and simple but it's complete meaning is rarely understood by novices.Note the three things said abut the compilation process.I'll continue to break down and analyze each part of this important sentence.
Non-deterministic : The first term is 'non-deterministic'. Compilation is said to be non-deterministic as loopholes are left in implementation - deliberately. Four kinds of situations may occur :
- Implementation defined : As the name suggests , these are left to the implementing agent . An example would be a right shift assign operator. These need to be documented.
- Undefined : When a program can give two different results on two different systems or architectures depending on the support that the hosted environment provides. An example would be that of an int overflow behavior.
- Unspecified : These are left to be implemented in the most efficient manner.These behaviors need not be documented. A good example would be the order of evaluation of arguments passed to a function. There is no decided order in which arguments are evaluated , resulting in possibly different behavior on different calls to a particular function.
Fn(++a,++a) - Here the ',' is not an operator but a separator and the order of execution is unspecified.The sequence point is at beginning of function.
In order to explain examples more clearly , three terms need to be understood first
Side Effects : Anything that changes state.
Sequence Point : Sequence of operations in the abstract machine.
Agreement Points : Subject to constraints.
It is required that multiple side effects are not made while crossing a sequence point.This point will be illustrated in the following example :
ARY[b++] = ++b+c; - Here the sequence point at the end ( ; terminator ) and multiple side-effects are made without crossing the sequence point , hence the order of evaluation is undefined.
Early evaluation oriented languages - Arguments are evaluated before body is entered . Eg - C , C++ .
Lazy evaluation oriented languages - Arguments are not evaluated till they actually get used. Eg - SML , DYLAN .
- Locale Specific : The behavior of these points are dependent on the implementation of the C library, and are not defined by GCC itself. Eg - stripping of spaces between characters.
Abstract : The second term is 'abstract'. The compilation process is said to be abstract because there is no physical implementation of the represented steps.Everything works strictly according to an 'AS-IF' philosophy where steps seem to occur AS-IF a model were being followed , but is not actually (in the processor and control unit).
Finite state machine : The third and last term is 'Finite state machine'.A set of states is maintained and changes in these states reflect actual state of program.
With that , we have finally understood the meaning of the sentence ' The compilation process is a non-deterministic , abstract finite state machine .' :)
I will now proceed to the eight conceptual phases of translation of the source file by the preprocessor , in the next section.

Articles