Question:
Two parts of compilation process:
- Explain structure of compiler OR
- Explain phases of compiler OR
- Analysis synthesis model of compilation OR
- Write output of phases of a compiler. for a = a + b * c * 2; type of a, b, c are float
Two parts of compilation process:
- Analysis phase: The main objective of the analysis phase is to break the source code into parts and then arranges these pieces into a meaningful structure.
- Synthesis phase: Synthesis phase is concerned with generation statement which has the same meaning as the source statement.
- Lexical analysis
- Syntax analysis
- Semantic analysis
- Lexical analysis is also called linear analysis or scanning.
- Lexical analyzer reads the source program and then it is broken into stream of units. Such units are called token.
- Then it classifies the units into different lexical classes. E.g. id’s, constants, keyword etc...and enters them into different tables.
- For example, in lexical analysis the assignment statement a: = a + b * c * 2 would be grouped into the following tokens:
a | Identifier 1 |
---|---|
= | Assignment sign |
a | Identifier 1 |
+ | The plus sign |
b | Identifier 2 |
* | Multiplication sign |
c | Identifier 3 |
* | Multiplication |
sign | |
2 | Number 2 |
Syntax Analysis:
- Syntax analysis is also called hierarchical analysis or parsing.
- The syntax analyzer checks each line of the code and spots every tiny mistake that the programmer has committed while typing the code.
- If code is error free then syntax analyzer generates the tree.
Semantic analysis:
- Semantic analyzer determines the meaning of a source string.
- For example matching of parenthesis in the expression, or matching of if..else statement or performing arithmetic operation that are type compatible, or checking the scope of operation.
Synthesis phase: synthesis part is divided into three sub parts,
- Intermediate code generation
- Code optimization
- Code generation
Intermediate code generation:
- The intermediate representation should have two important properties, it should be easy to produce and easy to translate into target program.
- We consider intermediate form called “three address code”.
- Three address code consist of a sequence of instruction, each of which has at most three operands.
- The source program might appear in three address code as,
t1= int to real(2) |
---|
t2= id3 * t1 |
t3= t2 * id2 |
t4= t3 + id1 |
id1= t4 |
Code optimization:
- The code optimization phase attempt to improve the intermediate code.
- This is necessary to have a faster executing code or less consumption of memory.
- Thus by optimizing the code the overall running time of a target program can be improved.
t1= id3 * 2.0 |
---|
t2= id2 * t1 |
id1 = id1 + t2 |
Code generation:
- In code generation phase the target code gets generated. The intermediate code instructions are translated into sequence of machine instruction.
MOV id3, R1 |
---|
MUL #2.0, R1 |
MOV id2, R2 |
MUL R2, R1 |
MOV id1, R2 |
ADD R2, R1 |
MOV R1, id1 |
Symbol Table:
- A symbol table is a data structure used by a language translator such as a compiler or interpreter.
- It is used to store names encountered in the source program, along with the relevant attributes for those names.
- Information about following entities is stored in the symbol table.
- Variable/Identifier
- Procedure/function
- Keyword
- Constant
- Class name
- Label name