# Translation

Usually done by special programs such as:

• compilers, translating HLL instructions into machine code
• assemblers translating mneonmic form of machine instructions

it is possible for a compiler to generate assembley code which is then passed onto an assembler.

C is an ideal languaegh for writing a compiler.

An interpreter is differently from a compiler because it analyses the program and what it won’t do is generate code. It will not generate code. Instead it trys to work out what your program is saying and then the interpreter carries out actions on behalf of your program.

# Compilers V Interpreters

Compilers

• take a source program and translates uit into whole machine code

Interpreters

• take an instruction at a time to determine its function then carry out those functions
• no binary is generated

Execution of compiled code is much faster than interpretation

Interpreters are more suitable for rapid prototyping and for other situations when a program is frequenelty modified.

Interpretation can make it easier to move programs between platforms.

# Interpreters as virtual machines

An interpreter is somewhat similar to computer hardware.

Because of that they are sometimes reffered to as a virtual machine.

Some systems combine compilation and interpretation.

HLL (highl evel language) is converted to machine-independent virtual machine code. Virtual code is then interpreted on each platform.

# The lexical component

The lexical component is a list of all legal words (elementary expressions) in the language, together with information about each word.

For example Java has “import”, “Public”, “else” are all legal words. THey go together with informkation about their meanings and roles.

What is the name used to refer to basic items in the lexical component of a language?

Lexans Lexoids Lexemes Lexels Lecterns

Lexeme

# The syntatical component

Syntax defines the form andd structure of logical expressions of the language.

For example in Java:

double private x, y;


is not a correct fragment of code although lexally it is correct.

# The semenatic component.

The sementic component deals with the meaning of expressions.

In programming languages the sementic component defines a course of action to be performed. When executing a partitcular fragment of a program.

For example in Java the fragment:

(if m < n){
System.out.println("the minimum is " + m)
}


means

check if m < n if it is true then print the message otherwise skip this instruction.

# Programming language descriptions

The precise description of a programming language is needed for the translator:

• To analyse the (source) program

# Informal narrative description

One way to define a language is in an informal way.

Example:

Pascal is written free forms iwth no required layout, tokens are seperated by spaces, tabs, carriage returns or comments.

A block consists of the following component:

• label declartions
• constant definitions

a long document that describes how the language works.

It is easily understandable by a human and it is useful as a reference source but it is not precise enough to implement a translator.

# Grammars

A grammar is a set of syntactic rules.

Syntatic rules define syntatic constructions in terms of other syntatic stuff.

## Syntatic Diagrams

One way of representing syntax rules of a grammar is by syntax diagrams (slide 16, put into Quizlet).

Here we have non-terminal expressions.

## Backus-Naur form

Dealing with diagrams is fine for us but not fine for a compiler. Instead we use something called BNF which has been around for a long time. There’s some debate over what BNF stands for.

Question: What else does BNF stand for?

The thing on the left is referred to a non terminal, it needs to be expanded.

The things on the right can consist of non terminals and terminal symbols.

-> defined as
means "or". AB = A or B
<name> non terminal symbol. Anything in angle brackets are a non terminal symbol.
symbols = terminal symbol like the word if or the word then
[symbols] means optional symbols. it does not have to occur.
{symbols} means the symbols are repeated 0 or more times

example:

<identifier> -> <letter> {<letter>|<digit>}


A{b}C doesn’t match

ABBBBBBBCCCC ABC AC ACC

ACC because Correct: Must start with a and end with c. Can have zero or any number of b’s in the middle.

# parsing

Examing the program and trying to work out what gramatical constructs are present.

If you run a program through the compiler what a compiler would try to do is apply all the grammatical rules it has in order to work out the grammatical structure of your program.

It’s loooking to see whether applying all the grammatical rules it knows about it’s looking to find if it can repalce the symbols in the program to what the rules actually look like.

It tries to place rules in your program to match the source code but to apply the rules its knows about.

It applies the rules it knows about in order to match your program.

If the compiler cannot do it then your program has a mistake. It is not the compiler that is at fault, it is you the programmer at fault.

# Compilation Process

The compilation process is normally broken down into the following major steps:

• Lexical analysis
• Syntax analysis
• Semantic analysys
• Code generation
• Code optimisation

# Compiler Implementation

A compiler is often written in a high level language.

Problem: You always need both compilers available. A compiler for the HLL and the compiler itself.

# Bootstrapping

Quite often a compiler is written in the language it compiles.

C compiler written in C language

We can then run the compielr through itself. We can use the C compiler to compile the C compiler. This is called bootstrapping.

# Lexical analysis

The test of the program is converted into a sequence of tokens, representing distinct elements of the program EG number, denotations, variable identifiers, key words

As they are encountered, identifiers are entered into a symbol table.

# Syntax Analysis

The source program is analysed to uncover the grammatical form of the constructions used.