CS代考 10/6/21 – cscodehelp代写

10/6/21
The Parsing Process
COMP712 Programming Languages
AUT
1
A typical view of the compilation process:
parsing
Your Program
Lexical Analyser
Syntax Analyser
What is generated?
Symbol Table Parse Tree Tokens = terminal words
2
12
A typical view of the compilation process:
parsing
What happens next?
Code generation
Interpretation i.e walk the tree
3
Your Program
Lexical Analyser
Syntax Analyser
A typical view of the compilation process:
parsing
But the above does not say whether one performs a complete lexical analysis prior to performing syntax analysis or doing both simultaneously.
Your Program
Lexical Analyser
Syntax Analyser
4
34
Thus, we could have a two-pass parser:
1 pass
Or, a single pass parser: 1 pass
2 pass
In the previous lecture, we look at parsing from the LA standpoint. Now we look at it from the SA standpoint
5
Your Program
Lexical Analyser
Syntax Analyser
Syntax Analyser
Lexical Analyser
Your Program
Consider (single) parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
First, what does parsing means?
It means syntax analyser calls lexical analyser to get tokens and use them to build a parse tree immediately
6
56
1

10/6/21
Consider parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “begin”?
LA finds the entry “begin” in the hash table
It returns s.begin to the SA
What happens next?
7
Consider parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
SA knows what s.begin means. So, what action is taken?
SA ignores s.begin and starts the tree building process.
SA initialises tree:
Tree
8
78
Consider parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “integer”?
LA finds the entry “integer” in the hash table. Why? What does it return to SA?
s.integer
9
Consider parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
SA knows what s.integer means. So, what actions are taken?
SA sets declaration = true
SA sets decl_type = s.integer
Global variables
10
9 10
Consider parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “x”?
LA looks for the entry “x” in the hash table
But it is not there!
11
LA creates an entry “x” in the symbol table:
Begin integer x, y; x:=5;
y := x;
:
:
Begin real y;
: : end;
End;
/ “x” Isthatall? No
P1P2 flags /
Flags:
It starts gathering its runtime properties
Declared: yes
Type: s.integer Reserved or user-defined Block level: 0
Offset: 0
12
11 12
2

10/6/21
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “,”?
It returns s.comma to the SA
SA ignores s.comma and knows more declaration coming.
13
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “y”?
LA looks for the entry “y” in the hash table
Again, it is not there!
14
13 14
Begin integer x, y; x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
/
………
“x”
LA creates another entry in the symbol table.
/ “y” P1P2 flags /
Flags:
Declared: yes
Type: integer Reserved or user-defined Block level: 0
Offset: 1
15
Block level tells you that these variables are at the same level i.e using the same data frame.
Enter a new block level, you reset offset value to zero and increase block level by 1.
Exit a new block level, you reset offset value to the previous value and decrease block level by 1.
Offset value allows you to index its location on the stack.
16
15 16
Recall what is a data frame?
The memory required by your program
In this program, we need 2 locations at the outer block
To get x, you do:
Base address + offset of x
Begin integer x, y; x := 5;
y := x;
:
:
Begin real y;
:
y := 0.5*x; end;
: End;
y:
x:
17
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “;”?
It returns s.semicolon to the SA
SA ignores it and sets decl_type = nil.
18
17 18
3

10/6/21
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “x”?
LA finds the entry “x” in the hash table
It returns s.name and a pointer to its runtime properties in the symbol table to SA
19
SA gets s.name and a pointer to the symbol table:
Flags:
Declared: yes Type: s.integer user-defined
Block level: 0 Offset: 0
/ “x”
P1 P2
flags /
Since SA did not encounter s.integer or s.real or s.procedure, it knows declaration has finished. So, declaration := false. It now expects an operator next.
20
19 20
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
After reading x, what is read?
: not := and reading ”:”, what happens next?
It has to check if the next symbol is “=“ and if so, returns?
s.becomes to SA
21
SA gets s.becomes and starts to build the first sentence which is an assignment expression.
/ “x”
:= ………
/ “y”
s.name
Note where
this arrow is pointing to
P1P2 flags /
22
21 22
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “5”?
LA returns the value 5 and its type s.integer to SA
SA knows the right hand side is an arithmetic expression. Call a function to build such a tree.
23
Continue parsing the simple Algol-like program:
Begin integer x, y; x := 5;
y := x;
:
Begin real y;
: end;
: End;
What happens when you read “;”?
LA returns s.semicolon to SA
SA knows end of statement reached.
24
23 24
4

10/6/21
SA now completes the assignment expression tree. It checks that variable x is of the right type.
/ “x”
………
/ “y” P1P2 flags /
s.name
:=
s.value 5
25
SA builds the first statement tree in the program
Tree
/ “x”
:= ………
/ “y”
x5
P1 P2 flags /
26
25 26
So, after parsing the whole program, the
tree looks like
……
/ “x”
………
/ “y”
P1 P2 flags /
Tree
First statement
:=
x
5
27
Begin integer x, y; x := 5;
y : = x;
:
:
Begin real y;
: end;
: End;
What happens at this point after reading “;”?
Again, SA can now create a tree for the legal sentence, y := x. Draw the tree.
28
27 28
References to x and y are as follows:
/ “x”
:= ………
/ “y”
yx
P1P2 flags /
29
Begin integer x, y; x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
Current situation
/ “x”
………
/ “y” P1P2 flags /
What happens when y is read?
30
29 30
5

10/6/21
Begin integer x, y; x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
You create a new descriptor for this “y”
/ “y”
flags /
flags /
Which entry is for real y and which is for integer y?
31
From this point, every time you access y, you get the runtime properties of real y.
Begin integer x, y; :
x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
/ “y” flags
flags
Flags:
Declared: yes Type : real User-defined Block level: 1 Offset : 0
/ /
What happens at this point, after reading “;”?
Flags:
Declared: yes Type : integer User-defined Block level: 0 Offset : 1
32
31 32
Begin integer x, y; :
x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
You de-reference all variables declared in that block
/ “y”
flags /
flags /
You are here
After de-referencing, how do you know that this descriptor is variable “y”
33
Begin integer x, y; :
x := 5;
y := x;
:
:
Begin real y;
: end;
: End;
/ “y”
flags /
flags /
You are here
Because you still have P2, pointing to its name.
34
33 34
Begin integer x, y; :
x := 5;
y := x;
:
:
Begin real y;
: end;
print (y);
: End;
/ “y”
flags /
flags /
Where is the y in print (y) pointing to?
35
What was the design issue we encountered so far?
It is a question of where variables exist and when they become accessible.
Technically, we are referring to the scope and extend of a variable.
36
35 36
6

10/6/21
What is the scope and extend of x?
Begin integer x, y; :
x := 5; y := x;
:
Begin real y; :
end;
print (y);
: End;
Scope of X
Extend of X
37
y?
Begin integer x, y; :
:
print (y);
: End;
x := 5; y := x;
What is
the
scope
and
extend
of end; y y integer
Begin real y; :
Scope Extend of of integer integer
38
37 38
Begin integer x, y; :
What is x := 5; the y := x;
scope :
and Begin real y;
extend : of real end;
Scope of real y
Extend of real y
y?
print (y);
: End;
39
In Java, you can do:
[lbl] start: alert(“Lather”); alert(“rinse”); goto start;
What is the implication of allowing goto’s (from the symbol table standpoint)?
Can’t see it?
40
39 40
How about this classic use of GOTO in C:
for ……
for ………..
if (breakout-condition)
goto final; :
final:
Can you now see the difficulty handing
goto’s? Youneedforwardreferencing! How do you handle forward referencing?
41
Forward Referencing
Need to provide provisionally declared descriptors
When encountering them, have to change these descriptors to “declared”
Have to look out for those that are genuinely not declared and hence generate appropriate error messages.
42
41 42
7

10/6/21
Hence, in language design, having increasingly sophisticated/liberal linguistic structures require an increasingly sophisticated compiler.
From now on, when you learn to use a new language, you do not only learn where you can declare variables but also what storage management is needed to support such methods for declaring variables in the language.
43
Parsing Theory
COMP712 Programming Languages
AUT
44
43 44
Earlier I said:
The SA gets a s.integer and it
knows what to do.
What exactly do I mean by that? In other words, how does the SA know what to do?
45
Before we begin, a bit of history:
The Fortran project undertaken by IBM led the change in attitude that a programming language is not about writing codes for a particular computer but rather for the communication of algorithms to computers.
However, it was Algol-60 that became the major force in this change of attitude and much of what we know about compilers came from the challenge posed by the task of constructing compilers for Algol-60.
46
45 46
Parsing
• The goal is to decide if a string is a legal instance of S
In other words, we need to know:
X := X + 5; is an arithmetic expression of L
But how? Follow the grammar of the language
47
Parsing
• Alternatively,incompilerterms,toshowthatthere is a parse tree whose root is S and whose leaves form
the string STR.
:=
• Parsingmaybeseenasa x + constructive proof process –
why? x5
X := X + 5;
A legal assignment statement
Because it attempts to prove the legality or otherwise of STR by constructing THE parse tree.
48
47 48
8

10/6/21
Parsing
• Whatif,wecouldconstructatree,outofseveral,to show that the string STR is legal?
• Thatmeansthegrammarisambigousandtheparser is a non-deterministic parser.
• Analternativeviewofparsingisthatitisnotaproof process that would say yes or no but a process that attempts to construct the most reasonable partial parse.
49
Parsing
• Discoveringthetreeamountstodiscovering appropriate application instances of the rules which comprise the grammar G. Recall the tutorial 4:
Provide one derivation for the sentence aaabbbccc from the grammar G:
1. SàaBC
2. SàaSBC
3. aBàab
4. bBàbb
5. CBàBC
6. bC àbc
7. cCàcc
50
49 50
Parsing
Provide one derivation for the sentence aaabbbccc from the grammar G:
1. SàaBC
2. SàaSBC
3. aBàab
4. bBàbb
5. CBàBC
6. bC àbc
7. cCàcc
Sà2 aSBCà2 aaSBCBCà1 aaaBCBCBCà3 aaabCBCBCà5 aaabBCCBCà4 aaabbCCBCà5 aaabbCBCC*à5 aaabbBCCC à4 aaabbbCCCà6 aaabbbcCC à7 aaabbbccCà7 aaabbbccc
At step 7(marked *) you use rule 5 to shift the B to the front and not rule 6 to convert bC to bc.
51
Recall this example too

The
The man
The man The man has The man has

The man has a
The man has a dog
(1)
(2) àThe
(8) àman
(3) (5) àhas
(4) (6) àa
(9) àdog
You apply rules 1- 2-8-3-5- 4-6-9
52
51 52
An alternative production
(1)
(3)

(4)

dog (9) a dog (6)
has a dog (5)
The has a dog (2)
The man has a dog (8)
But, you could also apply rules 1-3-4-9- 6-5-2-8
53
That means you can search the rules in many different ways
Parsing
• Discoveringthetreeamountstodiscovering appropriate application instances of the rules which comprise the grammar G.
• Thepointtorealizeisthatparsingessentially involves a search process.
• Crucially,programminglanguagesaredefinedto make the parsing process, and hence the search process, as easy as possible. Why?
54
53 54
9

10/6/21
Parsing
• Consequentlythesearchprocessesusedarevery dumb, by AI-standard – typically pattern matching or depth first search.
• Inallcases,thestringSTRisprocessedinaL-to-R manner
• Twoobviousstrategiesfordiscoveringtheparse tree T. Describe one.
55
Parsing Strategies
• StartwithSandtrytoexpandittoSTRbyrepeatedly expanding LHS of certain productions appropriately.
• Hencecalledtop-downandisclearlyagoaloriented process.
• Or,startfromtheSTRandattempttoreduceittoS by successfully reducing the RHS of productions to their LHS.
• Analogoustoabove,thisiscalledbottom-upandis essentially pattern matching.
56
55 56
An example of both strategies
Suppose we have a grammar for “hello world”:
SàHW
H à hello W à world
57
Parse tree 1S
S
Unprocessed input
W 3 Match H W
match
hello
S
hello
A ‘top down’ parse
2 Predict S H
Hello world
Hello world
world
world
time
SàHW Hàhello W à world
First token accepted
Second token accepted
58
4 Predict H
W
S
5 Match H W
hello world
57 58
Parse tree
hello
reduce H hello
H world hello
reduce H W hello world
S
reduce H
hello world
Unprocessed input
A ‘bottom up’ parse
1 2 3
4 5
6
W
Hello world
world
time
First token shifted
Second token shifted
SàHW
H à hello Wàworld
59
Parsing Strategies
• Atop-downparserisalsoknownasanLLparserand
a bottom-up parser, a LR parser
• LLstandsforleft-to-right,leftmostderivationandLR
stands for left-to-right, rightmost derivation.
• LLparserinvolvespredictingwhatisoutthereand
matching if the guessed terminal is as expected
• LRparserinvolvesshiftinginthenextterminaland reducing them (terminals and nonterminals) to a nonterminal.
60
59 60
10

10/6/21
LL and LR – which is which
S
S HW S
H
hello
hello
H
hello
H world
hello
H W
hello world
S HW hello world
S HW hello world
This is said to be a ‘Left-to-right, Left-most derivation’ parse – LL
Right derivation
W
This is said to be a ‘Left-to-right, Right-most derivation – LR
61
Parsing Strategies
• LLparsersarealsoknownaspredictiveparsers.
• LRparsersarealsoknownasshift-reduceparsers
• BothcouldbespecifiedasLL(n)orLR(n)wherenis0 or more. The number indicates how many terminals lookahead.
• ManyareLL(1)orLR(1).Onlysimplegrammarsuch as hello-world is LL(0) or LR(0).
62
61 62
Parsing Strategies
• Thereareseveralimportantsub-classesofLRparsers such as SLR, LALR, etc.
• Whichisbetter,LLparserorLRparser?
• OK,hereiswherewestopintermsofcompiler theory. We will look only at how LL parsing work as an example of parsing and to have some ideas of what is involved.
63
LL parsing Consider the grammar, G1:
SàE
EàE+T | T {this means: EàE+T and EàT }
T à T*i | I {i.e. “|” means alternative production }
Where i is a terminal of the language.
How do you start writing a skeleton program to implement an LL parser for G1?
64
63 64
LL parsing
Consider the grammar, G1: SàE
EàE+T | T {this means: EàE+T and EàT }
T à T*i | I. {i.e. “|” means alternative production }
(defun parseS () (parseE Str)) (defun parseE (Str) (……)
But note that there
are 2 productions in (defun parseT (Str) (……) each. Why?
65
LL parsing Consider the grammar, G1:
EàE+T | T {this means: EàE+T and EàT } So, you have to write:
(defun parseE (Str) (if (parseE+T) (parseE+T) (parseT)))
What the above means is that we try using one rule first and if not successful, we try the other rule.
(defun parseE+T (S) (parseE) (parse+) (parseT))
Look for E Look for + Look for T
66
65 66
11

10/6/21
Consider the grammar, G1:
EàE+T | T {this means: EàE+T and EàT }
(defun parseE (S) (if (parseE+T) (parseE+T) (parseT))) (defun parseE+T (S) (parseE) (parse+) (parseT))
Look for E Look for + Look for T
Do you see any problem using the above functions?
Try parsing 5 + 6.
Look for E Look for E Look for E You get into a loop!
Solution? Try parseT first before parseE+T. In other words, re-write the grammar.
67
LL parsing
Consider the grammar, G2: G1: SàE
EàT | T+E Tài |i*T
SàE EàE+T | T TàT*i |i
G2 uses a right recursive definition (not always possible, right?). Now we can write something like:
(defun parseE (S) (if (parseT)
(parseT) (parseT+E)))
Anymoreproblem? YES
68
67 68
LL parsing
Consider the grammar, G2: G1: SàE
EàT | T+E Tài |i*T
Try parsing 5 + 6 again:
SàE EàE+T | T TàT*i |i
(defun parseE (S) (if (parseT)
(parseT) (parseT+E)))
Answer is: 5 But the correct + Why? answer is: 5 6
69
LL parsing
Consider the grammar, G2: G1: SàE
EàT | T+E Tài |i*T
SàE EàE+T | T TàT*i |i
Fail to get EàT+E and Tài*T
So, order of productions is important. Use the
longer production first, i.e.:
(defun parseE (S) (if (parseT+E) (parseT+E) (parseT)))
70
69 70
LL parsing
Consider the grammar, G3: G2: SàE.
EàT+E | T Tài*T |i
SàE. EàT | T+E Tài |i*T
(defun parseE (S) (if (parseT+E) (parseT+E) (parseT)))
Parsed ok but very inefficient. Keep discovering the same bit of tree, especially T’s (see it?)
Need factoring i.e. get T first then check for + and E
71
LL parsing
Consider the grammar, G4: G3: SàE.
EàT[+E|null] T à i[*T|null]
SàE. EàT+E | T Tài*T |i
G4 involves a little bit of bottom-ups in that it addresses the data to decide (in this case on the basis of a one symbol lookahead) whether to try for longer form of E or T.
What kind of a parser is implemented using G4? LL(1)
72
71 72
12

10/6/21
Implementing an LL parser
An important lesson – we cannot simply take a grammar and implement a parser for it! What steps do we need to take?
We need to, at least, order productions, remove left recursion and factoring fully in order to produce an efficient parser.
Making sure that one’s grammar satisfies the above requirements is straightforward if we consider only each production. This is obviously not the case.
73
Factoring
Factoring is the exception as it is always done within a single production.
However, one could factor for several levels:
AàBCD | BCE | XY | XZ | X AàBC [D | E] | X [Y | Z | null]
74
73 74
Ordering Productions NàSTR1 | STR2. Which one if both same length?
But what about?
STR1à* STR STR3 STR2à* STR wherebyà* means zero or more productions
Should try STR1 alternative of N first. Ex:
EàI | BE BEàI= I
BE ought to be tried first
BE means Boolean Expression
75
Removal of Left Recursion
EàE + T| T Directly left recursive and
AàBx | U BàAy | T
General left recursion
As we know, we can always replace a recursive definition with an iterative one. Hence “E+T” is just like saying that we have zero or more “+ T’s”. Use this notation:
EàT [ + T]* An iterative definition
76
75 76
What about indirect recursion?
First step: delete all direct left recursion.
Second step: A simple check for indirect left recursion is to produce all nonterminals that can be leftmost node of tree from any N and see if it contains N
NàN1 STR1 N1à* N STR2
77
To produce all nonterminals that can be leftmost node of tree from any production, use two functions:
First(N) – the list of nonterminals which begin a right side of N
First+(N) – compute First for each nonterminal produced above and repeat with every new nonterminal added to the list
If any nonterminal is indirectly left-recursive, it will appear in its own First+ list.
78
77 78
13

10/6/21
Sounded complicated? Here is an example: EàE1 | T E1àE + T Tài[*T|null]
First(E) = (E1 T) First+(E) = (E1 T E)
First(E1) = (E) First+(E1)= (EE1T)
Now take First(E1) and First(T) to produce First+(E)
Now you have indirect recursion!
If you take first(E1) you will find indirect recursion as well.
79
Resolving indirect recursion
EàE1 | T E1àE + T
First+(E) = (E1 T E) First+(E1) = (E E1 T)
To resolve, choose an indirectly left recursive production, say E1, and replace its left side (E1) by its right side (E + T) throughout. EàE+T | T
Then remove any direct left recursions. Repeat the process. The process is bound to terminate since removing direct left recursions does not introduce nonterminals and amalgamation removes them.
80
79 80
Implementing an LL parser
We need to order productions, remove left recursion and factoring fully in order to produce an efficient parser.
Such a parser assumes that any recognition of a goal is the correct one and if it fails, the sentence is illegal. Such a parser is referred to as fast-back as opposed to slow-back
Interestingly, this is NOT enough to prevent backtracking!
81
So far, I have introduced something quietly! The null symbol!
In grammar, G4: SàE
E à T[+E|null] T à i[*T|null]
Replacing left recursion using an iterative definition:
EàT [+ T ]* Tài [ * i]*
Having a null option could create a “context clash”!
82
81 82
An Example
Original grammar: Bàbegin D ; S end Dàd | D; d
Sàs | S ; s
Converted grammar: Bàbegin D ; S end Dàd { ; d }*
Sàs { ; s }*
Consider parsing: begin d; d; s; s end
Get dàlook for {; d} and found it then look for {; d} but couldn’t and yet there is no error. So, need to backtrack!
; is used to continue the iteration or to stop, thus creating a context clash
83
An Example
Converted grammar: Bàbegin D ; S end Dàd { ; d }*
Sàs { ; s }*
The problem:
; is used to continue the iteration or to stop, thus creating a context clash
Converted to 1-track grammar: Bàbegin D S end
Dàd ; { d ; }*
Sàs { ; s }*
84
83 84
14

10/6/21
Acknowledgement
Understanding and writing compilers: A do-it-yourself guide by
Available free on the internet
See chapter 16 (and esp. 16.4) for context clash.
85
85
15

Leave a Reply

Your email address will not be published. Required fields are marked *

cscodehelp™ 博士 课程作业面试辅导 CS 计算机科学 | EE 电气工程 | STATICS 统计 | FINANCE 金融 | 程序代做 | 工作代做 | 面试代面 | CS代做
Amphibious Theme by TemplatePocket Powered by WordPress