# CS代考计算机代写 algorithm 6. Context-free grammars

6. Context-free grammars

A canonical first example of a context-free grammar (CFG) is shown in (1). This grammar generates the stringset {anbn | n ≥ 1}, i.e. the stringset {ab, aabb, aaabbb, aaaabbbb, . . . }.

(1) S → aSb S → ab

The fact that the grammar in (1) generates the string ‘aaaabbbb’ is due to the fact that we can start from ‘S’ and gradually rewrite nonterminals in accord with the rules, one step at a time, as follows:

(2) S aSb

aaSbb aaaSbbb aaaabbbb

A sequence of strings related to each other like (2) is known as a derivation.

1

(3)

Formal definition of context-free grammars

A context-free grammar (CFG) is a four-tuple (N, Σ, I, R) where:

• N is a finite set of nonterminal symbols;

• Σ, the alphabet, is a finite set of terminal symbols (disjoint from N);

• I ⊆ N is the set of initial nonterminal symbols; and • R⊆(N×(N∪Σ)∗)isafinitesetofrules;and

So strictly speaking, (1) is an informal representation of the following mathematical object: (4) G = {S}, {a,b}, {S}, {(S,aSb),(S,ab)}

We often write u ⇒G v to mean that from the string u we can derive the string v by rewriting one symbol in accord with one of the rules of the grammar G; and write u ⇒G ∗ v to mean that from u we can derive v in one or more such steps.

So what we did above in (2) can be described in symbols like this:

(5) S ⇒G aSb ⇒G aaSbb ⇒G aaaSbbb ⇒G aaaabbbb

(6) S ⇒G ∗ aaaabbbb

A CFG G = (N,Σ,I,R) generates a string w ∈ Σ∗ iff there’s some way to derive w from some initial

nonterminal symbol:

(7) w∈L(G)⇐⇒ I(n)∧n⇒G∗w n∈N

Ling185A, Winter 2021 — Tim Hunter, UCLA

Two handy conventions:

2

• •

Wegenerallyuseuppercaselettersfornonterminalsymbolsandlowercaselettersforterminalsymbols, in which case we can recover the sets N and Σ if we’re only given a collection of rules.

Unless otherwise specified, we’ll assume that there is exactly one initial nonterminal symbol, and that it is the symbol on the left hand side of first rule listed. This way the grammar can be entirely identified by listing its rules.

Equivalent derivations, leftmost derivations, and rightmost derivations

Now let’s consider a more interesting grammar:

(8) NP→NPandNP NP → NP or NP

NP → apples NP → bananas NP → oranges

Notice that the string ‘apples and oranges’ has two distinct derivations in this grammar:

(9) a. NP

NP and NP

apples and NP apples and oranges

b. NP

NP and NP

NP and oranges apples and oranges

And the string ‘apples and oranges or bananas’ has many distinct derivations, but here are two of them:

(10) a.

NP

NP and NP

b. NP

NP or NP

NP and NP or NP

apples and NP or NP

apples and oranges or NP apples and oranges or bananas

apples and apples and apples and apples and

NP

NP or NP

oranges or NP oranges or bananas

But in an important

thing”, whereas the two in (10) seem “actually different”. The underlying distinction is that the two derivations in (9) are both plugging together the following three facts:

(11) NP ⇒∗ apples and oranges NP ⇒∗ apples

NP ⇒∗ oranges

but the two derivations in (10) plug together different collections of facts of this sort:

way, the two derivations in (9) seem to be just “different ways of doing the same

(12) a. NP ⇒∗ apples and oranges or bananas NP ⇒∗ apples

NP ⇒∗ oranges or bananas NP ⇒∗ oranges

NP ⇒∗ bananas

b. NP ⇒∗ apples and oranges or bananas NP ⇒∗ apples and oranges

NP ⇒∗ apples

NP ⇒∗ oranges

NP ⇒∗ bananas

Ling185A, Winter 2021 — Tim Hunter, UCLA

In the early generative linguistics literature, these individual constituency-facts like ‘NP ⇒∗ oranges or bananas’ were known as “is-a relationships” (e.g. “ ‘oranges or bananas’ is a NP”).1

Let’s call such a collection of is-a relationships an analysis of a string. So:

• (9a) and (9b) both correspond to the same analysis of the string ‘apples and oranges’, shown in (11),

but

• (10a) and (10b) correspond to two distinct analyses of the string ‘apples and oranges or bananas’, shown in (12a) and (12b), respectively.

An analysis can be represented graphically by a tree.

NP

NP and NP

NP

NP and NP

NP

NP or NP

(13) a.

A leftmost derivation is a derivation where every step rewrites the leftmost nonterminal symbol in the preceding string.

There is exactly one leftmost derivation per analysis.

(9) / (11)

apples

NP or oranges

NP bananas

NP apples

and NP oranges

bananas

apples

oranges

(10a) / (12a)

If two derivations correspond to the same analysis, i.e. they both express the same set of is-a relations,

we’ll call them equivalent derivations.

So the relationship between derivations and analyses is many-to-one; usually we only care about distinct analyses of a string, and the differences between equivalent derivations (such as (9a) and (9b)) concern only the order in which the rewrites take place. We can get rid of these “spurious choices” by adopting some fixed strategy that dictates, given any intermediate point in a derivation (such as ‘NP and NP’), which nonterminal symbol must be rewritten next.

b. A rightmost derivation is a derivation where every step rewrites the rightmost nonterminal symbol in the preceding string.

There is exactly one rightmost derivation per analysis.

The derivation in (9a) is a leftmost derivation; the derivation in (9b) is a rightmost derivation. The two derivations in (10) are both leftmost derivations.

So by considering only leftmost derivations or considering only rightmost derivations, we can leave aside the “irrelevant” differences; if we find two leftmost derivations or two rightmost derivations for a string, then it has two distinct tree structures, i.e. can be analyzed by two distinct sets of is-a relationships.

3 Finding analyses

We’ll assume from here on that all CFGs are in Chomsky Normal Form (CNF), which means that each rule’s right-hand side consists of either a single terminal symbol or exactly two nonterminal symbols. So in effect we’ll take R to be of the form R ⊆ N ×((N ×N)∪Σ). Assuming this form is computationally convenient and doesn’t really restrict what we can do.2

1See e.g. chapter 4 of Chomsky’s Syntactic Structures (1957), or chapter 1 of Lasnik’s Syntactic Structures Revisited (2000).

2Any CFG can be converted into a CFG in CNF that generates the same stringset — modulo the empty string, i.e. for any CFG G we can construct another CFG G′ such that G′ is in CNF and L(G′) = L(G) − {ε}.

(10b) / (12b)

Ling185A, Winter 2021 — Tim Hunter, UCLA

Here’s an example grammar in CNF:

(14) VP → V NP V VP →VPPP VP NP →NPPP VP PP → P NP NP

→ watches → watches → spies

→ watches

N = {VP,NP,PP,V,P}

Σ = {watches, spies, telescopes, with} I={VP}

NP → spies

NP → telescopes P →with

The important idea about how a CFG generates a string can be stated in the same form as what we saw for FSAs:

(15) w ∈ L(G) ⇐⇒ string w can generated via tree t all possible trees t

For an FSA, the relevant notion of a path is a linear path through the states; here, think of a tree as a branching path through the nonterminals.

One way in which CFGs are more complicated than FSAs is that, even given a particular string we’re interested in, we don’t know what the shape(s) of the relevant path(s) might be. So spelling out what it means to consider “all possible trees t” is a bit trickier than spelling out “all possible paths p” for an FSA.

But if we leave aside the “middle” of the tree for the moment, and just focus on the top and the bottom, we can at least write the following:

(16) x1x2…xn ∈L(G) ⇐⇒ ··· ··· I(r)∧···∧R(c1,x1)∧R(c2,x2)∧···∧R(cn,xn) r∈N c1∈N cn∈N

For example, what we would need to evaluate to see whether the grammar in (14) will generate ‘watches with telescopes’3 would look something like this (still leaving aside the “middle” of the tree):

(17) ··· I(r)∧···∧R(c1,watches)∧R(c2,with)∧R(c3,telescopes) r∈N c1∈N c2∈N c3∈N

Interleaving the growing of structure with the checking of this structure against the grammar, analogously to what we did with forward and backward values for FSAs, will let us sidestep the issue of working out exactly what would need to go into this “global disjunction”.

4 Inside values

For any CFG G there’s a two-place predicate insideG, relating strings to nonterminals:

(18) insideG(w)(n) is true iff the string w is derivable from the nonterminal n in grammar G

Given a way to work out insideG(w)(n) for any string and any state, we can easily use this to check for membership in L(G):

(19) w ∈ L(G) ⇐⇒ I(n) ∧ insideG(w)(n) n∈N

We can arrange the values of the predicate insideG in a table or chart. The chart in (20) shows the values of insideG for all non-empty substrings, or infixes, of ‘watches spies with telescopes’, where G is the grammar in (14). Each cell in the chart corresponds to one of these infixes.4

3Notice that this is a length-three string belonging to Σ∗, where Σ is the set of terminal symbols in (14).

4So a cell in this chart, since it corresponds to an infix (and specifies a value for each nonterminal), is analogous to a column in the tables we saw earlier for forward and backward values.

Ling185A, Winter 2021 — Tim Hunter, UCLA

(20)

VP: 1 NP: 1 PP: 0

V: 1 P: 0

VP: 1 NP: 0 PP: 0

V: 0 P: 0

VP: 0 NP: 0 PP: 0

V: 0 P: 0

VP: 1 NP: 0 PP: 0

V: 0 P: 0

VP: 1 NP: 1 PP: 0

V: 0 P: 0

VP: 0 NP: 0 PP: 0

V: 0 P: 0

VP: 1 NP: 1 PP: 0

V: 0 P: 0

VP: 0 NP: 0 PP: 0

V: 0 P: 1

VP: 0 NP: 0 PP: 1

V: 0 P: 0

VP: 0 NP: 1 PP: 0

V: 0 P: 0

watches . . .

spies . . .

with …

telescopes . . .

We can fill this table in gradually by “working from small to big”, similarly to what we saw for forward and backward values for FSAs, but with some important differences.

• Here it’s the cells on the diagonal, corresponding to infixes of length one, that can be filled in just by lookups into the grammar.5

• We then work upwards and to the right to fill in values for longer infixes, which can be calculated by looking “just one step back” at values for shorter infixes.

• But exactly what it means to look “one step back” is a bit more complicated because of the fact that we’re dealing with all infixes, not just prefixes and suffixes.

Specifically, every cell in the table can be filled in according to the following recursive definition (which is not as complicated as it looks):

(21) insideG(x)(n) = R(x, n)

insideG(x1 . . . xm)(n) = R(n, l, r) ∧ insideG(x1 . . . xi)(l) ∧ insideG(xi+1 . . . xm)(r)

1≤i