CmSc 365 Theory of Computation


Parse Trees

Exam-like questions and problems
Learning goals


  1. Derivations and similarity

  2. Let G be a CFG. A string w є L(G) may have many derivations, corresponding to how we choose the rules to apply and how we choose which symbol to expand.

    • At a given step we may be able to expand two or more non-terminal symbols. The order in which we expand the non-terminal symbols will determine the derivation, i.e. different orders will result in different derivations, even if we choose same rules.
    • At a given step for a given non-terminal symbol there may be several rules to choose. Choosing different rules will result in different derivations.

    Some derivations are inherently different, others are not. To distinguish between these two cases, we will define a precedence relation on the set of the derivations of a given string, and similarity between derivations.

    Definition: Let D and D’ be two derivations of some string in L(G)

    D = x1 => x2 => … => xn
    D’ = x’1 => x’2 => … => x’n

    where xi, x’i є V* for i = 1, 2, …, n
    x1, x’1 є V - ∑, xn, x’n є ∑*

    We say that D precedes D’, written as D < D’, if N > 2 and there is an integer 1 < k < n, such that:

    a) xi = x’i for i ¹ k
    b) xk-1 = x’k-1 = uAvBw, where u, v, w є V*, A, B є V - ∑
    c) xk = uyvBw, where A → y є R
    d) x’k = uAvzw, where B → z є R
    e) xk+1 = x’k+1 = uyvzw

    The difference is in the order of expanding non-terminal symbols.

    Definition: two derivations are similar if one of them precedes the other.

    The similarity relation is reflexive, symmetric and transitive, therefore it defines classes of equivalence.

    Each derivation can be depicted using a derivation tree , also called a parse tree.

    Example:

    Consider the string (( )( )) and the grammar with rules given below:
    Rule1: S → SS
    Rule2: S → (S)
    Rule3: S → ( )

    One possible derivation is:

       S => (S)       by Rule1
         => (SS)      by Rule2
         => (( )S)    by Rule3
         => (( )( ))  by Rule3

    The process of derivation is depicted in the following parse tree

    Each non-terminal symbol is expanded by applying a grammar rule that contains the symbol in its left- hand side. Its children are the symbols in the right-hand side of the rule.

    Note: The order of applying the rules depends on the symbols to be expanded. At each tree level we may apply several different rules corresponding to the nodes to be expanded.

    Derivations within the same class of similarity yield the same parse tree.

    Within a given class, we have a derivation that is not preceded by any other derivation. This is the left-most derivation.
    There is a derivation that precedes all other derivations in a given class. This is the right-most derivation.
    Thus we can use either left-most or right-most derivation when building the parse tree.

  3. Parse Trees

  4. A parse tree for a string in L(G) is a tree where

    • the root is the start symbol for G
    • the interior nodes are the nonterminals of G
    • the leaf nodes are the terminal symbols of G.
    • the children of a node T (from left to right) correspond to the symbols on the right hand side of some production for T in G.

    Every terminal string generated by a grammar has a corresponding parse tree; every valid parse tree represents a string generated by the grammar (called the yield of the parse tree).

    Example: Given the grammar G = (V, ∑, R, E), where

    V = { E, D, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, +, -, *, /, (, ) },
    ∑ = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, +, -, *, /, (, ) },
    and R contains the following rules:

    1.	E → D 
    2.	E → ( E ) 
    3.	E → E + E 
    4.	E → E - E 
    5.	E → E * E 
    6.	E → E / E 
    7.	D → 0 | 1 | 2 | ... 9 
    find a parse tree for the string 1 + 2 * 3:

    The parse tree is:

    
                  E
                / | \
               /  |  \
              /   |   \
             /    |    \
            /     |     \
           E      |      \
          /|\     |       \
         / | \    |        \
        E  |  E   |        E
        |  |  |   |	       |	
        D  |  D   |        D
        |  |  |   |        |
        1  +  2   *        3

    (from http://www.cs.rochester.edu/u/nelson/courses/csc_173/grammars/parsetrees.html)

  5. Formal definition of a parse tree:

    1. The parse tree for a terminal symbol a is just one node - it is both the root and the only leaf.
    2. If there is a rule A → a, its parse tree has a root A and a leaf a.
    3. If T1, … Tn are parse trees with roots A1,…, An, and there is a rule A → A1,..An,
      then the tree with a root A and subtrees T1, … Tn rooted at the children of A:
      A1, A2,….,An, is a parse tree.
    4. Nothing else is a parse tree

  6. Ambiguity

  7. Definition: A grammar G is ambiguous if there is a string є L(G) with two different left-most (right-most) derivations.

    Thus, if the grammar yields more than one parse tree for a given string, the grammar is called ambiguous, and the corresponding language - ambiguous language.

    Example:

    Let S denote a statement, Exp - an expression.
    Consider the following rules for if statements (the words if, then, else are terminal symbols):

    Rule1: If_statement ® if Exp then S else S

    Rule2: If_statement ® if Exp then S

    Consider now the statement

    if a < b then if c < y then write(yes) else write(no);

    Following the grammar rules above, there are two possible interpretations:

    If a < b then

    if c < y then write(yes)
     else write(no);

    If a < b then if c < y then write(yes)

     else write(no);

    Here is the parse tree for the first interpretation:

     

    Here is the parse tree for the second interpretation:

    The problem can be solved semantically by having a rule that states the ELSE belongs to the last preceding IF for which there is not already an associated ELSE. We introduce a new non-terminal symbol S_withElse to stand for the “then” part of an “if” statement that has an “else” part:

    Rule1: IF_statement ® if Exp then S_withElse else S
    Rule2: IF_statement ® if Exp then S
    Rule3: S_withElse ® if Exp then S_withElse else S_withElse
    Rule4: S_withElse ® S

    The resulting grammar is unambiguous.


Learning goals / Exam-like questions and problems

  1. Know the formal definition of a parse tree.
  2. Given a CFG, and a string in the language defined by the grammar,
    be able to build the parse tree of the string. See
    problems

Back to Contents page

Created by Lydia Sinapova