Macquarie University
Department of Computing

COMP332 Programming Languages 2012
Assignment Two (Sample Solution)

This assignment concerned the design, development, testing, and documentation of a semantic analyser for a simple MiniJava version of the Java programming language. We had to implement checks for a variety of semantic conditions, mostly involving naming and typing. Also, we had to design test cases to demonstrate that our implementation is working correctly.

This report presents the design and implementation of my semantic analyser and describes the design of the tests I have used to check the correctness of the implementation. Parts One and Two are discussed separately. The full code for this sample solution is on the COMP332 iLearn site.

Part One: The basic MiniJava semantic rules

This part of the assignment required us to implement rules 1, 2, 4-15 and 17-19 as specified in the handout. In the following I consider each rule in turn, discussing how each of them was implemented. Testing is considered in the final sub-section.

Rule 1: There must be at most one defining occurrence of any name in a scope, not counting definitions in enclosing scopes.

Rule 2: Each applied identifier occurrence must refer to exactly one defining occurrence. That defining occurrence is located by first looking in the innermost scope in which the applied occurrence occurs. If the defining occurrence is not located in the innermost scope, the search proceeds recursively to the smallest enclosing scope.

Rules 1 and 2 were already implemented in the skeleton since that code built an environment for each node of the tree, containing just the named entities that are visible at that node. The check method already contained checks for undefined and multiply-defined entities.

Rule 4: The type of an integer expression is integer.

This rule required examination of the tipe attribute, which returns the type of an expression node. The skeleton implementation of the attribute already contained a case for this rule.

For this rule and others concerning the type of expressions and their expected type I also needed to augment the check method with a case to check that an expression's type is compatible with its expected type.

case e : Expression =>
    if (!iscompatible (e->tipe, e->exptipe))
        message (e, "type error: expected " + (e->exptipe) + " got " + (e->tipe))

I followed the same scheme as in the practical exercises by defining an iscompatible method which compares the type and expected type for equality, but allows an unknown type in either place so that errors can be supressed when we don't care about the type or when there is another error that prevents a type from being calculated.

def iscompatible (t1 : Type, t2 : Type) : Boolean =
    (t1 == UnknownType ()) || (t2 == UnknownType ()) || (t1 == t2)

Rule 5: The type of a true or false expression is Boolean.

To implement this rule I added cases for the true and false expressions to the tipe attribute as follows.

case _ : TrueExp | _ : FalseExp =>
    BooleanType ()

Rule 6: The type of an identifier expression referring to a variable, argument or class is the declared type of the variable or argument (in the first two cases) or a reference type that refers to instances of that class (in the last case). Method names cannot be used by themselves in expressions; they can only be used in call expressions.

This rule required a case for tipe at IdnExp nodes. The case looks at the entity associated with the identifier occurrence to determine what kind of entity it is, returning the appropriate type as specified in the rule. An important aspect was having a default case to specify an unknown type in the case when there is no sensible entity associated with the identifier occurrence. Otherwise, we will get spurious type errors when identifiers are not declared.

case IdnExp (i) =>
    (i->entity) match {
        case ClassEntity (decl) =>
            ReferenceType (decl)
        case ArgumentEntity (decl) =>
            actualTypeOf (decl.tipe)
        case VariableEntity (decl) =>
            actualTypeOf (decl.tipe)
        case _ =>
            UnknownType ()
    }

Note that in the argument and variable entity cases we don't just use the type as found in the declaration. This is because if the type is a class type then that type will just refer to the name of the class. To be useful for type-checking, we really need to use a reference type that refers to the declaration of the named class. Thus, we use a helper method actualTypeOf that converts such types by looking up the entity of a named class.

def actualTypeOf (t : Type) : Type =
    t match {
        case ClassType (idn) =>
            (idn->entity) match {
                case ClassEntity (decl) =>
                    ReferenceType (decl)
                case _ =>
                    UnknownType ()
            }
        case _ =>
            t
    }

In the non-class type cases, the type in the declaration is the same as the internal type for type checking, so we just leave it alone.

actualTypeOf will be used by later cases for the same reason.

Rule 7: The condition in an if-statement or while-statement must be of type Boolean.

This rule required updating the exptipe attribute to have a case for when the parent of an expression is an if or while statement node.

case _ : If | _ : While =>
    BooleanType ()

Rule 8: The expression in a println-statement can be of any type.

This rule was handled by a catch-all case for exptipe that returned the unknown type for expressions that are not covered by other cases.

case _ =>
    UnknownType ()

Rule 9: In a variable assignment statement, the name on the left-hand side must refer to a variable or argument. The type of the expression on the right-hand side must be the same as that of the variable or argument.

This rule required exptipe to have a case for variable assignment nodes. There are two sub-cases: one for normal variables and one for method arguments. In each case the expected type can be extracted from the respective entity. As for rule 6, we have a default case that returns an unknown type when we don't know the entity on the left-hand side of an assignment.

case VarAssign (lhs, _) =>
    (lhs->entity) match {
        case VariableEntity (Var (t, _)) =>
            actualTypeOf (t)
        case ArgumentEntity (Argument (t, _)) =>
            actualTypeOf (t)
        case _ =>
            UnknownType ()
    }

Rule 10: In an array assignment statement or index expression, the base sub-expression must be of integer array type and the index sub-expression must be of integer type. In the assignment case, the type of the right-hand side sub-expression must be integer. In the index expression case the type of the sub-expression is integer.

This rule is handled similarly to the other cases for the expected type. The difference is that an array assignment contains three expression children. We need to have one case for each of these expressions so the cases require guards to discriminate between them.

case ArrAssign (base, _, _) if base eq e =>
    IntArrayType ()

case ArrAssign (_, index, _) if index eq e =>
    IntType ()

case ArrAssign (_, _, elem) if elem eq e =>
    IntType ()

Rules 11: In plus, minus and star expressions, the types of both sub-expressions must be integer. The type of the expression is integer. Rule 12: In a logical AND expression, the types of both sub-expressions must be Boolean. The type of the expression is Boolean. Rule 13: In a logical NOT expresion, the type of the sub-expression must be Boolean. The type of the expression is Boolean. Rule 14: In a less-than expression, the types of both sub-expressions must be integer. The type of the expression is Boolean. Rule 15: In a length expression, the type of the base sub-expression must be integer array. The type of the expression is integer.

Rule 11 was already implemented in the skeleton. The other rules are handled similarly, by these cases in the tipe attribute definition:

// Rule 11
case _ : PlusExp | _ : MinusExp | _ : StarExp =>
    IntType ()

// Rule 12
case _ : AndExp =>
    BooleanType ()

// Rule 13
case _ : NotExp =>
    BooleanType ()

// Rule 14
case _ : LessExp =>
    BooleanType ()

// Rule 15
case _ : LengthExp =>
    IntType ()

and by these cases in the exptipe attribute definition:

// Rule 11
case _ : PlusExp | _ : MinusExp | _ : StarExp =>
    IntType ()

// Rule 12
case _ : AndExp =>
    BooleanType ()

// Rule 13
case _ : NotExp =>
    BooleanType ()

// Rule 14
case _ : LessExp =>
    IntType ()

// Rule 15
case _ : LengthExp =>
    IntArrayType ()

Rule 17: The type of a this expression is a reference to an instance of the class in which the this expression occurs.

This rule required a bit more work. The reason is that the type that we need is not constant, nor is it directly obtainable from the ThisExp node. What we need to do is to look upwards in the tree to find the nearest enclosing Class node. The type of this will be a reference to that class type. I implmented the search in a new thistype attribute and use it in the ThisExp case of the tipe attribute.

case e : ThisExp =>
    e->thistype

The thistype attribute has three cases. If we are at the root then we have not found a class, so we return the unknown type. If we are at a class node then we return it. Otherwise, we just search up the tree by asking our parent what its thistype is.

lazy val thistype : SourceNode => Type =
    attr {
        case n if n.isRoot =>
            UnknownType ()

        case decl : Class =>
            ReferenceType (decl)

        case n =>
            (n.parent[SourceNode])->thistype

    }

Rule 18: In a new array expression the type of the sub-expression must be integer. The type of the new array expression is integer array.

These rules are again straight-forward. tipe just returns an integer array type in the new array expression case.

case _ : NewArrExp =>
    IntArrayType ()

The exptipe attribute needs a case to restrict the expression in a new array creation expression to be of integer type.

case _ : NewArrExp =>
    IntType ()

Rule 19: In a new instance expression the name used must refer to a normal class. The type of the expression is a reference to an instance of that class.

As for some of the earlier cases, we just get the entity referred to by a name in a new expression. In check we have a case to complain if the entity that is used in such an expression is not referring to a class entity. Note that this test automatically rules out the main class since a different kind of entity is used for that one. As usual, we don't raise an error if the entity is unknown.

case e : Expression =>
    ...
    e match {
        ...
        case NewExp (u) =>
            (u->entity) match {
                // Rule 19
                case _ : ClassEntity =>
                    // Do nothing
                case ent =>
                    if (ent != UnknownEntity)
                        message (u, "illegal instance creation of non-class type")
            }
        ...
    }

The NewExp case of tipe just looks up the entity and makes a reference type if the a class is being referred to. Otherwise, we return the unknown type.

case NewExp (i) =>
    (i->entity) match {
        case ClassEntity (decl) =>
            ReferenceType (decl)
        case _ =>
            UnknownType ()
    }

Part One: Testing

As in Assignment One, there are many, many tests that we could write for our semantic analysis. It is difficult to know when we have been complete enough. What you will find in the sample solution is a pretty complete attempt, but there are still more tests that we could imagine having. For marking, we wanted to see that you had put some effort into considering completeness, not necessarily going as far as could be done.

Testing semantic analysis checking is really a matter of making sure the tests cover positive cases (i.e., code that passes the checks) as well as negative cases (i.e., code that violates the checks). For some semantic rules, there is only a positive case since there is no way for an input program to violate the rule (e.g., rule 4).

With these basic principles in place, in each case it is best to construct the tests so that each one tests a single semantic rule in either positive mode, or where applicable, negative mode. This approach was followed in the skeleton which provided tests for rules 1, 2, 4 and 11.

I reused the skeleton test framework to embed expressions into a dummy class so that they could be parsed and analysed. I extended this idea to allow variable declarations and statements to be optionally embedded into the class as well. This capability was needed to test some of the more complex rules that involve statements and classes. This extension resulted in an a embedExpressionAndCheck method which has the following signature.

def embedExpressionAndCheck (exp : Expression,
                             retType : Type = IntType (),
                             vars : List[Var] = Nil,
                             stmts : List[Statement] = Nil)

For example, the first test below declares an integer variable v and uses it in an assignment statement to make sure that an integer expression can be assigned to it. This is a positive test since we expect no error. The second test below is similar, but tests the negative condition since the variable is of integer type but the expression is Boolean. Most of the tests follow this pattern.

test ("an integer expression is assignment compatible with an integer var") {
    val exp = IntExp (0) // dummy
    val exp1 = IntExp (42)
    val vars = List (Var (IntType (), IdnDef ("v")))
    val stmts = List (VarAssign (IdnUse ("v"), exp1))
    embedExpressionAndCheck (exp, vars, stmts)
    assert (messagecount === 0)
}

test ("a Boolean expression is not assignment compatible with an integer var") {
    val exp = IntExp (0) // dummy
    val exp1 = TrueExp ()
    val vars = List (Var (IntType (), IdnDef ("v")))
    val stmts = List (VarAssign (IdnUse ("v"), exp1))
    embedExpressionAndCheck (exp, IntType (), vars, stmts)
    assert (messagecount === 1)
    assertMessage (0, 0, 0, "type error: expected int got boolean")
}

In a few places tests are hard-coded as a program where that is possible. For example, to check that the type of this is the current class type, we just have a simple program with a method that returns that type. This test uses the skeleton's parseTest method.

test ("the type of this is the current class") {
    parseTest ("""
        |class Dummy { public static void main () { System.out.println (0); } }
        |class Test {
        |    public Test m () {
        |        return this;
        |    }
        |}
        """.stripMargin)
    assert (messagecount === 0)
}

Part Two: The more complex semantic rules

Part Two involved implementing the remaining semantic checks, namely those for rules 3, 16 and 20. As before, I discuss how each of these was achieved, before considering testing.

Rule 20: The type of the return expression in a method must be the same as the declared return type of the method.

Of the rules in Part Two, rule 20 is the easiest since it follows the same pattern as those in Part One. A return expression occurs directly as a child of the method body node. The body node contains the return type of the method as a field. Thus, the relevant case of exptipe just has to extract it and return it (after making sure it is an actual type as we have done before).

case MethodBody (t, _, _, _, _) =>
    actualTypeOf (t)

Rule 16: In a method call expression the named entity that is being called must be a method of the class whose instance is referenced by the base expression (or in a superclass if that class has no method with the given name). The number of arguments supplied in the call and their types must be the same as the number of arguments and argument types required by the called method. The type of the expression is the return type of the method.

Rule 3: If a defining occurence of an applied identifier is not found in the class in which that applied occurrence appears, and the class has a superclass, then the search moves to the superclass. If that class has no definition for the identifier and has a superclass, the search moves to that superclass. And so on until the definition is found or the superclass chain is exhausted.

The main problem with processing method calls is that the skeleton code for the entity attribute only knows how to look up names in the current environment. For a call like o.m we need to look m up in the definition of the class of o, not in the current environment. Thus, we need to add a case for identifiers that occur as the child of a call expression. We make use of the HasParent pattern to do this check, but you could also do it with an explicit check of the parent of the identifier use just the same.

lazy val entity : IdnNode => Entity =
    attr {

        case HasParent (n @ IdnUse (i), CallExp (base, _, _)) =>
            (base->tipe) match {
                case ReferenceType (decl) =>
                    lookup (decl->env, i, UnknownEntity)  /* *** */
                case t =>
                    UnknownEntity
            }
        case n =>
            lookup (n->env, n.idn, UnknownEntity)

    }

For rule 3, we must deal with superclass lookups. We extend the code we just saw for looking up methods, to look in the superclass if there is one if a method name is not found. The checking will actually be done by a helper method findMethod, we replace the line marked by /* *** */ with the following call.

findMethod (decl, i)

We first lookup in the environment of decl. If it's there, we are done. Otherwise, we look to see if decl has a superclass clause. If so, we check to make sure that the named used in the superclass clause, actually refers to a class, since it could be undefined or some other named entity, for example. If it's a class, then we recursively call findMethod to look in the superclass. If there is no superclass clause, or the thing named there is not a class, we bail out by returning the unknown entity.

def findMethod (decl : Class, i : String) : Entity =
    lookup (decl->env, i, UnknownEntity) match {

        case UnknownEntity =>
            decl.superclass match {
                case Some (superidn) =>
                    (superidn->entity) match {
                        case ClassEntity (superdecl) =>
                            findMethod (superdecl, i)
                        case _ =>
                            UnknownEntity
                    }
                case None =>
                    UnknownEntity
            }

        case entity =>
            // Found it in decl's env, so return it
            entity

    }

In our first cut at checking for rule 16, we have a check whose purpose is to ensure that a call expression can only make a call to a method entity.

case e : Expression =>
    ...
    e match {
        ...
        case CallExp (_, u, args) =>
            (u->entity) match {
                // Rule 16
                case _ : MethodEntity =>
                    // Do nothing
                case ent =>
                    if (ent != UnknownEntity)
                        message (u, "illegal call to non-method")
            }
        ...
    }

To support the checking of method arguments, the "Do nothing" code here needs to make an additional check.

    val expargnum = decl.body.args.length
    if (expargnum != args.length)
        message (u, "wrong number of arguments, got " +
                     args.length + " but expected " +
                     expargnum)

The tipe attribute has a case that specifies the type of a call expression to be the return type of the called method (or unknown if the called identifier is not referring to a method).

case CallExp (_, i, _) =>
    (i->entity) match {
        case MethodEntity (decl) =>
            actualTypeOf (decl.body.tipe)
        case _ =>
            UnknownType ()
    }

We also need to specify the expected type of each argument in a call expression. In the exptipe attribute case for an expression whose parent is a call expression, if we are calling a method we use a helper methoed expTypeOfArg to work out what the type of the argument. We somehow need to know which argument in the argument list the expression e is, so we use a Kiama node property e.index for this purpose. (An alternative method would be to write a method that searches the argument list for e and returns the position where it was found.)

case CallExp (_, u, _) =>
    (u->entity) match {
        case MethodEntity (decl) =>
            expTypeOfArg (decl, e.index)

        case _ =>
            // No idea what is being called, so no type constraint
            UnknownType ()
    }

expTypeOfArg looks up the formal argument if the actual argument is in range and returns the relevant actual type. Note that the base expression and method name are counted in the index count, so we need to subtract two from the index to get the zero-indexed argument number.

def expTypeOfArg (method : Method, index : Int) : Type = {
    val argnum = index - 2
    val numargs = method.body.args.length
    if (argnum < numargs) {
        val arg = method.body.args (argnum)
        actualTypeOf (arg.tipe)
    } else
        UnknownType ()
}

Part Two: Testing

Testing was undertaken using the same principles as for the first part of the assignment. The only real difference is that in these tests it is natural to test multiple things at once. For example, to test that the type of a call expression is correct, we are also testing that the method name is looked up properly. Here is the positive test.

test ("the type of a method call expression is the method return type (1)") {
    parseTest ("""
        |class Dummy { public static void main () { System.out.println (0); } }
        |class Test {
        |    int v;
        |    public int m () {
        |        return 42;
        |    }
        |    public int n () {
        |        v = this.m ();
        |        return 0;
        |    }
        |}
        """.stripMargin)
    assert (messagecount === 0)
}

Tony Sloane

Copyright (c) 2012-2014 by Anthony Sloane, Macquarie University. All rights reserved.