Macquarie University
Department of Computing

COMP332 Programming Languages 2012

Assignment Two

MiniJava Semantic Analysis

Due: 11am Monday 8 October (Week 9)
Worth: 10%

This assignment asks you to develop part of a semantic analyser for the MiniJava programming language. We will build on the work done in Assignment One, but will not support the interfaces that were added in Part Two of that assignment.

Building this implementation will give you insight into the way that programming language implementations work in general, as well as specific experience with how Java is compiled and executed. All compiler or language processors must enforce the rules of the language. The semantic analyser enforces rules to do with naming and typing in particular. We only proceed to code generation if these rules are satisfied.

We will build on the first two assignments in Assignment Three where we will add a simple code generator so that we can run our programs.

Updates from Assignment One

To support the semantic analysis tasks that we wish to carry out, it is useful to have a slightly different tree structure from that of Assignment One. This section outlines the changes. See the new version of MiniJavaTree.scala in the skeleton for details. The syntax analyser has been updated to use the new structure.

What you have to do

You have to write, document and test a Scala semantic analyser for the MiniJava language, as described below. There is one part for each of two passing assessments standards: a) P and Cr, and b) D and HD, with part (b) requiring more independent work than part (a).

You are strongly advised to complete each part of the assignment before moving onto the next one. In fact, within each part it is advisable to solve small portions at a time rather than trying to code the whole solution in one go.

MiniJava semantic rules

This section explains the rules of name and type analysis for MiniJava which you will need in the assignment. Some languages allow name and type analysis to be handled separately, but in Java-like languages they are intertwined since in an expression such as o.m() we need to know the meaning of the name o in order to work out that it's of a reference type and only then can we look up the name m in the referenced class to find out what it means.

Basically, MiniJava names work as in Java, with some simplifications. Four different kinds of named entity exist: classes, methods, method arguments and variables. The whole program is a scope containing all of the normal classes. Each class has a scope nested within the program scope, containing all of the class's variables and methods. Each method has a scope nested within its class scope, containing all of the method's arguments and variables. Declared names are usable anywhere in the scope in which they are defined, not just from their declaration to the end of the scope.

The main class is not represented by an entity like the normal classes, since the main class name cannot be used as a normal class name. The main class only exists as an entry point for the program. The expression this makes no sense in the main class, since it has no variables or methods, but we will not check this condition in this assignment.

MiniJava variables and arguments can refer to values of type integer, Boolean, integer array or reference to an instance of a particular class. The first three types are represented by the syntax tree node types used in Assignment One. Reference types are represented by instances of the ReferenceType case class defined in SymbolTable.scala..

The full MiniJava semantic analyser phase must implement the following rules and check any associated conditions (but see below for what you need to do in the two parts of the assignment):

  1. There must be at most one defining occurrence of any name in a scope, not counting definitions in enclosing scopes.

  2. Each applied identifier occurrence must refer to exactly one defining occurrence. That defining occurrence is located by first looking in the innermost scope in which the applied occurrence occurs. If the defining occurrence is not located in the innermost scope, the search proceeds recursively to the smallest enclosing scope.

  3. If a defining occurence of an applied identifier is not found in the class in which that applied occurrence appears, and the class has a superclass, then the search moves to the superclass. If that class has no definition for the identifier and has a superclass, the search moves to that superclass. And so on until the definition is found or the superclass chain is exhausted.

  4. The type of an integer expression is integer.

  5. The type of a true or false expression is Boolean.

  6. The type of an identifier expression referring to a variable, argument or class is the declared type of the variable or argument (in the first two cases) or a reference type that refers to instances of that class (in the last case). Method names cannot be used by themselves in expressions; they can only be used in call expressions.

  7. The condition in an if-statement or while-statement must be of type Boolean.

  8. The expression in a println-statement can be of any type.

  9. In a variable assignment statement, the name on the left-hand side must refer to a variable or argument. The type of the expression on the right-hand side must be the same as that of the variable or argument.

  10. In an array assignment statement or index expression, the base sub-expression must be of integer array type and the index sub-expression must be of integer type. In the assignment case, the type of the right-hand side sub-expression must be integer. In the index expression case the type of the sub-expression is integer.

  11. In plus, minus and star expressions, the types of both sub-expressions must be integer. The type of the expression is integer.

  12. In a logical AND expression, the types of both sub-expressions must be Boolean. The type of the expression is Boolean.

  13. In a logical NOT expresion, the type of the sub-expression must be Boolean. The type of the expression is Boolean.

  14. In a less-than expression, the types of both sub-expressions must be integer. The type of the expression is Boolean.

  15. In a length expression, the type of the base sub-expression must be integer array. The type of the expression is integer.

  16. In a method call expression the named entity that is being called must be a method of the class whose instance is referenced by the base expression (or in a superclass if that class has no method with the given name). The number of arguments supplied in the call and their types must be the same as the number of arguments and argument types required by the called method. The type of the expression is the return type of the method.

  17. The type of a this expression is a reference to an instance of the class in which the this expression occurs.

  18. In a new array expression the type of the sub-expression must be integer. The type of the new array expression is integer array.

  19. In a new instance expression the name used must refer to a normal class. The type of the expression is a reference to an instance of that class.

  20. The type of the return expression in a method must be the same as the declared return type of the method.

Part One (Pass and Credit, 74 marks): The basic MiniJava semantic rules

The first part of the assignment involves implementing and testing the basic MiniJava semantic rules. Specifically, you must implement rules 1, 2, 4-15 and 17-19 listed in the previous section.

Your code must use the Kiama attribute grammar and environment libraries as discussed in lectures and practicals. You should use the expression language semantic analyser as a guide for your implementation, although note that MiniJava is a much more complex language.

A skeleton sbt project for the assignment has been provided on BitBucket as the inkytonik/comp332-ass2 repository. The modules are very similar to those used in the practical exercises for Week 6 and 7. For this assignment you should not have to modify any parts of the implementation except the semantic analyser (SemanticAnalysis) and the related tests (SemanticTests.scala).

Some of the semantic analysis and useful associated testing code is given to get you started, including most of the basic environment handling; you must provide the rest, particularly the implementations of the checks that enforce rules (look for FIXME in the code for some places where new code has to go).

Part Two (Distinction and High Distinction, 26 marks): The more complex semantic rules

The second part of the assignment entails implementing and testing the semantic rules that were not covered by Part One, i.e., rules 3, 16 and 20.

Running the semantic analyser and testing it

The skeleton for this assignment is designed to be run from within sbt. For example, to compile your project and run it on the file test/factorial.java you use the command
  run test/factorial.java

Assuming your code compiles and runs, this will print any semantic errors that have been found. If no errors are detected, you will see no output.

The project is also set up to do automatic testing. See the file SemanticTests.scala which provides the necessary definitions to test the semantic analyser on some sample inputs. Note that the tests we provide are not sufficient to test all of your code. You must augment them with other tests.

You can run the tests using the test command in sbt. This command will build the project and then run each test in turn, comparing the output produced by your program with the expected output. Any deviations will be reported as test failures.

Running all of the tests can be time-consuming since it also runs the parsing tests. To just run the semantic analysis tests you can use the command test-only *SemanticTests.

What you must hand in and how

  1. All of the code for your semantic analyser. To make this clear, submit every file that is needed to build your program from source, including files in the skeleton that you have not changed. Do not add any new files or include multiple versions of your files. Do not include any libraries. We will compile all of the files that you submit using sbt, so you should avoid any other build mechanisms.

  2. Your submission should include all of the tests that you have used to make sure that your program is working correctly. Note that just testing one or two simple cases is not enough for many marks. You should test as comprehensively as you can.

  3. A type-written report that describes how you have achieved the goals of the assignment. Your report must contain the following components or sections:

Submit your code and report electronically as a single zip file called ass2.zip using the appropriate submission link on the COMP332 iLearn website by the due date and time. Your report should be in PDF format. DO NOT SUBMIT YOUR ASSIGNMENT OR DOCUMENTATION IN ANY OTHER FORMAT THAN ZIP and PDF, RESPECTIVELY.

Marking

The assignment will be assessed according to the assessment standards for Learning Outcomes 2, 3 and 4 as specified in the Unit Guide.

Marks will be allocated equally to the code and to the report. Your code will be assessed for correctness and quality with respect to the assignment description. Marking of the report will assess the clarity and accuracy of your description and the adequacy of your testing. Marks allocated to testing will be 30% of the marks for the assignment.


Tony Sloane
Last modified: 7 September 2012
Copyright (c) 2010-2012 by Anthony Sloane, Macquarie University. All rights reserved.