Recursive Descent Parser - Java
Published:
A Java program that analyzes the validity of an input files’ lexical syntax for a hypothetical imperative programming language.
Introduction
What is a Recursive Descent Parser? A recursive descent parser has a subprogram for each nonterminal in its grammar. Many of these subprograms are recursive - hence the name - and produces a parse tree in top-down nature.
In this application, I have created a program to analyze the syntax of input programs for a hypothetical imperative programming language. The Extended Backus-Naur Form (EBNF) grammar for this imperative programming language is below:
<program> -> program begin <statement_list> end
<statement_list> -> <statement> {;<statement>}
<statement> -> <assignment_statement> | <if_statement> | <loop_statement>
<assignment_statement> -> <variable> = <expression>
<variable> -> identifier
<expression> -> <term> { (+|-) <term>}
<term> -> <factor> {(* | /) <factor> }
<factor> -> identifier | int_constant | (<expr>)
<if_statement> -> if (<logic_expression>) then <statement>
<logic_expression> -> <variable> (< | >) <variable>
<loop_statement> -> loop (<logic_expression>) <statement>
Where:
- Separators:
|–> OR{}–> optional statement with zero or more repetitions()–> encapsulates an OR selection;–> statement terminator
 - Keywords:
programbeginendifthenloop
 - Operators:
+–> Addition-–> Subtraction*–> Multiplication/–> Division>–> Greater than<–> Less than- In this grammar, there are only two logic expressions 
>and<. 
- In this grammar, there are only two logic expressions 
 
 - Identifiers:
- A string value that begins with a letter followed by 0 or more letters and/or digits
 
 
The nonterminal symbols include:
<program><statement_list><statement><assignment_statement><variable><if_statement><loop_statement><expression><logic_expression><term><factor><expr>
The program reads in an input test .txt file and determines whether or not the file contains syntax errors.
Tools & Techniques
- Java
 - Eclipse IDE
 
Implementation
The implementation of the program follows the two steps below:
- Implement a lexical analyzer as a subprogram of the main program. Each time the lexical analyzer is called, it should return the next lexeme and its token code.
 - Implement a parser based on the above Extended Backus-Naur Form (EBNF) rules. Create a subprogram for each nonterminal symbol which should parse only sentences that can be generated by the nonterminal.
 
Test Cases
The following test case files have been included:
BeginMissing.txt–> The syntax error in this test file is on line 2 where the keywordbeginshould be.

ExtraSemicolon.txt–> The syntax error in this test file is on line 7 where there should not be a semicolon.

IfMisspelled.txt–> The syntax error in this test file is on the beginning of line 6, where the keywordifis misspelled.

MissingThen.txt–> The syntax error in this test file is at the end of line 4, there should be athenkeyword.

NoErrors.txt–> There are no syntax errors in this test file.

SemicolonMissing.txt–> The syntax error in this test file is a missing semicolon at the end of line
 g
The only file that runs correctly is the NoErrors.txt file.
Run
To execute the program, follow the instructions below:
- Clone the git repo: 
https://github.com/erincameron11/recursive-descent-parser - Open up the files as a Java project in the IDE of your choice.
 - Run the 
RecursiveDescentParser.javafile and follow the commands in the console.- _Note: When entering a file name in the console, do not put the file extensions (in this case, 
.txt). For example:NoErrorsis a valid file name entry in the console. 
 - _Note: When entering a file name in the console, do not put the file extensions (in this case, 
 
