- Digit
(1) [0-9] - Letter
(2) ‘’ ([a-zA-Z]| [0-9] |‘’ | ‘$’)+ | ([a-zA-Z]| ‘$’)* ([a-zA-Z]| [0-9] | ‘_’ | ‘$’)+ - Identifier
(3) identifier = letter(letter | digit | ‘.’ )* - Reserved Word and Identifier
(1) reserved = while | if | for | do | switch | case | then | else | break | out.println | main | class
(2) identifier = int - Comment
(1) [^//][a-zA-Z0-1]*[(~newline)$]+
- begin with a ‘//’ characters and continue to the end of the line
- begin with a ‘//’ characters and continue to the end of the line
- White Space
(1) whitespace = (newline | blank | tab | comment)
Figure 1: NFA notation for scanner
Figure 2: DFA notation for scanner
- UML
Figure 3: UML of scnr java code
- The scnr.java code has a figure 2 UML.
- public static int keyflag: Stores the keyword properties of a token
- public static void main function: Acts as the main driver. Distinguish type of each tokens.
- public static String readFile function: Reads the file and converts it to a static String.
- public static void print token_type: Print each token with the type.
- Flow
Figure 4: Flow of scnr java program
- When scnr is executed, it goes to the main function first.
- The main function takes the name of the file to be scanned as an argument.
- Put the file name in the readFile function, convert the contents of the file to String, and return to main to scan the String.The pseudo-code of the algorithm used for scanning is the same as in 3 with Figure 5).
- When the token is separated with the type, it outputs with print_token_type function and returns to main. Repeat until there are no remaining characters in buffer.
Figure 5: Transition Table
- A table representing the transition state corresponding to the DFA in 3.
- This table does not show an acceptance state.
- Pseudocode
state:= 0 terminator = '(', ')', ' ', ';', '{', '}', '"', ',' special_symbol = '+', '-', '=', '<', '>', '/', '*' ch := next input character; while ch is not empty do if ch is digit state = trs_tbl[state+1][digit]; else if ch is letter state = trs_tbl[state+1][letter]; else if ch is terminator print_token_type(); state = trs_tbl[state+1][terminator]; else if ch is special_symbol print_token_type(); state = trs_tbl[state+1][special_symbol]; else error occur; end while;
- Using ant
- Using javac
- Use ant (directory name)
(1) ant version
- Apache Ant(TM) version 1.10.5 compiled on March 28 2019
(2) build.xml
- Apache Ant(TM) version 1.10.5 compiled on March 28 2019
<project name="scnr" default="build" basedir=".">
<property name="src" value="src"/>
<property name="build" value="build"/>
<property name="doc" value="doc"/>
<path id="lib.path">
<pathelement location="${build}" />
</path>
<target name="init">
<mkdir dir="${build}"/>
</target>
<target name="build" depends="init">
<javac srcdir="${src}" destdir="${build}" debug="true" includeantruntime="false">
</javac>
</target>
<target name="run" depends="build">
<java classname="scnr" fork="true" dir="." maxmemory="4096m">
<classpath location="."/>
<classpath refid="lib.path"/>
<arg file="data/test.txt"/>
</java>
</target>
<target name="clean">
<delete dir="${build}"/>
</target>
</project>
(3) command
- ant build
- ant run
- this build.xml already set the file name(test.txt)
- Use Javac (directory name)
(1) java version
- openjdk version "11.0.6" 2020-01-14
- OpenJDK Runtime Environment (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1)
- OpenJDK 64-Bit Server VM (build 11.0.6+10-post-Ubuntu-1ubuntu118.04.1, mixed mode)
(2) command - javac scnr.java
- java scnr [file name]