File

Tokenize a java source file

With this example we are going to demonstrate how to tokenize a java source file.
In short, to tokenize a java source file you should:

  • Create a new FileReader.
  • Create a new StreamTokenizer that parses the given FileReader.
  • Use parseNumbers() API method of StreamTokenizer that specifies that numbers should be parsed by this tokenizer.
  • Use wordChars(int low, int hi) API method that specifies that all characters c in the range low <= c <= high are word constituents.
  • Use eolIsSignificant(boolean flag) method that determines whether or not ends of line are treated as tokens.
  • Use ordinaryChars(int low, int hi) that specifies that all characters c in the range low <= c <= high are “ordinary” in this tokenizer.
  • Use slashSlashComments(boolean flag) method that determines whether or not the tokenizer recognizes C++-style comments.
  • Use slashStarComments(boolean flag) API method that determines whether or not the tokenizer recognizes C-style comments.
  • Iterate over the tokens of the tokenizer and for every token of the tokenizer, and check if it a String, the end of a line, a number, a word or something else,
  • Close the fileReader using its close() API method.

Let’s take a look at the code snippet that follows:

package com.javacodegeeks.snippets.core;

import java.io.FileReader;
import java.io.StreamTokenizer;

public class Main {

    public static void main(String[] argv) throws Exception {  

  FileReader fileReader = new FileReader("C:/Users/nikos7/Desktop/Main.java");

  StreamTokenizer tokenizer = new StreamTokenizer(fileReader);

  tokenizer.parseNumbers();

  tokenizer.wordChars('_', '_');

  tokenizer.eolIsSignificant(true);

  tokenizer.ordinaryChars(0, ' ');

  tokenizer.slashSlashComments(true);

  tokenizer.slashStarComments(true);

  int tok = tokenizer.nextToken();

  while (tok != StreamTokenizer.TT_EOF) {

tok = tokenizer.nextToken();

switch (tok) {

    case StreamTokenizer.TT_NUMBER:

  double n = tokenizer.nval;

  System.out.println(n);

  break;

    case StreamTokenizer.TT_WORD:

  String word = tokenizer.sval;

  System.out.println(word);

  break;

    case '"':

  String doublequote = tokenizer.sval;

  System.out.println(doublequote);

  break;

    case ''':

  String singlequote = tokenizer.sval;
  System.out.println(singlequote);

  break;

    case StreamTokenizer.TT_EOL:

  break;
    case StreamTokenizer.TT_EOF:
  break;
    default:
  char character = (char) tokenizer.ttype;
  System.out.println(character);

  break;
}
  }
  fileReader.close();
    }
}

Output:

ch
 
=
 
(
char
)
 
tokenizer.ttype
;

 
This was an example of how to tokenize a java source file in Java.

Byron Kiourtzoglou

Byron is a master software engineer working in the IT and Telecom domains. He is an applications developer in a wide variety of applications/services. He is currently acting as the team leader and technical architect for a proprietary service creation and integration platform for both the IT and Telecom industries in addition to a in-house big data real-time analytics solution. He is always fascinated by SOA, middleware services and mobile development. Byron is co-founder and Executive Editor at Java Code Geeks.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button