regex
Parse an Apache logfile with regular expressions
In this example we shall show you how to parse an Apache logfile with regular expressions. To parse an Apache logfile with regular expressions we have followed the steps below:
- We have created an interface with a static final int that is the number of fields to be found and a static final String that is the log entry to be parsed.
- We have also created an implementation of the interface, that creates a StringTokenizer with the String logEntryLine and uses
countTokens()
API method of StringTokenizer to calculate the number of times that this tokenizer’snextToken()
method can be called before it generates an exception. - Then it uses
nextToken()
API method of StringTokenizer to return the next token, andnextToken(String delim)
API method of StringTokenizer to get the next token using specified delimiters, according to the log entry delimiters,
as described in the code snippet below.
package com.javacodegeeks.snippets.core; import java.util.StringTokenizer; /** * Parse an Apache log file with StringTokenizer */ public class Apache implements LogExample { public static void main(String argv[]) { StringTokenizer matcher = new StringTokenizer(logEntryLine); System.out.println("tokens = " + matcher.countTokens()); // StringTokenizer CAN NOT count if you are changing the delimiter! // if (matcher.countTokens() != NUM_FIELDS) { // System.err.println("Bad log entry (or bug in StringTokenizer?):"); // System.err.println(logEntryLine); // } System.out.println("Hostname: " + matcher.nextToken()); // StringTokenizer makes you ask for tokens in order to skip them: matcher.nextToken(); // eat the "-" matcher.nextToken(); // again System.out.println("Date/Time: " + matcher.nextToken("]")); //matcher.nextToken(" "); // again System.out.println("Request: " + matcher.nextToken(""")); matcher.nextToken(" "); // again System.out.println("Response: " + matcher.nextToken()); System.out.println("ByteCount: " + matcher.nextToken()); System.out.println("Referer: " + matcher.nextToken(""")); matcher.nextToken(" "); // again System.out.println("User-Agent: " + matcher.nextToken(""")); } } /** * Common fields for Apache Log demo. */ interface LogExample { /** * The number of fields that must be found. */ public static final int NUM_FIELDS = 9; /** * The sample log entry to be parsed. */ public static final String logEntryLine = "123.45.67.89 - - [27/Oct/2000:09:27:09 -0400] "GET /java/javaResources.html HTTP/1.0" 200 10450 "-" "Mozilla/4.6 [en] (X11; U; OpenBSD 2.8 i386; Nav)""; }
Output:
tokens = 19
Hostname: 123.45.67.89
Date/Time: [27/Oct/2000:09:27:09 -0400
Request: ]
Response: /java/javaResources.html
ByteCount: HTTP/1.0"
Referer: 200 10450
User-Agent:
This was an example of how to parse an Apache logfile with regular expressions in Java.