Home » Core Java » util » regex » Parse an Apache logfile with regular expressions

About Byron Kiourtzoglou

Avatar photo
Byron is a master software engineer working in the IT and Telecom domains. He is an applications developer in a wide variety of applications/services. He is currently acting as the team leader and technical architect for a proprietary service creation and integration platform for both the IT and Telecom industries in addition to a in-house big data real-time analytics solution. He is always fascinated by SOA, middleware services and mobile development. Byron is co-founder and Executive Editor at Java Code Geeks.

Parse an Apache logfile with regular expressions

In this example we shall show you how to parse an Apache logfile with regular expressions. To parse an Apache logfile with regular expressions we have followed the steps below:

  • We have created an interface with a static final int that is the number of fields to be found and a static final String that is the log entry to be parsed.
  • We have also created an implementation of the interface, that creates a StringTokenizer with the String logEntryLine and uses countTokens() API method of StringTokenizer to calculate the number of times that this tokenizer’s nextToken() method can be called before it generates an exception.
  • Then it uses nextToken() API method of StringTokenizer to return the next token, and nextToken(String delim) API method of StringTokenizer to get the next token using specified delimiters, according to the log entry delimiters,

as described in the code snippet below.

package com.javacodegeeks.snippets.core;

import java.util.StringTokenizer;

/**
 * Parse an Apache log file with StringTokenizer
 */
public class Apache implements LogExample {

    public static void main(String argv[]) {


  StringTokenizer matcher = new StringTokenizer(logEntryLine);


  System.out.println("tokens = " + matcher.countTokens());

  // StringTokenizer CAN NOT count if you are changing the delimiter!

  // if (matcher.countTokens() != NUM_FIELDS) {

  //   System.err.println("Bad log entry (or bug in StringTokenizer?):");

  //   System.err.println(logEntryLine);

  // }


  System.out.println("Hostname: " + matcher.nextToken());

  // StringTokenizer makes you ask for tokens in order to skip them:

  matcher.nextToken(); // eat the "-"

  matcher.nextToken(); // again

  System.out.println("Date/Time: " + matcher.nextToken("]"));

  //matcher.nextToken(" "); // again

  System.out.println("Request: " + matcher.nextToken("""));

  matcher.nextToken(" "); // again

  System.out.println("Response: " + matcher.nextToken());

  System.out.println("ByteCount: " + matcher.nextToken());

  System.out.println("Referer: " + matcher.nextToken("""));

  matcher.nextToken(" "); // again

  System.out.println("User-Agent: " + matcher.nextToken("""));
    }
}
/**
 * Common fields for Apache Log demo.
 */
interface LogExample {

    /**
     * The number of fields that must be found.
     */
    public static final int NUM_FIELDS = 9;
    /**
     * The sample log entry to be parsed.
     */
    public static final String logEntryLine = "123.45.67.89 - - [27/Oct/2000:09:27:09 -0400] "GET /java/javaResources.html HTTP/1.0" 200 10450 "-" "Mozilla/4.6 [en] (X11; U; OpenBSD 2.8 i386; Nav)"";
}

Output:

tokens = 19
Hostname: 123.45.67.89
Date/Time:  [27/Oct/2000:09:27:09 -0400
Request: ] 
Response: /java/javaResources.html
ByteCount: HTTP/1.0"
Referer:  200 10450 
User-Agent:  

 
This was an example of how to parse an Apache logfile with regular expressions in Java.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

 

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

 

and many more ....

 

Receive Java & Developer job alerts in your Area

I have read and agree to the terms & conditions

 

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments