Home » Core Java » Java Regular Expressions Tutorial

About Ima Miri

Ima Miri

Ima is a Senior Software Developer in enterprise application design and development. She is experienced in high traffic websites for e-commerce, media and financial services. She is interested in new technologies and innovation area along with technical writing. Her main focus is on web architecture, web technologies, java/j2ee, Open source and mobile development for android.

Java Regular Expressions Tutorial

1. What is regular expression?

The regular expression is a sequence of characters that can be used to search, edit or manipulate text and data. You must learn a specific syntax to create regular expressions. Regular expression is used to define constraint on strings such as password and email validation.

Java provides Java Regex API in java.util.regex package that contains the three classes: Pattern, Matcher and PatternSyntaxException.

1.1. What is Pattern?

Pattern is a compiled representation of a regular expression. A regular expression which is specified as a string, must be compiled into an instance of Pattern class. The created pattern can be used to create a Matcher object.

Pattern p = Pattern.compile("\\d");

Instances of Pattern class are immutable and are thread safe.

1.2. What is a Matcher?

A matcher is created from a pattern by invoking the pattern’s matcher method.

Matcher matcher = pattern.matcher("Regular expression tutorial with 9 examples!");

Instances of the Matcher class are not thread safe.

1.3. PatternSyntaxException

An unchecked exception is thrown when a regular expression syntax is incorrect.

1.4. Regular Expression Predefined Characters

Predefined Character work as a short codes and make the code easier to read. Predefined Characters are also called Metacharacters.

Regular ExpressionDescription
\dAny digits, short of [0-9]
\DAny non-digit, short for [^0-9]
\sAny whitespace character, short for [\t\n\x0B\f\r]
\SAny non-whitespace character, short for [^\s]
\wAny word character, short for [a-zA-Z_0-9]
\WAny non-word character, short for [^\w]
\bA word boundary
\BA non word boundary

1.5. Regular Expression Quantifiers

The quantifiers specify the number of occurrences of a character in input string.

Regular ExpressionDescription
a?a occurs once or not at all
a*a occurs zero or more times
a+a occurs one or more times
a{n}a occurs exactly n times
a{n,}a occurs n or more times
a{n,m}a occurs at least n times but not more than m times

1.6. Regular Expression common symbols

Regular ExpressionDescription
.Any character
^The beginning of a line
$The end of a line
[abc]simple a, b, or c
[^abc]Any character except a, b, or c
(a)a, as a capturing group
\\The backslash character
a|bEither a or b
\tThe tab character
\nThe newline character
\rThe carriage-return character

2. How to use regular expression

Lets start with some examples with Pattern class and how it works.

2.1. split

Pattern pattern = Pattern.compile("\\d+");
String[] st = pattern.split("20 potato, 10 tomato, 5 bread");
for(int i = 1; i < st.length; i++) {
   System.out.println("recipe ingredient" + i + " : " + st[i]);
}

Output:

recipe ingredient1 : potato,
recipe ingredient2 : tomato,
recipe ingredient3 : bread

split() splits the given input string based on matches of the pattern. In the above example, the split method will look for any digit number which occurs once or more in the input string.

2.2. flags

A Pattern can be created with flags to make the pattern flexible against the input string. For example Pattern.CASE_INSENSITIVE enables case insensitive matching.

Pattern pattern = Pattern.compile("abc$", Pattern.CASE_INSENSITIVE);

2.3. matches

Pattern class has matches method that takes regular expression and input string as argument and return boolean result after matching them.

System.out.println("Matches: " + pattern.matches(".*", "abcd654xyz00")); // true

If the input string is matched with the pattern, you can use String matches method instead of using Pattern and matches.

String str = "abcd654xyz00";
str.matches(".*"); //true
Tip
A pattern is applied on a string from left to right and each part of the string that is used in the match, can not be reused. For example, regex “234″ will match “34234656723446″ only twice as “__234____234__″.

2.4. Groups and capturing

Capturing groups are numbered by counting their opening parentheses from left to right. In the expression ((A)(B(C))), for example, there are four such groups: ((A)(B(C))), (A), (B(C)), (C).

To find out how many groups are present in the regular expression, you can use groupCount on a matcher object. The groupCount() method returns an int showing the number of capturing groups present in the matcher’s pattern. For example in ((ab)(c)) contains 3 capturing groups; ((ab)(c)), (ab) and (c).

There is also a special group, group zero, which always represents the entire expression. This group is not included in the total reported by groupCount().

Pattern p = Pattern.compile("(cd)(\\d+\\w)(.*)", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("abCD45ee EE54dcBA");
if(m.find()) {
    System.out.println("Group0: " + m.group(0));
    System.out.println("Group1: " + m.group(1));
    System.out.println("Group2: " + m.group(2));
    System.out.println("Group3: " + m.group(3));
}

System.out.println("Group count: " + m.groupCount());

And here is the output:

Group0: CD45ee EE54dcBA
Group1: CD
Group2: 45e
Group3: e EE54dcBA
Group count: 3

The part of input String that matches the capturing group is saved into memory and can be recalled using Backreference. Backreference can be used in regular expression with backslash (\) and then the number of group to be recalled.

System.out.println(Pattern.matches("(\\d\\w)\\1", "2x2x")); //true
System.out.println(Pattern.matches("(\\d\\w)\\1", "2x2z")); //false
System.out.println(Pattern.matches("(A\\d)(bcd)\\2\\1", "A4bcdbcdA4")); //true
System.out.println(Pattern.matches("(A\\d)(bcd)\\2\\1", "A4bcdbcdA5")); // false

In the first example, the capturing group is (\d\w). The capturing group is results to “2x” when it is matched with the input String “2x2x” and saved in memory. The backreference \1 is referring to “a2” and it returns true. However, due to the same analyse the second example will result to false. Now, it is your turn to analyse the capturing group for examples 3 and 4.

2.5. Other Matcher methods

Matcher has some other methods to work with regular expressions.

2.5.1 lookingAt and matches

The matches and lookingAt methods both will match an input string against a pattern. However, the difference between them is that matches requires the entire input string to be matched, while lookingAt does not.

Pattern pattern = Pattern.compile("dd");
Matcher matcher = ptr.matcher("dd3435dd");
System.out.println("lookingAt(): " + matcher.lookingAt()); // true
System.out.println("matches(): " + matcher.matches()); // false

2.5.2. start and end

start() and end() methods represent where the match was found in the input string.

Pattern p = Pattern.compile("(cd)(\\d+\\w)(.*)", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("abCD45ee EE54dcBA");
if(m.find()) {
    System.out.println("start(): " + m.start()); //2 
    System.out.println("end(): " + m.end()); //17
}

2.5.3. replaceAll and replaceFirst

replaceAll and replaceFirst are manipulating the input string with the replacement string. replaceFirst replaces the first occurrence, and replaceAll replaces all occurrences.

public static void  main(String[] args){
   Pattern pt = Pattern.compile("Lion");
   Matcher mt = pt.matcher("Lion is the strongest animal in jungle. Lion is smart.");
   String s1 = mt.replaceFirst("Bear");
   System.out.println("replaceFirst(): " + s1);
   String s2 = mt.replaceAll("Tiger");
   System.out.println("replaceAll()" + s2);
}

Output:

replaceFirst(): Bear is the strongest animal in jungle. Lion is smart.
replaceAll()Tiger is the strongest animal in jungle. Tiger is smart.

Java regular expression are always important in interview questions and needs more practice.

3. Download the code

This was a tutorial for java regular expression.

Download
You can download the full source code of this example here: Java Regular Expression tutorial

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!

 

1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design

 

and many more ....

 

Receive Java & Developer job alerts in your Area

 

Leave a Reply

Be the First to Comment!

Notify of
avatar
wpDiscuz