java.util.regex.Pattern Example
Pattern
class represents a compiled representation of a regular expression. This is defined in the java.util.regex
package as a final class.
A regular expression is specified as a string. This string is to be first compiled into an instance of this class using the Pattern.compile()
static method. This instance is used by a Matcher
class to perform match operation.
This article’s examples show the usage of Pattern
class. The examples are tested on Windows OS and require Java SE 7.
Matcher
Matcher
is an engine that performs match operations on a character sequence by interpreting a pattern. A matcher is created from a pattern by invoking the pattern’s matcher()
method.
Regex
A regular expression (abbreviated regex) is a sequence of characters that forms a search pattern, for use in pattern matching with strings used in search or find-replace operations. Each character in a regular expression is either understood to be a metacharacter with its special meaning, or a regular character with its literal meaning.
A reference article on regular expressions found at Wikipedia: Regular_expression
PatternSyntaxException
PatternSyntaxException
is an unchecked exception thrown to indicate a syntax error in a regular expression pattern. The Pattern class’s compile()
method can throw this runtime exception.
Examples
This article shows usage of some of the features of the Pattern
class. They are the following:
matches()
static methodcompile()
,matcher()
andsplit()
methods- Usage of
Matcher
class withPattern
Pattern
‘s match flags (defined as constant fields)
1. Example 1
In this example the Pattern
class’s matches()
method is used to perform a simple match. This method compiles the given regular expression and attempts to match the given input against it.
1.1. The code
PatternExample1.java
import java.util.regex.Pattern; public class PatternExample1 { public static void main (String [] args) { String regexStr = ".oo."; String sourceStr = "look"; boolean result = Pattern.matches(regexStr, sourceStr ); System.out.println("[" + regexStr + "] found in [" + sourceStr + "] : " + result); } }
1.2. The output
[.oo.] found in [look] : true
From the output note that the “.oo.” regex string is used for a match of four character word with “oo” in the middle of the word. Hence the true
result. The dot (.) metacharacter specifies any character in a regex pattern.
2. Example 2
This example uses the Pattern
and Matcher
classes to search text.
The program tries to find all occurrences of a two character string of the format: “any upper-case alphabet” followed by “o”. The regex pattern string used for this is: “[A-Z]o”. Here the range metacharacters “[ ]” are used to specify the range of upper-case alphabets between “A” and “Z”.
The expected search result: “Do” is a match, and, “do” is not a match. The input text to be searched is a set of phrases in an array.
2.1. The code
PatternExample2.java
import java.util.regex.Pattern; import java.util.regex.Matcher; public class PatternExample2 { public static void main (String [] args) { String [] phrases = {"Chew the fat", "Cat got your tongue", "Do a Devon Loch", "Hairy at the heel", "Have a one track mind!", "More holes than a Swiss cheese", "When pigs fly"}; String regexStr = "[A-Z]o"; Pattern pattern = Pattern.compile(regexStr); System.out.println("Regex pattern: " + regexStr); for (String phrase : phrases) { Matcher matcher = pattern.matcher(phrase); while (matcher.find()) { System.out.println("[" + matcher.group() + "] found in [" + phrase + "]"); } } } }
2.2. The output
Regex pattern: [A-Z]o [Do] found in [Do a Devon Loch] [Lo] found in [Do a Devon Loch] [Mo] found in [More holes than a Swiss cheese]
From the output note that all occurrences of the match are found within a phrase. In the phrase “Do a Devon Loch”, “Do” and “Lo” are found. Since the first alphabet character is to be upper case only, “vo” is not found in that phrase.
3. Example 3
This example shows the usage of Pattern
class’s split()
method. The split()
method splits the given input character sequence around matches of this pattern and returns an array of strings.
3.1. The code
PatternExample3.java
import java.util.regex.Pattern; public class PatternExample3 { public static void main (String [] args) { String regexStr = "\\s"; String sourceStr = "foo bar baz"; Pattern pattern = Pattern.compile(regexStr); String [] ss = pattern.split(sourceStr); System.out.println("Split [" + sourceStr + "] with [" + regexStr + "]"); for (String s : ss) { System.out.println(s); } } }
3.2. The output
Split [foo bar baz] with [\s] foo bar baz
From the output note that the split()
method with regex pattern “\s” looks for a white-space and splits the input string into three strings. In the code the extra backslash for the regex string “\\s” is required for the compiler to take the string “\s” literally and not as an escape sequence.
4. Example 4
This example shows usage of Pattern
class’s pattern match flags. These are defined as constant fields of type int
. The overloaded version of the compile()
static method accepts one or more flags as additional parameter to return a Pattern
instance.
The CASE_INSENSITIVE
flag is used in this example. This flag enables case-insensitive matching.
4.1. The code
PatternExample4.java
import java.util.regex.Pattern; import java.util.regex.Matcher; public class PatternExample4 { public static void main (String [] args) { String [] phrases = {"Chew the fat", "Cat got your tongue", "Do a Devon Loch", "Hairy at the heel", "Have a one track mind!", "More holes than a Swiss cheese", "When pigs fly"}; String regexStr = "[A-Z]o"; int matchFlag = Pattern.CASE_INSENSITIVE; Pattern pattern = Pattern.compile(regexStr, matchFlag); System.out.println("Regex pattern (CASE_INSENSITIVE): " + regexStr); for (String phrase : phrases) { Matcher matcher = pattern.matcher(phrase); while (matcher.find()) { System.out.println("[" + matcher.group() + "] found in [" + phrase + "]"); } } } }
4.2. The output
Regex pattern (CASE_INSENSITIVE): [A-Z]o found in [Cat got your tongue] [yo] found in [Cat got your tongue] [to] found in [Cat got your tongue] [Do] found in [Do a Devon Loch] [vo] found in [Do a Devon Loch] [Lo] found in [Do a Devon Loch] [Mo] found in [More holes than a Swiss cheese] [ho] found in [More holes than a Swiss cheese]
From the output note that all occurrences of the match are found within a phrase. In the phrase “More holes than a Swiss cheese”, “Mo” and “ho” are found, which has both upper and lower case first character respectively.
4.3. NOTES
- In the above program the same result can be achieved without using the match flag; use the regex pattern string “[a-zA-Z]o”.
- Multiple match flags can be specified at a time. For example to define a regex pattern with
CASE_INSENSITIVE
andLITERAL
flags, use the following syntax:int matchFlags = Pattern.CASE_INSENSITIVE | Pattern.LITERAL;
5. Download Java Source Code
This was an example of java.util.regex.Pattern Example
You can download the full source code of this example here: RegexPatternExamples.zip