Apache JMeter

JMeter Regular Expression Extractor Example

In this example, we will demonstrate the use of Regular Expression Extractor post processor in Apache JMeter. We will go about parsing and extracting the portion of response data using regular expression and apply it on a different sampler. Before we look at the usage of Regular Expression Extractor, let’s look at the concept.

1. Introduction

Apache JMeter is an open source Java based tool that enables you to perform functional, load, performance and regression tests on an application. The application may be running on a Web server or it could be a standalone in nature. It supports testing on both client-server and web model containing static and dynamic resources. It supports wide variety of protocols for conducting tests that includes, HTTP, HTTPS, JDBC, FTP, JMS, LDAP, SOAP etc.

A quick look at some of the features:

  • It provides a comprehensive GUI based workbench to play around with tests. It also allows you to work in a non-GUI mode. JMeter can also be ported on the server allowing to perform tests in a distributed environment.
  • It provides a concept of template which are pre-defined test plans for various schemes or protocols that can be directly used to create your required test plan.
  • It enables you to build test plan structurally using powerful features like Thread Group, Controllers, Samplers, Listeners etc.
  • It provides debugging and error monitoring through effective logging.
  • It supports parameterized testing through the concept of variables.
  • It supports creation of different flavors of test plan that includes Web, Database, FTP, LDAP, Web service, JMS, Monitors etc.
  • It allows for remote testing by having different JMeter instances running as servers across nodes and accessed from a single client application.
  • It gives you real time test results that covers metrics like latency, throughput, response times, active threads etc.
  • It enables you to perform testing based on regular expressions and many more other features.

1.1. Regular Expression

Regular expression is a pattern matching language that performs a match on a given value, content or expression. The regular expression is written with series of characters that denote a search pattern. The pattern is applied on strings to find and extract the match. The regular expression is often termed as regex in short. Pattern based searching has become very popular and is provided by all the known languages like Perl, Java, Ruby, Javascript, Python etc. The regex is commonly used with UNIX operating system with commands like grep, ls, awk and editors like ed and sed. The language of regex uses meta characters like . (matches any single character), [] (matches any one character), ^ (matches the start position), $ (matches the end position) and many more to devise a search pattern. Using these meta characters, one can write a powerful regex search pattern with combination of if/else conditions and replace feature. The discussion about regex is beyond the scope of this article. You can find plenty of articles and tutorials on regular expression available on the net.

1.2. Regular Expression Extractor

Regular Expression (regex) feature in JMeter is provided by the Jakarta ORO framework. It is modelled on Perl5 regex engine. With JMeter, you could use regex to extract values from the response during test execution and store it in a variable (also called as reference name) for further use. Regular Expression Extractor is a post processor that can be used to apply regex on response data. The matched expression derived on applying the regex can then be used in a different sampler dynamically in the test plan execution. The Regular Expression Extractor control panel allows you to configure the following fields:

Apply to: Regex extractor are applied to test results which is a response data from the server. A response from the primary request is considered main sample while that of sub request is a sub sample. A typical HTML page (primary resource) may have links to various other resources like image, javascript files, css etc. These are embedded resources. A request to these embedded resources will produce sub samples. An HTML page response itself becomes primary or a main sample. A user has the option to apply regex to main sample or sub samples or both.

Field to check: Regex is applied to the response data. Here you choose what type of response it should match. There are various response indicators or fields available to choose. You can apply regex to plain response body or a document that is returned as a response data. You can also apply regex to request and response headers. You can also parse URL using regex or you can opt to apply regex on response code.

Reference Name: This is the name of the variable that can be further referenced in the test plan using ${}. After applying regex, the final extracted value is stored in this variable. Behind the scenes, JMeter will generate more than 1 variable depending on the match occurred. If you have defined groups in your regex by providing parenthesis (), then it will generate as many variables as number of groups. These variables names are suffixed with the letters _g(n) where n is the group no. When you do not define any grouping on your regex, the returned value is termed as the zeroth group or group 0. Variable values can be checked by using Debug Sampler. This will enable you to verify whether you regular expression worked or not.

Regular Expression: This is the regex itself that is applied on the response data. A regex may or may not have a group. A group is a subset of string that is extracted from the match. For example, if the response data is ‘Hello World’ and my regex is Hello (.+)$, then it matches ‘Hello World’ but extracts the string ‘World’. The parenthesis () applied is the group that is captured or extracted. You may have more than one group in your regex, so which one or how many to extract, is configured through the use of template. See the below point.

Template: Templates are references or pointers to the groups. A regex may have more than one groups. It allows you to specify which group value to extract by specifying the group number as $1$ or $2$ or $1$$2$ (extract both groups). From the ‘Hello World’ example in the above point, $0$ points to the complete matched expression that is ‘Hello World’ and $1$ group points to the string ‘World’. A regex without parenthesis () is matched as $0$ (default group). Based on the template specified, that group value is stored in the variable (reference name).

Match no.: A regex applied to the response data may have more than one matches. You can specify which match should be returned. For example, a value of 2 will indicate that it should return the second match. A value of 0 will indicate any random match to be returned. A negative value will return all the matches.

Default value: The regex match is set to a variable. But what happens when the regex does not match. In such a scenario, the variable is not created or generated. But if you specify a default value then if the regex does not match then the variable is set to the specified default value. It is recommended to provide a default value so that you know whether your regex worked or not. It is a useful feature for debugging your test.

2. Regular Expression Extractor By Example

We will now demonstrate the use of Regular Expression Extractor by configuring a regex that will extract the URL of the first article from the JCG (Java Code Geeks) home page. After extracting the URL, we will use it in a HTTP Request sampler to test the same. The extracted URL will be set in a variable.

Before installing JMeter, make sure you have JDK 1.6 or higher installed. Download the latest release of JMeter using the link here. At the time of writing this article, the current release of JMeter is 2.13. To install, simply unzip the archive into your home directory where you want JMeter to be installed. Set the JAVA_HOME environment variable to point to JDK root folder. After unzipping the archive, navigate to /bin folder and run the command jmeter. For Windows, you can run using the command window. This will open JMeter GUI window that will allow you to build the test plan.

2.1. Configuring Regular Expression Extractor

Before we configure regex extractor, we will create a test plan with a ThreadGroup named ‘Single User’ and a HTTP Request Sampler named ‘JCG Home’. It will point to the server www.javacodegeeks.com. For more details on creating ThreadGroup and related elements, you can view the article JMeter Thread Group Example. The below image shows the configured ThreadGroup (Single User) and HTTP Request Sampler (JCG Home).

JCG Home Sampler
JCG Home Sampler

Next, we will apply the regex on the response body (main sample). When the test is executed, it will ping the web site named www.javacodegeeks.com and return the response data which is a HTML page. This HTML web page contains JCG articles, the title of which is wrapped in a <h2> tag. We will write a regular expression that will match the first <h2> tag and extract the URL of the article. The URL will be part of an anchor <a> tag. Right click on JCG Home sampler and select Add -> Post Processors -> Regular Expression Extractor.

JMeter Regex Extractor
JMeter Regex Extractor

The name of our extractor is ‘JCG Article URL Extractor’. We will apply the regex to the main sample and directly on the response body (HTML page). The Reference Name or variable name provided is ‘article_url’. The regex used is <h2 .+?><a href="http://(.+?)".+?</h2>. We will not go into the details of the regex as this is a different discussion thread altogether. In a nutshell, this regex will find or match the first <h2> tag and extract the URL from the anchor tag. It will strip the word http:// and extract only the server part of the URL. The extractor itself is placed in a parenthesis () forming our first group. The Template field is set with the value of $1$ that points to our first group (the URL) and the Match No. field indicates the first match. The Default Value set is the ‘error’. So if our regex fails to match then the variable article_url will hold the value ‘error’. If the regex makes a successful match, then the article URL will be stored in the article_url variable.

We will use this article_url variable in another HTTP Request sampler named JCG Article. Right click on Single User ThreadGroup and select Add -> Sampler -> HTTP Request.

JCG Article
JCG Article

As you can see from the above, the server name is ${article_url} which is nothing but the URL that was extracted from the previous sampler using regex. You can verify the results by running the test.

2.2. View Test Results

To view the test results, we will configure the View Results Tree listener. But before we do that, we will add a Debug Sampler to see the variable and its value being generated upon executing the test. This will help you understand whether your regex successfully matched an expression or failed. Right click on Single User ThreadGroup and select Add -> Sampler -> Debug Sampler.

Debug Sampler
Debug Sampler

As we want to debug the generated variables, set the JMeter variables field to True. Next, we will view and verify test results using View Results Tree listener. Right click on Single User ThreadGroup and select Add -> Listener -> View Results Tree.

View Debug Result
View Debug Result

First let’s look at the output of Debug Sampler response data. It shows our variable article_url and observe the value which is the URL that we extracted. The test has also generated group variables viz. article_url_g0 and article__url_g1. The group 0 is a regular general match and group 1 is the string that is extracted from the general match. This string is also stored in our article_url variable. The variable named article_url_g tells you the no. of groups in the regex. Our regex contained only 1 group (note the sole parenthesis () in our regex). Now lets look at the result of our JCG Article sampler:

View JCG Article Result
View JCG Article Result

The JCG Article sampler successfully made the request to the server URL that was extracted using regex. The server URL was referenced using ${article_url} expression.

3. Conclusion

The regular expression extractor in JMeter is one of the significant feature that can help parse different types of values on different types of response indicators. These values are stored in variables that can be used as references in other threads of the test plan. The ability to devise groups in the regex, capturing portions of matches makes it even more a powerful feature. Regular expression is best used when you need to parse the text and apply it dynamically to subsequent threads in your test plan. The objective of the article was to highlight the significance of Regular Expression Extractor and its application in the test execution.

Rajeev Hathi

Rajeev is a senior Java architect and developer. He has been designing and developing business applications for various companies (both product and services). He is co-author of the book titled 'Apache CXF Web Service Development' and shares his technical knowledge through his blog platform techorgan.com
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button