In this tutorial we are going to see how to validate HTML Tag format. In general, validating HTML with regular expressions in not the optimal method. You should use an HTML parser for that matter. But when you want to validate the basic HTML format, quickly in your application, Regular Expressions will do. So the basic policy of HTML tags format.
So this is the regular expression we are going to use for 12 hours format validation:
<(\"[^\"]*\"|'[^']*'|[^'\">])*>
You can take a look at the Pattern
class documentation to learn how to construct your own regular expressions according to your policy.
1. Validator clas.
This is the class that we are going to use for HTML tag format validation.
HtmlTagValidator.java:
package com.javacodegeeks.java.core; import java.util.regex.Matcher; import java.util.regex.Pattern; public class HtmlTagValidator{ private Pattern pattern; private Matcher matcher; private static final String HTML_TAG_FORMAT_PATTERN = "<(\"[^\"]*\"|'[^']*'|[^'\">])*>"; public HtmlTagValidator(){ pattern = Pattern.compile(HTML_TAG_FORMAT_PATTERN); } public boolean validate(final String tag){ matcher = pattern.matcher(tag); return matcher.matches(); } }
2. Unit Testing our HtmlTagValidator class
For unit testing we are going to use JUnit
. Unit testing is very important in these situations because they provide good feedback about the correctness of our regular expressions. You can test your program and reassure that your regular expression meets the rules on your policy about the form of HTML Tag format.
This is a basic test class:
HtmlTagValidatorTest.java:
package com.javacodegeeks.java.core; import static org.junit.Assert.*; import java.util.Arrays; import java.util.Collection; import org.junit.BeforeClass; import org.junit.Test; import org.junit.runner.RunWith; import org.junit.runners.Parameterized; import org.junit.runners.Parameterized.Parameters; @RunWith(Parameterized.class) public class HtmlTagValidatorTest { private String arg; private static HtmlTagValidator htmlTagValidator; private Boolean expectedValidation; public HtmlTagValidatorTest(String str, Boolean expectedValidation) { this.arg = str; this.expectedValidation = expectedValidation; } @BeforeClass public static void initialize() { htmlTagValidator = new HtmlTagValidator(); } @Parameters public static Collection<Object[]> data() { Object[][] data = new Object[][] { { "<'br />", false }, // wrong format { "img src=\"ar.jpg\">" , false }, // wrong format { "<input => />", false }, // wrong format { "<br />", true }, { "<img src=\"a.png\" />", true }, { "</a>", true } }; return Arrays.asList(data); } @Test public void test() { Boolean res = htmlTagValidator.validate(this.arg); String validv = (res) ? "valid" : "invalid"; System.out.println("HTML tag Format "+arg+ " is " + validv); assertEquals("Result", this.expectedValidation, res); } }
Output:
HTML tag Format <'br /> is invalid
HTML tag Format img src="ar.jpg"> is invalid
HTML tag Format <input => /> is invalid
HTML tag Format <br /> is valid
HTML tag Format <img src="a.png" /> is valid
HTML tag Format </a> is valid
This was an exampl on how to validate date format with Java Regular Expression.