Thursday, May 5, 2011

Regular expression Using The Matcher Class

Except for splitting a string (see previous paragraph), you need to create a Matcher object from the Pattern object. The Matcher will do the actual work. The advantage of having two separate classes is that you can create many Matcher objects from a single Pattern object, and thus apply the regular expression to many subject strings simultaneously.

To create a Matcher object, simply call Pattern.matcher() like this:

myMatcher = Pattern.matcher("subject");


If you already created a Matcher object from the same pattern, call myMatcher.reset("newsubject") instead of creating a new matcher object, for reduced garbage and increased performance. Either way, myMatcher is now ready for duty.

To find the first match of the regex in the subject string, call myMatcher.find(). To find the next match, call myMatcher.find() again. When myMatcher.find() returns false, indicating there are no further matches, the next call to myMatcher.find() will find the first match again. The Matcher is automatically reset to the start of the string when find() fails.

The Matcher object holds the results of the last match. Call its methods start(), end() and group() to get details about the entire regex match and the matches between capturing parentheses. Each of these methods accepts a single int parameter indicating the number of the backreference. Omit the parameter to get information about the entire regex match. start() is the index of the first character in the match. end() is the index of the first character after the match. Both are relative to the start of the subject string. So the length of the match is end() - start(). group() returns the string matched by the regular expression or pair of capturing parentheses.

myMatcher.replaceAll("replacement") has exactly the same results as myString.replaceAll("regex", "replacement"). Again, the difference is speed.

The Matcher class allows you to do a search-and-replace and compute the replacement text for each regex match in your own code. You can do this with the appendReplacement() and appendTail() Here is how:

 

StringBuffer myStringBuffer = new StringBuffer();
myMatcher = myPattern.matcher("subject");
while (myMatcher.find()) {
if (checkIfThisMatchShouldBeReplaced())
{
myMatcher.appendReplacement(myStringBuffer, computeReplacementString());
}
}
myMatcher.appendTail(myStringBuffer)

 

Obviously, checkIfThisMatchShouldBeReplaced() and computeReplacementString() are placeholders for methods that you supply. The first returns true or false indicating if a replacement should be made at all. Note that skipping replacements is way faster than replacing a match with exactly the same text as was matched. computeReplacementString() returns the actual replacement string.

Example


public static boolean checkDate(String date, boolean isEnglish){  
String monthExpression = "[0-1][1-9]";
String dayExpression = "(0[1-9]|[12][0-9]|3[01])";
boolean isValid = false;
//RegEx to validate date in US format.
String expression = "^" + monthExpression +"[- / ]?" + dayExpression + "[- /]?(18|19|20|21)\\d{2}";
if(isEnglish){
//RegEx to validate date in Metric format.
expression = "^"+dayExpression + "[- / ]?" + monthExpression + "[- /]?(18|19|20|21)\\d{2,4}";
}
CharSequence inputStr = date;
Pattern pattern = Pattern.compile(expression,Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputStr);
if(matcher.matches()){
isValid=true;
}
}

No comments:

Post a Comment

Chitika