Wednesday, May 4, 2011

Common symbols to represent Regex pattern in java

Regular expressions helping symbols

Let X and Z be 2 regex to be searched.

Symbol Description
.X Matches any character
^X regex must match at the beginning of the line
X$ Finds regex must match at the end of the line
[abc] Set definition, can match the letter a or b or c. Note that it matches only 1 character.
[^abc] When a "^" appears as the first character inside [] when it negates the pattern. This can match any character except a or b or c
[abc[vz]] Set definition, can match a or b or c followed by either v or z
[a-d] Ranges between a and d…a,b,c,d. Its kind of inclusive range, where it includes a and d as well.
[a-d1-3] Ranges between a and d…a,b,c,d and numbers in range of 1-4, ie. 1,2,3,4
X|Z Finds X or Z
XZ Finds X directly followed by Z
$ Checks if a line end follows

Common predefined patterns

\d any digit
\D any non digit
\w any word character
\W any non-word character
\s any white space
\S any non white space
\S+ Several non-white space character

 

Quantifiers

Symbol Description Example
* Occurs zero or more times, is short for {0,} X* - Finds no or several letter X, .* - any character sequence
+ Occurs one or more times, is short for {1,} X+ - Finds one or several letter X
? Occurs no or one times, ? is short for {0,1} X? -Finds no or exactly one letter X
{X} Occurs X number of times, {} describes the order of the preceding liberal \d{3} - Three digits, .{10} - any character sequence of length 10
{X,Y} .Occurs between X and Y times, \d{1,4}- \d must occur at least once and at a maximum of four
*? ? after a qualifier makes it a "reluctant quantifier", it tries to find the smallest match.  

 

Note:

The backslash is an escape character in Java Strings. e.g. backslash has a predefine meaning in Java. You have to use "\\" to define a single backslash. If you want to define "\w" then you must be using "\\w" in your regex. If you want to use backslash you as a literal you have to type \\\\ as \ is also a escape charactor in regular expressions.

No comments:

Post a Comment

Chitika