Topics Map > Cobra Learning
Cobra Learning  Understanding regular expressions
Understanding regular expressions
Regular expressions give users grading certain question types the ability to evaluate responses against a set of acceptable values. A regular expression uses alphanumeric and metacharacters to create a pattern that describes one or more strings that must be identically matched within a body of text.
Note You can choose to use regular expressions in short answer, multishort answer, arithmetic, significant figures, and fill in the blanks questions.
Regular expressions examples
Question 1 A _____ wags his tail. He eats dog _______ twice a day.
Answer 1 Blank 1 = [Dd] og. Blank 2 = [ Ff] ood
Question 2 The classic movie Jurassic Park was directed by Steven ________, who also directed Indiana Jones and the Raiders of the Lost Ark.
Answer 2 [Ss] pielberg
Question 3 What word describes red, blue, green, yellow, pink, etc.?
Answer 3 colou?r*
Question 4 What kind of animal meows?
Answer 4 [Cc]at.
Metacharacter descriptions and functions
Character  Description  Example 

\ 
Marks the next character as a special character, a literal, a back reference, or an octal escape. 
The sequence '\\' matches "\" and "\(" matches "(". n matches the character n. \n matches a newline character. 
^ 
Matches the position at the beginning of the input string. If the RegExp object’s Multiline property is set, ^ also matches the position following '\n' or '\r'. 
^cat matches strings that begin with cat 
$ 
Matches the position at the end of the input string. If the RegExp object’s Multiline property is set, $ also matches the position preceding '\n' or '\r'. 
cat$ matches any string that ends with cat 
* 
Matches the preceding character or subexpression zero or more times. * equals {0,} 
be* matches b or be or beeeeeeeeee zo* matches z and zoo. 
+ 
Matches the preceding character or subexpression one or more times. + equals {1,}. 
be+ matches be or bee but not b 
? 
Matches the preceding character or subexpression zero or one time. ? equals {0,1} 
abc? matches ab or abc colou?r matches color or colour but not colouur do(es)? matches the do in do or does. 
? 
When this character immediately follows any of the other quantifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is nongreedy. A nongreedy pattern matches as little of the searched string as possible, whereas the default greedy pattern matches as much of the searched string as possible. 
In the string oooo, o+? matches a single o, while o+ matches all os. 
() 
Parentheses create a substring or item that you can apply metacharacters to. 
a(bee)?t matches at or abeet but not abet 
{n,} 
n is a nonnegative integer. Matches exactly n times. 
[09]{3,} matches any three digits o{2,} does not match the o in Bob, but matches the two os in food. b{4,} matches bbbb 
{n} 
n is a nonnegative integer. Matches at least n times. 
[09]{3} matches any three or more digits o{2} does not match the "o" in "Bob" and matches all the o's in "foooood". 'o{1}' is equivalent to 'o+'. 'o{0}' is equivalent to 'o*'. 
{n,m} 
m and n are nonnegative integers, where n <= m. Matches at least n and at most m times. NoteYou cannot put a space between the comma and the numbers. 
[09]{3,5} matches any three, four, or five digits "o{1,3}" matches the first three o's in "fooooood". 'o{0,1}' is equivalent to 'o?'. c{2, 4} matches cc, ccc, cccc 
. 
Matches any single character except "\n". To match any character including the '\n', use a pattern such as '[\s\S]'. 
cat. matches catT and cat2 but not catty 
(?!) 
Makes the remainder of the regular expression case insensitive. 
ca(?i)se matches caSE but not CASE 
(pattern) 
Matches pattern and captures the match. The captured match can be retrieved from the resulting Matches collection, using the SubMatches collection in VBScript or the $0$9 properties in JScript. To match parentheses characters ( ), use '\(' or '\)'. 
(jam){2} matches jamjam. First group matches jam. 
(?:pattern) 
Matches pattern but does not capture the match, that is, it is a noncapturing match that is not stored for possible later use. This is useful for combining parts of a pattern with the "or" character (). 
'industr(?: yies) is a more economical expression than 'industryindustries'. 
(?=pattern) 
Positive lookahead matches the search string at any point where a string matching pattern begins. This is a noncapturing match, that is, the match is not captured for possible later use. Lookaheads do not consume characters: after a match occurs, the search for the next match begins immediately following the last match, not after the characters that comprised the lookahead. 
'Windows (?=9598NT2000)' matches "Windows" in "Windows 2000" but not "Windows" in "Windows 3.1". 
(?!pattern) 
Negative lookahead matches the search string at any point where a string not matching pattern begins. This is a noncapturing match, that is, the match is not captured for possible later use. Lookaheads do not consume characters, that is, after a match occurs, the search for the next match begins immediately following the last match, not after the characters that comprised the lookahead. 
'Windows (?!9598NT2000)' matches "Windows" in "Windows 3.1" but does not match "Windows" in "Windows 2000". 
xy 
Matches x or y. 
July (first1st1) will match July 1st but not July 2 'zfood' matches "z" or "food". '( zf)ood' matches "zood" or "food". 
[xyz] 
A character set. Matches any one of the enclosed characters. 
gr[ae]y matches gray or grey '[abc]' matches the 'a' in "plain". 
[^xyz] 
A negative character set. Matches any character not enclosed. 
1[^02] matches 13 or 11 but not 10 or 12 [^abc]' matches the 'p' in "plain". 
[az] 
A range of characters. Matches any character in the specified range. 
[19] matches any single digit EXCEPT 0 '[az]' matches any lowercase alphabetic character in the range 'a' through 'z'. 
[^az] 
A negative range characters. Matches any character not in the specified range. 
'[^az]' matches any character not in the range 'a through 'z' 
\b 
Matches a word boundary: the position between a word and a space. 
'er\b' matches the 'er' in "never" but not the 'er' in "verb". 
\B 
Matches a nonword boundary. 
'er\B' matches the 'er' in "verb" but not the 'er' in "never". 
\cx 
Matches the control character indicated by x. The value of x must be in the range of AZ or az. If not, c is assumed to be a literal 'c' character. 
\cM matches a ControlM or carriage return character. 
\d 
Matches a digit character. Equivalent to [09] 

\D 
Matches a nondigit character Equivalent to [^09] 

\f 
Matches a formfeed character. Equivalent to \x0c and \cL 

\n 
Matches a newline character. Equivalent to \x0a and \cJ 

\r 
Matches a carriage return character. Equivalent to \x0d and \cM 

\s 
Matches any white space character including space, tab, formfeed, etc. Equivalent to [ \f\n\r\t\v] 
Can be combined in the same way as [\d\s], which matches a character that is a digit or whitespace. 
\S 
Matches any nonwhite space character. Equivalent to [^ \f\n\r\t\v] 

\t 
Matches a tab character. Equivalent to \x09 and \cI 

\v 
Matches a vertical tab character. Equivalent to \x0b and \cK 

\w 
Matches any word character including underscore. Equivalent to '[AZaz09_]' 

\W 
Matches any nonword character. Equivalent to '[^AZaz09_]' You should only use \D, \W and \S outside character classes. 

\Z 
Matches the end of the string the regular expression is applied to. Matches a position, but never matches before line breaks. 
.\Z matches k in jol\hok 
\xn 
Matches n, where n is a hexadecimal escape value. Hexadecimal escape values must be exactly two digits long. Allows ASCII codes to be used in regular expressions. 
'\x41' matches "A". '\x041' is equivalent to '\x04' & "1" 
\num 
Matches num, where num is a positive integer. A reference back to captured matches. 
'(.)\1' matches two consecutive identical characters 
\n 
Identifies either an octal escape value or a backreference. If \n is preceded by at least n captured subexpressions, n is a backreference. Otherwise, n is an octal escape value if n is an octal digit (07). 
“\11” and “\011” both match a tab character. “\0011” is the equivalent of “1”. 
\nm 
Identifies either an octal escape value or a backreference. If \nm is preceded by at least nm captured subexpressions, nm is a backreference. If \nm is preceded by at least n captures, n is a backreference followed by literal m. If neither of the preceding conditions exists, \nm matches octal escape value nm when n and m are octal digits (07). 

\nml 
Matches octal escape value nml when n is an octal digit (03) and m and l are octal digits (07). 

\un 
Matches n, where n is a Unicode character expressed as four hexadecimal digits. 
For example, \u00A9 matches the copyright symbol (©). 