Beta

Regular Expressions Engine

Description:

#Regular Expressions Engine

In this kata you have to reimplement JavaScript native RegExp class.

##Regular expressions syntax (https://msdn.microsoft.com/en-us/library/ae5bf541(v=vs.90).aspx)

Regular expression is a pattern of text which consist of simple and special characters.

###Simple Characters

Any character except special one which will be matched exactly.

###Special Characters

  • \<character> Turns special character to simple one and some of simple characters to special. Any other simple character preceded by \ still matches exactly.
    • \f \n \r \t \v \\ Matches a form-feed, new line, carriage return, tab, vertical tab and backslash per se correspondingly.
    • \b Matches a word boundary, that is, the position between a word and a space.
    • \B Matches a nonword boundary.
    • \d Matches a digit character.
    • \D Matches a nondigit character.
    • \s Matches any white space character including space, tab, form-feed, and so on.
    • \S Matches any non-white space character.
    • \w Matches any word character including underscore.
    • \W Matches any nonword character.
    • \xNN Matches NN, where n is a hexadecimal ASCII value.
    • \uNNNN Matches NNNN, where n is a hexadecimal Unicode value.
    • \N Identifies either an octal escape value or a backreference. If \n is preceded by at least n captured subexpressions(see below), n is a backreference. Otherwise, n is an octal escape value if n is an octal digit (0-7).
  • ^ Matches the position at the beginning of the input string. If the RegExp object's multiline property is set, ^ also matches the position following \n or \r.
  • $ Matches the position at the end of the input string. If the RegExp object's multiline property is set, $ also matches the position preceding \n or \r.
  • Quantifiers. Define number of repetitions of preceding character or subexpression.
    • <character>* Matches zero or more times.
    • <character>+ Matches one or more times.
    • <character>? Matches zero or one time.
    • <character>{N} Matches exactly N times.
    • <character>{N,} Matches at least N times.
    • <character>{N,M} Matches at least N and at most M times.
  • <quantifier>? When this character immediately follows any of quantifiers, the matching pattern is non-greedy. A non-greedy pattern matches as little of the searched string as possible, whereas the default greedy pattern matches as much of the searched string as possible.
  • . Matches any single character except \n.
  • Subexpressions.
    • (<pattern>) A subexpression that matches pattern and captures the match.
    • (?:<pattern>) A subexpression that matches pattern but does not capture the match, that is, it is a non-capturing match that is not stored for possible later use.
    • (?=<pattern>) A subexpression that performs a positive lookahead search, which matches the string at any point where a string matching pattern begins. This is a non-capturing match, that is, the match is not captured for possible later use. Lookaheads do not consume characters, that is, after a match occurs, the search for the next match begins immediately following the last match, not after the characters that comprised the lookahead.
    • (?!<pattern>) A subexpression that performs a negative lookahead search, which matches the search string at any point where a string not matching pattern begins. This is a non-capturing match, that is, the match is not captured for possible later use. Lookaheads do not consume characters, that is, after a match occurs, the search for the next match begins immediately following the last match, not after the characters that comprised the lookahead.
  • <pattern1>|<pattern2> Matches either pattern1 or pattern2.
  • [<characters>] A character set. Matches any one of the enclosed characters. [<start character>-<end character>] makes a range which matches any character with code greater than start's code and lower than end's code. Ranges and sets can be combined. [^<characters>] matches any character not in set. \b inside a set matches backslash character.

##RegExp methods and properties

Constructor

new RegExp([pattern[,flags]]);

Pattern

A string containing regular expression with syntax described above. Remember to escape all \ characters.

Flags

A string containing a set of flags in any order that set corresponding RegExp properties.

  • g - global
  • m - multiline
  • i - ignoreCase

Properties

  • source String containing pattern as it was provided to constructor.
  • lastIndex Specifies index from where next search will be started.
  • global See exec method.
  • multiline See ^ and $ in Regular expressions syntax section.
  • ignoreCase Specifies if RegExp matching should ignore case.

Methods

  • exec(str) Performs matching search on specified string. If matching substring is found returns an array containing substring and every stored subexpression. If nothing is found returns null. If global flag is set to true and something is found stores last index of matching substring into lastIndex, otherwise sets it to 0.
  • test(str) Performs simple check for any matches in specified string. Returns true or false.
  • toString() Returns expression as a JavaScript RegExp literal.
Regular Expressions
Algorithms

Stats:

CreatedOct 5, 2015
PublishedOct 5, 2015
Warriors Trained239
Total Skips83
Total Code Submissions425
Total Times Completed17
JavaScript Completions17
Total Stars13
% of votes with a positive feedback rating100% of 2
Total "Very Satisfied" Votes2
Total "Somewhat Satisfied" Votes0
Total "Not Satisfied" Votes0
Total Rank Assessments2
Average Assessed Rank
2 kyu
Highest Assessed Rank
1 kyu
Lowest Assessed Rank
3 kyu
Ad
Contributors
  • Freywar Avatar
Ad