A regular expression parsing and matching engine, written in C.
This repo goes with a blog post on the history and implementation of Regular Expressions:
The engine supports the following regular expression syntax:
- Literal characters
- Quantifiers:
*,+,? - Groups:
(...) - Alternation:
|
Notably, it does not support backreferences, lookaheads, or lookbehinds. In computation theory, regular expressions are equivalent to regular languages, which can be recognized by finite automata. Adding these features (as is present in many modern regex engines) would increase the expressive power of regular expressions, but also make the matching algorithm necessarily less efficient, i.e. a linear time algorithm vs an exponential time algorithm.
This project is licensed under the MIT License. See the LICENSE file for details.