Skip to content

A didactic implementation of regex validation using FSM in python

Notifications You must be signed in to change notification settings

davidemesso/RegexPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RegexPy

A didactic implementation of regex validation using FSM in python with backend parsing in C++

Content

  • regexPy Main package
    • regex.py Module containing the interface for the final user.
    • fsm.py The core module, handling FSMs creation and simulation, actually executing the regex matching.
    • utils Package
      • stack.py A simple stack data structure implementation used to simplify stack manipulation operations.
      • fsmutils.py The module containing usefull functions for manipulating FSMs in order to achieve a Regex.
    • tests Package containing unit tests.
  • Parser The parser implementation which is used to obtain a polish notation stack based representation of the regex used for FSM generation, this folder only contains files used for execution.
  • ParserSource Tools and sources used to easly generate ad hoc parser, can be used for recompile purpose. Special thanks: Professor Mauro Leoncini
  • main.py, timeTest.py Testing files.
  • tests.py Script for launching all tests.

HOW TO USE

From CLI

shell:

python3 ./main.py "abcd" "a*"
False
python3 ./main.py "aaaa" "a*"
True

python command line:

>>> from regexpy.regex import Regex
>>> Regex.match("abcd", "a*")
False
>>> Regex.match("aaaa", "a*")
True

As module

import Regex from regex class and use either the instance method validate() or the static method match() with the string and the pattern as done in main.py file

Full toolchain

NOTE: you could instantiate directly a FSMDeterministic object in simple cases, but it recomended to

New features

regex class now implements:

  • matchAny() "One of" matching
  • matchAll() "All of" matching Allowing for multiple regex to be evaluated at once.

Conclusion

The program achieved the instantiation of a regex object just parsing a string describing it, without having to project the FSM itself.

This is computable via an algorithm that turns the regex into a not deterministic fsm, then using the subset construction to obtain a deterministic fsm, which is also simulable for string validation.

FSMUtils contains every function necessary for this process.

About

A didactic implementation of regex validation using FSM in python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published