hachoir-regex 1.0.3
Manipulation of regular expressions (regex)
Hachoir regex
hachoir-regex is a Python library for regular expression (regex or regexp) manupulation. You can use a|b (or) and a+b (and) operators. Expressions are optimized during the construction: merge ranges, simplify repetitions, etc. It also contains a class for pattern matching allowing to search multiple strings and regex at the same time.
Website: http://bitbucket.org/haypo/hachoir/wiki/hachoir-regex
Changelog
Version 1.0.3
- Raise SyntaxError on unsupported escape character
- Two dot atoms are always equals
Version 1.0.2 (2007-07-12)
- Refix PatternMatching without any pattern
Version 1.0.1 (2007-06-28)
- Fix PatternMatching without any pattern
Version 1.0 (2007-06-28)
- First public version
Regex examples
Regex are optimized during their creation:
>>> from hachoir_regex import parse, createRange, createString
>>> createString("bike") + createString("motor")
<RegexString 'bikemotor'>
>>> parse('(foo|fooo|foot|football)')
<RegexAnd 'foo(|[ot]|tball)'>
Create character range:
>>> regex = createString("1") | createString("3")
>>> regex
<RegexRange '[13]'>
>>> regex |= createRange("2", "4")
>>> regex
<RegexRange '[1-4]'>
As you can see, you can use classic "a|b" (or) and "a+b" (and) Python operators. Example of regular expressions using repetition:
>>> parse("(a{2,}){3,4}")
<RegexRepeat 'a{6,}'>
>>> parse("(a*|b)*")
<RegexRepeat '[ab]*'>
>>> parse("(a*|b|){4,5}")
<RegexRepeat '(a+|b){0,5}'>
Compute minimum/maximum matched pattern:
>>> r=parse('(cat|horse)')
>>> r.minLength(), r.maxLength()
(3, 5)
>>> r=parse('(a{2,}|b+)')
>>> r.minLength(), r.maxLength()
(1, None)
Pattern maching
Use PatternMaching if you would like to find many strings or regex in a string. Use addString() and addRegex() to add your patterns.
>>> from hachoir_regex import PatternMatching
>>> p = PatternMatching()
>>> p.addString("a")
>>> p.addString("b")
>>> p.addRegex("[cd]")
And then use search() to find all patterns:
>>> for start, end, item in p.search("a b c d"):
... print "%s..%s: %s" % (start, end, item)
...
0..1: a
2..3: b
4..5: [cd]
6..7: [cd]
You can also attach an objet to a pattern with 'user' (user data) argument:
>>> p = PatternMatching()
>>> p.addString("un", 1)
>>> p.addString("deux", 2)
>>> for start, end, item in p.search("un deux"):
... print "%r at %s: user=%r" % (item, start, item.user)
...
<StringPattern 'un'> at 0: user=1
<StringPattern 'deux'> at 3: user=2
Installation
With distutils:
sudo ./setup.py install
Or using setuptools:
sudo ./setup.py --setuptools install
| File | Type | Py Version | Uploaded on | Size | # downloads |
|---|---|---|---|---|---|
| hachoir_regex-1.0.3-py2.5.egg (md5) | Python Egg | 2.5 | 2009-09-10 18:22:12.781237 | 30KB | 46 |
| hachoir-regex-1.0.3.tar.gz (md5) | Source | 2008-04-01 17:32:04 | 21KB | 1198 | |
| hachoir_regex-1.0.3-py2.6.egg (md5) | Python Egg | 2.6 | 2009-09-10 18:22:15.603203 | 30KB | 54 |
| hachoir_regex-1.0.3-py2.4.egg (md5) | Python Egg | 2.4 | 2009-09-10 18:22:10.883024 | 30KB | 45 |
- Author: Victor Stinner
- Home Page: http://bitbucket.org/haypo/hachoir/wiki/hachoir-regex
- Download URL: http://bitbucket.org/haypo/hachoir/wiki/hachoir-regex
- License: GNU GPL v2
-
Categories
- Development Status :: 5 - Production/Stable
- Intended Audience :: Developers
- Intended Audience :: Education
- License :: OSI Approved :: GNU General Public License (GPL)
- Natural Language :: English
- Operating System :: OS Independent
- Programming Language :: Python
- Topic :: Scientific/Engineering :: Information Analysis
- Topic :: Software Development :: Interpreters
- Topic :: Software Development :: Libraries :: Python Modules
- Topic :: Text Processing
- Topic :: Utilities
- Package Index Owner: haypo
- DOAP record: hachoir-regex-1.0.3.xml
Log in to rate this package.
