find-job-titles

Fast extraction of job titles from strings

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

find_job_titles

https://img.shields.io/travis/fluquid/find_job_titles.svg

Find Job Titles in Strings

Free software: MIT license
Python versions: 2.7, 3.4+

Features

Find any of 77k job titles in a given string
Text processing is extremely fast using “acora” library
Dictionary generation takes about 20 seconds upfront

Quickstart

Instantiate “Finder” and start extracting job titles:

>>> from find_job_titles import Finder
>>> finder.findall('I am the Senior Vice President')
[('Senior Vice President', 9),
 ('Vice President', 16),
 ('President', 21)]

All possible, overlapping matches are returned. Matches contain positional information of where the match was found.

Alternatively use “finditer” for lazy consumption of matches:

>>> finder.finditer('I am the Senior Vice President')]
<generator object ...>

Credits

This package was created with Cookiecutter and the fluquid/cookiecutter-pypackage project template.

History

0.7.0 (2017-08-22)

fixed tox tests for py27 re: different unicode treatment by acora and pyahocorasick
only testing default Finder using pyahocorasick now.

0.6.0 (2017-08-22)

rewrote and fixed longest match code
added pyahocorasick implementation and made default
added params to enable/disable longest matches

0.5.0 (2017-08-22)

0.4.0 (2017-08-21)

updated title list with marketing execs
set non-dev version

0.3.0-dev (2017-08-18)

updated title list (- surnames, - blacklist, + added_roles)

0.2.0-dev (2017-08-18)

proper tracking of code with releases

0.1.0 (unreleased)

First release on PyPI.

Algorithm	Hash digest
SHA256	`88763ef7e1f47ced03bda7e61c4cf778ef4f39cd71d4b59b226d7b49bf7e7aad`
MD5	`84cfb2f037de12a858a00cb6004fd717`
BLAKE2b-256	`dd79961b1af12d2d57cdc2d2d4bb0206dcdb1fbce9032e18d7a0b530afa72efb`

Algorithm	Hash digest
SHA256	`4ad27d617834cc0c1630d3e5cf09b0df62c5913f69ddc5a682797d7a331e7c40`
MD5	`2bb9fea9a1415f0f616fc0096e5f3156`
BLAKE2b-256	`e3439f8294dabf906f3cc5277a0914a4dcc7fb6d506c3e8c317e469c11dbeea7`

find-job-titles 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

find_job_titles

Features

Quickstart

Credits

History

0.7.0 (2017-08-22)

0.6.0 (2017-08-22)

0.5.0 (2017-08-22)

0.4.0 (2017-08-21)

0.3.0-dev (2017-08-18)

0.2.0-dev (2017-08-18)

0.1.0 (unreleased)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes