Skip to main content

used to interface with Senna and stanford-parser.jar for dependency parsing

Project description



<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.14: http://docutils.sourceforge.net/" />
<title>practNLPTools-lite</title>
<style type="text/css">

/*
:Author: David Goodger (goodger@python.org)
:Id: $Id: html4css1.css 7952 2016-07-26 18:15:59Z milde $
:Copyright: This stylesheet has been placed in the public domain.

Default cascading style sheet for the HTML output of Docutils.

See http://docutils.sf.net/docs/howto/html-stylesheets.html for how to
customize this style sheet.
*/

/* used to remove borders from tables and images */
.borderless, table.borderless td, table.borderless th {
border: 0 }

table.borderless td, table.borderless th {
/* Override padding for "table.docutils td" with "! important".
The right padding separates the table cells. */
padding: 0 0.5em 0 0 ! important }

.first {
/* Override more specific margin styles with "! important". */
margin-top: 0 ! important }

.last, .with-subtitle {
margin-bottom: 0 ! important }

.hidden {
display: none }

.subscript {
vertical-align: sub;
font-size: smaller }

.superscript {
vertical-align: super;
font-size: smaller }

a.toc-backref {
text-decoration: none ;
color: black }

blockquote.epigraph {
margin: 2em 5em ; }

dl.docutils dd {
margin-bottom: 0.5em }

object[type="image/svg+xml"], object[type="application/x-shockwave-flash"] {
overflow: hidden;
}

/* Uncomment (and remove this text!) to get bold-faced definition list terms
dl.docutils dt {
font-weight: bold }
*/

div.abstract {
margin: 2em 5em }

div.abstract p.topic-title {
font-weight: bold ;
text-align: center }

div.admonition, div.attention, div.caution, div.danger, div.error,
div.hint, div.important, div.note, div.tip, div.warning {
margin: 2em ;
border: medium outset ;
padding: 1em }

div.admonition p.admonition-title, div.hint p.admonition-title,
div.important p.admonition-title, div.note p.admonition-title,
div.tip p.admonition-title {
font-weight: bold ;
font-family: sans-serif }

div.attention p.admonition-title, div.caution p.admonition-title,
div.danger p.admonition-title, div.error p.admonition-title,
div.warning p.admonition-title, .code .error {
color: red ;
font-weight: bold ;
font-family: sans-serif }

/* Uncomment (and remove this text!) to get reduced vertical space in
compound paragraphs.
div.compound .compound-first, div.compound .compound-middle {
margin-bottom: 0.5em }

div.compound .compound-last, div.compound .compound-middle {
margin-top: 0.5em }
*/

div.dedication {
margin: 2em 5em ;
text-align: center ;
font-style: italic }

div.dedication p.topic-title {
font-weight: bold ;
font-style: normal }

div.figure {
margin-left: 2em ;
margin-right: 2em }

div.footer, div.header {
clear: both;
font-size: smaller }

div.line-block {
display: block ;
margin-top: 1em ;
margin-bottom: 1em }

div.line-block div.line-block {
margin-top: 0 ;
margin-bottom: 0 ;
margin-left: 1.5em }

div.sidebar {
margin: 0 0 0.5em 1em ;
border: medium outset ;
padding: 1em ;
background-color: #ffffee ;
width: 40% ;
float: right ;
clear: right }

div.sidebar p.rubric {
font-family: sans-serif ;
font-size: medium }

div.system-messages {
margin: 5em }

div.system-messages h1 {
color: red }

div.system-message {
border: medium outset ;
padding: 1em }

div.system-message p.system-message-title {
color: red ;
font-weight: bold }

div.topic {
margin: 2em }

h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
margin-top: 0.4em }

h1.title {
text-align: center }

h2.subtitle {
text-align: center }

hr.docutils {
width: 75% }

img.align-left, .figure.align-left, object.align-left, table.align-left {
clear: left ;
float: left ;
margin-right: 1em }

img.align-right, .figure.align-right, object.align-right, table.align-right {
clear: right ;
float: right ;
margin-left: 1em }

img.align-center, .figure.align-center, object.align-center {
display: block;
margin-left: auto;
margin-right: auto;
}

table.align-center {
margin-left: auto;
margin-right: auto;
}

.align-left {
text-align: left }

.align-center {
clear: both ;
text-align: center }

.align-right {
text-align: right }

/* reset inner alignment in figures */
div.align-right {
text-align: inherit }

/* div.align-center * { */
/* text-align: left } */

.align-top {
vertical-align: top }

.align-middle {
vertical-align: middle }

.align-bottom {
vertical-align: bottom }

ol.simple, ul.simple {
margin-bottom: 1em }

ol.arabic {
list-style: decimal }

ol.loweralpha {
list-style: lower-alpha }

ol.upperalpha {
list-style: upper-alpha }

ol.lowerroman {
list-style: lower-roman }

ol.upperroman {
list-style: upper-roman }

p.attribution {
text-align: right ;
margin-left: 50% }

p.caption {
font-style: italic }

p.credits {
font-style: italic ;
font-size: smaller }

p.label {
white-space: nowrap }

p.rubric {
font-weight: bold ;
font-size: larger ;
color: maroon ;
text-align: center }

p.sidebar-title {
font-family: sans-serif ;
font-weight: bold ;
font-size: larger }

p.sidebar-subtitle {
font-family: sans-serif ;
font-weight: bold }

p.topic-title {
font-weight: bold }

pre.address {
margin-bottom: 0 ;
margin-top: 0 ;
font: inherit }

pre.literal-block, pre.doctest-block, pre.math, pre.code {
margin-left: 2em ;
margin-right: 2em }

pre.code .ln { color: grey; } /* line numbers */
pre.code, code { background-color: #eeeeee }
pre.code .comment, code .comment { color: #5C6576 }
pre.code .keyword, code .keyword { color: #3B0D06; font-weight: bold }
pre.code .literal.string, code .literal.string { color: #0C5404 }
pre.code .name.builtin, code .name.builtin { color: #352B84 }
pre.code .deleted, code .deleted { background-color: #DEB0A1}
pre.code .inserted, code .inserted { background-color: #A3D289}

span.classifier {
font-family: sans-serif ;
font-style: oblique }

span.classifier-delimiter {
font-family: sans-serif ;
font-weight: bold }

span.interpreted {
font-family: sans-serif }

span.option {
white-space: nowrap }

span.pre {
white-space: pre }

span.problematic {
color: red }

span.section-subtitle {
/* font-size relative to parent (h1..h6 element) */
font-size: 80% }

table.citation {
border-left: solid 1px gray;
margin-left: 1px }

table.docinfo {
margin: 2em 4em }

table.docutils {
margin-top: 0.5em ;
margin-bottom: 0.5em }

table.footnote {
border-left: solid 1px black;
margin-left: 1px }

table.docutils td, table.docutils th,
table.docinfo td, table.docinfo th {
padding-left: 0.5em ;
padding-right: 0.5em ;
vertical-align: top }

table.docutils th.field-name, table.docinfo th.docinfo-name {
font-weight: bold ;
text-align: left ;
white-space: nowrap ;
padding-left: 0 }

/* "booktabs" style (no vertical lines) */
table.docutils.booktabs {
border: 0px;
border-top: 2px solid;
border-bottom: 2px solid;
border-collapse: collapse;
}
table.docutils.booktabs * {
border: 0px;
}
table.docutils.booktabs th {
border-bottom: thin solid;
text-align: left;
}

h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
font-size: 100% }

ul.auto-toc {
list-style-type: none }

</style>
</head>
<body>
<div class="document" id="practnlptools-lite">
<h1 class="title">practNLPTools-lite</h1>

<p>Creating practNLPTools in lite mode.[ get the old coding in <a class="reference external" href="https://github.com/jawahar273/practNLPTools-lite/tree/dev">devbranch</a> or pre-stable version <a class="reference external" href="https://github.com/jawahar273/practNLPTools-lite/tree/pyup-update-pytest-3.2.2-to-3.2.3">properbranch</a> ]</p>
<p><object data="https://img.shields.io/badge/Author-jawahar-blue.svg" type="image/svg+xml">Author</object></p>
<p><a class="reference external" href="https://travis-ci.org/jawahar273/practNLPTools"><img alt="Build Status" src="https://travis-ci.org/jawahar273/practNLPTools.svg?branch=master" /></a> - on click this built this might take you to build of
<a class="reference external" href="https://github.com/jawahar273/practNLPTools-lite">practNLPTools</a> which is testing ground for this repository so don’t
worry.</p>
<p>Practical Natural Language Processing Tools for Humans.
practNLPTools is a pythonic library over <a class="reference external" href="http://ronan.collobert.com/senna/">SENNA</a> and Stanford
Dependency Extractor.</p>
<table border="1" class="docutils">
<colgroup>
<col width="50%" />
<col width="50%" />
</colgroup>
<thead valign="bottom">
<tr><th class="head">name</th>
<th class="head">status</th>
</tr>
</thead>
<tbody valign="top">
<tr><td>Wercker status</td>
<td><a class="reference external" href="https://app.wercker.com/project/byKey/758bf4fa0e3bb9066d118385ee4aac1f"><img alt="Wercker status" src="https://app.wercker.com/status/758bf4fa0e3bb9066d118385ee4aac1f/s/master" /></a></td>
</tr>
<tr><td>PyPi</td>
<td><a class="reference external" href="https://pypi.python.org/pypi/pntl"><object data="https://img.shields.io/pypi/v/practNLPTools-lite.svg" type="image/svg+xml">pypi status</object></a></td>
</tr>
<tr><td>travis</td>
<td><a class="reference external" href="https://travis-ci.org/jawahar273/practNLPTools-lite"><object data="https://img.shields.io/travis/jawahar273/practNLPTools-lite.svg" type="image/svg+xml">travis status</object></a></td>
</tr>
<tr><td>Documentation</td>
<td><a class="reference external" href="https://pntl.readthedocs.io/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/pntl/badge/?version=latest" /></a></td>
</tr>
<tr><td>dependency</td>
<td><a class="reference external" href="https://pyup.io/repos/github/jawahar273/practNLPTools-lite/"><object data="https://pyup.io/repos/github/jawahar273/practNLPTools-lite/shield.svg" type="image/svg+xml">Updates</object></a></td>
</tr>
<tr><td>blocker Pyupbot</td>
<td><a class="reference external" href="https://pyup.io/repos/github/jawahar273/practNLPTools-lite/"><object data="https://pyup.io/repos/github/jawahar273/practNLPTools-lite/python-3-shield.svg" type="image/svg+xml">Python 3</object></a></td>
</tr>
<tr><td>FOSSA</td>
<td><a class="reference external" href="https://app.fossa.io/projects/git%2Bhttps%3A%2F%2Fgithub.com%2Fjawahar273%2FpractNLPTools-lite?ref=badge_small"><img alt="FOSSA Status" src="https://app.fossa.io/api/projects/git%2Bhttps%3A%2F%2Fgithub.com%2Fjawahar273%2FpractNLPTools-lite.svg?type=small" /></a></td>
</tr>
</tbody>
</table>
<ul class="simple">
<li>Documentation: <a class="reference external" href="https://pntl.readthedocs.io">https://pntl.readthedocs.io</a></li>
</ul>
<div class="section" id="functionality">
<h1>Functionality</h1>
<ul class="simple">
<li>Semantic Role Labeling.</li>
<li>Syntactic Parsing.</li>
<li>Part of Speech Tagging (POS Tagging).</li>
<li>Named Entity Recognisation (NER).</li>
<li>Dependency Parsing.</li>
<li>Shallow Chunking.</li>
<li>Skip-gram(in-case).</li>
<li>find the senna path if is install in the system.</li>
<li>stanford parser and depPaser file into installed direction.</li>
</ul>
</div>
<div class="section" id="future-work">
<h1>Future work</h1>
<ul class="simple">
<li>creating depParser for corresponding os environment</li>
<li>custome input format for stanford parser insted of tree format</li>
</ul>
</div>
<div class="section" id="features">
<h1>Features</h1>
<ol class="arabic simple">
<li>Fast: <a class="reference external" href="http://ronan.collobert.com/senna/">SENNA</a> is written is C. So it is Fast.</li>
<li>We use only dependency Extractor Component of Stanford Parser, which
takes in Syntactic Parse from SENNA and applies dependency
Extraction. So there is no need to load parsing models for Stanford
Parser, which takes time.</li>
<li>Easy to use.</li>
<li>Platform Supported - Windows, Linux and Mac</li>
<li>Automatic finds stanford parsing jar if it is present in install path[pntl].</li>
</ol>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">SENNA pipeline has a fixed maximum size of the sentences that it
can read. By default it is 1024 token/sentence. If you have larger
sentences, changing the MAX_SENTENCE_SIZE value in SENNA_main.c should beconsidered and your system specific binary should be rebuilt. Otherwise this could introduce misalignment errors.</p>
</div>
</div>
<div class="section" id="installation">
<h1>Installation</h1>
<div class="section" id="requires">
<h2>Requires:</h2>
<blockquote>
A computer with 500mb memory, Java Runtime Environment (1.7
preferably, works with 1.6 too, but didnt test.) installed and python.</blockquote>
<div class="section" id="linux">
<h3>Linux:</h3>
<p>run:</p>
<pre class="literal-block">
sudo python setup.py install
</pre>
</div>
<div class="section" id="windows">
<h3>windows:</h3>
<p>run this commands as administrator:</p>
<pre class="literal-block">
python setup.py install
</pre>
</div>
</div>
</div>
<div class="section" id="bench-mark-comparsion">
<h1>Bench Mark comparsion</h1>
<p>By using the <tt class="docutils literal">time</tt> command in ubuntu on running the <tt class="docutils literal">testsrl.py</tt> on
this <a class="reference external" href="https://github.com/jawahar273/SRLTagger">link</a> and along with <tt class="docutils literal">tools.py</tt> on <tt class="docutils literal">pntl</tt></p>
<table border="1" class="docutils">
<colgroup>
<col width="33%" />
<col width="33%" />
<col width="33%" />
</colgroup>
<thead valign="bottom">
<tr><th class="head">&nbsp;</th>
<th class="head">pntl</th>
<th class="head">NLTK-senna</th>
</tr>
</thead>
<tbody valign="top">
<tr><td>at fist run</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr><td>&nbsp;</td>
<td>real 0m1.674s</td>
<td>real 0m2.484s</td>
</tr>
<tr><td>&nbsp;</td>
<td>user 0m1.564s</td>
<td>user 0m1.868s</td>
</tr>
<tr><td>&nbsp;</td>
<td>sys 0m0.228s</td>
<td>sys 0m0.524s</td>
</tr>
<tr><td>at second run</td>
<td>&nbsp;</td>
<td>&nbsp;</td>
</tr>
<tr><td>&nbsp;</td>
<td>real 0m1.245s</td>
<td>real 0m3.359s</td>
</tr>
<tr><td>&nbsp;</td>
<td>user 0m1.560s</td>
<td>user 0m2.016s</td>
</tr>
<tr><td>&nbsp;</td>
<td>sys 0m0.152s</td>
<td>sys 0m1.168s</td>
</tr>
</tbody>
</table>
<div class="admonition note">
<p class="first admonition-title">Note</p>
<p class="last">this bench mark may differt accouding to system’s working and to restult present here is exact same result in my system ububtu 4Gb RAM
and i3 process. If I find another good benchmark techinque then I will
change to it.</p>
</div>
<!-- Features -->
<!-- - - - - - - - - -->
<!-- * TODO -->
</div>
<div class="section" id="credits">
<h1>Credits</h1>
<p>This package was created with <a class="reference external" href="https://github.com/audreyr/cookiecutter">Cookiecutter</a> and the <a class="reference external" href="https://github.com/audreyr/cookiecutter-pypackage">audreyr/cookiecutter-pypackage</a> project template.</p>
</div>
</div>
</body>
</html>


=======
History
=======

0.2.3-beta
----------
* Proper release for PyPI version.

0.2.0 4-alpha
------------------
* Marking standard tools for `pntl`.

0.1.1 (2017-09-17)
------------------

* Planing to release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pntl-0.2.3.2.tar.gz (2.5 MB view hashes)

Uploaded Source

Built Distribution

pntl-0.2.3.2-py3.6.egg (2.5 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page