Skip to main content

Text preprocesssing

Project description

Overview

documentation

This documentation is all about PParser package. PParser designes to help developers to perprocessing their text automatically! Also it has many useful features that makes perprocessing more fun! However, This is not an exhaustive description but it should show you how use the package effortlessly.

Introduction

PParser is an integrated package uses many famous packages. Moreover, PParser supports multi languages. In the below table you can see all valid operations accomplishing by PParser and their corresponder packages.

Operations

Keyword

Packages

normalize

NORMALIZE

HAZM, PARSIVAR

sent tokenize

S_TOKENIZE

HAZM, PARSIVAR

word tokenize

W_TOKENIZE

HAZM, PARSIVAR

lemmatize

LEMMATIZE

HAZM

stem

STEM

HAZM, PARSIVAR

Features

This section provides a list of possible features supported by PParser. It able to:

  • Use GPU

  • Use multi thread

  • Add custom stopwords

  • Use multi processors

  • Separate files for using GPU

  • Remove specify range of characters

  • Remove digits and Non-Persian letters

  • Convert fnglish letters to persian letters

Installation

for installing, you can simpley use pip to install the package.

>>> pip install -i https://test.pypi.org/simple/ mPPars.

Usage

In this section we are going to see the simple usage of PParser package.

https://gitlab.com/mostafarahgouy/pparser/-/raw/mostafa-dev/images/guideline.gif

Examples

https://gitlab.com/mostafarahgouy/pparser/-/raw/mostafa-dev/images/guideline.gif https://gitlab.com/mostafarahgouy/pparser/-/blob/mostafa-dev/images/example_of_validation.png

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PartNLP-0.0.1.tar.gz (11.6 kB view hashes)

Uploaded Source

Built Distribution

PartNLP-0.0.1-py3-none-any.whl (53.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page