Skip to main content

An easy-to-use package to restore punctuation of portuguese texts.

Project description

🤗 bert-restore-punctuation-ptbr

🇧🇷 easy-to-use package to restore punctuation of portuguese texts.

This is a bert-base-portuguese-cased model finetuned for punctuation restoration on WikiLingua.

This model is intended for direct use as a punctuation restoration model for the general Portuguese language. Alternatively, you can use this for further fine-tuning on domain-specific texts for punctuation restoration tasks.

Model restores the following punctuations -- [! ? . , - : ; ' ]

The model also restores the upper-casing of words.


🤷 Usage

Below is a quick way to use the template.

  1. First, install the package.
pip install respunct
  1. Sample python code.
from respunct import RestorePuncts

model = RestorePuncts()

model.restore_puncts("""
henrique foi no lago pescar com o pedro mais tarde foram para a casa do pedro fritar os peixes""")
# output:
# Henrique foi no lago pescar com o Pedro. Mais tarde, foram para a casa do Pedro fritar os peixes.

🎯 Accuracy

label precision recall f1-score support
Upper - OU 0.89 0.91 0.90 69376
None - OO 0.99 0.98 0.98 857659
Full stop/period - .O 0.86 0.93 0.89 60410
Comma - ,O 0.85 0.83 0.84 48608
Upper + Comma - ,U 0.73 0.76 0.75 3521
Question - ?O 0.68 0.78 0.73 1168
Upper + period - .U 0.66 0.72 0.69 1884
Upper + colon - :U 0.59 0.63 0.61 352
Colon - :O 0.70 0.53 0.60 2420
Question Mark - ?U 0.50 0.56 0.53 36
Upper + Exclam. - !U 0.38 0.32 0.34 38
Exclamation Mark - !O 0.30 0.05 0.08 783
Semicolon - ;O 0.35 0.04 0.08 1557
Apostrophe - 'O 0.00 0.00 0.00 3
Hyphen - -O 0.00 0.00 0.00 3
accuracy 0.96 1047818
macro avg 0.57 0.54 0.54 1047818
weighted avg 0.96 0.96 0.96 1047818

🤙 Contact

Maicon Domingues for questions, feedback and/or requests for similar models.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

respunct-1.0.0.tar.gz (7.9 kB view hashes)

Uploaded Source

Built Distribution

respunct-1.0.0-py3-none-any.whl (8.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page