Skip to main content

Handle SPS packages like a breeze.

Project description

scielo.packtools
================

Canivete suiço para a inspeção de pacotes SPS.


Instalação
----------

Python Package Index (recomendado):

```bash
pip install packtools
```

Pip + git (versão de desenvolvimento):

```bash
pip install -e git+git://github.com/scieloorg/packtools.git#egg=packtools
```

Repositório de códigos (versão de desenvolvimento):

```bash
git clone https://github.com/scieloorg/packtools.git
cd packtools
python setup.py install
```


Configuração do Catálogo XML
----------------------------

Um Catálogo XML é um mecanismo de *lookup* que pode ser utilizado para evitar que requisições de
rede sejam realizadas para carregar DTDs externas.

Por questões de desempenho e segurança, as instâncias de `stylechecker.XML` não realizam
conexões de rede, portanto é extremamente recomendável que seja configurado um Catálogo XML,
que traduz ids públicos para uris de arquivos locais.

O `packtools` já apresenta um catálogo padrão, que para ser utilizado basta definir a
variável de ambiente `XML_CATALOG_FILES` com o caminho absoluto para
`packtools/catalogs/scielo-publishing-schema.xml`.

Mais informações em http://xmlsoft.org/catalog.html#Simple


Utilitário `stylechecker`
-------------------------

Após a instalação, o programa `stylechecker` deverá estar disponível no seu emulador de terminal.
Esse programa realiza a validação de um determinado XML no formato SPS contra a DTD, e
apresenta uma lista dos erros encontrados. Também é possível *anotar* os erros encontrados em uma
cópia do XML em validação, por meio do argumento opcional `--annotated`.

O utilitário `stylechecker` tenta carregar a DTD externa, especificada na declaração DOCTYPE do
XML. Para evitar esse comportamento, utilize a opção `--nonetwork`.

A função *ajuda* pode ser utilizada com a opção `-h`, conforme o exemplo:

```bash
$ stylechecker -h
usage: stylechecker [-h] [--annotated | --raw] [--nonetwork]
[--assetsdir ASSETSDIR] [--version] [--loglevel LOGLEVEL]
[--nocolors] [--extrasch EXTRASCH]
[XML [XML ...]]

SciELO PS stylechecker command line utility.

positional arguments:
XML filesystem path or URL to the XML

optional arguments:
-h, --help show this help message and exit
--annotated reproduces the XML with notes at elements that have
errors
--raw each result is encoded as json, without any
formatting, and written to stdout in a single line.
--nonetwork prevents the retrieval of the DTD through the network
--assetsdir ASSETSDIR
lookup, at the given directory, for each asset
referenced by the XML. current working directory will
be used by default.
--version show program's version number and exit
--loglevel LOGLEVEL
--nocolors prevents the output from being colorized by ANSI
escape sequences
--extrasch EXTRASCH runs an extra validation using an external schematron
schema.

Copyright 2013 SciELO <scielo-dev@googlegroups.com>. Licensed under the terms
of the BSD license. Please see LICENSE in the source code for more
information.
```



History
=======

0.7.5
-----

* Added feature to run the validation against an external schematron schema
[#55].
* stylechecker's `--loglevel` option accepts upper, lower or mixed case strings.
* stylechecker utility can read from stdin, so it can be a filter in unix
pipelines.
* Added `--raw` option to stylechecker.
* Fixed bug that would raise UnicodeDecodeError in the presence
of any non-ascii character in the path to the file (Python 2 on Windows only).


0.7.4 (2015-06-19)
------------------

* Fixed bug that would cause page counts to be reported as error when
pagination is identified with elocation-id [#51].
* Added support for creative commons IGO licenses (sps-1.2 only).
* Fixed bug that would cause funding-group validation to raise false positives.


0.7.3 (2015-05-18)
------------------

* Validating the minimum set of elements required for references of type
journal [http://git.io/vUSp6].
* Added validation of //aff/country/@country attributes for XMLs under
sps-1.2 spec.


0.7.2 (2015-04-30)
------------------

* Fixes a bug in which the occurrence of empty award-id,
fn[@fn-type="financial-disclosure"] or ack could lead stylechecker to crash.


0.7.1 (2015-04-29)
------------------

* Fixes a bug that report *page-count* as invalid when fpage or lpage values
are non-digit.
* Fixes a bug that mark as invalid XMLs containing use-licenses with
https scheme or missing trailing slashes.
* Changes the funding-group validation algorithm.
* Checking for funding-statement when fn[fn-type="financial-disclosure"] is
present.


0.7 (2015-03-13)
----------------

* Added SciELO PS 1.2 support.
* Added the apparent sourceline of the element raising validation errors
(stylechecker).
* Added the option *--nocolors* to prevent stylechecker output from being
colorized by ANSI escape sequences.
* stylechecker now prints log messages to stdout. The option *--loglevel*
should be used to define the log level. Options are: DEBUG, INFO, WARNING,
ERROR or CRITICAL.
* SciELO PS 1.2 schematron uses EXSLT querybinding.
* Better error handling while analyzing multiple XML files with stylechecker.


0.6.4 (2015-02-03)
------------------

* Fixes a bug that causes malfunctioning on stylechecker
while expanding wildcards on windows.
* Major semantic changes at *--assetsdir* options. Now it is always turned ON,
and the option is used to set the lookup basedir. By default,
the XML basedir is used.


0.6.3 (2015-02-02)
------------------

* stylechecker CLI utility overhaul:
* The basic output is now presented as JSON structure.
* The option *--assetsdir* lookups, in the given dir, for each asset referenced in
XML. The *--annotated* option now writes the output to a file. The
utility now takes more than one XML a time.
* *pygments*, if installed, will be used to display pretty JSON outputs.


0.6.2 (2015-01-23)
------------------

* Added method `XMLValidator.lookup_assets`.
* Added property `XMLValidator.assets`.
* Fixed minor issue that would cause //element-citation[@publication-type="report"]
to be reported as invalid.
* Fixed minor issue that would erroneously identify an element-citation element
as not being child of element ref.


0.6.1 (2014-11-28)
------------------

* Minor fix to implement changes from SciELO PS 1.1.1.


0.6 (2014-10-28)
----------------

* Python 3 support.
* Project-wide code refactoring.
* `packtools.__version__` attribute to get the package version.
* Distinction between classes of error with the attribute `StyleError.level`.


0.5 (2014-09-29)
----------------

* Basic implementation of XML style rules according to SciELO PS version 1.1.
* `stylechecker` and `packbuilder` console utilities.
* Major performance improvements on `XMLValidator` instantiation, when used
with long-running processes (9.5x).

Project details


Release history Release notifications | RSS feed

This version

0.7.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

packtools-0.7.5.tar.gz (422.2 kB view hashes)

Uploaded Source

Built Distribution

packtools-0.7.5-py2.py3-none-any.whl (607.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page