id2xml

Convert text format RFCs and Internet-Drafts to .xml format

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Internet-Draft text to XML Conversion Tool

This tool, ‘id2xml’, is intended for use by the RFC-Editor staff, in order to produce a first xml2rfc-compatible XML version from text-only Internet-Draft submissions.

id2xml may also be useful for Internet-Draft authors who wish to start working on a new version of an older draft or RFC, for which no xml2rfc-compatible XML source is available.

Version 1.0.x can process the drafts specified in the development Statement of Work to XML files acceptable to xml2rfc, and can also process a number of other test files to acceptable XML. Missing is internal <xref/> links to figures and tables.

The XML produced follows RFC 7749 [1] in version 0.9.x and 1.x of the tool, and will follow RFC 7991 [2] in version 2.x, which will be released once support is available to process XML sources which follows the RFC 7991 vocabulary.

Changelog

Version 1.0.1 (14 Jun 2017)

This is a bugfix release which addressess a number of issues raised by the RFC Editor staff, and a few issues found during testing.

Added generation of a sortrefs PI which matches the original’s RFC references being sorted or not.

Tweaked the slugifier, and applied it to section-* anchors, to ensure they are valid. Fixed an issue causing trailing commas in entity names.

Rewrote the handling of back matter to permit the various back sections to occur in any order. Added yet another way to say ‘work in progress’ in references; added new reference patterns and removed the expectation that references will have a terminating period; did some minor code clean-up.

Refined the header and footer stripping to consider end-of-line commas, and to require short lines triggering paragraph breaks to contain text.

Fixed a bug in line reading, which could cause the first line of a document to be skipped. Added recognition of additional Standards Track status indications, such as ‘Proposed Standard’, etc. Fixed a grammar issue. Fixed an issue with mismatched authors on the first page and Author’s Addresses section. Refined the header/footer stripping to deal with additional variations of header/footer lines.

Fixed a number of places where warn() was called with a line object instead of the line number.

Changed to using the supplied figure or texttable label to set the title attribute also when rendering the figure as texttable or texttable as figure.

Refined the tokenizer for the text parser in order to correctly handle things like (Section N.N).

Eliminated trailing blank cells in texttables.

Added RFC-Editor staff to the release notification list.

Version 1.0.0 (30 May 2017)

The number of lines in the corpus of test documents now show a percentage of lines which differ from the original input file to the text file generated from id2xml’s xml file of just over 2%, and in some cases the generated text is an improvement over the original text. The tool should now be functionally complete for vocabulary v2 output, so this seems like a good time for a 1.0.0 release.

Changes since 1.0.0rc3:

Split the functionality up into separate run.py, parser.py and utils.py files, and adjusted Makefile and MANIFEST accordingly.

Entries in the <references/> sections are now entity references for drafts and RFCs, instead of inserting the reference xml as generated from the input document.

There’s a slight refactoring of how the reference_anchors and section_anchors lists are generated.

Added xref elements for Section N.nn strings which reference document sections.

There has been multiple rounds of refactoring, to clean up and organise the code better.

The generated xml has also been cleaned up, to avoid long lines and tags bunched up on the same line. It’s still not super pretty, but should be readable.

Added a check on coupled debug trace switches, where setting a trace start option also requires that a trace stop option be set.

The regular expression which identifies code has been further refined.

Refined the header stripping to not join pararaphs where the first part has a short line.

Added more cases where list hangIndent is derived and set.

Added modification of the text-list-symbols PI in order to better match the source. Since this is a global setting, it can’t handle inconsistent bullet styles in a document (for instance created with hangText=”*” …).

Improved the error message for missing stream information when attempting to process older RFCs

Fixed a bug in the handling of the xml tree for xrefs found in text interspersed with vspace elements.

Code optimisations.

Added the last two changelog sections to the release information shown onl PyPi.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.5.2

Jul 23, 2023

1.5.1

Jul 12, 2023

1.5.0

Sep 3, 2019

1.4.4

Oct 17, 2018

1.4.3

Oct 12, 2018

1.4.1

Dec 19, 2017

1.4.0

Nov 2, 2017

1.3.1

Oct 30, 2017

1.3.0

Sep 23, 2017

1.2.3

Sep 23, 2017

1.2.2

Aug 10, 2017

1.2.1

Aug 10, 2017

1.2.0

Aug 9, 2017

1.1.0

Jul 27, 2017

1.0.3

Jul 1, 2017

1.0.2

Jun 18, 2017

This version

1.0.1

Jun 14, 2017

1.0.0

May 30, 2017

1.0.0rc3 pre-release

May 25, 2017

1.0.0rc2 pre-release

May 22, 2017

1.0.0-rc1 pre-release

May 19, 2017

0.9.3

May 15, 2017

0.9.2

May 15, 2017

0.9.1

May 10, 2017

0.9.0

May 9, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

id2xml-1.0.1.tar.gz (54.3 kB view hashes)

Uploaded Jun 14, 2017 Source

Built Distribution

id2xml-1.0.1-py2.7.egg (95.0 kB view hashes)

Uploaded Jun 14, 2017 Source

Hashes for id2xml-1.0.1.tar.gz

Hashes for id2xml-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`37af2ae750a79e1dcfad4faecbf99553cafc1f8f8205ff2f269ca2cb8de07851`
MD5	`ee19f880f7d5be52919290920af18ae3`
BLAKE2b-256	`f762350c829cf5c731d3a2f8075a515baeec77faf51f1dd49ac7e85ee896b42d`

Hashes for id2xml-1.0.1-py2.7.egg

Hashes for id2xml-1.0.1-py2.7.egg
Algorithm	Hash digest
SHA256	`7210bf5107111fc098c5be6637cd1d325cad9b67a7a0fa318d68df563d378020`
MD5	`9b51b6f8f4d24ce67193911359abffb9`
BLAKE2b-256	`3e0e7398e0bdfc459897f771ad815fef54678d3dd6c6d2c8f15b2f055341cc2e`

id2xml 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Internet-Draft text to XML Conversion Tool

Changelog

Version 1.0.1 (14 Jun 2017)

Version 1.0.0 (30 May 2017)

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution