xmltodict

Makes working with XML feel like you are working with JSON

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

# xmltodict

`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):

[![Build Status](https://secure.travis-ci.org/martinblech/xmltodict.png)](http://travis-ci.org/martinblech/xmltodict)

```python
>>> print(json.dumps(xmltodict.parse("""
... <mydocument has="an attribute">
... <and>
... <many>elements</many>
... <many>more elements</many>
... </and>
... <plus a="complex">
... element as well
... </plus>
... </mydocument>
... """), indent=4))
{
"mydocument": {
"@has": "an attribute",
"and": {
"many": [
"elements",
"more elements"
]
},
"plus": {
"@a": "complex",
"#text": "element as well"
}
}
}
```

## Namespace support

By default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:

```python
>>> xml = """
... <root xmlns="http://defaultns.com/"
... xmlns:a="http://a.com/"
... xmlns:b="http://b.com/">
... <x>1</x>
... <a:y>2</a:y>
... <b:z>3</b:z>
... </root>
... """
>>> xmltodict.parse(xml, process_namespaces=True) == {
... 'http://defaultns.com/:root': {
... 'http://defaultns.com/:x': '1',
... 'http://a.com/:y': '2',
... 'http://b.com/:z': '3',
... }
... }
True
```

It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:

```python
>>> namespaces = {
... 'http://defaultns.com/': None, # skip this namespace
... 'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
... 'root': {
... 'x': '1',
... 'ns_a:y': '2',
... 'http://b.com/:z': '3',
... },
... }
True
```

## Streaming mode

`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):

```python
>>> def handle_artist(_, artist):
... print(artist['name'])
... return True
>>>
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
... item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```

It can also be used from the command line to pipe objects to a script like this:

```python
import sys, marshal
while True:
_, article = marshal.load(sys.stdin)
print(article['title'])
```

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

```sh
$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```

And you reuse the dicts with every script that needs them:

```sh
$ cat enwiki.dicts.gz | gunzip | script1.py
$ cat enwiki.dicts.gz | gunzip | script2.py
...
```

## Roundtripping

You can also convert in the other direction, using the `unparse()` method:

```python
>>> mydict = {
... 'response': {
... 'status': 'good',
... 'last_updated': '2014-02-16T23:10:12Z',
... }
... }
>>> print(unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<response>
<status>good</status>
<last_updated>2014-02-16T23:10:12Z</last_updated>
</response>
```

Text values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.

```python
>>> import xmltodict
>>>
>>> mydict = {
... 'text': {
... '@color':'red',
... '@stroke':'2',
... '#text':'This is a test'
... }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<text stroke="2" color="red">This is a test</text>
```

## Ok, how do I get it?

### Using pypi

You just need to

```sh
$ pip install xmltodict
```

### RPM-based distro (Fedora, RHEL, …)

There is an [official Fedora package for xmltodict](https://admin.fedoraproject.org/pkgdb/acls/name/python-xmltodict).

```sh
$ sudo yum install python-xmltodict
```

### Arch Linux

There is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).

```sh
$ sudo pacman -S python-xmltodict
```

### Debian-based distro (Debian, Ubuntu, …)

There is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).

```sh
$ sudo apt install python-xmltodict
```

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.13.0

May 8, 2022

0.12.0

Feb 11, 2019

This version

0.11.0

Apr 27, 2017

0.10.2

Jun 2, 2016

0.10.1

Feb 23, 2016

0.10.0

Feb 23, 2016

0.9.2

Feb 4, 2015

0.9.1

Jan 18, 2015

0.9.0

Apr 17, 2014

0.8.7

Mar 27, 2014

0.8.6

Feb 16, 2014

0.8.5

Feb 3, 2014

0.8.4

Feb 3, 2014

0.8.3

Oct 21, 2013

0.8.2

Oct 21, 2013

0.8.1

Oct 12, 2013

0.7.0

Aug 25, 2013

0.6.0

Aug 19, 2013

0.5.1

Jul 15, 2013

0.5.0

May 25, 2013

0.4.6

Mar 2, 2013

0.4.4

Jan 24, 2013

0.4.3

Jan 11, 2013

0.4.2

Jan 4, 2013

0.4.1

Dec 20, 2012

0.4

Dec 13, 2012

0.3

Nov 14, 2012

0.2

Aug 28, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmltodict-0.11.0.tar.gz (26.6 kB view hashes)

Uploaded Apr 27, 2017 Source

Built Distribution

xmltodict-0.11.0-py2.py3-none-any.whl (7.2 kB view hashes)

Uploaded Apr 27, 2017 Python 2 Python 3

Hashes for xmltodict-0.11.0.tar.gz

Hashes for xmltodict-0.11.0.tar.gz
Algorithm	Hash digest
SHA256	`8f8d7d40aa28d83f4109a7e8aa86e67a4df202d9538be40c0cb1d70da527b0df`
MD5	`9f955947db085485873ac68154e88069`
BLAKE2b-256	`5717a6acddc5f5993ea6eaf792b2e6c3be55e3e11f3b85206c818572585f61e1`

Hashes for xmltodict-0.11.0-py2.py3-none-any.whl

Hashes for xmltodict-0.11.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`add07d92089ff611badec526912747cf87afd4f9447af6661aca074eeaf32615`
MD5	`feb9f31561d6f0f777da1f96552feadc`
BLAKE2b-256	`42a97e99652c6bc619d19d58cdd8c47560730eb5825d43a7e25db2e1d776ceb7`