my_xml

Easy to use parser for simple XML

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Developers
License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python
Topic
- Text Processing :: Markup :: XML
- Utilities

Project description

Help module to parse a simple XML buffer and store it as a read-only (mostly)
dictionary-type object (MyXml). This dictionary can hold other dictionaries,
nodes-lists, or leaf nodes. Access to the nodes is by using attributes.

>>> xml = parse("<Foo><Bar>Val</Bar></Foo>")
>>> xml.Foo.Bar == "Val"
True
>>> xml.Foo.Bar
<Bar>Val</Bar>

I don't like to use the built in Python DOM parsers for simple XML data, but
this module is good only for simple XML! No name-spaces, CDATA and other fancy
features are supported.

There are three factory functions, "parse", "parse_file" and "parse_object".

- parse takes an XML string and builds MyXml object from it.

- parse_file takes a file name reads it and do the same.

Both functions take an optional list of tags names from the beginning of the
XML data, to ignore.

- parse_object takes a complex python object (of dictionaries, sequences and
scalars) and creates MyXml object from it.

It is possible, but not convenient, to construct an XML trees using this module.

Usage Examples:

>>> xml = parse('''
... <?xml bla bla bla>
... 
... <Main>
... <Text>One Two & Three</Text>
... <List>
... 
... <Item aaa="bbb" ></Item>
... <Item ccc = "ab+c" />
... <Item>Bla Bla Bla</Item>
... </List>
... <BoolNum num="3.5" bool="Yes">No</BoolNum>
... <Double><Double>Value</Double></Double>
... </Main>
... ''')

- An XML node is an attribute of the MyXml object

>>> xml.Main.Text
<Text>One Two & Three</Text>

- And also

>>> xml.Main.Text == "One Two & Three"
True

>>> xml.Main.Text.value == "One Two & Three"
True

There is also a way to access a node with "nd_" prefix (so we can access
python reserved words), this will also return EMPY_NODE if the node doesn't
exists.

>>> xml.nd_Main.nd_Text
<Text>One Two & Three</Text>

- A node can be looked at as a list with one item

>>> xml.Main.Double.Double[0] is xml.Main.Double.Double
True

- Nodes Lists are regular lists
>>> len(xml.Main.List.Item)
3
>>> unicode(xml.Main.List.Item[2])
u'Bla Bla Bla'

- MyXml object is a dictionary

>>> xml["Main"]["Text"] == xml.Main["Text"]
True
>>> xml.Main.get("Text") == xml["Main"].Text
True

- There is also a very simple XPath-like method

>>> xml.xpath("Main/List/Item")[2]
<Item>Bla Bla Bla</Item>

- Attributes can be accessed with an "at_" prefix

>>> xml.Main.List.Item[1].at_ccc
u'ab+c'

- Access the attributes dictionary with "at_dict"

>>> xml.Main.List.Item[0].at_dict["aaa"]
u'bbb'

- Every value can be looked at as a number and a boolean

>>> xml.Main.BoolNum.boolean
False

- Also attribute can be looked at as booleans or numbers

>>> xml.Main.BoolNum.at_num.number * 2
7.0
>>> xml.xpath("Main/BoolNum").at_bool.boolean
True

- But if the value is not a number or boolean (yes, no, true, false, 1, 0) the
- return value is None

>>> xml.Main.List.Item[0].at_aaa.number

- "get" and "xpath" return an empty node by default, so we can still use the
- number/boolean attributes.

>>> bool(xml.get("foo").boolean)
False

>>> xml.xpath("Main/foo").number is None
True

- Printing MyXml objects keeps the original order and adds indentation.
- The indentation is not thread safe though.

>>> print xml.Main.List
<List>
<Item aaa="bbb" />
<Item ccc="ab+c" />
<Item>Bla Bla Bla</Item>
</List>

- Constructing MyXml object from a python complex object:

>>> xml = parse_object({
... "foo1": "bar",
... "foo2": ["bar1", "bar2", "bar3"],
... "foo3": {"bar": "foo"},
... "foo4": 5
... }, "Main") # "Main" is the name of the top most node

>>> xml.xpath("Main/foo4").number
5

- The names of the nodes that hold a sequence items, are the type name of the
- sequence (list, tuple, set, generator).

>>> xml.xpath("Main/foo2/list")[1] == "bar2"
True

- Finally - not very useful - but you can modify MyXml object

>>> add_returns_self = xml.add(MyNode("bar5", "foo5")) # MyNode(value, name)
>>> xml.foo5.at_dict["attr"] = "attr value"
>>> xml.xpath("Main/foo5").at_attr == "attr value"
True

One can also use the other built in dictionary and list methods, but this is not
recommended

>>> xml # Here the order is not preserved because of the python dictionary
<Main>
<foo4>5</foo4>
<foo1>bar</foo1>
<foo2>
<list>bar1</list>
<list>bar2</list>
<list>bar3</list>
</foo2>
<foo3>
<bar>foo</bar>
</foo3>
<foo5 attr="attr value">bar5</foo5>
</Main>

Please note that this module is not efficient in parsing large XML buffers. It
uses string slicing heavily.

Erez Bibi

Please send comments and questions to
erezbibi AT users DOT sourceforge DOT net

Project details

These details have not been verified by PyPI

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Developers
License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python
Topic
- Text Processing :: Markup :: XML
- Utilities

Release history Release notifications | RSS feed

This version

0.1.2

Jan 17, 2011

0.1.1

Oct 4, 2010

0.1.0

Oct 2, 2010

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

my_xml-0.1.2.zip (12.7 kB view hashes)

Uploaded Jan 17, 2011 Source

Built Distribution

my_xml-0.1.2-py2.6.egg (18.6 kB view hashes)

Uploaded Jan 17, 2011 Source

Hashes for my_xml-0.1.2.zip

Hashes for my_xml-0.1.2.zip
Algorithm	Hash digest
SHA256	`929faa7798335a72daf0338678573ab2aaacc7fdf00d922e32b3562eb2bb68b9`
MD5	`6457ee5170b3b01d084f05aaa904ac6e`
BLAKE2b-256	`ed0559e46b3729162f0c67ec5cb25c535e434cfa119f4c11ebcd9437e93d6b17`

Hashes for my_xml-0.1.2-py2.6.egg

Hashes for my_xml-0.1.2-py2.6.egg
Algorithm	Hash digest
SHA256	`b4a39222f978d7edfd56f8afcd271c45fb4868c4c99e4fc90ed24beef0f10e0a`
MD5	`31be80e597b936509ea8ec818e5978a2`
BLAKE2b-256	`da4aaf8f314dce12643ebf2951fbf50245cc2590bc23be09d1035b6c417f9fad`