pyaml

PyYAML-based module to produce pretty and readable YAML-serialized data

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

pretty-yaml: Pretty YAML serialization

YAML is generally nice an easy format to read if it was written by humans.

IMPORTANT NOTE: I just discovered allow_unicode=True option for Emitter (shameful, I know). It fixes a lot of issues with non-ascii stuff, maybe consider trying just that one first.

Unfortunately, by default, available serializers seem to not care about that aspect, producing correct but less-readable (and format allows for json-style crap) and poorly-formatted dumps, hence this simple module.

Observe…

Let’s try default PyYAML methods first.

yaml.dump(src, sys.stdout):

destination:
  encoding:
    xz: {enabled: true, min_size: 5120, options: null, path_filter: null}
  result: {append_to_file: null, append_to_lafs_dir: null, print_to_stdout: true}
  url: http://localhost:3456/uri
filter: ["\u0414\u043B\u0438\u043D\u043D\u044B\u0439 \u0441\u0442\u0440\u0438\u043D\
    \u0433 \u043D\u0430 \u0440\u0443\u0441\u0441\u043A\u043E\u043C", "\u0415\u0449\
    \u0435 \u043E\u0434\u043D\u0430 \u0434\u043B\u0438\u043D\u043D\u0430\u044F \u0441\
    \u0442\u0440\u043E\u043A\u0430"]

Quite similar to JSON (a subset of YAML, actually).

yaml.dump(src, sys.stdout, default_flow_style=False):

destination:
  encoding:
    xz:
      enabled: true
      min_size: 5120
      options: null
      path_filter: null
  result:
    append_to_file: null
    append_to_lafs_dir: null
    print_to_stdout: true
  url: http://localhost:3456/uri
filter:
- "\u0414\u043B\u0438\u043D\u043D\u044B\u0439 \u0441\u0442\u0440\u0438\u043D\u0433\
  \ \u043D\u0430 \u0440\u0443\u0441\u0441\u043A\u043E\u043C"
- "\u0415\u0449\u0435 \u043E\u0434\u043D\u0430 \u0434\u043B\u0438\u043D\u043D\u0430\
  \u044F \u0441\u0442\u0440\u043E\u043A\u0430"

Better, but why all the “null” stuff if yaml allows to have just empty values? Why make all non-ascii strings completely unreadable like that if pretty much every parser reads utf-8 or whatever unicode-string object by default?

pyaml.dump(src, sys.stdout):

destination:
  encoding:
    xz:
      enabled: true
      min_size: 5120
      options:
      path_filter:
  result:
    append_to_file:
    append_to_lafs_dir:
    print_to_stdout: true
  url: http://localhost:3456/uri
filter:
  - 'Длинный стринг на русском'
  - 'Еще одна длинная строка'

Note, yaml.load will read that to the same thing as the above dumps, but now you can read that as well.

pyaml.pprint(data) (or just pyaml.p) should work just as well as pyaml.dump(src, sys.stdout) (and there’s pyaml.dumps() to get bytes).

Multi-line X.509 data?

yaml.dump(cert, sys.stdout):

{cert: !!python/unicode '-----BEGIN CERTIFICATE-----

    MIIDUjCCAjoCCQD0/aLLkLY/QDANBgkqhkiG9w0BAQUFADBqMRAwDgYDVQQKFAdm

    Z19jb3JlMRYwFAYDVQQHEw1ZZWthdGVyaW5idXJnMR0wGwYDVQQIExRTdmVyZGxv

    ...

Beautiful, is it not? (it is not)

pyaml.p(cert):

cert: |-
  -----BEGIN CERTIFICATE-----
  MIIDUjCCAjoCCQD0/aLLkLY/QDANBgkqhkiG9w0BAQUFADBqMRAwDgYDVQQKFAdm
  Z19jb3JlMRYwFAYDVQQHEw1ZZWthdGVyaW5idXJnMR0wGwYDVQQIExRTdmVyZGxv
  dnNrYXlhIG9ibGFzdDELMAkGA1UEBhMCUlUxEjAQBgNVBAMTCWxvY2FsaG9zdDAg
  Fw0xMzA0MjQwODUxMTRaGA8yMDUzMDQxNDA4NTExNFowajEQMA4GA1UEChQHZmdf
  Y29yZTEWMBQGA1UEBxMNWWVrYXRlcmluYnVyZzEdMBsGA1UECBMUU3ZlcmRsb3Zz
  a2F5YSBvYmxhc3QxCzAJBgNVBAYTAlJVMRIwEAYDVQQDEwlsb2NhbGhvc3QwggEi
  MA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCnZr3jbhfb5bUhORhmXOXOml8N
  fAli/ak6Yv+LRBtmOjke2gFybPZFuXYr0lYGQ4KgarN904vEg7WUbSlwwJuszJxQ
  Lz3xSDqQDqF74m1XeBYywZQIywKIbA/rfop3qiMeDWo3WavYp2kaxW28Xd/ZcsTd
  bN/eRo+Ft1bor1VPiQbkQKaOOi6K8M9a/2TK1ei2MceNbw6YrlCZe09l61RajCiz
  y5eZc96/1j436wynmqJn46hzc1gC3APjrkuYrvUNKORp8y//ye+6TX1mVbYW+M5n
  CZsIjjm9URUXf4wsacNlCHln1nwBxUe6D4e2Hxh2Oc0cocrAipxuNAa8Afn5AgMB
  AAEwDQYJKoZIhvcNAQEFBQADggEBADUHf1UXsiKCOYam9u3c0GRjg4V0TKkIeZWc
  uN59JWnpa/6RBJbykiZh8AMwdTonu02g95+13g44kjlUnK3WG5vGeUTrGv+6cnAf
  4B4XwnWTHADQxbdRLja/YXqTkZrXkd7W3Ipxdi0bDCOSi/BXSmiblyWdbNU4cHF/
  Ex4dTWeGFiTWY2upX8sa+1PuZjk/Ry+RPMLzuamvzP20mVXmKtEIfQTzz4b8+Pom
  T1gqPkNEbe2j1DciRNUOH1iuY+cL/b7JqZvvdQK34w3t9Cz7GtMWKo+g+ZRdh3+q
  2sn5m3EkrUb1hSKQbMWTbnaG4C/F3i4KVkH+8AZmR9OvOmZ+7Lo=
  -----END CERTIFICATE-----

Seem to be somewhat nicer.

Use e.g. pyaml.dump(stuff, string_val_style='|') to force all string values (but not keys) to some particular style (see tricks section below for examples).

Another example.

Let’s say you have a parsed URL like this:

# -*- coding: utf-8 -*-
url = dict(
  path='/some/path',
  query_dump=OrderedDict([
    ('key1', 'тест1'),
    ('key2', 'тест2'),
    ('key3', 'тест3'),
    ('последний', None) ]) )

Order of keys in query matters because there might be a hundred of them and you’d like the output to be diff-friendly.

yaml.dump(url, sys.stdout):

path: !!python/unicode '/some/path'
query_dump: !!python/object/apply:collections.OrderedDict
- - [!!python/unicode 'key1', "\u0442\u0435\u0441\u04421"]
  - [!!python/unicode 'key2', "\u0442\u0435\u0441\u04422"]
  - [!!python/unicode 'key3', "\u0442\u0435\u0441\u04423"]
  - ["\u043F\u043E\u0441\u043B\u0435\u0434\u043D\u0438\u0439", null]

Ugh…

yaml.safe_dump(url, sys.stdout):

yaml.representer.RepresenterError: cannot represent an object: OrderedDict(...

Right… let’s try something designed to be pretty here:

>>> from pprint import pprint
>>> pprint(url)
{'path': u'/some/path',
 'query_dump': OrderedDict([(u'key1', u'\u0442\u0435\u0441\u04421'), (u'key2', u'\u0442\u0435\u0441\u04422'), (u'key3', u'\u0442\u0435\u0441\u04423'), (u'\u043f\u043e\u0441\u043b\u0435\u0434\u043d\u0438\u0439', None)])}

YUCK!

pyaml.pprint(url):

path: /some/path
query_dump:
  key1: тест1
  key2: тест2
  key3: тест3
  последний:

Much easier to read than… anything else! Diff-friendly too.

Have a long config which will produce a wall-of-text even with indentation? No problem!

pyaml.dump(src, sys.stdout, vspacing=[2, 1]):

destination:

  encoding:
    xz:
      enabled: true
      min_size: 5120
      options:
      path_filter:
        - \.(gz|bz2|t[gb]z2?|xz|lzma|7z|zip|rar)$
        - \.(rpm|deb|iso)$
        - \.(jpe?g|gif|png|mov|avi|ogg|mkv|webm|mp[34g]|flv|flac|ape|pdf|djvu)$
        - \.(sqlite3?|fossil|fsl)$
        - \.git/objects/[0-9a-f]+/[0-9a-f]+$

  result:
    append_to_file:
    append_to_lafs_dir:
    print_to_stdout: true

  url: http://localhost:3456/uri


filter:
  - /(CVS|RCS|SCCS|_darcs|\{arch\})/$
  - /\.(git|hg|bzr|svn|cvs)(/|ignore|attributes|tags)?$
  - /=(RELEASE-ID|meta-update|update)$


http:

  ca_certs_files: /etc/ssl/certs/ca-certificates.crt

  debug_requests: false

  request_pool_options:
    cachedConnectionTimeout: 600
    maxPersistentPerHost: 10
    retryAutomatically: true


logging:

  formatters:
    basic:
      datefmt: '%Y-%m-%d %H:%M:%S'
      format: '%(asctime)s :: %(name)s :: %(levelname)s: %(message)s'

  handlers:
    console:
      class: logging.StreamHandler
      formatter: basic
      level: custom
      stream: ext://sys.stderr

  loggers:
    twisted:
      handlers:
        - console
      level: 0

  root:
    handlers:
      - console
    level: custom

Hopefully, the why should be obvious now.

Among other features - proper readable (and working) object deduplication links by the grace of unidecode module transliteration and an option (“force_embed” keyword) to disable deduplication (imagine reading through dump riddled with such “jump there” links - none of that!).

Obligatory warning

Note that prime concern for this module is to chew simple stuff gracefully, and internally there are some nasty hacks (that I’m not proud of) are used to do it, which may not work with more complex serialization cases, possibly even producing non-deserializable (but fixable) output.

Again, prime goal is not to serialize, say, gigabytes of complex document-storage db contents, but rather individual simple human-parseable documents, please keep that in mind (and of course, patches for hacks are welcome!).

Other Tricks

Pretty-print any yaml or json (yaml subset) file from the shell:

python -m pyaml /path/to/some/file.yaml
curl -s https://status.github.com/api.json | python -m pyaml

Easier “debug printf” for more complex data (all funcs below are aliases to same thing):

pyaml.p(stuff)
pyaml.pprint(my_data)
pyaml.pprint('----- HOW DOES THAT BREAKS!?!?', input_data, some_var, more_stuff)
pyaml.print(data, file=sys.stderr) # needs "from __future__ import print_function"

Force all string values to a certain style (see info on these in PyYAML docs):

pyaml.dump(many_weird_strings, string_val_style='|')
pyaml.dump(multiline_words, string_val_style='>')
pyaml.dump(no_want_quotes, string_val_style='plain')

Using pyaml.add_representer() (note pyaml) as suggested in this SO thread (or #7) should also work.

Installation

It’s a regular package for Python 2.7 (not 3.X).

Using pip is the best way:

% pip install pyaml

If you don’t have it, use:

% easy_install pip
% pip install pyaml

Alternatively (see also):

% curl https://raw.github.com/pypa/pip/master/contrib/get-pip.py | python
% pip install pyaml

Or, if you absolutely must:

% easy_install pyaml

But, you really shouldn’t do that.

Current-git version can be installed like this:

% pip install 'git+https://github.com/mk-fg/pretty-yaml.git#egg=pyaml'

Module uses PyYAML for processing of the actual YAML files and should pull it in as a dependency.

Dependency on unidecode module is optional and should only be necessary if same-id objects or recursion is used within serialized data.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

24.4.0

Apr 17, 2024

23.12.0

Dec 25, 2023

23.9.7

Sep 29, 2023

23.9.6

Sep 13, 2023

23.9.5

Sep 9, 2023

23.9.4

Sep 9, 2023

23.9.3

Sep 6, 2023

23.9.2

Sep 5, 2023

23.9.1

Sep 3, 2023

23.9.0

Sep 3, 2023

23.7.0

Jul 6, 2023

23.5.9

May 11, 2023

23.5.8

May 6, 2023

23.5.7

May 5, 2023

23.5.6

May 5, 2023

23.5.5

May 5, 2023

21.10.1

Oct 9, 2021

21.8.3

Aug 8, 2021

21.8.2

Aug 8, 2021

20.4.0

Apr 2, 2020

20.3.1

Mar 9, 2020

20.3.0

Mar 9, 2020

19.12.0

Dec 7, 2019

19.4.1

Apr 17, 2019

19.4.0

Apr 17, 2019

18.11.0

Nov 19, 2018

17.12.1

Dec 23, 2017

17.12.0

Dec 23, 2017

17.10.0

Oct 8, 2017

17.8.0

Aug 17, 2017

17.7.2

Jul 28, 2017

16.12.2

Dec 11, 2016

16.12.1

Dec 8, 2016

16.12.0

Dec 8, 2016

16.11.4

Nov 12, 2016

16.11.3

Nov 12, 2016

16.11.0

Nov 2, 2016

16.9.0

Sep 10, 2016

15.8.2

Aug 30, 2015

15.8.0

Aug 30, 2015

15.6.3

Jun 29, 2015

15.6.2

Jun 29, 2015

15.5.7

May 19, 2015

15.5.6

May 19, 2015

15.5.5

May 19, 2015

15.5.4

May 19, 2015

15.5.3

May 19, 2015

15.5.2

May 4, 2015

15.5.1

May 4, 2015

15.5.0

May 2, 2015

15.4.0

Apr 27, 2015

15.03.1

Mar 20, 2015

15.03.0

Mar 20, 2015

This version

15.02.1

Feb 15, 2015

15.02.0

Feb 15, 2015

14.12.10

Dec 3, 2014

14.11.3

Nov 10, 2014

14.11.2

Nov 10, 2014

14.05.7

May 28, 2014

14.05.6

May 20, 2014

14.05.5

May 20, 2014

14.05.3

May 20, 2014

14.05.2

May 6, 2014

14.04.3

Apr 8, 2014

14.04.2

Apr 8, 2014

13.12.0

Dec 20, 2013

13.07.1

Jul 29, 2013

13.07.0

Jul 3, 2013

13.05.2

May 22, 2013

13.01.0

Jan 17, 2013

12.12.5

Dec 14, 2012

12.12.4

Dec 14, 2012

12.12.3

Dec 14, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyaml-15.02.1.tar.gz (14.3 kB view hashes)

Uploaded Feb 15, 2015 Source

Hashes for pyaml-15.02.1.tar.gz

Hashes for pyaml-15.02.1.tar.gz
Algorithm	Hash digest
SHA256	`8dfe1b295116115695752acc84d15ecf5c1c469975fbed7672bf41a6bc6d6d51`
MD5	`e98cf27f50b9ca291ca4937c135db1c9`
BLAKE2b-256	`231ed0680beac4329c757b7223431498a0c7d5e09a4cefdd74eabeae3ce3858a`