Skip to main content

Toolbox of general-purpose python code.

Project description

Like many programmers, I have developed a toolbox of utilities that I like to have close at hand.

For more information: http://pypi.python.org/pypi/gsn_util/

For the source code: http://launchpad.net/gsn-util

Installation

You can use any of the following standard incantations:

  • pip gsn_util

  • easy_install gsn_util

  • python setup.py install

If you want to install in your home directory, you can add the –user flag to any of the above.

Usage

There are many tidbits in this file. Most are self explanatory, some I think are rather clever, others lifted from other sources (always with credits in the docstring). If a bit of code doesn’t say “this is from …” in the docstring, then I wrote it myself.

A few of the highlights:

  • def memoize(f)

    For any function f, return a caching version of f. Thus if the function is called more than once with the same arguments, all calls (except for the first one) return the cached result instantly

    >>> long_running_function(1.1) # takes a long time
    >>> f = memoize(log_running_function)
    >>> f(2.2)  # takes a long time, too
    >>> f(3.3)  # Also takes a long time
    >>> f(2.2)  # Instantaneous (using the previously cached result)
    
  • def forkify(f)

    Return a function that forks and calls f in separate process.

    I found this useful for long-running Python processes that handle a lot of data and eventually run out of memory. In spite of my best efforts at making sure no dangling references were hanging around, the most robust solution was to just fork and let the operating system handle de-allocation.

    So if memory_intensive_function() uses a lot of memory but produces small results, then this will prevent out-of-memory problems:

    >>> f = forkify(memory_intensive_function)
    >>> [f(ii) for ii in huge_list]
    

    Exceptions raised in the child process are caught and re-raised in the parent process.

  • class SnooperMixin(object):

    Snoop on how an object is being used.

    Suppose you pass an object into some function and want to know what properties of your object the function is using/depending on. Normally you do this:

    >>> obj = SomeObject()
    >>> opaque_function(obj)
    

    Instead you do the following. Note that there’s no body to the definition of SnoopedObject.

    >>> class SnoopedObject(SnooperMixin, SomeObject): pass
    >>> obj = SnoopedObject()
    >>> opaque_function(obj)
    >>> obj.snoop
    set(['readlines', 'next'])
    

    So you know that opaque_function accessed/used the methods/data called readlines and next.

    This knowledge, of course, exposes the implementation details of opaque_function() and you probably shouldn’t write code that depends on those details… On the other hand, such knowledge can be very illuminating.

    The name Mixin comes from the old CLOS (Common Lisp Object System) notion of an object that’s not itself a fully specified, useful object, but is something that’s added to other objects to given them specific functionality.

  • class DotDict(dict)

    Behaves like a dictionary, but allows dot access to read attributes.

    I use this as a container when I want the container to behave exactly like a dictionary, but get tired of typing foo[‘bar’] and want to just type foo.bar instead.

    Specifically, I use it to hold data from simulation snapshots. If my simulation has a field called “density”, I’m sure not going to type sim[‘density’] every time I want to do anything. This object allows me to refer to it as sim.density instead.

    >>> foo = DotDict()
    >>> foo.density = read_from_file()
    >>> plot(foo.density)
    

    Accessing fields like a dict also works:

    >>> for kk in foo.keys(): ensure_no_nans(foo[kk])
    

    “But that’s not very object oriented, you should define a SimulationData object that has density as an attribute,” you may say. Well…. that’s what I’ve done. I want the SimulationData object to have the same things that dict objects have, the keys() function, for example. As long as you don’t have a simulation data field that conflicts with the name of one of the dict methods, this causes no problem.

  • List manipulation, including:

    • def cross_set(*sets):

      Given lists, generate all possibilities with the first element chosen from the first list, the second element chosen from the second, etc. Note that this handles an arbitrary number of sets from which to draw.

      >>> cross_set([1], [2,3])
      [[1,2], [1,3]]
      
    • def combinations(lst, n):

      Generate all combinations of n items of lst

      >>> combinations([1,2,3], 2)
      [[1,2], [1,3], [2,3]]
      
  • Dict manipulation, including:

    • def map_dict_tree(f, d):

      Map an arbitrarily nested dict of dicts of dicts… The recursion stops when a non-dict value is encountered.

      >>> obj = dict(a=1, b=dict(c=2, d=dict(e=3, f=4)))
      >>> obj
      {'a': 1, 'b': {'c': 2, 'd': {'e': 3, 'f': 4}}}
      >>> map_dict_tree(lambda x: x+2, obj)
      {'a': 3, 'b': {'c': 4, 'd': {'e': 5, 'f': 6}}}
      
  • Convenient keyword argument list manipulation:

    • def given(*args):

      Return True if all of the arguments are not None.

      Intended for use in argument lists where you can reasonably specify different combinations of parameters. Then you can write:

      def foo(a=None, b=None, c=None):
          if given(a,b):
              do something
          elif given(a,c):
              do something else
    • def pop_keys(d, *names):

      Pull some keywords from dict d if they exist.

      I use this to help with argument processing when I have lots of keyword arguments floating around. The typical use is something like:

      def foo(**kw):
          kw1 = pop_keys('args', 'for', 'bar')
          bar(**kw1)
          other_function(**kw)  # kw doesn't contain the popped keywords anymore

      Thus neither bar() nor other_function() get keyword arguments that they don’t expect. In addition, if the caller doesn’t specify an argument, it doesn’t show up in the arg list for the calls to bar or other_function, so that the default values are used.

    • def dict_union(*ds, **kw):

      Combine several dicts and keywords into one dict. I use this for argument processing where I want to set defaults in several places, sometimes overriding values. The common case is something like:

      values = dictUntion(global_defaults, local_defaults, key1=val1,
                          key2=val2)

      where global_defaults and local_defaults are dicts where local_defaults overrides global_defaults, and key1 and key2 override anything in either of the values.

  • Composition of function predicates:

    • def f_or(*fs)

    • def f_and(*fs)

    • def f_not(f)

    The idea is to compose functions using logical operators to make compound predicates. Ie, you have functions blue(obj) and green(obj) that return True or False depending on whether the object is blue or green. You can write:

    blue_or_green = f_or(blue, green)
    if blue_or_green(obj):
        do something
  • Concise syntax for pickling objects:

    Pickling is great, but I do a lot of interactive data analysis, so I want syntax for object persistence that’s one line and as few characters as possible.

    >>> can([1,2,3], 'file.dat')
    >>> obj = uncan('file.dat')
    
  • def timer(f, *a, **kw):

    Provide reasonably reliable time estimates for a function.

    Runs the function once. If the run time is less than timer_tmin, run the function timer_factor more times. Repeat until timer_tmin is surpassed. If timer_verbose, print what’s going on to stdout.

    >>> square = lambda x: x**2
    >>> timer(f, 5, timer_tmin=2.0, timer_factor=3, timer_verbose=True)
    
  • def import_graph(with_system=True, out_file=sys.stdout,

    excludes=None, exclude_regexps=None)

    Construct a graph of which python modules import which others, suitable for consumption by graphviz (http://www.graphviz.org).

    This just works on python files in the current directory. It’s intended to be helpful if you want to reduce dependencies among python files in the current directory.

    >>> import_graph(out_file='imports.dot')
    # At the Unix shell prompt:
    [novak@thalia ~]$ dot -Tpng imports.dot > imports.png
    

Version Information

gsn_util passes all tests with Python 2.5 through 2.7.

When translated to Python 3 via the 2to3 script, gsn_util passes all tests on Python 3.1, 3.2, and 3.3.

License

The code is released under the MIT license, so you should be able to do whatever you want with it.

If you incorporate this code into a larger project, I would appreciate it if you send me a note at greg.novak@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gsn_util-0.2.1.tar.gz (22.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page