skip to navigation
skip to content

Not Logged In

urllib3 1.7

HTTP library with thread-safe connection pooling, file post, and more.

Latest Version: 1.8.2

https://travis-ci.org/shazow/urllib3.png?branch=master

Highlights

  • Re-use the same socket connection for multiple requests (HTTPConnectionPool and HTTPSConnectionPool) (with optional client-side certificate verification).
  • File posting (encode_multipart_formdata).
  • Built-in redirection and retries (optional).
  • Supports gzip and deflate decoding.
  • Thread-safe and sanity-safe.
  • Works with AppEngine, gevent, and eventlib.
  • Tested on Python 2.6+ and Python 3.3+, 100% unit test coverage.
  • Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at Requests which is also powered by urllib3.

What's wrong with urllib and urllib2?

There are two critical features missing from the Python standard library: Connection re-using/pooling and file posting. It's not terribly hard to implement these yourself, but it's much easier to use a module that already did the work for you.

The Python standard libraries urllib and urllib2 have little to do with each other. They were designed to be independent and standalone, each solving a different scope of problems, and urllib3 follows in a similar vein.

Why do I want to reuse connections?

Performance. When you normally do a urllib call, a separate socket connection is created with each request. By reusing existing sockets (supported since HTTP 1.1), the requests will take up less resources on the server's end, and also provide a faster response time at the client's end. With some simple benchmarks (see test/benchmark.py ), downloading 15 URLs from google.com is about twice as fast when using HTTPConnectionPool (which uses 1 connection) than using plain urllib (which uses 15 connections).

This library is perfect for:

  • Talking to an API
  • Crawling a website
  • Any situation where being able to post files, handle redirection, and retrying is useful. It's relatively lightweight, so it can be used for anything!

Examples

Go to urllib3.readthedocs.org for more nice syntax-highlighted examples.

But, long story short:

import urllib3

http = urllib3.PoolManager()

r = http.request('GET', 'http://google.com/')

print r.status, r.data

The PoolManager will take care of reusing connections for you whenever you request the same host. For more fine-grained control of your connection pools, you should look at ConnectionPool.

Run the tests

We use some external dependencies to run the urllib3 test suite. Easiest way to run the tests is thusly from the urllib3 source root:

$ pip install -r test-requirements.txt
$ nosetests
.....................................................

Success! You could also pip install coverage to get code coverage reporting.

Contributing

  1. Check for open issues or open a fresh issue to start a discussion around a feature idea or a bug. There is a Contributor Friendly tag for issues that should be ideal for people who are not very familiar with the codebase yet.
  2. Fork the urllib3 repository on Github to start making your changes.
  3. Write a test which shows that the bug was fixed or that the feature works as expected.
  4. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to CONTRIBUTORS.txt.

Changes

1.7 (2013-08-14)

  • More exceptions are now pickle-able, with tests. (Issue #174)
  • Fixed redirecting with relative URLs in Location header. (Issue #178)
  • Support for relative urls in Location: ... header. (Issue #179)
  • urllib3.response.HTTPResponse now inherits from io.IOBase for bonus file-like functionality. (Issue #187)
  • Passing assert_hostname=False when creating a HTTPSConnectionPool will skip hostname verification for SSL connections. (Issue #194)
  • New method urllib3.response.HTTPResponse.stream(...) which acts as a generator wrapped around .read(...). (Issue #198)
  • IPv6 url parsing enforces brackets around the hostname. (Issue #199)
  • Fixed thread race condition in urllib3.poolmanager.PoolManager.connection_from_host(...) (Issue #204)
  • ProxyManager requests now include non-default port in Host: ... header. (Issue #217)
  • Added HTTPS proxy support in ProxyManager. (Issue #170 #139)
  • New RequestField object can be passed to the fields=... param which can specify headers. (Issue #220)
  • Raise urllib3.exceptions.ProxyError when connecting to proxy fails. (Issue #221)
  • Use international headers when posting file names. (Issue #119)
  • Improved IPv6 support. (Issue #203)

1.6 (2013-04-25)

  • Contrib: Optional SNI support for Py2 using PyOpenSSL. (Issue #156)
  • ProxyManager automatically adds Host: ... header if not given.
  • Improved SSL-related code. cert_req now optionally takes a string like "REQUIRED" or "NONE". Same with ssl_version takes strings like "SSLv23" The string values reflect the suffix of the respective constant variable. (Issue #130)
  • Vendored socksipy now based on Anorov's fork which handles unexpectedly closed proxy connections and larger read buffers. (Issue #135)
  • Ensure the connection is closed if no data is received, fixes connection leak on some platforms. (Issue #133)
  • Added SNI support for SSL/TLS connections on Py32+. (Issue #89)
  • Tests fixed to be compatible with Py26 again. (Issue #125)
  • Added ability to choose SSL version by passing an ssl.PROTOCOL_* constant to the ssl_version parameter of HTTPSConnectionPool. (Issue #109)
  • Allow an explicit content type to be specified when encoding file fields. (Issue #126)
  • Exceptions are now pickleable, with tests. (Issue #101)
  • Fixed default headers not getting passed in some cases. (Issue #99)
  • Treat "content-encoding" header value as case-insensitive, per RFC 2616 Section 3.5. (Issue #110)
  • "Connection Refused" SocketErrors will get retried rather than raised. (Issue #92)
  • Updated vendored six, no longer overrides the global six module namespace. (Issue #113)
  • urllib3.exceptions.MaxRetryError contains a reason property holding the exception that prompted the final retry. If reason is None then it was due to a redirect. (Issue #92, #114)
  • Fixed PoolManager.urlopen() from not redirecting more than once. (Issue #149)
  • Don't assume Content-Type: text/plain for multi-part encoding parameters that are not files. (Issue #111)
  • Pass strict param down to httplib.HTTPConnection. (Issue #122)
  • Added mechanism to verify SSL certificates by fingerprint (md5, sha1) or against an arbitrary hostname (when connecting by IP or for misconfigured servers). (Issue #140)
  • Streaming decompression support. (Issue #159)

1.5 (2012-08-02)

  • Added urllib3.add_stderr_logger() for quickly enabling STDERR debug logging in urllib3.
  • Native full URL parsing (including auth, path, query, fragment) available in urllib3.util.parse_url(url).
  • Built-in redirect will switch method to 'GET' if status code is 303. (Issue #11)
  • urllib3.PoolManager strips the scheme and host before sending the request uri. (Issue #8)
  • New urllib3.exceptions.DecodeError exception for when automatic decoding, based on the Content-Type header, fails.
  • Fixed bug with pool depletion and leaking connections (Issue #76). Added explicit connection closing on pool eviction. Added urllib3.PoolManager.clear().
  • 99% -> 100% unit test coverage.

1.4 (2012-06-16)

  • Minor AppEngine-related fixes.
  • Switched from mimetools.choose_boundary to uuid.uuid4().
  • Improved url parsing. (Issue #73)
  • IPv6 url support. (Issue #72)

1.3 (2012-03-25)

  • Removed pre-1.0 deprecated API.
  • Refactored helpers into a urllib3.util submodule.
  • Fixed multipart encoding to support list-of-tuples for keys with multiple values. (Issue #48)
  • Fixed multiple Set-Cookie headers in response not getting merged properly in Python 3. (Issue #53)
  • AppEngine support with Py27. (Issue #61)
  • Minor encode_multipart_formdata fixes related to Python 3 strings vs bytes.

1.2.2 (2012-02-06)

  • Fixed packaging bug of not shipping test-requirements.txt. (Issue #47)

1.2.1 (2012-02-05)

  • Fixed another bug related to when ssl module is not available. (Issue #41)
  • Location parsing errors now raise urllib3.exceptions.LocationParseError which inherits from ValueError.

1.2 (2012-01-29)

  • Added Python 3 support (tested on 3.2.2)
  • Dropped Python 2.5 support (tested on 2.6.7, 2.7.2)
  • Use select.poll instead of select.select for platforms that support it.
  • Use Queue.LifoQueue instead of Queue.Queue for more aggressive connection reusing. Configurable by overriding ConnectionPool.QueueCls.
  • Fixed ImportError during install when ssl module is not available. (Issue #41)
  • Fixed PoolManager redirects between schemes (such as HTTP -> HTTPS) not completing properly. (Issue #28, uncovered by Issue #10 in v1.1)
  • Ported dummyserver to use tornado instead of webob + eventlet. Removed extraneous unsupported dummyserver testing backends. Added socket-level tests.
  • More tests. Achievement Unlocked: 99% Coverage.

1.1 (2012-01-07)

  • Refactored dummyserver to its own root namespace module (used for testing).
  • Added hostname verification for VerifiedHTTPSConnection by vendoring in Py32's ssl_match_hostname. (Issue #25)
  • Fixed cross-host HTTP redirects when using PoolManager. (Issue #10)
  • Fixed decode_content being ignored when set through urlopen. (Issue #27)
  • Fixed timeout-related bugs. (Issues #17, #23)

1.0.2 (2011-11-04)

  • Fixed typo in VerifiedHTTPSConnection which would only present as a bug if you're using the object manually. (Thanks pyos)
  • Made RecentlyUsedContainer (and consequently PoolManager) more thread-safe by wrapping the access log in a mutex. (Thanks @christer)
  • Made RecentlyUsedContainer more dict-like (corrected __delitem__ and __getitem__ behaviour), with tests. Shouldn't affect core urllib3 code.

1.0.1 (2011-10-10)

  • Fixed a bug where the same connection would get returned into the pool twice, causing extraneous "HttpConnectionPool is full" log warnings.

1.0 (2011-10-08)

  • Added PoolManager with LRU expiration of connections (tested and documented).
  • Added ProxyManager (needs tests, docs, and confirmation that it works with HTTPS proxies).
  • Added optional partial-read support for responses when preload_content=False. You can now make requests and just read the headers without loading the content.
  • Made response decoding optional (default on, same as before).
  • Added optional explicit boundary string for encode_multipart_formdata.
  • Convenience request methods are now inherited from RequestMethods. Old helpers like get_url and post_url should be abandoned in favour of the new request(method, url, ...).
  • Refactored code to be even more decoupled, reusable, and extendable.
  • License header added to .py files.
  • Embiggened the documentation: Lots of Sphinx-friendly docstrings in the code and docs in docs/ and on urllib3.readthedocs.org.
  • Embettered all the things!
  • Started writing this file.

0.4.1 (2011-07-17)

  • Minor bug fixes, code cleanup.

0.4 (2011-03-01)

  • Better unicode support.
  • Added VerifiedHTTPSConnection.
  • Added NTLMConnectionPool in contrib.
  • Minor improvements.

0.3.1 (2010-07-13)

  • Added assert_host_name optional parameter. Now compatible with proxies.

0.3 (2009-12-10)

  • Added HTTPS support.
  • Minor bug fixes.
  • Refactored, broken backwards compatibility with 0.2.
  • API to be treated as stable from this version forward.

0.2 (2008-11-17)

  • Added unit tests.
  • Bug fixes.

0.1 (2008-11-16)

  • First release.
 
File Type Py Version Uploaded on Size
urllib3-1.7.tar.gz (md5) Source 2013-08-14 59KB
  • Downloads (All Versions):
  • 3786 downloads in the last day
  • 20333 downloads in the last week
  • 92497 downloads in the last month