Python Non-cryptographic Hash Library
Project description
Introduction
pyhash
is a python non-cryptographic hash library.
It provides several common hash algorithms with C/C++ implementation for performance and compatibility.
>>> import pyhash
>>> hasher = pyhash.fnv1_32()
>>> hasher('hello world')
2805756500L
>>> hasher('hello', ' ', 'world')
2805756500L
>>> hasher('world', seed=hasher('hello '))
2805756500L
It also can be used to generate fingerprints without seed.
>>> import pyhash
>>> fp = pyhash.farm_fingerprint_64()
>>> fp('hello')
>>> 13009744463427800296L
>>> fp('hello', 'world')
>>> [13009744463427800296L, 16436542438370751598L]
Notes
hasher('hello', ' ', 'world')
is a syntax sugar for hasher('world', seed=hasher(' ', seed=hasher('hello')))
, and may not equals to hasher('hello world')
, because some hash algorithms use different hash
and seed
size.
For example, metro
hash always use 32bit seed for 64/128 bit hash value.
>>> import pyhash
>>> hasher = pyhash.metro_64()
>>> hasher('hello world')
>>> 5622782129197849471L
>>> hasher('hello', ' ', 'world')
>>> 16402988188088019159L
>>> hasher('world', seed=hasher(' ', seed=hasher('hello')))
>>> 16402988188088019159L
Installation
$ pip install pyhash
Notes pyhash
only support pypy
v6.0 or newer, please download and install the latest pypy
.
Algorithms
pyhash supports the following hash algorithms
- FNV (Fowler-Noll-Vo) hash
- fnv1_32
- fnv1a_32
- fnv1_64
- fnv1a_64
- MurmurHash
- murmur1_32
- murmur1_aligned_32
- murmur2_32
- murmur2a_32
- murmur2_aligned_32
- murmur2_neutral_32
- murmur2_x64_64a
- murmur2_x86_64b
- murmur3_32
- murmur3_x86_128
- murmur3_x64_128
- lookup3
- lookup3
- lookup3_little
- lookup3_big
- SuperFastHash
- super_fast_hash
- City Hash
_ city_32
- city_64
- city_128
- city_crc_128
- city_fingerprint_256
- Spooky Hash
- spooky_32
- spooky_64
- spooky_128
- FarmHash
- farm_32
- farm_64
- farm_128
- farm_fingerprint_32
- farm_fingerprint_64
- farm_fingerprint_128
- MetroHash
- metro_64
- metro_128
- metro_crc_64
- metro_crc_128
- MumHash
- mum_64
- T1Hash
- t1ha2 (64-bit little-endian)
- t1ha2_128 (128-bit little-endian)
- t1ha1 (64-bit native-endian)
- t1ha1_le (64-bit little-endian)
- t1ha1_be (64-bit big-endian)
- t1ha0 (64-bit, choice fastest function in runtime.)
- t1ha0_ia32aes_noavx (64-bit, x86 with AES-NI without AVX extensions)
- t1ha0_ia32aes_avx (64-bit, x86 with AES-NI and AVX extensions)
- t1ha0_ia32aes_avx2 (64-bit, x86 with AES-NI and AVX2 extensions)
- t1ha0_32 (32-bit native-endian)
- t1ha0_32le (32-bit little-endian)
- t1ha0_32be (32-bit big-endian)
t1_32t1_32_bet1_64t1_64_be
- XXHash
- xx_32
- xx_64
String and Bytes literals
Python has two types can be used to present string literals, the hash values of the two types are definitely different.
- For Python 2.x String literals,
str
will be used by default,unicode
can be used with theu
prefix. - For Python 3.x String and Bytes literals,
unicode
will be used by default,bytes
can be used with theb
prefix.
For example,
$ python2
Python 2.7.15 (default, Jun 17 2018, 12:46:58)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
4138058784L
>>> hasher(u'foo')
2085578581L
>>> hasher(b'foo')
4138058784L
$ python3
Python 3.7.0 (default, Jun 29 2018, 20:13:13)
[Clang 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyhash
>>> hasher = pyhash.murmur3_32()
>>> hasher('foo')
2085578581
>>> hasher(u'foo')
2085578581
>>> hasher(b'foo')
4138058784
You can also import unicode_literals to use unicode literals in Python 2.x
from __future__ import unicode_literals
In general, it is more compelling to use unicode_literals when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3. In the latter case, explicitly marking up all unicode string literals with u'' prefixes would help to avoid unintentionally changing the existing Python 2 API. However, if changing the existing Python 2 API is not a concern, using unicode_literals may speed up the porting process.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for pyhash-0.9.0-pp360-pypy3_60-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a04133a27755a33be90018dfc5cd5b7a3379f9f692d6f4075331b5729c4294f |
|
MD5 | c3fb902a64f960cc9b31789393d62560 |
|
BLAKE2b-256 | a0b1f196bcc2ec96b3173b9867d8ae9556dc2d15fb3bd17afa7dd05de68e7352 |
Hashes for pyhash-0.9.0-pp260-pypy_41-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 512f5509d422ee66e82bb5a35f69dc2028d6c24b76e6d1a2dbdcd9aafe9343e8 |
|
MD5 | a4c81bd150bf0ed4e805360438ed9f61 |
|
BLAKE2b-256 | 154b8a999214c4a85bd2a44d5d664238d8631297a6066ff3e91461ef325ee5e1 |
Hashes for pyhash-0.9.0-cp37-cp37m-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 348d03577cb41b8afd0396cb1daeab0a5da9c8647b92b95d99a2626615d6b5e8 |
|
MD5 | 765f0946f0424c9e6d4e221e62bf9593 |
|
BLAKE2b-256 | bb481bd0d826a7d30ece6d08a5e9c72021d2aedf00d962dd827e49ab2e72563a |
Hashes for pyhash-0.9.0-cp27-cp27m-macosx_10_13_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b722f063b4c46b43d18ced5e6d0bb5f379dffb1d7742c2b06579bf0c9729bddd |
|
MD5 | b5275e2413ba0c75e2336a4b0aa3cbe6 |
|
BLAKE2b-256 | 9ed07794c03d6f77e1d6b5329f29ba44bca62a1a9c9dc826bec26be45b6d81e8 |