Validation and secure evaluation of untrusted python expressions

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Evalidate

Evalidate is simple python module for safe eval()'uating user-supplied (possible malicious) logical expressions in python syntax.

Purpose

Originally it's developed for filtering (performing boolean expressions) complex data structures e.g. raise salary if

person.age>30 and person.salary>5000 or "Jack" in person.children

or, like simple firewall, allow inbound traffic if:

(packet.tcp.dstport==22 or packet.tcp.dstport==80) and packet.tcp.srcip in WhiteListIP

But also, it can be used for other expressions, e.g. arithmetical, like

a+b-100

Install

pip3 install evalidate

Security

Built-in python features such as compile() or eval() are quite powerful to run any kind of user-supplied code, but could be insecure if used code is malicious like os.system("rm -rf /"). Evalidate works on whitelist principle, allowing code only if it consist only of safe operations (based on authors views about what is safe and what is not, your mileage may vary - but you can supply your list of safe operations)

TL;DR. Just give me safe eval!

from evalidate import safeeval, EvalException

src="a+b" # source code
# src="__import__('os').system('clear')"
c={'a': 1, 'b': 2} # context, variables which will be available for code

try:
    result = safeeval(src,c)
    print(result)
except EvalException as e:
    print("ERR:", e)

Gives output:

In case of dangerous code:

src="__import__('os').system('clear')"

output will be: ERR: Operation type Call is not allowed

Exceptions

Evalidate throws exceptions CompilationException, ValidationException, ExecutionException. All of them inherit from base exception class EvalException.

Extending evalidate: safenodes and addnodes

Evalidate has built-in set of python operations, which are considered 'safe' (from author point of view). Code is considered valid only if all of it's operations are in this list. You can override this list by adding argument safenodes like:

result = evalidate.safeeval(src,c, safenodes=['Expression','BinOp','Num','Add'])

this will be enough for '1+1' expression (in src argument), but not for '1-1'. If you will try '1-1', it will report error:

ERROR: Validation error: Operaton type Sub is not allowed

This way you can start from scratch and allow only required operations. As an alternative, you can use built-in list of allowed operations and extend it if needed, using addnodes argument.

For example, "1*1" will give error:

ERROR: Validation error: Operaton type Mult is not allowed

But it will work with addnodes:

result = evalidate.safeeval(src,c, addnodes=['Mult'])

Please note, using 'Mult' operation isn't very secure, because for strings it can lead to Out-of-memory:

src='"a"*1000000*1000000*1000000*1000000'

ERROR: Runtime error (OverflowError): repeated string is too long

Allowing function calls

Evalidate does not allow any function calls by default:

>>> from evalidate import safeeval, EvalException
>>> try:
...   safeeval('int(1)')
... except EvalException as e:
...   print(e)
... 
Operation type Call is not allowed

To enable int() function, need to allow 'Call' node and add this function to list of allowed function:

>>> evalidate.safeeval('int(1)', addnodes=['Call'], funcs=['int'])
1

Attempt to call other functions will fail (because it's not in funcs list):

evalidate.safeeval('1+round(2)', addnodes=['Call'], funcs=['int'])

This will throw ValidationException.

Any indirect function calls (like: __builtins__['eval']("print(1)")) are not allowed.

Functions

There are two functions, safeeval() and evalidate().

safeeval() is simplest possible replacement to eval(). It is good to evaluate something once or few times, where speed is not an issue. If you need to eval same code 2nd time, it will take same 'long' time to parse/validate code.

evalidate() is just little more complex, but returns validated safe python AST node, which can be compiled to python bytecode, and executed at full speed. (And this code is safe after evalidate)

safeeval()

result = safeeval(src, context={}, safenodes=None, addnodes=None, funcs=None, attrs=None)

safeeval is higher-level wrapper of evalidate(), which validates code and runs it (if validation is successful). Throws exception if compilation(parsing), validation or execution fails.

src - source expression like "person['Age']>30 and salary==10000"

context - dictionary of variables, available for evaluated code.

safenodes, addnodes, funcs and attrs are same as in evalidate()

returns result of evaluation of expression.

evalidate()

node = evalidate(expression, safenodes=None, addnodes=None, funcs=None, attrs=None)

evalidate() is main (and recommended to use) method, performs parsing of python expession, validates it, and returns python AST (Abstract Syntax Tree) structure, which can be later compiled and executed

>>> import evalidate
>>> node = evalidate.evalidate('1+2')
>>> code = compile(node,'<usercode>','eval')
>>> eval(code)
3

expression - string with python expressions like '1+2' or 'a+b' or 'a if a>0 else b' or 'p.salary * 1.2'
safenodes - list of allowed nodes. This will override built-in list of allowed nodes. e.g. safenodes=['Expression','BinOp','Num','Add'])
addnodes - list of allowed nodes. This will extend built-in lsit of allowed nodes. e.g. addnodes=['Mult']
funcs - list of allowed function calls. You need to add 'Call' to safe nodes. e.g. funcs=['int']
attrs - list of allowed attributes. You need to add 'Attribute' to attrs. e.g. attrs=['salary'].

evalidate() throws CompilationException if cannot parse source code and ValidationException if it doesn't like source code (if code has unsafe operations).

Even if evalidate is successful, this doesn't guarantees that code will run well, For example, code still can have NameError (if tries to access undefined variable) or ZeroDivisionError.

evalidate uses ast.parse() and returns AST node.

Warning

It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python’s AST compiler.

In my test, works well with 200 nested int(): int(int(.... int(1)...)) but not with 201. Source code is 1000+ characters. But even if evalidate will get such code, it will just raise CompilationException.

Examples

Filtering by user-supplied condition

from evalidate import safeeval, EvalException

depot = [
    {
        'book': 'Sirens of Titan',
        'price': 12,
        'stock': 4
    },
    {
        'book': 'Gone Girl',
        'price': 9.8,
        'stock': 0
    },
    {
        'book': 'Choke',
        'price': 14,
        'stock': 2
    },
    {
        'book': 'Pulp',
        'price': 7.45,
        'stock': 4
    }
]

#src='stock==0' # books out of stock
src='stock>0 and price>8' # expensive book available for sale

for book in depot:
    try:
        result = safeeval(src,book)
        if result:
            print(book)
    except EvalException as e:
        print("ERR:", e)

With first src line ('stock==0') it gives:

{'price': 9.8, 'book': 'Gone Girl', 'stock': 0}

With second src line ('stock>0 and price>8') it gives:

{'price': 12, 'book': 'Sirens of Titan', 'stock': 4}
{'price': 14, 'book': 'Choke', 'stock': 2}

Also, see examples/products.py in repo. It uses dataset "products" from https://dummyjson.com/.

# print all 100 products
./products.py

# Only cheap products, 8 matches
./products.py 'price<20'

# smartphones (5)
./products.py 'category=="smartphones"'

# good smartphones
./products.py 'category=="smartphones" and rating>4.5'

# cheap smartphones
./products.py 'category=="smartphones" and price<300'

Data as objects

Data represented as object with attributes (not as dictionary) (we have to add 'Attribute' to safe nodes). Increase salary for person for 200, and additionaly 25 for each year (s)he works in company.

from evalidate import safeeval, EvalException

class Person:
    pass

p = Person()
p.salary=1000
p.age=5

data = {'p':p}
src = 'p.salary+200+p.age*25'
try:                        
    result = safeeval(src,data,addnodes=['Attribute','Mult'], attrs=['salary', 'age'])                        
    print("result", result)
except EvalException as e:
    print("ERR:",e)

Validate, compile and evaluate code

import evalidate

def test(src):   
    data={'one':1,'two':2}

    try:
        node = evalidate.evalidate(src)
    except evalidate.CompilationException:
        print("Bad source code:", repr(src))
        return
    except evalidate.ValidationException:
        print("Dangerous code:", repr(src))
        return

    code = compile(node,'<usercode>','eval')
    try:
        result = eval(code,{},data)
        print("result:", result)
    except Exception as e:
        # almost any kind of exception can happen here
        print("Runtime exception:",e)

srclist=['one+two+3', 'one+two+3+os.system("clear")', '', '1/0']

for src in srclist:
    test(src)

Similar projects and benchmark

asteval

While asteval can compute much more complex code (define functions, use python math libraries) it has drawbacks:

asteval is much slower (evalidate can be used at speed of eval() python bytecode)
user can provide source code which runs very long time and consumes many resources

evalidate is good to run short same code against different data.

Benchmarking

We use evalidate-vs-asteval.py which is in benchmark/ directory of repository

$ ./evalidate-vs-asteval.py 
Src: a+b
Context: {'a': 1, 'b': 2}
Runs: 100000
asteval: 3.538s
asteval (reuse interpreter): 1.232s
safeeval: 2.384s
evalidate/compile/eval (reuse compiled code): 0.017s

0.017s vs 1.232s

Read about eval() risks

More info

Want more info? Check source code of module, it's very short and simple, easy to modify

Contact

Write me: yaroslaff at gmail.com

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.0.2

Jul 8, 2023

2.0.1

Jun 22, 2023

2.0.0

Jun 21, 2023

1.1.0

Jun 15, 2023

1.0.3

Jun 7, 2023

1.0.2

Aug 28, 2022

This version

1.0.1

Aug 25, 2022

1.0.0

Aug 24, 2022

0.7.10

Aug 23, 2022

0.7.9

Aug 10, 2022

0.7.8

Sep 12, 2021

0.7.7

Jun 20, 2019

0.7.6

Oct 4, 2018

0.7.5

Oct 2, 2018

0.7.4

Oct 1, 2018

0.7.3

Oct 1, 2018

0.7.2

Oct 1, 2018

0.7.1

May 28, 2015

0.7

May 28, 2015

0.6

Jan 21, 2015

0.5

Jan 20, 2015

0.4

Jan 17, 2015

0.3

Oct 4, 2014

0.2

Oct 4, 2014

0.1

Oct 4, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

evalidate-1.0.1-py3-none-any.whl (7.5 kB view hashes)

Uploaded Aug 25, 2022 Python 3

Hashes for evalidate-1.0.1-py3-none-any.whl

Hashes for evalidate-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8eb985ff7650ea16cf566cf4fef4d15b05af5a05d2b2eb9ce6099985ee144e25`
MD5	`a06c407eefbc4cd80f3e74958227f4c9`
BLAKE2b-256	`14fcde69133d3a6de29be2ac84efd1c9f22c20638ea907c5f7ba476e9bd48204`