skip to navigation
skip to content

pytest-spark 0.4.0

pytest plugin to run the tests with support of pyspark.

pytest plugin to run the tests with support of pyspark (Apache Spark).

This plugin will allow to specify SPARK_HOME directory in pytest.ini and thus to make “pyspark” importable in your tests which are executed by pytest.

Also it defines session scope fixture spark_context which can be used in your tests.

Install

$ pip install pytest-spark

Usage

Set Spark location

To run tests with required spark_home location just add “spark_home” value to pytest.ini in your project directory:

[pytest]
spark_home = /opt/spark

Or set the “SPARK_HOME” environmental variable.

pytest-spark will try to import pyspark from specified location.

Order of reading spark_home:

  1. pytest.ini
  2. “SPARK_HOME” environmental variable
  3. Try to find it in common locations (i.e. OS X Homebrew)

Using the spark_context fixture

Use fixture spark_context in your tests as a regular pyspark fixture. SparkContext instance will be created once and reused for the whole test session.

Example:

def test_my_case(spark_context):
    test_rdd = spark_context.parallelize([1, 2, 3, 4])
    # ...

Using the spark_session fixture (Spark 2.0 and above)

Use fixture spark_session in your tests as a regular pyspark fixture. A SparkSession instance with Hive support enabled will be created once and reused for the whole test session.

Example:

def test_spark_session_dataframe(spark_session):
    test_df = spark_session.createDataFrame([[1,3],[2,4]], "a: int, b: int")
    # ...
 
File Type Py Version Uploaded on Size
pytest-spark-0.4.0.tar.gz (md5) Source 2017-08-03 3KB
pytest_spark-0.4.0-py2.py3-none-any.whl (md5) Python Wheel py2.py3 2017-08-03 4KB