A Python client library to simplify robust mini-batch scoring against an H2O MLOps scoring endpoint.

Project description

H2O MLOps Scoring Client

A Python client library to simplify robust mini-batch scoring against an H2O MLOps scoring endpoint. It can run on your local PC, a stand alone server, Databricks, or a Spark 3 cluster.

Scoring Pandas data frames is as easy as:

pip install h2o-mlops-scoring-client

import h2o_mlops_scoring_client


scores_df = h2o_mlops_scoring_client.score_data_frame(
    mlops_endpoint_url="https://.../model/score",
    id_column="ID",
    data_frame=df,
)

Scoring from a source to a sink is also possible through pyspark:

pip install h2o-mlops-scoring-client[PYSPARK]

import h2o_mlops_scoring_client


h2o_mlops_scoring_client.score_source_sink(
    mlops_endpoint_url="https://.../model/score",
    id_column="ID",
    source_data="s3a://...",
    source_format=h2o_mlops_scoring_client.Format.CSV,
    sink_location="s3a://...",
    sink_format=h2o_mlops_scoring_client.Format.PARQUET,
    sink_write_mode=h2o_mlops_scoring_client.WriteMode.OVERWRITE
)

Installation

Requirements

Linux or Mac OS (Windows is not supported)
Java (only required for pyspark installs)
Python 3.8 or greater

Install from PyPI

pip install h2o-mlops-scoring-client

pyspark is no longer included in a default install. To include pyspark:

pip install h2o-mlops-scoring-client[PYSPARK]

FAQ

When should I use the MLOps Scoring Client?

Use when batch scoring processing (authenticating and connecting to source or sink, file/data processing or conversions, etc.) can happen external to H2O AI Cloud but you want to stay within the H2O MLOps workflow (projects, scoring, registry, monitoring, etc.).

Where does scoring take place?

As the batch scoring processing occurs, the data is sent to an H2O MLOps deployment for scoring. The scores are then returned for the batch scoring processing to complete.

What Source/Sinks are supported?

The MLOps scoring client can support many source/sinks, including:

ADLS Gen 2
Databases with a JDBC driver
Local file system
GBQ
S3
Snowflake

What file types are supported?

The MLOps scoring client can read and write:

CSV
Parquet
ORC
BigQuery tables
JDBC queries
JDBC tables
Snowflake queries
Snowflake tables

If there's a file type you would like to see supported, please let us know.

I want model monitoring for batch scoring, can I do that?

Yes. The MLOps Scoring Client uses MLOps scoring endpoints which are automatically monitored.

Is a Spark installation required?

No. If you're scoring Pandas data frames, then no extra Spark install or configuration is needed. If you want to connect to an external source or sink, you'll need to install pyspark and do a small amount of configuration.

Project details

Release history Release notifications | RSS feed

This version

0.2.1b1 pre-release

Mar 11, 2024

0.2.0b1 pre-release

Feb 28, 2024

0.1.5b1 pre-release

Feb 9, 2024

0.1.4b1 pre-release

Dec 14, 2023

0.1.3b1 pre-release

Nov 14, 2023

0.1.2b1 pre-release

Oct 4, 2023

0.1.1b1 pre-release

Sep 4, 2023

0.1.0b1 pre-release

Aug 29, 2023

0.0.11b1 pre-release

Aug 24, 2023

0.0.10b1 pre-release

Aug 10, 2023

0.0.9b1 pre-release

Jul 17, 2023

0.0.8b1 pre-release

May 31, 2023

0.0.7b1 pre-release

May 22, 2023

0.0.6b1 pre-release

May 6, 2023

0.0.5b1 pre-release

Apr 15, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

h2o_mlops_scoring_client-0.2.1b1-py3-none-any.whl (12.4 kB view hashes)

Uploaded Mar 11, 2024 Python 3

Hashes for h2o_mlops_scoring_client-0.2.1b1-py3-none-any.whl

Hashes for h2o_mlops_scoring_client-0.2.1b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d76c9504ccce231f585a41deba01617652029d773562b21dadc387f4239a3540`
MD5	`c49b25bf131a068aed2bc558a8997558`
BLAKE2b-256	`548ec8ffa57a7a1e045236e1e94c6ba4ce27d3a43d412634e5e6a13228193952`