Skip to main content

Constrain Operator for Inferential Models is a simple tool for pre and pos processing of data to eliminate redunduncy in datasets caused by dependency rules between the variables/columns.

Project description

COIM

Constrain Operator for Inferential Models is a simple tool for pre and pos processing of data to eliminate redunduncy in datasets caused by dependency rules between the variables/columns.

Usage

To start using COIM, import into your code the operator class, which orquestrates the constrains and define an instance.

from COIM import ConstrainOperator
CO=ConstrainOperator()

To add a new constrain, use the add_rule method from ConstrainOperator class.

from COIM import SomeConstrain
SC=SomeConstrain(**parameters)
CO.add_rule(SC)

Each constrain will require their own specific parameters, refer to section Available constrains to know each of them. However, all constrains receive the parameter "labels", which is a list with the new names to be used on the encoded columns.

Then you can encode your dataframe to use the new corrected variables to feed your model.

new_df=CO.encode_dataframe(df)

After running your model, you can regenerate the data in the original format, decoding the acquired values and errors.

decoded_df, decoded_errors=CO.decode_dataframe(predicted_df, errors)

That will yield the predictions for the original variables as if they had been fed to the model themselves, but with rather more consistent results

Available constrains

  1. "add_scalar":
    • $a+K=b$
    • base_variable = a
    • target_variable = b
    • constant = K
  2. "mul_scalar":
    • $a*K=b$
    • base_variable = a
    • target_variable = b
    • constant = K
  3. "const_sum":
    • $\sum W_i\cdot a_i=K$
    • variables = $[a_1, a_2, \cdots, a_n]$
    • reference_variable = $a_j$
    • constant_sum = K
    • weights = $[W_1, W_2, \cdots, W_n]$ or $W$ if $W_1=W_2= \cdots= W_n$
  4. "custom_func":
    • to be used when none of the above is applicable and you have to develop your own functions to operate the dataframe
    • variables : list of the variables to be used
    • validate_function: Function to assert if the received dataframe follows the given constrain. (df[DataFrame], variables[list], labels[list])->bool
    • format_function: Write a string that describes the constrain equation. (variables[list], labels[list])->str
    • encode_dataframe: Create the new custom columns in the dataframe. (df[DataFrame], variables[list], labels[list])->DataFrame
    • decode_dataframe: Restore the original columns in the dataframe and calculate the propagated errors. (df[DataFrame], variables[list], labels[list], errors[DataFrame])->DataFrame, DataFrame

Future additions

In the foreseeable future, some new constrains will be implemented, those are:

  1. Variable sum
  2. Constant and variable products
  3. Conditionals

Theoretical foundation

All of the worked out mathematics for the developed constrains can be found at the calculations pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

COIM-0.0.2.tar.gz (8.4 kB view hashes)

Uploaded Source

Built Distribution

COIM-0.0.2-py3-none-any.whl (9.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page