Constrain Operator for Inferential Models is a simple tool for pre and pos processing of data to eliminate redunduncy in datasets caused by dependency rules between the variables/columns.
Project description
COIM
Constrain Operator for Inferential Models is a simple tool for pre and pos processing of data to eliminate redunduncy in datasets caused by dependency rules between the variables/columns.
Usage
To start using COIM, import into your code the operator class, which orquestrates the constrains and define an instance.
from COIM import ConstrainOperator
CO=ConstrainOperator()
To add a new constrain, use the add_rule method from ConstrainOperator class.
from COIM import SomeConstrain
SC=SomeConstrain(**parameters)
CO.add_rule(SC)
Each constrain will require their own specific parameters, refer to section Available constrains to know each of them. However, all constrains receive the parameter "labels", which is a list with the new names to be used on the encoded columns.
Then you can encode your dataframe to use the new corrected variables to feed your model.
new_df=CO.encode_dataframe(df)
After running your model, you can regenerate the data in the original format, decoding the acquired values and errors.
decoded_df, decoded_errors=CO.decode_dataframe(predicted_df, errors)
That will yield the predictions for the original variables as if they had been fed to the model themselves, but with rather more consistent results
Available constrains
- "add_scalar":
- $a+K=b$
- base_variable = a
- target_variable = b
- constant = K
- "mul_scalar":
- $a*K=b$
- base_variable = a
- target_variable = b
- constant = K
- "const_sum":
- $\sum W_i\cdot a_i=K$
- variables = $[a_1, a_2, \cdots, a_n]$
- reference_variable = $a_j$
- constant_sum = K
- weights = $[W_1, W_2, \cdots, W_n]$ or $W$ if $W_1=W_2= \cdots= W_n$
- "custom_func":
- to be used when none of the above is applicable and you have to develop your own functions to operate the dataframe
- variables : list of the variables to be used
- validate_function: Function to assert if the received dataframe follows the given constrain. (df[DataFrame], variables[list], labels[list])->bool
- format_function: Write a string that describes the constrain equation. (variables[list], labels[list])->str
- encode_dataframe: Create the new custom columns in the dataframe. (df[DataFrame], variables[list], labels[list])->DataFrame
- decode_dataframe: Restore the original columns in the dataframe and calculate the propagated errors. (df[DataFrame], variables[list], labels[list], errors[DataFrame])->DataFrame, DataFrame
Future additions
In the foreseeable future, some new constrains will be implemented, those are:
- Variable sum
- Constant and variable products
- Conditionals
Theoretical foundation
All of the worked out mathematics for the developed constrains can be found at the calculations pdf
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.