scenarioselector

The scenario selection algorithm selects a maximal subset of scenarios from a scenario set, so that the selected scenarios have specified means (or sums).

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering

Project description

The scenario selection algorithm selects a maximal subset of scenarios from a scenario set¹, so that the selected scenarios have specified means (or sums).

Package Installation

The Python scenarioselector package is published on the Python Package Index, and is hosted on GitHub. The package can be installed with the Package Installer for Python.

pip install scenarioselector

Demonstrations

The easiest way to understand the scenario selection algorithm is to read through, download² and run two accompanying Jupyter Notebook demos presented with the Jupyter NBViewer.

Basic Usage

The following three steps outline basic usage of the ScenarioSelector class, which constructs instances of scenario selection problems and applies the selection algorithm.

1. Instantiate ScenarioSelector

Construct an object which defines the scenario selection problem you want to solve.

from scenarioselector import ScenarioSelector

selector = ScenarioSelector(data, weights=1, means=0, sums=0)

Variable	Allowable Types	Shape	Default Value	Description
`data`	List of lists, NumPy array, or pandas dataframe.	(N, D)	Required parameter.	Scenario set with N scenarios and D variables.
`weights`	Scalar, list or NumPy array.	(N,)	Unit weight for each scenario.	Strictly positive weights for each of the N scenarios.
`means`	Scalar, list or NumPy array.	(D,)	Zero mean for each variable.	Target means for the D variables.
`sums`	Scalar, list or NumPy array.	(D,)	Zero sum for each variable.	Target sums for the D variables.

Note: Non-zero target values may be specified for either means or sums, but not both.

2. Run the Scenario Selection Algorithm

Call the ScenarioSelector's optimize method to run the scenario selection algorithm.

selector.optimize(callback=None, pivot_rule=None)

Note: Calling selector.optimize() without parameters runs the algorithm with default parameters.

3. View Results

Results of the optimization can be inspected as follows³.

selector.selected is a Numpy array of Booleans which indicates which scenarios have been selected. If the input variable data is a NumPy array, then you can use NumPy's Boolean indexing functionality to obtain the selected scenario set as selected_data = data[selector.selected], and the associated weights as selected_weights = selector.weights[selector.selected].
- If you have specified target means, the weighted means of the reduced scenario set will be close to your specified target. You can verify this by calculating numpy.average(selected_data, weights=selected_weights, axis=0). If the original scenario set is equally weighted then you do not need to specify the selected weights.
- If you have specified target sums, the weighted sums of the reduced scenario set will be close to your specified target. You can verify this by calculating numpy.dot(selected_weights, selected_data). If each scenario has unit weight then you can get the same result by calculating numpy.sum(selected_data, axis=0).
selector.reduced_weights is a NumPy array of reduced weights associated with each scenario. You can verify the algorithm has hit the sums target precisely by calculating numpy.dot(selector.reduced_weights, data).
selector.probabilities is an NumPy array of probabilities associated with each scenario. You can verify the algorithm has hit the means target precisely by calculating numpy.dot(selector.probabilities, data).

Example of Basic Usage

The following is an example of basic usage with N = 5 and D = 2.

Consider a finite discrete probability space, (Ω, P), where Ω := {ω₁, ω₂, ω₃, ω₄, ω₅} and the probabilities of each outcome are p₁ = P(ω₁) = 0.15, p₂ = P(ω₂) = 0.25, p₃ = P(ω₃) = 0.2, p₄ = P(ω₄) = 0.25 and p₅ = P(ω₅) = 0.15.

Consider an R²-valued random variable X with five realizations X(ω₁) = (0.8, -3.2), X(ω₂) = (3.0, 2.9), X(ω₃) = (3.0, 2.5), X(ω₄) = (-0.8, 1.0) and X(ω₅) = (0.8, -2.0).

Suppose we want to select a maximal subset of the five scenarios, so that the weighted sum of the outcomes X(ω_n) selected scenarios is equal to (1.1, 1.0). More precisely, we want to find reduced weights 0 ≤ q_n ≤ p_n which maximize Σ_n q_n, subject to the constraint Σ_n q_n X(ω_n) = (1.1, 1.0).

We define an array of shape (5, 2) which holds the scenario set data.

from scenarioselector import ScenarioSelector
import numpy as np

data    = np.array([[0.8, -3.2], [3.0, 2.9], [3.0, 2.5], [-0.8, -1.0], [0.8, -2.0]])
weights = [0.15, 0.25, 0.2, 0.25, 0.15]
sums    = [1.1, 1.0]

selector = ScenarioSelector(data, weights=weights, sums=sums)

print()
print("Before optimization")
print("-------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))

selector.optimize()

print()
print("After optimization")
print("------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))

Note: Python uses zero-based array indices so, for example, data[1] evaluates to [3.0, 2.9].

Advanced Usage

ScenarioSelector Properties

A ScenarioSelector object has the following properties, which can be queried at any stage of the optimization.

Property	Type	Shape	Description
`selected`	NumPy array	(N,)	Booleans, indicating which scenarios are selected.
`reduced_weights`	NumPy array	(N,)	Reduced weights associated with each scenario.
`probabilities`	NumPy array	(N,)	Probabilities associated with each scenario.
`lagrange_multiplier`	Numpy array	(D,)	Lagrange multiplier for the dual problem.
`tableau`	pandas dataframe		Condensed tableau for the simplex algorithm.
`pivot_count`	int		Number of pivots operations used to get to the current state.

Callback Function

The scenario selector's optimize method can be parameterized with a bespoke callback function. For example,

tableaus = []

def callback(selector, i, element):
	print("Iteration {} pivots on element {}.".format(i, element))
	tableaus.append(selector.tableau)

To keep track of the optimization progress, call the ScenarioSelector's optimize method with the callback function as a parameter.

selector.optimize(callback=callback)

Pivot Rule

A pivot rule determines which variable and scenario(s) to use for pivot and flip operations in the modified simplex algorithm.

from scenarioselector.pivot_rule import PivotRule, PivotRuleSlowed
from scenarioselector.pivot_variable import (Dantzig, DantzigTwoPhase,
                                             MaxObjectiveImprovement, MaxObjectiveImprovementTwoPhase)
from scenarioselector.pivot_scenarios import pivot_scenarios, barrodale_roberts

pivot_rule = PivotRule(pivot_variable=DantzigTwoPhase, pivot_scenarios=barrodale_roberts)
selector.optimize(pivot_rule=pivot_rule)

The choices of pivot variable and pivot scenario(s) are discussed in the next two subsections.

Note: The derived pivot rule PivotRuleSlowed is designed specifically for use with the Barrodale Roberts improvement. This rule slows down the effect of passing through each vertex in succession, and is included only for demonstration purposes.

Pivot Variable

A pivot_variable rule determines which variable to use for the next pivot operation. Pre-defined pivot_variable rules can be summarised as follows.

Rule	Description
`Dantzig`	Choose the variable whose corresponding entry in the basement row of the condensed tableau has the largest magnitude.
`DantzigTwoPhase`	Similar to `Dantzig`, however the first D operations move all the Lagrange multiplier variables into the basis.
`MaxObjectiveImprovement`	Choose the variable such that a classical pivot operation will lead to the largest improvement in the objective value.
`MaxObjectiveImprovementTwoPhase`	Similar to `MaxObjectiveImprovement`, however the first D operations move all the Lagrange multiplier variables into the basis.

Pivot Scenarios

A pivot_scenarios rule determines which scenario(s) to use for the next pivot and associated flip operations. The Barrodale Roberts improvement allows the modified simplex algorithm to pass through multiple vertices at once, allowing the algorithm to flip an array of selection states in a single operation.

Footnotes

A scenario set is a set of (possibly weighted) observations of multi-variate data.
The example notebooks are located in a separate project which is also hosted on GitHub.
This section assumes you have imported NumPy with the statement import numpy.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Intended Audience
- Science/Research
License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

This version

0.1.4

May 4, 2021

0.1.3

Apr 28, 2021

0.1.2

Apr 23, 2021

0.1.1

Apr 23, 2021

0.1.0

Apr 22, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scenarioselector-0.1.4.tar.gz (18.9 kB view hashes)

Uploaded May 4, 2021 Source

Built Distribution

scenarioselector-0.1.4-py3-none-any.whl (19.2 kB view hashes)

Uploaded May 4, 2021 Python 3

Hashes for scenarioselector-0.1.4.tar.gz

Hashes for scenarioselector-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`cae45daf169b1f748c17c85b03fd7af8f8e05374a2823cb2ef5ded42a2bcbcdf`
MD5	`2da208719aee384d6cc94f0f5c1bd8b2`
BLAKE2b-256	`1ced4e3a5c6f465fc85a1f00bc662ef939d9be9e798e9121efb8b7b7844b0b5c`

Hashes for scenarioselector-0.1.4-py3-none-any.whl

Hashes for scenarioselector-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`550405464eb001e82e64d6c95a862adb30d3123ba4eb31c48efd93f1cbb15cbc`
MD5	`f2088f34e06c24b1184caad2fede6947`
BLAKE2b-256	`cacf804e58843fbd0e79652557cf44f299c90fc36aed90ed0fcb415ed507557f`