Skip to main content

The scenario selection algorithm selects a maximal subset of scenarios from a scenario set, so that the selected scenarios have specified means (or sums).

Project description

The scenario selection algorithm selects a maximal subset of scenarios from a scenario set1, so that the selected scenarios have specified means (or sums).

Package Installation

The Python scenarioselector package is published on the Python Package Index, and is hosted on GitHub. The package can be installed with the Package Installer for Python.

pip install scenarioselector

Demonstrations

The easiest way to understand the scenario selection algorithm is to read through, download2 and run two accompanying Jupyter Notebook demos presented with the Jupyter NBViewer.

Basic Usage

The following three steps outline basic usage of the ScenarioSelector class, which constructs instances of scenario selection problems and applies the selection algorithm.

1. Instantiate ScenarioSelector

Construct an object which defines the scenario selection problem you want to solve.

from scenarioselector import ScenarioSelector

selector = ScenarioSelector(data, weights=1, means=0, sums=0)
Variable Allowable Types Shape Default Value Description
data List of lists, NumPy array, or pandas dataframe. (N, D) Required parameter. Scenario set with N scenarios and D variables.
weights Scalar, list or NumPy array. (N,) Unit weight for each scenario. Strictly positive weights for each of the N scenarios.
means Scalar, list or NumPy array. (D,) Zero mean for each variable. Target means for the D variables.
sums Scalar, list or NumPy array. (D,) Zero sum for each variable. Target sums for the D variables.

Note: Non-zero target values may be specified for either means or sums, but not both.

2. Run the Scenario Selection Algorithm

Call the ScenarioSelector's optimize method to run the scenario selection algorithm.

selector.optimize(callback=None, pivot_rule=None)

Note: Calling selector.optimize() without parameters runs the algorithm with default parameters.

3. View Results

Results of the optimization can be inspected as follows3.

  • selector.selected is a Numpy array of Booleans which indicates which scenarios have been selected. If the input variable data is a NumPy array, then you can use NumPy's Boolean indexing functionality to obtain the selected scenario set as selected_data = data[selector.selected], and the associated weights as selected_weights = selector.weights[selector.selected].

    • If you have specified target means, the weighted means of the reduced scenario set will be close to your specified target. You can verify this by calculating numpy.average(selected_data, weights=selected_weights, axis=0). If the original scenario set is equally weighted then you do not need to specify the selected weights.
    • If you have specified target sums, the weighted sums of the reduced scenario set will be close to your specified target. You can verify this by calculating numpy.dot(selected_weights, selected_data). If each scenario has unit weight then you can get the same result by calculating numpy.sum(selected_data, axis=0).
  • selector.reduced_weights is a NumPy array of reduced weights associated with each scenario. You can verify the algorithm has hit the sums target precisely by calculating numpy.dot(selector.reduced_weights, data).

  • selector.probabilities is an NumPy array of probabilities associated with each scenario. You can verify the algorithm has hit the means target precisely by calculating numpy.dot(selector.probabilities, data).

Example of Basic Usage

The following is an example of basic usage with N = 5 and D = 2.

Consider a finite discrete probability space, (Ω, P), where Ω := {ω1, ω2, ω3, ω4, ω5} and the probabilities of each outcome are p1 = P(ω1) = 0.15, p2 = P(ω2) = 0.25, p3 = P(ω3) = 0.2, p4 = P(ω4) = 0.25 and p5 = P(ω5) = 0.15.

Consider an R2-valued random variable X with five realizations X1) = (0.8, -3.2), X2) = (3.0, 2.9), X3) = (3.0, 2.5), X4) = (-0.8, 1.0) and X5) = (0.8, -2.0).

Suppose we want to select a maximal subset of the five scenarios, so that the weighted sum of the outcomes Xn) selected scenarios is equal to (1.1, 1.0). More precisely, we want to find reduced weights 0 ≤ qnpn which maximize Σn qn, subject to the constraint Σn qn Xn) = (1.1, 1.0).

We define an array of shape (5, 2) which holds the scenario set data.

from scenarioselector import ScenarioSelector
import numpy as np

data    = np.array([[0.8, -3.2], [3.0, 2.9], [3.0, 2.5], [-0.8, -1.0], [0.8, -2.0]])
weights = [0.15, 0.25, 0.2, 0.25, 0.15]
sums    = [1.1, 1.0]

selector = ScenarioSelector(data, weights=weights, sums=sums)

print()
print("Before optimization")
print("-------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))

selector.optimize()

print()
print("After optimization")
print("------------------")
print(sum(selector.selected), "scenarios selected: ", selector.selected)
print("Exact sums:", np.dot(selector.reduced_weights, data))
print("Approx sums:", np.dot(selector.weights[selector.selected], data[selector.selected]))

Note: Python uses zero-based array indices so, for example, data[1] evaluates to [3.0, 2.9].

Advanced Usage

ScenarioSelector Properties

A ScenarioSelector object has the following properties, which can be queried at any stage of the optimization.

Property Type Shape Description
selected NumPy array (N,) Booleans, indicating which scenarios are selected.
reduced_weights NumPy array (N,) Reduced weights associated with each scenario.
probabilities NumPy array (N,) Probabilities associated with each scenario.
lagrange_multiplier Numpy array (D,) Lagrange multiplier for the dual problem.
tableau pandas dataframe Condensed tableau for the simplex algorithm.
pivot_count int Number of pivots operations used to get to the current state.

Callback Function

The scenario selector's optimize method can be parameterized with a bespoke callback function. For example,

tableaus = []

def callback(selector, i, element):
	print("Iteration {} pivots on element {}.".format(i, element))
	tableaus.append(selector.tableau)

To keep track of the optimization progress, call the ScenarioSelector's optimize method with the callback function as a parameter.

selector.optimize(callback=callback)

Pivot Rule

A pivot rule determines which variable and scenario(s) to use for pivot and flip operations in the modified simplex algorithm.

from scenarioselector.pivot_rule import PivotRule, PivotRuleSlowed
from scenarioselector.pivot_variable import (Dantzig, DantzigTwoPhase,
                                             MaxObjectiveImprovement, MaxObjectiveImprovementTwoPhase)
from scenarioselector.pivot_scenarios import pivot_scenarios, barrodale_roberts

pivot_rule = PivotRule(pivot_variable=DantzigTwoPhase, pivot_scenarios=barrodale_roberts)
selector.optimize(pivot_rule=pivot_rule)

The choices of pivot variable and pivot scenario(s) are discussed in the next two subsections.

Note: The derived pivot rule PivotRuleSlowed is designed specifically for use with the Barrodale Roberts improvement. This rule slows down the effect of passing through each vertex in succession, and is included only for demonstration purposes.

Pivot Variable

A pivot_variable rule determines which variable to use for the next pivot operation. Pre-defined pivot_variable rules can be summarised as follows.

Rule Description
Dantzig Choose the variable whose corresponding entry in the basement row of the condensed tableau has the largest magnitude.
DantzigTwoPhase Similar to Dantzig, however the first D operations move all the Lagrange multiplier variables into the basis.
MaxObjectiveImprovement Choose the variable such that a classical pivot operation will lead to the largest improvement in the objective value.
MaxObjectiveImprovementTwoPhase Similar to MaxObjectiveImprovement, however the first D operations move all the Lagrange multiplier variables into the basis.

Pivot Scenarios

A pivot_scenarios rule determines which scenario(s) to use for the next pivot and associated flip operations. The Barrodale Roberts improvement allows the modified simplex algorithm to pass through multiple vertices at once, allowing the algorithm to flip an array of selection states in a single operation.

Footnotes

  1. A scenario set is a set of (possibly weighted) observations of multi-variate data.
  2. The example notebooks are located in a separate project which is also hosted on GitHub.
  3. This section assumes you have imported NumPy with the statement import numpy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scenarioselector-0.1.4.tar.gz (18.9 kB view hashes)

Uploaded Source

Built Distribution

scenarioselector-0.1.4-py3-none-any.whl (19.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page