alea package

Subpackages

Submodules

alea.model module

class alea.model.MinuitWrap(f: Callable, parameters: Parameters)[source]

Bases: object

Wrapper for functions to be called by Minuit. Initialized with a function f and a Parameters instance.

func

function wrapped

s_args

parameter names of the model

Type

list

_parameters

parameters and limits of the model

Type

dict

Parameters
  • f (Callable) – function to be wrapped

  • parameters (Parameters) – parameters of the model

__init__(f: Callable, parameters: Parameters)[source]

Initialize the wrapper.

class alea.model.StatisticalModel(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]

Bases: object

Class that defines a statistical model.

  • The statisical model contains two parts that you must define yourself:
    • a likelihood function

      ll(self, parameter_1, parameter_2… parameter_n): A function of a set of named parameters which return a float expressing the loglikelihood for observed data given these parameters.

    • a data generation function

      generate_data(self, parameter_1, parameter_2… parameter_n): A function of the same set of named parameters return a full data set.

  • Methods that you must implement:
    • _ll

    • _generate_data

  • Methods that you may implement:
    • get_expectation_values

  • Methods that already exist here:
    • ll

    • store_data

    • fit

    • get_parameter_list

    • confidence_interval

The public methods generate_data and ll, as the names suggested, depend on private methods _generate_data, and _ll respectively.

data

data of the model

_data

data of the model

_confidence_level

confidence level for confidence intervals

_confidence_interval_kind

kind of confidence interval to compute

parameters

parameters of the model

confidence_interval_threshold

threshold for confidence interval

is_data_set

True if data is set

Type

bool

Parameters
  • parameter_definition (dict or list, optional (default=None)) – definition of the parameters of the model

  • confidence_level (float, optional (default=0.9)) – confidence level for confidence intervals

  • confidence_interval_kind (str, optional (default="central")) – kind of confidence interval to compute

  • confidence_interval_threshold (Callable[[float], float], optional (default=None)) – threshold for confidence interval

  • confidence_interval_root_find (str, optional (default="brentq")) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

  • data (dict or list, optional (default=None)) – pre-set data of the model

  • fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings

Raises
  • RuntimeError – if you try to instantiate the StatisticalModel class directly

  • NotImplementedError – if you do not implement the likelihood function or the data generation

__init__(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]

Initialize a statistical model.

_check_ll_and_generate_data_signature()[source]

Check that the likelihood and generate_data functions have the same signature.

_confidence_interval_checks(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, asymptotic_dof: Optional[int] = None, **kwargs) Tuple[str, Callable[[float], float], str, Tuple[float, float]][source]

Helper function for confidence_interval that does the input checks and return bounds.

Parameters
  • poi_name (str) – name of the parameter of interest

  • parameter_interval_bounds (Tuple[float, float], optional (default=None)) – range in which to search for the confidence interval edges

  • confidence_level (float, optional (default=None)) – confidence level for confidence intervals

  • confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute

  • confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

Returns

confidence interval kind, confidence interval threshold, parameter interval bounds

Return type

Tuple[str, Callable[[float], float], str, Tuple[float, float]]

_define_parameters(parameter_definition, nominal_values=None)[source]

Initialize the parameters of the model.

_generate_data(**kwargs)[source]

Generate data for the given parameters.

_ll(**kwargs) float[source]

Likelihood function, return the loglikelihood for the given parameters.

confidence_interval(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, confidence_interval_args: Optional[dict] = None, best_fit_args: Optional[dict] = None, asymptotic_dof: Optional[int] = None, fit_strategy: Optional[dict] = None) Tuple[float, float][source]

Uses self.fit to compute confidence intervals for a certain named parameter. If the parameter is a rate parameter, and the model has expectation values implemented, the bounds will be interpreted as bounds on the expectation value, so that the range in the fit is parameter_interval_bounds/mus. Otherwise the bound is taken as-is.

Parameters
  • poi_name (str) – name of the parameter of interest

  • parameter_interval_bounds (Tuple[float, float], optional (default=None)) –

    range in which to search for the confidence interval edges. May be specified as:

    • setting the property “parameter_interval_bounds” for the parameter

    • passing a list here

    • passing None here, the property of the parameter is used

  • confidence_level (float, optional (default=None)) – confidence level for confidence intervals. If None, the default confidence level of the model is used.

  • confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute. If None, the default kind of the model is used.

  • confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

  • confidence_interval_args (dict, optional (default=None)) – Parameters that will be fixed in the profile likelihood computation. If None, all fittable parameters will be profiled except the poi.

  • best_fit_args (dict, optional (default=None)) – If you require the “global” best-fit used to normalise the profile likelihood ratio to fix fewer parameters than the profile likelihood– mainly used for 1-D slices of higher-dimensional confidence volumes, where the global best-fit may not be along the profile. If None, will be set to confidence_interval_args.

  • asymptotic_dof (int, optional (default=None)) – Degrees of freedom for asymptotic

  • fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings.

property data

Simple getter for a data-set– mainly here so it can be over-ridden for special needs.

Data-sets are expected to be in the form of a list of one or more structured arrays, representing the data-sets of one or more likelihood terms.

fit(verbose: Optional[bool] = False, fit_strategy: Optional[dict] = None, **kwargs) Tuple[dict, float][source]

Fit the model to the data by maximizing the likelihood. Return a dict containing best-fit values of each parameter, and the value of the likelihood evaluated there. While the optimization is a minimization, the likelihood returned is the __maximum__ of the likelihood.

Parameters
  • verbose (bool) – if True, print the Minuit object

  • fit_strategy (dict) –

    override the default fit strategy defined in the model (model.fit_strategy). Possible settings are: - minimizer_routine (str): the minimizer routine to use, either

    ”migrad”, “simplex”, or “simplex_migrad” (first run simplex, then migrad).

    • minuit_strategy (int): strategy for Minuit, can be 0, 1, or 2. The higher the

      number, the more precise the fit but also the slower.

    • refit_invalid (bool): if True, refit with the simplex_migrad routine

      and strategy 2 if the optimization does not converge the first time.

    • disable_index_fitting (bool): if True, disable the index fitting

      even if the model has index parameters.

    • max_index_fitting_iter (int): maximum number of iterations for index fitting

Returns

best-fit values of each parameter, and the value of the likelihood evaluated there

Return type

dict, float

generate_data(**kwargs) Union[dict, list][source]

Generate data for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.

Raises

ValueError – If the parameters are not within the fit limits

Returns

generated data

Return type

dict or list

Caution

This implementation won’t allow you to call generate_data by positional arguments.

get_expectation_values(**parameter_values)[source]

Get the expectation values of the measurement.

Parameters

parameter_values – values of the parameters

get_likelihood_term_from_name(likelihood_name: str) int[source]

Return the index of a likelihood term if the likelihood has several names.

Parameters

likelihood_name (str) – name of the likelihood term

Returns

index of the likelihood term

Return type

int

static get_model_from_name(statistical_model: str)[source]

Get the statistical model class from a string.

get_parameter_list()[source]

Return a set of all parameters that the generate_data and likelihood accepts.

ll(**kwargs) float[source]

Likelihod function, returns the loglikelihood for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.

Keyword Arguments

kwargs – keyword arguments for the parameters

Returns

likelihood value

Return type

float

make_objective()[source]

Make a function that can be passed to Minuit.

Returns

function that can be passed to Minuit

Return type

Callable

property nominal_expectation_values

Nominal expectation values for the sources of the likelihood.

For this to work, you must implement get_expectation_values.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments

fit_guesses (dict) – A dict of parameter names and values.

store_data(file_name, data_list, data_name_list: Optional[List[str]] = None, metadata: Optional[dict] = None)[source]

Store a list of datasets. (each on the form of a list of one or more structured arrays or dicts) Using inference_interface, but included here to allow over-writing. The structure would be: [[datasets1], [datasets2], ..., [datasetsn]], where each of datasets is a list of structured arrays. If you specify, it is set, if not it will read from self.get_likelihood_term_names. If not defined, it will be ["0", "1", ..., "n-1"]. The metadata is optional.

Parameters
  • file_name (str) – name of the file to store the data in

  • data_list (list) – list of datasets

  • data_name_list (list, optional (default=None)) – list of names of the datasets. If None, it will be read from self.get_likelihood_term_names

  • metadata (dict, optional (default=None)) – metadata to store with the data. If None, no metadata is stored.

alea.parameters module

class alea.parameters.ConditionalParameter(name: str, conditioning_parameter_name: str, **kwargs)[source]

Bases: object

This class is used to define a parameter that depends on another parameter. It has the same attributes as the Parameter class but each of them can be a dictionary with keys being the values of the conditioning parameter and values being the corresponding values of the conditional parameter. Calling the object with the conditioning parameter value as an argument will return a corresponding Parameter object with the correct values.

name

The name of the parameter.

Type

str

conditioning_parameter_name

The name of the conditioning parameter.

Type

str

__eq__(other: object) bool[source]

Return True if all attributes are equal.

property blueice_anchors: Any

Return the blueice_anchors of the parameter (cominal condition)

property fit_guess: Optional[float]

Return the initial guess for fitting the parameter (cominal condition)

property fit_limits: Optional[Tuple[float, float]]

Return the fit_limits of the parameter (cominal condition)

property fittable: bool

Return the fittable attribute of the parameter (cominal condition)

property needs_reinit: bool

Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]

Return the nominal value of the parameter (cominal condition)

property parameter_interval_bounds: Optional[Tuple[float, float]]

Return the parameter_interval_bounds of the parameter (cominal condition)

property ptype: Optional[str]

Return the ptype of the parameter (cominal condition)

property relative_uncertainty: Optional[bool]

Return the relative_uncertainty of the parameter (cominal condition)

property uncertainty: Any

Return the uncertainty of the parameter (cominal condition)

value_in_fit_limits(value: float) bool[source]

Returns True if value under cominal condition is within fit_limits.

class alea.parameters.Parameter(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]

Bases: object

Represents a single parameter with its properties.

name

The name of the parameter.

Type

str

nominal_value

The nominal value of the parameter.

Type

float, optional (default=None)

fittable

Indicates if the parameter is fittable or always fixed.

Type

bool, optional (default=True)

ptype

The ptype of the parameter.

Type

str, optional (default=None)

uncertainty

The uncertainty of the parameter. If a string, it can be evaluated as a numpy or scipy function to define non-gaussian constraints.

Type

float or str, optional (default=None)

relative_uncertainty

Indicates if the uncertainty is relative to the nominal_value.

Type

bool, optional (default=None)

blueice_anchors

Anchors for blueice template morphing. Blueice will load the template for the provided values and then interpolate for any value in between.

Type

list, optional (default=None)

fit_limits

The limits for fitting the parameter.

Type

Tuple[float, float], optional (default=None)

parameter_interval_bounds

Limits for computing confidence intervals.

Type

Tuple[float, float], optional (default=None)

fit_guess

The initial guess for fitting the parameter.

Type

float, optional (default=None)

description

A description of the parameter.

Type

str, optional (default=None)

__eq__(other: object) bool[source]

Return True if all attributes are equal.

__init__(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]

Initialise a parameter.

_check_parameter_consistency()[source]

Check if parameter is consistent.

_check_parameter_interval_bounds(value)[source]

Check if parameter_interval_bounds is within fit_limits and is not None.

property blueice_anchors: Any

Return the blueice_anchors of the parameter.

If the blueice_anchors is a string, it will be evaluated as a numpy or scipy function.

property fit_guess: Optional[float]

Return the initial guess for fitting the parameter.

property needs_reinit: bool

Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]

Return the nominal value of the parameter.

property parameter_interval_bounds: Optional[Tuple[float, float]]
property uncertainty: Any

Return the uncertainty of the parameter.

If the uncertainty is a string, it will be evaluated as a numpy or scipy function.

value_in_fit_limits(value: float) bool[source]

Returns True if value is within fit_limits.

class alea.parameters.Parameters[source]

Bases: object

Represents a collection of parameters.

names

A list of parameter names.

Type

List[str]

fit_guesses

A dictionary of fit guesses.

Type

Dict[str, float]

fit_limits

A dictionary of fit limits.

Type

Dict[str, float]

fittable

A list of parameter names which are fittable.

Type

List[str]

not_fittable

A list of parameter names which are not fittable.

Type

List[str]

uncertainties

A dictionary of parameter uncertainties.

Type

Dict[str, float or Any]

with_uncertainty

A Parameters object with parameters with a not-NaN uncertainty.

Type

Parameters

nominal_values

A dictionary of parameter nominal values.

Type

Dict[str, float]

parameters

A dictionary to store the parameters, with parameter name as key.

Type

Dict[str, Parameter]

__call__(return_fittable: Optional[bool] = False, **kwargs: Optional[Dict]) Dict[str, float][source]

Return a dictionary of parameter values, optionally filtered to return only fittable parameters.

Parameters

return_fittable (bool, optional (default=False)) – Indicates if only fittable parameters should be returned.

Keyword Arguments

kwargs (dict) – Additional keyword arguments to override parameter values.

Raises

ValueError – If a parameter name is not found.

Returns

A dictionary of parameter values.

Return type

dict

__eq__(other: object) bool[source]

Return True if all parameters are equal.

__getattr__(name: str) Parameter[source]

Retrieves a Parameter object by attribute access.

Parameters

name (str) – The name of the parameter.

Raises

AttributeError – If the attribute is not found.

Returns

The retrieved Parameter object.

Return type

Parameter

__getitem__(name: str) Parameter[source]

Retrieves a Parameter object by dictionary access.

Parameters

name (str) – The name of the parameter.

Raises

KeyError – If the key is not found.

Returns

The retrieved Parameter object.

Return type

Parameter

__init__()[source]

Initialise a collection of parameters.

__iter__() Iterator[Parameter][source]

Return an iterator over the parameters.

Each iteration return a Parameter object.

__str__() str[source]

Return an overview table of all parameters.

add_parameter(parameter: Union[Parameter, ConditionalParameter]) None[source]

Adds a Parameter object to the Parameters collection.

Parameters

parameter (Parameter) – The Parameter object to add.

Raises

ValueError – If the parameter name already exists.

property fit_guesses: Dict[str, float]

A dictionary of fit guesses.

property fit_limits: Dict[str, float]

A dictionary of fit limits.

property fittable: List[str]

A list of parameter names which are fittable.

classmethod from_config(config: Dict[str, dict])[source]

Creates a Parameters object from a configuration dictionary.

Parameters

config (dict) – A dictionary of parameter configurations.

Returns

The created Parameters object.

Return type

Parameters

classmethod from_list(names: List[str])[source]

Creates a Parameters object from a list of parameter names. Everything else is set to default values.

Parameters

names (List[str]) – List of parameter names.

Returns

The created Parameters object.

Return type

Parameters

property names: List[str]

A list of parameter names.

property nominal_values: dict

A dict of nominal values for all parameters with a nominal value.

property not_fittable: List[str]

A list of parameter names which are not fittable.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments

fit_guesses (dict) – A dict of parameter names and values.

set_nominal_values(**nominal_values)[source]

Set the nominal values for parameters.

Keyword Arguments

nominal_values (dict) – A dict of parameter names and values.

property uncertainties: dict

A dict of uncertainties for all parameters with a not-NaN uncertainty.

Caution: this is not the same as the parameter.uncertainty property.

values_in_fit_limits(**kwargs: Dict) bool[source]

Return True if all values are within the fit limits.

Keyword Arguments

kwargs (dict) – The parameter values to check.

Returns

True if all values are within the fit limits.

Return type

bool

property with_uncertainty: Parameters

Return parameters with a not-NaN uncertainty.

The parameters are the same objects as in the original Parameters object, not a copy. For conditional parameters, the parameters under the nominal condition are returned.

alea.runner module

class alea.runner.Runner(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]

Bases: object

Runner manipulates statistical model and toydata.

  • initialize the statistical model

  • generate or reads toy data

  • save toy data if needed

  • fit fittable parameters

  • write the output file

One toyfile can contain multiple toydata, but all of them are from the same generate_values.

model

statistical model instance

Type

StatisticalModel

poi

parameter of interest

Type

str

hypotheses

list of hypotheses

Type

list

common_hypothesis

common hypothesis, the values are copied to each hypothesis

Type

dict

generate_values

generate values for toydata

Type

dict

nominal_values

nominal values of parameters

Type

dict

_compute_confidence_interval

whether compute confidence interval

Type

bool

_n_mc

number of Monte Carlo

Type

int

_toydata_filename

toydata filename

Type

str

_toydata_mode

toydata mode, ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’

Type

str

_metadata

metadata, if None, it is set to {}

Type

dict

_output_filename

output filename

Type

str

_result_names

list of result names

Type

list

_result_dtype

list of result dtypes

Type

list

_hypotheses_values

list of values for hypotheses

Type

list

Parameters
  • statistical_model (str) – statistical model class name

  • poi (str) – parameter of interest

  • hypotheses (list) – list of hypotheses

  • n_mc (int) – number of Monte Carlo

  • common_hypothesis (dict, optional (default=None)) – common hypothesis, the values are copied to each hypothesis

  • generate_values (Dict[str, float], optional (default=None)) – generate values of toydata. If None, toydata depend on statistical model.

  • nominal_values (dict, optional (default=None)) – nominal values of parameters. If None, nothing will be assigned to model.

  • statistical_model_config (str, optional (default=None)) – statistical model configuration filename

  • parameter_definition (dict or list, optional (default=None)) – parameter definition

  • statistical_model_args (dict, optional (default={})) – arguments for statistical model

  • likelihood_config (dict, optional (default=None)) – likelihood configuration

  • compute_confidence_interval (bool, optional (default=False)) – whether compute confidence interval

  • confidence_level (float, optional (default=0.9)) – confidence level

  • confidence_interval_kind (str, optional (default='central')) – kind of confidence interval, choice from ‘central’, ‘upper’ or ‘lower’

  • confidence_interval_root_find (str, optional (default='brentq')) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

  • fit_strategy (dict, optional (default=None)) – fit strategy dictionary. If None, the default fit strategy of the model will be used.

  • toydata_mode (str, optional (default='generate_and_store')) – toydata mode, choice from ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’

  • toydata_filename (str, optional (default=None)) – toydata filename

  • only_toydata (bool, optional (default=False)) – whether only generate toydata

  • output_filename (str, optional (default='test_toymc.ii.h5')) – output filename

  • seed (int, optional (default=None)) – random seed for runners before generating toydata

  • metadata (dict, optional (default=None)) – metadata to be saved in output file

__init__(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]

Initialize statistical model, parameters list, and generate values list.

_get_hypotheses()[source]

Get generate values list from hypotheses.

Caution

When free hypothesis is provided, it should be the first hypothesis. Free hypothesis means that all parameters are free to fit, it will not use common_hypothesis!

_get_parameter_list()[source]

Get parameter list and result list from statistical model.

property common_hypothesis: Dict[str, float]
data_generator()[source]

Generate, save or read toydata.

property generate_values: Dict[str, float]
property hypotheses: list
pre_process_poi(value, attribute_name)[source]

Pre-process of poi_expectation for some attributes of runner.

read_toydata()[source]

Read toydata from file.

run()[source]

Run toy simulation.

If only_toydata is True, only generate toydata.

static runner_arguments()[source]

Get runner arguments and annotations.

simulate()[source]

Only generate toydata.

simulate_and_fit()[source]
For each Monte Carlo:
  • run toy simulation a specified toydata mode and generate values.

  • loop over hypotheses.

Todo

Implement per-hypothesis switching on whether to compute confidence intervals

store_toydata(toydata, toydata_names)[source]

Write toydata to file.

If toydata is a list of dict, convert it to a list of list.

static update_poi(model, poi: str, generate_values: Dict[str, float], nominal_values: Dict[str, float] = {})[source]

Update the poi according to poi_expectation. First, it will check if poi_expectation is provided, if not so, it will do nothing. Second, it will check if poi is provided, if so, it will raise error. Third, it will check if poi ends with _rate_multiplier, if not so, it will raise error. Finally, it will update poi to the correct value according to poi_expectation using the get_expectation_values method of model, under specified nominal_values.

Parameters
  • poi (str) – parameter of interest

  • generate_values (dict) – generate values of toydata, it can contain “poi_expectation”

  • nominal_values (dict) – nominal values of parameters

Caution

The expectation is evaluated under nominal_values in each batch.

write_output(results)[source]

Write output file with metadata.

alea.simulators module

class alea.simulators.BlueiceDataGenerator(ll_term)[source]

Bases: object

A class for generating data from a blueice likelihood term.

ll

The blueice likelihood term.

binned

True if the likelihood term is binned.

Type

bool

bincs

The bin centers of the likelihood term.

Type

list

direction_names

The names of the directions of the likelihood term.

Type

list

source_histograms

The histograms of the sources of the likelihood term.

Type

list

data_lengths

The number of bins of each component of the likelihood term.

Type

list

dtype

The data type of the likelihood term.

Type

list

last_kwargs

The last kwargs used to generate data.

Type

dict

mus

The expected number of events of each source of the likelihood term.

parameters

The parameters of the likelihood term.

Type

list

ll_term

A blueice likelihood term.

Type

BinnedLogLikelihood or UnbinnedLogLikelihood

__init__(ll_term)[source]

Initialize the BlueiceDataGenerator.

compute_pdfs_and_mus(filter_kwargs=True, **kwargs) None[source]

Compute PDFs of the sources for the given parameters.

Parameters
  • filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.

  • kwargs – The parameters pasted to the likelihood function.

simulate(filter_kwargs=True, n_toys=None, sample_n_toys=False, **kwargs)[source]

Simulate toys for each source.

Parameters
  • filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.

  • n_toys (int, optional (default=None)) – If not None, a fixed number n_toys of toys is generated for each source component. Defaults to None.

  • sample_n_toys (bool, optional (default=False)) – If True, the number of toys is sampled from a Poisson distribution with mean n_toys. Defaults to False. Only works if n_toys is not None.

Keyword Arguments

kwargs – The parameters pasted to the likelihood function.

Returns

Array of simulated data for all sources in the given analysis space. The index “source” indicates the corresponding source of an entry. The dtype follows self.dtype.

Return type

numpy.array

alea.submitter module

class alea.submitter.Submitter(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]

Bases: object

Submitter base class that generate the submission script from the configuration. It is initialized by the configuration file, and the configuration file should contain the arguments of __init__ method of the Submitter.

statistical_model

the name of the statistical model

Type

str

statistical_model_config

the configuration file of the statistical model

Type

str

poi

the parameter of interest

Type

str

computation_dict

the dictionary of the computation, with keys to_zip, to_vary and in_common

Type

dict

debug

whether to run in debug mode. If True, only one job will be submitted or one runner will be returned. And its script will be printed.

Type

bool

resubmit

whether to resubmit the jobs that have not finished. If True, will submit all the jobs, even if the output file exists.

Type

bool

Parameters
  • statistical_model (str) – the name of the statistical model

  • statistical_model_config (str) – the configuration file of the statistical model

  • poi (str) – the parameter of interest

  • computation_options (dict) – the configuration of the computation

  • computation (str, optional (default='discovery_power')) – the name of the computation, it should be a key of computation_options

  • outputfolder (str, optional (default=None)) – the output folder

  • debug (bool, optional (default=False)) – whether to run in debug mode

  • loglevel (str, optional (default='INFO')) – the log level

Keyword Arguments

kwargs – the arguments of __init__ method of the Submitter, containing configurations of clusters

Caution

All the source of template should be from the same folder. All the output, including toydata and fitting results, should be in the same folder.

__init__(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]

Initializes the submitter.

all_runner_kwargs()[source]

Parse all the runner arguments from the submission script.

allowed_special_args: List[str] = []
already_done(i_args: dict) bool[source]

Check if the job is already done, considering the modes of toydata and output.

static arg_to_str(value, annotation) str[source]

Convert the argument to string for the submission script.

Parameters
  • value – the value of the argument, can be various type

  • annotation – the annotation of the argument

Returns

the string of the argument

Return type

str

Caution

Currently we only support str, int, float, bool, dict and list. The float will be rounded to 4 digits after the decimal point.

static check_redunant_arguments(runner_args, allowed_special_args: List[str] = [])[source]
combine_n_jobs: int = 1
combined_tickets_generator()[source]

Get the combined submission script for the current configuration. self.combine_n_jobs jobs will be combined into one submission script.

Yields

(str, str) – the combined submission script and name output_filename

Note

User can add combine_n_jobs: 10 in local_configurations, slurm_configurations or htcondor_configurations to combine 10 jobs into one submission script. User will need this feature when the number of jobs pending for submission is too large.

computation_tickets_generator()[source]

Get the submission script for the current configuration. It generates the submission script for each combination of the computation options.

For Runner from to_zip, to_vary and in_common:
  • First, generate the combined computational options directly.

  • Second, update the input and output folder of the options.

  • Thrid, collect the non-fittable(settable) parameters into nominal_values.

  • Then, collect the fittable parameters into generate_values.

  • Finally, it generates the submission script for each combination.

Yields

(str, str) – the submission script and name output_filename

config_file_path: str
filename_kwargs(runner_args: dict) dict[source]

Get the filename_kwargs from runner_args.

Parameters

runner_args (dict) – the arguments of Runner

Returns

the keyword arguments for the filename

Return type

dict

first_i_batch: int = 0
classmethod from_config(config_file_path: str, **kwargs) Submitter[source]

Initializes the submitter from a yaml config file.

Parameters

config_file_path (str) – Path to the yaml config file.

Returns

Statistical model.

Return type

BlueiceExtendedModel

logging = <Logger submitter_logger (INFO)>
merged_arguments_generator()[source]

Generate the merged arguments for Runner from to_zip, to_vary and in_common.

property outputfolder: Optional[str]
static runner_kwargs_from_script(sys_argv: Optional[List[str]] = None)[source]

Parse kwargs of a Runner from a string of arguments(script).

Parameters

sys_argv (list, optional (default=None)) – string of arguments, with the format of [’–arg1’, ‘value1’, ‘–arg2’, ‘value2’, …]. The arguments must be the same as the arguments of Runner.__init__.

static script_from_runner_kwargs(annotations, kwargs) str[source]

Generate the submission script from the runner arguments.

static str_to_arg(value: str, annotation)[source]

Convert the string to argument for the submission script.

Parameters
  • value – the string of the argument

  • annotation – the annotation of the argument

Returns

the value of the argument, can be various type

submit(*arg, **kwargs)[source]

Submit the jobs to the destinations.

template_path: str
static update_limit_threshold(runner_args, outputfolder: str)[source]
static update_n_batch(runner_args)[source]

Update n_mc if n_batch is provided.

Distribute n_mc into n_batch, so that each batch will run n_mc/n_batch times.

static update_output_toydata(runner_args, outputfolder: str)[source]
static update_runner_args(runner_args: Dict[str, Dict[str, Any]], parameters_fittable: List[str], parameters_not_fittable: List[str])[source]

Update the runner arguments’ generate_values and nominal_values. If the argument is fittable, it will be added to generate_values, otherwise it will be added to nominal_values.

Parameters

runner_args (dict) – the arguments of Runner

static update_statistical_model_args(runner_args: Dict[str, Dict[str, Any]], template_path: Optional[str] = None)[source]

Update template_path in the statistical model arguments.

Parameters

runner_args (dict) – the arguments of Runner

alea.template_source module

class alea.template_source.CombinedSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Source that is a weighted sums of histograms. Useful e.g. for safeguard. The first histogram is the base histogram, and the rest are added to it with weights. The weights can be set as shape parameters in the config.

Parameters
  • weights – Weights of the 2nd to the last histograms.

  • histnames – List of filenames containing the histograms.

  • templatenames – List of names of histograms within the hdf5 files.

build_histogram()[source]

Build the histogram of the source.

class alea.template_source.SpectrumTemplateSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Reweighted template source by 1D spectrum. The first axis of the template is assumed to be reweighted.

Parameters

spectrum_name – Name of bbf json-like spectrum file

static _get_json_spectrum(filename: str)[source]

Translates bbf-style JSON files to spectra.

Parameters

filename (str) – Name of the JSON file.

Todo

Define the format of the JSON file clearly.

build_histogram()[source]

Build the histogram of the source.

class alea.template_source.TemplateSource(config: Dict, *args, **kwargs)[source]

Bases: HistogramPdfSource

A source defined with a template histogram. The parameters are set in self.config. “templatename”, “histname”, “analysis_space” must be in self.config.

config

The configuration of the source.

Type

dict

dtype

The data type of the source.

Type

list

_bin_volumes

The bin volumes of the source.

Type

numpy.ndarray

_n_events_histogram

The histogram of the number of events of the source.

Type

multihist.MultiHistBase

events_per_day

The number of events per day of the source.

Type

float

_pdf_histogram

The histogram of the probability density function of the source.

Type

multihist.MultiHistBase

Parameters
  • config (dict) – The configuration of the source.

  • templatename – Hdf5 file to open.

  • histname – Histogram name.

  • named_parameters (list) – List of config setting names to pass to .format on histname and filename.

  • normalise_template (bool) – Normalise the template histogram.

  • in_events_per_bin (bool) – If True, histogram is in events per day / bin. If False or absent, histogram is already pdf.

  • histogram_scale_factor (float) – Multiply histogram by this number

  • convert_to_uniform (bool) – Convert the histogram to a uniform per bin distribution.

  • log10_bins (list) – List of axis numbers. If True, bin edges on this axis in the hdf5 file are log10() of the actual bin edges.

__init__(config: Dict, *args, **kwargs)[source]

Initialize the TemplateSource.

_check_binning(h, histogram_info: str)[source]

Check if the histogram”s bin edges are the same to analysis_space.

Parameters
  • h (multihist.MultiHistBase) – The histogram to check.

  • histogram_info (str) – Information of the histogram.

_compute_multiple_file_hashes(templatenames: List[str], format_named_parameters: Dict) str[source]

Compute a deterministic hash for multiple template files.

_compute_single_file_hash(templatename: str, format_named_parameters: Dict) str[source]

Compute the hash for a single template file.

apply_slice_args(h, slice_args: Optional[Union[List[Dict], Dict]] = None)[source]

Apply slice arguments to the histogram.

Parameters
  • h (multihist.MultiHistBase) – The histogram to apply the slice arguments to.

  • slice_args (dict) – The slice arguments to apply. The sum_axis, slice_axis, and slice_axis_limits are supported.

build_histogram()[source]

Build the histogram of the source.

property format_named_parameters

Get the named parameters in the config to dictionary format.

set_dtype()[source]

Set the data type of the source.

set_pdf_histogram(h)[source]

Set the histogram of the probability density function of the source.

simulate(n_events: int)[source]

Simulate events from the source.

Parameters

n_events (int) – The number of events to simulate.

Returns

The simulated events.

Return type

numpy.ndarray

alea.utils module

exception alea.utils.CannotUpdate[source]

Bases: Exception

class alea.utils.IndexMorpher(config, shape_parameters)[source]

Bases: Morpher

IndexMorpher is a morpher which applies no interpolation.

get_anchor_points(bounds, n_models=None)[source]

Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]

Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

class alea.utils.LockableSet(*args)[source]

Bases: set

A set whose update method can be locked.

basenames()[source]

The basenames of the filenames in the set.

lock()[source]

Lock the set to prevent modifications.

uniqueness()[source]

Check if the basenames contains unique elements.

unlock()[source]

Unlock the set to allow modifications.

update(*args)[source]

Update the set with elements if it is not locked.

class alea.utils.ReadOnlyDict(data)[source]

Bases: object

A read-only dict.

get(key, default=None)[source]
items()[source]
keys()[source]
values()[source]
alea.utils._get_internal(file_name)[source]

Get the abspath of the file.

Raise FileNotFoundError when not found in any subfolder

alea.utils._package_path(sub_directory)[source]

Get the abs path of the requested sub folder.

alea.utils._prefix_file_path(config: dict, template_folder_list: list, ignore_keys: List[str] = ['name', 'histname'])[source]

Prefix file path with template_folder_list whenever possible.

Parameters
  • config (dict) – dictionary contains file path

  • template_folder_list (list) – list of possible base folders. Ordered by priority.

  • ignore_keys (list, optional (default=["name", "histname"])) –

  • prefixing (keys to be ignored when) –

alea.utils.adapt_likelihood_config_for_blueice(likelihood_config: dict, template_folder_list: list) dict[source]

Adapt likelihood config to be compatible with blueice.

Parameters
  • likelihood_config (dict) – likelihood config dict

  • template_folder_list (list) – list of possible base folders. Ordered by priority.

Returns

adapted likelihood config

Return type

dict

alea.utils.add_i_batch(filename: str) str[source]

Add i_batch to filename.

alea.utils.asymptotic_critical_value(confidence_interval_kind: str, confidence_level: float, degree_of_freedom: Optional[int] = None)[source]

Return the critical value for the confidence interval.

Parameters
  • confidence_interval_kind (str) – confidence interval kind, either ‘lower’, ‘upper’ or ‘central’

  • confidence_level (float) – confidence level

  • degree_of_freedom (int, optional (default=None)) – degree of freedom

Returns

critical value

Return type

float

Raises
  • ValueError – if confidence_interval_kind is not ‘lower’, ‘upper’ or ‘central’

  • ValueError – if degree_of_freedom is not None and not 1, when confidence_interval_kind is ‘lower’ or ‘upper’

alea.utils.can_assign_to_typing(value_type, target_type) bool[source]

Check if value_type can be assigned to target_type. This is useful when converting Runner’s argument into strings.

Parameters
  • value_type – type of the value, might be float, int, etc.

  • target_type – type of the target, might be Optinal, Union, etc.

alea.utils.can_expand_grid(variations: dict) bool[source]

Check if variations can be expanded into a grid.

Example

>>> can_expand_grid({'a': [1, 2], 'b': [3, 4]})
True
alea.utils.clip_limits(value) Tuple[float, float][source]

Clip limits to be within [-MAX_FLOAT, MAX_FLOAT] by converting None to -MAX_FLOAT and MAX_FLOAT.

alea.utils.compute_file_hash(file_path: str) str[source]

Compute the SHA-256 hash of a file.

alea.utils.compute_variations(to_zip, to_vary, in_common) list[source]

Compute variations of Runner from to_zip, to_vary and in_common. By priority, the order is to_zip, to_vary, in_common. The values in to_zip will overwrite the keys in to_vary and in_common. The values in to_vary will overwrite the keys in in_common.

Parameters
  • to_zip (dict) – variations to be zipped

  • to_vary (dict) – variations to be varied

  • in_common (dict) – variations in common

Returns

a list of dict

Return type

list

alea.utils.convert_to_in_common(in_common: Dict[str, Any]) Dict[str, Any][source]

Expand the values in in_common, according to the itertools.product method, if necessary. This usually happens to the hypotheses.

Example

>>> convert_to_in_common({'hypotheses': ['free', {'a': [1, 2], 'b': [3, 4]}]})
{
    "hypotheses": [
        "free",
        {"a": 1, "b": 3},
        {"a": 1, "b": 4},
        {"a": 2, "b": 3},
        {"a": 2, "b": 4},
    ]
}
alea.utils.convert_to_vary(to_vary: Dict[str, List]) List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the itertools.product method.

Example

>>> convert_to_vary({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]
alea.utils.convert_to_zip(to_zip: Dict[str, List]) List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the zip method.

Example

>>> convert_to_zip({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 2, 'b': 4}]
alea.utils.convert_variations(variations: dict, iteration) list[source]

Convert variations to a list of dict, according to the iteration method.

Parameters
  • variations (dict) – variations to be converted

  • iteration – iteration method, either zip or itertools.product

Returns

a list of dict

Return type

list

alea.utils.deterministic_hash(thing, length=10)[source]

Return a base32 lowercase string of length determined from hashing a container hierarchy.

Edited from strax: strax/utils.py

alea.utils.dump_json(file_name: str, data: dict)[source]

Dump data to a json file.

alea.utils.dump_yaml(file_name: str, data: dict)[source]

Dump data from yaml file.

alea.utils.evaluate_numpy_scipy_expression(value: str)[source]

Evaluate numpy(np) and scipy.stats expression.

alea.utils.evaluate_numpy_scipy_expression_in_dict(d: dict)[source]

Evaluate numpy(np) and scipy.stats expression in a dict.

Example

>>> evaluate_numpy_scipy_expression_in_dict({'a': 'np.arange(0, 2, 1)', 'b': [0, 1]})
{'a': [0, 1], 'b': [0, 1]}
alea.utils.expand_grid_dict(variations: List[Union[dict, str]]) List[Union[dict, str]][source]

Expand dict into a list of dict, according to the itertools.product method, if necessary.

Parameters

variations (list) – variations to be expanded

Example

>>> expand_grid_dict(["free", {"a": 1, "b": 3}, {"a": 'np.arange(1, 3)', "b": [3, 4]}])
[
    "free",
    {"a": 1, "b": 3},
    {"a": 1, "b": 3},
    {"a": 1, "b": 4},
    {"a": 2, "b": 3},
    {"a": 2, "b": 4},
]
alea.utils.extremal_root(f, xL, xR, which='left', step=0.01, step_growth=1.0, max_step=None, xtol=1e-12, rtol=np.float64(8.881784197001252e-16))[source]

Return the left-most or right-most root of f in [xL, xR].

The interval is scanned adaptively to detect a sign change, and the root is refined using scipy.optimize.brentq.

Parameters
  • f (Callable[[float], float]) – Scalar function.

  • xL (float) – Left boundary (must satisfy xR > xL).

  • xR (float) – Right boundary.

  • which (str, optional) – “left” or “right”. Default is “left”.

  • step (float, optional) – Initial scan step (>0).

  • step_growth (float, optional) – Step multiplier (>=1).

  • max_step (float | None, optional) – Maximum scan step.

  • xtol (float, optional) – Absolute tolerance for brentq.

  • rtol (float, optional) – Relative tolerance for brentq.

Returns

Extremal root in the interval.

Return type

float

alea.utils.formatted_to_asterisked(formatted, wildcards: Optional[Union[str, List[str]]] = None)[source]

Convert formatted string to asterisk Sometimes a parameter(usually shape parameter) is not specified in formatted string, this function replace the parameter with asterisk.

Parameters
  • formatted (str) – formatted string

  • wildcards (str or list, optional (default=None)) – wildcards to be replaced with asterisk.

Returns

asterisked string

Return type

str

Examples

>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}")
"a_*_b_*"
>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}", wildcards="a")
"a_*_b_{b:d}"
alea.utils.get_analysis_space(analysis_space: list) list[source]

Convert analysis_space to a list of tuples with evaluated values.

alea.utils.get_file_path(fname, folder_list: Optional[List[str]] = None)[source]

Find the full path to the resource file Try 5 methods in the following order.

  1. fname begin with ‘/’, return absolute path

  2. folder begin with ‘/’, return folder + name

  3. can get file from _get_internal, return alea internal file path

  4. can be found in local installed ntauxfiles, return ntauxfiles absolute path

  5. can be downloaded from MongoDB, download and return cached path

Parameters
  • fname (str) – file name

  • folder_list (list, optional (default=None)) – list of possible base folders. Ordered by priority. The function will search for file from the first folder in the list, and return the first found file immediately without searching the rest folders.

Returns

full path to the resource file

Return type

str

alea.utils.get_metadata(output_filename_pattern: str) list[source]

Get metadata from output files.

alea.utils.get_template_folder_list(likelihood_config, extra_template_path: Optional[str] = None)[source]

Get a list of template_folder from likelihood_config.

alea.utils.load_json(file_name: str)[source]

Load data from json file.

alea.utils.load_yaml(file_name: str)[source]

Load data from yaml file.

alea.utils.make_hashable(obj)[source]

Convert a container hierarchy into one that can be hashed.

See http://stackoverflow.com/questions/985294

alea.utils.search_filename_pattern(filename: str) str[source]

Return pattern for a given existing filename. This is needed because sometimes the filename is not appended by “_{i_batch:d}”. We need to distinguish between the two cases and return the correct pattern.

Returns

existing pattern for filename, either filename or filename w/ inserted “_*”

Return type

str

alea.utils.signal_multiplier_estimator(signal: ndarray, background: ndarray, data: ndarray, iteration=100, diagnostic=False) float[source]

Estimate the best-fit signal multiplier using perturbation theory. The method tries to solve the critial point of the likelihood function by perturbation theory, where the likelihood function is defined as the binned Poisson likelihood function, given signal, background models and data.

Parameters
  • signal (np.ndarray) – signal model

  • background (np.ndarray) – background model

  • data (np.ndarray) – data array

  • iteration (int, optional (default=100)) – number of iterations

Returns

best-fit signal multiplier

Return type

float

alea.utils.within_limits(value, limits)[source]

Returns True if value is within limits.

Module contents