alea package

Subpackages

Submodules

alea.model module

class alea.model.MinuitWrap(f: Callable, parameters: Parameters)[source]

Bases: object

Wrapper for functions to be called by Minuit.

Initialized with a function f and a Parameters instance.

func: function wrapped

s_args

parameter names of the model

Type: list

_parameters

parameters and limits of the model

Type: dict

Parameters

f (Callable) – function to be wrapped
parameters (Parameters) – parameters of the model

__init__(f: Callable, parameters: Parameters)[source]: Initialize the wrapper.

class alea.model.StatisticalModel(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]

Bases: object

Base class for defining a statistical model with a likelihood and data generation method.

The statistical model contains two parts that you must define yourself:
- a likelihood function
  ll(self, parameter_1, parameter_2… parameter_n): A function of a set of named parameters which return a float expressing the log-likelihood for observed data given these parameters.
- a data generation function
  generate_data(self, parameter_1, parameter_2… parameter_n): A function of the same set of named parameters return a full data set.
Methods that you must implement:
- _ll
- _generate_data
Methods that you may implement:
- get_expectation_values
Methods that already exist here:
- ll
- store_data
- fit
- get_parameter_list
- confidence_interval

The public methods generate_data and ll, as the names suggested, depend on private methods _generate_data, and _ll respectively.

data: data of the model

_data: data of the model

_confidence_level: confidence level for confidence intervals

_confidence_interval_kind: kind of confidence interval to compute

parameters: parameters of the model

confidence_interval_threshold: threshold for confidence interval

is_data_set

True if data is set

Type: bool

Parameters

parameter_definition (dict or list, optional (default=None)) – definition of the parameters of the model
confidence_level (float, optional (default=0.9)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default="central")) – kind of confidence interval to compute
confidence_interval_threshold (Callable[[float], float], optional (default=None)) – threshold for confidence interval
confidence_interval_root_find (str, optional (default="brentq")) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
data (dict or list, optional (default=None)) – pre-set data of the model
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings

Raises

RuntimeError – if you try to instantiate the StatisticalModel class directly
NotImplementedError – if you do not implement the likelihood function or the data generation

__init__(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]: Initialize a statistical model.

_check_ll_and_generate_data_signature()[source]: Check that the likelihood and generate_data functions have the same signature.

_confidence_interval_checks(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, asymptotic_dof: Optional[int] = None, **kwargs) → Tuple[str, Callable[[float], float], str, Tuple[float, float]][source]

Helper function for confidence_interval that does the input checks and return bounds.

Parameters

poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) – range in which to search for the confidence interval edges
confidence_level (float, optional (default=None)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

Returns

confidence interval kind, confidence interval threshold, parameter interval bounds

Return type

Tuple[str, Callable[[float], float], str, Tuple[float, float]]

_define_parameters(parameter_definition, nominal_values=None)[source]: Initialize the parameters of the model.

_generate_data(**kwargs)[source]: Generate data for the given parameters.

_ll(**kwargs) → float[source]: Likelihood function, return the log-likelihood for the given parameters.

confidence_interval(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, confidence_interval_args: Optional[dict] = None, best_fit_args: Optional[dict] = None, asymptotic_dof: Optional[int] = None, fit_strategy: Optional[dict] = None) → Tuple[float, float][source]

Compute confidence intervals for the parameter of interest (POI).

Finds the intersection between the profile log-likelihood curve and the critical value curve to determine the confidence interval edges. If the parameter is a rate parameter and the model has expectation values implemented, the bounds will be interpreted as bounds on the expectation value, so that the range in the fit is parameter_interval_bounds/mus. Otherwise the bound is taken as-is.

Parameters

poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) –
range in which to search for the confidence interval edges. May be specified as:
- setting the property “parameter_interval_bounds” for the parameter
- passing a list here
- passing None here, the property of the parameter is used
confidence_level (float, optional (default=None)) – confidence level for confidence intervals. If None, the default confidence level of the model is used.
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute. If None, the default kind of the model is used.
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
confidence_interval_args (dict, optional (default=None)) – Parameters that will be fixed in the profile likelihood computation. If None, all fittable parameters will be profiled except the poi.
best_fit_args (dict, optional (default=None)) – If you require the “global” best-fit used to normalise the profile likelihood ratio to fix fewer parameters than the profile likelihood– mainly used for 1-D slices of higher-dimensional confidence volumes, where the global best-fit may not be along the profile. If None, will be set to confidence_interval_args.
asymptotic_dof (int, optional (default=None)) – Degrees of freedom for asymptotic
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings.

property data

Return the dataset, overridable for special needs.

Datasets are expected to be in the form of a list of one or more structured arrays, representing the datasets of one or more likelihood terms.

fit(verbose: Optional[bool] = False, fit_strategy: Optional[dict] = None, **kwargs) → Tuple[dict, float][source]

Fit the model to the data by maximizing the likelihood.

Returns a dict of best-fit parameter values and the maximum log-likelihood value. While the optimization is a minimization internally, the likelihood returned is the maximum.

Parameters

verbose (bool) – if True, print the Minuit object
fit_strategy (dict) –
override the default fit strategy defined in the model (model.fit_strategy). Possible settings are: - minimizer_routine (str): the minimizer routine to use, either

”migrad”, “simplex”, or “simplex_migrad” (first run simplex, then migrad).
- minuit_strategy (int): strategy for Minuit, can be 0, 1, or 2. The higher the
  number, the more precise the fit but also the slower.
- refit_invalid (bool): if True, refit with the simplex_migrad routine
  and strategy 2 if the optimization does not converge the first time.
- disable_index_fitting (bool): if True, disable the index fitting
  even if the model has index parameters.
- max_index_fitting_iter (int): maximum number of iterations for index fitting

Returns

best-fit values of each parameter, and the value of the likelihood evaluated there

Return type

dict, float

generate_data(**kwargs) → Union[dict, list][source]

Generate data for the given parameters.

Parameters are passed as keyword arguments; positional arguments are not supported. If a parameter is not given, the default value is used.

Raises: ValueError – If the parameters are not within the fit limits
Returns: generated data
Return type: dict or list

Caution

This implementation won’t allow you to call generate_data by positional arguments.

get_expectation_values(**parameter_values)[source]

Get the expectation values of the measurement.

Parameters: parameter_values – values of the parameters

get_likelihood_term_from_name(likelihood_name: str) → int[source]

Return the index of a likelihood term if the likelihood has several names.

Parameters: likelihood_name (str) – name of the likelihood term
Returns: index of the likelihood term
Return type: int

static get_model_from_name(statistical_model: str)[source]: Get the statistical model class from a string.

get_parameter_list()[source]: Return a set of all parameters that the generate_data and likelihood accepts.

ll(**kwargs) → float[source]

Return the log-likelihood for the given parameters.

Parameters are passed as keyword arguments; positional arguments are not supported. If a parameter is not given, the default value is used.

Keyword Arguments: kwargs – keyword arguments for the parameters
Returns: likelihood value
Return type: float

make_objective()[source]

Make a function that can be passed to Minuit.

Returns: function that can be passed to Minuit
Return type: Callable

property nominal_expectation_values

Nominal expectation values for the sources of the likelihood.

For this to work, you must implement get_expectation_values.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments: fit_guesses (dict) – A dict of parameter names and values.

store_data(file_name, data_list, data_name_list: Optional[List[str]] = None, metadata: Optional[dict] = None)[source]

Store a list of datasets to a file using inference_interface.

Each dataset is in the form of a list of one or more structured arrays or dicts. The structure would be: [[datasets1], [datasets2], ..., [datasetsn]], where each of datasets is a list of structured arrays. If you specify, it is set, if not it will read from self.get_likelihood_term_names. If not defined, it will be ["0", "1", ..., "n-1"]. The metadata is optional.

Parameters

file_name (str) – name of the file to store the data in
data_list (list) – list of datasets
data_name_list (list, optional (default=None)) – list of names of the datasets. If None, it will be read from self.get_likelihood_term_names
metadata (dict, optional (default=None)) – metadata to store with the data. If None, no metadata is stored.

alea.parameters module

class alea.parameters.ConditionalParameter(name: str, conditioning_parameter_name: str, **kwargs)[source]

Bases: object

A parameter whose properties depend on the value of another (conditioning) parameter.

Each attribute can be a dictionary mapping conditioning parameter values to the corresponding values of the conditional parameter. Calling the object with the conditioning parameter value as an argument returns a Parameter object with the correct values.

name

The name of the parameter.

Type: str

conditioning_parameter_name

The name of the conditioning parameter.

Type: str

__eq__(other: object) → bool[source]: Return True if all attributes are equal.

property blueice_anchors: Any: Return the blueice_anchors of the parameter (nominal condition)

property fit_guess: Optional[float]: Return the initial guess for fitting the parameter (nominal condition)

property fit_limits: Optional[Tuple[float, float]]: Return the fit_limits of the parameter (nominal condition)

property fittable: bool: Return the fittable attribute of the parameter (nominal condition)

property needs_reinit: bool: Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]: Return the nominal value of the parameter (nominal condition)

property parameter_interval_bounds: Optional[Tuple[float, float]]: Return the parameter_interval_bounds of the parameter (nominal condition)

property ptype: Optional[str]: Return the ptype of the parameter (nominal condition)

property relative_uncertainty: Optional[bool]: Return the relative_uncertainty of the parameter (nominal condition)

property uncertainty: Any: Return the uncertainty of the parameter (nominal condition)

value_in_fit_limits(value: float) → bool[source]: Returns True if value under nominal condition is within fit_limits.

class alea.parameters.Parameter(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]

Bases: object

Represents a single parameter with its properties.

name

The name of the parameter.

Type: str

nominal_value

The nominal value of the parameter.

Type: float, optional (default=None)

fittable

Indicates if the parameter is fittable or always fixed.

Type: bool, optional (default=True)

ptype

The ptype of the parameter.

Type: str, optional (default=None)

uncertainty

The uncertainty of the parameter. If a string, it can be evaluated as a numpy or scipy function to define non-gaussian constraints.

Type: float or str, optional (default=None)

relative_uncertainty

Indicates if the uncertainty is relative to the nominal_value.

Type: bool, optional (default=None)

blueice_anchors

Anchors for blueice template morphing. Blueice will load the template for the provided values and then interpolate for any value in between.

Type: list, optional (default=None)

fit_limits

The limits for fitting the parameter.

Type: Tuple[float, float], optional (default=None)

parameter_interval_bounds

Limits for computing confidence intervals.

Type: Tuple[float, float], optional (default=None)

fit_guess

The initial guess for fitting the parameter.

Type: float, optional (default=None)

description

A description of the parameter.

Type: str, optional (default=None)

__eq__(other: object) → bool[source]: Return True if all attributes are equal.

__init__(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]: Initialise a parameter.

_check_parameter_consistency()[source]: Check if parameter is consistent.

_check_parameter_interval_bounds(value)[source]: Check if parameter_interval_bounds is within fit_limits and is not None.

property blueice_anchors: Any

Return the blueice_anchors of the parameter.

If the blueice_anchors is a string, it will be evaluated as a numpy or scipy function.

property fit_guess: Optional[float]: Return the initial guess for fitting the parameter.

property needs_reinit: bool: Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]: Return the nominal value of the parameter.

property parameter_interval_bounds: Optional[Tuple[float, float]]

property uncertainty: Any

Return the uncertainty of the parameter.

If the uncertainty is a string, it will be evaluated as a numpy or scipy function.

value_in_fit_limits(value: float) → bool[source]: Returns True if value is within fit_limits.

class alea.parameters.Parameters[source]

Bases: object

Represents a collection of parameters.

names

A list of parameter names.

Type: List[str]

fit_guesses

A dictionary of fit guesses.

Type: Dict[str, float]

fit_limits

A dictionary of fit limits.

Type: Dict[str, float]

fittable

A list of parameter names which are fittable.

Type: List[str]

not_fittable

A list of parameter names which are not fittable.

Type: List[str]

uncertainties

A dictionary of parameter uncertainties.

Type: Dict[str, float or Any]

with_uncertainty

A Parameters object with parameters with a not-NaN uncertainty.

Type: Parameters

nominal_values

A dictionary of parameter nominal values.

Type: Dict[str, float]

parameters

A dictionary to store the parameters, with parameter name as key.

Type: Dict[str, Parameter]

__call__(return_fittable: Optional[bool] = False, **kwargs: Optional[Dict]) → Dict[str, Any][source]

Return a dictionary of parameter values, optionally filtered to fittable parameters only.

Parameters: return_fittable (bool, optional (default=False)) – Indicates if only fittable parameters should be returned.
Keyword Arguments: kwargs (dict) – Additional keyword arguments to override parameter values.
Raises: ValueError – If a parameter name is not found.
Returns: A dictionary of parameter values.
Return type: dict

__eq__(other: object) → bool[source]: Return True if all parameters are equal.

__getattr__(name: str) → Parameter[source]

Retrieves a Parameter object by attribute access.

Parameters: name (str) – The name of the parameter.
Raises: AttributeError – If the attribute is not found.
Returns: The retrieved Parameter object.
Return type: Parameter

__getitem__(name: str) → Parameter[source]

Retrieves a Parameter object by dictionary access.

Parameters: name (str) – The name of the parameter.
Raises: KeyError – If the key is not found.
Returns: The retrieved Parameter object.
Return type: Parameter

__init__()[source]: Initialise a collection of parameters.

__iter__() → Iterator[Parameter][source]

Return an iterator over the parameters.

Each iteration return a Parameter object.

__str__() → str[source]: Return an overview table of all parameters.

add_parameter(parameter: Union[Parameter, ConditionalParameter]) → None[source]

Adds a Parameter object to the Parameters collection.

Parameters: parameter (Parameter) – The Parameter object to add.
Raises: ValueError – If the parameter name already exists.

property fit_guesses: Dict[str, float]: A dictionary of fit guesses.

property fit_limits: Dict[str, float]: A dictionary of fit limits.

property fittable: List[str]: A list of parameter names which are fittable.

classmethod from_config(config: Dict[str, dict])[source]

Creates a Parameters object from a configuration dictionary.

Parameters: config (dict) – A dictionary of parameter configurations.
Returns: The created Parameters object.
Return type: Parameters

classmethod from_list(names: List[str])[source]

Creates a Parameters object from a list of parameter names.

Everything else is set to default values.

Parameters: names (List[str]) – List of parameter names.
Returns: The created Parameters object.
Return type: Parameters

property names: List[str]: A list of parameter names.

property nominal_values: dict: A dict of nominal values for all parameters with a nominal value.

property not_fittable: List[str]: A list of parameter names which are not fittable.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments: fit_guesses (dict) – A dict of parameter names and values.

set_nominal_values(**nominal_values)[source]

Set the nominal values for parameters.

Keyword Arguments: nominal_values (dict) – A dict of parameter names and values.

property uncertainties: dict

A dict of uncertainties for all parameters with a not-NaN uncertainty.

Caution: this is not the same as the parameter.uncertainty property.

values_in_fit_limits(**kwargs: Dict) → bool[source]

Return True if all values are within the fit limits.

Keyword Arguments: kwargs (dict) – The parameter values to check.
Returns: True if all values are within the fit limits.
Return type: bool

property with_uncertainty: Parameters

Return parameters with a not-NaN uncertainty.

The parameters are the same objects as in the original Parameters object, not a copy. For conditional parameters, the parameters under the nominal condition are returned.

alea.runner module

class alea.runner.Runner(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]

Bases: object

Manages toy Monte Carlo simulation and fitting for a statistical model.

Responsibilities:

initialize the statistical model
generate or read toy data
save toy data if needed
fit fittable parameters
write the output file

One toyfile can contain multiple toydata, but all of them share the same generate_values.

model

statistical model instance

Type: StatisticalModel

poi

parameter of interest

Type: str

hypotheses

list of hypotheses

Type: list

common_hypothesis

common hypothesis, the values are copied to each hypothesis

Type: dict

generate_values

generate values for toydata

Type: dict

nominal_values

nominal values of parameters

Type: dict

_compute_confidence_interval

whether compute confidence interval

Type: bool

_n_mc

number of Monte Carlo

Type: int

_toydata_filename

toydata filename

Type: str

_toydata_mode

toydata mode, ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’

Type: str

_metadata

metadata, if None, it is set to {}

Type: dict

_output_filename

output filename

Type: str

_result_names

list of result names

Type: list

_result_dtype

list of result dtypes

Type: list

_hypotheses_values

list of values for hypotheses

Type: list

Parameters

statistical_model (str) – statistical model class name
poi (str) – parameter of interest
hypotheses (list) – list of hypotheses
n_mc (int) – number of Monte Carlo
common_hypothesis (dict, optional (default=None)) – common hypothesis, the values are copied to each hypothesis
generate_values (Dict[str, float], optional (default=None)) – generate values of toydata. If None, toydata depend on statistical model.
nominal_values (dict, optional (default=None)) – nominal values of parameters. If None, nothing will be assigned to model.
statistical_model_config (str, optional (default=None)) – statistical model configuration filename
parameter_definition (dict or list, optional (default=None)) – parameter definition
statistical_model_args (dict, optional (default={})) – arguments for statistical model
likelihood_config (dict, optional (default=None)) – likelihood configuration
compute_confidence_interval (bool, optional (default=False)) – whether compute confidence interval
confidence_level (float, optional (default=0.9)) – confidence level
confidence_interval_kind (str, optional (default='central')) – kind of confidence interval, choice from ‘central’, ‘upper’ or ‘lower’
confidence_interval_root_find (str, optional (default='brentq')) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
fit_strategy (dict, optional (default=None)) – fit strategy dictionary. If None, the default fit strategy of the model will be used.
toydata_mode (str, optional (default='generate_and_store')) – toydata mode, choice from ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’
toydata_filename (str, optional (default=None)) – toydata filename
only_toydata (bool, optional (default=False)) – whether only generate toydata
output_filename (str, optional (default='test_toymc.ii.h5')) – output filename
seed (int, optional (default=None)) – random seed for runners before generating toydata
metadata (dict, optional (default=None)) – metadata to be saved in output file

__init__(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]: Initialize statistical model, parameters list, and generate values list.

_get_hypotheses()[source]: Get generate values list from hypotheses.

Caution

When free hypothesis is provided, it should be the first hypothesis. Free hypothesis means that all parameters are free to fit, it will not use common_hypothesis!

_get_parameter_list()[source]: Get parameter list and result list from statistical model.

property common_hypothesis: Dict[str, float]

data_generator()[source]: Generate, save or read toydata.

property generate_values: Dict[str, float]

property hypotheses: list

pre_process_poi(value, attribute_name)[source]: Pre-process of poi_expectation for some attributes of runner.

read_toydata()[source]: Read toydata from file.

run()[source]

Run toy simulation.

If only_toydata is True, only generate toydata.

static runner_arguments()[source]: Get runner arguments and annotations.

simulate()[source]: Only generate toydata.

simulate_and_fit()[source]

Run toy simulations, perform fits for different hypotheses, and collect results.

For each Monte Carlo iteration, runs the toy simulation under the specified toydata mode and generate values, then fits the model to the generated toydata for each hypothesis, and collects the fit results and confidence intervals if needed.

Todo

Implement per-hypothesis switching on whether to compute confidence intervals

store_toydata(toydata, toydata_names)[source]

Write toydata to file.

If toydata is a list of dict, convert it to a list of list.

static update_poi(model, poi: str, generate_values: Dict[str, float], nominal_values: Dict[str, float] = {})[source]

Update the poi in generate_values according to poi_expectation.

Checks that poi_expectation is provided, that poi is not already set, and that poi ends with _rate_multiplier. Then updates the poi to the correct value using the get_expectation_values method of the model under the specified nominal_values.

Parameters

poi (str) – parameter of interest
generate_values (dict) – generate values of toydata, it can contain “poi_expectation”
nominal_values (dict) – nominal values of parameters

Caution

The expectation is evaluated under nominal_values in each batch.

write_output(results)[source]: Write output file with metadata.

alea.simulators module

class alea.simulators.BlueiceDataGenerator(ll_term)[source]

Bases: object

A class for generating data from a blueice likelihood term.

ll: The blueice likelihood term.

binned

True if the likelihood term is binned.

Type: bool

bincs

The bin centers of the likelihood term.

Type: list

direction_names

The names of the directions of the likelihood term.

Type: list

source_histograms

The histograms of the sources of the likelihood term.

Type: list

data_lengths

The number of bins of each component of the likelihood term.

Type: list

dtype

The data type of the likelihood term.

Type: list

last_kwargs

The last kwargs used to generate data.

Type: dict

mus: The expected number of events of each source of the likelihood term.

parameters

The parameters of the likelihood term.

Type: list

ll_term

A blueice likelihood term.

Type: BinnedLogLikelihood or UnbinnedLogLikelihood

__init__(ll_term)[source]: Initialize the BlueiceDataGenerator.

compute_pdfs_and_mus(filter_kwargs=True, **kwargs) → None[source]

Compute PDFs and expected event counts for all sources given the parameters.

Results are cached; recomputation is skipped if kwargs are unchanged from the previous call.

Parameters

filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs.
kwargs – The parameters passed to the likelihood function.

simulate(filter_kwargs=True, n_toys=None, sample_n_toys=False, **kwargs)[source]

Simulate toys for each source.

Parameters

filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
n_toys (int, optional (default=None)) – If not None, a fixed number n_toys of toys is generated for each source component. Defaults to None.
sample_n_toys (bool, optional (default=False)) – If True, the number of toys is sampled from a Poisson distribution with mean n_toys. Defaults to False. Only works if n_toys is not None.

Keyword Arguments

kwargs – The parameters pasted to the likelihood function.

Returns

Array of simulated data for all sources in the given analysis space. The index “source” indicates the corresponding source of an entry. The dtype follows self.dtype.

Return type

numpy.array

alea.submitter module

class alea.submitter.Submitter(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]

Bases: object

Submitter base class that generate the submission script from the configuration.

Initialized from a configuration file whose contents map to the arguments of the __init__ method of the Submitter.

statistical_model

the name of the statistical model

Type: str

statistical_model_config

the configuration file of the statistical model

Type: str

poi

the parameter of interest

Type: str

computation_dict

the dictionary of the computation, with keys to_zip, to_vary and in_common

Type: dict

debug

whether to run in debug mode. If True, only one job will be submitted or one runner will be returned. And its script will be printed.

Type: bool

resubmit

whether to resubmit the jobs that have not finished. If True, will submit all the jobs, even if the output file exists.

Type: bool

Parameters

statistical_model (str) – the name of the statistical model
statistical_model_config (str) – the configuration file of the statistical model
poi (str) – the parameter of interest
computation_options (dict) – the configuration of the computation
computation (str, optional (default='discovery_power')) – the name of the computation, it should be a key of computation_options
outputfolder (str, optional (default=None)) – the output folder
debug (bool, optional (default=False)) – whether to run in debug mode
loglevel (str, optional (default='INFO')) – the log level

Keyword Arguments

kwargs – the arguments of __init__ method of the Submitter, containing configurations of clusters

Caution

All the source of template should be from the same folder. All the output, including toydata and fitting results, should be in the same folder.

__init__(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]: Initializes the submitter.

all_runner_kwargs()[source]: Parse all the runner arguments from the submission script.

allowed_special_args: List[str] = []

already_done(i_args: dict) → bool[source]: Check if the job is already done, considering the modes of toydata and output.

static arg_to_str(value, annotation) → str[source]

Convert the argument to string for the submission script.

Parameters

value – the value of the argument, can be various type
annotation – the annotation of the argument

Returns

the string of the argument

Return type

str

Caution

Currently we only support str, int, float, bool, dict and list. The float will be rounded to 4 digits after the decimal point.

static check_redunant_arguments(runner_args, allowed_special_args: List[str] = [])[source]

combine_n_jobs: int = 1

combined_tickets_generator()[source]

Get the combined submission script for the current configuration.

self.combine_n_jobs jobs will be combined into one submission script.

Yields: (str, str) – the combined submission script and name output_filename

Note

User can add combine_n_jobs: 10 in local_configurations, slurm_configurations or htcondor_configurations to combine 10 jobs into one submission script. User will need this feature when the number of jobs pending for submission is too large.

computation_tickets_generator()[source]

Generate submission scripts for each combination of the computation options.

For each Runner argument set derived from to_zip, to_vary and in_common:

First, generate the combined computational options directly.
Second, update the input and output folder of the options.
Third, collect the non-fittable (settable) parameters into nominal_values.
Then, collect the fittable parameters into generate_values.
Finally, generate the submission script for each combination.

Yields: (str, str) – the submission script and name output_filename

config_file_path: str

filename_kwargs(runner_args: dict) → dict[source]

Get the filename_kwargs from runner_args.

Parameters: runner_args (dict) – the arguments of Runner
Returns: the keyword arguments for the filename
Return type: dict

first_i_batch: int = 0

classmethod from_config(config_file_path: str, **kwargs) → Submitter[source]

Initialize the submitter from a yaml config file.

Parameters: config_file_path (str) – Path to the yaml config file.
Returns: The initialized Submitter instance.
Return type: Submitter

logging = <Logger submitter_logger (INFO)>

merged_arguments_generator()[source]: Generate the merged arguments for Runner from to_zip, to_vary and in_common.

property outputfolder: Optional[str]

static runner_kwargs_from_script(sys_argv: Optional[List[str]] = None)[source]

Parse kwargs of a Runner from a string of arguments(script).

Parameters: sys_argv (list, optional (default=None)) – string of arguments, with the format of [’–arg1’, ‘value1’, ‘–arg2’, ‘value2’, …]. The arguments must be the same as the arguments of Runner.__init__.

static script_from_runner_kwargs(annotations, kwargs) → str[source]: Generate the submission script from the runner arguments.

static str_to_arg(value: str, annotation)[source]

Convert the string to argument for the submission script.

Parameters

value – the string of the argument
annotation – the annotation of the argument

Returns

the value of the argument, can be various type

submit(*arg, **kwargs)[source]: Submit the jobs to the destinations.

template_path: str

static update_limit_threshold(runner_args, outputfolder: str)[source]

static update_n_batch(runner_args)[source]

Update n_mc if n_batch is provided.

Distribute n_mc into n_batch, so that each batch will run n_mc/n_batch times.

static update_output_toydata(runner_args, outputfolder: str)[source]

static update_runner_args(runner_args: Dict[str, Dict[str, Any]], parameters_fittable: List[str], parameters_not_fittable: List[str])[source]

Update the runner arguments’ generate_values and nominal_values.

Fittable parameters are added to generate_values; non-fittable parameters are added to nominal_values.

Parameters: runner_args (dict) – the arguments of Runner

static update_statistical_model_args(runner_args: Dict[str, Dict[str, Any]], template_path: Optional[str] = None)[source]

Update template_path in the statistical model arguments.

Parameters: runner_args (dict) – the arguments of Runner

alea.template_source module

class alea.template_source.CombinedSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Source that is a weighted sum of histograms.

Useful for example for safeguard. The first histogram is the base histogram and the rest are added to it with weights, which can be set as shape parameters in the config.

Parameters

weights – Weights of the 2nd to the last histograms.
histnames – List of filenames containing the histograms.
templatenames – List of names of histograms within the hdf5 files.

build_histogram()[source]: Build the histogram of the source.

class alea.template_source.SpectrumTemplateSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Reweighted template source by a 1D spectrum.

The first axis of the template is assumed to be the one being reweighted.

Parameters: spectrum_name – Name of bbf json-like spectrum file

static _get_json_spectrum(filename: str)[source]

Translates bbf-style JSON files to spectra.

Parameters: filename (str) – Name of the JSON file.

Todo

Define the format of the JSON file clearly.

build_histogram()[source]: Build the histogram of the source.

class alea.template_source.TemplateSource(config: Dict, *args, **kwargs)[source]

Bases: HistogramPdfSource

A source defined with a template histogram.

The parameters are set in self.config; “templatename”, “histname”, and “analysis_space” must be present in self.config.

config

The configuration of the source.

Type: dict

dtype

The data type of the source.

Type: list

_bin_volumes

The bin volumes of the source.

Type: numpy.ndarray

_n_events_histogram

The histogram of the number of events of the source.

Type: multihist.MultiHistBase

events_per_day

The number of events per day of the source.

Type: float

_pdf_histogram

The histogram of the probability density function of the source.

Type: multihist.MultiHistBase

Parameters

config (dict) – The configuration of the source.
templatename – Hdf5 file to open.
histname – Histogram name.
named_parameters (list) – List of config setting names to pass to .format on histname and filename.
normalise_template (bool) – Normalise the template histogram.
in_events_per_bin (bool) – If True, histogram is in events per day / bin. If False or absent, histogram is already pdf.
histogram_scale_factor (float) – Multiply histogram by this number
convert_to_uniform (bool) – Convert the histogram to a uniform per bin distribution.
log10_bins (list) – List of axis numbers. If True, bin edges on this axis in the hdf5 file are log10() of the actual bin edges.

__init__(config: Dict, *args, **kwargs)[source]: Initialize the TemplateSource.

_check_binning(h, histogram_info: str)[source]

Check if the histogram”s bin edges are the same to analysis_space.

Parameters

h (multihist.MultiHistBase) – The histogram to check.
histogram_info (str) – Information of the histogram.

_compute_multiple_file_hashes(templatenames: List[str], format_named_parameters: Dict) → str[source]: Compute a deterministic hash for multiple template files.

_compute_single_file_hash(templatename: str, format_named_parameters: Dict) → str[source]: Compute the hash for a single template file.

apply_slice_args(h, slice_args: Optional[Union[List[Dict], Dict]] = None)[source]

Apply slice arguments to the histogram.

Parameters

h (multihist.MultiHistBase) – The histogram to apply the slice arguments to.
slice_args (dict) – The slice arguments to apply. The sum_axis, slice_axis, and slice_axis_limits are supported.

build_histogram()[source]: Build the histogram of the source.

property format_named_parameters: Get the named parameters in the config to dictionary format.

set_dtype()[source]: Set the data type of the source.

set_pdf_histogram(h)[source]: Set the histogram of the probability density function of the source.

simulate(n_events: int)[source]

Simulate events from the source.

Parameters: n_events (int) – The number of events to simulate.
Returns: The simulated events.
Return type: numpy.ndarray

alea.utils module

exception alea.utils.CannotUpdate[source]: Bases: Exception

class alea.utils.IndexMorpher(config, shape_parameters)[source]

Bases: Morpher

IndexMorpher is a morpher which applies no interpolation.

get_anchor_points(bounds, n_models=None)[source]: Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]: Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

class alea.utils.LockableSet(*args)[source]

Bases: set

A set whose update method can be locked.

basenames()[source]: The basenames of the filenames in the set.

lock()[source]: Lock the set to prevent modifications.

uniqueness()[source]: Check if the basenames contains unique elements.

unlock()[source]: Unlock the set to allow modifications.

update(*args)[source]: Update the set with elements if it is not locked.

class alea.utils.ReadOnlyDict(data)[source]

Bases: object

A read-only dict.

get(key, default=None)[source]

items()[source]

keys()[source]

values()[source]

alea.utils._get_internal(file_name)[source]

Get the abspath of the file.

Raise FileNotFoundError when not found in any subfolder

alea.utils._package_path(sub_directory)[source]: Get the abs path of the requested sub folder.

alea.utils._prefix_file_path(config: dict, template_folder_list: list, ignore_keys: List[str] = ['name', 'histname'])[source]

Prefix file path with template_folder_list whenever possible.

Parameters

config (dict) – dictionary contains file path
template_folder_list (list) – list of possible base folders. Ordered by priority.
ignore_keys (list, optional (default=["name", "histname"])) –
prefixing (keys to be ignored when) –

alea.utils.adapt_likelihood_config_for_blueice(likelihood_config: dict, template_folder_list: list) → dict[source]

Adapt likelihood config to be compatible with blueice.

Parameters

likelihood_config (dict) – likelihood config dict
template_folder_list (list) – list of possible base folders. Ordered by priority.

Returns

adapted likelihood config

Return type

dict

alea.utils.add_i_batch(filename: str) → str[source]: Add i_batch to filename.

alea.utils.asymptotic_critical_value(confidence_interval_kind: str, confidence_level: float, degree_of_freedom: Optional[int] = None)[source]

Return the critical value for the confidence interval.

Parameters

confidence_interval_kind (str) – confidence interval kind, either ‘lower’, ‘upper’ or ‘central’
confidence_level (float) – confidence level
degree_of_freedom (int, optional (default=None)) – degree of freedom

Returns

critical value

Return type

float

Raises

ValueError – if confidence_interval_kind is not ‘lower’, ‘upper’ or ‘central’
ValueError – if degree_of_freedom is not None and not 1, when confidence_interval_kind is ‘lower’ or ‘upper’

alea.utils.can_assign_to_typing(value_type, target_type) → bool[source]

Check if value_type can be assigned to target_type.

This is useful when converting Runner’s argument into strings.

Parameters

value_type – type of the value, might be float, int, etc.
target_type – type of the target, might be Optinal, Union, etc.

alea.utils.can_expand_grid(variations: dict) → bool[source]

Check if variations can be expanded into a grid.

Example

>>> can_expand_grid({'a': [1, 2], 'b': [3, 4]})
True

alea.utils.clip_limits(value) → Tuple[float, float][source]: Clip limits to [-MAX_FLOAT, MAX_FLOAT] by replacing None with the respective bound.

alea.utils.compute_file_hash(file_path: str) → str[source]: Compute the SHA-256 hash of a file.

alea.utils.compute_variations(to_zip, to_vary, in_common) → list[source]

Compute all Runner argument combinations from to_zip, to_vary and in_common.

By priority the order is to_zip > to_vary > in_common: values in to_zip overwrite those in to_vary and in_common, and values in to_vary overwrite those in in_common.

Parameters

to_zip (dict) – variations to be zipped
to_vary (dict) – variations to be varied
in_common (dict) – variations in common

Returns

a list of dict

Return type

list

alea.utils.convert_to_in_common(in_common: Dict[str, Any]) → Dict[str, Any][source]

Expand the values in in_common, according to the itertools.product method, if necessary.

This usually happens to the hypotheses.

Example

>>> convert_to_in_common({'hypotheses': ['free', {'a': [1, 2], 'b': [3, 4]}]})
{
    "hypotheses": [
        "free",
        {"a": 1, "b": 3},
        {"a": 1, "b": 4},
        {"a": 2, "b": 3},
        {"a": 2, "b": 4},
    ]
}

alea.utils.convert_to_vary(to_vary: Dict[str, List]) → List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the itertools.product method.

Example

>>> convert_to_vary({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]

alea.utils.convert_to_zip(to_zip: Dict[str, List]) → List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the zip method.

Example

>>> convert_to_zip({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 2, 'b': 4}]

alea.utils.convert_variations(variations: dict, iteration) → list[source]

Convert variations to a list of dict, according to the iteration method.

Parameters

variations (dict) – variations to be converted
iteration – iteration method, either zip or itertools.product

Returns

a list of dict

Return type

list

alea.utils.deterministic_hash(thing, length=10)[source]

Return a base32 lowercase string of length determined from hashing a container hierarchy.

Edited from strax: strax/utils.py

alea.utils.dump_json(file_name: str, data: dict)[source]: Dump data to a json file.

alea.utils.dump_yaml(file_name: str, data: dict)[source]: Dump data from yaml file.

alea.utils.evaluate_numpy_scipy_expression(value: str)[source]: Evaluate numpy(np) and scipy.stats expression.

alea.utils.evaluate_numpy_scipy_expression_in_dict(d: dict)[source]

Evaluate numpy(np) and scipy.stats expression in a dict.

Example

>>> evaluate_numpy_scipy_expression_in_dict({'a': 'np.arange(0, 2, 1)', 'b': [0, 1]})
{'a': [0, 1], 'b': [0, 1]}

alea.utils.expand_grid_dict(variations: List[Union[dict, str]]) → List[Union[dict, str]][source]

Expand dict into a list of dict, according to the itertools.product method, if necessary.

Parameters: variations (list) – variations to be expanded

Example

>>> expand_grid_dict(["free", {"a": 1, "b": 3}, {"a": 'np.arange(1, 3)', "b": [3, 4]}])
[
    "free",
    {"a": 1, "b": 3},
    {"a": 1, "b": 3},
    {"a": 1, "b": 4},
    {"a": 2, "b": 3},
    {"a": 2, "b": 4},
]

alea.utils.extremal_root(f, xL, xR, which='left', step=0.01, step_growth=1.0, max_step=None, xtol=1e-12, rtol=np.float64(8.881784197001252e-16))[source]

Return the left-most or right-most root of f in [xL, xR].

The interval is scanned adaptively to detect a sign change, and the root is refined using scipy.optimize.brentq.

Parameters

f (Callable[[float], float]) – Scalar function.
xL (float) – Left boundary (must satisfy xR > xL).
xR (float) – Right boundary.
which (str, optional) – “left” or “right”. Default is “left”.
step (float, optional) – Initial scan step (>0).
step_growth (float, optional) – Step multiplier (>=1).
max_step (float | None, optional) – Maximum scan step.
xtol (float, optional) – Absolute tolerance for brentq.
rtol (float, optional) – Relative tolerance for brentq.

Returns

Extremal root in the interval.

Return type

float

alea.utils.formatted_to_asterisked(formatted, wildcards: Optional[Union[str, List[str]]] = None)[source]

Convert a formatted string to an asterisked string.

When a parameter (usually a shape parameter) is not specified in the formatted string, this function replaces the parameter with an asterisk.

Parameters

formatted (str) – formatted string
wildcards (str or list, optional (default=None)) – wildcards to be replaced with asterisk.

Returns

asterisked string

Return type

str

Examples

>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}")
"a_*_b_*"
>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}", wildcards="a")
"a_*_b_{b:d}"

alea.utils.get_analysis_space(analysis_space: list) → list[source]: Convert analysis_space to a list of tuples with evaluated values.

alea.utils.get_file_path(fname, folder_list: Optional[List[str]] = None)[source]

Find the full path to the resource file.

The following methods are tried in order:

fname begin with ‘/’, return absolute path
folder begin with ‘/’, return folder + name
can get file from _get_internal, return alea internal file path
can be found in local installed ntauxfiles, return ntauxfiles absolute path
can be downloaded from MongoDB, download and return cached path

Parameters

fname (str) – file name
folder_list (list, optional (default=None)) – list of possible base folders. Ordered by priority. The function will search for file from the first folder in the list, and return the first found file immediately without searching the rest folders.

Returns

full path to the resource file

Return type

str

alea.utils.get_metadata(output_filename_pattern: str) → list[source]: Get metadata from output files.

alea.utils.get_template_folder_list(likelihood_config, extra_template_path: Optional[str] = None)[source]: Get a list of template_folder from likelihood_config.

alea.utils.load_json(file_name: str)[source]: Load data from json file.

alea.utils.load_yaml(file_name: str)[source]: Load data from yaml file.

alea.utils.make_hashable(obj)[source]

Convert a container hierarchy into one that can be hashed.

See http://stackoverflow.com/questions/985294

alea.utils.search_filename_pattern(filename: str) → str[source]

Return the glob pattern for a given existing filename.

This is needed because sometimes the filename is not appended by “_{i_batch:d}”. The function distinguishes between the two cases and returns the correct pattern.

Returns: existing pattern for filename, either filename or filename w/ inserted “_*”
Return type: str

alea.utils.signal_multiplier_estimator(signal: ndarray, background: ndarray, data: ndarray, iteration=100, diagnostic=False) → float[source]

Estimate the best-fit signal multiplier using perturbation theory.

Solves the critical point of the binned Poisson likelihood function iteratively via perturbation theory, given signal and background models and observed data.

Parameters

signal (np.ndarray) – signal model
background (np.ndarray) – background model
data (np.ndarray) – data array
iteration (int, optional (default=100)) – number of iterations

Returns

best-fit signal multiplier

Return type

float

alea.utils.within_limits(value, limits)[source]: Returns True if value is within limits.

alea package

Subpackages

Submodules

alea.model module

alea.parameters module

alea.runner module

alea.simulators module

alea.submitter module

alea.template_source module

alea.utils module

Module contents