alea package

Subpackages

Submodules

alea.model module

class alea.model.MinuitWrap(f: Callable, parameters: Parameters)[source]

Bases: object

Wrapper for functions to be called by Minuit. Initialized with a function f and a Parameters instance.

func: function wrapped

s_args

parameter names of the model

Type: list

_parameters

parameters and limits of the model

Type: dict

Parameters

f (Callable) – function to be wrapped
parameters (Parameters) – parameters of the model

__init__(f: Callable, parameters: Parameters)[source]: Initialize the wrapper.

class alea.model.StatisticalModel(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]

Bases: object

Class that defines a statistical model.

The statisical model contains two parts that you must define yourself:
- a likelihood function
  ll(self, parameter_1, parameter_2… parameter_n): A function of a set of named parameters which return a float expressing the loglikelihood for observed data given these parameters.
- a data generation function
  generate_data(self, parameter_1, parameter_2… parameter_n): A function of the same set of named parameters return a full data set.
Methods that you must implement:
- _ll
- _generate_data
Methods that you may implement:
- get_expectation_values
Methods that already exist here:
- ll
- store_data
- fit
- get_parameter_list
- confidence_interval

The public methods generate_data and ll, as the names suggested, depend on private methods _generate_data, and _ll respectively.

data: data of the model

_data: data of the model

_confidence_level: confidence level for confidence intervals

_confidence_interval_kind: kind of confidence interval to compute

parameters: parameters of the model

confidence_interval_threshold: threshold for confidence interval

is_data_set

True if data is set

Type: bool

Parameters

parameter_definition (dict or list, optional (default=None)) – definition of the parameters of the model
confidence_level (float, optional (default=0.9)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default="central")) – kind of confidence interval to compute
confidence_interval_threshold (Callable[[float], float], optional (default=None)) – threshold for confidence interval
confidence_interval_root_find (str, optional (default="brentq")) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
data (dict or list, optional (default=None)) – pre-set data of the model
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings

Raises

RuntimeError – if you try to instantiate the StatisticalModel class directly
NotImplementedError – if you do not implement the likelihood function or the data generation

__init__(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]: Initialize a statistical model.

_check_ll_and_generate_data_signature()[source]: Check that the likelihood and generate_data functions have the same signature.

_confidence_interval_checks(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, asymptotic_dof: Optional[int] = None, **kwargs) → Tuple[str, Callable[[float], float], str, Tuple[float, float]][source]

Helper function for confidence_interval that does the input checks and return bounds.

Parameters

poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) – range in which to search for the confidence interval edges
confidence_level (float, optional (default=None)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”

Returns

confidence interval kind, confidence interval threshold, parameter interval bounds

Return type

Tuple[str, Callable[[float], float], str, Tuple[float, float]]

_define_parameters(parameter_definition, nominal_values=None)[source]: Initialize the parameters of the model.

_generate_data(**kwargs)[source]: Generate data for the given parameters.

_ll(**kwargs) → float[source]: Likelihood function, return the loglikelihood for the given parameters.

confidence_interval(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, confidence_interval_args: Optional[dict] = None, best_fit_args: Optional[dict] = None, asymptotic_dof: Optional[int] = None, fit_strategy: Optional[dict] = None) → Tuple[float, float][source]

Uses self.fit to compute confidence intervals for a certain named parameter. If the parameter is a rate parameter, and the model has expectation values implemented, the bounds will be interpreted as bounds on the expectation value, so that the range in the fit is parameter_interval_bounds/mus. Otherwise the bound is taken as-is.

Parameters

poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) –
range in which to search for the confidence interval edges. May be specified as:
- setting the property “parameter_interval_bounds” for the parameter
- passing a list here
- passing None here, the property of the parameter is used
confidence_level (float, optional (default=None)) – confidence level for confidence intervals. If None, the default confidence level of the model is used.
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute. If None, the default kind of the model is used.
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
confidence_interval_args (dict, optional (default=None)) – Parameters that will be fixed in the profile likelihood computation. If None, all fittable parameters will be profiled except the poi.
best_fit_args (dict, optional (default=None)) – If you require the “global” best-fit used to normalise the profile likelihood ratio to fix fewer parameters than the profile likelihood– mainly used for 1-D slices of higher-dimensional confidence volumes, where the global best-fit may not be along the profile. If None, will be set to confidence_interval_args.
asymptotic_dof (int, optional (default=None)) – Degrees of freedom for asymptotic
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings.

property data

Simple getter for a data-set– mainly here so it can be over-ridden for special needs.

Data-sets are expected to be in the form of a list of one or more structured arrays, representing the data-sets of one or more likelihood terms.

fit(verbose: Optional[bool] = False, fit_strategy: Optional[dict] = None, **kwargs) → Tuple[dict, float][source]

Fit the model to the data by maximizing the likelihood. Return a dict containing best-fit values of each parameter, and the value of the likelihood evaluated there. While the optimization is a minimization, the likelihood returned is the __maximum__ of the likelihood.

Parameters

verbose (bool) – if True, print the Minuit object
fit_strategy (dict) –
override the default fit strategy defined in the model (model.fit_strategy). Possible settings are: - minimizer_routine (str): the minimizer routine to use, either

”migrad”, “simplex”, or “simplex_migrad” (first run simplex, then migrad).
- minuit_strategy (int): strategy for Minuit, can be 0, 1, or 2. The higher the
  number, the more precise the fit but also the slower.
- refit_invalid (bool): if True, refit with the simplex_migrad routine
  and strategy 2 if the optimization does not converge the first time.
- disable_index_fitting (bool): if True, disable the index fitting
  even if the model has index parameters.
- max_index_fitting_iter (int): maximum number of iterations for index fitting

Returns

best-fit values of each parameter, and the value of the likelihood evaluated there

Return type

dict, float

generate_data(**kwargs) → Union[dict, list][source]

Generate data for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.

Raises: ValueError – If the parameters are not within the fit limits
Returns: generated data
Return type: dict or list

Caution

This implementation won’t allow you to call generate_data by positional arguments.

get_expectation_values(**parameter_values)[source]

Get the expectation values of the measurement.

Parameters: parameter_values – values of the parameters

get_likelihood_term_from_name(likelihood_name: str) → int[source]

Return the index of a likelihood term if the likelihood has several names.

Parameters: likelihood_name (str) – name of the likelihood term
Returns: index of the likelihood term
Return type: int

static get_model_from_name(statistical_model: str)[source]: Get the statistical model class from a string.

get_parameter_list()[source]: Return a set of all parameters that the generate_data and likelihood accepts.

ll(**kwargs) → float[source]

Likelihod function, returns the loglikelihood for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.

Keyword Arguments: kwargs – keyword arguments for the parameters
Returns: likelihood value
Return type: float

make_objective()[source]

Make a function that can be passed to Minuit.

Returns: function that can be passed to Minuit
Return type: Callable

property nominal_expectation_values

Nominal expectation values for the sources of the likelihood.

For this to work, you must implement get_expectation_values.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments: fit_guesses (dict) – A dict of parameter names and values.

store_data(file_name, data_list, data_name_list: Optional[List[str]] = None, metadata: Optional[dict] = None)[source]

Store a list of datasets. (each on the form of a list of one or more structured arrays or dicts) Using inference_interface, but included here to allow over-writing. The structure would be: [[datasets1], [datasets2], ..., [datasetsn]], where each of datasets is a list of structured arrays. If you specify, it is set, if not it will read from self.get_likelihood_term_names. If not defined, it will be ["0", "1", ..., "n-1"]. The metadata is optional.

Parameters

file_name (str) – name of the file to store the data in
data_list (list) – list of datasets
data_name_list (list, optional (default=None)) – list of names of the datasets. If None, it will be read from self.get_likelihood_term_names
metadata (dict, optional (default=None)) – metadata to store with the data. If None, no metadata is stored.

alea.parameters module

class alea.parameters.ConditionalParameter(name: str, conditioning_parameter_name: str, **kwargs)[source]

Bases: object

This class is used to define a parameter that depends on another parameter. It has the same attributes as the Parameter class but each of them can be a dictionary with keys being the values of the conditioning parameter and values being the corresponding values of the conditional parameter. Calling the object with the conditioning parameter value as an argument will return a corresponding Parameter object with the correct values.

name

The name of the parameter.

Type: str

conditioning_parameter_name

The name of the conditioning parameter.

Type: str

__eq__(other: object) → bool[source]: Return True if all attributes are equal.

property blueice_anchors: Any: Return the blueice_anchors of the parameter (cominal condition)

property fit_guess: Optional[float]: Return the initial guess for fitting the parameter (cominal condition)

property fit_limits: Optional[Tuple[float, float]]: Return the fit_limits of the parameter (cominal condition)

property fittable: bool: Return the fittable attribute of the parameter (cominal condition)

property needs_reinit: bool: Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]: Return the nominal value of the parameter (cominal condition)

property parameter_interval_bounds: Optional[Tuple[float, float]]: Return the parameter_interval_bounds of the parameter (cominal condition)

property ptype: Optional[str]: Return the ptype of the parameter (cominal condition)

property relative_uncertainty: Optional[bool]: Return the relative_uncertainty of the parameter (cominal condition)

property uncertainty: Any: Return the uncertainty of the parameter (cominal condition)

value_in_fit_limits(value: float) → bool[source]: Returns True if value under cominal condition is within fit_limits.

class alea.parameters.Parameter(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]

Bases: object

Represents a single parameter with its properties.

name

The name of the parameter.

Type: str

nominal_value

The nominal value of the parameter.

Type: float, optional (default=None)

fittable

Indicates if the parameter is fittable or always fixed.

Type: bool, optional (default=True)

ptype

The ptype of the parameter.

Type: str, optional (default=None)

uncertainty

The uncertainty of the parameter. If a string, it can be evaluated as a numpy or scipy function to define non-gaussian constraints.

Type: float or str, optional (default=None)

relative_uncertainty

Indicates if the uncertainty is relative to the nominal_value.

Type: bool, optional (default=None)

blueice_anchors

Anchors for blueice template morphing. Blueice will load the template for the provided values and then interpolate for any value in between.

Type: list, optional (default=None)

fit_limits

The limits for fitting the parameter.

Type: Tuple[float, float], optional (default=None)

parameter_interval_bounds

Limits for computing confidence intervals.

Type: Tuple[float, float], optional (default=None)

fit_guess

The initial guess for fitting the parameter.

Type: float, optional (default=None)

description

A description of the parameter.

Type: str, optional (default=None)

__eq__(other: object) → bool[source]: Return True if all attributes are equal.

__init__(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]: Initialise a parameter.

_check_parameter_consistency()[source]: Check if parameter is consistent.

_check_parameter_interval_bounds(value)[source]: Check if parameter_interval_bounds is within fit_limits and is not None.

property blueice_anchors: Any

Return the blueice_anchors of the parameter.

If the blueice_anchors is a string, it will be evaluated as a numpy or scipy function.

property fit_guess: Optional[float]: Return the initial guess for fitting the parameter.

property needs_reinit: bool: Return True if the parameter needs re-initialization (for ptype needs_reinit).

property nominal_value: Optional[float]: Return the nominal value of the parameter.

property parameter_interval_bounds: Optional[Tuple[float, float]]

property uncertainty: Any

Return the uncertainty of the parameter.

If the uncertainty is a string, it will be evaluated as a numpy or scipy function.

value_in_fit_limits(value: float) → bool[source]: Returns True if value is within fit_limits.

class alea.parameters.Parameters[source]

Bases: object

Represents a collection of parameters.

names

A list of parameter names.

Type: List[str]

fit_guesses

A dictionary of fit guesses.

Type: Dict[str, float]

fit_limits

A dictionary of fit limits.

Type: Dict[str, float]

fittable

A list of parameter names which are fittable.

Type: List[str]

not_fittable

A list of parameter names which are not fittable.

Type: List[str]

uncertainties

A dictionary of parameter uncertainties.

Type: Dict[str, float or Any]

with_uncertainty

A Parameters object with parameters with a not-NaN uncertainty.

Type: Parameters

nominal_values

A dictionary of parameter nominal values.

Type: Dict[str, float]

parameters

A dictionary to store the parameters, with parameter name as key.

Type: Dict[str, Parameter]

__call__(return_fittable: Optional[bool] = False, **kwargs: Optional[Dict]) → Dict[str, float][source]

Return a dictionary of parameter values, optionally filtered to return only fittable parameters.

Parameters: return_fittable (bool, optional (default=False)) – Indicates if only fittable parameters should be returned.
Keyword Arguments: kwargs (dict) – Additional keyword arguments to override parameter values.
Raises: ValueError – If a parameter name is not found.
Returns: A dictionary of parameter values.
Return type: dict

__eq__(other: object) → bool[source]: Return True if all parameters are equal.

__getattr__(name: str) → Parameter[source]

Retrieves a Parameter object by attribute access.

Parameters: name (str) – The name of the parameter.
Raises: AttributeError – If the attribute is not found.
Returns: The retrieved Parameter object.
Return type: Parameter

__getitem__(name: str) → Parameter[source]

Retrieves a Parameter object by dictionary access.

Parameters: name (str) – The name of the parameter.
Raises: KeyError – If the key is not found.
Returns: The retrieved Parameter object.
Return type: Parameter

__init__()[source]: Initialise a collection of parameters.

__iter__() → Iterator[Parameter][source]

Return an iterator over the parameters.

Each iteration return a Parameter object.

__str__() → str[source]: Return an overview table of all parameters.

add_parameter(parameter: Union[Parameter, ConditionalParameter]) → None[source]

Adds a Parameter object to the Parameters collection.

Parameters: parameter (Parameter) – The Parameter object to add.
Raises: ValueError – If the parameter name already exists.

property fit_guesses: Dict[str, float]: A dictionary of fit guesses.

property fit_limits: Dict[str, float]: A dictionary of fit limits.

property fittable: List[str]: A list of parameter names which are fittable.

classmethod from_config(config: Dict[str, dict])[source]

Creates a Parameters object from a configuration dictionary.

Parameters: config (dict) – A dictionary of parameter configurations.
Returns: The created Parameters object.
Return type: Parameters

classmethod from_list(names: List[str])[source]

Creates a Parameters object from a list of parameter names. Everything else is set to default values.

Parameters: names (List[str]) – List of parameter names.
Returns: The created Parameters object.
Return type: Parameters

property names: List[str]: A list of parameter names.

property nominal_values: dict: A dict of nominal values for all parameters with a nominal value.

property not_fittable: List[str]: A list of parameter names which are not fittable.

set_fit_guesses(**fit_guesses)[source]

Set the fit guesses for parameters.

Keyword Arguments: fit_guesses (dict) – A dict of parameter names and values.

set_nominal_values(**nominal_values)[source]

Set the nominal values for parameters.

Keyword Arguments: nominal_values (dict) – A dict of parameter names and values.

property uncertainties: dict

A dict of uncertainties for all parameters with a not-NaN uncertainty.

Caution: this is not the same as the parameter.uncertainty property.

values_in_fit_limits(**kwargs: Dict) → bool[source]

Return True if all values are within the fit limits.

Keyword Arguments: kwargs (dict) – The parameter values to check.
Returns: True if all values are within the fit limits.
Return type: bool

property with_uncertainty: Parameters

Return parameters with a not-NaN uncertainty.

The parameters are the same objects as in the original Parameters object, not a copy. For conditional parameters, the parameters under the nominal condition are returned.

alea.runner module

class alea.runner.Runner(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]

Bases: object

Runner manipulates statistical model and toydata.

initialize the statistical model

generate or reads toy data

save toy data if needed

fit fittable parameters

write the output file

One toyfile can contain multiple toydata, but all of them are from the same generate_values.

model

statistical model instance

Type: StatisticalModel

poi

parameter of interest

Type: str

hypotheses

list of hypotheses

Type: list

common_hypothesis

common hypothesis, the values are copied to each hypothesis

Type: dict

generate_values

generate values for toydata

Type: dict

nominal_values

nominal values of parameters

Type: dict

_compute_confidence_interval

whether compute confidence interval

Type: bool

_n_mc

number of Monte Carlo

Type: int

_toydata_filename

toydata filename

Type: str

_toydata_mode

toydata mode, ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’

Type: str

_metadata

metadata, if None, it is set to {}

Type: dict

_output_filename

output filename

Type: str

_result_names

list of result names

Type: list

_result_dtype

list of result dtypes

Type: list

_hypotheses_values

list of values for hypotheses

Type: list

Parameters

statistical_model (str) – statistical model class name
poi (str) – parameter of interest
hypotheses (list) – list of hypotheses
n_mc (int) – number of Monte Carlo
common_hypothesis (dict, optional (default=None)) – common hypothesis, the values are copied to each hypothesis
generate_values (Dict[str, float], optional (default=None)) – generate values of toydata. If None, toydata depend on statistical model.
nominal_values (dict, optional (default=None)) – nominal values of parameters. If None, nothing will be assigned to model.
statistical_model_config (str, optional (default=None)) – statistical model configuration filename
parameter_definition (dict or list, optional (default=None)) – parameter definition
statistical_model_args (dict, optional (default={})) – arguments for statistical model
likelihood_config (dict, optional (default=None)) – likelihood configuration
compute_confidence_interval (bool, optional (default=False)) – whether compute confidence interval
confidence_level (float, optional (default=0.9)) – confidence level
confidence_interval_kind (str, optional (default='central')) – kind of confidence interval, choice from ‘central’, ‘upper’ or ‘lower’
confidence_interval_root_find (str, optional (default='brentq')) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
fit_strategy (dict, optional (default=None)) – fit strategy dictionary. If None, the default fit strategy of the model will be used.
toydata_mode (str, optional (default='generate_and_store')) – toydata mode, choice from ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’
toydata_filename (str, optional (default=None)) – toydata filename
only_toydata (bool, optional (default=False)) – whether only generate toydata
output_filename (str, optional (default='test_toymc.ii.h5')) – output filename
seed (int, optional (default=None)) – random seed for runners before generating toydata
metadata (dict, optional (default=None)) – metadata to be saved in output file

__init__(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]: Initialize statistical model, parameters list, and generate values list.

_get_hypotheses()[source]: Get generate values list from hypotheses.

Caution

When free hypothesis is provided, it should be the first hypothesis. Free hypothesis means that all parameters are free to fit, it will not use common_hypothesis!

_get_parameter_list()[source]: Get parameter list and result list from statistical model.

property common_hypothesis: Dict[str, float]

data_generator()[source]: Generate, save or read toydata.

property generate_values: Dict[str, float]

property hypotheses: list

pre_process_poi(value, attribute_name)[source]: Pre-process of poi_expectation for some attributes of runner.

read_toydata()[source]: Read toydata from file.

run()[source]

Run toy simulation.

If only_toydata is True, only generate toydata.

static runner_arguments()[source]: Get runner arguments and annotations.

simulate()[source]: Only generate toydata.

simulate_and_fit()[source]

For each Monte Carlo:

run toy simulation a specified toydata mode and generate values.
loop over hypotheses.

Todo

Implement per-hypothesis switching on whether to compute confidence intervals

store_toydata(toydata, toydata_names)[source]

Write toydata to file.

If toydata is a list of dict, convert it to a list of list.

static update_poi(model, poi: str, generate_values: Dict[str, float], nominal_values: Dict[str, float] = {})[source]

Update the poi according to poi_expectation. First, it will check if poi_expectation is provided, if not so, it will do nothing. Second, it will check if poi is provided, if so, it will raise error. Third, it will check if poi ends with _rate_multiplier, if not so, it will raise error. Finally, it will update poi to the correct value according to poi_expectation using the get_expectation_values method of model, under specified nominal_values.

Parameters

poi (str) – parameter of interest
generate_values (dict) – generate values of toydata, it can contain “poi_expectation”
nominal_values (dict) – nominal values of parameters

Caution

The expectation is evaluated under nominal_values in each batch.

write_output(results)[source]: Write output file with metadata.

alea.simulators module

class alea.simulators.BlueiceDataGenerator(ll_term)[source]

Bases: object

A class for generating data from a blueice likelihood term.

ll: The blueice likelihood term.

binned

True if the likelihood term is binned.

Type: bool

bincs

The bin centers of the likelihood term.

Type: list

direction_names

The names of the directions of the likelihood term.

Type: list

source_histograms

The histograms of the sources of the likelihood term.

Type: list

data_lengths

The number of bins of each component of the likelihood term.

Type: list

dtype

The data type of the likelihood term.

Type: list

last_kwargs

The last kwargs used to generate data.

Type: dict

mus: The expected number of events of each source of the likelihood term.

parameters

The parameters of the likelihood term.

Type: list

ll_term

A blueice likelihood term.

Type: BinnedLogLikelihood or UnbinnedLogLikelihood

__init__(ll_term)[source]: Initialize the BlueiceDataGenerator.

compute_pdfs_and_mus(filter_kwargs=True, **kwargs) → None[source]

Compute PDFs of the sources for the given parameters.

Parameters

filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
kwargs – The parameters pasted to the likelihood function.

simulate(filter_kwargs=True, n_toys=None, sample_n_toys=False, **kwargs)[source]

Simulate toys for each source.

Parameters

filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
n_toys (int, optional (default=None)) – If not None, a fixed number n_toys of toys is generated for each source component. Defaults to None.
sample_n_toys (bool, optional (default=False)) – If True, the number of toys is sampled from a Poisson distribution with mean n_toys. Defaults to False. Only works if n_toys is not None.

Keyword Arguments

kwargs – The parameters pasted to the likelihood function.

Returns

Array of simulated data for all sources in the given analysis space. The index “source” indicates the corresponding source of an entry. The dtype follows self.dtype.

Return type

numpy.array

alea.submitter module

class alea.submitter.Submitter(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]

Bases: object

Submitter base class that generate the submission script from the configuration. It is initialized by the configuration file, and the configuration file should contain the arguments of __init__ method of the Submitter.

statistical_model

the name of the statistical model

Type: str

statistical_model_config

the configuration file of the statistical model

Type: str

poi

the parameter of interest

Type: str

computation_dict

the dictionary of the computation, with keys to_zip, to_vary and in_common

Type: dict

debug

whether to run in debug mode. If True, only one job will be submitted or one runner will be returned. And its script will be printed.

Type: bool

resubmit

whether to resubmit the jobs that have not finished. If True, will submit all the jobs, even if the output file exists.

Type: bool

Parameters

statistical_model (str) – the name of the statistical model
statistical_model_config (str) – the configuration file of the statistical model
poi (str) – the parameter of interest
computation_options (dict) – the configuration of the computation
computation (str, optional (default='discovery_power')) – the name of the computation, it should be a key of computation_options
outputfolder (str, optional (default=None)) – the output folder
debug (bool, optional (default=False)) – whether to run in debug mode
loglevel (str, optional (default='INFO')) – the log level

Keyword Arguments

kwargs – the arguments of __init__ method of the Submitter, containing configurations of clusters

Caution

All the source of template should be from the same folder. All the output, including toydata and fitting results, should be in the same folder.

__init__(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]: Initializes the submitter.

all_runner_kwargs()[source]: Parse all the runner arguments from the submission script.

allowed_special_args: List[str] = []

already_done(i_args: dict) → bool[source]: Check if the job is already done, considering the modes of toydata and output.

static arg_to_str(value, annotation) → str[source]

Convert the argument to string for the submission script.

Parameters

value – the value of the argument, can be various type
annotation – the annotation of the argument

Returns

the string of the argument

Return type

str

Caution

Currently we only support str, int, float, bool, dict and list. The float will be rounded to 4 digits after the decimal point.

static check_redunant_arguments(runner_args, allowed_special_args: List[str] = [])[source]

combine_n_jobs: int = 1

combined_tickets_generator()[source]

Get the combined submission script for the current configuration. self.combine_n_jobs jobs will be combined into one submission script.

Yields: (str, str) – the combined submission script and name output_filename

Note

User can add combine_n_jobs: 10 in local_configurations, slurm_configurations or htcondor_configurations to combine 10 jobs into one submission script. User will need this feature when the number of jobs pending for submission is too large.

computation_tickets_generator()[source]

Get the submission script for the current configuration. It generates the submission script for each combination of the computation options.

For Runner from to_zip, to_vary and in_common:

First, generate the combined computational options directly.
Second, update the input and output folder of the options.
Thrid, collect the non-fittable(settable) parameters into nominal_values.
Then, collect the fittable parameters into generate_values.
Finally, it generates the submission script for each combination.

Yields: (str, str) – the submission script and name output_filename

config_file_path: str

filename_kwargs(runner_args: dict) → dict[source]

Get the filename_kwargs from runner_args.

Parameters: runner_args (dict) – the arguments of Runner
Returns: the keyword arguments for the filename
Return type: dict

first_i_batch: int = 0

classmethod from_config(config_file_path: str, **kwargs) → Submitter[source]

Initializes the submitter from a yaml config file.

Parameters: config_file_path (str) – Path to the yaml config file.
Returns: Statistical model.
Return type: BlueiceExtendedModel

logging = <Logger submitter_logger (INFO)>

merged_arguments_generator()[source]: Generate the merged arguments for Runner from to_zip, to_vary and in_common.

property outputfolder: Optional[str]

static runner_kwargs_from_script(sys_argv: Optional[List[str]] = None)[source]

Parse kwargs of a Runner from a string of arguments(script).

Parameters: sys_argv (list, optional (default=None)) – string of arguments, with the format of [’–arg1’, ‘value1’, ‘–arg2’, ‘value2’, …]. The arguments must be the same as the arguments of Runner.__init__.

static script_from_runner_kwargs(annotations, kwargs) → str[source]: Generate the submission script from the runner arguments.

static str_to_arg(value: str, annotation)[source]

Convert the string to argument for the submission script.

Parameters

value – the string of the argument
annotation – the annotation of the argument

Returns

the value of the argument, can be various type

submit(*arg, **kwargs)[source]: Submit the jobs to the destinations.

template_path: str

static update_limit_threshold(runner_args, outputfolder: str)[source]

static update_n_batch(runner_args)[source]

Update n_mc if n_batch is provided.

Distribute n_mc into n_batch, so that each batch will run n_mc/n_batch times.

static update_output_toydata(runner_args, outputfolder: str)[source]

static update_runner_args(runner_args: Dict[str, Dict[str, Any]], parameters_fittable: List[str], parameters_not_fittable: List[str])[source]

Update the runner arguments’ generate_values and nominal_values. If the argument is fittable, it will be added to generate_values, otherwise it will be added to nominal_values.

Parameters: runner_args (dict) – the arguments of Runner

static update_statistical_model_args(runner_args: Dict[str, Dict[str, Any]], template_path: Optional[str] = None)[source]

Update template_path in the statistical model arguments.

Parameters: runner_args (dict) – the arguments of Runner

alea.template_source module

class alea.template_source.CombinedSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Source that is a weighted sums of histograms. Useful e.g. for safeguard. The first histogram is the base histogram, and the rest are added to it with weights. The weights can be set as shape parameters in the config.

Parameters

weights – Weights of the 2nd to the last histograms.
histnames – List of filenames containing the histograms.
templatenames – List of names of histograms within the hdf5 files.

build_histogram()[source]: Build the histogram of the source.

class alea.template_source.SpectrumTemplateSource(config: Dict, *args, **kwargs)[source]

Bases: TemplateSource

Reweighted template source by 1D spectrum. The first axis of the template is assumed to be reweighted.

Parameters: spectrum_name – Name of bbf json-like spectrum file

static _get_json_spectrum(filename: str)[source]

Translates bbf-style JSON files to spectra.

Parameters: filename (str) – Name of the JSON file.

Todo

Define the format of the JSON file clearly.

build_histogram()[source]: Build the histogram of the source.

class alea.template_source.TemplateSource(config: Dict, *args, **kwargs)[source]

Bases: HistogramPdfSource

A source defined with a template histogram. The parameters are set in self.config. “templatename”, “histname”, “analysis_space” must be in self.config.

config

The configuration of the source.

Type: dict

dtype

The data type of the source.

Type: list

_bin_volumes

The bin volumes of the source.

Type: numpy.ndarray

_n_events_histogram

The histogram of the number of events of the source.

Type: multihist.MultiHistBase

events_per_day

The number of events per day of the source.

Type: float

_pdf_histogram

The histogram of the probability density function of the source.

Type: multihist.MultiHistBase

Parameters

config (dict) – The configuration of the source.
templatename – Hdf5 file to open.
histname – Histogram name.
named_parameters (list) – List of config setting names to pass to .format on histname and filename.
normalise_template (bool) – Normalise the template histogram.
in_events_per_bin (bool) – If True, histogram is in events per day / bin. If False or absent, histogram is already pdf.
histogram_scale_factor (float) – Multiply histogram by this number
convert_to_uniform (bool) – Convert the histogram to a uniform per bin distribution.
log10_bins (list) – List of axis numbers. If True, bin edges on this axis in the hdf5 file are log10() of the actual bin edges.

__init__(config: Dict, *args, **kwargs)[source]: Initialize the TemplateSource.

_check_binning(h, histogram_info: str)[source]

Check if the histogram”s bin edges are the same to analysis_space.

Parameters

h (multihist.MultiHistBase) – The histogram to check.
histogram_info (str) – Information of the histogram.

_compute_multiple_file_hashes(templatenames: List[str], format_named_parameters: Dict) → str[source]: Compute a deterministic hash for multiple template files.

_compute_single_file_hash(templatename: str, format_named_parameters: Dict) → str[source]: Compute the hash for a single template file.

apply_slice_args(h, slice_args: Optional[Union[List[Dict], Dict]] = None)[source]

Apply slice arguments to the histogram.

Parameters

h (multihist.MultiHistBase) – The histogram to apply the slice arguments to.
slice_args (dict) – The slice arguments to apply. The sum_axis, slice_axis, and slice_axis_limits are supported.

build_histogram()[source]: Build the histogram of the source.

property format_named_parameters: Get the named parameters in the config to dictionary format.

set_dtype()[source]: Set the data type of the source.

set_pdf_histogram(h)[source]: Set the histogram of the probability density function of the source.

simulate(n_events: int)[source]

Simulate events from the source.

Parameters: n_events (int) – The number of events to simulate.
Returns: The simulated events.
Return type: numpy.ndarray

alea.utils module

exception alea.utils.CannotUpdate[source]: Bases: Exception

class alea.utils.IndexMorpher(config, shape_parameters)[source]

Bases: Morpher

IndexMorpher is a morpher which applies no interpolation.

get_anchor_points(bounds, n_models=None)[source]: Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]: Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

class alea.utils.LockableSet(*args)[source]

Bases: set

A set whose update method can be locked.

basenames()[source]: The basenames of the filenames in the set.

lock()[source]: Lock the set to prevent modifications.

uniqueness()[source]: Check if the basenames contains unique elements.

unlock()[source]: Unlock the set to allow modifications.

update(*args)[source]: Update the set with elements if it is not locked.

class alea.utils.ReadOnlyDict(data)[source]

Bases: object

A read-only dict.

get(key, default=None)[source]

items()[source]

keys()[source]

values()[source]

alea.utils._get_internal(file_name)[source]

Get the abspath of the file.

Raise FileNotFoundError when not found in any subfolder

alea.utils._package_path(sub_directory)[source]: Get the abs path of the requested sub folder.

alea.utils._prefix_file_path(config: dict, template_folder_list: list, ignore_keys: List[str] = ['name', 'histname'])[source]

Prefix file path with template_folder_list whenever possible.

Parameters

config (dict) – dictionary contains file path
template_folder_list (list) – list of possible base folders. Ordered by priority.
ignore_keys (list, optional (default=["name", "histname"])) –
prefixing (keys to be ignored when) –

alea.utils.adapt_likelihood_config_for_blueice(likelihood_config: dict, template_folder_list: list) → dict[source]

Adapt likelihood config to be compatible with blueice.

Parameters

likelihood_config (dict) – likelihood config dict
template_folder_list (list) – list of possible base folders. Ordered by priority.

Returns

adapted likelihood config

Return type

dict

alea.utils.add_i_batch(filename: str) → str[source]: Add i_batch to filename.

alea.utils.asymptotic_critical_value(confidence_interval_kind: str, confidence_level: float, degree_of_freedom: Optional[int] = None)[source]

Return the critical value for the confidence interval.

Parameters

confidence_interval_kind (str) – confidence interval kind, either ‘lower’, ‘upper’ or ‘central’
confidence_level (float) – confidence level
degree_of_freedom (int, optional (default=None)) – degree of freedom

Returns

critical value

Return type

float

Raises

ValueError – if confidence_interval_kind is not ‘lower’, ‘upper’ or ‘central’
ValueError – if degree_of_freedom is not None and not 1, when confidence_interval_kind is ‘lower’ or ‘upper’

alea.utils.can_assign_to_typing(value_type, target_type) → bool[source]

Check if value_type can be assigned to target_type. This is useful when converting Runner’s argument into strings.

Parameters

value_type – type of the value, might be float, int, etc.
target_type – type of the target, might be Optinal, Union, etc.

alea.utils.can_expand_grid(variations: dict) → bool[source]

Check if variations can be expanded into a grid.

Example

>>> can_expand_grid({'a': [1, 2], 'b': [3, 4]})
True

alea.utils.clip_limits(value) → Tuple[float, float][source]: Clip limits to be within [-MAX_FLOAT, MAX_FLOAT] by converting None to -MAX_FLOAT and MAX_FLOAT.

alea.utils.compute_file_hash(file_path: str) → str[source]: Compute the SHA-256 hash of a file.

alea.utils.compute_variations(to_zip, to_vary, in_common) → list[source]

Compute variations of Runner from to_zip, to_vary and in_common. By priority, the order is to_zip, to_vary, in_common. The values in to_zip will overwrite the keys in to_vary and in_common. The values in to_vary will overwrite the keys in in_common.

Parameters

to_zip (dict) – variations to be zipped
to_vary (dict) – variations to be varied
in_common (dict) – variations in common

Returns

a list of dict

Return type

list

alea.utils.convert_to_in_common(in_common: Dict[str, Any]) → Dict[str, Any][source]

Expand the values in in_common, according to the itertools.product method, if necessary. This usually happens to the hypotheses.

Example

>>> convert_to_in_common({'hypotheses': ['free', {'a': [1, 2], 'b': [3, 4]}]})
{
    "hypotheses": [
        "free",
        {"a": 1, "b": 3},
        {"a": 1, "b": 4},
        {"a": 2, "b": 3},
        {"a": 2, "b": 4},
    ]
}

alea.utils.convert_to_vary(to_vary: Dict[str, List]) → List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the itertools.product method.

Example

>>> convert_to_vary({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]

alea.utils.convert_to_zip(to_zip: Dict[str, List]) → List[Dict[str, Any]][source]

Convert dict into a list of dict, according to the zip method.

Example

>>> convert_to_zip({'a': [1, 2], 'b': [3, 4]})
[{'a': 1, 'b': 3}, {'a': 2, 'b': 4}]

alea.utils.convert_variations(variations: dict, iteration) → list[source]

Convert variations to a list of dict, according to the iteration method.

Parameters

variations (dict) – variations to be converted
iteration – iteration method, either zip or itertools.product

Returns

a list of dict

Return type

list

alea.utils.deterministic_hash(thing, length=10)[source]

Return a base32 lowercase string of length determined from hashing a container hierarchy.

Edited from strax: strax/utils.py

alea.utils.dump_json(file_name: str, data: dict)[source]: Dump data to a json file.

alea.utils.dump_yaml(file_name: str, data: dict)[source]: Dump data from yaml file.

alea.utils.evaluate_numpy_scipy_expression(value: str)[source]: Evaluate numpy(np) and scipy.stats expression.

alea.utils.evaluate_numpy_scipy_expression_in_dict(d: dict)[source]

Evaluate numpy(np) and scipy.stats expression in a dict.

Example

>>> evaluate_numpy_scipy_expression_in_dict({'a': 'np.arange(0, 2, 1)', 'b': [0, 1]})
{'a': [0, 1], 'b': [0, 1]}

alea.utils.expand_grid_dict(variations: List[Union[dict, str]]) → List[Union[dict, str]][source]

Expand dict into a list of dict, according to the itertools.product method, if necessary.

Parameters: variations (list) – variations to be expanded

Example

>>> expand_grid_dict(["free", {"a": 1, "b": 3}, {"a": 'np.arange(1, 3)', "b": [3, 4]}])
[
    "free",
    {"a": 1, "b": 3},
    {"a": 1, "b": 3},
    {"a": 1, "b": 4},
    {"a": 2, "b": 3},
    {"a": 2, "b": 4},
]

alea.utils.extremal_root(f, xL, xR, which='left', step=0.01, step_growth=1.0, max_step=None, xtol=1e-12, rtol=np.float64(8.881784197001252e-16))[source]

Return the left-most or right-most root of f in [xL, xR].

The interval is scanned adaptively to detect a sign change, and the root is refined using scipy.optimize.brentq.

Parameters

f (Callable[[float], float]) – Scalar function.
xL (float) – Left boundary (must satisfy xR > xL).
xR (float) – Right boundary.
which (str, optional) – “left” or “right”. Default is “left”.
step (float, optional) – Initial scan step (>0).
step_growth (float, optional) – Step multiplier (>=1).
max_step (float | None, optional) – Maximum scan step.
xtol (float, optional) – Absolute tolerance for brentq.
rtol (float, optional) – Relative tolerance for brentq.

Returns

Extremal root in the interval.

Return type

float

alea.utils.formatted_to_asterisked(formatted, wildcards: Optional[Union[str, List[str]]] = None)[source]

Convert formatted string to asterisk Sometimes a parameter(usually shape parameter) is not specified in formatted string, this function replace the parameter with asterisk.

Parameters

formatted (str) – formatted string
wildcards (str or list, optional (default=None)) – wildcards to be replaced with asterisk.

Returns

asterisked string

Return type

str

Examples

>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}")
"a_*_b_*"
>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}", wildcards="a")
"a_*_b_{b:d}"

alea.utils.get_analysis_space(analysis_space: list) → list[source]: Convert analysis_space to a list of tuples with evaluated values.

alea.utils.get_file_path(fname, folder_list: Optional[List[str]] = None)[source]

Find the full path to the resource file Try 5 methods in the following order.

fname begin with ‘/’, return absolute path
folder begin with ‘/’, return folder + name
can get file from _get_internal, return alea internal file path
can be found in local installed ntauxfiles, return ntauxfiles absolute path
can be downloaded from MongoDB, download and return cached path

Parameters

fname (str) – file name
folder_list (list, optional (default=None)) – list of possible base folders. Ordered by priority. The function will search for file from the first folder in the list, and return the first found file immediately without searching the rest folders.

Returns

full path to the resource file

Return type

str

alea.utils.get_metadata(output_filename_pattern: str) → list[source]: Get metadata from output files.

alea.utils.get_template_folder_list(likelihood_config, extra_template_path: Optional[str] = None)[source]: Get a list of template_folder from likelihood_config.

alea.utils.load_json(file_name: str)[source]: Load data from json file.

alea.utils.load_yaml(file_name: str)[source]: Load data from yaml file.

alea.utils.make_hashable(obj)[source]

Convert a container hierarchy into one that can be hashed.

See http://stackoverflow.com/questions/985294

alea.utils.search_filename_pattern(filename: str) → str[source]

Return pattern for a given existing filename. This is needed because sometimes the filename is not appended by “_{i_batch:d}”. We need to distinguish between the two cases and return the correct pattern.

Returns: existing pattern for filename, either filename or filename w/ inserted “_*”
Return type: str

alea.utils.signal_multiplier_estimator(signal: ndarray, background: ndarray, data: ndarray, iteration=100, diagnostic=False) → float[source]

Estimate the best-fit signal multiplier using perturbation theory. The method tries to solve the critial point of the likelihood function by perturbation theory, where the likelihood function is defined as the binned Poisson likelihood function, given signal, background models and data.

Parameters

signal (np.ndarray) – signal model
background (np.ndarray) – background model
data (np.ndarray) – data array
iteration (int, optional (default=100)) – number of iterations

Returns

best-fit signal multiplier

Return type

float

alea.utils.within_limits(value, limits)[source]: Returns True if value is within limits.

alea package

Subpackages

Submodules

alea.model module

alea.parameters module

alea.runner module

alea.simulators module

alea.submitter module

alea.template_source module

alea.utils module

Module contents