alea package
Subpackages
- alea.examples package
- alea.models package
- Submodules
- alea.models.blueice_extended_model module
BlueiceExtendedModelBlueiceExtendedModel.parametersBlueiceExtendedModel.dataBlueiceExtendedModel.is_data_setBlueiceExtendedModel._likelihoodBlueiceExtendedModel.likelihood_namesBlueiceExtendedModel.livetime_parameter_namesBlueiceExtendedModel.data_generatorsBlueiceExtendedModel.__init__()BlueiceExtendedModel._build_data_generators()BlueiceExtendedModel._build_ll_from_config()BlueiceExtendedModel._generate_ancillary()BlueiceExtendedModel._generate_data()BlueiceExtendedModel._generate_science_data()BlueiceExtendedModel._process_blueice_config()BlueiceExtendedModel._set_default_ptype()BlueiceExtendedModel._set_efficiency()BlueiceExtendedModel.all_source_namesBlueiceExtendedModel.dataBlueiceExtendedModel.from_config()BlueiceExtendedModel.get_expectation_values()BlueiceExtendedModel.get_source_histograms()BlueiceExtendedModel.get_source_name_list()BlueiceExtendedModel.likelihood_listBlueiceExtendedModel.likelihood_parametersBlueiceExtendedModel.store_data()BlueiceExtendedModel.store_real_data()
CustomAncillaryLikelihood
- Module contents
- alea.submitters package
Submodules
alea.model module
- class alea.model.MinuitWrap(f: Callable, parameters: Parameters)[source]
Bases:
objectWrapper for functions to be called by Minuit.
Initialized with a function f and a Parameters instance.
- func
function wrapped
- Parameters
f (Callable) – function to be wrapped
parameters (Parameters) – parameters of the model
- __init__(f: Callable, parameters: Parameters)[source]
Initialize the wrapper.
- class alea.model.StatisticalModel(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]
Bases:
objectBase class for defining a statistical model with a likelihood and data generation method.
- The statistical model contains two parts that you must define yourself:
- a likelihood function
ll(self, parameter_1, parameter_2… parameter_n): A function of a set of named parameters which return a float expressing the log-likelihood for observed data given these parameters.
- a data generation function
generate_data(self, parameter_1, parameter_2… parameter_n): A function of the same set of named parameters return a full data set.
- Methods that you must implement:
_ll
_generate_data
- Methods that you may implement:
get_expectation_values
- Methods that already exist here:
ll
store_data
fit
get_parameter_list
confidence_interval
The public methods generate_data and ll, as the names suggested, depend on private methods _generate_data, and _ll respectively.
- data
data of the model
- _data
data of the model
- _confidence_level
confidence level for confidence intervals
- _confidence_interval_kind
kind of confidence interval to compute
- parameters
parameters of the model
- confidence_interval_threshold
threshold for confidence interval
- Parameters
parameter_definition (dict or list, optional (default=None)) – definition of the parameters of the model
confidence_level (float, optional (default=0.9)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default="central")) – kind of confidence interval to compute
confidence_interval_threshold (Callable[[float], float], optional (default=None)) – threshold for confidence interval
confidence_interval_root_find (str, optional (default="brentq")) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
data (dict or list, optional (default=None)) – pre-set data of the model
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings
- Raises
RuntimeError – if you try to instantiate the StatisticalModel class directly
NotImplementedError – if you do not implement the likelihood function or the data generation
- __init__(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]
Initialize a statistical model.
- _check_ll_and_generate_data_signature()[source]
Check that the likelihood and generate_data functions have the same signature.
- _confidence_interval_checks(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, asymptotic_dof: Optional[int] = None, **kwargs) Tuple[str, Callable[[float], float], str, Tuple[float, float]][source]
Helper function for confidence_interval that does the input checks and return bounds.
- Parameters
poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) – range in which to search for the confidence interval edges
confidence_level (float, optional (default=None)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
- Returns
confidence interval kind, confidence interval threshold, parameter interval bounds
- Return type
Tuple[str, Callable[[float], float], str, Tuple[float, float]]
- _define_parameters(parameter_definition, nominal_values=None)[source]
Initialize the parameters of the model.
- _ll(**kwargs) float[source]
Likelihood function, return the log-likelihood for the given parameters.
- confidence_interval(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, confidence_interval_args: Optional[dict] = None, best_fit_args: Optional[dict] = None, asymptotic_dof: Optional[int] = None, fit_strategy: Optional[dict] = None) Tuple[float, float][source]
Compute confidence intervals for the parameter of interest (POI).
Finds the intersection between the profile log-likelihood curve and the critical value curve to determine the confidence interval edges. If the parameter is a rate parameter and the model has expectation values implemented, the bounds will be interpreted as bounds on the expectation value, so that the range in the fit is parameter_interval_bounds/mus. Otherwise the bound is taken as-is.
- Parameters
poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) –
range in which to search for the confidence interval edges. May be specified as:
setting the property “parameter_interval_bounds” for the parameter
passing a list here
passing None here, the property of the parameter is used
confidence_level (float, optional (default=None)) – confidence level for confidence intervals. If None, the default confidence level of the model is used.
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute. If None, the default kind of the model is used.
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
confidence_interval_args (dict, optional (default=None)) – Parameters that will be fixed in the profile likelihood computation. If None, all fittable parameters will be profiled except the poi.
best_fit_args (dict, optional (default=None)) – If you require the “global” best-fit used to normalise the profile likelihood ratio to fix fewer parameters than the profile likelihood– mainly used for 1-D slices of higher-dimensional confidence volumes, where the global best-fit may not be along the profile. If None, will be set to confidence_interval_args.
asymptotic_dof (int, optional (default=None)) – Degrees of freedom for asymptotic
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings.
- property data
Return the dataset, overridable for special needs.
Datasets are expected to be in the form of a list of one or more structured arrays, representing the datasets of one or more likelihood terms.
- fit(verbose: Optional[bool] = False, fit_strategy: Optional[dict] = None, **kwargs) Tuple[dict, float][source]
Fit the model to the data by maximizing the likelihood.
Returns a dict of best-fit parameter values and the maximum log-likelihood value. While the optimization is a minimization internally, the likelihood returned is the maximum.
- Parameters
verbose (bool) – if True, print the Minuit object
fit_strategy (dict) –
override the default fit strategy defined in the model (model.fit_strategy). Possible settings are: - minimizer_routine (str): the minimizer routine to use, either
”migrad”, “simplex”, or “simplex_migrad” (first run simplex, then migrad).
- minuit_strategy (int): strategy for Minuit, can be 0, 1, or 2. The higher the
number, the more precise the fit but also the slower.
- refit_invalid (bool): if True, refit with the simplex_migrad routine
and strategy 2 if the optimization does not converge the first time.
- disable_index_fitting (bool): if True, disable the index fitting
even if the model has index parameters.
max_index_fitting_iter (int): maximum number of iterations for index fitting
- Returns
best-fit values of each parameter, and the value of the likelihood evaluated there
- Return type
- generate_data(**kwargs) Union[dict, list][source]
Generate data for the given parameters.
Parameters are passed as keyword arguments; positional arguments are not supported. If a parameter is not given, the default value is used.
- Raises
ValueError – If the parameters are not within the fit limits
- Returns
generated data
- Return type
Caution
This implementation won’t allow you to call generate_data by positional arguments.
- get_expectation_values(**parameter_values)[source]
Get the expectation values of the measurement.
- Parameters
parameter_values – values of the parameters
- get_likelihood_term_from_name(likelihood_name: str) int[source]
Return the index of a likelihood term if the likelihood has several names.
- static get_model_from_name(statistical_model: str)[source]
Get the statistical model class from a string.
- get_parameter_list()[source]
Return a set of all parameters that the generate_data and likelihood accepts.
- ll(**kwargs) float[source]
Return the log-likelihood for the given parameters.
Parameters are passed as keyword arguments; positional arguments are not supported. If a parameter is not given, the default value is used.
- Keyword Arguments
kwargs – keyword arguments for the parameters
- Returns
likelihood value
- Return type
- make_objective()[source]
Make a function that can be passed to Minuit.
- Returns
function that can be passed to Minuit
- Return type
Callable
- property nominal_expectation_values
Nominal expectation values for the sources of the likelihood.
For this to work, you must implement
get_expectation_values.
- set_fit_guesses(**fit_guesses)[source]
Set the fit guesses for parameters.
- Keyword Arguments
fit_guesses (dict) – A dict of parameter names and values.
- store_data(file_name, data_list, data_name_list: Optional[List[str]] = None, metadata: Optional[dict] = None)[source]
Store a list of datasets to a file using inference_interface.
Each dataset is in the form of a list of one or more structured arrays or dicts. The structure would be:
[[datasets1], [datasets2], ..., [datasetsn]], where each of datasets is a list of structured arrays. If you specify, it is set, if not it will read fromself.get_likelihood_term_names. If not defined, it will be["0", "1", ..., "n-1"]. The metadata is optional.- Parameters
file_name (str) – name of the file to store the data in
data_list (list) – list of datasets
data_name_list (list, optional (default=None)) – list of names of the datasets. If None, it will be read from self.get_likelihood_term_names
metadata (dict, optional (default=None)) – metadata to store with the data. If None, no metadata is stored.
alea.parameters module
- class alea.parameters.ConditionalParameter(name: str, conditioning_parameter_name: str, **kwargs)[source]
Bases:
objectA parameter whose properties depend on the value of another (conditioning) parameter.
Each attribute can be a dictionary mapping conditioning parameter values to the corresponding values of the conditional parameter. Calling the object with the conditioning parameter value as an argument returns a Parameter object with the correct values.
- property fit_guess: Optional[float]
Return the initial guess for fitting the parameter (nominal condition)
- property fit_limits: Optional[Tuple[float, float]]
Return the fit_limits of the parameter (nominal condition)
- property needs_reinit: bool
Return True if the parameter needs re-initialization (for ptype
needs_reinit).
- property nominal_value: Optional[float]
Return the nominal value of the parameter (nominal condition)
- property parameter_interval_bounds: Optional[Tuple[float, float]]
Return the parameter_interval_bounds of the parameter (nominal condition)
- class alea.parameters.Parameter(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]
Bases:
objectRepresents a single parameter with its properties.
- fittable
Indicates if the parameter is fittable or always fixed.
- Type
bool, optional (default=True)
- uncertainty
The uncertainty of the parameter. If a string, it can be evaluated as a numpy or scipy function to define non-gaussian constraints.
- relative_uncertainty
Indicates if the uncertainty is relative to the nominal_value.
- Type
bool, optional (default=None)
- blueice_anchors
Anchors for blueice template morphing. Blueice will load the template for the provided values and then interpolate for any value in between.
- Type
list, optional (default=None)
- parameter_interval_bounds
Limits for computing confidence intervals.
- __init__(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]
Initialise a parameter.
- _check_parameter_interval_bounds(value)[source]
Check if parameter_interval_bounds is within fit_limits and is not None.
- property blueice_anchors: Any
Return the blueice_anchors of the parameter.
If the blueice_anchors is a string, it will be evaluated as a numpy or scipy function.
- property needs_reinit: bool
Return True if the parameter needs re-initialization (for ptype
needs_reinit).
- class alea.parameters.Parameters[source]
Bases:
objectRepresents a collection of parameters.
- with_uncertainty
A Parameters object with parameters with a not-NaN uncertainty.
- Type
- parameters
A dictionary to store the parameters, with parameter name as key.
- __call__(return_fittable: Optional[bool] = False, **kwargs: Optional[Dict]) Dict[str, Any][source]
Return a dictionary of parameter values, optionally filtered to fittable parameters only.
- Parameters
return_fittable (bool, optional (default=False)) – Indicates if only fittable parameters should be returned.
- Keyword Arguments
kwargs (dict) – Additional keyword arguments to override parameter values.
- Raises
ValueError – If a parameter name is not found.
- Returns
A dictionary of parameter values.
- Return type
- __getattr__(name: str) Parameter[source]
Retrieves a Parameter object by attribute access.
- Parameters
name (str) – The name of the parameter.
- Raises
AttributeError – If the attribute is not found.
- Returns
The retrieved Parameter object.
- Return type
- __iter__() Iterator[Parameter][source]
Return an iterator over the parameters.
Each iteration return a Parameter object.
- add_parameter(parameter: Union[Parameter, ConditionalParameter]) None[source]
Adds a Parameter object to the Parameters collection.
- Parameters
parameter (Parameter) – The Parameter object to add.
- Raises
ValueError – If the parameter name already exists.
- classmethod from_config(config: Dict[str, dict])[source]
Creates a Parameters object from a configuration dictionary.
- Parameters
config (dict) – A dictionary of parameter configurations.
- Returns
The created Parameters object.
- Return type
- classmethod from_list(names: List[str])[source]
Creates a Parameters object from a list of parameter names.
Everything else is set to default values.
- Parameters
names (List[str]) – List of parameter names.
- Returns
The created Parameters object.
- Return type
- set_fit_guesses(**fit_guesses)[source]
Set the fit guesses for parameters.
- Keyword Arguments
fit_guesses (dict) – A dict of parameter names and values.
- set_nominal_values(**nominal_values)[source]
Set the nominal values for parameters.
- Keyword Arguments
nominal_values (dict) – A dict of parameter names and values.
- property uncertainties: dict
A dict of uncertainties for all parameters with a not-NaN uncertainty.
Caution: this is not the same as the parameter.uncertainty property.
- values_in_fit_limits(**kwargs: Dict) bool[source]
Return True if all values are within the fit limits.
- property with_uncertainty: Parameters
Return parameters with a not-NaN uncertainty.
The parameters are the same objects as in the original Parameters object, not a copy. For conditional parameters, the parameters under the nominal condition are returned.
alea.runner module
- class alea.runner.Runner(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]
Bases:
objectManages toy Monte Carlo simulation and fitting for a statistical model.
- Responsibilities:
initialize the statistical model
generate or read toy data
save toy data if needed
fit fittable parameters
write the output file
One toyfile can contain multiple toydata, but all of them share the same generate_values.
- model
statistical model instance
- Type
- Parameters
statistical_model (str) – statistical model class name
poi (str) – parameter of interest
hypotheses (list) – list of hypotheses
n_mc (int) – number of Monte Carlo
common_hypothesis (dict, optional (default=None)) – common hypothesis, the values are copied to each hypothesis
generate_values (Dict[str, float], optional (default=None)) – generate values of toydata. If None, toydata depend on statistical model.
nominal_values (dict, optional (default=None)) – nominal values of parameters. If None, nothing will be assigned to model.
statistical_model_config (str, optional (default=None)) – statistical model configuration filename
parameter_definition (dict or list, optional (default=None)) – parameter definition
statistical_model_args (dict, optional (default={})) – arguments for statistical model
likelihood_config (dict, optional (default=None)) – likelihood configuration
compute_confidence_interval (bool, optional (default=False)) – whether compute confidence interval
confidence_level (float, optional (default=0.9)) – confidence level
confidence_interval_kind (str, optional (default='central')) – kind of confidence interval, choice from ‘central’, ‘upper’ or ‘lower’
confidence_interval_root_find (str, optional (default='brentq')) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
fit_strategy (dict, optional (default=None)) – fit strategy dictionary. If None, the default fit strategy of the model will be used.
toydata_mode (str, optional (default='generate_and_store')) – toydata mode, choice from ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’
toydata_filename (str, optional (default=None)) – toydata filename
only_toydata (bool, optional (default=False)) – whether only generate toydata
output_filename (str, optional (default='test_toymc.ii.h5')) – output filename
seed (int, optional (default=None)) – random seed for runners before generating toydata
metadata (dict, optional (default=None)) – metadata to be saved in output file
- __init__(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]
Initialize statistical model, parameters list, and generate values list.
- _get_hypotheses()[source]
Get generate values list from hypotheses.
Caution
When free hypothesis is provided, it should be the first hypothesis. Free hypothesis means that all parameters are free to fit, it will not use common_hypothesis!
- pre_process_poi(value, attribute_name)[source]
Pre-process of poi_expectation for some attributes of runner.
- simulate_and_fit()[source]
Run toy simulations, perform fits for different hypotheses, and collect results.
For each Monte Carlo iteration, runs the toy simulation under the specified toydata mode and generate values, then fits the model to the generated toydata for each hypothesis, and collects the fit results and confidence intervals if needed.
Todo
Implement per-hypothesis switching on whether to compute confidence intervals
- store_toydata(toydata, toydata_names)[source]
Write toydata to file.
If toydata is a list of dict, convert it to a list of list.
- static update_poi(model, poi: str, generate_values: Dict[str, float], nominal_values: Dict[str, float] = {})[source]
Update the poi in generate_values according to poi_expectation.
Checks that poi_expectation is provided, that poi is not already set, and that poi ends with _rate_multiplier. Then updates the poi to the correct value using the get_expectation_values method of the model under the specified nominal_values.
- Parameters
Caution
The expectation is evaluated under nominal_values in each batch.
alea.simulators module
- class alea.simulators.BlueiceDataGenerator(ll_term)[source]
Bases:
objectA class for generating data from a blueice likelihood term.
- ll
The blueice likelihood term.
- mus
The expected number of events of each source of the likelihood term.
- ll_term
A blueice likelihood term.
- Type
BinnedLogLikelihood or UnbinnedLogLikelihood
- compute_pdfs_and_mus(filter_kwargs=True, **kwargs) None[source]
Compute PDFs and expected event counts for all sources given the parameters.
Results are cached; recomputation is skipped if kwargs are unchanged from the previous call.
- Parameters
filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs.
kwargs – The parameters passed to the likelihood function.
- simulate(filter_kwargs=True, n_toys=None, sample_n_toys=False, **kwargs)[source]
Simulate toys for each source.
- Parameters
filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
n_toys (int, optional (default=None)) – If not None, a fixed number n_toys of toys is generated for each source component. Defaults to None.
sample_n_toys (bool, optional (default=False)) – If True, the number of toys is sampled from a Poisson distribution with mean n_toys. Defaults to False. Only works if n_toys is not None.
- Keyword Arguments
kwargs – The parameters pasted to the likelihood function.
- Returns
Array of simulated data for all sources in the given analysis space. The index “source” indicates the corresponding source of an entry. The dtype follows self.dtype.
- Return type
numpy.array
alea.submitter module
- class alea.submitter.Submitter(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]
Bases:
objectSubmitter base class that generate the submission script from the configuration.
Initialized from a configuration file whose contents map to the arguments of the __init__ method of the Submitter.
- computation_dict
the dictionary of the computation, with keys to_zip, to_vary and in_common
- Type
- debug
whether to run in debug mode. If True, only one job will be submitted or one runner will be returned. And its script will be printed.
- Type
- resubmit
whether to resubmit the jobs that have not finished. If True, will submit all the jobs, even if the output file exists.
- Type
- Parameters
statistical_model (str) – the name of the statistical model
statistical_model_config (str) – the configuration file of the statistical model
poi (str) – the parameter of interest
computation_options (dict) – the configuration of the computation
computation (str, optional (default='discovery_power')) – the name of the computation, it should be a key of computation_options
outputfolder (str, optional (default=None)) – the output folder
debug (bool, optional (default=False)) – whether to run in debug mode
loglevel (str, optional (default='INFO')) – the log level
- Keyword Arguments
kwargs – the arguments of __init__ method of the Submitter, containing configurations of clusters
Caution
All the source of template should be from the same folder. All the output, including toydata and fitting results, should be in the same folder.
- __init__(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]
Initializes the submitter.
- already_done(i_args: dict) bool[source]
Check if the job is already done, considering the modes of toydata and output.
- static arg_to_str(value, annotation) str[source]
Convert the argument to string for the submission script.
- Parameters
value – the value of the argument, can be various type
annotation – the annotation of the argument
- Returns
the string of the argument
- Return type
Caution
Currently we only support str, int, float, bool, dict and list. The float will be rounded to 4 digits after the decimal point.
- combined_tickets_generator()[source]
Get the combined submission script for the current configuration.
self.combine_n_jobsjobs will be combined into one submission script.- Yields
(str, str) – the combined submission script and name output_filename
Note
User can add
combine_n_jobs: 10inlocal_configurations,slurm_configurationsorhtcondor_configurationsto combine 10 jobs into one submission script. User will need this feature when the number of jobs pending for submission is too large.
- computation_tickets_generator()[source]
Generate submission scripts for each combination of the computation options.
- For each Runner argument set derived from to_zip, to_vary and in_common:
First, generate the combined computational options directly.
Second, update the input and output folder of the options.
Third, collect the non-fittable (settable) parameters into nominal_values.
Then, collect the fittable parameters into generate_values.
Finally, generate the submission script for each combination.
- Yields
(str, str) – the submission script and name output_filename
- classmethod from_config(config_file_path: str, **kwargs) Submitter[source]
Initialize the submitter from a yaml config file.
- logging = <Logger submitter_logger (INFO)>
- merged_arguments_generator()[source]
Generate the merged arguments for Runner from to_zip, to_vary and in_common.
- static runner_kwargs_from_script(sys_argv: Optional[List[str]] = None)[source]
Parse kwargs of a Runner from a string of arguments(script).
- Parameters
sys_argv (list, optional (default=None)) – string of arguments, with the format of [’–arg1’, ‘value1’, ‘–arg2’, ‘value2’, …]. The arguments must be the same as the arguments of Runner.__init__.
- static script_from_runner_kwargs(annotations, kwargs) str[source]
Generate the submission script from the runner arguments.
- static str_to_arg(value: str, annotation)[source]
Convert the string to argument for the submission script.
- Parameters
value – the string of the argument
annotation – the annotation of the argument
- Returns
the value of the argument, can be various type
- static update_n_batch(runner_args)[source]
Update n_mc if n_batch is provided.
Distribute n_mc into n_batch, so that each batch will run n_mc/n_batch times.
- static update_runner_args(runner_args: Dict[str, Dict[str, Any]], parameters_fittable: List[str], parameters_not_fittable: List[str])[source]
Update the runner arguments’ generate_values and nominal_values.
Fittable parameters are added to generate_values; non-fittable parameters are added to nominal_values.
- Parameters
runner_args (dict) – the arguments of Runner
alea.template_source module
- class alea.template_source.CombinedSource(config: Dict, *args, **kwargs)[source]
Bases:
TemplateSourceSource that is a weighted sum of histograms.
Useful for example for safeguard. The first histogram is the base histogram and the rest are added to it with weights, which can be set as shape parameters in the config.
- Parameters
weights – Weights of the 2nd to the last histograms.
histnames – List of filenames containing the histograms.
templatenames – List of names of histograms within the hdf5 files.
- class alea.template_source.SpectrumTemplateSource(config: Dict, *args, **kwargs)[source]
Bases:
TemplateSourceReweighted template source by a 1D spectrum.
The first axis of the template is assumed to be the one being reweighted.
- Parameters
spectrum_name – Name of bbf json-like spectrum file
- class alea.template_source.TemplateSource(config: Dict, *args, **kwargs)[source]
Bases:
HistogramPdfSourceA source defined with a template histogram.
The parameters are set in self.config; “templatename”, “histname”, and “analysis_space” must be present in self.config.
- _bin_volumes
The bin volumes of the source.
- Type
numpy.ndarray
- _n_events_histogram
The histogram of the number of events of the source.
- Type
multihist.MultiHistBase
- _pdf_histogram
The histogram of the probability density function of the source.
- Type
multihist.MultiHistBase
- Parameters
config (dict) – The configuration of the source.
templatename – Hdf5 file to open.
histname – Histogram name.
named_parameters (list) – List of config setting names to pass to .format on histname and filename.
normalise_template (bool) – Normalise the template histogram.
in_events_per_bin (bool) – If True, histogram is in events per day / bin. If False or absent, histogram is already pdf.
histogram_scale_factor (float) – Multiply histogram by this number
convert_to_uniform (bool) – Convert the histogram to a uniform per bin distribution.
log10_bins (list) – List of axis numbers. If True, bin edges on this axis in the hdf5 file are log10() of the actual bin edges.
- _check_binning(h, histogram_info: str)[source]
Check if the histogram”s bin edges are the same to analysis_space.
- Parameters
h (multihist.MultiHistBase) – The histogram to check.
histogram_info (str) – Information of the histogram.
- _compute_multiple_file_hashes(templatenames: List[str], format_named_parameters: Dict) str[source]
Compute a deterministic hash for multiple template files.
- _compute_single_file_hash(templatename: str, format_named_parameters: Dict) str[source]
Compute the hash for a single template file.
- apply_slice_args(h, slice_args: Optional[Union[List[Dict], Dict]] = None)[source]
Apply slice arguments to the histogram.
- Parameters
h (multihist.MultiHistBase) – The histogram to apply the slice arguments to.
slice_args (dict) – The slice arguments to apply. The sum_axis, slice_axis, and slice_axis_limits are supported.
- property format_named_parameters
Get the named parameters in the config to dictionary format.
alea.utils module
- class alea.utils.IndexMorpher(config, shape_parameters)[source]
Bases:
MorpherIndexMorpher is a morpher which applies no interpolation.
- get_anchor_points(bounds, n_models=None)[source]
Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.
- make_interpolator(f, extra_dims, anchor_models)[source]
Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.
- alea.utils._get_internal(file_name)[source]
Get the abspath of the file.
Raise FileNotFoundError when not found in any subfolder
- alea.utils._prefix_file_path(config: dict, template_folder_list: list, ignore_keys: List[str] = ['name', 'histname'])[source]
Prefix file path with template_folder_list whenever possible.
- alea.utils.adapt_likelihood_config_for_blueice(likelihood_config: dict, template_folder_list: list) dict[source]
Adapt likelihood config to be compatible with blueice.
- alea.utils.asymptotic_critical_value(confidence_interval_kind: str, confidence_level: float, degree_of_freedom: Optional[int] = None)[source]
Return the critical value for the confidence interval.
- Parameters
- Returns
critical value
- Return type
- Raises
ValueError – if confidence_interval_kind is not ‘lower’, ‘upper’ or ‘central’
ValueError – if degree_of_freedom is not None and not 1, when confidence_interval_kind is ‘lower’ or ‘upper’
- alea.utils.can_assign_to_typing(value_type, target_type) bool[source]
Check if value_type can be assigned to target_type.
This is useful when converting Runner’s argument into strings.
- Parameters
value_type – type of the value, might be float, int, etc.
target_type – type of the target, might be Optinal, Union, etc.
- alea.utils.can_expand_grid(variations: dict) bool[source]
Check if variations can be expanded into a grid.
Example
>>> can_expand_grid({'a': [1, 2], 'b': [3, 4]}) True
- alea.utils.clip_limits(value) Tuple[float, float][source]
Clip limits to [-MAX_FLOAT, MAX_FLOAT] by replacing None with the respective bound.
- alea.utils.compute_variations(to_zip, to_vary, in_common) list[source]
Compute all Runner argument combinations from to_zip, to_vary and in_common.
By priority the order is to_zip > to_vary > in_common: values in to_zip overwrite those in to_vary and in_common, and values in to_vary overwrite those in in_common.
- alea.utils.convert_to_in_common(in_common: Dict[str, Any]) Dict[str, Any][source]
Expand the values in in_common, according to the itertools.product method, if necessary.
This usually happens to the hypotheses.
Example
>>> convert_to_in_common({'hypotheses': ['free', {'a': [1, 2], 'b': [3, 4]}]}) { "hypotheses": [ "free", {"a": 1, "b": 3}, {"a": 1, "b": 4}, {"a": 2, "b": 3}, {"a": 2, "b": 4}, ] }
- alea.utils.convert_to_vary(to_vary: Dict[str, List]) List[Dict[str, Any]][source]
Convert dict into a list of dict, according to the itertools.product method.
Example
>>> convert_to_vary({'a': [1, 2], 'b': [3, 4]}) [{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]
- alea.utils.convert_to_zip(to_zip: Dict[str, List]) List[Dict[str, Any]][source]
Convert dict into a list of dict, according to the zip method.
Example
>>> convert_to_zip({'a': [1, 2], 'b': [3, 4]}) [{'a': 1, 'b': 3}, {'a': 2, 'b': 4}]
- alea.utils.convert_variations(variations: dict, iteration) list[source]
Convert variations to a list of dict, according to the iteration method.
- alea.utils.deterministic_hash(thing, length=10)[source]
Return a base32 lowercase string of length determined from hashing a container hierarchy.
Edited from strax: strax/utils.py
- alea.utils.evaluate_numpy_scipy_expression(value: str)[source]
Evaluate numpy(np) and scipy.stats expression.
- alea.utils.evaluate_numpy_scipy_expression_in_dict(d: dict)[source]
Evaluate numpy(np) and scipy.stats expression in a dict.
Example
>>> evaluate_numpy_scipy_expression_in_dict({'a': 'np.arange(0, 2, 1)', 'b': [0, 1]}) {'a': [0, 1], 'b': [0, 1]}
- alea.utils.expand_grid_dict(variations: List[Union[dict, str]]) List[Union[dict, str]][source]
Expand dict into a list of dict, according to the itertools.product method, if necessary.
- Parameters
variations (list) – variations to be expanded
Example
>>> expand_grid_dict(["free", {"a": 1, "b": 3}, {"a": 'np.arange(1, 3)', "b": [3, 4]}]) [ "free", {"a": 1, "b": 3}, {"a": 1, "b": 3}, {"a": 1, "b": 4}, {"a": 2, "b": 3}, {"a": 2, "b": 4}, ]
- alea.utils.extremal_root(f, xL, xR, which='left', step=0.01, step_growth=1.0, max_step=None, xtol=1e-12, rtol=np.float64(8.881784197001252e-16))[source]
Return the left-most or right-most root of f in [xL, xR].
The interval is scanned adaptively to detect a sign change, and the root is refined using scipy.optimize.brentq.
- Parameters
xL (float) – Left boundary (must satisfy xR > xL).
xR (float) – Right boundary.
which (str, optional) – “left” or “right”. Default is “left”.
step (float, optional) – Initial scan step (>0).
step_growth (float, optional) – Step multiplier (>=1).
max_step (float | None, optional) – Maximum scan step.
xtol (float, optional) – Absolute tolerance for brentq.
rtol (float, optional) – Relative tolerance for brentq.
- Returns
Extremal root in the interval.
- Return type
- alea.utils.formatted_to_asterisked(formatted, wildcards: Optional[Union[str, List[str]]] = None)[source]
Convert a formatted string to an asterisked string.
When a parameter (usually a shape parameter) is not specified in the formatted string, this function replaces the parameter with an asterisk.
- Parameters
- Returns
asterisked string
- Return type
Examples
>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}") "a_*_b_*" >>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}", wildcards="a") "a_*_b_{b:d}"
- alea.utils.get_analysis_space(analysis_space: list) list[source]
Convert analysis_space to a list of tuples with evaluated values.
- alea.utils.get_file_path(fname, folder_list: Optional[List[str]] = None)[source]
Find the full path to the resource file.
The following methods are tried in order:
fname begin with ‘/’, return absolute path
folder begin with ‘/’, return folder + name
can get file from _get_internal, return alea internal file path
can be found in local installed ntauxfiles, return ntauxfiles absolute path
can be downloaded from MongoDB, download and return cached path
- Parameters
- Returns
full path to the resource file
- Return type
- alea.utils.get_template_folder_list(likelihood_config, extra_template_path: Optional[str] = None)[source]
Get a list of template_folder from likelihood_config.
- alea.utils.search_filename_pattern(filename: str) str[source]
Return the glob pattern for a given existing filename.
This is needed because sometimes the filename is not appended by “_{i_batch:d}”. The function distinguishes between the two cases and returns the correct pattern.
- Returns
existing pattern for filename, either filename or filename w/ inserted “_*”
- Return type
- alea.utils.signal_multiplier_estimator(signal: ndarray, background: ndarray, data: ndarray, iteration=100, diagnostic=False) float[source]
Estimate the best-fit signal multiplier using perturbation theory.
Solves the critical point of the binned Poisson likelihood function iteratively via perturbation theory, given signal and background models and observed data.