alea package
Subpackages
- alea.examples package
- alea.models package
- Submodules
- alea.models.blueice_extended_model module
BlueiceExtendedModelBlueiceExtendedModel.parametersBlueiceExtendedModel.dataBlueiceExtendedModel.is_data_setBlueiceExtendedModel._likelihoodBlueiceExtendedModel.likelihood_namesBlueiceExtendedModel.livetime_parameter_namesBlueiceExtendedModel.data_generatorsBlueiceExtendedModel.__init__()BlueiceExtendedModel._build_data_generators()BlueiceExtendedModel._build_ll_from_config()BlueiceExtendedModel._generate_ancillary()BlueiceExtendedModel._generate_data()BlueiceExtendedModel._generate_science_data()BlueiceExtendedModel._process_blueice_config()BlueiceExtendedModel._set_default_ptype()BlueiceExtendedModel._set_efficiency()BlueiceExtendedModel.all_source_namesBlueiceExtendedModel.dataBlueiceExtendedModel.from_config()BlueiceExtendedModel.get_expectation_values()BlueiceExtendedModel.get_source_histograms()BlueiceExtendedModel.get_source_name_list()BlueiceExtendedModel.likelihood_listBlueiceExtendedModel.likelihood_parametersBlueiceExtendedModel.store_data()BlueiceExtendedModel.store_real_data()
CustomAncillaryLikelihood
- Module contents
- alea.submitters package
Submodules
alea.model module
- class alea.model.MinuitWrap(f: Callable, parameters: Parameters)[source]
Bases:
objectWrapper for functions to be called by Minuit. Initialized with a function f and a Parameters instance.
- func
function wrapped
- Parameters
f (Callable) – function to be wrapped
parameters (Parameters) – parameters of the model
- __init__(f: Callable, parameters: Parameters)[source]
Initialize the wrapper.
- class alea.model.StatisticalModel(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]
Bases:
objectClass that defines a statistical model.
- The statisical model contains two parts that you must define yourself:
- a likelihood function
ll(self, parameter_1, parameter_2… parameter_n): A function of a set of named parameters which return a float expressing the loglikelihood for observed data given these parameters.
- a data generation function
generate_data(self, parameter_1, parameter_2… parameter_n): A function of the same set of named parameters return a full data set.
- Methods that you must implement:
_ll
_generate_data
- Methods that you may implement:
get_expectation_values
- Methods that already exist here:
ll
store_data
fit
get_parameter_list
confidence_interval
The public methods generate_data and ll, as the names suggested, depend on private methods _generate_data, and _ll respectively.
- data
data of the model
- _data
data of the model
- _confidence_level
confidence level for confidence intervals
- _confidence_interval_kind
kind of confidence interval to compute
- parameters
parameters of the model
- confidence_interval_threshold
threshold for confidence interval
- Parameters
parameter_definition (dict or list, optional (default=None)) – definition of the parameters of the model
confidence_level (float, optional (default=0.9)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default="central")) – kind of confidence interval to compute
confidence_interval_threshold (Callable[[float], float], optional (default=None)) – threshold for confidence interval
confidence_interval_root_find (str, optional (default="brentq")) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
data (dict or list, optional (default=None)) – pre-set data of the model
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings
- Raises
RuntimeError – if you try to instantiate the StatisticalModel class directly
NotImplementedError – if you do not implement the likelihood function or the data generation
- __init__(parameter_definition: Optional[Union[dict, list]] = None, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: str = 'brentq', asymptotic_dof: Optional[int] = 1, data: Optional[Union[dict, list]] = None, fit_strategy: Optional[dict] = None, **kwargs)[source]
Initialize a statistical model.
- _check_ll_and_generate_data_signature()[source]
Check that the likelihood and generate_data functions have the same signature.
- _confidence_interval_checks(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, asymptotic_dof: Optional[int] = None, **kwargs) Tuple[str, Callable[[float], float], str, Tuple[float, float]][source]
Helper function for confidence_interval that does the input checks and return bounds.
- Parameters
poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) – range in which to search for the confidence interval edges
confidence_level (float, optional (default=None)) – confidence level for confidence intervals
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
- Returns
confidence interval kind, confidence interval threshold, parameter interval bounds
- Return type
Tuple[str, Callable[[float], float], str, Tuple[float, float]]
- _define_parameters(parameter_definition, nominal_values=None)[source]
Initialize the parameters of the model.
- _ll(**kwargs) float[source]
Likelihood function, return the loglikelihood for the given parameters.
- confidence_interval(poi_name: str, parameter_interval_bounds: Optional[Tuple[float, float]] = None, confidence_level: Optional[float] = None, confidence_interval_kind: Optional[str] = None, confidence_interval_threshold: Optional[Callable[[float], float]] = None, confidence_interval_root_find: Optional[str] = None, confidence_interval_args: Optional[dict] = None, best_fit_args: Optional[dict] = None, asymptotic_dof: Optional[int] = None, fit_strategy: Optional[dict] = None) Tuple[float, float][source]
Uses self.fit to compute confidence intervals for a certain named parameter. If the parameter is a rate parameter, and the model has expectation values implemented, the bounds will be interpreted as bounds on the expectation value, so that the range in the fit is parameter_interval_bounds/mus. Otherwise the bound is taken as-is.
- Parameters
poi_name (str) – name of the parameter of interest
parameter_interval_bounds (Tuple[float, float], optional (default=None)) –
range in which to search for the confidence interval edges. May be specified as:
setting the property “parameter_interval_bounds” for the parameter
passing a list here
passing None here, the property of the parameter is used
confidence_level (float, optional (default=None)) – confidence level for confidence intervals. If None, the default confidence level of the model is used.
confidence_interval_kind (str, optional (default=None)) – kind of confidence interval to compute. If None, the default kind of the model is used.
confidence_interval_root_find (str, optional (default=None)) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
confidence_interval_args (dict, optional (default=None)) – Parameters that will be fixed in the profile likelihood computation. If None, all fittable parameters will be profiled except the poi.
best_fit_args (dict, optional (default=None)) – If you require the “global” best-fit used to normalise the profile likelihood ratio to fix fewer parameters than the profile likelihood– mainly used for 1-D slices of higher-dimensional confidence volumes, where the global best-fit may not be along the profile. If None, will be set to confidence_interval_args.
asymptotic_dof (int, optional (default=None)) – Degrees of freedom for asymptotic
fit_strategy (dict, optional (default=None)) – strategy for the fit, see _DEFAULT_FIT_STRATEGY for possible settings.
- property data
Simple getter for a data-set– mainly here so it can be over-ridden for special needs.
Data-sets are expected to be in the form of a list of one or more structured arrays, representing the data-sets of one or more likelihood terms.
- fit(verbose: Optional[bool] = False, fit_strategy: Optional[dict] = None, **kwargs) Tuple[dict, float][source]
Fit the model to the data by maximizing the likelihood. Return a dict containing best-fit values of each parameter, and the value of the likelihood evaluated there. While the optimization is a minimization, the likelihood returned is the __maximum__ of the likelihood.
- Parameters
verbose (bool) – if True, print the Minuit object
fit_strategy (dict) –
override the default fit strategy defined in the model (model.fit_strategy). Possible settings are: - minimizer_routine (str): the minimizer routine to use, either
”migrad”, “simplex”, or “simplex_migrad” (first run simplex, then migrad).
- minuit_strategy (int): strategy for Minuit, can be 0, 1, or 2. The higher the
number, the more precise the fit but also the slower.
- refit_invalid (bool): if True, refit with the simplex_migrad routine
and strategy 2 if the optimization does not converge the first time.
- disable_index_fitting (bool): if True, disable the index fitting
even if the model has index parameters.
max_index_fitting_iter (int): maximum number of iterations for index fitting
- Returns
best-fit values of each parameter, and the value of the likelihood evaluated there
- Return type
- generate_data(**kwargs) Union[dict, list][source]
Generate data for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.
- Raises
ValueError – If the parameters are not within the fit limits
- Returns
generated data
- Return type
Caution
This implementation won’t allow you to call generate_data by positional arguments.
- get_expectation_values(**parameter_values)[source]
Get the expectation values of the measurement.
- Parameters
parameter_values – values of the parameters
- get_likelihood_term_from_name(likelihood_name: str) int[source]
Return the index of a likelihood term if the likelihood has several names.
- static get_model_from_name(statistical_model: str)[source]
Get the statistical model class from a string.
- get_parameter_list()[source]
Return a set of all parameters that the generate_data and likelihood accepts.
- ll(**kwargs) float[source]
Likelihod function, returns the loglikelihood for the given parameters. The parameters are passed as keyword arguments, positional arguments are not possible. If a parameter is not given, the default value is used.
- Keyword Arguments
kwargs – keyword arguments for the parameters
- Returns
likelihood value
- Return type
- make_objective()[source]
Make a function that can be passed to Minuit.
- Returns
function that can be passed to Minuit
- Return type
Callable
- property nominal_expectation_values
Nominal expectation values for the sources of the likelihood.
For this to work, you must implement
get_expectation_values.
- set_fit_guesses(**fit_guesses)[source]
Set the fit guesses for parameters.
- Keyword Arguments
fit_guesses (dict) – A dict of parameter names and values.
- store_data(file_name, data_list, data_name_list: Optional[List[str]] = None, metadata: Optional[dict] = None)[source]
Store a list of datasets. (each on the form of a list of one or more structured arrays or dicts) Using inference_interface, but included here to allow over-writing. The structure would be:
[[datasets1], [datasets2], ..., [datasetsn]], where each of datasets is a list of structured arrays. If you specify, it is set, if not it will read fromself.get_likelihood_term_names. If not defined, it will be["0", "1", ..., "n-1"]. The metadata is optional.- Parameters
file_name (str) – name of the file to store the data in
data_list (list) – list of datasets
data_name_list (list, optional (default=None)) – list of names of the datasets. If None, it will be read from self.get_likelihood_term_names
metadata (dict, optional (default=None)) – metadata to store with the data. If None, no metadata is stored.
alea.parameters module
- class alea.parameters.ConditionalParameter(name: str, conditioning_parameter_name: str, **kwargs)[source]
Bases:
objectThis class is used to define a parameter that depends on another parameter. It has the same attributes as the Parameter class but each of them can be a dictionary with keys being the values of the conditioning parameter and values being the corresponding values of the conditional parameter. Calling the object with the conditioning parameter value as an argument will return a corresponding Parameter object with the correct values.
- property fit_guess: Optional[float]
Return the initial guess for fitting the parameter (cominal condition)
- property fit_limits: Optional[Tuple[float, float]]
Return the fit_limits of the parameter (cominal condition)
- property needs_reinit: bool
Return True if the parameter needs re-initialization (for ptype
needs_reinit).
- property nominal_value: Optional[float]
Return the nominal value of the parameter (cominal condition)
- property parameter_interval_bounds: Optional[Tuple[float, float]]
Return the parameter_interval_bounds of the parameter (cominal condition)
- class alea.parameters.Parameter(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]
Bases:
objectRepresents a single parameter with its properties.
- fittable
Indicates if the parameter is fittable or always fixed.
- Type
bool, optional (default=True)
- uncertainty
The uncertainty of the parameter. If a string, it can be evaluated as a numpy or scipy function to define non-gaussian constraints.
- relative_uncertainty
Indicates if the uncertainty is relative to the nominal_value.
- Type
bool, optional (default=None)
- blueice_anchors
Anchors for blueice template morphing. Blueice will load the template for the provided values and then interpolate for any value in between.
- Type
list, optional (default=None)
- parameter_interval_bounds
Limits for computing confidence intervals.
- __init__(name: str, nominal_value: Optional[float] = None, fittable: bool = True, ptype: Optional[str] = None, uncertainty: Optional[Union[float, str]] = None, relative_uncertainty: Optional[bool] = None, blueice_anchors: Optional[Union[list, str]] = None, fit_limits: Optional[Tuple] = None, parameter_interval_bounds: Optional[Tuple[float, float]] = None, fit_guess: Optional[float] = None, description: Optional[str] = None)[source]
Initialise a parameter.
- _check_parameter_interval_bounds(value)[source]
Check if parameter_interval_bounds is within fit_limits and is not None.
- property blueice_anchors: Any
Return the blueice_anchors of the parameter.
If the blueice_anchors is a string, it will be evaluated as a numpy or scipy function.
- property needs_reinit: bool
Return True if the parameter needs re-initialization (for ptype
needs_reinit).
- class alea.parameters.Parameters[source]
Bases:
objectRepresents a collection of parameters.
- with_uncertainty
A Parameters object with parameters with a not-NaN uncertainty.
- Type
- parameters
A dictionary to store the parameters, with parameter name as key.
- __call__(return_fittable: Optional[bool] = False, **kwargs: Optional[Dict]) Dict[str, float][source]
Return a dictionary of parameter values, optionally filtered to return only fittable parameters.
- Parameters
return_fittable (bool, optional (default=False)) – Indicates if only fittable parameters should be returned.
- Keyword Arguments
kwargs (dict) – Additional keyword arguments to override parameter values.
- Raises
ValueError – If a parameter name is not found.
- Returns
A dictionary of parameter values.
- Return type
- __getattr__(name: str) Parameter[source]
Retrieves a Parameter object by attribute access.
- Parameters
name (str) – The name of the parameter.
- Raises
AttributeError – If the attribute is not found.
- Returns
The retrieved Parameter object.
- Return type
- __iter__() Iterator[Parameter][source]
Return an iterator over the parameters.
Each iteration return a Parameter object.
- add_parameter(parameter: Union[Parameter, ConditionalParameter]) None[source]
Adds a Parameter object to the Parameters collection.
- Parameters
parameter (Parameter) – The Parameter object to add.
- Raises
ValueError – If the parameter name already exists.
- classmethod from_config(config: Dict[str, dict])[source]
Creates a Parameters object from a configuration dictionary.
- Parameters
config (dict) – A dictionary of parameter configurations.
- Returns
The created Parameters object.
- Return type
- classmethod from_list(names: List[str])[source]
Creates a Parameters object from a list of parameter names. Everything else is set to default values.
- Parameters
names (List[str]) – List of parameter names.
- Returns
The created Parameters object.
- Return type
- set_fit_guesses(**fit_guesses)[source]
Set the fit guesses for parameters.
- Keyword Arguments
fit_guesses (dict) – A dict of parameter names and values.
- set_nominal_values(**nominal_values)[source]
Set the nominal values for parameters.
- Keyword Arguments
nominal_values (dict) – A dict of parameter names and values.
- property uncertainties: dict
A dict of uncertainties for all parameters with a not-NaN uncertainty.
Caution: this is not the same as the parameter.uncertainty property.
- values_in_fit_limits(**kwargs: Dict) bool[source]
Return True if all values are within the fit limits.
- property with_uncertainty: Parameters
Return parameters with a not-NaN uncertainty.
The parameters are the same objects as in the original Parameters object, not a copy. For conditional parameters, the parameters under the nominal condition are returned.
alea.runner module
- class alea.runner.Runner(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]
Bases:
objectRunner manipulates statistical model and toydata.
initialize the statistical model
generate or reads toy data
save toy data if needed
fit fittable parameters
write the output file
One toyfile can contain multiple toydata, but all of them are from the same generate_values.
- model
statistical model instance
- Type
- Parameters
statistical_model (str) – statistical model class name
poi (str) – parameter of interest
hypotheses (list) – list of hypotheses
n_mc (int) – number of Monte Carlo
common_hypothesis (dict, optional (default=None)) – common hypothesis, the values are copied to each hypothesis
generate_values (Dict[str, float], optional (default=None)) – generate values of toydata. If None, toydata depend on statistical model.
nominal_values (dict, optional (default=None)) – nominal values of parameters. If None, nothing will be assigned to model.
statistical_model_config (str, optional (default=None)) – statistical model configuration filename
parameter_definition (dict or list, optional (default=None)) – parameter definition
statistical_model_args (dict, optional (default={})) – arguments for statistical model
likelihood_config (dict, optional (default=None)) – likelihood configuration
compute_confidence_interval (bool, optional (default=False)) – whether compute confidence interval
confidence_level (float, optional (default=0.9)) – confidence level
confidence_interval_kind (str, optional (default='central')) – kind of confidence interval, choice from ‘central’, ‘upper’ or ‘lower’
confidence_interval_root_find (str, optional (default='brentq')) – root finding algorithm of confidence interval supported options: “brentq” and “extremal”
fit_strategy (dict, optional (default=None)) – fit strategy dictionary. If None, the default fit strategy of the model will be used.
toydata_mode (str, optional (default='generate_and_store')) – toydata mode, choice from ‘read’, ‘generate’, ‘generate_and_store’, ‘no_toydata’
toydata_filename (str, optional (default=None)) – toydata filename
only_toydata (bool, optional (default=False)) – whether only generate toydata
output_filename (str, optional (default='test_toymc.ii.h5')) – output filename
seed (int, optional (default=None)) – random seed for runners before generating toydata
metadata (dict, optional (default=None)) – metadata to be saved in output file
- __init__(statistical_model: str = 'alea.examples.gaussian_model.GaussianModel', poi: str = 'mu', hypotheses: list = ['free'], n_mc: int = 1, common_hypothesis: Optional[dict] = None, generate_values: Optional[Dict[str, float]] = None, nominal_values: Optional[dict] = None, statistical_model_config: Optional[str] = None, parameter_definition: Optional[Union[dict, list]] = None, statistical_model_args: Optional[dict] = None, likelihood_config: Optional[dict] = None, compute_confidence_interval: bool = False, confidence_level: float = 0.9, confidence_interval_kind: str = 'central', confidence_interval_root_find: str = 'brentq', fit_strategy: Optional[dict] = None, toydata_mode: str = 'generate_and_store', toydata_filename: str = 'test_toydata_filename.ii.h5', only_toydata: bool = False, output_filename: str = 'test_output_filename.ii.h5', seed: Optional[int] = None, metadata: Optional[dict] = None)[source]
Initialize statistical model, parameters list, and generate values list.
- _get_hypotheses()[source]
Get generate values list from hypotheses.
Caution
When free hypothesis is provided, it should be the first hypothesis. Free hypothesis means that all parameters are free to fit, it will not use common_hypothesis!
- pre_process_poi(value, attribute_name)[source]
Pre-process of poi_expectation for some attributes of runner.
- simulate_and_fit()[source]
- For each Monte Carlo:
run toy simulation a specified toydata mode and generate values.
loop over hypotheses.
Todo
Implement per-hypothesis switching on whether to compute confidence intervals
- store_toydata(toydata, toydata_names)[source]
Write toydata to file.
If toydata is a list of dict, convert it to a list of list.
- static update_poi(model, poi: str, generate_values: Dict[str, float], nominal_values: Dict[str, float] = {})[source]
Update the poi according to poi_expectation. First, it will check if poi_expectation is provided, if not so, it will do nothing. Second, it will check if poi is provided, if so, it will raise error. Third, it will check if poi ends with _rate_multiplier, if not so, it will raise error. Finally, it will update poi to the correct value according to poi_expectation using the get_expectation_values method of model, under specified nominal_values.
- Parameters
Caution
The expectation is evaluated under nominal_values in each batch.
alea.simulators module
- class alea.simulators.BlueiceDataGenerator(ll_term)[source]
Bases:
objectA class for generating data from a blueice likelihood term.
- ll
The blueice likelihood term.
- mus
The expected number of events of each source of the likelihood term.
- ll_term
A blueice likelihood term.
- Type
BinnedLogLikelihood or UnbinnedLogLikelihood
- compute_pdfs_and_mus(filter_kwargs=True, **kwargs) None[source]
Compute PDFs of the sources for the given parameters.
- Parameters
filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
kwargs – The parameters pasted to the likelihood function.
- simulate(filter_kwargs=True, n_toys=None, sample_n_toys=False, **kwargs)[source]
Simulate toys for each source.
- Parameters
filter_kwargs (bool, optional (default=True)) – If True, only parameters of the ll object are accepted as kwargs. Defaults to True.
n_toys (int, optional (default=None)) – If not None, a fixed number n_toys of toys is generated for each source component. Defaults to None.
sample_n_toys (bool, optional (default=False)) – If True, the number of toys is sampled from a Poisson distribution with mean n_toys. Defaults to False. Only works if n_toys is not None.
- Keyword Arguments
kwargs – The parameters pasted to the likelihood function.
- Returns
Array of simulated data for all sources in the given analysis space. The index “source” indicates the corresponding source of an entry. The dtype follows self.dtype.
- Return type
numpy.array
alea.submitter module
- class alea.submitter.Submitter(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]
Bases:
objectSubmitter base class that generate the submission script from the configuration. It is initialized by the configuration file, and the configuration file should contain the arguments of __init__ method of the Submitter.
- computation_dict
the dictionary of the computation, with keys to_zip, to_vary and in_common
- Type
- debug
whether to run in debug mode. If True, only one job will be submitted or one runner will be returned. And its script will be printed.
- Type
- resubmit
whether to resubmit the jobs that have not finished. If True, will submit all the jobs, even if the output file exists.
- Type
- Parameters
statistical_model (str) – the name of the statistical model
statistical_model_config (str) – the configuration file of the statistical model
poi (str) – the parameter of interest
computation_options (dict) – the configuration of the computation
computation (str, optional (default='discovery_power')) – the name of the computation, it should be a key of computation_options
outputfolder (str, optional (default=None)) – the output folder
debug (bool, optional (default=False)) – whether to run in debug mode
loglevel (str, optional (default='INFO')) – the log level
- Keyword Arguments
kwargs – the arguments of __init__ method of the Submitter, containing configurations of clusters
Caution
All the source of template should be from the same folder. All the output, including toydata and fitting results, should be in the same folder.
- __init__(statistical_model: str, statistical_model_config: str, poi: str, computation_options: dict, computation: str = 'discovery_power', outputfolder: Optional[str] = None, fit_strategy: Optional[dict] = None, debug: bool = False, resubmit: bool = False, loglevel: str = 'INFO', **kwargs)[source]
Initializes the submitter.
- already_done(i_args: dict) bool[source]
Check if the job is already done, considering the modes of toydata and output.
- static arg_to_str(value, annotation) str[source]
Convert the argument to string for the submission script.
- Parameters
value – the value of the argument, can be various type
annotation – the annotation of the argument
- Returns
the string of the argument
- Return type
Caution
Currently we only support str, int, float, bool, dict and list. The float will be rounded to 4 digits after the decimal point.
- combined_tickets_generator()[source]
Get the combined submission script for the current configuration.
self.combine_n_jobsjobs will be combined into one submission script.- Yields
(str, str) – the combined submission script and name output_filename
Note
User can add
combine_n_jobs: 10inlocal_configurations,slurm_configurationsorhtcondor_configurationsto combine 10 jobs into one submission script. User will need this feature when the number of jobs pending for submission is too large.
- computation_tickets_generator()[source]
Get the submission script for the current configuration. It generates the submission script for each combination of the computation options.
- For Runner from to_zip, to_vary and in_common:
First, generate the combined computational options directly.
Second, update the input and output folder of the options.
Thrid, collect the non-fittable(settable) parameters into nominal_values.
Then, collect the fittable parameters into generate_values.
Finally, it generates the submission script for each combination.
- Yields
(str, str) – the submission script and name output_filename
- classmethod from_config(config_file_path: str, **kwargs) Submitter[source]
Initializes the submitter from a yaml config file.
- Parameters
config_file_path (str) – Path to the yaml config file.
- Returns
Statistical model.
- Return type
- logging = <Logger submitter_logger (INFO)>
- merged_arguments_generator()[source]
Generate the merged arguments for Runner from to_zip, to_vary and in_common.
- static runner_kwargs_from_script(sys_argv: Optional[List[str]] = None)[source]
Parse kwargs of a Runner from a string of arguments(script).
- Parameters
sys_argv (list, optional (default=None)) – string of arguments, with the format of [’–arg1’, ‘value1’, ‘–arg2’, ‘value2’, …]. The arguments must be the same as the arguments of Runner.__init__.
- static script_from_runner_kwargs(annotations, kwargs) str[source]
Generate the submission script from the runner arguments.
- static str_to_arg(value: str, annotation)[source]
Convert the string to argument for the submission script.
- Parameters
value – the string of the argument
annotation – the annotation of the argument
- Returns
the value of the argument, can be various type
- static update_n_batch(runner_args)[source]
Update n_mc if n_batch is provided.
Distribute n_mc into n_batch, so that each batch will run n_mc/n_batch times.
- static update_runner_args(runner_args: Dict[str, Dict[str, Any]], parameters_fittable: List[str], parameters_not_fittable: List[str])[source]
Update the runner arguments’ generate_values and nominal_values. If the argument is fittable, it will be added to generate_values, otherwise it will be added to nominal_values.
- Parameters
runner_args (dict) – the arguments of Runner
alea.template_source module
- class alea.template_source.CombinedSource(config: Dict, *args, **kwargs)[source]
Bases:
TemplateSourceSource that is a weighted sums of histograms. Useful e.g. for safeguard. The first histogram is the base histogram, and the rest are added to it with weights. The weights can be set as shape parameters in the config.
- Parameters
weights – Weights of the 2nd to the last histograms.
histnames – List of filenames containing the histograms.
templatenames – List of names of histograms within the hdf5 files.
- class alea.template_source.SpectrumTemplateSource(config: Dict, *args, **kwargs)[source]
Bases:
TemplateSourceReweighted template source by 1D spectrum. The first axis of the template is assumed to be reweighted.
- Parameters
spectrum_name – Name of bbf json-like spectrum file
- class alea.template_source.TemplateSource(config: Dict, *args, **kwargs)[source]
Bases:
HistogramPdfSourceA source defined with a template histogram. The parameters are set in self.config. “templatename”, “histname”, “analysis_space” must be in self.config.
- _bin_volumes
The bin volumes of the source.
- Type
numpy.ndarray
- _n_events_histogram
The histogram of the number of events of the source.
- Type
multihist.MultiHistBase
- _pdf_histogram
The histogram of the probability density function of the source.
- Type
multihist.MultiHistBase
- Parameters
config (dict) – The configuration of the source.
templatename – Hdf5 file to open.
histname – Histogram name.
named_parameters (list) – List of config setting names to pass to .format on histname and filename.
normalise_template (bool) – Normalise the template histogram.
in_events_per_bin (bool) – If True, histogram is in events per day / bin. If False or absent, histogram is already pdf.
histogram_scale_factor (float) – Multiply histogram by this number
convert_to_uniform (bool) – Convert the histogram to a uniform per bin distribution.
log10_bins (list) – List of axis numbers. If True, bin edges on this axis in the hdf5 file are log10() of the actual bin edges.
- _check_binning(h, histogram_info: str)[source]
Check if the histogram”s bin edges are the same to analysis_space.
- Parameters
h (multihist.MultiHistBase) – The histogram to check.
histogram_info (str) – Information of the histogram.
- _compute_multiple_file_hashes(templatenames: List[str], format_named_parameters: Dict) str[source]
Compute a deterministic hash for multiple template files.
- _compute_single_file_hash(templatename: str, format_named_parameters: Dict) str[source]
Compute the hash for a single template file.
- apply_slice_args(h, slice_args: Optional[Union[List[Dict], Dict]] = None)[source]
Apply slice arguments to the histogram.
- Parameters
h (multihist.MultiHistBase) – The histogram to apply the slice arguments to.
slice_args (dict) – The slice arguments to apply. The sum_axis, slice_axis, and slice_axis_limits are supported.
- property format_named_parameters
Get the named parameters in the config to dictionary format.
alea.utils module
- class alea.utils.IndexMorpher(config, shape_parameters)[source]
Bases:
MorpherIndexMorpher is a morpher which applies no interpolation.
- get_anchor_points(bounds, n_models=None)[source]
Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.
- make_interpolator(f, extra_dims, anchor_models)[source]
Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.
- alea.utils._get_internal(file_name)[source]
Get the abspath of the file.
Raise FileNotFoundError when not found in any subfolder
- alea.utils._prefix_file_path(config: dict, template_folder_list: list, ignore_keys: List[str] = ['name', 'histname'])[source]
Prefix file path with template_folder_list whenever possible.
- alea.utils.adapt_likelihood_config_for_blueice(likelihood_config: dict, template_folder_list: list) dict[source]
Adapt likelihood config to be compatible with blueice.
- alea.utils.asymptotic_critical_value(confidence_interval_kind: str, confidence_level: float, degree_of_freedom: Optional[int] = None)[source]
Return the critical value for the confidence interval.
- Parameters
- Returns
critical value
- Return type
- Raises
ValueError – if confidence_interval_kind is not ‘lower’, ‘upper’ or ‘central’
ValueError – if degree_of_freedom is not None and not 1, when confidence_interval_kind is ‘lower’ or ‘upper’
- alea.utils.can_assign_to_typing(value_type, target_type) bool[source]
Check if value_type can be assigned to target_type. This is useful when converting Runner’s argument into strings.
- Parameters
value_type – type of the value, might be float, int, etc.
target_type – type of the target, might be Optinal, Union, etc.
- alea.utils.can_expand_grid(variations: dict) bool[source]
Check if variations can be expanded into a grid.
Example
>>> can_expand_grid({'a': [1, 2], 'b': [3, 4]}) True
- alea.utils.clip_limits(value) Tuple[float, float][source]
Clip limits to be within [-MAX_FLOAT, MAX_FLOAT] by converting None to -MAX_FLOAT and MAX_FLOAT.
- alea.utils.compute_variations(to_zip, to_vary, in_common) list[source]
Compute variations of Runner from to_zip, to_vary and in_common. By priority, the order is to_zip, to_vary, in_common. The values in to_zip will overwrite the keys in to_vary and in_common. The values in to_vary will overwrite the keys in in_common.
- alea.utils.convert_to_in_common(in_common: Dict[str, Any]) Dict[str, Any][source]
Expand the values in in_common, according to the itertools.product method, if necessary. This usually happens to the hypotheses.
Example
>>> convert_to_in_common({'hypotheses': ['free', {'a': [1, 2], 'b': [3, 4]}]}) { "hypotheses": [ "free", {"a": 1, "b": 3}, {"a": 1, "b": 4}, {"a": 2, "b": 3}, {"a": 2, "b": 4}, ] }
- alea.utils.convert_to_vary(to_vary: Dict[str, List]) List[Dict[str, Any]][source]
Convert dict into a list of dict, according to the itertools.product method.
Example
>>> convert_to_vary({'a': [1, 2], 'b': [3, 4]}) [{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]
- alea.utils.convert_to_zip(to_zip: Dict[str, List]) List[Dict[str, Any]][source]
Convert dict into a list of dict, according to the zip method.
Example
>>> convert_to_zip({'a': [1, 2], 'b': [3, 4]}) [{'a': 1, 'b': 3}, {'a': 2, 'b': 4}]
- alea.utils.convert_variations(variations: dict, iteration) list[source]
Convert variations to a list of dict, according to the iteration method.
- alea.utils.deterministic_hash(thing, length=10)[source]
Return a base32 lowercase string of length determined from hashing a container hierarchy.
Edited from strax: strax/utils.py
- alea.utils.evaluate_numpy_scipy_expression(value: str)[source]
Evaluate numpy(np) and scipy.stats expression.
- alea.utils.evaluate_numpy_scipy_expression_in_dict(d: dict)[source]
Evaluate numpy(np) and scipy.stats expression in a dict.
Example
>>> evaluate_numpy_scipy_expression_in_dict({'a': 'np.arange(0, 2, 1)', 'b': [0, 1]}) {'a': [0, 1], 'b': [0, 1]}
- alea.utils.expand_grid_dict(variations: List[Union[dict, str]]) List[Union[dict, str]][source]
Expand dict into a list of dict, according to the itertools.product method, if necessary.
- Parameters
variations (list) – variations to be expanded
Example
>>> expand_grid_dict(["free", {"a": 1, "b": 3}, {"a": 'np.arange(1, 3)', "b": [3, 4]}]) [ "free", {"a": 1, "b": 3}, {"a": 1, "b": 3}, {"a": 1, "b": 4}, {"a": 2, "b": 3}, {"a": 2, "b": 4}, ]
- alea.utils.extremal_root(f, xL, xR, which='left', step=0.01, step_growth=1.0, max_step=None, xtol=1e-12, rtol=np.float64(8.881784197001252e-16))[source]
Return the left-most or right-most root of f in [xL, xR].
The interval is scanned adaptively to detect a sign change, and the root is refined using scipy.optimize.brentq.
- Parameters
xL (float) – Left boundary (must satisfy xR > xL).
xR (float) – Right boundary.
which (str, optional) – “left” or “right”. Default is “left”.
step (float, optional) – Initial scan step (>0).
step_growth (float, optional) – Step multiplier (>=1).
max_step (float | None, optional) – Maximum scan step.
xtol (float, optional) – Absolute tolerance for brentq.
rtol (float, optional) – Relative tolerance for brentq.
- Returns
Extremal root in the interval.
- Return type
- alea.utils.formatted_to_asterisked(formatted, wildcards: Optional[Union[str, List[str]]] = None)[source]
Convert formatted string to asterisk Sometimes a parameter(usually shape parameter) is not specified in formatted string, this function replace the parameter with asterisk.
- Parameters
- Returns
asterisked string
- Return type
Examples
>>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}") "a_*_b_*" >>> formatted_to_asterisked("a_{a:.2f}_b_{b:d}", wildcards="a") "a_*_b_{b:d}"
- alea.utils.get_analysis_space(analysis_space: list) list[source]
Convert analysis_space to a list of tuples with evaluated values.
- alea.utils.get_file_path(fname, folder_list: Optional[List[str]] = None)[source]
Find the full path to the resource file Try 5 methods in the following order.
fname begin with ‘/’, return absolute path
folder begin with ‘/’, return folder + name
can get file from _get_internal, return alea internal file path
can be found in local installed ntauxfiles, return ntauxfiles absolute path
can be downloaded from MongoDB, download and return cached path
- Parameters
- Returns
full path to the resource file
- Return type
- alea.utils.get_template_folder_list(likelihood_config, extra_template_path: Optional[str] = None)[source]
Get a list of template_folder from likelihood_config.
- alea.utils.search_filename_pattern(filename: str) str[source]
Return pattern for a given existing filename. This is needed because sometimes the filename is not appended by “_{i_batch:d}”. We need to distinguish between the two cases and return the correct pattern.
- Returns
existing pattern for filename, either filename or filename w/ inserted “_*”
- Return type
- alea.utils.signal_multiplier_estimator(signal: ndarray, background: ndarray, data: ndarray, iteration=100, diagnostic=False) float[source]
Estimate the best-fit signal multiplier using perturbation theory. The method tries to solve the critial point of the likelihood function by perturbation theory, where the likelihood function is defined as the binned Poisson likelihood function, given signal, background models and data.