saga.aggregate

This module implements methods for aggregating output from jobs run for all possible option combinations supplied in yaml files. It also offers methods for manipulating, plotting, and saving the outputs.

saga.aggregate.add_cut_array(args_array, cut_array, arr_vars)

Parameters

args_arraylist, required

Array of argument dictionaries

cut_arraylist, required

List array of dictionaries of array variables to bin cut titles

proj_idslist, required

Array of unique integer bin ids

arr_varslist, required

Variables in which to construct a grid of cuts

Returns

list

An array of bin cut dictionaries with the same grid structure as proj_ids.

Raises

TypeError

Raise an error the shape of the arrays do not match or the length of the shapes is not in \((1,2)\).

Description

Add bin cut title and ylabel arguments to each argument dictionary in a grid array from a cut array produced from get_cut_array(). Note that the returned array will have the same shape as the given args_array.

saga.aggregate.apply_bin_mig(df, inv_bin_mig_mat, results_keys=())

Parameters

dfpandas.DataFrame, required

Pandas dataframe of asymmetry results

inv_bin_mig_matnp.array, required

Inverse of bin migration matrix mapping generated bins to reconstructed bins

results_keyslist, optional

List of keys to results entries to which to apply bin migration corrections

Description

Apply a bin migration correction to a set of results contained within a pandas dataframe.

saga.aggregate.compute_systematics(results, bin_migration_mat=None, systematic_scales_mat=None, systematics_additive_mat=None)

Parameters

resultsnp.array, required

Array of asymmetry results

bin_migration_matnp.array, optional

Bin migration matrix mapping generated bins to reconstructed bins

systematic_scales_matnp.array, optional

Array of systematic errors to be scaled by y values before being added in quadrature to other systematic errors

systematics_additive_matnp.array, optional

Array of absolute systematic errors to add to other systematic errors

Returns

np.array

Array of systematics values added in quadrature

Description

Compute the systematic errors for a 1D binning scheme given any combination of bin migration matrix (this will be inverted internally with np.linalg.inv()), a systematic scaling array to multiply by the actual results and be added in quadrature, and a systematic additive array to add to the other systematic errors not in quadrature.

saga.aggregate.get_aggregate_graph(graph_list, xvar_keys=None, sgasym=0.0, get_min_max=False)

Parameters

graph_listlist, required

List of graphs with dimension (N_GRAPHS, 2*(1+N_XVAR_KEYS), N_BIN_IDS)

xvar_keyslist, optional

List of binning variables for which to return mean values

sgasymfloat, optional

Injected signal asymmetry for computing difference of measured and injected values

get_min_maxbool, optional

Additional graph keys y_min, y_max, ydiff_min, and ydiff_max.

Returns

dict

Dictionary of mean asymmetry means and errors and other statistics names as well as bin variable means and errors to an array of their values in each kinematic bin

Description

Compute the mean bin variables and asymmetry means and errors and other statistical information across a list of graphs’ data from get_graph_data(). Note that in the case of that the graph dimension is greater than 1, the bin variable statistics will be returned as a list of arrays in the same order as xvar_keys.

saga.aggregate.get_bin_mig_mat(df, id_gen_key='binid_gen', id_rec_key='binid_rec', mig_key='mig')

Parameters

dfpandas.DataFrame, required

Pandas dataframe of bin migration matrix

id_gen_keystr, optional

Key for generated bin indices

id_rec_keystr, optional

Key for reconstructed bin indices

mig_keystr, optional

Key for bin migration fractions

Returns

np.array

Bin migration matrix whose indices map generated to reconstructed bins.

Raises

TypeError

Raise an error if the generated and reconstructed unique bin ids do not have the same members.

Description

Create a 2D bin migration matrix from a dataframe of generated and reconstructed bin indices and the bin migration fractions.

saga.aggregate.get_binscheme_cuts_and_ids(binscheme, start_idx=0, id_key='bin_id', binvar_titles=None)

Parameters

binschemedict, required

Dictionary mapping binning variables to bin limits arrays

start_idxint, optional

Starting integer for enumerating bin unique integer ids

id_keystr, optional

Column name for bin unique integer ids

binvar_titleslist, optional

List of latex format titles for bin variables for forming cut titles

Returns

(dict, dict, pandas.DataFrame, list or None)

Dictionary of bin ids to cuts, dictionary of bin ids to bin cut title dictionaries, a pandas dataframe with bin ids under id_key and the projection bin ids under the respective variable names, and the shape of the nested bin scheme at depth 2. Note that the nested grid shape will be None if no 2D nested bin scheme is defined.

Description

Create a maps of bin ids to bin cuts and titles and a data frame mapping bin ids to projection bin ids. Also, check and return the nested bin scheme shapes at depth 2 in the case of a 2D nested bin scheme.

saga.aggregate.get_config_list(configs, aggregate_keys=None)

Parameters

configsdict, required

Map of configuration option names to option values

aggregate_keyslist, optional

List of keys over which to group configurations

Description

Get a list of all possible option value combinations from a map of option names to possible values.

saga.aggregate.get_config_out_path(base_dir, aggregate_keys, result_name, config, sep='_', aliases=None, ext='.pdf')

Parameters

base_dirstr, required

Directory name to prepend to output file names

aggregate_keyslist, required

List of keys over which configurations are grouped

result_namestr, required

Unique string identifier for result

configdict, required

Job configuration map

sepstr, optional

String separator

aliasesdict, optional

Map of configuration option names to maps of option values to string aliases

extstr, optional

File extension

Returns

str

Unique output path for a file produced from the given configuration

Description

Get a unique output file name for a given configuration.

saga.aggregate.get_config_str(config, sep='_', aliases=None)

Parameters

configdict, required

Configuration mapping option names to option values

sepstr, optional

String separator

aliasesdict, optional

Map of configuration option names to maps of option values to string aliases

Returns

str

String representation of configuration

Description

Create a string representation of a configuration. Note that aliases of nonhashable types, e.g., list or dict will be accessed by the string representation of the aliased object str(<object>).

saga.aggregate.get_cut_array(cut_titles, proj_ids)

Parameters

cutsdict, required

Dictionary of bin ids to list of bin cuts by variable

proj_idslist, required

Array of unique integer bin ids

Returns

list

An array of bin cut dictionaries with the same grid structure as proj_ids.

Raises

TypeError

Raise an error the shape of the array of bin indices is not in \((1,2)\).

Description

Create a grid of dictionaries of array variables to bin cut titles given a dictionary of bin ids to cut title lists and a grid array of projection bin indices. Note that the returned array will have the same shape as the given proj_ids.

saga.aggregate.get_graph_array(dfs, proj_ids, id_key='bin_id', count_key='count', xvar_keys=None, asym_key='a0', err_ext='err', sgasym=0.0)

Parameters

dfslist, required

Array of pandas dataframes containing bin ids and data

proj_idslist, required

Array of unique integer bin ids, note if entries are set to None the graph array entry will also be set to None to allow for masked grids

id_keystr, optional

String identifier for bin id column

count_keystr, optional

String identifier for bin count column

xvar_keyslist, optional

List of binning variables for which to return mean values

asym_keystr, optional

Asymmetry variables for which to return mean value

err_extstr, optional

Extension for forming error column names

sgasymfloat or list, optional

Injected signal asymmetry for computing difference of measured and injected values

Returns

list

An array of aggregated graph data dictionaries obtained with the same grid structure as proj_ids.

Raises

TypeError

Raise an error the shape of the array of bin indices is not in \((1,2)\).

Description

Aggregate a set of graphs given the array of dataframes and a grid array of projection bin indices. Note that the returned array will have the same shape as the given proj_ids.

saga.aggregate.get_graph_data(df, bin_ids, id_key='bin_id', count_key='count', xvar_keys=None, asym_key='a0', err_ext='err')

Parameters

dfpandas.DataFrame, required

Pandas dataframe containing bin ids and data

bin_idslist, required

List of unique integer bin ids

id_keystr, optional

String identifier for bin id column

count_keystr, optional

String identifier for bin count column

xvar_keyslist, optional

List of binning variables for which to return mean values

asym_keystr, optional

Asymmetry variables for which to return mean value

err_extstr, optional

Extension for forming error column names

Returns

np.array

Numpy array containing graph data with dimensions (1+2*(1+len(xvar_keys)),*shape(bin_ids))

Description

Read graph data (count, y, yerr, x0, x0err, x1, x1_err,...) for a projection plot from a pandas dataframe.

saga.aggregate.get_nested_scheme_shape(node, lims_key='lims', nested_key='nested')

Parameters

nodedict, required

Map of bin scheme variables to bin limits arrays with a nested structure

lims_keystr, optional

Key for bin limits arrays

nested_keystr, optional

Key for nested bin scheme structure

Raises

ValueErrror:

Raise an error if node does not define a 2D nested bin structure.

Returns

list

List of lengths of nested bin variable bins for each bin in the outer variable of a 2D nested bin scheme

Description

Find the bin scheme shapes for a 2D bin scheme with nested structure.

saga.aggregate.get_out_dirs_list(configs, base_dir, aggregate_keys=None, aliases=None)

Parameters

configsdict, required

Map of configuration option names to option values

base_dirstr, required

Path to directory in which to create job directories

aggregate_keyslist, optional

List of keys over which to group configurations

aliasesdict, optional

Map of configuration option names to maps of option values to string aliases

Returns

list

List of job output directory names grouped across the values for aggregate_keys

Description

Get a list of output directory names grouped across the keys listed in aggregate_keys for jobs generated from all possible option value combinations from a map of configuration option names to possible values.

saga.aggregate.get_out_file_name(base_dir=None, base_name='', binscheme_name='x', ext='.csv')

Parameters

base_dirstr, optional

Base directory for file path

base_namestr, optional

Base name used to construct file name

binscheme_namestr, optional

Name of binning scheme

extstr, optional

File extension

Returns

str

Output file name

Description

Get output file name for a generic kinematic binning scheme passed to an executable constructed as baseoutpath+binscheme_name+ext.

saga.aggregate.get_projection_ids(df, proj_vars, arr_vars=None, id_key='bin_id', arr_var_bins=None, nested_grid_shape=None)

Parameters

dfpandas.DataFrame, required

Pandas dataframe with bin ids under id_key and the projection bin ids under the respective variable names

proj_varslist, required

Projection variables

arr_varslist, optional

Variables in which to construct a grid of results

id_keystr, optional

Column name for bin unique integer ids

arr_var_binsdict, optional

Dictionary of array binning scheme variable names to the specific projection bin ids to look for in those variables

nested_grid_shapelist, optional

One dimensional list of nested grid dimensions, note this will be padded with None to the largest nested dimension

Returns

(list, list, list)

An array of each projection’s unique bin ids, a list of the array binning variables encountered, and an array of the array variables’ bin ids for each projection. Note the array shapes should be (*N_BINS_PROJ_VARS,*N_BINS_ARR_VARS). If N_BINS_PROJ_VARS=[5] and N_BINS_ARR_VARS=[3,8] then the shape of all_proj_ids and all_proj_arr_var_ids will be (3,8,5).

Raises

TypeError

Raise an error if array binning variables found in keys of arr_var_bins are also found in proj_vars since should not simultaneously project and select a single bin.

Description

Create an array of unique bin ids projected over a subset of binning variables from a binning scheme and organized in array-like structure over another subset of binning variables.

saga.aggregate.get_scheme_vars(node, nested_key='nested')

Parameters

nodedict, required

Map of bin scheme variables to bin limits arrays with either a nested or grid structure

nested_keystr, optional

Key for nested bin scheme structure

Description

Find the bin scheme variables in a bin scheme with either a nested or grid structure.

saga.aggregate.get_subset(df, bin_ids, id_key='bin_id')

Parameters

dfpandas.DataFrame, required

Pandas dataframe containing bin ids and data

Returns

pandas.DataFrame

Pandas dataframe subset containing only elements whose id_key entries are in bin_ids

Description

Load a subset of a dataset.

saga.aggregate.offset_graph_x(g, offset, axis=0)

Parameters

garray, required

Array describing a 1D graph with structure graph[{x_mean,x_err,y_mean,y_err,...},{nbins}]

offsetfloat, required

Value by which to offset graph values

axisint, optional

Axis along which to offset the graph

Description

Offset the x values of a graph.

saga.aggregate.reshape_nested_grid(proj_ids, nested_grid_shape, fill_value=None)

Parameters

gridlist, required

List of projection ids to reshape

nested_grid_shapelist, required

List of (irregular) nested grid array dimensions

fill_valueint, optional

Fill value for padding

Returns

list

A reshaped grid array padded to dimension (len(nested_grid_shape),max(nested_grid_shape))

Raises

ValueError

If nested_grid_shape is empty or not a list of integers or if sum(nested_grid_shape) is not the same length as the number of projections in proj_ids

Description

Reshape projection ids for an irregular nested bin scheme padding to the largest nested dimension with fill_value.

saga.aggregate.set_nested_bin_cuts(cuts, cut_titles, ids, node, old_cuts=None, old_cut_titles=None, old_ids=None, var='', lims_key='lims', nested_key='nested', binvar_titles=None)

Parameters

cutslist, required

List of cuts to recursively update

cut_titleslist, required

List of cut titles to recursively update

idslist, required

List of bin ids to recursively update

old_cutslist, optional

List of cuts from previous recursion

old_cut_titleslist, optional

List of cut titles from previous recursion

old_idslist, optional

List of bin ids from previous recursion

varstr, optional

Bin scheme variable at current recursion depth

lims_keystr, optional

Key for bin limits

nested_keystr, optional

Key for nested bin scheme

binvar_titlesdict, optional

Map of bin scheme variable names to LaTeX titles

Description

Recursively set lists of bin cuts, titles, and ids for each bin scheme variable in a nested binning scheme.