C++ API
Note
This project is under active development.
-
namespace saga
-
namespace analysis
Typedefs
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
Functions
-
RooArgSet *getSubRooArgSet(RooArgSet *argset, string fitformula, vector<string> varformulas, vector<string> varnames)
Create a subset of a RooArgSet for a given pdf formula.
- Parameters:
argset – RooArgSet containing RooRealVar* variables
fitformula – PDF formula passed to RooGenericPdf
varformulas – List of variable formulas passed to RooGenericPdf for all arguments in the RooArgSet
varnames – List of variable names in RooArgSet directly corresponding to the list of variable formulas
- Returns:
RooArgSet*
-
string getSubFormula(string fitformula, vector<string> varformulas, int max_idx = 0)
Adjust the parameter indexing of a pdf formula for a given subset of parameters.
Note that the output formula will have the variable notation
x[<idx>]whereidxindicates the integer index of the variable in the appropriate RooArgSet.- Parameters:
fitformula – PDF formula passed to RooGenericPdf
varformulas – List of variable formulas passed to RooGenericPdf
max_idx – If this parameter is \(>0\), then the formulas will be substituted in descending order starting at
idx==max_idx.
- Returns:
string
-
vector<string> getGenAsymPdf(RooWorkspace *w, vector<string> categories_as_float, RooCategory *h, RooCategory *t, RooCategory *ht, RooCategory *ss, RooArgSet *argset, vector<string> argnames, string fit_method_name, string binid, string fitformula_uu, string fitformula_pu, string fitformula_up, string fitformula_pp, double bpol, double tpol, int count, bool use_extended_nll)
Create a PDF for fitting a generic asymmetry with a maximum likelihood fit.
Create a PDF given the formulas for the asymmetries coupling to each combination of beam helicity and target spin states. The PDF will be constructed internally using RooGenericPdf in the form:
\[\begin{split} \begin{aligned} PDF(\lambda_{\ell}, S, &x_0, x_1, ..., a_0, a_1, ..., d_0, d_1, ...) = \\ & 1 + A_{UU}(\vec{x}, \vec{a}, \vec{d}) \\ & + \lambda_{\ell} \, \overline{\lambda_{\ell}^2} \, A_{PU}(\vec{x}, \vec{a}, \vec{d}) \\ & + S \, \overline{S^2} \, A_{UP}(\vec{x}, \vec{a}, \vec{d}) \\ & + \lambda_{\ell} \, S \, \overline{\lambda_{\ell}^2} \, \overline{S^2} \, A_{PP}(\vec{x}, \vec{a}, \vec{d}), \\ \end{aligned} \end{split}\]where \(A_{UU}\) denotes the unpolarized modulations as well as any transverse target spin asymmetries which may be even under a target spin flip. \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) denote the asymmetries dependent on beam helicity, target spin, or both. The appropriate terms will be dropped if there is no dependence on beam helicity or target spin. Thex_<int>denote the independent fit variables, thea_<int>denote the asymmetry amplitudes, and thed_<int>denote the depolarization factors. \(\vec{x}\), \(\vec{a}\), and \(\vec{d}\) are shorthand for the sets of these variables.If
categories_as_floatcontains the beam helicity or target spin variable names, the PDF will use these as independent variables. Otherwise, a simultaneous PDF will be formed over the various helicity and spin states.Note that in the case of a structure function \(F_{UT}\) or \(F_{LT}\) modulation that is not odd under a sign flip of \(\phi_{S}\), i.e., it does not produce an asymmetry, the relevant formula should be included in the argument for the \(A_{UU}\) or \(A_{PU}\) formula respectively since in this scenario \(F_{UT}\) and \(F_{LT}\) should only have kinematic dependence on \(\phi_{S}\) rather than categorical dependence on \(S\).
The variable names in the fit formulas should follow the TFormula notation, e.g.,
x_0\(\rightarrow\)x[0],x_1\(\rightarrow\)x[1],a_0\(\rightarrow\)x[N_x],d_0\(\rightarrow\)x[N_x+N_a], etc.Note that in the case that all three fit formula terms are used, the formulas and corresponding argument sets for the PDFs that depend only on either beam helicity or target spin will be reduced to the appropriate subset of variables used in the corresponding fit formula. This ensures that the PDFs will still compile correctly even when uploaded to the RooWorkspace.
When using a single fit formula the indexing of parameters in the fit formula is straightforward. However, when using all three fit formulas, the indexing is global across all three formulas.
The returned list contains:
The name of the full model in the workspace
The name of each yield variable in the workspace in the case of an extended NLL fit
- Parameters:
w – RooWorkspace in which to work
categories_as_float – List of category variables to include use as asymmetry fit variables and automatically add to PDF formula
h – Beam helicity \(\lambda_{\ell}\in(-1,0,1)\)
t – Target spin \(S\in(-1,0,1)\)
ht – Beam helicity times target spin \((\lambda_{\ell} \, S)\in(-1,0,1)\)
ss – Combined beam helicity and target spin state \(ss = (\lambda_{\ell}+1)\cdot10 + (S+1)\)
argset – Argument set for PDF
argnames – Argument names for PDF
fit_method_name – Fit method name, used to name PDF
binid – Unique bin id, used to name PDF
fitformula_uu – Fit formula for the asymmetry terms \(A_{UU}\)
fitformula_pu – Fit formula for the beam helicity dependent asymmetry terms \(A_{PU}\)
fitformula_up – Fit formula for the target spin dependent asymmetry terms \(A_{UP}\)
fitformula_pp – Fit formula for the beam helicity and target spin dependent asymmetry terms \(A_{PP}\)
bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)
tpol – Luminosity averaged target polarization \(\overline{S^2}\)
count – Bin count
use_extended_nll – Option to use an extended likelihood term
- Returns:
List of model name and all yield variable names
-
vector<double> fitAsym(RooWorkspace *w, string dataset_name, double bpol, double tpol, vector<string> categories_as_float, string helicity, string tspin, string htspin, string combined_spin_state, string binid, string bincut, vector<string> binvars, vector<string> depolvars, vector<string> fitvars, string fitformula_uu, string fitformula_pu, string fitformula_up, string fitformula_pp, vector<double> initparams, vector<vector<double>> initparamlims, bool use_sumw2error = true, bool use_average_depol = false, bool use_extended_nll = false, bool use_binned_fit = false, string sb_dataset_name = "", string bgfracvar = "", ostream &out = cout)
Fit an asymmetry.
Compute the bin count, bin variable mean values and variances, depolarization variable values and errors, and fit the asymmetry with a binned or unbinned dataset using a maximum likelihood fit method with an optional extended likelihood term. Note that for the maximum likelihood fit, the given asymmetry formulas \( A_{(UU,PU,UP,PP)}(x_0, x_1, ..., a_0, a_1, ..., d_0, d_1, ...) \) will be used internally by
getGenAsymPdf()to construct a simultaneous PDF of the form:\[\begin{split} \begin{aligned} PDF(\lambda_{\ell}, S, &x_0, x_1,..., a_0, a_1,..., d_0, d_1,...) = \\ & 1 + A_{UU}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + \lambda_{\ell} \, \overline{\lambda_{\ell}^2} \, A_{PU}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + S \, \overline{S^2} \, A_{UP}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + \lambda_{\ell} \, S \, \overline{\lambda_{\ell}^2} \, \overline{S^2} \, A_{PP}(\vec{x}, \vec{a}, \vec{d}), \\ \end{aligned} \end{split}\]where \(A_{UU}\) denotes the unpolarized modulations as well as any transverse target spin asymmetries which may be even under a target spin flip. \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) denote the asymmetries dependent on beam helicity, target spin, or both. The appropriate terms will be dropped if there is no dependence on beam helicity or target spin. A simultaneous fit will be applied over the data subsets distinguished by the beam helicity, target spin, and product of beam helicity and target spin. Thex_<int>denote the independent fit variables, thea_<int>denote the asymmetry amplitudes, and thed_<int>denote the depolarization factors. \(\vec{x}\), \(\vec{a}\), and \(\vec{d}\) are shorthand for the sets of these variables.The variable names in the fit formulas should follow the TFormula notation, e.g.,
x_0\(\rightarrow\)x[0],x_1\(\rightarrow\)x[1],a_0\(\rightarrow\)x[N_x],d_0\(\rightarrow\)x[N_x+N_a], etc.In the case that a sideband dataset is supplied via
sb_dataset_name, the initial dataset (dataset_name) is taken to be the signal region dataset. Then, a background PDF \(A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d})\) is created identically to the signal PDF \(A_{SG}(\vec{x}, \vec{a}_{SG}, \vec{d})\). The datasets should be created fromsaga::signal::setBinnedBGFractions()so that the background fraction variable \(\varepsilon\) is already be loaded in the workspace and present in the datasets. A simultaneous PDF will be constructed over the combined signal region ( \(SG\)) and sideband region ( \(SB\)) datasets with the form:\[\begin{split} PDF(\vec{x}, \vec{a}, \vec{d}) = 1 + \bigg{\{} \begin{array} e \varepsilon(\vec{x}) \, A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d}) + (1 - \varepsilon(\vec{x})) \, A_{SG}(\vec{x}, \vec{a}_{SG}, \vec{d}), & \text{ } \vec{x} \in SG \\ A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d}), & \text{ } \vec{x} \in SB \\ \end{array}. \end{split}\]The returned vector will have the following entries:
Bin count
For each bin variable:
Bin variable mean value
Bin variable standard deviation
For each depolarization variable:
Depolarization variable mean value
Depolarization variable standard deviation
The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of
Beam helicity \(\lambda_{\ell}\)
Target spin \(S\)
Beam helicity times target spin \(\lambda_{\ell} \, S\)
For each asymmetry fit parameter:
Asymmetry fit parameter mean value
Asymmetry fit parameter error
The following entries will be appended if using sideband subtraction with binned background fractions:
For each background asymmetry fit parameter:
Asymmetry fit parameter mean value
Asymmetry fit parameter error
- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)
tpol – Luminosity averaged target polarization \(\overline{S^2}\)
categories_as_float – List of category variables to treat as asymmetry fit variables and automatically add to PDF formulas
helicity – Name of the helicity variable
tspin – Name of the target spin variable
htspin – Name of the beam helicity times target spin variable
combined_spin_state – Name of the combined spin state variable
binid – Bin unique id
bincut – Kinematic variable cut for bin
binvars – List of kinematic binning variables
depolvars – List of depolarization variables
fitvars – List of asymmetry fit variables
fitformula_uu – The asymmetry formula in ROOT TFormula format for unpolarized terms
fitformula_pu – The asymmetry formula in ROOT TFormula format for beam helicity dependent terms
fitformula_up – The asymmetry formula in ROOT TFormula format for target spin dependent terms
fitformula_pp – The asymmetry formula in ROOT TFormula format for beam helicity and target spin dependent terms
initparams – List of initial values for asymmetry parameters
initparamlims – List of initial asymmetry parameter minimum and maximum bounds
use_sumw2error – Option to use
RooFit::SumW2Error(true)option when fitting to dataset which is necessary if using a weighted datasetuse_average_depol – Option to divide out average depolarization in bin instead of including depolarization as an independent variable in the fit
use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization
use_binned_fit – Option to use a binned fit to the data
sb_dataset_name – Name of the sideband dataset to use for the simultaneous fit of the signal + background and background PDFs
bgfracvar – Name of binned background fraction variable passed to
saga::signal::setBinnedBGFractions()out – Output stream
- Returns:
List of bin count, bin variable means and errors, depolarization variable means and errors, fit parameters and errors
-
void getKinBinnedAsym(string baseoutpath, string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> asymfitvars, vector<string> asymfitvar_titles, vector<vector<double>> asymfitvar_lims, vector<int> asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, double bpol, double tpol, string asymfit_formula_uu, string asymfit_formula_pu, string asymfit_formula_up, string asymfit_formula_pp, vector<double> asymfitpar_inits, vector<vector<double>> asymfitpar_initlims, bool use_sumw2error, bool use_average_depol, bool use_extended_nll, bool use_binned_fit, map<string, string> massfit_yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, bool use_splot, string massfit_sgcut, string massfit_bgcut, bool use_sb_subtraction, bool use_binned_sb_bgfracs, map<int, string> asymfitvar_bincuts, string bgfracvar, vector<double> bgfracvar_lims, int bgfrac_idx = 0, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)
Loop kinematic bins and fit an asymmetry, correcting for background with sideband subtraction or sPlots.
Loop bins cuts and fit an asymmetry with the
saga::analysis::fitAsym()method. Optionally, apply an invariant mass fit and background correction using the sideband subtraction method or the sPlot method from arXiv:physics/0402083. The mass fit will be applied withsaga::signal::fitMass()and the sPlot method will usesaga::signal::applySPlot().Results will be saved in a csv file with the following columns:
bin_id: The unique bin idcount: The total number of counts in the binFor each bin variable
binvar<binvar>: Mean value<binvar>_err: Standard deviation
For each depolarization variable
depolvar<depolvar>: Mean value<depolvar>_err: Standard deviation
The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of
Beam helicity \(\lambda_{\ell}\)
Target spin \(S\)
Beam helicity times target spin \(\lambda_{\ell} \, S\)
For each asymmetry fit parameter
asymfitpar<asymfitpar>: Final parameter value<asymfitpar>_err: Final parameter error
The following columns will be added in the case of a single mass fit for applied to the entire bin:
int_sg_pdf_val: Signal PDF integral \(N_{SG}^{PDF}\) value in the signal regionint_sg_pdf_err: Signal PDF integral error \(\delta N_{SG}^{PDF}\) in the signal regionint_bg_pdf_val: Background PDF integral \(N_{BG}^{PDF}\) in the signal regionint_bg_pdf_err: Background PDF integral error \(\delta N_{BG}^{PDF}\) in the signal regionint_model_pdf_val: Full PDF integral \(N^{PDF}\) in the signal regionint_model_pdf_err: Full PDF integral error \(\delta N^{PDF}\) in the signal regionint_ds_valFull dataset sum \(N^{DS}\) in the signal regionint_ds_err: Poissonian error \(\sqrt{N^{DS}}\) of the full dataset sum in the signal regioneps_bg_pdf: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\)eps_bg_pdf_err: Background fraction error \(\delta\varepsilon_{1}\)eps_sg_pdf: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\)eps_sg_pdf_err: Background fraction error \(\delta\varepsilon_{2}\)eps_pdf: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\)eps_pdf_err: Background fraction error \(\delta\varepsilon_{3}\)For each mass fit variable:
<chi2>: \(\chi^2\) value of the 1D projection of the full PDF in that variable
For each mass fit signal PDF parameter
massfitpar_sg<massfitpar_sg>: Final parameter value<massfitpar_sg>_err: Final parameter error
For each mass fit background PDF parameter
massfitpar_bg<massfitpar_bg>: Final parameter value<massfitpar_bg>_err: Final parameter error
- Parameters:
baseoutpath – Base name prefix for output files
scheme_name – Name bin scheme and basename of output csv file
frame – ROOT RDataframe from which to create RooFit datasets
workspace_name – Name of workspace in which to work
workspace_title – Title of workspace in which to work
dataset_name – Dataset name
dataset_title – Dataset title
weight_name – Name of weight variable, ignored if empty
categories_as_float – List of category variables to include as asymmetry fit variables in dataset
helicity – Name of helicity variable
helicity_states – Map of state names to helicity values
tspin – Name of target spin variable
tspin_states – Map of state names to target spin values
htspin – Name of helicity times target spin variable
htspin_states – Map of state names to helicity times target spin values
combined_spin_state – Name of combined spin state variable
bincuts – Map of unique bin id ints to bin variable cuts for bin
binvars – List of kinematic binning variables names
binvar_titles – List of kinematic binning variables titles
binvar_lims – List kinematic binning variable minimum and maximum bounds
binvar_bins – List of kinematic binning variables bins
depolvars – List of depolarization variables names
depolvar_titles – List of depolarization variables titles
depolvar_lims – List depolarization variable minimum and maximum bounds
depolvar_bins – List of depolarization variables bins
asymfitvars – List of asymmetry fit variables names
asymfitvar_titles – List of asymmetry fit variables titles
asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds
asymfitvar_bins – List of asymmetry fit variables bins
massfitvars – List of invariant mass fit variables names
massfitvar_titles – List of invariant mass fit variables titles
massfitvar_lims – List invariant mass fit variable minimum and maximum bounds
massfitvar_bins – List of invariant mass fit variables bins
bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)
tpol – Luminosity averaged target polarization \(\overline{S^2}\)
asymfit_formula_uu – The asymmetry formula in ROOT TFormula format for the unpolarized modulations
asymfit_formula_pu – The asymmetry formula in ROOT TFormula format for the beam helicity dependent asymmetries
asymfit_formula_up – The asymmetry formula in ROOT TFormula format for the target spin dependent asymmetries
asymfit_formula_pp – The asymmetry formula in ROOT TFormula format for the beam helicity and target spin dependent asymmetries
asymfitpar_inits – List of initial values for asymmetry fit variables
asymfitpar_initlims – List of initial asymmetry fit variables minimum and maximum bounds
use_sumw2error – Option to use
RooFit::SumW2Error(true)option when fitting to dataset which is necessary if using a weighted datasetuse_average_depol – Option to divide out average depolarization in bin instead of including depolarization as an independent variable in the fit
use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization
use_binned_fit – Option to use a binned fit to the data
massfit_yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining mass fit arguments. Note that the values specified here will function as the defaults.
massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be:
<pdf_name>_<binid>.massfit_formula_sg – The signal PDF formula in ROOT TFormula format
massfit_formula_bg – The background PDF formula in ROOT TFormula format
massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset
massfit_parinits_sg – List of signal PDF parameter initial values
massfit_parnames_sg – List of signal PDF parameter names
massfit_partitles_sg – List of signal PDF parameter titles
massfit_parunits_sg – List of signal PDF parameter unit titles
massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds
massfit_parinits_bg – List of background PDF parameter initial values
massfit_parnames_bg – List of background PDF parameter names
massfit_partitles_bg – List of background PDF parameter titles
massfit_parunits_bg – List of background PDF parameter unit titles
massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds
massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable
massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable
use_splot – Option to use sPlot method and perform fit with sWeighted dataset
massfit_sgcut – Signal region cut for sideband subtraction background correction. Note, this will automatically be formed from
massfit_sgregion_limsif not specified.massfit_bgcut – Background region cut for sideband subtraction background correction
use_sb_subtraction – Option to use sideband subtraction for background correction
use_binned_sb_bgfracs – Option to use background fractions from invariant mass fits binned in the asymmetry fit variable for background correction
asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for asymmetry fit variable bins
bgfracvar – Name of binned background fraction variable
bgfracvar_lims – List of binned background fraction variable minimum and maximum bounds
bgfrac_idx – Index to select which formulation to use for the background fraction in
saga::signal::setBinnedBGFractions()massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend for the signal and background mass fit
massfit_lg_text_size – Size of TLegend text for the signal and background mass fit
massfit_lg_margin – Margin of TLegend for the signal and background mass fit
massfit_lg_ncols – Number of columns in TLegend for the signal and background mass fit
massfit_use_sumw2error – Option to use
RooFit::SumW2Error(true)option for the signal and background mass fit which is necessary if using a weighted datasetmassfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit
massfit_use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit
out – Output stream
- Throws:
runtime_error – if invalid arguments are provided
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
-
namespace bins
Functions
-
vector<double> findBinLims(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string varname, const int nbins)
Find bin limits for equal bin statistics.
Given the desired number of bins in a distribution, find the bin limits that will ensure all bins have roughly equal statistics.
- Parameters:
frame – ROOT RDataFrame with which to find bin limits
varname – Bin variable name
nbins – Number of bins
- Returns:
Bin limits
-
void findNestedBinLims(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, YAML::Node node, string node_name = "", string nbins_key = "nbins", string lims_key = "lims", string nested_key = "nested", vector<string> bin_cuts = {})
Recursively set a map of bin scheme coordinates to bin variable limits for a nested bin scheme.
Given a dataframe and a yaml node defining a nested binning scheme with the desired number of bins specified at each level, recursively set a map of bin scheme coordinates to bin variable limits and a list of bin variables encountered.
- Parameters:
frame – ROOT RDataFrame with which to find bin limits
node – YAML node containing nested bin scheme definition
node_name – Name of YAML node
nbins_key – YAML key for number of bins at current depth
lims_key – YAML key for bin limits at current depth
nested_key – YAML key for nested binning
bin_cuts – List of bin cuts to apply to the dataframe
-
vector<double> getBinLims(const int nbins, double xmin, double xmax)
Compute bin limits on a regular interval between a minimum and maximum.
- Parameters:
nbins – Number of bins
xmin – Minimum bound of bin variable
xmax – Maximum bound of bin variable
- Returns:
Bin limits
-
void setNestedBinCuts(vector<string> &cuts, YAML::Node node, vector<string> &old_cuts, string node_name = "", string lims_key = "lims", string nested_key = "nested")
Set binning scheme cuts for a nested binning scheme.
Recursively set a list of bin cuts given a YAML node defining a nested bin scheme. Note that this will set cuts for all bins within the nested bin scheme.
- Parameters:
cuts – List of nested binning cuts to set
node – YAML node defining a nested bin scheme
node_name – Name of nested YAML node
old_cuts – Old list of cuts from previous recursion level
lims_key – YAML key for bin limits
nested_key – YAML key for nested binning
-
map<int, string> getBinCuts(map<string, vector<double>> binscheme, int start_bin_id)
Produce binning scheme cuts for a grid binning scheme.
Produce a map of unique integer bin identifiers to bin cuts given a map of bin variables to their respective bin limits. Note that this will produce cuts for all bins within the grid scheme and bin identifiers by default start at zero but can be made to start at any integer.
- Parameters:
binscheme – Map of bin variable names to their respective bin limits
start_bin_id – Starting unique integer bin identifier
- Returns:
Map of unique integer bin ids to bin cuts
-
map<string, map<int, string>> getBinCutsMap(YAML::Node node_binschemes, int start_bin_id = 0)
Read a YAML node and create a map of bin scheme names to maps of bin id to cuts.
Produce a map of unique bin scheme names to maps of unique integer bin identifiers to bin cuts given a YAML node containing a map of bin variables to their respective bin limits. Note that this will produce cuts for all bins within the grid scheme and bin identifiers by default start at zero but can be made to start at any integer.
- Parameters:
node_binschemes – YAML node containing bin scheme definitions
start_bin_id – Starting unique integer bin identifier
- Throws:
Runtime – error
- Returns:
Map of bin scheme names to maps of unique integer bin ids to bin cuts
-
map<string, vector<string>> getBinSchemesVars(YAML::Node node_binschemes)
Produce a list of lists of the bin variables used in each bin scheme defined in given a YAML node.
- Parameters:
node_binschemes – YAML node containing bin scheme definitions
- Returns:
Map of bin scheme names to lists of bin variable names used in each scheme
-
map<string, map<int, string>> getBinCutsMapBatch(map<string, map<int, string>> bincuts_map, int nbatches, int ibatch)
Reduce a bin cuts map to a smaller batched version.
Reduce a bin cuts map to a smaller batched version given the total number of batches and the index of the batch. This is useful for parallelizing results computed on a large bin cuts map.
- Parameters:
bincuts_map – ROOT RDataframe from which to compute bin migration fraction
nbatches – Total number of batches
ibatch – Index of the batch \(i\in[0,N_{batches}-1]\)
-
void getBinMigration(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> binvars, string mc_suffix = "_mc", string weight_name = "")
Compute bin migration fractions and save to a CSV file.
Compute bin migration fraction and save to a CSV file. Note that the truth bin cuts will be inferred from the provided cuts assuming they follow the form
(binvar>=binmin && binvar<=binmax).- Parameters:
frame – ROOT RDataframe from which to compute bin migration fraction
scheme_name – Bin scheme name
bincuts – Map of unique integer bin identifiers to bin cuts
binvars – List of bin variable names
mc_suffix – Suffix for forming the truth variable names
weight_name – Name of the weight variable, ignored if empty
-
void getBinKinematics(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> kinvars, string weight_name)
Compute bin statistics and kinematics and save to a CSV file.
- Parameters:
frame – ROOT RDataframe from which to compute bin migration fraction
scheme_name – Bin scheme name, csv file will be named
<scheme_name>_kinematics.csvbincuts – Map of unique integer bin identifiers to bin cuts
kinvars – List of kinematic variable names
weight_name – Name of the weight variable, ignored if empty
-
void saveTH1ToCSV(const TH1 &h1, string csv_name)
Save a TH1 or TH2 ROOT histogram to a CSV file.
Save a TH1 or TH2 ROOT histogram to a CSV file. The CSV file will be named have columns
bin,llimx,countif it is 1D, orbinx,biny,llimx,llimy,countif it is 2D. Note that the lower bin limits are written for each bin so there are \(N_{bins}+1\) rows in the CSV file for 1D histograms and \((N_{bins,x}+1)\times(N_{bins,y}+1)\) rows in the CSV file for 2D histograms.- Parameters:
h1 – ROOT histogram to save to CSV
csv_name – Path to CSV file
- Throws:
Runtime – error
-
void getBinKinematicsTH1Ds(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> kinvars, vector<vector<double>> kinvar_lims, vector<int> kinvar_bins, bool save_pdfs = false, bool save_csvs = false)
Create 1D kinematics histograms for each bin and save to a ROOT file.
- Parameters:
frame – ROOT RDataframe from which to compute bin migration fraction
scheme_name – Bin scheme name, ROOT file will be named
<scheme_name>_kinematics.rootbincuts – Map of unique integer bin identifiers to bin cuts
kinvars – List of kinematic variable names
kinvar_lims – List of outer bin limits for each kinematic variable
kinvar_bins – List of number of bins in each kinematic variable
save_pdfs – Option to save 1D histograms as PDFs, files will be names
c1_<scheme_name>_bin<bin_id>_<kinvar>.pdfsave_csvs – Option to save 1D histograms as CSVs, files will be names
<scheme_name>_bin<bin_id>_<kinvar>.csv
-
void getBinKinematicsTH2Ds(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<vector<string>> kinvars, vector<vector<vector<double>>> kinvar_lims, vector<vector<int>> kinvar_bins, bool save_pdfs = false, bool save_csvs = false)
Create 2D kinematics histograms for each bin and save to a ROOT file.
- Parameters:
frame – ROOT RDataframe from which to compute bin migration fraction
scheme_name – Bin scheme name, ROOT file will be named
<scheme_name>_kinematics.rootbincuts – Map of unique integer bin identifiers to bin cuts
kinvars – List of kinematic variable pairs (x-axis,y-axis) names
kinvar_lims – List of outer bin limits for each kinematic variable pair (x-axis,y-axis)
kinvar_bins – List of number of bins in each kinematic variable pair (x-axis,y-axis)
save_pdfs – Option to save 2D histograms as PDFs, files will be names
c2_<scheme_name>_bin<bin_id>_<kinvar_x>_<kinvar_y>.pdfsave_csvs – Option to save 2D histograms as CSVs, files will be names
<scheme_name>_bin<bin_id>_<kinvar_x>_<kinvar_y>.csv
-
vector<double> findBinLims(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string varname, const int nbins)
-
namespace data
Typedefs
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
Functions
-
void createDataset(RNode frame, RooWorkspace *w, string name, string title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> &asymfitvars, vector<string> &asymfitvar_titles, vector<vector<double>> &asymfitvar_lims, vector<int> &asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins)
Create a dataset for an asymmetry fit.
Create a RooFit dataset for an asymmetry fit from a ROOT RDataFrame, adding helicity, target spin, binning, depolarization, asymmetry fit, and invariant mass fit variables. Store all variables and RooDataSet in a RooWorkspace.
- Parameters:
frame – ROOT RDataframe from which to create a RooDataSet
w – RooWorkspace in which to work
name – Dataset name
title – Dataset title
weight_name – Name of weight variable, ignored if empty
categories_as_float – List of category variables to include as floats named
<category>_as_floatin datasethelicity – Name of helicity variable
helicity_states – Map of state names to helicity values
tspin – Name of target spin variable
tspin_states – Map of state names to target spin values
htspin – Name of helicity times target spin variable
htspin_states – Map of state names to helicity times target spin values
combined_spin_state – Name of combined spin state variable
binvars – List of kinematic binning variables names
binvar_titles – List of kinematic binning variables titles
binvar_lims – List kinematic binning variable minimum and maximum bounds
binvar_bins – List of kinematic binning variables bins
depolvars – List of depolarization variables names
depolvar_titles – List of depolarization variables titles
depolvar_lims – List depolarization variable minimum and maximum bounds
depolvar_bins – List of depolarization variables bins
asymfitvars – List of asymmetry fit variables names
asymfitvar_titles – List of asymmetry fit variables titles
asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds
asymfitvar_bins – List of asymmetry fit variables bins
massfitvars – List of invariant mass fit variables names
massfitvar_titles – List of invariant mass fit variables titles
massfitvar_lims – List invariant mass fit variable minimum and maximum bounds
massfitvar_bins – List of invariant mass fit variables bins
-
template<typename CsvKeyType, typename CsvValueType>
RNode mapDataFromCSV(RNode filtered_df, string rdf_key_col, string csv_path, string csv_key_col, vector<string> col_names, map<string, string> col_aliases, bool readHeaders = true, char delimiter = ',') Map values from a CSV file into an existing RDataFrame.
Load a CSV file containing, e.g., run-dependent values, with
ROOT::RDataFrame::FromCSV. Then, add the data from the requested column names to an existing RDataFrame by matching entries forcsv_key_colin the CSV to entries forrdf_key_colin the RDataFrame. Note that column values will automatically be cast to float in the RDataFrame.- Parameters:
filtered_df – RDataFrame in which to load data from CSV
rdf_key_col – Name of the key column in the RDataFrame
csv_path – Path to the CSV file
csv_key_col – Name of the key column in the CSV
col_names – List of column names for values to map from the CSV file
col_aliases – Map of column names to aliases for defining branches in the RDataFrame
readHeaders – Option to read the headers from the CSV file
delimiter – Delimiter used in the CSV file
- Returns:
RDataFrame with run dependent values loaded from the CSV file
-
TRandom *initializeTRandom(UInt_t seed, string trandom_type)
Initialize a
TRandomgenerator.Initialize a
TRandomgenerator from the available algorithms provided by ROOT.- Parameters:
seed – Seed for random number generator
trandom_type – Type name of ROOT TRandom number generator
- Throws:
Runtime – Error
- Returns:
TRandomgenerator of given type initialized with the given seed
-
RNode injectAsym(RNode df, int seed, double bpol, double tpol, string mc_sg_match_name, string asyms_sg_uu_pos_name, string asyms_sg_uu_neg_name, string asyms_sg_pu_pos_name, string asyms_sg_pu_neg_name, string asyms_sg_up_name, string asyms_sg_pp_name, string asyms_bg_uu_pos_name, string asyms_bg_uu_neg_name, string asyms_bg_pu_pos_name, string asyms_bg_pu_neg_name, string asyms_bg_up_name, string asyms_bg_pp_name, string combined_spin_state_name, string helicity_name, string tspin_name, string phi_s_up_name, string phi_s_dn_name, string phi_s_name_injected, string trandom_type)
Inject an asymmetry into an existing RDataFrame.
Inject an asymmetry into an existing
ROOT::RDataFramegiven a random seed, beam and target polarizations, and the relevant signal and background asymmetry formulas separated into unpolarized modulations and modulations even under transverse target spin flips, i.e., modulations even under a flip of \(\phi_{S}\), as well as asymmetry terms dependent on beam helicity, target spin, or both.In almost all scenarios, the unpolarized and even \(\phi_{S}\) dependent modulations will not be needed. However, in the case of a term with an even dependence on \(\phi_{S}\), the \(\phi_{S}\) dependence can be injected into the dataset if a variable name is supplied for \(\phi_{S}\) in both spin states via the arguments
phi_s_up_nameandphi_s_dn_name.The injection algorithm proceeds as follows. For each event, a random number \(r\in[0,1)\), beam helicity \(\lambda_{\ell}\in(-1,0,1)\), and target spin \(S\in(-1,0,1)\) are all randomly generated. A non-zero \(\lambda_{\ell}\) and \(S\) are generated with probabilities taken from the beam and target polarizations respectively: \(P(\lambda_{\ell}\neq0) = \overline{\lambda_{\ell}^2}\) and \(P(S\neq0) = \overline{S^2}\). Otherwise, positive and negative helicity and spin values are generated with equal probability. The probability \(w\) of accepting the proposed \((\lambda_{\ell},S)\) pair is:
\[\begin{split} w &= \frac{1}{N} \bigg{\{} 1 + A_{UU} + \, S_{||} \, A_{UL} + A_{UT}(\phi^{True}_{S}) \\ & \quad + \,\lambda_{\ell} \, \big{[}A_{LU} + S_{||} \, A_{LL} + A_{LT}(\phi^{True}_{S})\big{]} \bigg{\}}\,, \end{split}\]where \(N\) is the number of possible combinations of \((\lambda_{\ell},S)\), given whether either has already been set to \(0\). For example, if \((\lambda_{\ell},S)=(0,\pm1)\) or \((\lambda_{\ell},S)=(\pm1,0)\) then \(N=2\), but if \((\lambda_{\ell},S)=(\pm1,\pm1)\) then \(N=4\). Note that since we rely on the fact that the \(A_{UT}\) terms are odd under a transverse target spin flip, this formulation is equivalent to the following\[\begin{split} w &= \frac{1}{N} \bigg{\{} 1 + A_{UU} + \, S \, A_{UP} \\ & \quad + \,\lambda_{\ell} \, \big{[}A_{PU} + S \, A_{PP} \big{]} \bigg{\}}\,, \end{split}\]and \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) are the asymmetry terms dependent on beam helicity, target spin, or both. The asymmetry terms will be taken from either the signal or background asymmetries according to the boolean variablemc_sg_match_nameindicating signal events. If \(r<w\) the beam helicity and target spin values for that event are accepted, otherwise all random values are regenerated and the process repeats until \(r<w\).- Parameters:
df –
ROOT::RDataFramein which to inject asymmetryseed – Seed for random number generator
bpol – Average beam polarization
tpol – Average target polarization
mc_sg_match_name – Name of boolean column indicating signal events
asyms_sg_uu_pos_name – Name of column containing the true signal unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=+1\)
asyms_sg_uu_neg_name – Name of column containing the true signal unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=-1\)
asyms_sg_pu_pos_name – Name of column containing the true signal asymmetries dependent on beam helicity, for \(S_{\perp}=+1\) in the case of a modulation even under transverse target spin flips
asyms_sg_pu_neg_name – Name of column containing the true signal asymmetries dependent on beam helicity, for \(S_{\perp}=-1\) in the case of a modulation even under transverse target spin flips
asyms_sg_up_name – Name of column containing the true signal asymmetries dependent on target spin
asyms_sg_pp_name – Name of column containing the true signal asymmetries dependent on beam helicity and target spin
asyms_bg_uu_pos_name – Name of column containing the true background unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=+1\)
asyms_bg_uu_neg_name – Name of column containing the true background unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=-1\)
asyms_bg_pu_pos_name – Name of column containing the true background asymmetries dependent on beam helicity, for \(S_{\perp}=+1\) in the case of a modulation even under transverse target spin flips
asyms_bg_pu_neg_name – Name of column containing the true background asymmetries dependent on beam helicity, for \(S_{\perp}=-1\) in the case of a modulation even under transverse target spin flips
asyms_bg_up_name – Name of column containing the true background asymmetries dependent on target spin
asyms_bg_pp_name – Name of column containing the true background asymmetries dependent on beam helicity and target spin
combined_spin_state_name – Name of column containing combined beam helicity and target spin state encoded as \(ss = (\lambda_{\ell}+1)\cdot10 + (S+1)\)
helicity_name – Name of column containing the beam helicity
tspin_name – Name of column containing the target spin
phi_s_up_name – Name of column containing the injected \(\phi_{S}\) variable for \(S_{\perp}=+1\) events
phi_s_dn_name – Name of column containing the injected \(\phi_{S}\) variable for \(S_{\perp}=-1\) events
phi_s_name_injected – Name of column to contain the injected \(\phi_{S}\) variable
trandom_type – Type name of ROOT TRandom number generator
- Throws:
Runtime – Error
- Returns:
ROOT::RDataFramewith helicity and target spin values injected
-
RNode bootstrapPoisson(RNode df, int seed, string weight_name, string trandom_type)
Weight a dataframe by resampling with Poissonian statistics.
Weight an existing
ROOT::RDataFramefollowing the Poissonian bootstrapping method of resampling each event randomly from a Poissonian distribution with mean \(\lambda=1\).- Parameters:
df –
ROOT::RDataFrameto weightseed – Seed for random number generator
weight_name – Name of column containing the event weights
trandom_type – Type name of ROOT TRandom number generator
- Returns:
ROOT::RDataFramefiltered for non-zero resampling weights
-
RNode bootstrapClassical(RNode df, int n, int seed, string weight_name, string trandom_type)
Weight a dataframe by resampling with replacement.
Weight an existing
ROOT::RDataFramefollowing the classical bootstrapping method of resampling with replacement.- Parameters:
df –
ROOT::RDataFrameto weightn – Sample size
seed – Seed for random number generator
weight_name – Name of column containing the event weights
trandom_type – Type name of ROOT TRandom number generator
- Throws:
Runtime – Error
- Returns:
ROOT::RDataFramefiltered for non-zero resampling weights
-
template<typename RetType>
RetType get_weighted_count(RNode df, string weight_name) Compute a weighted count.
Compute the weighted count of a
ROOT::RDataFrame.- Parameters:
df – RDataFrame to use
weight_name – Name of the weight column in the RDataFrame, ignored if empty
- Returns:
The weighted count
-
template<typename RetType>
RetType get_weighted_mean(RNode df, string var_name, string weight_name) Compute a weighted mean.
Compute the weighted mean of a variable in a
ROOT::RDataFrame.- Parameters:
df – RDataFrame to use
var_name – Name other variable in which to compute the mean
weight_name – Name of the weight column in the RDataFrame, ignored if empty
- Returns:
The weighted mean of the variable
-
template<typename RetType>
RetType get_weighted_stddev(RNode df, string var_name, string weight_name, RetType mean = 0.0) Compute a weighted standard deviation.
Compute the weighted standard deviation of a variable in a
ROOT::RDataFrame.- Parameters:
df – RDataFrame to use
var_name – Name other variable in which to compute the mean
weight_name – Name of the weight column in the RDataFrame, ignored if empty
mean – Value of the weighted mean if already available
- Returns:
The weighted standard deviation of the variable
-
RNode defineAngularDiffVars(RNode frame, vector<string> particle_suffixes, string theta_name = "theta", string phi_name = "phi", string mc_suffix = "_mc")
Define Monte Carlo (MC) simulation angular difference variables.
Define the angular difference variables by taking the difference of reconstructed and MC values:
\(\Delta\theta = |\theta_{Rec} - \theta_{MC}|\)
\(\Delta\phi = |\phi_{Rec} - \phi_{MC}|\).
Note that \(\phi\) is a cyclic variable on \(2\pi\), so if \(|\phi_{Rec} - \phi_{MC}|>\pi\) then:
\(\Delta\phi = 2\pi - |\phi_{Rec} - \phi_{MC}|\).
- Parameters:
frame –
ROOT::RDataFramein which to define angular difference variablesparticle_suffixes – Suffixes of particle variables to define angular difference variables for
theta_name – Name of theta variable
phi_name – Name of phi variable
mc_suffix – Suffix of MC variables
- Returns:
ROOT::RDataFramewith angular difference variables defined
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
-
namespace hbanalysis
Typedefs
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
Functions
-
vector<double> fitHB(RooWorkspace *w, string dataset_name, double bpol, double tpol, double alpha, string helicity, string tspin, string htspin, string combined_spin_state, string binid, string bincut, vector<string> binvars, vector<string> depolvars, vector<string> fitvars, ostream &out = cout)
Fit an asymmetry with the Helicity Balance (HB) method.
Compute the bin count, bin variable mean values and variances, depolarization variable values and errors, and fit the asymmetry with the Helicity Balance (HB) method. The asymmetry parameter will be computed with:
\[ D^{\Lambda}_{LL'} = \frac{1}{\alpha_{\Lambda} \overline{\lambda_{\ell}^2}}\frac{\sum^{N_{\Lambda}}_{i=1}\lambda_{\ell,i} \cos{\theta_{LL'}^i}}{\sum^{N_{\Lambda}}_{i=1}D(y_i) \cos^2{\theta_{LL'}^i}} \,. \]Here, \(\lambda_{\ell,i}\) indicates the beam helicity for a given event \(i\), and \(\overline{\lambda^{2}_{\ell}}\) is the luminosity averaged beam polarization. The method relies on the assumption that the luminosity averaged helicity \(\overline{\lambda_{\ell}}=0\) to allow the acceptance method to cancel out. See Gunar Schnell’s thesis from New Mexico State University, 1999 for a full derivation. \(N_{\Lambda}\) is the number of \(\Lambda\) events in the bin, and \(D(y_i)\) and \(\cos{\theta_{LL'}^i}\) are the depolarization factor and the decay angle in the \(\Lambda\) CM frame respectively for the given event. Similarly, the error will be computed as follows. Letting\[\begin{split} \begin{aligned} A& = \lambda_{\ell,i} \cos{\theta_{LL'}^i}, \\ B& =D(y_i) \cos^2{\theta_{LL'}^i} \,, \end{aligned} \end{split}\]the statistical scale uncertainty may be expressed as\[\begin{split} \begin{aligned} \bigg{(}\frac{\delta D^{\Lambda}_{LL'}}{D^{\Lambda}_{LL'}}\bigg{)}^2& = \bigg{[}\delta\bigg{(}\frac{\text{Sum}[A]}{\text{Sum}[B]}\bigg{)}\bigg{/}\bigg{(}\frac{\text{Sum}[A]}{\text{Sum}[B]}\bigg{)}\bigg{)}\bigg{]}^2 \\ & = \bigg{(} \frac{\text{Var}[A]^2}{\text{Sum}[A]^2} + \frac{\text{Var}[B]^2}{\text{Sum}[B]^2} - 2 \frac{\text{Var}[AB]}{\text{Sum}[A]\text{Sum}[B]}\bigg{)} \,. \end{aligned} \end{split}\]Here, \(\text{Var}[X_i] = \sum_i (X_i - \text{Mean}[X_i])^2\) denotes the variance of a quantity \(X_i\), and Sum and Mean are exactly the operations named. Since both \(A\) and \(B\) are polynomial functions of \(\cos{\theta_{LL'}}\), \(A\) and \(B\) are correlated. Thus, we include the covariance term \(\text{Var[AB]}\) in the uncertainty calculation.The returned vector will have the following entries:
Bin count
For each bin variable:
Bin variable mean value
Bin variable standard deviation
For each depolarization variable:
Depolarization variable mean value
Depolarization variable standard deviation
The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of
Beam helicity \(\lambda_{\ell}\)
Target spin \(S\)
Beam helicity times target spin \(\lambda_{\ell} \, S\)
For the (only) Helicity Balance parameter:
Helicity Balance parameter mean value
Helicity Balance parameter error
- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)
tpol – Luminosity averaged target polarization \(\overline{S^2}\)
alpha – Lambda decay asymmetry parameter \(\alpha_{\Lambda}\)
helicity – Name of the helicity variable
tspin – Name of the target spin variable
htspin – Name of the beam helicity times target spin variable
combined_spin_state – Name of the combined spin state variable
binid – Bin unique id
bincut – Kinematic variable cut for bin
binvars – List of kinematic binning variables
depolvars – List of depolarization variables
fitvars – List of asymmetry fit variables
out – Output stream
- Throws:
Runtime – error
- Returns:
List of bin count, bin variable means and errors, depolarization variable means and errors, fit parameters and errors
-
void getKinBinnedHB(string baseoutpath, string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> asymfitvars, vector<string> asymfitvar_titles, vector<vector<double>> asymfitvar_lims, vector<int> asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, double bpol, double tpol, double alpha, map<string, string> massfit_yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, bool use_splot, string massfit_sgcut, string massfit_bgcut, bool use_sb_subtraction, bool use_binned_sb_bgfracs, map<int, string> asymfitvar_bincuts, string bgfracvar, vector<double> bgfracvar_lims, int bgfrac_idx = 0, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)
Loop kinematic bins and extract an asymmetry with the Helicity Balance method, correcting for background with sideband subtraction or sPlots.
Loop bins cuts and fit an asymmetry with the
saga::hbanalysis::fitHB()method. Optionally, apply an invariant mass fit and background correction using the sideband subtraction method or the sPlot method from arXiv:physics/0402083. The mass fit will be applied withsaga::signal::fitMass()and the sPlot method will usesaga::signal::applySPlot().Results will be saved in a csv file with the following columns:
bin_id: The unique bin idcount: The total number of counts in the binFor each bin variable
binvar<binvar>: Mean value<binvar>_err: Standard deviation
For each depolarization variable
depolvar<depolvar>: Mean value<depolvar>_err: Standard deviation
The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of
Beam helicity \(\lambda_{\ell}\)
Target spin \(S\)
Beam helicity times target spin \(\lambda_{\ell} \, S\)
For each asymmetry fit parameter
asymfitpar(only one allowed for the HB method)<asymfitpar>: Final parameter value<asymfitpar>_err: Final parameter error
The following columns will be added in the case of a single mass fit for applied to the entire bin:
int_sg_pdf_val: Signal PDF integral \(N_{SG}^{PDF}\) value in the signal regionint_sg_pdf_err: Signal PDF integral error \(\delta N_{SG}^{PDF}\) in the signal regionint_bg_pdf_val: Background PDF integral \(N_{BG}^{PDF}\) in the signal regionint_bg_pdf_err: Background PDF integral error \(\delta N_{BG}^{PDF}\) in the signal regionint_model_pdf_val: Full PDF integral \(N^{PDF}\) in the signal regionint_model_pdf_err: Full PDF integral error \(\delta N^{PDF}\) in the signal regionint_ds_valFull dataset sum \(N^{DS}\) in the signal regionint_ds_err: Poissonian error \(\sqrt{N^{DS}}\) of the full dataset sum in the signal regioneps_bg_pdf: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\)eps_bg_pdf_err: Background fraction error \(\delta\varepsilon_{1}\)eps_sg_pdf: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\)eps_sg_pdf_err: Background fraction error \(\delta\varepsilon_{2}\)eps_pdf: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\)eps_pdf_err: Background fraction error \(\delta\varepsilon_{3}\)For each mass fit variable:
<chi2>: \(\chi^2\) value of the 1D projection of the full PDF in that variable
For each mass fit signal PDF parameter
massfitpar_sg<massfitpar_sg>: Final parameter value<massfitpar_sg>_err: Final parameter error
For each mass fit background PDF parameter
massfitpar_bg<massfitpar_bg>: Final parameter value<massfitpar_bg>_err: Final parameter error
- Parameters:
baseoutpath – Base name prefix for output files
scheme_name – Name bin scheme and basename of output csv file
frame – ROOT RDataframe from which to create RooFit datasets
workspace_name – Name of workspace in which to work
workspace_title – Title of workspace in which to work
dataset_name – Dataset name
dataset_title – Dataset title
weight_name – Name of weight variable, ignored if empty
categories_as_float – List of category variables to include as asymmetry fit variables in dataset
helicity – Name of helicity variable
helicity_states – Map of state names to helicity values
tspin – Name of target spin variable
tspin_states – Map of state names to target spin values
htspin – Name of helicity times target spin variable
htspin_states – Map of state names to helicity times target spin values
combined_spin_state – Name of combined spin state variable
bincuts – Map of unique bin id ints to bin variable cuts for bin
binvars – List of kinematic binning variables names
binvar_titles – List of kinematic binning variables titles
binvar_lims – List kinematic binning variable minimum and maximum bounds
binvar_bins – List of kinematic binning variables bins
depolvars – List of depolarization variables names
depolvar_titles – List of depolarization variables titles
depolvar_lims – List depolarization variable minimum and maximum bounds
depolvar_bins – List of depolarization variables bins
asymfitvars – List of asymmetry fit variables names
asymfitvar_titles – List of asymmetry fit variables titles
asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds
asymfitvar_bins – List of asymmetry fit variables bins
massfitvars – List of invariant mass fit variables names
massfitvar_titles – List of invariant mass fit variables titles
massfitvar_lims – List invariant mass fit variable minimum and maximum bounds
massfitvar_bins – List of invariant mass fit variables bins
bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)
tpol – Luminosity averaged target polarization \(\overline{S^2}\)
alpha – Lambda decay asymmetry parameter \(\alpha_{\Lambda}\)
massfit_yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining mass fit arguments. Note that the values specified here will function as the defaults.
massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be:
<pdf_name>_<binid>.massfit_formula_sg – The signal PDF formula in ROOT TFormula format
massfit_formula_bg – The background PDF formula in ROOT TFormula format
massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset
massfit_parinits_sg – List of signal PDF parameter initial values
massfit_parnames_sg – List of signal PDF parameter names
massfit_partitles_sg – List of signal PDF parameter titles
massfit_parunits_sg – List of signal PDF parameter unit titles
massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds
massfit_parinits_bg – List of background PDF parameter initial values
massfit_parnames_bg – List of background PDF parameter names
massfit_partitles_bg – List of background PDF parameter titles
massfit_parunits_bg – List of background PDF parameter unit titles
massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds
massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable
massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable
use_splot – Option to use sPlot method and perform fit with sWeighted dataset
massfit_sgcut – Signal region cut for sideband subtraction background correction. Note, this will automatically be formed from
massfit_sgregion_limsif not specified.massfit_bgcut – Background region cut for sideband subtraction background correction
use_sb_subtraction – Option to use sideband subtraction for background correction
use_binned_sb_bgfracs – Option to use background fractions from invariant mass fits binned in the asymmetry fit variable for background correction
asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for asymmetry fit variable bins
bgfracvar – Name of binned background fraction variable
bgfracvar_lims – List of binned background fraction variable minimum and maximum bounds
bgfrac_idx – Index to select which formulation to use for the background fraction in
saga::signal::setBinnedBGFractions()massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend for the signal and background mass fit
massfit_lg_text_size – Size of TLegend text for the signal and background mass fit
massfit_lg_margin – Margin of TLegend for the signal and background mass fit
massfit_lg_ncols – Number of columns in TLegend for the signal and background mass fit
massfit_use_sumw2error – Option to use
RooFit::SumW2Error(true)option for the signal and background mass fit which is necessary if using a weighted datasetmassfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit
massfit_use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit
out – Output stream
- Throws:
runtime_error – if invalid arguments are provided
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
-
namespace resolution
Typedefs
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
Functions
-
vector<double> fitResolution(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> resfitvars, string yamlfile, string pdf_name = "gauss", string fitformula = "gaus(x[0],x[1],x[2],x[3])", vector<string> parnames = {"constant", "mu", "sigma"}, vector<string> partitles = {"C", "#mu", "#sigma"}, vector<string> parunits = {"", "", ""}, vector<double> parinits = {1.0, 0.0, 0.1}, vector<vector<double>> parlims = {{1.0, 1.0}, {-1.0, 1.0}, {0.0, 1.0}}, string plot_title = "Fit Resolution", double lg_text_size = 0.04, double lg_margin = 0.1, int lg_ncols = 1, bool use_sumw2error = false, bool use_extended_nll = true, bool use_binned_fit = false, ostream &out = cout)
Fit a resolution distribution.
Fit a resolution distribution, that is, the difference in reconstructed and true values \(\Delta X = X_{Rec} - X_{True}\) with a generic PDF, although this will default to Gaussian. Starting parameter values and limits may be loaded from a yaml file for each bin.
The returned vector will contain, in order:
The total bin count
The average bin variable means and corresponding standard deviations
The \(\chi^2/NDF\) of the fit from a 1D histogram in each fit variable
The parameter value and error for each fit parameter
- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
binid – Bin unique id
bincut – Kinematic variable cut for bin
binvars – List of kinematic binning variables
resfitvars – List of resolution fit variables
yamlfile – Path to YAML file specifying the remaining fit arguments
pdf_name – Base name of PDF. Note, the actual PDF name will be:
<pdf_name>_<binid>.fitformula – The PDF formula in ROOT TFormula format
parnames – List of PDF parameter names
partitles – List of PDF parameter titles
parunits – List of PDF parameter unit titles
parinits – List of PDF parameter initial values
parlims – List of PDF parameter minimum and maximum bounds
plot_title – Title of fit plot
lg_text_size – Size of TLegend text
lg_margin – Margin of TLegend
lg_ncols – Number of columns in TLegend
use_sumw2error – Option to use
RooFit::SumW2Error(true)option when fitting to dataset which is necessary if using a weighted datasetuse_extended_nll – Option to use an extended Negative Log Likelihood function for minimization
use_binned_fit – Option to use a binned fit to the data
out – Output stream
- Returns:
List containing fit results
-
void getKinBinnedResolutions(string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> resfitvars, vector<string> resfitvar_titles, vector<vector<double>> resfitvar_lims, vector<int> resfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, map<string, string> yamlfile_map, string pdf_name, string fitformula, vector<string> parnames, vector<string> partitles, vector<string> parunits, vector<double> parinits, vector<vector<double>> parlims, double lg_text_size = 0.04, double lg_margin = 0.1, int lg_ncols = 1, bool use_sumw2error = false, bool use_extended_nll = true, bool use_binned_fit = false, ostream &out = cout)
Loop kinematic bins and fit a resolution distribution.
Loop bins cuts and fit a resolution distribution with the
saga::resolution::fitResolution()method.Results will be saved in a csv file with the following columns:
bin_id: The unique bin idcount: The total number of counts in the binFor each bin variable
binvar<binvar>: Mean value<binvar>_err: Standard deviation
For each independent fit variable:
<chi2ndf>: \(\chi^2/NDF\) value of the 1D projection of the full PDF in that variable
For each resolution fit PDF parameter
fitpar<fitpar>: Final parameter value<fitpar>_err: Final parameter error
- Parameters:
scheme_name – Name bin scheme and basename of output csv file
frame – ROOT RDataframe from which to create RooFit datasets
workspace_name – Name of workspace in which to work
workspace_title – Title of workspace in which to work
dataset_name – Dataset name
dataset_title – Dataset title
weight_name – Name of weight variable, ignored if empty
categories_as_float – List of category variables to include as asymmetry fit variables in dataset
helicity – Name of helicity variable
helicity_states – Map of state names to helicity values
tspin – Name of target spin variable
tspin_states – Map of state names to target spin values
htspin – Name of helicity times target spin variable
htspin_states – Map of state names to helicity times target spin values
combined_spin_state – Name of combined spin state variable
bincuts – Map of unique bin id ints to bin variable cuts for bin
binvars – List of kinematic binning variables names
binvar_titles – List of kinematic binning variables titles
binvar_lims – List kinematic binning variable minimum and maximum bounds
binvar_bins – List of kinematic binning variables bins
depolvars – List of depolarization variables names
depolvar_titles – List of depolarization variables titles
depolvar_lims – List depolarization variable minimum and maximum bounds
depolvar_bins – List of depolarization variables bins
resfitvars – List of resolution fit variables names
resfitvar_titles – List of resolution fit variables titles
resfitvar_lims – List resolution fit variable minimum and maximum bounds
resfitvar_bins – List of resolution fit variables bins
massfitvars – List of invariant mass fit variables names
massfitvar_titles – List of invariant mass fit variables titles
massfitvar_lims – List invariant mass fit variable minimum and maximum bounds
massfitvar_bins – List of invariant mass fit variables bins
yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining resolution fit arguments. Note that the values specified here will function as the defaults.
pdf_name – Base name of the resolution PDF. Note, the actual PDF name will be:
<pdf_name>_<binid>.fitformula – The resolution PDF formula in ROOT TFormula format
parnames – List of resolution PDF parameter names
partitles – List of resolution PDF parameter titles
parunits – List of resolution PDF parameter unit titles
parinits – List of resolution PDF parameter initial values
parlims – List of resolution PDF parameter minimum and maximum bounds
lg_text_size – Size of TLegend text for the signal and background mass fit
lg_margin – Margin of TLegend for the signal and background mass fit
lg_ncols – Number of columns in TLegend for the signal and background mass fit
use_sumw2error – Option to use
RooFit::SumW2Error(true)option for the signal and background mass fit which is necessary if using a weighted datasetuse_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit
use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit
out – Output stream
- Throws:
Runtime – error
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
-
namespace signal
Typedefs
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
Functions
-
vector<string> getGenMassPdf(RooWorkspace *w, RooArgSet *argset_sg, RooArgSet *argset_bg, string pdf_name, string sgYield_name, string bgYield_name, string binid, string fitformula_sg, string fitformula_bg, double initsgfrac, int count, bool use_extended_nll)
Create a PDF for fitting a combined signal and background invariant mass distribution.
Create a PDF given the formulas for the signal and background distributions. The PDF will be constructed internally using RooGenericPdf in the form:
\[\begin{split} \begin{aligned} PDF(x_0, x_1, ..., &a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...) = \\ & u \, f_{SG}(\vec{x}, \vec{a}_{SG}) \\ & + (1-u) \, f_{BG}(\vec{x}, \vec{a}_{BG}), \\ \end{aligned} \end{split}\]where \(u\) is the ratio of signal events to total events in the dataset and \(f_{C}(x_0, x_1, ..., a_{C,0}, a_{C,1}, ...)\) for \(C\in(SG,BG)\) are the given signal and background PDFs. Thex_<int>denote the independent fit variables, i.e., the mass variables, and thea_<int>denote the dependent fit variables, i.e., the signal and background PDF parameters.The variable names in each fit formula should (separately) follow the TFormula notation, e.g.,
x_0\(\rightarrow\)x[0],x_1\(\rightarrow\)x[1],a_0\(\rightarrow\)x[N_x],a_1\(\rightarrow\)x[N_x+1], etc.The indexing of parameters in the fit formula is separate for the signal and background PDFs.
- Parameters:
w – RooWorkspace in which to work
argset_sg – Argument set for the signal PDF
argset_bg – Argument set for the background PDF
pdf_name – Base name of full PDF
sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
binid – Unique bin id, used to name the PDF
fitformula_sg – The signal PDF formula in ROOT TFormula format
fitformula_bg – The background PDF formula in ROOT TFormula format
initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset
count – Bin count
use_extended_nll – Option to use an extended likelihood term
- Returns:
List of the names of the combined, signal, and background pdfs and the signal and background yield variabless in that order
-
vector<double> fitMass(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> fitvars, string yamlfile, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)
Fit a combined signal and background mass distribution.
Compute the bin count, bin variable mean values and variances, and fit the combined signal and background mass distribution with a binned or unbinned dataset using a maximum likelihood fit method with an optional extended likelihood term. Note that for the maximum likelihood fit, the given PDF formulas \(f_{(SG,BG)}(x_0, x_1, ..., a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...)\) will be used internally by
getGenMassPdf()to construct a PDF of the form:\[\begin{split} \begin{aligned} PDF(x_0, x_1, ..., &a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...) = \\ & u \, f_{SG}(\vec{x}, \vec{a}_{SG}) \\ & + (1-u) \, f_{BG}(\vec{x}, \vec{a}_{BG}), \\ \end{aligned} \end{split}\]where \(u\) is the ratio of signal events to total events in the dataset. Thex_<int>denote the independent fit variables, i.e., the mass variables, and thea_<int>denote the dependent fit variables, i.e., the signal and background PDF parameters.The variable names in each fit formula should (separately) follow the TFormula notation, e.g.,
x_0\(\rightarrow\)x[0],x_1\(\rightarrow\)x[1],a_0\(\rightarrow\)x[N_x],a_1\(\rightarrow\)x[N_x+1], etc.The mass fit parameters (
massfit_*) may optionally be loaded from a yaml file containing all these parameters by name to allow one to specify bin-dependent starting parameters, limits, and even PDF formulas.The returned vector will contain, in order:
Total bin count
Signal PDF integral \(N_{SG}^{PDF}\) and error in the signal region
Background PDF integral \(N_{BG}^{PDF}\) and error in the signal region
Full PDF integral \(N^{PDF}\) and error in the signal region
Full dataset sum \(N^{DS}\) and Poissonian error \(\sqrt{N^{DS}}\) in the signal region
Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\) and error
Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\) and error
Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\) and error
\(\chi^2/NDF\) computed from the 1D histogram in each fit variable
Bin variable mean values and standard deviations
Signal PDF parameters and errors
Background PDF parameters and errors
- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
binid – Bin unique id
bincut – Kinematic variable cut for bin
binvars – List of kinematic binning variables
fitvars – List of mass fit variables
yamlfile – Path to YAML file specifying the remaining mass fit arguments
massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be:
<massfit_pdf_name>_<binid>.massfit_formula_sg – The signal PDF formula in ROOT TFormula format
massfit_formula_bg – The background PDF formula in ROOT TFormula format
massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset
massfit_parinits_sg – List of signal PDF parameter initial values
massfit_parnames_sg – List of signal PDF parameter names
massfit_partitles_sg – List of signal PDF parameter titles
massfit_parunits_sg – List of signal PDF parameter unit titles
massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds
massfit_parinits_bg – List of background PDF parameter initial values
massfit_parnames_bg – List of background PDF parameter names
massfit_partitles_bg – List of background PDF parameter titles
massfit_parunits_bg – List of background PDF parameter unit titles
massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds
massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable
massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable
massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend
massfit_lg_text_size – Size of TLegend text
massfit_lg_margin – Margin of TLegend
massfit_lg_ncols – Number of columns in TLegend
massfit_use_sumw2error – Option to use
RooFit::SumW2Error(true)option when fitting to dataset which is necessary if using a weighted datasetmassfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization
massfit_use_binned_fit – Option to use a binned fit to the data
out – Output stream
- Returns:
List of bin count, bin variable means and errors, fit parameters and errors
-
void applySPlot(RooWorkspace *w, string dataset_name, string weight_name, string sgYield_name, string bgYield_name, string model_name, string dataset_sg_name, string dataset_bg_name)
Apply the sPlot method from arXiv:physics/0402083.
Apply sPlot method from arXiv:physics/0402083 given a dataset, yield variables, and a PDF model and add the sWeighted datasets to the workspace.
- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
weight_name – Name of existing weight variable, ignored if empty
sgYield_name – Signal yield variable name
bgYield_name – Background yield variable name
model_name – Full PDF name
dataset_sg_name – Name of dataset with signal sweights
dataset_bg_name – Name of dataset with background sweights
-
void getBinnedBGFractionsDataset(RooWorkspace *w, RooAbsData *rds, RNode frame, string rds_out_name, map<int, string> bincuts, string bgfracvar, map<int, vector<double>> bgfracs_map, int bgfrac_idx = 0, double bgfracs_default = 0.0)
Set the background fraction of a dataset bin by bin from a map of bin ids to background fraction values.
Set the background fractions of a dataset bin by bin from a map of unique integer bin identifiers to vectors of background fraction values. The background fractions are created first from the RDataFrame since defining a conditional variable for a RooDataSet is nigh impossible. Then, they are merged into the existing dataset
rdsand a new dataset containing the background fraction columns is created and uploaded to the workspace.- Parameters:
w – RooWorkspace in which to work
rds – RooDataSet to which to add the background fraction columns
frame – ROOT RDataframe from which to define background fraction columns
rds_out_name – Name of the new RooDataSet containing the background fraction columns to import into the RooWorkspace
bincuts – Map of unique bin id ints to bin variable cuts for bin
bgfracvar – Background fraction variable name
bgfracs_map – Map of unique integer bin identifiers to background fraction values
bgfrac_idx – Index of the background fraction of interest in the vector of background fraction values
bgfracs_default – Weight variable default value for events outside provided cuts
-
void setBinnedBGFractions(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> fitvars, map<string, string> yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, RNode frame, string bgcut, vector<string> asymfitvars, map<int, string> asymfitvar_bincuts, string rds_out_name, string sb_rds_out_name, string bgfracvar, vector<double> bgfracvar_lims = {0., 1.0}, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, int bgfrac_idx = 0, double bgfracs_default = 0.0, ostream &out = cout)
Apply a generic mass fit and set the background fraction for a dataset bin by bin.
Apply a generic mass fit in each asymmetry fit variable bin using
fitMass()and set the background fraction column for the given dataset, which should only contain events from either the signal or sideband region. Background fractions \(\varepsilon\) will be taken from one of three choices specified by index:0: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\) and error1: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\) and error2: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\) and error
The naming scheme for the bins will be
<binid>__<asymfitvar_binid>.- Parameters:
w – RooWorkspace in which to work
dataset_name – Dataset name
binid – Bin unique id
bincut – Kinematic variable cut for bin
binvars – List of kinematic binning variables
fitvars – List of mass fit variables
yamlfile_map – Map of bin ids to paths of YAML files specifying the remaining mass fit arguments
massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be:
<massfit_pdf_name>_<binid>.massfit_formula_sg – The signal PDF formula in ROOT TFormula format
massfit_formula_bg – The background PDF formula in ROOT TFormula format
massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace
massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset
massfit_parinits_sg – List of signal PDF parameter initial values
massfit_parnames_sg – List of signal PDF parameter names
massfit_partitles_sg – List of signal PDF parameter titles
massfit_parunits_sg – List of signal PDF parameter unit titles
massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds
massfit_parinits_bg – List of background PDF parameter initial values
massfit_parnames_bg – List of background PDF parameter names
massfit_partitles_bg – List of background PDF parameter titles
massfit_parunits_bg – List of background PDF parameter unit titles
massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds
massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable
massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable
frame – ROOT RDataframe in which to define the background fraction variable
bgcut – Background invariant mass region cut
asymfitvars – List of asymmetry fit variables names
asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for bin
rds_out_name – Name of signal region RooDataSet under which to import it into the RooWorkspace
sb_rds_out_name – Name of sideband region RooDataSet under which to import it into the RooWorkspace
bgfracvar – Background fraction variable name
bgfracvar_lims – Background fraction variable limits
massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend
massfit_lg_text_size – Size of TLegend text
massfit_lg_margin – Margin of TLegend
massfit_lg_ncols – Number of columns in TLegend
massfit_use_sumw2error – Option to use
RooFit::SumW2Error(true)option when fitting to dataset which is necessary if using a weighted datasetmassfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization
massfit_use_binned_fit – Option to use a binned fit to the data
bgfrac_idx – Index of the background fraction of interest from the available choices
bgfracs_default – Weight variable default value for events outside provided cuts
out – Output stream
-
using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>
-
namespace util
Functions
-
template<typename T>
T getYamlArg(const YAML::Node &node, string argname, T defaultval, string message_prefix, bool verbose) Load an argument from YAML file.
- Template Parameters:
T – Type of argument to get
- Parameters:
node – Yaml node from which to load argument
argname – Name of argument to load
defaultval – Default value to use if argument is not found
message_prefix – Prefix to output message
verbose – Option to print out argument name and value
- Returns:
Argument of type T
-
void replaceAll(string &s, string const &to_replace, string const &replace_with)
Find and replace function for string.
Find all occurences of a substring and replace them with another. Note that this is an in place operation. REGEX is NOT supported.
- Parameters:
s – String to search
to_replace – Substring to find and replace
replace_with – Substring to insert
-
string addLimitCuts(string cuts, vector<string> vars, vector<vector<double>> varlims)
Add additional variable limit cuts to an existing cut string.
Add additional variable limit cuts to an existing cut string.
- Parameters:
cuts – Base cut string
vars – Variables for which to add limits cuts
varlims – Limits of provided variables
- Returns:
Updated cut string
-
template<typename T>
-
namespace analysis