C++ API

Note

This project is under active development.

namespace saga
namespace analysis

Typedefs

using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>

Functions

RooArgSet *getSubRooArgSet(RooArgSet *argset, string fitformula, vector<string> varformulas, vector<string> varnames)

Create a subset of a RooArgSet for a given pdf formula.

Parameters:
  • argset – RooArgSet containing RooRealVar* variables

  • fitformula – PDF formula passed to RooGenericPdf

  • varformulas – List of variable formulas passed to RooGenericPdf for all arguments in the RooArgSet

  • varnames – List of variable names in RooArgSet directly corresponding to the list of variable formulas

Returns:

RooArgSet*

string getSubFormula(string fitformula, vector<string> varformulas, int max_idx = 0)

Adjust the parameter indexing of a pdf formula for a given subset of parameters.

Note that the output formula will have the variable notation x[<idx>] where idx indicates the integer index of the variable in the appropriate RooArgSet.

Parameters:
  • fitformula – PDF formula passed to RooGenericPdf

  • varformulas – List of variable formulas passed to RooGenericPdf

  • max_idx – If this parameter is \(>0\), then the formulas will be substituted in descending order starting at idx==max_idx.

Returns:

string

vector<string> getGenAsymPdf(RooWorkspace *w, vector<string> categories_as_float, RooCategory *h, RooCategory *t, RooCategory *ht, RooCategory *ss, RooArgSet *argset, vector<string> argnames, string fit_method_name, string binid, string fitformula_uu, string fitformula_pu, string fitformula_up, string fitformula_pp, double bpol, double tpol, int count, bool use_extended_nll)

Create a PDF for fitting a generic asymmetry with a maximum likelihood fit.

Create a PDF given the formulas for the asymmetries coupling to each combination of beam helicity and target spin states. The PDF will be constructed internally using RooGenericPdf in the form:

\[\begin{split} \begin{aligned} PDF(\lambda_{\ell}, S, &x_0, x_1, ..., a_0, a_1, ..., d_0, d_1, ...) = \\ & 1 + A_{UU}(\vec{x}, \vec{a}, \vec{d}) \\ & + \lambda_{\ell} \, \overline{\lambda_{\ell}^2} \, A_{PU}(\vec{x}, \vec{a}, \vec{d}) \\ & + S \, \overline{S^2} \, A_{UP}(\vec{x}, \vec{a}, \vec{d}) \\ & + \lambda_{\ell} \, S \, \overline{\lambda_{\ell}^2} \, \overline{S^2} \, A_{PP}(\vec{x}, \vec{a}, \vec{d}), \\ \end{aligned} \end{split}\]
where \(A_{UU}\) denotes the unpolarized modulations as well as any transverse target spin asymmetries which may be even under a target spin flip. \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) denote the asymmetries dependent on beam helicity, target spin, or both. The appropriate terms will be dropped if there is no dependence on beam helicity or target spin. The x_<int> denote the independent fit variables, the a_<int> denote the asymmetry amplitudes, and the d_<int> denote the depolarization factors. \(\vec{x}\), \(\vec{a}\), and \(\vec{d}\) are shorthand for the sets of these variables.

If categories_as_float contains the beam helicity or target spin variable names, the PDF will use these as independent variables. Otherwise, a simultaneous PDF will be formed over the various helicity and spin states.

Note that in the case of a structure function \(F_{UT}\) or \(F_{LT}\) modulation that is not odd under a sign flip of \(\phi_{S}\), i.e., it does not produce an asymmetry, the relevant formula should be included in the argument for the \(A_{UU}\) or \(A_{PU}\) formula respectively since in this scenario \(F_{UT}\) and \(F_{LT}\) should only have kinematic dependence on \(\phi_{S}\) rather than categorical dependence on \(S\).

The variable names in the fit formulas should follow the TFormula notation, e.g., x_0 \(\rightarrow\)x[0], x_1 \(\rightarrow\)x[1], a_0 \(\rightarrow\)x[N_x], d_0 \(\rightarrow\)x[N_x+N_a], etc.

Note that in the case that all three fit formula terms are used, the formulas and corresponding argument sets for the PDFs that depend only on either beam helicity or target spin will be reduced to the appropriate subset of variables used in the corresponding fit formula. This ensures that the PDFs will still compile correctly even when uploaded to the RooWorkspace.

When using a single fit formula the indexing of parameters in the fit formula is straightforward. However, when using all three fit formulas, the indexing is global across all three formulas.

The returned list contains:

  • The name of the full model in the workspace

  • The name of each yield variable in the workspace in the case of an extended NLL fit

Parameters:
  • w – RooWorkspace in which to work

  • categories_as_float – List of category variables to include use as asymmetry fit variables and automatically add to PDF formula

  • h – Beam helicity \(\lambda_{\ell}\in(-1,0,1)\)

  • t – Target spin \(S\in(-1,0,1)\)

  • ht – Beam helicity times target spin \((\lambda_{\ell} \, S)\in(-1,0,1)\)

  • ss – Combined beam helicity and target spin state \(ss = (\lambda_{\ell}+1)\cdot10 + (S+1)\)

  • argset – Argument set for PDF

  • argnames – Argument names for PDF

  • fit_method_name – Fit method name, used to name PDF

  • binid – Unique bin id, used to name PDF

  • fitformula_uu – Fit formula for the asymmetry terms \(A_{UU}\)

  • fitformula_pu – Fit formula for the beam helicity dependent asymmetry terms \(A_{PU}\)

  • fitformula_up – Fit formula for the target spin dependent asymmetry terms \(A_{UP}\)

  • fitformula_pp – Fit formula for the beam helicity and target spin dependent asymmetry terms \(A_{PP}\)

  • bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)

  • tpol – Luminosity averaged target polarization \(\overline{S^2}\)

  • count – Bin count

  • use_extended_nll – Option to use an extended likelihood term

Returns:

List of model name and all yield variable names

vector<double> fitAsym(RooWorkspace *w, string dataset_name, double bpol, double tpol, vector<string> categories_as_float, string helicity, string tspin, string htspin, string combined_spin_state, string binid, string bincut, vector<string> binvars, vector<string> depolvars, vector<string> fitvars, string fitformula_uu, string fitformula_pu, string fitformula_up, string fitformula_pp, vector<double> initparams, vector<vector<double>> initparamlims, bool use_sumw2error = true, bool use_average_depol = false, bool use_extended_nll = false, bool use_binned_fit = false, string sb_dataset_name = "", string bgfracvar = "", ostream &out = cout)

Fit an asymmetry.

Compute the bin count, bin variable mean values and variances, depolarization variable values and errors, and fit the asymmetry with a binned or unbinned dataset using a maximum likelihood fit method with an optional extended likelihood term. Note that for the maximum likelihood fit, the given asymmetry formulas \( A_{(UU,PU,UP,PP)}(x_0, x_1, ..., a_0, a_1, ..., d_0, d_1, ...) \) will be used internally by getGenAsymPdf() to construct a simultaneous PDF of the form:

\[\begin{split} \begin{aligned} PDF(\lambda_{\ell}, S, &x_0, x_1,..., a_0, a_1,..., d_0, d_1,...) = \\ & 1 + A_{UU}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + \lambda_{\ell} \, \overline{\lambda_{\ell}^2} \, A_{PU}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + S \, \overline{S^2} \, A_{UP}(\vec{x}, \vec{a}, \vec{d}) \\ & \quad + \lambda_{\ell} \, S \, \overline{\lambda_{\ell}^2} \, \overline{S^2} \, A_{PP}(\vec{x}, \vec{a}, \vec{d}), \\ \end{aligned} \end{split}\]
where \(A_{UU}\) denotes the unpolarized modulations as well as any transverse target spin asymmetries which may be even under a target spin flip. \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) denote the asymmetries dependent on beam helicity, target spin, or both. The appropriate terms will be dropped if there is no dependence on beam helicity or target spin. A simultaneous fit will be applied over the data subsets distinguished by the beam helicity, target spin, and product of beam helicity and target spin. The x_<int> denote the independent fit variables, the a_<int> denote the asymmetry amplitudes, and the d_<int> denote the depolarization factors. \(\vec{x}\), \(\vec{a}\), and \(\vec{d}\) are shorthand for the sets of these variables.

The variable names in the fit formulas should follow the TFormula notation, e.g., x_0 \(\rightarrow\)x[0], x_1 \(\rightarrow\)x[1], a_0 \(\rightarrow\)x[N_x], d_0 \(\rightarrow\)x[N_x+N_a], etc.

In the case that a sideband dataset is supplied via sb_dataset_name, the initial dataset (dataset_name) is taken to be the signal region dataset. Then, a background PDF \(A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d})\) is created identically to the signal PDF \(A_{SG}(\vec{x}, \vec{a}_{SG}, \vec{d})\). The datasets should be created from saga::signal::setBinnedBGFractions() so that the background fraction variable \(\varepsilon\) is already be loaded in the workspace and present in the datasets. A simultaneous PDF will be constructed over the combined signal region ( \(SG\)) and sideband region ( \(SB\)) datasets with the form:

\[\begin{split} PDF(\vec{x}, \vec{a}, \vec{d}) = 1 + \bigg{\{} \begin{array} e \varepsilon(\vec{x}) \, A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d}) + (1 - \varepsilon(\vec{x})) \, A_{SG}(\vec{x}, \vec{a}_{SG}, \vec{d}), & \text{ } \vec{x} \in SG \\ A_{BG}(\vec{x}, \vec{a}_{BG}, \vec{d}), & \text{ } \vec{x} \in SB \\ \end{array}. \end{split}\]

The returned vector will have the following entries:

  • Bin count

  • For each bin variable:

    • Bin variable mean value

    • Bin variable standard deviation

  • For each depolarization variable:

    • Depolarization variable mean value

    • Depolarization variable standard deviation

  • The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of

    • Beam helicity \(\lambda_{\ell}\)

    • Target spin \(S\)

    • Beam helicity times target spin \(\lambda_{\ell} \, S\)

  • For each asymmetry fit parameter:

    • Asymmetry fit parameter mean value

    • Asymmetry fit parameter error

The following entries will be appended if using sideband subtraction with binned background fractions:

  • For each background asymmetry fit parameter:

    • Asymmetry fit parameter mean value

    • Asymmetry fit parameter error

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)

  • tpol – Luminosity averaged target polarization \(\overline{S^2}\)

  • categories_as_float – List of category variables to treat as asymmetry fit variables and automatically add to PDF formulas

  • helicity – Name of the helicity variable

  • tspin – Name of the target spin variable

  • htspin – Name of the beam helicity times target spin variable

  • combined_spin_state – Name of the combined spin state variable

  • binid – Bin unique id

  • bincut – Kinematic variable cut for bin

  • binvars – List of kinematic binning variables

  • depolvars – List of depolarization variables

  • fitvars – List of asymmetry fit variables

  • fitformula_uu – The asymmetry formula in ROOT TFormula format for unpolarized terms

  • fitformula_pu – The asymmetry formula in ROOT TFormula format for beam helicity dependent terms

  • fitformula_up – The asymmetry formula in ROOT TFormula format for target spin dependent terms

  • fitformula_pp – The asymmetry formula in ROOT TFormula format for beam helicity and target spin dependent terms

  • initparams – List of initial values for asymmetry parameters

  • initparamlims – List of initial asymmetry parameter minimum and maximum bounds

  • use_sumw2error – Option to use RooFit::SumW2Error(true) option when fitting to dataset which is necessary if using a weighted dataset

  • use_average_depol – Option to divide out average depolarization in bin instead of including depolarization as an independent variable in the fit

  • use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization

  • use_binned_fit – Option to use a binned fit to the data

  • sb_dataset_name – Name of the sideband dataset to use for the simultaneous fit of the signal + background and background PDFs

  • bgfracvar – Name of binned background fraction variable passed to saga::signal::setBinnedBGFractions()

  • out – Output stream

Returns:

List of bin count, bin variable means and errors, depolarization variable means and errors, fit parameters and errors

void getKinBinnedAsym(string baseoutpath, string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> asymfitvars, vector<string> asymfitvar_titles, vector<vector<double>> asymfitvar_lims, vector<int> asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, double bpol, double tpol, string asymfit_formula_uu, string asymfit_formula_pu, string asymfit_formula_up, string asymfit_formula_pp, vector<double> asymfitpar_inits, vector<vector<double>> asymfitpar_initlims, bool use_sumw2error, bool use_average_depol, bool use_extended_nll, bool use_binned_fit, map<string, string> massfit_yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, bool use_splot, string massfit_sgcut, string massfit_bgcut, bool use_sb_subtraction, bool use_binned_sb_bgfracs, map<int, string> asymfitvar_bincuts, string bgfracvar, vector<double> bgfracvar_lims, int bgfrac_idx = 0, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)

Loop kinematic bins and fit an asymmetry, correcting for background with sideband subtraction or sPlots.

Loop bins cuts and fit an asymmetry with the saga::analysis::fitAsym() method. Optionally, apply an invariant mass fit and background correction using the sideband subtraction method or the sPlot method from arXiv:physics/0402083. The mass fit will be applied with saga::signal::fitMass() and the sPlot method will use saga::signal::applySPlot().

Results will be saved in a csv file with the following columns:

  • bin_id: The unique bin id

  • count: The total number of counts in the bin

  • For each bin variable binvar

    • <binvar>: Mean value

    • <binvar>_err: Standard deviation

  • For each depolarization variable depolvar

    • <depolvar>: Mean value

    • <depolvar>_err: Standard deviation

  • The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of

    • Beam helicity \(\lambda_{\ell}\)

    • Target spin \(S\)

    • Beam helicity times target spin \(\lambda_{\ell} \, S\)

  • For each asymmetry fit parameter asymfitpar

    • <asymfitpar>: Final parameter value

    • <asymfitpar>_err: Final parameter error

The following columns will be added in the case of a single mass fit for applied to the entire bin:

  • int_sg_pdf_val: Signal PDF integral \(N_{SG}^{PDF}\) value in the signal region

  • int_sg_pdf_err: Signal PDF integral error \(\delta N_{SG}^{PDF}\) in the signal region

  • int_bg_pdf_val: Background PDF integral \(N_{BG}^{PDF}\) in the signal region

  • int_bg_pdf_err: Background PDF integral error \(\delta N_{BG}^{PDF}\) in the signal region

  • int_model_pdf_val: Full PDF integral \(N^{PDF}\) in the signal region

  • int_model_pdf_err: Full PDF integral error \(\delta N^{PDF}\) in the signal region

  • int_ds_val Full dataset sum \(N^{DS}\) in the signal region

  • int_ds_err: Poissonian error \(\sqrt{N^{DS}}\) of the full dataset sum in the signal region

  • eps_bg_pdf: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\)

  • eps_bg_pdf_err: Background fraction error \(\delta\varepsilon_{1}\)

  • eps_sg_pdf: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\)

  • eps_sg_pdf_err: Background fraction error \(\delta\varepsilon_{2}\)

  • eps_pdf: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\)

  • eps_pdf_err: Background fraction error \(\delta\varepsilon_{3}\)

  • For each mass fit variable:

    • <chi2>: \(\chi^2\) value of the 1D projection of the full PDF in that variable

  • For each mass fit signal PDF parameter massfitpar_sg

    • <massfitpar_sg>: Final parameter value

    • <massfitpar_sg>_err: Final parameter error

  • For each mass fit background PDF parameter massfitpar_bg

    • <massfitpar_bg>: Final parameter value

    • <massfitpar_bg>_err: Final parameter error

Parameters:
  • baseoutpath – Base name prefix for output files

  • scheme_name – Name bin scheme and basename of output csv file

  • frame – ROOT RDataframe from which to create RooFit datasets

  • workspace_name – Name of workspace in which to work

  • workspace_title – Title of workspace in which to work

  • dataset_name – Dataset name

  • dataset_title – Dataset title

  • weight_name – Name of weight variable, ignored if empty

  • categories_as_float – List of category variables to include as asymmetry fit variables in dataset

  • helicity – Name of helicity variable

  • helicity_states – Map of state names to helicity values

  • tspin – Name of target spin variable

  • tspin_states – Map of state names to target spin values

  • htspin – Name of helicity times target spin variable

  • htspin_states – Map of state names to helicity times target spin values

  • combined_spin_state – Name of combined spin state variable

  • bincuts – Map of unique bin id ints to bin variable cuts for bin

  • binvars – List of kinematic binning variables names

  • binvar_titles – List of kinematic binning variables titles

  • binvar_lims – List kinematic binning variable minimum and maximum bounds

  • binvar_bins – List of kinematic binning variables bins

  • depolvars – List of depolarization variables names

  • depolvar_titles – List of depolarization variables titles

  • depolvar_lims – List depolarization variable minimum and maximum bounds

  • depolvar_bins – List of depolarization variables bins

  • asymfitvars – List of asymmetry fit variables names

  • asymfitvar_titles – List of asymmetry fit variables titles

  • asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds

  • asymfitvar_bins – List of asymmetry fit variables bins

  • massfitvars – List of invariant mass fit variables names

  • massfitvar_titles – List of invariant mass fit variables titles

  • massfitvar_lims – List invariant mass fit variable minimum and maximum bounds

  • massfitvar_bins – List of invariant mass fit variables bins

  • bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)

  • tpol – Luminosity averaged target polarization \(\overline{S^2}\)

  • asymfit_formula_uu – The asymmetry formula in ROOT TFormula format for the unpolarized modulations

  • asymfit_formula_pu – The asymmetry formula in ROOT TFormula format for the beam helicity dependent asymmetries

  • asymfit_formula_up – The asymmetry formula in ROOT TFormula format for the target spin dependent asymmetries

  • asymfit_formula_pp – The asymmetry formula in ROOT TFormula format for the beam helicity and target spin dependent asymmetries

  • asymfitpar_inits – List of initial values for asymmetry fit variables

  • asymfitpar_initlims – List of initial asymmetry fit variables minimum and maximum bounds

  • use_sumw2error – Option to use RooFit::SumW2Error(true) option when fitting to dataset which is necessary if using a weighted dataset

  • use_average_depol – Option to divide out average depolarization in bin instead of including depolarization as an independent variable in the fit

  • use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization

  • use_binned_fit – Option to use a binned fit to the data

  • massfit_yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining mass fit arguments. Note that the values specified here will function as the defaults.

  • massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be: <pdf_name>_<binid>.

  • massfit_formula_sg – The signal PDF formula in ROOT TFormula format

  • massfit_formula_bg – The background PDF formula in ROOT TFormula format

  • massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset

  • massfit_parinits_sg – List of signal PDF parameter initial values

  • massfit_parnames_sg – List of signal PDF parameter names

  • massfit_partitles_sg – List of signal PDF parameter titles

  • massfit_parunits_sg – List of signal PDF parameter unit titles

  • massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds

  • massfit_parinits_bg – List of background PDF parameter initial values

  • massfit_parnames_bg – List of background PDF parameter names

  • massfit_partitles_bg – List of background PDF parameter titles

  • massfit_parunits_bg – List of background PDF parameter unit titles

  • massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds

  • massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable

  • massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable

  • use_splot – Option to use sPlot method and perform fit with sWeighted dataset

  • massfit_sgcut – Signal region cut for sideband subtraction background correction. Note, this will automatically be formed from massfit_sgregion_lims if not specified.

  • massfit_bgcut – Background region cut for sideband subtraction background correction

  • use_sb_subtraction – Option to use sideband subtraction for background correction

  • use_binned_sb_bgfracs – Option to use background fractions from invariant mass fits binned in the asymmetry fit variable for background correction

  • asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for asymmetry fit variable bins

  • bgfracvar – Name of binned background fraction variable

  • bgfracvar_lims – List of binned background fraction variable minimum and maximum bounds

  • bgfrac_idx – Index to select which formulation to use for the background fraction in saga::signal::setBinnedBGFractions()

  • massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend for the signal and background mass fit

  • massfit_lg_text_size – Size of TLegend text for the signal and background mass fit

  • massfit_lg_margin – Margin of TLegend for the signal and background mass fit

  • massfit_lg_ncols – Number of columns in TLegend for the signal and background mass fit

  • massfit_use_sumw2error – Option to use RooFit::SumW2Error(true) option for the signal and background mass fit which is necessary if using a weighted dataset

  • massfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit

  • massfit_use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit

  • out – Output stream

Throws:

runtime_error – if invalid arguments are provided

namespace bins

Functions

vector<double> findBinLims(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string varname, const int nbins)

Find bin limits for equal bin statistics.

Given the desired number of bins in a distribution, find the bin limits that will ensure all bins have roughly equal statistics.

Parameters:
  • frame – ROOT RDataFrame with which to find bin limits

  • varname – Bin variable name

  • nbins – Number of bins

Returns:

Bin limits

void findNestedBinLims(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, YAML::Node node, string node_name = "", string nbins_key = "nbins", string lims_key = "lims", string nested_key = "nested", vector<string> bin_cuts = {})

Recursively set a map of bin scheme coordinates to bin variable limits for a nested bin scheme.

Given a dataframe and a yaml node defining a nested binning scheme with the desired number of bins specified at each level, recursively set a map of bin scheme coordinates to bin variable limits and a list of bin variables encountered.

Parameters:
  • frame – ROOT RDataFrame with which to find bin limits

  • node – YAML node containing nested bin scheme definition

  • node_name – Name of YAML node

  • nbins_key – YAML key for number of bins at current depth

  • lims_key – YAML key for bin limits at current depth

  • nested_key – YAML key for nested binning

  • bin_cuts – List of bin cuts to apply to the dataframe

vector<double> getBinLims(const int nbins, double xmin, double xmax)

Compute bin limits on a regular interval between a minimum and maximum.

Parameters:
  • nbins – Number of bins

  • xmin – Minimum bound of bin variable

  • xmax – Maximum bound of bin variable

Returns:

Bin limits

void setNestedBinCuts(vector<string> &cuts, YAML::Node node, vector<string> &old_cuts, string node_name = "", string lims_key = "lims", string nested_key = "nested")

Set binning scheme cuts for a nested binning scheme.

Recursively set a list of bin cuts given a YAML node defining a nested bin scheme. Note that this will set cuts for all bins within the nested bin scheme.

Parameters:
  • cuts – List of nested binning cuts to set

  • node – YAML node defining a nested bin scheme

  • node_name – Name of nested YAML node

  • old_cuts – Old list of cuts from previous recursion level

  • lims_key – YAML key for bin limits

  • nested_key – YAML key for nested binning

map<int, string> getBinCuts(map<string, vector<double>> binscheme, int start_bin_id)

Produce binning scheme cuts for a grid binning scheme.

Produce a map of unique integer bin identifiers to bin cuts given a map of bin variables to their respective bin limits. Note that this will produce cuts for all bins within the grid scheme and bin identifiers by default start at zero but can be made to start at any integer.

Parameters:
  • binscheme – Map of bin variable names to their respective bin limits

  • start_bin_id – Starting unique integer bin identifier

Returns:

Map of unique integer bin ids to bin cuts

map<string, map<int, string>> getBinCutsMap(YAML::Node node_binschemes, int start_bin_id = 0)

Read a YAML node and create a map of bin scheme names to maps of bin id to cuts.

Produce a map of unique bin scheme names to maps of unique integer bin identifiers to bin cuts given a YAML node containing a map of bin variables to their respective bin limits. Note that this will produce cuts for all bins within the grid scheme and bin identifiers by default start at zero but can be made to start at any integer.

Parameters:
  • node_binschemes – YAML node containing bin scheme definitions

  • start_bin_id – Starting unique integer bin identifier

Throws:

Runtime – error

Returns:

Map of bin scheme names to maps of unique integer bin ids to bin cuts

map<string, vector<string>> getBinSchemesVars(YAML::Node node_binschemes)

Produce a list of lists of the bin variables used in each bin scheme defined in given a YAML node.

Parameters:

node_binschemes – YAML node containing bin scheme definitions

Returns:

Map of bin scheme names to lists of bin variable names used in each scheme

map<string, map<int, string>> getBinCutsMapBatch(map<string, map<int, string>> bincuts_map, int nbatches, int ibatch)

Reduce a bin cuts map to a smaller batched version.

Reduce a bin cuts map to a smaller batched version given the total number of batches and the index of the batch. This is useful for parallelizing results computed on a large bin cuts map.

Parameters:
  • bincuts_map – ROOT RDataframe from which to compute bin migration fraction

  • nbatches – Total number of batches

  • ibatch – Index of the batch \(i\in[0,N_{batches}-1]\)

void getBinMigration(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> binvars, string mc_suffix = "_mc", string weight_name = "")

Compute bin migration fractions and save to a CSV file.

Compute bin migration fraction and save to a CSV file. Note that the truth bin cuts will be inferred from the provided cuts assuming they follow the form (binvar>=binmin && binvar<=binmax).

Parameters:
  • frame – ROOT RDataframe from which to compute bin migration fraction

  • scheme_name – Bin scheme name

  • bincuts – Map of unique integer bin identifiers to bin cuts

  • binvars – List of bin variable names

  • mc_suffix – Suffix for forming the truth variable names

  • weight_name – Name of the weight variable, ignored if empty

void getBinKinematics(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> kinvars, string weight_name)

Compute bin statistics and kinematics and save to a CSV file.

Parameters:
  • frame – ROOT RDataframe from which to compute bin migration fraction

  • scheme_name – Bin scheme name, csv file will be named <scheme_name>_kinematics.csv

  • bincuts – Map of unique integer bin identifiers to bin cuts

  • kinvars – List of kinematic variable names

  • weight_name – Name of the weight variable, ignored if empty

void saveTH1ToCSV(const TH1 &h1, string csv_name)

Save a TH1 or TH2 ROOT histogram to a CSV file.

Save a TH1 or TH2 ROOT histogram to a CSV file. The CSV file will be named have columns bin, llimx, count if it is 1D, or binx, biny, llimx, llimy, count if it is 2D. Note that the lower bin limits are written for each bin so there are \(N_{bins}+1\) rows in the CSV file for 1D histograms and \((N_{bins,x}+1)\times(N_{bins,y}+1)\) rows in the CSV file for 2D histograms.

Parameters:
  • h1 – ROOT histogram to save to CSV

  • csv_name – Path to CSV file

Throws:

Runtime – error

void getBinKinematicsTH1Ds(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<string> kinvars, vector<vector<double>> kinvar_lims, vector<int> kinvar_bins, bool save_pdfs = false, bool save_csvs = false)

Create 1D kinematics histograms for each bin and save to a ROOT file.

Parameters:
  • frame – ROOT RDataframe from which to compute bin migration fraction

  • scheme_name – Bin scheme name, ROOT file will be named <scheme_name>_kinematics.root

  • bincuts – Map of unique integer bin identifiers to bin cuts

  • kinvars – List of kinematic variable names

  • kinvar_lims – List of outer bin limits for each kinematic variable

  • kinvar_bins – List of number of bins in each kinematic variable

  • save_pdfs – Option to save 1D histograms as PDFs, files will be names c1_<scheme_name>_bin<bin_id>_<kinvar>.pdf

  • save_csvs – Option to save 1D histograms as CSVs, files will be names <scheme_name>_bin<bin_id>_<kinvar>.csv

void getBinKinematicsTH2Ds(ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void> frame, string scheme_name, map<int, string> bincuts, vector<vector<string>> kinvars, vector<vector<vector<double>>> kinvar_lims, vector<vector<int>> kinvar_bins, bool save_pdfs = false, bool save_csvs = false)

Create 2D kinematics histograms for each bin and save to a ROOT file.

Parameters:
  • frame – ROOT RDataframe from which to compute bin migration fraction

  • scheme_name – Bin scheme name, ROOT file will be named <scheme_name>_kinematics.root

  • bincuts – Map of unique integer bin identifiers to bin cuts

  • kinvars – List of kinematic variable pairs (x-axis,y-axis) names

  • kinvar_lims – List of outer bin limits for each kinematic variable pair (x-axis,y-axis)

  • kinvar_bins – List of number of bins in each kinematic variable pair (x-axis,y-axis)

  • save_pdfs – Option to save 2D histograms as PDFs, files will be names c2_<scheme_name>_bin<bin_id>_<kinvar_x>_<kinvar_y>.pdf

  • save_csvs – Option to save 2D histograms as CSVs, files will be names <scheme_name>_bin<bin_id>_<kinvar_x>_<kinvar_y>.csv

namespace data

Typedefs

using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>

Functions

void createDataset(RNode frame, RooWorkspace *w, string name, string title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> &asymfitvars, vector<string> &asymfitvar_titles, vector<vector<double>> &asymfitvar_lims, vector<int> &asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins)

Create a dataset for an asymmetry fit.

Create a RooFit dataset for an asymmetry fit from a ROOT RDataFrame, adding helicity, target spin, binning, depolarization, asymmetry fit, and invariant mass fit variables. Store all variables and RooDataSet in a RooWorkspace.

Parameters:
  • frame – ROOT RDataframe from which to create a RooDataSet

  • w – RooWorkspace in which to work

  • name – Dataset name

  • title – Dataset title

  • weight_name – Name of weight variable, ignored if empty

  • categories_as_float – List of category variables to include as floats named <category>_as_float in dataset

  • helicity – Name of helicity variable

  • helicity_states – Map of state names to helicity values

  • tspin – Name of target spin variable

  • tspin_states – Map of state names to target spin values

  • htspin – Name of helicity times target spin variable

  • htspin_states – Map of state names to helicity times target spin values

  • combined_spin_state – Name of combined spin state variable

  • binvars – List of kinematic binning variables names

  • binvar_titles – List of kinematic binning variables titles

  • binvar_lims – List kinematic binning variable minimum and maximum bounds

  • binvar_bins – List of kinematic binning variables bins

  • depolvars – List of depolarization variables names

  • depolvar_titles – List of depolarization variables titles

  • depolvar_lims – List depolarization variable minimum and maximum bounds

  • depolvar_bins – List of depolarization variables bins

  • asymfitvars – List of asymmetry fit variables names

  • asymfitvar_titles – List of asymmetry fit variables titles

  • asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds

  • asymfitvar_bins – List of asymmetry fit variables bins

  • massfitvars – List of invariant mass fit variables names

  • massfitvar_titles – List of invariant mass fit variables titles

  • massfitvar_lims – List invariant mass fit variable minimum and maximum bounds

  • massfitvar_bins – List of invariant mass fit variables bins

template<typename CsvKeyType, typename CsvValueType>
RNode mapDataFromCSV(RNode filtered_df, string rdf_key_col, string csv_path, string csv_key_col, vector<string> col_names, map<string, string> col_aliases, bool readHeaders = true, char delimiter = ',')

Map values from a CSV file into an existing RDataFrame.

Load a CSV file containing, e.g., run-dependent values, with ROOT::RDataFrame::FromCSV. Then, add the data from the requested column names to an existing RDataFrame by matching entries for csv_key_col in the CSV to entries for rdf_key_col in the RDataFrame. Note that column values will automatically be cast to float in the RDataFrame.

Parameters:
  • filtered_df – RDataFrame in which to load data from CSV

  • rdf_key_col – Name of the key column in the RDataFrame

  • csv_path – Path to the CSV file

  • csv_key_col – Name of the key column in the CSV

  • col_names – List of column names for values to map from the CSV file

  • col_aliases – Map of column names to aliases for defining branches in the RDataFrame

  • readHeaders – Option to read the headers from the CSV file

  • delimiter – Delimiter used in the CSV file

Returns:

RDataFrame with run dependent values loaded from the CSV file

TRandom *initializeTRandom(UInt_t seed, string trandom_type)

Initialize a TRandom generator.

Initialize a TRandom generator from the available algorithms provided by ROOT.

Parameters:
  • seed – Seed for random number generator

  • trandom_type – Type name of ROOT TRandom number generator

Throws:

Runtime – Error

Returns:

TRandom generator of given type initialized with the given seed

RNode injectAsym(RNode df, int seed, double bpol, double tpol, string mc_sg_match_name, string asyms_sg_uu_pos_name, string asyms_sg_uu_neg_name, string asyms_sg_pu_pos_name, string asyms_sg_pu_neg_name, string asyms_sg_up_name, string asyms_sg_pp_name, string asyms_bg_uu_pos_name, string asyms_bg_uu_neg_name, string asyms_bg_pu_pos_name, string asyms_bg_pu_neg_name, string asyms_bg_up_name, string asyms_bg_pp_name, string combined_spin_state_name, string helicity_name, string tspin_name, string phi_s_up_name, string phi_s_dn_name, string phi_s_name_injected, string trandom_type)

Inject an asymmetry into an existing RDataFrame.

Inject an asymmetry into an existing ROOT::RDataFrame given a random seed, beam and target polarizations, and the relevant signal and background asymmetry formulas separated into unpolarized modulations and modulations even under transverse target spin flips, i.e., modulations even under a flip of \(\phi_{S}\), as well as asymmetry terms dependent on beam helicity, target spin, or both.

In almost all scenarios, the unpolarized and even \(\phi_{S}\) dependent modulations will not be needed. However, in the case of a term with an even dependence on \(\phi_{S}\), the \(\phi_{S}\) dependence can be injected into the dataset if a variable name is supplied for \(\phi_{S}\) in both spin states via the arguments phi_s_up_name and phi_s_dn_name.

The injection algorithm proceeds as follows. For each event, a random number \(r\in[0,1)\), beam helicity \(\lambda_{\ell}\in(-1,0,1)\), and target spin \(S\in(-1,0,1)\) are all randomly generated. A non-zero \(\lambda_{\ell}\) and \(S\) are generated with probabilities taken from the beam and target polarizations respectively: \(P(\lambda_{\ell}\neq0) = \overline{\lambda_{\ell}^2}\) and \(P(S\neq0) = \overline{S^2}\). Otherwise, positive and negative helicity and spin values are generated with equal probability. The probability \(w\) of accepting the proposed \((\lambda_{\ell},S)\) pair is:

\[\begin{split} w &= \frac{1}{N} \bigg{\{} 1 + A_{UU} + \, S_{||} \, A_{UL} + A_{UT}(\phi^{True}_{S}) \\ & \quad + \,\lambda_{\ell} \, \big{[}A_{LU} + S_{||} \, A_{LL} + A_{LT}(\phi^{True}_{S})\big{]} \bigg{\}}\,, \end{split}\]
where \(N\) is the number of possible combinations of \((\lambda_{\ell},S)\), given whether either has already been set to \(0\). For example, if \((\lambda_{\ell},S)=(0,\pm1)\) or \((\lambda_{\ell},S)=(\pm1,0)\) then \(N=2\), but if \((\lambda_{\ell},S)=(\pm1,\pm1)\) then \(N=4\). Note that since we rely on the fact that the \(A_{UT}\) terms are odd under a transverse target spin flip, this formulation is equivalent to the following
\[\begin{split} w &= \frac{1}{N} \bigg{\{} 1 + A_{UU} + \, S \, A_{UP} \\ & \quad + \,\lambda_{\ell} \, \big{[}A_{PU} + S \, A_{PP} \big{]} \bigg{\}}\,, \end{split}\]
and \(A_{PU}\), \(A_{UP}\), and \(A_{PP}\) are the asymmetry terms dependent on beam helicity, target spin, or both. The asymmetry terms will be taken from either the signal or background asymmetries according to the boolean variable mc_sg_match_name indicating signal events. If \(r<w\) the beam helicity and target spin values for that event are accepted, otherwise all random values are regenerated and the process repeats until \(r<w\).

Parameters:
  • dfROOT::RDataFrame in which to inject asymmetry

  • seed – Seed for random number generator

  • bpol – Average beam polarization

  • tpol – Average target polarization

  • mc_sg_match_name – Name of boolean column indicating signal events

  • asyms_sg_uu_pos_name – Name of column containing the true signal unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=+1\)

  • asyms_sg_uu_neg_name – Name of column containing the true signal unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=-1\)

  • asyms_sg_pu_pos_name – Name of column containing the true signal asymmetries dependent on beam helicity, for \(S_{\perp}=+1\) in the case of a modulation even under transverse target spin flips

  • asyms_sg_pu_neg_name – Name of column containing the true signal asymmetries dependent on beam helicity, for \(S_{\perp}=-1\) in the case of a modulation even under transverse target spin flips

  • asyms_sg_up_name – Name of column containing the true signal asymmetries dependent on target spin

  • asyms_sg_pp_name – Name of column containing the true signal asymmetries dependent on beam helicity and target spin

  • asyms_bg_uu_pos_name – Name of column containing the true background unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=+1\)

  • asyms_bg_uu_neg_name – Name of column containing the true background unpolarized modulations and modulations with an even dependence on transverse target spin, i.e., \(\phi_{S}\), for \(S_{\perp}=-1\)

  • asyms_bg_pu_pos_name – Name of column containing the true background asymmetries dependent on beam helicity, for \(S_{\perp}=+1\) in the case of a modulation even under transverse target spin flips

  • asyms_bg_pu_neg_name – Name of column containing the true background asymmetries dependent on beam helicity, for \(S_{\perp}=-1\) in the case of a modulation even under transverse target spin flips

  • asyms_bg_up_name – Name of column containing the true background asymmetries dependent on target spin

  • asyms_bg_pp_name – Name of column containing the true background asymmetries dependent on beam helicity and target spin

  • combined_spin_state_name – Name of column containing combined beam helicity and target spin state encoded as \(ss = (\lambda_{\ell}+1)\cdot10 + (S+1)\)

  • helicity_name – Name of column containing the beam helicity

  • tspin_name – Name of column containing the target spin

  • phi_s_up_name – Name of column containing the injected \(\phi_{S}\) variable for \(S_{\perp}=+1\) events

  • phi_s_dn_name – Name of column containing the injected \(\phi_{S}\) variable for \(S_{\perp}=-1\) events

  • phi_s_name_injected – Name of column to contain the injected \(\phi_{S}\) variable

  • trandom_type – Type name of ROOT TRandom number generator

Throws:

Runtime – Error

Returns:

ROOT::RDataFrame with helicity and target spin values injected

RNode bootstrapPoisson(RNode df, int seed, string weight_name, string trandom_type)

Weight a dataframe by resampling with Poissonian statistics.

Weight an existing ROOT::RDataFrame following the Poissonian bootstrapping method of resampling each event randomly from a Poissonian distribution with mean \(\lambda=1\).

Parameters:
  • dfROOT::RDataFrame to weight

  • seed – Seed for random number generator

  • weight_name – Name of column containing the event weights

  • trandom_type – Type name of ROOT TRandom number generator

Returns:

ROOT::RDataFrame filtered for non-zero resampling weights

RNode bootstrapClassical(RNode df, int n, int seed, string weight_name, string trandom_type)

Weight a dataframe by resampling with replacement.

Weight an existing ROOT::RDataFrame following the classical bootstrapping method of resampling with replacement.

Parameters:
  • dfROOT::RDataFrame to weight

  • n – Sample size

  • seed – Seed for random number generator

  • weight_name – Name of column containing the event weights

  • trandom_type – Type name of ROOT TRandom number generator

Throws:

Runtime – Error

Returns:

ROOT::RDataFrame filtered for non-zero resampling weights

template<typename RetType>
RetType get_weighted_count(RNode df, string weight_name)

Compute a weighted count.

Compute the weighted count of a ROOT::RDataFrame.

Parameters:
  • df – RDataFrame to use

  • weight_name – Name of the weight column in the RDataFrame, ignored if empty

Returns:

The weighted count

template<typename RetType>
RetType get_weighted_mean(RNode df, string var_name, string weight_name)

Compute a weighted mean.

Compute the weighted mean of a variable in a ROOT::RDataFrame.

Parameters:
  • df – RDataFrame to use

  • var_name – Name other variable in which to compute the mean

  • weight_name – Name of the weight column in the RDataFrame, ignored if empty

Returns:

The weighted mean of the variable

template<typename RetType>
RetType get_weighted_stddev(RNode df, string var_name, string weight_name, RetType mean = 0.0)

Compute a weighted standard deviation.

Compute the weighted standard deviation of a variable in a ROOT::RDataFrame.

Parameters:
  • df – RDataFrame to use

  • var_name – Name other variable in which to compute the mean

  • weight_name – Name of the weight column in the RDataFrame, ignored if empty

  • mean – Value of the weighted mean if already available

Returns:

The weighted standard deviation of the variable

RNode defineAngularDiffVars(RNode frame, vector<string> particle_suffixes, string theta_name = "theta", string phi_name = "phi", string mc_suffix = "_mc")

Define Monte Carlo (MC) simulation angular difference variables.

Define the angular difference variables by taking the difference of reconstructed and MC values:

  • \(\Delta\theta = |\theta_{Rec} - \theta_{MC}|\)

  • \(\Delta\phi = |\phi_{Rec} - \phi_{MC}|\).

Note that \(\phi\) is a cyclic variable on \(2\pi\), so if \(|\phi_{Rec} - \phi_{MC}|>\pi\) then:

  • \(\Delta\phi = 2\pi - |\phi_{Rec} - \phi_{MC}|\).

Parameters:
  • frameROOT::RDataFrame in which to define angular difference variables

  • particle_suffixes – Suffixes of particle variables to define angular difference variables for

  • theta_name – Name of theta variable

  • phi_name – Name of phi variable

  • mc_suffix – Suffix of MC variables

Returns:

ROOT::RDataFrame with angular difference variables defined

namespace hbanalysis

Typedefs

using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>

Functions

vector<double> fitHB(RooWorkspace *w, string dataset_name, double bpol, double tpol, double alpha, string helicity, string tspin, string htspin, string combined_spin_state, string binid, string bincut, vector<string> binvars, vector<string> depolvars, vector<string> fitvars, ostream &out = cout)

Fit an asymmetry with the Helicity Balance (HB) method.

Compute the bin count, bin variable mean values and variances, depolarization variable values and errors, and fit the asymmetry with the Helicity Balance (HB) method. The asymmetry parameter will be computed with:

\[ D^{\Lambda}_{LL'} = \frac{1}{\alpha_{\Lambda} \overline{\lambda_{\ell}^2}}\frac{\sum^{N_{\Lambda}}_{i=1}\lambda_{\ell,i} \cos{\theta_{LL'}^i}}{\sum^{N_{\Lambda}}_{i=1}D(y_i) \cos^2{\theta_{LL'}^i}} \,. \]
Here, \(\lambda_{\ell,i}\) indicates the beam helicity for a given event \(i\), and \(\overline{\lambda^{2}_{\ell}}\) is the luminosity averaged beam polarization. The method relies on the assumption that the luminosity averaged helicity \(\overline{\lambda_{\ell}}=0\) to allow the acceptance method to cancel out. See Gunar Schnell’s thesis from New Mexico State University, 1999 for a full derivation. \(N_{\Lambda}\) is the number of \(\Lambda\) events in the bin, and \(D(y_i)\) and \(\cos{\theta_{LL'}^i}\) are the depolarization factor and the decay angle in the \(\Lambda\) CM frame respectively for the given event. Similarly, the error will be computed as follows. Letting
\[\begin{split} \begin{aligned} A& = \lambda_{\ell,i} \cos{\theta_{LL'}^i}, \\ B& =D(y_i) \cos^2{\theta_{LL'}^i} \,, \end{aligned} \end{split}\]
the statistical scale uncertainty may be expressed as
\[\begin{split} \begin{aligned} \bigg{(}\frac{\delta D^{\Lambda}_{LL'}}{D^{\Lambda}_{LL'}}\bigg{)}^2& = \bigg{[}\delta\bigg{(}\frac{\text{Sum}[A]}{\text{Sum}[B]}\bigg{)}\bigg{/}\bigg{(}\frac{\text{Sum}[A]}{\text{Sum}[B]}\bigg{)}\bigg{)}\bigg{]}^2 \\ & = \bigg{(} \frac{\text{Var}[A]^2}{\text{Sum}[A]^2} + \frac{\text{Var}[B]^2}{\text{Sum}[B]^2} - 2 \frac{\text{Var}[AB]}{\text{Sum}[A]\text{Sum}[B]}\bigg{)} \,. \end{aligned} \end{split}\]
Here, \(\text{Var}[X_i] = \sum_i (X_i - \text{Mean}[X_i])^2\) denotes the variance of a quantity \(X_i\), and Sum and Mean are exactly the operations named. Since both \(A\) and \(B\) are polynomial functions of \(\cos{\theta_{LL'}}\), \(A\) and \(B\) are correlated. Thus, we include the covariance term \(\text{Var[AB]}\) in the uncertainty calculation.

The returned vector will have the following entries:

  • Bin count

  • For each bin variable:

    • Bin variable mean value

    • Bin variable standard deviation

  • For each depolarization variable:

    • Depolarization variable mean value

    • Depolarization variable standard deviation

  • The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of

    • Beam helicity \(\lambda_{\ell}\)

    • Target spin \(S\)

    • Beam helicity times target spin \(\lambda_{\ell} \, S\)

  • For the (only) Helicity Balance parameter:

    • Helicity Balance parameter mean value

    • Helicity Balance parameter error

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)

  • tpol – Luminosity averaged target polarization \(\overline{S^2}\)

  • alpha – Lambda decay asymmetry parameter \(\alpha_{\Lambda}\)

  • helicity – Name of the helicity variable

  • tspin – Name of the target spin variable

  • htspin – Name of the beam helicity times target spin variable

  • combined_spin_state – Name of the combined spin state variable

  • binid – Bin unique id

  • bincut – Kinematic variable cut for bin

  • binvars – List of kinematic binning variables

  • depolvars – List of depolarization variables

  • fitvars – List of asymmetry fit variables

  • out – Output stream

Throws:

Runtime – error

Returns:

List of bin count, bin variable means and errors, depolarization variable means and errors, fit parameters and errors

void getKinBinnedHB(string baseoutpath, string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> asymfitvars, vector<string> asymfitvar_titles, vector<vector<double>> asymfitvar_lims, vector<int> asymfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, double bpol, double tpol, double alpha, map<string, string> massfit_yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, bool use_splot, string massfit_sgcut, string massfit_bgcut, bool use_sb_subtraction, bool use_binned_sb_bgfracs, map<int, string> asymfitvar_bincuts, string bgfracvar, vector<double> bgfracvar_lims, int bgfrac_idx = 0, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)

Loop kinematic bins and extract an asymmetry with the Helicity Balance method, correcting for background with sideband subtraction or sPlots.

Loop bins cuts and fit an asymmetry with the saga::hbanalysis::fitHB() method. Optionally, apply an invariant mass fit and background correction using the sideband subtraction method or the sPlot method from arXiv:physics/0402083. The mass fit will be applied with saga::signal::fitMass() and the sPlot method will use saga::signal::applySPlot().

Results will be saved in a csv file with the following columns:

  • bin_id: The unique bin id

  • count: The total number of counts in the bin

  • For each bin variable binvar

    • <binvar>: Mean value

    • <binvar>_err: Standard deviation

  • For each depolarization variable depolvar

    • <depolvar>: Mean value

    • <depolvar>_err: Standard deviation

  • The raw asymmetries and errors using the actual counts or, in the case of an extended fit, using the fitted counts, for each of

    • Beam helicity \(\lambda_{\ell}\)

    • Target spin \(S\)

    • Beam helicity times target spin \(\lambda_{\ell} \, S\)

  • For each asymmetry fit parameter asymfitpar (only one allowed for the HB method)

    • <asymfitpar>: Final parameter value

    • <asymfitpar>_err: Final parameter error

The following columns will be added in the case of a single mass fit for applied to the entire bin:

  • int_sg_pdf_val: Signal PDF integral \(N_{SG}^{PDF}\) value in the signal region

  • int_sg_pdf_err: Signal PDF integral error \(\delta N_{SG}^{PDF}\) in the signal region

  • int_bg_pdf_val: Background PDF integral \(N_{BG}^{PDF}\) in the signal region

  • int_bg_pdf_err: Background PDF integral error \(\delta N_{BG}^{PDF}\) in the signal region

  • int_model_pdf_val: Full PDF integral \(N^{PDF}\) in the signal region

  • int_model_pdf_err: Full PDF integral error \(\delta N^{PDF}\) in the signal region

  • int_ds_val Full dataset sum \(N^{DS}\) in the signal region

  • int_ds_err: Poissonian error \(\sqrt{N^{DS}}\) of the full dataset sum in the signal region

  • eps_bg_pdf: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\)

  • eps_bg_pdf_err: Background fraction error \(\delta\varepsilon_{1}\)

  • eps_sg_pdf: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\)

  • eps_sg_pdf_err: Background fraction error \(\delta\varepsilon_{2}\)

  • eps_pdf: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\)

  • eps_pdf_err: Background fraction error \(\delta\varepsilon_{3}\)

  • For each mass fit variable:

    • <chi2>: \(\chi^2\) value of the 1D projection of the full PDF in that variable

  • For each mass fit signal PDF parameter massfitpar_sg

    • <massfitpar_sg>: Final parameter value

    • <massfitpar_sg>_err: Final parameter error

  • For each mass fit background PDF parameter massfitpar_bg

    • <massfitpar_bg>: Final parameter value

    • <massfitpar_bg>_err: Final parameter error

Parameters:
  • baseoutpath – Base name prefix for output files

  • scheme_name – Name bin scheme and basename of output csv file

  • frame – ROOT RDataframe from which to create RooFit datasets

  • workspace_name – Name of workspace in which to work

  • workspace_title – Title of workspace in which to work

  • dataset_name – Dataset name

  • dataset_title – Dataset title

  • weight_name – Name of weight variable, ignored if empty

  • categories_as_float – List of category variables to include as asymmetry fit variables in dataset

  • helicity – Name of helicity variable

  • helicity_states – Map of state names to helicity values

  • tspin – Name of target spin variable

  • tspin_states – Map of state names to target spin values

  • htspin – Name of helicity times target spin variable

  • htspin_states – Map of state names to helicity times target spin values

  • combined_spin_state – Name of combined spin state variable

  • bincuts – Map of unique bin id ints to bin variable cuts for bin

  • binvars – List of kinematic binning variables names

  • binvar_titles – List of kinematic binning variables titles

  • binvar_lims – List kinematic binning variable minimum and maximum bounds

  • binvar_bins – List of kinematic binning variables bins

  • depolvars – List of depolarization variables names

  • depolvar_titles – List of depolarization variables titles

  • depolvar_lims – List depolarization variable minimum and maximum bounds

  • depolvar_bins – List of depolarization variables bins

  • asymfitvars – List of asymmetry fit variables names

  • asymfitvar_titles – List of asymmetry fit variables titles

  • asymfitvar_lims – List asymmetry fit variable minimum and maximum bounds

  • asymfitvar_bins – List of asymmetry fit variables bins

  • massfitvars – List of invariant mass fit variables names

  • massfitvar_titles – List of invariant mass fit variables titles

  • massfitvar_lims – List invariant mass fit variable minimum and maximum bounds

  • massfitvar_bins – List of invariant mass fit variables bins

  • bpol – Luminosity averaged beam polarization \(\overline{\lambda_{\ell}^2}\)

  • tpol – Luminosity averaged target polarization \(\overline{S^2}\)

  • alpha – Lambda decay asymmetry parameter \(\alpha_{\Lambda}\)

  • massfit_yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining mass fit arguments. Note that the values specified here will function as the defaults.

  • massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be: <pdf_name>_<binid>.

  • massfit_formula_sg – The signal PDF formula in ROOT TFormula format

  • massfit_formula_bg – The background PDF formula in ROOT TFormula format

  • massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset

  • massfit_parinits_sg – List of signal PDF parameter initial values

  • massfit_parnames_sg – List of signal PDF parameter names

  • massfit_partitles_sg – List of signal PDF parameter titles

  • massfit_parunits_sg – List of signal PDF parameter unit titles

  • massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds

  • massfit_parinits_bg – List of background PDF parameter initial values

  • massfit_parnames_bg – List of background PDF parameter names

  • massfit_partitles_bg – List of background PDF parameter titles

  • massfit_parunits_bg – List of background PDF parameter unit titles

  • massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds

  • massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable

  • massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable

  • use_splot – Option to use sPlot method and perform fit with sWeighted dataset

  • massfit_sgcut – Signal region cut for sideband subtraction background correction. Note, this will automatically be formed from massfit_sgregion_lims if not specified.

  • massfit_bgcut – Background region cut for sideband subtraction background correction

  • use_sb_subtraction – Option to use sideband subtraction for background correction

  • use_binned_sb_bgfracs – Option to use background fractions from invariant mass fits binned in the asymmetry fit variable for background correction

  • asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for asymmetry fit variable bins

  • bgfracvar – Name of binned background fraction variable

  • bgfracvar_lims – List of binned background fraction variable minimum and maximum bounds

  • bgfrac_idx – Index to select which formulation to use for the background fraction in saga::signal::setBinnedBGFractions()

  • massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend for the signal and background mass fit

  • massfit_lg_text_size – Size of TLegend text for the signal and background mass fit

  • massfit_lg_margin – Margin of TLegend for the signal and background mass fit

  • massfit_lg_ncols – Number of columns in TLegend for the signal and background mass fit

  • massfit_use_sumw2error – Option to use RooFit::SumW2Error(true) option for the signal and background mass fit which is necessary if using a weighted dataset

  • massfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit

  • massfit_use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit

  • out – Output stream

Throws:

runtime_error – if invalid arguments are provided

namespace log
class Logger
#include <log.h>

Singleton logger class for SAGA.

namespace resolution

Typedefs

using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>

Functions

vector<double> fitResolution(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> resfitvars, string yamlfile, string pdf_name = "gauss", string fitformula = "gaus(x[0],x[1],x[2],x[3])", vector<string> parnames = {"constant", "mu", "sigma"}, vector<string> partitles = {"C", "#mu", "#sigma"}, vector<string> parunits = {"", "", ""}, vector<double> parinits = {1.0, 0.0, 0.1}, vector<vector<double>> parlims = {{1.0, 1.0}, {-1.0, 1.0}, {0.0, 1.0}}, string plot_title = "Fit Resolution", double lg_text_size = 0.04, double lg_margin = 0.1, int lg_ncols = 1, bool use_sumw2error = false, bool use_extended_nll = true, bool use_binned_fit = false, ostream &out = cout)

Fit a resolution distribution.

Fit a resolution distribution, that is, the difference in reconstructed and true values \(\Delta X = X_{Rec} - X_{True}\) with a generic PDF, although this will default to Gaussian. Starting parameter values and limits may be loaded from a yaml file for each bin.

The returned vector will contain, in order:

  • The total bin count

  • The average bin variable means and corresponding standard deviations

  • The \(\chi^2/NDF\) of the fit from a 1D histogram in each fit variable

  • The parameter value and error for each fit parameter

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • binid – Bin unique id

  • bincut – Kinematic variable cut for bin

  • binvars – List of kinematic binning variables

  • resfitvars – List of resolution fit variables

  • yamlfile – Path to YAML file specifying the remaining fit arguments

  • pdf_name – Base name of PDF. Note, the actual PDF name will be: <pdf_name>_<binid>.

  • fitformula – The PDF formula in ROOT TFormula format

  • parnames – List of PDF parameter names

  • partitles – List of PDF parameter titles

  • parunits – List of PDF parameter unit titles

  • parinits – List of PDF parameter initial values

  • parlims – List of PDF parameter minimum and maximum bounds

  • plot_title – Title of fit plot

  • lg_text_size – Size of TLegend text

  • lg_margin – Margin of TLegend

  • lg_ncols – Number of columns in TLegend

  • use_sumw2error – Option to use RooFit::SumW2Error(true) option when fitting to dataset which is necessary if using a weighted dataset

  • use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization

  • use_binned_fit – Option to use a binned fit to the data

  • out – Output stream

Returns:

List containing fit results

void getKinBinnedResolutions(string scheme_name, RNode frame, string workspace_name, string workspace_title, string dataset_name, string dataset_title, string weight_name, vector<string> categories_as_float, string helicity, map<string, int> helicity_states, string tspin, map<string, int> tspin_states, string htspin, map<string, int> htspin_states, string combined_spin_state, map<int, string> bincuts, vector<string> binvars, vector<string> binvar_titles, vector<vector<double>> binvar_lims, vector<int> binvar_bins, vector<string> depolvars, vector<string> depolvar_titles, vector<vector<double>> depolvar_lims, vector<int> depolvar_bins, vector<string> resfitvars, vector<string> resfitvar_titles, vector<vector<double>> resfitvar_lims, vector<int> resfitvar_bins, vector<string> massfitvars, vector<string> massfitvar_titles, vector<vector<double>> massfitvar_lims, vector<int> massfitvar_bins, map<string, string> yamlfile_map, string pdf_name, string fitformula, vector<string> parnames, vector<string> partitles, vector<string> parunits, vector<double> parinits, vector<vector<double>> parlims, double lg_text_size = 0.04, double lg_margin = 0.1, int lg_ncols = 1, bool use_sumw2error = false, bool use_extended_nll = true, bool use_binned_fit = false, ostream &out = cout)

Loop kinematic bins and fit a resolution distribution.

Loop bins cuts and fit a resolution distribution with the saga::resolution::fitResolution() method.

Results will be saved in a csv file with the following columns:

  • bin_id: The unique bin id

  • count: The total number of counts in the bin

  • For each bin variable binvar

    • <binvar>: Mean value

    • <binvar>_err: Standard deviation

  • For each independent fit variable:

    • <chi2ndf>: \(\chi^2/NDF\) value of the 1D projection of the full PDF in that variable

  • For each resolution fit PDF parameter fitpar

    • <fitpar>: Final parameter value

    • <fitpar>_err: Final parameter error

Parameters:
  • scheme_name – Name bin scheme and basename of output csv file

  • frame – ROOT RDataframe from which to create RooFit datasets

  • workspace_name – Name of workspace in which to work

  • workspace_title – Title of workspace in which to work

  • dataset_name – Dataset name

  • dataset_title – Dataset title

  • weight_name – Name of weight variable, ignored if empty

  • categories_as_float – List of category variables to include as asymmetry fit variables in dataset

  • helicity – Name of helicity variable

  • helicity_states – Map of state names to helicity values

  • tspin – Name of target spin variable

  • tspin_states – Map of state names to target spin values

  • htspin – Name of helicity times target spin variable

  • htspin_states – Map of state names to helicity times target spin values

  • combined_spin_state – Name of combined spin state variable

  • bincuts – Map of unique bin id ints to bin variable cuts for bin

  • binvars – List of kinematic binning variables names

  • binvar_titles – List of kinematic binning variables titles

  • binvar_lims – List kinematic binning variable minimum and maximum bounds

  • binvar_bins – List of kinematic binning variables bins

  • depolvars – List of depolarization variables names

  • depolvar_titles – List of depolarization variables titles

  • depolvar_lims – List depolarization variable minimum and maximum bounds

  • depolvar_bins – List of depolarization variables bins

  • resfitvars – List of resolution fit variables names

  • resfitvar_titles – List of resolution fit variables titles

  • resfitvar_lims – List resolution fit variable minimum and maximum bounds

  • resfitvar_bins – List of resolution fit variables bins

  • massfitvars – List of invariant mass fit variables names

  • massfitvar_titles – List of invariant mass fit variables titles

  • massfitvar_lims – List invariant mass fit variable minimum and maximum bounds

  • massfitvar_bins – List of invariant mass fit variables bins

  • yamlfile_map – Map of bin ids to the paths of yaml files specifying the remaining resolution fit arguments. Note that the values specified here will function as the defaults.

  • pdf_name – Base name of the resolution PDF. Note, the actual PDF name will be: <pdf_name>_<binid>.

  • fitformula – The resolution PDF formula in ROOT TFormula format

  • parnames – List of resolution PDF parameter names

  • partitles – List of resolution PDF parameter titles

  • parunits – List of resolution PDF parameter unit titles

  • parinits – List of resolution PDF parameter initial values

  • parlims – List of resolution PDF parameter minimum and maximum bounds

  • lg_text_size – Size of TLegend text for the signal and background mass fit

  • lg_margin – Margin of TLegend for the signal and background mass fit

  • lg_ncols – Number of columns in TLegend for the signal and background mass fit

  • use_sumw2error – Option to use RooFit::SumW2Error(true) option for the signal and background mass fit which is necessary if using a weighted dataset

  • use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization for the signal and background mass fit

  • use_binned_fit – Option to use a binned fit to the data for the signal and background mass fit

  • out – Output stream

Throws:

Runtime – error

namespace signal

Typedefs

using RNode = ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter, void>

Functions

vector<string> getGenMassPdf(RooWorkspace *w, RooArgSet *argset_sg, RooArgSet *argset_bg, string pdf_name, string sgYield_name, string bgYield_name, string binid, string fitformula_sg, string fitformula_bg, double initsgfrac, int count, bool use_extended_nll)

Create a PDF for fitting a combined signal and background invariant mass distribution.

Create a PDF given the formulas for the signal and background distributions. The PDF will be constructed internally using RooGenericPdf in the form:

\[\begin{split} \begin{aligned} PDF(x_0, x_1, ..., &a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...) = \\ & u \, f_{SG}(\vec{x}, \vec{a}_{SG}) \\ & + (1-u) \, f_{BG}(\vec{x}, \vec{a}_{BG}), \\ \end{aligned} \end{split}\]
where \(u\) is the ratio of signal events to total events in the dataset and \(f_{C}(x_0, x_1, ..., a_{C,0}, a_{C,1}, ...)\) for \(C\in(SG,BG)\) are the given signal and background PDFs. The x_<int> denote the independent fit variables, i.e., the mass variables, and the a_<int> denote the dependent fit variables, i.e., the signal and background PDF parameters.

The variable names in each fit formula should (separately) follow the TFormula notation, e.g., x_0 \(\rightarrow\)x[0], x_1 \(\rightarrow\)x[1], a_0 \(\rightarrow\)x[N_x], a_1 \(\rightarrow\)x[N_x+1], etc.

The indexing of parameters in the fit formula is separate for the signal and background PDFs.

Parameters:
  • w – RooWorkspace in which to work

  • argset_sg – Argument set for the signal PDF

  • argset_bg – Argument set for the background PDF

  • pdf_name – Base name of full PDF

  • sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • binid – Unique bin id, used to name the PDF

  • fitformula_sg – The signal PDF formula in ROOT TFormula format

  • fitformula_bg – The background PDF formula in ROOT TFormula format

  • initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset

  • count – Bin count

  • use_extended_nll – Option to use an extended likelihood term

Returns:

List of the names of the combined, signal, and background pdfs and the signal and background yield variabless in that order

vector<double> fitMass(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> fitvars, string yamlfile, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, ostream &out = cout)

Fit a combined signal and background mass distribution.

Compute the bin count, bin variable mean values and variances, and fit the combined signal and background mass distribution with a binned or unbinned dataset using a maximum likelihood fit method with an optional extended likelihood term. Note that for the maximum likelihood fit, the given PDF formulas \(f_{(SG,BG)}(x_0, x_1, ..., a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...)\) will be used internally by getGenMassPdf() to construct a PDF of the form:

\[\begin{split} \begin{aligned} PDF(x_0, x_1, ..., &a_{SG,0}, a_{SG,1}, ..., a_{BG,0}, a_{BG,1}, ...) = \\ & u \, f_{SG}(\vec{x}, \vec{a}_{SG}) \\ & + (1-u) \, f_{BG}(\vec{x}, \vec{a}_{BG}), \\ \end{aligned} \end{split}\]
where \(u\) is the ratio of signal events to total events in the dataset. The x_<int> denote the independent fit variables, i.e., the mass variables, and the a_<int> denote the dependent fit variables, i.e., the signal and background PDF parameters.

The variable names in each fit formula should (separately) follow the TFormula notation, e.g., x_0 \(\rightarrow\)x[0], x_1 \(\rightarrow\)x[1], a_0 \(\rightarrow\)x[N_x], a_1 \(\rightarrow\)x[N_x+1], etc.

The mass fit parameters (massfit_*) may optionally be loaded from a yaml file containing all these parameters by name to allow one to specify bin-dependent starting parameters, limits, and even PDF formulas.

The returned vector will contain, in order:

  • Total bin count

  • Signal PDF integral \(N_{SG}^{PDF}\) and error in the signal region

  • Background PDF integral \(N_{BG}^{PDF}\) and error in the signal region

  • Full PDF integral \(N^{PDF}\) and error in the signal region

  • Full dataset sum \(N^{DS}\) and Poissonian error \(\sqrt{N^{DS}}\) in the signal region

  • Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\) and error

  • Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\) and error

  • Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\) and error

  • \(\chi^2/NDF\) computed from the 1D histogram in each fit variable

  • Bin variable mean values and standard deviations

  • Signal PDF parameters and errors

  • Background PDF parameters and errors

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • binid – Bin unique id

  • bincut – Kinematic variable cut for bin

  • binvars – List of kinematic binning variables

  • fitvars – List of mass fit variables

  • yamlfile – Path to YAML file specifying the remaining mass fit arguments

  • massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be: <massfit_pdf_name>_<binid>.

  • massfit_formula_sg – The signal PDF formula in ROOT TFormula format

  • massfit_formula_bg – The background PDF formula in ROOT TFormula format

  • massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset

  • massfit_parinits_sg – List of signal PDF parameter initial values

  • massfit_parnames_sg – List of signal PDF parameter names

  • massfit_partitles_sg – List of signal PDF parameter titles

  • massfit_parunits_sg – List of signal PDF parameter unit titles

  • massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds

  • massfit_parinits_bg – List of background PDF parameter initial values

  • massfit_parnames_bg – List of background PDF parameter names

  • massfit_partitles_bg – List of background PDF parameter titles

  • massfit_parunits_bg – List of background PDF parameter unit titles

  • massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds

  • massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable

  • massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable

  • massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend

  • massfit_lg_text_size – Size of TLegend text

  • massfit_lg_margin – Margin of TLegend

  • massfit_lg_ncols – Number of columns in TLegend

  • massfit_use_sumw2error – Option to use RooFit::SumW2Error(true) option when fitting to dataset which is necessary if using a weighted dataset

  • massfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization

  • massfit_use_binned_fit – Option to use a binned fit to the data

  • out – Output stream

Returns:

List of bin count, bin variable means and errors, fit parameters and errors

void applySPlot(RooWorkspace *w, string dataset_name, string weight_name, string sgYield_name, string bgYield_name, string model_name, string dataset_sg_name, string dataset_bg_name)

Apply the sPlot method from arXiv:physics/0402083.

Apply sPlot method from arXiv:physics/0402083 given a dataset, yield variables, and a PDF model and add the sWeighted datasets to the workspace.

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • weight_name – Name of existing weight variable, ignored if empty

  • sgYield_name – Signal yield variable name

  • bgYield_name – Background yield variable name

  • model_name – Full PDF name

  • dataset_sg_name – Name of dataset with signal sweights

  • dataset_bg_name – Name of dataset with background sweights

void getBinnedBGFractionsDataset(RooWorkspace *w, RooAbsData *rds, RNode frame, string rds_out_name, map<int, string> bincuts, string bgfracvar, map<int, vector<double>> bgfracs_map, int bgfrac_idx = 0, double bgfracs_default = 0.0)

Set the background fraction of a dataset bin by bin from a map of bin ids to background fraction values.

Set the background fractions of a dataset bin by bin from a map of unique integer bin identifiers to vectors of background fraction values. The background fractions are created first from the RDataFrame since defining a conditional variable for a RooDataSet is nigh impossible. Then, they are merged into the existing dataset rds and a new dataset containing the background fraction columns is created and uploaded to the workspace.

Parameters:
  • w – RooWorkspace in which to work

  • rds – RooDataSet to which to add the background fraction columns

  • frame – ROOT RDataframe from which to define background fraction columns

  • rds_out_name – Name of the new RooDataSet containing the background fraction columns to import into the RooWorkspace

  • bincuts – Map of unique bin id ints to bin variable cuts for bin

  • bgfracvar – Background fraction variable name

  • bgfracs_map – Map of unique integer bin identifiers to background fraction values

  • bgfrac_idx – Index of the background fraction of interest in the vector of background fraction values

  • bgfracs_default – Weight variable default value for events outside provided cuts

void setBinnedBGFractions(RooWorkspace *w, string dataset_name, string binid, string bincut, vector<string> binvars, vector<string> fitvars, map<string, string> yamlfile_map, string massfit_pdf_name, string massfit_formula_sg, string massfit_formula_bg, string massfit_sgYield_name, string massfit_bgYield_name, double massfit_initsgfrac, vector<double> massfit_parinits_sg, vector<string> massfit_parnames_sg, vector<string> massfit_partitles_sg, vector<string> massfit_parunits_sg, vector<vector<double>> massfit_parlims_sg, vector<double> massfit_parinits_bg, vector<string> massfit_parnames_bg, vector<string> massfit_partitles_bg, vector<string> massfit_parunits_bg, vector<vector<double>> massfit_parlims_bg, vector<vector<double>> massfit_fitwindow_lims, vector<vector<double>> massfit_sgregion_lims, RNode frame, string bgcut, vector<string> asymfitvars, map<int, string> asymfitvar_bincuts, string rds_out_name, string sb_rds_out_name, string bgfracvar, vector<double> bgfracvar_lims = {0., 1.0}, double massfit_lg_text_size = 0.04, double massfit_lg_margin = 0.1, int massfit_lg_ncols = 1, bool massfit_plot_bg_pars = false, bool massfit_use_sumw2error = false, bool massfit_use_extended_nll = true, bool massfit_use_binned_fit = false, int bgfrac_idx = 0, double bgfracs_default = 0.0, ostream &out = cout)

Apply a generic mass fit and set the background fraction for a dataset bin by bin.

Apply a generic mass fit in each asymmetry fit variable bin using fitMass() and set the background fraction column for the given dataset, which should only contain events from either the signal or sideband region. Background fractions \(\varepsilon\) will be taken from one of three choices specified by index:

  • 0: Background fraction \(\varepsilon_{1} = \frac{N_{BG}^{PDF}}{N^{DS}}\) and error

  • 1: Background fraction \(\varepsilon_{2} = 1 - \frac{N_{SG}^{PDF}}{N^{DS}}\) and error

  • 2: Background fraction \(\varepsilon_{3} = 1 - \frac{N_{SG}^{PDF}}{N^{PDF}}\) and error

The naming scheme for the bins will be <binid>__<asymfitvar_binid>.

Parameters:
  • w – RooWorkspace in which to work

  • dataset_name – Dataset name

  • binid – Bin unique id

  • bincut – Kinematic variable cut for bin

  • binvars – List of kinematic binning variables

  • fitvars – List of mass fit variables

  • yamlfile_map – Map of bin ids to paths of YAML files specifying the remaining mass fit arguments

  • massfit_pdf_name – Base name of the combined signal and background pdf. Note, the actual PDF name will be: <massfit_pdf_name>_<binid>.

  • massfit_formula_sg – The signal PDF formula in ROOT TFormula format

  • massfit_formula_bg – The background PDF formula in ROOT TFormula format

  • massfit_sgYield_name – Base name of the signal yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_bgYield_name – Base name of the background yield variable for an extended fit, the bin id will be appended for uniqueness in the workspace

  • massfit_initsgfrac – Initial value of the ratio of signal events to total events \(u\) in the dataset

  • massfit_parinits_sg – List of signal PDF parameter initial values

  • massfit_parnames_sg – List of signal PDF parameter names

  • massfit_partitles_sg – List of signal PDF parameter titles

  • massfit_parunits_sg – List of signal PDF parameter unit titles

  • massfit_parlims_sg – List of signal PDF parameter minimum and maximum bounds

  • massfit_parinits_bg – List of background PDF parameter initial values

  • massfit_parnames_bg – List of background PDF parameter names

  • massfit_partitles_bg – List of background PDF parameter titles

  • massfit_parunits_bg – List of background PDF parameter unit titles

  • massfit_parlims_bg – List of background PDF parameter minimum and maximum bounds

  • massfit_fitwindow_lims – List of fit window minimum and maximum bounds for each fit variable

  • massfit_sgregion_lims – List of signal region minimum and maximum bounds for each fit variable

  • frame – ROOT RDataframe in which to define the background fraction variable

  • bgcut – Background invariant mass region cut

  • asymfitvars – List of asymmetry fit variables names

  • asymfitvar_bincuts – Map of unique bin id ints to bin variable cuts for bin

  • rds_out_name – Name of signal region RooDataSet under which to import it into the RooWorkspace

  • sb_rds_out_name – Name of sideband region RooDataSet under which to import it into the RooWorkspace

  • bgfracvar – Background fraction variable name

  • bgfracvar_lims – Background fraction variable limits

  • massfit_plot_bg_pars – Option to plot background pdf parameters on TLegend

  • massfit_lg_text_size – Size of TLegend text

  • massfit_lg_margin – Margin of TLegend

  • massfit_lg_ncols – Number of columns in TLegend

  • massfit_use_sumw2error – Option to use RooFit::SumW2Error(true) option when fitting to dataset which is necessary if using a weighted dataset

  • massfit_use_extended_nll – Option to use an extended Negative Log Likelihood function for minimization

  • massfit_use_binned_fit – Option to use a binned fit to the data

  • bgfrac_idx – Index of the background fraction of interest from the available choices

  • bgfracs_default – Weight variable default value for events outside provided cuts

  • out – Output stream

namespace util

Functions

template<typename T>
T getYamlArg(const YAML::Node &node, string argname, T defaultval, string message_prefix, bool verbose)

Load an argument from YAML file.

Template Parameters:

T – Type of argument to get

Parameters:
  • node – Yaml node from which to load argument

  • argname – Name of argument to load

  • defaultval – Default value to use if argument is not found

  • message_prefix – Prefix to output message

  • verbose – Option to print out argument name and value

Returns:

Argument of type T

void replaceAll(string &s, string const &to_replace, string const &replace_with)

Find and replace function for string.

Find all occurences of a substring and replace them with another. Note that this is an in place operation. REGEX is NOT supported.

Parameters:
  • s – String to search

  • to_replace – Substring to find and replace

  • replace_with – Substring to insert

string addLimitCuts(string cuts, vector<string> vars, vector<vector<double>> varlims)

Add additional variable limit cuts to an existing cut string.

Add additional variable limit cuts to an existing cut string.

Parameters:
  • cuts – Base cut string

  • vars – Variables for which to add limits cuts

  • varlims – Limits of provided variables

Returns:

Updated cut string