Background Algorithms for Binned Data (`binned`)¶

Introduction¶

The gdt.core.background.binned module provides algorithms for background fitting of binned data. Typically the background is fit in the time domain for each channel of multi-channel data, however, nothing precludes the design of an algorithm that fits in the energy domain. Currently an algorithm for fitting a polynomial to binned data is provided.

The Polynomial Fitting Algorithm¶

The algorithm for fitting a polynomial to binned data is based on least-squares minimization using chi-squared as a statistic. The fit is performed in two passes to mitigate issues with small count statistics. The first pass uses variances calculated from the observed data rates in the minimization. The second pass uses variances calculated from the first-pass model rates. The fit is performed on multi-channel data, where each channel is fit independently with the same order polynomial.

Examples¶

Although the algorithm is designed to fit multiple energy channels simultaneously, for the sake of simplicity, we will look at an example using a single channel of data.

>>> import numpy as np
>>> # generate events at an average rate of 1 per second for 1000 s.
>>> times = np.random.exponential(1.0, size=1000).cumsum()

>>> from gdt.core.binning.unbinned import bin_by_time
>>> # get the bin edges such that we have equally spaced 10-sec wide bins
>>> edges = bin_by_time(times, 10.0)

>>> counts, _ = np.histogram(times, bins=edges)

So far we have generated some random event data consistent with a constant rate of 1 count/second and binned it into 10 s bins. Because this is a single channel of data, and Polynomial expects multiple channels, we just need to reshape the counts array. Furthermore, let’s assume our exposure is the same as the bin width.

>>> counts = counts.reshape(-1, 1)
>>> exposure = edges[1:] - edges[:-1]

Now we can initialize the Polynomial object:

>>> from gdt.core.background.binned import Polynomial
>>> poly = Polynomial(counts, edges[:-1], edges[1:], exposure)

Once the object is initialized, we can do the fit and pass any algorithms parameters needed to fully specify the fit. In this example, we will pass the polynomial order.

>>> model, model_uncert = poly.fit(order=0)
>>> print(model)
[[1.02040816]
 [1.02040816]
 [1.02040816]
...
>>> print(model_uncert)
[[0.03019387]
 [0.03019387]
 [0.03019387]
...

The fit returns the model fit at each bin as well as the model uncertainty. Since this is based off of randomly generated data, your results might be slightly different. You can also retrieve the fit statistic and degrees-of-freedom for each channel of the fit:

>>> poly.statistic
array([97.992])
>>> poly.dof
array([97.])

Of course, this is a trivial case, so maybe we want to fit a higher-order polynomial and see if it provides a good fit. You can simply redo the fit:

>>> model, model_uncert = poly.fit(order=1)
>>> poly.statistic
array([97.86047063])
>>> poly.dof
array([96.])

Finally, we can interpolate the rate and rate uncertainty at any point:

>>> interp_edges = np.linspace(10.0, 20.0, 6)
>>> model_interp, uncert_interp = poly.interpolate(interp_edges[:-1], interp_edges[1:])
>>> model_interp
array([[1.00572787],
       [1.00578859],
       [1.00584931],
       [1.00591003],
       [1.00597075]])
>>> uncert_interp
array([[0.05940458],
       [0.05922084],
       [0.05903729],
       [0.05885395],
       [0.05867081]])

The RoboLowess Algorithm¶

The RoboLowess algorithm provides a non-parametric background fit using LOWESS smoothing with iterative sigma-clipping. It operates in two passes: first, a summed lightcurve across channels is used to identify background-dominated bins; second, each channel is refit only on those background bins. The resulting background is then interpolated to the full time grid using a cubic spline. An optional refinement step models the residual distribution with a Gaussian+Exponential mixture to further exclude signal-like bins.

Parameters include:

win_size: Absolute smoothing window in seconds. Internally converted to
a LOWESS fraction via frac = win_size / data_range. If None (default), the fraction is auto-computed from the median bin width and the data range. A warning is issued if the resulting fraction falls outside [min_frac, max_frac], but the supplied value is still used.
min_frac: Minimum allowed LOWESS fraction (default 0.4). When
win_size is None the auto-computed fraction is clamped to this floor. When win_size is provided, this value is only used as the lower bound of the warning check.
max_frac: Maximum allowed LOWESS fraction (default 0.95). Same role as
min_frac but as a ceiling.
lowess_iter: Robustness iterations for LOWESS.
first_pass_chan_range: channel range used in Pass 1 to
build the background mask.
refit_after_clipping: If True (default), the algorithm performs a
final LOWESS refit with residual clipping. For high-time-resolution TTE data, such as 64/256 ms binning, set refit_after_clipping=False to skip this final refit.

Note¶

The closed-form expression used when win_size is not provided was derived empirically. First, a numerical optimization was run over a grid of data ranges and temporal resolutions by minimizing a background‑fit error metric, and the resulting optimal values were then fit with a simple power‑law using scipy.optimize.curve_fit, yielding the closed‑form expression used when win_size is not specified.

Examples¶

Using the same binned data from the polynomial example above:

>>> from gdt.core.background.binned import RoboLowess
>>> lowess_bg = RoboLowess(counts, edges[:-1], edges[1:], exposure)
>>> removed_times, removed_bkg, diagnostics = lowess_bg.fit(lowess_iter=5)

The full background model for all bins and channels is stored on the _backgrounds attribute (shape (num_times, num_chans)):

>>> backgrounds = lowess_bg._backgrounds
>>> print("backgrounds (first 5 rows):\n", backgrounds[:5])
backgrounds (first 5 rows):
[[1.02127623]
[1.02033087]
[1.01935709]
[1.01836799]
[1.0173858 ]]

We can retrieve the fit statistic and degrees-of-freedom:

>>> lowess_bg.statistic
[118.28368989]
>>> lowess_bg.dof
[100.]

Finally, we can interpolate the background rate and uncertainty at any binning:

>>> interp_edges = np.linspace(2.0, 10.0, 6)
>>> model_interp, uncert_interp = lowess_bg.interpolate(
...     interp_edges[:-1], interp_edges[1:])
>>> model_interp
array([[1.02111197],
    [1.02124269],
    [1.02127556],
    [1.0212273],
    [1.02111464]])
>>> uncert_interp
array([[0.],
    [0.],
    [0.],
    [0.],
    [0.]])

Note

The RoboLowess fit and interpolation do not provide model uncertainty. The uncertainty output is always zero.

You can also restrict Pass 1 to a subset of channels if we want to avoid some channels. For a single energy channel example, the default is sufficient and no first_pass_chan_range is needed. In multi-channel data, we can specify a range such as (0, 3) to limit the summed lightcurve.

Recommended Settings for 64 ms TTE Data

For high time binning of 64 ms TTE data, a recommended starting configuration is lowess_iter=0 and refit_after_clipping=False. Choose win_size to be roughly 0.25–0.5 of the fitted time range (e.g., 30–60 s for a 120 s window), then inspect the fit.

For Developers:¶

Designing Plug-ins¶

You can design your own custom algorithm to be used to fit binned data. In order for the algorithm to work with the fitter interface, the custom algorithm must be a class and satisfy the following:

__init__() must take the following as arguments:

A 2D array of counts with shape (num_times, num_chans);

A 1D array of time bin start times, shape (num_times,);

A 1D array of time bin end times, shape (num_times,);

A 1D array of exposures for each time bin, shape (num_times,).

A fit() method that takes no arguments but may have keywords to specify algorithm parameters.
An interpolate() method that must take the following arguments:
1. A 1D array of time bin start times, shape (num_times,);
2. A 1D array of time bin end times, shape (num_times,).
The interpolate() method must return the following:
1. A 2D rates array, shape (num_times, num_chans);
2. A 2D rate uncertainty array, shape (num_times, num_chans).

Additionally, the class can provide the following public attributes to be exposed by the BackgroundFitter interface:

dof: The degrees of freedom of the fits, array shape (num_chans,)
statistic: The fit statistic for each fit, array shape (num_chans,)
statistic_name: A string of the fit statistic used

To illustrate the above requirements, below is an example template:

>>> class MyBinnedBackgroundAlgorithm():
>>>
>>>     def __init__(self, counts, tstart, tstop, exposure):
>>>         # Do some stuff here, like store the inputs for use by the algorithm
>>>
>>>     @property
>>>     def dof(self):
>>>         # Return the degrees of freedom here. This is optional.
>>>
>>>     @property
>>>     def statistic(self):
>>>         # Return the fit statistic for each channel.  This is optional.
>>>
>>>     @property
>>>     def statistic_name(self):
>>>         # Return the name of the fit statistic.  This is optional.
>>>
>>>     def fit(self, fit_param1=0, fit_param2=42.0):
>>>         # Parameters needed for the algorithm are passed by keywords.
>>>         # The algorithm for doing the fit goes here.
>>>
>>>     def interpolate(self, tstart, tstop, interp_param=10.0):
>>>         # tstart, tstop required and extra parameters for interpolation
>>>         # can be passed as keywords.
>>>
>>>         # The interpolation algorithm goes here.
>>>         return (rates, rate_uncertainty)

Reference/API¶

gdt.core.background.binned Module¶

Classes¶

`Polynomial`(counts, tstart, tstop, exposure)	Class for performing a polynomial fit on Time-Energy data.
`RoboLowess`(counts, tstart, tstop, exposure, *)	Background fitting using LOWESS with iterative sigma-clipping.

Background Algorithms for Binned Data (`binned`)¶

Introduction¶

The Polynomial Fitting Algorithm¶

Examples¶

The RoboLowess Algorithm¶

Note¶

Examples¶

For Developers:¶

Designing Plug-ins¶

Reference/API¶

gdt.core.background.binned Module¶

Classes¶

Page Contents

Previous topic

Next topic

This Page

Navigation

Background Algorithms for Binned Data (binned)¶

Introduction¶

The Polynomial Fitting Algorithm¶

Examples¶

The RoboLowess Algorithm¶

Note¶

Examples¶

For Developers:¶

Designing Plug-ins¶

Reference/API¶

gdt.core.background.binned Module¶

Classes¶

Background Algorithms for Binned Data (`binned`)¶