.. _background_unbinned: .. |BackgroundFitter| replace:: :class:`~gdt.core.background.fitter.BackgroundFitter` .. |NaivePoisson| replace:: :class:`~gdt.core.background.unbinned.NaivePoisson` ****************************************************************************** Background Algorithms for Unbinned Data (:mod:`~gdt.core.background.unbinned`) ****************************************************************************** Introduction ============ The :mod:`gdt.core.background.unbinned` module provides algorithms for background fitting of unbinned (event) data. Typically the background is fit in the time domain for each channel of multi-channel data. Currently an algorithm for estimating the background via a naive Poisson Maximum Likelihood method is provided. The Naive Poisson Maximum Likelihood Algorithm ============================================== This algorithm assumes the presence of time-varying Poisson process, and aims to provide a MLE of the Poisson rate. Since the data could contain bright sources, this is a naive approach to estimating the background and will fail if bright sources are present. The premise of the algorithm lies on the fact that the data can be binned at some minimum resolution such that each "bin" contains either zero or one count. The MLE Poisson solution can then be easily formulated by considering the product of zero-count and one-count Poisson likelihoods. This formulation works for a constant Poisson process. For a time-varying process, a sliding time-window concept is introduced such that within the time-window, the Poisson rate is assumed constant, and the Poisson MLE is made at the center of the window. In a sense, this algorithm is similar to "smoothing" the data as the window is moved through the data. Because the rate is estimated at the center of the window, when the window is at either edges of the considered data, the window width is truncated, which effectively results in an increased uncertainty in the Poisson rate. The |NaivePoisson| class contains an option between two versions of the algorithm: a fast approximation, and a slower exact implementation. The exact version uses precisely a fixed window width as specified by the user (except when truncated at the edges of the data), and counts the number of zero- and one-count "bins". The fast approximation allows the user-specified window width to change by calculating the number of events contained within the first window and keeping the number of events in the window constant instead of the window width. For slowly varying data, this is a good approximation and runs much faster than the exact implementation; however if the data rate changes considerably, the window width in this approximation will also change considerably. Examples -------- Although the algorithm is designed to fit multiple energy channels simultaneously, for the sake of simplicity, we will look at an example using a single channel of data. >>> import numpy as np >>> # generate events at an average rate of 1 per second for 1000 s. >>> times = np.random.exponential(1.0, size=1000).cumsum() we have generated some random event data consistent with a constant rate of 1 count/second. Now we can initialize the algorithm: >>> from gdt.core.background.unbinned import NaivePoisson >>> naivep = NaivePoisson([times]) Because we have a single channel of data, our array of times must be contained in a single-element list. Once initialized, we can do the fit. There are two algorithm-specific parameters we can set: the window width, and if we're using the fast approximation or exact implementation: >>> naivep.fit(window_width=10.0, fast=False) You can then interpolate the fit at any arbitrary point (within the data that was fit): >>> interp_edges = np.linspace(10.0, 20.0, 6) >>> rate_interp, uncert_interp = naivep.interpolate(interp_edges[:-1], interp_edges[1:]) >>> rate_interp array([[1.02945792], [0.82706984], [0.94514904], [1.04939443], [1.08009983]]) >>> uncert_interp array([[0.32085167], [0.28758822], [0.30743276], [0.32394358], [0.32864872]]) You can refit the data with different parameters. For example, here we use the fast approximation: >>> naivep.fit(window_width=10.0, fast=True) >>> rate_interp, uncert_interp = naivep.interpolate(interp_edges[:-1], interp_edges[1:]) >>> rate_interp array([[0.98921635], [0.97645178], [0.96820927], [0.99056896], [1.20206283]]) >>> uncert_interp array([[0.2743579 ], [0.27067192], [0.26852942], [0.27461663], [0.33336566]]) .. _background_unbinned_plugins: For Developers: =============== Designing Plug-ins ------------------ You can design your own custom algorithm to be used to fit unbinned data. In order for the algorithm to work with the fitter interface, the custom algorithm must be a class and satisfy the following: * ``__init__()`` must take the following as an argument: #. A list, of length ``num_chans``, where each item is a numpy.ndarray of event times. * A ``fit()`` method that takes no arguments but may have keywords to specify algorithm parameters. * An ``interpolate()`` method that must take the following arguments: #. A 1D array of time bin start times, shape (``num_times``,); #. A 1D array of time bin end times, shape (``num_times``,). * The ``interpolate()`` method must return the following: #. A 2D rates array, shape (``num_times``, ``num_chans``); #. A 2D rate uncertainty array, shape (``num_times``, ``num_chans``). Additionally, the class can provide the following public attributes to be exposed by the |BackgroundFitter| interface: * ``dof``: The degrees of freedom of the fits, array shape (``num_chans``,) * ``statistic``: The fit statistic for each fit, array shape (``num_chans``,) * ``statistic_name``: A string of the fit statistic used To illustrate the above requirements, below is an example template: >>> class MyUnbinnedBackgroundAlgorithm(): >>> >>> def __init__(self, times): >>> # Do some stuff here, like store the inputs for use by the algorithm >>> >>> @property >>> def dof(self): >>> # Return the degrees of freedom here. This is optional. >>> >>> @property >>> def statistic(self): >>> # Return the fit statistic for each channel. This is optional. >>> >>> @property >>> def statistic_name(self): >>> # Return the name of the fit statistic. This is optional. >>> >>> def fit(self, fit_param1=0, fit_param2=42.0): >>> # Parameters needed for the algorithm are passed by keywords. >>> # The algorithm for doing the fit goes here. >>> >>> def interpolate(self, tstart, tstop, interp_param=10.0): >>> # tstart, tstop required and extra parameters for interpolation >>> # can be passed as keywords. >>> >>> # The interpolation algorithm goes here. >>> return (rates, rate_uncertainty) Reference/API ============= .. automodapi:: gdt.core.background.unbinned :inherited-members: :no-inheritance-diagram: