Some essential notes on C-statistics

Basic concepts

C-statistics is based on Poisson likelihood:

L (M) = \prod_{i} \frac{M_{i}^{D_{i}}}{D_{i}!} \exp (- M_{i})

where $L (M)$ is the likelihood for the model $M$ , $D_{i}$ is the data in $i$ th bin, $M_{i}$ denotes the model prediction in $i$ th bin. Taking its logarithm and multiplying by $- 2$ , we can get

- 2 \ln L (M) = 2 \sum_{i} (M_{i} - D_{i} \ln M_{i} + \ln (D_{i}!)) .

Omitting the factorial term, we can get the Cash-statistics (Cash 1979):

\tilde{C} = 2 \sum_{i} (M_{i} - D_{i} \ln M_{i}) .

Approximating the factorial term by Stirling's formula, that is

\ln (D_{i}!) \approx D_{i} \ln D_{i} - D_{i},

a modification of the original Cash-statistic, C-statistics, can be obtained as follows:

C = 2 \sum_{i} (M_{i} - D_{i} \ln M_{i} + D_{i} \ln D_{i} - D_{i}),

which is implemented in some popular fitting packages like XSPEC (Arnaud 1996), SHERPA (Freeman et al. 2001), and SPEX (Kaastra et al. 1996). $\tilde{C}$ is the same as $C$ , up to a constant $\sum_{i} (D_{i} \ln D_{i} - D_{i})$ . $C$ is non-negative, $C$ is equal to $0$ if and only if all the $M_{i}$ are equal to $D_{i}$ . Since the count rate is usually low for X-ray observation, it is better to use C-statistics fitting than $χ^{2}$ fitting, i.e., getting the best-fit parameters by minimizing $C$ instead of $χ^{2}$ .

$1 σ$ confidence intervals for parameters

Assuming that there are $k$ parameters in the model, that is $p_{1}$ , $p_{2}$ , ..., $p_{k}$ , a set of ${p_{i}}$ ( $i$ ranges from $1$ to $k$ ) that results in a minimum value of $C$ is the best-fit parameter set, i.e., the best-fit model. However, merely getting the best-fit parameter set is not enough, we also need to evaluate the $1 σ$ confidence intervals for the parameters.

A simple schematic

For simplicity, we may as well take a model with just two parameters $p_{1}$ and $p_{2}$ as an example. ${p_{1, b e s t}, p_{2, b e s t}}$ is the best-fit parameter set, which generates the minimum $C$ , i.e., $C_{m i n}$ . Considering the 2D parameter space $(p_{1}, p_{2})$ , $C$ will be larger than $C_{m i n}$ in the neighborhood of $(p_{1, b e s t}, p_{2, b e s t})$ . $C \leq C_{m i n} + 1$ defines a close region in the 2D parameter space, which is called the $1 σ$ confidence region. $C = C_{m i n} + 1$ defines the corresponding boundary of the $1 σ$ confidence region. The figure below can serve as a schematic of the $1 σ$ confidence region in the 2D parameter space $(p_{1}, p_{2})$ , which assumes that $p_{1, b e s t} = 0$ and $p_{2, b e s t} = 0$ , or equivalently, the origin of the 2D parameter space is set to be the best-fit parameter set.

The line segment AD represents the $1 σ$ confidence interval for $p_{1}$ . The line segment BC is not the $1 σ$ confidence interval for $p_{1}$ . If we fix the parameter $p_{2} = p_{2, b e s t} = 0$ and calculate the error of $p_{1}$ in SPEX, we will get the $1 σ$ confidence interval for $p_{1}$ , which is the line segment BC. However, it is not reasonable, because there will always be some correlation between parameters. You can also find that the $1 σ$ confidence interval for $p_{1}$ obtained by setting the rest parameters to the best-fit value will always be smaller than the $1 σ$ confidence interval for $p_{1}$ obtained by freeing all the parameters (BC<AD).

Fig1 — The $1 σ$ confidence region in the 2D parameter space $(p_{1}, p_{2})$ is represented by the oblique ellipse. Points A and D denote the minimum and maximum value of $p_{1}$ of the oblique ellipse, respectively. Points B and C are two points of intersection between the oblique ellipse and the horizontal line $p_{2} = p_{2, b e s t} = 0$ .

The algorithm of finding the $1 σ$ confidence interval

As shown in the Figure, you may think that we need to get the boundary, namely, the oblique ellipse first and then get the $1 σ$ confidence interval for each parameter by projecting the oblique ellipse to each parameter axis (by getting the minimum and maximum value of each parameter). However, calculating the boundary of the oblique ellipse is not efficient. To calculate the $1 σ$ confidence interval for the parameter $p_{1}$ , the algorithm implementd in SPEX goes as follows,

1. Get the best-fit parameter set
1. Fix the parameter $p_{1}$ to a value that is close to the $p_{1, b e s t}$ but free all the rest parameters.
1. Obtain the minimum $C$
1. If the minimum $C$ is larger than $C_{m i n} + 1$ , choose a new $p_{1}$ that is closer to $p_{1, b e s t}$ , if the minimum $C$ is smaller than $C_{m i n} + 1$ , choose a new $p_{1}$ that is farther from $p_{1, b e s t}$ . Go to step 2.
1. Return the two points $p_{1}$ that can result in the minimum $C$ being equal to $C_{m i n} + 1$ by setting the rest parameters to be free. Actually, it is not exactly $C_{m i n} + 1$ , there is some tolerance, the algorithm in SPEX will stop at some value that is close to $C_{m i n} + 1$ , like $C_{m i n} + 1.01$ .

Why the $1 σ$ region is defined as $C_{m i n} + 1$ ?

Why it is $C_{m i n} + 1$ not some value like $C_{m i n} + 1.2$ or $C_{m i n} + 1.3$ ? It should date back to the $χ^{2}$ test.

Δ χ^{2} = χ^{2} (N) - χ_{m i n}^{2} (N - 1)

is distributed like a $χ^{2}$ variable with one degree of freedom, where N is the number of data points and $χ^{2} (N - 1)_{m i n}$ is the $χ^{2}$ the usual minimum fit statistic Bonamente (2019). Namely, $Δ χ^{2}$ is distributed as $χ^{2} (1)$ . According to the $χ^{2} (1)$ distribution, $Δ χ^{2} \leq 1$ will lead to a probability of 68%, which is the so-called $1 σ$ confidence level. For C-statistics,

Δ C = C - C_{m i n}

is approximately distributed like a $χ^{2} (1)$ distribution. Therefore, we also use $C_{m i n} + 1$ to define the $1 σ$ confidence region.

Some essential notes on C-statistics ​

Basic concepts ​

1σ confidence intervals for parameters ​

A simple schematic ​

The algorithm of finding the 1σ confidence interval ​

Why the 1σ region is defined as Cmin+1? ​