# Jack-knifing a Multitaper SDF estimator¶

Assume there is a parameter $$\theta$$ that parameterizes a distribution, and that the set of random variables $$\lbrace Y_1, Y_2, ..., Y_n \rbrace$$ are i.i.d. according to that distribution.

The basic jackknifed estimator $$\tilde{\theta}$$ of some parameter $$\theta$$ is found through forming pseudovalues $$\hat{\theta}_i$$ based on the original set of samples. With n samples, there are n pseudovalues based on n “leave-one-out” sample sets.

## General JN definitions¶

simple sample estimate
$$\hat{\theta} = \dfrac{1}{n}\sum_i Y_i$$
leave-one-out measurement
$$\hat{\theta}_{-i} = \dfrac{1}{n-1}\sum_{k \neq i}Y_k$$
pseudovalues
$$\hat{\theta}_i = n\hat{\theta} - (n-1)\hat{\theta}_{-i}$$

Now the jackknifed esimator is computed as

$$\tilde{\theta} = \dfrac{1}{n}\sum_i \hat{\theta}_i = n\hat{\theta} - \dfrac{n-1}{n}\sum_i \hat{\theta}_{-i}$$

This estimator is known (?) to be distributed about the true parameter theta approximately as a Student’s t distribution, with standard error defined as

$$s^{2} = \dfrac{n-1}{n}\sum_i \left(\hat{\theta}_i - \tilde{\theta}\right)^{2}$$

## General Multitaper definition¶

The general multitaper spectral density function (sdf) estimator, using n orthonormal tapers, combines the n $$\lbrace \hat{S}_i^{mt} \rbrace$$ sdf estimators, and takes the form

$$\hat{S}^{mt}(f) = \dfrac{\sum_{k} w_k(f)^2S^{mt}_k(f)}{\sum_{k} |w_k(f)|^2} = \dfrac{\sum_{k} w_k(f)^2S^{mt}_k(f)}{\lVert \vec{w}(f) \rVert^2}$$

For instance, using discrete prolate spheroidal sequences (DPSS) windows, the $$\rbrace w_i \lbrace$$ set, in their simplest form, are the eigenvalues of the spectral concentration system.

A natural choice for a leave-one-out measurement is (leaving out the dependence on argument f)

$$\ln\hat{S}_{-i}^{mt} = \ln\dfrac{\sum_{k \neq i} w_k^2S^{mt}_k}{\lVert \vec{w}_{-i} \rVert^2} = \ln\sum_{k \neq i} w_k^2S^{mt}_k - \ln\lVert \vec{w}_{-i} \rVert^2$$

where $$\vec{w}_{-i}$$ is the vector of weights with the ith element set to zero. The natural log has been taken so that the estimate is distributed below and above $$S(f)$$ more evenly.

## Multitaper Pseudovalues¶

I’m not quite clear on the form of the pseudovalues for multitaper combinations.

### One Option¶

The simple option is to weight the different leave-one-out measurements equally, which leads to

$$\ln\hat{S}_{i}^{mt} = n\ln\hat{S}^{mt} - (n-1)\ln\hat{S}_{-i}^{mt}$$

And of course the estimate of $$S(f)$$ is given by

$$\ln\tilde{S}^{mt} (f) = \dfrac{1}{n}\sum_i \ln\hat{S}_i^{mt}(f)$$

### Another Option¶

Another approach seems obvious which weights the leave-one-out measurements according to the length of $$\vec{w}_{-i}$$. It would look something like this

let
$$g = {\lVert \vec{w} \rVert^2}$$
$$g_i = {\lVert \vec{w}_{-i} \rVert^2}$$

Then the pseudovalues are

$$\ln\hat{S}_i^{mt} = \left(\ln\hat{S}^{mt} + \ln g\right) - \left(\ln\hat{S}_{-i}^{mt} + \ln g_i\right)$$

and the jackknifed estimator is

$$\ln\tilde{S}^{mt} = \sum_i \ln\hat{S}_i^{mt} - \ln g$$

and I would wager, the standard error is estimated as

$$s^2 = \dfrac{1}{n}\sum_i \left(\ln\hat{S}_i^{mt} - \ln\tilde{S}^{mt}\right)^2$$

### Consensus in Literature (??)¶

From what I can tell from a couple of sources from Thompson [#f1]_, [#f2]_, this is the approach for estimating the variance.