1

Consider the following model:

$\hat y = f(x_1, ..., x_n) + \delta$

$\hat x_i = x_i + \varepsilon _i, i=1,..,n$

where $\delta, \varepsilon _1, ..., \varepsilon _n$ are i.i.d. random variables with known $\mathbb E \delta = \mathbb E \varepsilon _i = 0$, $\mathbb V \delta = \sigma ^2 _\delta$, $\mathbb V \varepsilon _i = \sigma ^2$. The goal is to estimate $(x_1, ..., x_n)$ based on observations $(\hat x_1, ..., \hat x_n, \hat y)$.

In a simple case $f(x_1, ..., x_n) = \sum _i x_i$ and $\sigma ^2 _\delta < n \sigma ^2$ it seems like one can use $\hat y$ to improve estimation of $(x_1, ..., x_n)$. How can I construct an estimator in this case? In general $f$ case?

vladkkkkk
  • 293

1 Answers1

1

Here's one possible way I can think of to leverage this extra information.

This answer implicitly assumes that the distributions of $\varepsilon_i$ and $\delta$ are unimodal with a maximum at zero (such as a mean-zero a Gaussian). This means that small values of $|\varepsilon_i|$ and $|\delta|$ are more likely.

We have $\hat{x}_i - x_i = \varepsilon_i$ and $\hat{y} - f(x_1,...,x_n) = \delta$. Define the "likelihood function" \begin{equation} \mathcal{L} \;=\; \frac{1}{2\sigma^2}\sum_{i=1}^n{\left(x_i - \hat{x}_i\right)}^2 \;+\;\frac{1}{2\sigma_{\delta}^2}{\bigl[\hat{y} - f(x_1,...,x_n)\bigr]}^2 \end{equation} We want to minimize this function with respect to the estimates $(x_1,...,x_n)$. By doing so, we're essentially saying we want $x_i$ to be close to $\hat{x}_i$, while still approximately enforcing the constraint that $\hat{y}$ should be close to $f(x_1,...,x_n)$. We take a derivative of $\mathcal{L}$ with respect to the $j^{th}$ component of the estimate and set it equal to zero: \begin{equation} \frac{\partial\mathcal{L}}{\partial x_j} \;=\; \frac{1}{\sigma^2}(x_j - \hat{x}_j) \;+\; \frac{1}{\sigma_{\delta}^2}\left(f - \hat{y}\right)\frac{\partial f}{\partial x_j} \;=\;0\qquad\qquad (1) \end{equation} With $1\leq j\leq n$, Eq. (1) constitutes $n$ equations in $n$ unknowns. It can be solved analytically or numerically (depending on the form of $f$) to get an estimate for $(x_1,...,x_n)$.

For the special case where $f(x_1,...,x_n) = \sum_i x_i$, Eq. (1) becomes: \begin{equation} \frac{\partial\mathcal{L}}{\partial x_j} \;=\; \frac{1}{\sigma^2}(x_j - \hat{x}_j) \;+\; \frac{1}{\sigma_{\delta}^2}\left(\sum_{i=1}^n x_i - \hat{y}\right) \;=\;0\qquad\qquad (2) \end{equation} This is a linear system in the unknowns and the observations, which (with a little help from Mathematica) can be solved analytically to yield the estimate: \begin{equation} x_j \;=\; \hat{x}_j \;+\; \frac{\sigma^2}{n\sigma^2 + \sigma_{\delta}^2} \left( \hat{y} - \sum_{i = 1}^n \hat{x}_i \right) \end{equation} The extra term above serves to bias the estimate up or down depending on the relationship between $\hat{y}$ and $\sum_i \hat{x}_i$.

John Barber
  • 3,924