[MADS] plug-in estimator
I asked a couple of astronomers if they heard the term plug-in estimator and none of them gave me a positive answer.
When computing sample mean (xbar) and sample variance (s^2) to obtain (xbar-s, xbar+s) and to claim this interval covers 68%, these sample mean, sample variance, and the interval are plug-in estimators. Whilst clarifying the form of sampling distribution, or on verifying the formulas or estimators of sample mean and sample variance truly match with true mean and true variance, I can drop plug-in part because I know asymptotically such interval (estimator) will cover 68%.
When there is lack of sample size or it is short in sufficing (theoretic) assumptions, instead of saying 1-σ, one would want to say s, or plug-in error estimator. Without knowing the true distribution (asymptotically, the empirical distribution), somehow 1-σ mislead that best fit and error bar assures 68% coverage, which is not necessary true. What is computed/estimated is s or a plug-in estimator that is defined via Δ chi-square=1. Generally, the Greek letter σ in statistics indicate parameter, not a function of data (estimator), for instance, sample standard deviation (s), root mean square error (rmse), or the solution of Δ chi-square=1.
Often times I see extended uses of statistics and their related terms in astronomical literature which lead unnecessary comments and creative interpretation to account for unpleasant numbers. Because of the plug-in nature, the interval may not cover the expected value from physics. It’s due to chi-square minimization (best fit can be biased) and data quality (there are chances that data contain outliers or go under modification through instruments and processing). Unless robust statistics is employed (outliers could shift best fits and robust statistics are less sensitive to outliers) and calibration uncertainty or some other correction tools are suitably implemented, strange intervals are not necessary to be followed by creative comments or to be discarded. Those intervals are by products of employing plug-in estimators whose statistical properties are unknown during the astronomers’ data analysis state. Instead of imaginative interpretations, one would proceed with investigating those plug-in estimators and try to device/rectify them in order to make sure they lead close to the truth.
For example, instead of simple average (xbar=f(x_1,…,x_n) :average is a function of data whereas the chi-square minimzation method is another function of data), whose breakdown point is asymptotically zero and can be off from the truth, median (another function of data) could serve better (breakdown point is 1/2). We know that the chi-square methods are based on L2 norm (e.g. variation of least square methods). Instead, one can develop methods based on L1 norm as in quantile regression or least absolute deviation (LAD, in short, link from wiki). There are so many statistics are available to walk around short comings of popular plug-in estimators when sampling distribution is not (perfect) gaussian or analytic solution does not exist.
yaser:
what is plug-in estimator?
06-09-2009, 4:07 pmhlee:
I cannot give you a short answer but can say that it is jargon in nonparametric statistics. Before parameterization or setting models (likelihoods and priors from a distribution family), one wishes to characterize a sample distribution by looking its central location and scale (skewness, kurtosis, empirical cdf, contours, principle components, covariance matrix, etc), which are navigated via computing average and sample standard deviation. The equation of computing average, for example, is a plug-in estimator. Since we haven’t posed Gaussian or Poisson distribution as the ground truth, we cannot say the average plug-in estimator is the mean (mu in Gaussian or lambda in Poisosn) estimator. We can locate the center of data distribution from a plug-in estimator and thus, without parameterization but based on probability theory and asymptotics, we can quantify the uncertainty of that location plug-in estimator.
06-09-2009, 5:35 pm