Документ взят из кэша поисковой машины. Адрес оригинального документа : http://zebu.uoregon.edu/1998/es202/dispersion.html
Дата изменения: Sun Feb 1 23:13:34 1998
Дата индексирования: Tue Oct 2 06:20:55 2012
Кодировка:

Every sample has some intrinsic dispersion which determines the overall possible range of values for some attribute. The bigger this range, the more difficult it is to accurately characterize the sample with just a few data points.

For instance, suppose I go to my son's second grade class and I want to determine the mean age of the class. Well, a priori I know that kids of age 6 to 8 are in the second grade and so, to determine the mean age, I really only need to ask 2 or 3 kids, essentially to verify that this is a normal second grade class and not filled with small 40 year olds.

However, to determine the mean age of the students in this class would require considerably more samples, because the dispersion or range of ages is larger.

The measure of a dispersion is a somewhat mathematically complex procedure which you don't need to know. This will be done automatically for you using the provided tools. What you need to do is to understand how to interpret it.

The measure of dispersion assumes that the sample can be adequately represented by a normal or guassian distribution. We will discuss this in detail later but for now we can assume that most samples are adequately represented in this way. Let's return to the rainfall example and show the results of fitting the data to a mean value plus a dispersion. This is shown here:

The mean value for this data is 51.5 inches and the dispersion is 8 inches. So how do you interpret this?

A measure of dispersion is also a measure of probability. It is DEFINED in such a way that +/- 1 dispersion unit (usually called 1 sigma) contains 68% (about 2/3) of the sample. In the rainfall example given above, the fit to the data means that 68% of the time in eugene the average annual rainfall is between:

43.5 and 59.5 inches.

Furthermore, there is no significant difference for quantities that are separated by less than one dispersion unit. That is, it would be inappropriate to say that a 55 inch annual rainfall is significantly above normal.

A dispersion is then a measure of the statistical fluctuation around some mean quantity. You must have knowledge of this if you are to identify an event as being significant. Here is a simple guide:

Let's return again for rainfall. We have determined that the 25 year average was 51.5 inches. Last year, 1995, Eugene set a new annual rainfall record of 65.56 inches. Is this significant? Well