|Merits and Demerits of S.D.
|i. The standard deviation is used in preference to other deviations as it is a neat method removing the negatives
ii. It is defined rigidly
iii. It is based on each and every item
iv. It is capable of mathematical treatments
v. It is not affected much by fluctuations of sampling
|i. It is difficult to compile (calculators make it easy though)
ii. It is not simple to understand
iii. It gives more weightage to extreme values
|Find the Mean, Variance and Standard deviation of the following data :
350, 361, 370, 373, 376, 379, 385, 387, 394, 395.
Sol: Mean of the given data,
||(350 + 361 + 370 + 373 + 376 + 379 + 385 + 387 + 394 + 395)
|∴ Variance (σ2)
||(1832) = 183.2
|Standard deviation (σ)
||√(183.2) = 13.54
The mean of the squares of the deviations of the values from their arithmetic mean is called variance. It is denoted by σ2. The positive square root of variance is called standard deviation and it is denoted by σ.
Variance and standard deviation for ungrouped data
To obtain M.D. we have considered the absolute values of the deviations ignoring the minus sign.
In calculating the variance and standard deviation (S.D.), to make the deviations non-negative, we just square them (instead of taking the modulus of the deviations).
The process is explained in steps below :
1. Find the mean (x) of the values.
2. Find the deviations of all the values from the mean i.e, (xi – x).
These can be positive or negative.
3. Square the deviations i.e., (xi – x)2.
Now all values are only positive.
4. Sum them.
5. Divide by the number of observations (n).
It is called the variance and represented by σ2.
6. Take its square root
The positive value of the square root in step-6 is defined as the standard deviation (S.D.). It is denoted by the Greek letter σ (sigma).
∴ σ = +
Note that Σ and σ are both called sigma but are entirely different with the former representing summation of a series.
The value of 'σ' depends to a large extent on the term .
Let us denote it as 'Y' and analyse.
i) When will Y = 0 ?
Y = 0 only if all the observations (xi) are equal to the mean (x).
In other words, all the observations should be the same!
Then there is no dispersion or spread or deviation. But it is a highly ideal situation and practically impossible. This is because any "process" invariably has some "variation".
ii) Y is a large value.
It indicates some or many observations (xi) are considerably away from the mean(x).
In such a case the degree of dispersion is higher.
iii) Y is a small value.
It indicates most of the observations (xi) are very close to the mean (x).
Hence the degree of dispersion is low which is desirable.
Conclusion: case (i) is impractical, case (ii) is undesirable and case (iii) is what is required.
So if there are two or more sets of data for the same process, the one with minimum value of Y and hence σ, is the best.
To have a more appropriate number for measure of dispersion, 'Y' is scaled down by dividing with 'n' and taking its square root – which is σ.
Note: There is an alternate formulae for the variance.