Statistics Definition and Formulas
Statistics
Statistics is concerned with the data collected for a specific purpose. We take decisions about the data by analysing and interpreting them. We know that the methods of representing the data in graphical and tabular forms, this representing show the critical properties and characteristics of the data. In this chapter we also study Measure of central tendency Mean (arithmetic mean), median and mode are three measures of central tendency. The measure of central tendency gives us an idea of where the data are concentrated but for proper analysis of the data, we must also know how much the data is scattered or spread around the measure of central tendency and how they are collected.
Donate through UPI ID:- achalup41-1@oksbi
Mean
The number of observations divided by the sum of observations is called the mean and It is denoted by \(\overline{x}\).
\(Mean (\overline{x}) = \frac{The \space sum \space of \space observations}{The \space number \space of \space observations}\)
Mean of ungrouped frequency distribution-
Let's understand by example,
Ex- Find the mean for the following frequency distribution?
x_{i} | 20 | 40 | 60 | 80 | 100 |
f_{i} | 2 | 12 | 14 | 8 | 4 |
x_{i} | f_{i} | f_{i}x_{i} |
20 | 2 | 40 |
40 | 12 | 480 |
60 | 14 | 840 |
80 | 8 | 640 |
100 | 4 | 400 |
\(\sum f_i = 40\) | \(\sum f_ix_i = 2400\) |
\(Mean (\overline{x}) = \frac{\sum f_ix_i}{\sum f_i}\)
\(Mean (\overline{x}) = \frac{2400}{40}\)
= 60
Mean of Grouped frequency distribution-
Ex- Find the mean for the following frequency distribution?
Intervals | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 |
f_{i} | 3 | 28 | 42 | 20 | 7 |
Intervals | f_{i} | x_{i} | f_{i}x_{i} |
0 - 10 | 3 | \(\frac {0+10}{2}\) = 5 | 15 |
10 - 20 | 28 | \(\frac {10+20}{2}\) = 15 | 420 |
20 - 30 | 42 | \(\frac {20+30}{2}\) = 25 | 1050 |
30 - 40 | 20 | \(\frac {30+40}{2}\) = 35 | 700 |
40 - 50 | 7 | \(\frac {40+50}{2}\) = 45 | 315 |
\(Mean (\overline{x}) = \frac{\sum f_ix_i}{\sum f_i}\)
\(Mean (\overline{x}) = \frac{2500}{100}\)
= 25
Median
To find the median, first, arrange the data in ascending and descending order. Then solve by the following rules-
(I). If 'n' is odd-
\(Median (M) = (\frac {n+1}{2})^{th} \space term \)
(II). If 'n' is even-
\(Median (M) = \frac {(\frac {n}{2})^{th} \space term + (\frac {n}{2})^{th} \space term}{2} \)
Where 'n' is the number of observations.
ex- Median of 5, 2, 10, 15, 20, 25, 3?
solution- In ascending order- 2, 3, 5, 10, 15, 20, 25
n = 7 (odd)
\(Median (M) = (\frac {n+1}{2})^{th} \space term \)
\(Median (M) = (\frac {7+1}{2})^{th} \space term \)
\(Median (M) = (\frac {8}{2})^{th} \space term \)
\(Median (M) = 4^{th} \space term \)
M = 10
Median of ungrouped frequency distribution-
Let's understand by an example,
Ex- Find the median for the following frequency distribution?
x_{i} | 5 | 7 | 9 | 10 | 12 | 15 |
f_{i} | 8 | 6 | 2 | 2 | 2 | 6 |
x_{i} | f_{i} | C_{i}f_{i}(cumulative frequency) |
5 | 8 | 8 |
7 | 6 | 8+6 = 14 |
9 | 2 | 14+2 = 16 |
10 | 2 | 16+2 = 18 |
12 | 2 | 18+2 = 20 |
15 | 6 | 20+6 = 26 |
N = \(\sum f_i \) = 26
\(\frac {N}{2} = \frac {26}{2} = 13 \) (Cumulative frequency exactly greater than '13')
Median(M) = 7
Median of Grouped frequency distribution-
\(M = l + ( \frac {\frac {N}{2} - C}{f})\times h \)
l = Lower limit of the median class
N = \(\sum f_i \)
C = The cumulative frequency of the class preceding the median class
f = The frequency of the median class
h = class interval
Ex- Find the median for the following frequency distribution?
Interval | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 |
f_{i} | 8 | 30 | 40 | 12 | 10 |
Interval | f_{i} | C_{i}f_{i} |
0 - 10 | 8 | 8 |
10 - 20 | 30 | 38 |
20 - 30 | 40 | 78 |
30 - 40 | 12 | 90 |
40 - 50 | 10 | 100 |
N = \(\sum f_i \) = 100
\(\frac {N}{2} = \frac {100}{2} = 50 \) (Cumulative frequency exactly greater than '50')
Median class = (20 - 30)
h = 30-20 = 10
\(M = l + ( \frac {\frac {N}{2} - C}{f})\times h \)
\(M = 20 + ( \frac {50 - 38}{40})\times 10 \)
\(M = 20 + ( \frac {12}{4}) \)
M = 23
Mode
The observation in data whose frequency is greater is called the mode and it is represented by 'z'.
ex- Mode of 5, 2, 2, 5, 20, 5, 3?
solution- Frequency of 5 is three which is higher so,
z = 5
Mode of ungrouped frequency distribution-
x_{i} | 20 | 40 | 60 | 80 | 100 |
f_{i} | 2 | 12 | 14 | 8 | 4 |
z = 60 (highest frequency 14)
Mode of grouped frequency distribution-
\(Z = l + ( \frac {f_1 - f_0}{2f_1 - f_0 - f_2})\times h \)
where, l = Lower limit of mode class
f_{0} = The frequency of the class preceding the mode class
f_{1} = Frequency of mode class
f_{2} = The frequency of the class following the mode class
h = class interval
Ex- Find the mode for the following frequency distribution?
Interval | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 |
f_{i} | 4 | 7 | 13 | 9 | 3 |
Mode class = 20 - 30 (highest frequency)
\(Z = l + ( \frac {f_1 - f_0}{2f_1 - f_0 - f_2})\times h \)
\(Z = 20 + ( \frac {13 - 7}{2 \times 13 - 7 - 9})\times 10 \)
Z = 20 + 6
Z = 26
Relation between mean, median and mode
Mode = 3 Median - 2 Mean
Z = 3M - 2\(\overline{x}\)
Some other results from the above relation
Z - \(\overline{x}\) = 3(M - \(\overline{x}\))
Z - M = 2(M - \(\overline{x}\))
Quartiles
\(Q_i = l + (\frac {i \frac {N}{4} - C}{f}) \times h \)
Where, l = Lower limit of quartile class
f = frequency of quartile class
C = The frequency of the class preceding the quartile class
N = \(\sum f_i \)
h = class interval
i = 1, 2, 3 (possible value of 'i')
Deciles
\(D_i = l + (\frac {i \frac {N}{10} - C}{f}) \times h \)
Where, l = Lower limit of decile class
f = frequency of decile class
C = The frequency of the class preceding the decile class
i = 1, 2, 3, ......... 9 (possible value of 'i')
Percentiles
\(P_i = l + (\frac {i \frac {N}{100} - C}{f}) \times h \)
Where, l = Lower limit of percentile class
f = frequency of percentile class
C = The frequency of the class preceding the percentile class
i = 1, 2, 3, ............. 99 (possible value of 'i')
Ex- Find Q_{1}, D_{1} and P_{1} for the given distribution?
Interval | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 |
f_{i} | 8 | 30 | 40 | 12 | 10 |
Interval | f_{i} | C_{i}f_{i} |
0 - 10 | 8 | 8 |
10 - 20 | 30 | 38 |
20 - 30 | 40 | 78 |
30 - 40 | 12 | 90 |
40 - 50 | 10 | 100 |
\( N = f_i = 100\)
\( \frac {N}{2} = 50 \)
\(Q_i = l + (\frac {i \frac {N}{4} - C}{f}) \times h \)
\(Q_1 = 20 + (\frac {1 \frac {100}{4} - 38}{40}) \times 10 \)
\(Q_1 = 20 - \frac {13}{4} \)
\(Q_1 = \frac {67}{4} \)
\(D_i = l + (\frac {i \frac {N}{10} - C}{f}) \times h \)
\(D_1 = 20 + (\frac {1 \frac {100}{10} - 38}{40}) \times 10 \)
= 20 - 7
= 13
\(P_i = l + (\frac {i \frac {N}{100} - C}{f}) \times h \)
\(P_1 = 20 + (\frac {1 \frac {100}{100} - 38}{40}) \times 10 \)
\(P_1 = 20 - \frac {37}{4} \)
\(P_1 = \frac {43}{4} \)
Measures of Dispersion
The spread of the terms of a series from the mean is called dispersion.
i. Range
The difference between the maximum and minimum value is called Range.
ex- Range of 6, 4, 2, 3, 8, 4, 7?
Range = 8 - 2 = 6
Range coefficient
\(Range \space coefficient = \frac {maximum \space value - minimum \space value}{maximum \space value + minimum \space value } \)
\(Range \space coefficient = \frac {8 - 2}{8 + 2} \)
= 0.6
ii. Quartile Deviation
\(Quartile \space Deviaton = \frac {Q_3 - Q_1}{2} \)
Coefficient of Quartile Deviation
\(Quartile \space Deviaton \space Coefficient = \frac {Q_3 - Q_1}{Q_3 + Q_1} \)
iii. Mean Deviation
I. Mean deviation with respect to the Mean
Ex- Find Mean deviation with respect to the Mean of the following questions?
a. Individual series
qus- 4, 7, 8, 9, 10, 12, 13, 17
Solution-
x_{i} | \(|x_i - \overline {x}|\) |
4 | 6 |
7 | 3 |
8 | 2 |
9 | 1 |
10 | 0 |
12 | 2 |
13 | 3 |
17 | 7 |
\( \overline{x} = \frac{80}{8} = 10 \)
\( Mean \space Deviation = \frac {\sum |x_i - \overline{x}|}{n} \)
\( Mean \space Deviation = \frac {24}{8} \)
= 3
b. Ungrouped frequency distribution-
x_{i} | 20 | 40 | 60 | 80 | 100 |
f_{i} | 2 | 12 | 14 | 8 | 4 |
we know that,
\(Mean( \overline {x}) = 60 \)
x_{i} | f_{i} | \(|x_i - \overline {x}|\) | \(f_i|x_i - \overline {x}|\) |
20 | 2 | 40 | 80 |
40 | 12 | 20 | 240 |
60 | 14 | 0 | 0 |
80 | 8 | 20 | 160 |
100 | 4 | 40 | 160 |
\( Mean \space Deviation = \frac {\sum f_i |x_i - \overline{x}|}{N} \)
\( Mean \space Deviation = \frac {640}{40} \)
= 16
c. Grouped frequency distribution-
Interval | 0 - 10 | 10 - 20 | 20 - 30 | 30 - 40 | 40 - 50 |
f_{i} | 3 | 28 | 42 | 20 | 7 |
we know that,
\(Mean (\overline {x}) = 25 \)
Interval | x_{i} | f_{i} | \(|x_i - \overline {x}|\) | \(f_i|x_i - \overline {x}|\) |
0 - 10 | 0 + 10 = 5 | 3 | 20 | 60 |
10 - 20 | 10 + 20 = 15 | 28 | 10 | 280 |
20 - 30 | 20 + 30 = 25 | 42 | 0 | 0 |
30 - 40 | 30 + 40 = 35 | 20 | 10 | 200 |
40 - 50 | 40 + 50 = 45 | 7 | 20 | 140 |
\( Mean \space Deviation = \frac {\sum f_i |x_i - \overline{x}|}{N} \)
\( Mean \space Deviation = \frac {680}{100} \)
= 6.8
II. Mean Deviation formula with respect to median
a. Individual series
\( Mean \space Deviation = \frac {\sum |x_i - M|}{n} \)
b. frequency distribution-
\( Mean \space Deviation = \frac {\sum f_i |x_i - M|}{N} \)
III. Mean Deviation formula with respect to Mode
a. Individual series
\( Mean \space Deviation = \frac {\sum |x_i - Z|}{n} \)
b. frequency distribution-
\( Mean \space Deviation = \frac {\sum f_i |x_i - Z|}{N} \)
Standard Deviation
The square root of the arithmetic mean of the square of the deviation obtained from the Arithmetic mean of different variable values is called standard deviation. It is expressed by \(\sigma\).
\(\sigma = \sqrt {\frac {\sum (x_i - \overline {x})^2}{n}}\)
Variance
Variance = (Standard Deviation)^{2}
\( V = \sigma^2 \)
Coefficient of Standard Deviation
Coefficient of variance
\( Coefficient \space of \space variance = \frac {\sigma}{\overline{x}} \times 100 \)
Post a Comment
Please do not enter any site link in the comment box 🚫.