Determination of the arithmetic mean by the method of moments. Properties of the arithmetic mean
Where A is a conditional zero equal to the variant with the maximum frequency (the middle of the interval with the maximum frequency), h is the interval step,
Service assignment. Using the online calculator, the average value is calculated using the method of moments. The result of the decision is drawn up in Word format.
Instruction. To obtain a solution, you must fill in the initial data and select the report options for formatting in Word.
Algorithm for finding the average by the method of moments
Example. The costs of working time for a homogeneous technological operation were distributed among the workers as follows:
Required to define average value the cost of working time and the standard deviation by the method of moments; the coefficient of variation; mode and median.Table for calculating indicators.
Groups | Interval middle, x i | Quantity, fi | x i f i | Cumulative frequency, S | (x-x ) 2 f |
5 - 10 | 7.5 | 20 | 150 | 20 | 4600.56 |
15 - 20 | 17.5 | 25 | 437.5 | 45 | 667.36 |
20 - 25 | 22.5 | 50 | 1125 | 95 | 1.39 |
25 - 30 | 27.5 | 30 | 825 | 125 | 700.83 |
30 - 35 | 32.5 | 15 | 487.5 | 140 | 1450.42 |
35 - 40 | 37.5 | 10 | 375 | 150 | 2200.28 |
150 | 3400 | 9620.83 |
Fashion
where x 0 is the beginning of the modal interval; h is the value of the interval; f 2 -frequency corresponding to the modal interval; f 1 - premodal frequency; f 3 - postmodal frequency.
We choose 20 as the beginning of the interval, since it is this interval that accounts for the largest number.
The most common value of the series is 22.78 min.
Median
The median is the interval 20 - 25, because in this interval, the accumulated frequency S is greater than the median number (the median is the first interval, the accumulated frequency S of which exceeds half of the total sum of frequencies).
Thus, 50% of the population units will be less than 23 min.
.
We find A = 22.5, interval step h = 5.
Mean squared deviations by the method of moments.
x c | x*i | x * i f i | 2 f i |
7.5 | -3 | -60 | 180 |
17.5 | -1 | -25 | 25 |
22.5 | 0 | 0 | 0 |
27.5 | 1 | 30 | 30 |
32.5 | 2 | 30 | 60 |
37.5 | 3 | 30 | 90 |
5 | 385 |
min.
Standard deviation.
min.
The coefficient of variation- a measure of the relative spread of population values: shows what proportion of the average value of this quantity is its average spread.
Because v>30% but v<70%, то вариация умеренная.
Example
To evaluate the distribution series, we find the following indicators:weighted average
The average value of the studied trait by the method of moments.
where A is a conditional zero equal to the variant with the maximum frequency (the middle of the interval with the maximum frequency), h is the interval step.
The arithmetic mean has a number of properties that more fully reveal its essence and simplify the calculation:
1. The product of the average and the sum of the frequencies is always equal to the sum of the products of the variant and the frequencies, i.e.
2. The arithmetic mean of the sum of the varying values is equal to the sum of the arithmetic means of these values:
3. The algebraic sum of the deviations of the individual values of the attribute from the average is zero:
4. The sum of the squared deviations of the options from the mean is less than the sum of the squared deviations from any other arbitrary value, i.e.:
5. If all variants of the series are reduced or increased by the same number, then the average will decrease by the same number:
6. If all variants of the series are reduced or increased by a factor, then the average will also decrease or increase by a factor:
7. If all frequencies (weights) are increased or decreased by a factor, then the arithmetic mean will not change:
This method is based on the use of the mathematical properties of the arithmetic mean. In this case, the average value is calculated by the formula: , where i is the value of an equal interval or any constant number not equal to 0; m 1 - moment of the first order, which is calculated by the formula: ; A is any constant number.
18 SIMPLE HARMONIC AVERAGE AND WEIGHTED.
Average harmonic is used in cases where the frequency is unknown (f i), and the volume of the studied trait is known (x i *f i =M i).
Using example 2, we determine the average wage in 2001.
In the original information of 2001. there is no data on the number of employees, but it is not difficult to calculate it as the ratio of the wage bill to the average wage.
Then 2769.4 rubles, i.e. average salary in 2001 -2769.4 rubles.
In this case, the harmonic mean is used: ,
where M i is the wage fund in a separate workshop; x i - salary in a separate shop.
Therefore, the harmonic mean is used when one of the factors is unknown, but the product "M" is known.
The harmonic mean is used to calculate the average labor productivity, the average percentage of compliance with the norms, the average salary, etc.
If the products of "M" are equal to each other, then the harmonic simple mean is used: , where n is the number of options.
GEOMETRIC AVERAGE AND CHRONOLOGICAL AVERAGE.
The geometric mean is used to analyze the dynamics of phenomena and allows you to determine the average growth factor. When calculating the geometric mean, the individual values of a trait usually represent relative indicators of dynamics, built in the form of chain values, as the ratio of each level of the series to the previous level.
, - chain coefficients of growth;
n is the number of chain growth factors.
If the initial data is given as of certain dates, then the average level of the attribute is determined by the chronological average formula. If the intervals between dates (moments) are equal, then the average level is determined by the formula of the average chronological simple.
Let's consider its calculation on specific examples.
Example. The following data are available on the balances of household deposits in Russian banks in the first half of 1997 (at the beginning of the month):
The average balance of deposits of the population for the first half of 1997 (according to the formula of the average chronological idle time) amounted to.
A - conditional average (more often than others repeating in the variation series)
a - conditional deviation from the conditional average (rank)
i - interval
1st stage - determination of the middle of the groups;
2nd stage - ranking of groups: 0 is assigned to the group, the frequency of occurrence of the variant in which is the highest. Those. in this case 7-11 (frequency -32). Up from this group, the ranking is done by adding (-1). Down - increase (+1).
3rd stage - determination of the conditional mode (conditional average). A is the middle of the modal interval. In our case, the modal interval is 7 -11, so A = 9.
4th stage - determination of the interval. The interval in all groups of the series is the same and equals 5. i = 5/
5th stage - determination of the total number of observations. n = ∑p = 103.
We substitute the obtained data into the formula:
Tasks for independent work
Using the data of the grouped variation series, calculate the arithmetic mean by the method of moments.
Option number 1
Option number 2
Option number 3
Option number 4
Option number 5
Option number 6
Option number 7
Option number 8
Option number 9
Option number 10
Option number 11
Option number 12
Task №4 Determining the mode and median in an ungrouped variational series with an odd number of options
Terms of inpatient treatment of sick children in days: 15, 14, 18, 17, 16, 20, 19, 16, 14, 16, 17, 12, 18, 19, 20.
To determine the mode in the variation series, the ranking of the series is optional. However, before determining the median, it is necessary to build a variation series in ascending or descending order.
12, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 19, 20, 20.
Mode = 16. option 16 occurs the most times (3 times).
If there are several options with the highest frequency of occurrence, then two or more Modes can be indicated in the variation series.
The median in a series with an odd number is determined by the formula:
8 is the ordinal number of the median in the ranked variation series,
then. Me = 17.
Task №5 Determining the mode and median in an ungrouped variation series with an even number of options.
Based on the data given in the task, you need to find the mode and median
Terms of inpatient treatment of sick children in days: 15, 14, 18, 17, 16, 20, 19, 16, 14, 16, 17, 12, 18, 19, 20, 11
We build a ranked variational series:
11, 12, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 19, 20, 20
We have two median numbers 16 and 17. In this case, the median is found as the arithmetic mean between them. Me = 16.5.
Method of moments equates the moments of the theoretical distribution with the moments of the empirical distribution (distribution constructed from observations). From the equations obtained, estimates of the distribution parameters are found. For example, for a distribution with two parameters, the first two moments (mean and variance of the distribution, respectively, m and s) will be set to the first two empirical (sample) moments (mean and variance of the sample, respectively), and then estimation will be performed.Where A is a conditional zero equal to the variant with the maximum frequency (the middle of the interval with the maximum frequency), h is the interval step,
Service assignment. Using the online calculator, the average value is calculated using the method of moments. The result of the decision is drawn up in Word format.
Instruction. To obtain a solution, you must fill in the initial data and select the report options for formatting in Word.
Algorithm for finding the average by the method of moments
Example. The costs of working time for a homogeneous technological operation were distributed among the workers as follows:
It is required to determine the average value of the cost of working time and the standard deviation by the method of moments; the coefficient of variation; mode and median.Table for calculating indicators.
Groups | Interval middle, x i | Quantity, fi | x i f i | Cumulative frequency, S | (x-x ) 2 f |
5 - 10 | 7.5 | 20 | 150 | 20 | 4600.56 |
15 - 20 | 17.5 | 25 | 437.5 | 45 | 667.36 |
20 - 25 | 22.5 | 50 | 1125 | 95 | 1.39 |
25 - 30 | 27.5 | 30 | 825 | 125 | 700.83 |
30 - 35 | 32.5 | 15 | 487.5 | 140 | 1450.42 |
35 - 40 | 37.5 | 10 | 375 | 150 | 2200.28 |
150 | 3400 | 9620.83 |
Fashion
where x 0 is the beginning of the modal interval; h is the value of the interval; f 2 -frequency corresponding to the modal interval; f 1 - premodal frequency; f 3 - postmodal frequency.
We choose 20 as the beginning of the interval, since it is this interval that accounts for the largest number.
The most common value of the series is 22.78 min.
Median
The median is the interval 20 - 25, because in this interval, the accumulated frequency S is greater than the median number (the median is the first interval, the accumulated frequency S of which exceeds half of the total sum of frequencies).
Thus, 50% of the population units will be less than 23 min.
.
We find A = 22.5, interval step h = 5.
Mean squared deviations by the method of moments.
x c | x*i | x * i f i | 2 f i |
7.5 | -3 | -60 | 180 |
17.5 | -1 | -25 | 25 |
22.5 | 0 | 0 | 0 |
27.5 | 1 | 30 | 30 |
32.5 | 2 | 30 | 60 |
37.5 | 3 | 30 | 90 |
5 | 385 |
min.
Standard deviation.
min.
The coefficient of variation- a measure of the relative spread of population values: shows what proportion of the average value of this quantity is its average spread.
Because v>30% but v<70%, то вариация умеренная.
Example
To evaluate the distribution series, we find the following indicators:weighted average
The average value of the studied trait by the method of moments.
where A is a conditional zero equal to the variant with the maximum frequency (the middle of the interval with the maximum frequency), h is the interval step.
Calculations of the arithmetic mean can be cumbersome if the options (feature values) and weights have very large or very small values and the calculation process itself becomes difficult. Then, for ease of calculation, a number of properties of the arithmetic mean are used:
1) if you reduce (increase) all options by any arbitrary number BUT, then the new average will decrease (increase) by the same number BUT, i.e. will change to ± BUT;
2) if we reduce all options (feature values) by the same number of times ( To), then the average will decrease by the same amount, and with an increase in ( To) times - will increase in ( To) once;
3) if we reduce or increase the weights (frequencies) of all variants by some constant number BUT, then the arithmetic mean will not change;
4) the sum of the deviations of all options from the total average is zero.
The listed properties of the arithmetic mean allow, if necessary, to simplify calculations by replacing the absolute frequencies with relative ones, to reduce the options (feature values) by any number BUT, reduce them to To times and calculate the arithmetic mean of the reduced version, and then move on to the mean of the original series.
The method of calculating the arithmetic mean using its properties is known in statistics as "conditional zero method", or "conditional average", or how "method of moments".
Briefly, this method can be written as a formula
If the reduced variants (character values ) are denoted by , then the above formula can be rewritten as .
When using a formula to simplify the calculation of the arithmetic mean weighted interval series when determining the value of any number BUT use such methods of its definition.
Value BUT is equal to the value:
1) the first value of the average value of the interval (we will continue on the example of the problem, where million dollars, and .
Calculation of the average of the reduced option
Intervals | Interval mean | Number of factories f | Work | |
Up to 2 | 1,5 | 0 (1,5–1,5) | ||
2–3 | 2,5 | 1 (2,5–1,5) | ||
3–4 | 3,5 | 2 (3,5–1,5) | ||
4–5 | 4,5 | 3 (4,5–1,5) | ||
5–6 | 5,5 | 4 (5,5–1,5) | ||
Over 6 | 6,5 | 5 (6,5–1,5) | ||
Total: | 3,7 | – |
,
2) value BUT we take equal to the value of the average value of the interval with the highest frequency of repetitions, in this case BUT= 3.5 at ( f= 30), or the value of the middle variant, or the largest variant (in this case, the largest value of the feature X= 6.5) and divided by the interval size (1 in this example).
Calculation of the average at BUT = 3,5, f = 30, To= 1 in the same example.
Calculation of the average method of moments
Intervals | Interval mean | Number of factories f | Work | |
Up to 2 | 1,5 | (1,5 – 3,5) : 1 = –2 | –20 | |
2–3 | 2,5 | (2,5 – 3,5) : 1 = –1 | –20 | |
3–4 | 3,5 | (3,5 – 3,5) : 1 = 0 | ||
4–5 | 4,5 | (4,5 – 3,5) : 1 = 1 | ||
5–6 | 5,5 | (5,5 – 3,5) : 1 = 2 | ||
Over 6 | 6,5 | (6,5 – 3,5) : 1 = 3 | ||
Total: | 3,7 | – |
; ; ;
The method of moments, conditional zero or conditional average is that with the reduced method of calculating the arithmetic mean, we choose such a moment that in the new row one of the values of the sign , i.e., we equate and from here we choose the value BUT and To.
It must be kept in mind that if X – BUT) : To, where To is an equal value of the interval, then the new variants obtained form in an equal-interval series series of natural numbers (1, 2, 3, etc.) positive downwards and negative upwards from zero. The arithmetic mean of these new variants is called the moment of the first order and is expressed by the formula
.
To determine the value of the arithmetic mean, you need to multiply the value of the moment of the first order by the value of that interval ( To), by which we divide all options, and add to the resulting product the value of options ( BUT) that was read.
;
Thus, using the method of moments or conditional zero, it is much easier to calculate the arithmetic mean from the variational series, if the series is equal-interval.
Fashion
Mode is the value of a feature (variant) that is most frequently repeated in the studied population.
For discrete distribution series, the mode will be the value of the variants with the highest frequency.
Example. When determining the plan for the production of men's shoes, the factory studied consumer demand based on the results of the sale. The distribution of shoes sold was characterized by the following indicators:
Shoes of size 41 were in the greatest demand and accounted for 30% of the sold quantity. In this distribution series M 0 = 41.
For interval distribution series with equal intervals, the mode is determined by the formula
.
First of all, it is necessary to find the interval in which the mode is located, i.e., the modal interval.
In a variational series with equal intervals modal spacing is determined by the highest frequency, in series with unequal intervals - by the highest distribution density, where: - the value of the lower boundary of the interval containing the mode; is the frequency of the modal interval; - the frequency of the interval preceding the modal, i.e. premodal; - the frequency of the interval following the modal, i.e. post-modal.
An example of calculating the mode in an interval series
The grouping of enterprises according to the number of industrial and production personnel is given. Find fashion. In our problem, the largest number of enterprises (30) has a grouping with 400 to 500 employees. Therefore, this interval is the modal interval of the evenly spaced propagation series. Let us introduce the following notation:
Substitute these values into the mode calculation formula and calculate:
Thus, we have determined the value of the modal value of the attribute contained in this interval (400–500), i.e. M 0 = 467 people
In many cases, when characterizing the population as a generalizing indicator, preference is given to fashion, not the arithmetic mean. So, when studying prices in the market, it is not the average price for a certain product that is fixed and studied in dynamics, but the modal one. When studying the demand of the population for a certain size of shoes or clothes, it is of interest to determine the modal number, and not the average size, which does not matter at all. If the arithmetic mean is close in value to the mode, then it is typical.
TASKS FOR SOLUTION
Task 1
At the variety seed station, when determining the quality of wheat seeds, the following determination of seeds was obtained by the percentage of germination:
Define fashion.
Task 2
When registering prices during the busiest trading hours, individual sellers registered the following actual selling prices (USD per kg):
Potato: 0.2; 0.12; 0.12; 0.15; 0.2; 0.2; 0.2; 0.15; 0.15; 0.15; 0.15; 0.12; 0.12; 0.12; 0.15.
Beef: 2; 2.5; 2; 2; 1.8; 1.8; 2; 2.2; 2.5; 2; 2; 2; 2; 3; 3; 2.2; 2; 2; 2; 2.
What prices for potatoes and beef are modal?
Task 3
There is data on the wages of 16 workshop mechanics. Find the modal value of wages.
In dollars: 118; 120; 124; 126; 130; 130; 130; 130; 132; 135; 138; 140; 140; 140; 142; 142.
Median Calculation
In statistics, the median is the variant located in the middle of the variation series. If the discrete distribution series has an odd number of series members, then the median will be the variant located in the middle of the ranked series, i.e. add 1 to the sum of frequencies and divide everything by 2 - the result will give the ordinal number of the median.
If there is an even number of options in the variational series, then the median will be half the sum of the two middle options.
To find the median in the interval variation series, we first determine the median interval for the accumulated frequencies. Such an interval will be one whose cumulative (cumulative) frequency is equal to or exceeds half the sum of the frequencies. Accumulated frequencies are formed by gradual summation of frequencies, starting from the interval with the lowest value of the attribute.
Calculation of the median in the interval variation series
Intervals | Frequencies ( f) | Cumulative (accumulated) frequencies |
60–70 | 10 (10) | |
70–80 | 40 (10+30) | |
80–90 | 90 (40+50) | |
90–100 | 15 (90+60) | |
100–110 | 295 (150+145) | |
110–120 | 405 (295+110) | |
120–130 | 485 (405+80) | |
130–140 | 500 (485+15) | |
Sum: | ∑f = 500 |
Half the sum of the accumulated frequencies in the example is 250 (500:2). Therefore, the median interval will be an interval with a feature value of 100–110.
Before this interval, the sum of the accumulated frequencies was 150. Therefore, in order to obtain the value of the median, it is necessary to add another 100 units (250 - 150). When determining the value of the median, it is assumed that the value of the feature within the boundaries of the interval is distributed evenly. Therefore, if 145 units in this interval are distributed evenly in the interval, equal to 10, then 100 units will correspond to the value:
10: 145 ´ 100 = 6.9.
Adding the obtained value to the minimum boundary of the median interval, we obtain the desired value of the median:
Or the median in the variational interval series can be calculated by the formula:
,
where is the value of the lower boundary of the median interval (); – the value of the median interval ( =10); – the sum of the frequencies of the series (the number of the series is 500); is the sum of accumulated frequencies in the interval preceding the median one ( = 150); is the frequency of the median interval ( = 145).