How to Calculate Standard Deviation and Variance by Hand

Daniel Storage
18 Jun 201909:29
EducationalLearning
32 Likes 10 Comments

TLDRThis instructional video teaches viewers how to calculate four key measures of variability: sample variance, sample standard deviation, population variance, and population standard deviation. The video emphasizes the commonality in the initial steps of these calculations, which involve finding the sum of squared deviations from the mean. It demonstrates the process using a dataset of coffee consumption and guides through the calculations step by step, including the use of a table to organize the steps. The video concludes with the formulas and results for each measure, highlighting the slight differences in the final steps for sample versus population calculations.

Takeaways
  • ๐Ÿ“š The video teaches how to calculate four main measures of variability: sample variance, sample standard deviation, population variance, and population standard deviation.
  • ๐Ÿ” The formulas for these measures share similarities, particularly in the numerators, which involve taking each value, subtracting the mean, squaring the result, and summing them up.
  • ๐Ÿ“ˆ The distinction between sample and population data depends on the scope of interest; if generalizing beyond a specific group, it's a sample, otherwise it's a population.
  • ๐Ÿ“ The process involves creating a table and following a series of steps to calculate the desired measures of variability.
  • โš–๏ธ The first step in calculating variability is to find the deviations of each value from the mean, which is essential for understanding standard deviations.
  • ๐Ÿงฎ The mean of the data set used in the video is given as 2.6, and deviations are calculated by subtracting this mean from each data point.
  • ๐Ÿ”ข To deal with the issue of deviations summing to zero, the video suggests squaring the deviations, which removes the negative sign and allows for further calculations.
  • ๐Ÿ“Š The sum of the squared deviations from the mean is called the sum of squares (SS), which is a crucial component in calculating all four measures of variability.
  • ๐Ÿ“‰ For sample variance, the sum of squares is divided by (n - 1) (where n is the sample size), and for population variance, it is divided by n.
  • ๐Ÿ“ The sample standard deviation is found by taking the square root of the sample variance, and similarly, the population standard deviation is the square root of the population variance.
  • ๐ŸŽฏ The video concludes with the calculation of both sample and population variances and standard deviations, highlighting the process and the final results.
Q & A
  • What are the four main measures of variability discussed in the video?

    -The four main measures of variability discussed in the video are sample variance, sample standard deviation, population variance, and population standard deviation.

  • What is the significance of the numerators in the formulas for these measures of variability?

    -The numerators across all four formulas are essentially identical, involving taking each x-value, subtracting the mean of all the values, squaring that value, and adding all those up. This means that the initial steps for calculating any of these measures of variability are the same.

  • What does the video suggest using as the first step in calculating measures of variability?

    -The video suggests calculating the deviations of each value from the mean as the first step in calculating measures of variability.

  • Why can't we simply take the average of the deviations to find the standard deviation?

    -We can't take the average of the deviations to find the standard deviation because the sum of the deviations will always equal zero, which would make the average zero, regardless of the original data.

  • What is the sum of squares and why is it important in calculating measures of variability?

    -The sum of squares is the sum of each squared deviation from the mean across all values. It is important because it is used in the numerator for calculating all measures of variability, and it represents the total variability in the data set.

  • How does the process of calculating sample variance differ from calculating population variance?

    -The process of calculating sample variance involves dividing the sum of squares by n-1 (the sample size minus one), whereas calculating population variance involves dividing the sum of squares by n (the population size).

  • What is the relationship between variance and standard deviation?

    -Standard deviation is the square root of variance. It is a measure of the typical amount that scores deviate from the mean, and it is derived from the variance by taking its square root.

  • Why do we use n-1 instead of n when calculating sample variance?

    -We use n-1 instead of n when calculating sample variance as a way to make the sample statistics approximate the corresponding population parameters more accurately. It's a statistical method known as Bessel's correction.

  • How does the video demonstrate the process of calculating these measures of variability?

    -The video demonstrates the process by using a table and a series of steps that include calculating deviations from the mean, squaring these deviations, summing them up to get the sum of squares, and then using this sum to calculate the different measures of variability.

  • What is the practical example used in the video to illustrate the calculation of these measures of variability?

    -The practical example used in the video is the number of cups of coffee that each student in the presenter's class drinks per day. This data is used to illustrate both sample and population calculations.

Outlines
00:00
๐Ÿ“š Introduction to Variability Measures

This paragraph introduces the concept of variability and its four main measures: sample variance, sample standard deviation, population variance, and population standard deviation. The speaker emphasizes that although the formulas for these measures will be discussed later, it's important to note the similarities in their numerators. The process involves taking each data point, subtracting the mean, squaring the result, and summing these values. The speaker uses the example of the number of cups of coffee students drink per day to illustrate the concept. The data can be treated as either a sample or a population, depending on the context. The paragraph concludes with the setup of a table to guide the calculations, starting with calculating deviations from the mean.

05:01
๐Ÿ” Calculating Variability Measures: Steps and Formulas

The second paragraph delves into the process of calculating the measures of variability. The speaker explains that the first step is to find the sum of squares, which is the sum of the squared deviations from the mean. This is done by subtracting the mean from each data point, squaring the result, and adding these values together. The sum of squares is represented as SS and is a crucial component in the formulas for all four measures of variability. The speaker then demonstrates how to calculate the sample variance by dividing the sum of squares by n-1 (where n is the number of data points), and the sample standard deviation by taking the square root of the sample variance. For population parameters, the sum of squares is divided by n, not n-1. The paragraph concludes with the calculation of both the population variance and standard deviation, providing the final values for these measures.

Mindmap
Keywords
๐Ÿ’กVariability
Variability refers to the degree to which data points differ from each other within a set. In the context of the video, it is a central theme as the tutorial focuses on calculating measures that quantify this dispersion. The script mentions four main measures of variability: sample variance, sample standard deviation, population variance, and population standard deviation, all of which are essential for understanding data spread and consistency.
๐Ÿ’กSample Variance
Sample variance is a measure that estimates the variability of a small subset of data, which is assumed to be representative of a larger population. The video script explains that it is calculated by taking the sum of the squared deviations from the sample mean, dividing by the number of observations minus one, and is denoted with a different notation than population variance to reflect its sample-specific nature.
๐Ÿ’กSample Standard Deviation
Sample standard deviation is the square root of the sample variance and provides a measure of the average distance data points fall from the mean within a sample. It is highlighted in the script as an easier calculation once the sample variance is known, as it simply involves taking the square root of the variance value.
๐Ÿ’กPopulation Variance
Population variance is the measure of the dispersion of an entire population's data points around the population mean. In the script, it is distinguished from sample variance by the fact that it uses the entire population data and is calculated by dividing the sum of squared deviations from the mean by the total number of observations, without subtracting one.
๐Ÿ’กPopulation Standard Deviation
Population standard deviation is derived from the population variance and represents the average distance of data points from the population mean. The script explains that it is found by taking the square root of the population variance, similar to how sample standard deviation is derived.
๐Ÿ’กDeviations from the Mean
Deviations from the mean are the differences between each data point and the mean (average) of the dataset. The script emphasizes that calculating these deviations is the first step in determining measures of variability, as they indicate how spread out the data is around the central value.
๐Ÿ’กSum of Squares (SS)
The sum of squares, often abbreviated as SS, is the cumulative total of the squared deviations of each data point from the mean. The script describes it as a crucial intermediate step in calculating all four measures of variability, as it forms the basis for both sample and population variance calculations.
๐Ÿ’กSample Notation
Sample notation, such as using x-bar (\(\bar{x}\)) to represent the sample mean, distinguishes calculations made on a subset of data from those made on an entire population. The script uses sample notation in its initial calculations to demonstrate the process of finding variability measures for a sample.
๐Ÿ’กPopulation Notation
Population notation, such as using the Greek letter Mu (M) for the population mean, is used when referring to calculations made on an entire dataset. The script transitions from sample to population notation to illustrate the process of calculating measures of variability for a complete set of data.
๐Ÿ’กData Points
Data points are individual values within a dataset. The script uses the example of the number of cups of coffee students drink per day as data points, which are then used to demonstrate the calculation of various measures of variability.
๐Ÿ’กSample Size (n)
Sample size refers to the number of observations or data points in a sample. In the script, it is mentioned that dividing by 'n minus one' for sample variance and by 'n' for population variance is a statistical practice to better approximate population parameters when only a sample is known.
Highlights

Introduction to calculating four main measures of variability: sample variance, sample standard deviation, population variance, and population standard deviation.

Highlighting the similarities in the numerators of the formulas for all four measures of variability.

Explanation of the notation changes based on whether data is a sample or a population.

Process of calculating each x-value by subtracting the mean and squaring the result.

Use of a table to systematically calculate the required measures.

Example data provided: number of cups of coffee students drink per day.

Treating data as either sample data or population data based on the context.

Calculation of deviations of each value from the mean as the first step.

Mean of all scores (x-bar) is given as 2.6 for the sample data.

The problem with averaging deviations due to them summing up to zero.

Two options to address the issue of negative signs: taking absolute values or squaring the values.

Historical debate in statistics leading to the preference for squaring values.

Calculation of the sum of squares (SS) as the sum of squared deviations from the mean.

Sum of squares represented as SS and its role in calculating measures of variability.

Process to calculate sample variance using the sum of squares and dividing by n-1.

Derivation of sample standard deviation by taking the square root of sample variance.

Calculation of population variance and standard deviation using the sum of squares and dividing by n.

Explanation of the conservative approach in statistics by using n-1 for sample calculations.

Final results and conclusion of the video with the four calculated measures of variability.

Transcripts
Rate This

5.0 / 5 (0 votes)

Thanks for rating: