Calculate the Standard Deviation in C

CCBeginner
Practice Now

Introduction

In this lab, you will learn how to calculate the standard deviation of a dataset in C programming. The lab covers three main steps: computing the mean of the dataset, summing the squared deviations from the mean to calculate the variance, and then taking the square root to obtain the standard deviation. By the end of this lab, you will have a solid understanding of these fundamental statistical concepts and how to implement them in C.

The lab provides step-by-step instructions and sample code to guide you through the process. You will start by writing a C program to calculate the mean of a given dataset, then extend the program to compute the variance by summing the squared deviations from the mean. Finally, you will take the square root of the variance to determine the standard deviation and print the result.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL c(("`C`")) -.-> c/UserInteractionGroup(["`User Interaction`"]) c(("`C`")) -.-> c/BasicsGroup(["`Basics`"]) c(("`C`")) -.-> c/CompoundTypesGroup(["`Compound Types`"]) c(("`C`")) -.-> c/PointersandMemoryGroup(["`Pointers and Memory`"]) c(("`C`")) -.-> c/FunctionsGroup(["`Functions`"]) c/UserInteractionGroup -.-> c/output("`Output`") c/BasicsGroup -.-> c/variables("`Variables`") c/CompoundTypesGroup -.-> c/arrays("`Arrays`") c/PointersandMemoryGroup -.-> c/memory_address("`Memory Address`") c/FunctionsGroup -.-> c/math_functions("`Math Functions`") subgraph Lab Skills c/output -.-> lab-435137{{"`Calculate the Standard Deviation in C`"}} c/variables -.-> lab-435137{{"`Calculate the Standard Deviation in C`"}} c/arrays -.-> lab-435137{{"`Calculate the Standard Deviation in C`"}} c/memory_address -.-> lab-435137{{"`Calculate the Standard Deviation in C`"}} c/math_functions -.-> lab-435137{{"`Calculate the Standard Deviation in C`"}} end

Compute Mean of the Dataset

In this step, you will learn how to compute the mean of a dataset in C programming. The mean is a fundamental statistical measure that represents the average value of a set of numbers.

First, let's create a C program to calculate the mean of a dataset. Open a new file using nano:

cd ~/project
nano mean_calculation.c

Now, enter the following code:

#include <stdio.h>

#define MAX_SIZE 100

float calculateMean(int arr[], int size) {
    float sum = 0;
    for (int i = 0; i < size; i++) {
        sum += arr[i];
    }
    return sum / size;
}

int main() {
    int dataset[MAX_SIZE];
    int size;

    printf("Enter the number of elements (max %d): ", MAX_SIZE);
    scanf("%d", &size);

    printf("Enter %d integers:\n", size);
    for (int i = 0; i < size; i++) {
        scanf("%d", &dataset[i]);
    }

    float mean = calculateMean(dataset, size);
    printf("Mean of the dataset: %.2f\n", mean);

    return 0;
}

Compile the program:

gcc mean_calculation.c -o mean_calculation

Run the program and input some sample data:

./mean_calculation

Example output:

Enter the number of elements (max 100): 5
Enter 5 integers:
10
20
30
40
50
Mean of the dataset: 30.00

Let's break down the code:

  1. We define a calculateMean function that takes an array and its size as parameters.
  2. The function calculates the sum of all elements in the array.
  3. The mean is computed by dividing the sum by the total number of elements.
  4. In the main function, we prompt the user to enter the dataset.
  5. We call calculateMean and print the result with two decimal places.

Sum the Squared Deviations and Compute Variance

In this step, you will extend the previous program to calculate variance by summing the squared deviations from the mean. Variance measures how spread out the numbers are in a dataset.

Open the previous file to modify:

cd ~/project
nano mean_calculation.c

Update the program with variance calculation:

#include <stdio.h>
#include <math.h>

#define MAX_SIZE 100

float calculateMean(int arr[], int size) {
    float sum = 0;
    for (int i = 0; i < size; i++) {
        sum += arr[i];
    }
    return sum / size;
}

float calculateVariance(int arr[], int size, float mean) {
    float sumSquaredDeviations = 0;
    for (int i = 0; i < size; i++) {
        float deviation = arr[i] - mean;
        sumSquaredDeviations += deviation * deviation;
    }
    return sumSquaredDeviations / size;
}

int main() {
    int dataset[MAX_SIZE];
    int size;

    printf("Enter the number of elements (max %d): ", MAX_SIZE);
    scanf("%d", &size);

    printf("Enter %d integers:\n", size);
    for (int i = 0; i < size; i++) {
        scanf("%d", &dataset[i]);
    }

    float mean = calculateMean(dataset, size);
    float variance = calculateVariance(dataset, size, mean);

    printf("Mean of the dataset: %.2f\n", mean);
    printf("Variance of the dataset: %.2f\n", variance);

    return 0;
}

Compile the updated program:

gcc mean_calculation.c -o mean_calculation -lm

Run the program and input sample data:

./mean_calculation

Example output:

Enter the number of elements (max 100): 5
Enter 5 integers:
10
20
30
40
50
Mean of the dataset: 30.00
Variance of the dataset: 200.00

Key points in the code:

  1. We added a new calculateVariance function that takes the array, size, and mean.
  2. The function calculates the deviation of each element from the mean.
  3. It squares these deviations and sums them up.
  4. Variance is computed by dividing the sum of squared deviations by the number of elements.
  5. We use -lm flag when compiling to link the math library.

Take Square Root for Standard Deviation and Print

In this final step, you will complete the standard deviation calculation by taking the square root of the variance. Standard deviation is a key measure of data dispersion in statistical analysis.

Open the previous file to modify:

cd ~/project
nano mean_calculation.c

Update the program with standard deviation calculation:

#include <stdio.h>
#include <math.h>

#define MAX_SIZE 100

float calculateMean(int arr[], int size) {
    float sum = 0;
    for (int i = 0; i < size; i++) {
        sum += arr[i];
    }
    return sum / size;
}

float calculateVariance(int arr[], int size, float mean) {
    float sumSquaredDeviations = 0;
    for (int i = 0; i < size; i++) {
        float deviation = arr[i] - mean;
        sumSquaredDeviations += deviation * deviation;
    }
    return sumSquaredDeviations / size;
}

float calculateStandardDeviation(float variance) {
    return sqrt(variance);
}

int main() {
    int dataset[MAX_SIZE];
    int size;

    printf("Enter the number of elements (max %d): ", MAX_SIZE);
    scanf("%d", &size);

    printf("Enter %d integers:\n", size);
    for (int i = 0; i < size; i++) {
        scanf("%d", &dataset[i]);
    }

    float mean = calculateMean(dataset, size);
    float variance = calculateVariance(dataset, size, mean);
    float standardDeviation = calculateStandardDeviation(variance);

    printf("Dataset Statistics:\n");
    printf("Mean: %.2f\n", mean);
    printf("Variance: %.2f\n", variance);
    printf("Standard Deviation: %.2f\n", standardDeviation);

    return 0;
}

Compile the updated program:

gcc mean_calculation.c -o mean_calculation -lm

Run the program and input sample data:

./mean_calculation

Example output:

Enter the number of elements (max 100): 5
Enter 5 integers:
10
20
30
40
50
Dataset Statistics:
Mean: 30.00
Variance: 200.00
Standard Deviation: 14.14

Key points in the code:

  1. We added a new calculateStandardDeviation function.
  2. This function uses sqrt() from the math library to compute standard deviation.
  3. Standard deviation is the square root of variance.
  4. The main function now prints all three statistical measures.
  5. We continue to use -lm flag to link the math library.

Summary

In this lab, you first learned how to compute the mean of a dataset in C programming. The mean is a fundamental statistical measure that represents the average value of a set of numbers. You then extended the program to calculate the variance by summing the squared deviations from the mean. Variance measures how spread out the numbers are in a dataset. Finally, you learned how to take the square root of the variance to compute the standard deviation and print the result.

Other C Tutorials you may like