In this step, we will extend our previous program to compute the necessary sums for calculating the Pearson correlation coefficient. We'll modify the correlation_input.c
file to include calculations for the correlation formula.
Open the previous file:
cd ~/project
nano correlation_input.c
Update the code with the following implementation:
#include <stdio.h>
#include <math.h>
#define MAX_POINTS 100
double calculatePearsonCorrelation(double x[], double y[], int n) {
double sum_x = 0, sum_y = 0, sum_xy = 0;
double sum_x_squared = 0, sum_y_squared = 0;
// Compute necessary sums
for (int i = 0; i < n; i++) {
sum_x += x[i];
sum_y += y[i];
sum_xy += x[i] * y[i];
sum_x_squared += x[i] * x[i];
sum_y_squared += y[i] * y[i];
}
// Pearson correlation coefficient formula
double numerator = n * sum_xy - sum_x * sum_y;
double denominator = sqrt((n * sum_x_squared - sum_x * sum_x) *
(n * sum_y_squared - sum_y * sum_y));
return numerator / denominator;
}
int main() {
double x[MAX_POINTS], y[MAX_POINTS];
int n, i;
printf("Enter the number of data points (max %d): ", MAX_POINTS);
scanf("%d", &n);
printf("Enter x and y coordinates:\n");
for (i = 0; i < n; i++) {
printf("Point %d (x y): ", i + 1);
scanf("%lf %lf", &x[i], &y[i]);
}
double correlation = calculatePearsonCorrelation(x, y, n);
printf("\nData Points Entered:\n");
for (i = 0; i < n; i++) {
printf("Point %d: (%.2f, %.2f)\n", i + 1, x[i], y[i]);
}
printf("\nPearson Correlation Coefficient: %.4f\n", correlation);
return 0;
}
Compile the program with math library:
gcc -o correlation_input correlation_input.c -lm
Run the program with sample data:
./correlation_input
Example output:
Enter the number of data points (max 100): 5
Enter x and y coordinates:
Point 1 (x y): 1 2
Point 2 (x y): 2 4
Point 3 (x y): 3 5
Point 4 (x y): 4 4
Point 5 (x y): 5 5
Data Points Entered:
Point 1: (1.00, 2.00)
Point 2: (2.00, 4.00)
Point 3: (3.00, 5.00)
Point 4: (4.00, 4.00)
Point 5: (5.00, 5.00)
Pearson Correlation Coefficient: 0.8528
Key points about the Pearson correlation calculation:
- We compute necessary sums: x, y, xy, xÂē, yÂē
- Apply the Pearson correlation coefficient formula
- Use sqrt() from math.h for calculation
- Return the correlation coefficient between -1 and 1