Input sanitization is a critical security technique to prevent malicious input from compromising system integrity and functionality.
Core Sanitization Techniques
1. Character Filtering
void sanitize_input(char *input) {
for (int i = 0; input[i] != '\0'; i++) {
if (!isalnum(input[i]) && input[i] != ' ') {
input[i] = '_'; // Replace invalid characters
}
}
}
2. Whitelist Validation
int is_valid_input(const char *input) {
const char *allowed = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 ";
return strspn(input, allowed) == strlen(input);
}
Sanitization Strategies
Strategy |
Description |
Use Case |
Character Filtering |
Remove/Replace invalid characters |
User input validation |
Length Limitation |
Truncate input to maximum length |
Prevent buffer overflow |
Type Conversion |
Convert input to expected type |
Numeric input validation |
Escape Special Characters |
Neutralize potential injection risks |
SQL, Shell commands |
flowchart TD
A[Raw User Input] --> B{Validate Length}
B -->|Too Long| C[Truncate]
B -->|Valid Length| D{Character Filter}
D --> E{Whitelist Check}
E -->|Pass| F[Safe Processing]
E -->|Fail| G[Reject Input]
Advanced Sanitization Techniques
Regular Expression Validation
int validate_email(const char *email) {
regex_t regex;
int reti = regcomp(®ex, "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$", REG_EXTENDED);
reti = regexec(®ex, email, 0, NULL, 0);
regfree(®ex);
return reti == 0;
}
int sanitize_numeric_input(const char *input, int *result) {
char *endptr;
long value = strtol(input, &endptr, 10);
if (endptr == input || *endptr != '\0') {
return 0; // Invalid input
}
*result = (int)value;
return 1;
}
Security Considerations
- Never trust user input
- Always validate and sanitize
- Use multiple layers of validation
- Implement context-specific sanitization
- Minimize processing overhead
- Use efficient validation algorithms
- Implement early rejection of invalid inputs
LabEx emphasizes the critical role of comprehensive input sanitization in developing secure and robust C applications.