Practical Code Examples
Real-World Normalization Scenarios
graph TD
A[Data Preprocessing] --> B[Feature Scaling]
B --> C[Machine Learning]
B --> D[Statistical Analysis]
B --> E[Deep Learning]
1. Machine Learning Dataset Normalization
Preprocessing Iris Dataset
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
## Load dataset
iris = load_iris()
X, y = iris.data, iris.target
## Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
## Normalize features
scaler = StandardScaler()
X_train_normalized = scaler.fit_transform(X_train)
X_test_normalized = scaler.transform(X_test)
## Train SVM classifier
classifier = SVC()
classifier.fit(X_train_normalized, y_train)
2. Financial Data Normalization
Stock Price Scaling
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
## Sample stock price data
stock_prices = np.array([
[100, 105, 98],
[200, 210, 190],
[50, 55, 48]
])
## Create MinMax scaler
scaler = MinMaxScaler()
normalized_prices = scaler.fit_transform(stock_prices)
3. Image Processing Normalization
import numpy as np
from sklearn.preprocessing import RobustScaler
## Simulated image pixel data
image_data = np.random.randint(0, 255, size=(100, 28, 28))
## Flatten and normalize image data
flattened_images = image_data.reshape(100, -1)
robust_scaler = RobustScaler()
normalized_images = robust_scaler.fit_transform(flattened_images)
Normalization Technique Comparison
Scenario |
Best Scaling Method |
Key Considerations |
Neural Networks |
Min-Max |
Bounded input range |
SVM Classification |
Z-Score |
Zero-centered data |
Regression |
Robust Scaling |
Outlier resistance |
Advanced Normalization Strategies
Custom Scaling Function
def custom_normalization(data, method='zscore'):
if method == 'zscore':
return (data - np.mean(data)) / np.std(data)
elif method == 'minmax':
return (data - np.min(data)) / (np.max(data) - np.min(data))
else:
raise ValueError("Invalid normalization method")
## Example usage
data = np.array([1, 2, 3, 4, 5])
normalized_data = custom_normalization(data, method='minmax')
Best Practices at LabEx
- Always explore data distribution
- Experiment with multiple scaling techniques
- Consider domain-specific requirements
- Validate model performance after normalization