SVR 과 비교한 커널 릿지 회귀

소개

이 튜토리얼에서는 파이썬의 인기 머신러닝 라이브러리인 Scikit-Learn 을 사용하여 커널 릿지 회귀 (KRR) 와 서포트 벡터 회귀 (SVR) 를 비교합니다. 두 모델 모두 커널 트릭을 사용하여 비선형 함수를 학습합니다. KRR 과 SVR 은 손실 함수와 적합 방법에서 차이가 있습니다. 우리는 사인파 목표 함수로 구성된 인공 데이터셋과 다섯 번째 데이터 포인트마다 강한 노이즈가 추가된 데이터셋을 사용할 것입니다.

VM 팁

VM 시작이 완료되면 왼쪽 상단 모서리를 클릭하여 Notebook 탭으로 전환하여 Jupyter Notebook을 연습에 사용할 수 있습니다.

때때로 Jupyter Notebook 이 로드되는 데 몇 초가 걸릴 수 있습니다. Jupyter Notebook 의 제한으로 인해 작업의 유효성 검사를 자동화할 수 없습니다.

학습 중 문제가 발생하면 Labby 에 문의하십시오. 세션 후 피드백을 제공하면 문제를 신속하게 해결해 드리겠습니다.

샘플 데이터 생성

사인파 목표 함수로 구성되고 다섯 번째 데이터 포인트마다 강한 노이즈가 추가된 데이터셋을 생성합니다.

import numpy as np

## 샘플 데이터 생성
rng = np.random.RandomState(42)
X = 5 * rng.rand(10000, 1)
y = np.sin(X).ravel()

## 목표에 노이즈 추가
y[::5] += 3 * (0.5 - rng.rand(X.shape[0] // 5))

X_plot = np.linspace(0, 5, 100000)[:, None]

커널 기반 회귀 모델 생성

Scikit-Learn 의 GridSearchCV 를 사용하여 KRR 및 SVR 모델을 생성하고 최적의 하이퍼파라미터를 찾을 것입니다.

from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVR
from sklearn.kernel_ridge import KernelRidge

train_size = 100

## SVR 모델
svr = GridSearchCV(
    SVR(kernel="rbf", gamma=0.1),
    param_grid={"C": [1e0, 1e1, 1e2, 1e3], "gamma": np.logspace(-2, 2, 5)},
)

## KRR 모델
kr = GridSearchCV(
    KernelRidge(kernel="rbf", gamma=0.1),
    param_grid={"alpha": [1e0, 0.1, 1e-2, 1e-3], "gamma": np.logspace(-2, 2, 5)},
)

SVR 및 커널 릿지 회귀 시간 비교

2 단계에서 찾은 최적의 하이퍼파라미터를 사용하여 SVR 및 KRR 모델의 학습 및 예측 시간을 비교합니다.

import time

## SVR 학습
t0 = time.time()
svr.fit(X[:train_size], y[:train_size])
svr_fit = time.time() - t0

## SVR 모델의 최적 파라미터 및 점수 출력
print(f"최적 SVR 파라미터: {svr.best_params_} 및 R2 점수: {svr.best_score_:.3f}")
print("SVR 복잡도 및 밴드폭 선택 및 모델 학습 시간: %.3f 초" % svr_fit)

## KRR 학습
t0 = time.time()
kr.fit(X[:train_size], y[:train_size])
kr_fit = time.time() - t0

## KRR 모델의 최적 파라미터 및 점수 출력
print(f"최적 KRR 파라미터: {kr.best_params_} 및 R2 점수: {kr.best_score_:.3f}")
print("KRR 복잡도 및 밴드폭 선택 및 모델 학습 시간: %.3f 초" % kr_fit)

## SVR 의 지원 벡터 비율 계산
sv_ratio = svr.best_estimator_.support_.shape[0] / train_size
print("지원 벡터 비율: %.3f" % sv_ratio)

## SVR 예측
t0 = time.time()
y_svr = svr.predict(X_plot)
svr_predict = time.time() - t0
print("%d개 입력에 대한 SVR 예측 시간: %.3f 초" % (X_plot.shape[0], svr_predict))

## KRR 예측
t0 = time.time()
y_kr = kr.predict(X_plot)
kr_predict = time.time() - t0
print("%d개 입력에 대한 KRR 예측 시간: %.3f 초" % (X_plot.shape[0], kr_predict))

결과 확인

그리드 검색을 통해 RBF 커널의 복잡도/규제 및 밴드폭을 최적화했을 때 학습된 KRR 및 SVR 모델을 시각화합니다.

import matplotlib.pyplot as plt

sv_ind = svr.best_estimator_.support_
plt.scatter(
    X[sv_ind],
    y[sv_ind],
    c="r",
    s=50,
    label="SVR support vectors",
    zorder=2,
    edgecolors=(0, 0, 0),
)
plt.scatter(X[:100], y[:100], c="k", label="data", zorder=1, edgecolors=(0, 0, 0))
plt.plot(
    X_plot,
    y_svr,
    c="r",
    label="SVR (fit: %.3fs, predict: %.3fs)" % (svr_fit, svr_predict),
)
plt.plot(
    X_plot, y_kr, c="g", label="KRR (fit: %.3fs, predict: %.3fs)" % (kr_fit, kr_predict)
)
plt.xlabel("데이터")
plt.ylabel("타겟")
plt.title("SVR 대 커널 릿지")
_ = plt.legend()

학습 및 예측 시간 시각화

훈련 데이터 세트 크기가 다를 때 KRR 및 SVR 의 학습 및 예측 시간을 시각화합니다.

_, ax = plt.subplots()

sizes = np.logspace(1, 3.8, 7).astype(int)
for name, estimator in {
    "KRR": KernelRidge(kernel="rbf", alpha=0.01, gamma=10),
    "SVR": SVR(kernel="rbf", C=1e2, gamma=10),
}.items():
    train_time = []
    test_time = []
    for train_test_size in sizes:
        t0 = time.time()
        estimator.fit(X[:train_test_size], y[:train_test_size])
        train_time.append(time.time() - t0)

        t0 = time.time()
        estimator.predict(X_plot[:1000])
        test_time.append(time.time() - t0)

    plt.plot(
        sizes,
        train_time,
        "o-",
        color="r" if name == "SVR" else "g",
        label="%s (train)" % name,
    )
    plt.plot(
        sizes,
        test_time,
        "o--",
        color="r" if name == "SVR" else "g",
        label="%s (test)" % name,
    )

plt.xscale("log")
plt.yscale("log")
plt.xlabel("훈련 데이터 크기")
plt.ylabel("시간 (초)")
plt.title("실행 시간")
_ = plt.legend(loc="best")

학습 곡선 시각화

KRR 및 SVR 의 학습 곡선을 시각화합니다.

from sklearn.model_selection import LearningCurveDisplay

_, ax = plt.subplots()

svr = SVR(kernel="rbf", C=1e1, gamma=0.1)
kr = KernelRidge(kernel="rbf", alpha=0.1, gamma=0.1)

common_params = {
    "X": X[:100],
    "y": y[:100],
    "train_sizes": np.linspace(0.1, 1, 10),
    "scoring": "neg_mean_squared_error",
    "negate_score": True,
    "score_name": "평균 제곱 오차",
    "std_display_style": None,
    "ax": ax,
}

LearningCurveDisplay.from_estimator(svr, **common_params)
LearningCurveDisplay.from_estimator(kr, **common_params)
ax.set_title("학습 곡선")
ax.legend(handles=ax.get_legend_handles_labels()[0], labels=["SVR", "KRR"])

plt.show()

요약

이 튜토리얼에서는 Scikit-Learn 을 사용하여 커널 릿지 회귀 (KRR) 와 서포트 벡터 회귀 (SVR) 를 비교했습니다. 사인 함수 타겟 함수와 5 번째 데이터 포인트에 강한 노이즈가 추가된 데이터 세트를 생성했습니다. Scikit-Learn 의 GridSearchCV 를 사용하여 KRR 및 SVR 모델을 구성하고 최적의 하이퍼파라미터를 찾았습니다. 찾은 최적의 하이퍼파라미터를 사용하여 SVR 및 KRR 모델의 학습 및 예측 시간을 비교했습니다. 그리드 검색을 통해 RBF 커널의 복잡성/규제 및 대역폭이 모두 최적화되었을 때 KRR 및 SVR 의 학습된 모델을 시각화했습니다. 또한 훈련 세트 크기가 다를 때 KRR 및 SVR 의 학습 및 예측 시간을 시각화했습니다. 마지막으로 KRR 및 SVR 의 학습 곡선을 시각화했습니다.

커널 릿지 회귀 시각화

소개