ガウス過程回帰：カーネル関数の探索

はじめに

この実験では、Python の Scikit-learn ライブラリを使ってガウス過程回帰（GPR）に異なるカーネルをどのように使用するかを示します。GPR は、ノイズのあるデータに複雑なモデルを適合させる非パラメトリック回帰手法です。カーネル関数は、任意の 2 つの入力点間の類似性を決定するために使用されます。カーネル関数の選択は重要であり、データに適合するモデルの形状を決定します。この実験では、GPR で最も一般的に使用されるカーネルを扱います。

VM のヒント

VM の起動が完了したら、左上隅をクリックしてノートブックタブに切り替えて、Jupyter Notebook を使った練習にアクセスします。

時々、Jupyter Notebook が読み込み終了するまで数秒待つ必要があります。Jupyter Notebook の制限により、操作の検証を自動化することはできません。

学習中に問題に遭遇した場合は、Labby にお問い合わせください。セッション後にフィードバックを提供してください。すぐに問題を解決いたします。

ライブラリのインポート

必要なライブラリをインポートして始めます。

import matplotlib.pyplot as plt
import numpy as np
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, RationalQuadratic, ExpSineSquared, ConstantKernel, DotProduct, Matern

学習用データの作成

次に、異なるセクションで使用する学習用データセットを作成します。

rng = np.random.RandomState(4)
X_train = rng.uniform(0, 5, 10).reshape(-1, 1)
y_train = np.sin((X_train[:, 0] - 2.5) ** 2)
n_samples = 5

ヘルパー関数

ガウス過程に利用可能な個々のカーネルをそれぞれ提示する前に、ガウス過程から抽出されたサンプルをプロットするためのヘルパー関数を定義します。

def plot_gpr_samples(gpr_model, n_samples, ax):
    """Plot samples drawn from the Gaussian process model.

    If the Gaussian process model is not trained then the drawn samples are
    drawn from the prior distribution. Otherwise, the samples are drawn from
    the posterior distribution. Be aware that a sample here corresponds to a
    function.

    Parameters
    ----------
    gpr_model : `GaussianProcessRegressor`
        A :class:`~sklearn.gaussian_process.GaussianProcessRegressor` model.
    n_samples : int
        The number of samples to draw from the Gaussian process distribution.
    ax : matplotlib axis
        The matplotlib axis where to plot the samples.
    """
    x = np.linspace(0, 5, 100)
    X = x.reshape(-1, 1)

    y_mean, y_std = gpr_model.predict(X, return_std=True)
    y_samples = gpr_model.sample_y(X, n_samples)

    for idx, single_prior in enumerate(y_samples.T):
        ax.plot(
            x,
            single_prior,
            linestyle="--",
            alpha=0.7,
            label=f"Sampled function #{idx + 1}",
        )
    ax.plot(x, y_mean, color="black", label="Mean")
    ax.fill_between(
        x,
        y_mean - y_std,
        y_mean + y_std,
        alpha=0.1,
        color="black",
        label=r"$\pm$ 1 std. dev.",
    )
    ax.set_xlabel("x")
    ax.set_ylabel("y")
    ax.set_ylim([-3, 3])

ラジアルベーシス関数カーネル

ラジアルベーシス関数（RBF）カーネルは次のように定義されます。

$$ k(x_i, x_j) = \exp \left( -\frac{|x_i - x_j|^2}{2\ell^2} \right) $$

ここで、$\ell$ は長さ尺度パラメータです。

kernel = 1.0 * RBF(length_scale=1.0, length_scale_bounds=(1e-1, 10.0))
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0)

fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))

## plot prior
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
axs[0].set_title("Samples from prior distribution")

## plot posterior
gpr.fit(X_train, y_train)
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
axs[1].scatter(X_train[:, 0], y_train, color="red", zorder=10, label="Observations")
axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
axs[1].set_title("Samples from posterior distribution")

fig.suptitle("Radial Basis Function kernel", fontsize=18)
plt.tight_layout()

有理二乗カーネル

有理二乗カーネルは次のように定義されます。

$$ k(x_i, x_j) = \left( 1 + \frac{|x_i - x_j|^2}{2\alpha\ell^2} \right)^{-\alpha} $$

ここで、$\ell$ は長さ尺度パラメータであり、$\alpha$ は小スケールと大スケールの特徴の相対的な重み付けを制御します。

kernel = 1.0 * RationalQuadratic(length_scale=1.0, alpha=0.1, alpha_bounds=(1e-5, 1e15))
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0)

fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))

## plot prior
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
axs[0].set_title("Samples from prior distribution")

## plot posterior
gpr.fit(X_train, y_train)
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
axs[1].scatter(X_train[:, 0], y_train, color="red", zorder=10, label="Observations")
axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
axs[1].set_title("Samples from posterior distribution")

fig.suptitle("Rational Quadratic kernel", fontsize=18)
plt.tight_layout()

指数サイン二乗カーネル

指数サイン二乗カーネルは次のように定義されます。

$$ k(x_i, x_j) = \exp \left( -\frac{2\sin^2(\pi|x_i - x_j|/p)}{\ell^2} \right) $$

ここで、$\ell$ は長さ尺度パラメータであり、$p$ は周期を制御します。

kernel = 1.0 * ExpSineSquared(
    length_scale=1.0,
    periodicity=3.0,
    length_scale_bounds=(0.1, 10.0),
    periodicity_bounds=(1.0, 10.0),
)
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0)

fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))

## plot prior
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
axs[0].set_title("Samples from prior distribution")

## plot posterior
gpr.fit(X_train, y_train)
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
axs[1].scatter(X_train[:, 0], y_train, color="red", zorder=10, label="Observations")
axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
axs[1].set_title("Samples from posterior distribution")

fig.suptitle("Exp-Sine-Squared kernel", fontsize=18)
plt.tight_layout()

内積カーネル

内積カーネルは次のように定義されます。

$$ k(x_i, x_j) = (\sigma_0 + x_i^T x_j)^2 $$

ここで、$\sigma_0$ は定数です。

kernel = ConstantKernel(0.1, (0.01, 10.0)) * (
    DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2
)
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0)

fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))

## plot prior
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
axs[0].set_title("Samples from prior distribution")

## plot posterior
gpr.fit(X_train, y_train)
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
axs[1].scatter(X_train[:, 0], y_train, color="red", zorder=10, label="Observations")
axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
axs[1].set_title("Samples from posterior distribution")

fig.suptitle("Dot-Product kernel", fontsize=18)
plt.tight_layout()

マテルンカーネル

マテルンカーネルは次のように定義されます。

$$ k(x_i, x_j) = \frac{1}{\Gamma(\nu)2^{\nu-1}}\left(\frac{\sqrt{2\nu}}{\ell}|x_i - x_j|\right)^\nu K_\nu\left(\frac{\sqrt{2\nu}}{\ell}|x_i - x_j|\right) $$

ここで、$\ell$ は長さ尺度パラメータであり、$\nu$ は関数の滑らかさを制御します。

kernel = 1.0 * Matern(length_scale=1.0, length_scale_bounds=(1e-1, 10.0), nu=1.5)
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0)

fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))

## plot prior
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
axs[0].set_title("Samples from prior distribution")

## plot posterior
gpr.fit(X_train, y_train)
plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
axs[1].scatter(X_train[:, 0], y_train, color="red", zorder=10, label="Observations")
axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
axs[1].set_title("Samples from posterior distribution")

fig.suptitle("Matérn kernel", fontsize=18)
plt.tight_layout()

まとめ

この実験では、Python の Scikit - learn ライブラリを使ってガウス過程回帰に異なるカーネルをどのように使用するか学びました。GPR で最も一般的に使用されるカーネルを扱い、それには、放射基底関数カーネル、有理二乗カーネル、指数サイン二乗カーネル、内積カーネル、およびマテルンカーネルが含まれます。また、ヘルパー関数を使ってガウス過程から抽出されたサンプルをプロットする方法も学びました。

まとめ

おめでとうございます！あなたはガウス過程回帰：カーネルの実験を完了しました。あなたのスキルを向上させるために、LabEx でさらに多くの実験を行って練習することができます。