Lennard-Jonesクラスターの最適化手法：モンテカルロ法からBlock Basin Hoppingへの展開

3228 words

16 minutes

Lennard-Jonesクラスターの最適化手法：モンテカルロ法からBlock Basin Hoppingへの展開

2026-01-21

Computational Physics

Global Optimization

Block Basin Hopping

Energy Landscape

Python

last_modified: 2026-01-21

※生成AIにより自動生成した記事です。自己責任でご覧ください。

1. モンテカルロ法：確率的探索の基礎と課題#

高次元空間における最適化問題へのアプローチとして、Metropolisらが提唱したモンテカルロ法（MC）は基礎的な役割を果たす。

アルゴリズムの構造#

MC法の基本原理は、現在の状態 $\mathbf{X}$ に対してランダムな変位（摂動）を与え、そのエネルギー変化 $\Delta E$ に基づいて遷移を判定することにある。エネルギーが低下する場合（ $\Delta E < 0$ ）は遷移を受理し、上昇する場合（ $\Delta E > 0$ ）であっても、温度 $T$ に依存する確率 $P = \exp(-\Delta E / k_B T)$ で受理する。

高次元空間における限界#

MC法は局所解からの脱出機構を備えているが、粒子数 $N$ の増加に伴い、探索効率が低下する傾向がある。高次元空間においては、ランダムな摂動によってエネルギー的に有利な領域へ到達する確率は、次元数に対して指数関数的に減少することが示唆される。そのため、単純なMC法では、広大な状態空間の探索において収束までに多大な計算コストを要する場合がある。

2. Basin Hopping法：エネルギー地形の変換#

WalesとDoyeによって提案されたBasin Hopping (BH) 法は、MC法の課題に対し、エネルギー地形（Potential Energy Surface, PES）の「変換」という観点からアプローチした手法である。

緩和によるマッピング#

BH法の特徴は、摂動後の座標に対して必ず**構造最適化（緩和）**を行う点にある。数理的には、本来のエネルギー関数 $E(\mathbf{X})$ を、局所最適化演算子 $\mathcal{L}$ を用いて以下のような階段状の関数 $\tilde{E}(\mathbf{X})$ に変換していると解釈できる。

\tilde{E}(\mathbf{X}) = E(\mathcal{L}(\mathbf{X}))

ここで $\mathcal{L}(\mathbf{X})$ は、座標 $\mathbf{X}$ から勾配法等で到達可能な局所最適解（Local Minimum）を指す。

手法の利点と課題#

この操作により、探索空間は連続的な超曲面から、局所安定点の集合（Basin）へと離散化される。これにより、探索における振動的なノイズが除去され、盆地間の遷移（Hopping）効率が向上すると考えられる。一方で、系全体を一律に摂動させるアプローチは、大規模な系において非効率となる可能性がある。多くの原子を同時に動かすと、構造の破壊度が大きくなり、高いエネルギー障壁によって遷移が拒否される確率が高まることが予想されるためである。

3. Block Basin Hopping法：部分空間での探索#

大規模系におけるBH法の課題に対し、探索空間を分割統治的に扱うアプローチが有効であると考えられる。本節では、Basin Hopping法を部分系（ブロック）単位で適用する変法について述べる。この手法は学術的に固定された名称ではないが、その動作特性を表すため、本稿では便宜的に Block Basin Hopping (BBH) と呼称する。

分割統治的なアプローチ#

このアプローチでは、全自由度 $3N$ を一度に更新するのではなく、選択された一部の原子群（ブロック）の自由度のみを更新対象とする。そのプロセスは以下の通りである。

ブロック選択: エネルギー的に不安定な領域、あるいはランダムに選択された $k$ 個の原子をブロックとする。
局所的なBasin Hopping: 残りの $N-k$ 個の原子を固定したポテンシャル場の中で、ブロック内の原子配置に対して摂動と緩和を繰り返す。
大域的更新: 局所的に得られた最適な配置を全体構造に反映し、採択判定を行う。

論理的妥当性#

このブロック単位の変法（block-wise variant）が有効であると考えられる理由は以下の通りである。

探索空間の縮小: ブロックサイズ $k$ が $N$ に比べて十分小さい場合、部分空間内での最適配置探索は、全空間探索に比べて計算コストが大幅に低いと推定される。
構造維持と修復: 系全体を破壊することなく、局所的な欠陥（Defect）や歪みを選択的に解消できる可能性が高い。これにより、エネルギー障壁を迂回しつつ、より安定な状態へ遷移できることが期待される。

これは、高次元の最適化問題を、より低次元の可解な部分問題の連続として扱うエンジニアリング的なアプローチであり、特にフラストレーションの多いエネルギー地形において有効性が示唆される。

4. Pythonによる実装#

以下に、Block Basin Hopping法を実装したクラス BlockBasinOptimizer を示す。この実装では、数値的安定性のためのソフトクリッピングや、正確な解析的勾配計算が含まれている。

import numpy as np
from scipy.optimize import minimize
"""
This implementation is intended for educational purposes to illustrate block-wise local optimization and basin-inspired stochastic search on multimodal energy landscapes. It is not claimed to be a novel global optimization algorithm.
"""
class BlockBasinOptimizer:
    def __init__(self, n_atoms):
        self.n = n_atoms
        self.dim = 3
        self.best_ever_e = np.inf
        self.best_ever_x = None
        # small eps for numerical stability (not a physical cutoff)
        self._eps = 1e-12

    # -------------------------
    # Helper: pairwise energy+grad (for two sets)
    # -------------------------
    def _pair_energy_and_grad(self, coords_a, coords_b=None):
        """
        If coords_b is None: compute all pair interactions within coords_a
        Returns:
            energy (scalar),
            grad_a (array shape (len(coords_a), 3))
        """
        A = coords_a
        nA = A.shape[0]

        if coords_b is None:
            # all pairs within A
            r_vec = A[:, np.newaxis, :] - A[np.newaxis, :, :]  # (nA, nA, 3)
            r2 = np.sum(r_vec**2, axis=2)                      # (nA, nA)
            # exclude self interactions by setting diag to inf
            np.fill_diagonal(r2, np.inf)
            inv_r = 1.0 / np.sqrt(r2)
            # avoid tiny numerical blow-ups but do not impose physical cutoff
            inv_r = np.where(np.isfinite(inv_r), inv_r, 0.0)

            inv_r6 = inv_r**6
            inv_r12 = inv_r6**2

            iu = np.triu_indices(nA, k=1)
            energy = 4.0 * np.sum(inv_r12[iu] - inv_r6[iu])

            # dV/dr for each pair (i,j)
            dV_dr = 4.0 * (-12.0 * inv_r12 * inv_r + 6.0 * inv_r6 * inv_r)  # shape (nA,nA)
            # unit vectors: r_vec * inv_r[...,None]
            unit = r_vec * inv_r[..., np.newaxis]  # diag components are zero because inv_r diag=0
            # gradient on each atom i is sum_j dV/dr_ij * unit_ij
            grad = np.sum(dV_dr[..., np.newaxis] * unit, axis=1)  # (nA,3)
            return energy, grad

        else:
            # interactions between A (active) and B (fixed)
            B = coords_b
            nB = B.shape[0]
            r_vec = A[:, np.newaxis, :] - B[np.newaxis, :, :]   # (nA, nB, 3)
            r2 = np.sum(r_vec**2, axis=2)                        # (nA, nB)
            # numerical protection
            r2 = np.maximum(r2, self._eps)
            inv_r = 1.0 / np.sqrt(r2)
            inv_r6 = inv_r**6
            inv_r12 = inv_r6**2

            energy = 4.0 * np.sum(inv_r12 - inv_r6)

            dV_dr = 4.0 * (-12.0 * inv_r12 * inv_r + 6.0 * inv_r6 * inv_r)  # (nA, nB)
            unit = r_vec * inv_r[..., np.newaxis]                           # (nA, nB, 3)

            # gradient contribution on active atoms
            grad_a = np.sum(dV_dr[..., np.newaxis] * unit, axis=1)  # (nA,3)
            # NOTE: we don't return gradient on fixed atoms (they're treated fixed)
            return energy, grad_a

    # -------------------------
    # Full potential (energy only)
    # -------------------------
    def potential_full(self, x):
        coords = x.reshape((self.n, 3))
        e, _ = self._pair_energy_and_grad(coords, None)
        return e

    # -------------------------
    # Full potential with gradient (for minimizers)
    # -------------------------
    def potential_full_with_grad(self, x):
        coords = x.reshape((self.n, 3))
        e, grad = self._pair_energy_and_grad(coords, None)
        # flatten grad to shape (3n,)
        return e, grad.flatten()

    # -------------------------
    # Local block potential with grad (active variables only)
    # -------------------------
    def potential_local_block_with_grad(self, active_coords_flat, fixed_coords, n_active):
        active_coords = active_coords_flat.reshape((n_active, 3))

        # active-active
        e_aa, grad_aa = self._pair_energy_and_grad(active_coords, None)

        # active-fixed
        if fixed_coords is not None and fixed_coords.size > 0:
            e_af, grad_af = self._pair_energy_and_grad(active_coords, fixed_coords)
        else:
            e_af = 0.0
            grad_af = np.zeros_like(grad_aa)

        e_total = e_aa + e_af
        grad_total = grad_aa + grad_af
        return e_total, grad_total.flatten()

    # -------------------------
    # Block selection (unchanged)
    # -------------------------
    def get_knn_block(self, coords, k_size, center_idx=None):
        if center_idx is None:
            center_idx = np.random.randint(0, self.n)
        dists = np.linalg.norm(coords - coords[center_idx], axis=1)
        sorted_indices = np.argsort(dists)
        block_indices = sorted_indices[:k_size]
        return block_indices, center_idx

    def get_high_energy_block(self, coords, k_size):
        # compute pairwise energy per atom (consistent with pair formula)
        r_vec = coords[:, np.newaxis, :] - coords[np.newaxis, :, :]
        r2 = np.sum(r_vec**2, axis=2)
        np.fill_diagonal(r2, np.inf)
        inv_r = 1.0 / np.sqrt(r2)
        inv_r = np.where(np.isfinite(inv_r), inv_r, 0.0)
        inv_r6 = inv_r**6
        inv_r12 = inv_r6**2
        pair_energy = 4.0 * (inv_r12 - inv_r6)
        np.fill_diagonal(pair_energy, 0.0)
        atom_energy = np.sum(pair_energy, axis=1) / 2.0
        center_idx = np.argmax(atom_energy)
        return self.get_knn_block(coords, k_size, center_idx)

    # -------------------------
    # BlockBasin local enumeration (mostly unchanged logic)
    # -------------------------
    def blockbasin_local_enumeration(self, current_coords, block_size=6, center_idx=None, use_high_energy=False):
        if use_high_energy:
            active_idx, _ = self.get_high_energy_block(current_coords, block_size)
        else:
            active_idx, _ = self.get_knn_block(current_coords, block_size, center_idx)

        fixed_idx = np.setdiff1d(np.arange(self.n), active_idx)

        active_coords = current_coords[active_idx]
        fixed_coords = current_coords[fixed_idx]
        n_active = len(active_idx)

        def local_obj(x):
            return self.potential_local_block_with_grad(x, fixed_coords, n_active)

        x0_local = active_coords.flatten()
        res = minimize(local_obj, x0_local, method='L-BFGS-B', jac=True,
                       options={'maxiter': 100})
        best_local_x = res.x.copy()
        best_local_e = res.fun
        curr_local_x = res.x.copy()
        curr_local_e = res.fun

        n_local_steps = 30
        local_temp = 0.5

        for step in range(n_local_steps):
            if step % 8 == 0:
                pert_scale = 0.8
            elif step % 4 == 0:
                pert_scale = 0.5
            else:
                pert_scale = 0.3

            pert = np.random.uniform(-pert_scale, pert_scale, size=curr_local_x.shape)
            trial_x = curr_local_x + pert

            res = minimize(local_obj, trial_x, method='L-BFGS-B', jac=True,
                           options={'maxiter': 100, 'ftol': 1e-7})
            e_trial = res.fun
            x_trial = res.x

            delta = e_trial - curr_local_e
            if delta < 0 or np.random.rand() < np.exp(-delta / local_temp):
                curr_local_x = x_trial
                curr_local_e = e_trial
                if curr_local_e < best_local_e:
                    best_local_e = curr_local_e
                    best_local_x = curr_local_x.copy()

        new_coords = current_coords.copy()
        new_coords[active_idx] = best_local_x.reshape((n_active, 3))

        return new_coords

    # chaotic initial generators (unchanged)
    def generate_chaotic_initial(self, mode='random'):
        if mode == 'linear':
            coords = np.zeros((self.n, 3))
            coords[:, 0] = np.linspace(0, self.n * 1.0, self.n)
            noise = np.random.uniform(-0.1, 0.1, size=(self.n, 3))
            coords += noise
        elif mode == 'planar':
            coords = np.zeros((self.n, 3))
            side = int(np.ceil(np.sqrt(self.n)))
            idx = 0
            for i in range(side):
                for j in range(side):
                    if idx < self.n:
                        coords[idx] = [i * 1.1, j * 1.1, 0]
                        idx += 1
            noise = np.random.uniform(-0.1, 0.1, size=(self.n, 3))
            coords += noise
        elif mode == 'clustered':
            coords = np.zeros((self.n, 3))
            half = self.n // 2
            coords[: half] = np.random.uniform(-1, 1, size=(half, 3)) + np.array([3, 0, 0])
            coords[half: ] = np.random.uniform(-1, 1, size=(self.n - half, 3)) + np.array([-3, 0, 0])
        elif mode == 'shell':
            coords = np.zeros((self.n, 3))
            for i in range(self.n):
                theta = np.random.uniform(0, 2 * np.pi)
                phi = np.random.uniform(0, np.pi)
                r = 2.5
                coords[i] = [
                    r * np.sin(phi) * np.cos(theta),
                    r * np.sin(phi) * np.sin(theta),
                    r * np.cos(phi)
                ]
        elif mode == 'spiral':
            coords = np.zeros((self.n, 3))
            for i in range(self.n):
                t = i * 0.8
                coords[i] = [
                    np.cos(t) * 1.5,
                    np.sin(t) * 1.5,
                    t * 0.3
                ]
        elif mode == 'random_sparse':
            coords = np.random.uniform(-5, 5, size=(self.n, 3))
        elif mode == 'random_dense':
            coords = np.random.uniform(-1.0, 1.0, size=(self.n, 3))
        elif mode == 'anti_icosahedral':
            coords = np.zeros((self.n, 3))
            coords[0] = [0.5, 0.5, 0.5]
            positions = [
                [1, 0, 0], [-1, 0, 0], [0, 1, 0], [0, -1, 0],
                [0, 0, 1], [0, 0, -1], [1, 1, 0], [-1, -1, 0],
                [1, 0, 1], [-1, 0, -1], [0, 1, 1], [0, -1, -1]
            ]
            for i in range(1, min(self.n, 13)):
                coords[i] = np.array(positions[i-1]) * 1.3
            noise = np.random.uniform(-0.3, 0.3, size=(self.n, 3))
            coords += noise
        else:
            coords = np.random.uniform(-5, 5, size=(self.n, 3))

        return coords.flatten()

    def global_perturbation(self, coords, intensity=0.3):
        pert = np.random.uniform(-intensity, intensity, size=coords.shape)
        return coords + pert

    def run(self, n_steps=100, block_size=5, init_mode='random'):
        x_curr = self.generate_chaotic_initial(mode=init_mode)

        e_raw = self.potential_full(x_curr)
        print(f"{'='*60}")
        print(f"BlockBasin-Phys Optimizer (N={self.n}, block_size={block_size})")
        print(f"Init mode: {init_mode}")
        print(f"{'='*60}")
        print(f"RAW Initial Energy (before any opt): {e_raw:.6f}")

        res = minimize(self.potential_full_with_grad, x_curr, method='L-BFGS-B',
                       jac=True, options={'maxiter': 50})
        x_curr = res.x
        e_curr = res.fun

        self.best_ever_e = e_curr
        self.best_ever_x = x_curr.copy()

        print(f"After minimal relaxation: {e_curr:.6f}")
        print(f"Target(LJ13): -44.3268")
        print(f"{'-'*60}")

        no_improve_count = 0

        for i in range(n_steps):
            if i % 3 == 0:
                use_high_energy = True
                center_idx = None
            else:
                use_high_energy = False
                center_idx = i % self.n

            x_new = self.blockbasin_local_enumeration(
                x_curr.reshape(self.n, 3),
                block_size,
                center_idx=center_idx,
                use_high_energy=use_high_energy
            )
            x_new = x_new.flatten()

            res = minimize(self.potential_full_with_grad, x_new, method='L-BFGS-B',
                           jac=True, options={'maxiter': 200, 'ftol': 1e-9})
            e_new = res.fun
            x_new = res.x

            if e_new < e_curr - 1e-7:
                improvement = e_curr - e_new
                print(f"Step {i:3d}:  E = {e_new:.6f} (improved by {improvement:.6f})")
                e_curr = e_new
                x_curr = x_new
                
                coords = x_curr.reshape((self.n,3))
                coords -= coords.mean(axis=0)
                x_curr = coords.flatten()
                
                no_improve_count = 0

                if e_new < self.best_ever_e:
                    self.best_ever_e = e_new
                    self.best_ever_x = x_new.copy()

                if e_curr < -44.32 and self.n == 13:
                    print(f"{'='*60}")
                    print(">>> GLOBAL MINIMUM REACHED!  <<<")
                    print(f"{'='*60}")
                    break
            else:
                no_improve_count += 1

                if no_improve_count >= 8:
                    intensity = 0.2 + 0.05 * (no_improve_count // 8)
                    intensity = min(intensity, 0.6)

                    x_perturbed = self.global_perturbation(x_curr, intensity=intensity)
                    res = minimize(self.potential_full_with_grad, x_perturbed,
                                   method='L-BFGS-B', jac=True)

                    if res.fun < e_curr:
                        print(f"Step {i:3d}:  Perturbation helped! E = {res.fun:.6f}")
                        x_curr = res.x
                        coords = x_curr.reshape((self.n,3))
                        coords -= coords.mean(axis=0)
                        x_curr = coords.flatten()
                        
                        e_curr = res.fun
                        if e_curr < self.best_ever_e:
                            self.best_ever_e = e_curr
                            self.best_ever_x = x_curr.copy()
                    else:
                        if no_improve_count >= 15:
                            print(f"Step {i:3d}:  Restoring best and reshaking...")
                            x_curr = self.best_ever_x.copy()
                            e_curr = self.best_ever_e

                    no_improve_count = 0

        print(f"{'-'*60}")
        print(f"Final Energy  :   {self.best_ever_e:.6f}")
        print(f"Target (LJ13) :   -44.3268")
        print(f"Gap (LJ13)    :   {self.best_ever_e - (-44.3268):.6f}")

        return self.best_ever_e, self.best_ever_x


# -------------------------
# Simple numeric gradient check utility
# -------------------------
def gradient_check():
    np.random.seed(0)
    opt = BlockBasinOptimizer(n_atoms=7)  # smaller N for speed
    x0 = opt.generate_chaotic_initial(mode='random')
    e, g = opt.potential_full_with_grad(x0)
    # numerical grad
    eps = 1e-6
    num_g = np.zeros_like(x0)
    for i in range(len(x0)):
        xp = x0.copy(); xm = x0.copy()
        xp[i] += eps; xm[i] -= eps
        fplus = opt.potential_full(xp)
        fminus = opt.potential_full(xm)
        num_g[i] = (fplus - fminus) / (2 * eps)
    max_err = np.max(np.abs(num_g - g))
    print("Energy:", e)
    print("Max grad error (num-analytic):", max_err)
    
    e, g = opt.potential_full_with_grad(x0)
    total_force = np.sum(g.reshape(-1,3), axis=0)
    assert np.allclose(total_force, 0.0, atol=1e-8)
    return max_err

if __name__ == "__main__":
    # 1) 勾配チェック（既に試したなら問題なし）
    print("Running gradient check...")
    err = gradient_check()
    if err < 1e-5:
        print("Gradient OK (within tolerance). Proceeding to LJ13 runs...\n")
    else:
        print("ERROR: gradient error large; aborting LJ13 runs. Inspect implementation.")
        raise SystemExit(1)

    # 2) LJ13 単発ランダム実行（ユーザの元の動作に合わせる）
    print("=== Single random LJ13 run (random seed) ===")
    np.random.seed(None)  # 真のランダム性
    opt = BlockBasinOptimizer(n_atoms=13)
    
    final_e, final_coords = opt.run(n_steps=80, block_size=6, init_mode='random')
    print(f"Single run final energy: {final_e:.6f}\n")

    # 3) 再現可能な小規模トライアル（検証用） — シード 0-30
   
    trial_results = []
    for seed in range(30):
        print(f"\n--- Trial seed={seed} ---")
        np.random.seed(seed)
        opt = BlockBasinOptimizer(n_atoms=13)
        e, _ = opt.run(n_steps=80, block_size=6, init_mode='random')
        trial_results.append(e)

    print("\n=== Trial summary ===")
    for i, e in enumerate(trial_results):
        print(f"Seed {i}: final E = {e:.6f}")
    print(f"Best: {min(trial_results):.6f}, Worst: {max(trial_results):.6f}, Mean: {np.mean(trial_results):.6f}")

5. 結論#

Lennard-Jonesクラスターのような多自由度系の最適化において、探索手法は「ランダム探索（MC）」から「地形の粗視化（BH）」へと進化してきた。さらに、そのBH法を空間的に分割して適用する「Block Basin Hopping」のような変法を用いることで、全体の構造を維持しつつ局所的な最適化を積み重ね、深いエネルギーの谷（Global Minimum）へ効率的に到達できる可能性が高まる。

参考文献#

Cambridge Cluster Database. (n.d.). The Cambridge Cluster Database. Retrieved from http://www-wales.ch.cam.ac.uk/CCD.html
Wales, D. J., & Doye, J. P. K. (1997). Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters containing up to 110 Atoms. The Journal of Physical Chemistry A, 101(28), 5111-5116.
Nayeem, A., Vila, J., & Scheraga, H. A. (1991). A comparative study of the simulated-annealing and Monte Carlo-with-minimization approaches to the multiple-minima problem for small peptides. Journal of Computational Chemistry, 12(5), 594-605.
Goedecker, S. (2004). Minima hopping: An efficient search method for the global minimum of the potential energy surface of complex molecular systems. The Journal of Chemical Physics, 120(21), 9911-9917.
Li, Z., & Scheraga, H. A. (1987). Monte Carlo-minimization approach to the multiple-minima problem in protein folding. Proceedings of the National Academy of Sciences, 84(19), 6611-6615.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of State Calculations by Fast Computing Machines. The Journal of Chemical Physics, 21(6), 1087-1092.