Metric Learning - MLKR - Fail to reproduce known result

Question

I am trying to replicate the results in this Metric Learning for Kernel Regression, using the metric-learn python package to perform the MLKR computation.

I am completely failing to replicate one result in the paper.

I generated synthetic dataset following step by step their instructions,

We tested the abilities of MLKR for dimensionality reduction on a synthetic data set. We generated n = 8000 input vectors of dimensionality D = 10. The input data was constructed as follows: First, we sampled data points from a three dimensional “twin-peak” manifold with ~xi = (ri , si ,sin(ri) tanh(si))⊤ for some randomly sampled points (ri , si). The target value was computed as yi = k~xik2. The dataset was then embedded into a ten dimensional space by adding seven dimensions of high uniform noise (the noise level was set to 1000% of the original input magnitude.) Finally, we multiplied the input vectors by a randomly generated rotation matrix which resulted in all ten input dimensions containing a comparable mixture of noise and signal.

see code below if you wish. Now, following their explanation, it seems simply a matter of performing MLKR, getting the Malahanobis matrix M

mlkr = MLKR() mlkr.fit(X, y) M = mlkr.get_mahalanobis_matrix()

rank the eigenvectors and project the points. Yet, I am completely unable to recover their nice noiseless surface.

I can get there only if:

I do not rotate the point cloud (last step of the synthetic data processing)
I use a StandardScaler on the data (which really MLKR should not need at all

The two points are rather puzzling, why would that be? Also I verified that the first 3 eigenvalues of M are the highest, so it is indeed recovering the three dimensions in the data that are not noisy. But why it cannot handle rotations?

Any hints please, thanks

This is the code

import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from sklearn.preprocessing import StandardScaler from metric_learn import MLKR from sklearn.decomposition import PCA # ============================================================ # MLKR vs PCA on the Twin-Peak Manifold (Weinberger et al. 2006) # ============================================================ # ---------- 0) Generate synthetic data exactly as in the paper ---------- rng = np.random.default_rng(7) n, D = 1000, 10 # The paper uses 8000, setting to 1000 for speed r = rng.uniform(-3, 3, size=n) s = rng.uniform(-3, 3, size=n)# 3-D twin-peak manifold: ~xi = (ri, si, sin(ri) tanh(si)) x_tilde = np.column_stack([r, s, np.sin(r) * np.tanh(s)]) # (n,3) # Target value: yi = ||~xi||² y = np.sum(x_tilde**2, axis=1) # Add seven dims of high uniform noise at 1000 % magnitude # 1000% of magnitude means 10 * magnitude noise_level_factor = 10 magnitudes = np.linalg.norm(x_tilde, axis=1) # Scale for the noise: 10 * ||x_i_tilde|| scale = noise_level_factor * magnitudes[:, None] # Uniform noise in [-1, 1], scaled by the signal magnitude noise = rng.uniform(-1.0, 1.0, size=(n, 7)) * scale X = np.concatenate([x_tilde, noise], axis=1).astype(float) # Apply a random rotation so each dimension mixes signal + noise # This is crucial for making PCA fail Q, _ = np.linalg.qr(rng.normal(size=(D, D))) #Apply Rotation - Switching Rotation off and StandardScaler on make it work X = X @ Q # ---------- 1) Standardize the data (Crucial step for MLKR) ---------- # Seems this is necessary? scaler = StandardScaler() X = scaler.fit_transform(X) # ---------- 2) Fit PCA (Baseline Comparison) ---------- pca = PCA(n_components=3, random_state=0) Z_pca = pca.fit_transform(X) fig = plt.figure(figsize=(14, 6)) ax1 = fig.add_subplot(131, projection='3d') sc1 = ax1.scatter(Z_pca[:,0], Z_pca[:,1], Z_pca[:,2], c=y, s=4, alpha=0.6, cmap="viridis") ax1.set_title(f"PCA (Unsupervised) - Failed Recovery") ax1.set_xlabel("PC1"); ax1.set_ylabel("PC2"); ax1.set_zlabel("PC3") fig.colorbar(sc1, ax=ax1, fraction=0.046, pad=0.04, label="target y") # ---------- 3) Fit MLKR and Project ---------- # The learned matrix M = A.T @ A, where A is the projection matrix. # The 'metric_learn' implementation of MLKR returns M directly. mlkr = MLKR(init='pca') mlkr_embed = mlkr.fit_transform(X, y) M = mlkr.get_mahalanobis_matrix() mlkr.fit(X, y) M = mlkr.get_mahalanobis_matrix() # Eigen-decomposition of M (M = U @ diag(w) @ U.T) # Rank and pick largest eigenvectors w, U = np.linalg.eigh(M) idx = np.argsort(w)[::-1] U = U[:, idx] w = w[idx] # Select the top k=3 eigenvectors (the transformation matrix A) k = 3 A = U[:, :k] # The MLKR projection is then Xs @ A Z_mlkr = X @ A ax2 = fig.add_subplot(132, projection='3d') sc2 = ax2.scatter(Z_mlkr[:,0], Z_mlkr[:,1], Z_mlkr[:,2], c=y, s=4, alpha=0.6, cmap="viridis") ax2.set_title(f"MLKR Projection (Supervised) - Clean Recovery") ax2.set_xlabel("MLKR Dim 1"); ax2.set_ylabel("MLKR Dim 2"); ax2.set_zlabel("MLKR Dim 3")

Did you try to contact the authors (either the paper or the package)? In case of specific questions regarding a particular paper, this might get you some helpful answers. — Broele
– Broele, Commented Nov 10 at 8:31

Smerdjakov · Accepted Answer · 2025-11-11 18:50:46Z

I think I got to the bottom of it. It is essential to whiten the original input dataset. I attach the working reproducing script below, learnt a lot through the process.

 # ============================================================ # MLKR vs PCA on the Twin-Peak Manifold (Weinberger et al.) # "Whiten + Rotate" — forcing a pure rotation in # the original metric reintroduces noise). # ============================================================ import numpy as np import matplotlib.pyplot as plt from sklearn.decomposition import PCA from metric_learn import MLKR # ----------------------------- # 0) Config # ----------------------------- SEED = 7 N = 8000 D = 10 NOISE_FACTOR = 10.0 # 1000% = 10x magnitude MAX_ITER = 600 ROTATE_DATA = True # as in the paper; mixing signal & noise rng = np.random.default_rng(SEED) # ----------------------------- # 1) Synthetic data (like paper) # ----------------------------- r = rng.uniform(-3, 3, size=N) s = rng.uniform(-3, 3, size=N) X_sig = np.column_stack([r, s, np.sin(r) * np.tanh(s)]) # (n,3) y = np.sum(X_sig**2, axis=1) # 7 dims of high uniform noise, scaled per-sample mag = np.linalg.norm(X_sig, axis=1, keepdims=True) # (n,1) noise = rng.uniform(-1.0, 1.0, size=(N, 7)) * (NOISE_FACTOR * mag) X = np.hstack([X_sig, noise]).astype(float) # (n,10) # Random rotation so all dims mix signal & noise if ROTATE_DATA: Qrand, _ = np.linalg.qr(rng.normal(size=(D, D))) X = X @ Qrand # ----------------------------- # 2) Whitening (critical) # Xw = (X - mean_) @ W, with W = components_.T / sqrt(explained_variance_) # ----------------------------- pca_whiten = PCA(n_components=D, whiten=True, random_state=0) Xw = pca_whiten.fit_transform(X) # Cov ≈ I mean_ = pca_whiten.mean_ P = pca_whiten.components_ # (D,D) rows orthonormal ev = pca_whiten.explained_variance_ # (D,) W = P.T / np.sqrt(ev) # (D,D) whitening matrix # ----------------------------- # 3) Baseline PCA (on raw X) for reference # ----------------------------- Z_pca = PCA(n_components=3, random_state=0).fit_transform(X) # ----------------------------- # 4) MLKR on whitened data # ----------------------------- mlkr = MLKR(n_components=3, random_state=0, max_iter=MAX_ITER) mlkr.fit(Xw, y) M = mlkr.get_mahalanobis_matrix() # (D,D), PSD # Top-3 eigenvectors of M (columns) in whitened space evals, U = np.linalg.eigh(M) order = np.argsort(evals)[::-1] evals = evals[order] U = U[:, order] U_k = U[:, :3] # (D,3) # ----------------------------- # 5) Two usable linear maps from ORIGINAL data to 3D signal coords # (a) Balanced subspace: T_balanced = W @ U_k # (b) MLKR-scaled: T_mlkr = W @ (U_k @ sqrt(Λ_k)) # ----------------------------- Lambda_k_sqrt = np.sqrt(np.diag(evals[:3])) # (3,3) T_balanced = W @ U_k # (D,3) T_mlkr = W @ (U_k @ Lambda_k_sqrt) # (D,3) Xc = X - mean_ # center in original space Z_balanced = Xc @ T_balanced # n x 3 — good for visualization Z_mlkr_like = Xc @ T_mlkr # n x 3 — strict MLKR geometry # The direct MLKR 3D embedding in whitened coords: Z_mlkr_white = (Xw @ U_k) # n x 3 (often looks “flatter”) # ----------------------------- # 6) Plots # ----------------------------- fig = plt.figure(figsize=(18,5)) ax0 = fig.add_subplot(131, projection='3d') ax0.scatter(Z_pca[:,0], Z_pca[:,1], Z_pca[:,2], c=y, s=2, alpha=0.7, cmap = 'bwr') ax0.set_title("PCA on raw X (unsupervised)") ax0.set_xlabel("PC1"); ax0.set_ylabel("PC2"); ax0.set_zlabel("PC3") ax1 = fig.add_subplot(132, projection='3d') ax1.scatter(Z_balanced[:,0], Z_balanced[:,1], Z_balanced[:,2], c=y, s=2, alpha=0.7, cmap = 'bwr') ax1.set_title("MLKR signal coords (Whiten + Rotate, balanced)") ax1.set_xlabel("S1"); ax1.set_ylabel("S2"); ax1.set_zlabel("S3") ax2 = fig.add_subplot(133, projection='3d') ax2.scatter(Z_mlkr_like[:,0], Z_mlkr_like[:,1], Z_mlkr_like[:,2], c=y, s=2, alpha=0.7, cmap = 'bwr') ax2.set_title("MLKR signal coords (Whiten + Rotate, √λ-scaled)") ax2.set_xlabel("S1"); ax2.set_ylabel("S2"); ax2.set_zlabel("S3") plt.tight_layout() plt.show() print("Top 6 eigenvalues of M:", np.round(evals[:6], 6)) ```

Stack Exchange Network

Metric Learning - MLKR - Fail to reproduce known result

1 Answer 1

Hot Network Questions

Metric Learning - MLKR - Fail to reproduce known result

1 Answer 1

Related

Hot Network Questions