Singular Value Decomposition: Applications¶

This notebook demonstrates two concrete applications of SVD:

Low-rank structure in data matrices
Image compression

The philosophy throughout is:

Compute first, interpret later.

Mathematical Reminder: SVD¶

For any matrix $A \in \mathbb{R}^{m \times n}$, $$ A = \sum_{i=1}^r \sigma_i u_i v_i^\top, $$ where:

$r = \mathrm{rank}(A)$
each term $\sigma_i u_i v_i^\top$ is rank one
the first $k$ terms give the best rank-$k$ approximation

In [4]:

Copied!

import numpy as np
import numpy as np

Example 1: Movie Ratings Matrix¶

We start with a concrete data matrix. No structure is assumed a priori.

In [5]:

Copied!





A = np.array([
    [5, 4, 1],
    [5, 5, 0],
    [0, 0, 5],
    [1, 0, 4]
], dtype=float)

A
A = np.array([
    [5, 4, 1],
    [5, 5, 0],
    [0, 0, 5],
    [1, 0, 4]
], dtype=float)

A

Out[5]:

array([[5., 4., 1.],
       [5., 5., 0.],
       [0., 0., 5.],
       [1., 0., 4.]])

Compute the SVD¶

In [6]:

Copied!

U, S, VT = np.linalg.svd(A, full_matrices=True)
U, S, VT
U, S, VT = np.linalg.svd(A, full_matrices=True)
U, S, VT

Out[6]:

(array([[-0.67096878, -0.02362898, -0.46466034, -0.57735027],
        [-0.7197425 , -0.20541998,  0.47585716,  0.46188022],
        [-0.09389374,  0.77048084,  0.52682373, -0.34641016],
        [-0.15151103,  0.6029955 , -0.52925183,  0.57735027]]),
 array([9.6438109 , 6.36389089, 0.70555232]),
 array([[-0.73674894, -0.65146317, -0.1810987 ],
        [-0.0852072 , -0.17624687,  0.9806512 ],
        [-0.67077622,  0.73792464,  0.07434034]]))

The singular values indicate how many dominant rank-one components are needed to approximate the matrix.

Rank-$k$ Approximation¶

The rank-$k$ approximation is given by $$ A_k = \sum_{i=1}^k \sigma_i u_i v_i^\top. $$

In [8]:

Copied!

def rank_k_approx(U, S, VT, k):
    return (U[:, :k] * S[:k]) @ VT[:k, :]

A2 = rank_k_approx(U, S, VT, 2)
A2
def rank_k_approx(U, S, VT, k):
    return (U[:, :k] * S[:k]) @ VT[:k, :]

A2 = rank_k_approx(U, S, VT, 2)
A2

Out[8]:

array([[ 4.78009126,  4.24192282,  1.0243719 ],
       [ 5.22520783,  4.75224761, -0.02495918],
       [ 0.24932866, -0.27428785,  4.97236757],
       [ 0.7495222 ,  0.27555202,  4.02775979]])

Example 2: Image Compression¶

A grayscale image is simply a matrix of pixel intensities. SVD allows us to compress the image by keeping only the dominant singular values.

In [9]:

Copied!





import matplotlib.pyplot as plt

# Create a synthetic grayscale image
x = np.linspace(0, 1, 200)
y = np.linspace(0, 1, 200)
X, Y = np.meshgrid(x, y)
img = np.sin(3*np.pi*X) * np.cos(3*np.pi*Y)

plt.imshow(img, cmap='gray')
plt.title("Original Image")
plt.colorbar()
import matplotlib.pyplot as plt

# Create a synthetic grayscale image
x = np.linspace(0, 1, 200)
y = np.linspace(0, 1, 200)
X, Y = np.meshgrid(x, y)
img = np.sin(3*np.pi*X) * np.cos(3*np.pi*Y)

plt.imshow(img, cmap='gray')
plt.title("Original Image")
plt.colorbar()

Out[9]:

<matplotlib.colorbar.Colorbar at 0x726079b227b0>

No description has been provided for this image

SVD of the Image¶

In [10]:

Copied!

Ui, Si, VTi = np.linalg.svd(img, full_matrices=False)
Ui, Si, VTi = np.linalg.svd(img, full_matrices=False)

Compressed Reconstructions¶

In [11]:

Copied!





ks = [5, 20, 50]
fig, axes = plt.subplots(1, len(ks), figsize=(12,4))

for ax, k in zip(axes, ks):
    Ak = (Ui[:, :k] * Si[:k]) @ VTi[:k, :]
    ax.imshow(Ak, cmap='gray')
    ax.set_title(f"k = {k}")
    ax.axis('off')

plt.show()
ks = [5, 20, 50]
fig, axes = plt.subplots(1, len(ks), figsize=(12,4))

for ax, k in zip(axes, ks):
    Ak = (Ui[:, :k] * Si[:k]) @ VTi[:k, :]
    ax.imshow(Ak, cmap='gray')
    ax.set_title(f"k = {k}")
    ax.axis('off')

plt.show()

Another example¶

In [12]:

Copied!





import requests
from io import BytesIO
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import requests
from io import BytesIO
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt

Load a Real Image¶

Original file: https://commons.wikimedia.org/wiki/File:Albert_Einstein_1947.jpg

In [13]:

Copied!





img = Image.open("Albert_Einstein_Head_cleaned.jpg")
plt.figure(figsize=(4, 4))   # control displayed size
plt.imshow(img, cmap="gray")
plt.axis("off")
img = Image.open("Albert_Einstein_Head_cleaned.jpg")
plt.figure(figsize=(4, 4))   # control displayed size
plt.imshow(img, cmap="gray")
plt.axis("off")

Out[13]:

(np.float64(-0.5), np.float64(4529.5), np.float64(5606.5), np.float64(-0.5))

Convert to grayscale¶

In [14]:

Copied!

img_gray = img.convert("L")
img_gray = img.convert("L")

Convert to array and normalize¶

In [15]:

Copied!





A = np.asarray(img_gray, dtype=float) / 255.0
plt.imshow(A, cmap="gray")
plt.title("Original Grayscale Image")
plt.axis("off")
A = np.asarray(img_gray, dtype=float) / 255.0
plt.imshow(A, cmap="gray")
plt.title("Original Grayscale Image")
plt.axis("off")

Out[15]:

(np.float64(-0.5), np.float64(4529.5), np.float64(5606.5), np.float64(-0.5))

Full SVD of the grayscale image¶

In [16]:

Copied!

U, S, VT = np.linalg.svd(A, full_matrices=True)

S[:10]   # inspect leading singular values
U, S, VT = np.linalg.svd(A, full_matrices=True)

S[:10]   # inspect leading singular values

Out[16]:

array([1758.32195011,  546.33373601,  316.15297917,  232.73036675,
        190.87663869,  176.65220364,  145.54249535,  128.44660124,
        109.65112341,  101.42766933])

Rank-$k$ approximation¶

In [17]:

Copied!

def rank_k_approximation(U, S, VT, k):
    return (U[:, :k] * S[:k]) @ VT[:k, :]
def rank_k_approximation(U, S, VT, k):
    return (U[:, :k] * S[:k]) @ VT[:k, :]

In [18]:

Copied!





ks = [5, 20, 100]

fig, axes = plt.subplots(
    1, 3,
    figsize=(8, 8),
    constrained_layout=True
)

for ax, k in zip(axes.flat, ks):
    Ak = rank_k_approximation(U, S, VT, k)
    ax.imshow(Ak, cmap="gray")
    ax.set_title(f"Rank {k}", fontsize=11)
    ax.axis("off")

plt.show()
ks = [5, 20, 100]

fig, axes = plt.subplots(
    1, 3,
    figsize=(8, 8),
    constrained_layout=True
)

for ax, k in zip(axes.flat, ks):
    Ak = rank_k_approximation(U, S, VT, k)
    ax.imshow(Ak, cmap="gray")
    ax.set_title(f"Rank {k}", fontsize=11)
    ax.axis("off")

plt.show()

In [19]:

Copied!

m, n = A.shape

print("Original storage:", m * n)
for k in ks:
    print(f"k = {k}, compressed storage = {k * (m + n + 1)}")
m, n = A.shape

print("Original storage:", m * n)
for k in ks:
    print(f"k = {k}, compressed storage = {k * (m + n + 1)}")

Original storage: 25399710
k = 5, compressed storage = 50690
k = 20, compressed storage = 202760
k = 100, compressed storage = 1013800

Conclusion¶

SVD decomposes any matrix into orthogonal rank-one components.
Truncating the expansion yields optimal low-rank approximations.
Image compression is a direct, interpretation-free application.