Singular Value Decomposition: Applications¶
This notebook demonstrates two concrete applications of SVD:
- Low-rank structure in data matrices
- Image compression
The philosophy throughout is:
Compute first, interpret later.
Mathematical Reminder: SVD¶
For any matrix $A \in \mathbb{R}^{m \times n}$, $$ A = \sum_{i=1}^r \sigma_i u_i v_i^\top, $$ where:
- $r = \mathrm{rank}(A)$
- each term $\sigma_i u_i v_i^\top$ is rank one
- the first $k$ terms give the best rank-$k$ approximation
In [4]:
Copied!
import numpy as np
import numpy as np
Example 1: Movie Ratings Matrix¶
We start with a concrete data matrix. No structure is assumed a priori.
In [5]:
Copied!
A = np.array([
[5, 4, 1],
[5, 5, 0],
[0, 0, 5],
[1, 0, 4]
], dtype=float)
A
A = np.array([
[5, 4, 1],
[5, 5, 0],
[0, 0, 5],
[1, 0, 4]
], dtype=float)
A
Out[5]:
array([[5., 4., 1.],
[5., 5., 0.],
[0., 0., 5.],
[1., 0., 4.]])
Compute the SVD¶
In [6]:
Copied!
U, S, VT = np.linalg.svd(A, full_matrices=True)
U, S, VT
U, S, VT = np.linalg.svd(A, full_matrices=True)
U, S, VT
Out[6]:
(array([[-0.67096878, -0.02362898, -0.46466034, -0.57735027],
[-0.7197425 , -0.20541998, 0.47585716, 0.46188022],
[-0.09389374, 0.77048084, 0.52682373, -0.34641016],
[-0.15151103, 0.6029955 , -0.52925183, 0.57735027]]),
array([9.6438109 , 6.36389089, 0.70555232]),
array([[-0.73674894, -0.65146317, -0.1810987 ],
[-0.0852072 , -0.17624687, 0.9806512 ],
[-0.67077622, 0.73792464, 0.07434034]]))
The singular values indicate how many dominant rank-one components are needed to approximate the matrix.
Rank-$k$ Approximation¶
The rank-$k$ approximation is given by $$ A_k = \sum_{i=1}^k \sigma_i u_i v_i^\top. $$
In [8]:
Copied!
def rank_k_approx(U, S, VT, k):
return (U[:, :k] * S[:k]) @ VT[:k, :]
A2 = rank_k_approx(U, S, VT, 2)
A2
def rank_k_approx(U, S, VT, k):
return (U[:, :k] * S[:k]) @ VT[:k, :]
A2 = rank_k_approx(U, S, VT, 2)
A2
Out[8]:
array([[ 4.78009126, 4.24192282, 1.0243719 ],
[ 5.22520783, 4.75224761, -0.02495918],
[ 0.24932866, -0.27428785, 4.97236757],
[ 0.7495222 , 0.27555202, 4.02775979]])
Example 2: Image Compression¶
A grayscale image is simply a matrix of pixel intensities. SVD allows us to compress the image by keeping only the dominant singular values.
In [9]:
Copied!
import matplotlib.pyplot as plt
# Create a synthetic grayscale image
x = np.linspace(0, 1, 200)
y = np.linspace(0, 1, 200)
X, Y = np.meshgrid(x, y)
img = np.sin(3*np.pi*X) * np.cos(3*np.pi*Y)
plt.imshow(img, cmap='gray')
plt.title("Original Image")
plt.colorbar()
import matplotlib.pyplot as plt
# Create a synthetic grayscale image
x = np.linspace(0, 1, 200)
y = np.linspace(0, 1, 200)
X, Y = np.meshgrid(x, y)
img = np.sin(3*np.pi*X) * np.cos(3*np.pi*Y)
plt.imshow(img, cmap='gray')
plt.title("Original Image")
plt.colorbar()
Out[9]:
<matplotlib.colorbar.Colorbar at 0x726079b227b0>
SVD of the Image¶
In [10]:
Copied!
Ui, Si, VTi = np.linalg.svd(img, full_matrices=False)
Ui, Si, VTi = np.linalg.svd(img, full_matrices=False)
Compressed Reconstructions¶
In [11]:
Copied!
ks = [5, 20, 50]
fig, axes = plt.subplots(1, len(ks), figsize=(12,4))
for ax, k in zip(axes, ks):
Ak = (Ui[:, :k] * Si[:k]) @ VTi[:k, :]
ax.imshow(Ak, cmap='gray')
ax.set_title(f"k = {k}")
ax.axis('off')
plt.show()
ks = [5, 20, 50]
fig, axes = plt.subplots(1, len(ks), figsize=(12,4))
for ax, k in zip(axes, ks):
Ak = (Ui[:, :k] * Si[:k]) @ VTi[:k, :]
ax.imshow(Ak, cmap='gray')
ax.set_title(f"k = {k}")
ax.axis('off')
plt.show()
Another example¶
In [12]:
Copied!
import requests
from io import BytesIO
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import requests
from io import BytesIO
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
Load a Real Image¶
Original file: https://commons.wikimedia.org/wiki/File:Albert_Einstein_1947.jpg
In [13]:
Copied!
img = Image.open("Albert_Einstein_Head_cleaned.jpg")
plt.figure(figsize=(4, 4)) # control displayed size
plt.imshow(img, cmap="gray")
plt.axis("off")
img = Image.open("Albert_Einstein_Head_cleaned.jpg")
plt.figure(figsize=(4, 4)) # control displayed size
plt.imshow(img, cmap="gray")
plt.axis("off")
Out[13]:
(np.float64(-0.5), np.float64(4529.5), np.float64(5606.5), np.float64(-0.5))
Convert to grayscale¶
In [14]:
Copied!
img_gray = img.convert("L")
img_gray = img.convert("L")
Convert to array and normalize¶
In [15]:
Copied!
A = np.asarray(img_gray, dtype=float) / 255.0
plt.imshow(A, cmap="gray")
plt.title("Original Grayscale Image")
plt.axis("off")
A = np.asarray(img_gray, dtype=float) / 255.0
plt.imshow(A, cmap="gray")
plt.title("Original Grayscale Image")
plt.axis("off")
Out[15]:
(np.float64(-0.5), np.float64(4529.5), np.float64(5606.5), np.float64(-0.5))
Full SVD of the grayscale image¶
In [16]:
Copied!
U, S, VT = np.linalg.svd(A, full_matrices=True)
S[:10] # inspect leading singular values
U, S, VT = np.linalg.svd(A, full_matrices=True)
S[:10] # inspect leading singular values
Out[16]:
array([1758.32195011, 546.33373601, 316.15297917, 232.73036675,
190.87663869, 176.65220364, 145.54249535, 128.44660124,
109.65112341, 101.42766933])
Rank-$k$ approximation¶
In [17]:
Copied!
def rank_k_approximation(U, S, VT, k):
return (U[:, :k] * S[:k]) @ VT[:k, :]
def rank_k_approximation(U, S, VT, k):
return (U[:, :k] * S[:k]) @ VT[:k, :]
In [18]:
Copied!
ks = [5, 20, 100]
fig, axes = plt.subplots(
1, 3,
figsize=(8, 8),
constrained_layout=True
)
for ax, k in zip(axes.flat, ks):
Ak = rank_k_approximation(U, S, VT, k)
ax.imshow(Ak, cmap="gray")
ax.set_title(f"Rank {k}", fontsize=11)
ax.axis("off")
plt.show()
ks = [5, 20, 100]
fig, axes = plt.subplots(
1, 3,
figsize=(8, 8),
constrained_layout=True
)
for ax, k in zip(axes.flat, ks):
Ak = rank_k_approximation(U, S, VT, k)
ax.imshow(Ak, cmap="gray")
ax.set_title(f"Rank {k}", fontsize=11)
ax.axis("off")
plt.show()
In [19]:
Copied!
m, n = A.shape
print("Original storage:", m * n)
for k in ks:
print(f"k = {k}, compressed storage = {k * (m + n + 1)}")
m, n = A.shape
print("Original storage:", m * n)
for k in ks:
print(f"k = {k}, compressed storage = {k * (m + n + 1)}")
Original storage: 25399710 k = 5, compressed storage = 50690 k = 20, compressed storage = 202760 k = 100, compressed storage = 1013800
Conclusion¶
- SVD decomposes any matrix into orthogonal rank-one components.
- Truncating the expansion yields optimal low-rank approximations.
- Image compression is a direct, interpretation-free application.