Matrix Factorisation

What is the fundamental equation defining an eigenvalue $\lambda$ and its corresponding eigenvector $x$ for a square matrix $A$?

$Ax = \lambda x$ (where $x \ne 0$).

In terms of the determinant, what condition must $\lambda$ satisfy to be an eigenvalue of matrix $A$?

$\det(A - \lambda I) = 0$.

How does the geometric multiplicity $gm(\lambda)$ of an eigenvalue compare to its algebraic multiplicity $am(\lambda)$?

$gm(\lambda) \le am(\lambda)$.

According to the Spectral Theorem, what special basis exists for the vector space of a symmetric matrix $A \in \mathbb{R}^{n \times n}$?

An orthonormal basis consisting of eigenvectors of $A$.

What specific properties must a matrix $A$ possess to be factorized using Cholesky decomposition?

It must be symmetric and positive definite.

In the Cholesky decomposition $A = LL^\top$, what is the structural form of the matrix $L$?

A lower-triangular matrix with positive diagonal elements.

Given the Cholesky factor $L$, how is the determinant of the original matrix $A$ calculated efficiently?

$\det(A) = (\prod_{i} l_{ii})^2$.

Why is Singular Value Decomposition (SVD) referred to as the 'fundamental theorem of linear algebra'?

It can be applied to all matrices (not just square) and it always exists.

In the SVD $A = U\Sigma V^\top$, what mathematical property is shared by the matrices $U$ and $V$?

They are both orthogonal matrices.

By convention, how are the singular values $\sigma_i$ on the diagonal of $\Sigma$ typically ordered?

In descending order ($\sigma_1 \ge \sigma_2 \ge \dots \ge 0$).

Using the spectral norm, what is the approximation error when representing matrix $A$ by its rank-$k$ approximation $\hat{A}(k)$?

$\|A - \hat{A}(k)\|_2 = \sigma_{k+1}$.

The eigenspace $E_{\lambda}$ is defined as the set of all eigenvectors associated with $\lambda$ and is equivalent to the _____ of $(A - \lambda I)$.

kernel (or null space)

For a symmetric positive definite matrix $S = S^\top = PDP^\top$, how do the SVD matrices $U$ and $V$ relate to the eigendecomposition matrix $P$?

$U = P = V$.

What is the physical meaning of an eigenvalue $\lambda$ regarding its associated eigenvector during a linear transformation?

It is the factor by which the eigenvector is stretched.

If an eigenvalue $\lambda$ is negative, what happens to the direction of its corresponding eigenvector under the linear mapping?

The direction of the stretching is flipped.

In SVD construction, the right-singular vectors $v_j$ are the orthonormal eigenvectors of which symmetric matrix?

$A^\top A$.

How is the $i$-th left-singular vector $u_i$ derived from the right-singular vector $v_i$ and singular value $\sigma_i$ in SVD?

$u_i = \frac{1}{\sigma_i} Av_i$.

What is the relationship between the singular values $\sigma_i$ of $A$ and the eigenvalues $\lambda_i$ of $A^\top A$?

$\sigma_i = \sqrt{\lambda_i}$.

In the rank-$k$ approximation $\hat{A}(k) = \sum_{i=1}^k \sigma_i u_i v_i^\top$, what is the rank of each individual term $u_i v_i^\top$?

1

Which theorem states that the rank-$k$ approximation $\hat{A}(k)$ is the optimal rank-$k$ representation of $A$ in the spectral norm sense?

The Eckart-Young Theorem.

Why is the Cholesky decomposition $A = LL^\top$ particularly useful for computing the inverse of an SPD matrix?

It reduces the task to inverting two triangular matrices, which is computationally easier.

A matrix $D$ is called a _____ matrix if all its off-diagonal elements are zero.

diagonal

What is the result of raising a diagonal matrix $D$ to the power $k$?

A diagonal matrix where each diagonal entry $d_{ii}$ is raised to the power $k$.

Under what condition is the Cholesky factor $L$ of a symmetric positive definite matrix unique?

The diagonal elements of $L$ must be positive.

In the context of SVD, the set of all eigenvalues of a square matrix $A$ is known as the _____.

eigenspectrum (or spectrum)

How does SVD assist in dimensionality reduction techniques like Principal Component Analysis (PCA)?

It identifies the directions (singular vectors) along which the data has the most variance (largest singular values).

Which matrix decomposition is preferred for generating samples from a multivariate Gaussian distribution?

Cholesky decomposition.

The dimension of the eigenspace $E_{\lambda}$ is called the _____ multiplicity of the eigenvalue $\lambda$.

geometric

In the characteristic polynomial $p_A(\lambda) = \det(A - \lambda I)$, what does the algebraic multiplicity $am(\lambda_i)$ represent?

The number of times the root $\lambda_i$ appears in the polynomial.

True or False: A matrix and its transpose always possess the same eigenvectors.

False.

What is the relationship between the singular values of $A$ and the singular values of $A^\top$?

They are identical.

In SVD, the singular values $\sigma_i$ are the lengths of the _____ of the hyper-ellipse that the unit sphere is mapped to.

semi-axes

For a diagonal matrix $D$, the determinant is simply the _____ of its diagonal entries.

product

In a $3 \times 3$ Cholesky factorization, if $a_{11}$ is known, what is the formula for $l_{11}$?

$l_{11} = \sqrt{a_{11}}$.

Why is it impractical to implement the gradient of a deep neural network explicitly via manual derivation?

The functions are too complicated, making manual implementation expensive and error-prone.

Backpropagation is a computational application of which rule from vector calculus?

The chain rule.

What is the rank of a matrix $A$ if it has exactly $r$ non-zero singular values?

$r$

In the SVD $A = U\Sigma V^\top$, the matrix $U$ represents a rotation in the _____ space.

codomain (or target space $\mathbb{R}^m$)

The rank-$k$ approximation $\hat{A}(k)$ can be interpreted as a form of _____ compression for matrices.

lossy

In $A = U\Sigma V^\top$, what are the columns of $U$ called?

left-singular vectors

In $A = U\Sigma V^\top$, what are the columns of $V$ called?

right-singular vectors

True or False: Every symmetric matrix is diagonalizable.

True (by the Spectral Theorem).

What matrix property is required to ensure that all eigenvalues are real and an orthonormal eigenbasis exists?

The matrix must be symmetric.

In Cholesky decomposition, the matrix $L$ must have _____ diagonal elements to satisfy the square-root equivalent property.

positive

How is the spectral norm $\|A\|_2$ defined in terms of SVD components?

It is the largest singular value, $\sigma_1$.

Which matrix decomposition can be used to compute the determinant of an SPD matrix $A$ as $\prod l_{ii}^2$?

Cholesky decomposition.

The rank-$k$ approximation of a matrix effectively discards the _____ singular values to reduce noise or save space.

smallest (or least significant)

In the Stonehenge image example, what was the approximate data reduction percentage for a rank-5 SVD approximation?

Approximately $0.6\%$ of the original image size.

What is the primary motivation for using Cholesky decomposition in deep stochastic models like Variational Auto-encoders (VAEs)?

To perform a linear transformation of random variables for computing gradients.

If $A \in \mathbb{R}^{m \times n}$, what are the dimensions of the singular value matrix $\Sigma$ in the full SVD?

$m \times n$.

The homogeneous system $(A - \lambda I)x = 0$ having a non-trivial solution is equivalent to saying $\text{rank}(A - \lambda I) < \dots$?

$n$ (where $A$ is $n \times n$).

True or False: If $x$ is an eigenvector of $A$ with eigenvalue $\lambda$, then $x$ is also an eigenvector of $A^2$ with eigenvalue $\lambda^2$.

True.

In the Cholesky factorization $A = LL^\top$, $L$ is a _____ triangular matrix.

lower

How does SVD quantify the change in geometry between vector spaces $V$ and $W$ under a linear mapping $\Phi$?

By decomposing the mapping into rotations (orthogonal matrices) and scaling (singular values).

What happens to the singular values $\sigma_i$ of a matrix $A$ if the matrix is multiplied by a scalar $c > 0$?

They are all multiplied by $c$.

The sum $\sum_{i=1}^r \sigma_i u_i v_i^\top$ is called the _____ expansion of matrix $A$.

outer-product (or SVD)

If a square matrix $A$ is invertible, how can its eigenvalues be related to the eigenvalues of $A^{-1}$?

The eigenvalues of $A^{-1}$ are the reciprocals ($1/\lambda_i$) of the eigenvalues of $A$.

In the Cholesky factor $L$, the entry $l_{21}$ is calculated as $a_{21} / \dots$.

$l_{11}$

The process of approximating a high-rank matrix with a lower-rank one to filter out noise is called _____ filtering.

noise

In SVD, the orthogonal matrix $V$ corresponds to the orthonormal eigenbasis of which matrix?

$A^\top A$.

True or False: The Singular Value Decomposition is unique for every matrix.

False (singular values are unique, but singular vectors are not necessarily unique).

A symmetric matrix is positive definite if all its _____ are strictly positive.

eigenvalues

What is the term for the computational technique that calculates gradients of complex functions automatically?

Automatic Differentiation.

In a diagonal matrix $D$, what is the inverse $D^{-1}$ if any $d_{ii} \ne 0$?

A diagonal matrix with entries $1/d_{ii}$.

The error $\|A - \hat{A}(k)\|_2$ in the spectral norm is equal to the _____ singular value.

$(k+1)$-th

For any matrix $A$, the product $A^\top A$ is always symmetric and positive _____.

semidefinite

Which decomposition is specifically designed to handle the square-root equivalent for symmetric, positive definite matrices?

Cholesky decomposition.

The trace of a square matrix is equal to the _____ of its eigenvalues.

sum

The determinant of a square matrix is equal to the _____ of its eigenvalues.

product

In SVD, the singular values $\sigma_i$ are always _____.

non-negative ($\ge 0$).

What matrix product yields a symmetric matrix whose eigenvalues are the squares of the singular values of $A$?

$A^\top A$ (or $AA^\top$).

If a matrix $A$ is $m \times n$ with $m < n$, at most how many non-zero singular values can it have?

$m$

The Cholesky factor $L$ is _____ unique for an SPD matrix.

unique (provided diagonal elements are positive).

In SVD, $U$ and $V$ are orthogonal, meaning $U^\top = U^{-1}$ and $V^\top = \dots$?

$V^{-1}$

The Spectral Theorem implies that symmetric matrices can be diagonalized by _____ matrices.

orthogonal