Mathwizurd.com is created by David Witten, a mathematics and computer science student at Stanford University. For more information, see the "About" page.

Matrices as Linear Transformations

Linear Transformation

MathJax TeX Test Page Let T: $\mathbb{R}^n \to \mathbb{R}^m$. A linear transformation satisfies these two properties $$T(cx) = cT(x)$$ $$T(x + y) = T(x) + T(y)$$ There's a third property that follows from the first two: $$T(\vec{0}) = \vec{0}$$ Proof: $$\vec{0} + \vec{0} = \vec{0}$$ $$T(\vec{0} + \vec{0}) = T(\vec{0})$$ $$T(\vec{0}) + T(\vec{0}) = T(\vec{0})$$ $$T(\vec{0}) = \vec{0}$$

Matrices tie really well into this, but before we show how, we have to define a few things.

Multiplying Matrices

MathJax TeX Test Page We define multiplying two matrices like this: $$AB = A\begin{bmatrix}b_1 & b_2 & ... & b_n\end{bmatrix} = \begin{bmatrix}Ab_1 & Ab_2 & ... & Ab_n\end{bmatrix}$$ Another way you can define this is: $$(AB)_{ij} = A_i \cdot b_j \text{ (dot product)}$$ $A_i$ is the i'th row, and $b_j$ is the j'th column. $$$$ Now, there are two ways of defining $Ax$. The first is the column way: $$x_1\vec{a_1} + ... + x_n\vec{a_n}$$ In my opinion, this is more intuitive, as I explain below. The second method is the row method $$\begin{bmatrix}A_1 \cdot x \\ ... \\ A_m \cdot x\end{bmatrix}$$

Linear Transformation to Matrix Transformation

MathJax TeX Test Page Let T: $\mathbb{R}^n \to \mathbb{R}^m$. The theorem states that there exists an $m \times n$ matrix A such that $$T(x) = Ax \text{ for all }x \text{ in } \mathbb{R}^n$$ Not only that, but A equals $$\begin{bmatrix}T(e_1) & T(e_2) & T(e_3) & ... & T(e_n)\end{bmatrix}$$

WHy?

MathJax TeX Test Page $$T\left(\begin{bmatrix}x_1 \\ x_2 \\ ... \\ x_n\end{bmatrix}\right) = T\left(I_n\begin{bmatrix}x_1 \\ x_2 \\ ... \\ x_n\end{bmatrix}\right) = T\left(\begin{bmatrix}\vec{e_1} & \vec{e_2} & ... & \vec{e_n}\end{bmatrix}\begin{bmatrix}x_1 \\ x_2 \\ ... \\ x_n\end{bmatrix}\right)$$ $$= T\left(x_1\vec{e_1} + x_2\vec{e_2} + ... + x_n\vec{e_n}\right) = x_1T\left(\vec{e_1}\right) + x_2T\left(\vec{e_2}\right) + ... + x_nT\left(\vec{e_n}\right)$$ The last equation looks just like the product Ax $$= \begin{bmatrix} T\left(\vec{e_1}\right) & T\left(\vec{e_2}\right) & ... & T\left(\vec{e_n}\right)\end{bmatrix}\begin{bmatrix}x_1 \\ x_2 \\ ... \\ x_n\end{bmatrix}$$ We just proved that if T is a linear transformation, it can be represented as that matrix. What about the other way around? Let's try it: $$A(cx + dy) = A(cx) + A(dy) \text{ (matrix mult. is distributive. The proof is short)}$$ $$=cAx + dAy \text{ (The proof of this is also easy.)}$$ $$=cA(x) + dA(y)$$ This is satisfies both conditions of a linear transformation.

Instead of thinking of a linear transformation as changing each vector a different way, you can think of a linear transformation as a change of basis. Instead of shifting the point, you’re shifting the entire coordinate system. For example, the vector <1,2,3> moves 1 unit in the first basis vector, 2 units in the second basis vector, and 3 units in the third basis vector. However, instead of the basis vectors being e1, e2, and e3, they’re now T(e1), T(e2), and T(e3). The point is still <1,2,3> but with respect to new axes, that have rotated/stretched. So, Tx tells you what happens to the point x if the axes are changed. Additionally, multiplying by the inverse of T can be thought of as shifting the coordinates back.

Function Composition

MathJax TeX Test Page Let T = $\mathbb{R}^n \to \mathbb{R}^m$ and let S = $\mathbb{R}^m \to \mathbb{R}^p$ $$(S \circ T) (x) = \begin{bmatrix} S\left(\vec{e_1}\right) & S\left(\vec{e_2}\right) & ... & S\left(\vec{e_m}\right)\end{bmatrix}\begin{bmatrix} T\left(\vec{e_1}\right) & T\left(\vec{e_2}\right) & ... & T\left(\vec{e_n}\right)\end{bmatrix}$$ $$\text{ and } \begin{bmatrix}S(T(e_1)) & ... & S(T(e_n))\end{bmatrix}$$ Why? $$(S \circ T)(x) = S(T(x)) \text{ by definition}$$ $$=S\left(\begin{bmatrix} T\left(\vec{e_1}\right) & T\left(\vec{e_2}\right) & ... & T\left(\vec{e_n}\right)\end{bmatrix}\begin{bmatrix}x_1 \\ ... \\ x_n\end{bmatrix}\right)$$ $$= \begin{bmatrix} S\left(\vec{e_1}\right) & S\left(\vec{e_2}\right) & ... & S\left(\vec{e_m}\right)\end{bmatrix}\begin{bmatrix} T\left(\vec{e_1}\right) & T\left(\vec{e_2}\right) & ... & T\left(\vec{e_n}\right)\end{bmatrix}\begin{bmatrix}x_1 \\ ... \\ x_n\end{bmatrix}$$ Alternatively, you could've gone from the first line to this: $$=S(x_1T(e_1) + ... + x_nT(e_n)) \text{ by the matrix representation of T}$$ $$=x_1S(T(e_1)) + ... + x_nS(T(e_n)) \text{ property of linear transformations}$$ $$=\begin{bmatrix}S(T(e_1)) & ... & S(T(e_n))\end{bmatrix}$$ I hope this provides some insight that function compositions are just linear trasnformations of the first matrix, and thus are products. So, when you multiply a matrix by its inverse, you could imagine that this inverse is made up of many different matrices. Therefore, you could imagine a way to algorithmically calculate an inverse.


Matrix Equation Ax = b

Determining if a System of Equations is Consistent