A Matrix Tutorial

This is a general introduction to matrices, which are a branch of mathematics. One of my other pages uses matrices, and I can't find a tutorial to link to, so I've written it myself. I think it came out quite well. Note perfect, but better than nothing.

From algebra to matrices
Simultaneous equations
Specifying elements
Cartesian coordinates
Coordinate transformations
Inversion
Determinants
Vector spaces
Matrix calculations in Microsoft Excel

From Algebra to Matrices

Okay, consider the following formula:

5x + 3y + z

Interesting, isn't it? If you don't understand simple algebra like that, I won't explain it here. If you do, carry on reading. It can be rewritten as matrix form (within the limitations of ASCII) as:

(5 3 1)(x)
       (y)
       (z)
  

They both meam the same thing. So, this is a simple case of matrix multiplication. In fact, it's still algebra, so this section heading is misleading. If x, y and z represent numbers, the result of the formula is also a number. The result of the matrix multiplication is a matrix consisting of one cell. But, if you think of it as a number, it's the same one as you had before.

Simultaneous Equations

It's a standard maths exercise to solve a system of equations like this:

5x + 3y +  z = 1
2x + 3y + 5z = 2
 x + 9y + 6z = 3
  

For all these equations to be true, x, y and z are uniquely defined. Later on, I'll show how matrices can be used to find those values. For the moment, be content with rewriting the three equations as a single matrix equation:

(5 3 1)(x) = (1)
(2 3 5)(y) = (2)
(1 9 6)(z) = (3)
  

So now you see where square matrices can come from.

Specifying Elements

Each number in a matrix is called an element. The elements can be referred to by their row and column number, where row 1, column 1 is the top left-hand column. By convention, a row number is called i when you want to be vague about it, and a column number is called j. The i and j should be subscripts, just in case you browser doesn't support them. Anyway, setting

    (5 3 1)
A = (2 3 5)
    (1 9 6)
  

we have A1 1 = 5, A1 2 = 3, A1 3 = 1, A2 1 = 2, A2 2 = 3, A2 3 = 5, A3 1 = 1, A3 2 = 9, A3 3 = 6.

In plain ASCII, you may write Aij as A(i,j) or A_i_j.

So, the ith row and jth column of a matrix A is referred to as Aij.

Cartesian Coordinates

Numbers are boring, so let's say x, y and z are the axes of Cartesian coordiantes. That means

(5 3 1)(x)
(2 3 5)(y)
(1 9 6)(z)
  

represents the points (5,3,1), (2,3,5) and (1,9,6). Taking them together could describe a triangle, although I expect it would be an odd looking one. Anyway, say you want to stretch the shape to make it twice as large in the x direction. One result is

(10 3 1)(x)
( 4 3 5)(y)
( 2 9 6)(z)
  

This means the shape also moves along the x axis, but don't worry about it. The transformation can be written

(5 3 1)(2 0 0)(x)
(2 3 5)(0 1 0)(y)
(1 9 6)(0 0 1)(z)
  

It all depends on the matrix equation

(5 3 1)(2 0 0)   (10 3 1)
(2 3 5)(0 1 0) = ( 4 3 5)
(1 9 6)(0 0 1)   ( 2 9 6)
  

You may not immediately see why this should be true. However, you have seen

(5 3 1)(x)   (5x + 3y +  z)
(2 3 5)(y) = (2x + 3y + 5z)
(1 9 6)(z)   ( x + 9y + 6z)
  

Even if I didn't write it that way at the time. So, you know how to multiply a square matrix by a column matrix. It's like multiplying three different row matrices by the same column matrix. Multiplying square matrices is like multiplying the same matrix by three different column matrices. Hence

(5 3 1)(2)   (10)
(2 3 5)(0) = ( 4)
(1 9 6)(0)   ( 2)

(5 3 1)(0)   (3)
(2 3 5)(1) = (3)
(1 9 6)(0)   (9)

(5 3 1)(0)   (1)
(2 3 5)(0) = (5)
(1 9 6)(1)   (6)
  

leads to

(5 3 1)(2 0 0)   (10 3 1)
(2 3 5)(0 1 0) = ( 4 3 5)
(1 9 6)(0 0 1)   ( 2 9 6)
  

All matrix multiplication can be thought of as rows on the left multiplying by columns on the right.

Any transformation of the shape can be done using matrices. Simple ones are enlargment:

(5 3 1)(a 0 0)(x)   (5a 3a  a)(x)
(2 3 5)(0 a 0)(y) = (2a 3a 5a)(y)
(1 9 6)(0 0 a)(z)   ( a 9a 6a)(z)
  

Reflection about the x axis:

(5 3 1)(-1 0 0)(x)   (-5 3 1)(x)
(2 3 5)( 0 1 0)(y) = (-2 3 5)(y)
(1 9 6)( 0 0 1)(z)   (-1 9 6)(z)
  

And the clever transformation of doing nothing at all:

(5 3 1)(1 0 0)(x)   (5 3 1)(x)
(2 3 5)(0 1 0)(y) = (2 3 5)(y)
(1 9 6)(0 0 1)(z)   (1 9 6)(z)
  

The matrix

(1 0 0)
(0 1 0)
(0 0 1)
  

(and similar ones of different sizes) is called the identity matrix. It's written as the letter I, and is the matrix equivalent of the number 1.

Another trick is to multiply about the z axis:

(5 3 1)( cos(t) sin(t) 0)(x)
(2 3 5)(-sin(t) cos(t) 0)(y)
(1 9 6)(      0      0 1)(z)
  

Which can also be done by multiplying on the left:

(cos(t) -sin(t) 0)(5 3 1)(x)
(sin(t)  cos(t) 0)(2 3 5)(y)
(     0       0 1)(1 9 6)(z)
  

Note that it does matter which way round you do the multiplication. Matrix multiplication does not commute. It won't in general be possible to do the multiplication both ways round. The number of columns on the left has to match the number of rows on the right.

Coordinate Transformations

So far, I've talked about transforming shapes. However, you can also keep the shape the same, and change the set of coordinates you describe it in.

For example,

(5 3 1)(2 0 0)(x)
(2 3 5)(0 1 0)(y)
(1 9 6)(0 0 1)(z)
  

Could mulitply out as

(10 3 1)(x)
( 4 3 5)(y)
( 2 9 6)(y)
  

As before. Or it could be

(5 3 1)(2x)
(2 3 5)(y)
(1 9 6)(z)
  

So the effective x coordinate becomes 2x. This is a coordinate transformation. Shrinking the axis has the same effect as enlarging the shape.

This sort of stuff is really important in 3D computer graphics. If you add a spaceship to the scene, it will be defined as a set of polygons relative to some simple coordinate system. To decide where it appears, you define a transformation from the spaceship's coordinates to the "world coordinates". To make it move, you transform that transformation matrix.

This page has more about matrices with relevance to computer graphics.

Inversion

Finding the inverse of a matrix is the hardest thing you'll learn here. Once you're past this section, it's all plain sailing!

Earlier on, I gave this equation:

(5 3 1)(x) = (1)
(2 3 5)(y) = (2)
(1 9 6)(z) = (3)
  

But I didn't solve it. To do this, I need to invert that square matrix in the left. Multiplying a matrix by its inverse gives the identity matrix, whichever way round you do it. Only square matrices can be inverted, and not all of them. The inverse is written as raising the the power -1. Let's multiply both sides of that equation by the inverse of the square matrix.

(5 3 1)-1(5 3 1)(x) = (5 3 1)-1(1)
(2 3 5)  (2 3 5)(y) = (2 3 5)  (2)
(1 9 6)  (1 9 6)(z) = (1 9 6)  (3)
  

and simplify it

(1 0 0)(x) = (5 3 1)-1(1)
(0 1 0)(y) = (2 3 5)  (2)
(0 0 1)(z) = (1 9 6)  (3)
  

and again

(x) = (5 3 1)-1(1)
(y) = (2 3 5)  (2)
(z) = (1 9 6)  (3)
  

To solve the equation, all we need to do is invert the matrix, and do a matrix multiplication. 3*3 matrices are tricky, so here's how you invert a general 2*2 matrix:

(a b)-1 = ( d -b)/(ad-bc)
(c d)     (-c  a)        
  

(The (ad-bc) isn't a matrix. Multiplying a matrix by a number means you multiply every element by that number.) Work it out:

(a b)( d -b)*(da/bc) = (ad-bc  -ab+ba)/(ad-bc) 
(c d)(-c  a)           (cd-dc  -cb+da)         

                     = (ad-bc 0       )/(ad-bc)
		       (    0 -(ad-bc))        

                     = (1 0)                   
		       (0 1)                   
  

But, like I said, 3*3 matrices are tricky. Smart people can still work them out by hand. Really smart people use a computer. See below for how to do this in Microsoft Excel. My computer tells me that the solution to the equation above is:

(x)   (0.0639)
(y) = (0.1277)
(z)   (0.2979)

so

(5 3 1)(0.0639) = (1)
(2 3 5)(0.1277) = (2)
(1 9 6)(0.2979) = (3)
  

You can check it with a calculator.

Determinants

All square matrices have a unique determinant. The determinant is a number. The determinant of A is written as |A|, and this holds if A is written out as a square matrix. In general, |A||B| = |AB|. So, |A||inv(A)| = |I|. For what should be obvious reasons, the determinant of the identity matrix is 1. Then, |inv(A)| = 1/|A|. The determinant then tells you something about the inverse. In fact, you know that inv(A) exists iff |A| is not zero. Also, if A contains only integers, so will its inverse iff |A| = +/- 1.

So, this determinant thing looks useful. How do you calculate it? Here it is for a general 2*2 matrix:

|a b| = (ad-bc)
|c d|          
  

That should look familiar form the inverse formula. For a 3*3 matrix:

|a b c|   a|e f|   d|b c|   i|b c|
|d e f| =  |j k| -  |j k| +  |e f|
|i j k|                           

          a|e f|   b|d f|   c|d e|
	=  |j k| -  |i k| +  |i j|
  

There's a general pattern for determinants:

  1. Go down one column, or along one row
  2. Multiply each element in that row or column by the determinant of the submatrix excluding the row and column the element is in. (This is harder to describe then do. See the example.)
  3. Alternately add and subtract the results.
  4. Voila!

Here's the 3*3 matrix I gave as an example above:

|5 3 1|   5|3 5|   2|3 1|   |3 1|
|2 3 5| =  |9 6| -  |9 6| + |3 5|
|1 9 6|                          

        = 5*-27 - 2*9 + 12       
	= -135 - 18 + 12 = -141  
  

When you start with an integer matrix, value of its determinant is the lowest common denominator of the elements in its inverse. So, multiply each element in the inverse by the original determinant, and you get another integer matrix out. (If the determinant is negative, take its absolute value.) This is a useful trick for turning the numbers your computer spits out into fractions. To use the running example:

(x)   (0.0639)
(y) = (0.1277)
(z)   (0.2979)

Now we know the determinant for the matrix above is -141, that solution can be re-written:

(x)    1 (0.0639*141)    1 ( 9)
(y) = ---(0.1277*141) = ---(18)
(z)   141(0.2979*141)   141(42)

Which is not only neater, but exact. One method for calculating the inverse of a matrix involves lots of determinants of sub-matrices. But I can't remember it off-hand.

Vector Spaces

First off, then, what's a vector? Well, I'll tell you. A vector is a list of numbers. It could be written as a matrix with a single row or column. However, these are usually called row or column matrices, and I'm sure they are different from vectors.

Let's go back to this formula, which can be interpreted as a set of points defined with Cartesian coordinates:

(5 3 1)(x)
(2 3 5)(y)
(1 9 6)(z)
  

Converting from coordinate space to vector space means thinking of x, y and z as indepentent vectors. The simplest such are (1 0 0), (0 1 0) and (0 0 1). This leads to the remarkably trivial result:

(x)  (1 0 0)   (5 3 1)(x)   (5 3 1)
(y)= (0 1 0)   (2 3 5)(y) = (2 3 5)
(z)  (0 0 1)   (1 9 6)(z)   (1 9 6)
  

So, what's the point of this? Why not call it a matrix instead of a vector space? Well, the answer is that all the normal coordinate transforms, as above, still work. By defining the space differently, you can view structures in a different way.

The vectors x, y and z are called a basis. A basis is a set of orthonormal vectors that the space can be described using. The "orthonormal" means no basis vector can be defined as a combination of other basis vectors. The vectors are orthonormal if they form a matrix with a non-zero determinant. If they don't form a square matrix, they aren't efficiently expressed. Maybe not even legal. anyway, our current basis is:

    (x)  (1 0 0)
X = (y)= (0 1 0)
    (z)  (0 0 1)
  

Another could be:

     (2 0 0)
X' = (0 1 0)
     (0 0 1)
  

Define it as:

     (2 0 0)(1 0 0)   (2 0 0)
X' = (0 1 0)(0 1 0) = (0 1 0)X
     (0 0 1)(0 0 1)   (0 0 1)
  

And if, half-way through a calculation, you change your mind as to what basis to use as X, everything is still valid.

I don't have an exact definition to hand, but I think a vector is the infinite number of vectors that can be defined as combinations of the basis vectors. In this case, that's all matrices with one row and three columns.

Okay, you still may not see the point of vector spaces, but at least you know what they are. That's all for this tutorial, unless you happen to use our featured spreadsheet:

Matrix Calculations in Microsoft Excel

I've added this section because the online help is a real struggle. I think most good spreadsheets can handle matrices, but it's Excel I happen to have here.

First, then, here's an example of calculating the inverse of a matrix. Enter the following into a new sheet:

ABC
113-5
2-12-27
347-2

Now, select cells in the square A5 to C7. Type in the formula =MINVERSE(A1:C3) and press CTRL-SHIFT-RETURN. The cells should be filled with these decimals:

-0.129682997-0.0835734870.031700288
0.0115273780.0518731990.152737752
-0.2190201730.0144092220.097982709

To show this is the inverse of the original matrix, select the square of cells E5 to G7, and CTRL-SHIFT-ENTER the formula =MMULT(A1:C3,A5:C7). This should give near enough the identity matrix. Some of the numbers will look like -2.77556E-17 or something, Don't worry about it, floating point arithmetic is never perfect.

Now, the inverse would make more sense if you could write it as a fraction. To work this out, multiply each element by the determinant of the matrix you're inverting. And to do that, edit one of the cell formulae to read =MINVERSE(A1:C3)*MDETERM(A1:C3) and press CTRL-SHIFT-RETURN again. If all goes well, the matrix will now be shown as integers. The diagonals of what used to be the identity matrix will read 347. That tells you the determinant of that original matrix is 347, so each element in the inverse is a fraction over 347.

So now you know how to invert, multiply and find the determinant of matrices. Adding and subtracting are follow the same pattern, or you could even to normal arithmetic with copy and paste.


home page