Gaussian Elimination with good record keeping gives A = LU

Recall an elimination matrix is like the identity matrix except it has one nonzero entry off the main diagonal. The word elim defined below creates the elimination matrix E[i,j](mul)

> with(linalg):

Warning, the protected names norm and trace have been redefined and unprotected

> elim := proc(n,i,j,mul)
local A,k;
A := linalg[diag](seq(1,k = 1 .. n));
A[i,j] := mul;
evalm(A)
end:

> elim(3,3,2,9);

matrix([[1, 0, 0], [0, 1, 0], [0, 9, 1]])

As we have seen in class, we can often multiply A on the left by a succession of elimination matrices in order to reduce A to an upper triangular matrix U.

Look at an example.

> A := matrix([[-85, -55, -37], [-35, 97, 50], [79, 56, 49]]);

A := matrix([[-85, -55, -37], [-35, 97, 50], [79, 5...

In order to eliminate the entry A[2,1] , which is -35, of A (turn it into 0), 1st column, we multiply the top row of A by -A[2,1]/A[1,1] and add it to the 2nd row. This is accomplished by multipling A on the left by the elimination matrix E[2,1](-A[2,1]/A[1,1])

> e21 := elim(3,2,1,-A[2,1]/A[1,1]);

e21 := matrix([[1, 0, 0], [-7/17, 1, 0], [0, 0, 1]]...

Multiply e21 by A to start with (store the answer in U, which will eventually become upper triangular)

> U := evalm(e21 &* A);

U := matrix([[-85, -55, -37], [0, 2034/17, 1109/17]...

Now eliminate the entry U[3,1]

> e31 := elim(3,3,1,-U[3,1]/U[1,1]);

e31 := matrix([[1, 0, 0], [0, 1, 0], [79/85, 0, 1]]...

> U := evalm(e31 &* U);

U := matrix([[-85, -55, -37], [0, 2034/17, 1109/17]...

Now eliminate the entry U[3,2]

> e32 := elim(3,3,2,-U[3,2]/U[2,2]);

e32 := matrix([[1, 0, 0], [0, 1, 0], [0, -83/2034, ...

> U := evalm(e32 &* U);

U := matrix([[-85, -55, -37], [0, 2034/17, 1109/17]...

So at this point, we have that e32 e31 e21 A = U. To undo the eliminations, we only need to multiply U on the left successively by the inverse of e32, then e31, then e21 to get A back again. As we have seen in class, the inverse of E[i,j](mul) is just the elimination matrix E[i,j](-mul) .

> L:=evalm(inverse(e21)&*inverse(e31)&*inverse(e32));

L := matrix([[1, 0, 0], [7/17, 1, 0], [-79/85, 83/2...

> map(evalm,inverse(e21) *inverse(e31) *inverse(e32))= evalm(L);

matrix([[1, 0, 0], [7/17, 1, 0], [0, 0, 1]])*matrix...

We should check that LU = A

> evalm(L &* U);

matrix([[-85, -55, -37], [-35, 97, 50], [79, 56, 49...

>

Also, it is important that we reverse the order in which the inverse eliminations are performed (the inverse of a product is the product of the inverses in reverse order). For example, switching the first and last eliminations gives a different lower left hand corner.

> evalm(inverse(e32)&*inverse(e31)&*inverse(e21));

matrix([[1, 0, 0], [7/17, 1, 0], [-157781/172890, 8...

On the other hand, the first and second eliminations commute, their product either way begining

> evalm(inverse(e31)&*inverse(e21));

matrix([[1, 0, 0], [7/17, 1, 0], [-79/85, 0, 1]])

>

Observe that the product of the elimination matrices in reverse order is just the lower triangular matrix L whose entries below the diagonal are the negatives of the multiplier entries of the elimination matrices below the diagonal.

A square matrix U is upper triangular if U[i,j] = 0 for all i > j; that is, if the entries below the diagonal are zero. A square matrix L is a lower unit triangular matrix if L[i,j] = 0 for all i < j and L[i,i] = 1 for all i; that is, if all entries above the diagonal are 0 and all entries on the diagonal are 1.

Here is an interesting theorem.

The LU factorization theorem: If A is an n by n matrix which can be reduced to upper triangular form U without row interchanges, then A can be factored into the product of a lower unit triangular matrix L with the upper triangular matrix U.

Proof: The proof is by induction on n. The case n = 1 is trivially true. Suppose the theorem holds for all n-1 by n-1 matrices which can be factored without row swaps. Let A be a n by n matrix with A[1,1] <> 0 which can be factored without row swaps.

Then let L[1] = matrix([[1, 0, `..`, 0, 0], [a[2,1]/a[1,1], 1, `..`... and

let A[1] = matrix([[a[1,1], a[1,2], `..`, a[1,n-1], a[1,n]], [... .

Use block multiplication to see that A = L[1]*A[1] . Now the lower right hand n-1 by n-1 submatrix A[2] of A[1] must be reducible to an upper triangular matrix without row changes, because all the further elimination matrices reducing A[1] to upper triangular form do not alter the entries outside A[2] . Hence, by the inductive assumption, A[2] factors into L[2]*U[2] . Now use block multiplication to see that A[1] factors into matrix([[1, `..0`], [`.`, ` `], [`0`, L[2]]]) matrix([[a[1,1], `..a1n`], [`.`, ` `], [0, U[2]]]) . Hence A = ( L[1] matrix([[1, `..0`], [`.`, ` `], [`0`, L[2]]]) ) matrix([[a[1,1], `..a1n`], [`.`, ` `], [0, U[2]]]) = L U, as required. QED.

lufact1 A recursive procedure to factor A into LU.

The word lufact1 defined below implements the proof of the theorem. Notice the procedure calls itself. Procedures which call themselves are referred to as recursively defined procedures .

>

> lufact1 := proc(A)
local k,L,U,L1,A1,i,j,LU;
k := rowdim(A);
if k = 1 then RETURN([matrix(1,1,[1]),evalm(A)]) else
if A[1,1] = 0 then ERROR("0 pivot") fi;
L1 := 1/A[1,1]*submatrix(A,1..k,1..1);
A1 := evalm(submatrix(A,2..k,2..k)-
1/A[1,1]*(submatrix(A,2..k,1..1) &* submatrix(A,1..1,2..k)));
LU := lufact1(A1); # recursive call here
L := stackmatrix([seq(0,i=2..k)],LU[1]);
L := augment(L1,L);
U := augment([seq(0,i=2..k)],LU[2]);
U:= stackmatrix(row(A,1),U);
[evalm(L),evalm(U)]
fi
end:

> A := matrix([[-2, -2, 0, -1],
[-3, 2, 1, 1], [-1, 0, 1, -1], [-2, -2, -2, 3]]);

A := matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0...

> LU :=lufact1(A);

LU := [matrix([[1, 0, 0, 0], [3/2, 1, 0, 0], [1/2, ...

> evalm(LU[1] &* LU[2]);

matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0, 1, ...

It woiks!

Exercise: Test lufact1 on several random 4 by 4 matrices. (this amounts to executing the two red lines below several times and observing the output). Estimate the probability that lufact1 will fail on a random 4 by 4 matrix with entries chosen from the digits 0 to +/-9

> A := randmatrix(4,4,entries=rand(-9..9));

A := matrix([[9, -3, 6, 9], [5, 0, -5, -1], [0, -9,...

> lufact1(A);

[matrix([[1, 0, 0, 0], [5/9, 1, 0, 0], [0, -27/5, 1...

>

lufact An iterative procedure to factor A into LU.

Here is a word which defines an iterative procedure to factor a matrix into the product of a lower triangular and upper triangular matrix.

> with(linalg):

> lufact := proc(A)
local k,i,j,s,p,el,elinv,L,U;
k := rowdim(A);
U := evalm(A);
L := diag(seq(1,i = 1 .. k));
for i to k-1 do
if U[i,i] = 0 then
ERROR(`0 pivot`,i)
else
for j from i+1 to k do
el := elim( k,j,i,-U[j,i]/U[i,i] );
elinv := elim(k,j,i,U[j,i]/U[i,i]);
U := evalm(el &* U);
L := evalm(L&* elinv);
od
fi
od;
map(evalm,[L,U]);
end:

A := matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0, 1, -1], [-2, -2, -2, 3]]);

A := matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0...

> lufact(A);

[matrix([[1, 0, 0, 0], [3/2, 1, 0, 0], [1/2, 1/5, 1...

> evalm(%[1] &* %[2]);

matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0, 1, ...

This works too!

>

checking lufact1 and lufact against LUdecomp

We can check these against the linalg word LUdecomp

> U:=LUdecomp(A,L='l');

U := matrix([[-2, -2, 0, -1], [0, 5, 1, 5/2], [0, 0...

> evalm(l);

matrix([[1, 0, 0, 0], [3/2, 1, 0, 0], [1/2, 1/5, 1,...

> evalm(l&* U);

matrix([[-2, -2, 0, -1], [-3, 2, 1, 1], [-1, 0, 1, ...

This gives the same answer.

Exercise: Do you believe that the factorization A = LU is unique? That is, if A = LU = L[1]*U[1] , then L = L[1] and U = U[1] .

What evidence would you cite for your belief?

Use A = LU to solve Ax = b efficiently

An important use of the LU factorization is to solve Ax = b efficiently. If A = LU, then we can solve LUx = A x = b by a two step process.

1. Solve Ly = b for y by forward substitution

2. Solve Ux = y for x by back substitution.

Exercise. Given that A = LU, where L := matrix([[1, 0, 0, 0], [1, 1, 0, 0], [1, 1, 1, ... and U := matrix([[2, 0, 1, 2], [0, -1, 2, 0], [0, 0, 1,... , solve A x = b, where

b = matrix([[3], [2], [1], [4]])

>

The symmetric LU factorizaion theorem

If you carry out an LU factorization of a few random symmetric matrices (one which is equal to its transpose, see section 2.7) and looks at the L and U factors, a pattern appears. You notice you can pull a diagonal matrix D off to the left of the upper triangular factor U so that what is left has 1's down the diagonal. Take an example,

> A:=randmatrix(4,4,symmetric,entries=rand(-4..4));

A := matrix([[-4, 1, 3, 1], [1, 1, -4, 0], [3, -4, ...

> lu:= lufact1(A);

lu := [matrix([[1, 0, 0, 0], [-1/4, 1, 0, 0], [-3/4...

The upper triangular factor can itself be factored into a diagonal times a unit upper triangular matrix whose transpose is the lower unit triangular factor.

> lu[2];

matrix([[-4, 1, 3, 1], [0, 5/4, -13/4, 1/4], [0, 0,...

> d := diag(-4,5/4,-51/5,248/51);

d := matrix([[-4, 0, 0, 0], [0, 5/4, 0, 0], [0, 0, ...

> u1 := evalm(diag(-1/4,4/5,-5/51,51/248)&*lu[2]);

u1 := matrix([[1, -1/4, -3/4, -1/4], [0, 1, -13/5, ...

> map(evalm,[transpose(u1),lu[1]]);

[matrix([[1, 0, 0, 0], [-1/4, 1, 0, 0], [-3/4, -13/...

The symmetric LU factorization theorem. If A is a nonsingular symmetric matrix which factors into LU, then U is the product of a diagaonal matrix D with an upper unit triangular matrix U1 whose transpose is L.

One big use of this theorem is to save storage. You don't need to save the L factor because it can easily be recovered from the U factor.

>