Linear Algebra (Pt. 3): Gaussian Elimination
Last updated
Last updated
Do consider watching this video first for general conception. It is a good 48 minutes investment of your time (1.5x speed it should only take < 32 minutes) 🐱👤.
Next, watch this video from a teaching assistant after the previous as she explains it more directly.
Note that both Strang and TA discussed elimination to only obtain till the upper triangular form (u) or otherwise the Row Echelon Form (REF), while there still exists a Reduced Row Echelon Form (RREF). The latter RREF as compared to REF, is an unique expression of a system, and is thus not subjective to any algorithms used for its computation.
"Gauss thought of it (elimination) first only because he was born earlier... it's a natural idea... and yea he died earlier too."
Salty Gil Strang, 18.06, Lecture 2
Despite it being brutal, this method was used (did not prove it) by Chinese Mathematician in circa 179 AD 😲. Thus, there may be truth in Strang's statement that this may not be very complicated.
As mentioned, the goal of the Gaussian Elimination is to get to REF. The "E" (Echelon) in REF means "stairs" in French, and we should eventually get a "diagonal staircase" separating to both upper and lower-triangles. We will discuss its reason at the end.
To do so, there are two steps:
Forward-Elimination
Back-Substitution
The bulk of the work of Gaussian Elimination occurs in this step. Given for example a 3x3 problem, we can easily express it in row shorthand as discussed before.
We'll firstly focus on applying the mechanics on Matrix A only to go through its logic, before performing it along with Matrix b.
The idea behind is to pivot a row at a time (which is an equation), and reduce all other elements in its columns (variables) to 0.
To do so, we
Firstly pivot (1, 1), and thus all coefficients in the pivoted row will remain the same, while
We multiply row 1 by 3, in order to achieve 0 for (2, 1) after subtracting row 2 with the computed values.
After this juncture, we would supposedly proceed to eliminate (3, 1), but however the column is fortunately decorated with 0s below its pivoted element (1, 1).
Hence, after this rigorous step, we have successfully eliminated the unknown "x" (column) in all other equations (rows), but for the first equation 🎉🎊.
There is actually no step 2, it is just a repeat of step 1 (LOL) 🤩.
Since row 1 is cleared, we go down the "staircase" and pivot for the element (2, 2). Let us repeat the logic once again. We
Would retain row 2 coefficients, for the pivoted element is (2, 2), and
Multiply row 2 by 2, in order to achieve 0 for (3, 2) after subtracting row 3 with the computed values.
Err yea... but we are truly done with the hard work here. The reason being: there is no point pivoting (3, 3) for there are no rows below to disturb with.
You would observe for a fact that the matrix has only coefficients in a upper-triangular form (u). The upper-triangular form is essentially REF, in which equations have a reducing amount of unknowns in each of them (which makes it super easy to solve). This is also thus the technique used by computers to solve linear algebra. Hence, it is also of interest for mathematicians and data scientists to get from "A" to "u" as quickly as possible (prolly this is the next topic on algorithms).
The reason why this was excluded out of the exercise earlier, as it was not helpful for understanding. The RHS (Matrix b) is simply dragged along the operation for each individual row.
As shown in the red font, the RHS (Matrix b) coefficients simply follows the calculation as would the other columns to derive the final form of Matrix c.
This part is truly the joy of doing math in Secondary School (for it is so easy) for approximately 1 lesson before we move on to more difficult problems.
Firstly, we re-create the system of linear equations with the coefficient and result matrix in REF.
For the 3rd equation (row) has only 1 unknown (column), the equation is easy to solve (not by chance but by design 🧠).
With this value, we can perform backward substitution by substituting this value to equation (row) 2 to find y.
We repeat the following to find x.
Hence, we derive the following results.
I hope you were able to see the entire point of converting into REF. It made equation 3 and the finding of unknown z extremely easy to solve, and the subsequent unknowns were similarly solvable via the same algorithm.
Now that you are slightly exposed to the algorithm of Gaussian Elimination, doing a comparison via a table may be more helpful to crystallize that understanding.
I don't know about you, but I feel like we were sorta playing cheat while performing forward elimination. I mean, can we truly subtract the multiplication of a row while keeping the other rows intact? I mean Gauss would not forbid it, but it still does not make it feel right.
However, what is we were to legally multiply a matrix to eliminate the element (2, 1) instead? Thus, what would be the matrix required to eliminate the element (2, 1)?
As we are only interested in removing changing (2, 1) to 0, identity matrix is extremely helpful.
Next, we know that the output (2, 1) is a result of the multiplication between the 2nd row of the left-matrix and 1st column of the right-matrix.
Hence, to put 2 and 2 together, we would need to change some values in the 2nd row of the identity matrix.
You might think if the whole point of Elimination Matrix was to only prove Gaussian Elimination is legit, it is actually not very important 😥. It is not! By establishing the above, we have essentially established:
Do also gently be reminded that the above equation still follows MM properties; despite it being associative (you can put a parentheses between any factors), it is non-commutative (positions of factors cannot be switched without changing its results).
To do this, it may be helpful to recap what we have learnt previously in types of matrix and Matrix-Matrix Multiplication lol) operation.
We would call this Elimination Matrix to rid of element (2, 1). You may also ask how would we get to know to slot in the value -3 into the identity matrix to eliminate (2, 1). We would have done so as like how it was done previously, to "multiply row 1 by 3, and subtract row 2 with its computed values".
Now, try to test yourself what would be before revealing the answer 😀.
This is crucial as this implies that out there, there is one such matrix, such that this "one elimination matrix" is able to quickly transform Matrix A to u in one-step. This such matrix is essentially (you can try computing this out to practice MM, but it is not very meaningful as it is not reproducible lol).
System of Linear Equation
Augmented Matrix