Skip to contents

This vignette illustrates the ideas behind solving systems of linear equations of the form ๐€๐ฑ=๐›\mathbf{A x = b} where

  • ๐€\mathbf{A} is an mร—nm \times n matrix of coefficients for mm equations in nn unknowns
  • ๐ฑ\mathbf{x} is an nร—1n \times 1 vector unknowns, x1,x2,โ€ฆxnx_1, x_2, \dots x_n
  • ๐›\mathbf{b} is an mร—1m \times 1 vector of constants, the โ€œright-hand sidesโ€ of the equations

or, spelled out,

[a11a12โ‹ฏa1na21a22โ‹ฏa2nโ‹ฎโ‹ฎโ‹ฎam1am2โ‹ฏamn](x1x2โ‹ฎxn)=(b1b2โ‹ฎbm) \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{bmatrix} \begin{pmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \\ \end{pmatrix} \quad=\quad \begin{pmatrix} b_{1} \\ b_{2} \\ \vdots \\ b_{m} \\ \end{pmatrix} For three equations in three unknowns, the equations look like this:

 A <- matrix(paste0("a_{", outer(1:3, 1:3, FUN = paste0), "}"),   nrow=3)  b <- paste0("b_", 1:3) x <- paste0("x", 1:3) showEqn(A, b, vars = x, latex=TRUE)
a11โ‹…x1+a12โ‹…x2+a13โ‹…x3=b1a21โ‹…x1+a22โ‹…x2+a23โ‹…x3=b2a31โ‹…x1+a32โ‹…x2+a33โ‹…x3=b3\begin{array}{lllllll} a_{11} \cdot x_1 &+& a_{12} \cdot x_2 &+& a_{13} \cdot x_3 &=& b_1 \\ a_{21} \cdot x_1 &+& a_{22} \cdot x_2 &+& a_{23} \cdot x_3 &=& b_2 \\ a_{31} \cdot x_1 &+& a_{32} \cdot x_2 &+& a_{33} \cdot x_3 &=& b_3 \\ \end{array}

Conditions for a solution

The general conditions for solutions are:

  • the equations are consistent (solutions exist) if r(๐€|๐›)=r(๐€)r( \mathbf{A} | \mathbf{b}) = r( \mathbf{A})
    • the solution is unique if r(๐€|๐›)=r(๐€)=nr( \mathbf{A} | \mathbf{b}) = r( \mathbf{A}) = n
    • the solution is underdetermined if r(๐€|๐›)=r(๐€)<nr( \mathbf{A} | \mathbf{b}) = r( \mathbf{A}) < n
  • the equations are inconsistent (no solutions) if r(๐€|๐›)>r(๐€)r( \mathbf{A} | \mathbf{b}) > r( \mathbf{A})

We use c( R(A), R(cbind(A,b)) ) to show the ranks, and all.equal( R(A), R(cbind(A,b)) ) to test for consistency.

 library(matlib) # use the package

Equations in two unknowns

Each equation in two unknowns corresponds to a line in 2D space. The equations have a unique solution if all lines intersect in a point.

Two consistent equations

 A <- matrix(c(1, 2, -1, 2), 2, 2) b <- c(2,1) showEqn(A, b)
## 1*x1 - 1*x2 = 2  ## 2*x1 + 2*x2 = 1

Check whether they are consistent:

 c( R(A), R(cbind(A,b)) ) # show ranks
## [1] 2 2
 all.equal( R(A), R(cbind(A,b)) ) # consistent?
## [1] TRUE

Plot the equations:

 plotEqn(A,b)
## x[1] - 1*x[2] = 2  ## 2*x[1] + 2*x[2] = 1

Plot of two consistent equations which plot as lines intersecting in a point

Solve() is a convenience function that shows the solution in a more comprehensible form:

 Solve(A, b, fractions = TRUE)
## x1 = 5/4  ## x2 = -3/4

Three consistent equations

For three (or more) equations in two unknowns, r(๐€)โ‰ค2r(\mathbf{A}) \le 2, because r(๐€)โ‰คmin(m,n)r(\mathbf{A}) \le \min(m,n). The equations will be consistent if r(๐€)=r(๐€|๐›)r(\mathbf{A}) = r(\mathbf{A | b}). This means that whatever linear relations exist among the rows of ๐€\mathbf{A} are the same as those among the elements of ๐›\mathbf{b}.

Geometrically, this means that all three lines intersect in a point.

 A <- matrix(c(1,2,3, -1, 2, 1), 3, 2) b <- c(2,1,3) showEqn(A, b)
## 1*x1 - 1*x2 = 2  ## 2*x1 + 2*x2 = 1  ## 3*x1 + 1*x2 = 3
 c( R(A), R(cbind(A,b)) ) # show ranks
## [1] 2 2
 all.equal( R(A), R(cbind(A,b)) ) # consistent?
## [1] TRUE
 Solve(A, b, fractions=TRUE) # show solution 
## x1 = 5/4  ## x2 = -3/4  ## 0 = 0

Plot the equations:

 plotEqn(A,b)
## x[1] - 1*x[2] = 2  ## 2*x[1] + 2*x[2] = 1  ## 3*x[1] + x[2] = 3

Plot of three consistent equations which plot as three lines intersecting in a point

Three inconsistent equations

Three equations in two unknowns are inconsistent when r(๐€)<r(๐€|๐›)r(\mathbf{A}) < r(\mathbf{A | b}).

 A <- matrix(c(1,2,3, -1, 2, 1), 3, 2) b <- c(2,1,6) showEqn(A, b)
## 1*x1 - 1*x2 = 2  ## 2*x1 + 2*x2 = 1  ## 3*x1 + 1*x2 = 6
 c( R(A), R(cbind(A,b)) ) # show ranks
## [1] 2 3
 all.equal( R(A), R(cbind(A,b)) ) # consistent?
## [1] "Mean relative difference: 0.5"

You can see this in the result of reducing ๐€|๐›\mathbf{A} | \mathbf{b} to echelon form, where the last row indicates the inconsistency because it represents the equation 0x1+0x2=โˆ’30 x_1 + 0 x_2 = -3.

 echelon(A, b)
## [,1] [,2] [,3] ## [1,] 1 0 2.75 ## [2,] 0 1 -2.25 ## [3,] 0 0 -3.00

Solve() shows this more explicitly, using fractions where possible:

 Solve(A, b, fractions=TRUE)
## x1 = 11/4  ## x2 = -9/4  ## 0 = -3

An approximate solution is sometimes available using a generalized inverse. This gives ๐ฑ=(2,โˆ’1)\mathbf{x} = (2, -1) as a best close solution.

 x <- MASS::ginv(A) %*% b x
## [,1] ## [1,] 2 ## [2,] -1

Plot the equations. You can see that each pair of equations has a solution, but all three do not have a common, consistent solution.

 par(mar=c(4,4,0,0)+.1) plotEqn(A,b, xlim=c(-2, 4))
## x[1] - 1*x[2] = 2  ## 2*x[1] + 2*x[2] = 1  ## 3*x[1] + x[2] = 6
 # add the ginv() solution points(x[1], x[2], pch=15)

Plot of the lines corresponding to three inconsistent equations. They do not all intersect in a point, indicating that there is no common solution.

Equations in three unknowns

Each equation in three unknowns corresponds to a plane in 3D space. The equations have a unique solution if all planes intersect in a point.

Three consistent equations

An example:

 A <- matrix(c(2, 1, -1,  -3, -1, 2,  -2, 1, 2), 3, 3, byrow=TRUE) colnames(A) <- paste0('x', 1:3) b <- c(8, -11, -3) showEqn(A, b)
## 2*x1 + 1*x2 - 1*x3 = 8  ## -3*x1 - 1*x2 + 2*x3 = -11  ## -2*x1 + 1*x2 + 2*x3 = -3

Are the equations consistent?

 c( R(A), R(cbind(A,b)) ) # show ranks
## [1] 3 3
 all.equal( R(A), R(cbind(A,b)) ) # consistent?
## [1] TRUE

Solve for ๐ฑ\mathbf{x}.

 solve(A, b)
## x1 x2 x3  ## 2 3 -1

Other ways of solving:

 solve(A) %*% b
## [,1] ## x1 2 ## x2 3 ## x3 -1
 inv(A) %*% b
## [,1] ## [1,] 2 ## [2,] 3 ## [3,] -1

Yet another way to see the solution is to reduce ๐€|๐›\mathbf{A | b} to echelon form. The result of this is the matrix [๐ˆ|๐€โˆ’๐Ÿ๐›][\mathbf{I \quad | \quad A^{-1}b}], with the solution in the last column.

 echelon(A, b)
## x1 x2 x3  ## [1,] 1 0 0 2 ## [2,] 0 1 0 3 ## [3,] 0 0 1 -1

`echelon() can be asked to show the steps, as the row operations necessary to reduce ๐—\mathbf{X} to the identity matrix ๐ˆ\mathbf{I}.

 echelon(A, b, verbose=TRUE, fractions=TRUE)
##  ## Initial matrix:
## x1 x2 x3  ## [1,] 2 1 -1 8 ## [2,] -3 -1 2 -11 ## [3,] -2 1 2 -3 ##  ## row: 1  ##  ## exchange rows 1 and 2
## x1 x2 x3  ## [1,] -3 -1 2 -11 ## [2,] 2 1 -1 8 ## [3,] -2 1 2 -3 ##  ## multiply row 1 by -1/3
## x1 x2 x3  ## [1,] 1 1/3 -2/3 11/3 ## [2,] 2 1 -1 8 ## [3,] -2 1 2 -3 ##  ## multiply row 1 by 2 and subtract from row 2
## x1 x2 x3  ## [1,] 1 1/3 -2/3 11/3 ## [2,] 0 1/3 1/3 2/3 ## [3,] -2 1 2 -3 ##  ## multiply row 1 by 2 and add to row 3
## x1 x2 x3  ## [1,] 1 1/3 -2/3 11/3 ## [2,] 0 1/3 1/3 2/3 ## [3,] 0 5/3 2/3 13/3 ##  ## row: 2  ##  ## exchange rows 2 and 3
## x1 x2 x3  ## [1,] 1 1/3 -2/3 11/3 ## [2,] 0 5/3 2/3 13/3 ## [3,] 0 1/3 1/3 2/3 ##  ## multiply row 2 by 3/5
## x1 x2 x3  ## [1,] 1 1/3 -2/3 11/3 ## [2,] 0 1 2/5 13/5 ## [3,] 0 1/3 1/3 2/3 ##  ## multiply row 2 by 1/3 and subtract from row 1
## x1 x2 x3  ## [1,] 1 0 -4/5 14/5 ## [2,] 0 1 2/5 13/5 ## [3,] 0 1/3 1/3 2/3 ##  ## multiply row 2 by 1/3 and subtract from row 3
## x1 x2 x3  ## [1,] 1 0 -4/5 14/5 ## [2,] 0 1 2/5 13/5 ## [3,] 0 0 1/5 -1/5 ##  ## row: 3  ##  ## multiply row 3 by 5
## x1 x2 x3  ## [1,] 1 0 -4/5 14/5 ## [2,] 0 1 2/5 13/5 ## [3,] 0 0 1 -1 ##  ## multiply row 3 by 4/5 and add to row 1
## x1 x2 x3  ## [1,] 1 0 0 2 ## [2,] 0 1 2/5 13/5 ## [3,] 0 0 1 -1 ##  ## multiply row 3 by 2/5 and subtract from row 2
## x1 x2 x3  ## [1,] 1 0 0 2 ## [2,] 0 1 0 3 ## [3,] 0 0 1 -1

Now, letโ€™s plot them.

plotEqn3d() uses rgl for 3D graphics. If you rotate the figure, youโ€™ll see an orientation where all three planes intersect at the solution point, ๐ฑ=(2,3,โˆ’1)\mathbf{x} = (2, 3, -1)

 plotEqn3d(A,b, xlim=c(0,4), ylim=c(0,4))

Three inconsistent equations

 A <- matrix(c(1, 3, 1,  1, -2, -2,  2, 1, -1), 3, 3, byrow=TRUE) colnames(A) <- paste0('x', 1:3) b <- c(2, 3, 6) showEqn(A, b)
## 1*x1 + 3*x2 + 1*x3 = 2  ## 1*x1 - 2*x2 - 2*x3 = 3  ## 2*x1 + 1*x2 - 1*x3 = 6

Are the equations consistent? No.

 c( R(A), R(cbind(A,b)) ) # show ranks
## [1] 2 3
 all.equal( R(A), R(cbind(A,b)) ) # consistent?
## [1] "Mean relative difference: 0.5"