6  Introduction

\[ \newcommand {\mat}[1] { \boldsymbol{#1} } \newcommand {\yields} { \; \Rightarrow \; } \newcommand {\rank}[1] { \text{rank}\left(#1\right) } \newcommand {\nul}[1] { \mathcal{N}\left(#1\right) } \newcommand {\csp}[1] { \mathcal{C}\left(#1\right) } \newcommand {\rsp}[1] { \mathcal{R}\left(#1\right) } \newcommand {\lsp}[1] { \mathcal{L}\left(#1\right) } \renewcommand {\vec}[1] {\mathbf{#1}} \] \[ \newcommand {\hair} { \hspace{0.5pt} } \newcommand {\dd}[2] { \frac{\mathrm{d} \hair #1}{ \mathrm{d} \hair #2} } \newcommand {\tdd}[2] { \tfrac{\mathrm{d} #1}{\mathrm{d} #2} } \newcommand {\pp}[2] { \frac{\partial #1}{\partial #2} } \newcommand {\indint}[2] { \int \! {#1} \, \mathrm{d}{#2} } \newcommand {\defint}[4] { \int_{#1}^{#2} \! {#3} \, \mathrm{d}{#4} } \]

\[ \newcommand{\unit}[1]{\,\mathrm{#1}} \newcommand{\degC}{^\circ{\mathrm C}} \newcommand{\tothe}[1]{\times 10^{#1}} \]

Online material

The following online lectures provide additional insight in some of the topics that we treat here. We only cover part of what is treated in these lectures, and we use a different starting point.

  • The 3Blue1Brown video series The essence of linear algebra, a light introduction.
  • Gilbert Strang’s famous MIT lecture series Linear algebra. Hundreds of thousands of students have learned linear algebra from this thorough online course.

Clearly, you don’t learn math from just watching online videos. Solving problems is the only way to learn it properly. But the videos can help.

If you want to advance your knowledge of linear algebra and its applications during or after the master then the following online book is a very good source:

In addition to the basic topics, it treats more advanced topics like eigenvectors/eigenvalues, determinants, singular value decomposition, generalized inverses, etc.. The reason why this book is particularly well suited for self teaching is because it does not just show that these concepts are handy, but it also provides deeper insight by showing how they work, why they work, when they work, how they are connected. And it demonstrates their application with many exercises.

Linear algebra is concerned with the solutions of so-called systems of linear equations. A system of linear equations is a set of linear equations that are valid simultaneously. The term linear refers to the fact that the equations contain only linear terms of the variables. It is the simplest form of algebra that you can think of. So, if \(x\) is a variable whose value we would like to calculate from an equation, then a linear equation in \(x\) is one in which only \(x\) or multiples \(ax\) of \(x\) occur, where \(a\) is a so-called scalar. A scalar is a simple number (Integer, Real number, etc) that scales the variable \(x\). Terms like \(x^{2}\) or \(\sqrt{x}\) do not occur in a linear equations. Only first-order powers \(x^{1}=x\) (or \(y^1=y\) or \(z^1=z\), etc.) occur in linear equations. To summarize: in linear equations, only scalars and linear terms of variables (\(x\), \(y\), \(z\), etc.) occur. However, multiple variables, all occurring in a linear fashion are allowed. Therefore, also an equation like \(ax+by=c\) where \(x\) and \(y\) are variables whose values we would like to know, and \(a\), \(b\), and \(c\), are scalars, is a linear equation. The most general way of writing a single linear equation in \(n\) variables is

\[ \sum_{i=1}^{n} a_i x_i = b \]

where \(x_1, x_2, \ldots, x_n\) are the variables, \(a_1, a_2, \ldots, a_n\) are their scalar coefficients, and \(b\) is a constant. The Greek capital S, \(\Sigma\), stands for Sum and indicates that the terms of the equation, formed by iterating \(i\) from1 to \(n\), are to be summed. The expression is an abbreviation of

\[ a_1 x_1 + a_2 x_2 + \cdots + a_n x_n = b \]

Both the \(x_i\) and the \(a_i\) as well as \(b\) can take values from the set of real numbers \(\mathbb{R}\). The most general notation for a system of \(m\) linear equations is:

\[\begin{align*} \sum_{i=1}^{n} a_{i1} x_i & = b_1 \\ \sum_{i=1}^{n} a_{i2} x_i & = b_2 \\ & \vdots \\ \sum_{i=1}^{n} a_{im} x_i & = b_m \end{align*}\]

where \(a_{im}\) is the coefficient for the \(i\)-th variable in the the \(m\)-th equation. These expressions are an abbreviation of \[ \begin{alignedat}{8} a_{11} x_1 & {}+{} & a_{12} x_2 & {}+{} & \cdots & {}+{} & a_{1n} x_n &= b_1 \\ a_{21} x_1 & {}+{} & a_{22} x_2 & {}+{} & \cdots & {}+{} & a_{2n} x_n &= b_2 \\ \vdots & & & & \vdots & & & \vdots \\ a_{m1} x_1 & {}+{} & a_{m2} x_2 & {}+{} & \cdots & {}+{} & a_{mn} x_n &= b_m \end{alignedat} \]

6.1 Solution sets

In linear algebra we are interested in solutions to linear equations. We often think of solutions as single values or a countable number of values that the variables can assume to make both sides of an equation equal. For example, \(x+2=3\) yields a single value for \(x\), 1, that makes both sides of the equation equal. Or for the equation \(x^2=4\) the two solutions \(x=2\) and \(x=-2\) make both sides of the equation equal. However, in algebra in general, and also in linear algebra, we often encounter solutions that do not yield a single or even a countable number of values for the variables, but solutions that consist of uncountable1 (infinite) sets of values. In such cases, some or all variables can assume values in ranges of the set of real numbers \(\mathbb{R}\), or over the whole set of real numbers. In linear algebra, this situation is particularly interesting when there are multiple variables. We will demonstrate this principle using a few very simple examples.

The simplest linear equation is

\[ ax = 0 \]

and another, just a little more complicated equation is

\[ ax = b \]

Let’s study the first equation, \(ax=0\). When you’re not too careful you could think that \(x=0\) is the solution to that equation. However, what if \(a=0\)? In that case, actually any \(x\) from the set \(\mathbb{R}\) of real numbers will solve the equation! Therefore, whether the solution is a single number (\(x=0\)) or an uncountable set of numbers \(\mathbb{R}\)2 depends on the value of the scalar \(a\). For the second equation a single solution could be \(x=\tfrac{b}{a}\). However, this is only a valid solution if \(a \neq 0\). In the case that \(a=0\) and \(b \neq 0\) the equation has no solution, or in the language of set theory: the solution is the empty set \(\varnothing\). However, when \(a=0\) and \(b=0\), we get the same situation as with the first equation, and any member \(x\) of the real numbers \(\mathbb{R}\) will solve this equation again. Concluding:

Note

The solution to the linear equation \[ ax = b \] is

\[ x= \begin{cases} \varnothing & \text{when} \quad a=0 \quad \text{and} \quad b \neq 0, \\ \mathbb{R} & \text{when} \quad a=0 \quad \text{and} \quad b = 0, \\ \tfrac{b}{a} & \text{when} \quad a \neq 0 \end{cases} \]

Take-home message: a solution may be an empty set, a single number, multiple (countably many) numbers, or a set of infinitely many numbers. In other words, we generalize the concept of a solution to a set of objects that is not necessarily countable.

These are rather trivial solution sets: they are a bit boring or they don’t have much intricacy or structure. Then again, we were only looking at the simplest possible linear equation. Things become a little more interesting with this linear equation:

\[ a x + b y = c \tag{6.1}\]

where \(x\) and \(y\) are variables and \(a\), \(b\) and \(c\) are scalars. What can we say about the solution of this equation? First of all, since the equation has two variables, the solution will always look like a combination of two numbers, one for \(x\) and a second for \(y\). Such combinations of two or more numbers are written between brackets as \((x,y,z,\ldots)\), and are called tuples. Tuples are like sets, only the elements have a specific place, i.e. they are ordered lists of elements. You have to write the numbers in a specific order to know which value belongs to which variable. This is unlike ordinary sets, written with curly brackets like \(\{x,y,z,\ldots\}\), which behave like bags of objects3. The order in which the objects are written is irrelevant there.

To solve Equation 6.1 we could do what you are used to do in algebra, namely try to get one variable on the left side of the equation, let’s say \(x\):

\[ \begin{align*} a x & = c - b y \quad \text{therefore} \\ x & = \tfrac{c}{a} - \tfrac{b}{a} y \end{align*} \]

Here it stops and we can not get any further. Also, we should have been more careful, because in the second step we have divided both sides by the scalar \(a\). This operation is only allowed if \(a \neq 0\). If \(a=0\), then \(x\) can be any real number. But then we can also determine \(y\) exactly, namely \(y=\tfrac{c}{b}\) , that is unless also \(b=0\). We summarize the solutions in the table below:

a\(\qquad\) b\(\qquad\) c\(\qquad\) x y\(\qquad\) set of tuples \((x,y)\)
\(\neq 0\) \(\neq 0\) \(\in \mathbb{R}\) \(\tfrac{c}{a} - \tfrac{b}{a} y\) \(\mathbb{R}\) \(\{(x,y)|y \in \mathbb{R} \textrm{ and } x=\tfrac{c}{a} - \tfrac{b}{a} y \}\)
\(0\) \(\neq 0\) \(\in \mathbb{R}\) \(\mathbb{R}\) \(\tfrac{c}{b}\) \(\{(x,y)|y=\tfrac{c}{b} \textrm{ and } x \in \mathbb{R} \}\)
\(\neq 0\) \(0\) \(\in \mathbb{R}\) \(\tfrac{c}{a}\) \(\mathbb{R}\) \(\{(x,y)|y \in \mathbb{R} \textrm{ and } x=\tfrac{c}{a} \}\)
\(0\) \(0\) \(\neq 0\) \(\varnothing\) \(\varnothing\) \(\{(x,y)|y \in \varnothing \textrm{ and } x \in \varnothing \}\)
\(0\) \(0\) \(0\) \(\mathbb{R}\) \(\mathbb{R}\) \(\{(x,y)|y \in \mathbb{R} \textrm{ and } x \in \mathbb{R} \}\)

The first solution, perhaps you could say the one that occurs most often when the scalars \(a\), \(b\) and \(c\) would be chosen at random from the real numbers, expresses one variable in terms of the other. It means that we could freely choose the value for one variable, \(y\) in this case. But when choosing a value for \(y\) we would also fix the solution for \(x\). A variable whose value can be freely chosen in such a solution is called a free variable. Note, however, that you can only obtain the complete solution set by letting the free variable take all allowed values. For example, in the case above, the complete solution set is generated by letting \(y\) assume all values in \(\mathbb{R}\). Listing the different types of solutions in the tabular way as was done above would become inconvenient when the number of variables and scalars increases. Instead, we will later adopt notations that summarize these solutions implicitly.

6.2 The connection with geometry

The term linear algebra states that this branch of mathematics is concerned with algebra. However, there is an intimate connection between algebra and geometry. It was discovered and explored in great depth by the philosopher René Descartes. The notation that we use nowadays for variables (\(x\), \(y\), \(z\), etc.) and scalars (\(a\), \(b\), and \(c\)) is also due to Descartes. Descartes showed that it was possible to map equations (linear as well as nonlinear) onto the plane. He used a coordinate system for this that we nowadays call after him Cartesian coordinates. The idea was to give every variable \(x\), \(y\) its own axis. In this way, every combination of two numbers is mapped to a unique position in the plane. If we have only one variable \(x\), there will be only one axis, which lies on a line. Every value of \(x\) has a unique position on the line, and conversely, every point on the line has a unique value of \(x\) associated with it. If we have two variables \(x\), and \(y\), we have two axes, and all pairs of numbers \((x,y)\) map uniquely to a point on the plane. The converse statement, namely that every point on the plane maps to a unique pair of numbers \((x,y)\) is also true. We say that there is a one-to-one mapping of the plane to the set of pairs of numbers \((x,y)\). The usual way to lay out these axes is to make them perpendicular. The same trick can be used in three dimensions if we have three variables \(x\), \(y\), and \(z\). But with four or more dimensions it becomes impossible to make a mental image of the mapping, although there is no reason to stop this one-to-one mapping trick.

We can use the mapping trick to visualize solutions to equations. The simplest mapping is that of an equation of one variable on a line, like \(x=3\). How does that map onto one-dimensional space? It is just one point on the line (Figure 6.1). How is an equation like \(x=3\) projected on the plane? That depends on whether we know anything specific about the other variable, \(y\). If \(y\) can be any real number, then the solution set will be \(\{(x,y)|x=3 \textrm{ and } y \in \mathbb{R}\}\), which maps onto a vertical line (Figure 6.2).

Figure 6.1: Geometric mapping of the solution sets of the equation \(x=3\), when \(x\) is the only variable. The solution set is the single 1-tuple \((3)\) whose geometric representation is a point.
Figure 6.2: Geometric representation of the solution set of the equation \(x=3\), when there is an additional variable \(y\). Here, the solution set is a set of 2-tuples \((3,y)\) where \(y\) can be any number in \(\mathbb{R}\). This set can be represented geometrically by a line.

And a bit more complicated: how could we display an equation like \(x+2y=5\) on the plane? Is it equivalent to a point? No, as we saw in the table above, with these non-zero scalar coefficients, the solution to such an equation is the infinite set of 2-tuples \(\{(x,y)|y \in \mathbb{R} \textrm{ and } x = 5 - 2 y\}\), which, therefore, can not be a single point in the plane. The geometric mapping of this solution set is shown in Figure 6.3.

Figure 6.3: Geometric mapping of the solution set of the equation \(x + 2y =5\).

How do you obtain such a mapping? First of all, linear equations always map onto straight objects in geometry, they don’t have edges or curves. The equation above maps to a straight line. To draw a straight line, you only need to determine two different points on that line, and you can draw the rest with a ruler. The easiest way to determine two points of the solution set is to set \(y=0\) and determine \(x\), and to set \(x=0\) and determine \(y\):

\[ \begin{align*} \text{if }y & = 0\text{ then }x=5\text{, yields the point corresponding to the tuple }(5,0) \\ \text{if }x & = 0\text{ then }y=2\tfrac{1}{2}\text{, yields the point corresponding to the tuple }(0,2\tfrac{1}{2}) \end{align*} \]

Until now we have only looked at single equations. The more interesting parts of linear algebra are concerned with systems of linear equations. The solution set of a system of linear equations satisfies the solution sets of all individual equations in the system simultaneously. In the language of set theory we say that the solution set of a system of linear equations equals the intersection of the solution sets of the individual equations in the system. The intersection of two sets \(A\) and \(B\) is written as \(A \cap B\). Let’s take the examples used above and ask what the solution set of the following system is:

Example 6.1 (Solving a system of two equations in two variables) \[ \begin{align} \begin{aligned} x & = 3 \\ x + 2 y & = 5 \end{aligned} \end{align} \tag{6.2}\]

The solution set of this system of equations is the intersection of the two individual solution sets:

\[ \{(x,y)|x=3 \textrm{ and } y \in \mathbb{R}\} \; \cap \; \{(x,y)|y \in \mathbb{R} \textrm{ and } x = 5 - 2 y\} \]

Figure 6.4: The solution set of the system Equation 6.2 geometrically corresponds to the intersection of the solution sets of the individual equations, i.e. the intersection of the two lines.

We can calculate this intersection by substituting the equation \(x=3\) in \(x = 5 - 2 y\), thus obtaining \(y=1\). So, whereas \(y\) was a free variable (i.e. \(y \in \mathbb{R}\)) in the individual solution sets, there is only a single valid value for \(y\) in the intersection of the two sets. The solution set is a set with a single tuple: \(\{(3,1)\}\). Geometrically, this corresponds to one point in the plane. The intersection of solution sets is represented geometrically by the intersection of the lines that mapped the solution sets of the individual equations, as shown in Figure 6.4. Therefore, we see a consistent correspondence between the algebraic, set-theoretic, and geometric objects.

Graphical solution of a system of linear equations

In the previous section we saw that we could make a graphical or geometric image of a solution set of a linear equation. We also saw that the intersections of these graphical images represent the intersections of the solution sets, hence, they represent the solution to a system of equations. Let’s use this idea to solve the following systems of two linear equations:

Example 6.2 \[ \begin{align*} 2x-3y & = 2\tfrac{1}{2} \\ x+4y & = 4 \end{align*} \]

We draw the solution sets of these two equations (Figure 6.5), and read the solution from the graph: The intersection of the two solution sets is the tuple \((x,y)=(2,\tfrac{1}{2})\).

Figure 6.5: Graphical solution of a system of linear equations.

We can check the correctness of this solution by substituting the values for \(x\) and \(y\) in the system of equations. Then we obtain

\[ \begin{align*} 2 \times 2 - 3 \times \tfrac{1}{2} & = 4 - 1 \tfrac{1}{2} = 2\tfrac{1}{2} \\ 2 + 4 \times \tfrac{1}{2} & = 2 + 2 = 4 \end{align*}% \] so, this solution is correct!

We could, of course, also have obtained this solution by a few algebraic manipulations of the set of equations. For example, from the second equations we could express \(x\) in terms of \(y\):

\[ x = 4 - 4y \]

substituting this for \(x\) in the first equation we get

\[ \begin{align*} 8 - 8 y - 3 y & = 8 - 11 y = \tfrac{5}{2} \\ -11 y & = - \tfrac{11}{2} \\ y & = \tfrac{1}{2} \end{align*} \]

substituting this value for \(y\) back into \(x = 4 - 4y\), we get \(x=4-2=2\).

6.3 Exercises

Graphical solution

Exercise 1.

Solve this system both graphically and algebraically \[ \begin{align*} x-y & = 0 \\ x+2y & = 6 \end{align*} \]

Exercise 2.

Solve this system both graphically and algebraically \[ \begin{align*} x+y & = 2 \\ x-2y & = 0 \end{align*} \]

Exercise 3.

Solve this system graphically. What is the solution set? \[ \begin{align*} x-2y & = 0 \\ 2x-4y & = 1 \end{align*} \]

Exercise 4.

Explain why the system \[ \begin{align*} u + v + w & = 2 \\ u + 2v + 3w & = 1 \\ v + 2w & = 0 \end{align*} \] is singular (has an empty solution set) by finding a linear combination of the three equations that result in the equation \(0 = 1\).

Exercise 5.

Solve this system graphically. What is the solution set? \[ \begin{align*} x+y & = 2 \\ -2x-2y & = -4 \end{align*} \]


  1. Uncountable means that there exists no one-to-one mapping with the (countable) set \(\mathbb{N}\) of natural numbers. Even though there are infinitely many natural numbers, you can count them. However, there is no way to count the set \(\mathbb{R}\), i.e. to map every number in \(\mathbb{R}\) one-to-one with a number in \(\mathbb{N}\) and vice versa.↩︎

  2. \(\{x|x \in \mathbb{R}\}\) is the technical way of writing this solution in set builder notation. In words: the set of all numbers \(x\) where \(x\) is an element of the real numbers.↩︎

  3. Furthermore, duplicate occurrences of elements in ordinary sets have no relevance, i.e. \(\{2,3,3\}=\{2,3\}\), which is certainly not true for tuples: \((2,3,3) \neq (2,3)\).↩︎