10  Solving the matrix equation \(A \vec{x}=\vec{b}\)

\[ \newcommand {\mat}[1] { \boldsymbol{#1} } \newcommand {\yields} { \; \Rightarrow \; } \newcommand {\rank}[1] { \text{rank}\left(#1\right) } \newcommand {\nul}[1] { \mathcal{N}\left(#1\right) } \newcommand {\csp}[1] { \mathcal{C}\left(#1\right) } \newcommand {\rsp}[1] { \mathcal{R}\left(#1\right) } \newcommand {\lsp}[1] { \mathcal{L}\left(#1\right) } \renewcommand {\vec}[1] {\mathbf{#1}} \] \[ \newcommand {\hair} { \hspace{0.5pt} } \newcommand {\dd}[2] { \frac{\mathrm{d} \hair #1}{ \mathrm{d} \hair #2} } \newcommand {\tdd}[2] { \tfrac{\mathrm{d} #1}{\mathrm{d} #2} } \newcommand {\pp}[2] { \frac{\partial #1}{\partial #2} } \newcommand {\indint}[2] { \int \! {#1} \, \mathrm{d}{#2} } \newcommand {\defint}[4] { \int_{#1}^{#2} \! {#3} \, \mathrm{d}{#4} } \]

\[ \newcommand{\unit}[1]{\,\mathrm{#1}} \newcommand{\degC}{^\circ{\mathrm C}} \newcommand{\tothe}[1]{\times 10^{#1}} \]

10.1 The column space

We saw that the null space of a matrix \(\mat{A}\) was the set of vectors \(\vec{x}\) that solve the homogeneous matrix equation \(\mat{A} \vec{x} = \vec{0}\). Another important vector space characterizing a matrix \(\mat{A}\) is its column space. Suppose we try to solve the non-homogeneous matrix equation \(\mat{A} \vec{x} = \vec{b}\), where \(\vec{b} \neq \vec{0}\). We saw that sometimes such an equation (or the equivalent system of linear equations) does not have a solution, i.e it has the empty set \(\varnothing\) as a solution set. What can we say about \(\vec{b}\) if the solution set would not be empty, or to say it differently, which \(\vec{b}\) would be admissible to let \(\mat{A} \vec{x} = \vec{b}\) have a non-empty solution set? Well, we know that an equation like

\[ \mat{A} \vec{x} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix} \]

can also be written as the sum of column vectors (see Equation 8.1):

\[ \mat{A} \vec{x} = \begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{m1} \end{bmatrix} x_1 + \begin{bmatrix} a_{12} \\ a_{22} \\ \vdots \\ a_{m2} \end{bmatrix} x_2 + \cdots + \begin{bmatrix} a_{1n} \\ a_{2n} \\ \vdots \\ a_{mn} \end{bmatrix} x_n = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix} \]

This way of writing the matrix equation shows that we have linear combinations of all column vectors from the matrix \(\mat{A}\) on the left hand side of the equation, since the \(x_1\), \(x_2\), \(x_n\) are scalars. So the left hand side is the vector space composed by linear combinations of the columns from \(\mat{A}\), and is called the column space. Clearly, admissible vectors \(\vec{b}\) must be in that column space (i.e. be members of the column space) to obtain non-empty solution sets for the vector \(\vec{x}\)!

Example 10.1 (Trivial example) If we have a matrix

\[ \begin{bmatrix} 1 & 2 \\ 0 & 0 \end{bmatrix} \] then \[ \begin{bmatrix} 1 & 2 \\ 0 & 0 \end{bmatrix} \vec{x} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

can never give a solution for \(\vec{x}\), because there are no values of \(x_1\) and \(x_2\) to get both sides of the equation

\[ \begin{bmatrix} 1 \\ 0 \end{bmatrix} x_1 + \begin{bmatrix} 2 \\ 0 \end{bmatrix} x_2 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

equal. In general, only vectors of the form \(\bigl[\begin{smallmatrix} a \\ 0\end{smallmatrix}\bigr]\), where \(a\) can be any real value, are in the column space.

The following example is less trivial.

Example 10.2 (A Less trivial case) \[ \begin{bmatrix} 1 & 2 \\ 4 & 8 \end{bmatrix} \]

I.e

\[ \begin{bmatrix} 1 & 2 \\ 4 & 8 \end{bmatrix} \vec{x} = \begin{bmatrix} 1 \\ 1 \end{bmatrix} \]

will not yield a solution for \(\vec{x}\), because \(\bigl[\begin{smallmatrix} 1 \\ 1\end{smallmatrix}\bigr]\) is not in the column space of this matrix. However, all vectors of the form \(\bigl[\begin{smallmatrix} 1 \\ 4\end{smallmatrix}\bigr] a\), where \(a\) can be any real value, are in the column space of this matrix.

For a \(2 \times 2\) matrix this conclusion still seems trivial, but it becomes less trivial for larger matrices to find out which vectors are in the column space.

We have now arrived at a conclusion for the matrix equation \(\mat{A} \vec{x} = \vec{b}\) that differs from our conclusion for the homogeneous equation \(\mat{A} \vec{x} = \vec{0}\), namely that, whereas solution sets for the homogeneous equation always have at least one member (namely \(\vec{0}\)), solution sets for the non-homogeneous equation can be empty sets! That is because \(\vec{0}\) is always in the column space (or in any vector space), but an arbitrary vector \(\vec{b}\) is not necessarily in that column space.

10.2 Constructing the solution set for \(A \vec{x}=\vec{b}\)

Suppose that we know that a particular vector, let’s call it \(\vec{x}_p\), is a solution to the equation \(\mat{A} \vec{x}=\vec{b}\), or that \(\mat{A} \vec{x}_p = \vec{b}\). Then the following theorem holds:

Theorem 10.1 If \(\vec{x}_p\) is a particular vector that satisfies \(\mat{A} \vec{x}_p = \vec{b}\) and \(\vec{x_n}\) is a vector from the nullspace of \(\mat{A}\) then \(\vec{x} = \vec{x}_p + \vec{x}_n\) is also a solution to \(\mat{A} \vec{x} = \vec{b}\). Furthermore, the complete solution set for \(\mat{A} \vec{x} = \vec{b}\) can be written as \(\vec{x}_p + \vec{x}_n\), where \(\vec{x}_n\) is any vector from the nullspace of \(\mat{A}\).

Proof. To show that \(\vec{x} = \vec{x}_p + \vec{x}_n\) also satisfies \(\mat{A} \vec{x}=\vec{b}\) we first need to remind ourselves of the fact that by definition \(\mat{A}\vec{x_n}=\vec{0}\). It now becomes quite trivial:

\[ \mat{A} \vec{x} = \mat{A} (\vec{x}_p + \vec{x}_n) = \mat{A} \vec{x}_p + \mat{A} \vec{x}_n = \vec{b} + \vec{0} = \vec{b} \]

To show that it also describes the complete solution set for \(\mat{A} \vec{x}=\vec{b}\): suppose that we have any other vector \(\vec{y}\) that is also a solution, so \(\mat{A} \vec{y}=\vec{b}\). Then the difference \(\vec{y}' = \vec{y} - \vec{x}_p\) must be in the null space of \(\mat{A}\), since \(\mat{A}(\vec{y} - \vec{x}_p)=\mat{A}\vec{y} - \mat{A}\vec{x}_p = \vec{b} - \vec{b} = \vec{0}\).

Therefore, \(\vec{y}\) itself must be a member of the set described by \(\vec{x}_p + \vec{x}_n\) where \(\vec{x}_n \in \nul{\mat{A}}\), since it can be written as \(\vec{y} = \vec{x}_p + \vec{y}'\), where \(\vec{y}' \in \nul{\mat{A}}\).

To practically solve the problem of obtaining a description of the solution set for \(\mat{A} \vec{y}=\vec{b}\), we need a method to obtain one solution vector \(\vec{x}_p\) and we need a description of the null space. For the latter we already know a method: Gaussian elimination. For the former we will need Gauss-Jordan elimination. And we will see that we can actually solve both questions in one elimination. Let’s take our previous homogeneous example, Example 9.2, but now solve a non-homogeneous case:

\[ \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 4 \\ 8 \\ 6 \end{bmatrix} \]

We perform Gauss-Jordan elimination, but now, of course, with the right-hand vector augmented (because row operations will change it!).

\[ \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 1 & 2 & 3 & 8\\ 0 & 1 & 2 & 6 \end{array} \;\yields\; \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 4\\ 0 & 1 & 2 & 6 \end{array} \;\yields\; \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 4\\ 0 & 0 & 0 & 2 \end{array} \;\yields\; \begin{array}{ccc|c} 1 & 0 & -1 & 0 \\ 0 & 1 & 2 & 4 \\ 0 & 0 & 0 & 2 \end{array} \]

which yields the matrix equation

\[ \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 0 \\ 4 \\ 2 \end{bmatrix} \]

which is equivalent to the system of equations:

\[ \begin{alignedat}{6} u & & & {}-{} & w &= 0 \\ & & v & {}+{} & 2w &= 4 \\ 0u & {}+{} & 0v & {}+{} & 0w &= 2 \end{alignedat} \]

Clearly, it is impossible to solve this system with any choice of \(u\), \(v\) an \(w\), because the last equation reduces to \(0=2\). This shows that the solution set of this system is the empty set \(\varnothing\): there is no solution. The right-hand vector is incompatible with the left hand equation, or, in other words, it is not in the column space of the matrix. Now let’s try another right hand side:

\[ \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 4 \\ 5 \\ 1 \end{bmatrix} \]

Again, we perform Gauss-Jordan elimination:

\[ \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 1 & 2 & 3 & 5\\ 0 & 1 & 2 & 1 \end{array} \;\yields\; \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 1\\ 0 & 1 & 2 & 1 \end{array} \;\yields\; \begin{array}{ccc|c} 1 & 1 & 1 & 4 \\ 0 & 1 & 2 & 1\\ 0 & 0 & 0 & 0 \end{array} \;\yields\; \begin{array}{ccc|c} \textcolor{red}{1} & 0 & -1 & 3 \\ 0 & \textcolor{red}{1} & 2 & 1 \\ 0 & 0 & 0 & 0 \end{array} \]

which yields the matrix equation

\[ \begin{bmatrix} 1 & 0 & -1 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} \]

This yields a description with one free variable (\(w\)):

\[ \begin{alignedat}{6} \textcolor{red}{u} & & & {}-{} & w &= 3 \\ & & \textcolor{red}{v} & {}+{} & 2w &= 1 \\ 0u & {}+{} & 0v & {}+{} & 0w &= 0 \end{alignedat} \]

The last equation, which reduces to \(0=0\), shows that the right hand vector is compatible. We can make the first two equations valid by choosing the right expressions for the pivot variables in \(u\) and \(v\) in terms of the free variable \(w\):

\[ \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} 3 + w \\ 1 - 2 w \\ w \end{bmatrix} = \begin{bmatrix} 3 \\ 1 \\ 0 \end{bmatrix} + \begin{bmatrix} 1 \\ -2 \\ 1 \end{bmatrix} w \quad \textrm{where } w \in \mathbb{R} \]

We see that this equation is of the form \(\vec{x} = \vec{x}_p + \vec{x}_n\) where \(\vec{x}_n \in \nul{\mat{A}}\), with \(\vec{x}_p = \Bigl[\begin{smallmatrix} 3 \\ 1 \\ 0 \end{smallmatrix} \Bigr]\) and \(\vec{x}_n = \Bigl[\begin{smallmatrix*}[r] 1 \\ -2 \\ 1 \end{smallmatrix*} \Bigr] w\). The latter is equal to the description of the null space (Equation 9.1)!

How can we see that this right hand side vector, \(\Bigl[\begin{smallmatrix} 4 \\ 5 \\ 1 \end{smallmatrix}\Bigr]\), is in the column space of the matrix? Our solution shows that it does. For example, set the free variable \(w=1\). Then \(u=4\), \(v=-1\), and we get as follows:

\[ \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} 4 \\ -1 \\ 1 \end{bmatrix} = 4 \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} - \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} + \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 4 \\ 5 \\ 1 \end{bmatrix} \]

In fact, our solution showed that any choice for \(w\) will do the job: \[ \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} 3 + w \\ 1 - 2 w \\ w \end{bmatrix} = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} (3 + w) + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} (1 - 2w) + \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} w = \begin{bmatrix} 3 + w + 1 - 2w + w \\ 3 + w + 2 - 4w + 3w \\ 0 + 1 - 2w + 2w \end{bmatrix} = \begin{bmatrix} 4 \\ 5 \\ 1 \end{bmatrix} \]

10.3 Solvability condition for \(\vec{b}\)

We saw in the example above that \(\Bigl[\begin{smallmatrix}4 \\ 8 \\ 6 \end{smallmatrix}\Bigr]\) was incompatible with the matrix, whereas \(\Bigl[\begin{smallmatrix}4 \\ 5 \\ 1 \end{smallmatrix}\Bigr]\) was compatible. How can we construct general right hand side vectors \(\vec{b}\) for the equation \(\mat{A} \vec{x} = \vec{b}\) that are compatible with the matrix \(\mat{A}\)? Or, in other words, how can we formulate admissible vectors \(\vec{b}\)? Again, we use Gaussian elimination, but now on the general equation

\[ \begin{bmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \end{bmatrix} = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} \]

where we leave the entries for \(\vec{b}\) undetermined. Performing Gaussian elimination yields \[ \begin{array}{ccc|l} 1 & 1 & 1 & b_1 \\ 1 & 2 & 3 & b_2\\ 0 & 1 & 2 & b_3 \end{array} \;\yields\; \begin{array}{ccc|l} 1 & 1 & 1 & b_1 \\ 0 & 1 & 2 & b_2 - b_1\\ 0 & 1 & 2 & b_3 \end{array} \;\yields\; \begin{array}{ccc|l} 1 & 1 & 1 & b_1 \\ 0 & 1 & 2 & b_2 - b_1 \\ 0 & 0 & 0 & b_3 - b_2 + b_1 \end{array} \]

We don’t have to continue until obtaining the reduced row-echelon form because we can already see that the last equation \(0 = b_3 - b_2 + b_1\) yields a solvability condition on \(b_1\), \(b_2\), and \(b_3\) independent of \(u\), \(v\) and \(w\), whereas the first two equations can be satisfied for any values of \(b_1\), \(b_2\) and \(b_3\) by choosing the right \(u\), \(v\) and \(w\). Using the last equation, we can express one of the variables \(b\) in terms of the other two, for example \(b_3 = b_2 - b_1\). Therefore, the most general formulation of the column space \(\csp{\mat{A}}\) of \(\mat{A}\) is the set of vectors:

\[ \begin{split} \csp{\mat{A}} = \begin{bmatrix} b_1 \\ b_2 \\ b_2 - b_1 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} b_1 + \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} b_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ -1 & 1 \end{bmatrix} \begin{bmatrix} b_1 \\ b_2 \end{bmatrix} \\ \\ \textrm{where } b_1, b_2 \in \mathbb{R} \end{split} \tag{10.1}\]

Clearly, \(\Bigl[\begin{smallmatrix}4 \\ 5 \\ 1 \end{smallmatrix}\Bigr]\) satisfies this condition, whereas \(\Bigl[\begin{smallmatrix}4 \\ 8 \\ 6 \end{smallmatrix}\Bigr]\) does not. We saw here that we formulated \(\vec{b}\) using the two free variables \(b_1\) and \(b_2\): they can independently take any real values to obtain a compatible right hand vector.

We can also make a general statement about the number of free variables that we need to describe the column space, or of the dimension of the column space: \(\dim(\csp{\mat{A}})\). It is equal to the rank of \(\mat{A}\), i.e. \(\dim(\csp{\mat{A}})=\rank{\mat{A}}\)

Wait a minute, didn’t we say before that the column space equals all linear combinations of the columns of a matrix? So, in this case

\[ \csp{\mat{A}} = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} x_1 + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} x_2 + \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} x_3 \qquad \textrm{where } x_1 \in \mathbb{R} \textrm{ and } x_2 \in \mathbb{R} \textrm{ and } x_3 \in \mathbb{R} \]

That is true: in fact both this statement and the previous formulation are valid and do not contradict each other. The difference between this formulation and the previous one is that the three vectors in the last formulation are not a basis for the column space, whereas the two vectors in the first formulation are. The first formulation uses 2 free variables and this one 3. The reason is that by Gaussian elimination we ironed out dependencies between columns and rows, and were able to calculate a basis, whereas there is still a dependency present in the original matrix. Namely, the third column equals 2 times the second column minus the first column:

\[ \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} 2 - \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} \]

This means that there are effectively only two free variables among the \(x_1\), \(x_2\) and \(x_3\), since

\[ \begin{split} \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} x_1 + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} x_2 + \begin{bmatrix} 1 \\ 3 \\ 2 \end{bmatrix} x_3 = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} x_1 + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} x_2 + \left( \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} 2 - \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} \right) x_3 = \\ \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} (x_1 - x_3) + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} (x_2 + 2 x_3) = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} x_1' + \begin{bmatrix} 1 \\ 2 \\ 1 \end{bmatrix} x_2' \end{split} \]

It is not so difficult to show that this representation of \(\csp{\mat{A}}\) is equivalent to the one we obtained after Gaussian elimination in Equation 10.1: substitute \(x_1' = 2b_1 - b_2\) and \(x_2' = b_2 - b_1\) in the equation above.

10.4 Exercises

Solving \(\mat{A}\vec{x}=\vec{b}\)

Exercise 32.

Find the solution set for the equation \[ \begin{bmatrix} 1 & 3 & 3 & 2 \\ 2 & 6 & 9 & 7 \\ -1 & -3 & 3 & 4 \end{bmatrix} \begin{bmatrix} u \\ v \\ w \\ x \end{bmatrix} = \begin{bmatrix} 1 \\ 5 \\ 5 \end{bmatrix} \]

Exercise 33.

Find the value of \(c\) that makes it possible to solve \(\mat{A} \vec{x} = \vec{b}\), and solve it. \[ \begin{align*} u + v + 2w &= 2 \\ 2u + 3v - w &= 5 \\ 3u + 4v + w &= c \end{align*} \]

Exercise 34.

Under which conditions on \(b_{1}\) and \(b_{2}\) (if any) does \(\mat{A}\vec{x}=\vec{b}\) have a solution? \[ \mat{A} = \begin{bmatrix} 1 & 2 & 0 & 3 \\ 2 & 4 & 0 & 7 \end{bmatrix} \qquad \vec{b} = \begin{bmatrix} b_{1} \\ b_{2} \end{bmatrix} \]

Exercise 35.

We have the following matrix: \[ \mat{A}= \begin{bmatrix} 1 & 1 & 2 & 2\\ 1 & 1 & 1 & 0\\ 0 & 0 & 1 & 2 \end{bmatrix} \]

a. This matrix equation corresponds to a system of __ equations with __ variables.

b. If you just knew the numbers in your previous answer, what could you say about the number of free variables in a solution of this matrix equation?

c. What is the rank of \(\mat{A}\)?

d. How many free variables (or what dimension or what number of degrees of freedom) does the nullspace of this matrix have?

e. Give an equation that describes the complete nullspace of \(\mat{A}\). To which geometrical object does this solution correspond?

f. What is the general condition on \(b_1\), \(b_2\), and \(b_3\) if a solution \(\vec{x}\) exists for the equation \[ \mat{A} \vec{x} = \begin{bmatrix} b_1\\ b_2\\ b_3 \end{bmatrix} \]

Exercise 36.

We have the following system of equations, written in matrix form: \[ \begin{bmatrix} 1 & 2 & 3 & 5\\ 2 & 4 & 8 & 12\\ 3 & 6 & 7 & 13 \end{bmatrix} \begin{bmatrix} x_{1}\\ x_{2}\\ x_{3}\\ x_{4} \end{bmatrix} = \begin{bmatrix} b_{1} \\ b_{2} \\ b_{3} \end{bmatrix} \]

a. Reduce the system to echelon form using Gaussian elimination. Include the \(\vec{b}\) vector in the elimination process.

b. What condition on \(b_{1}\), \(b_{2}\) and \(b_{3}\) makes the system solvable?

c. Reduce the echelon form.

d. Give a full description of the nullspace of the matrix.

e. Check whether \(b_1=0\), \(b_2=6\) and \(b_{3}=-6\) satisfies the solvability condition and obtain the general solution for the system in this case.

Applications

Exercise 37. Deconvolution

A researcher wants to measure the concentration of a chemical compound X spectrophotometrically in samples. Compound \(\ce{X}\) absorbs light at wavelength \(w1\). Unfortunately, the samples also contain a compound \(\ce{Y}\) at unknown concentrations, with an absorption spectrum overlapping that of \(\ce{X}\) (so, absorbing at the same wavelengths). After considering the problem carefully, she decides on a solution called deconvolution by measuring the same sample at a second wavelength \(w2\), at which both compounds also absorb light but with different extinction coefficients. The extinction coefficients of both components are shown in Table 10.1.

Table 10.1: Extinction coefficients of compounds X and Y at wavelengths w1 and w2, given in \(\text{mM}^{-1}\).
compound w1 w2
X 2 1.0
Y 1 1.5

You should know that the Beer-Lambert law says that for a pure component the absorbance \(A\) at a certain wavelength is related to the concentration \(c\) as \(A = \epsilon c\), where \(\epsilon\) is the extinction coefficient at that wavelength. Absorbances by different components at a wavelength just add up to the total absorbance at that wavelength.

In a certain sample the researcher measures a total absorbance of 8.0 at \(w1\) and of 4.5 at \(w2\).

a. Call the concentrations of \(\ce{X}\) and \(\ce{Y}\) \(x\) and \(y\), respectively. Write down the equations for the total absorption by the sample at both wavelengths, using \(x\), \(y\) and the extinction coefficients from Table 10.1.

b. Calculate the concentration of \(\ce{X}\) in the sample (don’t forget the units!).

c. Under which conditions concerning the extinction coefficients would this deconvolution method have failed (meaning if the extinction coefficients had been different, when would the method have failed)?

Exercise 38.

A (hypothetical) factory wants to blend different batches of a product having different percentages of fat and protein, to produce a constant quality. They have currently \(4\) batches of the raw product available:

batch fat protein
M1 41 37
M2 37 34
M3 39 36
M4 40 34

(the numbers are in promille). The final blend of \(M1\), \(M2\), \(M3\) and \(M4\) should contain 39  of fat and 35  of protein. Note that you do not have to calculate an actual solution set for this problem anywhere!

a. Represent the fractions of \(M1\), \(M2\), \(M3\) and \(M4\) in the blend by the variables \(p_1 \ldots p_4\), respectively. Write the linear equations for this problem in matrix form. It should include any additional constraints on \(p_1 \ldots p_4\) that can be expressed as linear equalities.

Tip The sum of all fractions should add up to \(\ldots\)?

b. Since they represent fractions, there are additional constraints on \(p_1 \ldots p_4\) that can not be expressed by linear eqalities. What are they?

c. It may not always be possible to find a solution to this problem that conforms with these additional constraints. Discuss a compromise for those cases.

Exercise 39.

Suppose we have a metabolic network \[ \begin{align*} \ce{A + X} & \ce{-> 2B} \\ \ce{B} & \ce{-> C} \\ \ce{2C} & \ce{-> A + Y} \end{align*} \] (perhaps you should draw the network). We denote the rates of each of these reactions consecutively as \(v_1\), \(v_2\), and \(v_3\) and the concentrations of the metabolites with lowercase letters. Concentrations \(x\) and \(y\) are kept constant.

a. Write down the system of differential equations that describes the dynamics (not necessarily steady state) of the metabolites \(\ce{A}\), \(\ce{B}\) and \(\ce{C}\) both as a system of equations and as a matrix equation.

b. Deduce conservation relations between metabolites \(\ce{A}\), \(\ce{B}\) and \(\ce{C}\) (by studying the solvability conditions to the general system \(\mat{N} \vec{v} = \vec{b}\) where \(\vec{b}\) represents the vector of time derivatives of metabolite concentrations). Also study the steady state flux distribution in this network.

c. There is a slightly different way of obtaining moiety conservation relations in a metabolic network while performing Gaussian elimination, namely by performing Gaussian elimination on the augmented matrix \(\left[ \mat{N} \mid \mat{I} \right]\) where \(\mat{I}\) is the identity matrix. Find the conservation relations by performing this gaussian elimination.

Exercise 40.

Look at the chemical formula for the combustion of n-heptane:

\[ a \, \text{C}_7\text{H}_{16} + b \, \text{O}_2 → c \, \text{CO}_2 + d \, \text{H}_2\text{O} \]

a. Write down the chemical balance as a matrix equation that gives the changes in C, H and O as function of \(a\), \(b\), \(c\) and \(d\) and set that equal to 0. Remember that the products are consumed and their entries are negative.

b. Solve the matrix equation to describe the values for \(a\), \(b\), \(c\) and \(d\) that balance this equation, and give the specific solution for \(a\).

c. Set \(a=1\) and write the matrix equation in the from \(\mat{A} \vec{x} = \vec{b}\) and solve it.

d. Rewrite the equation from c) using the inverse, calculate the inverse and solve that equation.

Exercise 41.

Suppose we have a metabolic network \[ \begin{align*} \ce{X + S} & \ce{-> 3Y} \\ \ce{3Y} & \ce{-> 2Z + P1} \\ \ce{2Z} & \ce{-> X + P2} \end{align*} \] We denote the rates of each of these reactions consecutively as \(v_1\), \(v_2\), and \(v_3\) and concentrations of the metabolites with lowercase letters. Concentrations \(s\), \(p_1\) and \(p_2\) are kept constant.

a. Draw the network.

b. Write down the system of differential equations that describes the dynamics (not necessarily steady state) of the metabolites \(\ce{X}\), \(\ce{Y}\) and \(\ce{Z}\) both as a system of equations and as a matrix equation.

c. Deduce conservation relations between metabolites \(\ce{X}\), \(\ce{Y}\) and \(\ce{Z}\) by studying the solvability conditions to the general system \(\mat{N} \vec{v} = \vec{b}\) where \(\vec{b}\) represents the vector of time derivatives of metabolite concentrations.

d. Write down the system of steady state equations for this system in matrix notation

e. What is the steady state flux distribution in this network?

Exercise 42. The perceptron binary classifier

The perceptron, also called the McCulloch–Pitts neuron, is one of the earliest neural classifiers (McCulloch and Pitts, 1943). It consists of only one neuron that receives many signals and based on these decides to which of two possible classes an observed item belongs (Figure 10.1). It was implemented as the Perceptron machine in 1957, built by Rosenblatt.

Figure 10.1: The perceptron classifier. To the left is the layer of raw inputs \(x_i\), which are numerical variables measured on an object. The perceptron calculates the weigthed sum of these inputs, sends this through an activation function (here the Heaviside or step function) and emits 0 or 1, the class predictor.

The inputs are measurements of \(n\) different variables \(x_i\) (\(i = 1 \ldots n\)) which are each weighted by weights \(w_i\). Their linear sum plus an offset \(b\) is taken as input to the Heaviside activation function (See Chapter 2). The class of the input is decided on whether the Heaviside function yields a 0 or a 1.

Mathematically, its description is

\[ \begin{align*} f(\vec{x}) &= b + \sum_{i=1}^n w_i x_i \\ o(\vec{x}) &= \mathcal{H(f(\vec{x}))} = \begin{cases} f(\vec{x}) < 0 & 0 \\ f(\vec{x}) \geq 0 & 1 \end{cases} \end{align*} \]

The weights \(w_i\) and the offset \(b\) are learned from a training data set of objects of which the classes are known. Then they are fixed and the perceptron can be applied to objects of unknown classes.

a. Clearly, the decision boundary for the two classes lies at \(f(\vec{x}) = 0\). To what object in n-dimensional space does the decision boundary \(f(\vec{x}) = 0\) correspond?

b. To which points in n-dimensional space do the conditions \(f(\vec{x}) < 0\) and \(f(\vec{x}) \geq 0\) correspond? Where do the points corresponding to the two classes of the training set ideally lie relative to the object defined by \(f(\vec{x}) = 0\)?

Tip Try to imagine what it corresponds to in 2-dimensional space first. Then generalize your reasoning to higher dimensions.