This is the first addition to what will become an ongoing series on the basics of Linear Algebra, the foundational math behind machine learning. This article would best serve readers if read in accompaniment with Linear Algebra and Its Applications by David C. Lay, Steven R. Lay, and Judi J. McDonald. Consider this series as an external companion resource.
Through these essays, I hope to consolidate my understanding of these foundational concepts while if possible, offering additional clarity to others with what I hope to be an intuitive-based approach to learning math. If there are any mistakes or opportunities for me to elaborate further, please share and I can make the necessary amendments.
Linear equations and systems of linear equations have a variety of real-world applications in the domains of Finance, Engineering, Chemistry, Computer Science, Statistics, and Physics and beyond. In Chemistry, linear equations are used to balance chemical reactions and calculate the quantities of reactants and products. This cornerstone of Linear Algebra also appears in Physics where linear equations are used within Kinematics and Thermodynamics to describe the motion of objects, helping calculate distances, speeds, and accelerations and model heat transfer and energy flow in physical systems respectively. The financial field relies on linear equations and systems for budgeting and portfolio analysis while engineers might use the same tools to conduct structural analysis to model forces and stresses in buildings. Linear Algebra is ubiquitous; everyone can appreciate it to some degree.
A linear equation is an equation with one or more variables and for each variable, the exponent the variable is raised to must be one. It can be written in the form: a₁x₁ + a₂x₂ + … + 2ᵣxᵣ = b. The values [a₁, a₁, …, aᵣ] and b are referred to as coefficients of a linear equation.
Examples of linear equations include: 2x + 5y = 10, 6x = 18, 7v + 8w + 0x + 2y + 3z = 15, and 3x₁ + 4x₂ + 5x₃+9x₄ + 10x₇ = 3.
A non-example of a linear equation would be 2x² + 6x + 5 = 2; this is an instance of a quadratic* equation. Another such non-example could be 7x₁ + 3x₂ = x₁* y₁; the reason for this becomes apparent when you graph this equation, it can be rearranged to form the rational function y = 7x / x – 3 which is curved as opposed to linear.
Consider the linear equation 2x + 5y = 10. The below diagram illustrates the graphical representation of the linear equation, you’ll notice that it is a line. This becomes more obvious when recalling the equation of a line: y = mx + b, where m = slope and b = y intercept. The linear equation can be rearranged as demonstrated below to assume this form.
The following conclusion can be drawn: all (x, y) points that fall on the line are then solutions to the equation 2x + 5y = 10. For example, suppose we select the point of the x-intercept (5, 0) and substitute the x and y values into their respective positions in the equation. 2(5) + 5(0) = 10. Any (x, y) point on the line may be substituted into the equation and the equality will hold true. We can generalize this finding into a rule:
The set of solutions in ℝ²* for a linear equation with two variables, ax + by = c, can be represented as a line.
Notice that this singular equation has an infinite number of solutions that span ℝ²; we will take a closer look at number of solutions later.
This same underlying concept transfers to higher dimensional coordinate spaces denoted as ℝⁿ such as ℝ³ in which the line becomes a plane because of the addition of a third variable.
Systems of Linear Equations
A system of linear equations is a collection of one or more linear equations with the equations sharing similar variables. An example:
6x + 2y = 4
2x + 4y = 8
A solution to a system of linear equations is defined as the values (s₁, s₂, …, sᵣ) that make each equation true when substituted for their respective variables. In the case of the above system, the solution would be (0, 2) because when (0, 2) is substituted into the system, both equations evaluate to be true.
Solutions to a Linear System
What are the graphical implications of a solution to a linear system? What are the various cases of number of solutions for a linear system? This section will examine each of the three possibilities in greater detail. They are as follows:
- Unique Solution
- No Solution
- Infinite Solutions
Unique Solution: In the case of a linear system with two variables such as the one above, the solution is a point of intersection. Why? The solution is the ordered pair in which both equations must be satisfied, if no such ordered pair exists, that must mean that the lines never intersect. This is an example of a unique solution. Only one solution exists which satisfies all equations in the linear system.
No Solution: Consider the case of no solution. What might that imply in the context of a linear system with two variables? In what scenarios would a collection of lines never meet? One case would be if they were parallel. In the case of a linear system where lines are all parallel, the linear system will have no solutions. Another case would be if while some lines may intersect with others, there is no one common point of intersection that all lines share.
Infinite Solutions: The final case for a linear system is the existence of infinite solutions. When might it be possible for there to be infinite solutions for a two variable linear system? If the lines are the same, then there are infinite points of intersections because they overlap, and thus infinite solutions exist. Consider the following linear system:
6x + 3y = 18
2x + y = 6
While the coefficients may be different, these lines are actually identical! If you divide each of the coefficients of the first equation by 3, the resulting equation will be 2x + y = 6.
The visualization of the number of solutions for a linear system changes as the number of variables increases. Pictured below are possible diagrams of all three solution cases for a linear system with three variables. Anything after three dimensions becomes difficult for the human brain to visualize but the same rules apply! Regardless of how many variables there are, all linear systems have either no solutions, one solution, or infinite solutions.
As linear equations become more complex, the notation may become unwieldy. It’s important for the information of a linear system to be condensed to be easy to manipulate and work with, and so matrix notation is often used in favour of a set of equations. A coefficient matrix is a type of matrix that excludes the b coefficient from each equation. An augmented matrix is inclusive of the b coefficient, hence it has one more column than a coefficient matrix.
The size, also referred to as the order, of a matrix tells us how many rows and columns a matrix has. A m x n matrix is a matrix with m rows and n columns. The number of rows corresponds to how many linear equations a system has while the number of columns tells us how many variables there are. Take care to ensure that the number of rows precedes the number of columns as the order is not interchangeable.
Solving a Linear System
There is a systematic way to determine if a linear system has a solution, and if so, if it has a unique solution or infinite solutions, and from there, obtain the solutions. Solving a linear system can be performed using linear equations in their original form or with a matrix though it is recommended to use a matrix as the notation is cleaner and more compact. It is good however to be well acquainted with both methods because they provide additional insight into the mechanics of the other.
Below is a step by step process of working through to solve a system of equations sans-matrix. The basic idea is to create new equations through multiplying preexisting ones to obtain identical equations that can be then added or subtracted from another equation to eliminate one variable. This process is then repeated until we’ve eliminated enough unknowns from the system to be able to solve for one variable and then work our way back up to solve for the rest through back substitution. At the end, a check is needed to ensure that the solution actually satisfies the system of equations.
The steps outlined previously are transferable to the matrix-centered procedure of solving a linear system. Take note of how variables that are eliminated are designated within the matrix after each transformation. Before we get into that however, let’s define some row operations. Two are actually parallel to the operations we applied previously.
- Replacement: “replace a row by the sum of itself and another row.”*
- Interchange: “swap two rows.”*
- Scaling: “multiply all entries in a row by a non-zero constant.”*
Let us re-approach the same linear system once more but this time using matrices and applying row operations.
Notice how I’ve used the exact same operations and scale factors as in the linear equations method. Unsurprisingly, we wind up with the same equations from before. Something else to make note of is the triangular formation in the bottom left corner of the final matrix. It makes sense for this pattern to emerge because the 0s are markers of an eliminated variable and each variable eliminated brings us closer to identifying an equation we can solve easily for; this in turn makes progress in solving the system as a whole. We’ll revisit this occurrence and I will provide a more formal definition for it in the next chapter.
In this chapter, we learned:
- Linear equations: an equation with one or more variables where the degree of the equation must be equal to 1.
- Systems of linear equations: a collection of linear equations.
- Solutions to a system of one or more linear equations: a linear system either has no solutions, a unique solution or infinite solutions.
- Matrix Notation: rectangular array which is used as a condensed way to represent a linear system.
- Row Operations: replacement, interchange, and scaling operations allow us to transform a matrix into one that has eliminated enough unknown variables to solve for the system.
- Solving a linear system: a systematic way to find a) whether solutions exist for a given linear system and b) if solution(s) exist, what their exact values are.
*Unless otherwise noted, all images by author of the article.
*As a small aside: the word quadratic comes from quadratus which is the past participle of the Latin word quadrare meaning “to make square.”; which pays homage to its degree! [src]
*ℝ² is the space of all the possible ordered pairs (x, y) on the real number line, it is represented by a two dimensional plane. ℝ² encapsulates the entire set of real numbers and the set of real numbers is uncountably infinite, which means the ℝ² space is also infinite.
*Citation for row operations [src]