Basic Concepts in Linear Algebra

Linear Algebra

I introduce vector spaces, linear combinations, and basis vectors, with a focus on the intuition behind these concepts.

Published

September 19, 2025

Vector Spaces, Vectors, and Matrices

In mathematics, an algebraic structure is an abstraction consisting of (i) a set of elements, (ii) operations that manipulate those elements, and (iii) axioms that the operations must satisfy. The power of this abstraction is that once the core properties of the structure are formalized in general, they can be applied to any specific system — mathematical or real-world — that shares the same structure. For example, the field \(F\) is an algebraic structure consisting of elements called scalars with operations of addition and multiplication that satisfy six axioms. An ubiquitous field is the set of real numbers \(\mathbb{R}\).

Linear algebra is the study of vector spaces \(V\), which is an algebraic structure defined in the context of a field. The elements in a vector space are called vectors. For any two vectors \(\boldsymbol u,\boldsymbol v \in V\), the operation of vector addition creates a third vector \(\boldsymbol u + \boldsymbol v \in V\); this is known as closure under vector addition. For any scalar \(c \in F\) and vector \(\boldsymbol u \in V\), the operation of scalar multiplication creates another vector \(c \boldsymbol u \in V\); this is known as closure under scalar multiplication. The 8 axioms that govern these two operations are listed here.

Any sets of elements equipped with vector addition and scalar multiplication that satisfy the closure property and the 8 axioms is considered a vector space. Of particular interest are \(n\)-tuples of the form

\[ \boldsymbol u = (u_1, u_2, \ldots, u_n), \]

where the components \(u_1, \ldots, u_n\) are scalars from a field \(F\). The set of all such \(n\)-tuples is denoted by \(F^n\).¹ For example, \(\mathbb{R}^3\) is the set of all 3-tuples of real numbers. Here, vector addition is defined as the component-wise operation:

\[ \begin{aligned} \boldsymbol u = (u_1, u_2, &\ldots, u_n) \in F^n \quad\text{and}\quad \boldsymbol v = (v_1, v_2, \ldots, v_n) \in F^n \\ \\ \boldsymbol u + \boldsymbol v &= (u_1 + v_1, u_2 + v_2, \ldots, u_n + v_n) \in F^n. \end{aligned} \]

Similarly, scalar multiplication is defined as the component-wise operation:

\[ \begin{aligned} c \in F \quad&\text{and}\quad \boldsymbol u = (u_1,u_2, \ldots, u_n) \in F^n \\ \\ c \boldsymbol u &= (c u_1, c u_2, \ldots, c u_n) \in F^n. \end{aligned} \]

A natural generalization of \(n\)-tuples is the \(m \times n\) array called the matrix:

\[ A = \begin{pmatrix} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \end{pmatrix}. \]

The set of all \(m \times n\) matrices with components in a field \(F\) is denoted by \(F^{m \times n}\). Vector addition and scalar multiplication are defined analogously to the component-wise operations for \(n\)-tuples. Specifically, for any two matrices \(A, B \in F^{m \times n}\), vector addition creates a third matrix \(A + B \in F^{m \times n}\) whose components are given by

\[ (A + B)_{ij} = A_{ij} + B_{ij}. \]

For any scalar \(c \in F\), scalar multiplication creates another matrix \(cA \in F^{m \times n}\) where

\[ c A_{ij} = c (A_{ij}). \]

Basis and Dimension

A vector space \(V\) contains infinitely many vectors. We are interested in constructing a smaller set of vectors that captures the entire structure in a much more tractable manner. For example, consider the vector space \(\mathbb{R}^3\) in Figure 1. It seems like any point on the blue lattice structure can be described by how far it extends along the \(x\), \(y\), and \(z\) axes. This section formalizes this idea.

Let \(\mathcal{A} = \{\boldsymbol{u}_1, \ldots, \boldsymbol{u}_n\}\) be some finite subset of vectors in \(V\). One way to formalize the idea of \(\mathcal{A}\) capturing the entire structure of \(V\) is if any vector \(\boldsymbol{v} \in V\) can be expressed as a linear combination of the vectors in \(\mathcal{A}\):

\[ \boldsymbol v = c_1 \boldsymbol u_1 + c_2 \boldsymbol u_2 + \ldots + c_n \boldsymbol u_n = \sum_{i=1}^n c_i \boldsymbol u_i \in V, \]

where \(c_1, \ldots, c_n \in F\). If this is the case, then we say that the set \(\mathcal{A}\) spans the vector space \(V\). An equivalent way to say this is that the set of all possible linear combinations of the vectors in \(\mathcal{A}\) — the span of \(\mathcal{A}\) — is equal to \(V\).

We are also interested in efficiency. That is to say, we want \(\mathcal{A}\) to be as small as possible while still spanning \(V\). We call a set of vectors linearly dependent if one of the vectors can be expressed as a linear combination of the others. More formally, the vectors \(\boldsymbol{u}_1, \ldots, \boldsymbol{u}_n\) in \(\mathcal{A}\) would be linearly dependent if

\[ c_1 \boldsymbol u_1 + c_2 \boldsymbol u_2 + \ldots + c_n \boldsymbol u_n = 0, \]

and not all of the \(c_i\) are zero. Thus, we would like to remove any linearly dependent vectors from \(\mathcal{A}\): such vectors are going to be redundant since they will be a part of the span (i.e. the set of all possible linear combinations) of the other vectors in \(\mathcal{A}\). Formally, we say that the vectors \(\boldsymbol{u}_1, \ldots, \boldsymbol{u}_n\) in \(\mathcal{A}\) are linearly independent if

\[ c_1 \boldsymbol u_1 + c_2 \boldsymbol u_2 + \ldots + c_n \boldsymbol u_n = 0, \]

only if \(c_1 = c_2 = \ldots = c_n = 0\).

Taken together, these two ideas — spanning and linear independence — gives us exactly what we were looking for: a minimal yet complete description of the vector space \(V\). More formally, we call a set of vectors a basis for the vector space \(V\) if it is a linearly independent subset that spans \(V\). The vectors in a basis are called basis vectors.

Subspaces

It is often useful to consider a lower-dimension vector space that still preserves the properties of the original vector space. Formally, for a vector space \(V\), we define the subspace \(W\) to be any vector space that consists of a nonempty subset of the vectors in \(V\) endowed with the same operations of vector addition and scalar multiplication defined on \(V\).

This definition can seem abstract at first, and so it’s valuable to consider a concrete example. Recall that the vector space \(\mathbb{R}^3\) is the set of all 3-tuples of real numbers. This is visualized by the faint blue lattice in Figure 1. Now, consider following subset of \(\mathbb{R}^3\)

\[ W = \{(x,y,0): x, y \in \mathbb{R}\} \subseteq {\mathbb{R^3}}, \]

which is visualized as the turquoise \(x-y\) plane in the figure below.

Code

L <- 6       # half-size of the cube window
step <- 1    # spacing of lattice points

# Lattice points: all 3D triples on a grid
g <- seq(-L, L, by = step)
pts <- expand.grid(x = g, y = g, z = g)

# Plane through the origin (z=0)
Z <- matrix(0, length(g), length(g))

# Define cube edges
edges <- list(
  rbind(c(-L,-L,-L), c( L,-L,-L)),  rbind(c(-L, L,-L), c( L, L,-L)),
  rbind(c(-L,-L, L), c( L,-L, L)),  rbind(c(-L, L, L), c( L, L, L)),
  rbind(c(-L,-L,-L), c(-L, L,-L)),  rbind(c( L,-L,-L), c( L, L,-L)),
  rbind(c(-L,-L, L), c(-L, L, L)),  rbind(c( L,-L, L), c( L, L, L)),
  rbind(c(-L,-L,-L), c(-L,-L, L)),  rbind(c( L,-L,-L), c( L,-L, L)),
  rbind(c(-L, L,-L), c(-L, L, L)),  rbind(c( L, L,-L), c( L, L, L))
)

p <- plot_ly()

# (1) Lattice points (all possible 3-tuples)
p <- add_markers(
  p, data = pts, x = ~x, y = ~y, z = ~z,
  opacity = 0.12, marker = list(size = 2),
  showlegend = FALSE, hoverinfo = "skip"
)

# (2) Plane through the origin
p <- add_surface(
  p, x = g, y = g, z = Z,
  opacity = 0.35, showscale = FALSE
)

# (3) Axes
p <- add_trace(p, type = "scatter3d", mode = "lines",
               x = c(-L, L), y = c(0, 0), z = c(0, 0),
               line = list(width = 6), hoverinfo = "skip")
p <- add_trace(p, type = "scatter3d", mode = "lines",
               x = c(0, 0), y = c(-L, L), z = c(0, 0),
               line = list(width = 6), hoverinfo = "skip")
p <- add_trace(p, type = "scatter3d", mode = "lines",
               x = c(0, 0), y = c(0, 0), z = c(-L, L),
               line = list(width = 6), hoverinfo = "skip")

# (4) Wireframe cube (wideframe)
for (e in edges) {
  p <- add_trace(
    p, type = "scatter3d", mode = "lines",
    x = e[,1], y = e[,2], z = e[,3],
    line = list(width = 4, color = "orange"),
    showlegend = FALSE, hoverinfo = "skip"
  )
}

# Layout
p <- layout(
  p,
  scene = list(
    aspectmode = "cube",
    xaxis = list(title = "x", range = c(-L, L), showgrid = FALSE, zeroline = FALSE),
    yaxis = list(title = "y", range = c(-L, L), showgrid = FALSE, zeroline = FALSE),
    zaxis = list(title = "z", range = c(-L, L), showgrid = FALSE, zeroline = FALSE),
    bgcolor = "white"
  ),
  showlegend = FALSE
)

p

Figure 1: A 2D plane through the origin is a subspace of R³.

The subset \(W\) is a subspace of \(\mathbb{R}^3\). To see this, note that for any \(u = (u_1, u_2, 0) \in W\), \(v = (v_1, v_2, 0) \in W\), and \(c \in \mathbb{R}\), the component-wise definitions of vector addition and scalar multiplication for \(\mathbb{R}^3\) are closed in \(W\)

\[ \begin{aligned} u + v = &(u_1 + v_1, u_2 + v_2, 0) \in W \\ \\ c u &= (c u_1, c u_2, 0) \in W, \end{aligned} \] and thus \(W\) is a valid vector space.

Notice that the origin \((0,0,0)\) — the additive identity for \(\mathbb{R}^3\) — is contained in \(W\). This is not a coincidence: every subspace of \(\mathbb{R}^3\) must contain the origin. To see this, consider the \(x-y\) plane shifted up by one unit:

\[ W' = \{(x,y,1): x, y \in \mathbb{R}\} \subseteq {\mathbb{R^3}}. \]

This set does not include the origin, and it fails to be a subspace because the operations of vector addition and scalar multiplication are not closed in \(W'\). Specifically, for any \(u = (u_1, u_2, 1) \in W'\), \(v = (v_1, v_2, 1) \in W'\), and \(c \in \{\mathbb{R} / 1\}\), we have that

\[ \begin{aligned} u + v = &(u_1 + v_1, u_2 + v_2, 2) \notin W' \\ \\ cu &= (c u_1, c u_2, c) \notin W', \end{aligned} \] and so \(W'\) is not a valid vector space. Importantly, this is a general property not limited to \(\mathbb{R}^3\): any subspace must contain the additive identity (also called the zero vector) of the parent vector space.

Footnotes

These tuples are usually called vectors in most applied settings, but to avoid confusion with the more general definition of vectors in a vector space, I refer to them as \(n\)-tuples.↩︎