Despite our fluency in the language of linear algebra, we have yet to truly grasp the essence of what a vector is
Thus far, we have treated vectors as either arrows or lists of numbers or coordinates, but these representations are not inherent to the nature of vectors themselves
The question of what exactly a vector is turns out to be more abstract and less significant than we initially anticipated
Rather than fixating on the question of "what is a vector", it is more fruitful to ask "what can vectors do"
In its most general sense, a vector can be conceptualized as anything that exhibits a meaningful notion of vector addition and scalar multiplication. The ability to combine vectors through addition and scale them by a number forms the fundamental framework for working with vectors in linear algebra
When working with Euclidean vectors, we commonly employ two representations: arrows and columns of numbers. It is crucial to recognize that these representations are merely tools used to aid our understanding
Arrows serve as visual depictions of vectors, offering an intuitive way to grasp their properties. However, as we progress to complex numbers and higher dimensions, the visual representation of arrows becomes less practical and provides limited assistance
Representing vectors as columns of numbers provides a numerical representation that facilitates mathematical manipulations. These numerical representations enable us to easily perform operations and calculations involving vectors, making them a valuable asset in our studies
At the core of linear algebra lie two fundamental operations that allow us to construct new vectors from existing ones
These two operations are vector addition and scalar multiplication
In fact, we can consider all other concepts and ideas in linear algebra as different ways of formalizing and organizing these two fundamental operations
Vector addition and scalar multiplication are so fundamental that we can argue that the entirety of linear algebra originates from them
Consequently, we can extend the definitions and principles of linear algebra to other fields of mathematics as long as they adhere to the rules of vector addition and scalar multiplication
In simpler terms, anything that can undergo these two operations can be considered a vector
To generalize the concepts in linear algebra, the concept of a vector space is introduced
A vector space is defined as a set whose elements, called vectors, can undergo the operations of vector addition and scalar multiplication
In order to determine whether something can be considered a vector, a set of rules, collectively known as the axioms of vector space, is established
These axioms govern the operations of vector addition and scalar multiplication
The goal is to abstract from these operations a set of basic axioms and state that any set of objects that adhere to these axioms form a linear vector space
A linear vector space, denoted as V, is a collection of vectors for which there exists a definite rule for forming vector sums and scalar multiplications
An abstract vector can be denoted as such
∣vector⟩
Hence, vector addition and scalar multiplication are denoted as
There exists a null vector 0, such that adding it to any arbitrary vector yields the same vector
∣V⟩+∣0⟩=∣V⟩
Given any arbitrary vector, there must also exist an inverse vector in the same vector space such that their addition will give the null vector
∣V⟩+∣−V⟩=∣0⟩
Axioms Regarding Scalar Multiplication
The rules regarding scalar multiplication are as follows
Scalar multiplication is distributive over vector addition
λ(∣U⟩+∣W⟩)=λ∣U⟩+λ∣W⟩
Scalar multiplication is distributive over scalar addition
(γ+δ)∣V⟩=γ∣V⟩+δ∣V⟩
Scalar multiplication is associative
γ(δ∣V⟩)=(γ×δ)∣V⟩
The scalar multiplication of any vector by 1 will always yield the same vector
1∣V⟩=∣V⟩
Axioms Regarding Closure
In addition to the rules governing the two fundamental operations, there are two additional rules that state that the result of any vector addition or scalar multiplication operation within the space will always yield a vector that also belongs to the same vector space
The closure property guarantees that the set of vectors within a space remains consistent and forms a self-contained structure
∣vector⟩∈V means that the ∣vector⟩ lies within the vector space, \mathbb
Adding vectors from the same vector space will yield a new vector in the same vector space
∣U⟩∈Vand∣W⟩∈V⟹∣U⟩+∣W⟩∈V
Scaling vectors will yield a new vector in the same vector space
∣V⟩∈V⟹λ∣V⟩∈V
With all ten axioms laid out, the refined definition of a vector space is as follows
A vector space, V, is a nonempty set with two operations, vector addition and scalar multiplication, that obeys the ten axioms described above
Since we have defined abstract vectors using vector addition and scalar multiplication, it is natural to extend our understanding by introducing the concept of linear combination
j=1∑nλj∣Vj⟩
The existence of linear combinations implies that sets of abstract vectors should possess the property of linear dependency
j∑λj∣Vj⟩=∣0⟩⎩⎪⎨⎪⎧all λj=0not all λj=0⟹⟹{∣V1⟩,∣V2⟩,∣V3⟩,⋯} is a set of linearly independent vectors{∣V1⟩,∣V2⟩,∣V3⟩,⋯} is a set of linearly dependent vectors
Moreover, we define the dimension as the maximum number of linearly independent vectors that the vector space can contain
The introduction of basis vectors for geometric vectors was motivated by the desire to replace drawings with computational methods. However, the significance of basis vectors becomes even more pronounced when working with abstract vectors
In an n-dimensional space, we have the freedom to choose any set of n linearly independent vectors as the basis set
{∣basis 1⟩,∣basis 2⟩,∣basis 3⟩,⋯,∣basis n⟩}
Any vector in said n-dimensional vector space can be expressed as a linear combination of the basis vectors
An operator , Ω^ , is a mathematical entity that instructs the transformation of an input vector into an output vector
Ω∣V⟩=∣Vtransformed⟩
While the transformation carried out by an operator can become complex, we will focus our attention on linear operators
A linear operator induces a linear transformation, which satisfies two fundamental properties known as additivity and scaling
Additivity: Applying the operator to the sum of vectors yields the same result as adding the transformed vectors individually
Ω(∣U⟩+∣W⟩)=Ω∣U⟩+Ω∣W⟩
Scaling: Applying the operator to a scaled vector is equivalent to scaling the resulting transformed vector
Ω(λ∣V⟩)=λ(Ω∣V⟩)
One highly advantageous property of linear operators is that once we know their action on the basis vectors, we can easily deduce their action on any other vector within the same vector space
Ω∣V⟩
-To understand this, let's express the input vector as a linear combination of the basis vectors
Ω∣V⟩=Ω(i∑nvi∣basis i⟩)
By applying the additivity property of linear transformations, we can insert the operator into the summation
Ω∣V⟩=i∑nΩ(vi∣basis i⟩)
Furthermore, using the scaling property of linear transformations, we have
Ω∣V⟩=i∑nviΩ∣basis i⟩
Hence, if we know how the basis vectors are transformed, we can determine the transformed representation of any vector in that vector space
Ω∣V⟩=i∑nvi∣transformedbasis i⟩
In fact, this expression is what allows us to define matrix-vector multiplication in the first place
Given that we can represent a vector as a column vector, it is natural to associate operations with matrices
All linear operators can be represented as matrices, just like how all abstract vectors can be represented as a column of numbers
The matrix representation of a linear operator is basis-dependent just like how the components of a comlumn vector depends on the basis chosen
Anything that satisfies all the axioms defined above can be considered as a vector space. As a result, we can have vector spaces within vector space
We refer to these contained vector space as subspaces, S , of the vector space, V ,
A subset of vectors, S , taken from a vector space, V, can form a subspace if they satisfy the following conditions
The subset of vectors must satisfy closure with respect to vector addition and scalar multiplication
The null vector is part of the subset
We only need these to verify these two points because other axioms of vector space are automatically fufilled for the subspace S because they hold on the larger vector space \mathbb
The dimension of a vector space is defined as the maximum number of linearly independent vectors that the vector space can accommodate
dimension of VdimV
In other words, it represents the size or capacity of the vector space in terms of independent directions or degrees of freedom
When considering subspaces within a larger vector space, it's important to note that subspaces may only contain a subset of vectors from the parent space
This means that a subspace may not include all the basis vectors that span the larger vector space
Consequently, the subspace will have a smaller number of linearly independent vectors compared to the original vector space. As a result, the dimension of the subspace will be smaller than the dimension of the vector space it resides in
This reduction in dimension indicates that the subspace is confined to a more restricted set of vectors within the larger vector space
Note that a vector space is considered its own subspace, which makes it the only case where the subspace shares the same dimension as the vector space
Subspaces are intimately related to dimensions since can form smaller subspaces from larger vector spaces simply by removing dimensions
Removing a dimension can be understood as removing a basis vector from the set of linearly independent basis vectors that span the vector space
This removal of a dimension corresponds to plugging in a 0 for the component corresponding to the removed basis vector in all the vectors within the vector space
By removing dimensions or replacing rows with zeros, we can form smaller subspaces within the original vector space
Subspace can be thought of as different combinations of different dimensions within the larger vector space
Each dimension corresponds to a unique direction or degree of freedom that the vectors within that space can span
By combining different combinations of dimensions, we can construct subspaces of varying sizes and configurations within the original vector space
Overlapping Subspaces and Intersections
Vectors can belong to multiple distinct subspaces, and this occurs when these subspaces have overlapping dimensions
When the dimensions of the subspaces overlap, it means that there are shared basis vectors between them
These common basis vectors serve as a bridge between the subspaces, allowing vectors to be expressed as linear combinations that satisfy the requirements of both subspaces
The vectors that belong to multiple subspaces reside in the intersection of those subspaces
The intersection of set U and set W is defined as the set composed of all elements that belong to both sets and is denoted as \mathcal{U}\cap\mathcal
Geometrically, this intersection corresponds to the region where the different dimensions of the subspaces overlap
The intersection between subspaces is itself a subspace
The dimensionality of the intersection region corresponds to the number of overlapping dimensions between the subspaces
Interestingly, all subspaces must intersect with one another
This is because the null vector exists in all subspaces, and the origin acts as the common point of intersection for all subspaces
In other words, all subspaces have an intersection of at least dimensionality 0 at the origin
We define the sum of subspaces to be a subset of vectors that contain all combinations of vector sums that can be formed by taking one vector from each subspace
Let's consider two subspaces, U and W, both belonging to vector space \mathbb
As we can see, the set U+W contains not only the vectors from U and W but also additional vector sums formed by combining vectors from both subspaces
U+W={U,W,all combinations of ∣Ui⟩+∣Wj⟩}
The sum of subspaces can also be understood in terms of combining basis sets
Let's consider two subspaces, U and W, both belonging to the vector space V. Their basis sets are given as follows
basis set of V{∣basis 1⟩,∣basis 2⟩,∣basis 3⟩,⋯,∣basis i⟩,∣basis i+1⟩,∣basis i+2⟩,⋯,∣basis j⟩,∣basis j+1⟩,∣basis j+2⟩,⋯,∣basis n⟩}basis set of U{∣basis i⟩,∣basis i+1⟩,∣basis i+2⟩,⋯}basis set of W{∣basis j⟩,∣basis j+1⟩,∣basis j+2⟩,⋯}
Any vector within each subspace can be written as a linear combination of their respective basis set
In other words, the sum of subspaces is obtained by combining the basis vectors from each subspace into a single basis set
basis set of U+W{∣basis i⟩,∣basis i+1⟩,∣basis i+2⟩,⋯,∣basis j⟩,∣basis j+1⟩,∣basis j+2⟩,⋯}
Since the number of basis vectors in a subspace is directly related to its dimensionality, summing subspaces is equivalent to combining their respective dimensions
By summing subspaces, we are expanding the possibilities for vector combinations and extending the degree of freedom of vectors within the resulting sum subspace
Direct Sum
The dimension of the sum of subspaces may not necessarily simply be the sum of the dimensions of each subspace
This is because the subspaces may have overlapping dimensions, and those overlapping dimensions should not be counted twice in the sum subspace
As a result, the dimension of the sum subspace will be smaller than the sum of the dimensions of each subspace when the dimensionality of intersection is greater than 0
In the special case where the sum of subspaces is indeed equal to the sum of the dimensions of each subspace, we refer to it as a direct sum
SInstead of using the "+" symbol, we use the direct sum symbol "⊕" to denote this relationship
U⊕W
The direct sum indicates that the subspaces have non-overlapping dimensions, and the sum subspace encompasses all the dimensions of the individual subspaces without any duplication
There are two equivalent definitions to direction sum
The sum of subspaces is a direct sum when the only vector common to all subspaces is the null vector
In other words, the intersection of the subspaces is reduced to just the zero vector
This implies that there are no non-zero vectors shared by all the subspaces
The sum of subspaces will be a direct sum when every vector combination between the subspace produce a new and unique vector
This means that for any vector in the sum subspace, there is a unique way to express it as a linear combination of vectors from each subspace
There are no redundant or overlapping representations of vectors in the sum subspace
We can understand the concept of a direct sum in terms of combining basis sets
Since the subspaces involved in a direct sum do not have overlapping dimensions, they do not share any common basis vectors
The number of basis vectors in a direct sum subspace is simply the sum of the number of basis vectors in each subspace
Every vector space can be written as a direct sum of subspaces
The basis set of a subspace is only a subset of the basis vectors of the larger vector space
basis set of V{∣basis 1⟩,∣basis 2⟩,∣basis 3⟩,⋯,∣basis s⟩,∣basis s+1⟩,∣basis s+2⟩,⋯,∣basis s+σ⟩,⋯,∣basis n⟩}basis set of S{∣basis s⟩,∣basis s+1⟩,∣basis s+2⟩,⋯,∣basis s+σ⟩}
The remaining subset of basis vectors can form the basis set of another subspace
basis set of Q{∣basis 1⟩,∣basis 2⟩,∣basis 3⟩,⋯,∣basis s – 1⟩,∣basis s+σ+1⟩,⋯,∣basis n⟩}
Therefore, the larger vector space can be expressed as the direct sum of these two subspaces
The concept of an inner product is a generalization of the dot product from arrow vectors to more abstract vector spaces
In the case of dot products between arrow vectors, we use a dot symbol to represent the operation
U•W
However, when it comes to inner products, we use a different notation
⟨U∣W⟩
Within the context of inner product, the vector on the right, ∣W⟩ is referred to as a ket, while the vector on the left, ⟨U∣ , is referred to as a bra. The significance of this notation will become clearer in subsequent discussions
It is important to recognize that not all vector spaces are equipped with an inner product
Vector spaces that do have an inner product defined on them are referred to as inner product spaces
To generalize the dot product into the inner product, we need to develop a deeper understanding of the dot product
Now that we have explored the concept of linear transformation, we can aim to establish a connection between these seemingly different concepts
Although it may initially appear as an unnecessary detour, bridging this gap serves as a significant stepping stone for further progress
The disparity between the two operations becomes apparent when considering their outputs
Dot products yield scalars, while linear transformations yield vectors
To reconcile these ideas, we shall treat one-dimensional vectors as scalars. This leap is not too drastic because one-dimensional vectors inhabits a one-dimensional line, just like scalars exists on the number line
The dot product can be interpreted as a projection transformation between vectors
Specifically, it projects one vector onto the line defined by the other vector
the vectorthat definesthe line ofprojectionU•the vectorthat is beingtransformedW
This projection transformation can be expressed using a matrix, where the columns represent the transformed basis vectors
Since the projection is onto a one-dimensional line, the transformed basis vectors are one-dimensional, resulting in a 1×n dimensional matrix. Each entry of the matrix represents the dot product between the corresponding basis vector and the vector being projected
[(basis 1•U)(basis 2•U)(basis 3•U)⋯]
By expressing the vector U as a linear combination of the same set of basis vectors, we can rewrite the dot products
This expansion may seem to have complicated matters, but recall that if the basis set is orthonormal, we can easily eliminate most of the terms, leaving only the corresponding vector component in each dot product
After performing the matrix-vector multiplication, the resulting definition is equivalent to the previous definition of the dot product as the sum of the products of corresponding components
It is important to note that while we have focused on the case of an orthonormal basis set, it is possible to derive a projection matrix for other basis sets as well. However, determining the entries of the projection matrix becomes more challenging in such cases
At this stage, we still don't know how to compute the inner product, but we have a sense of its expected behavior and the defining features it should possess
To make progress, we need to reconsider the fundamental characteristics that define the dot product and find a way to generalize these features to apply them to abstract vectors
By abstracting these defining features, we can naturally uncover a method for computing the inner product while adhering to these principles
Defining Features
Skew-Symmetry
Although we have expanded our discussion beyond arrow vectors, considering the magnitude of the vector as its length remains valuable
We have established that the square root of the dot product of a vector with itself gives the vector's length. We expect a similar concept for the inner product
∣∣V∣∣=V•V generalization ∣∣∣V⟩∣∣=⟨V∣V⟩
As length is a real number, the inner product of a vector with itself must also be a real number. Consequently, it must be equal to its own complex conjugate
length is a real number⟹(⟨V∣V⟩)=(⟨V∣V⟩)∗⟹⟨V∣V⟩=⟨V∣V⟩∗
Moreover, in a more general context, interchanging the order of the inner product results in the complex conjugate
⟨U∣W⟩=⟨W∣U⟩∗skew-symmetry
Linearity and Antilinearity
The interpretation of the dot product as a linear transformation extends to the inner product, where the vector on the right undergoes the transformation
We have established that linear transformations stasifies linearity, which means the vector on the right must obey the properties of additivity and scaling
⟨V∣(γ∣U⟩+δ∣W⟩)=γ⟨V∣U⟩+δ⟨V∣W⟩linearity in ket
Using the skew-symmetry of inner product and the linearity in ket, we can see that the vector on the left expresses antilinearity
(γ⟨U∣+δ⟨W∣)∣V⟩=γ∗⟨U∣V⟩+δ∗⟨W∣V⟩antilinearity in bra
Computing Inner Product in an Orthonormal Basis
Suppose we want to compute the inner product of two arbitrary vectors
⟨U∣W⟩
To compute this inner product, let us first expand both vectors in the same basis
⟨U∣W⟩=(i∑ui⟨basis i∣)(j∑wj∣basis j⟩)
Using the linearity and antilinearity properties, we can combine the summations
⟨U∣W⟩=i∑j∑ui∗wj⟨basis i∣basis j⟩
To proceed, we need to determine the values of each ⟨basis i∣basis j⟩
These values depend on the chosen basis set, and the only thing we know for sure is that they are linearly independent
However, if we use an orthonormal basis, only the inner products of the same basis vectors will have non-zero values
Thus, in an orthonormal basis, the inner product is obtained by computing a single sum of the products of the complex conjugates of the corresponding components of the vectors
The process of finding the component of a vector by taking the inner product with the basis vector can be understood in terms of projection
Just like the dot product, the inner product measures the overlap between vectors
The component of each basis vector is found by determining how much the vector ∣V⟩ overlaps with each basis vector
Transition between Orthonormal Bases
There are an infinite distinct set of orthonormal basis vectors we can choose from and we often encounter situations where we need to convert the components of a vector from one orthonormal basis to another
∣V⟩=j∑vjold∣old orthobasis j⟩Changing from One Orthonormal Basis to Another∣V⟩=j∑vjnew∣new orthobasis j⟩
Our goal is to change the basis and express ∣V⟩ in terms of a different orthonormal basis
The process of converting from one orthonormal basis to another boils down to finding the new components of the vector
vinew=?
We know that the component of the vector in each orthonormal basis can be obtained by taking the inner product of the vector with each basis vector
vinew=⟨new orthobasis i∣V⟩
Expanding the vector ∣V⟩ in terms of its original basis, we get
This can be interpreted as the projection of one orthonormal basis onto another, allowing us to determine how much they overlap or align with each other
Gram-Schmidt Orthogonalization
Gram-Schmidt Orthogonalization is a procedure used to convert an arbitrary linearly independent basis into an orthonormal basis
It allows us to construct a new set of vectors that are orthogonal to each other and have unit length
The procedure can be summarized as follows
Choose a reference vector from the given basis and rescale it to have unit length. This normalized vector becomes the first "refined" vector in the new basis
For each subsequent vector in the original basis, subtract its projection onto the already "refined" vectors from itself. This ensures that the new vector is orthogonal to the previous "refined" vectors. Then, rescale the resulting vector to have unit length, and it becomes part of the new orthonormal basis
In Gram-Schmidt Orthogonalization, two basic operations are used
To normalize a vector, we divide it by its magnitude
∣unitvector⟩=∣∣∣vector⟩∣∣∣vector⟩
To orthogonalize a vector, subtract its projections onto the other refined vectors. This eliminates the vector's component in the direction of the refined vectors, ensuring orthogonality
Previously, we established a connection between the dot product and linear transformations. This connection not only remains when we generalize the dot product but also provides a deeper understanding of the bra vector
In an orthonormal basis, the dot product can be expressed as a matrix-vector multiplication, where the vector on the left is transposed
Similarly, the inner product can also be written as a matrix-vector multiplication in an orthonormal basis, where the vector on the left is transposed and each entry is complex conjugated
In an orthonormal basis, what the inner product truly does is transpose the first vector, take the complex conjugate of each entry, and then multiply it with the second vector
This combined operation is known as the Hermitian transpose, denoted by H
We have established that vectors exist independently of the basis we choose for them
This means that the same vector can have multiple representations depending on the basis we assign to it. Therefore, we are interested in knowing how we can translate from one representation to another
Previously, we explored a similar topic when discussing the inner product, but we limited ourselves to transitioning from one orthonormal basis to another. Now, we will learn the more general case where no assumption is made about the nature of the bases
Our goal here is to find the new component of the vector in the new basis
∣V⟩=j∑vjold∣oldbasis j⟩Changing from One Basis to Another∣V⟩=j∑vjnew∣newbasis j⟩
This can be achieved if we know how to express each of the original basis vectors as a linear combination of the new basis set
Now, to translate from one basis to another, we simply have to substitute each of the old basis vectors in terms of the new ones and perform some rearrangements
The goal is to determine how the linear transformation affects the input vector and express the output vector in the old basis
To perform the computation, we need to carry out a series of change of basis operations
First, we start with an input vector expressed in the old basis
Input Vector in the {∣oldbasis 1⟩,∣oldbasis 2⟩,∣oldbasis 3⟩,⋯} Basis⎣⎢⎢⎢⎡v1v2v3:⎦⎥⎥⎥⎤
We then multiply the input vector by the change of basis matrix to bring it into the new basis
Input Vector in the {∣newbasis 1⟩,∣newbasis 2⟩,∣newbasis 3⟩,⋯} Basis⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡v1v2v3:⎦⎥⎥⎥⎤
Now that the input vector is expressed in the same basis as the transformation matrix, we can multiply them to obtain the output vector
Output Vector in the {∣newbasis 1⟩,∣newbasis 2⟩,∣newbasis 3⟩,⋯} Basis⎣⎢⎢⎢⎡Ω11Ω21Ω31:Ω12Ω22Ω32:Ω13Ω23Ω33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡v1v2v3:⎦⎥⎥⎥⎤
To express the output vector in the old basis, we multiply it by the inverse of the change of basis matrix
Output Vector in the {∣oldbasis 1⟩,∣oldbasis 2⟩,∣oldbasis 3⟩,⋯} Basis⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤−1⎣⎢⎢⎢⎡Ω11Ω21Ω31:Ω12Ω22Ω32:Ω13Ω23Ω33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡v1v2v3:⎦⎥⎥⎥⎤
In a deeper interpretation, we can view this computation as obtaining the transformation matrix in the old basis by composing three matrices
Linear Transformation in the {∣oldbasis 1⟩,∣oldbasis 2⟩,∣oldbasis 3⟩,⋯} Basis⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤−1⎣⎢⎢⎢⎡Ω11Ω21Ω31:Ω12Ω22Ω32:Ω13Ω23Ω33:⋯⋯⋯⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡ε11ε21ε31:ε12ε22ε32:ε13ε23ε33:⋯⋯⋯⎦⎥⎥⎥⎤
By performing these computations, we are able to shift the perspective of the linear transformation and observe its effects in different bases
The composition of linear operators in the fashion of X−1AX is called a similarity transformation
Suppose B=X−1AX, then we will call B and A conjugate operators
To delve deeper into comprehending linear transformations, an effective strategy involves deconstructing the transformation process itself
Rather than envisioning the transformation as a whole for the entire vector space, a productive perspective is to explore how the transformation operates within the subspaces that collectively form the complete vector space
The rationale behind breaking down the transformation lies in simplification, analogous to how we decompose integers into prime factors to facilitate more insightful analysis
As we embark on this journey, a critical query emerges: which of these subspaces possess distinct significance in relation to the transformation process?
The answer is not straightforward, as akin to the countless ways numbers can be expressed as multiples of others, transformations can be dissected in multiple manners
The importance of this deconstruction becomes apparent through its capacity to streamline our analysis, this only happens when the resulting smaller components behave independently under the transformation
This leads us to a pivotal question: What precisely does it mean for subspaces to remain independent from each other during the transformation ?
The solution lies in the fact that, during the transformation process, the basis vectors of each distinct subspace are not combined
In simpler terms, each subspace experiences a transformation limited to scaling alone
The phenomenon of a single eigenvalue representing more than one eigenvector is called degeneracy
It corresponds to repeated roots for the characteristic polynomial
The immediate consequence of degeneracy is the emergence of eigenspace
Suppose ( eigenvalue i ) is the eigenvalue for the following eigenvectors such that
Ω∣eigen i ; 1 ⟩=( eigenvalue i )∣eigen i ; 1 ⟩Ω∣eigen i ; 2 ⟩=( eigenvalue i )∣eigen i ; 2 ⟩Ω∣eigen i ; 3 ⟩=( eigenvalue i )∣eigen i ; 3 ⟩:Ω∣eigen i ; k ⟩=( eigenvalue i )∣eigen i ; k ⟩
It follows that any linear combination of these degenerate eigenvectors will also be an eigenvector with the same eigenvalue due to the linearity of the operator
Ω(j=1∑kλj∣eigen i ; j ⟩)=j=1∑kΩ(λj∣eigen i ; j ⟩)=j=1∑kλj(Ω∣eigen i ; j ⟩)=j=1∑kλj(( eigenvalue i )∣eigen i ; j ⟩)=( eigenvalue i )(j=1∑kλj∣eigen i ; j ⟩)
Since these eigenvectors are orthogonal, and hence linearly independent, their linear combinations span a whole k-dimensional subspace where all vectors within it are eigenvectors of Ω with ( eigenvalue i ) as the eigenvalue
Diagonalization is a powerful mathematical technique that simplifies the representation of matrices by utilizing eigenvectors
By choosing a basis composed of eigenvectors, we can transform a linear transformation into a diagonal matrix, making computations more straightforward and revealing important properties of the transformation
When a linear transformation possesses linearly independent eigenvectors that span the entire vector space, it is convenient to use these eigenvectors as the basis for representing the transformation
This basis, known as the eigenbasis, allows us to express each eigenvector as a column vector with 1s and 0s in the eigenbasis
Notice that all the eigenvalues associated with this linear transformation sit on the diagonal of the matrix and every other entry is zero. We call these matrices, diagonal matrices
Diagonalization Process
Although diagonal matrices are desirable, it is unlikely that the chosen basis vectors will coincide with the eigenvectors of the desired linear transformation. We therefore have to diagonalize it by utilizing a change of basis matrix
Suppose we have a square matrix and we know what it and its eigenvectors look like in the same basis
Matrix in the {∣oldbasis 1⟩,∣oldbasis 2⟩,∣oldbasis 3⟩,⋯} Basis⎣⎢⎡Ω11Ω21Ω31:Ω12Ω22Ω32:Ω13Ω23Ω33:⋯⋯⋯⎦⎥⎤Eigenvectors in the {∣oldbasis 1⟩,∣oldbasis 2⟩,∣oldbasis 3⟩,⋯} Basis⎣⎢⎡♣11♣21♣31:⎦⎥⎤⎣⎢⎡♣12♣22♣32:⎦⎥⎤⎣⎢⎡♣13♣23♣33:⎦⎥⎤
If we want to change the representation from the old basis to the eigenbasis, the change of basis matrix's column will just be the eigenvectors expressed in the old basis
Working with diagonalized matrices simplifies computations significantly
When one of the matrix is diagonal, matrix-multiplication can be achieved by simply multiplying the diagonal elements along the row or column, depending on the order of multiplication