In this section we begin the material from Chapters 3 and 4 of Artin.
We have defined the notion of a binary operation on a set
X; this is a function from
to X. There is
another kind of operation that occurs frequently, where we
combine elements from sets X and Y and obtain an
element of Y. The classic example, which is the model
for our definition below, is the multiplication of a
vector by a scalar. In this case, we take a number
and a vector
, and we obtain a new
vector
. An operation of this sort
is a function
. Artin calls this an
external law of composition . We will call it an
operation of X on Y or an action of
X on Y , and we will adopt the convention of writing
the result of applying the function to (x,y) as either
x.y or simply xy.
Let R be a ring and let (M,+) be an Abelian
group . We say M is a (left) module over R
(or an R-module ) if there is an operation of R
on M satisfying the following conditions for all
,
. The first condition is an associative law,
while the final two conditions are distributive laws.
We will give some general examples of modules for rings which are not necessarily fields, but after that we will only consider vector spaces. When R is not a field, the theory of modules is quite a bit more complicated.
Example 3140
Let's record some expected, and easily proven, properties of modules.
Lemma 3147
Let M be an R-module. Then for all
,we have the following.
Lemma 3169
Let F be a field and let V be a vector space over F.
(2) and (3) are corollaries of (1) -- Exercise.
We will be interested in a much more general kind of
cancellation. We say
are
linearly independent if whenever
for
, we have ai=0 for
each
. We say
are
linearly dependent if they are not linearly
independent, that is, if there exist
,not all of which are , such that
.
Lemma 7.3 says that any single
nonzero
is linearly independent. If we consider
the ordinary plane
, we see that two vectors v,w
are linearly independent iff they do not lie on the same
line through
. However, any three vectors
in
are linearly dependent. The reason for this
last claim is that if two vectors v,w in
are
independent, then any vector
can be
written as u=av+bw for some
, whence
av+bw+(-1)u=0. Thus the condition of linear
independence is related to another condition, that of
expressing other vectors as a linear combination of the
given vectors. Our goal in this section is to explore
this link. This will lead us to the notion of a basis and
the dimension of a vector space.
Before we give the necessary formal definitions, let us
consider the ideas discussed in the last paragraph. When
we deal with ordinary n-space
, it is usually
crucial for us to know we have a set of co-ordinate axes.
For example, in three dimensions, we express points,
functions, and so on, in terms of the co-ordinates
(x,y,z), which in turn are defined in terms of the
x,y,z-axes. Every point has a unique set of co-ordinates
relative to these axes. If
are the normal
unit vectors, then the point (x,y,z) corresponds to the
vector
.
A basis of a vector space V can be thought of as the same thing -- a set of vectors that define co-ordinate axes, such that every vector can be written as a unique combination of the basis vectors.
If
are elements of the vector space V
over the field F, a linear combination of
is a sum of the form
for some
. We will be ambiguous here:
the term linear combination can refer either to the sum or
to the vector that is the result of that sum. Thus we
will refer to
as the coefficients
in the linear combination, even though different
coefficients could yield the same vector. We leave it to
the reader to make our ambiguity clear in any given
situation.
A trivial linear combination is one in which
every coefficient is . We say
are
linearly independent if the only linear
combination yielding is the trivial one. (This is the
same as the definition given earlier in this section.) We
say
span V if every element of
V can be written as a linear combination of
. We say
form a
basis of V if every element of V can be
written uniquely as a linear combination of
. (This means every
for some
, and the n-tuple
is unique.)
Thus a basis is a set of elements of V that can serve as a set of co-ordinate axes.
Example 3244
Lemma 3249
Let
. Then
form a
basis for V if and only if they are linearly independent
and span V.
Conversely, if
span V, any
can
be written
for some scalars
. If also
, then
. Hence if
are linearly independent, we conclude that ai=bi for
each i. This proves
form a basis.
Remark 3287 (2)
Above we have applied the terms linearly
independent, span, basis to a group of objects
and so we have used plural language
(``are'', ``span'', ``form''). We frequently think of
as the set
, in which
case we use the singular: X is linearly independent, X
spans V, X is a basis. In what follows we will mix
these modes of usage.
There is still another way in which we regard a basis. If
our goal is to put a co-ordinate system on V, then we
will presumably associate the n-tuple
to the vector
. This implies an
ordering. Thus when we wish to use explicit co-ordinates,
we need to speak of an ordered basis , which is an
n-tuple
. As usual, we generally leave
it to the reader to figure out what we are talking about
at any given moment!
Once we decide to apply the terms ``linearly independent'',
``span'', and ``basis'' to sets, it becomes natural to
allow infinite sets, and hence to modify the definitions
slightly. Thus if X is a set, a linear
combination of elements of X is a sum (or its result)
for some finite collection
of distinct elements of X and some
. Linear independence, spanning, and
basis are defined solely in terms of such finite linear
combinations.
Our goal for the rest of this section is straightforward. We wish to show that every vector space has a basis, and that any two bases have the same number of elements. This common number will be called the dimension of the vector space.
In pursuit of this goal, it is convenient to introduce other notions, which fortuitously are natural and useful in their own right.
If
, we say W is a subspace of V if
it is a subgroup under + (i.e., is closed under + and
- and contains ) and it is closed under scalar
multiplication, i.e.,
implies
.
Lemma 3307
Let
. Then W is a subspace of V if and only
if (a)
; (b) If
, then
; and (c) If
, then
.
In general, a subspace of the n-dimensional vector space
is a flat or linear space, containing the origin,
of smaller dimension. In fact, let us define
to be a linear subset if X contains the entire
line through any two points of X. The subspaces of
are precisely the linear subsets containing the
origin. We will say more about this later.
Let
. The span of X is defined to be
the set of all linear combinations of finitely many
elements from X. Thus
Lemma 3339
Let V be a vector space over F and let
. Then
is the smallest subspace of V containing X.
We leave both as exercises.
If W is a subspace of V andIn terms of our vector analogy, the subspace spanned by a set of vectors is the smallest linear space through the origin containing all the vectors.
The following lemma links linear independence and spanning.
Lemma 3361
Let V be a vector space over F, let
be
linearly independent, and let
. Then the
following statements are true.
First, suppose
. Then we have
for some
,
. Thus
is a nontrivial
linear combination from
that yields , so
is linearly dependent.
Next suppose
is linearly dependent, i.e.,
suppose that there is a non-trivial linear combination of
elements from
that equals . Since X is
linearly independent, this combination must involve y in
a non-trivial way. That is, there must exist
with
such that
. We can solve this equation
for y and we find
.
We are now ready to pursue our goal of showing bases exist and the cardinality of a basis of V is uniquely determined by V. We begin with one of the key results.
Theorem 3392
Let V be a vector space over a field F and let
. If X is linearly independent and Y spans
V, then there is a subset Y' of Y such that
is a basis of X.
Choose a subset Y' of Y with the following two
properties: (1)
is linearly independent, and
(2) Y' is the largest subset of Y satisfying condition
(1). Note that
satisfies condition (1), so
there are subsets of Y satisfying (1). Since Y is
finite, there is a largest such subset. (It is here where
we have to use more advanced techniques if Y is
infinite.)
We claim
is a basis of V. It is linearly
independent by definition, so we must show X' spans V.
We will first show that
.
Let
and suppose
. Then
and by Lemma 7.9,
is linearly independent. But if we set
, we have
and
linearly independent. This contradicts our choice of
Y'. If follows that
.
Now
is a subspace containing Y, so by
Lemma 7.8,
contains the
subspace
. Thus X' spans V, and we have
proven that X' is a basis of V.
Corollary 3429
Let V be a vector space. Then any linearly independent subset of V can be expanded to a basis, and any subset that spans V can be contracted to a basis.
Proof.X is linearly independent, we can take Y=V and apply Theorem 7.10.
If Y spans V, we can take
and apply
Theorem 7.10.
Remark 3437 (2)
There is one problem with the proof of the preceding corollary, and hence with the proof of the next corollary. In the proof, we used the set V as a spanning set, but V is likely to be infinite. Thus we are forced to confront the ``infinite case'' we tried to avoid. One way to avoid this problem is to prove all results only for finitely spanned vector spaces, that is, vector spaces which have a finite spanning set. (Such vector spaces are precisely the finite dimensional vector spaces .) The reader may either make this restriction or read the appendix to this section, where the ``infinite'' problem is discussed.
Corollary 3440
Every vector space has a basis.
Proof.Corollary 7.11 either to the linearly independent setProblem 3459
Let X be a subset of a vector space V. Show that the following statements are equivalent.
Our next goal is to compare the sizes of bases. This requires a lemma, which is related to Theorem 7.10, but is not quite the same.
Lemma 3480 (Exchange Lemma)
Let V be a vector space, let
be a linearly
independent set, and let
span V.
The linear independence of X' is all that we will actually need below. However, we claimed we could choose y so that Y' spans V; to do this, we must be a little more careful.
We can still take any
. Since Y
spans V, we can write
for some
and some nonzero
. Since X is linearly independent, there must be at
least one yj such that
. Set
y=yj for this j, and ![]()
Then X' is linearly independent as above. We can write
, so
. Obviously for any other
, we have
. It follows (as in the proof of
Theorem 7.10) that Y' spans
V.
(2) Exercise.
Theorem 3518
Let X,Y be subsets of a vector space V and suppose
that X is linearly independent and Y spans V. Then
.
Suppose the theorem is false and that V,X,Y give us a
counterexample. Keeping Y,V fixed and changing
X if necessary , we can assume that
is as
large as possible, that is, if
is such that
V,X',Y give us a counterexample, then
. (This is possible because all the numbers
involved are no greater than n=|Y|.)
If
, then by the Exchange Lemma, there are
,
such that
is linearly independent.
Moreover,
is strictly larger
than
. This contradicts our choice of X.
Thus we must have
and so we can conclude that
.
Corollary 3543
Any two bases of a vector space have the same number of elements. That is, if V is a vector space over a field F and X,Y are bases of V, then |X|=|Y|.
Proof.Theorem 7.15, we haveThis result can also be proved directly using part (2) of the Exchange Lemma.
We define the dimension of a vector space V to
be the size of any (and hence every) basis of V, and we
denote it
.
Thus for example,
and
. We have
. (If there are
different fields in use, we sometimes write
to
make clear that V is a vector space over F.)
Corollary 3563
Let V be a vector space over a field F and let
be finite.
(2) This proof is similar to the proof of (1).
Note that this proof would fail if
were infinite,
and in that case, the corollary is not true.
The following result is another very useful application.
Corollary 3590
Let F be a field and let
. Then the
following conditions are equivalent.
Recall from Section 1 that (1) holds if and only if the
equation
can be solved for any
. If
and if
Ai is column i of A, then
. Thus
is a linear combination of the
columns of A, and so the statement that
can
always be solved is equivalent to the statement that the
columns of A span Fn. This proves (1) is equivalent
to (4).
A similar proof shows (1) is equivalent to (7), and this completes the proof.
Here is another nice application of bases. In Artin, this result is used instead of the Exchange Lemma to prove Theorem 7.15. We get it as a corollary.
Corollary 3615
A homogeneous system of m linear equations in n unknowns always has a nonzero solution if n>m.
Put in matrix terms, if A is an
matrix with
n>m, then there is a
with
but
.
APPENDIX: Infinite-dimensional Vector Spaces
At two points in this section we made the assumption that spanning sets or bases were finite. In this appendix we will briefly discuss the general case.
The first place where the finiteness assumption was used
was in the proof of Theorem 7.10. We had a linearly independent set
and a
spanning set
, and we needed the existence of a
largest subset Y' of Y such that
remained
linearly independent. In the proof what we needed for
``largest'' was that if
, then
is linearly dependent. We usually express this by
saying that
is maximal with respect to
the property that
is linearly dependent. When
Y is finite, we know such maximal sets exist because we
can take a subset Y' satisfying this property that has
as many elements as possible. When Y is infinite,
however, there will be larger and larger subsets in a
never-ending chain.
Instead, we have to appeal to a fundamental principle of ``infinite'' mathematics, Zorn's Lemma. This lemma asserts the existence of objects without giving any means of constructing them, and so it is viewed with disfavor by some. If one is willing to use it, however, it is extremely powerful. (Indeed, many results cannot be proven without Zorn's Lemma.) We will state it below but not prove it. The proof involves the Axiom of Choice and some form of transfinite induction -- in fact, Zorn's Lemma is equivalent to the Axiom of Choice.
We first state a special version for sets, and then discuss the more general form.
Lemma (1)
Let
be a non-empty collection of sets. A non-empty
subset
of
is said to be a chain if
for any
, either
or
.Suppose that whenever
is a chain in
, the union
is an element of
. Then
contains a maximal element.
Let
be a chain in
, and put
. Clearly
; we must show
is linearly
independent. Suppose not: then there are
, such that some
non-trivial linear combination of all these elements is
. Each
for some
. Since
is a chain, there is an index j,
,such that
for all
. The
elements
all lie in
, and since
, these elements must be
linearly independent. This contradicts the choice of these
elements and shows that
must be linearly
independent after all. Thus
, as required.
The only properties of sets that occur in Zorn's Lemma
above are their properties relative to the partial order
. Thus it is reasonable to try to formulate Zorn's
Lemma in a more general setting, and it turns out that it
is both true and useful.
Let
be a partial order on a set S. We say
is a chain if C is linearly ordered under
, that is, if for any
, either
or
. We say
is an upper bound for
if
for every
. Finally, we say
is a maximal element (or simply
maximal ) if
for
implies s=x.
(Note that we do not require
for all
. We
are not assuming S is totally ordered. In particular,
S may have many maximal elements -- or it may have none
at all.)
If we take for S our collection of sets
and we
take for
the inclusion relation
, then a chain
in
relative to
is precisely a chain in
in the sense we defined above. Moreover,
is an upper bound for any
. Our hypothesis
in the special Zorn's Lemma for sets was that this union is
in
for any chain
, and hence every chain in
has an upper bound in
. This is a special case
of the general form of Zorn's Lemma.
Lemma (General Zorn's Lemma)
Let
be a partial order on a non-empty set S and
suppose that every chain in S has an upper bound. Then
S has a maximal element.
By the special version of Zorn's Lemma for sets, it follows
that
contains a maximal element C, that is, a
chain C that cannot be added onto. By our hypothesis for
the general Zorn's Lemma, this chain C has an upper bound
x. We claim x is a maximal element in S.
If this is false, there is an element
with
x<y. But then s<y for every
, so
is a chain that properly contains C. This is
impossible, and so x must be maximal.
The other place we appealed to finiteness was in our proof
of Theorem 7.15. We
were given
where X is linearly independent
and Y spans V and we wished to show
. We
showed this under the hypothesis that Y is finite, using
the Exchange Lemma, by fixing V,Y and taking a
counterexample V,X,Y with
maximal.
By Zorn's Lemma (verify!) we can find a maximal X such
that V,X,Y is a counterexample, but can we find one with
maximal? It is not clear how to order our
counterexamples X so that
is maximized. For
example, it seems quite possible to have counterexamples
X,X' with
but
(if
there are any counterexamples at all!).
We can instead solve this particular problem by a counting
argument. We need two facts about infinite sets. If S
is a set, let
denote the set of all finite
subsets of S. The first fact we need is that if S is
infinite, then
.
Suppose
. The second fact we need is that if
S is infinite and |S|>|T|, then there exists a
such that
is
infinite.
Both of these facts are consequences of the fact that for
non-empty sets A,B, if at least one of them is infinite,
then . (If
are
infinite cardinals, then
.) This
implies, for example, that if
is an infinite
cardinal, a countable union of sets of cardinality
has cardinality
.
Assume these two facts and assume Y is infinite. Since
Y spans V, for every
, there is a finite subset
such that
for some scalars
. For each x, pick a particular set Z -- or
shrink Y to a basis in which case Z is unique if we
assume each
-- and define a function
by letting f(x) be the chosen set Z.
Since Y is infinite, our first fact above tell us
. If |X|>|Y|, our second fact tells us
there is a finite
and an infinite
with f(x)=Z for all
. Thus each element of
X' lies in the finite dimensional subspace of V
spanned by Z. Since
, we know X' is
linearly independent. Thus we have an infinite linearly
independent set contained in a vector space of finite
dimension. We proved this is impossible in the finite
case of Theorem 7.15.