-
Notifications
You must be signed in to change notification settings - Fork 25
/
ch-several-vars-ders.tex
4719 lines (4334 loc) · 155 KB
/
ch-several-vars-ders.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Several Variables and Partial Derivatives} \label{pd:chapter}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Vector spaces, linear mappings, and convexity}
\label{sec:vectorspaces}
%mbxINTROSUBSECTION
\sectionnotes{3 lectures}
\subsection{Vector spaces}
The euclidean space $\R^n$ has already made an appearance in the metric
space chapter. In this chapter, we extend the differential calculus
we created for one variable to several variables. The key idea in
differential calculus is to approximate differentiable functions by linear
functions (approximating the graph by a straight line).
In several variables, we must introduce a little bit of linear
algebra before we can move on.
We start with vector spaces and linear mappings of vector spaces.
While it is common to use $\vec{v}$ or the bold
$\mathbf{v}$ for elements of $\R^n$,
especially in the applied sciences,
we use just plain old $v$, which is common in mathematics.
That is, $v \in \R^n$ is a \emph{\myindex{vector}}, which means
$v = (v_1,v_2,\ldots,v_n)$\glsadd{not:vector} is an $n$-tuple of
real numbers.\footnote{Subscripts are used for many purposes,
so sometimes we may have several vectors that may also
be identified by subscript, such as a finite or infinite sequence of
vectors $y_1,y_2,\ldots$.}
It is common to write and treat vectors as
\emph{\myindex{column vectors}}, that is, $n$-by-$1$ matrices:
\glsadd{not:vectorcol}
\begin{equation*}
v =
(v_1,v_2,\ldots,v_n) =
{ \left[
\begin{smallmatrix}
v_1 \\ v_2 \\ \vdots \\ v_n
\end{smallmatrix}
\right]
}
\end{equation*}
We do so when convenient.
We call real numbers
\emph{\myindex{scalars}} to distinguish them from vectors.
We often think of vectors as a direction and a magnitude and draw the
vector as an arrow. The vector $(v_1,v_2,\ldots,v_n)$ is represented
by an arrow from the origin to the point $(v_1,v_2,\ldots,v_n)$.
When we think of vectors as arrows, they are
not based at the origin necessarily; a vector is simply the direction and
the magnitude, and it does not know where it starts.
There is a natural algebraic structure when thinking of vectors as arrows.
We can add vectors as arrows by following one vector and then the other.
And we can take scalar multiples of vectors as arrows by rescaling
the magnitude. See \figureref{fig:vecasarrow}.
\begin{myfigureht}
\subimport*{figures/}{vecasarrow.pdf_t}
\caption{Vector as an arrow in $\R^2$, and the meaning of addition and
scalar multiplication.\label{fig:vecasarrow}}
\end{myfigureht}
Each vector also represents a point in $\R^n$. Usually, we think of $v \in \R^n$
as a point if we are thinking of $\R^n$ as a metric space, and we think of
it as an arrow if we think of the so-called \emph{vector space structure} on
$\R^n$ (addition and scalar multiplication).
Let us define the abstract notion of a vector space, as there
are many other vector spaces than just $\R^n$.
\begin{defn}
Let $X$ be a set together with
the operations of addition, $+ \colon X \times X \to X$,
and multiplication, $\cdot \colon \R \times X \to X$, (we usually write $ax$ instead of $a
\cdot x$). $X$ is called a \emph{\myindex{vector space}} (or a
\emph{\myindex{real vector space}})
if the following conditions are satisfied:
\begin{enumerate}[(i)]
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Addition is associative)}
%mbxENDIGNORE
%mbxlatex (Addition is associative) \quad
If $u, v, w \in X$, then $u+(v+w) = (u+v)+w$.
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Addition is commutative)}
%mbxENDIGNORE
%mbxlatex (Addition is commutative) \quad
If $u, v \in X$, then $u+v = v+u$.
\item \label{vecspacedefn:addidentity}
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Additive identity)}
%mbxENDIGNORE
%mbxlatex (Additive identity) \quad
There is a $0 \in X$ such that $v+0=v$ for all $v \in X$.
\item \label{vecspacedefn:addinverse}
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Additive inverse)}
%mbxENDIGNORE
%mbxlatex (Additive inverse) \quad
For each $v \in X$, there is a $-v \in X$,
such that $v+(-v)=0$.
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Distributive law)}
%mbxENDIGNORE
%mbxlatex (Distributive law) \quad
If $a \in \R$, $u,v \in X$, then
$a(u+v) = au+av$.
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Distributive law)}
%mbxENDIGNORE
%mbxlatex (Distributive law) \quad
If $a,b \in \R$, $v \in X$, then
$(a+b)v = av+bv$.
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Multiplication is associative)}
%mbxENDIGNORE
%mbxlatex (Multiplication is associative) \quad
If $a,b \in \R$, $v \in X$, then
$(ab)v = a(bv)$.
\item
%mbxSTARTIGNORE
\makebox[2.15in][l]{(Multiplicative identity)}
%mbxENDIGNORE
%mbxlatex (Multiplicative identity) \quad
$1v = v$ for all $v \in X$.
\end{enumerate}
Elements of a vector space are usually called \emph{vectors},
even if they
are not elements of $\R^n$ (vectors in the \myquote{traditional} sense).
%
If $Y \subset X$ is a subset that is a vector space itself using the same
operations, then $Y$ is called a \emph{\myindex{subspace}} or
a \emph{\myindex{vector subspace}} of $X$.
\end{defn}
Multiplication by scalars works as one would expect.
For example, $2v = (1+1)v = 1v + 1v = v+v$, similarly $3v = v+v+v$, and
so on.
One particular fact we often use is that $0 v = 0$, where
the zero on the left is $0 \in \R$ and the zero on the right is $0 \in X$.
To see this, start with
$0v = (0+0)v = 0v + 0v$, and add $-(0v)$ to both sides
to obtain $0 = 0v$. Similarly, $-v = (-1)v$, which follows by
$(-1)v+v = (-1)v + 1v = (-1+1)v = 0v = 0$.
Such algebraic facts which follow quickly from the definition will be taken
for granted from now on.
\begin{example}
The set $\R^n$ is a vector space, addition
and multiplication by a scalar is done componentwise:
If $a \in \R$, $v = (v_1,v_2,\ldots,v_n) \in \R^n$, and $w =
(w_1,w_2,\ldots,w_n) \in \R^n$, then
\begin{align*}
& v+w \coloneqq
(v_1,v_2,\ldots,v_n) +
(w_1,w_2,\ldots,w_n)
=
(v_1+w_1,v_2+w_2,\ldots,v_n+w_n) , \\
& a v \coloneqq
a (v_1,v_2,\ldots,v_n) =
(a v_1, a v_2,\ldots, a v_n) .
\end{align*}
\end{example}
We will mostly deal with \myquote{finite-dimensional} vector spaces that can
be regarded as subsets of $\R^n$, but other vector spaces are useful in
analysis.
It is better to think of even such simpler
vector spaces abstractly abstract notion rather than as $\R^n$.
\begin{example}
A trivial example of a vector space is
$X \coloneqq \{ 0 \}$. The operations are defined in the obvious way:
$0 + 0 \coloneqq 0$ and $a0 \coloneqq 0$. A zero vector must always exist,
so all vector spaces are nonempty sets, and this $X$ is the smallest
possible vector space.
\end{example}
\begin{example}
The space $C([0,1],\R)$ of continuous functions on the interval $[0,1]$
is a vector space. For two functions $f$ and $g$ in $C([0,1],\R)$ and
$a \in \R$, we make the obvious definitions of $f+g$ and $af$:
\begin{equation*}
(f+g)(x) \coloneqq f(x) + g(x), \qquad (af) (x) \coloneqq a\bigl(f(x)\bigr) .
\end{equation*}
The 0 is the function that is identically zero. We leave it as an exercise
to check that all the vector space conditions are satisfied.
The space $C^1([0,1],\R)$ of continuously differentiable functions is
a subspace of $C([0,1],\R)$.
\end{example}
\begin{example}
The space of polynomials $c_0 + c_1 t + c_2 t^2 + \cdots + c_m t^m$
(of arbitrary degree~$m$) is a vector space,
denoted by $\R[t]$\glsadd{not:realpoly} (coefficients are real and
the variable is $t$). The operations are defined in the same way as for
functions above.
Suppose there are
two polynomials, one of degree $m$ and one of degree $n$. Assume $n
\geq m$ for simplicity. Then
\begin{multline*}
(c_0 + c_1 t + c_2 t^2 + \cdots + c_m t^m)
+
(d_0 + d_1 t + d_2 t^2 + \cdots + d_n t^n)
= \\
(c_0+d_0) + (c_1+d_1) t + (c_2 + d_2) t^2 + \cdots + (c_m+d_m) t^m
+ d_{m+1} t^{m+1} + \cdots + d_n t^n
\end{multline*}
and
\begin{equation*}
a(c_0 + c_1 t + c_2 t^2 + \cdots + c_m t^m)
=
(ac_0) + (ac_1) t + (ac_2) t^2 + \cdots + (ac_m) t^m .
\end{equation*}
Despite what it looks like, $\R[t]$ is not equivalent to $\R^n$ for any $n$. In
particular, it is not \myquote{finite-dimensional.} We will make this notion
precise in just a little bit. One can make a
finite-dimensional vector subspace by restricting the degree. For example,
if $\sP_n$ is the set of polynomials of degree $n$ or less,
then $\sP_n$ is a finite-dimensional vector space, and we could
identify it with $\R^{n+1}$.
Above, the variable $t$ is really just a formal placeholder.
By setting $t$ equal to a real number, we obtain a function.
So the space $\R[t]$ can be thought of as a subspace of $C(\R,\R)$.
If we restrict the range of $t$ to $[0,1]$, $\R[t]$ can be identified with
a subspace of $C([0,1],\R)$.
\end{example}
\begin{prop}
For $S \subset X$ to be a vector subspace
of a vector space $X$, we only need to check:
%If $X$ is a vector space, for $S \subset X$
%to be a vector subspace, we only need to check
\begin{enumerate}[1)]
\item
$0 \in S$.
\item
$S$ is closed under addition: If $x,y \in S$, then $x+y \in S$.
\item
$S$ is closed under scalar multiplication:
If $x \in S$ and $a \in \R$, then $ax \in S$.
\end{enumerate}
\end{prop}
Items 2) and 3)
ensure that addition and scalar multiplication are indeed defined on~$S$.
Item 1) is required
to fulfill item \ref{vecspacedefn:addidentity} from the definition of vector
space. Existence
of additive inverse $-v$, item \ref{vecspacedefn:addinverse},
follows because $-v = (-1)v$ and item 3) says that
$-v \in S$ if $v \in S$. All other properties are certain equalities
that are already satisfied in $X$ and thus must be satisfied in a subset.
\medskip
It is possible to use other fields than $\R$ in the definition
(for example, it is common to use the complex numbers $\C$),
but let us stick with
the real numbers\footnote{If you want a very funky vector space over a different field,
$\R$ itself is a vector space over the field $\Q$.}.
\subsection{Linear combinations and dimension}
\begin{defn}
Suppose $X$ is a vector space,
$x_1, x_2, \ldots, x_m \in X$ are vectors, and
$a_1, a_2, \ldots, a_m \in \R$ are scalars. Then
\begin{equation*}
a_1 x_1 +
a_2 x_2 + \cdots
+ a_m x_m
\end{equation*}
is called a \emph{\myindex{linear combination}} of the vectors $x_1, x_2,
\ldots, x_m$.
For a subset $Y \subset X$, let $\spn(Y)$,\glsadd{not:span}
or the \emph{\myindex{span}} of $Y$,
be the set of all linear combinations of all finite subsets of $Y$.
We say $Y$ \emph{spans} $\spn(Y)$.
By convention, define $\spn(\emptyset) \coloneqq \{ 0 \}$.
\end{defn}
\begin{example}
Let $Y \coloneqq \bigl\{ (1,1) \bigr\} \subset \R^2$. Then
\begin{equation*}
\spn(Y)
=
\bigl\{ (x,x) \in \R^2 : x \in \R \bigr\} .
\end{equation*}
That is, $\spn(Y)$ is the line through the origin and the point $(1,1)$.
\end{example}
\begin{example} \label{example:vecspr2span}
Let $Y \coloneqq \bigl\{ (1,1), (0,1) \bigr\} \subset \R^2$. Then
\begin{equation*}
\spn(Y)
=
\R^2 ,
\end{equation*}
as every point $(x,y) \in \R^2$ can be written as
a linear combination
\begin{equation*}
(x,y) = x (1,1) + (y-x) (0,1) .
\end{equation*}
\end{example}
\begin{example}
Let $Y \coloneqq \{ 1, t, t^2, t^3, \ldots \} \subset \R[t]$, and
$E \coloneqq \{ 1, t^2, t^4, t^6, \ldots \} \subset \R[t]$.
The span of $Y$ is all polynomials,
\begin{equation*}
\spn(Y) = \R[t] .
\end{equation*}
The span of $E$ is the set of polynomials with even powers of $t$ only.
\end{example}
Suppose we have two linear combinations of vectors from $Y$. One linear
combination uses the vectors $\{ x_1,x_2,\ldots,x_m \}$, and the other uses
$\{ \widetilde{x}_1,\widetilde{x}_2,\ldots,\widetilde{x}_n \}$. We can write
their sum using vectors from the union
$\{ x_1,x_2,\ldots,x_m \} \cup
\{ \widetilde{x}_1,\widetilde{x}_2,\ldots,\widetilde{x}_n \}$:
\begin{multline*}
(a_1 x_1 +
a_2 x_2 + \cdots
+ a_m x_m)
+
(b_1 \widetilde{x}_1 +
b_2 \widetilde{x}_2 + \cdots
+ b_n \widetilde{x}_n)
\\
=
a_1 x_1 +
a_2 x_2 + \cdots
+ a_m x_m
+
b_1 \widetilde{x}_1 +
b_2 \widetilde{x}_2 + \cdots
+ b_n \widetilde{x}_n .
\end{multline*}
So the sum is also a linear combination of vectors from~$Y$.
Similarly,
a scalar multiple of a linear combination of vectors from~$Y$
is a linear combination of vectors from~$Y$:
\begin{equation*}
b (a_1 x_1 +
a_2 x_2 + \cdots
+ a_m x_m)
=
b a_1 x_1 +
b a_2 x_2 + \cdots
+ b a_m x_m .
\end{equation*}
Finally, $0 \in \spn(Y)$; if $Y$ is nonempty, $0 = 0 v$ for some $v \in Y$.
We have proved the following proposition.
\begin{prop}
Let $X$ be a vector space and $Y \subset X$ is a subset.
Then the set $\spn(Y)$ is a vector subspace of $X$.
\end{prop}
Every linear combination of elements in a subspace is an element of that
subspace. So $\spn(Y)$ is the smallest subspace that contains $Y$.
In particular,
if $Y$ is already a vector subspace, then $\spn(Y) = Y$.
\begin{defn}
A set of vectors $\{ x_1, x_2, \ldots, x_m \} \subset X$ is
\emph{\myindex{linearly independent}}\footnote{%
For an infinite set $Y \subset X$, we say $Y$ is linearly
independent if every finite subset of $Y$ is linearly independent
in the sense given.
However, this situation only comes up in infinitely many dimensions.}
if the equation
\begin{equation} \label{eq:lincomb}
a_1 x_1 + a_2 x_2 + \cdots + a_m x_m = 0
\end{equation}
has only the trivial solution $a_1 = a_2 = \cdots = a_m = 0$.
By convention, $\emptyset$ is linearly independent.
A set that is not linearly independent is
\emph{\myindex{linearly dependent}}.
%
A linearly independent set of vectors $B \subset X$ such that
$\spn(B) = X$
is called a \emph{\myindex{basis}} of $X$.
We generally consider the basis as not just a set, but as
an ordered $m$-tuple:
$x_1,x_2,\ldots,x_m$.
Suppose $d$ is largest integer for which $X$ contains a set of
$d$ linearly independent vectors. We then say $d$ is
the \emph{\myindex{dimension}} of $X$,
and we write $\dim \, X \coloneqq d$.
If $X$ contains a set of $d$ linearly independent vectors
for arbitrarily large $d$, we say
$X$ is infinite-dimensional and write $\dim \, X \coloneqq \infty$.
%
For the trivial vector space $\{ 0 \}$, we define $\dim \, \{ 0 \} \coloneqq 0$.
\end{defn}
A subset of a linearly independent set is clearly linearly independent.
So if a set contains $d$ linearly independent vectors, it also contains a
set of $m$ linearly independent vectors for all $m \leq d$.
Moreover, if a set does not have $d+1$
linearly independent vectors, no set of more than $d+1$ vectors is linearly
independent.
So $X$ is of dimension is $d$ if there
is a set of $d$ linearly independent vectors, but no
set of $d+1$ vectors is linearly independent.
No element of a linearly independent set can be zero,
and a set with one nonzero element is always linearly independent.
In particular, $\{ 0 \}$ is the only vector space of dimension $0$.
Every other vector space has a positive dimension
or is infinite-dimensional.
As the empty set is linearly independent, it is a basis
of $\{ 0 \}$.
As an example, the
set $Y$ of the two vectors in
\exampleref{example:vecspr2span} is a basis of $\R^2$, and so
$\dim \, \R^2 \geq 2$.
We will see in a moment that every vector subspace of $\R^n$
has a finite dimension, and that dimension is less than or equal to $n$.
So every set of 3 vectors in $\R^2$ is linearly dependent, and
$\dim \, \R^2 = 2$.
If a set is linearly dependent, then one of the
vectors is a linear combination of the others.
In \eqref{eq:lincomb},
if $a_k \not= 0$, then we solve for $x_k$:
\begin{equation*}
x_k = \frac{-a_1}{a_k} x_1 + \cdots +
\frac{-a_{k-1}}{a_k} x_{k-1} +
\frac{-a_{k+1}}{a_k} x_{k+1} +
\cdots +
\frac{-a_m}{a_m} x_m .
\end{equation*}
The vector $x_k$ has at least two different representations
as linear combinations of the vectors $\{ x_1,x_2,\ldots,x_m \}$. The one above and
$x_k$ itself.
For instance, the set $\bigl\{ (0,1), (2,3), (5,0) \bigr\}$ in $\R^2$ is linearly dependent:
\begin{equation*}
3(0,1) - (2,3) + 2(1,0) = 0 ,
\qquad \text{so} \qquad
(2,3) = 3(0,1) + 2(1,0) .
\end{equation*}
\begin{prop}
Suppose a vector space $X$ has basis
$B = \{ x_1, x_2, \ldots, x_n \}$. Then
every $y \in X$ has a unique representation of the form
\begin{equation*}
y = \sum_{k=1}^n a_k \, x_k
\end{equation*}
for some scalars $a_1, a_2, \ldots, a_n$.
\end{prop}
\begin{proof}
As $X$ is the span of $B$,
every $y \in X$ is a linear combination of elements of $B$.
Suppose
\begin{equation*}
y = \sum_{k=1}^n a_k \, x_k = \sum_{k=1}^n b_k \, x_k .
\end{equation*}
Then
\begin{equation*}
\sum_{k=1}^n (a_k-b_k) x_k = 0 .
\end{equation*}
By linear independence of the basis, $a_k = b_k$ for all $k$, and so
the representation is unique.
\end{proof}
For $\R^n$,
we define
the \emph{\myindex{standard basis}} of $\R^n$:\glsadd{not:standardbasis}
\begin{equation*}
e_1 \coloneqq (1,0,0,\ldots,0) , \quad
e_2 \coloneqq (0,1,0,\ldots,0) , \quad \ldots, \quad
e_n \coloneqq (0,0,0,\ldots,1) .
\end{equation*}
We use the same letters $e_k$ for any $\R^n$, and
which space $\R^n$ we are working in is understood from context.
A direct computation shows that $\{ e_1, e_2, \ldots, e_n \}$ really is
a basis of $\R^n$; it spans $\R^n$ and is
linearly independent. In fact,
\begin{equation*}
a = (a_1,a_2,\ldots,a_n) = \sum_{k=1}^n a_k \, e_k .
\end{equation*}
\begin{prop} \label{mv:dimprop}
\pagebreak[2]
Let $X$ be a vector space and $d$ a nonnegative integer.
\begin{enumerate}[(i)]
\item \label{mv:dimprop:i}
If $X$ is spanned by $d$ vectors, then $\dim \, X \leq d$.
\item \label{mv:dimprop:ii}
If $T$ is a linearly independent set and
$v \in X \setminus \spn (T)$, then
$T \cup \{ v \}$ is linearly independent.
\item \label{mv:dimprop:iii}
$\dim \, X = d$ if and only if $X$ has a basis of $d$
vectors.
In particular, $\dim \, \R^n = n$.
\item \label{mv:dimprop:iv}
If $Y \subset X$ is a vector subspace and $\dim \, X = d$,
then $\dim \, Y \leq d$.
\item \label{mv:dimprop:v}
If $\dim \, X = d$ and a set $T$ of $d$ vectors spans $X$,
then $T$ is linearly independent.
\item \label{mv:dimprop:vi}
If $\dim \, X = d$ and a set $T$ of $m$ vectors is
linearly independent, then there is a set $S$ of $d-m$
vectors such that $T \cup S$ is a basis of $X$.
\end{enumerate}
\end{prop}
In particular, the last item says that if $\dim X = d$ and $T$ is
a set of $d$ linearly independent vectors, then $T$ spans $X$.
Another thing to note is that item \ref{mv:dimprop:iii} implies that every
basis of a finite dimensional vector space has the same number of elements.
\begin{proof}
All statements hold trivially when $d=0$, so assume $d \geq 1$.
We start with \ref{mv:dimprop:i}.
Suppose $S \coloneqq \{ x_1 , x_2, \ldots, x_d \}$ spans $X$, and
$T \coloneqq \{ y_1, y_2, \ldots, y_m \}$ is a linearly independent
subset of $X$. We wish to show that $m \leq d$.
As $S$ spans $X$,
write
\begin{equation*}
y_1 = \sum_{k=1}^d a_{k,1} x_k ,
\end{equation*}
for some numbers $a_{1,1},a_{2,1},\ldots,a_{d,1}$.
One of the
$a_{k,1}$ is nonzero, otherwise $y_1$ would be zero.
Without loss of generality, suppose $a_{1,1} \not= 0$. Solve
\begin{equation*}
x_1 = \frac{1}{a_{1,1}} y_1 - \sum_{k=2}^d \frac{a_{k,1}}{a_{1,1}} x_k .
\end{equation*}
In particular, $\{ y_1 , x_2, \ldots, x_d \}$ spans $X$, since $x_1$ can be
obtained from $\{ y_1 , x_2, \ldots, x_d \}$. Therefore, there are some numbers
for some numbers $a_{1,2},a_{2,2},\ldots,a_{d,2}$, such that
\begin{equation*}
y_2 = a_{1,2} y_1 + \sum_{k=2}^d a_{k,2} x_k .
\end{equation*}
As $T$ is linearly independent---and so $\{ y_1, y_2 \}$ is linearly
independent---one of the $a_{k,2}$
for $k \geq 2$ must be nonzero. Without loss of generality suppose
$a_{2,2} \not= 0$. Solve
\begin{equation*}
x_2 = \frac{1}{a_{2,2}} y_2 - \frac{a_{1,2}}{a_{2,2}} y_1 - \sum_{k=3}^d
\frac{a_{k,2}}{a_{2,2}} x_k .
\avoidbreak
\end{equation*}
In particular,
$\{ y_1 , y_2, x_3, \ldots, x_d \}$ spans $X$.
We continue this procedure. If $m < d$, we are done. Suppose
$m \geq d$.
After $d$ steps, we obtain that
$\{ y_1 , y_2, \ldots, y_d \}$ spans $X$. Any
other vector $v$ in $X$ is a linear combination of
$\{ y_1 , y_2, \ldots, y_d \}$ and hence cannot be in $T$ as $T$ is
linearly independent. So $m = d$.
We continue with \ref{mv:dimprop:ii}.
Suppose
$T = \{x_1,x_2,\ldots,x_m\}$ is linearly independent,
does not span $X$, and $v \in X \setminus \spn (T)$.
Suppose
$a_1 x_1 + a_2 x_2 + \cdots + a_m x_m + a_{m+1} v = 0$
for some scalars $a_1,a_2,\ldots,a_{m+1}$.
If $a_{m+1} \not=0$, then $v$ would be a linear combination of $T$,
so $a_{m+1} = 0$. Then, as $T$ is linearly independent, $a_1=a_2=\cdots=a_m = 0$.
So $T \cup \{ v \}$ is linearly independent.
We move to \ref{mv:dimprop:iii}.
If $\dim \, X = d$,
then there must exist some linearly independent set $T$ of $d$ vectors,
and $T$ must span $X$, otherwise we could choose a larger set of linearly
independent vectors via \ref{mv:dimprop:ii}.
So we have a basis of $d$ vectors.
On the other hand, if we have a basis of $d$ vectors,
the dimension is at least $d$ as a basis is linearly independent.
A basis also spans $X$, and so
by \ref{mv:dimprop:i} we know that dimension is at most $d$.
Hence the dimension of $X$ must equal $d$.
The \myquote{in particular} follows by noting that
$\{ e_1, e_2, \ldots, e_n \}$ is a basis of $\R^n$.
To see \ref{mv:dimprop:iv},
suppose $Y \subset X$ is a vector subspace,
where $\dim \, X = d$. As $X$ cannot contain $d+1$ linearly independent
vectors, neither can $Y$.
For \ref{mv:dimprop:v}, suppose $T$ is a set of $m$ vectors
that is linearly dependent
and spans $X$.
We will show that $m > d$.
One of the
vectors is a linear combination of the others. If we remove it
from $T$, we obtain a set of $m-1$ vectors that still span $X$.
Hence
$d = \dim \, X \leq m-1$ by~\ref{mv:dimprop:i}.
For \ref{mv:dimprop:vi} suppose $T = \{ x_1, x_2, \ldots, x_m \}$ is
a linearly independent set. First, $m \leq d$ by definition of dimension.
If $m=d$, the set $T$ must span $X$ as in the proof of \ref{mv:dimprop:iii},
otherwise we could add another vector to $T$.
If $m < d$, $T$ cannot span $X$ by \ref{mv:dimprop:iii}.
So find $v$ not in the span of~$T$.
Via \ref{mv:dimprop:ii}, the set $T \cup \{ v \}$ is
a linearly independent set of $m+1$ elements.
Therefore, we repeat this procedure $d-m$ times to
find a set of $d$ linearly independent vectors.
Again, they must span $X$, otherwise we could add yet another vector.
\end{proof}
\subsection{Linear mappings}
When $Y \not= \R$,
a function $f \colon X \to Y$ is often called a
\emph{\myindex{mapping}} or a \emph{\myindex{map}}
rather than a \emph{function}.
\begin{defn}
A map $A \colon X \to Y$ of vector spaces $X$ and $Y$
is \emph{\myindex{linear}} (we also say $A$ is a
\emph{\myindex{linear transformation}}\index{transformation, linear}
or a \emph{\myindex{linear operator}}\index{operator, linear})
if for all $a \in \R$ and all $x,y \in X$,
\begin{equation*}
A(a x) = a A(x) \qquad \text{and} \qquad A(x+y) = A(x)+A(y) .
\end{equation*}
We usually write $Ax$ instead of $A(x)$ if $A$ is linear.
If $A$ is one-to-one and onto, then we say $A$ is
\emph{invertible}\index{invertible linear transformation},
and we denote the inverse by $A^{-1}$.
If $A \colon X \to X$ is linear, then we say $A$ is a
\emph{linear operator on $X$}.
We write $L(X,Y)$\glsadd{not:LXY} for the set of linear maps from $X$ to
$Y$, and $L(X)$\glsadd{not:LX} for the set of linear operators on $X$.
If $a \in \R$ and $A,B \in L(X,Y)$, define
the maps $aA$ and $A+B$ by
\begin{equation*}
(aA)(x) \coloneqq aAx ,
\qquad
(A+B)(x) \coloneqq Ax + Bx .
\end{equation*}
If $A \in L(Y,Z)$ and $B \in L(X,Y)$, define the
map $AB \colon X \to Z$ as the composition $A \circ B$,
\begin{equation*}
ABx \coloneqq A(Bx) .
\end{equation*}
Finally, denote by $I \in L(X)$ the \emph{\myindex{identity}}:
the linear operator such that $Ix = x$ for all $x$.
\end{defn}
\begin{prop} \label{prop:LXYvs}
Let $X$, $Y$, and $Z$ be vector spaces.
\begin{enumerate}[(i)]
\item If $A \in L(X,Y)$, then $A0 = 0$.
\item If $A,B \in L(X,Y)$, then $A+B \in L(X,Y)$.
\item If $A \in L(X,Y)$ and $a \in \R$, then $aA \in L(X,Y)$.
\item If $A \in L(Y,Z)$ and $B \in L(X,Y)$, then $AB \in L(X,Z)$.
\item If $A \in L(X,Y)$ is invertible, then $A^{-1} \in L(Y,X)$.
\end{enumerate}
\end{prop}
In particular, $L(X,Y)$ is a vector space, where $0 \in L(X,Y)$ is the
linear map that takes everything to $0$.
As $L(X)$ is not only a vector space, but also
admits a product (composition),
it is called an \emph{\myindex{algebra}}.
\begin{proof}
We leave the first four items as a quick exercise,
\exerciseref{exercise:LXYvs}.
Let us prove the last item.
Let $a \in \R$ and $y \in Y$. As $A$ is onto, then there is an
$x \in X$ such that $y = Ax$.
As it is also one-to-one, $A^{-1}(Az) = z$ for all $z \in X$.
So
\begin{equation*}
A^{-1}(ay)
=
A^{-1}(aAx)
=
A^{-1}\bigl(A(ax)\bigr)
= ax
= aA^{-1}(y).
\end{equation*}
Similarly, let $y_1,y_2 \in Y$ and $x_1, x_2 \in X$ be such that
$Ax_1 = y_1$ and
$Ax_2 = y_2$, then
\begin{equation*}
A^{-1}(y_1+y_2)
=
A^{-1}(Ax_1+Ax_2)
=
A^{-1}\bigl(A(x_1+x_2)\bigr)
= x_1+x_2
= A^{-1}(y_1) + A^{-1}(y_2). \qedhere
\end{equation*}
\end{proof}
\begin{prop} \label{mv:lindefonbasis}
If $A \in L(X,Y)$ is linear, then it is completely determined
by its values on a basis of $X$.
Furthermore, if $B$ is a basis of $X$,
then every function $\widetilde{A} \colon B \to Y$ extends to a linear
function $A$ on $X$.
\end{prop}
We only prove this proposition for finite-dimensional spaces, as we do
not need infinite-dimensional spaces.%
\footnote{For infinite-dimensional spaces, the proof is essentially the same, but a
little trickier to write.
Moreover, we haven't even defined what a basis is for
infinite-dimensional spaces.}
\begin{proof}
Let $\{ x_1, x_2, \ldots, x_n \}$ be a basis of $X$, and let
$y_k \coloneqq A x_k$. Every $x \in X$ has a unique representation
\begin{equation*}
x = \sum_{k=1}^n b_k \, x_k
\end{equation*}
for some numbers $b_1,b_2,\ldots,b_n$. By linearity,
\begin{equation*}
Ax =
A\sum_{k=1}^n b_k x_k
=
\sum_{k=1}^n b_k \, Ax_k
=
\sum_{k=1}^n b_k \, y_k .
\end{equation*}
The \myquote{furthermore} follows by setting $y_k \coloneqq \widetilde{A}(x_k)$,
and then for
$x = \sum_{k=1}^n b_k \, x_k$,
defining the extension as
$A(x) \coloneqq \sum_{k=1}^n b_k \, y_k$. The function is well-defined by
uniqueness of the representation of $x$.
We leave it to the reader to check that $A$ is linear.
\end{proof}
For a linear map, it is sufficient to check injectivity at the origin.
That is, if the only $x$ such that $Ax =0$ is $x=0$, then $A$ is one-to-one,
because if $Ay=Az$, then $A(y-z) = 0$. For this reason,
one often studies the \emph{\myindex{nullspace}} of $A$, that is,
$\{ x \in X : Ax = 0 \}$.
For finite-dimensional vector spaces (and only in finitely many dimensions)
we have the following special case of the
so-called rank-nullity theorem from linear algebra.
\begin{prop} \label{mv:prop:lin11onto}
If $X$ is a finite-dimensional vector space and $A \in L(X)$, then $A$ is one-to-one if and only if it is onto.
\end{prop}
\begin{proof}
Let $\{ x_1,x_2,\ldots,x_n \}$ be a basis for $X$.
First suppose $A$ is one-to-one. Let $c_1,c_2,\ldots,c_n$ be such that
\begin{equation*}
0 =
\sum_{k=1}^n c_k \, Ax_k =
A\sum_{k=1}^n c_k \, x_k
.
\end{equation*}
As $A$ is one-to-one,
the only vector that is taken to 0 is 0 itself.
Hence,
\begin{equation*}
0 =
\sum_{k=1}^n c_k \, x_k,
\end{equation*}
and so $c_k = 0$ for all $k$ as $\{ x_1,x_2,\ldots,x_n \}$ is a basis.
So $\{ Ax_1, Ax_2, \ldots, Ax_n \}$ is linearly independent.
By \propref{mv:dimprop}
and the fact that the dimension is $n$, we conclude
$\{ Ax_1, Ax_2, \ldots, Ax_n \}$ spans $X$. Consequently, $A$ is onto,
as any $y \in X$ can be written as
\begin{equation*}
y = \sum_{k=1}^n a_k \, Ax_k =
A\sum_{k=1}^n a_k \, x_k .
\avoidbreak
\end{equation*}
For the other direction, suppose $A$ is onto.
Suppose that for some
$c_1,c_2,\ldots,c_n$,
\begin{equation*}
0 = A\sum_{k=1}^n c_k \, x_k =
\sum_{k=1}^n c_k \, Ax_k .
\end{equation*}
As $A$ is determined by the action on the basis,
$\{ Ax_1, Ax_2, \ldots, Ax_n \}$ spans $X$.
So by \propref{mv:dimprop}, the set is linearly independent,
and $c_k = 0$ for all $k$. In other words, if $Ax = 0$, then $x=0$.
Thus, $A$ is one-to-one.
\end{proof}
We leave the proof of the next proposition as an exercise.
\begin{prop} \label{prop:LXYfinitedim}
If $X$ and $Y$ are finite-dimensional vector spaces, then $L(X,Y)$
is also finite-dimensional.
\end{prop}
We can identify a finite-dimensional vector
space $X$ of dimension $n$ with $\R^n$, provided we fix a basis
$\{ x_1, x_2, \ldots, x_n \}$ in $X$. That is, we define a bijective
linear map $A \in L(X,\R^n)$ by
$Ax_k \coloneqq e_k$, where $\{ e_1, e_2, \ldots, e_n \}$ is the standard
basis in $\R^n$. We have
the correspondence\glsadd{not:mapsto}
\begin{equation*}
\sum_{k=1}^n c_k \, x_k \, \in X
\quad
\overset{A}{\mapsto}
\quad
(c_1,c_2,\ldots,c_n) \, \in \R^n .
\end{equation*}
\subsection{Convexity}
A subset $U$ of a vector space is \emph{\myindex{convex}}
if whenever $x,y \in U$, the line segment from
$x$ to $y$ lies in $U$. That is, if the \emph{\myindex{convex combination}}
$(1-t)x+ty$ is in $U$ for all $t \in [0,1]$.
We write $[x,y]$ for this line segment.\glsadd{not:segmentinRn}
See \figureref{mv:convexcomb}.
\begin{myfigureht}
\subimport*{figures/}{convexset.pdf_t}
\caption{Convexity.\label{mv:convexcomb}}
\end{myfigureht}
In $\R$, convex sets are precisely the intervals, which are
also precisely the connected sets.
In two or more dimensions
there are lots of nonconvex connected sets. For example,
the set $\R^2 \setminus \{0\}$ is connected,
but not convex---for any $x \in \R^2 \setminus \{0\}$ where $y \coloneqq -x$,
we find
$(\nicefrac{1}{2})x + (\nicefrac{1}{2})y = 0$, which is not in the set.
Balls (in the standard metric) in $\R^n$ are convex. It is
a useful enough result to state as a proposition,
but we leave its proof as an exercise.
\begin{prop}
Let $x \in \R^n$ and $r > 0$. The ball $B(x,r) \subset \R^n$
%(using the standard metric on $\R^n$)
is convex.
\end{prop}
\begin{example}
A convex combination is, in particular, a linear combination.
So every vector subspace $V$ of a vector space $X$ is convex.
\end{example}
\begin{example}
%A somewhat more complicated example is given by the following.
Let $C([0,1],\R)$ be the vector space of continuous real-valued functions on $\R$.
Let $V \subset C([0,1],\R)$ be the set of those $f$ such that
\begin{equation*}
\int_0^1 f(x)\,dx \leq 1 \qquad \text{and} \qquad
f(x) \geq 0 \enspace \text{for all } x \in [0,1] .
\end{equation*}
Then $V$ is convex. Take $t \in [0,1]$, and note that if $f,g \in V$,
then $(1-t) f(x) + t g(x) \geq 0$ for all $x$. Furthermore,
\begin{equation*}
\int_0^1 \bigl((1-t)f(x) + t g(x)\bigr) \,dx
=
(1-t) \int_0^1 f(x) \,dx
+ t \int_0^1 g(x) \,dx \leq 1 .
\end{equation*}
Note that $V$ is not a vector subspace of $C([0,1],\R)$. The function
$f(x) \coloneqq 1$ is in $V$, but $2f$ and $-f$ is not.
\end{example}
\begin{prop} \label{prop:intersectionconvex}
The intersection of two convex sets is convex. In fact,
if $\{ C_\lambda \}_{\lambda \in I}$ is
an arbitrary collection of convex sets in a vector space, then
\begin{equation*}
C \coloneqq \bigcap_{\lambda \in I} C_\lambda
\qquad \text{is convex.}
\end{equation*}
\end{prop}
\begin{proof}
If $x, y \in C$, then $x,y \in C_\lambda$ for all
$\lambda \in I$, and hence if $t \in [0,1]$, then $(1-t)x + ty \in
C_\lambda$ for all $\lambda \in I$. Therefore, $(1-t)x + ty \in C$ and $C$
is convex.
\end{proof}
A useful construction using intersections of convex sets is the
\emph{\myindex{convex hull}}. Given a subset $S$ of a vector
space $X$, define the convex hull of $S$ as the intersection of all convex sets
containing $S$:
\begin{equation*}
\operatorname{co}(S) \coloneqq
\bigcap \{ C \subset X : S \subset C, \text{ and } C \text{ is convex} \} .
\end{equation*}
That is, the convex hull is the smallest convex set containing $S$.
By \propref{prop:intersectionconvex}, the intersection of convex sets is
convex.
Hence the convex hull is convex.
\begin{example}
The convex hull of $\{ 0, 1 \}$ in $\R$ is $[0,1]$. Proof:
A convex set containing 0 and 1 must contain $[0,1]$, so
$[0,1] \subset \operatorname{co}(\{0,1\})$.
The set $[0,1]$ is convex and contains $\{0,1\}$, so
$\operatorname{co}(\{0,1\}) \subset [0,1]$.
\end{example}
Linear mappings preserve convex sets. So in some sense, convex sets are the
right sort of sets when considering linear mappings or changes of
coordinates.
\begin{prop}
Let $X,Y$ be vector spaces, $A \in L(X,Y)$, and let $C \subset X$ be convex.
Then $A(C)$ is convex.
\end{prop}
\begin{proof}
Take two points $p,q \in A(C)$. Pick $u,v \in C$ such that
$Au = p$ and $Av=q$. As $C$ is convex, then
$(1-t)u+t v \in C$
for all $t \in [0,1]$, so
\begin{equation*}
(1-t)p+t q
=
(1-t)Au+tAv
=
A\bigl((1-t)u+tv\bigr)
\in A(C) . \qedhere
\end{equation*}
\end{proof}
\subsection{Exercises}
\begin{exercise}
Show that in $\R^n$
(with the standard euclidean metric),
for every $x \in \R^n$ and every $r > 0$,
the ball $B(x,r)$ is convex.
\end{exercise}
\begin{exercise}
Verify that $\R^n$ is a vector space.
\end{exercise}
\begin{exercise}
Let $X$ be a vector space.
Prove that a finite set of vectors $\{ x_1,x_2,\ldots,x_n \} \subset X$
is linearly independent if and only if for every $k=1,2,\ldots,n$
\begin{equation*}
\spn\bigl( \{ x_1,\ldots,x_{k-1},x_{k+1},\ldots,x_n \}\bigr) \subsetneq
\spn\bigl( \{ x_1,x_2,\ldots,x_n \}\bigr) .
\end{equation*}
That is, the span of the set with one vector removed is strictly smaller.
\end{exercise}
\begin{exercise} \label{exercise:intzerosubspace}
Show that the set $X \subset C([0,1],\R)$ of those functions such
that $\int_0^1 f = 0$ is a vector subspace.
Compare \exerciseref{exercise:intoneconvex}.
\end{exercise}
\begin{exercise}[Challenging]
Prove $C([0,1],\R)$ is an infinite-dimensional vector space
where the operations are defined in the obvious way:
$s=f+g$ and $m=af$ are defined as
$s(x) \coloneqq f(x)+g(x)$ and
$m(x) \coloneqq a f(x)$.
Hint: For the dimension, think of functions that are only nonzero
on the interval $(\nicefrac{1}{n+1},\nicefrac{1}{n})$.
\end{exercise}
\begin{exercise} \label{exercise:continuouskernel}
Let $k \colon [0,1]^2 \to \R$ be continuous. Show that
$L \colon C([0,1],\R) \to C([0,1],\R)$ defined by
\begin{equation*}
Lf(y) \coloneqq \int_0^1 k(x,y)f(x)\,dx