Chapter 4: Continuity and the Fundamental Theorem of Algebra
The difference between real numbers and arbitrary ordered fields is that the field of real numbers was required to contain all the infinite decimals. What this meant was that, for every infinite decimal there had to be a real number a such that the all the numbers of the sequence where are as close to the number a as we like provided that j is chosen sufficiently large. The mathematical parlance describes this by saying that the series converges to a or that the sequence has limit a. Since this limit requirement was the only additional assumption we added to distinguish the field of real numbers from other ordered fields like the rational numbers, it makes sense to study this idea if we are to understand the deeper properties of the real numbers like the fundamental theorem of algebra.
We will also need to understand polynomials better. Recall that a polynomial with coefficients in a field was defined to be a formal expression of the form where the are in F. We usually wrote a polynomial in the form p(x) and substituted values b in F in for the variable to get elements . So, the polynomial corresponds to a set of pairs (b, p(b)) for all b in F. When we do this, we are thinking of the polynomial as a function from the domain F to the range space F. For example, the polynomial is a function which from the real numbers to the real numbers which associates with each real number b its square .
We will be dealing with polynomials with complex coefficients. As we have just seen, such polynomials can be thought of as defining a function from the complex numbers to the complex numbers. But every complex number is of the form where a and b are real numbers. So we have another function consisting of all pairs ((a, b), p(a + bi)) for all real numbers pairs (a, b). Thus, the polynomial can be thought to define a function from the set of pairs of real numbers (the cartesian product of the real numbers with themselves) to the complex numbers . Because we are taking values of the function at pairs of numbers, one usually describes this by saying that the polynomial p with complex coefficients can be thought of a function of two variables and write p(x, y) for p(x + iy).
In the case of a function f(x) from the real numbers to the real numbers, we can represent the function by its graph, (i.e. the set of all pairs (x, f(x)) for x real). For example, has as its graph a parabola. This is useful because it allows one to picture a function as a curve in the plane. In the case of a function f(x, y) of two real variables with values in the real numbers, one can think of the set of all ((x, y), f(x,y)) for all real pairs (x,y) as being a surface in 3-dimensional space. Above each point (x,y) in the plane, lies exactly one point of the graph. For example, gives a surface called a parabaloid, its shape is what one gets by rotating the parabola around the z-axis.
In the case of a polynomial with complex coefficients, we have seen that it defines a function of two real variables with complex values. So, to each pair of real numbers (a, b), we have a complex number p(a + bi). Complex numbers are represented as pairs of reals and so we see that our complex polynomial corresponds to a four dimensional picture -- for each pair of reals, we have a pair of real numbers. Because of the difficulty of visualizing four dimensional objects, this is much less useful. Since the complex number is really a pair of numbers, we can consider the polynomial to represent a pair of functions. If p(x, y) = p(x + iy) = r + is, then we have the function consisting of the pairs ((x, y), r) and another function consisting of the pairs ((x, y), s). So geometrically, the polynomial corresponds to two surfaces. For example, if the polynomial were , then the complex function would consist of the pairs and since and , we see that the polynomial corresponds to the two functions with pairs ((x,y), x^2 + y^2) and with ((x, y), 2xy).
Complex numbers z are also represented as the pair consisting of their length |z| and their argument . So, we can also think of the polynomial p with complex coefficients as two surfaces, one recording the lengths: and the other recording the argument of the functional value: . This is the approach we will take in proving the fundamental theorem of algebra. That result simply says that the length surface of a non-zero polynomial has to touch the xy-plane in at least one point. Now, the surface has no points which lie below the xy-plane because lengths are never negative. So, if we were to look for points where the surface touched the xy-plane, we should look for lowest points on the surface. We will show:
Intuitively, a function f(x) is said to continuous at b if functional values f(x) are as close as we would like to f(b) as soon as x is sufficiently close to b and at a place where f(x) is defined. For example, the top surface of a table defines a flat surface which is continuous. It is even continuous at the edge of the table because f(x) is not defined for points x beyond the edge of the table. On the other hand, we have a room containing only a table and consider the function consisting of the table top and the part of the floor which is not under the table, then this function is still continuous at all points except at the points corresponding to the edge of the table. Given such a point, there are points arbitrarily close to it where the functional value is defined by the table top and other points arbitrarily close to it where the functional value is defined by the floor level.
Example 1: i. Another example is the function consisting of all the pairs for non-zero x together with the single point (0, 1). This is usually written as
This function is continuous at all x except for x = 0. All the functional values for x near 0 are close to 0 whereas f(0) = 1.
ii. A less extreme case is
Again, this function is continuous at all x except x = 0.
iii. This function is not continuous anywhere:
iv. This monstrosity is continuous everywhere except at (0, 0):
Let's give a more careful definition: Let f(x) be a real valued function defined for certain real values x. If f is defined at b, we want to say that f is continuous at b if f(x) is as close as we wish to f(b) for all x sufficiently close to b where f is defined. The problem is to make sense of expressions such as "as close as we wish". One might wish for different degrees of closeness at various times. To cover the most stringent case, we will interpret this as meaning the distance between the two is less than any specified positive number. So, "f(x) is as close as we wish to f(b)" means that, if you are given a maximal distance , then one must have . The expression, "for all x sufficiently close to b" means that there is a positive number such that the assertion holds for all x satisfying . So, our careful definition is:
Definition 1: Let f be a function with domain a set of real numbers and with range space the set of all real numbers. We say that f is continuous at b in its domain if for every there is a such that for all x in the domain of f with .
Showing that a function is continuous can be a lot of work:
Example 2: i. The function is continuous at x = 0. In fact, if we are given , then we need to show that there is a such that for all x with . If we choose to be the smaller of and 1, then we can see that this works. In fact, if , then because . So . Since , we know that . So, as required.
ii. Now let's try and show that the same function is continuous at x = 1. For a given , we need to have for all x with , where is yet to be chosen. Now, . So, if we want this to be small by making |x - 1| small, we can do it provided that |x + 1| is not made large in the process. But if . So, if , we would have
where the last inequality holds provided we choose . This is precisely what we want provided that we also have .
So, given , let be any number smaller than both 1 and . Then, if , we have because
This proves that is continuous at x = 1.
iii. Now, let's try to show that is continuous at all real number b. Repeating the same kind of reasoning, we would want
Given an \epsilon > 0, we could choose so that it is less than 1 and Then, we would have for all x with .
We will need to study the continuity of real valued functions of two variables. The careful definition is almost the same as in the one variable case:
Definition 2: Let f be a function with domain a set of pairs of real numbers and with range space the set of all real numbers. We say that f is continuous at (a,b) in its domain if for every there is a such that for all (x,y) in the domain of f with .
All we really did was replace the absolute value with the distance function: the condition that is replaced with the distance from (x,y) to (a,b) be less than . We could have written Definition 1 in the same way, since absolute value gives the distance between two real numbers.
Example 3: i. Any constant function is continuous at every point in its domain. Suppose f(x) = c where c is real number for all x in the domain of f. Let b be in the domain of f. If , then we need
for all x in the domain of f with . But this holds for any choice of as long as it is a positive number.
ii. Any linear function f(x,y) = ax + by + c is continuous at all points (x, y) in its domain. In fact, let and (r, s) be in the domain of r. We need
for all (x, y) in the domain with .
For (x, y) with , one has:
because the square root function is increasing. Similarly . But then, we have:
If we choose our so that , then the right hand side will be smaller than as desired.
Proving that a function is continuous using only the definition can be quite tedious. So, we will need to develop some results which make it easy to check that certain functions are continuous. Throughout this section we will be dealing with real valued functions of one or two variables. We will take their values at points z where, by point, we mean either a real number or a pair of real numbers depending on whether the function is of one or two variables. We will also use absolute value signs to indicate the distance to the origin, either 0 or (0, 0), depending on whether the function is of one or two variables.
Given two functions f and g, we can define sums, differences, products, and quotient functions by:
Proposition 1: Let f and g be real valued function of one or two variables. Let z be a point in the domain of f and the domain of g where both f and g are continuous.
Proof: Let w be a point in the intersection of the domains of f and of g. Then
If is any positive number, then we can use the continuity of f and g at z to know that there is a and such that and for all points w in the domains of f and g such that (for f) and (for g). Choosing to be any number smaller than both and gives
as needed. You should check through the same proof using subtraction instead of addition of functions.
For products, one has
Let . Choose smaller than 1 and subject to another condition which we will specify below. By the continuity of f at z, we know that there is a such that for all w in the domain of f with . Similarly, by the continuity of g at z, we know that there is a such that for all w in the domain of g with . Choose any smaller than both and . Continuing our series of inequalities, we get for all w in the domains of both f and of g with that
where the last inequality follows because we add the condition to the list of conditions which is required to satisfy.
Finally, let's consider the case of quotients. We need to assume that z is in the domains of f and of g as well as . The inequality looks like
Now the numerator is like the one for products and the same sort of argument will be able to handle it. The new ingredient is the denominator. In order to get an upper bound on a quotient |a|/|b|, you need to either make the numerator larger or the denominator smaller. So, we need a lower bound on |g(w)|. But and so we can choose a such that for all w in the domain of g with , one has . Since g(w) is close to g(z), it cannot be too close to zero; more specifically,
where we have used:
Lemma 1: If a and b are real numbers, then .
Proof: This is an exercise. Let be chosen according to criteria which we will figure out below. Now, choose so and for all w in the domains of f and g such that . Then we can continue our inequalities:
where the last inequality would be true if we chose so that .
Corollary 1: i. Every polynomial with real coefficients is continuous at all real numbers. ii. If p(z) is a polynomial with complex coefficients and p(x, y) = r(x,y) + i s(x,y), then r(x,y) and s(x,y) are continuous at all pairs (x, y). iii. If p(z) is a polynomial with complex coefficients, then |p(x,y)| is continuous at all (x, y).
Proof: i. and ii. All of these are functions made up of a finite number of additions and multiplications of continuous functions. So the result follows by applying the Proposition a certain number of times. (More correctly, one can proceed by descent assuming that one has the function made up with the least number of multiplications and additions.)
For assertion iii, one needs
Lemma 2: i. If f(x, y) is a real valued function continuous at (a, b) and g(x) is a real valued function continuous at f(a, b), then g(f(x,y)) is continuous at (a, b)
ii. The square root function is continuous at all non-negative reals.
Proof: i. Let . Since g is continuous at f(a, b), there is a so that for all w in the domain of g such that . Further, since f is continuous at (a, b), one knows that there is a such that for all (x, y) in the domain of f where . But then, letting w = f(x,y), we have
ii. Let a > 0 and . One has for with where is a quantity to be determined:
So, if we choose so that , then one has .
Now consider the case where a = 0. If , choose smaller than 1 and . If x is non-negative and , then since the square root function is increasing, and so, the square root function is even continuous at zero.
The typical discontinuity is a point where the function makes a jump. Continuous functions cannot do this:
Proposition 2: ( Bolzano's Theorem) Let f(x) be a continuous function defined on a closed interval [a,b]. If f(a) and f(b) have different signs (i.e. one is positive and the other is negative), then there is a number c in (a, b) such that f(c) = 0.
Proof: This is an example of a binary search approach. The idea is that at each step we know that the desired element is in a certain interval and we can discard half that interval. So, essentially we are developing the solution essentially by finding a base 2 decimal expansion.
Start with . We know that f(a) and f(b) have different signs and we are looking for an root of f. Consider . If f(d) = 0, then we have found a root. Otherwise, f(d) is non-zero and so its sign is different from either that of f(a) or that of f(b). Now replace the interval [a, b] with the interval with endpoints d and the one of a or b such that the sign of f(a) or f(b) is different from that of f(d). The new interval is called and its length is half that of its predecessor .
Repeating the same process, we obtain either a root of f or an inductive definition of intervals where with and such that and are both non-zero with different signs. If we found a root, we are done; so assume that we have the sequence of intervals.
According to Corollary 8 of Section 2.7.3, there is a unique real number c contained in all the intervals . It must be the case that f(c) = 0. Otherwise, the continuity of f implies that there is a such that . But, if we choose k so large that , then the every element x in satisfies . This contradicts the fact that there are x in the interval such that f(x) has a sign different from that of f(c). This completes the proof.
Corollary 2: (Intermediate Value Theorem) If f(x) is continuous on the closed interval [a, b], then for every real number d between f(a) and f(b), there is a c in (a, b) with f(c) = d.
Proof: Apply Bolzano's Theorem to the function f(x) - d.
Definition 3: A point z is an absolute minimum of a function f defined on a set S if z is in the domain of f and for all w in the set S. A point z is an absolute maximum of a function f defined on a set S if z is in the domain of f and for all w in S.
Proposition 3: ( Weierstrass's Extreme Value Theorem) If f is a real valued function defined and continuous on a closed interval [a, b] (or a closed rectangle in the case of a function of 2 variables), then f has at least one absolute maximum and at least one absolute minimum on this interval (or rectangle).
Proof: We can use binary search here as well. Let's discuss the one variable case first, and then indicate what changes need to be made to handle the two variable one. Start with the interval and search for an absolute maximum. Let . If there is a point e in [a, d] with for all x in [d, b], then let ; otherwise, let . We have halved the length of the interval. The half that was discarded does not contain an e with f(e) larger than every f(x) in the half that was kept.
By induction, assume that we have defined of length . Let . If contains an e with for all x in , then set ; otherwise, let . Clearly, has length . The half that was discarded does not contain an e with f(e) larger than every f(x) in the half that was kept.
By Corollary 8 of section 2.7.3, we know that there is a unique real number c contained in all the intervals . This c must be the absolute maximum. If not, there is an e in [a, b] with f(e) > f(c). Since it is not in the intersection of all the , there must be a step j in which e was in the half of the interval which was discarded. But by the choice of the discarded half, there is an in with .
How large can j be? Consider the set S of non-negative integers j for which there is an in with . If j is in S, then . So, we can argue with as in the last paragraph using in place of e that there is a larger value of j in the set S. So, we see that S contains arbitrarily large integers.
But, f is continuous. So, there is a such that for all x in [a, b] with . By choosing a J with , we see that every element x of with satisfies and so there can be no such x with . This contradicts our assertion that the set S contain arbitrarily large integers j, and so proves the result in the case of a function of a single variable.
The same sort of proof can be used to prove Weierstrass's Theorem in the case of functions of two variables. In this case, we have a rectangle which is just the cartesian product of two intervals. Instead of dealing with a nested set of intervals, we deal with rectangles. We can let and be the two midpoints. The crucial step is:
Remark: Although both Bolzano's and Weierstrass's Theorems were proved with binary search, examination shows that the proof of Weierstrass's Theorem really does not give us any effective means for finding the absolute maximum. We know it exists, but can't really find it. On the other hand, the proof of Bolzano's Theorem does allow one to actually approximate the desired root to any desired degree of accuracy. Since the size of the interval shrinks by a factor of 2 at each step, we get another decimal digit of accuracy every three or four steps.
The Fundamental Theorem of Algebra states:
Theorem 1: ( Gauss) Every non-constant polynomial with complex coefficients has at least one complex root.
Lemma 3: ( D'Alembert's Lemma) Let f(z) be a non-constant polynomial with . Then there is a non-zero complex number c such that |f(cx)| < |f(0)| for all sufficiently small positive real values x.
Proof: Let . be a non-zero polynomial with complex coefficients. We assume that and that , i.e. the smallest positive degree term of degree k. Then .
We want to choose c so that . Since x is a positive, the factor does not affect the argument. Since arg converts products into sums, we can solve to get
One can choose any non-zero c with this argument. Then the last two terms of can be written in the form where d is a positive real number. The remaining terms are in absolute value at most for some positive e. So,
for all positive x small enough so that x <|a_n|r/(2e). This completes the proof.
Corollary 3 Let g(z) be any polynomial with complex coefficients and b be any complex number with . Then there are points c arbitrarily close to b where . In particular, b is not an absolute minimum of the function |g(z)|.
Proof: Let f(z) = g(b + z). Then and f is a polynomial with complex coefficients. The result now follows from Lemma 3.
Proof of the Fundamental Theorem of Algebra Suppose f(z) is a non-constant polynomial with complex coefficients and no complex roots. If d is the degree of g(z), then the triangle inequality shows that for z sufficiently large in absolute value, the highest degree term dominates and one has for some positive c and all z with |z| > M. In particular, one knows that there is a square centered at the origin which is guaranteed to contain in its interior all the absolute minima of the function |g(z)|. By the Weierstrass Extreme Value Theorem applied to a slightly larger square, there is at least one such absolute minimum, say z. By D'Alembert's Lemma, we must have f(z) = 0, which is a contradiction.
All contents © copyright 2001 K. K. Kubota. All rights reserved