Fruits of procrastination

Month: April, 2014

A generalization of “all members of a simple field extension of F are algebraic over F”

The following is a powerful powerful theorem: Let F(\alpha) be a simple field extension of field F, where \alpha is algebraic over F. Then every x\in F(\alpha) is also algebraic over F. Also, \deg(x,f)\leq \deg(\alpha,F).

The proof I think is most brilliant. I am not going to provide you with a proof here. I am just going to try and slightly generalize this theorem.

Let \beta\in F(\alpha). Consider the vector space generated by \{1,\alpha,\alpha^2,\dots\}. Also, let \{c_1,c_2,\dots,c_n\} be the set of vectors that span the vector space generated by  \{1,\alpha,\alpha^2,\dots\}. Then \deg(\beta,F)\leq n.

How is this a generalization? Firstly, the set of vectors \{c_1,c_2,\dots,c_n\} need tot span the whole of F(\alpha). It only needs to span the subspace generated by \{1,\alpha,\alpha^2,\dots\}. Hence, we can find a better upper bound for the degree of the irreducible polynomial satisfying \beta. Secondly, we need only consider the set of vectors that spans the space, and not necessarily the basis of the space. Although this is likely to give us a worse estimate for the upper bound of the irreducible polynomial, it is still a generalization (sometimes, it may be impractical to determine the basis from the set spanning the vector space).

Trigonometric substitution in integration

Why trigonometric substitution in integration: this is something that puzzled me and made me hate differentiation/integration during my IITJEE preparation. A lot of the techniques that we learned were based on algorithmic memorization rather than a feel for what really was happening. Thankfully, I have come to a college that requires 0% attendance, so that I can fulfill all my deepest desires for understanding.

Say we have \int_{0}^{1}{\frac{1}{\sqrt{1-x^2}}dx}. A very common method of integrating this function is to make the substitution x=\sin\theta. Moreover, dx=\cos\theta d\theta. Hence, we get \int_0^{\frac{\pi}{2}}{\frac{\cos\theta }{|\sin\theta|}d\theta}.

Today we have to understand why that works.

When we say x=\sin\theta, what we’re saying is for some variable \theta, every number in the interval [0,1] is the image of some value of \theta. In others words, there is a bijective mapping \sin\theta:\theta\to[0,1]. Hence if we replace some value of x with the sine of the corresponding \theta, we should not find a difference.

However, there is the matter of dx. The function \int{f(x)dx} is nothing but the limit of making the intervals \Delta x on the x-axis smaller and smaller, and finding the summation \sum{f(x)\Delta x} (of course this limit is valid only if the same limit is reached regardless of the x we take in the interval). Now as x=\sin\theta, we have \Delta x=|x_i-x_j|=\cos\theta\Delta\theta for some \theta\in[\arcsin{x_i},\arcsin{x_j}], by the Mean Value Theorem. Moreover, the limits of integration will also change, but this is obvious.

We ultimately get \int_0^{\frac{\pi}{2}}{\frac{\cos\theta }{|\sin\theta|}d\theta}.

What we have in essence is a sequence of infinite summations, approaching a limit (the integral). Say one of the terms of the sequence is \frac{1}{2}*\frac{1}{2}+\frac{1}{4}*\frac{1}{4}+\dots. We can write this very summation in terms of another function (substitution x for \cos\theta. That is all that we’re doing here.

I hope this article helps those who ask “why”, and suffer for it.

Racism and path homotopy

I recently had to face some racism on for asking some seemingly obvious question on path homotopy. In retrospect, I could have thought through the concept myself. However, I don’t think I would have done so as quickly had I not been angry about the racism.

This is what I was trying to prove: let \alpha be a path from x_1 to x_2. Then \alpha*\alpha^{-1}*p*\alpha*\alpha^{-1} is homotopic to p. Here p is a loop on x_1.

Obviously one might expect \alpha and \alpha^{-1} to cancel out on both sides, giving only p. However, I was interested in deriving it from first principles, and not just learn a seemingly obvious tool for future calculations. Hence the pedant fought for two days almost to finally understand this (turned out to be pretty elementary in the end).

You have a path homotopy between paths f,g\in X when mapping [0,1]\times\{0\} to X gives you f, mapping [0,1]\times \{1\} gives you g, and mapping [0,1]\times \{t\} to X gives you the paths “in between”. Let us call the continuous function from k:I^2\to X.

Now let us do something weird: let us construct two paths m,n with the same terminal points in I^2. As I^2 is convex, the two paths are path homotopic. Hence, there exists a continuous mapping a:I^2\to I^2 such that a([0,1]\times\{0\})=m and a([0,1]\times\{1\})=n.

Looking at the whole picture: this is what we have: a mapping a:I^2\to I^2 which is continuous, and another mapping k:a(I^2)\to X, which is continuous on a(I^2) (as k is continuous on I^2 and a(I^2) is but a subspace of I^2). Hence, (k\circ a):I^2\to X is continuous, which shows k(m) is homotopic to k(n).

Now we come to the most important part. Let m be the path in I^2 such that k(m)=p. Also, let n be the path in I^2 such that k(n)=\alpha*\alpha^{-1}*p*\alpha*\alpha^{-1}. They have the same terminal points. Why their terminal points can be made to be the same has terrific ideas behind it, and would require another post (it is not explained in Munkres, so watch out for that post). Anyway, as (k\circ a):I^2\to X is continuous, it is easily seen that p and \alpha*\alpha^{-1}*p*\alpha*\alpha^{-1} are homotopic.

Why fundamental groups are defined only for loops- better explained than in Munkres’ Topology.

I want to point out why fundamental groups are defined for loops, and not path homotopy classes. This is something that Munkres’ Topology does not do a good job of explaining.

Munkres says that we cannot define groups on path homotopy classes because for some pair [f] and [g], [f*g] may not be defined. This is because it is possible that f(1)\neq g(0).

This seems like a bit of an arbitrary requirement to me. Although one can clearly see the benefits of defining [f]*[g] only when f(1)=g(0), this can be worked around. For starters, if the path homotopy class between f(1) and g(0) is [h], one could define [f]*[g] as [f*h*g]. Obviously this would require the axiom of choice if there are multiple path homotopy classes between f(1) and g(0), but I still think this can be worked around.

The reason why fundamental groups are defined for loops is that there is a unique identity and inverse for every element (path homotopy class). For example, if f is a path from x_1 to x_2, then [e_{x_1}]*[f]=[f]*[e_{x_2}]=[f]. Clearly [e_{x_1}]\neq[e_{x_2}]. However for a loop, the identity is unique- namely [e_{x_0}], where x_0 is the point where the oop starts and ends.

The same argument works for inverses.

Reaching infinity- Lobbying for Axiom A

This is a post on one aspect of the infinitely muddled and confusing (confused?) concept of cardinality.

Take the product and box topologies of \Bbb{R^\omega}. The product topology is first countable, whilst the box topology is not. A good discussion is given in Munkres. I will discuss this with an analogy.

Let us take a man who lives forever. Also, let us take an arbitrary day in his life, and call it Day 1. Everyday after day 1, he throws one ball into an infinitely large basket. Will the basket ever have \Bbb{N} balls? Remember that the man will live forever, and will keep on throwing balls into the basket long long long after we’re dead.

Intuition says “yes”. For any given number n\in\Bbb{N}, we can show that the basket will contain more than n balls after n days. However, showing that the basket will eventually contain more balls than any finite number does not show that it will eventually contain an infinite number of balls. This is very very important. One might think \text{not finite}=\text{infinite}. Surely this is common sense. However, the point to note is that even if the number of balls will be greater than n, it will still be finite. Say we take n=1 \text{ billion} and check the basket after a trillion days, although the number of baskets will be greater than n, it is still finite (a trillion).

It is a valid mathematical argument to show that something is infinite by proving that it is greater than any arbitrarily chosen finite number. However, here we are choosing n first , and then determining the set which contains more than n elements. This is an invalid procedure. Looking at it from another perspective, whatever set we choose, we can show it to contain only a finite number of elements. Hence, the basket will never contain \Bbb{N} balls.

My contention with this is: what if we can go to t=\Bbb{N}? We will surely find \Bbb{N} balls in the basket then!

We can’t. Not because we can’t do this in the practical world. But because we just can’t do this according to the current axioms in Mathematics. If there were such an axiom (let us call it Axiom A) which would allow us to do so, then we would surely find \Bbb{N} balls in the basket. But then we’d have to say “we can find \Bbb{N} balls assuming Axiom A.”

PS: I really hope Axiom A catches on. The results obtained from Axiom A are so much more intuitive!!


Zorn’s lemma

This is another rant on Zorn’s lemma. And hopefully the final one. It is something that has puzzled me for more than a year now!

In a partially ordered set P, let every chain have an upper bound. Why is it not obvious that P has a maximal element?

The whole concept depends on the following construction: can we construct an ascending chain such that for every element a in the chain, if there is a greater element(s) in P, then we add one of those greater elements to the chain after a? For example, take P to be \Bbb{Z}, and construct the chain 1<2<3. We know that 4,5,6,7,\dots, all these elements are greater than 3. Hence, we can choose any of those greater elements and add it to the chain, we can form a bigger chain.

If we continue adding such greater elements to the chain until we cannot add anymore, then the greatest element 'm' is clearly the maximal element of the whole of P, and not just the chain. Because is m was not the maximal element, then we could add another element from P in front of M in the chain, contradicting the fact that we cannot add any other element to the chain.

Rephrasing the earlier question: if in a partially ordered set P every chain has an upper bound, can we prove the existence of a chain in which for every element in the chain if a greater element exists, one of those elements has been added to the chain? Other similar questions would be: can we prove the existence of a chain in which is isomorphic to \Bbb{Z}, or can we prove the existence of a chain which contains all numbers between 5 and 73? The point to be taken away from this is that proving the existence of such chains is not trivial or “obvious”.

Where does Zorn’s lemma come in? Zorn’s lemma allows us to make an infinite number of arbitrary decisions. For example, if we have infinite pairs of shoes, Zorn’s lemma would allow us to arbitrarily choose one shoe from each pair in that infinite set, without detailing which shoe we picked from each pair, or how we went about choosing the shoe. Making this more clear, picking the left (or right) shoe from each pair does not require Zorn’s lemma, as the choice made for each pair is explicitly clear. However, selecting any *random* show from each of the infinite pairs requires the use of Zorn’s lemma, as we’re making an infinite number of decisions without really giving details.

How is this relevant to proving the existence of the aforementioned chain? As we can make an infinite (possibly uncountable) number of arbitrary choices without detailing how, we can, for each element in the chain with a greater element in P, add one of those elements (this is the decision making part) to the chain, and continue doing so infinitely.

Note that for each element a in the chain, if the set of greater elements in P had a lowest greater element or something similar, we wouldn’t need Zorn’s lemma. We could just state “we choose the lowest greater element”. However, such a distinct element in the set of elements greater than a need not exist in every partially ordered set P. Hence we need Zorn’s lemma.

Note that Zorn’s lemma does not imply that this chain has an upper bound in P! However, if it does, as assumed in the statement of the lemma, then that will undoubtedly be a maximal element of P.


Why isomorphisms

I often wondered why isomorphisms were important. And if you haven’t done the same, maybe you have studied Abstract Algebra rather passively. We shall explore this question today.

We know (or assume for the moment) that \Bbb{Z}_6 is isomorphic to \Bbb{Z}_2\times\Bbb{Z}_3. The mappings are n\to(n\mod 2,n\mod 3). You can verify all this for yourself.

Now let us suppose we don’t know what 4+5 is, where 4,5\in\Bbb{Z_6}. We can determine 4+5 by studying the manipulations of (4\mod 2,4\mod 3)=(0,1) and (5\mod 2,5\mod 3)=(1,2).

We have (0,1)+(1,2)=(1,0). We also know that the element which had mapped to (1,0) had been 3. Hence, 4+5=3.

Isomorphisms help us study the properties of one algebraic structure, provided we know the properties of the algebraic structure isomorphic to it. It is a tremendously useful concept in the fuzzy world of mathematics, where studying even a fraction of the concepts we ourselves have invented often proves to be an arduous task.

The minimal polynomial of a transformation

For a linear transformation T, the minimal polynomial m(x) is a very interesting polynomial indeed. I will discuss its most interesting property below:

Let m(x)=p_1(x)^{e_1}p_2(x)^{e_2}\cdots p_s(x)^{e_s} be the minimal polynomial of transformation T. Then p_i(T)^{y}\neq 0 for 1\leq y\leq e_i and y\in\Bbb{N}. You may be shocked (I hope Mathematics has that kind of effect on you :P). Why this is possible is that the ring of n\times n matrices can have zero-divisors. For example \begin{pmatrix} 1&1\\2&2\end{pmatrix}\begin{pmatrix}1&1\\-1&-1\end{pmatrix}=0_v