Fruits of procrastination

Month: May, 2014

That unnameable confusing thing about Field extensions

A first incursion into the concept of field extensions proves to be rather confusing. And I’ll tell you why.

Things which seem difficult, and sometimes impossible to prove, are proved very easily using concepts of vector spaces. For example, we know that \Bbb{Q}(a_1,a_2,\dots,a_n) is a finite extension of \Bbb{Q}. Prove that a_1^2+a_3^2a_5^7 is also algebraic over \Bbb{Q}.

What would generally be expected is that we look up the irreducible polynomials whose zeroes are a_1,a_3,a_5, and then from them somehow form a polynomial whose root is a_1^2+a_3^2a_5^7. This could prove to be tricky and time-consuming. What would we do if the expression was longer and more complicated?

In comes the concept of vector spaces. By some tricky observations about how field extensions are vector spaces and how the dimension of a vector space always remains constant, we prove that any algebraic (non-radical) combination of the elements a_1,a_2,\dots,a_n will be algebraic over \Bbb{Q}.

This seems a little unfair. We’re almost cheating here. Shouldn’t we somehow be proving things through a clever manipulation of polynomials? Yes we are cheating. Because almost every argument boils down to this: the number of basis elements in a finite dimensional vector space remains constant. This is clearly not a theorem about polynomials per se. We’re bringing foreign concepts in, and solving our problems with little insight as to what really is happening.

This IS unfair. With little insight into what really is happening, we are somehow arriving upon a contradiction, and proving terribly powerful things about polynomials. Until we find a more insightul method, we have to make do, I suppose.

Generalizing the product of sets to a universal property

The concept of product and co-product can be generalized to a universal property. However, “how” to do so is given in a rather unclear manner in most books and internet links. An example would be

I have somehow patched together a coherent explanation. I hope this helps anyone starting out in Category Theory.

Let C be a category, and let A and B be two objects in it. How should we define A\times B? We define a new category. Here, we select only those objects which have morphisms to both A and B. More technically speaking, we create the category C_{A,B}. I repeat that not all objects from C are selected. Only those which have morphisms to both A and B are selected in this new category.

Now what are the objects in this new category C_{A,B}? These are the newly selected objects along with their morphisms to A and B. The morphisms between objects are similar to those in other C_{A,B} categories. I wouldn’t like to go deeper into describing this category, as the details are pretty well-known.

If this category has a final object, then that final object is A\times B. Hence, A\times B is universal in the category C_{A,B}.

Now an example: let us take the category (Z,\leq). How is 3\times 4 defined? We know that integers are the objects in this category, and a morphism between objects a and b exists only if a\leq b: this morphism is (a,b). In this category, the final object will be 3, as 3=\min\{3,4\}. Why \min\{a,b\}=a\times b in the category (Z,\leq) is something that is a good exercise to find out. Just follow the procedure outlined above.

Note that although I have only generalized the product of two sets, the product of any (finite) number of sets can be generalized to a universal property by taking the category C_{A_1,A_2,\dots,A_n}

I must admit that my writing is getting more and more incoherent and muddled, and perhaps not as helpful as initially planned. I hope to rectify this from the next post on. I also need to learn how to draw commutative diagrams.

Finally, a valid generalisation of the Third Isomorphism Theorem

Let G be a commutative group. Let M and N be subgroups of G. If M\leq N, then \frac{(G/M)}{(N/M)}\cong \frac{G}{N}.

However, what if M\not\leq N? We will consider the general scenario, where M and N are any subgroups of G, provided N is not a proper subgroup of M.

Then \frac{(G/M)}{(N/M)}\cong \frac{G}{N+M}

In the case that M\leq N, we have M+N=N.

I have arrived upon this result myself. I don’t know if it is already known.

A generalization of the quotient group: an exhaustive analysis of all possible cases

We know that if G is a group and N is a normal subgroup of G, then G/N is a quotient group.

Today, we shall try and explore some fundamental questions that have plagued my understanding of Algebra for a long long time.

What if N is just a subset of G, and not necessarily a subgroup? Does G/N mean anything then? Take G=\Bbb{Z} and N=\{1,2,3\}. Then \{g+N| \forall g\in G\} consists of \overline{a} for all a\in G. For two different a,b\in G, we have \overline{a}\neq\overline{b}. Hence, there exists a bijection f:G\to G/N.

Can G/N display properties of a group? Sure. Define a new operation on G/N: \overline{a}+\overline{b}=\overline{a+b}. We have a two-sided identity, a two-sided inverse, associativity, and algebraic closure. We have a group!!

What happens when N becomes a subgroup? For any n\in N, we see that \overline{a+n}=\overline{a}. Imagine that for each equivalence class, we fix a representation; i.e. for the equivalence class \{a+n_1,a+n_2,\dots\}, we choose the element a+n_1 for all arithmetic operations, without the need to call any other element for any further operations. Let \overline{a+n_1} and \overline{b+n_2} be the representative elements of two equivalence groups in G/N. What about \overline{a+n_1}+\overline{b+n_2}? Can we prove that we’ve chosen \overline{a+n_1+b+n_2} to be the representative element of some equivalence class? Probably not.

Now let N be a normal subgroup. Here \overline{a+n_1+b+n_2}=\overline{a+b+n_3+n_2}. So we have \overline{a+b+n_4}, where n_4=n_3+n_2\in N. However, we can’t be sure of choosing this element as the representative of some equivalence class either.

Instead of choosing representative elements, let us try a different approach. Let us consider whole cosets a+N at once for a\in G. Choosing representative elements becomes irrelevant here. For group operations, we’d have to calculate (a+N)+(b+N); i.e. add each element of a+N to each and every element of b+N.

If N is normal, then we have (a+N)+(b+N)=(a+b+N), which surely is an element of G/N. Checking for other group properties, we soon see that G/N is a group.

What if N is not normal? Algebraic closure may be violated. Check out this link:

If N is a normal subgroup, then (a+N)+(b+N)=(a+b+N). Note that this isn’t a necessary condition for G/N to be a group As long as (a+N)+(b+N)=(c+N) for some c\in G, we would still have algebraic closure. This is just an additional luxury that provides for notational convenience.

Conclusion: It is possible for G/N to be a group if N is not a subgroup at all. The group operation would be (a+N)+(b+N)=(a+b+N). Remember that this would be a result of using the operation *_G on the elements of (a+N) and (b+N). The operation on G/N would just have to be defined this way. Choosing representatives doesn’t even come up here as there is a different a+N for every a\in G.

When N is a subgroup, there is one coset for multiple elements in G. Hence, choosing representatives comes up. But this proves to be problematic whilst trying to satisfy algebraic closure. Hence, we deal with adding whole cosets to each other. Choosing representatives becomes irrelevant here. For the purpose of satisfying algebraic closure, we find that if N is a normal subgroup, G/N is a subgroup in every instance.

Aristotle and I

I so remember trying to re-interpret nature based on my own empirical observations and intuition, philosophizing and hatching explanations. I can’t even remember when I started this. Maybe in class 2? When I used to take a sheet of paper and pencil and write and write and write about how clouds form, how insects move, why cockroaches have antenna-like structures, etc. And all this had nothing to do with the widely accepted scientific explanations! These explanations were all my own. 

Aristotle and I be soul brothers, bitch.

The only reason why I do not do Physics instead of Mathematics is that one can never know for sure whether one is right or not in the Sciences. In Mathematics, you have consistency and rigor, however artificial and arbitrary the axioms. You can be sure of being right. 

My original proof of “continuous mappings on compact metric spaces are uniformly continuous”

The proof of “continuous mappings on compact metric spaces are uniformly continuous” is rather convoluted and opaque, as given in most real analysis textbooks (Rudin included). There is a lot of unnecessary bookkeeping, and the treatment is not really motivated. Given below is my own proof of the theorem, which is inspired by the existing proof to an extent I am unaware of. I hope it helps the readers. The motivation for each step is explicitly stated.

A mapping is uniformly continuous if for every \epsilon>0, there exists \delta>0 such that f(B(x,\delta))\subset B(f(x),\epsilon) for all x\in X.

Let f:X\to Y be the continuous mapping under consideration. For every B(f(x),\epsilon/2)\in Y, take the inverse B(x,\lambda_x).

\bigcup B(x,\lambda_x)_{x\in X} forms a cover of X. Call this cover U.

We will do two things with this cover U.

Firstly, as X is compact, let us choose a finite subcover \bigcup_{i=1}^n B(x_i,\lambda_{x_i}). We shall call this cover U'.
Secondly, we form another cover $latex \bigcup B(x,\lambda_x/2)_{x\in X}$. Call it W. As X is compact. W too will have a finite subcover. We shall call it W'.

Both U' and W' consist of balls centred on certain points in X. The set of centres of these balls may be distinct. Let the set of centres of W' be S and the corresponding set for U' be R. Seek out points in $latex S\setminus R$, and add them to R. Call this new set $latex latex R’=R\cup S\setminus R$. Hence, R'\bigcap S=\emptyset. For the points in R'\setminus R, draw $latex \lambda_x$-balls around them, and add them to U'. Note that U' still remains a finite cover of X. Call the new finite subcover U''.

Final bit of notation: let \delta=\min\limits_{x_i\in R'}\{\lambda_{x_i}/2\}.

Take any two points a and b such that $latex d(a,b)=\delta$. Clearly $latex a$ is contained within some set of $latex W’$. Let that set of W' have as centre the point p. Note that p is also contained within R', and hence has a $\lambda_p$ ball around it. We have d(f(p),f(a))\leq \epsilon/2.

In addition to this, we have d(p,b)\leq d(p,a)+d(a,b)\leq \lambda_p/2+\lambda_p/2=\lambda_p. Hence, $latex d(f(p),f(b))\leq\epsilon/2$.

Now $latex d(f(a),f(b))\leq d(f(a),f(p))+d(f(p),f(b))\leq\epsilon/2+\epsilon/2=\epsilon$.

Hence proved.

An explanation for some steps:

The essential motive in this proof is to determine a distance \lambda such that for any two points a and b separated by the distance \delta, we should find that their mappings are separated by \epsilon. For this we need both points to be in such a ball which maps to an $latex \epsilon/2$-ball.

I will continue this explanation when I get the time.

Explaining the beginner’s problems with Topology

When one suddenly starts studying compactness and connectedness and other topological concepts in college, one is likely to get confused.

Where did all these concepts come from? Then, seemingly intuitive properties of \Bbb{R^n} start being proven using these alien notions. Forming a big picture which includes these concepts seems difficult. One still thinks about the real number line in intuitive terms. As for these terms, although one eventually begins to understand them, acceptance of such notions as tools for studying \Bbb{R^n} is difficult.

This is an attempt to mend that divide.

Connectedness, compactness, boundedness, and other seemingly arbitrary topological notions are properties of \Bbb{R^n} (along with other topological spaces) that we do not come across before they’re thrust on us. Hence the difficulty in accepting them in situations that are entirely too familiar and plain to us. If the situations too were alien, accepting such tools in analyzing them would perhaps be easier.

These properties are important, as lots of them are conserved in continuous mappings. Hence, whether one space can be continuously mapped to another can be exhaustively determined using these tools. I repeat, nothing about these tools is difficult except for the fact that they’re new to us, and the fact that they’re new to us has not been explicitly stated.

And there is nothing holy about these tools. I hypothesize that many more such properties which are conserved in continuous mappings are still to be discovered. And when they are discovered, we will be able to answer questions that stand unanswered today.

Topology has much in common with Physics. We’re looking around, searching for the laws of nature, but somehow making do with the laws we currently know.

Why ax+by+cz=d is a plane

Why is ax+by+cz=d a plane in three dimensions? Because it just is? Not good enough.

There are proofs for this. They involve normals and dot products and other fancy things that you’re not entirely sure are legit (if you’re in high school). I’m going to try a very different approach to convince you that this is a plane, today.

Take ax+by+cz=d. Make z=0. You have ax+by=d. This is a line, as is clearly known (constant slope). Now increase z by an infinitesimal amount: say \Delta z. You have ax+by=d-c\Delta z, which is again a line. However, what kind of a line is it? It is certainly parallel to the earlier line ax+by=d. It just has a different y-axis intercept.

You can continue trying this for different values of z. You get an infinite number of parallel lines, all shifted along the y-axis.

Now start raising each straight line ax+by=d-cz by z units along the z-axis. Clearly each point on the raised line satisfies the equation ax+by+cz=d. This bunch of raised and shifted parallel lines form a plane.

This might not be a rigorous proof, but it is probably more visually satisfying than the proof with normals and doc products and what not.

Vector addition: my mortal enemy

I have always, A.L.W.A.Y.S. found vectors confusing. WHY do they add up in that funny manner. Why do we learn about them at all? They just seem to be a piece of complicated machinery that just makes life difficult for high school and college students. It was only on learning abstract mathematics that I somehow became more comfortable with the need for their existence. I hope high schoolers do not need to go down this long arduous path before finally starting to make sense of things.

Please note that I did not really face a problem with understanding vector properties, or even dealing with vectors in unknown situations. My conundrum was more existential: why did they need to exist at all? Why do they add in that funny way, when numbers don’t? You get the drift. I suppose most people find themselves asking similar questions when they come across vectors.

The key concept that confuses most people here is that of “addition”. We know 2+2=4. 3+4=7….and so on. The symbol + has come to be associated with a very specific situation, in which two real numbers add up to give a third real number (we haven’t crossed over to complex numbers yet in our high school minds).

But then we start finding pointed arrows (vectors) being added to each other. Where did that come from? Clearly vectors are not real numbers! Even though I have finally learned how to add them (make a parallelogram and other arbitrary figures), HOW is this addition?

“Addition” here is just the name of an operator. For real numbers it means something different, and for vectors it means something completely different. This is the result of some very unfortunate nomenclature, fuelling confusion from one generation to another. Hence, when you come across the phrase “vector addition”, do NOT think of addition. Think of the operation your textbook talks about, repeat to yourself “this is not really addition”, and apply the algorithm. You’re set.

The following section is to be read only when you’re comfortable with the process outlined above:

In reality, the nomenclature is not unfortunate. Addition in the real number sense is only a special case of vector addition (which by way of this expression is clearly more general). However, these things become apparent only in retrospect.

Choosing a new coordinate system

I often used to wonder why one coordinate system is more appropriate than another in particular situations.

For example, when dealing with circular motion, we are often advised to use polar coordinates. I used to wonder why we can’t use only x-y coordinates. My teacher used to say using polar coordinates makes things easier. But I never quite bought it. Although after working on such situations with x-y coordinates I did realise that formulae were much simpler with polar coordinates, I didn’t think this justified completely changing the coordinate system.

This I feel is something many people can relate to.

The reason why changing the coordinate system helps is that if one can find axes that perfectly align with the motion of the object, then equations become much simpler! Let me generalize this, for a better insight.

Let us suppose an object can move only in a wave like fashion, and that too only in one direction. Normally, one may take the x-y axis, and then represent the location of the object through a complicated formula. However, what if we design an axis similar in shape to the wave?! We can then represent the location only through one coordinate!

This is true for any kind of motion. However, at this point, one may say that if *any* motion can be so simply represented, why don’t we always take such an axis? This is because choosing a new axis and hence simplifying representation does not solve all problems. You also have to eventually convert to the coordinate system you’re working with. Hence, if the cost of converting back to the *home* coordinate system is balanced out by the benefits availed by choosing a new axis, then go for it! Else, don’t.