Of ellipses, hyperbolae and mugging

For as long as I can remember, I have had unnatural inertia in studying coordinate geometry. It seemed to be a pursuit of rote learning and regurgitating requisite formulae, which is something I detested. My refusal to “mug up” formulae cost me heavily in my engineering entrance exams, and I was rather proud of myself for having stuck to my ideals in spite of not getting into the college of my dreams.

However, now I realise what useful entities ellipses and hyperbolae are in reality. Hence, as a symbolic gesture, I will derive the formulae of both the ellipse and the hyperbola in the most simple settings- that of the centre being at the origin (0,0).

1. Ellipse- The sum of distances from two _foci_ is constant. Let the sum be “L“. As the centre is at the origin, and we are free to take the foci along the x-axis, the coordinates of the foci are (-c,0) and (c,0). We thus have the equation \sqrt{(x-c)^2 +y^2}+\sqrt{(x+c)^2+ y^2}=L. On simpifying this, we get \frac{x^2}{a^2}+\frac{y^2}{b^2}=1, where a^2=\frac{L^2}{4} and b^2=\frac{(L^2-4c^2)}{4}.

2. In the case of a hyperbola, under similar conditions, we have the equation \sqrt{(x-c)^2 +y^2}-\sqrt{(x+c)^2+ y^2}=L. This under simplification gives \frac{x^2}{a^2}-\frac{y^2}{b^2}=1, where a^2=\frac{L^2}{4} and b^2=\frac{(4c^2-L^2)}{4}.  

The utility of trigonometrical substitutions

Today we will discuss the power of trigonometrical substitutions.

Let us take the expression \frac{\sum_{k=1}^{2499} \sqrt{10+\sqrt{50+\sqrt{k}}}}{\sum_{k=1}^{2499} \sqrt{10-\sqrt{50+\sqrt{k}}}}

This is a math competition problem. One solution proceeds this way: let p_k=\sqrt{50+\sqrt{k}}, q_k=\sqrt{50-\sqrt{k}}. Then as p_k^2+q_k^2=10^2, we can write p_k=10\cos x_k and q_k=10\sin x_k.

This is an elementary fact. But what is the reason for doing so?

Now we have a_k=\sqrt{10+\sqrt{50+\sqrt{k}}}=\sqrt{10+10\cos x_k}=\sqrt{20}\cos \frac{x_k}{2}. Similarly, b_k=\sqrt{10-\sqrt{50+\sqrt{k}}}=\sqrt{10-10\cos x_k}=\sqrt{20}\sin \frac{x_k}{2}. The rest of the solution can be seen here. It mainly uses identities of the form 2\sin A\cos B=(\sin A+\cos B)^2 to remove the root sign.

What if we did not use trigonometric substitutions? What is the utility of this method?

We will refer to this solution, and try to determine whether we’d have been able to solve the problem, using the same steps, but not using trigonometrical substitutions.

a_{2500-k}=\sqrt{10+\sqrt{50+\sqrt{2500-k}}}=\sqrt{10+\sqrt{50+\sqrt{(50+\sqrt{k})(50-\sqrt{k})}}}

=\sqrt{10+\sqrt{50+100\frac{\sqrt{50+\sqrt{k}}}{10}\frac{\sqrt{50-\sqrt{k}}}{10}}}=\sqrt{10+\sqrt{50}\sqrt{1+2\frac{\sqrt{50+\sqrt{k}}}{10}\frac{\sqrt{50-\sqrt{k}}}{10}}}
=\sqrt{10+10(\frac{\sqrt{50+\sqrt{k}}}{10\sqrt{2}}+\frac{\sqrt{50-\sqrt{k}}}{10\sqrt{2}})}=\sqrt{10+10\times 2\times\frac{\frac{1}{2}(\frac{\sqrt{50+\sqrt{k}}}{10\sqrt{2}}+\frac{\sqrt{50-\sqrt{k}}}{10\sqrt{2}})}{\sqrt{\frac{50+\sqrt{k}}{20\sqrt{2}}-\frac{50-\sqrt{k}}{20\sqrt{2}}+\frac{1}{2}}}\times \sqrt{\frac{50+\sqrt{k}}{20\sqrt{2}}-\frac{50-\sqrt{k}}{20\sqrt{2}}+\frac{1}{2}}}

As one might see here, our main aim is to remove the square root radicals, and forming squares becomes much easier when you have trigonometrical expressions. Every trigonometrical expression has a counterpart in a complex algebraic expression. It is only out of sheer habit that we’re more comfortable with trigonometrical expressions and their properties.

Let f:X\to Y be a mapping. We will prove that f^{-1}(Y-f(X-W))\subseteq W, with equality when f is injective. Note that f does not have to be closed, open, or even continuous for this to be true. It can be any mapping.

Let W\subseteq X. The mapping of W in Y is f(W). As for f(X-W), it may overlap with f(W), we the mapping be not be injective. Hence, Y-f(X-W)\subseteq f(W).

>Taking f^{-1} on both sides, we get f^{-1}(Y-f(X-W))\subseteq W.

How can we take the inverse on both sides and determine this fact? Is the reasoning valid? Yes. All the points in X that map to Y-f(X-W) also map to W. However, there may be some points in f^{-1}(W) that do not map to Y-f(X-W).

Are there other analogous points about mappings in general? In Y, select two sets A and B such that A\subseteq B. Then f^{-1}(A)\subseteq f^{-1}(B)

Axiom of Choice- a layman’s explanation.

Say you’re given the set \{1,2,3,\dots,n\}, and asked to choose a number. Any number. You may choose 1,2, or anything that you feel like from the set. Now suppose you’re given a set S, and you have absolutely no idea about what points S contains. In this case, you can’t visualize the points in S and pick any you feel like. You might say “pick the lowest upper bound of S“. However, what if S is not ordered? What if it does not even contain numbers? How do you select a point from a set when you can’t even describe any of the points in that set?

Here, the Axiom of Choice comes in. It states that if \mathfrak{B} is a set of disjoint subsets of S, then a function c: \mathfrak{B}\to \bigcup\limits_{B\in\mathfrak{B}}B such that one point from every disjoint set is selected. You can divide S into disjoint sets in any manner whatsoever, and get one point from each set. In fact, the disjoint sets don’t necessarily have to cover S. The condition is \bigcup\limits_{B\in\mathfrak{B}}\subseteq S.

Going by the above explanation, you may take each point in S to be a disjoint interval, and hence select the whole of S.

What is so special about this seemingly obvious remark? Selecting points from a set has always been defined by a knowledge of the set and its points. For example, if I say f:\Bbb{R}\to\Bbb{R} such that f(x)=f(x^2), I have selected the points with the information that points in \Bbb{R} can be squared, and lie inside \Bbb{R}, etc. If we had f:A\to \Bbb{R} where A is a set of teddy bears, then f(x)=x^2 would not be defined. With the Axiom of Choice, regardless of whether A contains real numbers or teddy bears, we can select a bunch of points from it.

What if B\in\mathfrak{B} are not all pair-wise disjoint? We can still select points from each B. Proof: Take the cross product \mathfrak{B}\times S. You will get points of the form (B,x), where B\in\mathfrak{B} and x\in S. Now create disjoint sets in \mathfrak{B}\times S by taking sets of the form B_i=\{(B_i,x)|x\in B\}; in essence you’re isolating individual sets of \mathfrak{B}. These are disjoint, as if x is the same, then B is different. The main purpose of taking this cross product was to create disjointed sets as compared to overlapping sets in S, so that we could apply the Axiom of Choice. Now we use the choice function to collect one point from each disjoint interval. Each of thee points will be of the form (B,x). Now we define a function g:\mathfrak{B}\times S\to S as g((B,x))=x. Hence, we have collected one point from each B\in\mathfrak{B}. Note that these points may well be overlapping. We found the cross-product just to be able to apply the Axiom of Choice. If the Axiom of Choice was defined for overlapping sets, then we wouldn’t have to find the cross product at all.

Now we come to the reason why this article was written: defining an injective function f:\Bbb{Z}_+\to B, where A is an infinite set and B is a subset of it. We don’t know if A is countable or not.

OK first I’d like to give you the following proof, and you should tell me the flaw in it. We use Proof by Induction. Say f(1)=a_1\in S. We find A-\{a_1\}, and determine a_2\in A-\{a_1\}. We assume f(a_n) equals a point in A-\{a_1,a_2,\dots,a_{n-1}\}, and then find f(a_{n+1}) in A-\{a_1,a_2,\dots,a_n\}. Why can we not do this? Because we know nothing about the points f(a_1),f(a_2),\dots! How can we possibly map 1 to any points without knowing what point we’re mapping it to?

The bad news is we might never know the properties of S. The good news is we can still work around it. Select \mathfrak{B} to be the set of all subsets of S. As one might know, there are \mathfrak{B}=2^{|S|}. The elements of \mathfrak{B} are not pairwise disjoint. However, we can still select a point from every set, as has been proven above (by taking the cross product to create disjoint sets, etc). The most brilliant part of the proof is this: take c(S). We know S\in \mathfrak{B}. This will give you one element from the whole of s. You need to know nothing about S to select an element from S. Let f(1)=c(S). Now take S-f(1). We know S-f(1)\in \mathfrak{B}. Let f(2)=c(S-f(1)). Continuing this pattern, we have f(n)=c(S-\bigcup\limits_{i}f(i)), where i\in \{1,2,3,\dots,n-1\}. By induction, we have that the whole of \Bbb{Z}_+ is mapped to a subset of S.

My first attempt at solving the Lonely Runner Conjecture

Let us suppose there are k runners running at speeds 0<a_1<a_2<\dots<a_k around a field of circumference 1. Take any runner from the k runners- say r_i, who runs with speed a_i. Say we pair him up with another runner r_j who runs with speed a_j. Then for time 0\leq t\leq\frac{1}{2|a_i-a_j|}, the distance between them is |a_i-a_j|t, and for \frac{1}{2|a_i-a_j|}\leq t\leq \frac{1}{|a_i-a_j|}, the distance between them is 1-|a_i-a_j|t.

The Lonely Runner Conjecture can be stated in the following way: there exists a time t such that \min\left[\min\{|a_i-a_1|t_1,1-|a_i-a_1|t_1\},\min\{|a_i-a_2|t_2,1-|a_i-a_2|t_2\},\dots,\min\{|a_i-a_k|t_k,1-|a_i-a_k|t_k\}\right]\geq\frac{1}{k}.

Here t_1,t_2,t_3,\dots,t_k are the smallest positive values found after successively determining t_x-n\frac{1}{|a_i-a_x|}, where x\in\{1,2,3,\dots k\}\setminus\{i\}, and n\in\Bbb{Z}^+

Proofs of Sylow’s Theorems in Group Theory- Part 1

I will try to give a breakdown of the proof of Sylow’s theorems in group theory. These theorems can be tricky to understand, and especially retain even if you’ve understood the basic line of argument.

1. Sylow’s First Theorem- If for a prime number p, o(G)=p^km, and p\nmid m, then there is a subgroup H\leq G such that o(H)=p^k.

First we will try to understand what an orbit of a group is. Take a set S, and a group G. S does not necessarily have to be a subset of G. Orbit(S)=\{gs|s\in S, g\in G\}. Note that this is right multiplication.

Stabilizer(S)=\{g\in G|gs=s\}.

Now the Orbit-Stabilizer Theorem- $|Orbit||Stabilizer|=|Group|$. Something to remember from this is that there is nothing special about this theorem. It is just another way of saying $|Image||Kernel|=|Group|$, as derived from the First Isomorphism Theorem.

The stabilizer of a set S is a subgroup. Proof: Let as=s and bs=s. Then abs=as=s. Also, if as=s, then taking left inverse on both sides, a^{-1}s=s.

Now we move to a combinatorial argument: what power of p divides {p^km\choose p^k}? On expanding, we get \displaystyle{\frac{p^km(p^km-1)(p^km-2)\dots (p^km-p^k+1)}{1,2,3\dots p^k}}=m\displaystyle{\Pi}_{j=1}^{p^k-1}\frac{p^km-j}{p^k-j}. The highest power of p that can divide j is \leq k-1. Hence, the power of p that divides p^k-j is the same that which divides p^km-j. In fact, p does not have to be prime for this property to be true. More generally speaking, just to gain a feel for this kind of argument, provided j\leq p^k-1, the fraction of the form \frac{a-j}{b-j} not be divided by any power of p except for 1, as long as both a and b are divisible by powers of p greater than or equal to k. j is clearly the constraint here. Remember that p\nmid m is a necessary condition for this.

Now we prove {p^km\choose p^k}\equiv m  (mod p).

Now we will define the set S to contain all subsets of G with p^k elements. Clearly, |S|={p^km\choose p^k}, which as shown above is not divisible by p. Take every g\in G, and multiply it with every element of S. Note that very element of S is itself a set containing p^k elements of G. If S_i\in S, then gS_i\in S. Moreover, equivalence classes are formed when all elements of G are multiplied with all elements of S, and these equivalence classes partition S. The proof of this: let GS_i be the orbit of S_i. Then if S_k\in Orb(S_i), gS_i=S_k. Moreover, if S_l\in S=g'S_i, then g'g^{-1}S_k=S_l. This gives a feel for why these objects of S are all in the same class.

Let there be r such partitions of S. Then |S|=|S_1|+|S_2|+\dots+|S_r|, where S_1,S_2,etc are partitions of S. We know |S| is not divisible by p. Hence, there has to be at least one |S_k| which is not divisible by p. Let S_k\in [S_k], where [S_k] is an equivalence class partitioning S. Consider a mapping f:G\to S_k such that f(g)=gS_k. The image is [S_k], and the kernel is the stabilizer. By the orbit-stabilizer theorem, |G|=|Stabilizer||Orbit|. We know that the orbit is not divisible by p^k. Hence, the stabilizer has to be divisible by p^k. Also, the stabilizer is a subgroup.

The rest of the proof will be continued next time.

Today, I will discuss this research paper by Javed Ali, Professor of Topology and Analysis, BITS Pilani.

What exactly is a proximinal set? It is the set of elements X in which for any x\in X, you can find the nearest point(s) to it in X. More formally, for each x\in X, \exists y\in X such that d(x,y)=\inf \{d(x,K)\}.

This article says a Banach space is a complete vector space with a norm. One might find the difference between a complete metric space and a complete vector space to be minimal. However, the difference that is crucial here is that not every metric space is a vector space. For example, let (X,d) be a metric space, satisfying the relevant axioms. However, for x,y\in X, x+y not being defined is possible. However, if X is a vector space, then $x+y\in X$. Hence, every normed vector space is a metric space if one were to define \|x-y\|=d(x,y), but the converse is not necessarily true.

What is a convex set? This article says a convex set is one in which a line joining any two points in the set lies entirely inside the set. But how can a set containing points contain a line? Essentially, the the convex property implies that every point the line passes through is contained within the convex space. Convexity introduces a geometrical flavor to Banach spaces. It is difficult to imagine what such a line segment would be in the Banach space of matrices (with the norm \|\mathbf{A}-\mathbf{B}\|=\mathbf{A}-\mathbf{B}.

What is a uniformly convex set? This paper says that a uniform convex space is defined thus: \forall<\epsilon\leq 2\in\Bbb{R}, \exists \delta(\epsilon)>0 such that for \|x\|=\|y\|=1 and \|x-y\|\geq \epsilon, \frac{\|x+y\|}{2}\leq 1-\delta(\epsilon). Multiplying by -1 on both sides, we get 1- \frac{\|x+y\|}{2}\geq \delta(\epsilon). What does this actually mean? The first condition implies that x and y cannot lie in the same direction. Hence, \|x+y\|<2. As a result, we get \left\|\frac{x+y}{2}\right\|<1, or 1-\left\|\frac{x+y}{2}\right\|>0. As \delta(\epsilon)>0, and as 1-\left\|\frac{x+y}{2}\right\| is bounded, \delta(\epsilon) can be the lower bound of 1-\left\|\frac{x+y}{2}\right\|.

But what is uniform about this condition? It is the fact that \delta(\epsilon) does not change with the unit vector being considered, and depends only on \epsilon.

Now we go on to prove that every closed convex set of every uniformly convex Banach space is proximinal.

Why I have nothing better to do than blog :(

After trying to follow blogs by various mathematicians, I have concluded that having a digital presence is pretty darned cool and useful in academia! Hence, in my pursuit of the forever-elusive elixir of “cool”, I have decided to record my progress in Algebraic Geometry through this blog. I’m no expert. But I feel trying to explain concepts to others promotes a higher degree of clarity than self-study, athough this is debatable and most definitely a grey area.

I really do hope Math enthusiasts find this blog to be a useful spot for better explanations of convoluted geometrical/algebraic concepts.

And reaffirm that beneath all the daunting machinery, Math is so simple and beautiful.