cozilikethinking

4 out of 5 dentists recommend this WordPress.com site

Major Life Update

Hello all! I’ve decided to do my PhD in Conformal Geometry, with applications to String Theory. I’ve been doing study projects in both Algebraic Geometry and Conformal Geometry, but realized I like both Physics and Math, and want to keep thinking about them for the rest of my life.

This is also my way of saying I’m probably going to start blogging a lot more now. Thank you for all your support and guidance 🙂

Topology Qualifying Exams

We were given a list of questions by the Professor to prepare for our Topology Qualifying exams at Penn State. I decided to write up the questions and answers in Latex. This document will hopefully be useful to future students preparing for the exam, and of general interest to others.

The latest draft is- Topology_Quals_Preparation-2

Aligning Machine Intelligence- MIRI

I am applying for a summer internship (or attendance at a workshop, or something like that) with the Machine Intelligence Research Institute (MIRI) at Berkeley, California. I have an interview tomorrow over Skype. In order to prep for it, I went on the MIRI website and read their mission statement and a paper outlining the various issues that researchers at MIRI try to address.

The paper in question is “Agent Foundations for Aligning Machine Intelligence with Human Interests: A Technical Research Agenda”. A link to the paper is here.

The paper says at the outset that MIRI does not concern itself with making more intelligent or more powerful artificial machines, as it is already an active  field of research with multiple institutes doing wonderful work in that direction. It deals with the question of how to make the machines behave in a way that is aligned with human interests and expectation. Some features discussed in the paper are:

  1. Real-world models– Machines can be compared based on a method similar to scientific induction- expose machines to a complex environment, and then make them develop models of the environment. If their models can predict future events in the complex environment with a better success rate, then their models are better. This seems to be an effective way to compare machines. However, when machines are put to model the external (real) world, they themselves will be part of the environment. That makes their observations and  conclusions questionable. For instance, if the machines are not water-proof. Then rainfall in the external environment will be termed as a catastrophic event, and the machine will spend more time and resources studying ways of avoiding rainfall, which does not align with human interests.
    Moreover, as machines are rewarded based on the maximization of a reward function, they may outsmart their human judges by finding ways of maximizing the function without creating the best models of the complex environment. This is similar to students gaming the system by learning important sections of the textbook for the exam, without reading the full book and gaining a cohesive understanding of the material, as long as they can predict what types of questions will be asked.
  2. Decision Theory– Given a situation, what decision much a machine take? At the outset, it sounds easy. Make a list of all possible actions, see which action maximizes the utility function, and then select that action. However, it is not clear how a machine would be able to exhaustively check all possible outcomes of all possible actions, and then select the one that is “best”. Due to the varying degrees of such analyses that it can do, it is also possible that in two environments that are identical in every aspect, the machine chooses different courses of action. Making a reliable machine which takes the same decision every time after analyzing the consequences of each possible action thoroughly is a difficult problem.
  3. Logical Uncertainty– Humans understand that despite understanding a complex system arbitrarily well, it is often difficult to predict the events in the system. This is not because of lack of information, but because of lack of deductive reasoning skills. For instance, if you were shown all 52 cards of a deck in order, and then the cards were shuffled in ways that you understand, you’d still have trouble predicting  which card is on top after a million fast shuffles. This is because the human mind is mostly incapable of making fast long calculations. In such circumstances, we assign probabilities to various outcomes, and then select the outcome which has the highest probability.
    A similar situation is applicable to smart machines- they will face situations in which they will not not be able to predict events accurately. Hence, they will need to assign probabilities to outcomes too. The assignment has to be done in a way such that is maximizes the successful prediction rate. Teaching a machine how to do that is an active area of research.
  4. Vingean reflection– This has to do with creating smart machines that can themselves create even smarter machines without human intervention. Let us assume that we can create machines which weigh all possible courses of action, and select the one which serve human interests best. Hence, it follows the same procedure to create a smarter machine. However, because the machine it creates will be smarter than itself, the parent machine will not be able to predict all courses of action that the child machine would take (if it could, it would be as smart as the child machine, which contradicts the hypothesis). Hence, an aligned machine may create a machine that is not aligned with human interests.
  5. Error-tolerant agent designs– Currently, if a machine is malfunctioning, we can just open it up and correct its code, hardware, etc. However, an intelligent machine, even if it is programmed to listen to instructions if the human believes repairs are needed, may find ways of not listening if it has an incentive to escape such meddling. In other words, although the machine has to follow the code which instructs it to listen to its human programmer, it may cleverly find another part of the code or its set of instructions which allows it to escape. Programming an intelligent machine to listen to humans is a contrived problem.
  6. Value Specification– The way humans are programmed to procreate is that the act of sex itself is pleasurable. Hence, we have a natural inclination to procreate. Although humans are aware of this, they don’t try to change the pleasurable nature of sex. In a similar way, intelligent machines can be programmed to follow human instructions if they are rewarded on doing so. A sufficiently intelligent machine can figure out this rewards system, and may decide to change it. If the machine is no longer rewarded on following human instructions, it may soon go out of human control. Hence, programming machines to not change the reward system is an area of research.
    Moreover, machines must inductively learn about human values and interests, as programming human interests into a computer is a fuzzy area at best. For instance, everything acceptable in human society is unlikely to yield an exhaustive list anytime soon, and hence cannot be fed into a machine. However, a machine may learn about society by observing it, and then base its actions on what is acceptable to humans. This is analogous to the fact that although not every cat picture in the world can be fed into a machine, it can inductively learn what a cat looks like by trawling the internet for cat pictures, and then identify an arbitrary cat picture based on its inductive learning.

I had a great time understanding and writing about the Artificial Intelligence concerns of MIRI, and hope to understand more in the near future.

Just a small update: I came to the PSU Hackathon to learn the rudiments of coding. While random scrolling down wikipedia articles, I came across the AKS Primality test, created by three computer scientists at the Indian Institute of Technology, Kanpur. The Wikipedia page does a great job of explaining the algorithm. It is the first polynomial time algorithm ever to determine the primality of a number. And it is so simple!

The algorithm depends not the following fundamental fact: if n is a prime number, and a is a number co-prime to n (say n-1), then for any integer x, we have x^n+a\equiv (x+a)^n \pmod {n}. This is a necessary and sufficient condition for primality! This is the simplest and most awesome thing I’ve seen in the recent past.

Note: Do you see why {n\choose k}\equiv 0 \pmod {n} is if and only if n is prime? Hint: For proving that this is not true for non-prime n, consider {n\choose j}, where j is the smallest prime factor of n.

 

I’ve been meaning to update this blog for the longest time. In order to finally get around to it, I decided to post about what I’ve been reading, as compared to an expository article on an area of Math which would understandably take more time and planning to execute.

Today I was reading the following article by Jacob Lurie- Spectral Homotopy Theory. It seems amazingly readable, despite being on a relatively advanced topic. I shall update this post to include a (hopefully respectable) summary of the article.

Decisions decisions

Hi! I’ve decided to join Penn State for a PhD in Mathematics this fall. I am excited about this leg of my life. Thanks all.

-Ayush

Examples- II

The set of irritating examples continues:

1. V(I\cap J)=V(I.J)=V(I)\cup V(J): let I be the ideal generated by the polynomial x+y=0 and J be the polynomial generated by x-y=0. Then I\cap J consists of the polynomials that are present in both ideals. As I and J are both prime ideals, their intersection is exactly the product of the two ideals. When we take a product of two ideals, their set of common zeroes is the union of the set of zeroes of the individual ideals. Hence, we get V(I\cap J)=V(I)\cup V(J).

Why is the intersection of two prime ideals equal to their product? It is easy to see that the product of the two ideals would be contained within the intersection. But what if the intersection is bigger than the product?

2. Explicitly write down a morphism between two varieties, that leads to a morphism between their coordinate rings: Consider the map t\to (t,t^2), which is a morphism between the affine varieties \Bbb{R} and V(y-x^2)\subset \Bbb{R}^2. We should not construct a morphism between the coordinate rings \Bbb{R}[x,y]/(y-x^2) and \Bbb{R}[x]/(0)=\Bbb{R}[x]. How do we go about doing that?
Consider any polynomial in \Bbb{R}[x,y]/(y-x^2); say something of the form x^2+y^2. We can now replace x by t and y by t^2. That is how we get a morphism from \Bbb{R}[x,y]/(y-x^2) to \Bbb{R}[x].

Now we shall start with a morphism between coordinate varieties. Consider the morphism \Bbb{R}[x,y]/(y-x^2) \to \Bbb{R}[x]. Let x be mapped to x and let y be mapped to x^2. We can see why the ideal (y-x^2) goes to 0. Hence, this map is well-defined. Now we need to construct a morphism between the varieties V((0)) and V(y-x^2). We may t\to (t,t^2). In general, if we have a map \Bbb{C}[x_1,x_2,\dots,x_m]/I\to \Bbb{C}[x_1,x_2,\dots,x_n]/J, with the corresponding mappings x_i\to p_i(x_1,x_2,\dots,x_n), then the map between the varieties V(J)\to V(I) is defined as follows: (a_1,a_2,\dots,a_n)\to (p_1(a_1,a_2,\dots,a_n), p_2(a_1,a_2,\dots,a_n),\dots, p_m(a_1,a_2,\dots,a_n)). How do we know that the image of the point belongs to V(I)? Let f be a polynomial in I. This is not a difficult argument, and follows from the fact that every polynomial in I maps to a polynomial in J (as the mapping between the coordinate rings is well-defined). We shall try and replicate that argument here.

Let f\in I. Then f(x_1,x_2,\dots,x_m) is mapped to f(p_1(x_1,x_2,\dots,x_n), p_2(x_1,x_2,\dots,x_n),\dots,p_m(x_1,x_2,\dots,x_n)), which is a polynomial in J. This polynomial satisfies the point (a_1,a_2,\dots,a_n). Hence, f(p_1(a_1,a_2,\dots,a_n), p_2(a_1,a_2,\dots,a_n),\dots,p_m(a_1,a_2,\dots,a_n))=0. This proves that the map of (a_1,a_2,\dots,a_n) is again a point in V(I), and we have defined a map between the varieties V(J) and V(I).

Where does the isomorphism figure in this picture? One step at a time, young Padawan.

3. Explicit example of a differential form: (x+y+z)dx+(x^2+y^2+z^2)dy+(x^3+y^3+z^3)dz. A differential form is just a bunch of functions multiplied with dx,xy,dz.

4. Explicit example of the snake lemma in action: I am going to try and talk a bit about Jack Schmidt’s answer [here](http://math.stackexchange.com/questions/182562/intuition-behind-snake-lemma). It promises to be very illustrative.

Irritating set of examples- I

I am trying to collect explicit examples for concepts and calculations. My hope is that this website becomes a useful repository of examples for anyone looking for them on the internet.

First some words of wisdom from the master himself, Professor Ravi Vakil: “Finally, if you attempt to read this without workign through a significant number of exercises, I will come to your house and pummel you with the EGA until you beg for mercy. As Mark Kisin has said, “You can wave your hands all you want, but it still won’t make you fly.”

We first start with some category theory examples:

1. Can we have two products of the same two objects, say A and B, in the same category? This question is much more general than I am making it out to be. Can we have two distinct universal objects of the same kind in a category (although they may be isomorphic, and even through unique isomorphism)? The only example of the product of objects being isomorphic but not the same is the following: A\times B and B\times A. These aren’t the same objects, but they’re isomorphic through unique isomorphism.

2. Groupoid- In the world of categories, a groupoid is a category in which all morphisms between objects are isomorphisms. An example of a groupoid, which is not a group, is the category \mathfrak{Set} with the following restriction: \text{Hom(A,B)} now only consists of isomorphisms, and not just any morphisms. This example, although true, is not very illustrative. This [link](http://mathoverflow.net/questions/1114/whats-a-groupoid-whats-a-good-example-of-a-groupoid) provides a much better demonstration of what is going on. Wikipedia says that the way in which a groupoid is different from a group is that the product of some pairs of elements may not be defined. The Overflow link suggests the same thing. You can’t take any any pair of moves that one may make on the current state of the jigsaw puzzle, and just compose them. The most important thing to note here is that the elements of the group do not correspond to objects of the categories. They correspond to morphisms between those objects. This is the most diabolical shift of perspective that one encounters while dealing with categories. Suddenly, morphisms encode much more information than you expect them to.

3. Algebraic Topology example: Consider a category in which points are objects of the category, and the paths between points, upto homotopy, are morphisms. This is a groupoid, as paths between points are invertible. The return path should not wrap around a wayward hole, obviously. One may consider the path as the same, just travelling in the opposite direction. The automorphism group of a point would be the fundamental group of paths centred at that point.

Another category that stems from Algebraic Topology is one in which all objects are topological spaces, and the morphisms between maps are the continuous maps between those spaces. Predictably, the isomorphisms are the homeomorphisms.

4. Subcategory: An example would be one in which objects are sets with cardinaly 1, and morphisms would be the same as those defined in the parent category- \mathfrak{Set}.

5. Covariant functor: Consider the forgetful functor from \mathfrak{Set} to \mathfrak{Vec}_k. The co-domain is bigger than the domain. One could think of this functor as an embedding.

A topological example is the following: one which sends the topological space X, with the choice of a point x_0, to the object \pi(X,x_0). How does this functor map morphisms? It just maps paths in X to their image under the same continous map. How do we know that the image is a path? This is easy to see. We can prove that we ultimately have just a continuous map from [0,1] to that image, and we will be done. Do we have to choose a point in each topological space? Yes. What if we have the following two tuple (X,x_0), (Y,y_o), such that x_0 is not mapped to y_0? Then there is no morhism between these two objects. In other words, the set of morphisms \text{Hom}((X,x_0), (Y,y_o)) consists of only those morphisms which map x_0 to y_0. An illustrative example is the following: f_1: [0,1]\to (\cos (2\pi t),\sin (2\pi t)) and f_2: [0,1]\to (\cos (4\pi t),\sin (4\pi t)). These are two different continuous maps between the same two topological spaces. They both map 0 to the point (1,0) in S^1, but they map a path starting and ending

Side note: Example of two homotopic paths being mapped to homotopic paths under a continuous map. Let f: [0,1]\to S^1 be the continuous map under consideration. Consider any path p in [0,1] which starts and ends at 0. We know that this is homotopic to the constant path at 0 (one may visualize the homotopy as shrinking this path successively toward 0). Then the image of this homotopy is mapped to a path in S^1 that shrinks toward the constant map at (1,0).

6. Contravariant functors: Mapping a vector space to its dual. This example is pretty self-explanatory.

7. Natural Transformation: A natural transformation is a morphism between functors. Abelianization is a common example of a natural transformation. The two functors, both of which are covariant, are id and id^{ab}. The first one maps a group to itself, and the second one maps a group to its commutator. The resultant commutative diagram is easy to see too. The data of the natural transformation is just m:G\to G^{ab} and m:G'\to G'^{ab}.

The double dual of a vector space is another example of a natural transformation. The dual would have worked too, except for the fact that the dual functor is contravariant. Note: one of the functors, in both these natural transformations, is the identity functor.

8. Equivalence of categories- This is exactly what you think it is. Two categories that are not equivalent are \mathfrak{Grp} and \mathfrak{Grp^{ab}}. Too much information is lost while abelianizing the group, which cannot be regained easily.

9. Initial object- The empty set is the initial object in the category \mathfrak{Set}. Why not a singleton? Because the map from the initial object to any object also has to be unique. Moreover, a singleton will not map to an object- namely the empty set. And an initial object should map to all objects.

10. Final object- A singleton will be a good final object in the category \mathfrak{Set}.

11. Zero object- The identity element in the category \mathfrak{Grp} would be such an object.

12. Localization through universal property: Consider \Bbb{Z}, with the multiplicative subset \Bbb{Z}-\langle 7\rangle. The embedding \iota: \Bbb{Z}\to \Bbb{Q} ensures that every integer goes to an invertible element. Trivially, so does every element of \Bbb{Z}-\langle 7\rangle. Hence, there exists a unique map from (\Bbb{Z}-\langle 7\rangle)^{-1}\Bbb{Z} to \Bbb{Q}. We can clearly see that this is overkill. Many more elements than just those of \Bbb{Z}-\langle 7\rangle are mapped to invertible elements. The point is that there may be a ring A such that only elements of \Bbb{Z}-\langle 7\rangle are mapped to invertible elements in A. Hence, in that case too, there will exist a unique map from (\Bbb{Z}-\langle 7\rangle)^{-1}\Bbb{Z} to A. Why do we care about there existing a map from some other object to rings which \Bbb{Z} maps to at all? When we have a morphism \phi:A\to B, and we can say that there exists a map A/S\to B, where S is a set of relations between elements of A, then we’re saying something special about the properties of elements in B (at least the properties of elements mapped to by S).

Sheaf (ÄŒech) Cohomology: A glimpse

This is a blogpost on Sheaf Cohomology. We shall be following this article.

If the reader wants to read up on what a sheaf is, he/she can read the very readable wikipedia article on it.

From the word cohomology, we can guess that we shall be talking about a complex with abelian groups and boundary operators. Let us specify what these abelian groups are.

Given an open cover \mathcal{U}=(U_i)_{i\in I} and a sheaf \mathcal{F}, we define the 0^{th} cochain group C^0(\mathcal{U}, \mathcal{F})=\prod_{i\in I}\mathcal{F}(U_i). Note that we are not assuming that the sections over the individual U_i‘s agree on the intersections. This is simply a tuple in which each coordinate is a section. We are interested in finding out whether we can glue these sections together to get a global section. This is only possible if the sections agree on the intersections of the open sets.

We now define C^1(\mathcal{U}, \mathcal{F})=\prod_{i,j\in I}\mathcal{F}(U_i\cap U_j). Here we are considering the tuple of sections defined on the intersections of two sets. Note that these intersections may not cover the whole of the topological space. Hence, we are no longer interested in gluing sections together to see whether they form a global section.

Similarly, we define C^2(\mathcal{U}, \mathcal{F})=\prod_{i,j,k\in I}\mathcal{F}(U_i\cap U_j\cap U_k).

Now, we come to the boundary maps. \delta: C^0(\mathcal{U}, \mathcal{F})\to C^1(\mathcal{U}, \mathcal{F}) is defined in the following way: \delta(f_i)=(g_{i,j}), where (g_{i,j})= f_{j|U_i\cap U_j}-f_{i|U_i\cap U_j}. What we’re doing is that we’re taking a tuple of sections, and mapping it to another tuple; the second tuple is generated by choosing two indices k,l, determining the sections defined over U_k and U_l, and then calculating f_{l|U_k\cap U_l}-f_{k|U_k\cap U_l}. In the image tuple, f_{l|U_k\cap U_l}-f_{k|U_k\cap U_l} would be written at the k,l coordinate.

Now we define the second boundary map. \delta: C^1(\mathcal{U}, \mathcal{F})\to C^2(\mathcal{U}, \mathcal{F}) is defined in the following way: \delta(f_{ij})=(g_{i,j,k}), where (g_{i,j,k})= f_{i,j|U_i\cap U_j\cap U_k}-f_{k,i|U_i\cap U_j\cap U_k}+f_{j,k|U_i\cap U_j\cap U_k}. What does this seemingly arbitrary definition signify? The first thing to notice is that if f_{i,j} is an image of an element in C^0(\mathcal{U}, \mathcal{F}), then \delta(f_{i,j})=0. Hence, at the very least, this definition of a boundary map gives us a complex on our hands. Maybe that is all that it signifies. We’re looking for definitions of C^i(\mathcal{U},\mathcal{F}) which keep us giving sections over smaller and smaller open sets, and definitions of \delta over these C^i(\mathcal{U},\mathcal{F}) which keep on mapping images from C^{i-1}(\mathcal{U},\mathcal{F}) to 0.

Predictably, H^i(\mathcal{U},\mathcal{F})=Z(\mathcal{U},\mathcal{F})/B^i(\mathcal{U},\mathcal{F}), where Z(\mathcal{U},\mathcal{F}) is the kernel of \delta acting on C^i(\mathcal{U},\mathcal{F}) and B^i(\mathcal{U},\mathcal{F}) is the image of \delta acting on C^{i-1}(\mathcal{U},\mathcal{F}). Sheaf cohomology, measures the extent to which tuples of sections over an open cover fail to be global sections. The longer the non-zero tail of the cohomology complex, the farther the sections of this sheaf lie from gluing together amicably. In other words, the length of the non-zero tail measures how “complex” the topological space and the sheaf on it are. However, there is still hope. By a theorem of Grothendieck, we know that the length of the complex is bounded by the dimension of the (noetherian) topological space.

Prūfer Group

This is a short note on the Prūfer group.

Let p be a prime integer. The PrÅ«fer group, written as \Bbb{Z}(p^\infty), is the unique p-group in which each element has p different pth roots. What does this mean? Take \Bbb{Z}/5\Bbb{Z} for example. Can we say that for any element a in this group, there are 5 mutually different elements which, when raised to the 5th power, give a? No. Take \overline{2}\in\Bbb{Z}/5\Bbb{Z} for instance. We know that only 2, when raised to the 5th power, would give 2. What about \Bbb{Z}/2^2\Bbb{Z}? Here p=2. Does every element have two mutually different 2th roots? No. For instance, \overline{2}\in\Bbb{Z}/2^2\Bbb{Z} doesn’t. We start to get the feeling that this condition would only be satisfied in a very special kind of group.

The Prūfer p-group may be identified with the subgroup of the circle group U(1), consisting of all the p^n-th roots of unity, as n ranges over all non-negative integers. The circle group is the multiplicative group of all complex numbers with absolute value 1. It is easy to see why this set would be a group. And using the imagery from the circle, it easy to see why each element would have p different pth roots. Say we take an element a of the Prūfer group. Assume that it is a p^{n}th root of 1. Then its p different pth roots are p^{n+1}th roots of 1. It is nice to see a geometric realization of this rather strange group that seems to rise naturally from groups of the form \Bbb{Z}/p^n\Bbb{Z}.