4 out of 5 dentists recommend this WordPress.com site

Month: October, 2016

Closed form for recurrence relations- A long battle

I am almost morally obligated to write this post. While preparing for Math Olympiads, I came across the closed form formula for linear recurrence relations pretty often. However, I never quite came across *any* source that explained *why* this worked. And in the 6 years since, I’ve looked far and wide. However today, in my first College Combinatorics class, I think I finally understood why this works.

Let $a_n=c_{n-1}a_{n-1}+c_{n-2}a_{n-2}+\dots+c_{n-k}a_{n-k}$. Along with this, we also have the initial conditions $a_0=d_0, a_1=d_1,\dots,a_{k-1}=d_{k-1}$.

Let us assume that there exists a geometric sequence which satisfies the same relation! We have no information about initial conditions. Just a geometric sequence. To make things more explicit, I’d like to point out that the initial conditions are probably completely different from those mentioned before. Let this geometric sequence be of the form $f_n=r^n$. Plugging these powers of $r$ in the recurrence relation above, we solve for the roots of the $(n-k)$ degree polynomial. We get $n-k$ complex roots, possibly repeated.

Say the distinct roots that we get are $p_1,p_2,\dots,p_j$, where $j\leq n-k$. Each root satisfies the recurrence relation. Hence, so does $q_1p_1^i+q_2p_2^i+\dots+q_jp_j^i$, where $q_1,q_2,\dots,q_j\in\Bbb{C}$. This can be checked directly by plugging his linear combination in the recurrence relation above.

Now we determine the values of the $q_i$‘s by assuming that $q_1p_1^i+q_2p_2^i+\dots+q_jp_j^i$ satisfies the initial conditions given above. Remember that these initial conditions give rise to a *unique* sequence that satisfies the recurrence relation. Hence, if we can get values of $q_i$‘s that make $q_1p_1^i+q_2p_2^i+\dots+q_jp_j^i$ satisfy each initial condition, then we’re done. Does this always work out? Do we always get unique values for the $q_i$‘s? Only when the system of equations we generate is consistent. As soon as we get values for $q_i$‘s that satisfy this system of equations, it is easy to see that we’ve gotten the closed form function we wanted.

This method is analogous to a gamble. We assume that the desired closed form function is a linear combination of geometric sequences. If it indeed is, we’ll get values for the $q_i$‘s. If it is not, we shall not get consistent values for the $q_i$‘s. What is important to note that if we do get consistent values for the $q_i$‘s, then this is our desired closed form. We don’t need to check for any more conditions. Because a recurrence relation with $k$ dependant terms (in that $a_n=c_{n-1}a_{n-1}+\dots+c_{n-k}a_{n-k}$) and $k$ initial conditions is unique.

Nakayama’s lemma

The Nakayama lemma as a concept is present throughout Commutative Algebra. And truth be told, learning it is not easy. The proof contains a small trick that is deceptively simple, but throws off many people. Also, it is easy to dismiss this lemma as unimportant. But as one would surely find out later, this would be an error in judgement. I am going to discuss this theorem and its proof in detail.

The statement of the theorem, as stated in Matsumura, is:

Let $I$ be an ideal in $R$, and $M$ be a finitely generated module over $R$. If $IM=M$, then there exists $r\in R$ such that $r\equiv 1\mod I$, and $rM=0$.

What does this statement even mean? Why is it so important? Why are the conditions given this way? Are these conditions necessary conditions? These are some questions that we can ask. We will try and discuss as many of them as we can.

$M$ is probably finitely generated so that we can generate a matrix, which by definiton has to be finite dimensional. Where the matrix comes in will become clear when we discuss the proof. What does $IM=M$ imply? This is a highly unusual situation. For instance, if $M=\Bbb{Z}$ and $I=(2)$, then $(2)\Bbb{Z}\neq\Bbb{Z}$. I can’t think of examples in which $I\neq (1)$, and $IM=M$. However, that does not mean that there do not exist any. What does it mean for $r\equiv 1\mod I$? It just means that $r=1+i$ for some $i\in I$. That was fairly simple! Now let’s get on with the proof.

Let $M$ be generated by the elements $\{a_1,a_2,\dots,a_n\}$. If $IM=M$, then for each generator $a_i$, we have $a_i=b_{i1}a_1+b_{i2}a_2+\dots+b_{in}a_n$, where all the $b_{ij}\in I$. We then have $b_{i1}a_1+b_{i2}a_2+\dots+(b_{ii}-1)a_i+\dots+b_{in}a_n=0$. Let us now create a matrix of these $n$ equations in the natural way, in which the rows are indexed by the $i$‘s. The determinant of this matrix will be $0$, as for any column vector that we multiply this matrix with, we will get $0$. On expanding this determinant, we will get an expression of the form $(-1)^n+ i$, where $i\in I$. If $n$ is odd, then just multiply the expression by $-1$. In either case, you get $1+i'$, where $i\in I$ ($i'=i$ or $i'=-i$).

Now as $1+i'$ is $0$, we have $(1+i')M=0$. Hence, $r=1+i'$ such that $r\equiv 1\mod I$ and $rM=0$

The reason why the proof is generally slightly confusing is that it is done more generally. It is first assume that there exists a morphism $\phi:M\to M$ such that $\phi(M)\subset IM$. Cayley-Hamilton is then used to give a determinant in terms of $\phi$, and then it is assumed that $\phi=1$. Here I have directly assumed that $\phi=1$, which made matters much simpler.

Spectral Theorem

This post is on the Spectral Theorem. This is something that I should have been clear on a long time ago, but for reasons unknown to me, I was not. I hope to be able to rectify that now. The proof was discussed today in class. I am only recording my thoughts on it.

The spectral theorem states that a self-adjoint operator in an $n$ dimensional vector space has $n$ orthogonal eigenvectors, and all its $n$ eigenvalues are real.

Let $V$ be the $n$-dimensional vector space under consideration, and let $a,b\in V$. A self adjoint operator is one that satisfies the following condition: $\langle Ta,b\rangle=\langle a,Tb\rangle$. If the inner product is defined in the conventional way in this setting, which is a sesquilinear product, then $T$ has to be a hermitian matrix. For motivation, we’re going to assume this inner product for the rest of the proof.

As we’re working in $\Bbb{C}$, $T$ has at least one eigenvalue, and consequently and eigenvector. Let $Tv=\lambda v$, where $\lambda$ is the eigenvalue and $v$ is the eigenvector. $\langle Tv,v\rangle=\langle \lambda v,v\rangle=\lambda\langle v,v\rangle=\langle v,Tv\rangle=\langle v,\lambda v\rangle=\overline{\lambda}\langle v,v\rangle$. We know that $\langle v,v\rangle\in\Bbb{R}$ (it is in fact greater than $0$). Hence, $\lambda=\overline{\lambda}$, which shows that $\lambda$ is real valued.

How do we contruct the basis of orthogonal eigenvectors though? We start with one eigenvector $v$. Now consider the orthogonal complement of $v$. Let this be $A$. We claim that $T(A)\subset A$. This is because for $a\in A$, $\langle Ta,v\rangle=\langle a,Tv\rangle=\langle a,\lambda v\rangle=\overline{\lambda}\langle a,v\rangle=\lambda\langle a,v\rangle=0$ (remember that $\lambda=\overline{\lambda}$). Hence, if we write $T$ in terms of the new basis which has $v$ and elements from its orthogonal complement, then the first row and column will be all $0$‘s except for the the top left position, which will have $\lambda$.

Now the action of $T$ on the orthogonal complement, or $A$, is the same as the action of $T\setminus \{\text{first row and first column}\}$. The determinant of this again will have at least one solution, which ensures that we have an eigenvalue to work with. In this way, through an iterative process which ends after $n$ iterations, we can generate $n$ eigenvectors.

Abelian Categories

This post is about Abelian categories. This concept, like many others, has eaten away at my existence for quite some time, as there were some chinks that I never quite cleared.

Vakil says that abelian categories are “the right setting” in which one can do homological algebra. This is because one can use kernels, cokernels, etc in such a category. Before we elaborate on this, we shall write about additive categories.

An additive category $\mathcal{A}$ is one in which for $A,B\in Obj(\mathcal{A})$, the set $Mor(A,B)$ is an abelian group, there exists a zero object (an object that is both an initial and final object), and $A\times B\in Obj(\mathcal{A})$ too.

An abelian group structure on the objects would automatically make the morphisms an abelian group. This is seen in the category of abelian groups and the category of modules. A zero object need not always exist. For example, such an object does not exist in $\textbf{Fld}$. However categories like $\textbf{Ab}$ and $\textbf{Mod}$ do contain such objects, which is namely $\{0\}$. As for products, the category $\textbf{Fld}$ again does not contain products. For example, $\Bbb{Z_3}\times\Bbb{Z}_5$ is not a field. However, the categories $\textbf{Ab}$ and $\textbf{Mod}$ do contain the zero object.

Now we shall discuss the concept of kernel and cokernel. For a map $f:B\to C$, a kernel is the two tuple $(A,i)$, where $i:A\to B$ is a morphism, such that $f\circ i=0$. Also, $(A,i)$ is universal with respect to this property. What does this mean? It means that any other tuple satisfying this property factors through $(A,i)$. This is just a fancy way of saying that $A$ contains the whole kernel of $f:B\to C$.

A cokernel has an anaologous construction- that of the dual of the kernel. A cokernel of $f:B\to C$ is a tuple of the form $(D,g)$ such that $g\circ f=0$. Also, it is the universal tuple with such properties. In less fancy terms, every map from $C$ which maps $f(B)$ to $0$ factors through $D$.

An abelian category is one in which every map has a kernel and a cokernel, every monomorphism is the kernel of its cokernel, and every epimorphism is the cokernel of its kernel. We shall discuss these concepts in separate paragraphs.

That every map has a kernel and a cokernel is seen in almost every category that we can think of. $\textbf{Grp}, \textbf{Ring}$, $\textbf{Ab}$, etc. Hence, this is not that much of a constraint.

What does “every monomorphism is the kernel of its cokernel” even mean? Before we can parse this statement, we need to understand what a monomorphism is. It is not just an injective map anymore. Let $f,f':A\to B$ and $g,g':B\to C$. Then if $f\circ g=f'\circ g\implies f=f'$, then $g$ is a monomorphism. When we say every monomorphism is the kernel of its cokernel, what we mean is the following: take a monomorphism $f:A\to B$. As it is a map in an abelian category, it has a cokernel. Which means there exists an object $Cok$ to which it maps via $g:B\to Cok$ such that $g\circ f=0$. This map $g:B\to Cok$, by virtue of being a map in an abelian category, has a kernel $(D,k)$ such that $k:D\to B$ such that $g\circ k=0$. What this condition says is that $(D,k)=(A,f)$. Why is it important that $f$ is a monomorphism? Let us try to understand this with a concrete example in mind- that of the category $\textbf{Ab}$. In this setting, it all appears intuitive. Every tuple $(M,l)$ that maps to $B$ such that $l\circ g=0$ has to factor through $(A,f)$ because $(A,f)$ is exactly the kernel! There is no many-one mapping here. There are no added conditions. $(A,f)$ is exactly the kernel for the map $g:B\to C$.

We can parse “every epimorphism is the cokernel of its kernel” in a similar way- in the category $\textbf{Ab}$. Take a surjective map $f:A\to B$. Its kernel is a tuple $(C,g)$ such that $f\circ g=0$. The cokernel of $g:C\to A$ is exactly $(f,B)$, as $B=A/ker f$. This obviously is a result of the fact that $f$ is surjective (a specific form of epimorphism). Otherwise $C$ would contain, but not equal $A/ker f$, which is the cokernel of the kernel in general.

Over in all, the definition of the abelian category seems to contain some common sense requirements that most categories we encounter do fulfil. Some categories like $\textbf{Fld}$ do not, however, as they’re not even additive categories to begin with.

Ringed Spaces

Today I’m going to be talking about the structure sheaf. Suppose $\mathcal{O}_X$ is a sheaf of rings on a topological space $X$ (i.e. a sheaf on $X$ with values in the category of Rings), then $(X,\mathcal{O}_X)$ is a ringed space. What does a sheaf of rings mean? Wikipedia says all it means is that for any open set $U\subset X$, $F(U)$ is a ring. An example of a structure sheaf is the sheaf of continuous functions on an open set.

Now let us talk about $\mathcal{O}_X$-modules. What are these? You have another sheaf- a sheaf of abelian groups (we will call this sheaf $G$), over the same topological space, and then ensure that it is closed under scalar multiplication with elements of the structure sheaf. How does this make sense? We know that the union of all elements of $F(U)$ over all open sets in $X$ is not a ring. Like the union of all continuous functions on all open sets in $\Bbb{R}$ is not a ring. However, each $F(U)$ is. Hence, as long as we take elements from $G(U)$ and multiply them only with elements of $F(U)$, we are still dealing with elements of an abelian group being multiplied with elements from a ring, and then we’re fine. We have a module.

You also have a commutative diagram condition on ringed spaces. The diagram is attached below.

What does such a diagram imply? It just means that whether you multiply and then restrict, or whether you restrict (elements of both sheaves) and then multiply, the result is the same. For example, if $F$ is the sheaf of differentiable functions, and $G$ is the sheaf of continuous functions, then $G$ is an $F$ module. Moreover, the order in which we restrict and multiply is irrelevant. Hence, the commutative diagram above is satisfied.

Why is Algebraic Geometry littered with such commutative diagrams whenever new concepts are defined? What structure are we trying to preserve? We just want to say that the “same thing is going on in both places”. For instance, when we have a mapping between sheaves, we want to say that whether we map and then restrict, or whether we restrict and then map, the result will be the same. So the operations of restrictions are “the same”, modulo mapping, in both sheaves. If we identify the elements of a sheaf with their image, then they’re being restricted in exactly the same way in both sheaves. This condition is a sanity check in some way. If any element could map to any element, regardless of the way they restrict, then we wouldn’t be able to map stalks to stalks and germs to germs, and the map between pre-sheaves would never become a map between stalks. In some sense, having such a commutative diagram makes mapping between direct limits possible.

On the local description of varieties

The degree of a curve is the degree of the defining polynomial. For example, a line is a curve of degree $1$.

Let $F$ be a curve in $\Bbb{A}[X,Y]$, and $P=(a,b)\in F$. Then $P$ will be called a simple point if $F_X(P)\neq 0$ or $F_Y(P)\neq 0$. What does this condition really imply? It means that the function doesn’t locally stop changing as we move away from $x$ and $y$. It is not a local maximum, or minimum (it’s not an extremal point). In this case the line $F_X(X-a)+F_Y(Y-b)=0$ is called the tangent line. Why do we get tangent lines whose equations look like this? This is analogous to the dot product rule for writing down the equation to a plane (or any hyperplane in general). The line is the hyperplane in $\Bbb{A}^2$.

Take an equation $F$, and write it down as $F=F_m+F_{m+1}+\dots+F_n$, where $F_m\neq 0$. Here each $F_i$ is a homogeneous component. Then $m$ is the multiplicity of $F$ at $(0,0)$. What does this mean? As we approach $(0,0)$, the higher degree terms approach $0$ much faster than $F_m$. Hence, $F$ “looks like” $F_m$ near $0$. An analogy would be $0=xy+(xy)^2+(xy)^3$. Near $(0,0)$, the function “looks like” $xy=0$. Hence, it makes sense to call $m$ the multiplicity of the curve at $(0,0)$.

But how do we define the multiplicity of the curve at an arbitrary point on it? Why only $(0,0)$? We can create new variables $x'=(x-a)$ and $y'=(y-b)$, and then re-write the equation. The new equation contains the point $(0,0)$. Now we can follow the same procedure.

As $F_m$ is a function of two variables in this case, it can be factorized into a product of linear polynomials, raised to various exponents. Is it weird that the function looks like a bunch of lines near $(0,0)$? No. Locally, every polynomial, and in fact every function, looks like a linear near a point. Here, because we’re talking about varieties, and hence possibly a product of polynomials, the local description of a polynomial is a bunch of lines. Nothing weird in it at all.

Why are the factors of $F_m$ tangents at $(0,0)$? Are we taking the “local” argument too far? No. It’s a revelation though. I have never seen it before. It makes complete mathematical sense. Say we take $y-x^2=0$. Then the tangent at $(0,0)$ is $y=0$, which is in fact true.

Projective morphisms

This post is about the morphisms between projective varieties. There are some aspects of such morphisms that I’m troubled about. The development will closely follow that in Karen Smith’s “Invitation to Algebraic Geometry”.

First, say we have a morphism $\phi:\Bbb{P}^1\to\Bbb{P}^2$ such that $[s,t]\to[s^2,st,t^2]$. We will try and analyze this map.

This map has to be homogeneous: in that each coordinate has to be homogeneous, and the homogeneity has to be of the same degree. This is the only way that such a map between projective varieties can be well-defined.

Now let us talk about the mappings from affine charts. Essentially, the affine charts cover the projective space, and hence every projective variety that lives in that space. When we talk about a particular affine chart, we can reduce the number of variables by $1$. Because the value of one variable is always $1$: hence it can be neglected. However, is the image also an affine chart? That depends. In this case of $[s,t]\to [s^2,st,t^2]$, the image of an affine chart will be an affine chart. This is because $s=1\implies s^2=1$. Similarly, $t=1\implies t^2=1$.

We’ve covered all the possible points in the domain by picking out the affine charts. Hence, we have fully described the map.

A map $f$ between projective varieties is a projective morphism if for each $p\in V$, where $V$ is the domain, there exists an neighbourhood $N(p)$ such that $f|_{N(p)}$ is a homogeneous polynomial map. Is an affine chart an open set? Yes. If it is the $z$th affine chart it is the complement of the algebraic set $z=0$ in $\Bbb{P}^n$.

Let us now consider a different map: consider $V(xz-y^2)\in\Bbb{P}^2$. Let us call this curve $C$. Now consider the map $C\to \Bbb{P}^1$, defined as $[x,y,z]\to [x,y]$ if $x\neq 0$ and $[x,y,z]\to [y,z]$ if $z\neq 0$. What does this mean?

First of all, why is the option $y\neq 0$ not included? If $y\neq 0$, both $x,z\neq 0$ is implied. Hence, this case is a subcase of the two cases considered earlier. Secondly, what does it mean to map to a projective space of a lower dimension? The curve is one-dimensional. Is that the reason why we can embed it in $\Bbb{P}^1$? Probably. Note that we haven’t yet proven that this mapping is an embedding. However, this will indeed turn out to be the case.

Is this map consistent? In other words, are the two maps the same in the intersection of the open sets $x\neq 0$ and $y\neq 0$? Let us see. $[x,y]\to [xz,yz]\to[y^2,yz]\to [y,z]$. Hence, when $x,z\neq 0$, this map is consistent.

Why do we have to have such a broken up map? Why not one consistent map? First of all, mapping from affine charts seems like a systematic way to map. You can always ensure that at least one coordinate is non-zero; both in the domain and range. That is really all there is to it. Sometimes on restricting to affine charts, you write affine maps, like in the precious example. In other cases, including this one, you write a projective map. Defining the various projective maps, whether they change with affine charts or not, is of paramount importance. The affine map part is just an observation which may or may not be made.

How to map a curve $f(x_0,x_1,x_2,\dots,x_n)$ to $\Bbb{P}^1$ in general? This seems to be a very difficult question. [This](http://web.stanford.edu/~tonyfeng/Brill_Noether1.pdf) suggests that every smooth projective curve can be embedded in $\Bbb{P}^3$. That seems to be the best we can do.

For completeness, I would like to mention that the two maps given above are inverse to each other, although this is unrelated to the motivation for this article.

Projective Closures of Affine Varieties

This blog post is going to be on the projective closures of affine varieties. This too is something that has confused me for some time. I did manage to make sense of most of it eventually, but I want to still write it down for my peace of mind.

Say we have a polynomial $p(x_1,x_2,\dots,x_n)$ in affine space $\Bbb{A}^n$. We want to find the projective closure of that. How do we do it? We first homogenize it by writing it as a homogeneous polynomial $p'(x_1,x_2,\dots,x_n,z)$. Note that $z$ is the extra variable that has been added here. Then, after having sketched the zeroes of the polynomial in the $z$-th affine chart in projective space, we find the projective closure in the whole space by assuming $z=0$, and then determining what are the zeroes of the remainder of the polynomial.

Why does this work? The complement of the $z$-th affine chart is that space in projective space that the affine curve could never have reached in affine space. Remember that the whole affine curve, point by point, lies inside the affine chart. Hence, the points in the complement of the chart will contain all the “new” points of the curve. Heuristically speaking, the “new” points are all those points that the curve tends to, if it could go infinitely far, whatever that might mean.

Now how do you find the projective closure of a general algebraic set (and not one that is generated by a single polynomial)? In other words, say you have an affine algebraic set $X$. What changes do you need to make to $I(X)$ in order for $V(I(X))$ to be the projective closure of $X$? Turns out, homogenizing the generators of $I(X)$ is not the answer, but homogenizing *all* the elements of $I(X)$ is. Is is the explanation of this part that is the aim for writing this post.

Why does homogenizing the generators not work? This is because we may count extra points of projective closure. How? Take the polynomials $x+y+1$ and $x+y+2$ in $\Bbb{R}^2$. Their corresponding homogeneous polynomials are $x+y+z$ and $x+y+2z$ in $\Bbb{P}^3$. The variety corresponding to the ideal $I(x+y+1, x+y+2)$ is obviously $\emptyset$. Hence, the affine chart of $\Bbb{P}^2$ will also contain the empty set, the projective closure of which is also the empty set. However, the projective closures of both $x+y+z$ and $x+y+2z$ contain the point $x+y=0$. From this, we can see that a set of algebraic sets in affine space may all contain a set of points of projective closure that their intersection may not.

This problem is solved by homogenizing all the polynomials in $I(X)$, and then generating an ideal from it. The variety corresponding to this ideal is exactly the projective closure of $X$. Let this ideal be referred to as $\tilde{I}(X)$. Why does this work?
First of all, it is clear that the projective closure of $X$ lies inside $V(\tilde{I}(X))$. Now we need to prove the other inclusion. If we can produce a homogeneous polynomial such that the corresponding variety on the affine chart is the same as $X$, then we’ll be done. This we can do by just adding the homogenized versions of all the generators in $I(X)$, but also ensuring that the homogenized versions are of different degrees!!! The key insight behind this is that as we have a sum of homogeneous polynomials of different degrees, the corresponding variety will contain only those points that lie in the intersection of all zero sets of the individual polynomials. The affine chart will consequently only contain those affine points which lie in the zero sets of all those affine polynomials, which is precisely $X$. To complete the proof, we need to assume that each affine variety has a unique projective closure.

Such a polynomial (sum of homogenized versions, to different degrees, of all the generators of $I(X)$) does in fact exist in $\tilde{I}(X)$. Hence proved.