The loss of nuance

Dictatorship is the death of nuance

Someone’s WhatsApp status

I’m currently going reading the book “Feeling Great” by Dr. David Burns, and in many way it is one of the most important books that I’ve read. Dr Burns is the founder of Cognitive Behavioral Therapy, which is the pre-eminent technique used to process PTSD, past trauma, anxiety, etc without medication. The most important technique introduced in the book is the introduction of nuance in your thoughts. How does that go?

For instance, I had a not-so-great college experience. There are certain memories associated with it that make me cringe and go spiraling despite having been in two other institutions since. I fear encountering certain people in my future again, and go into morbid details of how those interactions will be terrible and how I will feel terrible at the end of it. The book’s approach is this: the thoughts that come to me are all justified. Those indeed were bad memories, and it might indeed be terrible to meet those people again. In fact, it is good that I am scared of such things happening, because I want to protect myself from negative experiences. Hence, my anxiety is good. However, I’ve set my “anxiety dial” at 98/100, when it should be much lower. For instance, I only had a handful of bad memories with people I spent years and years with. Hence, most memories were neutral or positive. Secondly, it is unlikely that I will meet those people again, and it is again unlikely that they will be terrible even when we meet after so many years. Hence, I should instead turn the dial down to 5/100 or something. This helps me feel better, and more in control about my situation.

Let me dissect this a little bit. My brain wanted to dump facts into two buckets. Something is either possible, or it is not. Is is possible that I meet bullies from my past again and that they give me a terrible time? Yes. It is factually possible. Then, I would start spiraling. My brain refuses to deal with the more nuanced notion that although this is possible, it is unlikely. That is what the “anxiety dial” thing does. It gives me a measure of how likely something is, which my brain is not intuitively capable of processing. It is the introduction of nuance that is the first step in dealing with trauma. However, it is not just processing our past memories that needs nuance. Nuance is needed in how we process the world around us.


I was prompted to write this article after reading Postpolitics by Leighton Woodhouse. He talks about how the news media has essentially discarded nuance to create a shock culture of news consumption. Someone thinks that hate speech should not be persecuted? They are either a champion for free speech, or a bigot who wants the world to burn. They can only be either of those two things. This creates sharply divided opinions between the consumers of CNN on the left and Fox News on the right, and leads to further mudslinging and political polarization.

What would the introduction of nuance in this situation be like? We can introduce a bunch of dials like before. There can be a “bigot dial”, which goes from “people of all races can come into America and I will share all that I have with them” at 0/100, to “America should be reserved for White, Christian Europeans, as they clearly are the superior race” at 100/100. The person above might be at 60/100, say. We can also have a “free speech” dial, which goes from “No one should be allowed to say anything that goes against my beliefs” at 0/100, and “Everyone can say whatever they want, even if it is of a personal nature against me and my family and can cause harm to me” at 100/100. Let’s say that the person above is at 55/100, and wants freedom of political speech without fear of persecution from “the Commies”. The dial scores of 60/100 and 55/100 provide a much more nuanced view of the person, and in fact are scores that both the Right and the Left might agree on! Hence, perhaps political polarization may be alleviated greatly by the introduction of nuance.

Thinking about the “greats”

Reading Gandhi’s “My Experiments with Truth” was one of the most formative experiences of my life. I suddenly became vegan (and struggled to maintain it), tried to donate money, and resolved to become a less terrible person in general. However, I constantly had to deal with other people saying “Gandhi sucked because he probably had sexual relations with the young women in his coterie”. The truth was more complicated. He probably did ask them to lie naked with him to see if he was tempted to have sexual relations with them (he said that he did not, and felt protective towards them instead), but it is unclear if they actually had intercourse. Regardless of what actually transpired, it was morally wrong of Gandhi to do so because he was in a position of power in regard to these women, and asking them to do so, even if they consented to it, is wrong. Hence, let’s call Gandhi a lustful bastard and end the discussion. Right?

This feels wrong on the inside. Although Gandhi did do the above, and it was bad, he also did so many amazing things. He introduced nonviolent methods in India which were copied the world over to amazing effect. He fought against the religious persecution of Muslims in India, the denigration of lower castes, the cruelty of the meat industry, etc. For several years, he used to go and clean the toilets in the houses of the lower castes, if only to deny the superiority that his own higher caste gave him over others. This is pretty frickin amazing. He truly was one of the greats. Hence, instead of clubbing him into the “god” or “evil person” categories, we can instead use a “goodness” dial, and give him a 92/100 or something, acknowledging that although he did make mistakes, he also did an amazing amount of good. One may also argue that he also tried to subvert the legacies of Ambedkar and Bhagat Singh, and give him an even lower score of 70/100 or something. However, it is undeniable that the discussion around Gandhi needs more nuance.

The same goes for people like Einstein, Steve Jobs, Elon Musk, etc. Einstein was great, but look, he was wrong about Quantum Mechanics. Moreover he also had multiple affairs. Fine. But he did invent General Relativity!! That is really frickin awesome. It is a scientific discovery so unrivaled that only Newton can begin to have the same position in Science as him. And Newton spent most of his time studying alchemy. Hence, Einstein would probably get a 90/100 on the “human” dial. The same goes for Jobs being hard on his employees, Elon Musk shitposting on Twitter and making fantastical promises that he can’t deliver on, etc. Elon is single handedly changing the world with his cars, rockets and underground tunnels. He’s not just your friendly neighborhood dogecoin scammer. He probably gets a 95/100 on the “impact” dial.

The human brain is capable of much. However, it is not capable of nuance. That has to be a skill that we train ourselves in, so that we may navigate the world more easily, and have a more objective picture of reality.

Neuroscience and Indian law enforcement

Let’s face it. Raj Kundra probably does own a mobile app that uploads pornographic videos. The app is called Hotshots, and some satisfied consumers of the app, perhaps out of a sense of duty to their fellow Indians, have uploaded content from the app to multiple porn sites on the internet. Hence, Kundra’s masterpieces are freely available to be viewed for all connoisseurs of Indian creativity.

But why is any of this wrong? There are multiple issues at hand here. First of all, is consuming or producing pornography immoral? Reading the wikipedia article on Pornography in India was about as much fun as you’d expect. Let me quote a gem from the article:

In October 2018 the government directed Internet service providers to block 827 websites that host pornographic content following an order by the Uttarakhand High Court. The court cited the rape of a 10th standard girl from Dehradun by four of her seniors. The four accused told police that they raped the girl after watching pornography on the Internet

The implication is that the accused would not have raped the girl if they’d not watched pornography. Of all rapes that have ever occured, how many are a result of the rapists having watched porn immediately before? I don’t seem to have the statistics, but probably very few. Rape has been a problem for thousands of years. Invading armies have raped whole towns and villages. In the Nanjing Massacre in 1936, the invading Japanese army raped 30,000-80,000 women in one province. Hence, rape is probably more about a differential power structure than the availability of pornography.

But we can make an even stronger point! Pornography actually reduces rape and sexual assault.

UCLA researchers surveyed recollections of porn use among law-abiding men and a large group of convicted rapists and child sex abusers. Throughout their lives, the sex criminals recalled consuming less porn. More evidence that porn is a safety valve. Instead of committing rape and pedophilia, potential perpetrators find a less harmful outlet, masturbating to porn.

Around the millennium, partly in response to the availability of Internet porn, Japan, China, and Hong Kong relaxed laws that restricted its availability. In all three places, as porn became more easily available, sex crimes decreased.

Using Czech police records, American and Czech researchers compared rape rates in the Czech Republic for the 17 years before porn was legalized with rates during the 18 years after. Rapes decreased from 800 a year to 500. More porn, less rape.

This also ties in with our general experience. The wide availability of pornography has in fact hindered men from interacting with the opposite sex, especially in countries like Japan. The largest consumers of porn are probably incels.

Alright, so we have established that pornography cannot be blamed for an increase in sexual abuse. Is it still bad? This is a more nuanced question. I do believe that the actors working in pornographic films are often abused and traumatized. Although they agree to act in such films out of their own volition, they are often left traumatized as videos of their sexual acts remain in circulation for the duration of their lifetimes, affecting their relationships with their families, etc. Hence, these videos should be up for deletion whenever desired by the actors (after say a year of being on the internet, so that the companies may recuperate their costs). This affords the actors some modicum of control over their future. However, I do believe that there is nothing inherently immoral about people engaging in sexual acts or watching them, as long as everything is consensual. Any person who has read this far probably agrees with this.

Alright fine. Porn and let porn. But where do neuroscience and Indian law enforcement come into this?


I was recently listening to the audiobook of “The Hidden Spring” by Mark Solms, a reputed neuroscientist from South Africa. The book claims the brain learns through instruction and experience to preferentially activate various neural circuits, while inhibiting others. For instance, if you’ve recently been swindled of a lot of money by a close friend, you will start becoming less trusting of people. Your “suspicion” neural circuit will be much more active than your “trust” circuit, and you will probably say no if someone asks you for a loan without an adequate guarantee. It is this preferential activation of neural circuits that gives humans their individuality. For instance, your brain might preferentially activate your “trust” and “sympathy” circuits, and my brain might activate its “suspicion” and “hatred” circuits. As a consequence of this, you will probably be a more empathetic person than me and have many more friends, whilst I remain a neurotic loner.

So how is any of this related to the Indian law enforcement? Well India is the most rule-abiding country in the world. At least going by the number of rules it has. It, by very far, has the longest constitution in the world, and is decked with all kinds of laws and rules and directive principles and such that are supposed to govern all aspects of our lives. It contains common sensical laws that implore you not to kill your fellow human, and also goes into the minutae of how to send in your registration papers for your new house to the registrar within a couple of weeks with two passport size photos, or your registration will not be accepted unless you pay a 300 rupees fine. But this doesn’t begin to explain the genius of it. It is much more magnificent than the human brain, as you might appreciate below.

Writing laws is not too difficult. However, knowing when to apply them requires much more intelligence and creativity. The Indian law enforcement is not stupid like those in other countries, who supposedly try to apply all laws to everyone equally. They are composed solely of the best and the brightest. The artists. The Picassos of knowing whom to apply the laws to, whom to really apply all possible laws to, and whom to consider above the law and grovel in front of. The differential activation of neural circuits pales in comparison to their differential application of power.

How does it all work? Well, like I said before, India has laws for everything. It’s not really red tape. It’s a red curtain, hiding you from progress and all other material gains that may corrupt you. If you want to do anything at all in India, it is not possible to follow all the laws. If you want to open a company, some officer looking for a bribe may tell you that you didn’t follow Section %(*&^ Clause B, which asks you to take a no objection certificate from the Department of Fisheries down the road. And he’s right. Because there is indeed such a law! Because there are multiple laws for everything. And if you follow all the laws, there will be little money or motivation left for anything else.

Hence, the unwritten law of the land is that you can get past all the red tape if you know which hands to grease, and don’t piss off people more powerful than you. If you are reasonably powerful and wealthy in India, no laws apply to you. The constitution is completely irrelevant to you. You can open whatever business you want, do any kinds of transactions, and you’ll be fine. However, if you piss off anyone more powerful than you, they can soon come up with a thick booklet of rules that you’ve violated since getting on the bus that morning. And they’ll be right. You probably have violated those rules, because let’s face it, you have to violate rules to be able to breathe in the country. And now you’re screwed. You’re done for. You will be assassinated in the media, people will hate you for being anti-national, and moving to Pakistan will be the only door that is open to you.

So what happened in the Raj Kundra case? He was accused of selling pornographic videos by Poonam Pandey, who has her own pornographic app and an Only Fans account. He was also accused of violating her contract. Soon, Sherlyn Chopra, who also her has own Only Fans account and a pornographic app, reported him to the police for selling pornography. Recently, Zoya Rathore, who is the top actresses in the pornographic videos on the Hotshots app, reported to the police that she was asked to audition for the videos on the app, but she refused. She apparently did not act in them. Despite already having multiple pornographic videos of hers online on the Hotshots app. The magical realism needed to warp reality to this extent will make Salman Rushdie weep with joy.

Kundra, probably having anticipated trouble for selling pornographic content in India, ensured that the company wasn’t Indian. He would shoot content in Singapore and other countries, upload them on the app in the UK, and only then allow consumption of the videos in India. Imagine the app to be like the PornHub app. Out of Indian law enforcement’s jurisdiction. Hence, seeing as they could not prosecute him for owning the app, the Indian law enforcement charged Kundra with forcing girls to shoot pornographic videos.

How do you exactly force someone to shoot pornographic videos? There are lengthy contracts that actors have to sign before shooting begins, and they are monetarily compensated for those. Heck, those videos may sometimes have a higher quality of acting than many college drama productions. That is exactly what happened in this case as well. Hence, it is likely that none of these charges will hold up in court. However, Kundra and his family have been vilified in Indian media, and will probably have to leave the country for any semblance of sanity.

Well, law-shlaw. We need to protect Indian morality. Kundra can’t be allowed to do it. But what about the fact that every major Indian city has brothels. The Prostitution in India wiki page says the following

Prostitution is legal in India. A number of related activities including soliciting , kerb crawling, owning or managing a brothel, prostitution in a hotel,child prostitutionpimping and pandering are illegal. There are, however, many brothels illegally operating in Indian cities including MumbaiDelhiKolkata and Chennai. UNAIDS estimate there were 657,829 prostitutes in the country as of 2016

So if you’re a person asking a prostitute for sex, or owning a brothel, you should be put into jail. Let’s face it. Hundreds of thousands of Indians buy sex from prostitutes every day. How may are put in jail? OK, let’s assume that catching them red handed is difficult. But every one knows where the brothels are! Why doesn’t the police just go and raze them to the ground?!

We have a unique situation at hand. The Indian law enforcement is, at this very moment, aware of hundreds of thousands of people who engage in prostitution, create and produce pornographic content in India, create distribute and screen softporn “B grade” movies in India, etc. However, it chooses to arrest and malign a random dude who made sure to not violate Indian pornographic law by creating and uploading all of his content outside of India. Moreover, his chief accusers are other pornstars, who currently have their own active pornographic apps!

This is the genius of the Indian law enforcement. The fact that they have a bazillion laws is not the crux of it. The crux of it is the fact that if they don’t like you, or if they have orders from their political overlords, they will open their thick rule book, and find some laws that you are in violation of. You uploaded a facebook post criticizing the local MLA? Arrested under “Intent to cause disturbance”. You’re the sole witness against rioters in Gujarat? Well you didn’t file a report before taking a leave for a day 21 years back, so you’re hereby arrested. And they’ll be right each time. And you’ll hold your head in your arms and scream as loudly as you can, as they take you away in chains. And they’ll display your face to the whole world before they ruin your life. You fucking pornographer.

Music and Class in India

I was in middle school when I started listening to American music (we called it English music) and watching American TV series (on Star World, for those in the know). Soon, I was listening to a lot of Backstreet Boys, and watching a lot of Friends, Dexter, etc. It was a strange feeling. Although I enjoyed consuming all of that content, I obviously couldn’t relate to much of it. Society in India was very different. People were a lot more discrete about dating than Joey, for instance. In fact, the concept of dating as such did not really exist. People mostly just decided to get into a “relationship” right away. And the word “relationship” too was a new thing! In the generation before ours, such liaisons were called “affairs”, and were essentially looked down upon as dishonorable and irresponsible. If you were having an “affair”, you were probably bad at studies and hence didn’t have much professional hope anyway, reneging on your duties towards your family, and blowing through their hard earned money. It was with this cultural mindset that I watched my favorite American TV characters trying to get dates with everyone in sight, laughing loudly with the soundtrack each time. It was surprisingly easy to co-exist in two contradictory worlds.

The same can be said about American music in India. I started with listening to pop, and soon progressed to classic rock like Michael Learns to Rock or the Eagles. Metal was still too weird for me, and I never quite took to it. But it was not just me. Almost everybody in my class listened to English songs, and we took pains to memorize the lyrics so that we would be able to sing along with those songs in class or wherever. A part of it was obviously an intention to signal status….and that’s not exactly the same as wealth, or caste. Class could, in some sense, be built by an exposure or affinity to “the West”. If I was a a low caste person in India with not a lot of wealth, I could still signal class if I knew the lyrics to a lot of American songs and knew what was up in the West. If I knew who Kirk Hammett was, for instance, I was in, my family circumstances notwithstanding. However, this was rare. Most times, caste, class and wealth would be in alliance. If you were born into a high caste family with wealth, you were likely to be exposed to Western influences, and hence earn “class” as well.

Fine. So we all consumed a lot of American content to signal “class”. So what? Well, a natural outcome of this is that some people wanted to do this professionally. We have a very large number of rock bands in India that are still chasing the kind of fame that American rock bands see all over the world. Our film industry is full of filmmakers exposed mainly to western influences (often educated there), and often base their whole storylines in the “West” if their budget allows. Although the TV industry has mainly withstood the onslaught of western influences, they end up merely being the Indian versions of loud telenovelas that are often derided by our generation on social media. Essentially, we are invested in producing a lot of “westernized” content in India partly because they seem nice, and also partly because we want to signal “class”. But this never quite catches on. There are no Indian rock bands that regularly rule the music charts. Overly western-ised Bollywood movies regularly fail to recuperate their investments. Hence, this strategy has repeatedly failed to produce an authentically Indian voice that can resonate with the people.

But what is the authentically Indian voice? Is it the Hindustani or Carnatic music that we sometimes hear when our calls are kept on hold? I don’t think so. I would go so far as to say that most Indians (especially those living in rural India) have never even heard much classical Indian music, if any. Carnatic music was historically cultivated in Tamil Brahmin households, and has strong caste roots. Moreover, Hindustani music has also mostly been developed in esteemed Muslim or Hindu households, and was not commodified for the plebs until very recently. If you are a lower caste person working the fields in Madhya Pradesh, chances are you haven’t heard much of either kind music. It’s like the French saying that caviar is representative of regular French food.

In their search for the authentic music of India, some music outfits have tried fusing Indian classical music with rock, jazz, etc. And by my estimation, they’re musically brilliant! However, they have still eluded making it big. So what is the real music of India?

There are two ways of looking about it. If you talk about reach, then the real music of India is mainstream film music. Wherever you travel in India, you are likely to be blasted with Bollywood or regional film music. In my city of Kolkata (erstwhile Calcutta), you will often experience loud Bollywood music blaring all around you. The rich and poor equally enjoy dancing to the latest chartbusters that play on TV and radio stations all day every day. Hence, the real music of India is a bastardized offspring of Eastern and Western influences, packaged together with lavish sets and dancing film stars.

The other way of looking at it is, of course, status signaling. The reason why most of us started listening to English music was to signal our “class”. However, soon listening to English music became too mainstream, and people needed an alternate way to signal class. Thus, a lot of Indian college going hipsters began celebrating Anurag Kashyap and his brand of rustic, “authentically Indian” movies like “Gangs of Wasseypur”. I’m not saying that the movie wasn’t good. It was. However, the wave of appreciation that the movie saw was clearly culturally counter-revolutionary. It was a way for the urban elites to tell the masses that they, the rarefied and gentrified, still had their feet planted firmly on the ground, and perhaps understood real India better than the frauds who were trying to appropriate their superior class by listening to English music.

As music evolves in India, people will soon find another way to signal their class. They may start listening to old European music, or perhaps even Bhojpuri music ironically. However, the real music of India will always remain Himesh Reshammiya’s chartbusters, or perhaps Badshah’s “rap”. At least for some time to come.

Some notes on Elementary Differential Geometry

I will discuss these notes on Differential Geometry.

Why study vector fields, differential forms etc on a manifold, when you can directly study the whole manifold itself? Well, what does studying the whole manifold mean? One might say “Just look at it! You’ll see the shape and other geometric features”. That is true. However, what the manifold “looks like” depends on how it has been embedded in the space in which you are an observer. The embedding may change the position of the manifold with respect to you, its size, etc. Also, “looking at” a manifold might be difficult in practice. If you are a human living on Earth (which I am assuming everyone reading this blog post is), how do you look at the whole earth without taking a spaceship out of the planet? Can you still study some geometric or topological properties of the planet without shooting into space and looking down on it? Another related question is how do we study the shape and geometric properties of the universe that are part of? We clearly can’t get out of it just yet. Is there still a way to say something meaningful about it? These questions can be answered by studying things like geodesics (paths taken by light), etc. It is study that we wish to undertake when studying Differential Geometry. Although Einstein mainly focused on local properties of geodesics (how two parallel geodesics diverged in a small neighborhood of a point), we can also study global properties like the number of zeroes of a vector field, etc.

Why study tensors? Aren’t vectors and co-vectors enough to study a manifold? Clearly they’re not, as we don’t yet have a notion of “size” yet, which is imparted to the manifold by a metric. Then comes the notion of a curvature, which is yet another tensor which quantifies how far the manifold if from being “flat”. Hence, just considering a manifold with local coordinate systems with vectors and co-vectors is not really enough. Very often we need to impose further information like metric, etc in order to study the manifolds that we encounter in the real world.

Why is it important to parallel transport anything? Well imagine the path of a ball. If its velocity vector is being parallel transported, then it is not under a non-zero net force, telling us that we are in an inertial frame of reference. However, if its velocity vector is not being parallel-transported, then we know that the ball is under a non-zero net force. Hence, parallel transportation tells us something fundamental about the systems under study.

Why do we want to work with coordinates at all, if we already have a metric? Because vector fields, and other tensor fields are often expressed locally in terms of coordinates. Hence, it is often useful to also be able to work in coordinates. But why do we care about the coordinates induced by the exponential map in particular? One reason is that we all Christoffel symbols are 0 (although their derivatives might not be). This makes calculations easier. But couldn’t we have created another coordinate chart where the Christoffel symbol would also be 0? Doing so by solving a system of differential equations gives us an overdetermined system. Hence, it is not clear that other such charts exist. But why did we not see the exponential map before? Why only in the case of Riemannian manifolds? There is nothing special about the exponential map if you remove the properties of the metric. It is just a map to an open subset of Euclidean space. What makes the exponential map important in Riemannian geometry is that the straight lines in Euclidean space are mapped to geodesics on the manifold. Hence, this is not an ordinary coordinate chart. This preserves important properties of the metric. However, it is not an isometry because the manifold might have non-zero curvature while Euclidean space does not.

What does two manifolds being isometric means? Does it mean that they have the same metrics in any coordinate chart? No. It just means that one manifold can be mapped onto the other homomorphically such that the pullback metric is the same as the original metric. On an intuitive level, I should be able to map one manifold onto the other without changing lengths, derivatives of the metric, etc. When we take a flat manifold and map it homomorphically to a manifold with non-zero curvature, we are changing the metric. Hence, the map is no longer isometric.

Why do we use these fancy exponential functions to construct bump functions? We can construct an infinite number of differentiable functions that are 0 in some domain and 1 in another. However, these bump functions are also smooth, and it is difficult to construct smooth functions with these properties. That is why we use these fancy e^{-1/x} functions. Why do we want partition functions anyway? This is so we can use local coordinates to do our calculations. An analogy is that if we want to measure the weight of a heavy object with a small weighing machine, we can break it up into smaller parts, weigh each of those parts individually, and then add those weights up.

What does a differential structure even mean? We’d only heard of differentiable functions before this. Well, a differentiable structure is a bunch of coordinate charts with differentiable transition functions. But who cares about transition functions? What about the interiors of those charts? The interiors look like Euclidean sets. Hence, they’re “smooth” anyway. Why do we care for transition functions anyway? When we’re switching coordinate charts, the representation of a function also changes. But if a function is smooth in one coordinate chart, we want it to remain smooth in another. Hence, the transition functions need to be smooth as well.

Why is it important to make the tangent space closed under Lie bracket multiplication? One answer is that we can now calculate things like the number of generators, which might help us in classifying manifolds. Another way of seeing it is that [X,Y] is the Lie derivative of one vector field with respect to another. Hence, we want the set of vector fields to main closed under differentiation, much in the same that that the set of functions (all differential forms, in fact) remains closed under differentiation.

What does the Jacobi identity mean? It just means that the Lie bracket is a homomorphism. Why is it important for us to have homomorphisms? It ensures that the image also has the same algebraic structure as the domain. But why is that important? Isn’t it too strict a condition? This answer says that homomorphisms preserve algebraic structure, and that is good. But why is it good? One of the main purposes of homomorphisms is classification. We want to tell two algebraic structures apart. One way to do that is to ask in how many ways can I map one algebraic structure on to the other such that the algebraic structure of the former is preserved? If there are no ways, then the two structures are different. Hence, a homomorphism facilitates in comparing algebraic structures. The Jacobi identity facilitates in making \theta(X) a structure-preserving automorphism. This is not true for group actions in general. Why was it important to do it in this instance? Maybe mathematicians saw that this was true for vector fields, and wanted to impose it on all Lie algebras. I’m not sure.

What does “covariant” mean here? When coordinates change, some mathematical objects have an easier law of transformation, whilst others have a much more difficult law. For instance, vectors need just be multiplied with the Jacobian matrix, and they’re good to go. However, second derivatives of functions have a much more complicated law of transformation, for instance. The notion of derivatives given above has a “covariant” law of transformation, and that is why it is called a covariant derivative.

Let us actually take a small detour to study the wikipedia article on covariant derivatives.

So Christoffer symbols were first used to denote curvature, and not differentiation? That would make sense. People were looking for ways to measure curvature. They already knew how to differentiate. It is changing our conceptions of the more “elementary” notions that takes more time.

“Covariant” means independent of the coordinate system. How is that different from invariant? Well, invariant means absolutely unchanging in any coordinate system. For example, the number 2 remains the same number regardless of whether I am looking at it or you are. However, the velocity of a ball does indeed change depending upon the observer. It might be moving to your left, for instance, and to my right. “Covariant” implies that the law of transformation is “simple”, and involves just a multiplication with a matrix. Although it isn’t “invariant”, it is the next best thing. Things don’t change too wildly. How can we have the same \nabla_X Y expression for two different coordinate systems? How do we know that the two different coordinate expressions are “really” the same? The fact that one can be converted into another isn’t enough of a reason. Simplicity has nothing to do with anything. Well, they give us the same observables. If we were to measure the length, acceleration of any other quantity based on those two different coordinate expressions, they would be exactly the same. An analogy is that although there is no clear way of saying that the moon that I’m watching in the sky is the same as the moon that you’re watching, all the properties of the moon that we can observe like the color, size, etc will be the same. Hence, that is one way of proving that we’re indeed looking at the same moon.

How do regular coordinate derivatives change? When we use the right transformation matrices, everything transforms smoothly (although perhaps not linearly). However, there is no invariant way to write a derivative. For instance, \partial_i f \partial_i does not denote the same object in different coordinate systems. Is that all that this has been about? Being able to write an object without referring to coordinates? Why can’t we just write V, without referring to any coordinate system? Well, we want to be able to write down the formal expression in terms of coordinates, without writing down which coordinate system we are referring to. But why is that important? Writing down a quantity in coordinates is akin to writing down a quantity for an observer. Having something in formal coordinates but without reference to a particular coordinate system implies that we can now write down physical laws (which have to be written in coordinates because physical laws correspond to observers and reference frames) without singling out any coordinate system. This is what it has all been about. Writing physical laws.

Why are we concerned with transformation laws at all? Who cares how objects transform when changing coordinate systems? Well a vector transforms a certain way, and a co-vector transforms another way. Then come (k,l) tensors, which transform in completely different ways. How do we classify them? Well k indices transform like k indices transform like vectors, and l indices transform like co-vectors. Hence, we may think of them as the tensor product of k vectors and l co-vectors. Hence, we are able to classify mathematical objects based on how they transform. This becomes even more pronounced in Physics, when certain tensor products of spinors transform as vectors, and that is why we treat them as vectors. But, still, who cares how objects transform? Don’t we care what objects actually “are”? But what really “is” an object, like say a vector? Should it feel hot to the touch, or perhaps denote velocity or acceleration or something of that kind? No. It is just a mathematical object that satisfies certain rules like the transformation laws. Hence, we can classify all objects that satisfy all of those rules and laws as vectors.

How is “covariant” different from “invariant coordinate representation”? It’s very different. “Covariant” just refers to the fact that we have a vector here, and not a co-vector.

The “coordinate grid expands and twists” portion explains why we have Christoffel symbols. But expands and twists with respect to what? With respect to Euclidean space. We have taken Euclidean space to be the epitome of “flatness”, and any coordinate system that twists and turns with respect to it can be said to have non-zero Christoffel symbols. This is not just arbitrary. Twisting and turning with respect to the Euclidean space can be measured in the form of a non-inertial force in Newtonian mechanics. Of course things are slightly more complicated in General Relativity.

What does the covariant derivative have to do with parallel translation? Parallel translation just means that things are moving in a “straight line”, as they should. Moreover, they’re moving in a straight line in all possible coordinate systems. Hence, this is a true fact about the universe, and is not observer dependent.

Let us now get back to the notes.

Why do we need to induce a connection on the submanifold at all? Well, the Earth is a submanifold of the universe. We are only concerned with the velocities and accelerations of balls with respect to the connection induced on the Earth, and not the overall velocity or acceleration vectors of the ball. Hence, there is value to be found in inducing a connection on submanifolds.

Why are transformation laws written like \frac{\partial x_i}{\partial y_\alpha}, etc? That is a succinct way to represent a column in the Jacobian matrix. I am surprised that Christoffel symbols don’t transform like tensors though. I only now realize that Christoffel symbols are not tensors.

This is all for today. I hope to upload notes from the next few chapters later in the week.

Status signaling in academia

This is how math grad students talk:

I don’t really understand this very simple concept. What is the essence of this object, and why was it needed at all? I perhaps need to construct ten different examples or think of alternate definitions before I can successfully narrow down what this means.

No. Not really. This is how math grad students (including me) talk:

Oh you’ve heard of cohomology, but what about quantum cohomology? This *insert name* French mathematician has done some amazing work in this regard, and here’s a fairly advanced book that discusses it.

Of course this changes with time. As grad students become more competent in the latter half of their PhDs, conversations like these become more rare. However, there are still a lot of big words thrown around without caution.

Almost all of this can be explained by status signaling. Graduate students work in fairly isolated and non-overlapping research areas, and are hence free from the academic competition that their undergrad experiences entailed. However, there is still an intrinsic human need for them to signal to each other their relative intelligence and their imagined positions in the status hierarchy. And what better way to do this than to lob some polysyllabic words from their research fields, waiting and hoping for their audience to become suitably impressed, who in turn are getting ready to lob some long words of their own.

However, status signaling explains much, much more than just grad school conversations in lowly student bars. Robin Hanson claims that it explains almost all of modern human society. The more I think about this article, the more true it rings. The post below is also majorly influenced by the genius of this article by Freddie deBoer.

Status signaling in academia

If you’re a researcher in the United States or any other “First World” country, how would you signal your status as an intelligent and capable scientist? You would try to discover something new, create a new technology, or perhaps prove famous scientists wrong. Note that these do run the gamut of almost all avenues still open to researchers in these countries to signal their superior positions in the status hierarchy. They do really have to create something new.

However, if you are a researcher in the “Developing World”, perhaps in a country like India, things are slightly different. You can of course signal status by creating a brand new technology, or perhaps invent a paradigm-shifting theory. However, you can also earn status by being proficient at the newest fields and technologies that were only just created in the “First World”, and that your other colleagues are too slow or stupid to understand. Given below are some conversations that the author has created out of thin air

Are you aware of Machine Learning? Oh it is so interesting. There was recently a paper in the journal Nature on how it has come to beat humans at Chess and Go. We are using this esoteric kind of unsupervised learning in our lab to harvest data on genes.

You must have heard of Bitcoin. But have you heard of Blockchain? It is the technology that all cryptocurrencies are based on. I have included a module on it to teach my students in the Financial literacy course, and also regularly lecture corporations on their importance. Cryptocurrencies are the future, and it is a shame that our country doesn’t understand it yet

Oh you’re interested in learning String Theory? Well the first thing that you have to do is read the latest paper by Edward Witten. Oh you can’t understand the Math in it? Well, keep trying, and one day you will. I believe that the math used in that paper should be taught in elementary school itself.

The last conversation is real. The former Physics grad student (who then quit Physics to completely change fields) was perhaps trying to signal his own intelligence by saying that the latest paper by Witten was easy to read. It is in fact highly advanced, and would perhaps take most Physics or Mathematics researchers many months if not years to understand. Definitely not elementary school material.

Do we see a pattern? If I am a researcher working in India, I don’t really need to create whole new technologies or paradigms. A much easier way is to just import those paradigms from the “West”, become proficient in them before others (or at least in throwing around the relevant buzzwords), and consequently signal that I’m smarter and a more capable researcher than my colleagues. Of course other ways of signaling this are writing more papers than my colleagues, having my papers published in better journals, having a higher h-index, etc. Although the best way to signal status is still creating something brand new, the other ways are just so much easier that the law of “least work” precludes those from ever happening in the developing world.

I recently read the following comment on a substack article (that I cannot recall):

Chinese and Indian research is basically a paper mill. Let’s get real, nothing of value ever gets produced there.

As an Indian researcher, I felt bad upon reading this. However, this did ring true in significant ways. Although scientific advances do come out of India from time to time, nothing is usually big enough to “hit the headlines”. Of course there is one major exception: the claim by IISc scientists to have achieved superconductivity at room temperature for the first time in history. No one was surprised when this was proven to be a fraudulent claim. Another such claim doing the rounds these days is a Nature article published by an NCBS lab, which also turned out to have fraudulent data. One of the best research labs in the country willfully manipulated data to have their paper published, wasting taxpayers’ money and further reducing trust in Indian research.

This contrasts with my experiences as a student in India. My classmates and colleagues were some of the smartest people I’ve ever met, and I continue to correspond with and learn from them. Why was it that a country, with so many intelligent and hard working people, not capable of creating a good research culture that can contribute something meaningful to the world? I think that it is a case of misplaced incentives.

As researchers, our main incentive is discovering new truths about the world around us. However, an equally important (if not more important) incentive is signaling to others that we are intelligent. And by the law of “least resistance”, we want to find the shortest and easiest path to do so. Being versed at the latest “western” theories and technologies is a much easier path than actually creating something from scratch. Hence, we inevitably choose that path. Setting up whole labs devoted to reinforcement learning, creating a research group on String Theory in all major universities, etc.

Status signaling in other domains

One form of status signaling that is often on display is between researchers and entrepreneurs. When Elon Musk invented Neuralink for instance, lots of neuroscience researchers gave interviews in which they said that Musk had not created anything new, and was merely mooching off of their research that had been in the public domain for many years. Musk, on his part, emphasized that writing papers that no one reads is the easy part, and actually engineering products and bringing them to the world is the much harder part, that only he purportedly had done. Hence, researchers and entrepreneurs often engage in status battles.

Another form of status battle takes place between different economic strata. For instance, slightly lower earning professions (like researchers, bureaucrats, etc) are often engaged in status battles with high earning professions like bankers, tech workers, etc. The “we don’t get paid very much, but we are smarter and do what we love” refrain is often heard from researchers who actively hate their lives under the aegis of university bureaucracies, but want to signal a higher status. Of course the “how come you are smarter if I am the one earning much more money” retort is then in turn heard from bankers consultants, who lead an overworked and sometimes miserable existence in order to be able to signal a higher status.


Robin Hanson claims that most of what we do is status signaling. I want to strengthen this claim by saying that almost all of that we do is status signaling. We don’t really want to understand the world. We want to be perceived as understanding of the world, or at least as curious about the world. We want to signal our good looks by recalling stories of people expressing interest in us, our intelligence by talking about reading books and studying in reputed colleges (some take it too far and discuss IQ test scores), our virtues by talking about the disadvantaged and how we have stepped in to help them, etc. Very often, this leads one away from actually trying to understand the world, helping the disadvantaged, etc.

Of course, writing this post itself is an attempt to signal my status. I’m trying to prove that I’ve caught on to other people who indulge in status signaling, and that I myself am above all this. However, it would also be of immense value for me if I’m able to figure out a way to escape taking part in status battles with the people in my life. And if I remain in academia, it would enrich my life to no end if I’m able to pursue my curiosity without indulging in status games with the rest of the researchers in my field. Here’s to hoping.

CR-Invariant powers of the sub-Laplacian-I

Today I will be reviewing the paper “CR-invariant powers of the sub-Laplacian” by Rod Gover and Robin Graham in order to hopefully understand it better. I will post images from the original paper, and then writing an explanatory commentary below.

Why do we care about powers of laplacians? Because they naturally comes up when we talk about obstructions to harmonic extensions, as can be seen in the GJMS paper. For each k, the weight of the density has to be \frac{n}{2}+k. This gives us the obstruction to harmonic extension.

How is the Fefferman metric relevant here though? Well if we recall the GJMS paper, it constructs these powers of the laplacian based on the Fefferman metric itself. That is how it becomes relevant. This is of course the Fefferman-Graham metric, that the co-author of this paper modestly refuses to put his own name to.

What does CR invariant mean? This will be explained at greater length below.

Why are invariant differential operators important? This is the wiki article on invariant differential operators. I would like to discuss some examples from this article.

  1. Why does \nabla have to be rotationally invariant?
  2. The exterior derivative d changes its representation with coordinates. What does it mean to be invariant? The commutative property with respect to group transformation is being referred to here.

Why is k\leq N/2 (for N even) an important requirement? Because the kth coefficient of \tilde{g} will be needed to define \Delta^k, and \tilde{g} is not uniquely defined for k>N/2.

The conformal structure is on a circle bundle, and not a cone. I’m not sure. I suppose we will have to refer to the original paper to get an idea of why we have a circle bundle and not a cone.

What is the root of a bundle? An nth root of a bundle X is a bundle Y such that Y^{\otimes^n}=X. For instance, \Bbb{R}_{\Bbb{R}} and i\Bbb{R}_{\Bbb{R}} are square roots of \Bbb{R}.

Why do we need densities anyway? Why do we need homogeneous (in t) extensions of functions? I think that we are pre-empting how mathematical objects will change with a change in metric. But functions don’t change their value when there’s a change in metric! I think that we might be preserving a scalar property of these functions….like the integral. Hence, when we are dealing with “conformal quantities” that may change with a conformal change in metric, we want to preserve scalar quantities that might become observables (in Physics). Hence, we scale other quantities appropriately with the metric.

Why do we care about the operators P_k? They are just supposed to be powers of the Laplacian in the appropriate sense, which can be recovered from the obstruction tensor right? Well it’s technically $latex tf_{g}(\tilde{\Delta}^k_{TM})$, but yes, it is close enough. Note that \Delta^k has nothing to do with P_{k}. We are concerned only with powers of the ambient laplacian. \tilde{\Delta}^k|_{G} is independent of the extension of \tilde{f} that is chosen, and \tilde{\Delta} has an obstruction at order Q^{k-1} such that Q^{1-k}\tilde{\Delta}|_{G} is independent of the extension of \tilde{f} mod Q^{k}. What about dependance on \tilde{g}? We’ve already taken care of that by choosing k appropriately. There is no dependence on the chosen extension of \tilde{g}. Why is it important not to have dependence on the chosen extension of f? We are saying, in some sense, that we are defining data on the manifold in the form of g and f, and we don’t want to make any other choices. And only with this data on the manifold, we should be able to define well-defined conformal invariants. Hence, it is important to construct quantities that don’t depend on the extensions of \tilde{g} and \tilde{f}.

How can \tilde{\Delta}^k|_{G} and Q^{1-k}\tilde{\Delta}|_{G} ever be related? In some sense, the former is a multiple of the the (k-1)th order term of the latter’s Taylor expansion in terms of Q. However, it is important to note that this is just a heuristic explanation, and that \tilde{\Delta}^k and \tilde{\Delta} agree only on G.

Although the introduction continues after this, we will now jump to the next section of the paper.

Why do we study CR structures? Even dimensional manifolds can easily be complexified. However, this is difficult to do with odd dimensional manifolds. CR structures are the next best thing: odd dimensional manifolds can now be studied as hyper surfaces in larger complex manifolds. This allows us to import the heavy duty machinery of complex analysis in order to study these manifolds, perhaps study some invariant properties of theirs, and consequently classify them.

Why do we deal with co-vectors like \Lambda, and not just the tangent sub bundles? This is because differential forms lend themselves easily to cohomology, while vector fields don’t. Hence, we do so in order to use the machinery of cohomology.

What does \Delta^{1,0} look like? One may imagine it to be n+1 complex dimensional, as it is “orthogonal” only to \overline{L}. What does a complex weight mean? I suppose that it is a formal object that lends itself to manipulation in this paper (this is probably the most vacuous sentence I’ve written in the recent past).

Let us take a small detour into CR manifolds for a bit. We will be discussing the wikipedia article on these manifolds.

Why do we complexify tangent spaces? This is because complexification is the only way that allows us to define holomorphic forms on the manifold. We also need to complexify the tangent bundles of complex spaces in order to define holomorphic forms on them. the complexification need not mean anything geometrically. It is just a formal tool.

Why do we need preferred distributions? Preferred distributions span the holomorphic vector fields that can be defined on the manifold. How do we know which vector fields are holomorphic? In the simple example of the manifold being embedded inside a larger complex manifold, the answer to this question is clear. The restrictions of holomorphic vector fields to the CR manifold form the preferred distribution L. Note that L also needs to be integrable because we want the manifold M to be a leaf, in case the larger complex manifold were foliated with respect to the sub bundle L (assuming of course that L is defined over the whole complex manifold, and not just on M.

What is important to note here is that the L sub bundle of S^3\subset \Bbb{C}^2 does not consist of all possible holomorphic forms. It only consists of those forms that are “tangent” to it. But how can complex vector fields be tangent to S^3? Aren’t all vector fields over real numbers? We define tangency not by a visual picture of asymptotically touching a manifold, but algebraically. By that definition, the complex vector field mentioned above is tangent to S^3, ie it lies in the complexified tangent space of S^3. It’s not that only the vectors in the real tangent space of S^3 are “really” tangent to it. Tangency is now fair game for all vectors, real or complexified.

But now that we’re considering both real and complexified tangent vectors of S^3, won’t that double the dimension of the space of tangent vectors? Yes and no. Yes, over the reals, and no over \Bbb{C}. Also, note that L here is just one (complex) dimensional, and consequently so is \overline{L}. Hence, L\cup \overline{L} don’t span the complexified tangent bundle of the manifold. There is always an extra complex dimension left.

How can it be a metric on L when it is a 2-form? Well we choose two vectors in L, and then make one of them anti-holomorphic. How can we form a metric on a sub bundle of the (complexified) tangent bundle though? Shouldn’t it have been defined over the whole tangent bundle? That needn’t be the case. Metrics are mostly use to measure volumes. If all our sub manifolds will only have these holomorphic vector fields spanning their tangent spaces, we might as well define a metric only on this sub bundle. However, it is true that this metric will fail to give us the volume form of the whole space.

By focusing only on L, are we missing out on the geometry of the whole of M? Well we are yet to meet the Levi form. That will pick up the slack.

Why do we want 2-forms to take values in vector bundles? Why not just \Bbb{R} or \Bbb{C}? Because we want to generalize the notion of a function to a section of a vector bundle. Hence, it is only natural that we have forms that takes values in that vector bundle.

The Levi form maps pairs of vectors in L to the complement of L\oplus \overline{L}. In a sense, we can construct the whole complexified tangent bundle using just L and h, where h is the Levi form.

Let us now get back to the paper.

What does the real part of a vector space mean? Real coefficients? Yes. Take all the coefficients, and consider only their real part. For instance, the real part of \frac{\partial}{\partial z}_{\Bbb{C}} is \{ \frac{\partial}{\partial z},\frac{\partial}{\partial y}\}_{\Bbb{R}}. In practice, it is found by z\to \frac{z+\overline{z}}{2}.

What is happening with the dimensions? We start with a 2n+1 dimensional manifold, perhaps embedded inside a 2n+2 real dimensional space. After we complexify the tangent bundle of the space, we get a 2n+2 complex dimensional space, in which L is n (complex) dimensional. The real part of L has real dimension 2n (and similarly the complex part also has real dimension 2n). Hence, having complex dimension 1 over \Bbb{C} is equivalent to having real dimension 4 over \Bbb{R}. The complexification of the 2n+2 real dimensional space consequently gives us a 8n+8 real dimensional space over \Bbb{R}.

Is H closed under the action of J? Yes. Although L might not be closed under J, H certainly is. How does that impose an orientation on H? Well it imposes an orientation on pairs of vectors, and that is all that is needed to impose an orientation on the whole even dimensional H.

What is the contact form exactly? It is a form (not a vector) which is orthogonal to H. It is part of T^*M (without need of complexification), and there are multiple such choices possible (scalar multiples, etc). Where does d\theta lie? Both \theta and d\theta can be written in terms of real coefficients and \{dx^i,dy^j\}. However, clearly 2i d\theta\in \Bbb{C}T^*M\wedge T^*M. Its action is restricted to T^{1,0}. Moreover, T is a vector field in the complexified tangent bundle such that \theta(T)=1 but d\theta(T)=0. Seeing as theta is orthogonal to H (and hence T^{1,0}), T can have a maximum of n+1 unknown coefficients, and that’s exactly the number of equations we get from above. Hence, we can determine T uniquely. Again note that T is a complex vector field, and may not lie inside TM.

\theta^{\alpha} are again in \Bbb{C}T^*M. Why couldn’t we just have defined them as generating the dual space of T^{1,0}? Because then they would not have been in the dual space of T^{0,1}, which they are.

How do we get that formula for d\theta? -2id\theta gives us 2h_{\alpha\overline{\beta}}\theta^{\alpha}\wedge \theta^{\overline{\beta}}. This does give us a restriction to T^{1,0}, although the fact of 2 looks arbitrary.

What does the last formula mean? T is outside of T^{1,0}. \zeta is an n+1 form. Hence, the two contractions given on the right hand side give us n forms in T^{1,0} and T^{0,1} respectively. Remember that we had a free choice of \theta. Hence, we can impose this restriction on the pseudohermitian form in order to get this succinct volume form (based on a prior choice of \zeta, of course).

What does “depends only on CR structure” mean? It means that the pseudohermitian form does not depend on \lambda. That is clearly the case with this definition. What does \mathcal{E}(1,1) mean? It is a vector bundle of the form \mathcal{E}(1)\otimes \overline{\mathcal{E}(1)}. Isn’t the bold \theta still just a section of T^*M? No, because the action of \Bbb{R} on |\zeta|^{-2/n+2} is different from the action of \Bbb{R} on \Bbb{R}.

Why do we need \zeta at all to determine a unique pseudohermitian form? Why can’t we just choose a form of unit length or something? We can’t because we don’t have any metric to measure lengths in TM\setminus H. Hence, we have the choice of a CR scale.

What is so “flat” about the Heisenberg group? Well, the Heisenberg group can be embedded as \Bbb{R}^3\subset \Bbb{C}^2. Hence, it is flat, as opposed to S^3\subset \Bbb{C}^2.

Do we have enough equations to determine \omega_\alpha^\beta? I suppose the torsion tensor will be given to us. We have n^2 variables and n equations. I suppose we only need to determine \omega_\alpha^\beta\otimes \theta^\alpha, and not the individual \omega‘s? I’m not sure. I’m now confident that the torsion data has to be supplied beforehand, however.

Why are connections defined on the basis of \omega‘s? Well we can’t have Christoffel symbols because we don’t have a Riemannian metric. Hence, the hermitian metric d\theta is the best we can do. But why do we not have a regular coordinate derivative, like in the definition of the Levi-Civita connection? Well we are only defining the connections with respect to the “coordinate co-vectors”. Maybe the partial derivatives will come in when we have more general co-vectors? The \omega‘s are pretty similar to the Christoffel symbols we have, with one index missing because we’re not really “parallel-transporting” in any direction. Also note that \nabla h=0.

What is the point of defining a connection, if we can already differentiate? The main point here is invariance. We don’t want derivatives to change when coordinate systems (observers, frames of reference, etc) are changed. Coordinate derivatives are not invariant. However, connection derivatives are. A similar notion is now being proposed for CR manifolds. Also, manifolds don’t always have a global coordinate system. Derivatives have to be defined “intrinsically” without reference to coordinates. Connections don’t need coordinates to be defined. They don’t depend on the intrinsic information of the manifold, which in the case of Riemannian manifolds is the metric.

This is all for today. I will try to upload the second part of this expository series tomorrow.

Disentangling objective functions

I am currently reading the book Feeling Great by Dr. David Burns, and am finding it to be very insightful and helpful. In fact, I would highly recommend it to any person that has chanced upon this fetid corner of the internet. I apologize in advance for the self-help nature of the rest of the post.

In Chapter 3, the author talks about a Harvard student who is depressed because she is unable to get good grades and be the academic superstar that she had always been before this. She has been undergoing a lot of mental trauma for months now, and has finally come to her counselor for help. Now imagine that the counselor gives her two options:

  1. There is a “happiness button” that the student has to press, and then all her sadness will go away instantly, although her grades remain unchanged. Let us suspend belief for a moment and imagine that such a button actually exists
  2. The student does not press the happiness button, and continues living her life in pursuit of better grades and circumstances

Which option do you think the student will choose?

On close reflection, you may soon realize that the student will inevitably choose the second option, and not the first one. Although she does want to be happy, she wants good grades even more than mere happiness. She has made her happiness conditional upon academic success.

In life, we often entangle our happiness with our goals or ambitions. We say “if I become very rich or very successful in my field, I will be happy”. What inevitably happens is that we either don’t reach our desired goal, or when we do reach it, we realize that our goals have now shifted. We now want to be better than the other people who have achieved the same goals. Only then will we be happy.

What is perhaps more tricky to realize is that we need not do that. Happiness has nothing to do with achieving goals. Happiness is perhaps being at peace with ourselves and celebrating the present. This can be achieved by reflecting on the miracle of life and the universe, or perhaps injecting morphine into one’s eyeballs for the slightly more adventurous. However it is achieved, it actively has nothing to do with our goals. Hence, we will do well to disentangle our two aims of being happy and being successful. Both of these aims are valuable and worth pursuing. However, they are not related. Our being happy has nothing to do with being successful.

Humans have many objective functions like wealth, fame, happiness, meaning, quality of relationships, etc that they want to maximize in their lives. Maximizing any (or all) of these functions will add great value to one’s life. However, these objective functions needn’t have anything to do with one another. I can be happy without wealth, fame, meaning, etc….much like Sisyphus. I can also be wealthy without fame, happiness, meaning, etc. Entangling these functions can potentially take away value from our lives. For instance, if I entangle my happiness with fame and wealth, which means that I decide that I will be happy only when I’m rich and famous, then I lose out on the possibility of being happy if I’m not able to attain my goals of being rich and famous. Hence, keeping these functions separate and disentangled can only be to our benefit.

Of course, one may think that entangling my happiness with wealth and fame may make them more motivated to attain wealth and fame. Although this sounds convincing, this is not how things work in practice. We can’t “decide” what will make us happy. It is possible (and entirely common) that even when we attain our goals of wealth and fame, we are unhappy. An analogy is you deciding that you will turn 30 only when England wins the Football World Cup. You can’t really decide how and when you turn 30. Similarly, being happy cannot be arbitrarily entangled with any other objective function of your choosing. It has to be pursued and attained on its own terms, independent of other objective functions.

Thus ends my spiel for the day. If you think that I am slowly drifting away from reviewing scientific papers to writing crappy self-help posts, you’re right on the money.

HIV rebound

The paper that I’m writing about today is “The size of the expressed HIV reservoir predicts timing of viral rebound after treatment interruption” by Li et al. I will quote passages from the paper, and then try to explain what all of those fantastically long words mean.


Therapies to achieve sustained antiretroviral therapy-free HIV remission will require validation in analytic treatment interruption (ATI) trials. Identifying biomarkers that predict time to viral rebound could accelerate the development of such therapeutics.

This is one of a whole host of papers that deals with identifying biomarkers that can aid in the permanent treatment of HIV-positive patients. What does permanent treatment mean? When HIV-positive patients are put on an active treatment regimen, the treatment is often spectacularly successful…..until the treatment stops. Then, patients see a violent relapse. However, there are some patients (we’ll call them super-patients) that don’t see a relapse at all. Researchers are now trying to figure out what it is about these patients that helps them not relapse when treatment is stopped, and whether these conditions can be re-created in all patients. Simple.


Cell-associated DNA (CA-DNA) and CA-RNA were quantified in pre-ATI peripheral blood mononuclear cell samples, and residual plasma viremia was measured using the single-copy assay.

What is single-copy assay? Here is a direct quote from this paper:

This assay uses larger plasma sample volumes (7 ml), improved nucleic acid isolation and purification techniques, and RT-PCR to accurately quantify HIV-1 in plasma samples over a broad dynamic range (1–106 copies/ml). The limit of detection down to 1 copy of HIV-1 RNA makes SCA 20–50 times more sensitive than currently approved commercial assays.

Essentially it is a new-and-improved method of measuring the amount of HIV RNA in your blood plasma.

What are the results of this experiment?


Participants who initiated antiretroviral therapy (ART) during acute/early HIV infection and those on a non-nucleoside reverse transcriptase inhibitor-containing regimen had significantly delayed viral rebound. Participants who initiated ART during acute/early infection had lower levels of pre-ATI CA-RNA (acute/early vs. chronictreated: median <92 vs. 156 HIV-1 RNA copies/106 CD4þ cells, P < 0.01). Higher preATI CA-RNA levels were significantly associated with shorter time to viral rebound (4 vs. 5–8 vs. >8 weeks: median 182 vs. 107 vs. <92 HIV-1 RNA copies/106 CD4þ cells, Kruskal–Wallis P < 0.01). The proportion of participants with detectable plasma residual viremia prior to ATI was significantly higher among those with shorter time to viral rebound.

So people who start HIV treatment early have a more successful treatment overall, and it takes a longer time for the disease to rebound even when the treatment is stopped. This largely aligns with common sense and disease rebounds seen in other diseases like cancer. What is more surprising is that patients on the non-nucleoside reverse transcriptase inhibitor-containing regimen also see the same kind of success. Let us explore some of the words in this phrase. A nucleoside is a nucleotide, which is the basic building block of DNA and RNA, minus the phosphate group. Reverse transcriptase is the process of constructing complementary DNA sequences from RNA sequences (reverse transcription, because regular transcription constructs RNA from DNA). So constructing DNA from RNA without the help of nucleosides helps in treating HIV? Maybe this newly constructed DNA helps the immune system figure out how to fight the HIV RNA in the plasma? I’m not sure.

Moreover, higher levels of cell-associated HIV RNA lead to a shorter rebound time after treatment is stopped (ATI). This also makes sense. Treatment should only be stopped when RNA levels have decreased considerably. This is something I also came across in the book “The Emperor of Maladies” by Siddhartha Mukherjee. Cancer treatment, whether it be chemotherapy or a strict drug regimen, is often stopped when the patient supposedly feels cured for a duration of time. However, the cancer often rebounds very quickly. This tells us that treatments, whether they be for cancer or HIV, should be carried on for much longer than they are today, and the patient feeling “fine” is not a good marker for when the treatment should be stopped.


Higher levels of HIV expression while on Antiretroviral Therapy (ART) are associated with shorter time to rebound after treatment interruption. Quantification of the active HIV reservoir may provide a biomarker of efficacy for therapies that aim to achieve ART-free remission

This is a repetition of the above. Stop treatment only when HIV RNA levels are low. This will increase the time it takes for the disease to rebound. Essentially, disease treatment aligns with common sense. Who knew.

It sure doesn’t feel like predictive processing

Reddit user @Daniel_HMBD kindly re-wrote some parts of my previous essay to make it clearer. I am now posting this corrected version here.

Broad claim: The brain (conscious or unconscious) “explains away” a large part of our surroundings: the exact motion of a tree or a blade of grass as it sways gently in the wind, the exact motion of a human as they walk, etc. If we could force our brain to make predictions about these things as well, we’d develop our scientific acumen and our understanding of the world.

How can I understand the motion of a blade of grass? The most common answer is “observe its motion really closely”. I’ve spent considerable amounts of time staring at blades of grass, trying to process their motion. Here’s the best that I could come up with: the blades are demonstrating a simple pendulum-like motion, in which the wind pulls the blade in one direction and its roots and frame pull it in the opposite direction. Observe that I didn’t end up observing the tiny details of the motion. I was only trying to fit what I saw with what I had learned in my Physics course. This is exactly what our brain does: it doesn’t really try to understand the world around us. It only tries to explain the world around us based on what we know or have learned. It does the least amount of work possible in order to form a coherent picture of the world. Let me try and explain this point further in a series of examples.

When ancient humans saw thunder and lightning in the sky, they “explained away” the phenomena by saying that the Gods were probably angry with us, and that is why they were expressing their anger in the heavens. If there was a good harvest one year, they would think that the Gods were pleased with the animal sacrifices they’d made. If there was drought despite their generous sacrifices, they would think that the Gods were displeased with something that the people were doing (probably the witches, or the jealous enemies of our beloved king). Essentially, they would observe phenomena, and then somehow try to tie it to divine will. All of these deductions were after the fact, and were only attempts at “explaining away” natural phenomena.

When pre-Renaissance humans observed their seemingly flat lands and a circular sun rising and setting everyday, they explained these observations away by saying that the earth was (obviously) flat, and that the sun was revolving around the earth. They then observed other stars and planets moving across the skies, and explained this by saying that the planets and stars were also orbiting us in perfectly circular orbits. When the orbits were found to be erratic, they built even more complicated models of celestial motion on top of existing models in order to accommodate all that they could see in the night skies. They had one assumption that couldn’t be questioned: that the earth was still and not moving. Everything else had to be “explained away”.

When we deal with people who have a great reputation for being helpful and kind, we are unusually accommodating of them. If they’re often late, or sometimes dismissive of us, we take it all in our stride and try to maintain good ties with them. We explain away their imperfect behavior with “they were probably doing something important” and “they probably mean well”. However, when we deal with people who we don’t think very much of, we are quick to judge them. Even then they’re being very nice and courteous to us, we mostly only end up thinking “why are trying so hard to be nice” and resent them even more. We explain away their behavior with “they probably have an ulterior motive”.

Essentially, our brain sticks to what it knows or understands, and tries to interpret everything else in a way that is consistent with these assumptions. Moreover, it is not too concerned with precise and detailed explanations. When it sees thunder in the skies, it thinks “electricity, clouds, lightning rods”, etc. It doesn’t seek to understand why this bolt of lightning took exactly that shape. It is mostly happy with “lightning bolts roughly look and sound like this, all of this roughly fits in with what I learned in school about electricity and lightning, and all is going as expected”. The brain does not seek precision. It is mostly happy with rough fits to prior knowledge.

Note that the brain doesn’t really form predictions that often. It didn’t predict the lightning bolt when it happened. It started explaining away with lightning bolt after it was observed. What our brain essentially does is that it first observes things around us, and then interprets them in a way that is consistent with prior knowledge. When you observe a tree, your eyes and retina observe each fine detail of it. However, when this image is re-presented in the brain, your “the tree probably looks like this” and “the leaves roughly look like this” neurons fire, and you perceive a slightly distorted, incomplete picture of the tree as compared to what your eyes first perceived.

In other words, your brain is constanly deceiving you, giving you a dumbed-down version of reality. What can you do if you want to perceive reality more clearly?

Now we enter the historical speculation part of this essay. Leonardo da Vinci was famously curious about the world him. He made detailed drawings of birds and dragonflies in flight, of the play between light and shadows in real life, futuristic planes and helicopters, etc. Although his curiosity was laudable, what was even more impressive was the accuracy of his drawings. Isaac Newton, another curious scientist who made famously accurate observations of the world around him, was unmarried throughout his life and probably schizophrenic. John Nash and Michelangelo are other famous examples.

I want to argue that most neurotypicals observe external phenomena, and only after such observations try to explain these phenomena away. However, great minds generate predictions for everything around them, including swaying blades of grass. When their observations contradict these predictions, they are forced to modify their predictions and hence understanding of the world. Essentially, they are scientists in the true sense of the word. What evidence do I have for these claims? Very weak: n=1. Most of what I do is observe events, concur that this is roughly how they should be, and then move on. Because I can explain away almost anything, I don’t feel a need to modify my beliefs or assumptions. However, when I consciously try to generate predictions about the world around me, I am forced to modify my assumptions and beliefs in short order. I am forced to learn.

Why is it important to first generate predictions, and then compare them with observations? Let us take an example. When I sit on my verandah, I often observe people walking past me. I see them in motion, and after observing them think that that is roughy how I’d expect arms and legs to swing in order to make walking possible. I don’t learn anything new or perceive any finer details of human motion. I just reaffirm my prior belief of “arms and legs must roughly swing like pendulums to make walking possible” with my observations. However, I recently decided to make predictions about how the body would move while walking. When I compared these predictions with what I could observe, I realized that my predictions were way off. Legs are much straighter when we walk, the hips hardly see any vertical motion, and both of these observations were common to everyone that I could see. Hence, it is only when we make prior predictions that we can learn the finer minutae of the world around us, that we often ignore when we try to “explain away” observations.

I was on vacation recently, and had a lot of time to myself. I tried to generate predictions about the world around me, and then see how they correlated with reality. Some things that I learned: on hitting a rock, water waves coalesce at the back of the rock. Leaves are generally v-shaped, and not flat (this probably has something to do with maximizing sunlight collection under varying weather conditions). People barely move their hips in the vertical direction while walking. It is much more common to see variations in color amongst trees than height (height has to do with availability of food and sunlight, while color may be a result of random mutations). A surprisingly large number of road signs are about truck lanes (something that car drivers are less likely to notice, of course). Also, blades of grass have a much smaller time period than I assumed. Although I don’t remember the other things I learned, I think that I did notice a lot of things that I had never cared to notice before.

Can I use this in Mathematics (for context, I am a graduate student in Mathematics)? In other words, can I try to make predictions about mathematical facts and proofs, and hopefully align my predictions with mathematical reality? I do want to give this a serious shot, and will hopefully write a blog post on this in the future. But what does “giving it a serious shot” entail? I could read a theorem, think of a proof outline, and then see whether this is the route that the argument goes. I could also generate predictions about properties of mathematical objects, and see if these properties are true about these manifolds. We’ll see if this leads anywhere.

So forming predictions, which really is a lot like the scientific method, is naturally a feature of people of certain neural descriptions, who went on to become our foremost scientists. It is yet to be seen whether people without these neural descriptions can use these skills anyway to enhance their own understanding of the world, and hopefully make a couple of interesting scientific observations as well.