Fruits of procrastination

Month: July, 2020

Why is chlorophyll green?

The paper that I want to discuss today is Quieting a Noisy Antenna Reproduces Photosynthetic Light Harvesting Spectra, by Arp, Aji, Gabor et al. I first read about this in a Quanta article. The scientific question that was answered in this paper was amazing- why are plants green?

So, why indeed are plants green? One might say it is because of chlorophyll, which is green in color. But why is chlorophyll green? Of course, one answer to this question is “because it is green”. But a more satisfying answer is that chlorophyll being green ensures that plants receive a steady, not-wildly-fluctuating supply of energy.

Introduction

A plant, or at least the light harvesting parts of a plant, may be thought of as an antenna. The extremal points of a plant absorb energy from sunlight, and transfer this energy via various pathways to the parts where this energy is needed in order to make food and sustain other life-giving activities. Sunlight contains a spectrum of wavelengths, and plants probably want to absorb all wavelengths in order to maximize their intake. However, absorbing the green frequency would lead to a lot of variance in the amount of energy absorbed. Hence, to reduce this variance, plants just reflect this green part of the solar light, and absorb the red and blue parts.

And that is why plants appear green.

Antenna Network and Noise

Ensuring that the energy input into a network or grid is equal to the energy output is a fundamental requirement of networks. If excess energy is absorbed, it may destroy the system, and if not enough energy is absorbed, an underpowered system will soon shut down. However, the environment of a plant can vary rapidly with time. The sun can be covered by clouds, the plants above a light absorbing leaf may sway with the wind and hence block access to sunlight at intervals, etc. How can a plant ensure that it receives a steady supply of energy? Clearly, much like we need a constant amount of food everyday, a plant’s energy output to its food making parts needs to be constant in order to survive.

If the energy absorbed by a plant at a fixed point in time can be plotted on a graph, with different probabilities given to different amounts of energy absorbed, the greater the variance in this graph, the more the variance in energy absorbed by a plant. This variance, which is called noise, should be reduced. Reducing noise is going to be our main motive as we do the analysis below. Methods of reducing noise like adaptive noise filtering require external intervention, and hence are not available to plants.

One node network

Imagine a network with a single input node A, that absorbs light at wavelength \lambda_A with (maximum) power P_A. Note that the average power absorbed does not have to equal P_A due to changing external conditions like swaying plants blocking sunlight, etc. Hence, the average power absorbed by a plant is p_A P_A, where p_A<1, and can be thought of as the probability of the plant absorbing P_A.

Let the energy output be \Omega. If P_A=\Omega, then the average energy input, which is p_A P_A<P_A, would always be less than output. Hence, in this model of only one input node A, P_A>\Omega so that the average energy input is equal to output. In other words, p_A P_A=\Omega.

Let us now calculate the variance of the energy received. If p_A is the probability that the plant is able to receive P_A energy, and (1-p_A) is the probability of the plant receiving 0 energy, then the variance is \sigma^2=p_A(P_A-\Omega)^2+(1-p_A)(\Omega)^2. This can be simplified as

\frac{\sigma^2}{\Omega}=\frac{P_A}{\Omega}-1

We should look for ways to reduce this variance. We can do this by having two nodes instead of one.

Two node network

Let us now have a network with two input nodes, and see if we can reduce variance. Let the input nodes A and B absorb light at frequencies \lambda_A,\lambda_B. Let the power absorbed be P_A,P_B with probabilities p_A,p_B. We will assume that P_A<\Omega<P_B. Also, we want p_A P_A+p_BP_B=\Omega, as the average power input should be equal to the output. One constraint of the system is that the plant shouldn’t absorb P_A+P_B power, because P_A+P_B>>\Omega. Hence the possibilities of power absorption are that the plant absorbs P_A power, P_B power, or 0 power. The variance of the model is now \sigma^2=p_A(P_A-\Omega)^2+p_B(P_B-\Omega)^2+(1-P_A-P_B)(\Omega)^2. This can be simplified as

\frac{\sigma^2}{\Omega^2}=(\frac{P_A}{\Omega}-1)-p_B\frac{P_B}{\Omega}(\frac{P_A-P_B}{\Omega})

Clearly, this is smaller than the variance we get from the network with just one node. The good news is that plants also have two input nodes- chlorophyll a and chlorophyll b. The presence of two input nodes for a given wavelength probably served as an evolutionary advantage in order to minimize noise in energy absorption.

Optimization

We want plants to absorb energy at a steady rate, to ensure that energy input=energy output. We want to maximize P_A-P_B, where P_B<\Omega<P_A, so that the noise, or variance in energy absorption, is minimized. Our constraint is that p_A P_A + p_B P_B=\Omega. And we want to do this for all the wavelenghts of light that we can.

Now the maximum power available P_A depends upon the sunlight available, and is given below as the black curve in the graph. Ignore the two peaks for now.

Hence, we can ideally select two nodes each for the blue, green and red regions of the wavelength spectrum, and absorb energy from each of them. In order to reduce noise however, we need to maiximize P_A-P_B. This can be done if we place two nodes each in each of the three regions, and let the two nodes have very similar wavelengths, but different P_i‘s. This can be done easily where the slope of the irradiance is high. We can see in the graph above that the slope of the irradiance graph is high in the blue and red regions. However, in the green region, the slope is close to 0. Hence, if we place two nodes there with similar wavelengths, P_A-P_B will be almost $0$, and hence there will be a lot of noise in the energy input.

This is the reason why plants have two nodes each only in the red and blue regions of the light spectrum, and not the green region. The green light is reflected, and this is why plants are green.

Purple bacteria and green sulphur bacteria can be modeled using the same constraint of reducing noise in energy absorption. Hence, the scientific model developed by the authors is robust, and can explain the color of much of the flora and fauna found on the planet.

References

  1. Quieting a Noisy Antenna Reproduces Photosynthetic Light Harvesting Spectra

Jagdish Chandra Bose and Plant Neurobiology

The paper that I’ll be discussing today is Jagdish Chandra Bose and Plant Neurobiology by Prakash Narain Tandon, a celebrated Indian scientist.

When I was a school student in India, I often came across JC Bose’s claims of plants being sentient beings, having nervous systems, etc. However, these things were never part of the official curriculum (i.e. we never had to learn these things for tests). Bose’s statements in this matter have always been considered to be something of a not-completely-scientific, “the whole universe is connected by prāna”-type of claim by the larger scientific community. This paper assets that despite initial rejection, most of Bose’s claims have been proven to be correct by modern science in recent times.

Introduction

By the time Bose retired from Presidency College, he was a world renowned physicist who was known to have studied radio waves even before Marconi (although the primacy debate is a complex one, there is evidence to suggest that there were scientists in Europe who had studied radio waves even before Bose). After retiring, Bose started working at Bose Institute (which he founded), and guided by his “Unity of Life” philosophy, started studying the effect of radio waves on inorganic matter. Finding their response to be “similar to animal muscle”, he now started studying plant physiology (the nervous system of plants). He would expose plants to various stimuli, and record their response through the use of ingenious instruments that he himself designed. His conclusion was that the nervous impulses of plants were similar to those of animals.

Action Potentials

Bose studied both large plant parts and individual plant cells. He would connect microelectrodes to these cells, and record their response to stimuli. He concluded that plants contain receptors of stimuli, nerve cells that code these stimuli electrically and propagate these messages, and also motor organs that purportedly helped in carrying out a response to the stimuli. In this, he concluded that plants and animals have similar nervous systems. Bose said that the nervous system in plants was responsible for things like photosynthesis, ascent of sap, response to light, etc.

Bose said that the action potential of plant neurons follows the unipolarity of animal neurons. But what is Action Potential? This is an amazing video explanation of what it is. Action Potential is the electric potential via which neurons transmit messages. In resting state, the electric potential difference between the inside and the outside of neurons is -70 mV (the inside is negatively charged). When neurotransmitters activate the neuron (because a message is to be passed), this negative potential difference is destroyed by a stream of positive sodium ions that comes into the neuron from the outside. This causes lots of changes to the neurons, including inducing it to release neurotransmitters to activate the next neuron in line. The electric potential difference becomes positive, and then becomes negative again because the neuron loses a lot of potassium ions to the outside. The sodium-potassium pump on the cell membrane expends energy to exchange sodium and potassium ions to ensure that the neuron returns to its previous state before it was excited. Thus, the neuron enters the resting state again. This is the chemical mechanism by which a neuron conducts a message.

Where can one find the “nerves” of plants? Bose localized the nervous tissue in the phloem, which conducted both efferent and afferent nervous impulses. He also measured the speed of the nervous impulse, which he found to be 400 mm/sec. Although Burdon-Sanderson and Darwin had previously reported on nerve impulses in insectivorous plants, Bose’s studies over the next three decades were far wider and deeper. Although ignored after the 1930s, his studies have been found to be correct by modern experiments. The author claims that Baluska et al have not only confirmed Bose’s major findings, but have also advanced these further utilizing molecular biology, genomics, etc. Baluska seems to have published this paper in a journal that he himself is the editor of. Hence, these claims perhaps need to be investigated further.

Electrical Studies

Along with Action Potentials (APs) (common to plants and animals), Slow Wave Potentials (SWPs) or Variation Potentials (VPs) (found only in plants) are also used by plants to transmit nerve impulses. These SWPs do not propagate electrically, but by hydraulic pressure exerted by tissues found in the xylem. Some plants like Dionaea flytraps were found to possess unidirectional APs similar to those found in cardiac myocytes (cadiac muscle cells). This prompted Bose to poetically state that plants possess hearts that beat as long as they live.

Molecular Studies

At the molecular level, plants possess voltage gate channels (membranous proteins that are activated by change in electric potential and allow the exchange of ions), a vesicular trafficking apparatus (for the transport of proteins and other molecules within the cell plasma) etc, all of which are also found in animal cells. Trewavas also observed that water soluble Ca^{+2} ions were responsible for intra-cell communication, and also inducing changes in plants as a response to environmental conditions. We now know that there exist many such water-soluble (these are called cystolic) messengers in plants, as they do in animals.

Plant Roots

Darwin had pointed out that the tip of the radicle (found in the roots) is endowed with sensitivity, and also directs the movements of adjoining parts. In this, it is like the brain.

Bose elaborated on this by saying that the radicle is stimulated by friction and the chemical constitution of the surround soil. The cells undergo contraction at appropriate times, causing their liquid contents to go up. This causes the ascent of sap. Baluska et all carried these claims even further, and stated that within the root apex of the maize plant, there is a “command centre” which facilitates the long distance travel of nervous impulses, and instructs the plant to move towards positive stimuli (and away from negative stimuli). Tandon rejects the notion that such a command centre is anywhere near as complex as an animal brain.

Synapses- Neurotransmitters

Bose found the nerve cells in plants to be elongated tubes, and the dividing membrane between them to be the synapse (the gap between animal neurons where messages are transmitted between neurons). This claim has been substantiated by Barlow, who said that plant synapses share many characteristics with animal synapses. Plants also use many of the same neurotransmitters as animals like acetylcholine, glutamate and \gamma-aminobutyric acid.

Plant Memory, Learning and Intelligence

Bose claimed that plants are intelligent, have memory, and are capable of learning. Tandon makes the claim that Trewavas describes a large number of protein kinases in plant neural pathways, and hence finds their nervous system to be similar to that of animals. On skimming Trevawas’ paper however, I mostly found it to say that although there do exist protein kinases in plants, the neural systems found in plants differ from that of animals in important ways.

In another paper, Trewavas claims that one important difference between plant and animal nervous systems is the timescale of response- plants respond much more slowly to external stimuli. Hence, we need time scale photography to properly study the plant neural response. Also, if intelligence can be thought of as “a capacity for problem solving”, then plants show signs of intelligence as they change their architecture, physiology and phenotype in order to compete for resources, forage for food, and protect themselves against harsh elements of the environment.

Barlow substantiates these arguments, claiming that plants rapidly convert external stimuli to electrochemical signals, which cause them to change their physiology. He also claims that plants do have memory, as their decision making would involve recollection of previously stored memories. Barlow also says that roots experience four stimuli at once (touch, gravity, humidity and light), and have to decide how to obtain the optical mix of all. Hence, plants do possess the decision making aspect of intelligence.

With regard to plant intelligence, Baluska et al make the following claim:

‘Recent advances in chemical ecology reveal the astonishing communicative complexity of higher plants as exemplified by the battery of volatile substances which they produce and sense in order to share with other organisms information about their physiological state”

Gruntman and Novoplansky from Israel also claim that B. Dactyltoides are able to differentiate between themselves and other plants, and if a plant has multiple roots, each set of roots identities the other as belonging to the same plant (and hence these roots don’t compete with each other for resources). But how do plants recognize themselves and others? The authors claim that this is from the internal oscillations of hormones like auxin and cytokines. The frequency of this oscillation is unique to each plant, and can be measured externally by roots.

Conclusions

Bose claimed that

“these trees have a life like ours……they eat and grow…….face poverty, sorrows and sufferings. This poverty may……induce them to steal and rob…….they help each other, develop friendships, sacrifice their lives for their children”

The author finds that this sentiment is not yet fully supported by the scientific data collected by Bose. However, these claims may be further ratified when more experiments are done in this realm.

Bose single-handedly created the field of Plant Neurobiology. Although the establishment of this field has its opponents, even the most vocal of these opponents cannot find fault with any of Bose’s scientific claims. The author hopes that plant and animal neuroscientists communicate better with each other in the future, and find the time and resources to study this field more. Hopefully, such studies will ratify even more of Bose’s revolutionary ideas and claims.

References

  1. Jagdish Chandra Bose and Plant Neurobiology
  2. How Plants Learn

The Imitation Game

The paper that I’m reviewing today is Computing Machinery and Intelligence by Alan Turing. It is one of the most important papers in the history of computation, and has hence shaped the world as we know it today. It could perhaps be called more an essay than a paper; this claim will hopefully become apparent below. I spent 4th of July reading it, and it was a pretty fun way to spend my day.

The Imitation Game

“Can machines think?” We soon realize that we first need to define the terms “machine” and “think”. Let us set the scene: there are three people- A (man), B (woman) and C (interrogator). None of them can see each other or speak, and they can only communicate through typewritten messages. The interrogator (C) has the job of determining the genders of A and B (he doesn’t know that A is a man and B a woman). C can ask them any number of questions. A’s job is to mislead C into thinking that he is a woman. B’s job is to convince C that she is the woman, and that A is a man. Both A and B can lie in their answers to C.

Clearly, lying and misleading are traits of human beings, and require thought. Suppose we replace A (man) by a machine. Will the machine be as convincing a liar as A? Will it be able to dodge C’s questions as skillfully as A? If the probability of C calling out the machine’s lies is less than or equal to the probability of him/her calling out A’s lies, then Turing says that that implies that the machine can “think“. Testing whether someone or something can “think” would involve having them accomplish a task that would require considerable thought and planning. Lying and misleading clearly fall into that category. Without going into the metaphysics of what exactly “thought” is and why inanimate objects can’t think because they don’t own big squishy brains like us, I think this is a good definition of what it means for a machine to think.

What is a “machine” though? In order to do away with objections like “humans are also machines” etc, Turing says all machine that are considered in this paper are “digital computers”. This may sound like a very restrictive definition. However, he says that digital computers, given enough memory, can imitate all other machines (and are hence universal machines). This claim will be partly proved below. Hence, if we wish to prove that that there exists some machine capable of a certain task, it is equivalent to proving that there exists a digital computer capable of that task.

One of the more fascinating parts of this section is where Turing describes how the storage of a computer would work, and how a loop program works. He then gives an example of a loop in human life, which is what a loop program is meant to imitate:

To take a domestic analogy. Suppose Mother wants Tommy to call at the cobbler’s every morning on his way to school to see if her shoes are done, and she can ask him afresh every morning. Alternatively, she can stick up a notice once and for all in the hall which he will see when he leaves for school and which tells him to call for the shoes, and also to destroy the notice when he comes back if he has the shoes with him.

Although fascinating, the details of such functions are quite well known, and hence I won’t elaborate more on them here.

Universal machines

A “machine” is a pretty broad term in general. It has certain states, and rules for how the machine will behave when it is in that state. For instance, the universe may be thought of as a machine. Its initial state was that it was very very hot at first. God’s Rule Book said that in such a state, the universe had to expand and cool down, and that’s exactly what it did. The universe could be described as a continuous state machine- if its initial state was, say, even one millionth of a degree cooler, it would have evolved very differently. Hence, we can say that the universe is a machine is that very sensitive to initial conditions, and hence is a continuous state machine.

The opposite of a continuous state machine is a discrete state machine. This is a machine that is not very sensitive to initial conditions. An example would be your stereo system. Imagine that you have a stereo system with one of those old fashioned volume knobs that you turn to increase or decrease the volume. It won’t make that much of a difference to your audio experience if you turn the knob slightly more than you intended to. 70 dB will sound almost the same as 71 dB. Hence, what a lot of companies do is that they break up the volume level on your stereo into discrete states. Because 70dB is also the same as 71dB, they can both be clubbed into the volume level “15”. When you turn the volume knob, and see the volume bar in the visual display go up by 1, you’ve moved up from one state to the next. In this way, your stereo system is clearly a discrete state system.

Digital computers are discrete state machines. Discrete state machines are stable to small changes in initial conditions, and hence it becomes easy to predict their behavior in the future, based on knowing their initial state.

Turing then makes a fantastic prediction in this paper:

..I believe that at the end of this century…one will be able to speak of machines thinking without expecting to be contradicted.

Turing claims that this is only a conjecture, but that making conjectures is good for science. This is one conjecture that is perhaps still hotly debated across the world, and “verify you are a human” tests on many websites regularly play The Imitation Game with us.

Can machines have our immortal soul?

Turing anticipates many objections to his claim that machines can “think” if they can mislead other humans. And he deals with these anticipated objections one by one.

The Theological Objection might be that God provides immortal souls capable of thinking only to humans, and not animals or machines. Turing contends that “we should not irreverently usurp his power of creating souls”, and that He might indeed have provided all objects, animate and inanimate, of souls capable of thought. Turing also clarifies that religious arguments don’t impress him anyway. The Head in the Sand Objection contends that contemplating machines having the capacity for thought is too scary, and hence we should assume that they will never gain this facility. Intelligent people, basing their superiority on their capacity for thought, often make this argument. Turing thinks that this argument is too weak to even argue against, and that such “intelligent people” just need consolation.

The Mathematical Argument against machines being able to imitate humans well is that every machine is based on a certain logical system. Gödel proved that to every such logical system, you can ask a Yes/No question that it can only answer incorrectly, or never be able to answer. As humans (namely C, in this context) don’t have such a handicap, we can easily find out whether A is a machine or not. However, knowing what this question might be is going to be near impossible, as we wouldn’t know what type of a machine A might be (even if we are fairly sure it is a machine). Hence, we might now know what question to ask for which the machine will falter. Moreover, it is also entirely possible that humans would also give the incorrect answer to this question (as it is likely to be a tough logical question). In addition to this, what if this one machine is actually a collection of digital machines, with different such “trap” questions for each? It is going to be tough finding a question for which all such machines will fail. Therefore, this mathematical objection, although valid, is not something we need to worry about too much.

The Argument from Consciousness says that machines cannot think unless they can write a sonnet, feel pleasure at their success, feel warm because of flattery, etc. Turing contends that there is no way of knowing whether machines can feel any of this unless we can become the machine itself, which is impossible. One way to ascertain the emotional content of a machine is to conduct a viva voce, in which answering the questions posed would require some amount of emotional processing. The example that Turing provides, which he expects a human and a machine to be able to conduct, is given below:

Lady Lovelace, who described in details The Analytical Machine that Charles Babbage designed, claimed that machines cannot do anything original, and can only do what they’re programmed to do. Turing contends that although the kinds of machines that Lady Lovelace could see perhaps led her to that conclusion, she was incorrect, and that even The Analytical Machine could be suitably programmed such that it could do “original” things. One variation of her statement could be that “machines cannot do anything new”. However, even the “new” things that humans do is inspired, at the very least, from their own experiences. Hence, an even better variant would be “machines cannot surprise us”. This is also incorrect, as humans often make calculations that are slipshod, that lead them to certain conclusions. When they ask machines for answers, they’re often surprised with the (correct) answer that the machines provide. An analogy would be when we incorrectly calculate that preparing for a certain exam would take us one all nighter, and are surprised by how bad that plan was. We did not correctly calculate our productivity over the course of one night.

The Argument from Continuity says that the nervous system is a continuous state system, and a digital computer is a discrete state system. Hence, a digital computer can never successfully imitate a human being. Turing counters this by saying that a digital computer is capable of imitating continuous state systems. Take a differential analyzer for instance, which is a continuous state system. If asked for the value of \pi the differential analyzer, might give any value between 3.12 and 3.16, based on its current state. This is a feature of continuous state systems- even a small deviation from the “ideal” state brings about a noticeable change in the output. This could be imitated by a digital computer by having it output 3.12,3.13,3.14,3.15 with probabilities of 0.05, 0.19, 0.55, 0.20, 0.01.

The Argument from Informality of Behavior says that machines follow a rule book, which tells them how to behave under certain circumstances. Moreover, the behavior of a machine can be completely studied in a reasonable amount of time, such that we’ll be able to predict the behavior of a machine perfectly in any given situation. Humans are not predictable in such a fashion. Hence, humans and machines are different, and can be easily distinguished. Turing argues against this by saying that it is possible that such a “rule book” for humans may also exist, and the unpredictability of humans is just a result of the fact that we haven’t found all the rules in the book yet. Moreover, he says he has written a program on a relatively simple Manchester computer, which when supplied with one 16 digit number returns another such number within 2 seconds. He claims that humans will not be able to predict what number this program returns even if they get a thousand years to study the machine.

One of the more interesting sections of the paper is where Turing says that if player B is telepathic, then this game would break down, as she (player B is a woman) would easily be able to pass tests like ‘C asks “What number am I thinking?”‘. Clearly, a machine would be unable to think of this number with any degree of certainty. While Turing contends that Telepathy is indeed real, he overcomes this problem by suggesting that all the participants sit in “telepathy proof” rooms.

Learning Machines

Turing says that in some sense, a machine can be made to become like a “super-critical” brain (something which is capable of devising plans of action after being given an idea), and that in another sense a brain is also a machine. However, these ideas are likely to be contested. The only proof of whether a learning machine can exist can be given only when one such machine is constructed, which for all purposes lies far ahead in the future.

But what are the constraints when one tries to construct such a machine? Turing says that a human brain uses about 10^7 binary digits to process ideas and “think”. These binary digits can be thought of as analogues to neurons, which are 1 when they fire, and 0 when they don’t. A particular combination of neuron firing leads to resultant thoughts and actions. Turing thinks that the storage space needed for containing these many binary digits can be easily added to a computer. Hence, there is no physical constraint on constructing this device. The only question is, how do we program a computer to behave like a human being?

One could theoretically observe human beings closely, see how they behave under all possible circumstances, and then program a computer to behave in exactly that way. However, such an endeavor is likely to fail. One can instead program a computer to behave like a human child, and then make it experience the same things that a human child experiences (education, interacting with others, etc). Because a child is mostly a blank slate, and forms its picture of the world and behavioral paradigms based on its experiences, a computer may learn how to be a human adult (and hence imitate a human adult) in exactly the same way. Turing concedes that constructing a “child machine” is a difficult task, and that some trial and error is required.

A child learns through a “reward and punishment” system. What rewards and punishments can a teacher possibly give to a machine? A machine has to be programmed to respond to rewards and punishments much like a child does. It should repeat behaviors for which it is praised by the teacher, and not repeat behaviors for which it is scolded or punished. Also, it can be fed with programs that ask it to do exactly as the teacher says. Although inferring how the machine should behave based on fuzzy input from the external world might lead to mistakes, Turing contends that this is not any more likely than “falling over unfenced cliffs”. Moreover, suitable imperatives can be fed into the machine to further reduce such errors.

A critical feature of “learning machines” is that they change their rules of behavior with time. Suppose they have been programmed to go to school every day at 8 am. One day, the teacher announces that school will being at 9 am the next day. Learning machines are able to change the 8 am rule to 9 am. Adapting rules of behavior based on external output is also a feature of human beings.

How does a machine choose its mode of behavior though? Suppose the parents say “I want you to be on your best behavior in front of the guests”. What does that mean? There are a lot of near solutions to this problem- the machine could sit in a corner silently throughout the party, it could perform dangerous tricks to enthrall and entertain the guests, etc. How does it know which of these solutions is best? Turing suggests that we assign the machine a random variable. Let us suppose that all the “solutions” mentioned above are assigned numbers between 0 and 100. The machine could pick any number randomly, try out the solution corresponding to that number, evaluate the efficacy of that solution, and then pick another number. After a certain number of trials (maybe 15), it could pick the solution that is the most effective. Why not pick all numbers between 1 and 100? Because that would take too much time and computation.

By the time the machine is an adult, it will be difficult to accurately predict how the machine will behave in a certain situation, because we cannot know all the experiences that the machine has been through, and how its behavioral patterns have evolved. Hence, if we program something into it, it might behave completely unexpectedly- it might “surprise us”. Also, because its learned behavior and traits are unlikely to be perfect, it is expected to behave less than optimally in many situations, and hence mimic “human fallibility”. For all purposes, our machine will have become a human.

What is incredible to me is that this is exactly how neural networks behave. It is insane that Turing could envision such a futuristic technology more than half a century ago. Thanks for reading!

References

  1. Computing Machinery and Intelligence

The science of going in circles on doughnuts

The paper that I’m going to be reviewing today is A Topological Look at the Quantum Hall Effect. It explains an amazing coming together of topology and quantum physics, of all things, and provides a review of very important, Nobel prize winning work in Physics.

The Hall Effect

We all learned in high school that if we pass current through a conductor in a magnetic field, and the magnetic field is not parallel to the current, then the conductor experiences a force.

But does the conductor as a whole experience this force, or only the electrons inside it? Maxwell, in his Treatise on Electricity and Magnetism, said that “the mechanical force which urges the conductor…acts not on the electric current, but on the conductor which carries it.” This seems wrong, right? It is the charges that should experience the force, and not non-charged portions of the conductor.

Edwin Hall, a student at Johns Hopkins University, investigated further. He passed a current through a gold leaf in a magnetic field. He could see that the magnetic field was acting on the electrons themselves, as it altered the charge distribution inside the gold leaf, which he could measure with a galvanometer.

As you can see, there is a separation of charges across the breadth of the plate. One may imagine that instead of flowing through the whole plate uniformly, as soon as the current enters the plate, it gets deflected to one side of the plate, although it keeps moving forward. The potential difference created by this separation of charges is known as Hall voltage. Hall conductance is the current already flowing through the plate, divided by the hall voltage. Remember Ohm’s law, which says V=IR? This implies that \frac{1}{R}=\frac{I}{V}. This \frac{1}{R} is the conductance. Of course the difference is that this current is not caused by the Hall voltage. Hence, this formula cannot directly be obtained from Ohm’s law, but let’s shut our eyes to these details.

The direction of this voltage is not fixed, and depends on the charge of the charge carriers in the conductor. Hence, measuring the direction of this voltage is an easy way to determine the nature of charge carriers in a conductor. Some semiconductors owe their efficacy to having positively charged charge carriers, instead of the usual electrons.

The Quantum Hall Effect

In 1980, Klaus von Klitzing was studying the conductance of two dimensional electron gas at very low temperatures. Now remember that resistance=voltage/current (because conductance=current/voltage as discussed previously). As voltage increases, this formula says that resistance should increase. Now as the Hall voltage increases with an increase in magnetic field strength, as the magnetic field strength increases, so should the resistance. But what is the pattern of this increase? It is pretty surprising, surprising enough to get Klitzing a Nobel prize for studying it.

Note: it seems that the resistance is increasing in small steps for lower magnetic field strength, but bigger steps for higher values of the magnetic field. However, the conductance, which is its reciprocal, decreases by the same amount for each step. In other words, conductance is quantized. A quantity if referred to as quantized if can be written as some integer multiple of some fundamental quantity, and increases or decreases by the steps of the same size.

Why do we get this staircase graph, and not, say, a linear graph? If we consider the conductance=1/resistance values in the above graph, we see that all the successive values are integer multiples of a constant we find in nature- e^2/h, irrespective of the geometric properties or imperfections of the material. Here e is the electric charge of an electron, and h is Planck’s constant. Why does this happen?

Robert Laughlin offered an explanation. Consider an electron gas that is cold enough, such that quantum coherence holds. This basically means that there’s not much energy in the system for all the particles to behave independently, and that all particles behave “similarly”. Hence, the behavior of the system can be described easily by a Hamiltonian dependent on a small number of variables (2 in this case). Laughlin imagined the electron gas to be a looped ribbon, with its opposite edges connected to different electron reservoirs.

Now imagine that there’s a current flowing in the ribbon along its surface in a circular motion. The presence of the magnetic field causes a separation of charges. Because the two edges are already connected to electron reservoirs (which can both accept and donate electrons easily), the separation of charges causes electrons to flow from one reservoir to another. If, say, the side connected to A becomes positive due to separation of charges and the side connected to B becomes negative, then electrons from A go into the ribbon, and electrons from the ribbon go into B. One may imagine a current flowing between the electron reservoirs, which is maintained because the magnetic field maintains the Hall voltage between the two ends of the ribbon. However, in order to increase this Hall current, the magnetic flux \Phi will have to be increased (which will increase the Hall voltage).

The Aharonov-Bohm principle says that the Hamiltonian describing the system is gauge invariant under (magnetic) flux changes of \Phi_0=hc/e, which is the elementary quantum of magnetic flux. The Hamiltonian of a system may be thought of as its “energy”. Hence, although the Hall voltage increases every time we increase the magnetic flux by one quantum, the “energy” of the system remains unchanged. This quantized increase in Hall voltage causes a quantized increase in resistance, which can be seen from the step graph above. If the increase in resistance was not quantized, then the graph would have been a smooth curve, and not “stair-like”.

A small note about what it means for the Hamiltonian to be gauge invariant under flux changes of \Phi_0: it is not like the Hamiltonian doesn’t change with time. By the time the flux increase happens, all he eigenstates of the Hamiltonian will have changed. What gauge invariance here means is that if the measured eigenstate of the Hamiltonian at time t_0 is \psi(t_0), and we complete increasing the the magnetic flux by \Phi_0 at time t_1 and also measure the Hamiltonian at that time, then the eigenstate we get is going to be \psi(t_1), and not some other eigenstate \psi^*(t_1). For simplicity, we will just assume that \psi(t_0)=\psi(t_1) in this article, and that the Hamiltonian doesn’t change with time.

Another important note is the following: the charge transfer between the two reservoirs is quantized. In other words, when I increase the magnetic flux by one quantum of flux, the Hall current increases by a certain amount. When I increase the magnetic flux by the same amount again, the same increase in Hall current is seen. There is no good reason for this in quantum mechanics. This is because although the increase of the magnetic flux by \Phi_0 brings back the system to its original Hamiltonian eigenstate, in quantum mechanics there is no guarantee that the same initial conditions will lead to the same outcome. Quantum states randomly collapse into any one of their eigenstates, and don’t have to collapse to the same eigenstate each time. This fact, of the same increase in Hall current, needs some topology to be explained. But before that, we explore some concepts from geometry- namely, curvature.

Adiabatic curvature

“Adiabatic” evolution occurs when the external conditions of the system change gradually, allowing the system to adapt its configuration.

What is curvature? Let us explain this with an example. Take the earth. Starting at the North Pole, build a small road that comes back to the North Pole. By small, we mean that the road should not go all the way around the earth. If you drive your car along this road, the initial orientation of the car will be different from the final orientation of the car (when you arrive). Curvature is the limit of this difference between initial and final orientations of the car, divided by the area enclosed by the road, as the loop tends to a point. Note that this would never happen on flat land which was “not curved”: in other words, the initial and final orientations of the car would be the same.

But how does curvature fit into quantum mechanics? Consider a curved surface parametrized by two angles- \theta and \Phi. Each point of this curved surface corresponds to an eigenstate of the hamiltonian, which can be written as H(\theta\Phi). Essentially, this is a configuration space of the Hamiltonian of the system. Let us now build a quantum (ground) state, say e^{i\alpha}|\psi(\theta,\Phi). If we build a small loop from a point to itself, this ground state, as it is “driven” around the loop, changes its phase from \alpha to something else. The curvature of this configuration space, measured using the limiting process as described above, turns out to be 2 Im \langle \partial_{\theta}\psi|\partial_{\Phi}\psi\rangle.

Why do we care about bringing things back in a loop, as far as Hamiltonians are concerned? This is because at the end of the process of increasing the magnetic flux by one quantum, the initial and final Hamiltonian has to be the “same” in our case. We have to always come back in a loop to our initial Hamiltonian.

Hall Conductance as Curvature

What are \theta and \Phi as described above? \Phi is associated with the Hall voltage, and \theta can be thought of as a function of \Phi. More precisely, theta is defined such that I=c\partial_{\theta}H(\Phi,\theta). The Hamiltonian is periodic with respect to \Phi or Hall voltage. If the period of H with respect to \Phi is P, then the process of increasing the magnetic flux by one quantum can be thought of as changing the value of \Phi to \Phi+P, which leaves the Hamiltonian unchanged. Geometrically, if the configuration space can be thought of as a curved surface, this process can be thought of as moving along a circle and returning to the point we started out from.

For an adiabatic process where \Phi is changed very slowly, Schrodinger’s equation gives us \langle \psi|I|\psi\rangle=\hbar cK\Phi. \langle \psi|I|\psi\rangle can be thought of as the expected value of the current (the weighted average of all possible values of the current), and we know that \Phi is voltage. Hence, K, going by the formula discussed above, is (some constant multiplied with) conductance. In this way, Hall conductance can be interpreted as the curvature of the configuration space of the Hamiltonian.

Chern numbers

What is this “curved surface” that forms the configuration space of the Hamiltonian? As it is periodic in both its parameters \Phi,\theta, we can think of it as a torus. Each point of this configuration space corresponds to an eigenstate of the Hamiltonian.

Let the torus given above be the configuration space of the eigenstates of the Hamiltonian. Consider the integral \int K dA. The difference between this integral evaluated on the orange patch and the external blue patch is an integral multiple of 2\pi. This integer is known as Chern number. You may have seen the famous Gauss-Bonnet Theorem, which states that \frac{1}{2\pi} \int K dA=2(1-g). The right hand side, which is 2(1-g), is the Chern number for in the special case that the loop around the orange area is contractible (can be continuously shrunk to a point).

The Chern number associated with the diagram given above does not change when we deform the orange patch slightly. However, it will change if the boundary of the orange patch changes its topological properties (i.e. if its homotopy group changes). One way it could do that is if the loop circled the “doughnut-hole” of the torus completely.

Let’s bring it all together

Now that we have all that seemingly useless background, let’s bring it all together. Let’s start with the torus, representing all eigenstates of the Hamiltonian. Every time we increase the magnetic flux by one quantum, it’s like we’re going one full circle around the doughnut, and returning to our original Hamiltonian eigenstate. Both the pink and red curves given below are examples of this.

This causes the Chern number to change by an amount, say a. Why does the Chern number change? Because we’re adding the integral \frac{1}{2\pi}\int K dA to the previous integral. When we again increase the magnetic flux by one quantum, we go in another full circle around the torus. This causes the Chern number to change by the same amount a. Both the charge transfer between the conductance are exactly this Chern number in fundamental units (where constants like h and c are taken to be 1). Hence, each time we increase the magnetic flux by one quantum, we see the increase in Hall current, and the same decrease in conductance.

Why does topology have anything to do with quantum mechanics? We might think of this as an analogy that we constructed, that just happened to work absurdly well. We interpreted conductance in terms of curvature. Hence, the change in conductance can be thought of as an integral of curvature on copies of the torus. These integrals satisfy strict mathematical relations on the torus, via the Chern-Gauss-Bonnet Theorem. Hence, we can use this theorem to approximate how much the conductance changes every time we move along a certain loop on the torus (every time we increase the magnetic flux by one magnetic flux quantum).

Thanks for reading!

References

  1. A Topological Look at the Quantum Hall effect
  2. Adiabatic Theorem