IMO 2019, Problem 1

The International Math Olympiad 2019 had the following question:

Find all functions f:\Bbb{Z}\to \Bbb{Z} such that f(2a)+2f(b)=f(f(a+b)).

The reason that I decided to record this is because I thought I’d made an interesting observation that allowed me to solve the problem in only a couple of steps. However, I later realized that at least one other person has solved the problem the same way.

The right hand side is symmetric in a,b. Clearly, f(f(a+b))=f(f(b+a)). Hence, symmetrizing the left side as well, we get f(2a)+2f(b)=f(2b)+2f(a). This implies that f(2a)-f(2b)=2(f(a)-f(b)). Assuming b=0, we get f(2a)=2f(a)-f(0).

Now use a=x+y and b=0 to show that f(x)-f(0) is linear. This shows us that f(x)=2x-f(0) or f(x)=0 are the only solutions to this question.

Poverty, mental health and rhesus monkeys

Today I came across a very interesting paper titled “Values Encoded in orbitofrontal cortex are causally related to economic choices” by Ballesta, Shi, Conen and Padua-Schioppa. I haven’t read and analyzed the paper fully, partly because of the many statistical tools that I will have to learn to assess it carefully. However, I did manage to read some important bits, and it set me thinking about how it directly applies to so many of us in our daily lives.

In this paper, the researchers claim that our subjective values of things are hard-coded in our orbitofrontal cortices. That is just a fancy way of saying that if we like burgers more than fries, this information is stored in a part of the brain that lies directly above your eyes. Hence, every time you’re offered a choice between burgers and fries, that part of your brain implores you to choose burgers.

Experiment

The following experiment was done on rhesus monkeys. They were given a choice between 2 drops of grape juice and 6 drops of peppermint tea. Depending upon their surjective preferences (as coded in their orbitofrontal cortices), they would prefer one or the other. For example, let us assume that a monkey named Tim would mostly choose the 2 drops of grape juice over peppermint tea. How exactly is Tim offered these choices?

Tim is first shown a picture of 2 drops of grape juice. Then after a 1 second delay, he is shown a picture of 6 drops of peppermint tea. He is then asked to choose amongst the two images. He consistently chooses the grape juice.

However, suppose a current of 100 \mu A is passed through his orbitofrontal cortex every time he is shown the grape juice image. The passage of this current causes Tim to start choosing peppermint tea slightly more often than before. Note that Tim does not have a clear preference for peppermint tea as such. It is just that his choices become more randomized. Why does this happen? The authors of the paper think that the current interferes with the working of the orbitofrontal cortex, which was initially asking it to consistently choose the grape juice.

AB means that A is being offered first, and B is being offered second. StimON means that current is passed during the first offer.

Depression and poverty

If you’ve been depressed before, you can surely empathize with this. Let us suppose that in a happy and stable state of mind, you’d always prefer the color red over blue. Given the choice between two t-shirts of those colors, you’d consistently choose the red one. However, once that depression hits, you don’t really care anymore. You’d start choosing the blue one more often than before because they’re all the same, and it doesn’t matter. Our brain just refuses to do the computation that leads us to conclude that we prefer red over blue (yes, even choices are a result of mental computation). And the absence of computation leads to random choice.

What about poverty? Imagine that poverty causes a small current to run though your orbitofrontal cortex. This causes your brain’s computational capacities to plummet, leading you to make arbitary choices, or perhaps choices that are dictated by short term thinking (obviously short term thinking requires less computation than long term thinking). Say at the end of a hard day’s labor, you have $100 in your pocket. If you save $50 every day for a year, you will have saved enough money to accumulate interest, perhaps help you tide over bad times. But c’mon. You’re incapable of that computation. Your orbitofrontal cortex is screaming at you to save some money at least. You’ve been through terrible times, and if you’d saved in the past, you’d have been so much better off. You know that you should save money. You’ve definitely been burned enough times to know that. But you cannot hear your brain screaming over the current. You’d rather go to the bar right now and drink it all up. Tomorrow is another day.

What about self-destructive behaviour? What if the lack of will power is basically the lack of computational capabilites? This is perhaps getting into speculative territory. But these are burning questions that can be answered using a similar experimental evidence as described above.

The analysis above is different from the paper in one significant aspect: in the case of depression and poverty, there’s a current that is always running through the orbitofrontal cortex. Hence, our brain, that is now incapable of computing and hence making a good choice, now makes a random choice. However, in the experiment, the current is running through Tim’s brain only when the first choice is being shown. In effect, the current interferes with Tim’s ability to register or analyze the first choice properly. Hence, he starts choosing the second choice, which he can at least perceive properly (even though he may not like the second choice per se). If current through the orbitofrontal cortex does indeed decrease computational capabilities, then this current is causing Tim to not be able to carry out the computation to register the first choice, hence leading it to go for the second choice that it has registered better.

It would be interesting to see an experiment in which Tim has a clear preference for choice A, and a current is running through his brain when both choices are presented. Will this randomize his choices between A and B? That would then provide supporting evidence for my brain current theory of depression and poverty.

References

  1. Values Encoded in orbitofrontal cortex are causally related to economic choices” by Ballesta, Shi, Conen and Padua-Schioppa

Economics with bad drawings

All things considered, I am not very good at reading.

Let me elaborate on that. Most of my learning, throughout my life, has been based upon reading texts. Recently, while trying to wrap my head around complicated geometric notions written in texts without pictures, I realized that I was pretty horrible at understanding concepts without pictures. Soon, I began making pictures for everything, and it seemed to help in a lot of areas that I thought I could never understand. Including in my own mathematical research.

One field that I could never really understand that well before is Economics. Today, I spent some time reading a glossary of economic concepts and coming up with pictorial representations for all of them:

Evolution, Wars and Game theory

This post is on Evolutionary Game Theory, derived from the book Networks, Crowds and Markets: Reasoning about a highly connected world by David Easley and Jon Kleinberg. This seemingly esoteric field shines a light on an incredible number of things, including evolution, wars, and society itself. Although the subject is inherently mathematical, the mathematics only serves to distract from the ultimate message in a lot of examples. Hence, this article mostly avoids formulae and lengthy calculations. Almost all the examples constructed in this article are my own.

Introduction

What does Game theory actually mean? Imagine that India and Pakistan are about to go to war. Both sides have declared the other to be responsible for recent provocations, and have justified to their citizens the use of overwhelming force to crush the enemy. Who should attack first? Should they attack at all?

If both of them are able to stop themselves from going to war, both of them will benefit. No soldiers or citizens killed, no blow to the economy, you name it. However, if one of them surprise attacks first, they will catch the other sleeping and quickly be able to gain territory. This will boost the economy, and improve public morale, ensuring that the current government surely wins the next election. Hence, the benefits of attacking first outweigh the benefits of withholding from attack in this situation.

If India and Pakistan are not in communication with each other, they will each expect that the other will want to attack first. Hence, if only to preemptively deny this advantage to the other, they will both attack, leading to entrenched warfare, massive destruction, with very little gain.

If one could quantify the benefits of going to war or abstaining from war in a table, it would look something like this:

India/PakistanNot attackAttack
Not attack0,0-10,10
Attack10,-10-2,-2
Benefits and losses have been represented as numbers between -10 and 10

Hence, if India and Pakistan are not in communication with each other, out of sheer insecurity, they will attack each other, leading to a bad result in which both of them suffer. However, if they are in communication with each other, and show cooperation by agreeing not to attack one another, they will both avoid a whole lot of damage and suffering.

This is an example of the classic Prisoner’s Dilemma, in which insecurity amongst people leads to bad conclusions. If I take pains to help out a classmate who will never help me in return, then without communication I will just assume that I’m being a fool, and that I might as well be unhelpful to that person. This inevitably leads to an uncomfortable atmosphere in class for both. However, if both of us have a verbal or non-verbal understanding that we will always help the other out, then both of us can benefit. Life is generally not a zero-sum game, and humans generally do benefit from cooperation, as long as they play by the rules.

Game theory is the study of decision making in such circumstances.

Evolutionary Biology

Evolution, as we all know, is the process through which some genetic traits persist over time, while others are wiped out. If there are two types of owls on an island, identical in every way except for the fact that one type is blind and the other can see, then soon enough the blind owls will be wiped out from the island. This is because it will be in direct competition with the seeing owls for food and other resources. How did one type of owl come to be blind while the other did not? Most genetic differences arise because of random mutation. It is possible that owls were always blind, and due to a random mutation one of them could see. This mutated owl cold hunt better, and soon gave birth to a lot of kids who also had eyes. Gradually, these owls with the power of vision out-competed the “original” blind owls, established their dominance over the island, and with a couple of hundred years those blind owls were nowhere to be found.

Note that evolution is the composite of random mutation and natural selection. It is not the process through which one species naturally dominates another. For instance, when humans first reached Australia more than 40,000 years ago, they hunted many species into extinction. This is not explained through evolution. Evolution only occurs gradually between members of the same species that are distinct only in small ways, their differences caused by random mutation.

Evolutionary Game Theory

How does game theory come into the picture though? Let us suppose that in a large community of say a thousand pigs, ten pigs have three legs instead of four due to random mutation. What will happen? The other pigs will out-compete these mutated pigs for food, shelter and mates. These mutated pigs will probably starve a lot of the time, not have many mates, and hence much fewer children than the rest. The fraction of three-legged pigs will slowly decrease generation upon generation, until it is negligible (or even zero). A genetic trait is called evolutionarily stable if a large enough fraction of the population with that trait can successfully dominate all other mutations, provided that the fraction of such mutants is small enough. Having four legs seems to be an evolutionarily stable trait. Of course we haven’t proved it yet. What if even one pig with 16 legs can out-compete this pig population for food and mates? This is unlikely, as this 16 legged pig will only be made fun of by the others, and will find it hard to attract mates in order to pro-create. Hence, in all likelihood, having 4 legs is evolutionarily stable for this population of pigs.

Let us take another example. Imagine that in a society with a thousand humans, one socially awkward child with a super high IQ is born. If there is only one such child, he/she will struggle to fit in, probably fail to attract mates in competition with others, and hence this mutation will die out. Hence, average IQ is evolutionarily stable, and not a very high one. However, if a sufficiently large number of children are high IQ mutants, they can fend for each other, procreate with each other, and ultimately beat others while in competition for limited resources. Hence, high IQ can spread only if the number of high IQ mutants is large enough to begin with. This would perhaps explain the current state of the world to some extent.

But what happens if the world is mostly comprised of high IQ individuals, and a small number of low IQ mutant is introduced? Intense competition between the high IQ individuals may lead to their mutual destruction. For example, two technologically advanced nations may blow each other up with nuclear weapons, while less advanced nations may survive unscathed. In the aftermath of this destruction, the low IQ mutants may pro-create comfortably, soon populating society with mostly low IQ individuals. Hence, high IQ is not evolutionarily stable.

In the book Sapiens by Yuval Noah Harari, the author argues that Neanderthals were known to actually have a bigger brain than Homo sapiens, and a bigger brain generally translates to a higher IQ. Hence, it was always a mystery how Sapiens managed to exterminate the Neanderthals and spread over the whole world. Perhaps the fact that high IQ is not evolutionarily stable explains this counter-intuitive phenomenon.

Nash equilibrium

What is Nash equilibrium? It is an arrangement between two players in which neither side will gain from changing their strategy. For example, let us suppose that India and Pakistan agree to not go to war with each other. If Pakistan now changes its stance and attacks India, India will retaliate by attacking Pakistan. In this way, things will only become worse if either country stances its stance from peace to war. This shows that both countries being at peace is a Nash equilibrium.

What does Nash equilibrium have to do with evolution? Things that are evolutionarily stable are also Nash equilibria. What this means is the following: suppose we have a population with an evolutionarily stable trait (say, average IQ). If a small fraction of individuals suddenly becomes high IQ, that small fraction will soon be wiped out. Hence, when two average IQ persons are competing for a resource, it will not benefit either of them (over the long run) to become a high IQ mutant. Hence, the only stable arrangement, in which everyone is well-off over the long run, is if everyone has an average IQ.

Mixed strategy

In some situations, we don’t have an evolutionarily stable trait.

Consider the scenario in which most countries of the world are peaceful and minding their own business. Suddenly, the US starts attacking everybody, soon accumulating a load of wealth and territory. Other countries will soon follow suit, and the world will descend into chaos. But in this aftermath of bloody destruction, people will realize that Switzerland has remained unscathed because it refused to descend into this horrible mess. They will realize their folly, and decide to again become peaceful. Soon, most countries in the world are again peaceful. But then they see Russia attacking countries near its borders, slowly gaining control over large parts of the world. To prevent the scaled from being tilted too heavily in Russia’s favor, other countries soon join the fray.

In this way, much of world history has been a story of wars, interspersed with periods of genuine peace. Neither war, nor peace are evolutionarily stable. It took only one belligerent country to make all other countries go to war, and it took only one peaceful, relatively unscathed country to make everyone return to peace. I don’t mean to overplay my hand, but this could be a possible reason why humanity has often oscillated between wars and peace, and neither has stuck.

What should one do then? The author of the article proposes that we should have a mixed strategy in situations without pure evolutionary equilibria. What would that look like in this situation? Suppose all countries mutually decide to initiate an attack with another only one time out of ten (of course all countries can retaliate every time they’re attacked first). If a country suddenly starts attacking everyone, it will find itself always at war, its people and economy destroyed. Hence, it will adapt to attacking less frequently. If a country that is involved in world politics never initiates an attack, it will soon enough find itself attacked first by another country, and will hence suffer considerable damage. Soon, it will start initiating some attacks on its own to never let the past repeat itself. Hence, attacking sometimes, perhaps one time out of ten, is the evolutionarily stable strategy, as eventually everyone will start doing this.

Conclusion

As one may imagine, game theory can probably be applied almost everywhere in human life. When people lack cooperation, insecurity ultimately leads them to destroy their own peace of mind and others’. For example, in a society in which everyone throws garbage on the roads, one person deciding not to do so and taking pains to clean the streets is at a disadvantage. Other people will keep throwing garbage on the streets. Hence, that person will soon decide to stop his/her futile efforts, and litter the streets like other. This shows that the situation in which everyone throws garbage on the streets is evolutionarily stable. However, if people cooperate and mutually decide to never throw garbage on the streets, and also to punish/fine individuals that pollute the streets, our streets will remain clean forever. Hence, clean streets can be evolutionarily stable only when people communicate and cooperate with each other. I stole this example from the book Algorithms to Live By, by Brian Christian and Tom Griffiths. This book also re-kindled my desire to finally understand Game theory, after multiple failed attempts in the past.

References

  1. Evolutionary Game Theory, by Easley and Kleinberg

Effective Altruism- October

I am attaching my EA donation slip for October below. I took the Effective Altruism pledge last year, in which I pledged to donate 10% of my lifetime earnings to the organization.

Today is a special day for many Indians, including myself, as today is Gandhi Jayanti, Mahatma Gandhi’s birthday. Reading his autobiography was one of the most influential decisions of my life, and a lot of my life’s decisions were based on trying (and mostly failing) to follow in his footsteps- like going vegan (very few people know that Gandhi was vegan *before it was cool*), trying to donate a part of my salary, trying to be helpful to everyone, etc. Although I’ve failed in a lot of the above, I’ve also had some modest successes.

Gandhi’s influence in my life has changed over the past few years. I’ve helped myself to the odd dairy dessert way too many times. I’ve also been treated extremely badly by people who I’ve tried to be nice to. But most importantly, I’ve slowly realized that Gandhi’s ideas were mostly bad in the long run for society. It tries to negate basic human tendencies like selfishness by trying to establish primarily agricultural and socialistic communities, sexuality by trying to promote abstinence, etc. He may have had some success with these, but I’ve mostly failed.

After doing some random unfocused reading, I have realized that before having grand visions for society, we should have some evidence on whether our vision will work or not. I’ve found Abhijit Banerjee’s writings on experimentation in economics to be quite relevant in these matters. Despite Gandhi’s lack of insight into behavioral economics, he was an amazing man, and I am humbled by the opportunity to be able to donate some money towards social upliftment on his birthday.

Forecasting the American presidential elections

The article that I will be talking about today is How the Economist presidential forecast works by G. Elliott Morris. I have always wanted to know more about how American Presidential forecasts work, and how reliable they are. This is my attempt to try and understand it. Note that the authors have developed a code for their statistical algorithm, that they have posted here.

Poll position

How does one predict who will win the Presidential election? Simple. Randomly select a group of people from amongst the population, and note their voting preferences. If this selection process is unbiased and selects a large enough group of people, you should have a good indicator of who will win. Right? Unfortunately, this is not the full picture.

The time at which this poll is conducted is of great importance. At the beginning of the election year, a lot of people are undecided. “I don’t want to choose between two bad options, I’d rather go for a third party”. However, as the elections loom closer, people often return to their inherent preferences for either the Democrats or the Republicans. Hence, polls conducted at the beginning of the year are much less accurate than polls conducted right before the elections. For example, even as late as June 1988, George HW Bush was trailing by 12 percentage points in polling averages to his contender. He went on to win by 8 points just five months later. Of course, national polls taken close to the election can also be unreliable. Hillary Clinton was leading by 8 percentage points over Donald Trump as late is October 2016. She won the popular vote by only 2 points (and of course lost the election). For a fascination explanation of the electoral college, and how a candidate can lose the election despite winning the popular vote, watch this.

So if national polls are not completely reliable (at least the ones conducted in the early stages), how can one predict the election? A lot of things like market stability, global business, and even the stability of foreign governments rides on being able to predict the American Presidential election successfully. Hence, political theorists have put a lot of thought into it. It tuns out that there are some “fundamentals” that predict the election outcome better than polls. The “fundamentals” that we are concerned with here are the state of the economy, the state of the country, etc. One such model that uses “fundamentals” is “Time for Change”, developed by the political scientist Alan Abramowitz. It predicts the election outcome by using the GDP growth, net approval rating, and whether the incumbent is running for re-election. The error margins for this model have historically been comparable to those of polls taken late in the election season, and in 1992 it did a better job of predicting the election than national polls.

Something simple, please

To develop a prediction model using “fundamentals”, we have to choose the factors that are important in determining the election outcome. In selecting these factors using the given data, we might select factors that “seem” important, given the limited data, but do not really matter in predicting elections. This fallacy is known as overfitting, and can introduce substantial error into our predictions. To mitigate this problem, we borrow two techniques from machine learning- “elastic-net regularization” and “leave-one-out cross-validation”. It is heartening to see that although statistics heralded the machine learning revolution, new insights into how machines learn have also started changing the world of statistics.

Elastic-net regularization is the process of “shrinking” the impact of factors we consider in our model. The mantra that one must follow is that simpler equations do a better job of predicting the future than more convoluted ones. Hence, we may reduce the weights of the various factors we are considering, or remove the weak ones entirely. But how does one know by how much we should reduce the weights of these factors, or which factors to completely remove? For this, we use leave-one-out cross-validation. We will leave out one part of our data set, and train the model on the remainder of the data set, using a pre-determined “shrinkage” algorithm for reducing the weights of certain factors. We may also completely remove certain factors. We then test whether our model is able to predict the correct conclusion based on that left out data set. For instance, if we training an election prediction model based on data from 1952 to 2016, we leave out the data from 1952, and train out model on all the other election years to identify relevant factors and prescribe weights to them. Then we feed the data for 1952 into the model and see if it is able to predict the election result correctly. In the next iteration, we leave out 1956, and run the same algorithm. After we have this algorithm for all election years, we change the “shrinkage” algorithm and run the whole process all over again. A total of 100 times. We select the shrinkage algorithm that is the most successful on average.

The “shrinkage” algorithm that the authors found after running this algorithm was pretty close to Alan Abramowitz’s model. Some small differences were that the authors prescribed a penalty to parties that had been in power for two previous terms, and used a cocktail of economic indicators like real disposable income, non-farm payrolls, stock market, etc rather than just second-quarter GDP growth. They interestingly found that these economic factors have become less important in predicting elections, as the voter base gets more polarized. Hence, ideology has slowly come to trump economic concerns, which is a worrying indicator of major ideological upheaval in the coming years.

Of course, economic indicators are important in the 2020 elections. The pandemic has caused major economic depression, which is likely to reverse quickly once the health scare is mitigated. The authors see it fit to assign a weight to economic factors that is 40% higher than that assigned to economic factors during the 2008-2009 Great Recession.

The authors find that their “fundamentals” model does exceedingly well in back-testing, and better in fact that both early polls and late polls.

When the authors try to include polls in the model to possibly make it even more accurate, the machine learning algorithms they use decline to use early polls, and only incorporate polls conducted very close to the actual election.

There’s no margin like an error margin

Suppose the model predicts that Biden will win 51.252% of the vote. The actual election results being exactly this is effectively zero. The most important information that a model produces is the uncertainty estimate around that prediction. If the model predicts that Biden will get between 50% and 53% of the vote with 95% certainty, we can be pretty sure that Biden will win the popular vote.

To calculate these ranges of outcomes, we use a beta distribution, which is essentially like the normal distribution, but for values between 0 and 1. Also, the width of the beta distribution can vary as compared to the normal distribution, increasing or decreasing the uncertainty of a model’s prediction. If the beta distribution is wide, the margin of error is large. If the margin of error (95% confidence interval) is, say 10%, then a candidate predicted to win 52% of the vote has a 2.5% chance of getting less than 42% of the vote, and a 2.5% chance of getting more than 62%. Hence, in closely contested elections, beta distributions with large uncertainty can be quite unreliable.

Modeling uncertainty

How does one model uncertainty though, now that we’ve calculated the correct amount of “shrinkage”? We again use elastic-net regularization and leave-one-out cross-validation. Uncertainty also depends on certain parameters, and these parameters can be determined by these two algorithms. Uncertainties, in the authors’ model, are smaller closer to the election, in polarized elections, when there’s an incumbent running for re-election, and when economic conditions are similar to the long-term average. For instance, 11 months before the election in 1960, when the economy was unusually buoyant and the incumbent set to retire, the 95% confidence interval of the Republican vote share was quite large: from 42.7% to 62.4%. However, in 2004, with George W Bush seeking re-election, when the economy was in a stable state and the electorate was polarized, the 95% confidence level of Bush’s vote-share was from 49.6% to 52.6%. He ended up getting 51.1%, which was almost identical to the authors’ prediction.

Moral victories are for losers

Winning the popular vote does not guarantee winning the election. Hillary Clinton famously won the popular vote by 2%, but still lost the election to Donald Trump. The election outcome depends upon the “electoral college“, through which states, rather than people, do the voting. The authors, in trying to predict national election outcomes, choose to forecast a state’s “partisan lean” rather than the actual state election outcome. “Partisan lean” can be defined as how much a state favors Democrats or Republicans as compared to the whole nation, and hence how it would be expected to vote in the event of a tie.

Why would we forecast partisan lean instead of actual outcome though? This is because partisan lean is a more stable predictor of the actual voting outcome in the state. For instance, in the early stages, our model might predict that Minnesota is likely to vote Democrat by 52%. However, as the election approaches, events might cause the voting pattern across the whole nation to shift to the Republican side, including in Minnesota. Hence, although our model would predict that Minnesota would vote Democrat, events might transpire such that Minnesota eventually votes Republican. However, if one forecasts partisan bias, this bias towards a particular party will remain unchanged even if national events cause voters to swing, as long as this swing is spread uniformly across all states. Hence, partisan bias is a better predictor of eventual election outcome.

To produce central estimates of partisan lean, the authors use a variety of parameters like the state’s partisan lean during the previous two elections, the home states of the presidential candidates, the actual national popular vote, etc. But how do we use this analysis for 2020? The actual national popular vote has not even been conducted yet. In this case, we can use the various possible outcomes, calculate the partisan bias based on these numbers, and then attach a weight to them based on the probability of that outcome. For instance, if there’s a 10% chance of Trump getting 52% of the vote and Biden getting 45% of the vote, we plug those numbers into the algorithm to calculate each state’s partisan bias, and then attach a weight of 0.10 to it.

Bayesian analysis

The principle of Bayesian statistics is pretty powerful. First assume that a certain hypothesis or “prior” is true. Now study the actual real world data, and calculate the probability of that data being the outcome, assuming that your prior was true. If the probability if low, discard your prior and choose one for which the real world data is likely.

How does Bayesian analysis fit in here though? Don’t we already have a model? Why don’t we just plug in all the data and get a prediction already? This is because poll numbers are unreliable, and often have a bias. We can remove this bias by using Bayesian analysis.

What are some sources of errors while taking polls? One source of error is sampling error, in which the sample chosen is not representative of the whole population. For instance, in a population where half the people vote Democrat and the other half vote Republican, choosing a group of people who are mostly Republican will give us a biased election forecast.

However, there are other, non-sampling errors too. Even if the sample we select is representative of the population, not all of the people who are polled are eligible to vote, or will actually go out and vote even if they are eligible. Polls that try to correct for these and other non-sampling errors also don’t do a very good job of it, and inevitably introduce a bias. The model developed by the authors corrects for these biases by comparing the predictions of polls that have a bias towards the Democrats, and others that are biased towards the Republicans. Simplistically speaking, both of these kinds of biases cancel out.

There is another source of error that is more subtle: the partisan non-response. Let me illustrate that with an example. Given the generally negative media coverage for Donald Trump amongst multiple news outlets, many Republican voters will not agree to be surveyed at all. They might be scared of social ridicule should they voice their support for Trump, and probably don’t want to lie that they support Biden. Hence, any sample that polling organizations can construct will have a pro-Biden bias. However, this might change if the overall news coverage of Trump becomes more favorable. This introduces a lot of uncertainty into any forecasting model. The authors correct for partisan non-response by separating all polls into two groups- those that correct for partisan non-response, and those that don’t. Then they observe how the predictions given by these polls change every day. The difference in prediction between the two types of polls can be attributed to partisan non-response, and the authors can then incorporate this difference into their model.

However, what about the far flung states that are not polled as often as other, always in-the-news states? How do we reliably predict how these states will vote? The authors conclude that neighboring states, with similar geographies and demographies, vote similarly. Hence, if we have a forecast for Michigan but not for Wisconsin, we can reliably assume that Wisconsin is likely to have a similar polling result as Michigan. Given below is the correlation matrix for various states.

Bayes-in the model

Let us now put all the pieces together. The authors use an extension of the Markov Chain Monte Carlo Method, first expounded by Drew Linzer. What does it do? It performs a random walk.

Let me illustrate this with an example. Let us choose the prior that polls are biased towards the Democrats by about 5%. Also, we know the partisan bias for Michigan in the month of June. In the coming days until the election, Michigan can, swing Republican, Democrat, or stay the same. Because of our prior, however, we have to assign different probabilities to Michigan swinging Republican or Democrat (or staying the same). We perform this random walk every day for Michigan until the election, to get a prediction for how Michigan will vote, assuming our prior is true.

Now we may assume a different prior: that polls over-estimate Republican chances by 2%. We again perform a random walk for each state, including Michigan. The authors take 20,000 such priors, and perform random walks for various states. They now calculate each candidate’s chances of winning as the total number of priors which led to them winning, divided by the total number of priors.

Using this model, the authors predict a comfortable win for Biden.

References

  1. How the Economist presidential forecast works, by G. Elliott Morris.

IMO 1988/Problem 6

I spent some time thinking about the infamous IMO 1988/Problem 6 today:

For positive integers a,b, if \frac{a^2+b^2}{ab+1} is an integer, prove that it is a square number.

After some initial false starts, I came up with this:

a^2+b^2=(ab+1)(\frac{a}{b}-\frac{1}{b^2})+(b^2+\frac{1}{b^2}). We need to eliminate the \frac{1}{b^2}, in order to have some hope of a quotient that is an integer. One way to do that is to have (ab+1)\frac{1}{b^2}=(b^2+\frac{1}{b^2}). This implies that \frac{a}{b}=b^2. We have thus proven that the quotient, which is \frac{a}{b}, is a square number.

Of course, this is not the complete solution. We might have (ab+1)(\frac{a}{b^2}+l)=(b^2+\frac{1}{b^2}), in which case the quotient could also be an integer. However, it turns out that the solution that I have written above is the only possible solution. Now we only need to justify that this is the only solution.

Social Capital: Networks and Connections

The paper that I want to review today is “Social Capital: Its Origins and Applications in Modern Sociology” by Alejandro Portes. I found this paper in the list of the most highly cited Sociology papers of the last century. This is the first paper that I’ve ever read in Sociology, and I found it to be very engaging. This has definitely made me want to explore this field more.

Introduction

The concept of “social capital” has become one of the rare terms to transcend the rarefied world of sociology into everyday language. The media makes it sound like the solution to most of the world’s problems, both between individuals and countries. Due to the diverse applications to which this term has been put, there is no one definition of the term anymore.

Although this term has its historical roots in the writings of Marx and Durkheim, its modern presentation leaves much to be desired. Sociologists often only present the positive aspects of it whilst leaving aside the negative. Also, “social capital” is often interpreted as similar to monetary capital in its capacity to provide an individual with power, status or opportunities. Some authors have also gone on to the extent of saying that cities and countries too can possess social capital, as opposed to just individuals, and the presence of this ill-defined “social capital” is retrospectively held responsible for certain cities being more prosperous and stable.

Clearly, the modern presentation of social capital can benefit from a more balanced view. The author intends to do just that in this article.

Definitions

Pierre Bourdieu, the first person to systematically analyze the concept, defined social capital as “the aggregate of the actual or potential resources which are linked to possession of a durable network of more or less institutionalized relationships of mutual acquaintance or recognition”. For historical reasons, this analysis did not become well-known amongst researchers, what with the original paper being in French. Bourdieu makes the point that it is the benefits that members accrue, from being part of a social network, that gives rise to the strength and stability of such networks. Social networks do not just come into place on their own. People have to invest time and effort into building social ties. However, once this network is in place, the members of this network can appeal to the institutionalized norms of group relations to gain social capital. In some sense, this is like a monetary investment that can pay dividends later.

Because spending this social capital may lead to the acquisition of economic capital in the form of loans, investment tips, etc, Bourdieu thinks of social capital as completely interchangeable with economic capital. However, the acquisition of social capital is much less transparent, and much more uncertain than the process of acquisition of economic capital. It requires the investment of both economic and cultural resources, and involves uncertain time horizons, unspecified obligations, and the possible violation of reciprocity. If you help your friend today, it is not completely certain that you will ever need their help in the future, and that they will help you if you do.

Another contemporary researcher who has probed this realm is Glen Loury (1977), who stated that economists studying racial inequality and welfare focused too much on human capital (which might perhaps be interpreted as individual education or ability), and the creation of a level playing field such that only the most skilled persons succeed. They want to create a level playing field by making employers’ racial preferences illegal. However, this cannot succeed because the acquisition of human capital by some communities is stunted due to a lack of economic resources, along with the absence of strong social networks.

The merit notion that, in a free society, each individual will rise to the level justified by his or her competence conflicts with the observation that no one travels that road entirely alone. The social context within which individual maturation occurs strongly conditions what otherwise equally competent in- dividuals can achieve. This implies that absolute equality of opportunity…is an ideal that cannot be achieved. (Loury 1977)

Although Loury’s analysis of social capital and social networks stopped here, this led Coleman to delve into the issue in more detail, who described how social capital leads to the acquisition of human capital. Coleman defined social capital as “a variety of entities with two things in common: They all consist of some aspect of social structures, and they facilitate certain action of actors- whether persons or corporate actors- within the structure”. Coleman also described some things that lead to the generation of social capital, like reciprocity expectations and group enforced norms, along with the consequences of having social capital, like privileged access to information. Resources obtained through social capital are often looked upon as “gifts”, and hence one must distinguish between the possession of social capital, and the ability to acquire these gifts through it. Not everyone who possesses social capital, by virtue of being a member of a social group, can necessarily acquire these gifts without some requisite social savvy.

Another distinction that should be made is between the motivations of recipients and donors. Although the motivations of recipients are fairly clear, donors can either be motivated by reciprocity from the individual that they’re helping, or greater status in the community; or perhaps both.

Coleman also talks about the concept of “closure” in communities, which is the presence of sufficient ties in a community that guarantee the observance of norms. For instance, the possibility of malfeasance in the tightly knit community of Jewish diamond traders in New York is pretty low because of “closure”. This leads to easy transactions between members without going into much legalese.

Another interesting perspective on social capital is offered by Burt, who says that it is the relative absence of ties, called “structural holes”, that facilitates individual mobility. This is because in dense networks, after a certain amount of time, new information is scarce, and only redundant information gets transmitted. It is the weak ties in a sparse network that can suddenly become active, and transmit useful and new information, that can lead to new contacts, jobs, etc. Hence, it is the weaker networks that mostly lead to advancement as opposed to the stronger or denser ones. This is in stark contrast to the stance taken by others like Coleman, who emphasize that the benefits that can be accrued through a social network is directly dependent on how dense that network is.

Sources of social capital

What motivates a donor to help out a person asking for help in a social network? This motivation can be consummatory or instrumental. A consummatory motivation is one that stems from a sense of duty or responsibility. For instance, the economically well off members of a tightly knit community might feel an obligation to help out those who are less privileged. An instrumental motivation is one that stems from expectation of reciprocity. Donors help others only to accumulate obligations, and expect to be repaid on full at some time in the future. This is different from an economic exchange however, because the method of repayment can be different from the original method of payment, and also because the time frame of repayment is more uncertain.

There is another set of examples that explains this dichotomy of motivations. Bounded solidarity refers to the mechanism through which an underprivileged or sidelined community develops a sense of solidarity, and all members feel a duty to help each other out. This is an example of consummatory motivation. On the other hand, sometimes donors help out others only to raise their status in society. Hence, although there is an expectation of reciprocity, it is from the whole community and not from an individual. This again is an example of instrumental motivation. Of course the two motivations can be mixed: a donor may extend a loan to another member of a community to both gain status, and also expect individual reciprocity in that they may expect that the money be returned in time. The strong ties in the social network would ensure that both happen.

Effects of social capital

The three basic functions of social capital are

  1. As a source of social control- Parents, the police, etc use their social capital to influence the behavior of others in the community. For instance, parents may expect their children to behave well by using their social capital, which they possess by virtue of being guardians of their children. This social capital, if one may imagine it to be some sort of money, is never really spent and exhausted. Parents will always have an infinite amount of social capital to control the behvaiour of their children. The same goes for policemen, etc.
  2. As a source of family support- Children may use their social capital, which they possess by virtue of being dependent on their parents, to expect help from their parents in all spheres of life. This form of social capital is also inexhaustible. It has been noted that children that are brought up in a stable household with two parents often experience better success in their education and careers. On the other hand, children brought up in one-parents households face a harder time dealing with their education and career. This is mainly because children in one-parent households have less social capital, in that they have one less parent to ask for help from.
  3. As a source of benefits through extrafamilial networks- This one is slightly more intuitive. Connections made outside one’s family can have a huge impact on individual mobility and careers. For instance, Jewish immigrants in New York at the turn of the 20th century often received help from other immigrants in the form of small loans of employment in companies. They had social capital just by virtue of belonging to the same community. Other examples of this phenomenon are New York’s Chinatown, Miami’s Little Havana, etc.

    On the flip side, a lack of connections can spell doom for certain communities. For example, impoverished communities rarely have connections with the better parts of town which might provide them with employment or relief. For instance, inner-city impoverished black communities often lack connections with potential sources of employment, and remain mired in poverty. This problem is further exacerbated by the dense social network existing between members of this community, which leads them to influence each other to pursue crime and drug abuse.

    Stanton-Salazar and Dornbrush have found a positive correlation between the existence of such extrafamilial social networks and academic achievement amongst Hispanic students in San Francisco. On a side note, they found an even higher correlation between bilingualism and academic achievement, highlighting the importance of being able to communicate with a wider community.

    This form of social capital is exhaustible, in that you cannot keep asking the wider community for help and not expect people’s patience to run out.

Negative social capital

It is important to identify the negative with the positive. Recent studies have noted four negative consequences of the existence of social networks:

  1. Strong social ties within a community can bar access to others. For instance, business owners from some ethnic communities often employ only members of the same community. Control of the produce business on the East Coast by the Korean community, control of the diamond business in New York by the Jewish community, etc are examples.
  2. Successful members of a community are often assaulted by job-seeking kinsmen. The strong social ties often force these otherwise successful professionals/businessmen to help out or hire their kinsmen, affecting the overall quality and performance of their organizations.
  3. Social control can lead to demands of excessive conformity. In tightly knit traditional societies today, divorces are still looked down upon for instance, and errant members are ostracized. Privacy and individualism are reduced in this way.
  4. Marginalized communities often develop a strong sense of solidarity, and are apprehensive about mixing with the rest of the population and go up the social and career ladders. Consider the following quote from a Puerto Rican laborer, for instance

“When you see someone go downtown and get a good job, if they be Puerto Rican, you see them fix up their hair and put some contact lenses in their eyes. Then they fit in and they do it! I have seen it! … .Look at all the people in that building, they all “turn-overs.” They people who want to be white. Man, if you call them in Spanish it wind up a problem. I mean like take the name Pedro-I’mjust telling you this as an example-Pedro be saying (imitating a whitened accent) “My name is Peter.” Where do you get Peter from Pedro?”

(Bourgois 1991, p. 32

Decades and centuries of discrimination or persecution often lead to certain communities becoming closed to the outside world, which removes them from the larger social network that could perhaps have helped them succeed. This self-imposed exclusion makes their situation even worse than before. These are known as downward leveling norms. Moreover, members that step outside of these communities are often ostracized, which leads to low overall member mobility.

Social capital as a feature of communities and nations

Some political scientists have also extended the notion of social capital to cities and communities, renaming it as “civicness”. This “civicness” or social capital of a community encompasses “the features of a social organization, like networks, norms and trust, that facilitate action and cooperation for mutual benefit”. There is no information on the number of people involved in this social network, the density of the social network, etc.

Robert Putnam, a prominent advocate of the community view of social capital, said that the decline in the nature of cities and the community in general is a result of the loss of social capital through the falling membership of organizations like PTA, the Elks Club, the League of Women Voters, etc. Critics have called this view elitist for stating that social capital can only be regained through membership in these high society organizations. Moreover, they have also admonished Putnam’s opinion that the responsibility for increasing this social capital lies in the hands of the masses by joining these organizations, and not in the hands of the government or corporate leaders.

Dr Portes notes that Putnam’s argument is also circular. The social capital of a community cannot be measured directly. It can only be inferred from a community’s success. If a community is successful, one may infer that there probably is a strong sense of cohesion between the members. Anything that cannot be measured directly, and can only be inferred, cannot scientifically considered to be a cause. For instance, “emotional balance” cannot scientifically be considered a cause of a person’s success. There may be lots of causes of their success, like hard work of networking. Emotional balance can only be inferred: because this person is successful, they probably do have emotional balance. In this way, identifying one single cause for the success of a person or community, especially if that cause can only be inferred and not measured directly, is a dangerous game. The only way that Putnam could have proven his thesis is by taking two communities that are similar in all regards except that one has more social capital than the other (we are assuming that this social capital can be directly measured), and showing that the one with more social capital is more successful. This is obviously a difficult experiment to conduct in real life.

Conclusion

“Social capital” is essentially a mix of old ideas in a new garb. Moreover, it is unlikely that just increasing social capital will lead to a solution of community-wide problems. As has been explained above, social capital is also responsible for holding back certain communities from development. Hence, appreciating both the positives and negatives of social capital is important for having a balanced and realistic view of the concept.

References

  1. Social Capital: Its Origins and Applications in Modern Sociology, by Alejandro Portes.

EA- September

Attaching my Effective Altruism receipt for the month. I’ve been neglecting this for a couple of months, in part because of recent high expenses (car, etc). This is me trying to jump back on the bandwagon.

I’ve often thought about whether “blacking” out the amount donated would be a more altruistic thing to do. However, this is not a one-off donation. This is supposed to be 10% of your income, and the amount donated should reflect that. I might still black things out in the future to make these posts less awkward.

How have I been able to afford it? I don’t drink, or eat out that much anymore. My other expenses have generally only reduced over time. Hence, I am perhaps just substituting one form of expenditure with another. I have tried donating to friends’ charities and things like that before. However, on average, I’m happiest donating to just EA and Arxiv.