Quantum Computing

Today, I will be talking about quantum computing. I will be following Quantum Computing– Lecture Notes by Mark Osdin, who is a professor at the University of Washington, Seattle. These lecture notes are roughly based on the book Quantum Computing and Quantum Information by Michael A. Nielsen and Isaac L. Chuang.

So what exactly is quantum computing? Imagine that you’re trying to find the shortest route from your house to a restaurant. The way that a classical computer would solve this problem is that it would consider all possible routes from your house to the restaurant, and calculate the length of each route individually. It would then compare the lengths of all the routes, and give us the shortest one. A quantum computer, on the other hand, can determine all routes and calculate all route lengths at once.

What does that mean? Suppose there are a million routes, can the quantum computer calculate the lengths of each all at once? What about a trillion? This seems impossible, as it would imply that a quantum computer is infinitely faster than a classical computer. What a quantum computer actually does is that it performs all of these calculations, but then stores the results of these calculations in a superposition. This means that users cannot extract the lengths of all these trillions of routes all at once. Performing a measurement would collapse the wave function generated by a quantum computer into one such route. Hence, we may only be able to know the details of one route on making one measurement.

What use is a quantum computer then? We don’t want to find the length of each route by performing one measurement at a time. We just need the shortest route. If we can find a way to increase the probability of the wave function collapsing into the shortest route, we will have solved our problem. Although this process of increasing the probability of the wave function collapsing into the “right answer” may take some time, it generally takes much less time than a classical computer. Hence, quantum computers have been known to provide exponential speedups over classical computers.

In short, quantum computers are useful because they are fast. Much faster than any classical computer will ever be.

A quantum bit

A classical bit is a “storage space” on a computer, which stores either the number 0 or 1. It cannot store both. A quantum bit or qubit, on the other hand, can store a superposition of both numbers in the form \alpha |0 \rangle + \beta | 1\rangle, where \alpha,\beta\in\Bbb{C} such that \alpha^* \alpha+\beta^* \beta=1. In other words, \alpha,\beta are indicative of the probabilities of the qubit wave function collapsing into the |0 \rangle or |1 \rangle states.

But what if we just want to store the ordinary number 1 in a qubit? We can just find a way to get |\beta|=1 and \alpha=0. Hence, all storage operations that can be performed on a classical computer can also be performed on a quantum computer.

Bloch sphere-I

Each qubit wave function can be represented as a point on the two dimensional sphere S^2.

But how can two complex numbers (\alpha,\beta)\in\Bbb{C}^2 be represented on only a two dimensional manifold? Mathematically speaking, we have the condition that \alpha^*\alpha+\beta^*\beta=1, which gives us the fact that all such (\alpha,\beta) are contained within S^3\subset \Bbb{C}^2. We then mod out S^3/S^1, as the individual phases of \alpha and \beta don’t matter- only the difference of their phases does. This gives us S^2. For the mathematically minded, what I have performed above is a Hopf fibration.

In the sphere above, we have the |0\rangle state as the North Pole and the |1\rangle state as the South Pole. Why is that? This fact doesn’t matter at all, I suppose. I could have chosen any two points as |0\rangle and |1\rangle, and declared all other points to be superpositions of these two states. Choosing the north and south poles for these two points just allows for a fairly intuitive parametrization of any superpositon of those states. The parametrization is (\phi,\psi)\to \cos(\frac{\phi}{2})|0\rangle+\sin(\frac{\phi}{2})e^{i\psi}|1\rangle. Note that this naturally builds up the spinor framework, in which a 4\pi rotation with respect to the angle \phi on the Bloch sphere would correspond to a 2\pi rotation of the superposition of states. Is this just an artificial construction; perhaps an undesired outcome of the parametrization? Perhaps. However, it is still a useful mathematical construction. Each point on the Bloch sphere corresponds to a spinor, and not much is lost if we assume that a 4\pi rotation now corresponds to a full rotation of the quantum state, and not a 2\pi rotation. Moreover, spinors come up naturally in lots of other areas of Physics.

Evolution of quantum systems

The evolution of a quantum system can be thought of as the motion of a dot on the Bloch sphere. It is thought to happen through unitary transformations. What does this mean? Because unitary transformations have eigenvalues of modulus 1, these transformations merely rotate all the eigenstates whose superposition forms the overall wave function, keeping their probabilities the same. Hence, if you take a wave function with a high probability of collapsing into the |0\rangle eigenstate, this probability will remain high as the wave function evolves. Of course the shape of the wave function will change with time.


One may calculate the probability of making a particular measurement by projecting the state vector onto the desired subspace. For example, if have a wavefunction \alpha|0\rangle +\beta |1\rangle, we may calculate the probability of this wavefunction collapsing into the |0\rangle state by projecting it onto the |0\rangle subspace and then calculating the coefficient, which is \alpha. This whole process can be formalized by saying that given a wave function \psi, the probability of making a measurement m is \sqrt{\langle\psi|M_m^* M_m|\psi\rangle}, where M_m is the projection operator projecting the state vector onto the |m\rangle subspace. The state of the system after this measurement is actually observed is \frac{M_m|\psi\rangle}{\sqrt{\langle\psi|M_m^* M_m|\psi\rangle}}.

Multi-qubit systems

Suppose we have two qubits q_1=a|0\rangle +b|1\rangle and q_2=c|0\rangle+d|1\rangle. Can we also form a wavefunction of this system of two qubits? It turns out that we can. We just take the tensor product of the two qubits to get ac|00\rangle +ad|01\rangle +bc|10\rangle +bd|11\rangle. Why the tensor product? The tensor product is a poor man’s multiplication sign when multiplication is not well-defined. Why do we need multiplication at all? Simple probabilitistic arguments would suffice. If the probability of observing the first qubit in state |0\rangle is |a|^2 and the probability of observing the second qubit in state |1\rangle is |d|^2, then the probability of observing the state |01\rangle should be |a|^2|d|^2=|ad|^2, which is exactly what we see in the above system. Hence, this is as good a representation as any to represent the quantum state of the two qubit system.


Before we understand entanglement, we have to grapple with the Hadamard gate. A quantum gate is a linear transformation performed on a state vector. A Hadamard gate, represented by the matrix \frac{1}{\sqrt{2}}\begin{pmatrix} 1&1\\1&-1 \end{pmatrix}, can be thought of as the process of “mixing it up”. When you input the |0\rangle eigenstate into this gate, for instance, you get back \frac{1}{\sqrt{2}}|0\rangle+\frac{1}{\sqrt{2}}|1\rangle. Hadamard gates are an easy way of creating a superposition of states from a single pure state, where all the eigenfunctions in the superposition have equal amplitudes. It is only by using these superpositions of states that we truly harness the power of quantum computation.

There is another quantum gate (transformation) that we should learn about- the CNot gate. This transformation acts on systems of two qubits, and is represented by

\begin{pmatrix} 1&0&0&0\\0&1&0&0\\0&0&0&1\\0&0&1&0\end{pmatrix}

This is not a unitary transformation. What this means is that the probabilities of the occurence of each eigenstate can now change. The CNot gate is instrumental behind entangling two qubits. What does that mean?

When two wires get entangled, we find it hard to tell them apart. It seems like they have morphed into one, quite unwieldy, entity. Similarly, when two qubits get entangled- it is hard to separate them into distinct qubits. They seem to have morphed into one blob of information. How does the CNot gate morph two distinct qubits into one such blob, though? The CNot gate maps the state vector \frac{1}{\sqrt{2}}|00\rangle + \frac{1}{\sqrt{2}}|10\rangle to the vector \frac{1}{\sqrt{2}}|00\rangle + \frac{1}{\sqrt{2}}|11\rangle. This is an entangled state because there do not exist any states |\psi\rangle and |phi\rangle such that |\psi\rangle\otimes |\phi\rangle=\frac{1}{\sqrt{2}}|00\rangle + \frac{1}{\sqrt{2}}|11\rangle. This is easy to see using the method of undetermined coefficients. Hence, we cannot recover the wavefunctions of the individual qubits anymore.

As the CNot gate is not unitary, the probabilities of various eigenfunctions are often changed. Also note that in the above system represented by the wavefunction \frac{1}{\sqrt{2}}|00\rangle + \frac{1}{\sqrt{2}}|11\rangle, if we observe the first qubit to be in state |0\rangle, we automatically know that the second qubit must also be in state |0\rangle. Hence, knowing the state of one implies knowing the state of the other instantaneously. This further drives home the fact that these two qubits are no longer different entities with possibly different properties.


I learned about teleportation through this amazing video. What exactly is being teleported here? A person? An object? Not quite. Only the quantum wavefunction of a qubit is being teleported- which is still something!

How does all of this happen exactly? Imagine that we have two scientists- Alice and Bob. I have unashamedly borrowed these names from the video. Two qubits are entangled (using the CNot gate), and then both of them are given one qubit each. Now imagine that Alice has another qubit Q, whose state is a|0\rangle +b|1\rangle. She wants to communicate this wave function to Bob. However, she can only communicate classical, and not quantum information. How can she communicate the quantum wave function to Bob? I am going to follow the description in the video, and not that in the text, although the youtube comments suggest that both descriptions are equivalent.

If Alice tensors her entangled qubit with the qubit Q, Bob’s qubit automatically gets tensored with Q as well….well in a sense. He still needs more information to get a complete description of Q, but it’s a start. What this tensoring does to the two entangled qubits is similar to what the Hadamard gate does to the eigenfunction |0\rangle: it mixes things up. In other words, if the original entangled state was \frac{1}{\sqrt{2}}(|00\rangle +|11\rangle), tensoring it with Q creates a superposition of |00\rangle +|11\rangle, |01\rangle +|10\rangle,|10\rangle -|01\rangle and |00\rangle -|11\rangle. This is the set of all possible states that an entangled system of two qubits can exist in, and is called the set of Bell states. Now if Alice makes a Bell state measurement, it collapses the whole quantum system into one Bell state, tensored with one wave function. This happens for both Bob and Alice, although Bob does not know what the collapsed state of the system is. Now if Alice communicates to Bob which Bell state the system has collapsed into, which she can through classical channels, Bob will know how to retrieve the original wave function |\psi\rangle from the wave function he has for Q right now. This retrieval is through multiplication with one of the Pauli matrices, and which Pauli matrix is required for retrieval depends on which Bell state the system has collapsed into. For example, if the Bell state that is measured is |00\rangle-|11\rangle, Bob knows that he needs teh $latex $X$ Pauli matrix to retrieve the original Q wavefunction.

One may ask why Bob cannot perform such calculations himself. Hence, it is implicit in this description that only Alice has the tools to make measurements.

Super dense coding

Super dense coding is teleportation in reverse: it is the process through which two classical bits may be communicated by Alice to Bob, if she can only send across the quantum information of one qubit.

How does this happen? Assume that Alice and Bob share two entangled qubits that are in the Bell state \frac{1}{\sqrt{2}}(|00\rangle +|11\rangle). Now Alice has the following options:

  1. If Alice wants to communicate the classical bits |00\rangle, she can let the entangled state remain unchanged. When Bob receives Alice’s qubit, he will know that the Bell state of the entangled system is still \frac{1}{\sqrt{2}}(|00\rangle +|11\rangle), and will hence infer that Alice just wanted to communicate |00\rangle.
  2. If Alice wants to communicate |01\rangle, she will perform the X rotation on her own entangled qubit, causing the Bell state of the entangled system to change to \frac{1}{\sqrt{2}}(|00\rangle -|11\rangle). This change in the Bell state is not communicated to Bob until he also receives Alice’s entangled qubit. However, when he does, he soon notices the change in Bell state, and infers that Alice wanted to communicate the |01\rangle state.
  3. Similar conclusions can be reached for both the |10\rangle and |11\rangle states, in which Alice can perform the Z and XZ rotations on her entangled qubit respectively.

How does Bob know the Bell state of the entangled system, after he was received Alice’s qubit? Can one just look at both the qubits and know the state? No. Bob passes the entangled qubits through the CNot gate, and then passes Alice’s qubit through the Hadamard gate. After performing these operations, Bob gets the two bit classical state that Alice wanted to communicate. The moral of the story is that a quantum wavefunction cannot really be inferred with just one measurement.

Deutsch-Jozsa algorithm

I will be following this video to explain the Deutsch-Josza algorithm.

What problem does this algorithm solve? Imagine that you have a function f: \{0,1\}^n\to \{0,1\}, and you know that it is either a constant function, or a balanced function, which means it maps half of its domain to 0, and the other half to 1. How do we find out which it is? Using a classical computer, we will need at most 2^{n-1}+1 queries to be absolutely certain of the nature of f. However, a quantum computer can solve this problem with 1 query. How does this happen?

U_f is a quantum gate that is called an oracle function. It is a kind of black box that is quite useful in many applications. The operation that it performs is |x\rangle |y\rangle\to |x\rangle |y + f(x) \mod 2\rangle, where x\in \{0,1\}^n,y\in\{0,1\}. We are performing manipulations on n+1 qubits here. So in what order does all this happen?

We first input a classical state |0\rangle ^{\otimes n} of n bits, and an additional classical state |1\rangle of 1 bit into system. We then perform H^{\otimes n} on the n classical bits to produce a wave function over n qubits, and also perform H on the additional classical bit to produce a wave function on 1 qubit. The U_f gate then performs the operation described above for the wave function determined by tensoring the n qubits with the one additional qubit. Now we again perform the H^{\otimes n} operation on the first n qubits, and get another quantum wave function over n+1 qubits. This final wave function is denoted as |\psi_3\rangle in the diagram below.

If the wave function |\psi_3\rangle comprises of only the |0\rangle^{\otimes n} eigenstate for the first n qubits, f is a constant function. However, if it does not contain the |0\rangle^{\otimes n} eigenstate for the first n qubits at all, f is a balanced function. Fortunately, both of these possibilities can be determined by just one measurement of |\psi_3\rangle.

Bloch sphere

A lot of details about the Bloch sphere have already been mentioned before in this article. Perhaps one important point that we should keep in mind is that every unitary transformation of the wave function of a single qubit can be represented as a rotation of the corresponding point on the Bloch sphere around some axis. Moreover, each such rotation can be written as a composition of rotation around the x-axis, y-axis and z-axis. In other words, U=R_x(\theta_1)R_y(\theta_2)R_z(\theta_3) for some angles \theta_1,\theta_2,\theta_3. In fact, any unitary transformation U can be thought of as a bunch of rotations around just the x and z axes, as any rotation around the y axis can be decomposed into a rotation around the x axis followed by a rotation around the z-axis.

In fact, even stronger claims than that can be made. What if we couldn’t change the angle? What if we had to fix \theta_1,\theta_2? Could we still represent any unitary transformation U to arbitrary precision by such rotations? Yes we can. There do exist two axes a and b, and rotations R_a(\alpha) and R_b(\beta) with \alpha,\beta fixed such that any unitary transformation can be approximated to arbitrary precision by a bunch of these transformations. The only constraint is that \alpha/2\pi and \beta/2\pi have to be irrational.

These rotations are known as universal quantum gates.

Shor’s algorithm

I am going to be following this video to explain what is Shor’s algorithm, and how it works.

Popular science literature has often emphasized the fact that a lot of encryption is based on the simple fact that it is exceedingly difficult and time consuming for classical computers to factor large numbers. Such a computer may take 2000 years of computing time to factor a 100 digit number, and the numbers used in encryption are generally much larger than that.

But what is encryption, and where does factoring large numbers come into all of it? Encryption is the process through which information is converted into a kind of code, and sent to the intended receiver. The hope is that if some malicious party get their hands on this code, they will not be able to crack it in order to obtain that information. But what kind of code is it?

Imagine that Alice sends Bob a code, along with a large number N. Breaking the code to retrieve the information is only possible if Bob knows the factors of the number already, which he does. If a malicious party get their hands on this code and large number, they will not be able to decrypt the code without spending a prohibitively large amount of time and effort to find the factors of this number. Shor’s algorithm, when run on a quantum computer, can factor these numbers with ease. How does it do it?

As mentioned before, the large number we have is N. Now pick a random number g, and check if it has a common factor with N. This can be done very quickly using Euclid’s algorithm. If it does, congrats! We’re done. Two factors of N are N/g. We can use the algorithm below to further factorize these factors if we want.

Chances are that an arbitrarily chosen number g will not have a common factor with N. What should we do now? We should look for a number p>1 such that g^p\equiv 1\mod N. This is because if we can find such a p, then g^{p}-1\equiv 0\mod N, or (g^{\frac{p}{2}}-1)(g^{\frac{p}{2}}+1)\equiv 0\mod N. Hence, two factors of N can be determined by simply using the Euclidean algorithm to find common factors of N and g^{\frac{p}{2}}\pm 1. However, there is a problem.

What if p is odd? Or what if (g^{\frac{p}{2}}-1) or (g^{\frac{p}{2}}+1) has a common factor of N with N? We will have to start over and find another g, and hope that we don’t face this problem again. The good news is that the probability of us finding a “good” random number g for which we face neither of the above problems is \frac{3}{8} for each trial. Hence, we are almost certain to choose a “good” random number g after a sufficient number of trials.

But how does one find such a number p such that g^p\equiv 1\mod N? This is where Shor’s algorithm comes in. It constructs the wave function |1,g^1\mod N\rangle +|2,g^2\mod N\rangle+\dots, suitably normalized of course. Now if we measure only the remainder, the wave function to some remainder, say a. The resultant wave function is |b^x,a\mod N\rangle+|b^{x+p},a\mod N\rangle +|b^{x+2p},a\mod N\rangle+\dots We now need to determine the number p, which is also the period of this wave function.

We do this by determining the quantum fourier transform of this wave function. Naively speaking, a fourier transform gives us all the frequencies of the wave function. The fundamental frequency of this wave function is 1/period. However, a quantum fourier transform gives us all multiples of the fundamental frequency, which can be thought of as resonant frequencies. Hence, the fourier transform of the above wave function will be |1/p\rangle+|2/p\rangle+\dots

Now when we perform a fourier measurement, the above wave function may collapse to any |i/p\rangle, where i is a positive integer. However, with enough measurements (of course preceded by painstakingly recreating the same wavefunction by following the same procedure), we can figure out the fundamental frequency |1/p\rangle, and consequently its reciprocal p. The maximum number of trials required is equal to the number of factors of p.

The discovery of this algorithm was hugely responsible for creating and sustaining interest in quantum computation.

Grover’s algorithm

I will follow this video to explain Grover’s algorithm.

Imagine that we have N switches, some attached correctly and others attached upside down. We have to figure out the correct configuration of the switches which will light the bulb. A classical computer will take 2^N trials to determine the correct configuration. However, a quantum computer using Grover’s algorithm can provide a quadratic speedup. In this case, we may just get away with using 2^{N/2} trials.

How does all of this happen? Imagine a system of N+1 qubits, in which the first N qubits represent all configurations of the N switches, and the last qubit represents the state of the bulb, which is initially assumed to be off. One may imagine that |\frac{1}{2}\rangle corresponds to the bulb being off and \frac-{1}{2}\rangle corresponds to the bulb being on. For the eigenstate corresponding to the correct configuration of the switches, the function f switches the configuration of the bulb to “on”.

Now performing the “Grover iteration” denoted by U_+U_f, the amplitude of the correct configuration is increased. After performing. it enough times, we can be almost certain that the wave function will collapse into the correct configuration when the measurement is made.

What is U_+U_f? Let us try to understand this with an analogy. For a number x, U_f(x)=1-2x, and U_+(x)=2x-1. Hence for a number that is exactly equal to \frac{1}{2}, U_+U_f maps it back exactly to \frac{1}{2}. However, for a number that is equal to -\frac{1}{2}, U_+U_f maps it to 3, (U_+U_f)^2 maps it to -11, etc. It is easy to see that |(U_+U_f)^n(-\frac{1}{2})|\to \infty as n\to\infty. Hence, when we normalize the wave function above, all of the amplitude of the function gets concentrated around the configuration that corresponds to -\frac{1}{2}, which is the correct configuration.

I hope to study quantum error correction soon and blog about that as well. Thanks for reading!

Published by ayushkhaitan3437

Hello! My name is Ayush Khaitan, and I'm a graduate student in Mathematics. I am always excited about talking to people about their research. Please please set up a meeting with me if you feel that I might have an interesting perspective to offer- https://calendly.com/ayushkhaitan/meeting-with-ayush

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: