### Why substitution works in indefinite integration

Let’s integrate . We know the trick: substitute for . We get . Substituting into the original equation, we get . Let us assume remains positive throughout the interval under consideration. Then we get the integral as or .

I have performed similar operations for close to five years of my life now. But I was never, ever, quite convinced with it. How can you, just like that, substitute for ? My teacher once told me this: . Multiplying by on both sides, we get . What?!! It doesn’t work like that!!

It was a year back that I finally derived why this ‘ruse’ works.

Take the function . If you differentiate this with respect to , you get . If you integrate , you get . Simple.

Now take the function . Differentiate it with respect to . You get . If you integrate , you get .

The thing to notice is when you integrate the two functions- and , you want a function of the form . However and whatever I integrate, I ultimately want a function of the form , so that I can substitute for to get .

In the original situation, let us imagine there’s a function . We’ll discuss the properties of . If we were to make the substitution in and differentiate it with respect to , we’d get a function of the form , where is . There are two things to note here:

1. The form of the derivative if wrt is the same as that of , which is , multiplied by , or derivative of wrt .

2. When any function is differentiated with respect to any variable, integration wrt the same variabe gives us back the same function. Hence,

Coming back to , let us assume its integral is . It’s derivative on substituting and differentiating wrt is of the same form as multiplied by . This is a result of the chain rule of differentiation. Now following rule 2, we know .

How is making the substitution justified? Could we have made any other continuous substitution, like ? Let us assume we substitute for . We want to take all the values can take. This is the condition that must be satisfied by any substitution. For values that takes by doesn’t, we restrict the range of to that of . Note that the shapes of as plotted against and as plotted against will be different. But that is irrelevant as long as we can write the same cartesian pairs for any variable, where is the x-coordinate and is the y-coordinate.

Summing the argument, we predict the form the derivative of will take when the substitution is made, and then integrate this new form wrt to get the original function. This is why the ‘trick’ works.