Physical Interpretation of the Derivative

The primary concept of calculus deals with the rate of change of one variable with respect to another.

Instantaneous Speed

Let’s imagine a person who travels 90 km in 3 hours. Their average speed (rate of change of distance with respect to time) is 30 km/h. Of course, they don’t need to travel at that fixed speed; they may slow down or speed up at different times during their travel. For many purposes, it suffices to know the average speed.

However, in many daily happenings, the average speed is not a significant quantity. If a person traveling in an automobile strikes a tree, the quantity that matters is the speed at the instant of collision (this quantity might determine if they survive or not).

Concept Description
Interval Happens over a period of time
Instant Happens so fast that no time elapses

Calculating the average speed is simple. By definition, it’s the rate of change of distance with respect to time.

average speed=distance traveledinterval of time

The same computation process can’t be applied to get the instantaneous speed at some point in time. Since instantaneous means that the event happened in an infinitesimal or very short space of time, then distance and time might both be zero. Hence, using the average speed definition won’t help because 00 is meaningless. We know that this is a physical reality, but if we can’t calculate it, it’s impossible to work with it mathematically.

We can’t compute it with the knowledge we have right now, but we can surely approximate it. Let’s say that a ball is dropped near the surface of the Earth, and we want to know its instantaneous speed after 4 seconds. To calculate the instantaneous speed at any point in time, we need to know the distance it travels after some period of time. This relation can be expressed as a formula that relates distance and time traveled. The formula that relates the distance (in feet) to the time elapsed is:

f(t)=s=16t2

We can calculate the distance the ball traveled after 4 seconds by replacing t with 4:

s4=1642=256 feet

Let’s also compute the distance the ball traveled after 5 seconds:

s5=1652=400 feet

The average speed for this interval of time is then:

average speed for the interval of time [4, 5]=s5s41=4002561=144feet/s

So the average speed during the fifth second is 144;feet/s. This quantity is no more than an approximation of the instantaneous speed, but we may improve the approximation by calculating the average speed in the interval of time from 4 to 4.1 seconds, which is:

average speed for the interval of time [4, 4.1]=268.962560.1=129.6feet/s

Let’s register more computations of the above process with smaller and smaller intervals of time in a table:

|time elapsed after 4 seconds|  1|  0.1|  0.01|  0.001|  0.0001|
|average speed (in feet/s)   |144|129.6|128.16|128.016|128.0016|

Of course, no matter how small the interval is, the result is not the instantaneous speed at the instant t=4. However, we now see that the average speed for the intervals seems to be approaching the fixed number 128 feet/s.

Method of Increments

Let’s redo the process described above over an arbitrary interval of time. To do so, let’s introduce a quantity h, which represents an interval of time beginning at t=4 and extending before or after t=4. (h is called an increment in t because it’s some interval of time.)

The formula for the example above is:

(1)s=16t2

When calculated once by the end of the fourth second, it is:

(2)s4=1642=256

When substituted with the interval [4,4+h], it is:

s4+k=16(4+h)2(3)=256+128h+16h2

Where k is the additional distance the object falls h seconds after the initial 4 seconds. To obtain k, we have to subtract (2) from (3). The result is:

k=128h+16h2

The average speed in this interval of time is then kh. Dividing both sides by h:

kh=128+16h

To compute the instantaneous speed, the interval h must become smaller and smaller until it reaches 0. If h approaches 0, then 16h also approaches 0. We can conclude that the instantaneous speed when t=4 approaches 128 feet/s.

Generalization

Let’s generalize the process above for (1) for any value of t. To do so, let’s apply the method of increments when t is substituted with the interval t+h:

s+k=16(t+h)2=16t2+32th+h2

Subtracting (1) from the equation above:

k=32th+h2

Dividing both sides by h:

(4)kh=32t+h

Just as stated above, to compute the instantaneous speed, the interval h must become smaller and smaller until it reaches 0. If h approaches 0, then the instantaneous speed approaches 32t, which is a function that will tell us the instantaneous speed of the falling object at any time t!

It has been customary since the days of Euler to use Δt (delta t) for the increment of t. Δt means a “change in the value of t”. Thus, Δt has the same meaning as h; likewise, Δs has the same meaning as k. We can rewrite (4) as:

(5)ΔsΔt=32t+16Δt

It’s desirable to have some short notation for the statement that we have evaluated the limit of as the values of Δt approach 0, which can be expressed as:

limΔt0ΔsΔt

Where lim is an abbreviation for limit, replacing (5) with this new notation:

(6)limΔt0ΔsΔt=32t

To some mathematicians, this notation is somewhat lengthy; hence, mathematicians replaced it with different variations:

limΔt0ΔsΔt=dsdt=s=f(t)

The rate of change is not always related to time or distances. A generalization of the formulas above is needed. Instead of the symbols s and t, let’s use x and y without specifying what x and y mean physically.

Let’s calculate the instantaneous rate of change of y with respect to x (the word instantaneous does not really apply because x doesn’t represent time), using the method of increments on a function which depends on x:

(7)y=f(x)(8)y+Δy=f(x+Δx)

Subtracting (7) from (8):

Δy=f(x+Δx)f(x)

Dividing both sides by Δx:

ΔyΔx=f(x+Δx)f(x)Δx

The instantaneous rate of change of y with respect to x is reached when Δx approaches 0:

(9)limΔx0f(x+Δx)f(x)Δx

We can also use the variations for the notation of the rate of change:

limΔt0ΔyΔx=dydx=y=f(x)

What we did with the process above was to find the instantaneous rate of change of y with respect to x. We call this rate the derivative of y with respect to x. The process of applying the method of increments to obtain the derivative is called differentiation.

Geometric Interpretation of the Derivative

Let’s graph the following formula:

(10)y=x2
−6−5−4−3−2−10123456−10123456789

A point belonging to this geometrical representation of y has the form (x1,f(x1)); e.g., when x=1,y=1 and when x=2,y=4.

−6−5−4−3−2−10123456−10123456789

Let’s say that (x1,f(x1)) is a fixed point on the curve (for the sake of this example, the point will be x1=1,y1=1). Any other point that belongs to the curve can make a line with the fixed point.

−6−5−4−3−2−10123456−10123456789

The slope is a quantity that describes the direction and steepness of a line and is calculated by finding the ratio of the vertical change to the horizontal change between any two distinct points on the line. The previous statement expressed as a formula is:

m=y2y1x2x1=ΔyΔx

What if the movable point gets closer and closer to the fixed point such that Δx reaches 0? That’s exactly the definition of the derivative, which means that the derivative of a function will tell us the slope of the tangent line to the function (represented geometrically as a curve) at any derivable point!

Let’s find the instantaneous rate of change of this function evaluated at x=1, using (9):

m1=f(1)=limΔx0f(1+Δx)f(1)Δx=limΔx0(1+Δx)212Δx=limΔx012+2ΔxΔx212Δx=limΔx02Δx=2

This fixed number is the value of the slope of the line tangent to the derivative function when it’s evaluated with 1. Let’s find out the Point–slope form of the tangent line whose slope is m:

(11)yy1=m(xx1)

Substituting y1=1, m=2, and x1=1 computed above:

y=2(x1)+1=2x2+1=2x1

If we graph this line next to the geometric representation of y=x2, we see that it’s actually touching the curve at the point (1,1).

−6−5−4−3−2−10123456−10123456789

Before finding the equation of the slope for any value of x, let’s imagine the graph produced by the slope function. If we take a look at the graph produced by (10), we can see that for any point that belongs to the curve whose x coordinate is negative, the slope will be negative, and for any point that belongs to the curve whose x coordinate is positive, the slope will be positive, expressed mathematically:

sign(m)={1if x<0,0if x=0,1if x>0.

Now that we have an idea of the values of the slope, let’s find the value of m for any value of x that is the derivative of y with respect to x, using (9):

f(x)=limΔx0f(x+Δx)f(x)Δx=limΔx0(x+Δx)2x2Δx=limΔx0x2+2xΔxΔx2x2Δx=limΔx02xΔx=2x
−6−5−4−3−2−10123456−3-2.5−2-1.5−1-0.500.511.522.53

By looking at the line, we confirm our expectation of the values. Any point which belongs to the line whose x coordinate is negative has its y coordinate (the value of the slope) negative as well, and any x coordinate belonging to the line whose x coordinate is positive has its y coordinate positive as well.

There are infinite tangent lines to the curve that represents (10). In the following graph, the equation of the line is computed dynamically based on the position of the mouse pointer (computed by doing substitutions on (11)):

−6−5−4−3−2−10123456−10123456789

Second Derivative

Going back to the falling object formula (s is the distance the object moved after t seconds have elapsed):

s=16t2

The instantaneous rate of change of the distance with respect to time is:

(12)s=32t

s represents speed, and it is customary to use v (the first letter of velocity) instead of s:

(13)v=32t

Now v is a function of t, and we can ask for the rate of change of v with respect to t. This is called instantaneous acceleration. Acceleration is a change of speed that takes place during an interval of time. If there weren’t acceleration in a moving object, the moving object would be moving for the rest of its life with a constant speed. If the speed is given as a function of time, then we can calculate the instantaneous rate of change of the velocity with respect to time:

(14)v=32

The instantaneous acceleration obtained above is the derived function of the instantaneous speed, which is the derived function of the distance function. Then we can relate the instantaneous acceleration and the distance function with the following notation:

sord2sdt2

The function above is called the second derived function of (1). This notation applied to the generalized version using the variables x and y is:

d2ydx2oryorf(x)

The Chain Rule

Physical problems lead to more complicated algebraic functions, for example, y=x2+1, which arises when one wants to work with the upper half of the parabola y2=x2+1. We can express this function as a combination of two functions:

y=uu=x2+1

If y is a function of u and u is a function of x, then:

dydx=dydududx

Expressed in the function notation:

y=f(u)andu=g(x)

Then:

(15)dydx=f(u)g(x)

Returning to the original problem, let’s find the derivative of y=x2+1 with respect to x using the chain rule:

Let f(u)=u1/2 and g(x)=x2+1.

dydx=f(u)g(x)=u1/222x=xx2+1

Differentiation of Implicit Functions

Going back to the definition of a function, it’s a relation between two variables such that given a value of one in some domain, there’s a unique value determined for the second variable. However, functions often occur in forms where giving the independent variable some value will not result in a unique value. For example, the equation of a circle of radius equal to 5 is:

(16)x2+y2=25

Here, y is not expressed in terms of x. Solving for y, we have two equations:

(17)y=25x2y=25x2

(16) represents the circle implicitly, and (17) represents the equation explicitly.

We know that y in (16) represents some function of x. If we recognize that the left side of (16) is only a set of terms in x, then we can differentiate it. The problem is to find the derivative of y2, which should remind us of the chain rule (y plays the role of u in the chain rule):

d(y2)dx=2ydydx

Applying a differentiation process to (16):

2x+2ydydx=0

Solving for dydx:

dydx=xy

Theorems on Differentiation

Read “Calculus: An Intuitive and Physical Approach”.

Applications of the Derivative

  • Determination of the velocity and acceleration of a particle given its distance as a function of time.
  • Concentrate light, sound, and radio waves in a particular direction (see the reflective property of the parabola ).
  • Finding the maximum/minimum value of a function, i.e., find the largest/smallest value of f(x) when axb. A well-described solution to this problem can be found here .
  • Approximation of the roots of a polynomial with Newton’s method, described here .

Maxima/Minima

Let’s say that we throw an object into the air and we want to know the maximum height it acquires. As it rises, its velocity decreases, and when it reaches the highest point, its velocity is zero. We also know that the velocity is the instantaneous rate of change of height with respect to time; hence, the derivative is involved in this process, and therefore we expect it to be involved in other maxima/minima problems.

More generally, if y is a function of x, it seems that to find the maximum value of y, we must find y and set it to 0.

Let’s see an example. The following function has a maximum value of 3.333 near x=1 and a minimum value of 2 near x=3. If we analyze the slope of the function near those points, we will see that on the left of x=1, the slope is positive, and on the right of x=1, the slope is negative. Since we know that the derivative represents the slope of a function, we can also expect that the derivative of this function near x=1 will go from a positive value to a negative value, intersecting the x-axis. If we analyze the slope near x=3, we will see the same behavior with the slope, but it’s going from a negative value to a positive one.

y=x3/32x2+3x+2
−4−3−2−1012345678−4−3−2−1012345678maxmin
y=x24x+3
−4−3−2−1012345678−4−3−2−1012345678interceptintercept

Now the problem reduces to finding the points where y=0 in the derivative function. Finding them will tell us exactly the maximum/minimum value of y. Finding the values of x when y=0:

0=x24x+30=(x1)(x3)

And we see that:

y=0whenx=1andx=3

The process didn’t actually find the maximum/minimum values since for x>3, the function increases indefinitely. The same goes for when x<1, but in this case, the function decreases indefinitely. These values are called the relative maxima/minima because near x=3 or x=1, these points are the minimum/maximum that can be found.

Applications of Maxima/Minima

  • Refraction of light: we can build a function of time which relates the velocity/distance the light travels in different mediums. Finding the derivative and making it equal to 0 will find the relative minimum time needed to go from one point in medium a to a point in medium b.
  • Finding the sides of the rectangle with the maximum perimeter.

Newton-Raphson Method

The slope of the tangent line of a function f(x) at any derivable point is given by m=f(x). Let x1 be a derivable point; then the slope of the tangent line at x1 is m1=f(x1). The Point–slope form of the tangent line whose slope is f(x1) is:

yy1=m1(xx1)yf(x1)=f(x1)(xx1)

Newton found out that if we find the intercept of this tangent line with the x-axis at some initial guess x1, the value found approaches one of the roots of f(x), i.e., when f(x)=0 (obviously, given that it has roots).

If y=f(x)=0, then the equation of the line is:

0f(x1)=f(x1)(xx1)

Solving for x:

(18)x=x1f(x1)f(x1)

x in the last equation is the abscissa of the next approximation of one of the roots of f(x). If we run the algorithm above a few times with an acceptable initial guess, then we’ll obtain a better approximation of one of the roots of f(x).

−6−5−4−3−2−10123456−3−2−101234567

Finding the Square Root of a Number

Let’s say that we want to find the square root of a number n. This is equivalent to finding the solution to:

x2=n

The function to use is then:

f(x)=x2n

whose derivative is:

f(x)=2x

Substituting in (18):

x=x1x12n2x1=x1x12+n2x1=x12+n2x1=12(x1+nx1)
double square_root(double n) {
  // initial guess
  double EPS = 1e-15;
  double x0 = 1;
  while (true) {
    double xi = (x0 + n / x0) / 2.0;
    if (abs(x0 - xi) < EPS) {
      break;
    }
    x0 = xi;
  }
  return x0;
}