Derivative

Physical Interpretation of the Derivative

The primary concept of calculus deals with the rate of change of one variable with respect to another.

Instantaneous Speed

Let’s imagine a person who travels 90 km in 3 hours. Their average speed (rate of change of distance with respect to time) is 30 km/h. Of course, they don’t need to travel at that fixed speed; they may slow down or speed up at different times during their travel. For many purposes, it suffices to know the average speed.

However, in many daily happenings, the average speed is not a significant quantity. If a person traveling in an automobile strikes a tree, the quantity that matters is the speed at the instant of collision (this quantity might determine if they survive or not).

Concept	Description
Interval	Happens over a period of time
Instant	Happens so fast that no time elapses

Calculating the average speed is simple. By definition, it’s the rate of change of distance with respect to time.

average speed = \frac{distance traveled}{interval of time}

The same computation process can’t be applied to get the instantaneous speed at some point in time. Since instantaneous means that the event happened in an infinitesimal or very short space of time, then distance and time might both be zero. Hence, using the average speed definition won’t help because $\frac{0}{0}$ is meaningless. We know that this is a physical reality, but if we can’t calculate it, it’s impossible to work with it mathematically.

We can’t compute it with the knowledge we have right now, but we can surely approximate it. Let’s say that a ball is dropped near the surface of the Earth, and we want to know its instantaneous speed after 4 seconds. To calculate the instantaneous speed at any point in time, we need to know the distance it travels after some period of time. This relation can be expressed as a formula that relates distance and time traveled. The formula that relates the distance (in feet) to the time elapsed is:

f (t) = s = 16 t^{2}

We can calculate the distance the ball traveled after 4 seconds by replacing $t$ with 4:

\begin{aligned} s_{4} & = 16 \cdot 4^{2} \\ = 256 feet \end{aligned}

Let’s also compute the distance the ball traveled after 5 seconds:

\begin{aligned} s_{5} & = 16 \cdot 5^{2} \\ = 400 feet \end{aligned}

The average speed for this interval of time is then:

average speed for the interval of time [4, 5] = \frac{s_{5} - s_{4}}{1} = \frac{400 - 256}{1} = 144 feet/s

So the average speed during the fifth second is $144; feet/s$ . This quantity is no more than an approximation of the instantaneous speed, but we may improve the approximation by calculating the average speed in the interval of time from 4 to 4.1 seconds, which is:

average speed for the interval of time [4, 4.1] = \frac{268.96 - 256}{0.1} = 129.6 feet/s

Let’s register more computations of the above process with smaller and smaller intervals of time in a table:

|time elapsed after 4 seconds|  1|  0.1|  0.01|  0.001|  0.0001|
|average speed (in feet/s)   |144|129.6|128.16|128.016|128.0016|

Of course, no matter how small the interval is, the result is not the instantaneous speed at the instant $t = 4$ . However, we now see that the average speed for the intervals seems to be approaching the fixed number 128 feet/s.

Method of Increments

Let’s redo the process described above over an arbitrary interval of time. To do so, let’s introduce a quantity $h$ , which represents an interval of time beginning at $t = 4$ and extending before or after $t = 4$ . ( $h$ is called an increment in $t$ because it’s some interval of time.)

The formula for the example above is:

\begin{matrix} (1) & s = 16 t^{2} \end{matrix}

When calculated once by the end of the fourth second, it is:

\begin{matrix} (2) & s_{4} = 16 \cdot 4^{2} = 256 \end{matrix}

When substituted with the interval $[4, 4 + h]$ , it is:

\begin{aligned} s_{4} + k & = 16 (4 + h)^{2} \\ (3) & = 256 + 128 h + 16 h^{2} \end{aligned}

Where $k$ is the additional distance the object falls $h$ seconds after the initial $4$ seconds. To obtain $k$ , we have to subtract $(2)$ from $(3)$ . The result is:

k = 128 h + 16 h^{2}

The average speed in this interval of time is then $\frac{k}{h}$ . Dividing both sides by $h$ :

\frac{k}{h} = 128 + 16 h

To compute the instantaneous speed, the interval $h$ must become smaller and smaller until it reaches 0. If $h$ approaches 0, then $16 h$ also approaches 0. We can conclude that the instantaneous speed when $t = 4$ approaches 128 feet/s.

Generalization

Let’s generalize the process above for $(1)$ for any value of $t$ . To do so, let’s apply the method of increments when $t$ is substituted with the interval $t + h$ :

\begin{aligned} s + k & = 16 (t + h)^{2} \\ = 16 t^{2} + 32 t h + h^{2} \end{aligned}

Subtracting $(1)$ from the equation above:

\begin{aligned} k & = 32 t h + h^{2} \end{aligned}

Dividing both sides by $h$ :

\begin{matrix} (4) & \frac{k}{h} = 32 t + h \end{matrix}

Just as stated above, to compute the instantaneous speed, the interval $h$ must become smaller and smaller until it reaches 0. If $h$ approaches 0, then the instantaneous speed approaches $32 t$ , which is a function that will tell us the instantaneous speed of the falling object at any time $t$ !

It has been customary since the days of Euler to use $Δ t$ (delta t) for the increment of $t$ . $Δ t$ means a “change in the value of $t$ ”. Thus, $Δ t$ has the same meaning as $h$ ; likewise, $Δ s$ has the same meaning as $k$ . We can rewrite $(4)$ as:

\begin{matrix} (5) & \frac{Δ s}{Δ t} = 32 t + 16 Δ t \end{matrix}

It’s desirable to have some short notation for the statement that we have evaluated the limit of as the values of $Δ t$ approach 0, which can be expressed as:

lim_{Δ t \to 0} \frac{Δ s}{Δ t}

Where lim is an abbreviation for limit, replacing $(5)$ with this new notation:

\begin{matrix} (6) & lim_{Δ t \to 0} \frac{Δ s}{Δ t} = 32 t \end{matrix}

To some mathematicians, this notation is somewhat lengthy; hence, mathematicians replaced it with different variations:

lim_{Δ t \to 0} \frac{Δ s}{Δ t} = \frac{d s}{d t} = s^{'} = f^{'} (t)

The rate of change is not always related to time or distances. A generalization of the formulas above is needed. Instead of the symbols $s$ and $t$ , let’s use $x$ and $y$ without specifying what $x$ and $y$ mean physically.

Let’s calculate the instantaneous rate of change of $y$ with respect to $x$ (the word instantaneous does not really apply because $x$ doesn’t represent time), using the method of increments on a function which depends on $x$ :

\begin{aligned} (7) & y & = f (x) \\ (8) & y + Δ y & = f (x + Δ x) \end{aligned}

Subtracting $(7)$ from $(8)$ :

Δ y = f (x + Δ x) - f (x)

Dividing both sides by $Δ x$ :

\frac{Δ y}{Δ x} = \frac{f (x + Δ x) - f (x)}{Δ x}

The instantaneous rate of change of $y$ with respect to $x$ is reached when $Δ x$ approaches 0:

\begin{matrix} (9) & lim_{Δ x \to 0} \frac{f (x + Δ x) - f (x)}{Δ x} \end{matrix}

We can also use the variations for the notation of the rate of change:

lim_{Δ t \to 0} \frac{Δ y}{Δ x} = \frac{d y}{d x} = y^{'} = f^{'} (x)

What we did with the process above was to find the instantaneous rate of change of $y$ with respect to $x$ . We call this rate the derivative of $y$ with respect to $x$ . The process of applying the method of increments to obtain the derivative is called differentiation.

Geometric Interpretation of the Derivative

Let’s graph the following formula:

\begin{matrix} (10) & y = x^{2} \end{matrix}

A point belonging to this geometrical representation of $y$ has the form $(x_{1}, f (x_{1}))$ ; e.g., when $x = 1, y = 1$ and when $x = 2, y = 4$ .

Let’s say that $(x_{1}, f (x_{1}))$ is a fixed point on the curve (for the sake of this example, the point will be $x_{1} = 1, y_{1} = 1$ ). Any other point that belongs to the curve can make a line with the fixed point.

The slope is a quantity that describes the direction and steepness of a line and is calculated by finding the ratio of the vertical change to the horizontal change between any two distinct points on the line. The previous statement expressed as a formula is:

m = \frac{y_{2} - y_{1}}{x_{2} - x_{1}} = \frac{Δ y}{Δ x}

What if the movable point gets closer and closer to the fixed point such that $Δ x$ reaches 0? That’s exactly the definition of the derivative, which means that the derivative of a function will tell us the slope of the tangent line to the function (represented geometrically as a curve) at any derivable point!

Let’s find the instantaneous rate of change of this function evaluated at $x = 1$ , using $(9)$ :

\begin{aligned} m_{1} = f^{'} (1) & = lim_{Δ x \to 0} \frac{f (1 + Δ x) - f (1)}{Δ x} \\ = lim_{Δ x \to 0} \frac{(1 + Δ x)^{2} - 1^{2}}{Δ x} \\ = lim_{Δ x \to 0} \frac{1^{2} + 2 Δ x - Δ x^{2} - 1^{2}}{Δ x} \\ = lim_{Δ x \to 0} 2 - Δ x \\ = 2 \end{aligned}

This fixed number is the value of the slope of the line tangent to the derivative function when it’s evaluated with $1$ . Let’s find out the Point–slope form of the tangent line whose slope is $m$ :

\begin{matrix} (11) & y - y_{1} = m (x - x_{1}) \end{matrix}

Substituting $y_{1} = 1$ , $m = 2$ , and $x_{1} = 1$ computed above:

\begin{aligned} y & = 2 (x - 1) + 1 \\ = 2 x - 2 + 1 \\ = 2 x - 1 \end{aligned}

If we graph this line next to the geometric representation of $y = x^{2}$ , we see that it’s actually touching the curve at the point $(1, 1)$ .

Before finding the equation of the slope for any value of $x$ , let’s imagine the graph produced by the slope function. If we take a look at the graph produced by $(10)$ , we can see that for any point that belongs to the curve whose $x$ coordinate is negative, the slope will be negative, and for any point that belongs to the curve whose $x$ coordinate is positive, the slope will be positive, expressed mathematically:

s i g n (m) = {\begin{cases} - 1 & if x < 0, \\ 0 & if x = 0, \\ 1 & if x > 0. \end{cases}

Now that we have an idea of the values of the slope, let’s find the value of $m$ for any value of $x$ that is the derivative of $y$ with respect to $x$ , using $(9)$ :

\begin{aligned} f^{'} (x) & = lim_{Δ x \to 0} \frac{f (x + Δ x) - f (x)}{Δ x} \\ = lim_{Δ x \to 0} \frac{(x + Δ x)^{2} - x^{2}}{Δ x} \\ = lim_{Δ x \to 0} \frac{x^{2} + 2 x Δ x - Δ x^{2} - x^{2}}{Δ x} \\ = lim_{Δ x \to 0} 2 x - Δ x \\ = 2 x \end{aligned}

By looking at the line, we confirm our expectation of the values. Any point which belongs to the line whose $x$ coordinate is negative has its $y$ coordinate (the value of the slope) negative as well, and any $x$ coordinate belonging to the line whose $x$ coordinate is positive has its $y$ coordinate positive as well.

There are infinite tangent lines to the curve that represents $(10)$ . In the following graph, the equation of the line is computed dynamically based on the position of the mouse pointer (computed by doing substitutions on $(11)$ ):

Second Derivative

Going back to the falling object formula ( $s$ is the distance the object moved after $t$ seconds have elapsed):

s = 16 t^{2}

The instantaneous rate of change of the distance with respect to time is:

\begin{matrix} (12) & s^{'} = 32 t \end{matrix}

$s^{'}$ represents speed, and it is customary to use $v$ (the first letter of velocity) instead of $s^{'}$ :

\begin{matrix} (13) & v = 32 t \end{matrix}

Now $v$ is a function of $t$ , and we can ask for the rate of change of $v$ with respect to $t$ . This is called instantaneous acceleration. Acceleration is a change of speed that takes place during an interval of time. If there weren’t acceleration in a moving object, the moving object would be moving for the rest of its life with a constant speed. If the speed is given as a function of time, then we can calculate the instantaneous rate of change of the velocity with respect to time:

\begin{matrix} (14) & v^{'} = 32 \end{matrix}

The instantaneous acceleration obtained above is the derived function of the instantaneous speed, which is the derived function of the distance function. Then we can relate the instantaneous acceleration and the distance function with the following notation:

s^{″} or \frac{d^{2} s}{d t^{2}}

The function above is called the second derived function of $(1)$ . This notation applied to the generalized version using the variables $x$ and $y$ is:

\frac{d^{2} y}{d x^{2}} or y^{″} or f^{″} (x)

The Chain Rule

Physical problems lead to more complicated algebraic functions, for example, $y = \sqrt{x^{2} + 1}$ , which arises when one wants to work with the upper half of the parabola $y^{2} = x^{2} + 1$ . We can express this function as a combination of two functions:

y = \sqrt{u} u = x^{2} + 1

If $y$ is a function of $u$ and $u$ is a function of $x$ , then:

\frac{d y}{d x} = \frac{d y}{d u} \cdot \frac{d u}{d x}

Expressed in the function notation:

y = f (u) and u = g (x)

Then:

\begin{matrix} (15) & \frac{d y}{d x} = f^{'} (u) \cdot g^{'} (x) \end{matrix}

Returning to the original problem, let’s find the derivative of $y = \sqrt{x^{2} + 1}$ with respect to $x$ using the chain rule:

Let $f (u) = u^{1 / 2}$ and $g (x) = x^{2} + 1$ .

\frac{d y}{d x} = f^{'} (u) \cdot g^{'} (x) = \frac{u^{- 1 / 2}}{2} \cdot 2 x = \frac{x}{\sqrt{x^{2} + 1}}

Differentiation of Implicit Functions

Going back to the definition of a function, it’s a relation between two variables such that given a value of one in some domain, there’s a unique value determined for the second variable. However, functions often occur in forms where giving the independent variable some value will not result in a unique value. For example, the equation of a circle of radius equal to 5 is:

\begin{matrix} (16) & x^{2} + y^{2} = 25 \end{matrix}

Here, $y$ is not expressed in terms of $x$ . Solving for $y$ , we have two equations:

\begin{matrix} (17) & y = \sqrt{25 - x^{2}} y = - \sqrt{25 - x^{2}} \end{matrix}

$(16)$ represents the circle implicitly, and $(17)$ represents the equation explicitly.

We know that $y$ in $(16)$ represents some function of $x$ . If we recognize that the left side of $(16)$ is only a set of terms in $x$ , then we can differentiate it. The problem is to find the derivative of $y^{2}$ , which should remind us of the chain rule ( $y$ plays the role of $u$ in the chain rule):

\frac{d (y^{2})}{d x} = 2 y \frac{d y}{d x}

Applying a differentiation process to $(16)$ :

2 x + 2 y \frac{d y}{d x} = 0

Solving for $\frac{d y}{d x}$ :

\frac{d y}{d x} = - \frac{x}{y}

Theorems on Differentiation

Read “Calculus: An Intuitive and Physical Approach”.

Applications of the Derivative

Determination of the velocity and acceleration of a particle given its distance as a function of time.
Concentrate light, sound, and radio waves in a particular direction (see the reflective property of the parabola ).
Finding the maximum/minimum value of a function, i.e., find the largest/smallest value of $f (x)$ when $a \leq x \leq b$ . A well-described solution to this problem can be found here .
Approximation of the roots of a polynomial with Newton’s method, described here .

Maxima/Minima

Let’s say that we throw an object into the air and we want to know the maximum height it acquires. As it rises, its velocity decreases, and when it reaches the highest point, its velocity is zero. We also know that the velocity is the instantaneous rate of change of height with respect to time; hence, the derivative is involved in this process, and therefore we expect it to be involved in other maxima/minima problems.

More generally, if $y$ is a function of $x$ , it seems that to find the maximum value of $y$ , we must find $y^{'}$ and set it to 0.

Let’s see an example. The following function has a maximum value of $3.333$ near $x = 1$ and a minimum value of $2$ near $x = 3$ . If we analyze the slope of the function near those points, we will see that on the left of $x = 1$ , the slope is positive, and on the right of $x = 1$ , the slope is negative. Since we know that the derivative represents the slope of a function, we can also expect that the derivative of this function near $x = 1$ will go from a positive value to a negative value, intersecting the x-axis. If we analyze the slope near $x = 3$ , we will see the same behavior with the slope, but it’s going from a negative value to a positive one.

y = x^{3} / 3 - 2 x^{2} + 3 x + 2

y^{'} = x^{2} - 4 x + 3

Now the problem reduces to finding the points where $y^{'} = 0$ in the derivative function. Finding them will tell us exactly the maximum/minimum value of $y$ . Finding the values of $x$ when $y^{'} = 0$ :

\begin{aligned} 0 & = x^{2} - 4 x + 3 \\ 0 & = (x - 1) (x - 3) \end{aligned}

And we see that:

y^{'} = 0 when x = 1 and x = 3

The process didn’t actually find the maximum/minimum values since for $x > 3$ , the function increases indefinitely. The same goes for when $x < 1$ , but in this case, the function decreases indefinitely. These values are called the relative maxima/minima because near $x = 3$ or $x = 1$ , these points are the minimum/maximum that can be found.

Applications of Maxima/Minima

Refraction of light: we can build a function of time which relates the velocity/distance the light travels in different mediums. Finding the derivative and making it equal to $0$ will find the relative minimum time needed to go from one point in medium $a$ to a point in medium $b$ .
Finding the sides of the rectangle with the maximum perimeter.

Newton-Raphson Method

The slope of the tangent line of a function $f (x)$ at any derivable point is given by $m = f^{'} (x)$ . Let $x_{1}$ be a derivable point; then the slope of the tangent line at $x_{1}$ is $m_{1} = f^{'} (x_{1})$ . The Point–slope form of the tangent line whose slope is $f^{'} (x_{1})$ is:

\begin{aligned} y - y_{1} & = m_{1} (x - x_{1}) \\ y - f (x_{1}) & = f^{'} (x_{1}) \cdot (x - x_{1}) \end{aligned}

Newton found out that if we find the intercept of this tangent line with the $x$ -axis at some initial guess $x_{1}$ , the value found approaches one of the roots of $f (x)$ , i.e., when $f (x) = 0$ (obviously, given that it has roots).

If $y = f (x) = 0$ , then the equation of the line is:

0 - f (x_{1}) = f^{'} (x_{1}) \cdot (x - x_{1})

Solving for $x$ :

\begin{matrix} (18) & x = x_{1} - \frac{f (x_{1})}{f^{'} (x_{1})} \end{matrix}

$x$ in the last equation is the abscissa of the next approximation of one of the roots of $f (x)$ . If we run the algorithm above a few times with an acceptable initial guess, then we’ll obtain a better approximation of one of the roots of $f (x)$ .

Finding the Square Root of a Number

Let’s say that we want to find the square root of a number $n$ . This is equivalent to finding the solution to:

x^{2} = n

The function to use is then:

f (x) = x^{2} - n

whose derivative is:

f^{'} (x) = 2 x

Substituting in $(18)$ :

\begin{aligned} x & = x_{1} - \frac{x_{1}^{2} - n}{2 x_{1}} \\ = x_{1} - \frac{x_{1}}{2} + \frac{n}{2 x_{1}} \\ = \frac{x_{1}}{2} + \frac{n}{2 x_{1}} \\ = \frac{1}{2} \cdot (x_{1} + \frac{n}{x_{1}}) \end{aligned}

double square_root(double n) {
  // initial guess
  double EPS = 1e-15;
  double x0 = 1;
  while (true) {
    double xi = (x0 + n / x0) / 2.0;
    if (abs(x0 - xi) < EPS) {
      break;
    }
    x0 = xi;
  }
  return x0;
}