Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

9.1 Partial Derivatives and Linearization

Partial Derivatives

Suppose that f(x)=f(x1,x2,...,xd)f(x) = f(x_1,x_2,...,x_d) is a scalar-valued function of dd variables. How can we find the “slope” of ff at some input x=[x1,x2,...,xd]x = [x_1,x_2,...,x_d]? What does it mean to differentiate a function that accepts more than one input?

The simplest approach is to take a derivative with respect to one input variable at a time, holding the others fixed. This is equivalent to selecting a cross-section of a surface, then, since cross-sections are functions with a single input,differentiating the cross-section using the regular derivative.

First, isolate a cross section. The animation below isolates a yy-cross section of the same surface visualized in Section 8.2. The selected cross-section holds x=0.75x = -0.75 while allowing yy to vary.

cross section of a surface corresponding to x=0.75x = -0.75.

Second, take a derivative of the cross-section. The animation below isolates the derivative at y=0.25y = 0.25 using the limiting definition of a derivative:

ddsg(s)=limΔs0g(s+12Δs)g(s12Δs)Δs\frac{d}{ds} g(s) = \lim_{\Delta s \rightarrow 0} \frac{g(s + \frac{1}{2} \Delta s) - g(s - \frac{1}{2} \Delta s)}{\Delta s}

In this case, the function g(s)g(s) is a yy cross-section of ff with x=0.75x = -0.75 so g(s)=f(0.75,s)g(s) = f(-0.75,s).

A derivative is the limit of the slope of secants.

The slope recovered above is the slope of the surface holding xx fixed at -0.75 and varying yy about 0.25. We call the slope of a function with respect to only one input variable a partial derivative.

Run the code cell below to visualize the cross-sections of surfaces and their tangent lines.

from utils_lsg import show_surface_cross_section

show_surface_cross_section()

Examples

Let’s practice. Here are three examples:

  1. f(x,y)=x+7y3f(x,y) = x + 7 y - 3 find xf(x,y)\partial_{x} f(x,y) and yf(x,y)\partial_y f(x,y).

  1. f(x,y)=x2×(1+y3)f(x,y) = x^2 \times (1 + y^3) find xf(x,y)\partial_{x} f(x,y) and yf(x,y)\partial_y f(x,y).

  1. f(x,y)=log(xy)f(x,y) = \log(x - y) find xf(x,y)\partial_{x} f(x,y) and yf(x,y)\partial_y f(x,y).

Notice that, in each case, the corresponding partial can be found by treating all the other inputs as if they were constants, and taking a regular derivative with respect to the variable of interest. For instance:

z3x2yz+5xz1=3x2y5xz2\partial_z 3 x^2 y z + 5 x z^{-1} = 3 x^2 y - 5 x z^{-2}

since:

ddzaz+bz1=abz2\frac{d}{dz} a z + b z^{-1} = a - b z^{-2}

In this case, since we asked for a partial with respect to zz, we pretended that xx and yy were constants.

Partials Depend on All Inputs

Examples 2 and 3 above show that, for a generic surface ff, the partial derivative xjf(x)\partial_{x_j} f(x) will be a function of all of the input variables, not just the input xjx_j.

For instance:

xysin(x)=ycos(x).\partial_{x} y \sin(x) = y \cos(x).

This makes sense since the xx cross-sections f(x,1)f(x, 1) and f(x,0)f(x,0) do not produce the same curve:

f(x,1)=sin(x),f(x,0)=0.\begin{aligned} & f(x,1) = \sin(x), \quad f(x,0) = 0. \end{aligned}

The animation below continues the examples illustrated in the animations above. This time, we add a second yy cross-section by fixing a new xx value. The two cross-sections are different curves, so they have different slopes for different input values of yy. As a result, the partial derivative yf(x,y)\partial_{y} f(x,y) depends on both xx and yy.

Partial derivatives of ff with respect to yy along two different cross-sections.

Linearization

In Section 6.1 we saw that it is possible to approximate smooth functions with polynomial functions (e.g. linear, quadratic, or cubic functions) whose coefficients are derived by differentiating a function about a point where it is easy to evaluate. The same idea extends to functions of multiple variables.

Suppose that we know the value of f(x,y)f(x,y) at some xx_* and some yy_*, and, we also know its partials, xf(x,y)\partial_{x} f(x,y), yf(x,y)\partial_{y} f(x,y) at xx_* and yy_*. Then, for x,yx,y close to x,yx_*, y_*:

f(x,y)f(x,y)+xf(x,y)×(xx)+yf(x,y)×(yy).f(x,y) \simeq f(x_*,y_*) + \partial_{x} f(x_*,y_*) \times (x - x_*) + \partial_{y} f(x_*,y_*) \times (y - y_*).

Writing x=x+Δxx = x_* + \Delta x and y=y+Δyy = y_* + \Delta y:

f(x+Δx,y+Δy)f(x,y)+xf(x,y)×Δx+yf(x,y)×Δy.f(x_* + \Delta x,y_* + \Delta y) \simeq f(x_*,y_*) + \partial_{x} f(x_*,y_*) \times \Delta x + \partial_{y} f(x_*,y_*) \times \Delta y.

Compare this result to the linear approximation introduced in Section 6.1. The formula above amounts to applying the linear correction to f(x,y)f(x,y) separately in xx, then in yy.

For example, given f(x,y)=x2+xeyf(x,y) = x^2 + x e^{-y},

xf(x,y)=2x+ey\partial_x f(x,y) = 2 x + e^{-y}

and

yf(x,y)=xey.\partial_y f(x,y) = -x e^{-y}.

So:

f(0.1,0.2)=0.092f(0,0)+xf(0,0)×0.1+yf(0,0)×0.2=0+(2×0+e0)×0.1+(0×e0)×0.2=1×0.1=0.1.\begin{aligned} f(0.1,0.2) =0.092 & \approx f(0,0) + \partial_x f(0,0) \times 0.1 + \partial_y f(0,0) \times 0.2 \\ & = 0 + (2 \times 0 + e^0) \times 0.1 + (-0 \times e^{0}) \times 0.2 \\ & = 1 \times 0.1 = 0.1. \end{aligned}

The surface:

f~1(x,y)=f(x,y)+xf(x,y)×(xx)+yf(x,y)×(yy)\tilde{f}_1(x,y) = f(x_*,y_*) + \partial_{x} f(x_*,y_*) \times (x - x_*) + \partial_{y} f(x_*,y_*) \times (y - y_*)

defines a plane since it takes the form:

f~1(x,y)=a+bx+cy\tilde{f}_1(x,y) = a + b x + c y

for a=f(x,y)xf(x,y)xyf(x,y)y.a = f(x_*,y_*) - \partial_{x} f(x_*,y_*) x_* - \partial_{y} f(x_*,y_*) y_*. This should not be surprising. A plane is the two-dimensional equivalent of a line, since it is a surface that is a linear function of both inputs.

The plane defined by a linear approximation to a surface about x,yx_*,y_* is tangent to the surface at x,yx_*,y_* just like the linear approximation to a function of a single variable is tangent to the function. It has the same slope as the surface where it intersects the surface. Accordingly, we call the plane formed by a linear approximation a tangent plane.

We can write the formula for a tangent plane more concisely using an inner product:

f(x+Δx,y+Δy)f(x,y)+[xf(x,y),yf(x,y)][Δx,Δy].f(x_* + \Delta x,y_* + \Delta y) \simeq f(x_*,y_*) + [\partial_{x} f(x_*,y_*), \partial_{y} f(x_*,y_*)] \cdot [\Delta x, \Delta y].

This form is nice since it collects like terms into vectors. In particular, it groups the collection of partial derivatives into a single vector. The next section, Section 9.1, is all about the vector of partial derivatives. This vector is important since it encodes all of the information about the slope of the surface at x,yx_*,y_* needed to build a tangent plane to the surface.