Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

9.4 Chapter Summary

Interactive Tools:

  1. Gradient Visualizer - Use this tool to visualize the gradient vector fields associated with a surface.

  2. Gradient Ascent Visualizer. - Use this tool to visualize gradient ascent on the same surfaces provided in the Gradient Visualizer.

  3. Least Squares Visualizer - Use this tool to visualize the mean square error objective function used in least squares regression.

Partial Derivatives

All definitions and results are available in Section 9.1.

  1. A partial derivative of a surface f(x1,x2,...,xd)f(x_1,x_2,...,x_d) is a derivative with respect to one of the variables, holding all the rest fixed.

    • For example: xf(x,y)=ddtf(x+t,y)t=0\partial_x f(x,y) = \frac{d}{dt} f(x + t, y)|_{t = 0}.

    • The partial derivative of f(x1,x2,...,xd)f(x_1,x_2,...,x_d) with respect to xjx_j is the slope of the tangent line to its xjx_j cross-section

    • To compute partial derivatives, pretend every variable except the variable in the subscript is a constant, then take an ordinary derivative with respect to the variable in the subscript.

  2. To linearize a surface f(x1,x2,...,xd)f(x_1,x_2,...,x_d) about some input vector xx_*, compute:

    f(x)f~1(x)=f(x)+j=1dxjf(x)(xjxj)f(x) \simeq \tilde{f}_1(x) = f(x_*) + \sum_{j=1}^d \partial_{x_j} f(x_*) (x_j - {x_*}_j)
    • The function f~1(x)\tilde{f}_1(x) defines the tangent plane to the surface ff at the input xx_*. It is the plane containing the tangent lines to every cross section of the surface at xx_*.

Gradients

All definitions and results are available in Section 9.2.

  1. The gradient of a surface at an input vector xx is the vector of all the partial derivatives of ff at xx:

    f(x)=[x1f(x),x2f(x),...,xdf(x)].\nabla f(x) = [\partial_{x_1} f(x), \partial_{x_2} f(x), ..., \partial_{x_d} f(x)].
    • A gradient is a vector-valued function of xx

    • To compute a gradient, just compute the all of the partial derivatives

    • The gradient:

      1. points in the direction of steepest ascent (the negative gradient points in the direction of steepest descent)

      2. has magnitude equal to the fastest possible rate of ascent on the surface

      3. is perpendicular to the level sets of ff where it is not zero

    • So, given a contour plot, we can draw the gradients by adding arrows, perpendicular to the level sets, pointing uphill

  2. The tangent-plane at xx_* can be expressed concisely using the gradient:

    f(x)f~1(x)=f(x)+f(x)(xx).f(x) \simeq \tilde{f}_1(x) = f(x_*) + \nabla f(x_*) \cdot (x - x_*).
  3. The directional derivative of ff, at xx, in the direction vv, is the slope of the surface along a path, passing through xx, in the direction vv:

    vf(x)=ddtf(x+v^t)\partial_v f(x) = \frac{d}{dt} f(x + \hat{v} t)

    where v^=v/v\hat{v} = v/\|v\| is the unit vector pointing in the direction vv.

    • To compute a directional derivative, use the formula:

    vf(x)=f(x)v^=f(x)cos(θ)\partial_v f(x) = \nabla f(x) \cdot \hat{v} = \|\nabla f(x) \| \cos(\theta)

    where θ\theta is the angle between vv and f(x)\nabla f(x).

Multivariate Optimization (Unconstrianed)

All definitions and results are available in Section 9.3.

  1. An unconstrained optimization problem asks for the inputs that maximize (or minimize) a function.

    • Typically, we optimize iteratively, by moving uphill in the direction of the gradient. This approach is called gradient ascent.

  2. If ff is a smooth surface, and our problem is unconstrained, then:

    • xx_* is not an extrema (maximum or minimum) unless f(x)=0\nabla f(x) = 0.