Chapter Summary - Data 89 Course Notes

Interactive Tools:¶

Gradient Visualizer - Use this tool to visualize the gradient vector fields associated with a surface.
Gradient Ascent Visualizer. - Use this tool to visualize gradient ascent on the same surfaces provided in the Gradient Visualizer.
Least Squares Visualizer - Use this tool to visualize the mean square error objective function used in least squares regression.

All definitions and results are available in Section 9.1.

A partial derivative of a surface $f(x_1,x_2,...,x_d)$ is a derivative with respect to one of the variables, holding all the rest fixed.
- For example: $\partial_x f(x,y) = \frac{d}{dt} f(x + t, y)|_{t = 0}$ .
- The partial derivative of $f(x_1,x_2,...,x_d)$ with respect to $x_j$ is the slope of the tangent line to its $x_j$ cross-section
- To compute partial derivatives, pretend every variable except the variable in the subscript is a constant, then take an ordinary derivative with respect to the variable in the subscript.
To linearize a surface $f(x_1,x_2,...,x_d)$ about some input vector $x_*$ , compute:
$f(x) \simeq \tilde{f}_1(x) = f(x_*) + \sum_{j=1}^d \partial_{x_j} f(x_*) (x_j - {x_*}_j)$
(1)
- The function $\tilde{f}_1(x)$ defines the tangent plane to the surface $f$ at the input $x_*$ . It is the plane containing the tangent lines to every cross section of the surface at $x_*$ .

All definitions and results are available in Section 9.2.

The gradient of a surface at an input vector $x$ is the vector of all the partial derivatives of $f$ at $x$ :
$\nabla f(x) = [\partial_{x_1} f(x), \partial_{x_2} f(x), ..., \partial_{x_d} f(x)].$
(2)
- A gradient is a vector-valued function of $x$
- To compute a gradient, just compute the all of the partial derivatives
- The gradient:
  1. points in the direction of steepest ascent (the negative gradient points in the direction of steepest descent)
  2. has magnitude equal to the fastest possible rate of ascent on the surface
  3. is perpendicular to the level sets of $f$ where it is not zero
- So, given a contour plot, we can draw the gradients by adding arrows, perpendicular to the level sets, pointing uphill
The tangent-plane at $x_*$ can be expressed concisely using the gradient:
$f(x) \simeq \tilde{f}_1(x) = f(x_*) + \nabla f(x_*) \cdot (x - x_*).$
(3)
The directional derivative of $f$ , at $x$ , in the direction $v$ , is the slope of the surface along a path, passing through $x$ , in the direction $v$ :
$\partial_v f(x) = \frac{d}{dt} f(x + \hat{v} t)$
(4)
where $\hat{v} = v/\|v\|$ is the unit vector pointing in the direction $v$ .
- To compute a directional derivative, use the formula:
$\partial_v f(x) = \nabla f(x) \cdot \hat{v} = \|\nabla f(x) \| \cos(\theta)$
(5)
where $\theta$ is the angle between $v$ and $\nabla f(x)$ .

All definitions and results are available in Section 9.3.

An unconstrained optimization problem asks for the inputs that maximize (or minimize) a function.
- Typically, we optimize iteratively, by moving uphill in the direction of the gradient. This approach is called gradient ascent.
If $f$ is a smooth surface, and our problem is unconstrained, then:
- $x_*$ is not an extrema (maximum or minimum) unless $\nabla f(x) = 0$ .