Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

3.3 Local Properties

So far we’ve discussed two methods for visualizing functions. These were, check its characteristics (Section 3.1), then check its composition (Section 3.2). The first helps narrow down the ways in which the function can behave. The second breaks it into simpler parts.

Both of those strategies are global. They provide information about f(x)f(x) at all possible inputs xx.

This section is about local strategies. The simplest local strategy is the oldest. Just pick a bunch of inputs, {x1,x2,x3,...}\{x_1,x_2,x_3,...\}, plug them each in to find {f(x1),f(x2),f(x3),...}\{f(x_1),f(x_2),f(x_3),...\}, plot each, then trace a curve through your plot. This is the strategy you used in HW 2 to guess how the mode of the binomial depended on nn and pp.

That strategy always works, but it is slow and laborious. It’s also not very insightful. You usually won’t learn much about the function by doing it, so if we change the problem set up slightly, say by varying a free parameter, then you’ll have to do all of your work over again.

That said, evaluating a function at some specific input points is an essential strategy. The key is to pick those points wisely. Try to pick as few points as are sufficient to describe the function. In essence, select a set of reference points, learn what you can about ff near those references, then constrain your plot using what you’ve learned.

Choosing Reference Points

Make a list of reference points. In rough order of work to evaluate, check:

  1. Any inputs where ff is easy to evaluate (often, x=0x = 0 and x=±1x = \pm 1).

    • Be greedy. Always do the easiest thing first.

  2. The smallest and largest possible inputs. If necessary, take limits.

    • If XX is a random variable with support equal to some interval [a,b][a,b], then you should always include aa and bb in your list of reference points.

    • Some of these limits are easy. For example:

      • If you want to plot the CDF of XX, and X[a,b]X \in [a,b], then CDF(x)=0\text{CDF}(x) = 0 for all x<ax < a and CDF(x)=1\text{CDF}(x) = 1 for all x>bx > b.

      • If you want to plot a PMF or a PDF, and XX is unbounded in some direction (e.g. XX can be arbitrarily large), then your PMF/PDF must converge to zero as xx diverges. Otherwise, it couldn’t add/integrate to one so would not be normalized. This is why, whenever we’ve drawn a distribution for a random variable that is unbounded above, the distribution tailed off to zero for sufficiently large inputs.

  3. An axis of symmetry if it exists.

    • For instance, if you’re given a density PDF(x)g(x)\text{PDF}(x) \propto g(x) where g(x)=(1+14(x+3)2)5/2g(x) = (1 + \tfrac{1}{4}(x + 3)^2)^{-5/2}, notice that this is a composition of functions, x5/2x^{-5/2} and 1+14(x+3)21 + \tfrac{1}{4}(x + 3)^2. The inner function is a linear transformation (vertical translation, scale, horizontal translation) of our friend, x2x^2, so must have some symmetry. The function x2x^2 is even about zero. So, if we shift it left by 3, then the shifted function must be even about x=3x_* = -3.

  4. Any roots (locations where f(x)=0f(x) = 0) or poles (locations where f(x)f(x) diverges).

    • Sometimes roots are obvious. For instance, the function g(x)=x3.1(1x)1.2g(x) = x^{3.1}(1 - x)^{1.2} will have roots at x=0x = 0 and x=1x = 1. Roots are obvious when our functions are factored for us.

    • Sometimes roots are very hard to recover. For instance, try to find the roots of g(x)=3x4+2x310x6g(x) = 3x^4 + 2 x^3 - 10 x - 6. Not so easy. Add a power of x5x^5 and x6x^6 and there is no mathematical formula that could find the roots.

    • So, be strategic. Only look for roots if they jump out at you from the page.

    • To find poles, look for roots in the denominator. For instance, the function g(x)=(1+x2)/(1x2)g(x) = (1 + x^2)/(1 - x^2) will have poles at x=±1x = \pm 1 since the denominator equals zero there.

    • So, apply the same logic about roots to poles. Only look for poles if they are obvious. It is worth putting in a bit more elbow-grease when looking for poles, since adding a vertical asymptote to a plot is more important for organizing the plot than adding a root.

    • Be careful with poles. Remember that, if the numerator and denominator are both zero at some xx, then g(x)g(x) may be zero, infinity, or some other finite number. In this case, you have to take a limit. We’ll review l’Hopital’s rule in our chapter on asymptotics and tail behavior.

  5. Any points of discontinuity or nondifferentiability.

    • These are usually easy to spot. Look for a piecewise definition, or an absolute value in the function definition.

  6. All maxima and minima of the function.

    • These are the most important reference points for drawing distribution functions accurately. Always look for maxima and minima.

    • There are a couple strategies for finding maxima and minima. If your function is smooth (continuous and differentiable), then you should check for places where the derivative equals zero.

    • Once you find the roots of the derivative, evaluate its sign between the roots. This creates a partition of the number line into intervals where your function is increasing or decreasing. Anywhere a function switches from increasing to decreasing is a maximum. Anywhere that it switches from decreasing to increasing is a minimum.

    • The same arguments work for discrete functions, like PMF’s. We have to do a bit more work since a PMF is a bar chart, and doesn’t have a well defined slope. To see an example, look ahead to “Optimization”.

  7. Any inflection points of the function.

    • Check for places where the second derivative changes sign. Follow the same procedure we used for slope. First set d2dx2f(x)=0\frac{d^2}{dx^2} f(x) = 0, then check the sign of the second derivative in between its roots

    • Inflection points of the CDF correspond to maxima and minima of the PMF/PDF since the second derivative of the CDF is the first derivative of the PDF.

    • Inflection points are the least important features on this list unless you are drawing a CDF. They are also often the most work to find. So, save them for last. See if you have enough information to draw your function well before finding its inflection points.

That’s a long list, but you rarely need all of it. The more you practice, the faster you’ll get at recognizing functions, and the fewer reference points you’ll need. Most functions don’t have all the references on this list, so you can usually skip a bunch with little effort. For many distributions it is enough to check the largest and smallest possible inputs, check for symmetry, and to identify maxima and minima.

Evaluating ff Locally

Then:

  1. Evaluate ff at each reference.

    • Add a dot on your plot at each (x,f(x))(x,f(x)) pair.

  2. Evaluate the slope of ff at each reference.

    • Add a small tangent line to your plot at (x,f(x))(x,f(x)) with slope ddxf(x)\frac{d}{dx} f(x). Make sure your drawing is tangent to the tangent line (has the correct slope).

  3. Evaluate the slope of ff at some arbitrary point between each reference.

    • Make sure your function is increasing and decreasing on the correct intervals.

    • I usually draw a quick number line reference, mark the roots of the derivative, and add a plus sign on the intervals where ff is increasing, and a minus where it is decreasing.

    • If you’ve already determined which of the roots of ddxf(x)\frac{d}{dx} f(x) are maxima and minima, then you can fill in the intervals without evaluating the slope again.

  4. Evaluate the sign of the second derivative of ff at each reference.

    • This can be slow. So, check whether you know the sign of the second derivative before computing a derivative.

Steps (1) and (2) are essential. If you’ve done (1) and (2) you usually get (3) for free. Step (4) is slow. It helps for accurate plots, especially of CDF’s, but is the least important of the three. Do it last and only when necessary.

Optimization Refresher

Let’s remind ourselves how to find maxima and minima of a function, f(x)f(x).

We’ll go through four techniques. The first two are the easiest. Always do them first. They mostly require looking at your function, and identifying its structure. The last two are the slowest, but they are failsafe. They are largely mechanical. Do them last as back-ups in case the first two fail.

Symmetry

As usual, start by looking for symmetries. If:

  1. If f(x)f(x) is even about an axis of symmetry, xx_*, then xx_* must be a maximum or a minimum of the function.

    • If this isn’t clear to you, try to draw a counterexample. You’ll see that, if you draw f(x)f(x) increasing as xx leaves xx_* to the right, then, by symmetry, f(x)f(x) must also increase as xx leaves xx_* to the left.

    • Picture x2x^2. You come into zero descending, then, by symmetry, leave zero ascending.

    • The same is true for concave functions. If, as xx approaches xx_* from below, f(x)f(x) is increasing, then by symmetry f(x)f(x) must be decreasing as xx moves away from xx_* to the right.

    • The only counterexample is the constant function.

    • It follows that if ff is differentiable, then ddxf(x)=0\frac{d}{dx}f(x_*) = 0 at any even axis of symmetry. So, when you add an even axis of symmetry, you never need to evaluate the derivative there. Just add a horizontal tangent.

  2. If f(x)f(x) is odd about an axis of symmetry, xx_*, then xx_* cannot be a maximum or a minimum.

    • If this isn’t clear to you, try to draw a counterexample. You’ll find that, if f(x)f(x) is increasing as xx approaches xx_* from below, then f(x)f(x) is also increasing as xx leaves xx_* to the right.

    • Picture x3x^3.

Monotone Compositions

Suppose that f(x)=g(h(x))f(x) = g(h(x)) for some outer function gg and some inner function hh. Then, suppose that gg is a monotonic function. For instance, we’ll see lots of examples where gg is either exe^{x}, exe^{-x}, or xαx^{-\alpha} for some α>0\alpha > 0.

Suppose that g(x)g(x) is monotonically increasing or non-decreasing. Then, hhh_* \geq h implies g(h)g(h)g(h_*) \geq g(h). So, if h(x)h(x)h(x_*) \geq h(x), then g(h(x))g(h(x))g(h(x_*)) \geq g(h(x)). So, to maximize gg it is enough to maximize hh.

The same logic works in reverse for monotonically decreasing, or non-increasing, functions. If gg is monotonically decreasing or non-increasing, then making hh smaller makes g(h)g(h) larger. So, to maximize g(h(x))g(h(x)) minimize h(x)h(x).

This strategy works whenever hh is easier to optimize than ff. It is espeically useful for distributions that are defined using an exponential. For example, we will see lots of distributions that look like:

f(x)=eh(x)f(x) = e^{h(x)}

for some h(x)h(x). In this case, if you’re asked to optimize ff, just optimize hh.

Some functions don’t look like eh(x)e^{h(x)} at first glance, but are still easier to optimize if we rewrite them this way. In particular, if your function involves a product of terms, then, since exponentials exchange products and sums, we can write:

f(x)=a(x)×b(x)=elog(a(x))×elog(b(x))=elog(a(x))+log(b(x))f(x) = a(x) \times b(x) = e^{\log(a(x))} \times e^{\log(b(x))} = e^{\log(a(x)) + \log(b(x))}

Now, optimize the argument of the exponential, which is the log of ff:

log(f(x))=log(a(x))+log(b(x))\log(f(x)) = \log(a(x)) + \log(b(x))

This trick is important for random variables generated by processes that involve a bunch of and statements. Those and statements produce products, so the corresponding distributions are often expressed as a product of terms. For instance, the binomial PMF is a product of three terms:

PMF(x)=(nx)px(1p)nx\text{PMF}(x) = \left(\begin{array}{c} n \\ x \end{array} \right) p^x (1 - p)^{n-x}

and the geometric PMF is a product of two:

PMF(x)=(1p)x1p.\text{PMF}(x) = (1 - p)^{x - 1} p.

In both cases, if we want to find the choice of pp that maximizes the PMF at a given xx, we should start by taking a logarithm.

By Direction

In general, xx_* is a maximizer if f(x)f(x) is increasing to the left of xx_* and decreasing to the right. It is a minimizer if f(x)f(x) is decreasing to the left and increasing to the right. So, xx_* is a critical point (maximizer or minimizer) if ff changes direction at xx_*.

Differentiable Functions

If f(x)f(x) is smooth, then we can use its derivative to check whether it is increasing or decreasing. If the derivative is positive then ff is increasing. If the derivative is negative, then it is decreasing. So, ff has critical points where the derivative changes sign.

If f(x)f(x) is continuously differentiabe (that is, its slope is continuous), then its slope cannot change sign without crossing zero. So, to find critical points, we rely on the old calculus trick:

  1. Take the derivative

  2. Set it to zero.

Then, evaluate the sign of the derivative on either side of its roots to classiy critical points as maxima, minima, or neither.

Non-differentiable Functions

What if f(x)f(x) is not a smooth function of xx? Then it doesn’t have a well defined slope, so our friendly “set derivative to zero” method doesn’t apply.

We can still find maxima and minima by looking for locations where f(x)f(x) switches from increasing to decreasing, or from decreasing to increasing. Here’s an example.