Linear Algebra
11.612 min read

Local Linear Approximation (Multivariate)

The single-variable local linear approximation f(a+h)f(a)+f(a)hf(a+h) \approx f(a) + f'(a)h generalizes elegantly to multiple variables. Near a point a\mathbf{a}, a differentiable function f:RnRf: \mathbb{R}^n \to \mathbb{R} is approximated by a linear function:

f(a+h)f(a)+fx1(a)h1++fxn(a)hnf(\mathbf{a} + \mathbf{h}) \approx f(\mathbf{a}) + \frac{\partial f}{\partial x_1}(\mathbf{a})\, h_1 + \cdots + \frac{\partial f}{\partial x_n}(\mathbf{a})\, h_n

This can be written compactly as f(a+h)f(a)+Df(a)hf(\mathbf{a} + \mathbf{h}) \approx f(\mathbf{a}) + Df(\mathbf{a})\mathbf{h}, where Df(a)=[f/x1(a)f/xn(a)]Df(\mathbf{a}) = \begin{bmatrix} \partial f/\partial x_1(\mathbf{a}) & \cdots & \partial f/\partial x_n(\mathbf{a}) \end{bmatrix} is the 1×n1 \times n Jacobian matrix (row vector of partials). This linear map is the best linear approximation to ff near a\mathbf{a}.

Geometrically for n=2n=2: the graph z=f(x,y)z = f(x,y) is a surface in R3\mathbb{R}^3, and the local linear approximation gives the tangent plane at the point (a1,a2,f(a))(a_1, a_2, f(\mathbf{a})). The equation of the tangent plane is z=f(a)+fx(a)(xa1)+fy(a)(ya2)z = f(\mathbf{a}) + f_x(\mathbf{a})(x-a_1) + f_y(\mathbf{a})(y-a_2).

Formal View

Definition 11.7 — Local Linear Approximation (Multivariate)
The local linear approximation (LLA) of f:DRnRf: D \subseteq \mathbb{R}^n \to \mathbb{R} at a\mathbf{a} is the affine function
La(x)=f(a)+Df(a)(xa)L_{\mathbf{a}}(\mathbf{x}) = f(\mathbf{a}) + Df(\mathbf{a})(\mathbf{x}-\mathbf{a})
where Df(a)=[1f(a)nf(a)]Df(\mathbf{a}) = \begin{bmatrix} \partial_1 f(\mathbf{a}) & \cdots & \partial_n f(\mathbf{a})\end{bmatrix} is the Jacobian row vector.
Example 11.2 — Tangent Plane
For f(x,y)=x2+y2f(x,y) = x^2 + y^2 at a=(1,1)\mathbf{a} = (1,1): f(a)=2f(\mathbf{a})=2, fx(a)=2f_x(\mathbf{a})=2, fy(a)=2f_y(\mathbf{a})=2. The tangent plane is z=2+2(x1)+2(y1)=2x+2y2z = 2 + 2(x-1) + 2(y-1) = 2x + 2y - 2.

Why This Matters

The local linear approximation is the foundation of all first-order optimization methods and sensitivity analysis.

  • Gradient descent: moving in the direction that decreases the LLA most rapidly
  • Error propagation: estimating how small input errors translate to output errors
  • Newton's method in multiple variables uses the LLA to take linearized steps

Quiz

Question 1

For f(x,y)=ex+yf(x,y) = e^{x+y} at the point (0,0)(0,0), the local linear approximation L(0,0)(x,y)L_{(0,0)}(x,y) is:

Question 2

The local linear approximation of ff at a\mathbf{a} equals f(a)f(\mathbf{a}) plus a dot product of the gradient and the displacement vector.

Common Mistakes

  • Forgetting the constant term f(a)f(\mathbf{a}) — the LLA is an affine function, not a linear one.
  • Using global approximation when the LLA is only accurate near a\mathbf{a}.
  • Confusing the LLA (a real-valued affine function) with the Jacobian matrix (a matrix).