Quasi-rigorous Calculus of Variations

Let F\mathscr{F} be some (potentially nonlinear functional) F:L2([0,1])R\mathscr{F} : L^2([0, 1]) \to \mathbb{R} taking functions ρL2([0,1])\rho \in L^2([0, 1]) Suppose we have some εR\varepsilon \in \mathbb{R} and δxL2([0,1]).\delta x \in L^2([0, 1]). Then we can create the (linear?!) functional D:L2([0,1])R\mathscr{D} : L^2([0, 1]) \to \mathbb{R} defined by Dρ(δx)=limε0F(ρ+εδx)F(ρ)ε.\mathscr{D}_{\rho}(\delta x) = \lim_{\varepsilon \to 0} \frac{\mathscr{F}(\rho + \varepsilon \delta x) - \mathscr{F}(\rho)}{\varepsilon}. Why is this linear? Assume the necessary quantities exist to define the following (this appears to be equivalent to our functional F\mathscr{F} being Gateaux Differentiable at ρ0\rho_0). 0=Dρ(δx1+δx2)Dρ(δx1)Dρ(δx2)=limε0F(ρ+ε(δx1+δx2))F(ρ+εδx1)F(ρ+εδx2)ε 0 = \mathscr{D}_\rho(\delta x_1 + \delta x_2) - \mathscr{D}_\rho(\delta x_1) - \mathscr{D}_\rho(\delta x_2) = \lim_{\varepsilon \to 0 } \frac{\mathscr{F}(\rho + \varepsilon(\delta x_1 + \delta x_2)) - \mathscr{F}(\rho + \varepsilon \delta x_1) - \mathscr{F}(\rho + \varepsilon\delta x_2)}{\varepsilon} This gives us a feel for the regularity that we need F\mathscr{F} to have. One weird thing about the Gateaux derivative is that it is so general that it can fail to even be linear. Thus, the better requirement is to assume Fréchet differentiability of F\mathscr{F} at ρ0\rho_0. In its most general formulation, this posits that F\mathscr{F} is a functional on a Locally Convex Metrizable Topological Vector Space, which is complete (e.g. w.r.t. the metric). A few notes on this: the spaces Ck(M) \mathrm{C}^k(M) works for smooth manifolds under the countable family of seminorms given by fk,n=sup{fkx[n,n]} \| f \|_{k, n} = \sup \{ |f^k| \mid x \in [-n, n] \} and holds even for k=k = \infty.

For the purposes of most of the systems that I want to work with, we can assume that F\mathscr{F} takes the form of F(ρ(x))dμ(x)\int F(\rho(x))\intd \mu(x) for some FC1(R).F \in \mathrm{C}^1(\mathbb{R}). Let us check that this is Fréchet differentiable. That is for hL2([0,1])h \in L^2([0, 1]) 0=limε0F(ρ+εh)F(ρ)Fρ(εh)εh 0 = \lim_{\varepsilon \to 0} \frac{\left|\mathscr{F}(\rho + \varepsilon h) - \mathscr{F}(\rho) - \mathscr{F}'_{\rho}(\varepsilon h)\right|}{\|\varepsilon h\|} We can make a pretty good guess that Fρ(h)=F(ρ(x))h(x)dx\mathscr{F}'_\rho(h) = \int F'(\rho(x))h(x) \intd{x} (interesting note: the fact that FF' is only integrated against means that I think we need FF to be merely weakly differentiable, but this should be verified). We then find limε0F(ρ+εh)F(ρ)Fρ(εh)εh=limε0F(ρ(x)+εh(x))F(ρ(x))εF(ρ(x))h(x)dxεh\lim_{\varepsilon \to 0} \frac{\left|\mathscr{F}(\rho + \varepsilon h) - \mathscr{F}(\rho) - \mathscr{F}'_{\rho}(\varepsilon h)\right|}{\|\varepsilon h\|} = \lim_{\varepsilon \to 0} \frac{\left| \int F(\rho(x) + \varepsilon h(x)) - F(\rho(x)) - \varepsilon F'(\rho(x)) h(x) \intd{x} \right|}{\|\varepsilon h\|} A precise form of Taylor's theorem states that for any fixed xx, we can find some hρ(x)(y)h_{\rho(x)}(y) such that limyρ(x)hρ(x)(y)=0\lim_{y \to \rho(x)} h_{\rho(x)}(y) = 0 and F(ρ(x)+εh(x))=F(ρ(x))+εh(x)F(ρ(x))+0εh(x)F(ρ(x)+t)tdt F(\rho(x) + \varepsilon h(x)) = F(\rho(x)) + \varepsilon h(x)F'(\rho(x)) + \int_{0}^{\varepsilon h(x)} F''(\rho(x) + t)t \intd{t} Assume that F|F''| can be essentially bounded by some C,C, then we have R(x)=0εh(x)F(ρ(x)+t)tdxdx0εh(x)F(ρ(x)+t)tdx0εh(x)Ctdx=C2(εh(x))2 |R(x)| = \left| \int_{0}^{\varepsilon h(x)} F''(\rho(x) + t)t \intd{x'}\right| \intd{x} \leq \int_{0}^{\varepsilon h(x)} \left|F''(\rho(x) + t)t \right| \intd{x'} \leq \int_{0}^{\varepsilon h(x)} \left|Ct \right| \intd{x} = \frac{C}{2}(\varepsilon h(x))^2 This meabs

limε0F(ρ(x)+εh(x))F(ρ(x))εF(ρ(x))h(x)dxεh\lim_{\varepsilon \to 0} \frac{\left| \int F(\rho(x) + \varepsilon h(x)) - F(\rho(x)) - \varepsilon F'(\rho(x)) h(x) \intd{x} \right|}{\|\varepsilon h\|}=limε0F(ρ(x))+εF(ρ(x))h(x)+R(x)F(ρ(x))εF(ρ(x))h(x)dxεh= \lim_{\varepsilon \to 0} \frac{\left| \int F(\rho(x)) + \varepsilon F'(\rho(x))h(x) + R(x) - F(\rho(x)) - \varepsilon F'(\rho(x)) h(x) \intd{x} \right|}{\|\varepsilon h\|}
=limε0R(x)dxεh= \lim_{\varepsilon \to 0} \frac{\left| \int R(x) \intd{x} \right|}{\|\varepsilon h\|}
limε0R(x)dxεh\leq \lim_{\varepsilon \to 0} \frac{ \int \left| R(x) \right| \intd{x} }{\|\varepsilon h\|}
limε012Cε2h(x)2dxεh\leq \lim_{\varepsilon \to 0} \frac{ \frac{1}{2} C \varepsilon^2 \int h(x)^2 \intd{x} }{\|\varepsilon h\|}
=limε012Cεh22εh= \lim_{\varepsilon \to 0} \frac{ \frac{1}{2} C \|\varepsilon h\|_2^2 }{\|\varepsilon h\|}
=limε012Cεh2= \lim_{\varepsilon \to 0} \frac{1}{2} C \|\varepsilon h\|_2
=0= 0

This demonstrates that, indeed, Fρ(h)=F(ρ(x))h(x)dx\mathscr{F}'_\rho(h) = \int F'(\rho(x))h(x) \intd{x}

However!! In this process we've determined that, in fact, Dρ(h)=Fρ(h)=F(ρ(x))h(x)dx.\mathscr{D}_\rho(h) = \mathscr{F}'_\rho(h) = \int F'(\rho(x))h(x) \intd{x}.

Now! Here comes the fun part. This linear functional Dρ(h)\mathscr{D}_\rho(h) can be represented by integration against the measure F(ρ(x))dμ.F'(\rho(x)) \intd{\mu}. This functional can be used to calculate the directional functional derivative in some "direction" h(x)h(x). However, it is extremely common (especially in numerical applications) to wish to access a quantity δFδρ\fder{\mathscr{F}}{\rho} which can be evaluated pointwise. Here's the really clever thing, and the point of why I started writing this little article in the first place: for any Fréchet-differentiable functional G,\mathscr{G}, we can do this process to find some measure ν\nu against which an hh can be integrated to calculate a directional functional derivative. If we want to find δGδρ \fder{\mathscr{G}}{\rho} then we merely need to calculate the Radon-Nikodym derivative dνdμ\der{\nu}{\mu} (we omit for the moment the considerations that determine whether νμ\nu \ll \mu ). Let us examine this concretely for our example F\mathscr{F}. The Radon-Nikodym derivative is some Lebesgue integrable function dνdμ\der{\nu}{\mu} which satisfies ν(A)=Adνdμdμ \nu(A) = \int_A \der{\nu}{\mu}\intd{\mu} for all measurable sets AA. Well, in the case of our F\mathscr{F} we are calculating the R-N derivative with respect to the underlying measure μ\mu corresponding to L2([0,1])L^2([0, 1]) and so we get δFδρ(ρ0;x)=dνρ0dμ(x)=F(ρ0(x)).\fder{\mathscr{F}}{\rho}(\rho_0; x) = \der{\nu_{\rho_0}}{\mu}( x) = F'(\rho_0(x)).

Next up: product and chain rules!

Parent post: