Your time integrator is a filter

An interactive companion to my paper:

Time integration as filtering: a space-time discretization-aware LES formulation

Syver Døving Agdestein · arXiv preprint · 2026

Papers are optimized for archiving ideas, not for absorbing them. This post is the same idea as the paper, but with the knobs exposed: you can drag the filter width yourself instead of trusting a lemma. If you only remember three things:

A coarse finite difference is exactly the derivative of a filtered field — and forward Euler is exactly the derivative of a time-filtered field. Same trick, new axis.
At practical CFL numbers, the time-integration error is the largest single term in the coarse-simulation residual — bigger than the subgrid stress everyone builds closure models for.
One extra closure term, proportional to the time step τ, fixes it. It falls out of the analysis and it is Lax–Wendroff diffusion.

A finite difference is a filter

Large-eddy simulation (LES) starts from an uncomfortable fact: the grid you can afford cannot represent every eddy, so you simulate a filtered (locally averaged) flow and model what the filter removed. The classical theory filters the continuous equations first and discretizes afterwards, quietly assuming the discretization is exact.

The discretization-aware view (the "discretize first, filter next" line of work from my earlier papers) starts from an identity instead. Take the top-hat filter of width $h$ , ${\bar{u}}^{h} (x) = \frac{1}{h} \int_{x - h / 2}^{x + h / 2} u (ξ) d ξ$ . Then

\partial_{x}^{h} u (x) = \frac{u (x + h / 2) - u (x - h / 2)}{h} = \partial_{x} {\bar{u}}^{h} (x) .

Read that carefully: the coarse finite difference is not an approximation of $\partial_{x} u$ . It is exactly the derivative of the filtered field ${\bar{u}}^{h}$ . No truncation error, no big-O. The finite difference and the filter are the same object — I call this the filter-swap property. It means the coarse grid does not commit an error to be minimized; it defines, exactly, which smoothed field you are simulating.

The twist: forward Euler is also a filter

Everything above is standard in the spatial direction. The paper's observation is that the time axis has been left out of this game. A forward-Euler step computes the slope of the chord from $t$ to $t + τ$ — and that chord slope satisfies the same identity, with a one-sided top-hat filter ${\bar{u}}^{τ} (t) = \frac{1}{τ} \int_{t}^{t + τ} u (s) d s$ :

\partial_{t}^{τ} u (t) = \frac{u (t + τ) - u (t)}{τ} = \partial_{t} {\bar{u}}^{τ} (t) .

Don't take the equation's word for it — drag the sliders. The chord on the wiggly signal and the tangent of the smooth blue average always have the same slope, no matter how large you make τ:

Filter width τ = 0.60Evaluation time t = 1.60

signal uone-sided average ūforward-Euler chord on utangent of ū

Chord slope on u = tangent slope of ū^τ = 1.139 — equal exactly, for every τ and every t.

The window is one-sided (it looks forward from $t$ ) because forward Euler is explicit: causality shows up as filter asymmetry. So a coarse time step is not "a small error in time integration" — it is a second filter, applied on top of the spatial one. Your simulation is doubly filtered whether you admit it or not.

An exact space-time coarse equation

Apply both filters to a conservation law $\partial_{t} u + \partial_{x} f (u) = 0$ and both filter-swaps go through, giving an equation that holds exactly on the coarse space-time grid:

\partial_{t}^{τ} {\bar{u}}^{h} + \partial_{x}^{h} {\overset{―}{f (u)}}^{τ} = 0,

where $\partial_{t}^{τ}$ and $\partial_{x}^{h}$ are the forward-Euler difference and the coarse finite difference. This is not an asymptotic statement: the companion code verifies it on the Burgers DNS below, and the largest residual over the whole ensemble is $3 \times 10^{- 15}$ — machine precision. The catch, as always in LES: the flux ${\overset{―}{f (u)}}^{τ}$ depends on the unfiltered $u$ , which the coarse simulation doesn't have. Rewriting it as a computable flux plus a residual splits the modelling burden into four terms:

r^{h, τ} = \underset{spatial}{\underset{⏟}{r_{LES}^{h} + r_{num}^{h} + r_{div}^{h}}} + \underset{temporal}{\underset{⏟}{r_{time}^{τ}}}

r_LES — the commutator

Filtering and the nonlinearity don't commute. The classical subgrid stress; the term the entire closure-modelling literature is about.

r_num — numerical flux

The coarse numerical flux is not the filtered exact flux. Depends on your scheme (central, upwind, …).

r_div — divergence correction

From pushing the coarse difference through the decomposition. Sizeable in magnitude, but nearly dissipation-free — its fitted eddy-viscosity coefficient is ≈ 0.

r_time — time quadrature new

Forward Euler averages the flux over the step but evaluates it at the start. This quadrature error is the temporal residual — and it grows with the CFL number.

The three spatial terms are exactly the decomposition from my JCP paper on exact unresolved stresses. The fourth is the new one. A Taylor expansion pins down its leading behaviour:

r_{time}^{τ} (u) = \frac{τ}{2} \partial_{t} f (u) + O (τ^{2}) = - \frac{τ}{2} f^{'} (u)^{2} \partial_{x} u + O (τ^{2}) .

If that last expression looks familiar: it is the Lax–Wendroff diffusion, the term that has been stabilizing forward-Euler convection since 1960. The filtering formalism doesn't just permit it — it derives it, as the leading part of a well-defined residual.

Who actually dominates the residual?

Theory says $r_{time}^{τ}$ scales with the CFL number $CFL = | f^{'} | τ / h$ while the spatial terms don't care about τ. To see how much it matters in practice, the paper measures all four terms on Burgers' equation: a 13 500-cell DNS filtered onto 300 coarse cells, with the coarse time step swept from τ = 25 Δt to τ = 400 Δt, over an ensemble of 100 random initial conditions. Slide through the sweep:

Coarse time step τ = 100 Δt → CFL number 0.22

17%

26%

19%

38%

LES commutatorclassical subgrid stress17%

Numerical fluxcoarse-flux approximation error26%

Divergence correctiondiscrete-divergence mismatch19%

Time quadratureforward-Euler flux-averaging error38%

At CFL 0.22, the time term is the largest single contribution to the residual — a space-only closure cannot see 38% of what it is asked to model.

Data table (all CFL numbers)

CFL	LES commutator	Numerical flux	Divergence corr.	Time quadrature
0.06	23%	35%	24%	19%
0.11	20%	31%	21%	28%
0.22	17%	26%	19%	38%
0.45	14%	22%	16%	49%
0.89	11%	17%	13%	59%

At the smallest step the temporal term is a footnote. At CFL 0.89 — a perfectly ordinary operating point for an explicit code — it is 59% of the residual, larger than all three spatial terms combined. Every space-only closure model, however sophisticated, is structurally blind to it. (With a dissipative upwind coarse flux the picture is even more lopsided in a different way: the numerical-flux term alone is 72–88% of the residual. Your scheme choice is part of the closure problem whether you acknowledge it or not.)

A closure that knows the clock

The fix suggested by the analysis is almost embarrassingly cheap. Take a standard Smagorinsky eddy-viscosity model and append the term that the Taylor expansion handed us:

m^{h, τ} (v) = \underset{Smagorinsky (space)}{\underset{⏟}{- θ_{h}^{2} h^{2} | \partial_{x}^{h} v | \partial_{x}^{h} v}} \underset{Lax–Wendroff (time)}{\underset{⏟}{- θ_{τ} τ f^{'} (v)^{2} \partial_{x}^{h} v}} .

The temporal coefficient even comes with a theoretical value, $θ_{τ} = \frac{1}{2}$ , straight from the expansion. Fitting it by dissipation matching gives $θ_{τ} \approx 0.21$ — same sign and order of magnitude, reduced because the coarse-graining reshuffles some of the burden between terms. The measured temporal dissipation scales linearly in τ across the sixteenfold range of step sizes, exactly the Lax–Wendroff prediction, and the fitted spatial coefficient comes out at $θ_{h}^{2} \approx 0.20$ .

Does it work when you actually run it?

A-priori budgets can flatter a model, so the paper closes the loop: run the coarse simulation with each of four closures and compare against the filtered DNS. Besides no model and the full space + time closure, there are two space-only variants: classic Smagorinsky (fitted to the LES commutator only, as tradition dictates) and space (fitted to all three spatial terms).

space + timespace onlyclassicno model

Relative error of the coarse solution at the final time versus the CFL number of the coarse simulation (central coarse flux, log scale). Hover or tab across the columns for exact values.

Data table

CFL	space + time	space only	classic	no model
0.06	0.098	0.102	0.134	0.752
0.11	0.097	0.104	0.139	0.761
0.22	0.094	0.111	0.149	0.784
0.45	0.092	0.133	0.183	0.867
0.89	0.097	0.678	0.846	1.821

Four lines, one story. The no-model run is bad everywhere and ends at an error of 1.8 — worse than predicting zero. The two space-only closures are respectable at small CFL, and then quietly fall apart as the time step grows, reaching errors of 0.68 and 0.85 at CFL 0.89 — five to six times worse than where they started. The space + time closure doesn't notice: its error stays within 0.092–0.098 across the entire sweep, and even decreases slightly toward moderate CFL. Same grid, same cost per step — the only difference is one term that knows what τ is.

Why I think this matters

Closure models are secretly CFL-dependent. If you learn a closure from data at one time step and deploy it at another, part of what you learned was the time integrator, not the physics. This is a concrete mechanism behind train/test mismatch in data-driven LES.
The biggest term was the one nobody was modelling. At practical CFL numbers the time-quadrature error out-weighs the celebrated subgrid stress. That is a lot of modelling effort aimed at the second-largest problem.
The formalism is constructive. "Discretize first, filter next" doesn't just diagnose the error — its Taylor expansion hands you the closure term, coefficient included.

The obvious next steps are higher-order integrators (whose quadrature error shrinks — the framework predicts by how much) and the full incompressible Navier–Stokes setting.

The interactive figures above use the exact ensemble data of the paper (central coarse flux), regenerated with the seeded companion code. The paper additionally reports the upwind-flux results and the dissipation budgets.

Your time integrator is a filter ​

A finite difference is a filter ​

The twist: forward Euler is also a filter ​

An exact space-time coarse equation ​

Who actually dominates the residual? ​

A closure that knows the clock ​

Does it work when you actually run it? ​

Why I think this matters ​