All posts
Design5 min read

Frames as the Primitive Unit of Time

Every NLE that uses floating-point seconds eventually ships a subtle drift bug. Using integer frames eliminates an entire class of problems at the cost of one invariant every developer needs to know.

Here is a bug that has shipped in more video editors than anyone wants to admit: two clips that should be perfectly flush end up 0.0000003 seconds apart. At the join frame, the renderer has to pick which clip wins, and it picks arbitrarily. Sometimes you get a one-frame flash of the wrong clip. Sometimes a gap. It is intermittent, it is data-dependent, and it is miserable to reproduce.

The root cause is always the same: time was stored as floating-point seconds, and floating-point addition is not associative. Split a clip, move it, trim it, and the rounding errors compound until flush is no longer flush.

The decision: integer frames everywhere

In this engine, currentFrame is an integer, and so is everything that describes time. Clips carry startFrame, durationFrames, sourceStartFrame, and sourceDurationFrames — all integers. Seconds exist at exactly one place: the rendering boundary, when we set videoEl.currentTime = sourceFrame / fps. Nowhere else.

ts
// Bad — floating seconds compound rounding on every edit
clip.start = 1.5
clip.duration = 3.2
currentTime = 4.7

// Good — integer frames are exact, forever
clip.startFrame = 45
clip.durationFrames = 96
currentFrame = 141

Integer math is exact. Adjacent clips are flush when clipA.startFrame + clipA.durationFrames === clipB.startFrame, and that equality never decays no matter how many times you split, slip, or move.

The one invariant: the half-open interval

Integers remove rounding, but you still need one rule to decide which clip is active at a seam. A clip is active if and only if startFrame <= frame < startFrame + durationFrames. The interval is half-open: the start frame is included, the end frame is not.

This is the single most important thing to internalize. It means adjacent clips never both fire on the same frame — clip A ends exactly where clip B begins, and frame B-start belongs unambiguously to B. Source mapping then becomes pure arithmetic:

ts
// The only place trim semantics live
sourceFrame = (frame - clip.startFrame) + clip.sourceStartFrame

Why two frame coordinate systems

There are two kinds of frames in the model, and keeping them distinct is what makes non-destructive editing possible:

  • Timeline frame — where on the timeline the clip lives (startFrame).
  • Source frame — what part of the source media plays (sourceStartFrame).

A split is then trivially correct: create two clips with the same src, adjust their startFrame and sourceStartFrame, and no media is re-encoded. Trims and slips are likewise just integer adjustments to those two windows.

The clock follows the same rule

The PlaybackEngine integrates a float frame position internally for sub-frame seeking, but the store and UI consume Math.floor(getFrameAt()). Float lives at the boundary; integers are the contract.

The cost of all this is one invariant every contributor must know — the half-open interval. That is a cheap price for permanently deleting an entire category of drift bugs.

Building something with browser-native video?

Try the SDK, read the docs, or join the conversation.