How CLOCKstep:MULTI Learned to Track Human Groove in Real Time

Why This Feature Exists

For years, live musicians have been caught between two worlds: the expressive timing of a drummer and the rigid precision of machines. MIDI Clock, Tap Tempo, and click tracks all assume the grid is fixed and the musician must adapt. Fall behind or speed up just a little, and you must course correct.

But when humans play together without a grid, that's not how timing works.

We make small, nearly instantaneous corrections while listening to one another. Those micro-adjustments are constant, subtle, and mostly unconscious, but they keep everyone landing together on the downbeats and on important accents. It doesn't matter that the playing isn't metronomically perfect. It feels tight because the entire group is moving together. Every musician and band has a signature groove.

Follow Beat was created to bridge that gap when human timing must reconcile itself with machine precision, like when a band is using sequenced tracks or timed stage production. The idea seems simple enough:

let the machines follow the musician, not the other way around.

But doing that reliably in real time requires a system that understands both the immediate feel of each hit and the longer-term flow of each phrase.

Initial Goal

The original goal wasn’t to create a new category of sync device. It was a technical question:

Can a hardware clock interpret a drummer’s intent quickly enough to adapt to the tempo and feel natural?

This led to a series of experiments with:

  • microsecond-accurate hit timestamp arrays
  • predictability of hits along multiple possible timing vectors
  • short-term vs long-term tempo interpretation
  • clock stability vs tempo responsiveness

Many early attempts either lagged too much or reacted too aggressively. The real challenge did not lie in detecting hits and simply calculating deltas; it was understanding which hits mattered, how they fit within a rhythm, and how quickly the system should respond.

Understanding Real Drummer Timing

Live timing is not a smooth ramp from one BPM value to another. It’s a sequence of small, instantaneous corrections.

Humans don’t align with each other by gradually easing their tempos back or pushing forward in a curve; they jump. They land the next beat where they feel it needs to be.

This observation shaped the core principle:
Continuity matters more than apparent smoothness.

If the algorithm smoothed too aggressively to match a preconceived idea that this is how it would look right on paper, it actually ended up feeling sluggish in practice.

Instead, the solution became a hybrid approach:

  • micro-level responsiveness for beat-to-beat feel
  • macro-level smoothing for phrase-level tempo drift
  • elasticity controls to adjust how tightly or loosely the clock follows

This balance is why Follow Beat tracking feels musical rather than mechanical.

Building Reliable Hit Detection

Follow Beat supports both MIDI and audio-based detection. In both cases, the goal is the same: isolate intentional hits from everything else.

The challenge was separating meaningful events from noisy artifacts, ghost notes or accidental double triggers. Timestamp filtering and inter-onset analysis were used to reject anomalies without ignoring the natural quirks of real drumming.

For Audio Inputs:
A built-in calibration routine measures peak transient strength and the local noise floor. This allows the system to set an appropriate transient threshold and distinguish drum hits from cymbal wash or shell resonance.

For MIDI Inputs:
A user-defined velocity threshold establishes a clean digital cutoff for candidate hits, filtering out unintended notes or low-velocity ghost triggers.

Interpreting Musical Intent

Once hits were reliably detected, the next step was mapping hits to tempo.

This required:

  • microsecond-accurate timestamping
  • moving-window analysis
  • differentiating triplets from straight subdivisions
  • rejecting intervals that didn't correspond cleanly to an underlying pulse
  • maintaining continuity during fills and expressive phrasing

Rather than trying to lock the algorithm to a single expected interval, Follow Beat interprets any rhythmic subdivision.

That means the drummer can improvise freely. Playing eighths, quarters, triplets or any mixture without forcing a rigid pattern. The system adapts on the fly; there's no template.

Tempo Detection Behavior

One key breakthrough was recognizing that tempo detection isn't served well by a single interpretation model. Instead, Follow Beat relies on two interconnected layers.

Short-Term Model
Handles beat-to-beat micro-adjustments, tracking a drummer's feel.

Long-Term Model
Factors in our natural tendency to maintain a recognizable tempo across bars and phrases.

This two-layer implementation means that tempo decisions aren't dependent on one isolated interpretation. The short-term model promotes immediacy and responsiveness; the long-term model promotes stability and continuity. Together, they allow the system to follow expressive playing without becoming erratic or sluggish.

Knowing When To Do Nothing

One important discovery was recognizing when the best decision was to do nothing at all.

Human Accuracy Tolerance
Even though the system can precisely measure when a hit falls outside of an expected interval, that doesn't mean every offset warrants a tempo adjustment. Musicians tolerate slight deviations without perceiving them as tempo shifts. Knowing when a hit is 'close enough' leaves processing headroom for when clear decisions are required.

Ambiguous Hits
Did a tempo shift actually occur, or was that a poorly placed triplet? When an event is ambiguous, the safest interpretation is no interpretation at all. There's always another hit coming, and additional data almost always resolves the uncertainty.

Response Weighting

Once the system has determined that a hit warrants a tempo update, the next question is how strongly to respond. Respond too strongly and the clock begins to feel like it's leading instead of following.

Follow Beat assigns a weighted influence to each candidate hit so that adjustments are proportional to the musical intent, rather than absolute.

Elasticity

Some drummers want the system to follow their every nuance and large tempo jumps; others prefer it to hold more steady and respond only to subtler timing changes.

In tracking tempo changes, there is always a feedback loop. The clock is listening to the drummer. The drummer is listening to the clock, or to the music that is responding to the clock. And just like two musicians playing together, both sides influence one another.

Elasticity determines which side asserts the greater influence. Higher elasticity values favor the drummer, making the clock respond more immediately to their input. Lower values favor the clock, making it more resistant to changes detected by the playing, but not deaf to them. Most musicians will likely choose a moderate setting, giving the clock the freedom to react to expressive playing while preventing overreaction during a momentary inconsistency.

It's vital that the drummer be able to choose the elasticity that suits their expectations the best.

Conclusion

The development of Follow Beat began with a simple question: could a machine respond to a drummer’s timing in a way that felt natural rather than forced? Answering that question required more than detecting hits or measuring intervals. It meant building a system that could listen, interpret, and react with the kind of judgment musicians use when they play together.

The layered tempo analysis, weighted response, and elasticity controls all emerged from that goal. They aren’t about correcting the drummer or shaping their intent. They exist so the clock can adapt to the performance without disrupting it, and the drummer can play what they feel.

As a result, the clock becomes a part of the performance rather than a constraint on it.

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.