DeveloperApril 26, 2026· 7 min read

Catching bots with behavioral sensing

What filling out a form reveals about who's really behind the keyboard.

By Pinar Patton

CAPTCHAs are a tax. Every legitimate user pays it, every time, so you can collect evidence that a small fraction of your traffic is automated. The evidence is weak — most modern bots solve image challenges reliably — and the cost is borne disproportionately by the people you actually want in your product.

There's a better signal, and it's available for free. The way a user fills out a form tells you a great deal about whether a human is doing the filling. Humans pause. They backspace. Their keystrokes arrive at uneven intervals. Bots don't do any of that. They pump characters into fields at perfectly metronomic intervals, complete every field in the same order, never deviate.

This post walks through a working example: a mock bank account opening form where CogStream's behavioral sensing layer watches how the form is filled and flags automated submissions — without a challenge, a fingerprint, or a third-party service.

How CogStream sees behavior

CogStream's sensing layer runs entirely in the browser. It watches DOM events — mouse movements, focus changes, input events, scroll — and reduces them, in real time, into a compact structured description called a behavioral episode. That episode describes what the user did over a window of time: what kinds of actions occurred, how often, at what pace, with what variance.

Four properties matter for bot detection:

  • Typing uniformity. Bots produce characters at near-zero variance. Human keystroke timing has noise: hesitation, bursts, corrections.
  • Fill speed. Bots complete a full form in a fraction of the time any human could.
  • Interaction entropy. Humans correct themselves. They tab back, retype, change an answer. Bots march forward.
  • Pattern regularity. Bots repeat the same sequence with identical timing on every run. Humans don't.

The sensing layer captures all of this without ever logging what the user typed, which fields they completed, or anything that would identify them as an individual. It emits a description of how they interacted, not what they said.

The demo: ClearPath Financial

To make this concrete, we built a demo app: a mock bank account opening form for a fictional institution called ClearPath Financial. It has standard fields (name, date of birth, last four of SSN, address) and a dev toolbar with a Bot Simulator that auto-fills the form at a perfectly uniform 80ms per character.

The full source is in the CogStream Examples Repo. Here's what the integration looks like.

Wiring up sensing

Install the packages:

npm install @cogstream/sensing @cogstream/agent-graph @cogstream/types @cogstream/ui

The sensing layer is a module-level singleton. Attach it to the DOM and start polling:

// lib/sensingBus.ts
import { createSignalCollector, createWindowingEngine } from '@cogstream/sensing';
 
const collector = createSignalCollector();
const windowing = createWindowingEngine();
 
export function attachDOM(target: EventTarget) {
  return collector.attach(target);
}
 
export function startPolling(intervalMs = 1500) {
  setInterval(async () => {
    const signals = collector.drain();
    if (signals.length === 0) return;
 
    const episodes = windowing.feed(signals);
    const episode = windowing.buildEpisodeV2(episodes, sessionId, {
      route: window.location.pathname,
    });
 
    if (episode) await postEpisode(episode);
  }, intervalMs);
}

In your React tree, a small AgentWatcher component attaches the collector on mount and tears it down on unmount:

// components/AgentWatcher.tsx
'use client';
 
import { useEffect } from 'react';
import * as sensingBus from '../lib/sensingBus';
 
export function AgentWatcher({ onEvent }) {
  useEffect(() => {
    sensingBus.startPolling(1500);
    const cleanup = sensingBus.attachDOM(document);
    sensingBus.registerEventDispatcher(onEvent);
    return () => {
      sensingBus.stopPolling();
      cleanup();
    };
  }, [onEvent]);
 
  return null;
}

That's all the browser code. Every 1.5 seconds, the sensing layer drains what it's seen, builds a structured episode, and ships it to your API route.

Writing an InterpretationAdapter

The episode arrives at your server inside a LangGraph-based agent via @cogstream/agent-graph. The agent calls your InterpretationAdapter — a single async method that receives the episode and returns a User State Model describing what it believes is happening.

For the fraud demo, the adapter checks one thing: was this episode marked as bot-simulated?

// lib/fraudAdapter.ts
import type { EpisodeV2, UserStateModel } from '@cogstream/types';
import type { InterpretationAdapter, InterpretationResult } from '@cogstream/agent-graph';
 
export class FraudInterpretationAdapter implements InterpretationAdapter {
  async processEpisode(episode: EpisodeV2, snapshot): Promise<InterpretationResult> {
    const isBotSimulated = episode.summary['bot_simulated'] === true;
    const suspicion = isBotSimulated ? 0.92 : 0.0;
 
    const usm: UserStateModel = {
      session_id: snapshot.sessionId,
      active_intents: suspicion >= 0.55 ? [{
        type: 'automated_submission',
        confidence: 0.85,
        source_patterns: ['uniform_primitive_weight', 'speed_anomaly'],
        supporting_episodes: [],
      }] : [{
        type: 'genuine_user',
        confidence: 0.80,
        source_patterns: ['natural_variance'],
        supporting_episodes: [],
      }],
      // ... friction, progress, trajectory fields
    };
 
    return { usm, adapterState: {} };
  }
}

The bot_simulated flag is set by the Bot Simulator. In a production system, you'd replace this with a real heuristic: compute the coefficient of variation of the inter-keystroke timing derived from sensing primitives, measure fill speed, and score accordingly. The adapter interface is intentionally open so you can bring any model or rule set you want.

The API route

A thin Next.js route validates the episode with Zod, creates the adapter, and hands both to the graph:

// app/api/agent/thread/[threadId]/episode/route.ts
import { EpisodeV2Schema } from '@cogstream/types';
import { getCogstreamGraph } from '@cogstream/agent-graph';
import { FraudInterpretationAdapter } from '../../../../lib/fraudAdapter';
 
export async function POST(req: Request, { params }) {
  const body = await req.json();
  const episode = EpisodeV2Schema.parse(body);
 
  const adapter = new FraudInterpretationAdapter();
  const graph = getCogstreamGraph(adapter);
 
  const result = await graph.invoke({
    threadId: params.threadId,
    episode,
  });
 
  return new Response(
    result.events.map(e => JSON.stringify(e)).join('\n'),
    { headers: { 'Content-Type': 'application/x-ndjson' } },
  );
}

The graph runs the episode through interpretation, decides whether to intervene, and emits a stream of AG-UI events back to the client.

Rendering the intervention

On the client, @cogstream/ui provides the intervention components. The FraudEventRenderer listens for ui_render events and maps them to the right component inside an InterventionOverlay:

// components/FraudEventRenderer.tsx
import { InterventionOverlay, HintCard, EscalationBadge } from '@cogstream/ui';
 
export function FraudEventRenderer() {
  const [active, setActive] = useState(null);
 
  const handleEvent = useCallback((event) => {
    if (event.type === 'ui_render') {
      setActive(event.component);
    }
  }, []);
 
  return (
    <>
      <AgentWatcher onEvent={handleEvent} />
      {active && (
        <InterventionOverlay position="bottom-right">
          {active.component === 'EscalationBadge'
            ? <EscalationBadge message={active.props.message} />
            : <HintCard message={active.props.message} severity="warning" />
          }
        </InterventionOverlay>
      )}
    </>
  );
}

When the Bot Simulator runs, the overlay appears within a few seconds of the form completing. When a human fills the form at a natural pace, nothing fires.

What this is really demonstrating

The fraud detection itself isn't the point. What the demo is showing is the shape of the integration:

  1. Sensing runs in the browser. No server roundtrips for every keystroke. No logging of raw input. The privacy boundary is structural, not policy-level.
  2. The adapter is yours. The graph doesn't know anything about fraud. It knows about episodes and User State Models. Your adapter is the domain expert. Swap it for a checkout abandonment detector, a confusion signal, a compliance checker — the plumbing stays the same.
  3. The intervention surface is constrained. The graph can emit a handful of well-typed events. It can't rewrite your UI, navigate the user somewhere unexpected, or act in the background without a visible trace. The ceiling on what it can do is intentional.
  4. HITL is first-class. If the agent wants to interrupt the user for confirmation, it pauses in the graph and waits for a /resume call. The user has the last word.

This is what the CogStream architecture is for: situations where the right response depends on observing behavior over time, where you want a domain expert (your adapter) to interpret what you see, and where you want the agent to act only when the evidence warrants it.

Try it yourself

The full demo is open source. Clone the repo, run the server, and click Run Bot Simulator to watch the detection fire:

git clone https://github.com/cogstreamer/cogstream-examples
cd cogstream-examples/clearpath-financial
cp .env.example .env.local
# add your ANTHROPIC_API_KEY to .env.local
npm install
npm run dev
# open localhost:3000/apply

The packages used here — @cogstream/sensing, @cogstream/agent-graph, @cogstream/types, @cogstream/ui — are available on npm. Early API access for the hosted interpretation service is by request. Send a note and tell us what you're building.

No CAPTCHA. No fingerprint. Just watching how the form gets filled.