Mumpix Agent Docs · Jetson Developer Reference

Mumpix

Agents + Jetson

A self-learning browser and hardware automation agent that runs on Jetson Orin. Reads any DOM, controls any input device, accumulates causal knowledge across sessions, and re-plans when the world changes.

Single HTML file Dark by default Zero-config API Mumpix integrated

MumpixAgent  ← zero-config developer API
   │
   ├── ActionPipeline
   │      pre-snap → execute → post-snap → record → drift
   ├── AchieveEngine
   │      plan → execute sequence → replan on failure → promote gene
   ├── DriftDetector
   │      failure streaks → stale suppression → semantic fallback
   │
   └── mumpix-core  ← shared singleton (all layers, one memory file)
          ├── WorldModel   causal triples
          ├── DNA          goal → proven action sequence
          ├── Turbo        observation window → LLM prompt
          └── MumpixDB     evidence ladder LOW → PROVEN

Three services, one memory file:
  :7780 browser   | CDP read + xdotool write
  :7781 hardware  | evdev read + uinput write + WASM handlers
  :7779 meta      | stats, planner, genes, outcomes

Getting Started

This stack is designed so the zero-config path works first, and the deep internals stay available when you need to drop down a layer. The examples below are the three common entry points: direct agent usage, full shared-stack usage, and booting all services together.

System requirements

Jetson Orin on JetPack 6+ or any Linux machine with X11, Chrome, and xdotool. Browser control is HID-based, so Chrome must already be running with a DevTools port open.

Architecture

CDP is read-only. xdotool is write-only. Mumpix sits above both, learning from the structural state transitions and action outcomes without ever sending automation events through the browser.

Persistence

Everything shares one mumpix-memory.json. WorldModel triples, genes, and DB evidence persist every 30 seconds and on clean process exit.

            Quick Start · zero-config path
            
const { createAgent } = require('./src/agent');

const agent = await createAgent({}, { verbose: true });
await agent.connect();

await agent.goto('https://github.com/login');
await agent.fill('#login_field', 'operator@example.com');
await agent.fill('#password', '••••••••');
await agent.submit();

const result = await agent.achieve({ logged_in: true, domain: 'github.com' });
// → { ok: true, achieved: true, steps: 4, confidence: 0.87 }

            Quick Start · full stack with createAgent
            
const { wm, dna, db, turbo } = require('./mumpix-core');
const { createAgent } = require('./src/agent');

const agent = await createAgent({ wm, dna, db, turbo }, { verbose: true });
console.log(agent.core.wm.stats());
console.log(agent.drift.stats());

            Quick Start · boot all services
            
# one-time helper build for virtual hardware devices
npm run build-helper

# start Chrome with a debug port
google-chrome --remote-debugging-port=9222 &

# boot browser API + hardware API + meta API
node launcher.js

Environment variables

Variable	Default	Description
`MUMPIX_SAVE`	`mumpix-memory.json`	Shared persistence file for the Mumpix core.
`PORT_BROWSER`	`7780`	Browser API port for CDP read and xdotool write.
`PORT_HW`	`7781`	Hardware bridge API port.
`PORT_META`	`7779`	Meta API port for stats, planner, and DNA inspection.
`CDP_PORT`	`9222`	Chrome remote debugging port.
`ACTION_DELAY`	`200`	Default post-action delay inside the agent.

Core API

The top-level developer interface is the Mumpix agent wrapper. It exposes the clean zero-config path, but it never hides the internals. You can reach the shared Mumpix layers, the raw CDP reader, the HID writer, or the direct pipeline whenever you need them.

Constructor

              Constructor
              
const { createAgent } = require('./src/agent');

const agent = await createAgent({}, { verbose: true });
await agent.connect();

Option	Type	Default	Description
`port`	`number`	`9222`	Chrome DevTools Protocol port.
`verbose`	`boolean`	`false`	Logs pipeline activity, drift events, and warnings.
`actionDelay`	`number`	`200`	Default delay after actions before post-snapshot.
`pipeline.settleMs`	`number`	`380`	Per-action settle time for DOM/network changes.
`achieve.maxAttempts`	`number`	`3`	Maximum replans before `achieve()` gives up.

Deep access always available Zero-config by default

Instance properties

Property	Description
`agent.core.wm`	Shared `WorldModel` singleton with all learned causal triples.
`agent.core.dna`	DNA gene library for proven action sequences and block rules.
`agent.core.db`	MumpixDB structured truth and evidence ladder.
`agent.core.turbo`	Turbo working-memory window for prompt injection.
`agent.drift`	Drift detector tracking stale actions and semantic fallbacks.
`agent.cdp`	Raw CDP browser reader.
`agent.hid`	Raw xdotool HID writer.
`agent.pipeline`	Direct access to `ActionPipeline` for custom actions.

`async agent.goto(url)`

Async Returns outcome

Navigates by focusing Chrome, hitting the address bar, typing the URL, and pressing Enter. No browser-side navigation APIs are used, so there is no automation event path inside the page.

Param	Type	Description
`url`	`string`	Full URL including protocol.

Route verificationStructural snapshots before and after

`async agent.click(target)`

Async Auto-recovery

Uses three resolution strategies in order: selector lookup via CDP coordinates, text or aria-label matching from live page state, then drift alternatives based on prior non-stale triples.

Param	Type	Description
`target`	`string`	CSS selector or a `text:…` target.

Resolution order

Selector → text/aria-label → drift alternatives. The first successful resolution wins.

`async agent.fill(selector, value)`

AsyncAllow no-change

Clicks the field, selects existing text, and types or pastes the new value. Falls back from CSS selectors to label, placeholder, aria-label, id, or name matching when needed.

Param	Type	Description
`selector`	`string`	CSS selector or input descriptor.
`value`	`string`	Text to type.

`async agent.submit(selector?)`

AsyncButton or Return

Looks for a submit-capable element first, including button text such as Sign in, Log in, Submit, Continue, or Next. Falls back to a real Return keypress.

`async agent.observe()`

AsyncStructural hashes

Takes a structural snapshot of the current page and returns state information stable across text rewrites. This is the browser-facing state read used to feed the WorldModel and DB.

                Observe shape
                
              

{
  stateKey: "github.com::0ko33er::1fgr2ox",
  structHash: "1fgr2ox",
  routeHash: "0ko33er",
  meta: { domain: "github.com", url_path: "/login", title: "Sign in to GitHub" }
}
              

`async agent.key(combo)`

AsyncRaw HID

Sends raw key combinations using xdotool syntax. This is useful for tab management, escape paths, or browser shortcuts.

                Key examples
                
await agent.key('ctrl+t');
await agent.key('Escape');
await agent.key('ctrl+shift+j');

`achieve(goal)`

achieve() is the goal-directed entry point. It checks whether the goal is already satisfied, tries to replay any matching DNA plan, falls back to WorldModel planning, executes each step through the pipeline, re-plans on failure, and promotes successful sequences back into DNA as plan::... actions.

Goal field	Type	Description
`logged_in`	`boolean`	Satisfied when auth indicators appear in the structural state.
`domain`	`string`	Target hostname used for planning and gene lookup.
`url_contains`	`string`	Expected route fragment after success.
`page_type`	`string`	Target semantic page type such as `authenticated`.

Priority	Strategy	Source tag	Confidence
1	Replay a matching DNA goal plan.	`dna_plan`	High once promoted.
2	Plan from the current structural state via `WorldModel.plan()`.	`world_model`	Based on triple confidence.
3	Return current context for external reasoning if no plan exists.	`none`	0

Promotion path

When a sequence reaches the goal, the engine inserts a DNA gene keyed by the goal descriptor and stores the plan as a plan::label|||label… action for instant replay on the next episode.

Hardware

The hardware side mirrors the browser philosophy: read the real device stream, write through real kernel-facing virtual devices, and let Mumpix learn from the resulting state transitions. This keeps gamepads, MIDI controllers, and custom bridges inside the same memory substrate as browser actions.

`EvdevWatcher`

Discovers and opens /dev/input/event* readers, emits normalized input events, and lets you observe raw physical devices without modifying them.

                Read from physical devices
                
const { EvdevWatcher } = require('./evdev');

const watcher = new EvdevWatcher().start();
const controller = watcher.open('/dev/input/event3');
controller.on('event', ev => console.log(ev));

`UinputDevice`

Creates virtual keyboard, mouse, or gamepad devices so the system can emit kernel-visible input instead of faking higher-level events.

Template	Use
`keyboardConfig(name)`	Text entry and desktop shortcuts.
`mouseConfig(name)`	Relative pointer and button events.
`gamepadConfig(name)`	Absolute axes and button controls.

`DeviceBridge` and WASM handlers

Bridges a real reader to a virtual device with an optional handler in between. The handler can be built-in or a custom WebAssembly module that parses and transforms device events.

              Bridge real → virtual
              
const { createRemapper } = require('./wasm-loader');
const DeviceBridge = require('./device-bridge');

const remap = createRemapper({ 288: 304, 289: 305 });
const bridge = new DeviceBridge({ source: controller, target: pad, handler: remap });
bridge.start();

Handler	Input	Output
`midi`	Raw MIDI byte buffer	Note/channel/velocity object
`ds4`	DualShock 4 HID report	Stick/trigger/button state object
`createRemapper(map)`	evdev event	Remapped evdev event
`./handler.wasm`	Any binary buffer	Custom JSON payload

Memory Layers

These are the learning and planning layers that make the agent cumulative instead of stateless. The browser agent, hardware bridge, and any future domains all feed the same singletons through mumpix-core.js.

`WorldModel`

The causal memory store. It records state-action-effect triples and can plan forward from an observed structural state to a goal predicate.

              WorldModel API examples
              
            

const { wm } = require('./mumpix-core');

wm.observe(
  { domain: 'github.com', page_type: 'auth', logged_in: false },
  'click::text:Sign in',
  { domain: 'github.com', page_type: 'authenticated', logged_in: true }
);

wm.effects({ domain: 'github.com' }, 'click::text:Sign in');
wm.candidates({ domain: 'github.com' }, { logged_in: true });
wm.plan({ domain: 'github.com', logged_in: false }, { logged_in: true });
wm.snapshot();
            

Planner

The planner is the forward-chain part of the WorldModel. It uses candidate transitions, triple confidence, and distance-to-goal to build a short action sequence.

Priority	Strategy	Source
1	DNA replay	Stored goal plans
2	Greedy best-first search	WorldModel triples
3	Context handoff	Turbo + DB for LLM reasoning

REST plan example

curl -X POST localhost:7779/wm/plan \
  -H 'Content-Type: application/json' \
  -d '{"state":{"domain":"github.com"},"goal":{"logged_in":true}}'

`DriftDetector`

Tracks failure streaks for action keys, suppresses stale paths, and surfaces semantically similar alternatives using Jaccard similarity over action tokens.

Constant	Value	Description
`STALE_THRESHOLD`	`3`	Consecutive failures before suppression.
`RECOVER_THRESHOLD`	`2`	Consecutive successes before reactivation.
`REVALIDATE_AGE`	`24h`	Age before the triple is queued for re-probing.

`ActionPipeline`

This is the direct access point for custom actions that still need the full observation and learning harness. It is how the agent keeps drift, WorldModel, Turbo, DB, and Mumpix synchronized on every action.

              Direct pipeline access
              
            

await agent.pipeline.run(
  async () => {
    await agent.hid.focusChrome();
    await agent.hid.key('ctrl+l');
  },
  'key',
  'ctrl+l'
);
            

REST APIs

Three ports make up the running system: the browser service on 7780, the hardware service on 7781, and the meta service on 7779. All three share the same Mumpix core and memory file.

7780

Browser API

GET

/browser/snapshot
Full page state: URL, title, text, links, forms, buttons, alerts.

GET

/browser/rect?sel=…
Element screen coordinates for HID targeting.

POST

/act/goto
HID navigate via address bar keystrokes.

POST

/act/click
Selector → CDP coords → xdotool click.

POST

/act/fill
Selector → coords → click → type.

POST

/act/submit
Submit via focused field or Return key.

7781

Hardware API

GET

/devices
List physical evdev devices.

POST

/devices/open
Open a real device reader.

GET

/devices/:id/stream
SSE stream of input events.

POST

/virtual
Create a virtual keyboard, mouse, or gamepad.

POST

/bridge
Connect real → virtual with an optional handler.

DEL

/bridge/:id
Stop a bridge.

7779

Meta API

GET

/stats
Combined Mumpix, WorldModel, DB, Turbo, and DNA stats.

GET

/wm/triples
Recent causal triples.

POST

/wm/plan
Plan a route from current state to a goal predicate.

GET

/dna/genes
Inspect promoted genes and their current state.

POST

/dna/decay
Run dormancy selection manually.

POST

/save
Flush persistence immediately.

Model Catalog API

Developers can inspect the shipped Jetson model catalog through the shared router surface, then use the control API to switch the active model when they administer that device.

GET

/api/router/v1/models
Visible models plus active model state.

GET

/api/router/v1/models/catalog
Curated shipped model list with speed, VRAM, and context metadata.

GET

/control-api/models/status
Current model, service state, and live router visibility.

POST

/control-api/models/load
Load a shipped model by slug, display name, or full HF id.

POST

/control-api/models/eject
Eject the active model and stop the runtime cleanly.

Shipped Jetson models

RECOMMENDED  llama32-3b       28–38 tok/s  ~2.0 GB VRAM  128K
FASTEST      llama32-1b       55–70 tok/s  ~0.9 GB VRAM  128K
HIGH QUALITY llama31-8b       12–18 tok/s  ~4.5 GB VRAM  128K
CODING       qwen25-coder-3b  26–34 tok/s  ~2.0 GB VRAM   32K
COMPACT      phi35-mini       25–32 tok/s  ~2.2 GB VRAM  128K
COMPACT      gemma2-2b        32–42 tok/s  ~1.5 GB VRAM    8K
NEAR LIMIT   mistral-7b       13–19 tok/s  ~4.1 GB VRAM   32K

Current scope

The developer API supports listing, loading, ejecting, and checking model state. Downloading brand-new models stays separate for now so the device contract remains predictable.

Internals

These pieces are what make the stack robust across copy changes, retries, and long-running sessions. They are worth understanding because they explain why the agent can keep working when the surface text changes but the structure stays the same.

`state-hash`

Hash	Captures	Breaks on	Used for
`structHash`	Form shapes, element roles, aria-labels, input types	Actual structural redesign	Primary WorldModel state key
`routeHash`	Hostname + normalized path	Route rename	Route-level fallback
`contentHash`	Visible text and alerts	Any text rewrite	Exact change detection
`pageKey`	`domain::routeHash::structHash`	Route or structure change	Canonical state identity

Why it matters

If GitHub changes button copy from “Sign in” to “Log in,” the state key survives because the form structure and element roles are unchanged.

`mumpix-core`

Node’s require() cache guarantees this is a singleton. Every service sees the same wm, dna, db, and turbo instances. That is what lets browser, hardware, and bridge events all accumulate into one live memory substrate.

                Singleton import
                
const { wm, dna, db, turbo } = require('./mumpix-core');
console.log(wm.stats());

Evidence tiers

Tier	Multiplier	Meaning
LOW	×1	Observed once or twice; still noisy.
MEDIUM	×2	Repeated evidence but not yet resilient under variation.
HIGH	×3	Reliable across multiple episodes and structural states.
PROVEN	×4	Promoted into DNA and ready for direct replay.

Decay and purge

Weak genes are suppressed with DISABLED first, not deleted. They can come back when the environment changes. Hard deletion is a separate purge() path for truly exhausted genes.

Agents + Jetson

Getting Started

System requirements

Architecture

Persistence

Environment variables

Core API

Constructor

Instance properties

async agent.goto(url)

async agent.click(target)

Resolution order

async agent.fill(selector, value)

async agent.submit(selector?)

async agent.observe()

async agent.key(combo)

achieve(goal)

Promotion path

Hardware

EvdevWatcher

UinputDevice

DeviceBridge and WASM handlers

Memory Layers

WorldModel

Planner

DriftDetector

ActionPipeline

REST APIs

Browser API

Hardware API

Meta API

Model Catalog API

Shipped Jetson models

Current scope

Internals

state-hash

Why it matters

mumpix-core

Evidence tiers

Decay and purge

`async agent.goto(url)`

`async agent.click(target)`

`async agent.fill(selector, value)`

`async agent.submit(selector?)`

`async agent.observe()`

`async agent.key(combo)`

`achieve(goal)`

`EvdevWatcher`

`UinputDevice`

`DeviceBridge` and WASM handlers

`WorldModel`

`DriftDetector`

`ActionPipeline`

`state-hash`

`mumpix-core`