Back to blog
May 13, 2026/Maurice Kleine

What Dies When Your Sandbox Does

When an ephemeral sandbox tears down, you lose more than compute. Installed packages, browser sessions, working files, credentials in memory, and the small habits an agent built up to do its job all vanish. Persistent runtime semantics decide whether that loss is acceptable.

When the sandbox dies, the compute is the cheapest thing you lose.

The instance was already replaceable. What goes with it is the part that actually mattered.

Installed packages. The dependency tree the agent took ninety seconds to resolve. The browser session that just finished a login flow. The working directory where the agent had been carefully placing intermediate files.

The compute was substrate. The state was the work.

For short tasks, that loss doesn't register. Run code, return output, tear down. The agent never accumulated anything worth preserving.

The moment an agent does real work, the math inverts.

What actually dies

A run that gets cut off doesn't just lose its current step. It loses everything the agent built up to be useful in the first place:

  • Installed packages. Every npm install, every uv pip install, every apt-get install, every transitive dependency resolved against a current registry. Reinstalling looks free until you do it on every run and notice the floor is closer to ninety seconds than ten.
  • Browser sessions. Cookies, local storage, the active login on a site that uses MFA. A browser agent that logged in once is a browser agent that loses access on restart.
  • Working files. Drafts, intermediate outputs, debug artifacts, the half-written report the agent will never see again.
  • Credentials in memory. Tokens minted for this run, decrypted secrets, ephemeral keys that were never meant to be written to disk and now have to be reissued.
  • The agent's own context. The notes it took about what it tried, what failed, and where it left off.

Some of these can be reconstructed cheaply. Most can't be reconstructed at all without losing the determinism the agent was supposed to give you.

Why ephemeral was the default

There's a reason early sandbox products are ephemeral. It makes the security model simple: nothing survives, so nothing leaks. It makes scheduling simple: any worker can run any job. It makes pricing simple: charge by the second, retain nothing, owe nothing.

That model is correct for stateless functions. A request comes in, code runs, output goes out. The function doesn't know who called it and doesn't need to.

Agents aren't functions. They iterate. They install. They explore. They get interrupted and resume. They build up small habits that are part of how they do their job.

Treating an agent like a stateless function works until the first time the agent needs to remember anything.

Persistence is not "leave the VM running"

The wrong fix is to leave one VM running forever. That trades the ephemeral problem for a different bill: idle compute, patching drift, lifecycle ambiguity, and a slow accumulation of one-off changes nobody can re-create.

We tried this in the first version of Spinup. One VPS per agent, always on. Great for isolation. Bad for everything else: drift between machines, no central way to roll out fixes, patching one server at a time through a rescue-mode web CLI. The machine was alive. The runtime semantics weren't.

Persistence is a runtime semantic, not a hardware state. The contract is simple: when the agent comes back, the environment looks the way it did when it left. The machine that runs it can be stopped, restored, replaced, or migrated as long as the contract holds.

Snapshots are how this is usually expressed. A clean point-in-time view of the environment, restorable on demand, owned by the agent rather than by one machine.

The agent should outlive the box

If the machine is replaceable, the durable object has to be something else. In Spinup, that object is the agent. A workspace-owned record with its own canonical configuration, harness, skills, secrets, network policy, snapshot history, and run history.

The box underneath is provisioned, used, stopped, and replaced as needed. The agent stays the same. When the next run begins, the environment is reconstituted from the agent's record onto whatever box is available.

This is what makes the persistent-sandbox question answer differently for agents than for IDEs or for stateless code execution. What you want to keep is the agent that knows how to use the machine.

What this changes in practice

When the runtime layer treats persistence as a first-class semantic:

  • Long-running browser flows survive a restart instead of starting over at the login screen.
  • Coding agents stop reinstalling the same toolchain on every run.
  • Multi-step workflows can checkpoint and resume without writing their own state machine.
  • Harness swaps stop being rebuilds. The environment is the agent's, not the harness's.

None of this is exotic. It's what serious agent work needs to be tractable.

The first step toward agent identity is admitting that an agent has things worth keeping.

If you're building agents that should outlive the box they happen to be running on, that's what we're building Spinup for: a cloud agent runtime and persistent sandbox for AI agents where the agent stays put.

Join early access or follow along here as we publish what we learn.