Hixie's Natural Log: llmdevsilo: Using LLMs to develop code securely

2026-06-12 23:55 UTC llmdevsilo: Using LLMs to develop code securely

Over the past few months I've been writing notes for how I would write an LLM-based development tool. Specifically, giving these tools the power to run programs (especially compilers and tests) without having to either check every command for unexpected side-effects or trusting an LLM to do this checking for you, so as to avoid a disaster.

This week, I dumped the entire notes document into Anthropic's Fable with a 1M context window and "Ultracode" mode, and gave it the following prompt:

Implement this design in ~/dev/llmdevsilo/

Continue uninterrupted until it is complete.

Two hours and about 2 million tokens later (approx $10 cost based on the plan I'm using), it had implemented the entire thing.

There were cosmetic issues, to be sure. The native UI overflowed on small window sizes, the web client needed a tweak to work around browsers not liking the self-signed cert on the local websocket, that kind of thing. But overall it had implemented the entire thing. One shot, first attempt. (I'm not counting the two false starts caused by bugs in the Claude desktop app; those attempts were canceled within the first few minutes and didn't influence the prompt of the actual attempt.) Fable is... well, relentlessly proactive is pretty accurate.

So, I can now present llmdevsilo, or just "Silo" as Claude decided to call it. The details are in the design doc (and the additional documentation Claude generated, some as a result of additional prompting later), but at a high level:

You launch a desktop app, and tell it to start a session ("harness"), pointing it at a directory and an LLM model. Currently it supports the Anthropic REST API, the Open AI REST and WebSocket APIs, and local models. (I've no idea how the local model stuff works; the design notes just said to support it and something was implemented for it, but I haven't tested it.)
A harness process is instantiated. The directory you specified gets locked behind a disk image and mounted inside a sandbox. The only sandbox I've tested so far is macOS sandbox-exec, which seems to be the state of the art on Mac (despite being deprecated). On Linux it supports gVisor, but I haven't tested that yet.
You can now talk to the LLM, and the LLM can run programs in the sandbox (it is given a set of tools similar to what the Claude interfaces implement, like Read/Write, Bash, etc). It cannot read your local files, only what was in the directory you gave it and a select set of binaries (like those in /usr/bin). The sandbox is also provided with a scratch directory.
By default, any code running in the sandbox (and thus the LLM) can't access the network. If configured, the harness can provide the sandbox with network access via an HTTP proxy. The proxy supports TLS, there's a whole mechanism whereby a freshly minted temporary root CA is injected into the sandbox and the proxy generates certificates on the fly. Under gVisor, DNS is also proxied; only allow-listed names can be resolved.

What all this means is you can use an LLM with untrusted third party dependencies, give it access to the web, give it a real dev environment with compilers, and be safe in the knowledge that none of your secrets will be exfiltrated, your hard disk won't be wiped, and you never need to see a permissions prompt again, so no risk of prompt fatigue. This isn't the inherently imperfect "model watches the model" security, it's enforced by real sandbox boundaries.

A core part of the design that I like is that the UI is separate from the harness. You can run the harness on one computer and connect to it from another, or from your phone. There is a Flutter-based app UI that works as a native app on desktop and mobile (so far only tested on macOS) and also works on web. There is a Rust-based terminal client. In principle there could be any number of other apps too. The UIs have a secure mechanism to connect (by design; as with everything else, I haven't studied the implementation). To add another client, you get a pairing code from an existing client and connect; they then mint asymmetric key pairs for reconnecting securely later.

There are some limitations.

The biggest is that I haven't audited any of this code yet. It could be full of holes. The entire codebase is itself written by an LLM. The security-sensitive parts are mainly in Rust and I am not fluent in Rust, nor in the security technologies it attempts to leverage and implement.

Another big one is that if you're not able to use local models, you will need to use one of the APIs, which are a lot more expensive than the plans. For example, eyeballing the price tables, I think the original one-shot to create this would have cost about $100-$150 dollars in about two hours. Open AI and Claude only seem to allow you to use their plans with their own software, custom software is billed by the token.

The final limitation worth being explicit about is that while everything outside the sandbox is supposed to be safe from exfiltration, if you allow any network egress then the contents of the sandbox are not. This matters if what you're developing isn't going to be open source, for example.

I am happy to accept patches, but I may be slow to adopt them. Feel free to fork and run with this if you are interested. I have not finished tweaking it, I decided to post this in its current state because I ran out of tokens, and it seemed interesting enough to publish as is. The first commit in the repo is what the original prompt generated; subsequent commits are from additional discussions with the agent.