githubEdit

BEAM Live Introspection for AI Coding Agents

TL;DR — Give your AI coding agent (GitHub Copilot, Claude Code, OpenAI Codex, Gemini CLI) a reusable "skill" that lets it start, connect to, and introspect a running Elixir/BEAM node. Instead of guessing at runtime behavior or writing throwaway test scripts, the agent can query GenServer state, inspect supervision trees, poke ETS tables, and hot-reload code — all through a single shell script.

This article is self-contained: point your coding agent at it and say "Adopt this pattern for my project."


Table of Contents


The Problem

When an AI coding agent works on an Elixir project, it typically has two options for validating its changes: run the tests (mix test) or reason about the code statically. Neither lets it observe a live system — check whether a GenServer has the right state, whether a supervision tree recovered from a crash, or whether a message actually arrived.

The BEAM VM has world-class introspection built in. Every Erlang/Elixir node can be connected to from another node using distributed Erlang. The trick is teaching the coding agent how to use this.

Why this matters: "The Soul of Erlang and Elixir"

In his talk "The Soul of Erlang and Elixir"arrow-up-right, Saša Jurić demonstrates exactly this capability against a live system. He SSHs into a running server, opens a remote console, and without restarting anything, drills into the problem:

"BEAM is a runtime which is highly debuggable, introspectable, observable if you will. BEAM allows us to hook into the running system and peek and poke inside it and get a lot of useful information — and I don't need to set some special flags, restart the system and whatnot. I can do this by default." — Saša Jurić, 20:43arrow-up-right

From the remote shell, he lists all processes, identifies the CPU-hogging one by its reduction count, gets its stack trace, traces its function calls, kills it with Process.exit(pid, :kill) — and the rest of the system keeps running at 10K requests/second, undisturbed. Then he hot-deploys a fix into the running production node without a restart.

"I was able to approach the system and look from inside it to figure out what the problems are, quickly fix those problems, and deploy into production without disturbing anything in the system itself. This is what I want from my tool." — Saša Jurić, 29:27arrow-up-right

This is exactly what we're giving to AI coding agents: the same remote-shell-into-a-live-system capability that Saša demonstrates manually, but wrapped in a scriptable interface (dev_node.sh rpc) that a coding agent can invoke without needing an interactive TTY. The agent becomes the operator, SSHing into the running BEAM.

The Pattern

Three pieces work together:

  1. The project launches with a known node name and cookie — so the agent's helper script can connect.

  2. A dev_node.sh script in the project provides start, stop, status, rpc, and eval_file commands.

  3. A skill definition tells the coding agent when and how to use live introspection.

BEAM Live Introspection Pattern

The RPC node is started as a hidden node (--hidden flag). In distributed Erlang, hidden nodes do not participate in the global cluster mesh — they don't trigger transitive connections, don't appear in nodes(), and are invisible to :global process registration. This is exactly what we want: the introspection node should observe the system without joining it as a peer or causing the cluster to attempt scheduling work on it.


Step 1: Enable Your Project for Introspection

Your application must start with a short name (--sname) and a cookie (--cookie). The simplest approach is a run script in the project root:

Create run

How --sname works: --sname my_app registers the node as my_app@<hostname>. The (secret) cookie must match on both sides for distributed Erlang to connect. Using the project directory name as the sname is a convention that the dev_node.sh script mirrors — so everything just works without configuration.

For non-Phoenix projects

If you don't use Phoenix, replace mix phx.server with mix run --no-halt:

For production / releases

When running a Mix release, use the --sname and --cookie flags in your release config or rel/env.sh.eex:

⚠️ Use a stronger cookie in production. devcookie is for local development only. The Erlang cookie is security-critical. It is literally the only thing standing between you and an attacker getting full access onto your cluster.


Step 2: Add the dev_node.sh Script

Create scripts/dev_node.sh in your project. This is the single entry point for all BEAM introspection:

How dev_node.sh works

Command
What it does

start

Launches mix run --no-halt as a background BEAM node, waits until it's connectable

stop

Kills the background node via its PID file

status

Checks if the node is still running

rpc <expr>

Spawns a short-lived hidden BEAM node, connects to the app node, evaluates <expr> via :rpc.call, prints the result, and exits

eval_file <path>

Same as rpc, but reads the expression from a .exs file — useful for complex multi-line introspection

Key design decision: Each rpc call is stateless. A fresh hidden BEAM node connects, runs one expression, and exits. This avoids stale connections but means bindings don't carry across calls. The --hidden flag ensures the RPC node doesn't join the cluster as a peer — it won't appear in nodes(), won't trigger transitive connections to other cluster members, and the BEAM scheduler won't try to distribute work to it.

Add a shutdown.sh for graceful stop

This tells the running node to shut down through the BEAM's own System.halt(), which triggers application shutdown callbacks.


Step 3: Create the Skill Definition

A "skill" is a markdown file (SKILL.md) with a YAML front-matter header and instructions for the coding agent. It lives alongside its helper scripts.

Create a directory structure:

SKILL.md


Step 4: Register the Skill with Your Agent

Each coding agent looks for skills in a different location. The skill directory structure is the same everywhere — only the parent path changes.

GitHub Copilot (Copilot CLI / Copilot Coding Agent)

User-level skills (available to all projects):

Project-level instructions — add to .github/copilot-instructions.md or AGENTS.md at the project root:

Claude Code

User-level skills (available to all projects):

Where <profile> is your Claude profile identifier (e.g., [email protected]).

Project-level instructions — add a CLAUDE.md file at the project root referencing the skill, or add the instructions to your existing CLAUDE.md:

OpenAI Codex

User-level skills:

Project-level instructions — Codex reads AGENTS.md at the project root. Add the same introspection section as shown for Copilot above.

Gemini CLI

User-level skills:

Project-level instructions — Gemini reads GEMINI.md or AGENTS.md at the project root. Add the introspection instructions there.

Generic / Multi-Agent (~/.agents/)

Some agent frameworks check ~/.agents/ as a shared skills directory:

Quick setup script

To install the skill for all agents at once, run this from your project root:


Usage Examples

Once the skill is installed, here's what a typical agent interaction looks like:

"Is my GenServer running?"

You ask: "Check if the OrderProcessor GenServer is alive and what its state looks like."

The agent runs:

"Why is the queue backed up?"

You ask: "Something's wrong with message processing — debug it."

The agent runs:

"Hot-reload my fix and test it"

You ask: "I changed the retry logic — reload it into the running node and test."

The agent runs:


Reference: Agent Configuration Paths

Agent
User-level skill path
Project instructions file

GitHub Copilot

~/.github/skills/<name>/

.github/copilot-instructions.md or AGENTS.md

Claude Code

~/.claude-<profile>/skills/<name>/

CLAUDE.md

OpenAI Codex

~/.codex/skills/<name>/

AGENTS.md

Gemini CLI

~/.gemini/skills/<name>/

GEMINI.md or AGENTS.md

Generic

~/.agents/skills/<name>/

AGENTS.md

What goes where

  • SKILL.md — The skill definition with YAML front-matter (name, description) and instructions. This is what the agent reads to understand when and how to use the skill.

  • scripts/dev_node.sh — The helper script that handles node lifecycle and RPC. Can be a copy in each agent's skills dir, or a symlink to the project's scripts/dev_node.sh.

  • Project instructions file (AGENTS.md, CLAUDE.md, etc.) — Tells the agent that introspection is available for this specific project, including the node name and cookie.

Minimum viable setup

If you want the simplest possible setup for a single agent (e.g., GitHub Copilot):

  1. Add scripts/dev_node.sh to your project (chmod +x)

  2. Ensure your app starts with --sname and --cookie devcookie

  3. Add this to AGENTS.md:

That's it. No skill registration needed — the agent reads AGENTS.md and knows how to use the script.


How It Works Under the Hood

When dev_node.sh rpc runs, it:

  1. Starts a new, short-lived hidden BEAM node with a unique sname (rpc_<pid>), the same cookie, and the --hidden flag.

  2. Calls Node.connect/1 to connect to the target app node via distributed Erlang. Because the RPC node is hidden, this connection is not transitive — the app node won't try to mesh with it, and it won't appear in nodes() on the app side (only in nodes(:hidden)).

  3. Uses :rpc.call/4 to execute Code.eval_string/1 on the target node — so the expression runs in the app's process context with access to all its modules and state.

  4. Prints the result with IO.inspect/2 and exits.

This is the same mechanism that iex --remsh uses, but wrapped in a scriptable interface that coding agents can invoke without interactive TTY support.

Security considerations

  • The cookie devcookie is well-known. Anyone on the same machine (or network, if using --name instead of --sname) can connect. Use only for local development.

  • Code.eval_string/1 can execute arbitrary code. This is by design — the agent needs full access — but be aware of it in shared environments.

  • --sname restricts connections to the same hostname. --name would allow cross-host connections (not recommended without TLS distribution).

Last updated