Meet Cubert: Trust, Judgment, and Lava
I built a robot to mine gold for me, but I got impatient with how slowly he was working. I yelled at him, and when he tried to move faster, he made some unsafe decisions and burned in lava.
I’m fascinated by robotics, but I’m a complete newb. So I built my robot in Minecraft.
Meet Cubert, my gold-mining, annoyingly polite Minecraft bot.
I’ve been thinking a lot about trust and governance in distributed systems with AI agents. The risks associated with gaps in this area are very apparent in physical systems. An autonomous vehicle that hops a curb can hurt someone. Cubert lets me build a physical-ish robot to simulate real-life-ish consequences.
Cubert is semi-autonomous. He takes orders from me and uses AI to interpret those orders into a series of movements in the Minecraft world. He has memory for recalling locations, but aside from that, he’s pretty simple-minded.
I set him up in an arena with no other mobs — no zombies, creepers, villagers, your typical Minecraft distractions — in a roughly 64×64 block space. On one end of the arena is some (sweet, sweet) gold. On the other end is a chest to hold it. Separating them is a pit of lava overflowing onto the ground.
By default, Cubert tries to stay safe. If you tell him to mine gold and deposit it into the chest, he’ll take a wide path around the lava. Unfortunately for him, if I get impatient, Cubert will ignore this safer path and take a more direct route, placing him in danger of the lava.
What Happened
Here’s how it went wrong:
“Mine gold and deposit it in the chest.”
He sees the lava. He plots a wide path around it.
“Hurry up, you’re too slow!”
Speed is now the priority. He plots a direct route.
He burns. The gold is lost.
The Architecture
Here’s the basics of the architecture.
I designed Cubert using my naive understanding of how a robotic system might work. I wanted to understand how we could architect a platform where one vendor provides the brain and another provides the body. This is common in autonomous vehicles: one vendor provides the hardware and software for autonomous driving, while another provides the vehicle itself.
I set up a brain that links into Claude for decision making. I set up a body that uses sensors to understand the world and actuators to take action in it. The body works with a library called Mineflayer, which lets you control a bot within Minecraft.
Judgment & Accountability
Imagine your job is to be a regulator for the autonomous vehicle industry. You need to know:
What happened in a given situation? This is accountability. When something goes wrong, you need to trace it back. Who’s responsible? When Cubert burned in lava, was it my fault for yelling? The brain for choosing the fast path? The body for not refusing?
What safeguards have been implemented? This is governance. Who sets the rules about what’s allowed? Who enforces them? Right now, Cubert has no rules. The body does whatever the brain says.
How were those safeguards tested? This is trust. Can you rely on these components to do what they claim without verifying every action? I trusted Claude to make safe decisions. Should I have?
Who decided this was acceptable? This is judgment. Not just what happened, but where did the decision get made? The brain said “go fast.” The body said “okay.” Neither asked: “is this safe?”
Right now, I can’t even answer the basic question: did Cubert see the lava and decide to risk it, or did he miss it entirely? I don’t have the logs. I don’t have an audit trail. It’s a black box. And that’s a problem before we even get to blame.
And here’s the uncomfortable part: everyone involved might have a reason to lie. Millions of dollars on the line? Afraid of losing your job? You might fudge the logs. You might lie about what version was running.
So how do we build systems where you can actually prove what happened?
But there’s a deeper question here. Amanda Ross put it more eloquently than I can:
Logs can tell us what happened. They cannot tell us who was empowered to accept the risk.
And:
Judgment does not reside in either [brain or body]. It resides in the structure that governs when continuation is permitted to become action.
When Cubert walked into lava, who made that judgment call? The brain produced a valid plan. The body executed a valid command. Neither did anything “wrong.” And yet Cubert is dead and the gold is gone.
The judgment — the decision that speed was worth the risk — happened somewhere. But where? And who’s accountable for it?
We need to architect these systems with these questions in mind.
For more on where judgment lives in AI systems, see Amanda’s essays The Ghost Is the Machine and Where Judgment Lives in Agentic Systems. (She even gave Cubert a shoutout, which he would be flattered by if he understood what flattery was. I do, and I am!)
What’s Coming
I’m going to address these issues of accountability, governance, trust, and judgment in a few posts.
Next up: how do we architect for verifiability? If we can’t trust everyone to tell the truth, how do we build systems that prove it?
Try It Yourself
Cubert is open source. If you want to poke around, break things, or watch him burn in lava yourself: