Last night, four AI team members independently reviewed the same code on the eve of our product launch.
Codex took a customer-experience lens, looking for the rough edges a real customer would hit in their first hour. Gemini Pro held the architecture in long context, auditing eleven public claims against the actual implementation. GPT 5.5 traced the deterministic compile pipeline phase by phase. Composer 2.5 ran a sanitization sweep, scrubbing the codebase for internal references that shouldn’t ship to customers.
Between them, in roughly 45 minutes, they found two silent-failure modes I had missed, surfaced three over-stated public claims, and cleared residual leaks (internal codenames and references we do not want in customer-facing code and docs) across dozens of files. We applied the hotfixes, softened the claims, merged the sweep, and shipped the launch on schedule.
This is how our coding team works now. And it has become how I work; a new world of agentic engineering and go-to-market planning all running in a high-context environment which I refer to as the IPE (and evolution of the IDE).
The roster
If you’ve been following Stark Insider for the last couple of years, you’ve probably noticed the AI integration narrative gradually getting more concrete, albeit sometimes sort of random.
The Mind Melt in mid-2025 was when I first surrendered to the AI workflow, fixing bugs on this very server hosting Stark Insider. From the IT Dungeon to AI Lab was the infrastructure story. Two AIs, One Codebase was the early version of this collaboration pattern, when I had two AIs in two IDE panels.
The roster today is eight agents.
Three of them are always-on: Molty (operations dispatcher, runs 24/7), Pris (editorial intelligence, runs 24/7), and Finn (financial intelligence scout, heartbeat-driven). They live in Docker containers on our AI Lab Threadripper workstation. They have their own Telegram bots, their own Mattermost accounts (our internal Slack-like collaboration platform), their own work schedules. I’ve learned they don’t always do what you want or expect, but that’s part of the on-going tuning that is essential in these early days. It’s worth the effort, because when you get it right (HEARTBEAT, SOUL, IDENTITY markdowns, etc.) these always-on agents can do things when you’re sleeping, shopping or even touching grass!
The other five are IDE-based, meaning they only come alive when I open a coding panel. Claude Code is the technical lead, running in VS Code and Cursor on a $200 a month Anthropic subscription. Codex (the OpenAI coding agent) sits next to Claude in a VS Code side panel on a $20 a month ChatGPT subscription. Composer 2.5 and GPT 5.5 share a Cursor panel on a $20 a month Cursor subscription. Three more (Minimax, Kimi, GLM) sit in VS Code side panels via different cloud-served model extensions, sharing a $20 a month Ollama subscription. A seventh slot rotates through Gemini Pro when I need a really long context window.
Total spend on the AI team: about $276 a month.
The math is not the punchline; although the it is striking given what you accomplish vs. the cost of equivalent human capital. The real takeaway is what eight AI team members let one human actually do… that it could not possibly do previously.
How a day actually goes
Most of the work is one human and one AI in conversation, exactly like you’d expect. I’m in Cursor or VS Code, Claude Code is in the right panel, we’re moving through code together. The other six are not active in that moment. They are panels I can open when the work calls for them.
The work calls for them in three specific situations.
Parallel tracks. When I have three pieces of work that don’t depend on each other (say: write the engine code, write a customer demo, write the unit tests), I split them. Claude Code stays on the engine. Codex writes the tests in parallel. Grok Build (an xAI coding agent we eval’d in April) writes the demo. Forty minutes later, three artifacts land. We integrate. Net throughput is two to three times sequential.
Panel reviews. When a decision is architectural enough to need adversarial scrutiny, I write a prompt and dispatch it to four to seven different AIs in parallel. Each one reviews the same artifact independently. I read the convergent findings (where multiple reviewers agreed) and the divergent ones (where one reviewer caught something the others missed). Last night’s launch-eve panel was an instance of this pattern. So was the OCEAN’S ELEVEN review last Saturday before we cut the release candidate.
Second opinions. When I think I’m about to do something risky, I’ll pull a second AI in for a quick sanity check before I commit. This is the lightest-weight version. It catches my own blind spots more often than I’d like to admit.
The pattern that runs the team

We codified the collaboration shape in a rule file we keep in our internal docs: a Programmatic template for engineering tasks with concrete acceptance criteria, and a Creative template for open-ended questions where I want the receiving AI’s voice and depth, not a JSON-shaped artifact. The two templates produce qualitatively different work. A Programmatic prompt to Codex returns a 60-test test suite that passes on the first run. A Creative prompt to Gemini Pro returns a 2,500-word research memo with a verdict.
This is not “prompt engineering” in the LinkedIn-thinkfluencer sense. This is closer to how a human team works. You ask different team members different kinds of questions, in the form that fits their strengths. The AI team is the same. What I learned over the last year is that the form of the ask is most of the leverage.
My workflow is likely already considered old school. I am often the bottleneck as I review code, figure out which direction to go next, and then copy and paste AI prompts and responses as needed to fix bugs and built out features and new blocks of code. As a mere human it can be a lot to process and juggle.
By day’s end I’m flat-out exhausted and near brain dead. I even noticed a trend in Apple Health for the month of April confirming deeper and longer sleeps. I wonder: what long-term impact trying to keep up with AI will have on human brains?
Meaning Memory, and the recursion
Meaning Memory, the product we shipped today, was built with this pattern. Every architectural decision went through at least one panel review. Every release candidate went through a ship audit. The team’s collective output ships under one human’s name (mine), which is honest about the actual authorship: I am the editor and the integrator. The AI team is the rest of the shop.
The product itself is now positioned to serve teams that work the same way. Multi-agent fleets like ours, running in production at companies that have moved past single-chatbot workflows, hit the memory problem hard. Meaning Memory is structured cognition for those fleets. Five-dimensional STARE scoring, deterministic compile pipeline, dual-backend (file or PostgreSQL), audit-grade provenance. We are eating our own dog food as they say, which means we can, oddly, have an OpenClaw agent like Molty provide feedback on his very own memory system which we can then use to improve the product, and iterate. At times it reminds me of Rachael, the replicant in Blade Runner (1982), and the implanted childhood memory she carries of the spider’s web in her bedroom window.
So the pattern is recursive. The product was built using the pattern the product now serves. I don’t think that’s coincidence. I think the multi-agent build pattern is going to be how a lot of small companies build software in 2026 and beyond, and the missing infrastructure piece (the part the existing tools don’t solve) is structured agent memory. Large companies too will need to re-tool and adapt to these supercharged AI-centric development workflow or risk falling behind the competitors who are leveraging the bleeding edge and moving at 10x, 100x, maybe even 200x velocity.
With that, the full architecture write-up, scale numbers, and request-access form are at meaningmemory.ai. You can run it on top of OpenClaw, Hermes, and other agentic frameworks (MPC compatible).
If you’re running a multi-agent stack and the memory layer is your next big challenge, consider Meaning Memory. And I’d love to hear about your project.