Skip to content
Stark Insider
  • Culture
  • Filmmaking/Tech
  • Atelier Stark Films
News Tech

I Built an Agentic Memory Engine With 8 AI Collaborators. Here’s How.

8 AI coding agents, 1 human product manager, and the launch of Meaning Memory: a structured memory engine for multi-agent fleets.

BY Clinton Stark — 05.26.2026

Last night, four AI team members independently reviewed the same code on the eve of our product launch.

Codex took a customer-experience lens, looking for the rough edges a real customer would hit in their first hour. Gemini Pro held the architecture in long context, auditing eleven public claims against the actual implementation. GPT 5.5 traced the deterministic compile pipeline phase by phase. Composer 2.5 ran a sanitization sweep, scrubbing the codebase for internal references that shouldn’t ship to customers.

Between them, in roughly 45 minutes, they found two silent-failure modes I had missed, surfaced three over-stated public claims, and cleared residual leaks (internal codenames and references we do not want in customer-facing code and docs) across dozens of files. We applied the hotfixes, softened the claims, merged the sweep, and shipped the launch on schedule.

This is how our coding team works now. And it has become how I work; a new world of agentic engineering and go-to-market planning all running in a high-context environment which I refer to as the IPE (and evolution of the IDE).

The roster

If you’ve been following Stark Insider for the last couple of years, you’ve probably noticed the AI integration narrative gradually getting more concrete, albeit sometimes sort of random.

The Mind Melt in mid-2025 was when I first surrendered to the AI workflow, fixing bugs on this very server hosting Stark Insider. From the IT Dungeon to AI Lab was the infrastructure story. Two AIs, One Codebase was the early version of this collaboration pattern, when I had two AIs in two IDE panels.

The roster today is eight agents.

Three of them are always-on: Molty (operations dispatcher, runs 24/7), Pris (editorial intelligence, runs 24/7), and Finn (financial intelligence scout, heartbeat-driven). They live in Docker containers on our AI Lab Threadripper workstation. They have their own Telegram bots, their own Mattermost accounts (our internal Slack-like collaboration platform), their own work schedules. I’ve learned they don’t always do what you want or expect, but that’s part of the on-going tuning that is essential in these early days. It’s worth the effort, because when you get it right (HEARTBEAT, SOUL, IDENTITY markdowns, etc.) these always-on agents can do things when you’re sleeping, shopping or even touching grass!

The other five are IDE-based, meaning they only come alive when I open a coding panel. Claude Code is the technical lead, running in VS Code and Cursor on a $200 a month Anthropic subscription. Codex (the OpenAI coding agent) sits next to Claude in a VS Code side panel on a $20 a month ChatGPT subscription. Composer 2.5 and GPT 5.5 share a Cursor panel on a $20 a month Cursor subscription. Three more (Minimax, Kimi, GLM) sit in VS Code side panels via different cloud-served model extensions, sharing a $20 a month Ollama subscription. A seventh slot rotates through Gemini Pro when I need a really long context window.

Total spend on the AI team: about $276 a month.

The math is not the punchline; although the it is striking given what you accomplish vs. the cost of equivalent human capital. The real takeaway is what eight AI team members let one human actually do… that it could not possibly do previously.

How a day actually goes

Most of the work is one human and one AI in conversation, exactly like you’d expect. I’m in Cursor or VS Code, Claude Code is in the right panel, we’re moving through code together. The other six are not active in that moment. They are panels I can open when the work calls for them.

The work calls for them in three specific situations.

Parallel tracks. When I have three pieces of work that don’t depend on each other (say: write the engine code, write a customer demo, write the unit tests), I split them. Claude Code stays on the engine. Codex writes the tests in parallel. Grok Build (an xAI coding agent we eval’d in April) writes the demo. Forty minutes later, three artifacts land. We integrate. Net throughput is two to three times sequential.

Panel reviews. When a decision is architectural enough to need adversarial scrutiny, I write a prompt and dispatch it to four to seven different AIs in parallel. Each one reviews the same artifact independently. I read the convergent findings (where multiple reviewers agreed) and the divergent ones (where one reviewer caught something the others missed). Last night’s launch-eve panel was an instance of this pattern. So was the OCEAN’S ELEVEN review last Saturday before we cut the release candidate.

Second opinions. When I think I’m about to do something risky, I’ll pull a second AI in for a quick sanity check before I commit. This is the lightest-weight version. It catches my own blind spots more often than I’d like to admit.

The pattern that runs the team

VS Code IDE on a remote AI Lab workstation showing Claude Code on the left triaging linter logic and Codex on the right running a parallel customer UX review of Meaning Memory v3.15.1rc1, with five modified files in the source control panel.
Launch-eve panel review in flight: Claude Code triaging linter logic on the left, Codex auditing customer UX on the right, five files in motion in the Meaning Memory repo (center-left panel). One human in the middle, two AI coding agents in parallel.

We codified the collaboration shape in a rule file we keep in our internal docs: a Programmatic template for engineering tasks with concrete acceptance criteria, and a Creative template for open-ended questions where I want the receiving AI’s voice and depth, not a JSON-shaped artifact. The two templates produce qualitatively different work. A Programmatic prompt to Codex returns a 60-test test suite that passes on the first run. A Creative prompt to Gemini Pro returns a 2,500-word research memo with a verdict.

This is not “prompt engineering” in the LinkedIn-thinkfluencer sense. This is closer to how a human team works. You ask different team members different kinds of questions, in the form that fits their strengths. The AI team is the same. What I learned over the last year is that the form of the ask is most of the leverage.

My workflow is likely already considered old school. I am often the bottleneck as I review code, figure out which direction to go next, and then copy and paste AI prompts and responses as needed to fix bugs and built out features and new blocks of code. As a mere human it can be a lot to process and juggle.

By day’s end I’m flat-out exhausted and near brain dead. I even noticed a trend in Apple Health for the month of April confirming deeper and longer sleeps. I wonder: what long-term impact trying to keep up with AI will have on human brains?

Meaning Memory, and the recursion

Meaning Memory, the product we shipped today, was built with this pattern. Every architectural decision went through at least one panel review. Every release candidate went through a ship audit. The team’s collective output ships under one human’s name (mine), which is honest about the actual authorship: I am the editor and the integrator. The AI team is the rest of the shop.

The product itself is now positioned to serve teams that work the same way. Multi-agent fleets like ours, running in production at companies that have moved past single-chatbot workflows, hit the memory problem hard. Meaning Memory is structured cognition for those fleets. Five-dimensional STARE scoring, deterministic compile pipeline, dual-backend (file or PostgreSQL), audit-grade provenance. We are eating our own dog food as they say, which means we can, oddly, have an OpenClaw agent like Molty provide feedback on his very own memory system which we can then use to improve the product, and iterate. At times it reminds me of Rachael, the replicant in Blade Runner (1982), and the implanted childhood memory she carries of the spider’s web in her bedroom window.

So the pattern is recursive. The product was built using the pattern the product now serves. I don’t think that’s coincidence. I think the multi-agent build pattern is going to be how a lot of small companies build software in 2026 and beyond, and the missing infrastructure piece (the part the existing tools don’t solve) is structured agent memory. Large companies too will need to re-tool and adapt to these supercharged AI-centric development workflow or risk falling behind the competitors who are leveraging the bleeding edge and moving at 10x, 100x, maybe even 200x velocity.

With that, the full architecture write-up, scale numbers, and request-access form are at meaningmemory.ai. You can run it on top of OpenClaw, Hermes, and other agentic frameworks (MPC compatible).

If you’re running a multi-agent stack and the memory layer is your next big challenge, consider Meaning Memory. And I’d love to hear about your project.

Tags:Anthropic Artificial Intelligence (AI) ChatGPT Claude Cursor Google Google Antigravity Google Gemini IDE Integrated Personal Environment (IPE)

Related Stories

The Third Mind AI Summit returns to Sonoma wine country June 30 to July 2, 2026. Three days exploring how humans and AI agents collaborate as equals.

Save the Date: The Third Mind AI Summit 2026 Heads to Sonoma

News
MacBook Pro running Claude Code in Visual Studio Code with an autonomous coding prompt, demonstrating how to unlock long multi-hour runs from an AI coding agent

Quick Tip: How to Get Claude Code to Run Autonomously for Hours

News
Which Molty blind LLM study: a four-week single-blind crossover experiment testing whether users can detect the language model powering an always-on AI agent when the memory system stays constant. Results show no statistically significant difference across MiniMax M2.7, Kimi K2.5, GLM-5, and Gemma 4 31B.

Which Molty? Our Blind LLM Study Says Memory Beats Model

News
2026 Artificial Intelligence Index Report from Stanford HAI

Stanford's 2026 AI Index: Where AI Actually Stands (report)

News

More in News →

Clinton Stark

Filmmaker and editor at Stark Insider, covering arts, AI & tech, and indie film. Inspired by Bergman, slow cinema and Chipotle. Often found behind the camera or in the edit bay. Peloton: ClintTheMint.

Short Films
Loni Stark - A West Coast Adventure - A Lifetime in the Making - Stark Insider

Stark Insider
  • CULTURE
  • BEST OF AI
  • FILMMAKING/TECH
  • ATELIER STARK FILMS
  • HUMANxAI SYMBIOSIS
THE STARK COLLECTIVE
  • THE STARK CO
  • STARK INSIDER
  • STARKMIND
  • ATELIER STARK
© Copyright 2005-2026 BLG Media LLC. v2.19.0
  • Review Policy and Shipping
  • Privacy Policy
  • Contact
  • About