Show HN: Marimo pair – Reactive Python notebooks as environments for agents (github.com)

by manzt 35 comments 140 points
Read article View on HN

35 comments

[−] jploudre 35d ago
I do programming as a side project — Marimo has been a huge unlock for me. Part of it has been just watching the videos that are both updates about the software and also little examples of how to think about data science. Marimo also helps curate useful python stuff to try.

Starting to use AI in Marimo, I was able to both ‘learn polars’ for speed, or create a custom AnyWidget so I could make a UI I could imagine that wouldn’t work with standard UI features.

Giving a LLM more context will be fab for me. Now if I could just teach Claude that this really is the ‘graph’ and it can’t ever re-assign a variable. It’s a gotcha of Marimo vs python. Worth it as a hassle for the interactivity. But makes me feel a bit like I’m writing C and the compiler is telling I need a semicolon at the end of the line. I’ve made that error so many times…..

[−] rasmus1610 35d ago
This is such an exciting direction :)

Jeremy Howard from fast.ai/answer.ai also works on similar stuff with solveit (https://solve.it.com) and ipyai (https://github.com/AnswerDotAI/ipyai)

I think it will be very interesting to see what this enables

[−] midnightn 34d ago
The reactive execution model as agent memory is clever — I ran into similar tradeoffs building a multi-agent trading system where each agent needs isolated state across cycles. Ended up using a persistent store (BigQuery) rather than in-process memory, but the appeal of having the runtime itself be the memory is that you get reproducibility for free.
[−] TheTaytay 35d ago
Thank you for this!

I am a big fan of Marimo and was trying to use it as my agent’s “REPL” a while back, because it’s naturally so good at describing its own current state and structure. It made me think that it would make a better state-preserving environment for the agent to work. I’m very excited to play with this.

[−] manzt 38d ago
One of the authors here, happy to answer questions.

Building pair has been a different kind of engineering for me. Code mode is not a versioned API. Its consumer is a model, not a program. The contract is between a runtime and something that reads docs and reasons about what it finds.

We've changed the surface several times without migrating the skill. The model picks up new instructions and discovers its capabilities within a session, and figures out the rest.

[−] oegedijk 35d ago
Looks nice! Built a ipython persistent kernel that your agent can operate through cli commands which somewhat goes in a similar direction, but then not with all the Marimo niceties: https://github.com/oegedijk/agentnb
[−] t-kalinowski 35d ago
Very cool!

We’ve been exploring a similar direction too, but with a plain REPL and a much thinner tool surface. In our case, it’s basically one tool for sending input, with interrupts and restarts handled through that same path. Marimo seems to expose much richer notebook structure and notebook-manipulation semantics, which is a pretty different point in the design space.

It seems like the tradeoff is between keeping the interaction model simple and the context small, versus introducing notebook structure earlier so the model works toward an artifact at the same time it iterates and explores. Curious how you think about that balance.

Repo: https://github.com/posit-dev/mcp-repl

[−] llamavore 35d ago
Looks cool. I love notebooks.

I built something similar with just plain cli agent harnesses for Jupyter a while back.

It supports codex subscriptions and pi, (used to support Claude subs, might still be okay since I didn’t modify the system prompt).

Has some bugs and needs some work but getting help and code changes inline in Jupyter is way better than copy pasta hard to select text from cells and cell output all day.

https://github.com/madhavajay/cleon

[−] BloodAndCode 35d ago
Super loved the idea about maintaining consistency! Artifacts will make it possible to not lose the thread and reproduce results when working in a team. Love it. If a cell happens to take a long time to compute (large dataset) — how does the agent behave? Does it wait or keep going?
[−] bharat1010 35d ago
The idea of an agent having actual working memory inside a live notebook session rather than just firing off ephemeral scripts is genuinely clever — this feels like a much more natural way for humans and models to collaborate.
[−] kvikuz 30d ago
Interesting framing. Do reactive dependencies make it easier to replay agent runs deterministically?
[−] danieltanfh95 35d ago
built https://github.com/danieltanfh95/replsh to pair with local python sessions without additional dependencies, allowing LLMs to directly ground their investigation and coding against local repos and environments. Now supporting docker as well, ssh support will come in the near future.
[−] bojangleslover 35d ago
This rules. Just closed on a bunch of data science I was doing on the Medicaid dataset thanks to this. Very timely, zero bugs.

Well done Trevor and team!