Congratulations! The difference between pure agentic exploration and deterministic steps is spot on. Runbooks give ops more confidence on the data exploration and save time/context.
Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first. Also how the runbooks can self heal if results from some steps in the middle are not expected.
>> how the runbooks can self heal if results from some steps in the middle are not expected.
Yeah this is a very interesting angle. Our primary mechanism here is via agent created auto-memories today. The agent keeps track of the most useful steps, and more importantly, dead end steps as it executes runbooks. We think this offers a great bridge to suggest runbook updates and keep them current.
>> Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first.
Really depends on runbook quality, so I don't have a straightforward answer. Of course, it's faster and cheaper if you have well defined steps in your runbooks. As an example, check logs for service frontend, faceted by host_name, vs. check logs. Agent does more exploration in the latter case.
Re: savings - it depends on the use case. For example, one of our users set up a small runbook to run a group-by-IP query for high-throughput alerts, since that was their most common first response to those alerts. That alone cuts out a couple of minutes of exploration per incident and removes the variability of the agent deciding what data to investigate and how to slice it.
In our experience, runbooks provide a consistent, fast, and reliable way of investigating incidents (or ruling out common causes). In their absence, the AI does its usual open-ended exploration.
Congrats on the Relvy launch and YC! Automating on-call runbooks is a massive pain point. Have you considered how generative AI might further enhance the diagnostic or remediation steps, perhaps by suggesting solutions based on past incidents?
How does this differ from cursor cloud agents where I can hook up MCPs, etc and even launch the agent in my own cloud to connect directly to internal hosts like dbs?
This is a great tool for enterprises specifically the customer support teams that can quickly triage the customer escalations and take the first stab at the issues without escalating to internal teams. All the best guys!! Rooting for you!
Interesting! tbh, we don't have any runbooks and pretty minimal telemetry set up (we're a very small team :), do you have any recommendations on which telemetry service to use to get started? right now, our services run on a combination GCP Cloud Run + Vercel
Amazing product guys! I tried it on a few of my issues and it was pretty spot on in finding the root cause. Loved it!
Are you planning to support auto PRs for fixes? Would be a cool addition
Interesting!
In my experience using custom harnesses has worked better eg: Stripe etc all did it custom largely because of the sensitive integrations. How would you handle that?
25 comments
Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first. Also how the runbooks can self heal if results from some steps in the middle are not expected.
>> how the runbooks can self heal if results from some steps in the middle are not expected.
Yeah this is a very interesting angle. Our primary mechanism here is via agent created auto-memories today. The agent keeps track of the most useful steps, and more importantly, dead end steps as it executes runbooks. We think this offers a great bridge to suggest runbook updates and keep them current.
>> Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first.
Really depends on runbook quality, so I don't have a straightforward answer. Of course, it's faster and cheaper if you have well defined steps in your runbooks. As an example,
check logs for service frontend, faceted by host_name, vs.check logs. Agent does more exploration in the latter case.We wrote about the LLM costs of investigating production alerts more generally here, in case helpful: https://relvy.ai/blog/llm-cost-of-ai-sre-investigating-produ...
In our experience, runbooks provide a consistent, fast, and reliable way of investigating incidents (or ruling out common causes). In their absence, the AI does its usual open-ended exploration.