Schedule tasks on the web

[−] gowthamgts12 50d ago

interesting to see feature launches are coming via official website while usage restrictions are coming in with a team member's twitter account - https://x.com/trq212/status/2037254607001559305.

also, someone rightly predicted this rugpull coming in when they announced 2x usage - https://x.com/Pranit/status/2033043924294439147

[−] stingraycharles 50d ago

To me it makes perfect sense for them to encourage people to do this, rather than eg making things more expensive for everyone.

The same as charging a different toll price on the road depending on the time of day.

[−] girvo 50d ago

Funnily, Anthropic's pricing etc. why I'm using GLM-5 a bunch more outside of work. Definitely not Opus level, but surprisingly decent. Though I got lucky and got the Alibaba Coding Model lite plan, which is so cheap they got rid of it

[−] saratogacx 50d ago

I've been doing something similar. I use Claude for analysis and non-coding work, GLM for most coding tasks (GLM's coding plan) and when I need to do a larger implementation project I use GLM&Claude to build out an in depth plan and toss it to Github Copilot to Opus the implementation.

I was trying to get The alibaba plan but missed the mark. I'm curious to try out the Minimax coding plan (#10/mo) or Kimi ($20/mo) at some point to see how they stack up.

For Pricing: GLM was $180 for a year of their pro tier during a black friday sale and GHCP was $100/year but they don't have the annual plan any more so it is now $120. Alibaba's only coding plan today is $50/mo, too rich for me.

[−] girvo 49d ago

MiniMax 2.7 through their lowest plan is quite impressive too! I’ve not tried the Kimi ones yet

[−] brianjking 50d ago

Does GLM-5 have multimodality or are they still wanting you to load an MCP for vision support?

[−] girvo 50d ago

Text only still, sadly, though qwen3.5-plus on the same provider (Model Studio) is

[−] aplomb1026 49d ago

[dead]

[−] trvz 50d ago

If you use the cloud providers you accept this and more.

If you want stability, own the means of inference and buy a Mac Studio or Strix Halo computer.

[−] tyre 50d ago

If you read the replies to the second, you’ll see an engineer on Claude Code at Anthropic saying that it is false.

Someone spread FUD on the internet, incorrectly, and now others are spreading it without verifying.

[−] hobofan 50d ago

And if you look closely at the usernames, you see that the same engineer from link 2 that said "nah it’s just a bonus 2x, it’s not that deep" (just two week ago) is now saying "we're going to throttle you during peak hours" (as predicted).

Yes, it was FUD, but ended up being correct. With the track record that Anthropic has (e.g. months long denial of dumbed down models last year, just to later confirm it as a "bug"), this just continues to erode trust, and such predictions are the result of that.

[−] browningstreet 50d ago

Anthropic fixing that bug way faster than Apple fixing iOS keyboard "bug". Anthropic even acknowledged it, Apple gave us the silent treatment for years.

I'm not sure it's a rug pull when their stats show 7% and 2% subscription-level impacts. We're back in the ISP days, and they never said unlimited.

[−] nickandbro 50d ago

I feel like we are just inching closer and closer to a world where rapid iteration of software will be by default. Like for example a trusted user makes feedback -> feedback gets curated into a ticket by an AI agent, then turned into a PR by an Agent, then reviewed by an Agent, before being deployed by an Agent. We are maybe one or two steps from the flywheel being completed. Or maybe we are already there.

[−] jwpapi 50d ago

I just don’t see it coming. I was full on that camp 3 months ago, but I just realize every step makes more mistakes. It leads into a deadlock and when no human has the mental model anymore.

Don’t you guys have hard business problems where AI just cant solve it or just very slowly and it’s presenting you 17 ideas till it found the right one. I’m using the most expensive models.

I think the nature of AI might block that progress and I think some companies woke up and other will wake up later.

The mistake rate is just too high. And every system you implement to reduce that rate has a mistake rate as well and increases complexity and the necessary exploration time.

I think a big bulk of people is of where the early adaptors where in December. AI can implement functional functionality on a good maintained codebase.

But it can’t write maintable code itself. It actually makes you slower, compared to assisted-writing the code, because assisted you are way more on the loop and you can stop a lot of small issues right away. And you fast iterate everything•

I’ve not opened my idea for 1 months and it became hell at a point. I’ve now deleted 30k lines and the amount of issues I’m seeing has been an eye-opening experience.

Unscalable performance issues, verbosity, straight up bugs, escape hatches against my verification layers, quindrupled types.

Now I could monitor the ai output closer, but then again I’m faster writing it myself. Because it’s one task. Ai-assisted typing isn’t slower than my brain is.

Also thinking more about it FAANG pays 300$ per line in production, so what do we really trying to achieve here, speed was never the issue.A great coder writes 10 production lines per day.

Accuracy, architecture etc is the issue. You do that by building good solid fundamental blocks that make features additions easier over time and not slower

[−] jwpapi 49d ago

Wow so many replies.

I think it goes down in two camps. AI is improving on these issues and people countering.

I don’t know for sure, but to me it seems the last 2 years weren’t necessarily 'intelligence' improvements but post-training improvement and tool connections, also reduced censorship.

I’m know using less AI than ever and I’ve been burning 1000USD/month before Claude Code. I have a couple of really fundamental functions built that help me to solve a big chunk of specific problems I can built a lot on that. Adding functionality became easier not more complicated.

I would think for these business problems that I’m facing AI is less than 30% of the time right. For example deciding on how to setup databases for max efficiency how to write efficient queries. Everything that in the end is really moat to you compared to your vibe coded competitors.

From my personal experience I’ve seen a lot of vibe-cded companies stuck and barely adding nec functionality or features and my guess is that they don’t trust changes anymore.

So even if AI would be as good as a really good coder one thing would still be missing a person that is knowing exactly what is happening.

And I mean okay it might be writing a form real quick. But a modern form needs to do a lot of things and if you have established patterns for all kind of inputs, the implementation is mundane.

It’s like when you learn coding, type it yourself to learn. So if you can’t scale the AI only codebase at one point you have to learn it, and I argue right now most efficient way is to write in it.

And I’m also arguing that it’s really tough to get a software so good that it’s actually an asset on the market vibe-coded only. It seems like its more of a drug for wannapreneurs than it is actually building an asset.

Like it builds you a Netflix clone, but what you see is barely the code you need to write a Netflix competitor.

[−] onionisafruit 50d ago

I know it’s not your main point, but I’m curious where $300/line comes from. I don’t think I’ve ever seen a dollar amount attached to a line of production code before.

[−] aspenmartin 50d ago

I think this sounds like a true yet short sighted take. Keep in mind these features are immature but they exist to obtain a flywheel and corner the market. I don’t know why but people seem to consistently miss two points and their implications

- performance is continuing to increase incredibly quickly, even if you rightfully don’t trust a particular evaluation. Scaling laws like chinchilla and RL scaling laws (both training and test time)

- coding is a verifiable domain

The second one is most important. Agent quality is NOT limited by human code in the training set, this code is simply used for efficiency: it gets you to a good starting point for RL.

Claiming that things will not reach superhuman performance, INCLUDING all end to end tasks: understanding a vague business objective poorly articulated, architecting a system, building it out, testing it, maintaining it, fixing bugs, adding features, refactoring, etc. is what requires the burden of proof because we literally can predict performance (albeit it has a complicated relationship with benchmarks and real world performance).

Yes definitely, error rates are too high so far for this to be totally trusted end to end but the error rates are improving consistently, and this is what explains the METR time horizon benchmark.

[−] sobellian 50d ago

Scaling laws vs combinatorial explosion, who wins? In personal experience claude does exceedingly well on mundane code (do a migration, add a field, wire up this UI) and quite poorly on code that has likely never been written (even if it is logically simple for a human). The question is whether this is a quantitative or qualitative barrier.

Of course it's still valuable. A real app has plenty of mundane code despite our field's best efforts.

[−] nprateem 50d ago

But the issue isn't coding, it's doing the right thing. I don't see anywhere in your plan some way of staying aligned to core business strategy, forethought, etc.

The number of devs will reduce but there will still be large activities that can't be farmed out without an overall strategy

[−] jwpapi 48d ago

Today a great video came about it: https://www.youtube.com/watch?v=vFUjcHhOpgA

[−] EdgeNRoots 50d ago

[dead]

[−] chatmasta 50d ago

I love everything about this direction except for the insane inference costs. I don’t mind the training costs, since models are commoditized as soon as they’re released. Although I do worry that if inference costs drop, the companies training the models will have no incentive to publish their weights because inference revenue is where they recuperate the training cost.

Either way… we badly need more innovation in inference price per performance, on both the software and hardware side. It would be great if software innovation unlocked inference on commodity hardware. That’s unlikely to happen, but today’s bleeding edge hardware is tomorrow’s commodity hardware so maybe it will happen in some sense.

If Taalas can pull off burning models into hardware with a two month lead time, that will be huge progress, but still wasteful because then we’ve just shifted the problem to a hardware bottleneck. I expect we’ll see something akin to gameboy cartridges that are cheap to produce and can plug into base models to augment specialization.

But I also wonder if anyone is pursuing some more insanely radical ideas, like reverting back to analog computing and leveraging voltage differentials in clever ways. It’s too big brain for me, but intuitively it feels like wasting entropy to reduce a voltage spike to 0 or 1.

[−] Leptonmaniac 50d ago

I think that as a user I'm so far removed from the actual (human) creation of software that if I think about it, I don't really care either way. Take for example this article on Hacker News: I am reading it in a custom app someone programmed, which pulls articles hosted on Hacker News which themselves are on some server somewhere and everything gets transported across wires according to a specification. For me, this isn't some impressionist painting or heartbreaking poem - the entity that created those things is so far removed from me that it might be artificial already. And that's coming from a kid of the 90s with some knowledge in cyber security, so potentially I could look up the documentation and maybe even the source code for the things I mentioned; if I were interested.

[−] dominotw 50d ago

I dont mean this as a shade but ppl who are not coders now seem to think "coding is now solved" and seem to be pushing absurd ideas like shipping software with slack messages. These ppl are often high up in the chain and have never done serious coding.

Stripe is apparently pushing gazzaliion prs now from slack but their feature velocity has not changed. so what gives?

how is that number of pr is now the primary metric of productivity and no one cares about what is being shipped or if we are shipping product faster. Its total madness right now. Everyone has lost their collective minds.

[−] theredbeard 50d ago

We haven’t been inching closer to users writing a half-decent ticket in decades though.

[−] slopinthebag 50d ago

What kind of software are people building where AI can just one shot tickets? Opus 4.6 and GPT 5.4 regularly fail when dealing with complicated issues for me.

[−] jvuygbbkuurx 50d ago

Tusted user like Jia Tan.

[−] heavyset_go 50d ago

Feedback loops like that would be an exercise in raising garbage-in->garbage-out to exponential terms.

It's the "robots will just build/repair themselves" trope but the robots are agents

[−] simianwords 50d ago

I remember when I tried to set something up with the ChatGPT equivalent like "notify me only if there are traffic disruptions in my route every morning at 8am" and it would notify me every morning even if there was no disruption.

[−] kelvinjps10 50d ago

I feel like a lot of people and companies wanted to automate the web, but most website's operators wouldn't let you and would block you. Now you put the name AI into and now you're allowed to do It.

[−] javiercr 50d ago

I've recently switched from GitHub Copilot Pro to Claude Code Max (20x). While Claude is clearly superior in many aspects, one area where it falls short is remote/cloud agents.

Yesterday, I spent the entire day trying to set up "Claude on the web" for an Elixir project and eventually had to give up. Their network firewall kept killing Hex/rebar3 dependency resolution, even after I selected "full" network access.

The environment setup for "on the web" is just a bash script. And when something goes wrong, you only see the tail of the log. There is currently no way to view the full log for the setup script. It's really a pain to debug.

The Copilot equivalent to "Claude on the web" is "GitHub Copilot Coding Agents," which leverages GitHub Actions infrastructure and conventions (YAML files with defined steps). Despite some of the known flaws of GitHub Actions, it felt significantly more robust.

"Schedule task on the web" is based on the same infrastructure and conventions as "Claude on the web", so I'm afraid I'm gonna have the same troubles if I want to use this.

[−] iBelieve 50d ago

Looks like I'm limited to only 3 cloud scheduled tasks. And I'm on the Max 20x plan, too :(

"Your plan gets 3 daily cloud scheduled sessions. Disable or delete an existing schedule to continue."

But otherwise, this looks really cool. I've tried using local scheduled tasks in both Claude Code Desktop and the Codex desktop app, and very quickly got annoyed with permissions prompts, so it'll be nice to be able to run scheduled tasks in the cloud sandbox.

Here are the three tasks I'll be trying:

Every Monday morning: Run pnpm audit and research any security issues to see if they might affect our project. Run pnpm outdated and research into any packages with minor or major upgrades available. Also research if packages have been abandoned or haven't been updated in a long time, and see if there are new alternatives that are recommended instead. Put together a brief report highlighting your findings and recommendations.

Every weekday morning: Take at Sentry errors, logs, and metrics for the past few days. See if there's any new issues that have popped up, and investigate them. Take a look at logs and metrics, and see if anything seems out of the ordinary, and investigate as appropriate. Put together a report summarizing any findings.

Every weekday morning: Please look at the commits on the develop branch from the previous day, look carefully at each commit, and see if there are any newly introduced bugs, sloppy code, missed functionality, poor security, missing documentation, etc. If a commit references GitHub issues, look up the issue, and review the issue to see if the commit correctly implements the ticket (fully or partially). Also do a sweep through the codebase, looking for low-hanging fruit that might be good tasks to recommend delegating to an AI agent: obvious bugs, poor or incorrect documentation, TODO comments, messy code, small improvements, etc.

I ran all of these as one-off tasks just now, and they put together useful reports; it'll be nice getting these on a daily/weekly basis. Claude Code has a Sentry connector that works in their cloud/web environment. That's cool; it accurately identified an issue I've been working on this week.

I might eventually try having these tasks open issues or even automatically address issues and open PRs, but we'll start with just reports for now.

[−] monkeydust 50d ago

I do feel people will end up using this for things where a deterministic rule could be used - more effective, faster and cheaper. See this starting to happen at work...'We need AI to solve X....no you don't"

[−] chopete3 50d ago

Claude is moving fast.

https://grok.com/tasks

Grok has had this feature for some time now. I was wondering why others haven't done it yet.

This feature increases user stickiness. They give 10 concurrent tasks free.

I have had to extract specific news first thing in the morning across multiple sources.

[−] mkagenius 50d ago

This is a bit restrictive, doesn't take screenshots. So you can't "say take screenshots of my homepage and send it to me via email"

It doesnt allow egress curl, apart from few hardcoded domains.

I have created Cronbox in the cloud which has a better utility than above. Did a "Show HN: Cronbox – Schedule AI Agents" a few days back.

https://cronbox.sh

and a pelican riding a bicycle job -

https://cronbox.sh/jobs/pelican-rides-a-bicycle?variant=term...

[−] zmmmmm 50d ago

i'm missing something basic here .... what does it actually do? It executes a prompt against a git repository. Fine - but then what? Where does the output go? How does it actually persist whatever the outcome of this prompt is?

Is this assuming you give it git commit permission and it just does that? Or it acts through MCP tools you enable?

[−] jFriedensreich 50d ago

We need to fight model providers trying to own memory, workflows and tooling. Don't give them an inch more of your software than needed even if there is a slight inconvenience setting up.

[−] delphic-frog 49d ago

The pricing discussion is interesting but I think people are missing the bigger picture. Being able to schedule agents to run tasks on a cron is genuinely usefull for solo devs who can't justify hiring someone to handle repetitive maintainence work. I've been using AI agents for image processing stuff and the autonomous loop is where it works.

[−] nickphx 49d ago

Who cares? Why does the hype machine need to hype the most inane 'features' as if they are novel, useful, or relevant?

[−] arjie 50d ago

What's the per-unit-time compute cost (independent of tokens)? Compute deadline etc.? They don't charge for the Cloud Environment https://code.claude.com/docs/en/claude-code-on-the-web#cloud... currently running?

[−] 0898 50d ago

One interesting restriction is that it won’t do anything with people’s faces.

I run conferences and I like to have photos of delegates on the page so you can see who else is attending.

I wanted to automate this by having Claude go to the person’s LinkedIn profile and save the image to the website.

But it seems it won’t do that because it’s been instructed not to.

[−] hirako2000 50d ago

Oh my, did Anthropic invent Cron jobs as a service?

It's a game changer.

Edit: my mistake. It's inferior to a Cron job. If my repos happen to be self hosted with Forgejo or codeberg, then it won't even work. If I concede to use GitHub though I don't have to set up any env variables. Schedules lock-in, all over the web.

[−] throwatdem12311 50d ago

So this is basically just Anthropic’s version of Open Claw that they manage for you and you pay them.

[−] sarpdag 50d ago

I can't pick the effort for the tasks run on Claude Web. I have a feeling Claude is using low or medium effort on those tasks, and I observe clear quality differences with the task ran on my local claude code, which uses high effort.

[−] nlawalker 49d ago

Make sure to see channels too, just shared here last week -

Push events into a running session with channels: https://news.ycombinator.com/item?id=47448524

[−] lucgagan 50d ago

Here goes my project.

[−] pastel8739 50d ago

Is this free? I don’t see pricing info. I guess just a way to make you forget that you’re spending money on tokens?

[−] mememememememo 50d ago

The PHP script from a cron tab is back!

[−] PeterStuer 50d ago

Is only Github supported as a repository?

[−] dbvn 50d ago

it would be easier to use claude to write a cronjob that does the same thing for you but accurately

[−] Steinmark 49d ago

[dead]

[−] maxbeech 50d ago

[dead]

[−] MeetRickAI 50d ago

[flagged]

[−] commers148 50d ago

[flagged]

[−] georaa 49d ago

[flagged]

[−] jngiam1 50d ago

This is powerful. Combined with MCPs, you can pretty much automate a ton of work.

Schedule tasks on the web (code.claude.com)

243 comments