Can Claude Fly a Plane? (so.long.thanks.fish)

by casi 96 comments 107 points
Read article View on HN

96 comments

[−] operatingthetan 31d ago
We already have advanced autopilots that can fly commercial airliners. We just don't trust them enough to not have human pilots. I would trust the autopilot more than freaking Claude. We already do, every day.
[−] dewey 31d ago
I don't think anyone is suggesting we should do that...but it's still a fun project to play around with?
[−] 16bytes 30d ago
In aviation there's a saying, "Aviate, Navigate, Communicate" which describes the hierarchy of things to pay attention to while piloting an aircraft.

Autopilot can be thought of better as "auto-aviate". That is to say, if there is already a navigation plan, the aircraft can follow that plan. Simple autopilots just keep the wings level, others can hold an altitude and change heading. More sophisticated ones can change altitude or even fully land the plane.

All of those things, however, require people to manage the "Navigate" part. "Aviate" is a deterministically solved problem, at least in normal flight operations. As you point out we trust autopilots today, including on (nearly) every single commercial flight.

LLMs are a poor alternative to "aviate", but they could be part of a better flight management automation package. The parent article tries to use the LLM to aviate, with predictable results.

If paired with a capable auto-pilot (not the relatively basic one on that C-172), the LLM could figure out how to operate the FMS and take you from post take-off to final approach and aid in situational awareness.

Currently, I don't think there is a commercial solution for GA aircraft that could say, "Ok, I'm 20NM from KVNY, but there are three people ahead of me in the pattern, so I have to do a right 360 before descending and joining downwind on 34L".

Having an LLM propose that course of action and tell the autopilot to execute on it definitely would be an improvement to GA safety.

[−] Ekaros 31d ago
I think we can trust them to not have human pilots. It is just that having human in loop is very useful in not that rare scenarios. Say airfield has too much wind or fog or another plane has crashed on all runways... Someone needs to make decision what to do next. Or when there is some system failure not thought about.

And well if they are there they might as well fly for practise.

And no. I would not allow LLM in to the loop of making any decision involving actual flying part.

[−] boring-human 31d ago

> We just don't trust them enough to not have human pilots.

Much of the value of a human crew is as an implicit dogfooding warranty for the passengers. If it wasn't safe to fly, the pilots wouldn't risk it day after day.

To think of it, it'd be nice if they posted anonymized third-party psych evaluations of the cockpit crew on the wall by the restrooms. The cabin crew would probably appreciate that too.

[−] teeray 31d ago
“Automation can lower the workload in some cases. But in other situations, using automation when it is not appropriate can increase one’s workload. A pilot has to know how to use a level of automation that is appropriate... Whether you’re flying by hand or using technology to help, you’re ultimately flying the airplane with your mind by developing and maintaining an accurate real-time mental model of your reality—the airplane, the environment, and the situation. The question is: How many different levels of technology do you want to place between your brain and the control surfaces?“[0]

—Sully Sullenberger

[0] Sully: My Search for What Really Matters. p. 188

[−] jmward01 31d ago
The question of 'can it fly' is clearly a 'yes, given a little bit of effort'. Flying isn't hard, autopilots have been around a long time. It is recognizing and dealing with things you didn't anticipate that is hard. I think it is more interesting to have 99% of flying done with automated systems but have an LLM focus on recognizing unanticipated situations and recovering or mitigating them.
[−] basfijneman 31d ago
If planes can fly autopilot I assume claude can make a pretty good flight plan. Not sure if claude can react in time if shit hits the fan.

"spawning 5 subagents"

[−] travisgriggs 31d ago
The bit in the middle where it decides to make its control loop be pure P(roportional), presumably dropping the I and D parts, is interesting to me. Seems like a poor choice.

I try to fly about once a week, I’ve never really tried to self analyze what my inputs are for what I do. My hunch is that there’s quite a bit of I(ntegral) damping I do to avoid over correcting, but also quite a bit of D(erivative) adjustments I do, especially on approach, in order to “skate to the puck”. Density going to have to take it up with some flight buddies. OR maybe those with drone software control loop experience can weigh in?

[−] ramon156 31d ago

> CRASHED #2, different cause. Plane was stable in a slow descent but between fly.py invocations (~20 sec gap while I logged and computed the next maneuver) there was no active controller. Plane kept descending under its last commanded controls until it hit terrain at 26 ft MSL, 1.7 nm short of the runway. Lesson: never leave the controller idle in flight

Gold

[−] webprofusion 31d ago
"Can I Get Claude to Fly A Plane" isn't the same thing. Interesting though, would be a good test for different models but it relies on the test harness being good enough that a human could also use the same info to achieve the required outcome. e.g. if latency of input/output is too slow then nobody could do it.
[−] bottlepalm 31d ago
AI being able to quickly react to real time video input is the next thing. Computer use right now is painfully slow working off a slow screenshot/command loop.
[−] alex_duf 31d ago
Claude uses the wrong modality to be a piloting model. Latency is critical, and outputting tokens in the hope they take the action at the right time is kinda bonkers.

You'd want all the data from the plane to be input neurons, and all the actions to be output neurons.

[−] progx 31d ago
Prepare for landing "rate limit exceeded" (Error 429)" ;-)
[−] morpheuskafka 31d ago
Surely at least part of the issue here is that even an LLM operates in two digit tokens per second, not to mention extra tokens for "thinking/reasoning" mode, while a real autopilot probably has response times in tens of milliseconds. Plus the network latency vs a local LLM.
[−] ikari_pl 31d ago
"Approaching for landing"

"500 Our Servers Are Experiencing High Load"

"500 Our Servers Are Experiencing High Load"

"500 Our Servers Are Experiencing High Load"

[−] Markoff 31d ago
I wouldn't really worry about flying, but more about taking off/landing.

Related from December 2025: Garmin Emergency Autoland deployed for the first time

https://www.flightradar24.com/blog/aviation-news/aviation-sa...

[−] leptons 31d ago
Does Claude know the plane isn't at the car wash?
[−] hansmayer 31d ago
Mate, we don't trust it to write an email or the code it generates. Why should we trust it to fly a plane?
[−] est 31d ago

> main issue seemed to be delay from what it saw with screenshots and api data and changing course.

This is where I think Taalas-style hardware AI may dominate in the future, especially for vehicle/plane autopilot, even it can't update weights. But determinism is actually a good thing.

[−] mihaaly 31d ago
Friend participating in some sort of simulated glider tournament trained a neural network to fly one some way (don't ask details). I recall rules were changed to ban such, not because of him.

Using Claude sounds overkill and unfit the same time.

[−] thewhitetulip 31d ago
Humans can also fly. Once.
[−] rkagerer 31d ago
You could also use your forehead as a hammer, but it's likewise going to result in more pain than gain.

I wouldn't trust Claude to ride my bike, so I certainly wouldn't board its flight.

[−] edu 31d ago
Besides the article, I think a big issue for this would be the speed of the input-decision-act loop as it should be pretty fast and Claude would introduce a lot of latency in it.
[−] vachina 31d ago
Give a stochastic text generator to physics. What can go wrong.
[−] Paracompact 31d ago
As most others have pointed out, the goal from here wouldn't be to craft a custom harness so that Claude could technically fly a plane 100x worse than specialist autopilots. Instead, what would be more interesting is if Claude's executive control, response latency, and visual processing capabilities were improved in a task-agnostic way so that as an emergent property Claude became able to fly a plane.

It would still be better just to let autopilots do the work, because the point of the exercise isn't improved avionics. But it would be an honestly posed challenge for LLMs.

[−] userbinator 31d ago
The real question is, can it keep the plane in one piece?
[−] nairboon 31d ago
Let's hope you don't reach Claude's session limit during approach, while trying to correct a slightly too steep descent angle.
[−] Nevin1901 31d ago
I wonder if using a model with a higher TOK/s would yield improvements, as the model will have faster feedback loops
[−] johntopia 31d ago
If there's a timeline where claude can actually fly a plane, then operating nuclear reactors can be possible as well.
[−] resiros 31d ago
I think you gave someone an idea for a new RL environment :) Probably it will be able to fly it in the next iteration.
[−] nelox 31d ago
So Claude crashed because it was busy figuring out how to fly the plane?
[−] razorbeamz 31d ago
I'd imagine Claude is too slow to fly a plane above everything.
[−] blitzar 31d ago
Sky King managed it, no reason claude shouldnt be able to.
[−] dist-epoch 31d ago
try using codex-5.3-spark, it has much faster inference, might be able to keep up. and maybe a specialized different openrouter model for visual parsing.
[−] xuxu298 31d ago
haha, if can, would you dare to follow it? :D
[−] otabdeveloper4 31d ago
Yes, but for a limited time only.
[−] monour 31d ago
they say already used in some missiles which hit school at current war by mistake
[−] kqr 31d ago
Lots of people commenting seem to have not read the article. The author didn't hook Claude up directly with the controls, asking it to one-shot a successful flight.

The author tried getting Claude to develop an autopilot script while being able to observe the flight for nearly live feedback. It got three attempts, and did not manage autolanding. (There's a reason real autopilots do that assisted with ground-based aids.)

[−] linzhangrun 31d ago
[flagged]
[−] black_13 30d ago
[dead]