Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel

[−] rwmj 59d ago

Better to link to the site itself, or one of the reviews?

For an example of a review (picked pretty much at random) see: https://sashiko.dev/#/patchset/20260318151256.2590375-1-andr...

The original patch series corresponding to that is: https://lkml.org/lkml/2026/3/18/1600

Edit: Here's a simpler and better example of a review: https://sashiko.dev/#/patchset/20260318110848.2779003-1-liju...

I'm very glad they're not spamming the mailing list.

[−] jeffbee 59d ago

That is both really useful and a great example of why they should have stopped writing code in C decades ago. So many kernel bugs have arisen from people adding early returns without thinking about the cleanup functions, a problem that many other language platforms handle automatically on scope exit.

[−] KurSix 58d ago

You don't even need an LLM for this stuff. GCC has the __cleanup__ attribute, and kernel static analyzers like Smatch have been catching missing unlocks for a decade now. People just ignore linter warnings when submitting patches, so the language itself isn't really the issue. The LLM is basically just acting as a talking linter that can explain the error in plain English

[−] jeffbee 58d ago

Linux doesn't have any of: sufficient testing, sufficient static analysis, or sufficient pre-commit code review. Under those conditions, which I take as a given because it's their project and we can't just swap out the leaders with more tasteful leaders, adding this type of third-party review feedback strikes me as valuable. Perhaps, to your point, it would also be possible to simply run static analyzers on new proposed commits.

[−] pvtmert 55d ago

To be honest, during my ~10 years of experience, I haven't stumbled on any _project_ that has _sufficient_ amount of either testing, static analysis, and/or, where everyone who contributes gives a damn about linter/formatter warnings or errors.

Unless there is a CI that fails immediately when these checks fail, and it is not possible to bypass/override, then of course quality in the codebase improves significantly.

But that is very hard to give business results, in terms of ROI, 99.9% of the PMs will sweep these under the rug, because of unclear ROI.

Another remark I have to make is that, these things should be in a cloud/container (isolated environment) somewhere. Because I know several people who has their git commit aliased with no-precommit-hooks, because well, otherwise they complain that it is too slow. (Slowness is another issue that can be improved, but again, less visible ROI than just adding the command-line switch)

[−] overfeed 59d ago

Must we do this on every thread about the Linux kernel?

[−] RobRivera 59d ago

The beatings will continue until morale improves

[−] vpShane 59d ago

yeah but Linux is love, linux is life. if you really want to get the beatings going:

Rust > C and GNU/Linux should be Rust.

[−] ugh123 59d ago

also vim > emacs

[−] Ferret7446 59d ago

[flagged]

[−] richwater 59d ago

[flagged]

[−] tigen 59d ago

This ought to help with that. https://thephd.dev/c2y-the-defer-technical-specification-its...

[−] nurettin 59d ago

> stopped writing code in C decades ago.

And what were they supposed to use in 2006? Free Pascal? Ada?

[−] TacticalCoder 59d ago

Looks like a great new tool to help ship less bugs!

Nitpicking on this though:

> "In my measurement, Sashiko was able to find 53% of bugs based on a completely unfiltered set of 1000 recent upstream issues based on "Fixes:" tags (using Gemini 3.1 Pro). Some might say that 53% is not that impressive, but 100% of these issues were missed by human reviewers."

That'd assume 100% of the issues that were fixed and used for training were not fixed following a human review. I don't buy it: it's extremely common to have a dev notice a bug in the code, without a user having ever reported the bug.

I think the wording meant to say: "... but 100% of these issues were first missed by humans".

My point being: the original code review by a human ain't the only code review by a human. Or put it this way: it's not as if we were writing code, shipping it, then never ever looking at that line of code again unless a bug report were to come out. It's not how development works.

[−] withinrafael 59d ago

Looks cool, but this site is a bit difficult for me to grok.

I think the table might be slightly inside-out? The Status column appears to show internal pipeline states ("Pending", "In Review") that really only matter to the system, while Findings are buried in the column on the far right. For example, one reviewed patchset with a critical and a high finding is just causally hanging out below the fold. I couldn't immediately find a way to filter or search for severe findings.

It might help to separate unreviewed patches from reviewed ones, and somehow wire the findings into the visual hierarchy better. Or perhaps I'm just off base and this is targeting a very specific Linux kernel community workflow/mindset.

Just my 1c.

[−] kleiba 59d ago

> Sashiko was able to find around 53% of bugs

That's cool. Another interesting metric, however, would be the false positive ratio: like, I could just build a bogus system that simply marks everything as a bug and then claim "my system found 100% of all bugs!"

In practice, not just the recall of a bug finding system is important but also its precision: if human reviewers get spammed with piles of alleged bug reports by something like Sashiko, most of which turn out not to be bugs at all, that noise binds resources and could undermine trust in the usefulness of the system.

[−] ChrisArchitect 59d ago

https://github.com/sashiko-dev/sashiko (https://news.ycombinator.com/item?id=47427996)

[−] monksy 59d ago

I think this is a great and interesting project. However, I hope that they're not doing this to submit patches to the kernel. It would be much better to layer in additional tests to exploit bugs and defects for verification of existance/fixes.

(Also tests can be focused per defect.. which prevents overload)

From some of the changes I'm seeing: This looks like it's doing style and structure changes, which for a codebase this size is going to add drag to existing development. (I'm supportive of cleanups.. but done on an automated basis is a bad idea)

I.e. https://sashiko.dev/#/message/20260318170604.10254-1-erdemhu...

[−] throwa356262 59d ago

I find it interesting that this is written in Rust (not golang) and co-authored with Claude (not gemini)

[−] Havoc 59d ago

How do the kernel devs feel about this? Cause that seems to be the sticking point for external AI “help” - the open source devs hate it

Seems to be a well funded effort though so maybe it’s better?

[−] simianwords 59d ago

> Roman reports that Sashiko was able to find around 53% of bugs based on an unfiltered set of 1,000 recent upstream Linux kernel issues with "Fixes: " tag

What does this mean?

[−] bmd1905 55d ago

[dead]

[−] bmd1905 56d ago

[dead]

[−] takahitoyoneda 59d ago

[dead]

[−] bmd1905 59d ago

[dead]

[−] whiteclawonso36 59d ago

[dead]

[−] mika-el 59d ago

[flagged]

[−] balinha_8864 59d ago

[dead]

[−] Heer_J 59d ago

[dead]

[−] ratrace 59d ago

[dead]

[−] michaelchen58 59d ago

[flagged]

[−] goatyishere25 59d ago

[flagged]

[−] quantium1628 59d ago

[flagged]

[−] 4fterd4rk 59d ago

oh god can we not

[−] qainsights 59d ago

They would have completely redesigned Google Gerrit.

Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel (phoronix.com)

48 comments