The future of code search is not regex – 100x faster than ripgrep

[−] kristopolous 44d ago

I ran across this fascinating tool a few days ago researching embedding models on hugging face.

Advertised as "ColGREP Semantic code search for your terminal and your coding agents",

I haven't put it in any harness yet but I probably should.

https://github.com/lightonai/next-plaid/tree/main/colgrep

I've also tried astgrep (also known as sg) but llms really mess up on them. I think you'd need to fine tune.

If anyone has cracked that case I'd love to hear about it

[−] wiseowise 44d ago

The future is lack of scrolling on mobile, and scanning getting stuck, apparently.

[−] narinciye 44d ago

You don't miss much, don't worry, it also looks horrible on desktop.

[−] neogoose 43d ago

it looks absolutely gorgeous btw but the idea is that you can try the search speed not actually use lmao

[−] austinjp 44d ago

To save people the digging, here's the git repo:

https://github.com/dmtrKovalenko/fff.nvim

"FFF stands for freakin fast fuzzy file finder (pick 3) and it is an opinionated fuzzy file picker for your AI agent and Neovim. Just for file search, but we do the file search really fff well.

FFF is a tool for grepping, fuzzy file matching, globbing, and multigrepping with a strong focus on performance and useful search results. For humans - provides an unbelievable typo-resistant experience, for AI agents - implements the fastest file search with additional free memory suggesting the best search results based on various factors like frecency, git status, file size, definition matches, and more."

[−] adrian17 43d ago

So the repo builds:

- C library

- neovim plugin

- MCP server

But not a plain binary, which is the main way ripgrep is directly used (...at least by humans), and compared with.

[−] neogoose 43d ago

because it is meant to be used by the long running sdk not one shot search (this is where all the optimizations are coming from)

[−] genewitch 44d ago

considering that ripgrep has marginal overhead over just reading the files to /dev/null, how exactly does this achieve 100x speedup?

I have a lot of use for something that can search ~1GB of text "instantly", but so far nothing beats rg/ag after the data has been moved into RAM.

[−] anilakar 44d ago

The trick to optimization is not "doing faster" but "doing less". I already feel rg is missing a ton of results I want to see because it has a very large ignore list by default.

[−] genewitch 35d ago

i see this - complaint? - often, but i use grep for finding text in files in the filesystem, like normal people. But specific datasets i'll use ag/rg. As an example, i have transcribed all of the "shows* i have access to for a couple of radio programs, when i want to do exploratory searches, i hit the set once with ag/rg, which takes 7-14 seconds to warm up once, then it's <1ms to search all 1500 text files or whatever.

So while i'm sure ag/rg may be frustrating to use in certain circumstances, by default it works great for searching text files, even structured text files, on disk.

[−] hoherd 43d ago

alias rg="rg -iuu"

[−] Yokohiii 44d ago

The crate says it uses SIMD, but the crate also says that content search is 20-50 times faster. Maybe the guy unsure how fast it is or how much speedup he should claim to get recognition.

[−] neogoose 43d ago

it very much depends on the platform and the operating system

for example ripgrep doesn't do any memory mapping on macos which makes it 2-3x faster just becuase of that

[−] neogoose 43d ago

you can try it yourself. ripgrep search for "MAX_FILE_SIZE" in the chromium repo takes 6-7 seconds, with fff it is 20milliseconds

so essentially in this specific case it is over 1000x faster, but the repo size is huge (66G, 500k files)

[−] neogoose 44d ago

I have open sourced the fastest code search implementation. Comprehensive SDK for both file finder and grep file search that is over 100x faster than ripgrep

[−] MaxMonteil 44d ago

This looks cool!

You should add a link to the GitHub repo for the project itself, at first I wasn't even sure what it was called.

I found this link https://github.com/dmtrKovalenko/fff.nvim

[−] siva7 44d ago

I don't get this submission title. Your tool uses regex but the title claims the future is not about regex.

[−] molszanski 44d ago

I think it is about input. Before I had to type regex, now I just type text and fuzzy finds more, regex style. Awkward wording, but code seems cool.

[−] neogoose 43d ago

my tool is not using regex, it can use regex but it is not required

[−] jcgrillo 44d ago

k, but what actually are you talking about?

[−] CodesInChaos 44d ago

Where can I find the benchmark for the "20-50 times faster than ripgrep" claim from the documentation, or the "100x faster" claim from the HN submission title?

Ripgrep already has optimizations for regex which don't contain any patterns (or even just regex which contain such substrings). So "not regex" shouldn't be what makes the difference.

[−] self_awareness 44d ago

I've entered "bazel" and got shellPrefix.ts which doesn't relate to bazel in any way.

If that's the future then I'll stay in the past with ripgrep.

[−] _blk 44d ago

It's O(1) with a correctness of O(0)

[−] neogoose 43d ago

you absolutely missed the point

[−] neogoose 43d ago

if you would search in the chromium repo you would see the correct match https://fff.dmtrkovalenko.dev/?repo=2&q=bazel

[−] asdfadsfaf 44d ago

I don't get it how can I search anything but the file name?

[−] swiftcoder 44d ago

Is there a write up of the underlying approach? The summary on the repo mentioned SIMD, but not a whole lot else.

[−] globular-toast 44d ago

Why is it "for neovim"? Surely such a thing would be useful in many applications?

[−] ramon156 44d ago

Because it's being dishonest from multiple angles.

- it has regex, so the title is weird - it definitely wouldn't be 100x faster than rg - its an sdk, so its apples to oranges anyway

[−] pjmlp 44d ago

It has never been ripgrep for decades for those of us on IDEs.

[−] e12e 44d ago

To be fair, ripgrep is approximately one decade old, would be tricky to have used it for decades.

http://blog.burntsushi.net/ripgrep/

https://news.ycombinator.com/item?id=12564442

[−] kzrdude 44d ago

However, it's coming up on a decade (8 years) of vscode using ripgrep behind the scenes.

[−] pjmlp 44d ago

A programmer's editor. However with the right plugins, you get the same IDE capabilities for code searching in Java, C#, C++,...

Which basically runs an IDE headless (Eclipse, Netbeans, VS services,...), the joy of running an IDE + Electron, get to put those cores to use.

[−] gzread 44d ago

Has there been a general gentle decline in IDEs over the past 15 years or is it just me?

[−] pjmlp 44d ago

Maybe for a generation that has learnt to program with IDE poorly supported languages.

[−] vovavili 44d ago

Zed has made me rethink this opinion.

[−] hugodan 44d ago

Why do all vibecoded sites look the same? Same black on neon vibes and button styles

[−] JoeDohn 43d ago

I saw this yesterday it claims it's faster than Ripgrep, it uses regex and rg : https://github.com/erogol/ngi.

[−] forrestthewoods 44d ago

Websites that don’t tell me what they’re doing are infuriating. I’m on mobile. This landing page experience is awful.

[−] stunpix 44d ago

For desktops it's not different.

[−] neogoose 43d ago

it is absolutely amazing experience on mobile if you guys do not understand how to use a search bar and a couple of segmeneted controls -- there is nothing much I can do about it

[−] dig1 44d ago

ctags, GNU Global and even "ugrep -Q" would like to have a few words with you ;)

[−] schrodinger 44d ago

How's it work? Embed tokens and use euclidean distance or something?

[−] jcgrillo 44d ago

what even is this

[−] hyperlambda 44d ago

[flagged]

The future of code search is not regex – 100x faster than ripgrep (fff.dmtrkovalenko.dev)

44 comments