I'm a Kagi search/assistant user and advocate but the "small web" product is a frustrating misnomer.
To me the small web is any little website that was created to be interesting rather than to sell me something. That includes stuff like neocities, "shrine" type sites, single purpose sites, fandom portals, web experiments, etc.
Unfortunately Kagi's definition of "small web" is: blog or webcomic. You must have an RSS feed and it must have recent posts. That rules out so much interesting stuff I don't understand the point.
Expert/auteur websites like Sheldon Brown's (or, one of my favorites, Ask Aaron https://runamok.tech/AskAaron/FAQ.html) are the pinnacle of what's possible with the small web. Today this kind of info ends up in an ad-ridden hosted wiki or locked away in an unsearchable discord.
Then there's exceptionally cool demos like https://thelongestyard.link/q3a-demo/. This sort of thing just doesn't fit in a "blog" format unless you're writing a blog about how you built it and linking out to it.
If anyone knows of a directory of sites like these (preferably with a shuffle option) I'd love to hear about it (and contribute)!
Sheldon Brown's content is great, but is it ironic that the first thing you see on his site is a Google banner ad?
Understandably, he'd like to earn money on his content and I see no problem with that. But for me to visit his site and have Google add yet another tracking event to their "interest pile" about me (I guess i'm in the market for bikes now?) is a bit off putting.
He can't be making more than a few bucks a month through that single ad, right?
I assume nobody removed it and the revenue is just added to some Google Adsense balance sheet, and reports go to some Gmail account that will expire one day.
This website is the small web - self contained. It's a really good example of the Internet we had and apparently some still want. I think of it like computer graphics where you're definition of space can get bigger as you add a bunch of resources each with their own model space into the relative context of world space. The small web should define how we do that and discover things, not what or how we build within each specific model space.
I am looking for something that would filter for sites that rarely post but have good content. The number one problem with most of these systems is that everything favours frequent posting. Even if I do it manually, I cannot keep the tabs over many rarely posting sites - this is an obvious example of a problem that we delegate to computers. Favouring frequent posters creates incentives to do that even if quality worsens.
because we don't value them at all, literally. It's a tragedy of the commons, internet pollution is like air pollution, the polluters don't pay and there's no cost associated with overusing other people's attention.
I'd be fascinated on the economics of this from Google's perspective: specifically the unit economics on generating updated-once-a-year results to queried-once-in-a-million searches.
Tl;dr: I feel like the long-tail web (90s) was better, but economics pushed high-update-frequency more-centralized results.
This is a fairly recent phenomenon: I'm a longtime Small Web user and even I struggle with this massive influx of AI posts. I'm hopeful it will be addressed.
I could definitely see value in filters for "has RSS" and "has recent posts"—maybe even as the default view—but I absolutely agree that this is much less interesting to me without the wider world of interesting, small sites.
I would also love to go back to Geocities style web interaction, but the medium is the message, and the way the Internet has evolved as a medium means that people don't naturally interact with it in a way that supports regression to that era. Attempts to force it like neocities have a hyperreal quality to them.
On a similar note, I maintain and grow a manually curated collection of personal blogs with valid RSS feeds: https://minifeed.net/blogs
The criteria is simple: human-written (as much as I can validate myself), in English (for now), with valid RSS feed, and not a micro-blog (so, more than just feed of links or short tweet-like messages).
Similar to Kagi's Small Web viewer, or StumbleUpon-style viewer: you can get a random listing of blogs [1] or a random listing of posts from all blogs [2]. Feeds and posts are indexed, so full-text search works across all blogs. When possible and permitted by robots.txt, text is scraped for searching, so even if some text is omitted in the RSS feed by the author, search should work.
Though I do plan to implement a similar "view one random post at source" kind of view, soon.
UPD: Feel free to submit a blog, including your own! [3]
The implicit criteria (tech/business and adjacent) is an issue with all these lists for me. But it's also a personal list, which is great. I just wish literally anyone in these had a personal interest on anything else reflected in their lists because I keep checking them and being disappointed.
A topic that's come up before on here with others doing the complaining about a list I liked for this reason but wasn't top-loaded with tech: https://news.ycombinator.com/item?id=47015676
Jokes aside, it's really nice and I can totally see becoming addictive. Kudos to Kagi team for an other user oriented product. (as a side note, I am using Kagi daily and i didn't know about this tool)
I've been using the Kagi search engine for months now and I'm not impressed. I bought into it because there were a lot of posts saying that it was "just like old Google" but this has not been my experience. It's the same as new Google, you can type in what you're looking for exactly and you'll get random sort-of related websites.
I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for. That was back in like 2010. To me that's the old, and useful, search engine that I want.
The first random page it returned to me was this — https://gaultier.github.io/blog/how_to_make_your_own_static_... — which was about building one's own static site generator, which I really liked. I did not realise when I closed that page how hard it would be to find it again, because, of course every new visit to Kagi returns a different page :-)
I do love the concept, but a little part of me died
each time I came across an article with a very strong AI voice. That just feels antithetical to the ‘small web’ ethos because it obscures the ‘neighbor’ behind it.
I like the idea, but would like to be able to select a language and see the small web of that language. There are more languages than English, and this tool could make them thrive.
Also somehow if they are clever, they could use this for those translation system they are using, but please let us select our own language without feeding automatic translation like youtube does).
> This page is auto-generated from Github Actions workflow that runs every day at night and fetches the 5 latest articles from each of my favorite blogs.
So, basically, a random site from their index of ~30,000 sites.
You can choose similar sites by index.
But what are the criterion to have your site listed here, or how it will prevent this from just becoming a massive gamified advertising index, or anything more about "why these?" is not obvious to me.
Can anyone explain what is special about these sites specifically, or where this project is going?
I run a Hugo blog and I get more interesting referral traffic from Kagi's small web index than from Google at this point. 5,000 curated sites is small enough to be useful most "indie web" directories are graveyards unfortunately..
Does it work for you guys to go to about and then click on the "list" link?
For me it says I'm blocked due to hitting a "secondary" rate limit (don't understand what that means). I don't think I've opened a page on github yet today so clearly it's a lie. Is it the referer that triggers this?
In general, freeloading the "small web" on a Microsoft service is kind of ironic. Being blocked by algorithms that try to detect if you're really human is precisely one of the things one would hope to get away from by using small, personal websites
Bit bummed. The first random page I landed on was a really interesting article for me. The custom cursor (well why not) had me struggling to following a link, and instinctively I refreshed the page. I ended up somewhere else in the haystack with ostensibly no way back to that particular article.
Perhaps I'm yelling into the void here, but what would be great is when first landing at kagi.com/smallweb, the url query parameter would be somehow set, as it is when "Next Post" is clicked.
This thread inspired me to build a decentralised tool to explore the small web: https://susam.net/wander/
If you like it and if you have a website, please join the network. It only takes uploading two files to your web server to set it up. In fact, you can host it on GitHub Pages or Codeberg Pages too. And you can link to any small website (not just blogs).
Ah, this might explain the traffic from Kagi a week or so ago. I've been scratching my head over that one. I just checked, and my wee little blog is listed in smallweb.txt. Neat!
Curious what goes on behind the Next Post and Show Similar buttons.
210 comments
To me the small web is any little website that was created to be interesting rather than to sell me something. That includes stuff like neocities, "shrine" type sites, single purpose sites, fandom portals, web experiments, etc.
Unfortunately Kagi's definition of "small web" is: blog or webcomic. You must have an RSS feed and it must have recent posts. That rules out so much interesting stuff I don't understand the point.
Heavy Kagi user and the idea behind small web was appealing; but how its implemented don't click with me
Their rules excludes an absolute gem like https://www.sheldonbrown.com/ which is, to me, the essence of what we could call the "small web".
Each times the topic pops up, I try a few random ones and never found anything interesting.
There's also novelties like https://www.howmanypeopleareinspacerightnow.com/, this probably hasn't been updated in a decade but that makes it no less interesting.
Then there's exceptionally cool demos like https://thelongestyard.link/q3a-demo/. This sort of thing just doesn't fit in a "blog" format unless you're writing a blog about how you built it and linking out to it.
If anyone knows of a directory of sites like these (preferably with a shuffle option) I'd love to hear about it (and contribute)!
Understandably, he'd like to earn money on his content and I see no problem with that. But for me to visit his site and have Google add yet another tracking event to their "interest pile" about me (I guess i'm in the market for bikes now?) is a bit off putting.
He can't be making more than a few bucks a month through that single ad, right?
He's been dead since 2008, so I assume the banner ad keeps the lights on in the absence of his income and input.
> are the pinnacle of what's possible with the small web
if this is the pinnacle then I want nothing to do with it.
Tl;dr: I feel like the long-tail web (90s) was better, but economics pushed high-update-frequency more-centralized results.
I would also love to go back to Geocities style web interaction, but the medium is the message, and the way the Internet has evolved as a medium means that people don't naturally interact with it in a way that supports regression to that era. Attempts to force it like neocities have a hyperreal quality to them.
The criteria is simple: human-written (as much as I can validate myself), in English (for now), with valid RSS feed, and not a micro-blog (so, more than just feed of links or short tweet-like messages).
Similar to Kagi's Small Web viewer, or StumbleUpon-style viewer: you can get a random listing of blogs [1] or a random listing of posts from all blogs [2]. Feeds and posts are indexed, so full-text search works across all blogs. When possible and permitted by robots.txt, text is scraped for searching, so even if some text is omitted in the RSS feed by the author, search should work.
Though I do plan to implement a similar "view one random post at source" kind of view, soon.
UPD: Feel free to submit a blog, including your own! [3]
[1] https://minifeed.net/blogs/by/random
[2] https://minifeed.net/global/random
[3] https://minifeed.net/suggest
A topic that's come up before on here with others doing the complaining about a list I liked for this reason but wasn't top-loaded with tech: https://news.ycombinator.com/item?id=47015676
https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
There is also Small Comic:
https://kagi.com/smallweb/?comic
https://github.com/kagisearch/smallweb/blob/main/smallcomic....
And Small YouTube:
https://kagi.com/smallweb/?yt
https://github.com/kagisearch/smallweb/blob/main/smallyt.txt
Jokes aside, it's really nice and I can totally see becoming addictive. Kudos to Kagi team for an other user oriented product. (as a side note, I am using Kagi daily and i didn't know about this tool)
I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for. That was back in like 2010. To me that's the old, and useful, search engine that I want.
It refreshes every 5 hours and shows you the most recent blogs published on Kagi. Check it out!
Also somehow if they are clever, they could use this for those translation system they are using, but please let us select our own language without feeding automatic translation like youtube does).
https://kagi.com/smallweb/?url=https://pliutau.com/reading-l...
> This page is auto-generated from Github Actions workflow that runs every day at night and fetches the 5 latest articles from each of my favorite blogs.
You can choose similar sites by index.
But what are the criterion to have your site listed here, or how it will prevent this from just becoming a massive gamified advertising index, or anything more about "why these?" is not obvious to me.
Can anyone explain what is special about these sites specifically, or where this project is going?
For me it says I'm blocked due to hitting a "secondary" rate limit (don't understand what that means). I don't think I've opened a page on github yet today so clearly it's a lie. Is it the referer that triggers this?
In general, freeloading the "small web" on a Microsoft service is kind of ironic. Being blocked by algorithms that try to detect if you're really human is precisely one of the things one would hope to get away from by using small, personal websites
Previous post 7-sept-2023 https://news.ycombinator.com/item?id=37420281 185 comments. And https://news.ycombinator.com/item?id=39476015 23-feb-2023 36 comments
Perhaps I'm yelling into the void here, but what would be great is when first landing at kagi.com/smallweb, the url query parameter would be somehow set, as it is when "Next Post" is clicked.
If you like it and if you have a website, please join the network. It only takes uploading two files to your web server to set it up. In fact, you can host it on GitHub Pages or Codeberg Pages too. And you can link to any small website (not just blogs).
Details about how it works and how to set it up available here: https://codeberg.org/susam/wander#readme
Curious what goes on behind the Next Post and Show Similar buttons.