Qwen-3.6-Plus is the first model to break 1T tokens processed in a day (twitter.com)

by Alifatisk 23 comments 60 points
Read article View on HN

23 comments

[−] gertlabs 40d ago
Qwen 3.6 Plus is a decent model in our benchmarks (which found it to perform lower than its model card) at gertlabs.com, but not ground-breaking.

The reason for the insane popularity is because it's pretty good AND free. It's a no-brainer to switch to this for anything usage-based that isn't frontier coding while the free limits are available. It's probably running a model ~100B parameters under the hood, which won't be so heavily subsidized for long.

EDIT: our tool usage benchmark is still running, but so far, its performance with tools is dramatically better than its one shot performance. I'm treating Qwen 3.6 Plus as a near-SOTA model now.

[−] guteubvkk 40d ago
is it unlimited free, or the usual openrouter free (50 or 1000 requests/day)
[−] gertlabs 40d ago
You will be rate limited, so it depends on your use case. We only ran into brief, intermittent short term rate limits when making thousands of calls for the benchmark, so I imagine it's fine for personal use.
[−] roxolotl 40d ago
I’m very curious if we’re going to ever get another “deepseek moment. Qwen is starting to feel like it could be one. But for it to be people would have to decide to care. It took about a month, I think mid December-mid January, from the deepseek paper for the “moment” so it doesn’t necessarily have to be right away.
[−] try-working 40d ago
What's gone unnoticed with the Gemma 4 release is that it crowned Qwen as the small model SOTA. So for the first time a Chinese lab holds the frontier in a model category. It is a minor DeepSeek model, because western labs have to catch up with Alibaba now.
[−] guteubvkk 40d ago
on my 16 GB GPU Gemma 4 is better and faster than Qwen 3.5, both at 4-bit

so it's not so clear cut

[−] tmikaeld 39d ago
depends on usage, Gemma 4 is better on visuals/html/css and language understanding (Which probably plays a role in prompting). But it's worse at code in general compared to Qwen 3.5 27B.
[−] acchow 38d ago
Which in the series specifically?
[−] lostmsu 40d ago
It's unnoticed because it didn't. In Google's own benchmarks they are on par, and I've seen 3rd party benchmarks where Qwen beats G4 with high margin
[−] irishcoffee 40d ago
The day a western anything will need to catch up with alibaba will be a notable day indeed. Also, this will never happen.
[−] Alifatisk 40d ago
[−] neonstatic 40d ago
If it overthinks everything the way Qwen 3.5 running locally does, then I am not surprised! :)
[−] dcre 40d ago
Anybody want to give an anecdotal take on how good it is?
[−] Sabinus 39d ago
I wonder what kind of workloads people are putting through it. Presumably all tokens submitted are used to train on.