Qwen3.6-Plus: Towards real world agents

[−] Aurornis 43d ago

This is their hosted-only model, not an open weight model like they’ve become known for. They got a lot of good publicity for their open weight model releases, which was the goal. The hard part is pivoting from an open weight provider to being considered as a competitor to Claude and ChatGPT. Initial reactions are mostly anger from everyone who didn’t realize that the play along was to give away the smaller models as advertising, not because they were feeling generous.

Comparing to Opus 4.5 instead of the current 4.6 and other last-gen models is clearly an attempt to deceive, which isn’t winning them any points either.

I think there is a moderately large market for models like this that aren’t quite SOTA level but can be served up much cheaper. I don’t know how successful they’ll be in the race to the bottom in this market niche, though. Most users of cheap API tokens are not loyal to any brand and will change providers overnight each time someone releases a slightly better model.

[−] zozbot234 43d ago

> not an open weight model like they’ve become known for.

Right, they state that they'll release "smaller" variants openly at some point, with few details as to what that means. Will there be a ~300B variant as with Qwen 3.5? The blog post doesn't say.

[−] dietr1ch 43d ago

I wish they had a revenue goal to release openly, that way spending money in them would contribute to better open models in the long run.

This is how I view that the public can fund and eventually get free stuff, just like properly organized private highways end up with the state/society owning a new highway after the private entity that built it got the profits they required to make the project possible.

[−] JimDabell 43d ago

There are a lot of options for doing things this way:

https://en.wikipedia.org/wiki/Threshold_pledge_system

[−] drob518 43d ago

As a publicity stunt, releasing a 300B open model is pretty smart. You can talk about its strong performance and it being “open” and “available,” but it’s so large that most people can’t use it themselves and might try out the cloud-based offering.

[−] kadoban 43d ago

I'm running qwen 3.5 397b on very standard hardware. Just use the unsloth quants, they're great. I get like 20t/s or something.

It's super not a publicity stunt, qwen 3.5 is the base of the best local models out there IMO.

[−] drob518 42d ago

Well, you didn’t post the specs on your rig. I think it’s probably more correct to say that you run it on very beefy but readily available hardware. My point was not that nobody could run a 300B model, but rather that a 300B model is not going to be runnable by a majority of people. Sure, anyone who wants to run that model and has the money to purchase the hardware can do it. But the hardware is going to be pricey and most people don’t already have it unless they were trying to run large models before this. My overarching point is that most people with average laptop specs purchased over the last 3 to 5 years are going to have to consume this from the cloud. Which is great for Qwen.

[−] kadoban 42d ago

I just have a 3090 and 64gb ram. Yes this is more than most people have, but calling it a "publicity stunt" is just so uncharitably weird of a characterization.

There's smaller models all the way down too.

Like this should be _exactly_ what we want companies to release.

[−] rurban 42d ago

I can run a 300b model, but I don't do it. We need the H100's for training

[−] zozbot234 43d ago

The large models are actually MoE these days so they're usable on ordinary hardware with weights streaming from SSD, just very slow. You're nonethess right that it makes the cloud-based offering more popular, since you can use that for convenience after testing a few inferences locally.

[−] mogili1 43d ago

There are plenty of model providers that can serve them though at cheaper prices and cannibalize Alibaba revenue.

[−] Bombthecat 43d ago

You can run it on runpod for example.

[−] echelon 43d ago

I'm not interested in adopting an inferior closed source weight from a geopolitical rival. The open source weights argument was the one thing China had going and that I was seriously cheering them on for. They could have been our saviors and disrupted the US tech giants - and if it was open, I'd have welcomed it.

Now they show their true colors. They want to train models on our engineering to replace us, while simultaneously giving nothing back? No thanks. I'd rather fund the shitty US hyperscalers. At least that leads to jobs here.

If there's a company willing develop and foster large scale weights in the open, I'll adopt their tooling 100%. It doesn't matter if they're a year behind. Just do it open and build an entire ecosystem on top of it.

The re-AOLization of the internet into thin clients is bullshit, and all it takes is one player to buck the rules to topple the whole house of cards.

[−] miki123211 43d ago

Ah, so that explains the recent wave of Qwen team-member departures.

[−] jona-f 43d ago

> Most users of cheap API tokens are not loyal to any brand

In the exploration phase, yes. But once your setup settles down you likely want to stay on the same model for stable operation.

[−] Alifatisk 43d ago

I understand peoples reactions of Qwen team comparing against Opus 4.5 instead of 4.6. And them comparing against Gemini Pro 3.0 instead of 3.1. But calling it misleading is a bit of stretch in my eyes, people here are acting like we immediately forgot how previous generations performed just because a new version is released.

This field is going in a incredible pace, the providers release a new model every quarter or so. The amount of criticism is a bit overblown in my opinion. The benchmarks still look very good to me. I’ve used GLM-5 (latest is GLM-5.1) and Kimi K2.5, they are decent and gets the job done, so seeing how this model of Qwen performs compared to it is kinda impressive.

Also, why are so many pointing out the fact that this model is not open-weight as if this is their first time doing so. Qwen-3.5-plus, Qwen-3-Max is also closed source. This is not something new.

I think Qwen trying to catch up to the SOTA models is still healthy for us, the consumers. Sure, its sad news that this version is closed-weight, but I won’t downplay their progress.

[−] simonw 43d ago

Pretty solid Pelican: https://gist.github.com/simonw/ca081b679734bc0e5997a43d29fad...

I used the https://modelstudio.alibabacloud.com/ API to generate that one, which required signing up for an account and attaching PayPal billing - but it looks like OpenRouter are offering it for free right now so I could have used that: https://openrouter.ai/qwen/qwen3.6-plus:free

[−] jgbuddy 43d ago

Worth noting that this model, unlike almost all qwen models, is not open-weight, nor is the parameter count exposed. Also odd that it is compared against opus 4.5 even though 4.6 was released like 2 months ago.

[−] furyofantares 43d ago

I'll diverge from some of these comments, I don't find it misleading to compare to Opus 4.5.

I can remember how good Opus 4.5 was. If I'm considering using this, it's most informative to me to compare to the model it's closest to that I have familiarity with.

I'm obviously not switching to this if I want the best model. I'm switching if I'm hopeful that the smaller versions are close to it, or if I want to have more options for providers, or for any other reasons unrelated to getting the highest quality responses possible.

[−] linolevan 43d ago

I’m surprised that people are surprised. Qwen has been hosting private plus and max variants for a while now.

[−] nl 43d ago

23/25 on my agentic benchmark for the free version on OpenRouter. That's a great score - only 4 models have ever scored higher.

But there are open models that also score 23/25 including Qwen 3.5 27B.

[−] karimf 43d ago

> In the coming days, we will also open-source smaller-scale variants, reaffirming our commitment to accessibility and community-driven innovation.

[−] try-working 43d ago

For anyone that believes Chinese labs will stop open sourcing their models, let me tell you why that won't happen.

First, try signing up for Z.ai's coding plan. I know how to but I bet you won't be able to.

The absolute disaster that is Z.ai's internet presence shows that these small labs have no ability to market themselves and drive direct sales.

For marketing, they lack capabilities, and releasing open models is the only way for them to remain in the conversation.

For sales, they rely on distribution via OpenRouter, OpenCode etc. Interest with their users is driven by open model performance.

Open sourcing for Chinese labs is not some large national scheme. It is their only way to commercialization.

[−] srmatto 43d ago

The benchmarks provided are for Opus-4.5, not for the latest Opus-4.6 and Qwen is still lagging in a lot of them.

[−] kristopolous 43d ago

I've gone through about 500M tokens on this model already. They've got some free inferencing options (such as on openrouter) ... $0 is hard to beat and it's creating not-crap.

[−] eis 43d ago

Quite strong results in the benchmarks but why Gemini 3 Pro instead of 3.1? Why only for a few of the benchmarks? Why is OpenAI not there in the coding benchmarks? Why Opus 4.5 and not 4.6? Just jumps out into my eye as a bit strange.

As always, we'll have to try and see how it performs in the real world but the open weight models of Qwen were pretty decent for some tasks so still excited to see what this brings.

[−] woeirua 43d ago

Just more evidence that the B tier models are six months behind. Ultimately that’s good. Opus 4.6 level intelligence will be cheap later this year!

[−] davesque 43d ago

I wish these AI vendors would quit publishing comparisons with the previous generation of their competitors's models. It's just such a glaringly bad look and no one is fooled by it, even if their achievements deserve praise in their own right. The Qwen models are great and don't deserve the reputational hit that comes from dodgy marketing tactics.

[−] wg0 43d ago

It hallucinates a lot more then Sonnet or even MiniMax M2.5. Especially in tool calls, it would end up duplicating the content in code files and then realising later and getting stuck in a loop.

[−] edg5000 43d ago

Do they have an API where you can control the chat template or at least just put everything in the system prompt? This way you can control everything including the tool calling syntax. Even if you use the trained tool syntax, it allows you to control the tool system prompt which you may want to tweak. With DeepSeek this is all possible. An undocumented feature, great for harness builders. Anybody got info on Qwen regarding this?

[−] mtrifonov 42d ago

Most agent work focuses on task completion. Browse the web, fill out the form, and/or write the code. The harder problem is social agency, where the AI has to decide whether to participate at all. We built a cheap model gate that reads the conversational dynamics of a group chat before the expensive model runs. Wonder how Qwen3.6 performs in these nuances cases.

[−] Art9681 43d ago

How convenient of them to compare themselves to the last generation Opus and GPT models to make their model look better than it really is.

[−] giancarlostoro 43d ago

I hope their open source variants are just as good, having a 1 million token window for a fully offline model would be VERY interesting.

[−] XCSme 43d ago

3.6 Plus seems to be simply a refined/more consistent 3.5 Plus: https://aibenchy.com/compare/qwen-qwen3-5-plus-02-15-medium/...

[−] edg5000 43d ago

Has anybody done serious agentic work (e.g. using a CLI harness or simmilar) with 3.5 Plus/3.0 Max and such? How does it compare against Opus with Claude Code? I've used the chat quite a bit and I can't say at this point.

[−] zwaps 43d ago

They claim SOTA but are beaten by last gen Opus om every metric?

This one seems weird

[−] gburgett 43d ago

Looking forward to when this gets on Bedrock. I built an app with a niche AI agent and to this point only Sonnet is really good enough for our use case, but its expensive!

[−] throwaw12 43d ago

I would love to hear from people using both (Claude Code OR Codex) AND (Qwen) and their experience with Qwen models, are they on par, or how far are they?

[−] zkmon 43d ago

It is no longer available on OpenRouter. They say "going away on 3-March", but it's already gone!

[−] esafak 43d ago

Does anyone have experience with Alibaba's coding plan? Not that I'm very tempted at $50/month...

[−] wolvoleo 43d ago

Nice, I hope there will also come a small open version of it.

[−] throwaway911282 43d ago

ignoring gpt 5.4! I feel bad for people who have not even tried it. for the same 20$ I pay to openai and anthropic, I get significantly more from openai

[−] adinhitlore 43d ago

i've been fan of qwen for quite some time, awsome!

[−] dzonga 43d ago

Qwen free plan is still good.

you get a generous token limit.

[−] MarsIronPI 43d ago

It's not open weights so I'm not interested.

[−] daft_pink 43d ago

Not really interested in using models hosted on alibaba cloud.

Like Qwen local for it’s privacy, but I trust the privacy of Google/OpenAI/Anthropic more than alibaba.

[−] Caum 43d ago

[flagged]

[−] bezlant 42d ago

[dead]

[−] techpulselab 43d ago

[flagged]

[−] pratyushsood 42d ago

[dead]

[−] Sim-In-Silico 43d ago

[dead]

[−] 0xqlive 42d ago

[dead]

[−] maxothex 43d ago

[dead]

[−] Vivolab 43d ago

[dead]

[−] volume_tech 43d ago

[dead]

[−] johnwhitman 43d ago

[flagged]

[−] geenkeuse 42d ago

[dead]

[−] kanehorikawa 43d ago

[dead]

[−] shubhamgarg86 43d ago

[flagged]

[−] EdoardoIaga 43d ago

[flagged]

Qwen3.6-Plus: Towards real world agents (qwen.ai)

213 comments