Comparing Python Type Checkers: Typing Spec Conformance (pyrefly.org)

by ocamoss 59 comments 119 points
Read article View on HN

59 comments

[−] extr 61d ago
Wow, quite surprising results. I have been working on a personal project with the astral stack (uv, ruff, ty) that's using extremely strict lint/type checking settings, you could call it an experiment in setting up a python codebase to work well with AI. I was not aware that ty's gaps were significant. I just tried with zuban + pyright. Both catch a half dozen issues that ty is ignoring. Zuban has one FP and one FN, pyright is 100% correct.

Looks like I will be converting to pyright. No disrespect to the astral team, I think they have been pretty careful to note that ty is still in early days. I'm sure I will return to it at some point - uv and ruff are excellent.

[−] roflcopter69 61d ago
This is the way. For now, pyright it's also 100% pyright for me. I can recommend turning on reportMatchNotExhaustive if you're into Python's match statements but would love the exhaustiveness check you get in Rust. Eric Traut has done a marvellous job working on pyright, what a legend!

But don't get me wrong, I made an entry in my calendar to remind me of checking out ty in half a year. I'm quite optimistic they will get there.

[−] p1necone 61d ago
Say what you will about Microsoft, but their programming language people consistently seem to make very solid decisions.
[−] zdimension 61d ago
Microsoft started as a programming language company (MS-BASIC) and they never stopped delivering serious quality software there. VB (classic), for all its flaws, was an amazing RAD dev product. .NET, especially since the move to open-source, is a great platform to work with. C# and TS are very well-designed languages.

Though they still haven't managed to produce a UI toolkit that is both reliable, fast, and easy to use.

[−] athoscouto 60d ago
For big codebases pyright can be pretty slow and memory hungry. Even though ty is still a WIP, I'm adopting it at work because of how fast it is and some other goodies (e.g. https://docs.astral.sh/ty/features/type-system/#intersection...)
[−] zeratax 60d ago
I assume this is pretty rare, but ty sometimes finds real issues that are actually allowed by the spec, like:

  def foo(a: float) -> str:
    return a.hex()

  foo(false)
is correct according to PEP 484 (when an argument is annotated as having type float, an argument of type int is acceptable) but this will lead to a runtime error. mypy sees no type error here, but ty does.
[−] persedes 61d ago
Article is a nice write up of https://htmlpreview.github.io/?https://github.com/python/typ...

(glad they include ty now)

[−] martinky24 61d ago
I've been using ty on some previously untyped codebases at work. It does a good job of being fast and easy to use while catching many issues without being overly draconian.

My teammates who were writing untyped Python previously don't seem to mind it. It's a good addition to the ecosystem!

[−] tfrancisl 61d ago
And it makes it infinitely easier for them to get with the times and start typing their code!
[−] rirze 61d ago
I am worried about the false negatives/positive rate however. Hope it improves.
[−] notatallshaw 61d ago
My understand is Astral's focus for ty has been on making a good experience for common issues, whereas they plan for very high compliance but difficult or rare edge cases aren't are prioritized.

Compliance suite numbers are biased towards edge cases and not the common path because that's where a lot of the tests need to be added.

My advise is to see how each type checker runs against your own codebase and if the output/performance is something you are happy with.

[−] dcreager 61d ago

> My understand is Astral's focus for ty has been on making a good experience for common issues, whereas they plan for very high compliance but difficult or rare edge cases aren't are prioritized.

I would say that's true in terms of prioritization (there's a lot to do!), but not in terms of the final user experience that we are aiming for. We're not planning on punting on anything in the conformance suite, for instance.

[−] refactor_master 61d ago
How does Zuban manage to be developed by what appears to be a single person without megacorp backing, yet be mere inches behind pyright at this stage?
[−] ddxv 61d ago
I've used mypy forever and never even tried these others. Looking at them though it looks like it's worth trying out Zuban or Pyright? Is there a noticeable benefit when switching between different checkers?
[−] Scene_Cast2 61d ago
Are there any good static (i.e. not runtime) type checkers for arrays and tensors? E.g. "16x64x256 fp16" in numpy, pytorch, jax, cupy, or whatever framework. Would be pretty useful for ML work.
[−] nas 60d ago
Just an FYI, for people looking at the low pass rates for mypy and ty and concluding they must not be very useful. These test suites are checking many odd corners of the typing spec.

For "normal" Python code, I find mypy does pretty good. Certainly I find it helpful, especially on a large code base and when working with other developers of various experience levels.

The reason I prefer pyrefly over mypy is mostly because of speed. Better accuracy is nice but speed it the killer feature. Given the quality of uv and ruff and the experience of the team working on ty, I'm quite confident it's going to be great in that respect as well.

[−] pgwalsh 61d ago
Using VSCodium I was having issues with python type checkers for quite a while. I did the basedpyright thing for a while but that was painful. It's a bit too based for me, and I'm not sure i'd call it based. Right now I have uv, ruff, and ty and I'm happy with it. It's super easy to update and super fast. I didn't realize the coverage wasn't as good as some others but I still like it. I may have to try pyrefly. Never heard of it until this post, so thank you.
[−] JackSlateur 60d ago
My favorite typing-related stuff is typeguard (https://pypi.org/project/typeguard/), with its pytest plugin
[−] Neywiny 61d ago
This is great and I'll try out pyright ASAP on my current codebase. The people who wrote it evidently didn't have any type checking running (despite I think 3+ linters??) so it's a nightmare of

> "well the checker accurately reports it will be type X in an error case not Y"

> "but we never get type X"

> "Then we don't have good enough coverage"

It's so easy in vscode, but it isn't on by default like the c/c++ one I guess because too much legacy code would cause infinite errors. And the age old problem of .pyi files lying about types.

[−] IshKebab 61d ago
Interesting. This is the first I've heard of Zuban.

The fact that Mypy fails so badly matches my experience. It would be interesting to see exactly where Pyright "fails". It's been so reliable to me I wouldn't be 100% surprised if these are deliberate deviations from the spec, where it is dumb.

[−] Pay08 61d ago
I still can't get over the utter idiocy in Python's type hints being decorative. In what world does x: int = "thing" not give someone in the standardisation process pause?