In my experience a culture where teammates prioritise review times (both by checking on updates in GH a few times a day, and by splitting changes agressively into smaller patches) is reflected in much faster overall progress time. It's definitely a culture thing, there's nothing technically or organisationally difficult about implementing it, it just requires people working together considering team velocity more important than personal velocity.
Let's say a teammate is writing code to do geometric projection of streets and roads onto live video. Another teammate is writing code to do automated drone pursuit of cars. Let's say I'm over here writing auth code, making sure I'm modeling all the branches which might occur in some order.
To what degree do we expect intellectual peerage from someone just glancing into this problem because of a PR? I would expect that to be the proper intellectual peer of someone studying the problem, it's quite reasonable to basically double your efforts.
If the team is that small and working on things that are that disparate, then it is also very vulnerable to one of those people leaving, at which point there's a whole part of the project that nobody on the team has a good understanding of.
Having somebody else devote enough time to being up to speed enough to do code review on an area is also an investment in resilience so the team isn't suddenly in huge difficulty if the lone expert in that area leaves. It's still a problem, but at least you have one other person who's been looking at the code and talking about it with the now-departed expert, instead of nobody.
This is an unusually low overlap per topic; probably needs a different structure to traditional prs to get the best chance to benefit from more eyes... Higher scope planning or something like longer but intermittent partner programming.
Generally if the reviewer is not familiar with the content asynchronous line by line reviews are of limited value.
I'm surprised that the `isinstance()` comparison is with `type() == type` and not `type() is type`, which I would expect to be faster, since the `==` implementation tends to have an `isinstance` call anyway.
We've been relying on TypeForm (an experimental feature in Pyright) in xDSL. Since there are some Astral members commenting here: are there any plans to support TypeForm any time soon? It seems like you already have some features that go beyond the Python type spec, so I feel like there may be hope
Yes, we love TypeForm! We plan to support it as soon as the PEP for it lands. Under the covers, we already support much of what's needed, and use it for some of our special-cased functions like `ty_extensions.is_equivalent_to` [1,2]. TypeForm proper has been lower on the priority list mostly because we have a large enough backlog as it is, and that lets us wait to make sure there aren't any last-minute changes to the syntax.
In MLIR, there are two representations of memory, `tensor` and `memref`, which enables you to do some high-level things[0] in SSA before "bufferizing" to memrefs, which are eventually lowered to LLVM pointers.
Have you worked with TypeScript? I’m working with both every day and I’m always frustrated by the limits of the 'type' system in Python- sure it’s better than nothing but it’s so basic compared to what you can do in TypeScript. It’s very easy to use advanced generics in TypeScript but a hell to do (sometimes outright impossible) in Python.
Yep, although never in a project of a similar size. One advantage of the Python setup is that the types are ignored at runtime, so there's no overhead at startup/compilation time. Although it's also a disadvantage in terms of what you can do in the system, of course.
I agree it is pretty nice (with uv and as long as you REALLY don't care about performance). But even if you are one of the enlightened few to use that setup, you still have to deal with dependencies that don't have type annotations, or only basic ones like `dict`.
Typescript (via Deno) is still a better option IMO.
Can someone familiar with performance of LLMs please tell me how important this is to the overall perf? I'm interested in looking into optimizing tokenizers, and have not yet run the measurements. I would have assumed that the cost is generally dominated by matmuls but am encouraged by the reception of this post in the comments.
Tokenizing text is ridiculously small part of the overall computation that goes into serving a request. With that said if you’re doing this on petabytes of data, never hurts to have something faster.
To echo the other replies, the tokenizer is definitely not the bottleneck. It just happens to be the first step in inference, so it's what I did first.
Tokenization performance is complicated, but your guidepost is that the institutions with the resources and talent to do so choose to write extremely fast tokenizers: sentencepiece and tiktoken both pay dearly in complexity (particularly complexity of deployment because now you've got another axis of architecture-specific build/bundle/dylib to manage in addition to whatever your accelerator burden always was: its now aarch64 cross x86_64 cross CUDA capability...)
Sometimes it can overlap with accelerator issue, but pros look at flame graphs: a CPU core running the AVX lanes hard isn't keeping the bus fed, million things. People pre-tokenize big runs all the time.
I don't know why this thread is full of "nothing to see here", this obliterates the SOTA from the money is no object status quo: I'd like to think better of the community than the obvious which is that C++ is threatening a modest mindshare comeback against a Rust narrative that's already under pressure from the explosion of interest in Zig. Maybe there's a better reason.
I really want to switch to Zed from Cursor but the battery usage for my Python project with Pyright is unjustifiable. There are issues for this on GitHub and I'm just sad that the team isn't prioritising this more.
It’s funny you mention this because I have an issue with Cursor where if I leave it open for more than a few hours my M3 Max starts to melt, the fans spin up, and it turns out to be using most of my CPU when it should be idling.
Zed on the other hand works perfectly fine for me. Goes to show how a slightly different stack can completely change one’s experiences with a tool.
I love that language and frequently show it to people. I'm sad to see that my local install doesn't work any more. I actually used it to solve a puzzle in Evoland 2 that I'm relatively sure was added as a joke, and is not solvable in a reasonable time without a solver. I'm actually doing a PhD in compilers right now, and would love to chat about sentient if you have the time. My email is sasha@lopoukhine.com.
You might be interested in looking at MiniZinc (https://minizinc.org/) which is an open source modelling language for combinatorial problems. The system comes from a constraint programming background but the language is solver agnostic can be used to compile into many different types of solvers.
reply