I did benchmark extensively 4-5 years ago, but I don't have those numbers with me. Tries are quite expensive memory-wise by design, but I found that ART gave the best balance between speed (by exploiting cache locality) and memory. State of art might have improved by now.
As far as Typesense goes though, I found that the actual posting lists, document listings, and other faceting/sorting related indexing data structures is where the bigger overhead is, especially for larger datasets.
Thanks for the feedback,
my issue is that I allocate only a few MB to my indexing thread so I'm looking for a more efficient implementation to avoid having to produce then merge too many segments from disk.
I'm currently considering using compressed pointers on some part of the tree to reduce the memory footprint as much as I can. Let's see how it goes...
do you have any metric regarding the memory usage of your ART implementation ?
I tried to implement one for the database I'm currently working on, however I feel that I am using way too much memory.
Basically, with my current implementation a dictionary containing about distinct 2857086 words would require 341MB.