Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Single header branchless ASCII folding filter (github.com/mkcg)
3 points by mkcg on July 7, 2023 | hide | past | favorite
Hello,

as I was looking at Lucene core, I was amazed that the ASCII Folding filter was implemented as a huge switch/case statement which is then compiled as a big lookup table and a lot of branches.

Since this single filter is critical for many companies using Solr, Elasticsearch or Tantivy, I wanted to explore other ways to implement it.

I have not yet benchmark the branchless implementation, I expect it to be slower when dealing with english or latin inputs and to be faster when dealing with easterns languages.

Next time, I might try to implement it using SIMD instructions.

Also note that this is an experiment and that is was not yet evaluated against the unit tests provided by Lucene.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: