Hello,
as I was looking at Lucene core, I was amazed that the ASCII Folding filter was implemented as a huge switch/case statement which is then compiled as a big lookup table and a lot of branches.
Since this single filter is critical for many companies using Solr, Elasticsearch or Tantivy, I wanted to explore other ways to implement it.
I have not yet benchmark the branchless implementation, I expect it to be slower when dealing with english or latin inputs and to be faster when dealing with easterns languages.
Next time, I might try to implement it using SIMD instructions.
Also note that this is an experiment and that is was not yet evaluated against the unit tests provided by Lucene.