Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, that depends. If you want to implement a simple `truncate`[0][1] function, then you need to count graphemes

[0] http://api.rubyonrails.org/classes/ActionView/Helpers/TextHe... [1] https://github.com/epeli/underscore.string



If you're truncating by character (or WHATEVER) counts, you are guaranteed to be doing it wrong - maybe not in your native language, but in somebody's.

Heck, even in one graphemically-straightforward language you can get silliness: http://www.images.generallyawesome2.com/photos/funny/photos/...


But truncation is very often needed when you have more text than space. What's the alternative?


If it's storage space, you truncate by bytes, rounding down to the nearest complete grapheme- no need to count graphemes. If it's display space, truncate by pixels, in which case you need "size in device units of the output text from a rendering engine". Again, no need to count graphemes.


Counting graphemes and detecting "the nearest complete grapheme" are basically the same problem.

The only reason counting graphemes is hard is because detecting grapheme boundaries is hard.


Of course, neither of those approaches solves the 'ass-' problem, which is possibly a red herring.


I'm trying to imagine a use case for grapheme-wise truncate that wouldn't be better served by counting bytes, code points, pixels, or some sort of localized letter-counting convention.


How about implementing backspace in a text editor?


That only requires being able to identify the boundaries of the last grapheme. It does not require counting them all.


So you acknowledge that determining grapheme ranges is important - you just don't see the need to count the ranges?

If so, sure, I agree. The counting example is contrived. Truncation, navigation, text selection, etc. are more interesting and practical applications.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: