Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Cool!(but yeah english is really important!!)

which models do you use for speech recognition and text generation?



Yup english is top of at the top of the to-do list. We use some APIs from a few different big players. One that I want to highligh which is waayy better than the rest is Microsoft speech to text, which not only recognized what you say but can also do punctuation and capitalization. Instead of "do you like to eat pizza" which is what google speech to text would give you, microsoft recognizes it as "Do you like to eat pizza?". Small but important difference.


have you tried the new whisper model from openai?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: