Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
"All your base are belong to us" intro to my 2004 MIT Spam Conference talk (jgc.org)
112 points by jgrahamc on Dec 13, 2024 | hide | past | favorite | 28 comments


I made one big mistake with that introduction when it was recorded on to the video tape for playback at the conference: I didn't realize how much people were going to laugh at the end and went straight into my presentation. I got to hear the laughter because the conference was live streamed using RealPlayer.


I would like to see more modern text-to-speech that sounds good, but also inhuman. On one hand very easy to understand but on the other obviously constructed and not something a living person could generate. Like the scifi where robots always look like robots and it's illegal/immoral/taboo to have a robot that's indistinguishable from a person.

Maybe just an ordinary human sounding TTS that gets put through a mild vocoder of some kind.


That's a real problem with the vast majority of current TTS. Terrible in things like consistent intonation, proper pronunciation, believable pauses, while sounding human all the same and the result is super uncanny valley.

The gaming and movie industry understands this very well, they use human voice actors that can nail all of that and then make it sound more metallic or compressed or whatnot. Otherwise it does not fit.


This is why it pisses me off when a techy person makes a YouTube video and just uses TTS instead of recording a voiceover. I know some people don’t have a good recording situation, but I get the sense that a lot of people just do it because they either think people can’t tell or they think it’s a clever hack or “the way of the future” or something.

It isn’t. Instead I find myself watching videos and getting a weird creepy feeling when I suddenly hear the voiceover mispronounce a word or put an emphasis in the wrong place. Part of it is the uncanny valley for sure, but the more pernicious thing is this: once I realize that the voice is AI-generated, I start to worry that the script might be too. Now I’m trying to figure out “is this guy just an amateur writer taking a while to get to his point, or is this an LLM-authored script that is never going to go beyond surface-level statements about the topic.”


I don’t think this is about a “good recording situation”. It’s likely people who think they suck at speaking/narrating or think they have a horrible accent or want to remain anonymous or just find the process annoying, and find it less embarrassing/more privacy-preserving/less of a hassle to use an AI voice.


Those are the most likely.

Another factor, less common, is when you want or have to speak a non-native language you're not used to pronounce, in which case you're usually afraid of not being understood.

PS: I think all the Text To Speech systems sounds horrible, the last generations are even irritating, as the user of the parent commented.


I think this undersells the difficulty of recording a good voice-over, both technically and performatively.


And on the high end, making everyone sound like a professional public speaker. The machine sees mistakes as errors when in fact every non-hollywood speech contains multiple mistakes.


It sounds like a guy recorded his own voice and ran it through an LPC.


I was there and remember it well. The whole thing was completely unexpected and people lost it. Well done. :)


Elegant memes from a more civilized age


The main thing I realize is just how amazing we thought those machine voices were and how good and realistic, and how bad they sound now compared to what we have.


I don't think we ever thought they were particularly good or realistic, just the best we could do with the limitations of the time.


Macintalk Pro English Victoria was way better than that, even at the time.

https://en.m.wikipedia.org/wiki/PlainTalk


As a kid I thought the announcer voice for Blades of Steel was excellent at the time. Despite it being distorted it gave the feeling that you were watching an actual hockey game. Of course, most of the games I played then didn't have much in the way of human voice.


That wasn’t TTS though, but actual sampled recordings.



i remember it as just the opposite - it was funny in part because of how mechanical it sounded.


For great justice.


Incredible!

Although I couldn’t help but notice that the original says 3 “ha ha ha” but the spam conference said 4.

Yes the original subtitles have 4 but they only say 3 :)


This is cool, how did you make the voice? It's perfect for the application :)


I believe I used the Mac's 'say' command with one of the default voices (perhaps with some parameter tweaks).


Leaving the obligatory All Your Base remix video link here: https://youtu.be/qItugh-fFgg?feature=shared


What a coincidence, I searched for that very video a few days ago. It's astonishing how much of a time capsule it was. A really small slice of online life. I remember people used to send emails around with Word docs containing GIFs and silly images, doctored up. Everything pixelated. Nostalgia.


It was so huge at the time. There weren't so many memes or silly videos around at the time (maybe some things from https://b3ta.com/ like Weebl's Badger Badger?), so everyone was aware of it.


Mushroom mushroom!

I miss the old internet.


Classic


Some 2004 highlights for the nostalgic:

- MySpace was surging, Facebook was launched.

- Iraq counterinsurgency. Eventually Iraq fared better than Syria, while Afghanistan fared worse. No one in 2004 would have guessed this.

- WoW was released.

- Paris Hilton, Russell Brand, The Da Vinci Code, The Wardrobe Malfunction. Maybe not a year for culture.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: