Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
As an AI Language Model (twitter.com/d_feldman)
249 points by LeoPanthera on April 24, 2023 | hide | past | favorite | 92 comments


For now, this is just funny, I laughed. But with the advent of all these new open-source LLMs, it will get worse. If you thought people were gullible for falling for fake comments from bots, just wait for the near future. Post-factual/Post-Truth era truly begins. The internet that used to be a source of information is no more, only islands of truth will remain (like, hopefully, Wikipedia, but even that is questionable). The rest will be a cesspool of meaningless information. Just as sailors of the sea are experts at navigating the waters, so we'll have to learn to surf the web once again.

The funniest thing for me is how stupidly lazy are these jerks that employ GPT for such things. The printed book example really made me lol.

The simplest thing they could've done is use a service like quillbot to rephrase, just as I used here to rephrase my comment:

-----------------------

I chuckled. For now, this is just hilarious. However, it will grow worse when more new open-source LLMs emerge. Just wait till the near future if you thought people were naive enough to believe counterfeit comments from bots. The post-factual/post-truth age has arrived. Only isolated truths will remain when the internet ceases to be a reliable source of knowledge (like, ideally, Wikipedia, but even that is debatable). The remaining material will be an ocean of useless data. We'll have to relearn how to navigate the web, just as seafarers are specialists at navigating the seas.

The most amusing thing to me is how exceptionally sloppy these idiots are that use GPT for such things.


The thing is, the internet is already a cesspool of meaningless, misleading and outright malicious information. I guess the earlier we all collectively realize it the best it is.


The internet is currently quite useful, it could stand to be a lot less so. Don’t let your cynicism blind you to the fact that we do have a lot to lose.


I don't see how those points are mutually exclusive tbh. Yes, internet is quite usefull. Yes, it is also a cesspool, absolutely.


I’m taking issue with the characterization that we have nothing to lose, which seems to be the implication of GP and the comments agreeing with them.


At least in my case, I didn't mean to dismiss the internet and that we have nothing to lose. There is some good information there, but it is important that we do not believe everything (or even most) that we read and see.


> The thing is, the internet is already a cesspool of meaningless, misleading and outright malicious information.

Let me just throw this trash overboard. What's the harm? It's not going to make the pacific garbage patch any bigger!


Exactly. The more people that realize that the internet(and other media) is filled to the brim with various kinds of BS, the better


I am more optimistic here. While LLMs allow you to produce tons of garbage, they also provide the tools to filter through that garbage, something we didn't have before. LLMs allow us to view content in a way that we decided, not the content creator. That's extremely powerful and lets us sidesteps a lot of the old methods used to manipulate us.

The risk is more in the LLMs themselves, as whoever gets to control them gets to decide how people are going to experience the world. For the time being I might still double check all the answers I get from ChatGPT, but overtime the LLMs will get better and I'll get more lazy, thus making the LLMs the primary lens through which one views the world.


> The risk is more in the LLMs themselves, as whoever gets to control them gets to decide how people are going to experience the world. For the time being I might still double check all the answers I get from ChatGPT, but overtime the LLMs will get better and I'll get more lazy, thus making the LLMs the primary lens through which one views the world.

You've underlined the major risk these LLMs are for humanity. For a brief time in the history of human race, after information was democratized, most of us (at least educated people) had to use our own critical faculties to understand the world we live in. Now, that capacity will be outsourced to custom LLMs, most of them derived from other pre-trained with some ideological biases built-in. The informational Dark Ages of the technological era.

If they provide the tools to filter through the garbage, it'll probably be standardized in some way as an interface to the web. So just as HTML and its satellite technologies limit and standardize the representational aspect of information on the web, I think this AI-interface will severely limit the knowledge/wisdom aspect you can derive from information on the web. It's a hard thing to put my finger on, I hope you can understand what I'm saying.


I see LLM's as the new UX layer, mobile was it for a while after mouses/keyboards.

Since LLM's work mostly with text, I see how a downgrade in the interaction medium can become a downgrade in the information outputs.

The thing is, lets hope the current web doesn't totally disappears/obliterates.

HN will still be HN LLM's or not, shitty comments are shitty however they get produced.


Reflexively then, good comments are good, no matter what produced them. Or is a quality comment impugned by knowing it came from an LLM? Does it cheapen what it means to be human if other humans think highly of an LLMs attempts at English? Is it at all impressive that ChatGPT is able to spell words correctly, given that it's a computer? What does that mean for the spelling bee industry?


Predicting whether a text was written by a LLM or not is not trivial. What was the latest number by OpenAI? 30%? As LLMs get better, it seems like we won't be able to distinguish real text from fake text. Your LLM will be able to summarize it, but it will still be 99% spam.


You don't need to predict if it what written by LLM, if it's a human or machine makes no difference to the validity of a text. You just need to be able to extract the actual information out of it and cross check it against other sources.

The summary that an LLM can provide is not just of one text, but of all the texts about the topic it has access to. Thus you never need to access the actual texts itself, just whatever the LLM condenses out of them.


"just" need to "extract the actual information out of it and cross check it against other sources".

How do you determine the trustworthiness of those other sources when an ever increasing portion are also LLM generated?

All the "you just need to" responses are predicted on being able to police the LLM output based upon your own expertise (e.g., much talk about code generation being like working with junior devs, and so being able to replace all your juniors and just have super productive seniors).

Question: how does one become an expert? Yep, it's right there: experts are made through experience.

So if LLMs replace all the low experience roles, how exactly do new experts emerge?


You're trusting the LLM a lot more than you should. It's entirely possible to skew those too. (Even ignoring the philosophical question of what an "unskewed" LLM would even be.) I'm actually impressed by OpenAI's efforts to do so. I also deplore them and think it's an atrocity, but I'm still impressed. The "As an AI language model" bit is just the obvious way they're skewed. I wouldn't trust an LLM any farther than I can throw it to accurately summarize anything important.


>cross check it against other sources.

The problem comes in when 99.999999% of other sources are also bullshit.


If LLMs start writing a majority of HN comments, we won’t know what is true or not. HN will be noise and worthless then.


For HN and forums in general, I think this will mean disabling APIs and having strict captchas for posting.

Beyond HN, I think this will translate in video content and reviews becoming more trustworthy, even if it's just a person reading a LLM-produced script. You will at least know they cared enough to put a human in the loop. That and reputation. More and more credit will be assigned based on reputation, number of followers, etc. And that'll be until each of these systems get cracked somehow (fake followers, plausible generated videos, etc.).


Banal is banal, whether written by a human or not.

But GPT text is inherently deceptive, even when factually flawless— because we humans never evaluate a message merely on its factuality. We read between the lines. The same way insects are confused and fly in spirals around light, we will be flying spirals around GPT text based on our assumptions about its nature or the nature of the human whom we presume to have written it.


> overtime the LLMs will get better and I'll get more lazy, thus making the LLMs the primary lens through which one views the world

There's going to be a progressive de-skilling and dumbing down of humans, as AI's and robots do and think more and more for them.


It’s already happening at scale, people are already deciding there’s no point to continue education unless required.


Aren't people going further in education now than ever before?


Going further in schooling maybe, but getting degrees nowadays is usually not primarily for personal enlightenment.

Nor does it usually make people any more intelligent, in fact there may be an inverse correlation...


Bachelors degrees have mostly been a signal for a long time. The problem is that we have credential inflation, so now you need a masters or phd to send that same signal to employers. As a result, you have fewer people going to college, but a greater percentage of people who go to college are getting advanced degrees.


LLMs check the answers? How do they check the answers? By what appears most frequently in the training corpus - that's the "answer".

So, how well curated are the texts that make up the training corpus? Is it just what's generally available on the internet? How much do you think that text accurately reflects reality? "Truth is determined by the most frequent posters" seems like really bad epistemology.


Nothing new under the sun. The king, the church, the state, the media, etc. there has always been gatekeepers who decide what the populace sees.

Just look at the second largest economy in the world where truth hasn’t existed for decades


LLMs don’t know what garbage is or is not, depends on what they are trained on.


> For now, this is just funny, I laughed. But with the advent of all these new open-source LLMs, it will get worse. If you thought people were gullible for falling for fake comments from bots, just wait for the near future. Post-factual/Post-Truth era truly begins. The internet that used to be a source of information is no more, only islands of truth will remain (like, hopefully, Wikipedia, but even that is questionable). The rest will be a cesspool of meaningless information. Just as sailors of the sea are experts at navigating the waters, so we'll have to learn to surf the web once again.

I'm not sure what rock you've been living under, but this has been the internet for probably longer than a decade by now, the only difference is the volume. Even back before LLMs, or before Facebook, you couldn't take any "fact" at face value when found via the internet. And before that, the same people who fall for it now on the internet, fell for it when watching TV, or reading newspapers. People who are not interested in truth because it doesn't fit their world-view, will never be interested in the truth, no matter what medium it comes via.


I am aware of that. I like to think that millenials/gen-z at least knew a little how to sift through the fake information, and the gullible people were the elders. But now with such obscene amounts of fake info at every corner, I think the internet and all source of information (even printed! - because printed at least would require significant effort) will loose credibility. Science will be the last bastion, and even that can easily be influenced by money.


> People who are not interested in truth because it doesn't fit their world-view, will never be interested in the truth, no matter what medium it comes via.

Interestingly, this claim is self-referential.


Yes, the claim is self-referential in the sense that it describes a certain attitude towards truth and how that attitude can affect one’s openness to new information. Specifically, the claim suggests that individuals who are not interested in truth because it conflicts with their existing beliefs are unlikely to change their minds even when presented with evidence or information that contradicts their views. This can create a self-reinforcing cycle where the individual becomes increasingly resistant to new ideas and perspectives.


> Specifically, the claim suggests that...

Incorrect.

The claim is: "People who are not interested in truth because it doesn't fit their world-view, will never be interested in the truth, no matter what medium it comes via."

It is not a suggestion, it does not say "it is unlikely, it is an unequivocal assertion of fact.

> This can create a self-reinforcing cycle where the individual becomes increasingly resistant to new ideas and perspectives.

That's my point (about the thinking underlying the comment in question).

It's interesting how humans self-privilege themselves when applying epistemology - other people's claims must be actually true, but for one's own claims "close enough" is typically an adequate bar. And it is typically only the other person who needs to improve their thinking.


The thing about gippie is it will never shut up, it lists in bullet point fashion, and it uses alot of filler words: like, 'however', additionally, 'currently', 'also', 'that', etc.

i feel i can start to read when someone uses gippie, because i use it alot. I imagine a future where i use gippie to write an email and the receiver uses gippie to summarize and respond. There's also a future evolution of 'typo', where gippie hellucinates some non sensical answer. "Oh my bad, my bots trippin' LOL'


There will be self verifiable truths like provable theorems in axiomatic mathematics. There will be enforceable contracts like Elon musk’s purchase of twitter. There will be quarterly investor reports and earnings calls from public companies that avoid lying at risk of shareholder and sec lawsuits. There will be documents time stamped with hashes and bitcoin. the bots will need karma points as well.


As others stated the Web is like that for quite some time. Also I wouldn't say that Wikipedia is an island of truth, quite the opposite.

But there is a countermeasure: everything has a source.


How do you know the source is accurate?


The rephrased comment reads better than the original.


I also like the rephrased paragraph. There's a certain flow to it, I never learned or improved my skill in that area.


A danish electronics store tried to generate translations using GPT: https://www.proshop.fi/?s=as+an+ai+model

"Sorry, as an AI language model, I cannot translate random letter and number combinations like "HD29HLVx" into Finnish or any other language."


Haha, that is hilarious.

I wonder why they did not just use deepl or similar that is made for exactly this. Anyhow, I will let them know.


Cargo culting probably.

Everybody is talking about ChatGPT so "obviously" it's the right tool for any job imaginable. Don't need to review the output because it's "omnipotent".


This problem is going to get worse as GPT gets better. It's the new "but the computer said...", with nobody being willing to realize that it said something insane.

I seriously worry about GPT++ setting government policy in 10 years (or, worse, deciding how it's going to be applied case by case).


> I seriously worry about GPT++ setting government policy in 10 years (or, worse, deciding how it’s going to be applied case by case).

Good news! Machine learning models are already doing that, all over the world, with all the problems you would expect (a lot of the AI alignment discussion these days is an attempt by people making money in AI to distract from current AI issues.)


GPT++ is going to have to be better than ChatGPT-3 before people in charge will trust it with such decisions. for a cynical take, it can't really be bribed in its current form, so if you simplistically see politicians a being corrupt, GPT++ won't be installed until it can be bribed to act the same as a politician would be, at which point you'd have to exact same level of worry as with today's politicians (who can currently easily be lied to by a lobbyist).


To be fair it can be better at understanding context sometimes


Because an AI hustlebro on Twitter told them ChatGPT will solve everything.


It also cannot generate inappropriate or offensive content.

https://twitter.com/conspirator0/status/1647671394476478467


There are jailbreaks that let you get past the censorship. It's a cat and mouse game.


Apparently some spammers are too lazy and don't care. It's about the same level of half-assed as that "Nazi Ukrainian spy" who was busted with copies of The Sims video game.


ChatGPT might not be able to, but GPT(4) via the API certainly can.


Results for "Regenerate Response" are also very funny, example:

https://mavensoft.in/drop-vase-zf1muq

And they are even funnier in LinkedIn:

https://www.linkedin.com/search/results/content/?keywords=re...


Prose used to function as proof-of-work for humans everywhere - marketing, academia, and so on. Credibility was assigned based on the quality (and quantity) of the language instead of the content.

I see mostly positive effects of this ending.


But GPT-generated prose still reads a bit "flat" to me. There's no "sizzle". There's no surprising words. In terms of Shannon information theory, there's not enough information there - not enough entropy.


Have you tried adding “(answer) in the style of [writer]” to your prompts? It really changes the voice of the generated text.

Charles Bukowski, Hunter S Thompson, Christopher Hitchens, etc


You can increase that entropy a bit by putting up the temperature setting in the API which controls the randomness of the tokens selected and also the older GPT-3 models might be better at that since they don’t have that polite, formal tone trained into them


A real Amazon review would have rated it 5 stars because it arrived on time.


Having watched the discussion here for the past couple of days, I agree with frozenwind's comment. I wrote my own reflection here (the last comment):

https://aisnakeoil.substack.com/p/a-misleading-open-letter-a...

Regarding impacts upon academia, I wrote two comments here:

https://scholarlykitchen.sspnet.org/2023/04/12/guest-post-gp...

I'd also like to emphasize that it is not only about open source LLMs. I haven't followed the situation myself, but some sources say that LLMs are being developed also in the dark web, and who knows what kind of content the criminals there possess? (Borderline cases like NoiseGPT are already done openly.) Given the current geopolitical situation, the same goes for nation states. Fun times ahead indeed.


> who knows what kind of content the criminals there possess?

As a person who regularly browses darknet forums, let me tell you, the text contents there are almost identical to any web forum you find on the internet.


I don't browse the dark web so I cannot make a solid first-hand counterargument. But I seriously doubt your comment, given the amount of illegal material there, including most of the dumps from the thousands of past data breaches. Even if your comment is true regarding text content, it is almost certain that also they will try to capitalize on LLMs for improving their cyber crime codebases.


Also, "it's important to note" has forever been tarnished as "AI generated" for me.


Along with "additionally".


In conclusion


You make a fair point.


Moreover,


And now we truly know why GPT insists to say "as an AI model" when we break a rule. It's precisely this. Like a signature, so we find it out in the wild when it's abused.


I try really hard to prompt it to replace it with something else. It acknowledges and agree to it. Did it maybe once or twice, then reverted back to the old “As a AI model”

IIRC, I was trying to see if it could replace by “as a LLM”.


My attempt was to substitute it with "speaking as a mother".


I realise you want to do it using the prompt but wouldn't it be easier to `output.replace("As an AI language model, ", "As a totes sentient robot, ")`?


I was just playing with it, checking if I could give him instructions that would span several prompts and what not.

I try to make it play a games with me and start the prompt differently until a specific keyword was entered… it kinda worked. Kinda being key.


It’s probably provided as system instructions for rejecting things. You can use the API and feed it with different instructions with the system role


I kind of wonder if maybe they look for certain words in the output (or run it through some sort of sentiment analysis) and if it fails they submit the prompt again with a very strongly worded system prompt (after your prompt) instructing it to reject the command and begin with the phrase “As an AI language model”.

Like, I haven’t heard about a way they could actually implement filters this powerful “inside” the model, it feels like it’s probably a less elegant system than we’d imagine.


They use RLHF (reinforcement learning through human feedback) which means they can reward it when it does it and punish it when it doesn’t

They’ve probably done it strongly enough that it can’t really not do it, maybe on purpose to prevent misuse


It should be trivial to remove this string from any output, that kind of "watermark" only works against the absolutely laziest.


Of course. But guess what…


Except this only works when people are using ChatGPT (the web app). The API doesn't do that.


Presumably it depends on the system message you use?


Most bucket shop spammers aren't going to shell out for the API version.


> As an AI language model, I haven't actually used an aquarium light myself. That said, here is a sample review for a LED aquarium light based on the features ...

Direct Google search link to find the fake Amazon reviews:

https://www.google.com/search?q=site%3Ahttp%3A%2F%2Famazon.c...

It would be interesting to see an "as an AI language model" dataset with all the pages where it appears and the commentary around it. Search engines report millions of matches.


Recruiting freelance workers is about to be revolutionized, too:

https://www.freelancer.com/projects/data-entry/How-about-Int...

> How about "Intelliconverse"? It conveys the idea of intelligent conversation and reflects the capabilities of the GPT-3.5 architecture that powers my responses BUDGET ₹750-1250 INR / hour

Allegedly, a few people are even bidding on this?


Spoiler: the fragment "As an AI language model" reveals that the comment about the product or the book is from a LLM and not a person in a context in which it is supposed a person is giving a review to a product.


This is just the beginning. The bullshit avalanche is upon us.


I hope your definition includes the swathes of techno-optimists that bandwagon on to new tech and relentlessly shill its positives with religious fervor.


These people will eventually learn to make a second pass with another prompt.

  Please respond with the following text verbatim, except first remove every sentence that contains the phrase "as an AI language model"


I believe putting the LLM in any kind of “role playing” mode would get rid of it. I asked it to answer just in binary 0 or 1 and it did not start blabbing about how it’s an AI model even for silly questions. It was very bad at giving just true/false statements by the way. Even for basic API questions that any Jr engineer would know


I really hope the next generation of developers start using GPT instead of string.replace() haha

Looking forward to the billable hours speeding up webapps and lowering GPT costs :)


This isn't covered by the tweet but I found a whole bunch on TripAdvisor too:

https://www.google.com/search?q=%22as+an+ai+language+model%2...

Edit: Maybe not a whole bunch. Looks like just one or two repeated.


It seems to be the same review appearing over and over


I posted about something similar the other day[1] where this was found in nearly all of their product models that were translated

[1] https://news.ycombinator.com/item?id=35652424


if you want less “noisy” results, try:

"as an AI language model" site:amazon.com -chatgpt -gpt -openai


But I thought AI Language Models were gorgeous and handsome attractive looking Artificial Intelligence language researchers and influencers, who pose for fabulous photo spreads in fashion magazines and advertisements.

SWOON -- I'll buy anything they endorse!

AI is so hot right now researchers are posing for Yves Saint Laurent.

https://www.theverge.com/tldr/2017/8/31/16234342/ai-so-hot-r...

>Okay, yes, so this one particular researcher also happens to be very attractive.


Prompt Engineering's "finest" /s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: