Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does it work like that though? How long does it take for AI bots to crawl sites and have the data added to the model currently being used? Am I wrong in thinking that it takes a lot longer for AI bot crawls to be available to the public than a typical search engine crawler?


Bots could be crawlers gathering data to periodically be used as raw training data or the requests could just be from a web search agent of some form like ChatGPT finding latest news stories on topic X for example. I don’t know if robots.txt can distinguish between the two types of bot request or whether LLM providers even adhere to either.


Wow, Just reading the headline I had assumed they were giving the new article as a document, then asking it to summarize the the document given.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: