Andrew Roach is a user on retro.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Andrew Roach @ajroach42

Google no longer indexing decade old articles (that means 2008!)

boingboing.net/2018/01/16/try-

Time to duck duck go, if you haven't already.

· Web · 66 · 34

@alex_ Right to be Forgotten folks have robots.txt, and "delete" and, like, duck duck go and other services still archive there stuff and just generally "Right to be forgotten" is kind of a bad ideology when you're talking about a global publishing platform.

We don't have people throwing away library books in deference to the authors right to be forgotten.

This is a net loss.

@ajroach42 unfortunately this comes during a significant decline in the quality of duckduckgo's search results.

@deutrino I have not noticed same. It's been pretty solid for me.

@ajroach42 Run the same search in DDG and Google every so often and compare the results. It's most obvious with acronyms and words that are phonetically similar to other words, but in general you may be unpleasantly surprised. I sure was, given that DDG had equal or better relevance to Google for years.

@deutrino I'm sorry that DDG is not delivering the quality that you need?

It meets my needs without issue, so I'm not going to sweat it.

@ajroach42 Yeah, it's more that you recommended it and I thought you might like to be aware. I am getting annoyed enough after several months of poor performance to look for an alternative but haven't done so yet, or I'd suggest something else as part of the conversation. Have a good day.

@deutrino @ajroach42 i've been using ddg for like 3 years now and i haven't noticed a significant change in either direction. i ddg by default and if i don't find what i need i add a !g bang to the search

i find what i'm looking for on ddg ~60-70% of the time and on google most of the rest

@chr Ditto!

But some people gotta spread FUD. Whatcha gonna do?

@nightpool Yeah, it's not an across the board thing, and they fix it if a more recent piece that gets indexed contains a link back to the older piece (so it seems), but older content, especially from smaller outlets, is clearly deprioritized or outright removed from results.

@ajroach42 the idea that google might expire sites from it's index that don't get used in search results and don't get found via crawlers is literally the least surprising thing to me ever.

@nightpool I'm not sure why the older content wouldn't get found via crawlers, though?

But yeah, them expiring old results isn't particularly surprising, even if it does result in situations where they will not be able to provide accurate results for niche queries.

@nightpool @ajroach42 Remember when Google bragged about the pure number of pages in its index?

@nightpool I remember it did back in 2001 or 2002 (while I still was in elementary school)

@ajroach42 Important to remember though DDG gets some results from Yandex et al, so there's no guarantee those search engines won't follow suit

Right now, Archive.org is the last Library of Alexandria for the web. I just wish it had a better search engine

@jeanofthedead @ajroach42 I don't know much about SearX, but I think StartPage just inserts itself between you and Google results. Wouldn't this still have the same problem as searching Google directly?

@Brennan @ajroach42 Hmm, I think you’re right. I’ve noticed the lack of older results from this encrypted Google search engine, too. Frustrating. Here I come, DDG!

@ajroach42 wow, everything prior to Breaking Bad does not exist, per Google.

@jannamark

I should have been a little more clear in the original. It's not that they've removed everything over ten years old.

It's that things that they used to index, they have removed from the index, specifically based on the age of the thing.

@ajroach42 no it was my fault, I should have said, everything prior to Breaking Bad does not matter, for an improved punch line.

I'm still kind of in love with the idea that Breaking Bad defines the start of an epoch, but that's probably a sign that I should get some sleep

@ajroach42 Wow. I've been relying on it to look up old stuff I've written as well, for reference. Now I'm glad I built my own search into my sites - it's not great, but at least there's a way to go back and find out if I ever wrote about X.

@drwho @ajroach42 Only if you use the g! option, otherwise they have their own crawler. You may be thinking of Startpage.

@MatejLach @ajroach42 Or I just don't know enough about how they operate. Which is true. Thanks.

@ajroach42 equally plausible explanations: 1) google had a bug 2) google had a hardware failure

But it's cloud, so not trendy to consider such possibilities..