I got asked a couple of questions about Google and their results on Twitter today - it's not easy to answer in 140 characters, so I've embedded the tweets and will then attempt an answer.
The same search on Google brings back different numbers of results. Why? Any idea @Philbradley
— (@mumwastheword) March 21, 2014
Also using the minus sign to exclude words. Results go up. Google questions! @Philbradley
— (@mumwastheword) March 21, 2014
This happens on a regular basis, and there are a lot of reasons for this. To take the first question first. Google is constantly updating their results, finding more information and bringing it back, so it's possible that if the count is off by a small number, it's simply because Google added or deleted content from the database. However it could also mean that because Google uses different data centres the first query went to one data centre that had one count, and the second search went to a different data centre that had another count. Because they're not absolutely in sync, it's a possible answer.
That's the kind explanation. The other explanation is that Google doesn't actually care about the number of results - it's simply a rough estimate. Google is, in effect, bone idle, and doesn't carry out a proper search for you. I asked Amit Singhal, a leading Google engineer this exact question once and he said to me that I shouldn't worry about the counts - they weren't that important. If you think about it - unless you do a very precise search Google will give a very round neat number; it's meant to simply be an indication of how many results exist. It's also a fairly moot point, since Google knows that you're not going to look at thousands of results anyway. So regard the number of results that you see merely as a guideline, not an accurate figure.
Now, when you do your search again, adding in more filters, excluding words/phrases and so on, Google actually has to do some work, and I always think of it a little bit like a stroppy teenager giving a deep sigh before getting up to do the chores. Google thinks 'oh right, this is a real request' and it actually works a bit harder in an attempt to give a slightly more accurate figure - and again this is the explanation that I got from Amit on his visit to Google in London a year or so ago. You can see this time and time again - when excluding a word from a search you often get more, not less hits because the hit count the first time is very rough - sometimes by millions. An example that I often use is this one: "man on the moon" and then "man on the moon" -hoax.
and then
Which is a difference of an impressive 8,060,000! It makes something of a mockery of Google search results, but they really don't care that much, and I've got this straight from the horse's mouth as it were.
"since Google knows that you're not going to look at thousands of results anyway"
You could reword that: "Since Google ***won't allow you*** to look at more than a thousand results anyway." I assume that large Google result numbers are generated by a random number generator; that's as useful as any other assumption. The numbers aren't verifiable, therefore not meaningful.
Posted by: Walt Crawford | March 21, 2014 at 04:02 PM
While I love using Google Books, its ability to find stuff is s***. If you want the volume of a particular magazine, sometimes you'll find it, sometimes not.
Suppose, say, you wanted "Punch vol. 126." Typing "Punch 126" won't find it, even though both are in the headline. Google Books can't find it unless you use the workaround of finding any volume of Punch, clicking on the "More Editions" link, and scrolling down the list until you find the volume.
They're as bad as Amazon, which will give you thousands of items more beyond what you're looking for.
Posted by: Bill Peschel | March 22, 2014 at 04:33 PM