Google’s Latest Earned Patent Algorithm to Trap Spammers and Maybe White Hat SEO Too

Google’s Webmaster Guidelines highlight a number of practices that the search engine warns against, that someone might engage in if they were to try to boost their rankings in the search engine in ways intended to mislead it. The guidelines start with the following warning:

Even if you choose not to implement any of these suggestions, we strongly encourage you to pay very close attention to the “Quality Guidelines,” which outline some of the illicit practices that may lead to a site being removed entirely from the Google index or otherwise impacted by an algorithmic or manual spam action. If a site has been affected by a spam action, it may no longer show up in results on Google.com or on any of Google’s partner sites.

A Google patent granted this week describes a few ways in which the search engine might respond when it believes there’s a possibility that such practices might be taking place on a page, where they might lead to the rankings of pages being improved in those search results. The following image from the patent shows how search results might be reordered based upon such rank modifying spam:

Google Rank Modifying Spam Chart of Ranking Changes

Those practices, referred to in the patent as “rank-modifying spamming techniques,” may involve techniques such as:

•Keyword stuffing,
•Invisible text,
•Tiny text,
•Page redirects,
•Meta tags stuffing, and
•Link-based manipulation.
While the patent contains definitions of these practices, I’d recommend reading the definitions for those quality guidelines over on the Google help pages which go into much more detail. What’s really interesting about this patent isn’t that Google is taking steps to try to keep people from manipulating search results, but rather the possible steps they might take while doing so.

The patent is:

Ranking documents
Invented by Ross Koningstein
Assigned to Google
US Patent 8,244,722
Granted August 14, 2012
Filed: January 5, 2010

Abstract

A system determines a first rank associated with a document and determines a second rank associated with the document, where the second rank is different from the first rank. The system also changes, during a transition period that occurs during a transition from the first rank to the second rank, a transition rank associated with the document based on a rank transition function that varies the transition rank over time without any change in ranking factors associated with the document.

When Google believes that such techniques are being applied to a page, it might respond to them in ways that the person engaging in spamming might not expect. Rather than outright increasing the rankings of those pages, or removing them from search results, Google might respond with what the patent refers to as a time-based “rank transition function.”

The rank transition function provides confusing indications of the impact on rank in response to rank-modifying spamming activities. The systems and methods may also observe spammers’ reactions to rank changes caused by the rank transition function to identify documents that are actively being manipulated. This assists in the identification of rank-modifying spammers.

Let’s imagine that you have a page in Google’s index, and you work to improve the quality of the content on that page and acquire a number of links to it, and those activities cause the page to improve in rankings for certain query terms. The ranking of that page before the changes would be referred to as the “old rank,” and the ranking afterward is referred to as the “target rank.” Your changes might be the result of legitimate modifications to your page. A page where techniques like keyword stuffing or hidden text has been applied might also potentially climb in rankings as well, with an old rank and a higher target rank.

The rank transition function I referred to above may create a “transition rank” involving the old rank and the target rank for a page.

During the transition from the old rank to the target rank, the transition rank might cause:

•a time-based delay response,
•a negative response,
•a random response, and/or
•an unexpected response
For example, rather than just immediately raise the rank of a page when there have been some modifications to it, and/or to the links pointed to a page, Google might wait for a while and even cause the rankings of a page to decline initially before it rises. Or the page might increase in rankings initially, but to a much smaller scale than the person making the changes might have expected.

The search engine may monitor the changes to that page and to links pointing to the page to see what type of response there is to that unusual activity. For instance, if someone stuffs a page full of keywords, instead of the page improving in rankings for certain queries, it might instead drop in rankings. If the person responsible for the page then comes along and removes those extra keywords, it’s an indication that some kind of rank modifying spamming was going on.

So why use these types of transition functions?

For example, the initial response to the spammer’s changes may cause the document’s rank to be negatively influenced rather than positively influenced. Unexpected results are bound to elicit a response from a spammer, particularly if their client is upset with the results. In response to negative results, the spammer may remove the changes and, thereby render the long-term impact on the document’s rank zero.

Alternatively or additionally, it may take an unknown (possibly variable) amount of time to see positive (or expected) results in response to the spammer’s changes. In response to delayed results, the spammer may perform additional changes in an attempt to positively (or more positively) influence the document’s rank. In either event, these further spammer-initiated changes may assist in identifying signs of rank-modifying spamming.
The rank transition function might impact one specific document, or it might have a broader impact over “the server on which the document is hosted, or a set of documents that share a similar trait (e.g., the same author (e.g., a signature in the document), design elements (e.g., layout, images, etc.), etc.)”

If someone sees a small gain based upon keyword stuffing or some other activity that goes against Google’s guidelines, they might engaging in some similar additional changes to a site involving things like adding additional keywords or hidden text. If they see a decrease, they might make other changes, including reverting a page to its original form.

If there’s a suspicion that spamming might be going on, but not enough to positively identify it, the page involved might be subjected to fluctuations and extreme changes in ranking to try to get a spammer to attempt some kind of corrective action. If that corrective action helps in a spam determination, then the page, “site, domain, and/or contributing links” might be designated as spam.  By

New Google Patent -Categories For Additional Exigent Keyword Rankings

Imagine that Google assigns categories to every webpage or website that it visits. You can see categories like those for sites in Google’s local search. Now imagine that Google has looked through how frequently certain keywords appear on the pages of those websites, how often those pages rank for certain query terms in search results, and user data associated with those pages.

One of my local supermarkets has a sushi bar, and they may even note that on their website, but the keyword phrase [sushi bar] is more often found upon and associated with documents associated with a category of “Japanese Restaurants” based upon how often that phrase tends to show up on Japanese Restaurant sites, and how frequently Japanese restaurant sites tend to show up in search results for that phrase.

Since Google can make a strong statistical association between the query [sushi bar] and documents that would fall into a category of “Japanese restaurants,” it’s possible that the search engine might boost pages that have been categorized as “Japanese restaurants” in search results on a search for [sushi bar]. My supermarket [sushi bar] page might not get the same boost.

That’s something that a Google patent granted earlier this week tells us.

The patent presents this idea of creating categories for sites and associating keywords with those categories to boost sites in rankings when they are both relevant for those query term and fall within those categories within the content of local search. But the patent tells us that it can use this process in other searches as well.

Keywords associated with document categories
Invented by Tomoyuki Nanno, Michael Riley, and Gaku Ueda
Assigned to Google
US Patent 7,996,393
Granted August 9, 2011
Filed: September 28, 2007

Abstract

A system extracts a pair that includes a keyword candidate and information associated with a document from multiple documents, and calculates a frequency that the keyword candidate appears in search queries and a frequency that the pair appears in the multiple documents. The system also determines whether the keyword candidate is a keyword for a category based on the calculated frequencies, and associates the keyword with the document if the keyword candidate is the keyword for the category.
If you have access to Google’s Webmaster Tools for a website, the section on “Keywords” shows you the “most common keywords Google found when crawling your site,” along with a warning that those should “reflect the subject matter of your site.” Another section of Webmaster Tools shows the queries that your site receives visitors for, how many impressions and clickthroughs from search results that your pages receive, and an average ranking for your pages in those results. An additional section of the Google tools shows the anchor text most often used to link to your site.

If you were to take all of that information that Google provides for your site, and try to guess at a category or categories that Google might assign for your site, could you? It’s possible that Google is using that kind of information, and more to determine how your site should be categorized. Of course, Google would also be looking at other sites as well for information such as the frequency of keywords used on their pages and queries they are found for to create those categories as well, and to see how well your site might fall into one or more of them.

Of course, if you verify your business in Google Maps, you can enter categories for your business, but Google may suggest and include other categories as well. For instance, Google insists on including “Website Designer” as a category for my site even though that’s not a category that I’ve ever submitted to them.

And it while this patent discusses how it might be applied to local search, it could just as easily be applied to Web search as well, and the patent provides a long list of different types of categories that it might apply to websites that expand well beyond business types.