An Insightful Breakdown of the Extensive Google Search Documentation Leak

- Advertisement -

An Insightful Breakdown of the Extensive Google Search Documentation Leak

The recent leak of Google Search internal ranking documentation has sent shockwaves through the SEO community. The leak, which exposed over 14,000 potential ranking features, provides an unprecedented look into the closely guarded search rankings system of Google.

- Advertisement -

The leaked files originated from a Google API document commit titled “yoshi-code-bot /elixer-google-api,” indicating that this was not a hack or a whistle-blower situation. The leaked information was shared by Erfan Azimi with Rand Fishkin of SparkToro, who then brought in Michael King of iPullRank to help distribute the story.

The leaked documentation has caused many SEO professionals to reevaluate their beliefs about Google’s ranking system. Previously, SEOs fell into three camps: those who believed everything Google told them, those who thought Google was lying, and those who believed that Google sometimes told the truth but needed to be tested. However, this leak has prompted many to reconsider their stance.

- Advertisement -

The leaked documentation contains over 14,000 potential ranking signals/features, making it a massive amount of information to sift through. However, the author of the article has read through the entire thing and distilled it into a 40-page PDF summary for Search Engine Land. They encourage readers to make their own conclusions based on the information provided.

One of the key points from the leak is the revelation of seven different types of PageRank used by Google, including the famous ToolBarPageRank. The documentation also mentions a specific method for identifying different business models, such as news, YMYL (Your Money or Your Life), personal blogs, ecommerce, and video sites. It remains unclear why Google specifically filters for personal blogs.

- Advertisement -

Several important components of Google’s algorithm were also mentioned in the leak, including navBoost, NSR (Normalized Site Rank), and chardScores. The documentation reveals that Google uses page embeddings, site embeddings, site focus, and site radius in its scoring function. Google also measures various types of clicks and impressions to determine site and page rankings.

The leak raises several questions that the SEO community would love to understand. For example, why is Google specifically filtering for personal blogs and small sites? Why did Google publicly deny having a domain or site authority measurement? And why did Google lie about their use of click data? These mysteries have left SEO professionals curious and eager for answers.

The leaked documentation also includes several interesting findings. For example, Google has something called pageQuality (PQ), which is used to estimate the “effort” put into creating article pages. Factors such as tools, images, videos, unique information, and depth of information are deemed important for scoring high on “effort” calculations.

The leak also supports the concept of topical authority, which is based on Google’s patent research. Metrics such as siteFocusScore, siteRadius, siteEmbeddings, and pageEmbeddings are used for ranking. These metrics indicate how much a site is focused on a specific topic and measure the deviation from the site embedding.

Another interesting discovery is that short content can still rank well. The leak reveals that short content has a different scoring system applied to it, and it does not equate to thin content. This finding confirms the suspicions of many SEO professionals who have been trying to prove that short content can still be valuable.

The leak also sheds light on Google’s indexer, which is now named Alexandria. The indexer has two other prevalent indexers mentioned in the documentation: SegIndexer and TeraGoogle. These indexers play a role in placing documents into tiers within the index and long-term memory storage.

Actionable advice based on the leaked information includes investing in a well-designed site with intuitive architecture to optimize for NavBoost, removing or blocking pages that are not topically relevant, optimizing headings and paragraphs to match user queries, and adding unique information, new images, and video content to update and improve existing content.

Overall, the extensive Google Search documentation leak has provided valuable insights into the inner workings of Google’s ranking system. While many questions remain unanswered, the leaked information has prompted SEO professionals to reevaluate their strategies and consider new approaches to optimize for search rankings.

Disclaimer: The opinions expressed in this article are those of the author and not necessarily Search Engine Land.

- Advertisement -

Stay in Touch

spot_img

Related Articles