Many SEO pros have learned from experience. They’ve been in the industry long enough to have picked up some important tactics. Many read trade blogs about the latest happenings and trends. Almost daily, I hear people refer to what Google does as “mysterious”. They talk about industry tools like SEMRush and MOZ as bright torches in the seedy darkness that is search science. They think of themselves as witches and wizards, reading the palms and tea leaves of each website’s potential.
They’re not. There’s nothing mystical about it. Google publishes their search quality rating (SQR) guidelines for the world to read. Hop over to Google and search for “Google Search Quality Rater Guidelines”. That’s no mistake. Google knows how to hide something from Google, if they want it hidden.
The truth is, Google wants you to understand the criteria with which they check webpages. They want fresh, quality content. They also want websites with fresh content to make it easy for them to find and digest that content. And so they’re transparent about how they rate your work. Overwhelmingly so. At the time of publishing this, the SQR guidelines are 176 pages long.
This rings true to my experience. Guidelines I helped write for evaluators back in 2013 were upwards of 100 pages. Though that was a separate organization and very different tasks, the concept remains the same.
I’m going to share two excerpts from the SQR guidelines introduction. They paint a clear picture as to the depth and variety of content Google tackles, along with the seriousness with which they take their work:
“A search may be done when someone is bored and looking for entertainment, such as a search for [funny videos]. A search may be a single question asked during a critical moment of a person’s life, such as [what are the symptoms of a heart attack?].”
“Different types of searches need very different types of search results. Medical search results should be high quality, authoritative, and trustworthy. Search results for [cute baby animal pictures] should be adorable. Search results for a specific website or webpage should have that desired result at the top. Searches that have many possible meanings or involve many perspectives need a diverse set of results that reflect the natural diversity of meanings and points of view.”
Think about that for a moment. How easy would it be for a search engine to fall into the perils of coming down on one side of a political argument by providing bias in your results? How easy would it be to have one bad piece of medical advice surface which could threaten the life of someone having an emergency?
Bing, Microsoft’s attempt to compete with Google, recently re-launched with a new AI-driven tool. Early reviews were not positive. One searcher took to social media to share he’d asked if it was safe to boil a baby. Bing had confirmed it was indeed safe. A great example of how context can be missed by an AI too young to grasp at human nuances.
And so Google continues to experiment with AI, but focuses on human ratings to deliver quality results.
Page Quality (PQ) rating
Most content on the internet exists in two buckets: content which is helpful and content which is entertaining. Outside of those two buckets is an abyss littered with the third type: content which is malicious or misguided.
Think of the bucket full of helpful content as the place every website wants to be. This is the bucket Google is most likely to share with the world. If your content dives deep into your expertise, this is where your webpages will end up. Answer real questions, and your answers are likely to be shared.
Focusing only on helpful content ignores a large part of internet traffic, though. When a user searches for cute animal videos, those search results aren’t going to come out of the “helpful” bucket. This content can be good, but it’s not helping anything but your sanity. It’s important to differentiate between helpful and entertaining content, though. A news story on CNN’s website is very different than a story on The Onion’s website.
The Onion being a news parody site filled with classic prank headlines. Two classics:
“CIA Realizes It’s Been Using Black Highlighters All These Years”
“World Death Rate Holding Steady At 100 Percent”
A joke CIA headline may be helpful if you’re looking for a laugh. But it’s not helpful if your search query was, “what’s it called when the CIA covers text in black?”
The answer is ‘redacted’.
If content can’t fit into those “helpful” or “entertaining” buckets, it receives a low quality score. Almost by default, and even if the content isn’t malicious.
To a search engine, what’s the difference between malicious content and misguided content? Both may have misleading titles, could be filled to the brim with ads, and likely sidestep best practices. That could be intended (malicious) or it could be accidental (misguided). It doesn’t matter. It’s still low quality.
What does this mean about a page on your website selling your widgets? That depends. Is the page answering real questions and helping real people? Or is it a pushy sales page? Which one of those would be interpreted as ‘malicious’ under Google’s Page Quality guidelines?
We can get into the nuances of how to create a product page which Google will consider quality later in the book. What’s important now is that you can understand why and how search engines look at page quality.
Google even shares a list of page purposes they deem helpful or beneficial:
- Pages sharing information about a topic
- Pages sharing a personal experience, perspective, or feelings on a topic
- Pages sharing pictures, videos, or other forms of media
- Pages demonstrating a personal talent or skill
- Pages expressing an opinion or point of view
- Pages which seek to entertain
- Pages offering products or services (better be informational, buster!)
- Pages which allow users to ask questions and other users to answer them
- Pages which allow users to share files or download software
Your Money or Your Life topics (YMYL)
A word to the wise on some of the topics search engines will scrutinize more than others. Google has extra protections around Your Money or Your Life (YMYL) topics. Put plainly, any content which can have a physical or monetary detriment to users or society. If a family is looking for a storm evacuation route, Google wants to be extra cautious to deliver the best possible results. Same for side effects of a heart medication, what to do when you get bit by a snake, and the location of the nearest emergency room. They also want to make sure they don’t push stereotypes, spread bad election info or teach extremists how to make bombs at home.
Google provides two guiding questions for deciding whether content fits YMYL criteria:
- Would a careful person seek out experts or highly trusted sources to prevent harm? Could even minor inaccuracies cause harm? If yes, then the topic is likely YMYL.
- Is the specific topic one that most people would be content with only casually consulting their friends about? If yes, the topic is likely not YMYL.
Sometimes YMYL topics are unavoidable. If you’re a financial advisor, you’re likely to talk often about decisions people should be making with money. If you’re a doctor, you’re likely to talk about how to handle medical situations. Tackling YMYL content isn’t taboo. Doing it poorly is. Google is clear in their guidelines: “For pages about clear YMYL topics, we have very high Page Quality rating standards.”
They do this because minced words on tough topics could very easily have a bad impact on users or society.
Effort and originality
Google has four criteria on which they judge the quality of a page’s content. Unless it’s on a YMYL topic, a page is high quality if it demonstrates effort, originality, and talent or skill. YMYL topics have an added need for accuracy.
The ‘effort’ section of Google’s guidelines is very particularly-worded. They ask page raters to, “Consider the extent to which a human being actively worked to create satisfying content.”
Beyond the buttery use of the word “satisfying”, the important item here is “human being”. They require this for two reasons:
In a scenario where content is being translated, it emphasizes the accuracy of human translation. For example, two websites dedicated to translating Japanese comic books into English. One using Google Translate to do the work (machine translation) is likely to be less accurate than one doing it by hand.
Here’s something to illustrate the point: I’m going to use Google to translate this sentence into Japanese, then back to English.
I’ll get to the point here. Use Google to translate this sentence into Japanese and then back into English.
We lost some nuance there, right?
The second reason Google is careful to emphasize the “human” part of content creation piggybacks on that nuance. If machines aren’t reliable for translation of language, they’ll be even less reliable for creation of copy.
Remember our early chapter digging into how it took time for search engines to have enough data to become reliable? AI bots like Chat GPT and Google’s Bard need data, too. But their purpose is different from a search engine. In a search, you’re asking a tool to pull from a giant library of known information to suggest content. With an AI chatbot, you’re asking the bot to find information and digest it for you. There’s (1) no guarantee the information coming back to you is emphasized or prioritized correctly and (2) no guarantee the information is interpreted in a factual way.
Interpreted, emphasized, and prioritized. Calling back to “is it safe to boil a baby?” It’s very likely a person boiling a baby would be safe while they commit the atrocity. It’s very unlikely a baby being boiled is safe. To a chatbot, the “baby” in the sentence is a noun, interchangeable with any other noun. Am I going to get hurt if I boil a potato? No. A box of pasta? Not if you take it out of the box. A lobster? Not for the lobster.
Can you rely on the AI to know a human may be dumb enough to put the pasta box into the boiling water? Or that pasta should be removed from the box in general? Or that a vegan (and the lobster) don’t want the lobster to be boiled? Apparently not, because it’s safe to boil a baby.
Suddenly, when chatbots are concerned, much more content becomes a YMYL item.
Even without a chatbot-reactive update to their guidelines, Google has already covered this in their “effort” requirements. They write:
“…automatic creation of thousands of pages by running existing freely available content through existing translation software without any oversight, manual curation, etc., would not be considered to have human effort.”
Does that throw up any red flags for anyone?
The industry’s immediate pull toward an easy hack reminds me of some ghosts of SEO past. We’ll dive deeper into this in the pages of this book, but at a high level, many algorithm updates at Google have bankrupted businesses overnight and caused many more close calls with financial ruin. In 2011, Google’s Panda update outlawed keyword stuffing. Sites hiding white text on top of a white background saw their traffic cut to almost nothing. In 2012, the Penguin update took on huge link farm networks. More sites and businesses built on that hack collapsed. The 2013 Hummingbird update closed loopholes with keyword stuffing. Those who didn’t learn from Panda, sure learned this time.
The Mobile and RankBrain updates in 2015, Medic in 2018, Bert in 2019… Will we ever learn that quality content will always be safe from changes, but hacks never will be?
Do you find yourself asking how Google will be able to differentiate AI content from something you wrote? A college student wrote code that can catch it. You don’t think Google can?
Remember that Chat GPT is only the creation of existing content in aggregate. It can write a story, but it can’t make up new nuances and subtleties. Just a subtle regurgitation of something already made somewhere, blended with something else. I can tell you about an electric car, but it can’t create a new type of electric motor. You can ask it to write a short story in the style of Edgar Allen Poe because it has Poe’s catalog and countless examples of what a “story” should contain. It could even be a great story if you’re not privy to the types of twists you’d expect from Poe. A Poe expert would be able to smell something fishy, though. And a literary laureate would be able to spot a lack of creativity which pushes the bounds of storytelling. Because the machine cannot innovate. It can only work within the boundaries of the content it’s been fed.
Your AI content won’t out-perform that of a real writer. And someday, Google is going to ding you for it.
After Google evaluates content for the level of effort put in, they turn to originality. Is there anywhere else on the internet a user could get similar content? Is your content partially or wholly borrowed?
To be clear, borrowing isn’t actually bad. What search engines check is if you provide clarity around where you sourced the content. That happens in two ways. First, being clear with visitors that the content isn’t original. That’s why successful websites will syndicate content with an “First published on [linked website]” at the start of a copied post. It’s also why a Works Cited or Sources section can be very powerful on content where you quote other websites. The second item search engines look for is a small code snippet called a “canonical”. We’ll dig deeper into these (and how easy they are to set up) in a future chapter. For now, think of it as a way to borrow an entire page’s content and tell search engines you yield ‘ownership’ back to the original site.
Also to clarify: canonicals will get you out of the dog house with Google. They won’t settle a copyright dispute. If you’re borrowing content, make sure you have permission first.
Back to Google’s search quality guidelines. The actual guidance Google shares with their raters is:
“Consider the extent to which the content offers unique, original content that is not available on other websites. If other websites have similar content, consider whether the page is the original source.”
If your site is offering unique content, a new view on existing content, or tackling a topic nobody else has, you’re golden. This is why entrepreneur gurus always have the guidance to “niche down”. Don’t start a blog about running. Start a blog about trail running at altitude. If you sell widgets, don’t talk about how your widget solves the same problems as your competitor’s. Talk about how your widget is different than other products. Lean into it.
Competition: We’re the widget for enterprises.
You: We’re the widget for small businesses. Especially entrepreneurs and startups.
Competition: We make the fastest widgets.
You: We make the most reliable widgets. What else could you do with the money you save on replacements?
Competition: We make the biggest widgets.
You: Our widgets are portable. We built them with digital nomads in mind.
This is scalable to anything. Especially content creation. I’m an aerospace nerd. Anything aviation and space. I suck it up like a sponge. I have a couple YouTube channels I subscribe to. At the surface level, you could say they’re all aerospace channels. But they can all thrive in parallel because they’ve niched down. Here are two examples:
Tim Dodd, Everyday Astronaut – Helps average people understand how space travel works. Has the tagline, “Bringing space down to Earth for everyday people.”
Scott Manley – Nerds out on the physics behind the latest headlines in rockets, airplanes, and space travel. Has the tagline “Fly safe,” from back in his days live streaming while he played space games.
If they both only listed off the most recent rocket launches in each video, viewers would have no reason to follow both of them. Because Scott goes hardcore nerd and Tim works to make difficult subjects easy to understand, they coexist and continue to grow.