The Basics of Search Engine Friendly Design & Development
Search engines are limited in how they crawl the web and interpret content. A webpage doesn't always look the same to you and me as it looks to a search engine. In this section, we'll focus on specific technical aspects of building (or modifying) web pages so they are structured for both search engines and human visitors alike. Share this part of the guide with your programmers, information architects, and designers, so that all parties involved in a site's construction are on the same page.
To perform better in search engine listings, your most important content should be in HTML text format. Images, Flash files, Java applets, and other non-text content are often ignored or devalued by search engine crawlers, despite advances in crawling technology. The easiest way to ensure that the words and phrases you display to your visitors are visible to search engines is to place them in the HTML text on the page. However, more advanced methods are available for those who demand greater formatting or visual display styles:
1. Provide "alt" text for images. Assign images in "gif",jpg, or png format "alt attributes" in HTML to give search engines a text description of the visual content.
2. Supplement search boxes with navigation and crawlable links.
3. Suppliment Flash or Java plug-ins with text on the page.
4. Providae a transcript for video and audio content if the words and phrases used araea meant to be indexed by the engines.
Seeing your site as the search engines do
Many websites have significant problems with indexable content, so double-checking is worthwhile. By using tools like Google's cache, SEO-browser.com, and the MozBar you can see what elements of your content are visible and indexable to the engines. Take a look at Google's text cache of this page you are reading now. See how different it looks?
"I have a problem with getting found. I built a huge Flash site for juggling pandas and I'm not showing up anywhere on Google. What's up?"
Using the Google cache feature, we can see that to a search engine, JugglingPandas.com's homepage doesn't contain all the rich information that we see. This makes it difficult for search engines to interpret relevancy.
Uh oh ... via Google cache, we can see that the page is a barren wasteland. There's not even text telling us that the page contains the Axe Battling Monkeys. The site is built entirely in Flash, but sadly, this means that search engines cannot index any of the text content, or even the links to the individual games. Without any HTML text, this page would have a very hard time ranking in search results.
It's wise to not only check for text content but to also use SEO tools to double-check that the pages you're building are visible to the engines. This applies to your images, and as we see below, to your links as well.
Just as search engines need to see content in order to list pages in their massive keyword-based indexes, they also need to see links in order to find the content in the first place. A crawlable link structure—one that lets the crawlers browse the pathways of a website—is vital to them finding all of the pages on a website. Hundreds of thousands of sites make the critical mistake of structuring their navigation in ways that search engines cannot access, hindering their ability to get pages listed in the search engines' indexes.
Below, we've illustrated how this problem can happen:
In the example above, Google's crawler has reached page A and sees links to pages B and E. However, even though C and D might be important pages on the site, the crawler has no way to reach them (or even know they exist). This is because no direct, crawlable links point pages C and D. As far as Google can see, they don't exist! Great content, good keyword targeting, and smart marketing won't make any difference if the crawlers can't reach your pages in the first place.
This is the most basic format of a link, and it is eminently understandable to the search engines. The crawlers know that they should add this link to the engines' link graph of the web, use it to calculate query-independent variables (like Google's PageRank), and follow it to index the contents of the referenced page.
If you require users to complete an online form before accessing certain content, chances are search engines will never see those protected pages. Forms can include a password-protected login or a full-blown survey. In either case, search crawlers generally will not attempt to submit forms, so any content or links that would be accessible via a form are invisible to the engines.
If you use JavaScript for links, you may find that search engines either do not crawl or give very little weight to the links embedded within. Standard HTML links should replace JavaScript (or accompany it) on any page you'd like crawlers to crawl.
The Meta Robots tag and the robots.txt file both allow a site owner to restrict crawler access to a page. Just be warned that many a webmaster has unintentionally used these directives as an attempt to block access by rogue bots, only to discover that search engines cease their crawl.
Technically, links in both frames and iframes are crawlable, but both present structural issues for the engines in terms of organization and following. Unless you're an advanced user with a good technical understanding of how search engines index and follow links in frames, it's best to stay away from them.
Although this relates directly to the above warning on forms, it's such a common problem that it bears mentioning. Some webmasters believe if they place a search box on their site, then engines will be able to find everything that visitors search for. Unfortunately, crawlers don't perform searches to find content, leaving millions of pages inaccessible and doomed to anonymity until a crawled page links to them.
The links embedded inside the Juggling Panda site (from our above example) are perfect illustrations of this phenomenon. Although dozens of pandas are listed and linked to on the page, no crawler can reach them through the site's link structure, rendering them invisible to the engines and hidden from users' search queries.
Search engines will only crawl so many links on a given page. This restriction is necessary to cut down on spam and conserve rankings. Pages with hundreds of links on them are at risk of not getting all of those links crawled and indexed.If you avoid these pitfalls, you'll have clean, spiderable HTML links that will allow the spiders easy access to your content pages.
Nofollow, taken literally, instructs search engines to not follow a link (although some do). The nofollow tag came about as a method to help stop automated blog comment, guest book, and link injection spam, but has morphed over time into a way of telling the engines to discount any link value that would ordinarily be passed. Links tagged with nofollow are interpreted slightly differently by each of the engines, but it is clear they do not pass as much weight as normal links.
Although they don't pass as much value as their followed cousins, nofollowed links are a natural part of a diverse link profile. A website with lots of inbound links will accumulate many nofollowed links, and this isn't a bad thing. In fact, Moz's Ranking Factors showed that high ranking sites tended to have a higher percentage of inbound nofollow links than lower-ranking sites.
Google states that in most cases</a>, they don't follow nofollow links, nor do these links transfer PageRank or anchor text values. Essentially, using nofollow causes Google to drop the target links from their overall graph of the web. Nofollow links carry no weight and are interpreted as HTML text (as though the link did not exist). That said, many webmasters believe that even a nofollow link from a high authority site, such as Wikipedia, could be interpreted as a sign of trust.
Bing and Yahoo
Bing, which powers Yahoo search results, has also stated that they do not include nofollow links in the link graph, though their crawlers may still use nofollow links as a way to discover new pages. So while they may <em>follow</em> the links, they don't use them in rankings calculations.
Keywords are fundamental to the search process. They are the building blocks of language and of search. In fact, the entire science of information retrieval (including web-based search engines like Google) is based on keywords. As the engines crawl and index the contents of pages around the web, they keep track of those pages in keyword-based indexes rather than storing 25 billion web pages all in one database. Millions and millions of smaller databases, each centered on a particular keyword term or phrase, allow the engines to retrieve the data they need in a mere fraction of a second.
Obviously, if you want your page to have a chance of ranking in the search results for "dog," it's wise to make sure the word "dog" is part of the crawlable content of your document.
Keywords dominate how we communicate our search intent and interact with the engines. When we enter words to search for, the engine matches pages to retrieve based on the words we entered. The order of the words ("pandas juggling" vs. "juggling pandas"), spelling, punctuation, and capitalization provide additional information that the engines use to help retrieve the right pages and rank them.
Search engines measure how keywords are used on pages to help determine the relevance of a particular document to a query. One of the best ways to optimize a page's rankings is to ensure that the keywords you want to rank for are prominently used in titles, text, and metadata.
Generally speaking, as you make your keywords more specific, you narrow the competition for search results, and improve your chances of achieving a higher ranking. The map graphic to the left compares the relevance of the broad term "books" to the specific title Tale of Two Cities. Notice that while there are a lot of results for the broad term, there are considerably fewer results (and thus, less competition) for the specific result.
Since the dawn of online search, folks have abused keywords in a misguided effort to manipulate the engines. This involves "stuffing" keywords into text, URLs, meta tags, and links. Unfortunately, this tactic almost always does more harm than good for your site.
In the early days, search engines relied on keyword usage as a prime relevancy signal, regardless of how the keywords were actually used. Today, although search engines still can't read and comprehend text as well as a human, the use of machine learning has allowed them to get closer to this ideal.
The best practice is to use your keywords naturally and strategically (more on this below). If your page targets the keyword phrase "Eiffel Tower" then you might naturally include content about the Eiffel Tower itself, the history of the tower, or even recommended Paris hotels. On the other hand, if you simply sprinkle the words "Eiffel Tower" onto a page with irrelevant content, such as a page about dog breeding, then your efforts to rank for "Eiffel Tower" will be a long, uphill battle. The point of using keywords is not to rank highly for all keywords, but to rank highly for the keywords that people are searching for when they want what your site provides.
Keyword usage and targeting are still a part of the search engines' ranking algorithms, and we can apply some effective techniques for keyword usage to help create pages that are well-optimized. Here at Moz, we engage in a lot of testing and get to see a huge number of search results and shifts based on keyword usage tactics. When working with one of your own sites, this is the process we recommend. Use the keyword phrase:
And you should generally not use keywords in link anchor text pointing to other pages on your site; this is known as Keyword Cannibalization.
Keyword Density Myth
Keyword density is not a part of modern ranking algorithms, as demonstrated by Dr. Edel Garcia in <a href="http://www.e-marketing-news.co.uk/Mar05/garcia.html">The Keyword Density of Non-Sense
If two documents, D1 and D2, consist of 1000 terms (l = 1000) and repeat a term 20 times (tf = 20), then a keyword density analyzer will tell you that for both documents Keyword Density (KD) KD = 20/1000 = 0.020 (or 2%) for that term. Identical values are obtained when tf = 10 and l = 500. Evidently, a keyword density analyzer does not establish which document is more relevant. A density analysis or keyword density ratio tells us nothing about:
1. The relative distance between keywords in documents (proximity)
2. Where in a document the terms occur (distribution)
3. The co-citation frequency between terms (co-occurance)
4. The main theme, topic, and sub-topics (on-topic issues) of the documents
The Conclusion:
Keyword density is divorced from content, quality, semantics, and relevance. That should optimal page density look like then? You can read more information about On-Page Optimization in this post.
Using keywords in the title tag means that search engines will bold those terms in the search results when a user has performed a query with those terms. This helps garner a greater visibility and a higher click-through rate.
The final important reason to create descriptive, keyword-laden title tags is for ranking at the search engines. In Moz's biannual survey of SEO industry leaders, 94% of participants said that keyword use in the title tag was the most important place to use keywords to achieve high rankings.The title element of a page is meant to be an accurate, concise description of a page's content. It is critical to both user experience and search engine optimization.
As title tags are such an important part of search engine optimization, the following best practices for title tag creation makes for terrific low-hanging SEO fruit. The recommendations below cover the critical steps to optimize title tags for search engines and for usability.
Search engines display only the first 65-75 characters of a title tag in the search results (after that, the engines show an ellipsis – "..." – to indicate when a title tag has been cut off). This is also the general limit allowed by most social media sites, so sticking to this limit is generally wise. However, if you're targeting multiple keywords (or an especially long keyword phrase), and having them in the title tag is essential to ranking, it may be advisable to go longer.
The closer to the start of the title tag your keywords are, the more helpful they'll be for ranking, and the more likely a user will be to click them in the search results.
At Moz, we love to end every title tag with a brand name mention, as these help to increase brand awareness, and create a higher click-through rate for people who like and are familiar with a brand. Sometimes it makes sense to place your brand at the beginning of the title tag, such as your homepage. Since words at the beginning of the title tag carry more weight, be mindful of what you are trying to rank for.
Title tags should be descriptive and readable. The title tag is a new visitor's first interaction with your brand and should convey the most positive impression possible. Creating a compelling title tag will help grab attention on the search results page, and attract more visitors to your site. This underscores that SEO is about not only optimization and strategic keyword usage, but the entire user experience.
Meta TagsMeta tags were originally intended as a proxy for information about a website's content. Several of the basic meta tags are listed below, along with a description of their use.
The Meta Robots tag can be used to control search engine crawler activity (for all of the major engines) on a per-page level. There are several ways to use Meta Robots to control how search engines treat a page:
The X-Robots-Tag HTTP header directive also accomplishes these same objectives. This technique works especially well for content within non-HTML files, like images.
The meta description tag exists as a short description of a page's content. Search engines do not use the keywords or phrases in this tag for rankings, but meta descriptions are the primary source for the snippet of text displayed beneath a listing in the results.
The meta description tag serves the function of advertising copy, drawing readers to your site from the results. It is an extremely important part of search marketing. Crafting a readable, compelling description using important keywords (notice how Google bolds the searched keywords in the description) can draw a much higher click-through rate of searchers to your page.
Meta descriptions can be any length, but search engines generally will cut snippets longer than 160 characters, so it's generally wise to stay within in these limits.
In the absence of meta descriptions, search engines will create the search snippet from other elements of the page. For pages that target multiple keywords and topics, this is a perfectly valid tactic.
Meta Keywords: The meta keywords tag had value at one time, but is no longer valuable or important to search engine optimization. For more on the history and a full account of why meta keywords has fallen into disuse, read Meta Keywords Tag 101 from SearchEngineLand.
Meta Refresh, Meta Revisit-after, Meta Content-type, and others: Although these tags can have uses for search engine optimization, they are less critical to the process, and so we'll leave it to Google's Search Console Help to discuss in greater detail.
Well, How do you like this offering?
Chuck Reynolds
Contributor