URLs are at the very core of the world wide web, yet many webmasters and even some SEOs neglect them. While webpages can still perform well in search engines if the URLs aren't optimized, optimized URLs help users and search engines understand your site better and can give you a boost over competitors. Let's talk about best practices for URLs.
What Is A URL?
Before we talk about best practices, let's familiarize ourselves with what URLs actually are.
URL stands for Uniform Resource Locator and it is mostly commonly known as a web address. The URL also includes the protocol for retrieving a web resource, such as HTTP, HTTPS, or FTP. An example of what a URL looks like is:
Google doesn't have any maximum URL lengths, but in order for a URL to be correctly viewed in all browsers, it needs to be shorter than 2,083 characters.
URLs were designed to be addresses for web pages and files that were easy for humans to use and understand. The actual location of web pages and files is the IP address, which serves a purpose similar to a phone number. World Wide Web standards were developed to link web addresses with IP addresses and make the web easier for humans to use.
URLs consist of three parts: a protocol, a domain name, and a path. The syntax for URLs presents these parts like this:
The protocol tells the browser how to open the resource, such as the http web standard, the SSL-protected https web standard, mailto to open the operating system's default mail client, or ftp to run file transfers between a computer and a server.
The domain name includes the domain name itself and the top level domain. In example.com, "example" is the domain name and ".com" is the top level domain. The domain name is the human readable version of the location of a web resource, usually the website. The top level domain is a sort of category for websites, such as .com, .edu, or .gov. The domain may also have a subdomain, which is usually treated like a sort of subsection of a website with altered branding. Subdomains are used like this:
The path points to the location of a resource on the website. It may include a series of folders separated by slashes (/) and typically ends either with a folder (like /folder/) or a file with an extension (like /file.html).
Why URLs Impact SEO
Let's briefly touch on why URLs even matter for SEO before we get into the best practices. There are three primary ways URLs impact SEO.
1. User Experience
Google displays partial URLs in the search results, below the page title. A short, easy to understand URL makes it easier for the user to understand what the page is about even if the title tag is too long and cuts out. On the other hand, if the URL is too long it will cut out.
While pages without optimized URLs can still rank well in Google, the URL provides more context that search engines can use to determine what the page is about and how your site is structurally organized.
Google uses links to sites to evaluate what the page is about and how authoritative it is. Sometimes a link is simply a copied and pasted URL. If the URL contains relevant information about what's on the page, this means these kinds of links will also carry that relevant information. A user-friendly URL is also more likely to be shared than a long and clunky one. Similarly, easy to interpret URLs are more likely to get clicked on.
Best Practices From Our List Of Google Ranking Factors
These best practices are informed by what we know about how Google ranks web pages.
We maintain a list of 273 Google ranking factors (factual and mythical ones) that have been fact checked with patent filings, statements from Google, and studies that employ the scientific method. From that list, here are the factors that apply to URLs.
- Keyword In URL: Use the keyword in the URL. This helps Google understand how relevant a piece of content is for queries related to that keyword. This is backed up by a patent and a statement from Matt Cutts.
- Keyword Early In URL: Matt Cutts has confirmed that "after about five words" the weight of a keyword in the URL starts to fade, so be sure to use the keyword earlier in the URL if possible.
- Keyword In Domain: A patent confirms that using the keyword in the domain name provides a ranking boost. As a best practice, you should only include a keyword in your domain name if it makes sense for your branding.
- Separate Words With A Hyphen: Matt Cutts has confirmed that the best way to separate words in your URL is with a hyphen (-). Underscores can also work but might lead to words being interpreted as variables. If no separators are used, it might be difficult for Google to parse the words.
- Consistent WWW Use: There is no SEO value associated with using either WWW or non-WWW URLs. However, only one should be in use, and the one that isn't in use should redirect to the one that is in use. If both forms are in use (i.e. https://www.example.com and https://example.com), then this can lead to duplicate content issues. Google has confirmed that duplicate content is a problem.
- Avoid URL Tags: URL tags create duplicate content in the same way that inconsistent WWW usage does. When a tag doesn't change the content of a page, it creates two duplicate versions of a page at different URLs. While tags can be used to store information for the server, they shouldn't be used in any canonical pages or any pages that are linked to on your site.
- Avoid Keyword Repetition: There don't seem to be any official statements from Google saying to avoid repeating a keyword in the URL, but this is a best practice because it doesn't offer any additional value, and it can certainly be interpreted as an attempt to spam the search engine.
- Keep URLs Short: While plenty of very long URLs are indexed in Google and perform well, it is more difficult for Google to parse which words in a URL are important the longer the URL is. As we said previously, Matt Cutts has confirmed that value dwindles after around five words.
- No GSC URL Parameters: Google search console parameters obviously have their uses, but you should never use them in any URL that you want to be indexed. It's been confirmed that these are used to exclude pages from Google's index.
URL Best Practices Stated By Google
Here are some recommendations Google directly stated for webmasters.
Keep A Simple URL Structure
This recommendation is listed in Google's general webmaster guidelines.
They elaborate by saying to:
- Construct URLs logically in a way that is intelligible to humans
- Use readable words rather than numbers or character strings if possible
- Use hyphens between words, as we mentioned above
They also caution against complex URL structures that create an unnecessarily high number of URLs which can make it hard for Google to crawl the site. For this reason, they caution against findable, indexable URLs that do these things:
- Additive filtering of sets of items
- Dynamic document generation
- Tracking URL parameters like session IDs
- Sorting and referral parameters
- Calendar issues
- Broken relative links with repeated path elements that can create infinite URL spaces
To avoid these issues they recommend:
- Blocking problematic URLs with robots.txt
- Trim all unnecessary URL parameters
- Nofollow infinite calendar links
- Use a crawler to check for broken relative links
Additional Generally Accepted SEO Industry Best Practices For URLs
The advice in this section isn't, to our knowledge, directly tested or endorsed by statements from Google, but they are widely accepted standards based on extrapolation from things we do know about Google, users, and computer systems.
Treat Each Successive Path Element As More Specific
Let's look at this as an example:
We would expect the most general keywords to be located in domain, folder1 would be more specific, folder2 would be even more specific, and file would be as laser focused on the topic of the page as feasible in a short space.
Keep in mind that we don't want too many folders, and we want folder names to be simple. Remember that Google has said beyond five words or so we don't expect keywords to have much impact. Generally, this should mean about one word for the domain, one word for a single folder, and the most important page keywords listed within the first three words after the folder.
Use Subdomains Appropriately
According to Matt Cutts, subdomains are roughly equivalent to subfolders when it comes to how they are evaluated for ranking. However, HubPages was able to improve performance after the Panda update by switching to subdomains, so they aren't entirely identical.
As a best practice, subfolders are for subsections of the same site, while subdomains are generally used to host a section of the site with slightly different branding, or that runs on a different platform.
Subdomains might be used for child brands of a parent brand, for member-exclusive areas of a site, for user-generated content, and in general for anything that you want a little bit more walled off from the main site, while still retaining the parent domain, for whatever reason.
Use URL Anchors For Long Pages
A URL anchor is a URL with a hashtag (#) that links to a specific part of a webpage. URL anchors are used to navigate large webpages. The "Contents" you see listed in a Wikipedia article are links made using URL anchors:
This is what URL anchors look like in URL form:
Here, https://example.com/page is the URL for the webpage, and #anchor is the URL anchor.
For long documents, it's best practice to use URL anchors to point to different sections of the page. Not only does including links to URL anchors make the page easier for users to navigate, URL anchors may also help search engines understand the semantic structure of the document better.
In order for a URL anchor to point to a specific part of a document, code like this needs to be placed at the target location within the HTML of the webpage:
Here, anchor is the URL anchor.
The hyperlink that points to the URL anchor looks like this:
<a href="#anchor">Click here for anchor</a>
Here, anchor is the URL anchor and Click here for anchor is the clickable text a user will see in the hyperlink.
Use Lowercase Letters
Keeping URLs to lowercase letters only is a best practice to avoid the possibility of creating duplicate content.
Most server settings interpret capital and lowercase variations of a URL as the same URL, but it's possible search engines will interpret them as separate URLs because there are exceptions.
Make sure that your server settings are such that when uppercase letters are used in a URL, the user is redirected to the lowercase version of the URL. This doesn't just mean the server renders the same page, it means the server must take the user to the same URL every time.
Canonicalize Your URLs
Canonicalization is a way to tell search engines which URL should be used for a webpage. Since anchors and tags can create multiple URLs that point to the same document, it's a good best practice to include canonical URLs in every webpage.
Bear in mind that Google is not perfect and does not always listen to canonical tags, since they do not always trust the webmaster to know what they are doing. Canonicalization should not be thought of as a cure all.
Even when Google obeys the canonical tag, if you link to duplicate URLs you are still forcing Google to crawl the page and see the canonical tag before it understands that it is a duplicate. This makes it more difficult for Google to crawl your site and as a result can hurt your SEO performance.
With all of those caveats in mind, this is what a canonical tag looks like in the html of a document:
<link rel="canonical" href="https://example.com/canonical-page" />
This tag should be located within the <head> of the document.
This tells the search engines that the correct URL for the document being crawled is https://example.com/canonical-page. If the document is being crawled at a different URL, such as the tagged URL https://example.com/canonical-page?variable=123, then ideally Google will understand that it should use the other URL and attribute all SEO authority to the canonical location.
The canonical tag should be within the document anywhere the document renders, whether it is at the canonical URL or not.
Do Not Link To Duplicate URLs
We've said this at several other points in this post, but it deserves its own section. In general you simply should not include duplicate URLs anywhere they are legible to search engines (with the exception of URL anchors). Even if you are redirecting the URLs and especially if you are only canonicalizing them, it's best if the search engines only ever see the canonical URL for a document.
Duplicates can be accidentally created through automation, relative links, improper capitalization, and tagging, so it's important to regularly crawl your site and check for duplicate URLs.
URLs are a crucial part of the fabric of the web. They were invented to make the location of web pages easy for humans to understand. Follow the best practices above and keep your URLs simple and intuitive for the best user experience and SEO benefit.