When building a web application or virtually any website that is a bit bigger or should reach a wide range of people, localization becomes a topic. “Localization” means basically offering the user different versions (languages) of the website. There are different ways to do this.
The (Very) Easy Way
Many companies like to take the easy way out simply by storing the locale in the session and retrieving it when needed. This approach definitely is easy and can be implemented in a very short time. In general, storing the locale in the session is the best option if you do not have to worry about search engines, for example in a restricted area that only users (and thus no search engines) can access. For all other (public) parts of websites, a different approach is recommended.
Localization With Subdomains
Subdomains are the part before the domain name. For example, if your domain is http://testcompany.com a subdomain could be http://login.testcompany.com. You can use these subdomains for localization of your content, e.g. with http://en.testcompany.com for the English version and http://de.testcompany.com for the German version.
The difference to the aforementioned approach is that the locale is visible in the URL, thus making it easy for the search engines to index your site. This way, all your versions of your website will be indexed, as opposed to only one with the session method (I will explain how to tell Google there is an alternate version later).
Localization With a “Path”
When you can put the locale in front of the domain, you can also put it after the domain. Another way is to add the locale to the path of the URL like so: http://testcompany.com/de
It is not better or worse than the subdomain approach, though I personally prefer the subdomain approach because it keeps the path clean and you won’t have to worry about linking to files the right way.
For more information, Google offers a blog post on this issue.
How Do I Tell Google
I Love It? What Languages My Website Has?
When Google reaches your site, it crawls this site, it follows links, but it can’t quite know which site is a translation of another site. This information might be important to present more relevant search results to users.
There is an HTML tag that allows you to tell Google which URL is a direct translation of this site. The HTML tag looks like:
<link rel="alternate" hreflang="de" href="http://de.testcompany.com/url-path"/>
When Google crawls http://en.testcompany.com/url-path (the English version) it will see this tag and save the German version along with it. That’s how easy it is.
If you provide sitemaps to Google, it does have a somewhat counterintuitive and redundant approach. However, it is important to include every version in the sitemap.xml too. To read how to do that, I think Google is the best source: