Switching to HTTPS

Take these factors into account!

"Beginning in July 2018 with the release of Chrome 68, Chrome will mark all HTTP sites as “not secure”."

The above is taken from a Google Security blog post, A secure web is here to stay (Feb 8, 2018)

No site owners want web browsers to say that their site is "not secure", so it's high time http sites make the transition to https. It would be nice if this simply meant installing a valid SSL certificate on your web server with all (insecure) port 80 traffic redirecting to (secure) port 443. It would be nice, but reality is rarely that simple.

In the real world, sites are not self-contained - web pages are a mish mash of locally and externally stored files. Some images stored on a Content Delivery Network (CDN) here, some JavaScript to track user activity there. Web pages, after all, are simply blocks of text with extra requests to download additional resources: images, videos, JavaScript etc. These resources can be hosted on any number of external servers (CDNs, ad services, tracking services) and any one of them might be served insecurely over http.

But http Sites have been Around for Ages

Yes, but security is finally getting the attention it deserves. Even if you don't care about security, Google does. Having an insecure site is fairly embarrassing when browsers are shaming it with loud "not secure" warnings. In the end, failing to transition to https will diminish your Search Engine Optimization (SEO) and, ultimately, your site traffic.

That takes me to this blog post's inspiration: making the switch to https, preserving your site's functionality and its SEO. Below I list some things to look out for and some things to do when transitioning to https. Some things are specific to Drupal (our specialty), but any site should look to implement the suggestions below.

A Potent Mix

One major thing to look for when you switch to https is mixed content warnings. These appear when a secure site points to insecure resources. Any resources still being served over http will be blocked, potentially breaking your site. And these may not even be resources provided by your server!

Say your http site has happily been serving jQuery from http://code.jquery.com, and it powers all sorts of cool effects. The moment you switch to https that jQuery call will throw a mixed content warning, will be blocked by the browser and those cool effects will cease to be.

Not all mixed content warnings will break your site - you may not even notice anything untoward at all. Now, whether or not Google dings your search rankings for serving mixed content is a different matter. (Apparently it doesn't, but you certainly don't improve your rankings with mixed content warnings.)

Avoid Mixed Content Warnings

You'll be relieved to hear that a great proportion of 3rd party scripts that started out years ago running over http likely have an https version working by now - you just have to use it. (In my jQuery example from above, you'd just need to use https instead of http.) Finding and cleaning up all the legacy http resources called by your site may take some time, but until then there is a one-stop meta tag to tell the browser to do this automatically: upgrade-insecure-requests.

<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests" />

With that tag set, any non-secure resource will automatically have its request upgraded to run securely. If the secure version is not available, the resource will not be loaded. This is one change that gives a lot of bang for your buck.

(This tag should also be set at the server level, as a request header.)

Absolute links provide the protocol (http or https) and domain name in the link structure whereas relative links provide a path relative to your domain's root folder.

<a href="https://chromatichq.com/about">About</a> (Note the https://chromatichq.com in the link.)

<a href="/about">About</a>

If your non-secure site currently uses relative links, they will continue working on your secure site. This is good because your links continue working without any extra effort on your part.

However, the rub here is that it is better to output absolute links! Why is that? Absolute links can be repurposed and continue to work no matter where you find them in the wilds of the internet. (RSS feeds, Email this article to a friend etc.)

To output absolute links, check out Drupal's Pathologic module, which creates an input filter that you attach to your text formats. Instead of an administrator having to scour through your content to update links manually, Pathologic can output all your content links as absolute, with the desired https protocol.

Use Canonical Tags

Ever since SEO became a thing, people have tried to game the system to improve their search rankings. Amongst the tricks was the practice of having many pieces of similar or identical content available from multiple places. Search engines are wise to this nowadays, but they also want to know which page on your site represents the definitive version of your content.

The canonical tag is used to help them figure it out. Below is the canonical URL for a blog post I wrote.

<link rel="canonical" href="https://chromatichq.com/blog/teaching-algorithms-first-graders" />

Note that you can reach this blog post by using http instead of https, or you could also add www as a subdomain. Chromatic's site properly redirects all those variations to the canonical URL (as should your site!), but that is a conscious decision by us. We could have our web server configured to serve my blog post from any number of variations of the above URL and then the search engines would be wondering which one is the right, or canonical, version. Use the canonical tag to tell them.

A note of caution! If you add the canonical tag but don't actually configure it to give the proper canonical URL for each of your pages, then your best laid plans might kill your SEO. An SEO expert experimented by making his home page the canonical URL for all of his site's pages and within weeks, over half of his site's pages had been de-indexed by Google!

Minimize the Hops

In a recent project to move to https, one goal (with apologies to IPA fans and pun-haters everywhere) was to "minimize the hops" required to reach a page's canonical version. In plain speak, we wanted any links on our site (internal or external) to link directly to the correct https version of a page, rather than link to an out-of-date http version and then depend on a 301 redirect to hop the user to the canonical version.

Heavy Lifting

Here is an example of the lengths we went to in minimizing hops.

Scattered in the site's content were old, insecure links to sites like Facebook, Twitter and Instagram. These sites moved to https long ago and we wanted to point our users directly to the secure links. This is where some heavy manual lifting was required. We used Screaming Frog to generate reports of all the insecure external links found on our site and I catalogued all the Drupal fields and their corresponding database tables that could hold these links. Screaming Frog told us what to look for and my bit was telling us where to look. From a long list of external links (much more than just FB, Twitter and Instagram) and database fields and tables, we ran search and replace functions using Drupal update hooks.

We didn't limit our link searches to content. To be thorough, I grepped through the entire code base and found some instances of http links and resources that had been hard-coded by developers of yore. Those had to be checked, updated and committed too.

Update Your Site Map

There's nothing particularly onerous here, but once you make the switch to https, you will need to rebuild your site map to make sure it is using https versions of all your links.

Use a Content Delivery Network

If you are in the process of making the switch to https anyway, strongly consider using a CDN to improve performance and security. A CDN sits between your server and the rest of the internet, and can provide a secure, SSL connection with your site while optimizing the speed with which you can serve your web visitors. It's like having your own personal Superman (fast and strong), using blazing speed to securely deliver resources from your web server to your site visitors. And who doesn't love Superman?

Monitor Your Site

Once you have transitioned your site to https, keep an eye on your analytics to check for any dips in rankings. It's possible you missed something and will have to make some adjustments as you go.

Pro Tip: Use Google Lighthouse

The Chrome DevTools set of, ummm?, tools, has an Audits panel (a.k.a. Google Lighthouse) from which you can perform any or all of the following audits:

  • Performance
  • Progressive Web App
  • Best Practices
  • Accessibility
  • SEO

You run the audit on any of your site's pages and an easy-to-digest report with handy, actionable items is generated in less than a minute. For the purposes of this blog post, the "Best Practices" and "SEO" reports will hold the most interest. Chances are they will uncover some areas for improvement that you haven't yet considered.

So there you have it. Provided you pay attention to the steps outlined here, your switch to https shouldn't break your site and it shouldn't hurt your SEO. Three cheers for a secure web!