Seriocomic in 2020

Seriocomic in 2020

This site was last published to in February 2018 – over 2 years ago. However, it hasn’t been left untouched. While I’ll cover that in a moment – it’s this past week that has seen the biggest changes. In the truest tradition of blogging for the audience of 1 (myself), I’m going to attempt to document some of these changes, because posterity.

The end of Virtual Private Servers

I’ve been hosting the websites I operate on VPS (Virtual Private Servers) for almost 15 years – from memory, I’ve done the rounds – Siteground, Knownhost, WiredTree, JonesSolutions, PowerVPS, etc (I won’t link to them all as only 1 is still operating as it was).

In 2016, I left the “managed VPS” game and tried self-managing a VPS using DigitalOcean – but quickly (due to having an Australian edge vs a Singaporean one) migrated to Vultr. Powered with a custom VPS and my own VPS control centre using the VirtualMin software things ran along nicely.

However, after finally purchasing a decent Synology NAS, and with a reduced number of sites to manage, I decided to shut-down the hosting side of things and migrate the remaining sites to the web-server on my NAS, put them behind Cloudflare for caching purposes – effectively reducing all costs.

Queue the hack

Sometime in 2018 my NAS MariaDB SQL database was compromised and a hacker cleaned-out all my WordPress-powered sites. These were quickly recovered from back-ups, but that sent me down the rabbit-hole of looking for greater security options and improved performance.

WordPress – as good and as popular as it is – is notorious for two things:

  1. Code bloat and expensive (in terms of CPU) page rendering – even with good caching.
  2. Security holes galore.

As a long-time, albeit intermittent/infrequent user of WordPress – y’know, like before the beginning – I’ve been acutely aware of these shortcomings and have spend hundreds of hours over the years patching/hardening, trialling every caching system and security plugin out there to both speed up my sites and also protect them from malfeasants.

One of the emerging technologies of recent years in the web-development space has been the return to “Static sites” – essentially sites that have their content hosted in a repository somewhere like Github, and a static site generator transforming that content into a site – bypassing the need for any database.

This has two distinct appealing factors:

  1. Significant speed improvements and lower cost of rendering.
  2. Mitigates almost all security vectors.

If you’re paying close attention, then you’ll see that these two appealing factors directly negate the biggest issues with WordPress-powered sites.

The problem is that static-site generators typically rely on a fairly advanced level of developer knowledge, and often require working knowledge of Node.js, React, Vue, Go, etc – and while my GitHub foo has progressed from complete ‘N00b’ to ‘Neophyte’ – even with some click-and-play services like Stackbit and Forestry – the migration hurdle was still too high to completely remove WordPress as part of the puzzle. The key was to separate the CMS from the Site-Generator.

The workable solution

I’ve already managed to either move some of my simpler sites (e.g. the 1-pager mhudson.com) to Github, but rather than using Github-pages, I’ve opted for a more flexible service that is gaining popularity – especially in the Static/JAMstack space – Netlify.

Without going through all the details (this is already word #540), Netlify takes the content (Markdown/HTML etc) and other files (CSS/JS) and uses their own build-engine to generate, host and publish the static content – you simply need a domain name CNAME/A record to point to their deployed site URL.

The only remaining problem is the “keeping the WordPress” part – since WordPress likes to own that generate/publish part. Exporting a site like seriocomic with over 1300 pages (photos), including posts, date archives etc doesn’t fit most other solutions. The only one that I’ve found to work is a WordPress to Static plugin by Leon Stafford. Leon has given a recorded talk on exactly why not relying on WordPress (or other similar CMSs is a good idea compared to a static site).

So – that’s that right? Run the plugin, export the site to Github, build using Netlify, publish behind Cloudflare – well, yes, but no.

What makes this site so hard

As mentioned earlier, seriocomic has over a 1000 ‘posts’ – so a fair bit of content. But it’s also over 15 years old – the database was/is riddled with hundreds of old post-permalinks (when I used to use ‘/image/354’ style URLs), full of ‘HTTP’ protocol addresses and so-much-plugin-cruft that it’s simply to big a job to clean fully. I can’t even do a “clean export” because there is SO much custom yet redundant stuff in there.

Here’s a from-memory laundry-list of things I had to do to get this site cleaned-up to be reused as a static-site-worthy version:

  1. Delete/replace as many antiquated plugins as possible (finding a replacement lightbox plugin that didn’t add hundreds of lines of per-page JSON was a challenge!)
  2. Test/evaluate a significant number of ‘hide WordPress‘ plugins – I wanted this site to be transportable as possible, and also reduce the WordPress vulnerability aspect – removing as many references to WordPress in the front-end code as possible does this.
  3. Migrate and refactor many of the custom and theme functions into separate switchable ‘snippet’ functions using the excellent Snippets plugin.
  4. Offload all the images onto AWS S3 – I could have them hosted in Github – but there’s ~1300 images, each with 3 thumbnail variants – this was costing me in time and space. it’s a pittance to have them in an S3 bucket. Side note: AWS is a horrible system to use for novices, the permissions hurdles alone (while effective and worth-it) are a nightmare to navigate).
  5. Update a significant number of broken internal links, image paths, redirecting paths – having ScreamingFrog run crawls to identify the link-origins was invaluable, but the process was painfully slow.
  6. Refactor the theme (originally hacked together back in 2013) away from a “photoblog” where the same not-updated post was really starting to date the site – to a more gallery/Pinterest type collection
  7. Significant dead-end time trying to remove cruft from the exported HTML – the only plugin that would work fully was ‘Real-Time Search and Replace‘ – but boy the regex nearly killed me!
  8. Tweak themes to improve CSS efficiency, exposed Javascript, broken Javascript, minification and concatenation, inline CSS and JS – hours of debugging
  9. Hours of trying to not expose the wrong asset sub-domain, the wrong staging sub-domain through rewrites and regex – actually days taking 1 step forwards, two steps back

Luckily for me, Leon’s plugin offered different types of export – from direct publishing to Github or Netlify, to a Zip file and to a directory – the last option being the one I used most extensively for testing. Unfortunately, despite my best attempt to get the crawl/export to happen as fast as possible, it took on average an hour to export the static version – and for every successful export, I averaged about 10 that failed – more often than not because I was trying to crawl/export at a rate that was beyond my poor NAS’s capacity.

I also found the plugin was too aggressive in it’s “initial crawl” list – so had to hack that part to limit it trying to crawl what was never exposed in the first place. Hopefully, I get an opportunity to contribute to the plugin rather than to rely on maintaining a hacked version.

Benchmarking for pedanticism

A lot of what took me probably 5 times longer than this should have taken was the success I had in improving the performance of an old previously WordPress-powered site I had – SiteSpeed. If you take a look at the Google Lighthouse performance report – screen-shotted here for posterity – I managed to get a perfect performance score.

Achieving this on seriocomic was never going to happen – especially with such an image-heavy, javascript dependant site (stripped-back that interaction maybe). However, trimming redundant inline code, cruft and other immaterial HTML, CSS and JS added a solid 40 hours of effort. So was it worth it? Probably not – the lighthouse score is nowhere near where I’d like it, but the site is now stable and fast.

Picking up on the 2nd point from the list above (Hide WordPress), along with the speed issue, is best explained by using code examples. Here’s a view of the standard HTML output of this page, without any of the optimizations made:

<!doctype html>
<html lang="en-AU" dir="ltr" class="no-js">
<head>
<title>Seriocomic in 2020 &ndash; seriocomic</title>
<meta name="viewport" content="width=device-width,initial-scale=1">
<link rel="license" href="https://www.seriocomic.com/about/copyright/">
<link rel="author" title="Mike Hudson" href="https://www.seriocomic.com/about/">
<link rel="shortcut icon" type="image/x-icon" href="https://www.seriocomic.com/favicon.ico">
<meta name="description" content="This site was last published to in February 2018 &#8211; over 2 years ago. However, it hasn&#8217;t been left untouched. While I&#8217;ll cover that in a moment &#8211; it&#8217;s this past&#8230;" />
<meta property="og:image" content="https://www.seriocomic.com/wp€content/uploads/2020/07/seriocomic2020.png" />
<meta property="og:image:width" content="900" />
<meta property="og:image:height" content="600" />
<meta property="og:locale" content="en_AU" />
<meta property="og:type" content="article" />
<meta property="og:title" content="Seriocomic in 2020 &ndash; seriocomic" />
<meta property="og:description" content="This site was last published to in February 2018 &#8211; over 2 years ago. However, it hasn&#8217;t been left untouched. While I&#8217;ll cover that in a moment &#8211; it&#8217;s this past week that has seen the biggest changes." />
<meta property="og:url" content="https://www.seriocomic.com/rhetoric/seriocomic-in-2020/" />
<meta property="og:site_name" content="seriocomic" />
<meta property="article:published_time" content="2020-07-11T04:50+00:00" />
<link rel="canonical" href="https://www.seriocomic.com/rhetoric/seriocomic-in-2020/" />
<!-- Prefetch those things, good puppy -->
<link rel='dns-prefetch' href='//cdnjs.cloudflare.com' />
<link rel='dns-prefetch' href='//fonts.googleapis.com' />
<link rel='dns-prefetch' href='//s.w.org' />
<link rel="alternate" type="application/rss+xml" title="seriocomic &raquo; Seriocomic in 2020 Comments Feed" href="https://www.seriocomic.com/rhetoric/seriocomic-in-2020/feed/" />
<link rel='stylesheet' id='mf_Raleway:100-css'  href='https://fonts.googleapis.com/css?family=Raleway:100' type='text/css' media='all' />
<link rel='stylesheet' id='sod_drawer_css-css'  href='https://www.seriocomic.com/wp€content/plugins/wordpress-sliding-drawer-content-area/css/style.css' type='text/css' media='all' />
<link rel='stylesheet' id='simplelightbox-css-css'  href='https://www.seriocomic.com/wp€content/plugins/simplelightbox/dist/simple-lightbox.min.css?ver=5.4.2' type='text/css' media='all' />
<link rel='stylesheet' id='layout-css'  href='https://www.seriocomic.com/wp€content/themes/seriocomic2013/css/layout.css' type='text/css' media='all' />
<link rel='stylesheet' id='fonts-css'  href='https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css' type='text/css' media='all' />
<link rel='stylesheet' id='style-css'  href='https://www.seriocomic.com/wp€content/themes/seriocomic2013/style.css?ver=dev' type='text/css' media='all' />
<link rel='stylesheet' id='enlighterjs-css'  href='https://www.seriocomic.com/wp€content/plugins/enlighter/cache/enlighterjs.min.css?ver=/AZDiHwswrLHIuC' type='text/css' media='all' />
<link rel='https://api.w.org/' href='https://www.seriocomic.com/wp-json/' />
<link rel="EditURI" type="application/rsd+xml" title="RSD" href="https://www.seriocomic.com/xmlrpc.php?rsd" />
<style>
.sl-overlay{background:#1a1a1a;opacity: 0.9;z-index: 1006;}
.sl-wrapper .sl-navigation button,.sl-wrapper .sl-close,.sl-wrapper .sl-counter{color:#ffffff;z-index: 1015;}
.sl-wrapper .sl-image{z-index:9000;}
.sl-spinner{border-color:#696969;z-index:1007;}
.sl-wrapper{z-index:1000;}
.sl-wrapper .sl-image .sl-caption{background:#000000;color:#ffffff;opacity:0.6;}
</style></head>

The obvious elements in this section that I’m trying to eliminate are:

  • the inline HTML comment – since comments aren’t for production (line #23)
  • the <link rel='dns-prefetch' href='//s.w.org' /> reference – I’m not making any requests on the front-end to WordPress HQ (line #26)
  • the RSS feeds (since this will be a static-site and feeds can’t be generated) (lines 27,28)
  • the redundant and leaky wp-json and xmlrpc links (lines 35, 36)
  • the inline CSS(lines 38-43)

Minification takes care of the inline HTML comment, a custom function added to a snippet removes the feeds, the WP Hide plugin does most of the rest, including removing the redundant tags and links – but also as shown in the “optimized” version, replaces well-known WordPress folder paths with custom ones:

<!doctype html><html lang="en-AU" dir="ltr" class="no-js"><head><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Raleway:100&amp;display=swap" /><link media="all" href="https://www.seriocomic.com/innards/cache/autoptimize/css/autoptimize_b9e716a77d54f54d44ca959e14a309c1.css" rel="stylesheet" /><title>Seriocomic in 2020 &ndash; seriocomic</title><meta name="viewport" content="width=device-width,initial-scale=1"><link rel="license" href="https://www.seriocomic.com/about/copyright/"><link rel="author" title="Mike Hudson" href="https://www.seriocomic.com/about/"><link rel="shortcut icon" type="image/x-icon" href="https://www.seriocomic.com/favicon.ico"><meta name="description" content="This site was last published to in February 2018 &#8211; over 2 years ago. However, it hasn&#8217;t been left untouched. While I&#8217;ll cover that in a moment &#8211; it&#8217;s this past&#8230;" /><meta property="og:image" content="https://assets.seriocomic.com/2020/07/seriocomic2020.png" /><meta property="og:image:width" content="900" /><meta property="og:image:height" content="600" /><meta property="og:locale" content="en_AU" /><meta property="og:type" content="article" /><meta property="og:title" content="Seriocomic in 2020 &ndash; seriocomic" /><meta property="og:description" content="This site was last published to in February 2018 &#8211; over 2 years ago. However, it hasn&#8217;t been left untouched. While I&#8217;ll cover that in a moment &#8211; it&#8217;s this past week that has seen the biggest changes." /><meta property="og:url" content="https://www.seriocomic.com/rhetoric/seriocomic-in-2020/" /><meta property="og:site_name" content="seriocomic" /><meta property="article:published_time" content="2020-07-11T04:50+00:00" /><link rel="canonical" href="https://www.seriocomic.com/rhetoric/seriocomic-in-2020/" /><link rel='dns-prefetch' href='//cdnjs.cloudflare.com' /><link href='https://cdnjs.cloudflare.com' rel='preconnect' /><link href='https://fonts.gstatic.com' crossorigin='anonymous' rel='preconnect' /><link rel='stylesheet' href='https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css' type='text/css' media='all' /></head>

What you can’t see from that code-snippet is that I’ve also added a search-and-replace regular expression to remove multiple spaces – /[^\S\r\n]+/ and also went to the length of replacing the (hard to correct) en_GB to en_AU for the <meta property="og:locale" content="en_AU" /> line. Next-level pedantry indeed.

Also what you probably can’t appreciate is the number of different (probably 20?) Syntax Highlighter plugins I tried that would a) work with my theme, b) provide line numbering, c) not add to the DOM excessively, d) provide the theme customization I wanted. Probably two days just on this alone. I’ve had to let myself live with the HTML4/HTML5 closing tag (/> vs >) issue live for another day.

So – has it been worth it? I guess this is like building a home – you know you have to factor in some additional budget for over-runs, but when a project is personal, sometimes there is no budget.

I can now shut-down the web back-end (webserver and database) and sleep well knowing that the site is running extremely efficiently, quickly and cheaply.

Speaking of costs – through the generosity of the free-tiers offered by Cloudflare, Netlify and Github is this all really possible. Only the image hosting of S3 hits my back pocket, and then only a couple of cents each month.

So there you go – weeks of work summed up completely understating the pain and difficulty to make something relatively unchanged. Such is dev-life.

[UPDATE] - The AWS image hosting solution was replaced by the amazing free-service offered by Backblaze.

+

Well done!

You clicked the ‘+’! Well done you!

I’ve been wondering if anyone ever clicked that thing, now I know – thanks!

Some pages you might like…