Website Migration [2018]

Towards the end of last year I finished migrating my website content onto a new architecture. The main challenge was converting content and preserving URL validity, redirecting to the new content URLs.

Content and CMS history of the previous website

My first public website I filled with some content was using Joomla. Joomla is an extensive and popular website content management system with themes and addons. It is FOSS, is implemented in PHP, and worked fine for the most part. Over the years I added content.

The content I added was mainly some basic information and links to elsewhere, as well as some resources and reference documentation and cheat sheets.

At some point Wordpress had become the standard for (new) blog platforms, and I installed it as well as a Blog. My first blog post is from 2009.

Years later I attempted to convert and unify page and resource content into wordpress in an effort to replace and drop Joomla. Joomla regularly required updates. Wordpress at least did them automatically. But I was not able to follow through at the time.

Yet again years later, still dissatisfied with the technology, separation, disorganization and aestethics of the Joomla website content and Wordpress content and blog articles I started on migrating my website to Hugo, a static website generator. I had learned about it before, and used it other projects before, so I knew it would be a good, efficient fit for my website.

The migration

I used content exporters to export the Joomla pages and Wordpress pages and blog posts into markdown content files for hugo.

Some additional fixups were necessary for some image resources, outdated link and other content links, some encoding issues, and some improvements to the content itself.

I added taxononmy pages for categories, tags and technologies.

I implemented a simple layout to at least be able to finish the content migration and drop the then obsolete tech that is Joomla and Wordpress; discard the need to consider security, updates, and technical overhead in general. Even if it is a very simple layout right now, and with various ideas in the back of my head, if I ever find the time, the website can evolve after being published.

I added some additional page layouts for image lists, like Doge lion.

Some projects needed their own individual layouts to embed custom HTML, so I could embed them within the website, like the video player controls/navigation for DGS - Deutsche Gebärdensprache.

A huge requirement I put onto myself was retaining link validity. If possible content should retain it’s previous URL, or the previous URL should redirect to the new content URL. Doing so was a challenge, and verifying which links actually existed previously, and if they were valid on the new website was a challenge.

I looked for tools to crawl my website and generate a list of URLs, and a tool that would check a list of URLs for validity. Unfortunately, I did not find an appropriate tool. Some online tools were of help, but not unlimited or limited in functionality.

I created Sitemap Crawler in Julia as a tool for that: Finding out which URLs my old website had, saving those URLs into a sitemap file, and using that sitemap file to check the URLs under a new base URL (domain name) to test the new website for old URL validity.

This was quite a bit of work, but definitely helped a ton in retaining URL validity.

Conclusion

A shit ton of work, but I am glad I finally made the switch and finished the content migration last year.

Retaining URL validity may have been more effort than necessary, but helps keep the websites relevance through old links and search results, and is the technically correct way to do a [complete] migration.