AVC Comments Migration Complete

Back when we launched the new AVC (AVC 3.0) and moved away from the Disqus comment system, I heard loudly and clearly that the folks who have left comments here at AVC, via Disqus, from 2007 to early 2020, would like to have their comments displayed at the bottom of all of those old blog posts.

That was not an easy thing to do because I wanted to migrate all of those comments out of Disqus into the AVC WordPress database so that we have full control over them and how to display them in the new AVC.

Disqus was super helpful in getting the comments out, but we ran into a number of issues given that massive number of comments. There were 459,000 comments left on AVC in the “Disqus era.” Think about that.

Here is an email the team at Storyware, who did the work, sent me explaining their process. They also migrated the comments on GothamGal.com and completed that last month.

At first we tried to use the official Disqus Plugin to migrate your comments, but their plugin resulted in errors each time we tried to process a batch of comments. We then looked at writing a custom migration script for the exported XML file that you obtained from Disqus. With nearly 500k comments, your migration file was 397.3 MB in size. This massive file wasn’t efficient for testing migration scripts so we tabled this, knowing that we would be migrating a small set of Disqus comments for GothamGal. 

The GothamGal export from Disqus turned out to be 27 MB, much smaller in size. We used her export file to then develop a CLI tool to process the XML file and migrate the comments into WordPress. This tool worked well, but it relies on holding a lot of items in memory: an array of the Disqus threads (your posts), an array of your Disqus comments, and an array of processed comments that we can use for associating parents with children. This same script just couldn’t handle an export file that’s the size of the one generated for avc.com

To run the Disqus to WordPress migration for AVC, we developed a plugin that allowed us to perform the following steps:

1/Process all of the threads in the XML file, and store them in a new database table. These threads are needed for grabbing the URL associated with each comment, which can then be used to associate each comment with a post in WordPress. 

2/Process all of the Disqus comments in the XML file and also store these in a new database table, which we can use to gradually migrate the comments into WordPress. We did still have to break the huge AVC Disqus export file into 16 pieces in order to save the comments from the XML file into the database 🙂

3/Use a Laravel-esque Queue system to run batches of migrations in the background, processing 5,000 comments with each batch. We used the WP Queue package from Delicious Brains for the basis of this functionality, and then created a REST endpoint for triggering the Queue to process. 

Storyware plans to clean up the plugin and release it as a developer tool in the near future. 

This turned out to be a pretty big project that took their time and my expense to get done. But I want to honor all of the work that the AVC community put into the comments and that has now been done.

You can see what a long comment thread looks like at the bottom of the infamous Marketing post from 2011.

We have noticed in the migration logs that some comments didn’t make it through because of changes in the associated post’s URL after publication, but the overwhelming majority of all your comments were migrated without issue. I do not plan to fix that. I don’t believe in letting perfect becoming the enemy of the good.

I am relieved that this is now complete. I hope you all are as well.

#Weblogs