Creating a Lifeboat site

I would like to create a static version of my website as a “lifeboat” for potential downtime.

To do this I would use AWS Route53 to handle a health check, and when my primary WordPress site fails the health check, it would failover to the static version on S3/cloudfront.

There are some things that I would need to be able to do to achieve this:

  • This would mean I would like to have the static site not crawlable by search engines.
  • Remove features replying on PHP such as search

Is there anyway to achieve these two requests?

This would mean I would like to have the static site not crawlable by search engines.

As long as your deployment URL is set to your normal website address, WP will set canonical URLs correctly, which addresses the need here. There’s no need to restrict access to the static site.

Remove features replying on PHP such as search

This has to be done by the theme. The crawler uses a User-Agent of WP2Static.com, so you could have the theme check for that and omit the search form. Another option would be to switch over search to Algolia (on the live site also), so it would always work.

2 Likes

@nathansmonk how did you get on with this? It’s a great use case. I don’t host my own WordPress sites anymore, static or otherwise, but can see this being a great approach for the failover site scenario you’re using, which would be great to be able to point interested users to, along with any other advanced routing via Route53, NS1, etc.

I’m so close, but I just cannot make it work.

The AWS infrastructure is all set up and ready to accept it, but I cannot get a successful crawl to happen. Admittedly, I am doing this on a massive site. The crawl runs for about 14 hours, and then fails.

I’m using 6.6.0, because under the alpha I get Error: Class 'WP2Static\Controller' not found in ~/Sites/wordpress/app/public/wp-content/plugins/wp2static-master/wp2static.php on line *24*

I’m using 6.6.0, because under the alpha I get Error: Class ‘WP2Static\Controller’ not found in ~/Sites/wordpress/app/public/wp-content/plugins/wp2static-master/wp2static.php on line 24

I don’t know anything about v6, but regarding v7: Did you try running composer install from the wp2static-master directory?

Hi @nathansmonk,

14 hrs is pretty intense - how big a site and what size EC2 instance? Are you watching the generated directory to see that it is actually processing/increasing in size during those 14 hrs?

I’d aim for < 1 hr, even for massive sites. For that 14 hrs, you could be better served by spinning up a powerful instance, cloning site and running with big resources, probably coming out at similar or lesser cost in instance hours - just depends how much you value your time to do it manually or script it up.

The wp2static repo hasn’t had any official releases for a while, but you can definitely use it from master branch from GitHub. In the name of your plugin folder, it seems like you’ve downloaded a ZIP from GitHub, which is not a compiled, ready-to-run plugin, so as @john-shaffer mentioned, you’d need to run composer install. I think you’ll also be best to ensure the plugin dir is named wp2static, not wp2static-master, but maybe can get away with that for now.

If you want to try what forked from v6 of the plugin, it’s now named Static HTML Output and you can get a ready to run plugin from https://github.com/WP2Static/static-html-output-plugin/files/4822705/static-html-output-plugin-6.6.21.zip

Big sites are really fun to do for me, so I’d love to see you get this working a lot quicker. The “more modern” WP2Static includes a lot of caching options which will be much more useful for large sites, especially when you need to re-run after making changes, as it should only crawl/process changed files.

Unfortunately, I’ve been out of action for a while with some mental health issues, so fallen behind in support and development, otherwise I’d be volunteering more to directly help you / login to remote site, etc. On that note, for big sites, you may also be better served to take a clone and run the export from a local computer, assuming a lot more resources than an EC2 instance.

Please keep updating with any progress/issues. I can’t promise how quickly any of us can respond here - but there’s a chance someone else will have dealt with similar issues.

Sorry to hear about your health issues, I hope you are beginning to feel better.

The last attempt I did, I got up to about 40,000 urls before it conked out. I’m running a clustered infrastructure, where there are 3 t3.large instances. I know thery’re not the most powerful instances going, but the process basically has dedicated resources (it’s in an autoscaling group).

I didnt realise that about composer install so I’m definitely going to give that a go now! I know that Wordfence has been causing me some issues too.

I’ll report back!

Thanks!

WordFence and security plugins shouldn’t usually be required on a WP dev site using WP2Static, so another argument towards cloning/not running on a live site, though I understand that may not work for your scenario.

Is ELB or such in front of these to share the requests? That may be a cause of slow down, if it’s having to go out to load balancer and back into instance for each request. Ideally, we’d want the instance running WP2Static to resolve all requests locally.

Some steps taken to get a large WordPress site running smoothly in production may be at odds with an ideal WP2Static dev server, where we want to hit the one local server hard and use as many resources as available.

I understand where you’re coming from. I guess my anticipated use-case isn’t what most people would be looking for.

I’ve managed to get wp2static running after composer install. I had to remove vendor from the gitignore to make it activate on my production server. I have an issue with the s3 addon, whereby in the logs it says Failed to create 'name' index on wp_10_wp2static_addon_s3_options.

So right now I’m trying to make it happen just as a standard and then I’ll manually transfer.

Yep theres an ELB in front of the webservers, I don’t mind it taking a while, especially if it caches and only updates new stuff. But I just need it to complete the initial one once, this is the milestone for me. If I can crack that, I think it’ll be smooth sailing afterwards.

1 Like

I have an issue with the s3 addon, whereby in the logs it says Failed to create ‘name’ index on wp_10_wp2static_addon_s3_options.

That’s coming from this code:

        // dbDelta doesn't handle unique indexes well.
        $indexes = $wpdb->query( "SHOW INDEX FROM $table_name WHERE key_name = 'name'" );
        if ( 0 === $indexes ) {
            $result = $wpdb->query( "CREATE UNIQUE INDEX name ON $table_name (name)" );
            if ( false === $result ) {
                \WP2Static\WsLog::l( "Failed to create 'name' index on $table_name." );
            }
        }

It’s a unique index, so it can’t be added if there are any duplicate values in the name column. The simplest way to fix it is to just drop the table and re-enter the options. It should be recreated automatically. You can also connect to your SQL server and remove the duplicates. It could potentially be another issue (such as permissions), in which case creating the index manually will resolve the issue.

Thanks @john-shaffer. To be honest, I’m struggling to get the base plugin to play nice. I keep getting 504 errors despite setting my max_execution_time to 300000! Don’t suppose you have any insight into that?

I can only guess, but 504 seems like it would be coming from the load balancer and wouldn’t be affected by max_execution_time. You can try adjusting the ELB timeout.

I don’t let WP Cron run at all on HTTP requests, and instead I have cron jobs running wp cron and wp wp2static process_queue. That might help with the 504s.

I don’t think that you’ll get great results unless someone can rework the crawling code to magically know what it can skip crawling. WP2Static currently crawls the entire site every time to detect changes.

1 Like

Well I’ve got something … different.

I’ve recently switched to Amazon Linux 2, which is a whole other headache, but I’ve managed to get WP CLI working. I’m running a crawl command via that and it seems to be running for longer. I’ve entered the cmmand and it’s just blank right now. I’m not sure if that’s because its stalled or because it’s processing. Doesn’t look like any additional urls have been crawled just yet. I’m going to head off to sleep for a bit, and hopefully it’ll have done it’s thing when I wake up. Fingers crossed.

Hopefully you wake up to good news!

I’ll usually run something like watch du -sh /path/to/export/dir to monitor that the export is actually growing when kicking things off. Drop the h flag to see more frequent updates.

Re the 504’s/ELB - you can add the user agent in an ELB rule to not block anything, should be WP2Static.com

I have a successful crawl! 67546 urls crawled!

I’m now going to run the post_process command. Fingers crossed this may be it.

I now get Couldn\t make directory: /var/app/current/wp-content/uploads/sites/10/wp2static-processed-site//article-categories/mres-humanities/page at the process stage. I think the double / is preventing this from working. Any ideas?

It may rather be permissions. If running from GUI first, then CLI, the permissions of each process may differ. Can you try a quick permissions/ownership increase for that whole wp2static-processed-site dir?

little bit of info on the differing permissions: https://wp2static.com/developers/wp-cli/

Yep, spot on, my wpcli user was different to the webserver. Thats all sorted. I did indeed get a copy. But there are lots of files missing - I know this because the wp-content folder isnt there. I think the generated copy is perhaps too small now, because the crawl last night can’t have actually added the files due to permissions, so I’m going to run it again and see if that does the trick.

What I have noticed however, is that my site is now very slow (60 sec + backend loading time). I assume from the database of records being stored. Would you agree with that assumption?

Re site speed, may try rebooting instances and/or checking for rogue PHP processes running. Cancelling a job doesn’t always cancel it, especially for large jobs, where there may be a bunch of crawls/process tasks still queued up and running. Will aim to address this in future with proper async background tasks.

When testing massive sites locally on certain environments with php-fpm, I needed to restart both php-fpm and web server (httpd in that case) to get anything going again after exhausting resource limits using different configurations, other configurations were fine.

You can try deleting all the WP2Static related cache entries, but, being WordPress, could be a lot of other possibilities.