First off, I love this plugin. I’m using it to create static sites for mass page builds. So my process is to create a batch of 500 bulk pages on my WP site, then do an export. This works fine the first time. I then delete all of those mass pages from my WP site and repeat the process. However when I export the next batch of pages for some reason it’s still finding and exporting the initial run of 500 pages too, despite them having been deleted from my WP site. I’ve cleared the cache etc… Is there anyway so on subsequent static exports, to have it not find those deleted pages? I only want it to do an export of what’s actually showing in my Posts/Pages. I hope this makes some kind of sense.
Hi @MKMarketing, glad to hear you’ve been able to use it to good effect otherwise!
Could you please confirm which version of the plugin you’re using? We’ve got a few floating around at the moment, with different ways of handling the plugin’s own cache, which sounds like it could be what’s not getting cleared here. In WP2Static, we have a crawl cache and deploy cache.
Thanks for the quick response. I’m using Static HTML Output 6.6.21. I’ve clicked the Delete Deploy Cache, but it’s still finding those old posts.
Hmm, you can inspect a few points during the process to see where it may be getting stuck.
If I recall, I modified things not long ago to ensure that it will force re-detect the main URLs it uses to seed crawling, ie posts and page URLs - by this, I mean that it should be fine if you keep the export screen open, make changes to posts in another tab, then run again. But, I may be confusing that with the similar “Run” screen added to new WP2Static version.
So, in Static HTML Output 6.6.21, you have a new Export Log in the last tab. Checking that after a deploy should give you an idea of where the URLs originated from.
You can also check the actual generated directories within wp-content/uploads.
Sorry, I’m a little out of touch with code in last couple of months, hoping to dive back in soon, so a bit unsure of which features are in Static HTML Output latest version.
You do have some WP-CLI commands to inspect/list some of the caches, too.
If doing bulk sites and tying in with any other processes, moving to WP-CLI may benefit your workflow, too.
You can post here or email me your export/deploy logs, too, but take a look first and see if you can figure out what’s happening with the URLs you expect to see vs what you’re getting.
So I checked the crawl log for the most recent export I ran. The strange thing is that the log is only showing the URL’s from the most recent bulk page run I did,. About 350 in total. Yet in the static export zip, it’s got all of the previous pages I’ve run and since deleted from the site. I wonder how it’s finding those pages when they don’t exist any more, unless they’re still in the WP database. I’m going to check phpmyadmin and see if those old pages are showing up there.
Also, I definitely leave the export page open while it’s crawling and processing for the export.
Ah, so this may simply be that they are in the cache of the exported site (wp-content/uploads/processed-something).
The plugin may do some diff’ing to skip re-processing files which haven’t changed between crawls or such, but it definitely doesn’t delete the processed site dir each time (else, wouldn’t be able to do a partial crawl/deploy without having to redo whole site everytime).
So, it’s likely you’re just seeing them because they still exist in the “processed_site” dir in wp-content/uploads.
If they’re not linked from anywhere in your site, they shouldn’t cause a problem, but it’s not doing a delete call in the deploy process.
Can you explain the use case of the need to deploy a bunch of URLs, then not have them again soon? May give me some ideas of how to address, but you could probably hook into one of the plugin’s events and clear the whole processed site dir on each run if that makes sense…