Static WordPress Community

404 error when crawling sitemap

Hi, I’m getting the following 404 error at the start of the crawl:

wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:10 +0000] "GET /robots.txt HTTP/1.1" 200 362 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:10 +0000] "GET /robots.txt HTTP/1.1" 200 362 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:10 +0000] "GET /wp-sitemap.xml HTTP/1.1" 200 690 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /wp-sitemap-posts-post-1.xml HTTP/1.1" 200 446 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /wp-sitemap-posts-page-1.xml HTTP/1.1" 200 468 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /wp-sitemap-taxonomies-category-1.xml HTTP/1.1" 200 443 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /wp-sitemap-users-1.xml HTTP/1.1" 200 433 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /wp-sitemap.xml HTTP/1.1" 200 690 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /http://lvh.me/wp-sitemap.xml HTTP/1.1" 301 359 "-" "WP2Static.com"
wordpress_1  | 127.0.0.1 - - [29/Jan/2021:10:50:11 +0000] "GET /http:/lvh.me/wp-sitemap.xml HTTP/1.1" 404 5583 "-" "WP2Static.com"
wordpress_1  | [Fri Jan 29 10:50:11.223970 2021] [php:error] [pid 85] [client 172.20.0.1:45040] PHP Fatal error:  Uncaught WP2StaticGuzzleHttp\\Exception\\ClientException: Client error: `GET http://lvh.me/http://lvh.me/wp-sitemap.xml` resulted in a `404 Not Found` response:\n<!doctype html>\n<html lang="en-US" >\n<head>\n\t<meta charset="UTF-8" />\n\t<meta name="viewport" content="width=device-width (truncated...)\n in /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticguzzle/src/Exception/RequestException.php:113\nStack trace:\n#0 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticguzzle/src/Middleware.php(69): WP2StaticGuzzleHttp\\Exception\\RequestException::create(Object(WP2StaticGuzzleHttp\\Psr7\\Request), Object(WP2StaticGuzzleHttp\\Psr7\\Response), NULL, Array, NULL)\n#1 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(204): WP2StaticGuzzleHttp\\Middleware::WP2StaticGuzzleHttp\\{closure}(Object(WP2StaticGuzzleHttp\\Psr7\\Response))\n#2 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(153): WP2StaticGuzzleHttp\\Promise\\Promise::callHandler(1, Object(WP2StaticGuzzleHttp\\Psr7\\Response), NULL)\n#3 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/TaskQueue.php(48): WP2StaticGuzzleHttp\\Promise\\Promise::WP2StaticGuzzleHttp\\Promise\\{closure}()\n#4 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(248): WP2StaticGuzzleHttp\\Promise\\TaskQueue->run(true)\n#5 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(224): WP2StaticGuzzleHttp\\Promise\\Promise->invokeWaitFn()\n#6 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(269): WP2StaticGuzzleHttp\\Promise\\Promise->waitIfPending()\n#7 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(226): WP2StaticGuzzleHttp\\Promise\\Promise->invokeWaitList()\n#8 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(62): WP2StaticGuzzleHttp\\Promise\\Promise->waitIfPending()\n#9 /var/www/html/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticguzzle/src/Client.php(187): WP2StaticGuzzleHttp\\Promise\\Promise->wait()\n#10 /var/www/html/wp-content/plugins/wp2static/src/SitemapParser.php(236): WP2StaticGuzzleHttp\\Client->request('GET', 'http://lvh.me/h...', Array)\n#11 /var/www/html/wp-content/plugins/wp2static/src/SitemapParser.php(182): WP2Static\\SitemapParser->getContent()\n#12 /var/www/html/wp-content/plugins/wp2static/src/DetectSitemapsURLs.php(104): WP2Static\\SitemapParser->parse('http://lvh.me/h...')\n#13 /var/www/html/wp-content/plugins/wp2static/src/URLDetector.php(71): WP2Static\\DetectSitemapsURLs::detect('http://lvh.me/')\n#14 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(633): WP2Static\\URLDetector::detectURLs()\n#15 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(743): WP2Static\\Controller::wp2staticHeadless()\n#16 /var/www/html/wp-includes/class-wp-hook.php(285): WP2Static\\Controller::wp2staticRun()\n#17 /var/www/html/wp-includes/class-wp-hook.php(311): WP_Hook->apply_filters('', Array)\n#18 /var/www/html/wp-includes/plugin.php(484): WP_Hook->do_action(Array)\n#19 /var/www/html/wp-admin/admin-ajax.php(184): do_action('wp_ajax_wp2stat...')\n#20 {main}\n\nNext WP2Static\\WP2StaticException: Unable to fetch URL contents in /var/www/html/wp-content/plugins/wp2static/src/SitemapParser.php:239\nStack trace:\n#0 /var/www/html/wp-content/plugins/wp2static/src/SitemapParser.php(182): WP2Static\\SitemapParser->getContent()\n#1 /var/www/html/wp-content/plugins/wp2static/src/DetectSitemapsURLs.php(104): WP2Static\\SitemapParser->parse('http://lvh.me/h...')\n#2 /var/www/html/wp-content/plugins/wp2static/src/URLDetector.php(71): WP2Static\\DetectSitemapsURLs::detect('http://lvh.me/')\n#3 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(633): WP2Static\\URLDetector::detectURLs()\n#4 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(743): WP2Static\\Controller::wp2staticHeadless()\n#5 /var/www/html/wp-includes/class-wp-hook.php(285): WP2Static\\Controller::wp2staticRun()\n#6 /var/www/html/wp-includes/class-wp-hook.php(311): WP_Hook->apply_filters('', Array)\n#7 /var/www/html/wp-includes/plugin.php(484): WP_Hook->do_action(Array)\n#8 /var/www/html/wp-admin/admin-ajax.php(184): do_action('wp_ajax_wp2stat...')\n#9 {main}\n\nNext WP2Static\\WP2StaticException: Unable to fetch URL contents in /var/www/html/wp-content/plugins/wp2static/src/DetectSitemapsURLs.php:125\nStack trace:\n#0 /var/www/html/wp-content/plugins/wp2static/src/URLDetector.php(71): WP2Static\\DetectSitemapsURLs::detect('http://lvh.me/')\n#1 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(633): WP2Static\\URLDetector::detectURLs()\n#2 /var/www/html/wp-content/plugins/wp2static/src/Controller.php(743): WP2Static\\Controller::wp2staticHeadless()\n#3 /var/www/html/wp-includes/class-wp-hook.php(285): WP2Static\\Controller::wp2staticRun()\n#4 /var/www/html/wp-includes/class-wp-hook.php(311): WP_Hook->apply_filters('', Array)\n#5 /var/www/html/wp-includes/plugin.php(484): WP_Hook->do_action(Array)\n#6 /var/www/html/wp-admin/admin-ajax.php(184): do_action('wp_ajax_wp2stat...')\n#7 {main}\n  thrown in /var/www/html/wp-content/plugins/wp2static/src/DetectSitemapsURLs.php on line 125, referer: http://lvh.me/wp-admin/admin.php?page=wp2static

I’m running WordPress locally in docker, only wp2static plugin installed and on default theme.

The docker-compose.yml config I’m using:

version: '3.8'

services:
  wordpress:
    image: wordpress:5.6.0-php8.0-apache
    ports:
      - 80:80
    environment:
      WORDPRESS_DB_HOST: db
      WORDPRESS_DB_USER: wpuser
      WORDPRESS_DB_PASSWORD: wpuserpass
      WORDPRESS_DB_NAME: wpdatabase
    volumes:
      - wordpress:/var/www/html
    depends_on:
      - db

  db:
    image: mysql:8.0.23
    environment:
      MYSQL_DATABASE: wpdatabase
      MYSQL_USER: wpuser
      MYSQL_PASSWORD: wpuserpass
      MYSQL_RANDOM_ROOT_PASSWORD: wpdbrootpass
    volumes:
      - data:/var/lib/mysql

volumes:
  data:
  wordpress:

Any ideas what I’m doing wrong?

Thanks!

Hi @mrdaniel,

I think this was a Sitemap code related issue I recently fixed, but hasn’t made its way into published release yet. Let me give you a build to try and please report back if that fixes it:

https://wp2static.com/mrdanieltest.zip

btw, that docker-composer.yml should be fine for WP2Static, thanks for including that kind of info, always helpful to rule other things out

@leonstafford thanks for the updated zip, I tried it but getting the same error.

I’ve been using the Static HTML Output plugin with better success - although having some issues exporting sitemap & feeds. I’ll open a thread in that forum.

I’m having the same problem. When I look in my apache logs I see:

PHP Fatal error: Uncaught WP2StaticGuzzleHttp\Exception\ClientException: Client error: GET h ttps://deanhouseholder.com/https://deanhouseholder.com/sitemap.xml resulted in a 404 Not Found response:\n\n\n\n\t\n< meta http-equiv=“X-UA-Compatible” content="IE=edge (truncated…)\n in /var/www/dean/deanhouseholder.com/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticguzzle/src/Exception/ RequestException.php:113\nStack trace:\n#0 /var/www/dean/deanhouseholder.com/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticguzzle/src/Middleware.php(69): WP2StaticGuzzleHtt p\Exception\RequestException::create()\n#1 /var/www/dean/deanhouseholder.com/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(204): WP2StaticGuzzleH ttp\Middleware::WP2StaticGuzzleHttp\{closure}()\n#2 /var/www/dean/deanhouseholder.com/wp-content/plugins/wp2static/vendor/leonstafford/wp2staticpromises/src/Promise.php(153): WP2Stat icGuzzleHttp\Promise\Promise::callHandler() in /var/www/dean/deanhouseholder.com/wp-content/plugins/wp2static/src/DetectSitemapsURLs.php on line 125, referer: https://deanhouseholder .com/wp-admin/admin.php?page=wp2static

There url it is trying to request is /https://deanhouseholder.com/sitemap.xml. It shouldn’t have the initial slash.

I’ve tried this on the latest develop branch as well as the attached zip file above and both fail.

I switched to the master branch, generated a new .zip file and tested and (after increasing the memory limit) got it to work. So the bug is in the develop branch.