Web crawlers or so called web spiders/robots can cause significant load on your website, especially if you are using platforms such as WordPress.
This blog contains some tricks to reduce the load time for websites and crawlers through Robots.txt file, but we recommend that you consult with a web designer/web developer or SEO professional for more information on how to effectively implement.
Here is a sample robots.txt which is WP friendly (similar can be applied for any website and platform).
You need to create this file on your computer and upload it via FTP or your control panel file manager (in cPanel you can create this file using the cPanel File Manager directly), and place in the root folder of each of your domain names (same folder where your main index.php / index.html file resides).
Simply copy/paste between the lines “—” below:
The above sample will slow down the search engines so that they don’t aggressively scan your site all at ones (this does not impact how often the search engine will crawl your site), the code will also block some spiders such as Baiduspider (Chinese search engine, you should disable unless your site need to be indexed in Chinese), Yandex (Russian search engines, leave enabled if you have Russian website or visitors from Russia), and also prevent Google Image bot to scan your site. If you need any of these engines, simply remove the part between “User-agent” and “Disallow” and leave the remaining code.
In Magento CMS Platforms, it can be possible possible same images automatically resizes in diffrent-diffrent folders. In this case, you have to keep faraway all folders except(Cache) from search engine spiders. Follow the following code to reduce server load time for website and crawlers.
Usually with a WordPress CMS you don’t want spiders accessing the admin sections so the first user agent/disallow set of a typical robots.txt file applies to all bots telling them not to spider the administration folders like this:
In case of Ecommerce websites. You can block the following unwanted sections need to be blocked from robots.txt
Besides the above examples, you must block the my account section if any. I hope this blog may be helpful to improve your server response time through Robots.txt File.