FIX: phpBB forum cause high bot/spider server load and traffic

Questions and discussion about web design, search engine optimisation and hosting
Santeri
Posts: 319
Joined: 2017-7-5 09:58

Unread post by Santeri » 2024-5-12 07:07

If all bots are not properly added to Spiders/Robots in phpBB Administration Control Panel (ACP) GENERAL/QUICK ACCESS settings, they will cause excessive load and traffic on the web server. This happens because bots will get every single URL with a different SID and therefore consider them as different pages. They keep crawling those pages and getting all the time new pages. When a bot is properly detected, phpBB will return it pages without SIDs as well as prevent access to pages like cron where there is absolutely nothing useful to find.

As a fix, you can a generic bot to your forum that will catch all current and new bots which are not already specified in Spiders/Robots. Simply add there a new bot using the following values:

Code: Select all

Bot name: Unspecified bot
Agent match: bot
FIX: phpbb forum has high bot/spider server load and traffic This will make sure that all bots that have the string "bot" present in http User-Agent header. I proposed adding the configuration to phpBB by default, but they seem to have deleted my account and idea so it is not likely to be part of phpBB any time soon: https://www.phpbb.com/community/viewtop ... #p16011421

A quick experiment showed that catching all bots can reduce the load 90% and traffic 50%. This can be a viable alternative for blocking all bots from visiting the forum. The problem remains how to find and mark all current an future bots, because if any of them is not detected, they will start indexing SIDs.

Happy hacking,

Santeri