If all bots are not properly added to
Spiders/Robots in phpBB Administration Control Panel (ACP) GENERAL/QUICK ACCESS settings, they will
cause excessive load and traffic on the web server. This happens because bots will get every single URL with a different SID and therefore consider them as different pages. They keep crawling those pages and getting all the time new pages. When a bot is properly detected, phpBB will return it pages without SIDs as well as prevent access to pages like cron where there is absolutely nothing useful to find.
As a fix, you can a generic bot to your forum that will catch all current and new bots which are not already specified in
Spiders/Robots. Simply add there a new bot using the following values:
Code: Select all
Bot name: Unspecified bot
Agent match: bot
This will make sure that all bots that have the string "bot" present in http User-Agent header. I proposed adding the configuration to phpBB by default, but they seem to have deleted my account and idea so it is not likely to be part of phpBB any time soon:
https://www.phpbb.com/community/viewtop ... #p16011421
A quick experiment showed that catching all bots can reduce the load 90% and traffic 50%. This can be a viable alternative for blocking all bots from visiting the forum. The problem remains how to find and mark all current an future bots, because if any of them is not detected, they will start indexing SIDs.
Happy hacking,
Santeri