According to the list at http://www.useragentstring.com/pages/Browserlist/http://www.useragentstring.com/pages/useragentstring.php?typ=Browser with over 9000 user agent strings from various browsers:
- 0 user agent strings of browsers contains the word "bot"
- 2 user agent strings of browsers contains the word "crawl"
- 0 user agent strings of browsers contains the word "spider"
(The 2 which contains "crawl" is the following: "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; YComp 5.0.2.6; MSIECrawler)" and "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; MSIECrawler)" I think it is safe to not consider those.)
According to the list at http://www.useragentstring.com/pages/Crawlerlist/http://www.useragentstring.com/pages/useragentstring.php?typ=Crawler with 442 user agent strings listed as bots:
- 208 user agent strings of bots contains the word "bot"
- 63 user agent strings of bots contains the word "crawl"
- 37 user agent strings of bots contains the word "spider"
- 282 user agent strings of bots contains either "bot", "crawl" or "spider"
My conclusion: it is safe to filter bots by user agent strings by the words "bot", "crawl" and "spider". It's not bullet-proof but is definitely better than nothing.
Note: When searching for the keywords I used case insensitive searching.