Bad Bots Steal Accounts, Content and Skew the Web Ecosystem
Bad bots are a continuing problem. Good bots, those that perform welcome and benign activities — such as search engine crawlers like Googlebot and Bingbot — are welcome. Bad bots, those that scrape and steal content, mine for competitive data, undertake credential stuffing, ad fraud, transaction fraud and more, are not.
In its latest analysis of hundreds of billions of bad bot requests detected during 2018, Distil Networks examines the primary focus of current bad bot activity, and their targets. It demonstrates that the bad bot threat is little different from other threats — as defenses improve, the bot masters adapt their methods in a continuing game of cat and mouse.
Bot website activity in 2018 declined as a percentage of the overall traffic. Good bots declined by 14.4%, while bad bots declined by 6.4%. This does not mean, however, that actual bad bot activity declined — indeed, the greater percentage decline in good bots could indicate an increase — in absolute terms — of bad bot activity.
Distil Networks defines three levels of bad bot: simple, moderate and sophisticated. The latter two categories comprise ‘advanced persistent bots’ (APBs), which account for 73.6% of all bots. APBs cycle through random IP addresses, enter through anonymous proxies, change their identities, mimic human behavior, and are the most difficult to detect and block.
Bad bots are used for a wide variety of purposes. Some are targeted against all industries; others focus on specific industries. Credential stuffing bots, for example, affect all websites. “Bots are used by criminals to test the viability of stolen credentials,” warns Distil. “Every new data breach sees an increased availability of credentials and leads to higher volumes of bad bot traffic. With over 14 billion credentials stolen since 2013, the problem is already significant — and only getting worse.”
The ticketing sector is the most targeted sector for APBs — as it was in 2017. Several countries have enacted or plan to enact legislation to prevent ticketing bots, but with little enforcement. These bots are used to test the availability of tickets for major events, and to allow organizations and individuals to find and buy tickets at the official price to later resell at inflated prices.
The financial sector is the top target for all categories of bad bot. Typically, bots will attempt to access user accounts for financial fraud. However, investment firms are increasing their use of web scraping bots to improve their investment and trading performance. Distil explains, “the financial investment sector also deploys bots to scrape for information such as inventory levels and pricing data. Sometimes known as alternative data, this information is used by hedge funds to make investment decisions.”
In February 2019, Optimas reported that, “web scraping for investments already accounts for an astounding 5% of all web traffic and that this activity will continue to grow rapidly in coming years.” Optimas also estimated that hedge funds are expected to pay $2 billion in 2020 to collect and store data that was scraped from websites.
Gambling and gaming industries also suffer from extensive bad bot activity. Gambling sites suffer from relentless scraping for ever-changing betting lines. The increasing monetization of online gaming suffers from credential stuffing bots seeking to access accounts with either money or game items that can be stolen and sold.
Airlines are another popular target, with 25.9% of their traffic coming from bad bots. Prices are scraped by competitors while seat availability is scraped by the travel ecosystem. This is made worse by credential stuffing by criminals seeking access to user accounts in order to steal accumulated air mile balances.
E-commerce sites experience both ‘good’ bots and bad bots. “Some of the more virtuous price comparison sites,” Edward Roberts, senior director of product marketing at Distil, told SecurityWeek, “have agreements with subject websites that will allow, for example, one ‘scrape’ per day.” But that still leaves a range of bad bots that indulge in content scraping, account takeovers, credit card fraud and gift card abuse. Overall, e-commerce accounts for 18% of the bad bot traffic.
Bad bots come from a variety of sources. Credential stuffing bad bots are clearly criminal in intent, and originate from and for cyber criminals trying to gain illegal access to user accounts. Content scraping tools are more of a grey area, but are immoral if not precisely illegal. These can affect any content website — sometimes just to steal content and republish elsewhere in breach of copyright. Competitors use bad bots to keep an eye on prices, so they can undercut and steal business.
The effect of this bad bot activity doesn’t merely affect legitimate competition and provide access to victim accounts, it skews the entire web ecosystem. Websites that experience a sudden increase in popularity may think it is proof of increasing popularity while it is actually a bad bot attack.
The problem for all site owners is that it is difficult to detect and block such traffic. Because it targets the application layer and is (in the case of the APBs) designed to mimic genuine human activity, there is little that can be done in-house to prevent it. Distil recommends a few basic options, such as blocking or using CAPTCHA with older browser versions, including anything that has reached end of life or is more than three years old.
Certain data centers can also be blocked since they are often the source of the less sophisticated bots — such as Digital Ocean, GigaNET, OVH Hosting and Choopa LLC. And, of course, geo-blocking areas such as Russia and China can be used if the site doesn’t rely on or expect traffic from such regions. Failed log-ins should be monitored in case it indicates a bot attack. Beyond this, a bot mitigation service should be considered. Here, a team of bot experts is constantly engaged in monitoring the new techniques used by the bot operators, and developing new methods to detect and block them. What is clear, however, is that the bad bot threat should not be ignored.
Related: The Big Business of Bad Bots