Rumors are swirling about billions of Clubhouse and Facebook user records being scraped and sold on the dark web. But is that really what's happening?
Multiple reports of 3.8 billion Clubhouse and Facebook user records up for sale on the dark web are causing a lot of concern and confusion. Right now, it’s too early to tell if this data is legitimate or not. However, regardless of the veracity of these reports, this is a reminder that there was a verified claim of 1.3 million Clubhouse users’ public information being scraped and publicly released in April 2021. And that half a billion Facebook and half a billion LinkedIn users’ records from the same time were also scraped. That information remains out there and can be used by attackers already. And that information can be combined to create fuller profiles that attackers can use for spam, phishing, and account takeover attacks. These facts alone should be enough to prompt you to take action to better protect yourself from these confirmed data leaks, if you haven’t already. Here’s what you can do to help protect yourself from what’s known to be out there, and, if this latest claim should turn out to be accurate, protect yourself from it as well. This latest claim appears to build on two other Clubhouse data leak claims, one in April 2021, and one in July 2021. The April 2021 claim has been verified; the July 2021 has seemingly been debunked. In the spring of 2021, the information of 1.3 million Clubhouse users was posted online. Clubhouse didn’t deny the data was genuine but said then that this wasn’t the result of a hack but of “scraping” of public information. We’ve previously written to explain how data scraping works and how it’s at the heart not only of the Clubhouse data leak, but data leaks from around the same time affecting half a billion Facebook users and half a billion LinkedIn users. Where things get murkier is a claim made in July 2021 that 3.8 billion Clubhouse users’ phone numbers were for sale on the dark web. This claim was dismissed at that time by security researchers who looked at the data and declared the claims untrue and the result of dark web marketing hype by those trying to sell that data. This most recent development appears to be the result of someone taking the data from the July 2021 claim and saying they’ve combined it with data from the April 2021 Facebook data leak. This combined data is being offered for sale on the dark web, much like the discredited July 2021 data was. Most importantly, this most recent claim has not been independently verified and we cannot confirm if it is legitimate or not yet. The fact that it’s based on data that has previously been declared by experts to be bogus casts doubt on the veracity of the claim. The fact that this data is for sale on the sorts of dark web forums where we’ve previously seen overinflated claims in regards to the same Clubhouse data it’s supposedly based on is cause for even greater suspicion. Taken altogether, this means that we just can’t be sure if the current claims are true or not right now. It also means that the legitimate April 2021 scraped public data from Clubhouse, Facebook, and LinkedIn is still out there and can be combined together to form a single data set like this, giving this aspect of the claim plausibility. However, at most, that data combined would total over 1 billion records (1,300,000 Clubhouse + 500,000,000 Facebook + 500,000,000 LinkedIn), not the 3.8 billion that’s being claimed now. Even if the current claim is bogus, that’s still a lot of user data that’s out there, data that can be combined to form fuller profiles for attackers. (Assuming that hasn’t already happened and we just haven’t heard about it yet.) This means even if the recent 3.8 billion Clubhouse and Facebook user data claim is bogus, there is known data already out there that continues to pose the same risk to users that it has when it leaked in April 2021: an increased risk of email and SMS spam and phishing attacks and account takeover attacks. With this data, attackers can target any type of account, but the most concerning ones would be email accounts and financial accounts like banking. In particular, if attackers can pair a mobile phone number to a specific person, they can try to carry out “Sim Swapping” attacks which can be used to defeat two-factor authentication used to protect email and financial accounts. First, like we note in our blog about the Clubhouse, Facebook, and LinkedIn data scraping data leaks, remember that information that is public, even if it’s on a social media site, is public and can be scraped to be collected, collated, and compiled into large databases that then can be sold on the dark web. So everyone should consider setting their profiles to “not public” or be very, very careful about what information is shared publicly. Second, like we note in our blog on SIM swapping, you can protect against SIM swapping by using authentication apps rather than SMS text messages for your second security factor wherever possible. Thirdly, you can better protect yourself from SIM swapping if your mobile carrier allows you to put blocks against it, so that SIM swaps only happen with your explicit authorization. Not all carriers have this option, but if your carrier does, you should look into it. Finally, the key takeaway from the current murky claim of is that there is a kernel of truth underlying it. Whether there are 3.8 billion actual Clubhouse and Facebook user records for sale or not, we know there has been data for 2 billion Clubhouse, Facebook, and LinkedIn users out there since April 2021. If you haven’t taken steps to better protect yourself from those known, legitimate data leaks, now is the time to do so. And anything you do to protect against those legitimate data leaks will automatically help protect you if this latest claim turns out to be true.Untangling the data scraping claim
What does it mean?
What you should do to protect yourself from data scraping