RSS preview of Troy Hunt

Rss preview of Blog of Troy Hunt

Weekly Update 484

2025-12-28 17:33:52

I think the start of this week's video really nailed it for the techies amongst us: shit doesn't work, you change something random and now shit works and yu have no idea why 🤷‍♂️ Such was my audio this week and apoligise to those of you watching the video below for the first few mins (although I managed to clean up the audio-only podcast version). Ironically, doing things non-standard at home was intended to iron out the creases before the impending travel so... a week from now when I do this with Scott Helme from Duabi it'll all be fine! Let's see 🤞

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing

Weekly Update 483

2025-12-20 14:31:41

Building out an IoT environment is a little like the old Maslow's Hierarchy of Needs. All the stuff on the top is only any good if all the stuff on the bottom is good, starting with power. This week, I couldn't even get that right, but thankfully, sparky to rescue and ensuite underfloor heating disconnected, and we now have reliable power again. On top of that is the layer that has increasingly been my nemesis - the network. Two days after recording, I've just spent the better part of the entire day making a much more concerted effort to adjust channel and power settings on APs, lock clients that don't move to the APs that make the most sense, and generally just screw around with it until stuff worked. And then I turned off a circuit, turned it back on again, and all hell broke loose 😭

References

Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.

Weekly Update 482

2025-12-17 06:52:14

Perhaps it's just the time of year where we all start to wind down a bit, or maybe I'm just tired after another massive 12 months, but this week's vid is way late. Ok, going away to the place that had just been breached (ironic!) didn't help, but I think in general the pace we've maintained this year just needs to come back a bit. That said, I'll try to get this week's and next week's out on time, then it's off on travels for the next four weeks after that. Stay tuned for more IoT problems in a few days from now 🤦‍♂️

References

Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
Spicers Retreats suffered a data breach they attributed back to an attack on the Mews reservation platform (timely, given we had a getaway booked there only a couple of days later)
We worked through 630 million more passwords provided by the FBI (that includes 46 million we've never seen before)
Hmmm... spam to a Qantas-only email address, wonder where that might have come from? (this should be impossible because there's an injunction in place 🤦‍♂️)

Processing 630 Million More Pwned Passwords, Courtesy of the FBI

2025-12-13 05:29:39

The sheer scope of cybercrime can be hard to fathom, even when you live and breathe it every day. It's not just the volume of data, but also the extent to which it replicates across criminal actors seeking to abuse it for their own gain, and to our detriment.

We were reminded of this recently when the FBI reached out and asked if they could send us 630 million more passwords. For the last four years, they've been sending over passwords found during the course of their investigations in the hope that we can help organisations block them from future use. Back then, we were supporting 1.26 billion searches of the service each month. Now, it's... more:

Just as it's hard to wrap your head around the scale of cybercrime, I find it hard to grasp that number fully. On average, that service is hit nearly 7 thousand times per second, and at peak, it's many times more than that. Every one of those requests is a chance to stop an account takeover. But the real scale goes well beyond the API itself. Because the data model is open source and freely available, many organisations use the Pwned Passwords Downloader to take the entire corpus offline and query it directly within their own applications. That tool alone calls the API around a million times during download, but the resulting data is then queried… well, who knows how many times after that. Pretty cool, right?

This latest corpus of data came to us as a result of the FBI seizing multiple devices belonging to a suspect. The data appeared to have originated from both the open web and Tor-based marketplaces, Telegram channels and infostealer malware families. We hadn't seen about 7.4% of them in HIBP before, which might sound small, but that's 46 million vulnerable passwords we weren't giving people using the service the opportunity to block. So, we've added those and bumped the prevalence count on the other 584 million we already had.

We're thrilled to be able to provide this service to the community for free and want to also quickly thank Cloudflare for their support in providing us with the infrastructure to make this possible. Thanks to their edge caching tech, all those passwords are queryable from a location just a handful of milliseconds away from wherever you are on the globe.

If you're hitting the API, then all the data is already searchable for you. If you're downloading it all offline, go and grab the latest data now. Either way, go forth and put it to good use and help make a cybercriminal's day just that much harder 😊

Weekly Update 481

2025-12-05 15:14:33

Twelve years (and one day) since launching Have I Been Pwned, it's now a service that Charlotte and I live and breathe every day. From the first thing every morning to the last thing each day, from holidays to birthdays, in sickness and in heal... wait a minute - did we marry each other or a data breach service?! We decided to do a 12th-birthday special together today to give everyone a bit more insight into what she does and what life is like running this service. It's a different weekly vid, and we really hope you enjoy watching it 😊

References

Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
Just because a "fake" email address is in HIBP, it doesn't mean HIBP isn't accurately indexing data breaches (if it looks like an email address, it's an email address)

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

2025-12-04 07:37:06

Normally, when someone sends feedback like this, I ignore it, but it happens often enough that it deserves an explainer, because the answer is really, really simple. So simple, in fact, that it should be evident to the likes of Bruce, who decided his misunderstanding deserved a 1-star Trustpilot review yesterday:

Why Does Have I Been Pwned Contain "Fake" Email Addresses?

Now, frankly, Trustpilot is a pretty questionable source of real-world, quality reviews anyway, but the same feedback has come through other channels enough times that let's just sort this out once and for all. It all begins with one simple question:

What is an Email Address?

You think you know - and Bruce thinks he knows - but you might both be wrong. To explain the answer to the question, we need to start with how HIBP ingests data, and that really is pretty simple: someone sends us a breach (which is typically just text files of data), and we run the open source Email Address Extractor tool over it, which then dumps all the unique addresses into a file. That file is then uploaded into the system, where the addresses are then searchable.

The logic for how we extract addresses is all in that Github repository, but in simple terms, it boils down to this:

There must be an @ symbol
There can be up to 64 characters before it (the alias)
There can be up to 255 characters after it (the domain)
The domain must contain a period
The domain must also have a valid TLD
A few other little criteria that are all documented in the public repo

That is all! We can't then tell if there's an actual mailbox behind the address, as that would require massive per-address processing, for example, sending an email to each one and seeing if it bounces. Can you imagine doing that 7 billion times?! That's the number of unique addresses in HIBP, and clearly, it's impossible. So, that means all the following were parsed as being valid and loaded into HIBP (deep links to the search result):

I particularly like that last one, as it feels like a sentiment Bruce would express. It's also a great example as it's clearly not "real"; the alias is a bit of a giveaway, as is the domain ("foo" is commonly used as a placeholder, similar to how we might also use "bar", or combine them as "foo bar"). But if you follow the link and see the breach it was exposed in, you'll see a very familiar name:

Which brings us to the next question:

How Do "Fake" Email Addresses End up in Real Websites?

This is also going to seem profoundly simple when you see it. Here goes:

Any questions, Bruce? This is just as easily explainable as why we considered it a valid address and ingested it into HIBP: the email address has a valid structure. That is all. That's how it got into Adobe, and that's how it then flowed through into HIBP.

Ah, but shouldn't Adobe verify the address? I mean, shouldn't they send an email to the address along the lines of "Hey, are you sure you want to sign up for this service?" Yes, they should, but here's the kicker: that doesn't stop the email address from being added to their database in the first place! The way this normally works (and this is what we do with HIBP when you sign up for the free notification service) is you enter the email address, the system generates a random token, and then the two are saved together in the database. A link with the token is then emailed to the address and used to verify the user if they then follow that link. And if they don't follow that link? We delete the email address if it hasn't been verified within a few days, but evidently, Adobe doesn't. Most services don't, so here we are.

How Can I Be Really Sure Actual Fake Addresses Aren't in HIBP?

This is also going to seem profoundly obvious, but genuinely random email addresses (not "thisisfuckinguseless@") won't show up in HIBP. Want to test the theory? Try 1Password's generator (yes, Bruce, they also sponsor HIBP):

Now, whack that on the foo.com domain and do a search:

Huh, would you look at that? And you can keep doing that over and over again. You’ll get the same result because they are fabricated addresses that no one else has created or entered into a website that was subsequently breached, ipso facto proving they cannot appear in the dataset.

Conclusion

Today is HIBP's 12th birthday, and I've taken particular issue with Bruce's review because it calls into question the integrity with which I run this service. This is now the 218th blog post I've written about HIBP, and over the last dozen years, I've detailed everything from the architecture to the ethical considerations to how I verify breaches. It's hard to imagine being any more transparent about how this service runs, and per the above, it's very simple to disprove the Bruces of the world. If you've read this far and have an accurate, fact-based review you'd like to leave, that'd be awesome 😊

Troy HuntModify

Rss preview of Blog of Troy Hunt

References

References

References

References

What is an Email Address?

How Do "Fake" Email Addresses End up in Real Websites?

How Can I Be Really Sure Actual Fake Addresses Aren't in HIBP?

Conclusion

The author's social media

Troy Hunt Modify