MoreRSS

site iconTroy HuntModify

Create courses for Pluralsight and am a Microsoft Regional Director and MVP who travels the world speaking at events and training technology professionals.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Troy Hunt

每周更新 467

2025-08-31 18:28:10

Weekly Update 467

Using AI to analyse photos and send alerts if I've forgotten to take the bins out isn't going to revolutionise my life, no more so than using it to describe who's at the mailbox when a letter arrives and at the front door when they buzz. But that's really not the point; it's by playing with tech like this that firstly, you come to understand it better and secondly, you find genuinely impactful use cases. I keep scratching my head to try to work out where AI can do something really useful in HIBP and little exercises like throwing it into the home automation help get that part of the brain working. No epiphanies as yet, unfortunately. Got any good ideas?

Weekly Update 467
Weekly Update 467
Weekly Update 467
Weekly Update 467

References

  1. Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
  2. We're in Singapore! (I don't often wear a tux so it might be a while until you see a photo like that again 🤣)
  3. 107k email addresses from TheSqua.re breach went into HIBP (have a listen to how many goes I had at making contact with someone there...)
  4. Sending pictures from my Ubiquiti cams to AI APIs via Home Assistant redfines "smart home" (and will hopefully give me some good ideas for where we can make good use of it in HIBP)

家庭助理 + Ubiquiti + 人工智能 = 家庭自动化魔法

2025-08-27 15:37:51

Home Assistant + Ubiquiti + AI = Home Automation Magic

It seems like every manufacturer of anything electrical that goes in the house wants to be part of the IoT story these days. Further, they all want their own app, which means you have to go to gazillions of bespoke software products to control your things. And they're all - with very few exceptions - terrible:

Home Assistant + Ubiquiti + AI = Home Automation Magic

That's to control the curtains in my office and the master bedroom, but the hubs (you need two, because the range is rubbish) have stopped communicating.

Home Assistant + Ubiquiti + AI = Home Automation Magic

That one is for the spa, but it looks like the service it's meant to authenticate to has disappeared, so now, you can't.

Home Assistant + Ubiquiti + AI = Home Automation Magic

And my most recent favourite, Advantage Air, which controls the many tens of thousands of dollars' worth of air conditioning we've just put in. Yes, I'm on the same network, and yes, the touch screen has power and is connected to the network. I know that because it looks like this:

Home Assistant + Ubiquiti + AI = Home Automation Magic

That might look like I took the photo in 2013, but no, that's the current generation app, complete with Android tablet now fixed to the wall. Fortunately, I can gleefully ignore it as all the entities are now exposed in Home Assistant (HA), then persisted into Apple Home via HomeKit Bridge, where they appear on our iThings. (Which also means I can replace that tablet with a nice iPad Mini running Apple Home and put the Android into the server rack, where it still needs to act as the controller for the system.)

Anyway, the point is that when you go all in on IoT, you're dealing with a lot of rubbish apps all doing pretty basic stuff: turn things on, turn things off, close things, etc. HA is great as it abstracts away the crappy apps, and now, it also does something much, much cooler than just all this basic functionality...

Start by thinking of the whole IoT ecosystem as simply being triggers and actions. Triggers can be based on explicit activities (such as pushing a button), observable conditions (such as the temperature in a room), schedules, events and a range of other things that can be used to kick off an action. The actions then include closing a garage door, playing an audible announcement on a speaker, pushing an alert to a mobile device and like triggers, many other things as well. That's the obvious stuff, but you can get really creative when you start considering devices like this:

Home Assistant + Ubiquiti + AI = Home Automation Magic

That's a Sonoff IoT water valve, and yes, it has its own app 🤦‍♂️ But because it's Zigbee-based, it's very easy to incorporate it into HA, which means now, the swag of "actions" at my disposal includes turning on a hose. Cool, but boring if you're just watering the garden. Let's do something more interesting instead:

Home Assistant + Ubiquiti + AI = Home Automation Magic

The valve is inline with the hose which is pointing upwards, right above the wall that faces the road and has one of these mounted on it:

Home Assistant + Ubiquiti + AI = Home Automation Magic

That's a Ubiquiti G4 Pro doorbell (full disclosure: Ubiquiti has sent me all the gear I'm using in this post), and to extend the nomenclature used earlier, it has many different events that HA can use as triggers, including a press of the button. Tie it all together and you get this:

Not only does a press of the doorbell trigger the hose on Halloween, it also triggers Lenny Troll, who's a bit hard to hear, so you gotta lean in real close 🤣 C'mon, they offered "trick" as one of the options!

Enough mucking around, let's get to the serious bits and per the title, the AI components. I was reading through the new features of HA 2025.8 (they do a monthly release in this form), and thought the chicken counter example was pretty awesome. Counting the number of chickens in the coop is a hard problem to solve with traditional sensors, but if you've got a camera that take a decent photo and an AI service to interpret it, suddenly you have some cool options. Which got me thinking about my rubbish bins:

Home Assistant + Ubiquiti + AI = Home Automation Magic

The red one has to go out on the road by about 07:00 every Tuesday (that's general rubbish), and the yellow one has to go out every other Tuesday (that's recycling). Sometimes, we only remember at the last moment and other times, we remember right as the garbage truck passes by, potentially meaning another fortnight of overstuffing the bin. But I already had a Ubiquiti G6 Bullet pointing at that side of the house (with a privacy blackout configured to avoid recording the neighbours), so now it just takes a simple automation:

- id: bin_presence_check
  alias: Bin presence check
  mode: single
  trigger:
    - platform: state
      entity_id: binary_sensor.laundry_side_motion
      to: "off"
      for:
        minutes: 1
  condition:
    - condition: time
      weekday:
        - mon
        - tue
  action:
    - service: ai_task.generate_data
      data:
        task_name: Bin presence check
        instructions: >-
          Look at the image and answer ONLY in JSON with EXACTLY these keys:
          - bin_yellow_present: true if a rubbish bin with a yellow lid is visible, else false
          - bin_red_present: true if a rubbish bin with a red lid is visible, else false
          Do not include any other keys or text.
        structure:
          bin_yellow_present:
            selector:
              boolean:
          bin_red_present:
            selector:
              boolean:
        attachments:
          media_content_id: media-source://camera/camera.laundry_side_medium
          media_content_type: image/jpeg
      response_variable: result
    - service: "input_boolean.turn_{{ 'on' if result.data.bin_yellow_present else 'off' }}"
      target:
        entity_id: input_boolean.yellow_bin_present
    - service: "input_boolean.turn_{{ 'on' if result.data.bin_red_present else 'off' }}"
      target:
        entity_id: input_boolean.red_bin_present

Ok, so it's a 40-line automation, but it's also pretty human-readable:

  1. When there's motion that's stopped for a minute...
  2. And it's a Monday or Tuesday...
  3. Create an AI task that requests a JSON response indicating the presence of the yellow and red bin...
  4. And attach a snapshot of the camera that's pointing at them...
  5. Then set the values of two input booleans

From that, I can then create an alert if the correct bin is still present when it should be out on the road. Amazing! I'd always wanted to do something to this effect but had assumed it would involve sensors on the bins themselves. Not with AI though 😊

And then I started getting carried away. I already had a Ubiquiti AI LPR (that's a "license plate reader") camera on the driveway and it just happened to be pointing towards the letter box. Now, I've had Zigbee-based Aqara door and window sensors (they're effectively reed switches) on the letter box for ages now (one for where the letters go in, and one for the packages), and they announce the presence of mail via the in-ceiling Sonos speakers in the house. This is genuinely useful, and now, it's even better:

Home Assistant + Ubiquiti + AI = Home Automation Magic

I screen-capped that on my Apple Watch whilst I was out shopping, and even though it was hard to make out the tiny picture on my wrist, I had no trouble reading the content of the alert. Here's how it works:

- id: letterbox_and_package_alert
  alias: Letterbox/Package alerts
  mode: single
  trigger:
    - id: letter
      platform: state
      entity_id: binary_sensor.letterbox
      to: "on"
    - id: package
      platform: state
      entity_id: binary_sensor.package_box
      to: "on"
  variables:
    event: "{{ trigger.id }}"  # "letter" or "package"
    title: >-
      {{ "You've got mail" if event == "letter" else "Package delivery" }}
    message: >-
      {{ "Someone just left you a letter" if event == "letter" else "Someone just dropped a package" }}
    tts_message: >-
      {{ "You've got mail" if event == "letter" else "You've got a package" }}
    file_prefix: "{{ 'letterbox' if event == 'letter' else 'package_box' }}"
    file_name: "{{ file_prefix }}_{{ now().strftime('%Y%m%d_%H%M%S') }}"
    snapshot_path: "/config/www/snapshots/{{ file_name }}.jpg"
    snapshot_url: "/local/snapshots/{{ file_name }}.jpg"
  action:
    - service: camera.snapshot
      target:
        entity_id: camera.driveway_medium
      data:
        filename: "{{ snapshot_path }}"
    - service: script.hunt_tts
      data:
        message: "{{ tts_message }}"
    - service: ai_task.generate_data
      data:
        task_name: "Mailbox person/vehicle description"
        instructions: >-
          Look at the image and briefly describe any person
          and/or vehicle standing near the mailbox. They must
          be immediately next to the mailbox, and describe
          what they look like and what they're wearing.
          Keep it under 20 words.
        attachments:
          media_content_id: media-source://camera/camera.driveway_medium
          media_content_type: image/jpeg
      response_variable: description
    - service: notify.adult_iphones
      data:
        title: "{{ title }}"
        message: "{{ (description | default({})).data | default('no description') }}"
        data:
          image: "{{ snapshot_url }}"

This is really helpful for figuring out which of the endless deliveries we seem to get are worth "downing tools" for and going out to retrieve mail. Equally useful is the most recent use of an AI task, recorded just today (and shared with the subject's permission):

Like packages, we seem to receive endless visitors and getting an idea of who's at the door before going anywhere near it is pretty handy. We do get video on phone (and, as you can see, iPad), but that's not necessarily always at hand, and this way the kids have an idea of who it is too. Here's the code (it's a separate automation that plays the doorbell chime):

- id: doorbell_ring_play_ai
  alias: The doorbell is ringing, use AI to describe the person
  trigger:
    platform: state
    entity_id: binary_sensor.doorbell_ring
    to: 'on'
  action:
  - service: ai_task.generate_data
    data:
      task_name: "Doorbell visitor description"
      instructions: >-
        Look at the image and briefly describe how many people you see and what they're wearing, but don't refer to "the image" in your response.
        If they're carrying something, also explain that but don't mention it if they're not.
        If you can recognise what job they might, please include this information too, but don't mention it if you don't know.
        If you can tell their gender or if they're a child, mention that too.
        Don't tell me anything you don't know, only what you do know.
        This will be broadcast inside a house so should be conversational, preferably summarised into a single sentence.
      attachments:
        media_content_id: media-source://camera/camera.doorbell
        media_content_type: image/jpeg
    response_variable: description
  - service: script.hunt_tts
    data:
      message: "{{ (description | default({})).data | default('I have no idea who is at the door') }}"

I've been gradually refining that prompt, and it's doing a pretty good job of it at the moment. Hear how the response noted his involvement in "detailing"? That's because the company logo on his shirt includes the word, and indeed, he was here to detail the cars.

This is all nerdy goodness that has blown hours of my time for what, on the surface, seems trivial. But it's by playing with technologies like this and finding unusual use cases for them that we end up building things of far greater significance. To bring it back to my opening point, IoT is starting to go well beyond the rubbish apps at the start of this post, and we'll soon be seeing genuinely useful, life-improving implementations. Bring on more AI-powered goodness for Halloween 2025!

Edit: I should have included this in the original article, but the ai_task service is using OpenAI so all processing is done in the cloud, not locally on HA. That requires and API key and payment, although I reckon that pricing is pretty reasonable (and the vast majority of those requests are from testing):

Home Assistant + Ubiquiti + AI = Home Automation Magic

每周更新 466

2025-08-25 14:12:57

Weekly Update 466

I'm fascinated by the unwillingness of organisations to name the "third party" to which they've attributed a breach. The initial reporting on the Allianz Life incident from last month makes no mention whatsoever of Salesforce, nor does any other statement I can find from them. And that's very often the way with many other incidents too, which, IMHO, sucks. My view is that when our data is provided to a third party and that party exposes it, we have a very reasonable expectation to know who lost it. My own personal info was exposed in the Ticketek breach last year; can you find any mention whatsoever in that disclosure notice of Snowflake DB? Nope, but that's the "reputable, global third party supplier" they refer to. Another fun fact: the other third party they don't name is HIBP: "We are aware some customers have recently been contacted by a third party regarding the impact to their information". 🤷‍♂️

Weekly Update 466
Weekly Update 466
Weekly Update 466
Weekly Update 466

References

  1. Sponsored by: 1Password Extended Access Management: Secure every sign-in for every app on every device.
  2. Allianz Life was breached with 1.1 million unique email addresses affected (the unnamed third party is apparently Salesforce)
  3. The 16 million record PayPal "breach" always smelled bad (probably because it's not a PayPal breach!)

每周更新 465

2025-08-17 06:06:35

Weekly Update 465

How much tech stuff do I have sitting there in progress, literally just within arm's reach? I kick off this week's video going through it, and it's kinda nuts. Doing runeos and house build doesn't help, but it means there's just a constant distraction of "things" commanding my attention. I couldn't even go through writing this very short blog post without feeling the need to see if I could pair that smoke alarm directly to ZHA on Home Assistant without needing the Clipsal hub; I couldn't, so now I have one more thing to troubleshoot 🤷‍♂️ Maybe it all says more about my attention span than anything...

Weekly Update 465
Weekly Update 465
Weekly Update 465
Weekly Update 465

References

  1. Sponsored by: Malwarebytes Browser Guard blocks phishing, ads, scams, and trackers for safer, faster browsing
  2. We're putting local gov advice in front of HIBP visitors (first NZ, and we added Aus just after I recorded this live stream)
  3. The headline trolling "16B password breach" is now in HIBP (at least the portal I was sent is, and it's in there under "Data Troll")

160 亿密码的故事(又名 "数据巨魔")

2025-08-14 03:46:03

That 16 Billion Password Story (AKA "Data Troll")

Spoiler: I have data from the story in the title of this post, it's mostly what I expected it to be, I've just added it to HIBP where I've called it "Data Troll", and I'm going to give everyone a lot more context below. Here goes:

Headlines one-upping each other on the number of passwords exposed in a data breach have become somewhat of a sport in recent years. Each new story wants to present a number that surpasses the previous story, and the clickbait cycle continues. You can see it coming a mile away, and you just know the reality is somewhat less than the headline, but how much less?

And so it was in June when a story with this title hit the headlines: 16 billion passwords exposed in record-breaking data breach. I thought this would be another standard run-of-the-mill sensational headline that would catch a few eyeballs for a couple of days then be forgotten, but no, apparently not. It started with a huge volume of interest in Have I Been Pwned:

That 16 Billion Password Story (AKA "Data Troll")

That's Google searches for my "little" project, which I found odd, because we hadn't put any data in HIBP! But that initial story gained so much traction and entered the mainstream media to the extent that many publications directed people to HIBP, and inevitably, there was a bunch of searching done to figure out what the service actually was. And the news is still coming out - this story landed on AOL just last week:

That 16 Billion Password Story (AKA "Data Troll")

You know it's serious because of all the red and exclamation marks... but per the article, "you don't need to panic" 🤷‍♂️

Enough speculating, let's get into what's actually in here, and for that, I went straight to the source:

Bob is a quality researcher who has been very successful over the years at sniffing out breached data, some of which had previously ended up in HIBP as a result of his good work. So we had a chat about this trove, and the first thing he made clear was that this isn't a single source of exposure, but rather different infostealer data sets that have been publicly exposed this year. The headlines implying this was a massive breach are misleading; stealer logs are produced from individually compromised machines and occasionally bundled up and redistributed. Bob also pointed out that many of the data sets were no longer exposed, and he didn't have a copy of all of them. But he did have a subset of the data he was happy to send over for HIBP, so let's analyse that.

All told, the data Bob sent contained 10 JSON files totalling 775GB across 2.7B rows. An intial cursory check against HIBP showed more than 90% of the email addresses were already in there, and of those that were in previous stealer logs, there was a high correlation of matching website domains. What I mean by this is that if the data Bob sent had someone's email address and password captured when logging into Netflix and Spotify, that person was probably already in HIBP's stealer logs against Netflix and Spotify. In other words, there's a lot of data we've seen before.

So, what do we make of all this, especially since the corpus Bob sent is about 17% of the reported 16B headline? Let me speak generally about how these data sets tend to have hyperbolic headlines, and the numbers of actual impact are way smaller:

  1. There's usually duplication across files, as the same data appears multiple times
  2. There's also often duplication within the same file, again, as the same data reappears
  3. A "row" is an instance of someone's email address and password listed next to a website they're logging onto, so 100 distinct rows may all be one person

The corpus of data I received contained 2.7B rows, of which I was able to extract 325M unique stealer log entries. That's the number of rows I could successfully parse out website, email address and password values from. In my earlier example with the one person's credentials captured for both Netflix and Spotify, that would mean two unique stealer log records. All of this then distilled down to 109M unique email addresses across all the files, and that's the number you'll now see in HIBP. In other words, 2.7B -> 109M is a 96% reduction from headline to people. Could we apply the same maths to the 16B headline? We'll never know for sure, but I betcha the decrease is even greater; I doubt additional corpuses to the tune of that many billion would continue to add new email addresses, and the duplication ratio would increase.

Because it always comes up after loading stealer logs, a quick caveat:

Not all email addresses loaded into this breach will contain corresponding stealer log entries. This is because we have one process to regex out all the addresses (the code is open source), and another process that pulls rows with email addresses against valid websites and passwords.

And because I'll end up copying and pasting this over and over again in responses to queries, another caveat:

Presence in a stealer log is often an indicator of an infected device, but we have no data to indicate when it was infected. There will be a lot of old data in here, just as there's a lot of repackaged data.

Of the passwords in valid stealer log entries, there were 231M unique ones, and we'd seen 96% of them before. Those are now all in Pwned Passwords with updated prevalence counts and are searchable via the website and, of course, via the API. Speaking of which, those passwords are presently being searched a lot:

Of the 109M email addresses we could parse out of the corpus, 96% of them were already in HIBP (that number coincidentally matches the percentage of existing passwords we track). They weren't all from previous stealer logs, of course, but anecdotally, during my testing, I found a lot of crossover between this one and the ALIEN TXTBASE logs from earlier this year. Regardless, we added 4.4M new addresses from Data Troll that we'd never seen before, so that alone is significant. Not significant enough to justify hyperbolic headlines to the effect of "biggest ever", but still sizeable.

To summarise:

  1. The 16B headline distils down to a much smaller number of unique values of actual impact
  2. The data is largely from stealer logs that have been circulating for some time now
  3. It's certainly not fresh and doesn't pose any new risks that weren't already present

And lastly, there's that "Data Troll" title. When I first saw this story getting so much traction, the image I had in my mind was of a troll sitting on stashes of data. The mass media then picked this up and turned it into deliberately provocative headlines, manipulating the narrative to seek attention. Hopefully, this post tempers all that a little bit and brings some sanity back into the discussion. We need to take data exposures like this seriously, but it certainly didn't deserve the attention it got.

从可信赖的政府信息来源获取本地建议

2025-08-13 10:37:14

Get Pwned, Get Local Advice From a Trusted Gov Source

We were recently travelling to faraway lands, doing meet and greets with gov partners, when one of them posed an interesting idea:

What if people from our part of the world could see a link through to our local resource on data breaches provided by the gov?

Initially, I was sceptical, primarily because no matter where you are in the world, isn't the guidance the same? Strong and unique passwords, turn on MFA, and so on and so forth. But our host explained the suggestion, which in retrospect made a lot of sense:

Showing people a local resource from a trusted government body has a gravitas that we believe would better support data breach victims.

And he was right. Not just about the significance of a government resource, but as we gave it more thought, all the other things that are specific to the local environment. Additional support resources. Avenues to report scams. Language! Like literally, presenting content in a way that normal everyday folks can understand it based on where they are in the world. And we have the mechanics to do this now as we're already geo-targeting content on the breach pages courtesy of HIBP's partner program.

Whilst we're still working through the mechanics with the gov that initially came up with this suggestion, during a recent chat with our friends "across the ditch" at New Zealand's National Cyber Security Centre, I mentioned the idea. They thought it was great, so we just did it 🙂 As of now, if you're a Kiwi and you open up any one of the 899 breach pages (such as this one), you'll see this advice off to the right of the screen:

Get Pwned, Get Local Advice From a Trusted Gov Source

That links off to a resource on their Own Your Online initiative, which aims to help everyday folks there protect themselves in cyberspace. There's lots of good practical advice on the site along the lines I mentioned earlier, and even a suggestion to go and check out HIBP (which now links you back to the NZ NCSC...)

I'll be reaching out to our other gov partners around the world and seeing what resources they have that we could integrate, hopefully it's just one more little step in the right direction to protect the masses from online nasties.

Edit: As we add more local resources, I'll update this blog post with screen grabs and links, starting with our local Australian Signals Directorate:

Get Pwned, Get Local Advice From a Trusted Gov Source