The File That Wouldn't Die

January. Four in the morning. Somebody's phone went off and the email was from Instagram and the email was not a scam, which made it worse.

It came from security@mail.instagram.com signed the way Instagram always signs things, blue checkmark in the right spot. Someone, it said, had requested a password reset. If that was not you, it said, go back to sleep. Most people could not. Most people sat up in the dark holding a phone that had just told them a stranger was trying to walk into their life, and most people did the thing the stranger was hoping they would do. They panicked.

On the Wednesday before, a file had landed on a hacker forum. Seventeen million rows of it. Usernames, real names, phone numbers that mothers still had memorized, email addresses in the millions, partial street locations in some cases. A map of who was who and how to reach them. The person who posted it labelled the file a 2024 leak, fresh, and offered it to anyone who wanted a copy for free.

The file was older than that. Much older.

Two teams of researchers, working separately and not trusting each other's work until each had done its own, traced the same rows back to May of 2022. Different forum. Different handle. Same gaps in the data where Instagram's servers had not felt like answering three and a half years earlier. A year after the original post, the file surfaced on BreachForums under another name. Then it went quiet. Then it came back in January with a shiny new label.

The label was a lie. But here is what the label did. It made cybersecurity writers panic. It made non-technical people panic. It made the file look like news, when it was in fact a very old piece of dirt shaken off and stood back up.

Follow me a minute, because the mechanism takes some explaining.

The Window

When you post something on Instagram and your account is not set to private, that post is public. It has been since Instagram launched. The thing most people do not know, or do not bother thinking about, is that every one of those public posts and every bit of information attached to the account that made the post sits on a Meta server, and that server answers questions when asked. If you know how to ask.

The way you ask is called an API. Application Programming Interface. Fancy name for a teller window with a clerk behind it, handing things out to whoever shows up, according to rules the building's owner has posted on the wall. Most of the time the clerk is fine. Most of the time the rules work. Trouble starts when somebody figures out how to stand at that window all day, asking questions faster than the clerk can think, faster than the posted rules can keep up.

That is called scraping. Nobody picks a lock. Nobody kicks a door. Somebody stands at the window the whole building put there for legitimate traffic and asks and asks and asks, and writes down everything handed back, over weeks, over months, over one full summer in 2022.

Meta says no such thing happened in 2022. Meta also says no such thing happened in 2024. What Meta says exactly, and the wording is important because the wording is identical across every reporter who got a response, is this. There was no breach of their systems. Accounts remain secure. A bug allowed somebody outside to request password-reset emails for some users. The bug is fixed. Ignore the emails. Sorry for any confusion.

Read that again. The words do not cover what did not happen. The words cover what did not happen to Meta's own internal systems. The data on the hacker forum was always public-facing, or near enough. Names people had voluntarily put on profiles. Numbers they had volunteered to set up two-factor authentication. Geolocation tags stitched together with data-broker files nobody can track.

Technically correct, in other words. Legally useful. Functionally, it meant that seventeen million people had their personal details sitting free on a website any eleven-year-old with a VPN could reach.

The Flood

That was the first act. The reset-email flood was the second, and it is the part you should think about even if you do not use Instagram.

Picture the person holding that list. Seventeen million email addresses. A theory. The theory goes like this. If I type somebody's email into Instagram's Forgot Your Password page and Instagram responds by sending an actual reset message to that inbox, then I know for certain that email is attached to a live Instagram account. Useful. That is how you turn a pile of unsorted data into a targeted list. The test costs nothing. The test is automated. The test does not require the person running it to log in anywhere, or verify anything, or prove anything at all.

So somebody did that. Or more than one somebody. Nobody has said for sure, because Meta has not explained. What is known is that starting Thursday morning, January 8, reset emails began flooding inboxes all over the world. The specialist press reported roughly a million such emails by the weekend, probably more. The emails were legitimate because Instagram's actual servers sent them, triggered through a weakness Meta patched on Sunday.

The whole operation took four days.

Can It Still Be Done

Now the question. The only question that matters, from the point of view of a person who is not a security researcher and does not care about hacker-forum etiquette. Can this still be done today?

The honest answer is yes.

Scraping public data is not illegal in the United States. A federal appeals court ruled in 2022 that copying information a website has already made public does not violate the country's main computer-crime law, and the Supreme Court left that ruling standing. Europe has been arguing for years about whether aggregating public social-media data without consent counts as a data breach under their privacy rules, and the answer keeps coming back complicated. Meanwhile, every major platform depends on open or semi-open APIs to let businesses plug in, and every API is a window with a clerk behind it, and every clerk is outnumbered.

Password-reset enumeration, the test that turns email addresses into confirmed live accounts, has been a documented attacker technique for twenty years. Every security textbook mentions it. Every platform is supposed to defend against it. The fix is technically simple. When somebody asks for a password reset, the website should not tell the asker anything that confirms or denies whether that email is in the system. Accept the request silently. Send an email only if the account is real. Look the same either way. Most big platforms do this. Some do not. Some do it almost everywhere except three obscure corners of the product that nobody remembers.

Relabelling an old file as a fresh leak is the oldest trick of the three. Criminals do it because news of a new breach moves faster than news of an old one. Cybersecurity firms amplify the alert because their alerting tools do not always catch that the same records were floating around years earlier. The press picks up the alert. Within a day, an old file starts to feel, for a day at least, like a current emergency. This keeps working until breach-tracking services adopt better signatures for already-known data. Some are working on it. None have solved it.

All three techniques are available to anyone with patience and a laptop.

What You Can Do

For regular people, which is to say everybody who uses Instagram or Facebook or a bank or an airline or a streaming app, the lesson is less technical than it looks.

You cannot stop scraping. You cannot stop the platforms from leaving those clerks outnumbered. You cannot always tell when your information ends up in a file on a forum you have never heard of.

What you can do is boring. Turn on two-factor authentication, preferably the kind that uses an authentication app and not a text message. Text-message codes can be stolen by people who convince your phone company to move your number to a new SIM card, another old technique that still works. Use a password manager so that when one leak becomes a real break-in on some other website, the damage is confined. Do not click the link in an unexpected password-reset email. Do not even click the link in a real one. Open the app directly and see if the notification is waiting for you there. If it is not, the email was either someone's enumeration test or a phishing attempt dressed up to look like one.

The quiet part, the part nobody in the industry likes to say out loud. If your email address was in that seventeen-million-row file, and you have an Instagram account, you are on somebody's target list now, and you will be on that list as long as the file keeps circulating. Which is forever. Data does not die on the internet. It just moves to cheaper storage.

The Real Problem

Back to the phone buzzing at four in the morning.

The email was real. That was the uncanny part. You could check the headers, follow the authentication chain, trace the pedigree all the way back to Instagram's own mail servers. A stranger had used Instagram itself to mass-produce panic and inbox clutter. Instagram could not easily be blamed for that, except for the part where the company had left a bug in its password-reset flow for some unknown stretch of time, and the part years earlier where it had apparently let somebody empty a sizable chunk of the user database through a side window nobody was watching.

Both things could have been prevented. Neither one was.

The file is still out there. The techniques are still available. A reasonable person with a laptop and some patience, looking at the same building today with the same window and the same overworked clerk, could do the whole job again next Tuesday.

Probably somebody is.

Synexmedia.com