How AI reveals Iran stole a Dutch bakery's identity for hacking
Your guide to analyzing leaked data with AI in under an hour
Someone in Tehran needed to rent servers for hacking operations. They needed an address that looked legitimate enough to fool a hosting company. So they borrowed the identity of a bakery in Haarlem. A real bakery. Where people buy bread.
That address appeared on 31 invoices for attack servers between 2023 and 2025. The bakery had no idea.
I used AI to study a leak of Iranian intelligence files on GitHub. Not because I’m particularly clever, but because AI can do in minutes what would take humans weeks: cross-reference files, spot patterns in spreadsheets, and connect dots across sources while you’re still trying to remember where you saved that CSV file.
I’m writing this because cybersecurity experts who read my newsletter kept asking for a practical example of AI-assisted investigation. They wanted to see the actual process, not the theoretical version. So here’s what happened, step by step, so you can do it too when the next leak drops. And trust me, there’s always a next leak.
On September 30, 2025, Nariman Gharib, a Britain-based Iranian activist posted files from an Iranian intelligence operation to GitHub. The repository is called “KittenBusters” and it exposes an IRGC-IO unit called Charming Kitten (APT35)—basically Iran’s hacking team for the Counterintelligence Division.
What got posted:
- Employee photos with Iranian ID cards 
- Attack reports targeting governments and companies 
- Daily work logs from the hackers themselves 
- Complete source code for their BellaCiao malware 
- Infrastructure credentials and server lists 
- Internal communications 
- Target lists by country 
What they actually do: Break into email systems. Steal documents. Monitor journalists and dissidents. Gather intelligence. Run influence operations. The usual intelligence gathering, except they’re not particularly good at operational security—hence the leak.
Who they target: Iranian dissidents and academics, Middle Eastern governments, European organizations, anyone Iran considers strategically interesting. They use phishing emails, credential theft, and known vulnerabilities like ProxyShell—the Microsoft Exchange flaw that organizations somehow still haven’t patched.
The scale: This isn’t the Pentagon Papers. It’s not hundreds of thousands of files. But it’s enough: hundreds of documents, source code, credentials, personnel files. For an intelligence operation, having your employee photos and infrastructure passwords on GitHub is like having your diary posted to Reddit—uncomfortable and revealing.
Let me walk through how AI can be helpful, because the methodology matters more than the story.
Step 1: Making Sense of the Repository (5 Minutes)
The situation: I’m looking at a GitHub repo with folders labeled Episode 1, Episode 2, Episode 3, Episode 4. There are ZIP files called Attack_Reports.zip, Employees.zip, Malware_and_Logs.zip. There are PDFs with Persian names. There’s a README that mentions Abbas Rahrovi and Unit 1500 and various operations.
Where do you even start?
What I did: Asked Claude Code: “Get the repository introduction, analyze it semantically and extract targets, personnel, tools, infrastructure.” This “blindsides” the AI: it can only look at the hackers data.
What it gave me:
- It downloaded the data: no need to do it myself 
- Primary targets: Middle East, Turkey, UAE, Qatar, Afghanistan, Israel 
- Key person: Abbas Rahrovi (Iranian ID: 4270844116) running the operation 
- Tools: BellaCiao malware, CYCLOPS framework, Python scripts 
Why this matters: When you’re staring at an unfamiliar dataset, AI can give you the map before you start walking. You need to know what you’re looking at before you can figure out what questions to ask.
Your move: Next time you see a leak on GitHub or a massive document dump, try this yourself. For bigger leaks, do this step first:
Give me a prompt how to analyse this (and include the data or the link)
Step 2: Following infrastructure money
The situation: Nariman Gharib published leaked invoices from Edis Global—a hosting company. These invoices came from APT35’s own accounts. (Yes, they got their passwords leaked. The people who hack for a living got hacked. The irony is so thick you could spread it on toast.)
What I did: Uploaded the CSV to Claude Code. Asked: “Analyze this. Find unique customer identities—names, emails, addresses, phones. Look for patterns in payments, geographic distribution, anything odd.”
What it found:
A fake identity: “Maja Bosman”
- Email: bashiriansul@proton.me 
- Phone: +31 23 532 5426 
- Address: [Redacted], Haarlem, Netherlands 
- 31 invoices from 2023-2025 
- Server IP: 151.236.28.129 
Then it kept going and found more fake identities:
- Russia: Mekhaeel Kalashnikova (40+ invoices) 
- Israel: Sheldon Bayer, Malki Teichtel, Edgar Evseev (25+ invoices) 
- Hungary: Levis Cross (12+ invoices) 
All paid through Cryptomus—a cryptocurrency payment gateway.
I specifically asked for:
Find connections between Iran and Cryptomus
Claude Code found that Canada had fined Cryptomus $176 million in October 2025 for failing to report 7,557 suspicious Iranian transactions from July-December 2024. The exact period these invoices were being paid. I also found other sources.
The AI cross-referenced these for me.
Why this matters: Humans can’t process hundreds of spreadsheet rows while simultaneously checking recent news about payment processors. AI can do both at the same time, which is either incredibly useful or mildly terrifying depending on your perspective.
Your move: When you get financial data—invoices, transactions, payment records: “Analyze this for patterns. Identify all unique entities and payment methods” And then go in the wild, search for context on the web via Claude Code.
Step 3: The identity theft discovery 
The situation: I had a Dutch address from invoices. Did it appear anywhere else in the leaked files? Was this just billing fraud or was there operational infrastructure?
What I did: Asked Claude Code to search the entire KittenBusters repository for:
- IP: 151.236.28.129 
- Email: bashiriansul@proton.me 
- Phone: +31 23 532 5426 
- Address: [Redacted], Haarlem 
What it found:
In Episode 2 attack reports, Belgian ProxyShell targets included Dutch domains:
- ef-service.nl (Eurofiber Nederland BV) 
- fme-nv.com (Fresh Mushroom Europe NV) 
Server confirmed in Schiphol-Rijk, North Holland. Network infrastructure documented as 151.236.28.0/24.
What it didn’t find: The fake identity names themselves. They’re probably in those ZIP files (Attack_Reports.zip, Employees.zip) that need manual extraction. The AI could see the archives existed but couldn’t open them.
Then the reveal:
Claude Code confirmed the Haarlem address is real. It’s a functioning bakery.
Iranian intelligence didn’t invent an address. They stole one from a business that makes sandwiches.
Why this matters: Systematic search across hundreds of files without getting distracted or tired. Humans skip things. We get bored. We miss file names. AI searches everything and reports back like an overly diligent intern who actually reads every document.
Your move: Once you have specific identifiers from one source: “Search the entire repository for these terms. Show me every file that mentions them, the context around each mention, and tell me what you DON’T find.”
That last part—what’s missing—is often as informative as what’s there.
Step 4: Understanding the full picture
The situation: I had findings from invoices and the GitHub repo. Nariman Gharib had published detailed analysis of Episodes 2 and 3. I needed to understand how everything connected.
What I did: Gave Claude Code his articles and asked:
“Analyze all data and articles. Do the operations connect and when yes, how, when no, why not?”
What emerged:
Iran’s targeting scope:
- Iran (domestic): 100+ Exchange servers—universities, government ministries, telecoms. Tracking “regime opponents” which is Iranian government speak for “people we don’t like.” 
- Greece: 500+ entries—Parliament, shipping companies, government agencies. Strategic interest in Mediterranean politics. 
- Turkey: Foreign Ministry, municipalities, defense contractors. Keeping tabs on a regional rival. 
- Saudi Arabia/Kuwait: Hospitals, construction, energy sector. Gulf states intelligence gathering. 
- Canada: 700+ IP addresses for automated scanning. Finding vulnerable systems for future exploitation. 
How the attacks work:
- Find unpatched Microsoft Exchange servers (ProxyShell vulnerability) 
- Install webshells (backdoors for persistent access) 
- Steal credentials (usernames and passwords) 
- Download emails and documents 
- Maintain access for months or years 
Nothing sophisticated. Nothing fancy. Just patience and exploiting the fact that organizations patch systems approximately never.
What else got leaked:
- BellaCiao source code: Complete malware code developed at Shuhada base in Tehran (Episode 3) 
- Infrastructure credentials: Excel sheets with server passwords, login details (Episode 4) 
- Personnel files: Photos, Iranian ID numbers, names 
- Influence operations: Sahyoun24 platform exposed as IRGC propaganda 
Someone with access to personnel files leaked this. Someone inside. Which means someone in Iranian intelligence is very unhappy with their employer.
Why this matters: Multiple sources (GitHub files, invoices, external journalism, news reports) synthesized into coherent intelligence. Reading all of this linearly would take days. Having AI connect the pieces took minutes.
Your move: When investigating: “I have findings from these sources [list them]. Synthesize everything. Show me timeline, geographic patterns, attack methods, scale, and how different data sources connect.”
Step 5: Making it useful
The situation: Intelligence that nobody reads might as well not exist. Different audiences need different formats.
What I did: Asked Claude Code to generate four versions:
- Technical report for cybersecurity people: Full details on IPs, domains, infrastructure, attack methods, indicators of compromise 
- News article for regular humans: Lead with the bakery story, explain what APT35 actually does, why anyone should care 
- Social media thread for attention spans measured in seconds: 10 tweets with key findings and sources 
- Action items for law enforcement: What Dutch authorities (AIVD, NCTV) should investigate, what affected organizations should check 
Why this matters: A technical report doesn’t work for journalists. A news article doesn’t help incident responders. Social media threads don’t give law enforcement actionable leads. You need all four.
Generating them manually takes time. AI does it in minutes.
Your move: After you have findings: “Create four versions optimized for: 1) Technical professionals (full technical details), 2) General audience (accessible narrative), 3) Social media (punchy takeaways with sources), 4) Decision-makers (actionable steps).”
What AI contributed:
Speed: Hundreds of files analyzed quickly. This matters because leaks get reported fast—if you take weeks to analyze, someone else already published the story. Do not use the chatbot for this, but go to Claude Code because it remembers more and can hold more context.
Pattern recognition: AI spotted that all invoices used the same payment gateway (Cryptomus), then connected it to recent Canadian sanctions.
Systematic search: Searching an entire repository for specific terms without missing files or getting tired. I get distracted. AI didn’t.
Multi-source synthesis: Connecting GitHub files + invoice data + external journalism + news reports into coherent intelligence. Doing this manually means reading everything first, then trying to remember what connected to what. AI does it simultaneously.
Multiple outputs: Technical report, news article, social thread, action items—all at once, each optimized for its audience.
What I had to do myself:
Here’s what AI couldn’t do:
Strategic questions: Deciding what mattered and where to look next
Verification: Checking that the bakery address is real (Google Maps), confirming the Cryptomus sanctions through multiple news sources, verifying the GitHub repo is legitimate
Context: Understanding why the Dutch angle matters, recognizing the geopolitical significance, knowing which audiences need which information
Judgment: Deciding what to publish, how to frame it, whether to redact the bakery’s details
Ethics: Considering who gets harmed by disclosure
Neither AI nor I could have done this alone. That’s the actual lesson.
The limitations:
What AI couldn’t handle:
ZIP archives: Those Attack_Reports.zip and Employees.zip files need manual extraction before AI can analyze their contents
Image PDFs: Documents stored as images need OCR processing first. I often convert them to TXT.
Language depth: Persian text is translated quite well, but it requires human expertise for nuanced understanding
Physical verification: AI can’t confirm a bakery exists at an address—I had to check independently
Continuous monitoring: This is a snapshot. The repository gets updated with new episodes, but AI isn’t watching it constantly
You still need humans.
Your playbook for a next leak
When the next dump drops on GitHub or gets posted by hacktivists or shows up in your inbox from a whistleblower:
Phase 1: Orientation (5-10 minutes) “What’s in this repository? Key entities, dates, structure, what each folder contains.”
Phase 2: Structured data analysis (10-15 minutes) Feed any CSV, JSON, spreadsheets to AI. “Find patterns, anomalies, connections. Search for news about entities you find.”
Phase 3: Deep search (10-15 minutes) Take identifiers from Phase 2. “Search everything for these terms. Show context and what’s missing.”
Phase 4: Synthesis (10-15 minutes) “Combine all findings. Timeline, geographic patterns, methods, scale, connections between sources.”
Phase 5: Output (10 minutes) “Create versions for: technical audience, general public, social media, decision-makers.”
Phase 6: Verification (15+ minutes) Verify key facts manually. Check entities exist. Confirm technical details. Validate sources. Never skip this.
Total: 60-90 minutes for preliminary analysis
Then you can spend your time on the hard parts—investigation, verification, deciding what matters—instead of manually reading hundreds of files.
The bottom line
This is what investigation looks like now—human strategic thinking combined with AI computational power, revealing connections at the speed modern leaks demand.
And for the bakery owners in Haarlem who got dragged into Iranian intelligence operations through no fault of their own: You deserved better. Identity theft is bad enough when it’s credit card fraud.
TOOLS AND SOURCES
Tool: Claude Code (included with Claude Pro, $20/month)
Sources (all public):
- KittenBusters GitHub: https://github.com/KittenBusters/CharmingKitten 
- Nariman Gharib investigation: https://blog.narimangharib.com/massive-leak-exposes-charming-kitten 
- Nariman Gharib Episodes 2-3: https://blog.narimangharib.com/charming-kitten-episodes-2-3-five-continents 
- CloudSEK analysis: https://www.cloudsek.com/blog/an-insider-look-at-the-irgc-linked-apt35-operations (done with help of a chatbot) 
Everything cited is publicly available. No unauthorized access.




