Recently, an alarming research report showed that artificial intelligence can crack pseudonimized data in ways previously thought impossible. Researchers at Northeastern University were able to re-identify 1,250 anonymized interviews by using a common LLM to connect seemingly innocent details against publicly available information. This highlights a critical question for anyone using AI in healthcare: How do we ensure that patient data is truly anonymized?

What really happened?

The researchers used an off-the-shelf LLM (the same type of AI as ChatGPT) to analyze 1,250 interviews published as an anonymized dataset by Anthropic. By connecting details such as career histories, unique expressions, special projects and other contextual fingerprints, the AI was able to identify the real people behind the data. The attack didn't require advanced hacking - it was simply the AI's ability to understand context that did the trick.

Source: i10x.ai - LLMs Re-identify 1,250 Anonymized Interviews

Why this is serious for healthcare professionals

If you're using AI tools for record keeping, transcription or other tasks involving patient data, you need to ask a simple question: How is my data anonymized?

Traditional pseudonymization has involved removing names, addresses and other obvious personal data. But in the age of AI, this is insufficient. An AI can recognize patterns that humans overlook - the way a person describes a symptom, a unique sentence structure, or a particular medical history can become a traceable fingerprint.

The difference that makes a difference: Medivox's solution

In Medivox we use never first name and last name. The transcription only contains first names (mostly), and when this is pseudonymized it is replaced with another random first name from our list. We have written more about the principles behind this in Pseudonymization: A Key to Secure and Efficient Data Processing.

The problem with traditional pseudonimization Medivox's method
Real name is switched to pseudonym, but all other context is preserved Only first name used, replaced with random first name from list
AI can connect contextual fingerprints to real people First names have never been used as real identities
Requires data to be 100% reconstructable to connect Impossible to connect because the first name was never real

Proof

Try it for yourself: Search on Ola in Google. You will find many different people with the same first name - no unique identity. All first names on Medivox's list are synthetic and cannot be traced back to anyone.

Even an AI with access to all the information in the world can't connect a never-real first name to a real person.

Safety in practice

Medivox has always had privacy in focus, and we are constantly working to develop and improve our systems. Read more about how we take care of security in Safe use of AI in healthcare: Privacy in focus and AI and healthcare: Is it safe to use Medivox?.

Requires access

It is important to emphasize: To re-identify data, an attacker needs access to the notes. Without access, no AI can analyze them. Medivox's security measures include:

  • Encryption of data in transit and that residual
  • Strict access control
  • Logging of all access
  • Automatic alerts for abnormal activity

What this means for you as a healthcare professional

When you use Medivox for journaling, you can rest assured that:

  1. Pseudonymization is irrevocable - even if someone gains access to the data, the pseudonimized identity is worthless
  2. No contextual fingerprints - the first names are random and inconsistent with real people
  3. AI-resistant - method is designed to withstand exactly this type of attack

Conclusion

This research is a wake-up call for the entire industry. Old methods of pseudonimization no longer work in the age of AI. Medivox has embraced this and implemented a solution designed for the future - and you can read more about our approach to AI and health in A little dive into the future of AI tools for health.

If in doubt, ask for a demonstration of how pseudonymization works in practice. We are transparent about our methods.