De-Identified or Not? The Truth About HIPAA, AI, and Client Data
In the era of AI-driven tools, it’s more important than ever for mental health professionals to understand what HIPAA actually means by “de-identified.” We’ve recently seen troubling claims from some EHR vendors asserting that they store or use “de-identified transcripts” of client sessions to support AI-generated notes — with no opt-out available. That claim not only raises red flags, but it also reveals a fundamental misunderstanding (or misrepresentation) of HIPAA.
Let’s unpack what de-identification really entails under HIPAA, how it differs from tech marketing buzzwords like “anonymized,” and why transcripts can never truly be de-identified.
What HIPAA Actually Says About De-Identification
Under HIPAA, data that has been properly de-identified is no longer considered Protected Health Information (PHI). That means it isn’t subject to HIPAA’s Privacy and Security Rules — but only if it meets specific, rigorous standards.
There are two pathways to de-identify PHI under HIPAA:
- Safe Harbor Method — Removal of all 18 HIPAA identifiers
- Expert Determination Method — A qualified expert applies statistical methods to determine the risk of re-identification is “very small”
The 18 Identifiers That Must Be Removed (Safe Harbor Method)
To meet the Safe Harbor method, a dataset must have all of the following identifiers removed:
- Names
- Geographic subdivisions smaller than a state (e.g., street address, city, ZIP code)
- All elements of dates (except year) directly related to an individual, including birth date, admission date, discharge date, and date of death
- Telephone numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers (including license plate numbers)
- Device identifiers and serial numbers
- Web URLs
- IP addresses
- Biometric identifiers (e.g., fingerprints, voiceprints)
- Full-face photographs and any comparable images
- Any other unique identifying number, characteristic, or code [emphasis added]
If even one of these is present — or partially present — the data is not considered de-identified.
Therapy Transcripts and Narrative Clues Here’s where the misunderstanding often comes in: even if names are removed from a session transcript, the narrative itself contains identifiable clues. This includes unique situations, events, locations, or timelines that allow re-identification. As we say at PCT: If the client would recognize themselves in it, it’s not de-identified.
The Rarely Used Expert Determination Method
The Expert Determination method involves a trained statistical expert formally assessing the dataset and documenting that the risk of re-identification is very small. It’s a detailed, complex process that:
- Requires formal documentation
- Demands high-level expertise
- Is rarely used by mental health tech vendors because of cost and scrutiny
A vendor cannot simply claim they’ve de-identified data without this process.
De-Identified vs. Anonymized: Not the Same Thing
A growing number of vendors use the term “anonymized,” often interchangeably with “de-identified.” But here’s the thing: anonymized is not a HIPAA term. It’s a loosely defined tech industry concept that does not automatically meet HIPAA’s Safe Harbor or Expert Determination standards.
Don’t let buzzwords blur the legal line: if data isn’t fully de-identified under HIPAA, it’s still PHI.
The Myth of the “De-Identified Transcript”
Let’s talk about the troubling claim we’ve seen from certain EHR vendors: that they can generate and store “de-identified transcripts” of client sessions to power AI note-taking systems. In some cases, without even offering an opt-out.
Here’s why that doesn’t hold up:
- A session transcript is, by nature, tied to a specific individual.
- Even with names removed, narrative content reveals identity through context.
- The definition of a transcript is a “verbatim written record of spoken word”
- de-identified according to the Safe Harbor method but then it would no longer technically be an actual transcript of a therapy session
- Stated clearly by HIPAA and mental health attorney Eric Ström, JD PhD LMHC: “There can be no such thing as a ‘de-identified transcript’.” It’s not just that a “’de-identified transcript” doesn’t exist – such a thing cannot exist.”
When a vendor treats session transcripts as outside the bounds of HIPAA, it indicates a fundamental misunderstanding (or misapplication) of the law — and raises serious ethical concerns.
The Myth of the “De-Identified Transcript”
Let’s talk about the troubling claim we’ve seen from certain EHR vendors: that they can generate and store “de-identified transcripts” of client sessions to power AI note-taking systems. In some cases, without even offering an opt-out.
Here’s why that doesn’t hold up:
- A session transcript is, by nature, tied to a specific individual.
- Even with names removed, narrative content reveals identity through context.
- Sure, while theoretically a transcript could be de-identified according to the Safe Harbor method it would no longer then be an actual transcript of a therapy session since the definition of a transcript is “verbatim record of spoken words”.
- As therapist HIPAA attorney Eric Ström articulated: “There can be no such thing as a ‘de-identified transcript’. It’s not just that a ‘de-identified transcript’ doesn’t exist – such a thing cannot exist.”
When a vendor treats session transcripts as outside the bounds of HIPAA, it indicates a fundamental misunderstanding (or misapplication) of the law — and raises serious ethical concerns.
Why It Matters — Even If You Have a BAA
Having a Business Associate Agreement (BAA) with a vendor does not automatically guarantee that your clients’ PHI is being handled appropriately.
If a vendor fundamentally misunderstands what qualifies as PHI or what constitutes proper de-identification, they may:
- Misclassify PHI and fail to apply appropriate and required safeguards
- Implement insecure or non-compliant systems and workflows
- Bypass consent and transparency obligations
A BAA only functions as a protective framework if the vendor is operating with a correct understanding of HIPAA. When that foundation is flawed, your practice (and your clients) may still be at risk — legally, ethically, and clinically.
Trust in a vendor isn’t about the paperwork alone. It’s about their practices, knowledge, and transparency. If they get the basics wrong, like claiming transcripts can be de-identified, it’s a clear sign they may not be up to the task of protecting your clients’ sensitive information.
What Clinicians and Practice Owners Can Do
You don’t need to be a HIPAA expert to protect your clients’ data. Here’s how to stay empowered:
- Read the fine print in vendor policies.
- Ask hard questions, like:
- What identifiers are removed?
- Has an expert determination been conducted? (If it is apparent – or questionable – that de-identification under the Safe Harbor method has not been applied)
- Is this feature opt-in or opt-out?
- Watch for red flags, such as:
- Vendors claiming initials or transcripts are de-identified
- No ability to opt out of AI-powered features
Push back on service providers that treat your clients’ stories as AI training fodder.
Final Takeaways
- De-identification is a rigorous process, not a marketing label.
- Transcripts are never truly de-identified under HIPAA.
- Protecting client data is a legal and ethical necessity.
- Choose vendors who understand and respect HIPAA’s requirements.
Need help navigating tech choices for your practice? Check out our service evaluation tools and support resources in the show notes.
At Person Centered Tech, we help practices select and implement right-sized, HIPAA-consistent solutions that meet both clinical and operational needs. Reach out if you need support evaluating your options.
Learn more about AI use in clinical settings: