HIPAA & Translation: The 'BAA or Bust' Myth vs Real Data Safety
What HIPAA actually requires for translation tools, what to ask a vendor about BAAs, and how to run a privacy-first workflow even when a BAA isn't available.

Most conversations about HIPAA and translation start in the wrong place. They start with "is this tool HIPAA-compliant?" — as if compliance were a binary attribute of software. It is not. HIPAA compliance is a property of workflows, agreements, and organizational controls. A tool can support a compliant workflow or undermine one, but no SaaS product is compliant in isolation.
This matters because healthcare organizations translate documents constantly — patient intake forms, discharge instructions, consent documents, clinical notes, insurance correspondence. The volume is growing, and so is the pressure to use machine translation instead of slower, more expensive human workflows. The question is not whether to use machine translation. It is how to use it without creating a data breach.
What counts as PHI and why translation is risky
Protected Health Information (PHI) under HIPAA is any individually identifiable health information held or transmitted by a covered entity or its business associate. The definition is broader than most people assume. It includes 18 specific identifier categories defined in 45 CFR §164.514(b)(2):
- Names
- Geographic data smaller than a state
- Dates (except year) related to an individual
- Phone numbers
- Fax numbers
- Email addresses
- Social Security numbers
- Medical record numbers
- Health plan beneficiary numbers
- Account numbers
- Certificate/license numbers
- Vehicle identifiers and serial numbers
- Device identifiers and serial numbers
- Web URLs
- IP addresses
- Biometric identifiers
- Full-face photographs
- Any other unique identifying number or code
When you paste a clinical note into a translation tool, you are almost certainly transmitting PHI. A discharge summary contains the patient's name, dates of service, medical record number, diagnosis, and treatment plan. Even a consent form typically contains a name and a description of a medical procedure. These are not edge cases — they are the default.
The risk is straightforward: any service that receives PHI becomes a business associate under HIPAA. If that service stores, caches, or logs the text — even temporarily for debugging or model improvement — it is processing PHI. If it does so without a Business Associate Agreement (BAA) and appropriate safeguards, both the covered entity and the service provider face enforcement risk.
Translation is especially risky because the entire document content is the input. Unlike an analytics tool that might receive de-identified metrics, a translation service receives the full text. There is no inherent data minimization built into the workflow unless you build it yourself.
Vendor requirements: BAAs, access controls, and retention
When a covered entity evaluates a translation vendor, three things matter above everything else.
1. Business Associate Agreement
A BAA is a contract required by 45 CFR §164.502(e). It obligates the vendor to safeguard PHI, report breaches, and limit use of the data to permitted purposes. Without a BAA, a covered entity cannot lawfully share PHI with the vendor. Period.
A BAA is not optional. It is not a "nice to have." If a vendor processes PHI and refuses to sign a BAA, you cannot use that vendor for PHI workloads. This is not a matter of risk tolerance — it is a regulatory requirement.
To be transparent: noll does not currently offer a BAA. We are working toward it, but we are not there yet. You can review our compliance posture for the current state of our security controls, certifications, and data handling practices. We believe in being direct about this rather than hiding behind ambiguous marketing language.
2. Access controls and encryption
Beyond the BAA, evaluate whether the vendor implements:
- Encryption in transit (TLS 1.2+) and at rest (AES-256 or equivalent)
- Role-based access controls for internal staff
- Audit logging of all access to PHI
- Authentication controls for API access (not just a shared API key)
These are baseline expectations under the HIPAA Security Rule (45 CFR §164.312). Any vendor that cannot demonstrate them is not ready for PHI workloads regardless of whether they offer a BAA.
3. Data retention and deletion
This is where most translation vendors fail. Ask three specific questions:
- Is submitted text stored after the translation is returned? Many services retain input/output pairs for model training, quality improvement, or caching. Any retention of PHI beyond the minimum necessary creates risk.
- Are translations logged in a way that is accessible to vendor staff? Debug logs, error logs, and analytics pipelines often capture full request bodies.
- Can data be deleted on demand, and is deletion verifiable? HIPAA requires covered entities to be able to account for disclosures and ensure PHI is disposed of properly.
At noll, we operate on a principle of no retention and hard deletion. Text submitted for translation is not stored after the response is delivered. There is no training on user data, no logging of document content, and no caching of translations. This does not replace a BAA — but it materially reduces the blast radius of any potential incident.
Safe workflows: redaction, minimal data, and time-limited access
Even with a vendor that offers a BAA, best practice is to minimize the PHI you transmit. A BAA permits processing — it does not make processing risk-free. The Privacy Rule's "minimum necessary" standard (45 CFR §164.502(b)) applies here.
Pre-translation redaction
The most effective risk reduction is to remove PHI before it reaches any translation tool. This is the Safe Harbor de-identification method: strip all 18 identifier categories and you no longer have PHI.
In practice:
- Replace patient names with placeholders (e.g.,
[PATIENT]) - Replace dates with relative references or placeholders
- Remove medical record numbers, account numbers, and SSNs entirely
- Strip contact information (phone, fax, email, address below state level)
- Remove any other unique identifiers
After translation, re-insert the identifiers into the translated text. This is manual work, but it means the translation service never sees PHI. No BAA is required because no PHI is transmitted.
Minimal data transmission
When full redaction is impractical — for example, when the clinical meaning depends on specific dates or the document is too long for manual redaction — limit what you send:
- Translate in segments rather than full documents, so no single request contains enough context to identify a patient
- Separate demographic sections from clinical content and translate them independently
- Use a translation service that does not retain data, so even if PHI is transmitted, it is not stored
Time-limited access and audit trails
For organizations that do transmit PHI to a translation vendor under a BAA:
- Ensure that translated documents are retrieved promptly and not left in vendor systems
- Maintain an internal log of what was sent, when, and to which service
- Periodically audit the vendor's compliance with the BAA terms
- Require breach notification within the timeframe specified in your BAA (HIPAA requires no later than 60 days, but you can negotiate shorter)
How to evaluate "HIPAA-compliant" marketing claims
The phrase "HIPAA-compliant" in vendor marketing is, at best, shorthand. At worst, it is misleading. There is no HIPAA certification. HHS does not certify software. When a vendor says they are "HIPAA-compliant," they are making an assertion about their own practices — not citing an external validation.
Here is what to actually verify:
- Do they sign a BAA? If no, stop. They are not suitable for PHI workloads regardless of their other controls.
- Have they completed a third-party security assessment? SOC 2 Type II, HITRUST, or equivalent. Self-assessments are insufficient.
- Can they provide documentation of their Security Rule implementation? Specifically: risk assessment, access controls, audit controls, integrity controls, and transmission security.
- What is their breach history? Check the HHS Breach Portal (the "Wall of Shame") for any reported incidents.
- Do they use PHI for model training? If a translation vendor feeds your patients' medical records into model improvement pipelines, that is a use beyond treatment, payment, or operations — and it requires explicit authorization or a very carefully drafted BAA.
Be especially skeptical of claims like "HIPAA-ready" or "supports HIPAA compliance." These phrases are designed to sound reassuring without making a concrete commitment. The only concrete commitment is a signed BAA backed by demonstrable security controls.
When to use human translation
Machine translation has improved dramatically, but there are scenarios where human translation remains the appropriate choice for healthcare content — not just for quality, but for risk management.
Regulatory submissions and legal documents
Documents submitted to regulatory bodies (FDA, IRBs, ethics committees) or used in legal proceedings require certified translation. Machine translation output, even when post-edited, may not meet the certification standards required by the receiving authority. The liability exposure from an inaccurate translation in a regulatory filing is significant.
High-acuity clinical content
Medication dosing instructions, surgical consent forms, and psychiatric evaluation notes carry direct patient safety implications. An error in translating a dosage ("take 1 tablet" vs. "take 10 tablets") can cause harm. For these documents, human translation with domain-specific medical expertise and a review step is the standard of care.
When redaction is impossible
Some documents are so densely packed with PHI that redaction would make them untranslatable. A full psychiatric evaluation, for example, weaves identifying information throughout the narrative. For these cases, a human translator operating under a BAA and bound by confidentiality obligations is a more appropriate path than any SaaS tool.
The hybrid approach
The most practical model for most healthcare organizations is hybrid: use machine translation for high-volume, lower-risk content (general patient education materials, internal communications, research abstracts) and reserve human translation for the categories above. This balances cost, speed, and risk — but it requires clear internal policies about which documents go through which workflow.
HIPAA compliance in translation is not a product feature. It is a set of organizational decisions about data handling, vendor relationships, and workflow design. A BAA is necessary for PHI workloads — there is no workaround for that requirement. But a BAA alone is not sufficient. The organizations that handle this well combine contractual protections with technical controls and operational discipline: redaction where possible, minimal data transmission, no-retention vendors, and human oversight for high-risk content.
We are building noll to support the technical side of that equation — zero retention, no training on user data, hard deletion. The BAA is coming, but we will not claim it before it exists. In the meantime, the redaction-first workflow described above is a defensible path for organizations that need to translate sensitive content today.
Tags
Related Articles

The GDPR Translation Trap: Why Your 'Compliant' Tool Might Be Illegal
Most translation tools claim GDPR compliance but fail on data minimization, retention, and residency. Here's the 7-point checklist that separates real compliance from theatre.
7 min read

The EU AI Act is Coming: Is Your Translation Vendor Ready or Faking It?
The EU AI Act affects every organization using machine translation. Here's what it requires, what questions to ask vendors, and how to operationalize compliance.
8 min read

Data Residency Lies: Why 'Processing in EU' Doesn't Mean 'Staying in EU'
'EU-only processing' can mean storage, processing, or just marketing. Here's a checklist to verify data residency claims for cloud translation tools.
4 min read
Try noll for free
Translate your sensitive documents with zero data retention. Your files are automatically deleted after download.
Get started for free