There's Now a Free Tool That Strips PII Before Your AI Sees It. Here's When to Use It.
OpenAI released a free, on-device model that catches personally identifiable information before text reaches any server. 96% accuracy across 8 categories — and the 4% it misses is where your judgment still matters.
By Forge Team
The main reason teams hesitate before pasting internal documents into AI tools is not that they don't trust the output. It is that the input contains names, email addresses, employee records, and contract details that were never meant to leave the building. There is now a free tool that screens for that information locally — before any text reaches a server — and strips identifying details before the content ever hits your AI tool.
What OpenAI released
On April 22, OpenAI published a free, open-weight model — 1.5 billion parameters — built specifically to detect and redact personally identifiable information from text before it leaves your device. The model runs entirely on-device, which means the content it screens never travels to an external server during the detection step. TLDR AI reported on the release the same day.
The model identifies eight categories of personally identifiable information: names, email addresses, phone numbers, dates of birth, financial account identifiers, and three others. Across all eight, it achieves a 96% F1 accuracy score — a standard measure that accounts for both catching real identifiers and avoiding false positives on words that merely resemble them.
What to do differently from Monday
The 96% accuracy figure means roughly four out of every hundred identifiable pieces of information will pass through undetected. For a document with 50 names and addresses, that is two items. For a large export with 1,000 identifiers, it is forty.
That rate is low enough to make the filter genuinely useful as a first pass: employee survey analysis, customer feedback synthesis, draft reports built from meeting notes. It catches most of what a manual skim misses on a busy afternoon. The 4% is not a reason to skip it — it is a reason to keep a spot-check step in place before the processed content goes anywhere external.
If your workflow involves legal filings, medical records, or documents with a compliance sign-off requirement, the filter still earns its place in the process — it speeds up the human review step rather than replacing it.
What this looks like in practice
A training coordinator at a 75-person logistics company runs monthly employee pulse surveys across three regional offices. The survey platform exports anonymous responses, but employees regularly mention colleagues and managers by name when describing specific situations. Before running any thematic analysis in an AI tool, she exports the data through the PII filter first. It removes the names she expected, and catches a direct email address embedded in one free-text comment — something she would likely have missed skimming 200 responses on deadline.
The filter does not change her workflow in any fundamental way. It adds four minutes. What changes is her confidence that the content she pastes into an AI tool is not carrying information that could identify individuals to the model.
Audit which AI tools your team is using and what data flows in and out of them.
What the filter does not catch
A legal operations manager at a 300-person consulting firm handles document reviews that include client references. Her concern is not just PII in the technical sense — names, account numbers, dates of birth — it is commercially sensitive context: deal valuations, unreleased product roadmaps, acquisition targets, internal project codes.
The PII filter does not flag any of that. It screens for identifiable personal information, not for confidential business information. Those are different categories with different risk profiles, and treating the filter as a general "sensitive information" screen is a mistake that looks reasonable until something slips through.
The practical check before any document enters an AI tool: run the filter for compliance exposure, then apply your own judgment to commercial sensitivity as a separate pass. The tool handles one of the two categories. Knowing which one is the skill.
Assess the full data risk in your AI-connected tools — PII screening is one layer, not all of it.
The actual change this week
Teams now have a free, local option for the PII screening step — the kind of step that gets skipped on a busy day or handled with a fast manual search. Using it does not guarantee compliance. Not using it is a consistent way to miss the identifiers a skim does not catch.
Like this post?
Get the next one in your inbox. Practical AI skills, no filler.