Own the redaction layer.

Wordcab Redact is private PII, PHI, and PCI detection that ships inside your product. Built with Knowledgator on the open GLiNER model family. Embeddable, fine-tunable, and deployable in the same boundary your team already operates.

Talk to an Engineer

Built in partnership with Knowledgator, whose open-source extraction models have crossed 5M+ downloads on Hugging Face. The GLiNER-PII checkpoints we co-authored are open and Apache 2.0; benchmark them on your data before you talk to us. Wordcab Redact is the production build: vertical fine-tunes, hardware-tuned variants, a named engineer on your account, an SLA, and indemnified commercial licensing.

From a strong baseline to your production data.

The open GLiNER-PII reference posts 98% F1 character-level on Knowledgator's published EHR benchmark, ahead of every commercial DLP measured against it. That's a strong starting baseline, and the right place to begin an evaluation.

Every production deployment then meets text that no published benchmark covers in full: domain-specific entity types, tokenization quirks in spoken-form numbers, lowercase blood types in clinical notes, MRNs that share token shape with billing codes, and the entity classes unique to your vertical. Wordcab Redact carries the baseline into your environment, fine-tuned on representative data, packaged for your hardware, and supported.

98% F1

Knowledgator's published EHR benchmark, N=376

A strong starting baseline

Open Apache 2.0 GLiNER-PII posts 98% character-level F1 on EHR PII, ahead of Azure Language (50.2%), Presidio (22.3%), and the leading generic LLMs on the same benchmark. Wordcab Redact carries that baseline into vertical fine-tunes for healthcare, finance, legal, and contact-center deployments.

~120 ms

NER latency vs. ~3,200 ms for GPT-4o

~20× faster than LLM redaction

The pipeline finishes before an LLM finishes its first token. Inline in a voice agent's turn budget, or millions of documents in an overnight batch. Same model, same control plane.

Zero egress

no audio, text, or entities leave the boundary

Inside the perimeter

Runs on the same Helm chart as Voice and Think. VPC, on-prem, airgap. No call-home in the critical path, no telemetry to Wordcab on redacted content.

CPU-native

x86 / ARM / GPU all supported

Runs on the hardware you have

~20k tokens/sec on a single L4 with the open reference. Wordcab Redact adds quantized CPU and distilled GPU variants tuned for the boxes you actually operate; edge deployments are a configuration, not a port.

This is the redaction layer.

01. Detect

Entity detection

Find PII, PHI, PCI, credentials, and custom entities in any text.

The GLiNER detection model is zero-shot. Name an entity in the request and the model finds it. Common types ship with vetted defaults. Tenant-specific identifiers like INTERNAL_TICKET_ID or PATIENT_ROOM_NUMBER work without a retrain.

02. Replace

Replacement modes

Detect-only, placeholder replacement, pseudonymization, or character mask.

Pick the mode that fits the workflow. Pseudonymize keeps referential integrity across utterances so downstream analytics stay useful. Mask preserves length and casing for systems that depend on it. Detect returns spans only, so your application controls substitution.

03. Tune

Vertical fine-tunes

Healthcare, financial services, legal, contact center, or your own variant.

The open Apache 2.0 reference is a strong starting baseline. Wordcab Redact carries it forward into the verticals we ship for. A healthcare or finance fine-tune is trained on representative data from that domain, plus the tokenization edge cases every production deployment eventually meets. Lowercase blood types, spoken-form account numbers, MRNs that share token shape with billing codes, and the quiet failures auditors find first.

04. Operate

Same control plane

Helm chart, observability, RBAC, and audit log shared with the rest of the Wordcab stack.

Redact installs as part of the existing Wordcab Helm chart. Prometheus, OpenTelemetry, audit log, and RBAC are the same surfaces your team already operates Voice and Think against. One stack, one upgrade path, one set of dashboards.

Compared against the systems teams actually evaluate.

Character-level F1 on Electronic Health Records, N=376. Source: Knowledgator's open benchmark on the GLiNER-PII reference checkpoint. Vertical fine-tunes are trained on representative data from each domain and measured on yours during Pilot, so the deployment number is the one you ship on.

System	Type	EHR PII F1	Where it fits
GLiNER-PII (Knowledgator + Wordcab) Apache 2.0 reference · the baseline	Specialized NER	98.0%	Open, Apache 2.0. Benchmark on your data, run it yourself.
Generic LLM (GPT-class) Proprietary · per-token billing	Generative	84.5%	Expensive at scale, slow inline, leaves the perimeter.
Open LLM (Llama-class) Open weights · self-hosted	Generative	77.8%	Better than nothing for PHI; not a redaction engine on its own.
Azure Language Managed DLP	Hosted classifier	50.2%	Outside your boundary. Pricing scales with volume.
Microsoft Presidio Open source · rule-driven	Pattern + NER	22.3%	Useful as a baseline; not production-grade on clinical text.

Numbers above are from Knowledgator's published character-level micro-average benchmark on EHR PII. They establish a strong baseline against generic comparators. Every Pilot includes a redaction eval against your representative text so the numbers you ship on are the ones measured on your data, not on a public test set.

Entity coverage you can point at in audit.

The detection model is zero-shot. Anything in the request is valid. The list below is the vetted set Wordcab benchmarks and supports per vertical. Custom types are a request parameter, not a retraining cycle.

General PII

Default across every vertical

PERSON · names, aliases
EMAIL, PHONE, URL
ADDRESS, IP_ADDRESS
DATE_OF_BIRTH, AGE, GENDER
USERNAME, PASSWORD
ORGANIZATION, JOB_TITLE

Healthcare · PHI

HIPAA Safe Harbor

MEDICAL_RECORD, HEALTH_PLAN_ID
BLOOD_TYPE (incl. lowercase)
MEDICATION, DIAGNOSIS
PROCEDURE, PROVIDER_NAME
FACILITY_NAME, DEVICE_ID
BIOMETRIC_ID, LICENSE_NUMBER

Finance · PCI

PCI DSS + KYC

CARD_NUMBER, CARD_EXPIRY
CARD_CVV, ROUTING_NUMBER
IBAN, SWIFT, BIC
ACCOUNT_NUMBER, CUSTOMER_ID
SSN, TAX_ID
CRYPTO_WALLET

Legal, contact-center, and government variants add their own type sets (case identifiers, agent IDs, classification markings). Custom entities are a free-text label in the request body, useful for tenant-specific identifiers like internal ticket IDs or patient room numbers.

Built for the teams making the embed-vs-buy call.

The buyers we ship to are technical product leads at data-security, AI-security, and archive companies, plus the engineering teams in regulated verticals embedding redaction into their own platform.

DSPM & DLP Platforms →

AI & LLM Security →

Archive & eDiscovery →

Healthcare Platforms →

Banking & Financial Services →

Contact Centers →

Detection accuracy is your product.

License or embed the model your DSPM or DLP competes on. Same engine, your packaging.

Frequently asked questions

What's the difference between the open GLiNER-PII model and Wordcab Redact?

Open GLiNER-PII is the Apache 2.0 reference. Same architecture, general-purpose tuning, strong baseline. It's a great place to start and we recommend benchmarking it on your data first. Wordcab Redact is the production build: vertical fine-tunes (healthcare, finance, legal, contact center) that push accuracy higher on the data those verticals actually see; hardware-tuned variants (quantized CPU, distilled GPU, edge); a named engineer who owns your fine-tune and fixes regressions instead of pointing at a forum; an SLA; indemnified commercial licensing; and integration into the same control plane as Voice, Think, and Adapt.

Can it handle the edge cases generic DLP misses?

Every production deployment eventually meets entities and patterns that no published benchmark covers: lowercase blood types in clinical text, account numbers spoken digit-by-digit, MRNs that tokenize like billing codes. The vertical fine-tunes are trained on representative data from each domain so those cases stay covered. If you hit a new one in production, the named engineer who owns your fine-tune fixes it.

How does it integrate with the rest of the Wordcab stack?

Redact installs as part of the same Helm chart as Voice and Think. The API is OpenAI-compatible style: a single /v1/redact endpoint, same auth, same Webhook Portal for async jobs. Inline mode runs inside the voice agent turn budget; batch mode handles archive backfills overnight on the same cluster.

Can we run it fully on-prem or airgapped?

Yes. Same deployment model as the rest of the Wordcab stack: VPC, on-prem, hybrid, airgap. The model ships as a signed offline bundle with no outbound dependencies and no telemetry on redacted content.

What hardware do we need?

A single L4 or L40S handles ~20k tokens/sec on the open reference. The CPU build runs at production throughput on AVX-512 boxes you already operate. For embedded edge deployments we ship a distilled variant: sub-100ms inference on commodity CPU. Pilot includes a hardware sizing pass against your real traffic.

How do we get started?

Benchmark the open model on your data first. It's Apache 2.0 and that's the right starting point. When you're ready to put redaction behind something customer-facing, we scope a paid pilot for the Wordcab Redact build: representative sample of your text, the entity types that matter, the hardware you want to run on. Fixed price, no platform commitment until rollout.

Ship redaction your audit team can sign off on.

If your platform sends voice, transcripts, or LLM outputs through healthcare, finance, or any regulated workflow, Wordcab Redact is the layer that takes a strong baseline and tunes it to the data your auditors will actually see.

Talk to an Engineer

We usually respond within one business day.

What are you building?

Speech-to-text Text-to-speech Voice agents Summarization Redaction Fine-tuning

Or email us directly.