Roadmap
Known limitations and open problems. If something interests you, open an issue or pull request on GitHub.
Open problems
- No semantic understanding of privacy. Regex catches structured PII like emails and phone numbers. It misses anything contextual — journal entries, health history, relationship details, anything that's private but has no fixed format.
- No user-defined sensitive patterns. You can't tell it "treat anything about my company X as private." It only knows what it was built to know.
- Redaction breaks context. The LLM gets [NAME_1] but loses the relationship between that name and everything else in the prompt. For complex reasoning tasks this degrades the answer quality.
- Streaming restoration is best-effort. Tokens that get split across SSE chunks can fail to restore correctly.
- No multi-modal support. Images, PDFs, audio — anything that isn't text passes through unscanned.
- Single conversation scope. Mappings don't persist across sessions. Long-running agents or multi-session workflows re-expose the same PII under different tokens.