❓ Frequently Asked Questions (FAQ)

Common questions and answers about SecMask.

General Questions
Technical Questions
Usage Questions
Deployment Questions
Privacy & Security
Performance
Troubleshooting

General Questions

What is SecMask?

SecMask is a Mixture of Experts (MoE) system for detecting and masking secrets (API keys, tokens, credentials) in text. It uses two specialized NER models:

Fast Expert: DistilBERT-based, handles 92.7% of cases in ~6ms
Long Expert: Longformer-based, handles complex cases requiring up to 2048 tokens

Why use SecMask instead of regex-based tools?

Regex limitations:

Brittle patterns that break with minor variations
High false positive rates
Can't handle context-dependent secrets
Requires constant maintenance

SecMask advantages:

ML-based detection learns patterns from data
Low false positive rate (82% precision, production-safe)
Handles context (distinguishes real secrets from examples)
Automatically adapts to new secret formats
Multi-stage pipeline (NER + deterministic filters)

What types of secrets does SecMask detect?

SecMask detects:

API keys (OpenAI, Stripe, SendGrid, etc.)
Cloud credentials (AWS, Azure, GCP)
GitHub tokens (classic, fine-grained, PATs)
JWT tokens
SSH/PEM keys
Database connection strings
Kubernetes secrets
And more...

See BENCHMARKS.md for detailed detection rates.

Is SecMask open source?

Yes! SecMask is released under dual licensing:

SecMask codebase: MIT License (training scripts, inference code, documentation)
Fine-tuned models: Apache 2.0 (inherited from DistilBERT and Longformer base models)

You can:

✅ Use freely in commercial projects
✅ Modify and redistribute
✅ Contribute improvements
✅ Fine-tune for your own use cases

Attribution required for:

DistilBERT base model (© Hugging Face, Apache 2.0)
Longformer base model (© Allen Institute for AI, Apache 2.0)

See LICENSE and NOTICE for full details.

How accurate is SecMask?

On our test set (600 examples) at τ=0.80 threshold:

F1 Score: 0.52 (NER model only)
Precision: 82% (low false positives, production-safe)
Recall: 38% (NER component)

Production Note: These metrics represent the NER models alone. Production deployments combine NER detection with deterministic filters (PEM blocks, K8s secrets, pattern-based matching) for comprehensive secret coverage while maintaining the high precision guarantee.

See BENCHMARKS.md for detailed metrics and evaluation methodology.

Technical Questions

What is Mixture of Experts (MoE)?

MoE is an architecture where multiple specialized models ("experts") handle different types of inputs:

Router decides which expert to use based on input characteristics
Fast Expert (DistilBERT, 512 tokens) handles most cases quickly
Long Expert (Longformer, 2048 tokens) handles complex cases

This gives us both speed (6ms average) and accuracy.

Why two models instead of one?

Trade-off between speed and capacity:

Aspect	Fast Expert	Long Expert
Latency	6ms	12ms
Max tokens	512	2048
Model size	268MB	592MB
Use case	Single secrets	Long configs

Using both gives us the best of both worlds:

92.7% of texts processed in 6ms (fast expert)
Complex cases escalated to long expert automatically

How does the router work?

The router uses heuristics to decide which expert to use:

def should_escalate(text):
    """Decide if we need long expert"""

    # Fast expert if text is short
    if len(text.split()) < 100:
        return False

    # Fast expert if no complex patterns
    if not has_multi_line_structure(text):
        return False

    # Otherwise, use long expert
    return True

See router.py for full implementation.

Can I train my own models?

Yes! See train_ner_masker.py for the training script.

Requirements:

Labeled dataset (BIO-tagged secrets)
GPU (NVIDIA T4 or better)
~2 hours training time

Steps:

# Prepare data (see data/README.md)
python data/make_v2_data.py

# Train fast expert
python train_ner_masker.py \
  --model-name distilbert-base-uncased \
  --train-file data/v2_train.jsonl \
  --val-file data/v2_val.jsonl

# Train long expert
python train_longformer_expert.py \
  --model-name allenai/longformer-base-4096 \
  --train-file data/long_context_train.jsonl \
  --val-file data/long_context_val.jsonl

What is the model architecture?

Fast Expert:

Base: distilbert-base-uncased (66M parameters)
Task: Token classification (NER)
Labels: O (non-secret), B-SECRET, I-SECRET
Context window: 512 tokens

Long Expert:

Base: allenai/longformer-base-4096 (149M parameters)
Task: Token classification (NER)
Labels: Same as fast expert
Context window: 2048 tokens (4096 max)

Both use standard HuggingFace transformers architecture.

How do I add support for new secret types?

Option 1: Retrain with new data

# Add labeled examples to data/v2_train.jsonl
echo '{"text": "New secret: xyz-123", "labels": ["O", "O", "B-SECRET"]}' >> data/v2_train.jsonl

# Retrain model
python train_ner_masker.py --train-file data/v2_train.jsonl

Option 2: Add regex filter

# Add to filters.json
{
  "name": "custom_secret",
  "pattern": "xyz-[0-9]{3}",
  "confidence": 0.95
}

# Apply filter
from filters import apply_filters
masked = apply_filters(text, filters)

Usage Questions

How do I install SecMask?

Quick install:

# Install dependencies
pip install transformers torch

# Download code
git clone https://github.com/andrewandrewsen/secmask.git
cd secmask

# Run
python infer_moe.py --in file.txt \
  --fast-model andrewandrewsen/distilbert-secret-masker

See README.md for detailed instructions.

How do I use SecMask from Python?

from infer_moe import mask_text_moe

# Basic usage
masked = mask_text_moe(
    "My API key is sk-1234567890",
    fast_model_dir="andrewandrewsen/distilbert-secret-masker"
)

print(masked)  # "My API key is [SECRET]"

See EXAMPLES.md for more examples.

How do I adjust sensitivity?

Use the --tau parameter (threshold):

# More sensitive (more detections, more false positives)
python infer_moe.py --in file.txt --tau 0.50

# Less sensitive (fewer false positives, may miss some secrets)
python infer_moe.py --in file.txt --tau 0.90

# Default (balanced)
python infer_moe.py --in file.txt --tau 0.80

Recommended thresholds:

Production logs: tau=0.85 (minimize false positives)
Pre-commit hooks: tau=0.75 (catch more secrets)
Security audits: tau=0.70 (be extra cautious)

Can I process multiple files?

Yes! Use a simple loop:

# Bash
for file in *.py; do
    python infer_moe.py --in "$file" --out "${file}.masked"
done

# Python
from pathlib import Path
from infer_moe import mask_text_moe

for file in Path('.').glob('*.py'):
    with open(file, 'r') as f:
        content = f.read()

    masked = mask_text_moe(content,
        fast_model_dir="andrewandrewsen/distilbert-secret-masker")

    with open(f"{file}.masked", 'w') as f:
        f.write(masked)

How do I use private HuggingFace models?

Option 1: Login via CLI

huggingface-cli login
# Enter your token when prompted

Option 2: Environment variable

export HF_TOKEN="hf_xxxxxxxxxxxxx"
python infer_moe.py --in file.txt --fast-model my-org/private-model

Option 3: Pass token directly

python infer_moe.py --in file.txt \
  --fast-model my-org/private-model \
  --token hf_xxxxxxxxxxxxx

Deployment Questions

Can I deploy SecMask in production?

Yes! SecMask is production-ready. See DEPLOYMENT.md for:

Docker deployment
Kubernetes
AWS Lambda
Azure Functions

What are the hardware requirements?

Minimum (CPU-only):

2 CPU cores
4GB RAM
~6-10ms latency per request

Recommended (GPU):

NVIDIA T4 or better
8GB RAM
~3-5ms latency per request

See BENCHMARKS.md for details.

How do I scale SecMask?

Horizontal scaling (multiple instances):

# Kubernetes
kubectl scale deployment secmask --replicas=10

# Docker Swarm
docker service scale secmask=10

Vertical scaling (more resources):

resources:
  requests:
    memory: "8Gi"
    cpu: "4000m"

See DEPLOYMENT.md for auto-scaling setup.

Can I use SecMask in a Lambda function?

Yes! See DEPLOYMENT.md for setup.

Key considerations:

Use container image deployment (not zip)
Set timeout to 30s
Set memory to 2048MB
Pre-download models at build time

Does SecMask work offline?

Yes, once models are downloaded:

# Download models
python -c "from transformers import AutoModel; \
    AutoModel.from_pretrained('andrewandrewsen/distilbert-secret-masker')"

# Now works offline
python infer_moe.py --in file.txt \
  --fast-model ~/.cache/huggingface/hub/models--andrewandrewsen--distilbert-secret-masker

Privacy & Security

Does SecMask send data to external servers?

No. SecMask runs entirely locally. Your data never leaves your machine unless you:

Use HuggingFace Inference API (not recommended)
Deploy SecMask as a remote service

Is my data safe?

Yes! SecMask:

Processes data in-memory only
Doesn't log secrets (only metadata)
Doesn't send telemetry

Best practices:

Run SecMask on-premises
Use local model storage (not HF cache)
Review masked output before sharing

Can SecMask leak secrets in logs?

SecMask is designed to prevent secret leakage:

Only logs masked text (not original)
Doesn't log model predictions
Safe to enable debug logging

Example log output:

INFO: Masking text (length: 245 chars)
INFO: Masked in 6.3ms, found 2 secrets

How do I report security vulnerabilities?

DO NOT open public issues for security vulnerabilities.

Instead:

Email: [security@example.com]
Include: Description, impact, steps to reproduce
We'll respond within 48 hours

See SECURITY.md for details.

Performance

How fast is SecMask?

Latency:

Fast expert: 6ms (median), 12ms (P99)
Long expert: 12ms (median), 25ms (P99)
MoE average: 6.8ms (92.7% use fast expert)

Throughput:

CPU: ~50 requests/second (single core)
GPU (T4): ~300 requests/second
GPU (A100): ~1200 requests/second

See BENCHMARKS.md for detailed metrics.

Why is the first request slow?

Model loading overhead:

First request loads model into memory (~2-5s)
Subsequent requests reuse loaded model (fast)

Solutions:

Keep service running (don't restart per request)
Use model caching
Pre-load models at startup

# Pre-load models
from infer_moe import load_model
model, tokenizer = load_model("andrewandrewsen/distilbert-secret-masker")

# Now fast for all requests

How can I make SecMask faster?

1. Use fast-only mode (no escalation):

python infer_moe.py --in file.txt --no-escalate
# 2x faster, slight accuracy loss

2. Use GPU:

# Automatic if available
python infer_moe.py --in file.txt

3. Batch processing:

from transformers import pipeline

pipe = pipeline("token-classification",
    model="andrewandrewsen/distilbert-secret-masker",
    batch_size=16)  # Process 16 at once

results = pipe(texts)  # Much faster

4. ONNX conversion:

# Convert to ONNX (2-3x faster)
python -m optimum.onnxruntime \
    --model andrewandrewsen/distilbert-secret-masker \
    --export onnx

See DEPLOYMENT.md for more.

How much memory does SecMask use?

Model sizes:

Fast expert: 268MB
Long expert: 592MB
Runtime: +500MB (tokenizer, inference)

Total:

Fast-only: ~800MB
MoE (both): ~1.4GB

GPU adds VRAM overhead (~1GB).

Troubleshooting

Error: "Model not found"

Cause: Model not downloaded or incorrect path.

Solution:

# Download model
python -c "from transformers import AutoModel; \
    AutoModel.from_pretrained('andrewandrewsen/distilbert-secret-masker')"

# Use full HuggingFace ID
python infer_moe.py --fast-model andrewandrewsen/distilbert-secret-masker

Error: "CUDA out of memory"

Cause: GPU VRAM insufficient.

Solution:

# Use CPU
export CUDA_VISIBLE_DEVICES=""
python infer_moe.py --in file.txt

# Or reduce batch size (if using batching)
pipe = pipeline(..., batch_size=1)

Why are there false positives?

Common causes:

Hex strings (e.g., #1a2b3c, git commits)
UUIDs (e.g., 123e4567-e89b-12d3-a456-426614174000)
Hashes (e.g., MD5, SHA256)

Solutions:

# Increase threshold (fewer false positives)
python infer_moe.py --tau 0.90

# Add custom filters to whitelist patterns
# Edit filters.json

Why are secrets missed (false negatives)?

Common causes:

New secret format not in training data
Obfuscated secrets (e.g., base64 encoded)
Very long secrets (>512 tokens for fast expert)

Solutions:

# Decrease threshold (more detections)
python infer_moe.py --tau 0.70

# Enable long expert
python infer_moe.py \
  --fast-model andrewandrewsen/distilbert-secret-masker \
  --long-model andrewandrewsen/longformer-secret-masker

# Retrain with new examples
python train_ner_masker.py --train-file data/custom_train.jsonl

SecMask is too slow. What can I do?

Quick wins:

Use fast-only mode: --no-escalate
Use GPU if available
Increase threshold: --tau 0.85 (fewer detections = faster)

Advanced:

ONNX conversion (2-3x speedup)
Quantization (smaller model, faster)
Batch processing (higher throughput)

See Performance section above.

How do I debug issues?

Enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

from infer_moe import mask_text_moe
masked = mask_text_moe(...)

Check model loading:

from transformers import AutoModel

try:
    model = AutoModel.from_pretrained("andrewandrewsen/distilbert-secret-masker")
    print("✅ Model loaded successfully")
except Exception as e:
    print(f"❌ Error: {e}")

Test inference:

from infer_moe import mask_text_moe

text = "Test: sk-1234567890"
masked = mask_text_moe(text, fast_model_dir="andrewandrewsen/distilbert-secret-masker")
print(f"Input: {text}")
print(f"Output: {masked}")
assert '[SECRET]' in masked, "Secret not detected!"

Still Have Questions?

Documentation: README.md, EXAMPLES.md, DEPLOYMENT.md
GitHub Issues: Open an issue
GitHub Discussions: Ask the community
Email: [your-email@example.com]

Last Updated: 2024-11

FilesExpand file tree

FAQ.md

Latest commit

History

FAQ.md

File metadata and controls

❓ Frequently Asked Questions (FAQ)

Table of Contents

General Questions

What is SecMask?

Why use SecMask instead of regex-based tools?

What types of secrets does SecMask detect?

Is SecMask open source?

How accurate is SecMask?

Technical Questions

What is Mixture of Experts (MoE)?

Why two models instead of one?

How does the router work?

Can I train my own models?

What is the model architecture?

How do I add support for new secret types?

Usage Questions

How do I install SecMask?

How do I use SecMask from Python?

How do I adjust sensitivity?

Can I process multiple files?

How do I use private HuggingFace models?

Deployment Questions

Can I deploy SecMask in production?

What are the hardware requirements?

How do I scale SecMask?

Can I use SecMask in a Lambda function?

Does SecMask work offline?

Privacy & Security

Does SecMask send data to external servers?

Is my data safe?

Can SecMask leak secrets in logs?

How do I report security vulnerabilities?

Performance

How fast is SecMask?

Why is the first request slow?

How can I make SecMask faster?

How much memory does SecMask use?

Troubleshooting

Error: "Model not found"

Error: "CUDA out of memory"

Why are there false positives?

Why are secrets missed (false negatives)?

SecMask is too slow. What can I do?

How do I debug issues?

Still Have Questions?