Ga naar inhoud

Secrets Cleanup: How to Find & Remove Leaked Credentials from Git History

Lessons from a real incident: We discovered multiple API keys, GitHub tokens, and database credentials scattered across git history. Here's what we learned and how to prevent it.

The Problem

Secrets leak into git repositories more often than you'd think: - .env files accidentally committed - Embedded credentials in Dockerfile RUN commands - API keys in configuration files - GitHub tokens in automated clone commands

Once committed, they're permanently in history — even if you delete the file later.

Why This Matters

  • ⚠️ Public repositories expose credentials to anyone
  • ⚠️ Private repositories are accessible to team members (intentionally or accidentally)
  • ⚠️ Forked repositories inherit the full history
  • ⚠️ Backup systems might retain old copies
  • ⚖️ Compliance issues (GDPR, SOC2, etc.)

Common Hiding Spots for Secrets

Where Secrets Hide

# 1. Dockerfile RUN commands
RUN git clone https://ghp_TOKEN@github.com/private/repo.git

# 2. ENV files (if accidentally committed)
DATABASE_PASSWORD=super_secret_123
API_KEY=sk_live_abc123xyz

# 3. Configuration files
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

# 4. Docker Compose or scripts
environment:
  - POSTGRES_PASSWORD=hardcoded_password

# 5. Git remotes with embedded credentials
git remote add origin https://user:password@github.com/repo.git

Recipe: Finding Secrets in Your Repository

Step 1: Scan git history for token patterns

# Search for common GitHub token patterns
git log --all -p | grep -iE 'ghp_|github_pat_|git\+https.*@'

# Search for AWS keys
git log --all -p | grep -iE 'AKIA[0-9A-Z]{16}|aws_secret_access_key'

# Search for generic secrets
git log --all -p | grep -iE 'password|api.key|secret|token' | grep -v '^\s*#'

# Search for database credentials
git log --all -p | grep -iE 'DATABASE_PASSWORD|POSTGRES_PASSWORD|MYSQL_ROOT_PASSWORD'

Step 2: Check current files

# Look in all untracked/tracked files
grep -r "password\|api_key\|secret" . --exclude-dir=.git --exclude-dir=node_modules

# Specifically check .env files
ls -la | grep -E '\.env|\.secrets'

Step 3: Inspect Dockerfile and scripts

# Check all Dockerfile versions across history
git log --all --pretty=format: -p -- Dockerfile | grep -iE 'password|token|key|@github'

# Check shell scripts
git log --all --pretty=format: -p -- '*.sh' | grep -iE 'password|token|key'

Step 4: Use dedicated scanning tools

# Install git-secrets (detect secrets before commit)
brew install git-secrets  # macOS
# or
pip install detect-secrets

# Run against entire history
detect-secrets scan --all-files

# Install truffleHog (comprehensive secret scanner)
pip install truffleHog
truffleHog --entropy=False --regex --json git file://$(pwd)

Cleaning Secrets from History

Prerequisites

# Backup your repository first!
cd /home && tar czf myrepo_backup.tar.gz myrepo/.git myrepo/.env

# Install git-filter-repo
pip install git-filter-repo

# Clone a fresh copy if possible (safer)
git clone <your-repo> my-repo-clean
cd my-repo-clean

Remove Secrets

1. Create replacements file:

# /tmp/replacements.txt format:
# OLD_SECRET==>REPLACEMENT_TEXT
cat > /tmp/replacements.txt << 'EOF'
ghp_JlpXXXXXXXXXXXXXXXXXXXXXX==>***GITHUB_TOKEN_REMOVED***
sk_live_51234567890123456==>***STRIPE_KEY_REMOVED***
AKIAIOSFODNN7EXAMPLE==>***AWS_KEY_REMOVED***
EOF

2. Run git filter-repo:

# Important: work on a fresh clone or backup first
git filter-repo --replace-text /tmp/replacements.txt --force

3. Re-add remote and force push:

git remote add origin https://github.com/yourname/repo.git
git push origin main --force

⚠️ WARNING: This rewrites all history. Coordinate with team members!

Prevention: Best Practices

1. Use .gitignore properly

# .gitignore - Add common secret patterns
.env
.env.local
*.key
*.pem
*.p12
*.cert
.aws/
.ssh/
secrets/

2. Use environment variables, not hardcoded values

# ❌ BAD - in Dockerfile
RUN pip install git+https://ghp_TOKEN@github.com/private/repo.git

# ✅ GOOD - use build args or secrets
ARG GITHUB_TOKEN
RUN --mount=type=secret,id=github_token \
    pip install git+https://$(cat /run/secrets/github_token)@github.com/private/repo.git

3. Pre-commit hooks

# Install pre-commit framework
pip install pre-commit

# Create .pre-commit-config.yaml
cat > .pre-commit-config.yaml << 'EOF'
repos:
  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']
EOF

# Install hooks
pre-commit install

4. Git Secrets

# Install and initialize
brew install git-secrets
git secrets --install

# Add patterns to detect
git secrets --add 'password\s*=\s*'
git secrets --add 'ghp_[A-Za-z0-9_]'
git secrets --add 'AKIA[0-9A-Z]{16}'

Incident Response Checklist

If you discover leaked secrets:

  • [ ] 🚨 IMMEDIATELY revoke the exposed credential
  • GitHub tokens: https://github.com/settings/tokens
  • AWS keys: AWS IAM console
  • Database passwords: Reset them

  • [ ] 📝 Document what leaked (type, when, who had access)

  • [ ] ✅ Clean git history (use git filter-repo)

  • [ ] 🔄 Force push to all remotes

  • [ ] 👥 Notify team members to re-pull

  • [ ] 📊 Audit logs for unauthorized access

  • [ ] 🧹 Update .gitignore to prevent recurrence

  • [ ] 🛡️ Add pre-commit hooks to catch future leaks

Our Real Experience

We discovered three types of exposed secrets across our repository:

  1. B2 (Backblaze) API keys in env.example (assumed example, but had real values)
  2. GitHub tokens embedded in Dockerfile RUN commands for private packages
  3. Database credentials in docker-compose examples

Solution: - Rewritten git history (13 commits affected) - Replaced all secrets with placeholder strings - Added .gitignore protection - Implemented pre-commit hooks - Revoked all exposed credentials

Time spent: ~30 minutes once discovered Time spent if we'd caught it at commit time: ~1 minute

Tools Reference

Tool Purpose Installation
git-secrets Pre-commit prevention brew install git-secrets
detect-secrets Scan history pip install detect-secrets
truffleHog Deep scanning pip install trufflehog
git-filter-repo Clean history pip install git-filter-repo
pre-commit Hook automation pip install pre-commit

Key Takeaways

  1. Secrets leak more often than you think — even experienced developers slip up
  2. Prevention is 100x easier than cleanup — use .gitignore and pre-commit hooks
  3. Test your scanning regularly — add a fake secret, see if tools catch it
  4. Assume everything in git history is public — even private repos
  5. Automate the checks — humans forget, automation doesn't

Next Steps

  1. Run a security scan on your repository today
  2. Add .gitignore patterns to your projects
  3. Set up pre-commit hooks for your team
  4. Document your credential rotation schedule

Questions? Discuss this in our OpenIT.chat community or create a GitHub issue.

Credit: This guide was written during a real incident response in the OpenIT.chat project. Thanks to the security-conscious team members who asked the right questions!