Security & Privacy

Overview

PG Atlas handles public ecosystem data (dependency graphs, metrics, pony factor from git logs) with no personal user data in v0 (read-only API/dashboard, no accounts/logins). Security and privacy design emphasizes defense-in-depth, minimalism, and transparency to protect ecosystem integrity and community trust.

Core principles:

  • Privacy-first: No tracking of individual users; avoid cookies/session storage.
  • Personal data minimization: Collect zero PII; git logs processed only for aggregate contributor stats (no email exposure).
  • Auditability for maintainers: Multiple community maintainers access backend — all actions logged immutably.
  • CIA triad alignment:
    • Confidentiality (public data only, but protect ingest sources)
    • Integrity (verified inputs, reproducible metrics)
    • Availability (resilient hosting, DoS mitigation)

Threat model (v0):

  • Primary risks:
    • API abuse/DoS
    • ingestion tampering (fake SBOMs)
    • maintainer compromise
  • Low risk: Data exfiltration (all public anyway).

Confidentiality

  • Data classification: All stored/computed data public by design (open-source manifests, git logs, metrics).
  • Ingest protection: SBOM webhooks validate GitHub signatures (if Action-signed) or repo provenance; shadow crawls from trusted registries only.
  • No secrets exposure: API keys/secrets in environment variables or provider vaults; never in repo.
  • Future: Optional maintainer 2FA/multi-sig for admin actions.

Integrity

  • Input validation: Strict schema checks on SBOMs (CycloneDX/SPDX validation libraries); reject malformed.
  • Reproducibility: Raw artifacts retained; metric computation is deterministic.
  • Maintainer actions: Audit log all admin changes insofar this is possible.
  • Code integrity: Signed Git commits; dependabot alerts; reproducible builds where possible.

Open practice: Public issue tracker for reported discrepancies (e.g., wrong edges) — resolved via PR.

Availability

  • DoS mitigation: API rate limiting (token bucket); CDN caching for static/dashboard.
  • Resilience: Provider health checks/auto-restart; database backups (daily snapshots).
  • Incident response: Health endpoint and error monitoring; rollback via git/deploy history.
  • Scalability guardrails: Single-machine limits enforced; alert on resource spikes.

Privacy-Specific Measures

  • Web analytics: Plausible — cookie-less, privacy-respecting aggregate stats only (page views, referrers anonymized).
  • No trackers: Avoid Google Analytics, Meta pixels, etc.; no client-side fingerprinting.
  • Dashboard: Local storage optional for preferences (e.g., dark mode) — no server sync.
  • Git log handling: Obfuscate author emails by hashing or ellipsis redaction.

Open Questions

  • Maintainer access method (SSH keys vs. provider IAM roles)?
  • Formal incident response playbook needed for v0 launch?
  • SBOM signature enforcement level (optional vs. required)?

Back to top

This site uses Just the Docs, a documentation theme for Jekyll.