Extensibility
Overview
PG Atlas is designed for iterative evolution, supporting new signals, metrics, and data sources without major rewrites. The current architecture (FastAPI + PostgreSQL + NetworkX) provides a stable foundation for extensibility through modular components, versioned APIs, and well-defined extension points.
Current status: Operational with extension patterns established in v0: registry crawler abstraction, scheduled workflow framework, and metric computation and materialization.
Guiding principles:
- Modular architecture — Ingestion, storage, computation, API, and dashboard as loosely coupled components
- Stable public interface — API versioning strategy already communicates what is needed for a stable interface in v1
- Data portability — PostgreSQL schemas with clear migration paths
- Community-driven growth — PR-friendly extensions through documented patterns
Extension Patterns
PG Atlas demonstrates extensibility through three operational patterns established during v0 development.
Adding New Metrics
The pattern for new metric computation:
- Database schema — Add materialized metric columns to
reposandprojectstables - Computation logic — Create task function in
pg_atlas/procrastinate/tasks.py - Database writer — Bulk update metric values to minimize row lock contention
- API exposure — Add field to response models in
pgatlas/api/models.py - Documentation — Regenerate OpenAPI spec, release the API, and publish the TypeScript SDK
Example: The pony factor implementation reads from contributed_to edges, computes the minimum set of contributors representing 50% of commits, and is materialized to both repos.pony_factor and projects.pony_factor (aggregated as maximum across project repos).
Adding Registry Crawlers
The registry crawler abstraction supports five operational ecosystems (npm, crates.io, PyPI, pub.dev, Packagist) through a shared pattern:
- Abstract interface — Inherit from base crawler class with standard methods
- HTTP client — Use
httpx.AsyncClientwith retry logic and rate limiting - Queue integration — Enqueue pending packages via
registry-crawlworkflow - Error handling — Log failures and include them in the workflow’s job summary
Extension example: Adding a new language ecosystem (e.g., Ruby gems, Go modules) requires implementing the crawler interface and adding the ecosystem identifier to configuration.
Future Extensions
Near-term enhancements under consideration:
Additional Metrics
- On-chain usage signals — Soroban contract invocation counts, unique deployers, total value locked
- Activity scoring refinement — Replace 4-state enum with granular scoring based on commit recency and ecosystem engagement
- Security indicators — CVE feed integration, audit completion status, test coverage from CI
Ingestion Sources
- GitHub API data — Issue/PR statistics, reviewer diversity, community health metrics
- On-chain manifests — Stellar/Soroban deployment metadata for usage tracking
- Versioned dependency edges — Per-release blast radius for vulnerability impact analysis
API Enhancements
- Trend histories — Time-series endpoints for metric evolution
- Transitive queries — Blast radius calculations per package version
- Webhooks — Event notifications for metric changes (e.g. adoption score drops considerably)
Dashboard Features
- Risk heatmaps — Visual grids showing contributor diversity vs. criticality distribution
- Community contributions — Pluggable visualization modules for specialized views
- Export functionality — CSV downloads, PNG/SVG visualization exports
- Scope evolution: Prioritize via working group roadmap; v1 gated on 2026 outcomes.