Conduit Roadmap

Objective: A production-ready .NET data pipeline that fetches, transforms, and serves content from multiple sources with pluggable adapters, transformation stages, and resilient delivery.

Success Criteria

Multi-source ingestion beyond RSS (Atom, EDI 834, Zotero)
Data transformation and enrichment layer
Validation and rejected data tier
Deep domain coverage across all three source types
Deployable as a container to Azure or any Linux host
Comprehensive test coverage and documentation

Context

Conduit is a content aggregation and data pipeline platform built on .NET 10. It ingests content from diverse sources, transforms and enriches it, and serves it through multiple interfaces. The architecture follows a staged pipeline: fetch -> parse -> transform -> store, with multiple entry points (console runner, background worker, REST API, CLI).

The pipeline pattern is domain-agnostic -- applicable to news aggregation, research monitoring, competitive intelligence, healthcare data ingestion, or any scenario where structured content needs to be collected from heterogeneous sources.

Milestones

Foundation

Establishes the core project structure, DI, logging, testing, CI/CD, and documentation. Provides four entry points (console, worker, API, CLI) and a clean architecture that all subsequent milestones build on.

PRD: .orchestra/work/foundation/prd.md
Dependency: None
Status: Complete

Multi-Source Ingestion

Introduces a pluggable source adapter pattern so the pipeline can ingest data from any structured source -- RSS, Atom, EDI 834, and Zotero research libraries. Proves the architecture works across content feeds, transactional healthcare data, and hybrid local-file-plus-API sources.

PRD: .orchestra/work/multi-source-ingestion/prd.md
Dependency: Foundation
Status: Complete

Data Transformation

Adds a composable transformation layer between ingestion and storage. Handles deduplication, content enrichment, validation, and a three-tier output pattern (raw / curated / rejected). Turns Conduit from a data fetcher into a data pipeline.

PRD: .orchestra/work/data-transformation/prd.md
Dependency: Multi-Source Ingestion
Status: Complete

Source Depth: EDI 834

Deepens the 834 adapter from a working prototype to a more complete X12 implementation. Adds transaction/batch envelope tracking, functional acknowledgments (999/TA1), effective dating for overlapping coverage periods, and a more complete X12 loop parser for real-world 834 files.

PRD: .orchestra/work/source-depth-edi834/prd.md
Dependency: Data Transformation
Status: Not Started

Source Depth: Zotero

Deepens the Zotero adapter beyond CSV parsing and domain tagging. Adds richer metadata resolution (CrossRef API for citation counts and venue), preprint-to-published version linking, collection and tag hierarchy, and reading status tracking.

PRD: .orchestra/work/source-depth-zotero/prd.md
Dependency: Data Transformation
Status: Not Started

Source Depth: RSS

Deepens the RSS adapter beyond keyword extraction. Adds content-similarity deduplication across feeds, topic clustering, feed health tracking, and full-text extraction from linked articles.

PRD: .orchestra/work/source-depth-rss/prd.md
Dependency: Data Transformation
Status: Not Started

Storage Backends

Introduces persistent storage options beyond the local filesystem — database backends (SQLite, PostgreSQL), cloud storage (S3, Azure Blob), or hybrid strategies. Builds on the storage abstraction established in the Data Transformation milestone so that switching or adding backends is a configuration change, not a code change.

PRD: To be created
Dependency: Data Transformation
Status: Not Started

References

ADR-000: The Score
ADR-001: Domain-Agnostic Pipeline
ADR-002: Production Readiness is Continuous
ADR-003: Docusaurus Deploys .orchestra/ to GitHub Pages

Success Criteria​

Context​

Milestones​

Foundation​

Multi-Source Ingestion​

Data Transformation​

Source Depth: EDI 834​

Source Depth: Zotero​

Source Depth: RSS​

Storage Backends​

References​