DataHub

Generic async data pipeline with optional DB persistence and runtime subscriptions.

Design

The diagram below shows the full data pipeline — from raw network bytes to in-memory subscriber notifications. Every component implements the same data acceptor concept and may be omitted from the pipe if excessive. For example order book has no DB representation: data adapter pass its data directly to the data feed (the data sink is omitted).

DataHub architecture diagram: pipeline from Network through Transport, Dispatcher to Data Adapters, Data Sink with persistence using Data Model, Data Feed to Subscriptions.

Overview

DataHub is a header-only C++23 library providing a complete async data pipeline with zero runtime overhead via static polymorphism and compile-time reflection. It connects network sources (WebSocket, HTTP REST) to typed subscriber callbacks, with optional database persistence.

Core components

Data Dispatcher<Acceptor...>

Receives raw JSON strings from any transport. Tryes to pushes each message into Multiple adapters which tries to decode via a C++23 static reflection.

Data Adapter

Deserialises a JSON string into a typed C++ struct using Glaze. Calls the downstream handler and propagates its bool result — false means "not consumed", allowing fallthrough to the next adapter in the dispatcher chain.

Data Sink

Collects data from different data streams, persist and propagates to runtime subscription targeted Data Feed.

Data Model

DAO with RAII table creation on construction. Uses Glaze compile-time reflection to derive the table name, column types, and primary key — no hand-written SQL schema.

Data Feed

In-memory structured cache providing fast access to actual data for subscribers.

Data Subscription

Runtime notification interface.