Skip to content
molesignal
ArchitectureStorage

Why we put logs, metrics, and traces in the same Parquet store

molesignal team 9 min

Three stores is the industry default

When we sketched the data plane, the easy answer was three stores: one for logs (column-oriented), one for metrics (TSDB), one for traces (graph-aware). Industry default. That's what the OSS stack does (Loki + Mimir + Tempo), and it's what most SaaS does under the hood.

But every cross-signal jump in that world is a join across three engines. trace_id → log line → host metric becomes three queries with three semantics. Schema evolves separately on each side. The thing your SRE actually needs at 3am — "show me everything for trace_id abc123 in this minute" — turns into copy-paste.

One columnar store, one query language

We tried the alternative: all three signals serialize to Parquet on object storage. Logs are wide, sparse rows. Metrics are narrow, dense rows. Traces are nested but flatten under Arrow lists. The columnar layout means prune-and-scan stays cheap even when 90% of the columns are empty.

The 3am question collapses into a single query instead of three round-trips across three systems:

sql
SELECT *
FROM events
WHERE trace_id = 'abc123'
  AND ts BETWEEN now() - INTERVAL '1 minute' AND now()
ORDER BY ts ASC;

The cost-of-correlation question becomes a planner question, not a serialization question. We get to write one query language (SQL) once, push down to DataFusion, and let the planner figure out which columns and which time windows to touch.

It isn't free

It's not free. The hard parts move, they don't vanish:

  • Parquet rotation under spiky ingest is harder than partition-tuning Loki.
  • Cardinality limits on metrics are harder when they share a writer with logs.
  • Backfilling schema changes touches one store instead of three — better, but the one store is now load-bearing for everything.

We're solving these in the open (see /roadmap).

But we'd rather solve fewer problems harder than five problems shallower.