BigQuery vs Redshift vs ClickHouse for Music | MusicTech Lab

Music companies sit on mountains of data. Streaming royalties from dozens of DSPs. Multi-currency transactions across territories. Financial records that need to reconcile between royalty systems and general ledgers. The database you choose to store and query this data shapes everything downstream: how fast your dashboards load, how much you pay per month, and whether your finance team can actually get answers without filing a ticket.

We have built music data systems on Elasticsearch, ClickHouse, and BigQuery. Each one solved a different problem. This article breaks down when to use which, with real pricing numbers and performance benchmarks so you can make the right call for your catalog.

What makes music data different

Before comparing databases, it helps to understand why music data is uniquely challenging. A typical mid-size label deals with several patterns that push general-purpose databases to their limits.

High Volume

A single quarterly DSP statement can contain millions of rows. One track, dozens of territories, multiple income types.

Multi-Currency

Revenue arrives in GBP, gets converted to USD, reported in EUR, and settled in the artist's local currency.

Complex Relationships

Track to release to label to artist to contract. Each level has its own royalty rates, reserves, and deductions.

Dual Temporal

Every transaction has two dates: when the DSP reported it (cash basis) and when the stream actually happened (accrual basis).

These patterns mean your database needs to handle complex joins, large aggregations, and flexible time-based queries. Not every database does this equally well.

The three contenders

BigQuery: the serverless warehouse

Google BigQuery is a fully managed, serverless data warehouse. You do not provision clusters, tune nodes, or worry about scaling. You write SQL, BigQuery figures out the rest.

Architecture: Columnar storage with a distributed query engine. Separates storage and compute, so you pay independently for each. Data lives in Google Cloud Storage under the hood.

What it does well for music companies:

Complex SQL joins across 15+ dimension tables (labels, artists, tracks, territories, DSPs)
dbt integration is first-class, with native support for the staging-intermediate-marts pattern
Handles 50M+ row fact tables without breaking a sweat
Native connectors to Looker, Sigma Computing, and Looker Studio for BI
Zero infrastructure management, your finance team never waits on DevOps

Where it falls short:

Query latency is typically 1-2 seconds, not sub-second
On-demand pricing can spike if analysts run unoptimized queries on large tables
Not ideal for high-concurrency, user-facing dashboards with hundreds of simultaneous users

BigQuery offers a 1 TB/month free tier for queries. For a small label just getting started with analytics, you can run a meaningful data warehouse for nearly zero cost.

Redshift: the AWS powerhouse

Amazon Redshift is a cluster-based columnar data warehouse built on PostgreSQL. It has been the default choice for AWS-native organizations for over a decade.

Architecture: Provisioned clusters with leader and compute nodes. You choose node types and quantities based on your workload. Redshift Serverless is also available as a pay-per-query alternative.

What it does well for music companies:

Deep AWS ecosystem integration (S3, Glue, Lambda, Step Functions)
PostgreSQL compatibility means most SQL tools and libraries work out of the box
Mature ecosystem with extensive documentation and community support
Fine-grained cluster tuning for teams that want control over performance
Spectrum lets you query data directly in S3 without loading it

Where it falls short:

Provisioned clusters cost money even when idle, a 24/7 dc2.large node runs ~$180/month
Requires capacity planning and occasional cluster resizing
More operational overhead than BigQuery
Scaling up means migrating to larger nodes or adding more, not automatic

With Redshift provisioned clusters, you pay for compute whether anyone is running queries or not. If your team primarily queries during business hours, you are paying for 16 idle hours per day. Consider Redshift Serverless or BigQuery if your usage is bursty.

ClickHouse: the speed demon

ClickHouse is an open-source, column-oriented OLAP database designed for sub-second analytical queries. Originally built at Yandex to handle billions of page-view events, it has become the go-to choice for real-time analytics.

Architecture: Vectorized query execution with native compression. Processes data in columns rather than rows, which makes aggregations extremely fast. Available as self-hosted (open source) or ClickHouse Cloud (managed).

What it does well for music companies:

Sub-second queries on billions of rows, 5-10x faster than BigQuery and Redshift on aggregations
Real-time ingestion with no batch delay
Excellent compression ratios reduce storage costs significantly
Ideal for streaming play-count dashboards, listener behavior analytics, and DSP performance monitoring
Cost-effective at scale, especially self-hosted

Where it falls short:

Joins are more limited and less optimized than BigQuery or Redshift
Smaller BI tool ecosystem, Looker and Sigma have limited native support
Self-hosting requires operational expertise (replication, backups, upgrades)
Not designed for complex multi-table financial reporting

ClickHouse compresses data aggressively, often achieving 5-10x compression. A 500 GB dataset in BigQuery might only consume 50-100 GB in ClickHouse, directly reducing storage costs.

Head-to-head comparison

Feature	BigQuery	Redshift	ClickHouse
Deployment	Serverless (managed)	Cluster or Serverless	Self-hosted or Cloud
Query latency	1-2 seconds	1-3 seconds	50-500 ms
Joins	Full SQL, excellent	Full SQL, good	Limited, basic
Scaling	Automatic	Manual (cluster) or auto (serverless)	Manual (self) or auto (cloud)
dbt support	Native, first-class	Native, good	Community adapter
BI ecosystem	Looker, Sigma, Tableau, Looker Studio	QuickSight, Tableau, Looker	Grafana, Metabase, Superset
SQL dialect	GoogleSQL	PostgreSQL-based	ClickHouse SQL
Best concurrency	~100 concurrent	~50 concurrent	1000+ concurrent
Learning curve	Low	Medium	Medium-High
Vendor lock-in	High (GCP)	High (AWS)	Low (open source)

Pricing simulation: a mid-size music label

Let's model real costs for a label with 15 owned labels, 50 million royalty rows, and a 5-person analytics team running daily queries.

Storage: 50M rows of royalty data (~200 GB uncompressed)

Database	Storage model	Monthly cost
BigQuery	$20/TB active, $10/TB long-term	$4/mo (200 GB active)
Redshift	Included in node cost (managed storage: $0.024/GB)	$4.80/mo
ClickHouse Cloud	$25.30/TB (compressed, ~40 GB after compression)	$1.01/mo

Storage costs are nearly identical and trivially small at this scale. The real cost difference comes from compute.

Compute: daily queries by a 5-person team

Assumptions: 20 queries/day per analyst, ~5 GB scanned per query, 22 working days/month.

Database	Pricing model	Monthly cost
BigQuery (on-demand)	$6.25/TB scanned. 5 users x 20 queries x 5 GB x 22 days = ~11 TB/month	$68/mo
BigQuery (flat-rate)	100 slots at ~$0.04/slot-hour x 730 hours	$2,920/mo
Redshift (dc2.large, 2 nodes)	$0.25/node/hour x 2 nodes x 730 hours	$365/mo
Redshift (Serverless)	$0.375/RPU-hour, ~4 RPUs x ~8 active hours x 22 days	$264/mo
ClickHouse Cloud	~$0.10/compute-unit, auto-scaling with idle pause	$80-150/mo

Total monthly cost estimate

Database	Configuration	Total/month
BigQuery	On-demand	~$72
BigQuery	Flat-rate (100 slots)	~$2,925
Redshift	2x dc2.large (always on)	~$370
Redshift	Serverless	~$269
ClickHouse Cloud	Dev tier with auto-pause	~$100-150

BigQuery on-demand looks cheapest, but costs scale linearly with query volume. If your team grows to 20 analysts running complex queries, you could easily hit $500-1,000/month. Partition your tables and use query cost controls to prevent surprises.

Annual cost projection

                        Year 1      Year 2      Year 3
                        (50M rows)  (65M rows)  (85M rows)
BigQuery (on-demand)    $864        $1,080      $1,350
Redshift (2x dc2)      $4,440      $4,440      $6,660*
Redshift Serverless     $3,228      $3,900      $4,800
ClickHouse Cloud        $1,500      $1,800      $2,400

* Redshift cluster upgrade needed at ~70M rows

Performance benchmarks: music data queries

Here is how each database handles typical music industry analytical queries on a 50M-row royalty dataset.

Query 1: Revenue by label, last 12 months

SELECT label_name, SUM(net_revenue) AS total_revenue
FROM fact_revenue
JOIN dim_label USING (label_key)
JOIN dim_date USING (date_key)
WHERE dim_date.date >= DATE_SUB(CURRENT_DATE(), INTERVAL 12 MONTH)
GROUP BY label_name
ORDER BY total_revenue DESC

Database	Execution time	Data scanned
BigQuery	1.2 seconds	3.8 GB
Redshift (dc2.large x2)	1.8 seconds	Full table
ClickHouse	87 ms	0.4 GB (compressed)

Query 2: Artist recoupment status across all contracts

SELECT artist_name, contract_id,
       SUM(revenue) AS earned,
       advance_amount,
       advance_amount - SUM(revenue) AS balance
FROM fact_revenue
JOIN dim_artist USING (artist_key)
JOIN dim_contract USING (contract_key)
GROUP BY artist_name, contract_id, advance_amount
HAVING balance > 0
ORDER BY balance DESC

Database	Execution time	Notes
BigQuery	2.1 seconds	Handles multi-table join well
Redshift	2.8 seconds	Comparable with sort keys
ClickHouse	450 ms	Slower due to joins, still fast

Query 3: Real-time play counts by territory (1B events)

SELECT territory, COUNT(*) AS plays, SUM(revenue) AS revenue
FROM streaming_events
WHERE event_date >= today() - 7
GROUP BY territory
ORDER BY plays DESC
LIMIT 50

Database	Execution time
BigQuery	3.4 seconds
Redshift	4.1 seconds
ClickHouse	120 ms

ClickHouse dominates on single-table aggregations. BigQuery and Redshift are better when you need complex joins across many dimension tables, which is the typical pattern for financial reporting.

Decision framework

Choosing the right database is not about which one is "best." It is about which one fits your specific workload, team, and infrastructure.

Choose BigQuery

Financial reporting, royalty reconciliation, small teams.

Choose Redshift

AWS-native orgs with existing data engineering teams.

Choose ClickHouse

Real-time analytics, high-concurrency dashboards.

Can you combine them?

Yes, and some of the most effective music data architectures do exactly that.

A practical combination: ClickHouse for real-time dashboards (streaming counts, live territory maps, DSP performance) feeding from event streams, plus BigQuery for financial reporting (royalty reconciliation, label P&Ls, catalog valuation) with dbt transformations on a daily batch cycle.

This works well when:

Different teams have different latency requirements
Real-time event data and financial reporting serve different audiences
Your streaming volume justifies a dedicated OLAP engine

It becomes over-engineering when:

Your team is small (under 5 people)
All queries are batch/daily
You do not have dedicated infrastructure engineers

What about Elasticsearch and DynamoDB?

These come up in conversations about music data, but they solve fundamentally different problems.

Database	Purpose	Music use case
Elasticsearch	Full-text search and log analytics	Catalog search (find tracks by title, ISRC, artist), log monitoring, operational dashboards
DynamoDB	Transactional key-value store	User accounts, playlist storage, session management, low-latency app backends

Neither is a data warehouse. You would not run royalty reconciliation on Elasticsearch or catalog valuation queries on DynamoDB. Know the difference, and use each tool in its lane.

Our experience at MusicTech Lab

We have built music data systems across this entire spectrum.

Elasticsearch powered our royalty search engine, processing data from 15+ distributors into a unified, searchable catalog. It excels at finding a specific transaction across 200 million records in milliseconds. But it cannot calculate a label P&L.

ClickHouse drives our AI-powered analytics dashboard, where non-technical users type plain English questions and get charts back in seconds. Sub-second aggregation on tens of millions of rows makes the conversational experience feel instant.

BigQuery is the direction we are actively exploring for financial reporting and reconciliation, connecting royalty data to general ledger systems through dbt transformations. We are deep in the learning phase right now, studying dimensional modeling, star schemas, and the dbt ecosystem. The serverless model and native BI integrations make it a compelling foundation for label finance teams.

The lesson: there is no single "best database for music." There is the right database for your specific workload. And sometimes the right answer is to start learning the tool before your next project demands it.

Getting started

If you are a music company evaluating databases for the first time, start here:

Map your workloads. List every report, dashboard, and query your team runs. Categorize each as financial reporting, real-time analytics, or search.
Count your data. How many rows today? What is your annual growth rate? A 10M-row label has very different needs than a 500M-row distributor.
Assess your team. Do you have data engineers? DevOps? Or is your "data team" a finance analyst who knows Excel? This changes the answer dramatically.
Start serverless. Unless you have a clear reason for provisioned infrastructure, BigQuery on-demand or ClickHouse Cloud with auto-pause will get you running faster and cheaper.
Design for the dimensional model first. The database choice matters less than your data model. A well-designed star schema performs well on any of these three platforms. A poorly designed model performs badly on all of them.

The most expensive database decision is not picking the wrong vendor. It is building without a proper data model and having to rebuild everything six months later. Invest the time upfront in your dimensional design.

Which Database for Music Data? Redshift vs BigQuery vs ClickHouse and When to Use Each

Key Takeaways

What makes music data different

The three contenders

BigQuery: the serverless warehouse

Redshift: the AWS powerhouse

ClickHouse: the speed demon

Head-to-head comparison

Pricing simulation: a mid-size music label

Storage: 50M rows of royalty data (~200 GB uncompressed)

Compute: daily queries by a 5-person team

Total monthly cost estimate

Annual cost projection

Performance benchmarks: music data queries

Query 1: Revenue by label, last 12 months

Query 2: Artist recoupment status across all contracts

Query 3: Real-time play counts by territory (1B events)

Decision framework

Can you combine them?

What about Elasticsearch and DynamoDB?

Our experience at MusicTech Lab

Getting started

Let's Build Something Together

Related Articles

Share this article

Newsletter