
Music companies sit on mountains of data. Streaming royalties from dozens of DSPs. Multi-currency transactions across territories. Financial records that need to reconcile between royalty systems and general ledgers. The database you choose to store and query this data shapes everything downstream: how fast your dashboards load, how much you pay per month, and whether your finance team can actually get answers without filing a ticket.
We have built music data systems on Elasticsearch, ClickHouse, and BigQuery. Each one solved a different problem. This article breaks down when to use which, with real pricing numbers and performance benchmarks so you can make the right call for your catalog.
Before comparing databases, it helps to understand why music data is uniquely challenging. A typical mid-size label deals with several patterns that push general-purpose databases to their limits.
These patterns mean your database needs to handle complex joins, large aggregations, and flexible time-based queries. Not every database does this equally well.
Google BigQuery is a fully managed, serverless data warehouse. You do not provision clusters, tune nodes, or worry about scaling. You write SQL, BigQuery figures out the rest.
Architecture: Columnar storage with a distributed query engine. Separates storage and compute, so you pay independently for each. Data lives in Google Cloud Storage under the hood.
What it does well for music companies:
Where it falls short:
Amazon Redshift is a cluster-based columnar data warehouse built on PostgreSQL. It has been the default choice for AWS-native organizations for over a decade.
Architecture: Provisioned clusters with leader and compute nodes. You choose node types and quantities based on your workload. Redshift Serverless is also available as a pay-per-query alternative.
What it does well for music companies:
Where it falls short:
ClickHouse is an open-source, column-oriented OLAP database designed for sub-second analytical queries. Originally built at Yandex to handle billions of page-view events, it has become the go-to choice for real-time analytics.
Architecture: Vectorized query execution with native compression. Processes data in columns rather than rows, which makes aggregations extremely fast. Available as self-hosted (open source) or ClickHouse Cloud (managed).
What it does well for music companies:
Where it falls short:
| Feature | BigQuery | Redshift | ClickHouse |
|---|---|---|---|
| Deployment | Serverless (managed) | Cluster or Serverless | Self-hosted or Cloud |
| Query latency | 1-2 seconds | 1-3 seconds | 50-500 ms |
| Joins | Full SQL, excellent | Full SQL, good | Limited, basic |
| Scaling | Automatic | Manual (cluster) or auto (serverless) | Manual (self) or auto (cloud) |
| dbt support | Native, first-class | Native, good | Community adapter |
| BI ecosystem | Looker, Sigma, Tableau, Looker Studio | QuickSight, Tableau, Looker | Grafana, Metabase, Superset |
| SQL dialect | GoogleSQL | PostgreSQL-based | ClickHouse SQL |
| Best concurrency | ~100 concurrent | ~50 concurrent | 1000+ concurrent |
| Learning curve | Low | Medium | Medium-High |
| Vendor lock-in | High (GCP) | High (AWS) | Low (open source) |
Let's model real costs for a label with 15 owned labels, 50 million royalty rows, and a 5-person analytics team running daily queries.
| Database | Storage model | Monthly cost |
|---|---|---|
| BigQuery | $20/TB active, $10/TB long-term | $4/mo (200 GB active) |
| Redshift | Included in node cost (managed storage: $0.024/GB) | $4.80/mo |
| ClickHouse Cloud | $25.30/TB (compressed, ~40 GB after compression) | $1.01/mo |
Assumptions: 20 queries/day per analyst, ~5 GB scanned per query, 22 working days/month.
| Database | Pricing model | Monthly cost |
|---|---|---|
| BigQuery (on-demand) | $6.25/TB scanned. 5 users x 20 queries x 5 GB x 22 days = ~11 TB/month | $68/mo |
| BigQuery (flat-rate) | 100 slots at ~$0.04/slot-hour x 730 hours | $2,920/mo |
| Redshift (dc2.large, 2 nodes) | $0.25/node/hour x 2 nodes x 730 hours | $365/mo |
| Redshift (Serverless) | $0.375/RPU-hour, ~4 RPUs x ~8 active hours x 22 days | $264/mo |
| ClickHouse Cloud | ~$0.10/compute-unit, auto-scaling with idle pause | $80-150/mo |
| Database | Configuration | Total/month |
|---|---|---|
| BigQuery | On-demand | ~$72 |
| BigQuery | Flat-rate (100 slots) | ~$2,925 |
| Redshift | 2x dc2.large (always on) | ~$370 |
| Redshift | Serverless | ~$269 |
| ClickHouse Cloud | Dev tier with auto-pause | ~$100-150 |
Year 1 Year 2 Year 3
(50M rows) (65M rows) (85M rows)
BigQuery (on-demand) $864 $1,080 $1,350
Redshift (2x dc2) $4,440 $4,440 $6,660*
Redshift Serverless $3,228 $3,900 $4,800
ClickHouse Cloud $1,500 $1,800 $2,400
* Redshift cluster upgrade needed at ~70M rows
Here is how each database handles typical music industry analytical queries on a 50M-row royalty dataset.
SELECT label_name, SUM(net_revenue) AS total_revenue
FROM fact_revenue
JOIN dim_label USING (label_key)
JOIN dim_date USING (date_key)
WHERE dim_date.date >= DATE_SUB(CURRENT_DATE(), INTERVAL 12 MONTH)
GROUP BY label_name
ORDER BY total_revenue DESC
| Database | Execution time | Data scanned |
|---|---|---|
| BigQuery | 1.2 seconds | 3.8 GB |
| Redshift (dc2.large x2) | 1.8 seconds | Full table |
| ClickHouse | 87 ms | 0.4 GB (compressed) |
SELECT artist_name, contract_id,
SUM(revenue) AS earned,
advance_amount,
advance_amount - SUM(revenue) AS balance
FROM fact_revenue
JOIN dim_artist USING (artist_key)
JOIN dim_contract USING (contract_key)
GROUP BY artist_name, contract_id, advance_amount
HAVING balance > 0
ORDER BY balance DESC
| Database | Execution time | Notes |
|---|---|---|
| BigQuery | 2.1 seconds | Handles multi-table join well |
| Redshift | 2.8 seconds | Comparable with sort keys |
| ClickHouse | 450 ms | Slower due to joins, still fast |
SELECT territory, COUNT(*) AS plays, SUM(revenue) AS revenue
FROM streaming_events
WHERE event_date >= today() - 7
GROUP BY territory
ORDER BY plays DESC
LIMIT 50
| Database | Execution time |
|---|---|
| BigQuery | 3.4 seconds |
| Redshift | 4.1 seconds |
| ClickHouse | 120 ms |
Choosing the right database is not about which one is "best." It is about which one fits your specific workload, team, and infrastructure.
Yes, and some of the most effective music data architectures do exactly that.
A practical combination: ClickHouse for real-time dashboards (streaming counts, live territory maps, DSP performance) feeding from event streams, plus BigQuery for financial reporting (royalty reconciliation, label P&Ls, catalog valuation) with dbt transformations on a daily batch cycle.
This works well when:
It becomes over-engineering when:
These come up in conversations about music data, but they solve fundamentally different problems.
| Database | Purpose | Music use case |
|---|---|---|
| Elasticsearch | Full-text search and log analytics | Catalog search (find tracks by title, ISRC, artist), log monitoring, operational dashboards |
| DynamoDB | Transactional key-value store | User accounts, playlist storage, session management, low-latency app backends |
Neither is a data warehouse. You would not run royalty reconciliation on Elasticsearch or catalog valuation queries on DynamoDB. Know the difference, and use each tool in its lane.
We have built music data systems across this entire spectrum.
Elasticsearch powered our royalty search engine, processing data from 15+ distributors into a unified, searchable catalog. It excels at finding a specific transaction across 200 million records in milliseconds. But it cannot calculate a label P&L.
ClickHouse drives our AI-powered analytics dashboard, where non-technical users type plain English questions and get charts back in seconds. Sub-second aggregation on tens of millions of rows makes the conversational experience feel instant.
BigQuery is the direction we are actively exploring for financial reporting and reconciliation, connecting royalty data to general ledger systems through dbt transformations. We are deep in the learning phase right now, studying dimensional modeling, star schemas, and the dbt ecosystem. The serverless model and native BI integrations make it a compelling foundation for label finance teams.
The lesson: there is no single "best database for music." There is the right database for your specific workload. And sometimes the right answer is to start learning the tool before your next project demands it.
If you are a music company evaluating databases for the first time, start here:
Have a similar project in mind? We'd love to hear about it.
Get in touch to discuss how we can help bring your vision to life.
The Voice Memo Graveyard Problem
Most musical ideas end up as unsearchable, scattered recordings that creators never revisit. Why does this happen and what can we do about it?
Why we decided to use wavesurfer.js
Creating a simple, custom audio player with main functionality is relatively easy. Are you sure? Let's see how we tackled this issue
Technical Partner
Technical partner at MusicTech Lab with 15+ years in software development. Builder, problem solver, blues guitarist, long-distance swimmer, and cyclist.
Get music tech insights, case studies, and industry news delivered to your inbox.