Take your crypto analytics to the next level with ClickHouse. This guide shows you how to build a production-grade data pipeline for massive datasets and lightning-fast queries.
When your analytics needs grow beyond a single machine and you require a database designed for production scale, it’s time to consider ClickHouse. ClickHouse is built for handling billions of rows with sub-second query times, making it perfect for production analytics, real-time dashboards, and enterprise-grade data analysis.
Looking for other analytics solutions? Check out our full list of API Tutorials for more step-by-step guides.
Why ClickHouse for crypto analytics?
The goal: By the end of this guide, you will have a production-grade ClickHouse setup that can:
Install and configure ClickHouse for crypto analytics.
Create a robust data pipeline for high-frequency data ingestion.
Run complex analytics on 15-minute interval data.
Enable AI-powered analysis through MCP server integration.
Install ClickHouse locally for development and testing:
macOS specifics: Cask installation
The brew install clickhouse
command now installs a Cask, not a standard formula. This provides a single clickhouse
binary that acts as a multi-tool for both the server and client. Commands that refer to clickhouse-server
or brew services
will not work.
Use the following commands instead:
For production workloads, use ClickHouse Cloud:
Moving forward:
The rest of this tutorial will assume you are using a local ClickHouse installation (Option A). The Python scripts are configured for this by default. If you choose to use ClickHouse Cloud, remember to update the CLICKHOUSE_HOST
, CLICKHOUSE_PORT
, CLICKHOUSE_USER
, and CLICKHOUSE_PASSWORD
variables in the scripts accordingly.
Create a new file named build_clickhouse_db.py
. This script efficiently handles high-frequency data from the top 500 pools, incorporating robust error handling and API management strategies. It leverages two key endpoints: the Top Pools on a DEX endpoint to discover pools, and the Pool OHLCV Data endpoint to fetch historical price data.
This script is used for performance and reliability, using several good practices common in data pipelines:
asyncio
and aiohttp
, the script can make many API requests concurrently instead of one by one.fetch_pool_ohlcv_paginated
function calculates how much data to request per API call based on the OHLCV_API_LIMIT
.asyncio.Semaphore
, combined with carefully tuned BATCH_SIZE
and asyncio.sleep()
calls, makes sure we don’t hit the rate limit.fetch_with_retry
function automatically retries failed requests with an exponential backoff delay.Once your database is populated, you can query it directly using any ClickHouse-compatible SQL client or a Python script. While the next step (AI Integration) is recommended for the most powerful analysis, running queries directly is a great way to verify your data.
You can create a file named query_clickhouse.py
to see how fast ClickHouse can process complex analytical queries on the millions of rows you’ve ingested.
You can run the script by executing it from your terminal:
Now, let’s move on to the recommended final step: connecting your database to an AI assistant.
Enable seamless analysis of your local ClickHouse database through the ClickHouse MCP Server. This allows AI assistants like Claude Desktop to connect to your database, list tables, and run SELECT
queries securely.
The server is a Python package that can be installed via pip
:
Next, configure your AI client (e.g., Claude Desktop) to use the server. You’ll need to edit its configuration file.
~/Library/Application Support/Claude/claude_desktop_config.json
%APPDATA%/Claude/claude_desktop_config.json
Add the following JSON block to the mcpServers
section of the file. This tells the client how to run the server and provides the connection details for your local ClickHouse instance.
Finding the command path
The most common point of failure is an incorrect command path. The command should be the absolute path to the clickhouse-mcp-server
executable that pip
installed.
Find this path by running which clickhouse-mcp-server
in your terminal and use the output in the command
field below.
Save the configuration file and restart your AI client. Once restarted, you can start asking it to analyze the data in your crypto_analytics
database.
“Server disconnected” error: This almost always means the command
path in your configuration is incorrect. Double-check the absolute path using which clickhouse-mcp-server
.
AI connects to the default
database: We observed that the AI client might sometimes choose to connect to the default
database on its own, even if crypto_analytics
is specified in the config. This will result in it seeing no tables.
Solution: Be explicit: To ensure the AI works correctly, always specify the database in your prompt. This overrides the AI’s tendency to use the default.
Good example prompts:
crypto_analytics
database and then list the tables.”crypto_analytics
database, show me the top 10 pools by volume from the pools
table.”crypto_analytics
database.”Congratulations! You’ve successfully built a scalable crypto analytics pipeline with ClickHouse. You’ve ingested a large dataset of OHLCV data, and you’ve enabled a powerful AI assistant to securely query and analyze that data.
Key achievements:
Take your crypto analytics to the next level with ClickHouse. This guide shows you how to build a production-grade data pipeline for massive datasets and lightning-fast queries.
When your analytics needs grow beyond a single machine and you require a database designed for production scale, it’s time to consider ClickHouse. ClickHouse is built for handling billions of rows with sub-second query times, making it perfect for production analytics, real-time dashboards, and enterprise-grade data analysis.
Looking for other analytics solutions? Check out our full list of API Tutorials for more step-by-step guides.
Why ClickHouse for crypto analytics?
The goal: By the end of this guide, you will have a production-grade ClickHouse setup that can:
Install and configure ClickHouse for crypto analytics.
Create a robust data pipeline for high-frequency data ingestion.
Run complex analytics on 15-minute interval data.
Enable AI-powered analysis through MCP server integration.
Install ClickHouse locally for development and testing:
macOS specifics: Cask installation
The brew install clickhouse
command now installs a Cask, not a standard formula. This provides a single clickhouse
binary that acts as a multi-tool for both the server and client. Commands that refer to clickhouse-server
or brew services
will not work.
Use the following commands instead:
For production workloads, use ClickHouse Cloud:
Moving forward:
The rest of this tutorial will assume you are using a local ClickHouse installation (Option A). The Python scripts are configured for this by default. If you choose to use ClickHouse Cloud, remember to update the CLICKHOUSE_HOST
, CLICKHOUSE_PORT
, CLICKHOUSE_USER
, and CLICKHOUSE_PASSWORD
variables in the scripts accordingly.
Create a new file named build_clickhouse_db.py
. This script efficiently handles high-frequency data from the top 500 pools, incorporating robust error handling and API management strategies. It leverages two key endpoints: the Top Pools on a DEX endpoint to discover pools, and the Pool OHLCV Data endpoint to fetch historical price data.
This script is used for performance and reliability, using several good practices common in data pipelines:
asyncio
and aiohttp
, the script can make many API requests concurrently instead of one by one.fetch_pool_ohlcv_paginated
function calculates how much data to request per API call based on the OHLCV_API_LIMIT
.asyncio.Semaphore
, combined with carefully tuned BATCH_SIZE
and asyncio.sleep()
calls, makes sure we don’t hit the rate limit.fetch_with_retry
function automatically retries failed requests with an exponential backoff delay.Once your database is populated, you can query it directly using any ClickHouse-compatible SQL client or a Python script. While the next step (AI Integration) is recommended for the most powerful analysis, running queries directly is a great way to verify your data.
You can create a file named query_clickhouse.py
to see how fast ClickHouse can process complex analytical queries on the millions of rows you’ve ingested.
You can run the script by executing it from your terminal:
Now, let’s move on to the recommended final step: connecting your database to an AI assistant.
Enable seamless analysis of your local ClickHouse database through the ClickHouse MCP Server. This allows AI assistants like Claude Desktop to connect to your database, list tables, and run SELECT
queries securely.
The server is a Python package that can be installed via pip
:
Next, configure your AI client (e.g., Claude Desktop) to use the server. You’ll need to edit its configuration file.
~/Library/Application Support/Claude/claude_desktop_config.json
%APPDATA%/Claude/claude_desktop_config.json
Add the following JSON block to the mcpServers
section of the file. This tells the client how to run the server and provides the connection details for your local ClickHouse instance.
Finding the command path
The most common point of failure is an incorrect command path. The command should be the absolute path to the clickhouse-mcp-server
executable that pip
installed.
Find this path by running which clickhouse-mcp-server
in your terminal and use the output in the command
field below.
Save the configuration file and restart your AI client. Once restarted, you can start asking it to analyze the data in your crypto_analytics
database.
“Server disconnected” error: This almost always means the command
path in your configuration is incorrect. Double-check the absolute path using which clickhouse-mcp-server
.
AI connects to the default
database: We observed that the AI client might sometimes choose to connect to the default
database on its own, even if crypto_analytics
is specified in the config. This will result in it seeing no tables.
Solution: Be explicit: To ensure the AI works correctly, always specify the database in your prompt. This overrides the AI’s tendency to use the default.
Good example prompts:
crypto_analytics
database and then list the tables.”crypto_analytics
database, show me the top 10 pools by volume from the pools
table.”crypto_analytics
database.”Congratulations! You’ve successfully built a scalable crypto analytics pipeline with ClickHouse. You’ve ingested a large dataset of OHLCV data, and you’ve enabled a powerful AI assistant to securely query and analyze that data.
Key achievements: