Executive Summary: The 2026 Analytical Standard
ClickHouse is an open source columnar database management system that processes billions of rows in milliseconds. However almost every tutorial on the internet uses outdated Ubuntu 20.04 or 22.04 commands that completely fail on modern systems. Furthermore they treat ClickHouse like a basic application ignoring its true potential on dedicated hardware.
In this advanced 2026 guide we will install ClickHouse on the latest Ubuntu 26.04 Resolute Raccoon. These modern security commands will also work perfectly on Ubuntu 24.04 and 22.04. We will then dive deep into ServerMO Bare Metal optimizations replacing theoretical cloud setups with raw hardware configurations like Tiered Storage and ClickHouse Keeper.
Phase 1: Modern Repository Setup
Old tutorials instruct you to use the apt key command and Yandex repositories. That approach is a massive security failure and will throw immediate errors on Ubuntu 26.04. You must use the modern keyring method to securely fetch the official packages.
# Install core dependencies for secure repository management
sudo apt update
sudo apt install -y apt-transport-https ca-certificates curl gnupg
# Securely download the official GPG key into the correct keyring directory
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL 'https://packages.clickhouse.com/rpm/lts/repodata/repomd.xml.key' | sudo gpg --dearmor -o /etc/apt/keyrings/clickhouse.gpg
# Add the official repository enforcing the signed by security check
echo "deb [signed-by=/etc/apt/keyrings/clickhouse.gpg arch=$(dpkg --print-architecture)] https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list > /dev/null
# Update the package index and install the server and client
sudo apt update
sudo apt install -y clickhouse-server clickhouse-client
During the installation you will be prompted to create a password for the default user. Ensure you store this securely. Once complete start the service to verify the installation.
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server
clickhouse-client --password
Phase 2: Production Tiered Storage
This is where Bare Metal completely destroys public cloud pricing. If you rent cloud storage you pay a flat massive premium for fast disks. On a ServerMO dedicated server you can architect a hybrid setup mixing ultra fast NVMe drives with massive 18TB Enterprise HDDs.
We will configure a production grade storage policy. It keeps the default system files on the boot drive routes active analytical queries to the NVMe disk and automatically moves merged data parts larger than 10GB to the cold HDD archive.
<clickhouse>
<storage_configuration>
<disks>
<default>
<path>/var/lib/clickhouse/</path>
</default>
<nvme_disk>
<path>/mnt/nvme/clickhouse/</path>
</nvme_disk>
<hdd_disk>
<path>/mnt/hdd/clickhouse/</path>
</hdd_disk>
</disks>
<policies>
<tiered_policy>
<volumes>
<hot_volume>
<disk>nvme_disk</disk>
<max_data_part_size_bytes>10737418240</max_data_part_size_bytes>
</hot_volume>
<cold_volume>
<disk>hdd_disk</disk>
</cold_volume>
</volumes>
<move_factor>0.2</move_factor>
</tiered_policy>
</policies>
</storage_configuration>
</clickhouse>
Phase 3: Network Security Binding
By default ClickHouse listens on localhost securing it from the outside world. However many administrators modify the config.xml to listen on :: which broadly exposes port 8123 and 9000 to the entire public internet. This invites severe automated brute force attacks.
If you are running a multi node cluster or remote applications you must bind the listener strictly to your internal VPC IP address and use UFW firewall to whitelist specific communication nodes. Never leave the database ports completely open.
Phase 4: Fixing the Too Many Parts Error
The most common mistake new data engineers make is sending millions of individual insert statements per second. ClickHouse creates a physical file part on the disk for every insert. Doing this creates thousands of tiny files crashing the background merge process resulting in the dreaded Too many parts error.
The enterprise solution is to enable Async Inserts. This tells ClickHouse to hold all small incoming queries in RAM buffer them together and flush them to the disk as one large highly compressed chunk.
# Enable inside your user profile configuration /etc/clickhouse-server/users.xml
<profiles>
<default>
<async_insert>1</async_insert>
<wait_for_async_insert>1</wait_for_async_insert>
</default>
</profiles>
Phase 5: AI Vector Search Realities
As we move deeper into 2026 the line between traditional data analytics and Artificial Intelligence is vanishing. ClickHouse now supports Vector Search via HNSW indexes allowing you to store AI embeddings alongside relational data.
The Hardware Truth: Beware of marketing myths suggesting you need GPU servers for ClickHouse. ClickHouse is fundamentally optimized for CPU processing. For vector search it relies heavily on SIMD and AVX 512 instructions. To get maximum vector search performance you should deploy your cluster on High Frequency Bare Metal CPUs like Intel Xeon Scalable or AMD EPYC processors. GPUs should only be used externally for generating the embeddings.
# Example 2026 Vector Index Table Creation
CREATE TABLE ai_documents
(
id UInt64,
content String,
embedding Array(Float32),
INDEX vec_idx embedding TYPE vector_similarity('cosineDistance', 'f32')
)
ENGINE = MergeTree
ORDER BY id;
Phase 6: Replacing ZooKeeper
For years running a distributed cluster required installing Apache ZooKeeper. ZooKeeper is a heavy Java application that consumes enormous amounts of RAM and requires constant garbage collection tuning.
The modern approach is to install ClickHouse Keeper. It is a drop in replacement written purely in C++ offering vastly superior performance and stability. When deploying a large scale architecture across multiple bare metal nodes you can install it seamlessly using the official package.
# Install the standalone native keeper on your dedicated management nodes
sudo apt install -y clickhouse-keeper
sudo systemctl enable clickhouse-keeper
Phase 7: Memory Limits and OOM Prevention
ClickHouse is brutally aggressive. By default a single heavy analytical query will attempt to consume 100 percent of your physical RAM. On a shared node this will trigger the Linux Out of Memory Killer resulting in a complete database crash.
To ensure production stability you must enforce strict memory quotas in your users.xml configuration file.
<profiles>
<default>
<!-- Restrict a single query to maximum 16GB of RAM -->
<max_memory_usage>17179869184</max_memory_usage>
<!-- Ensure ClickHouse leaves at least 10 percent RAM for the OS -->
<max_server_memory_usage_to_ram_ratio>0.9</max_server_memory_usage_to_ram_ratio>
</default>
</profiles>