While postgres works fine (even it is slower, but actually returns results)
But yes, I've seen similar issues, running out of memory during query processing, it's a price you pay for higher performance. You need to know what's happening under the hood and do more work to make sure your queries will work well. I think postgres can be a thousand or more times slower, and doesn't have the horizontal scalability, so if you need to do complex queries/aggregations over billions of records then "return result" doesn't cut it. If postgres addresses your needs then great- you don't need to use ClickHouse...
but what knobs to use and what values to use in each specific case? Query just usually fails with some generic OOM message without much information.
- max_concurrent_queries, since each query uses a certain amount of memory
- max_memory_usage, which is the max per-query memory usage
Here's my full config for running clickhouse on a 2GiB server without OOMs. Some stuff in here is likely irrelevant, but it's a starting point.
diff --git a/clickhouse-config.xml b/clickhouse-config.xml
index f8213b65..7d7459cb 100644
--- a/clickhouse-config.xml
+++ b/clickhouse-config.xml
@@ -197,7 +197,7 @@
<!-- <listen_backlog>4096</listen_backlog> -->
- <max_connections>4096</max_connections>
+ <max_connections>2000</max_connections>
<!-- For 'Connection: keep-alive' in HTTP 1.1 -->
<keep_alive_timeout>3</keep_alive_timeout>
@@ -270,7 +270,7 @@
-->
<!-- Maximum number of concurrent queries. -->
- <max_concurrent_queries>100</max_concurrent_queries>
+ <max_concurrent_queries>4</max_concurrent_queries>
<!-- Maximum memory usage (resident set size) for server process.
Zero value or unset means default. Default is "max_server_memory_usage_to_ram_ratio" of available physical RAM.
@@ -335,7 +335,7 @@
In bytes. Cache is single for server. Memory is allocated only on demand.
You should not lower this value.
-->
- <mark_cache_size>5368709120</mark_cache_size>
+ <mark_cache_size>805306368</mark_cache_size>
<!-- If you enable the `min_bytes_to_use_mmap_io` setting,
@@ -981,11 +980,11 @@
</distributed_ddl>
<!-- Settings to fine tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h -->
- <!--
<merge_tree>
- <max_suspicious_broken_parts>5</max_suspicious_broken_parts>
+ <merge_max_block_size>2048</merge_max_block_size>
+ <max_bytes_to_merge_at_max_space_in_pool>1073741824</max_bytes_to_merge_at_max_space_in_pool>
+ <number_of_free_entries_in_pool_to_lower_max_size_of_merge>0</number_of_free_entries_in_pool_to_lower_max_size_of_merge>
</merge_tree>
- -->
<!-- Protection from accidental DROP.
If size of a MergeTree table is greater than max_table_size_to_drop (in bytes) than table could not be dropped with any DROP query.
diff --git a/clickhouse-users.xml b/clickhouse-users.xml
index f1856207..bbd4ced6 100644
--- a/clickhouse-users.xml
+++ b/clickhouse-users.xml
@@ -7,7 +7,12 @@
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
- <max_memory_usage>10000000000</max_memory_usage>
+ <max_memory_usage>536870912</max_memory_usage>
+
+ <queue_max_wait_ms>1000</queue_max_wait_ms>
+ <max_execution_time>30</max_execution_time>
+ <background_pool_size>4</background_pool_size>
+
<!-- How to choose between replicas during distributed query processing.
random - choose random replica from set of replicas with minimum number of errors