Comments (24)
Hi @ayush-san!
This issue is similar to #54483, but that should be fixed in 24.1.5.6
by #58310. However there might be still some bugs in the implementation, so could you please answer some of the questions:
- Have you enabled
statistics_interval_ms
? - Have you changed the default value of
kafka_consumers_pool_ttl_ms
? - Could you please paste the result of
SELECT * FROM system.kafka_consumers
? Feel free to modify any sensitive data, I am not really interested in concrete table names/topic names/etc, but the metrics. - All of your Kafka tables are properly connected with MVs? This is important to know because if there is no MVs attached to a kafka table, then the consumers are not polled regularly and that might cause some issues.
from clickhouse.
@antaljanosbenjamin I am on verison 24.1.5.6
- No
- No
- clickhouse_consumer.txt
- How to check this?
from clickhouse.
I had increased the node type to c7g.4xlarge to stop these errors for now but one of my node has become unstable and the clickhouse process on that node keeps on restarting with the following errors
2024.06.24 12:13:46.866652 [ 37258 ] {} <Error> a1764866-d514-4578-9267-40f13eeaf350::1719208800_8_12_1 (MergeFromLogEntryTask): virtual bool DB::ReplicatedMergeMutateTaskBase::executeStep(): Code: 271. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/a17/a1764866-d514-4578-9267-40f13eeaf350/1719208800_11_11_0/ in table alb_logs.store_zomato_application_entry_alb_logs (a1764866-d514-4578-9267-40f13eeaf350) located on disk default of type local, from mark 0 with max_rows_to_read = 42): While executing MergeTreeSequentialSource. (CANNOT_DECOMPRESS), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x0000000007528fd0 in /usr/bin/clickhouse
2. DB::CompressionCodecZSTD::doDecompressData(char const*, unsigned int, char*, unsigned int) const @ 0x000000000f4b07cc in /usr/bin/clickhouse
3. DB::ICompressionCodec::decompress(char const*, unsigned int, char*) const @ 0x000000000f4ecdbc in /usr/bin/clickhouse
4. DB::CompressedReadBufferFromFile::nextImpl() @ 0x000000000f4a4d64 in /usr/bin/clickhouse
5. void DB::deserializeBinarySSE2<1>(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 63ul, 64ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 63ul, 64ul>&, DB::ReadBuffer&, unsigned long) @ 0x000000000f634ea0 in /usr/bin/clickhouse
6. DB::ISerialization::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::unordered_map<String, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>>*) const @ 0x000000000f5ef174 in /usr/bin/clickhouse
7. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df6d54 in /usr/bin/clickhouse
8. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
9. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
10. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
11. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
12. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
13. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
14. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
15. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
16. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
17. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
18. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
19. DB::ReplicatedMergeMutateTaskBase::executeStep() @ 0x0000000010edda08 in /usr/bin/clickhouse
20. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
21. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
22. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
23. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
24. ? @ 0x0000ffffbee2d834
25. ? @ 0x0000ffffbedd1e5c
(version 24.1.5.6 (official build))
(version 24.1.5.6 (official build))
2024.06.24 12:13:46.863786 [ 37255 ] {} <Error> MergeTreeBackgroundExecutor: Exception while executing background task {a1764866-d514-4578-9267-40f13eeaf350::1719212400_6_10_1}: Code: 33. DB::Exception: Cannot read all data in MergeTreeReaderCompact. Rows read: 454. Rows expected: 470.: (while reading from part /data/store/a17/a1764866-d514-4578-9267-40f13eeaf350/1719212400_9_9_0/ in table alb_logs.store_zomato_application_entry_alb_logs (a1764866-d514-4578-9267-40f13eeaf350) located on disk default of type local, from mark 0 with max_rows_to_read = 470): While executing MergeTreeSequentialSource. (CANNOT_READ_ALL_DATA), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<unsigned long&, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type, std::type_identity<unsigned long&>::type>, unsigned long&, unsigned long&) @ 0x0000000007e1fcc4 in /usr/bin/clickhouse
2. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df7a68 in /usr/bin/clickhouse
3. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
4. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
5. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
6. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
7. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
8. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
9. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
10. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
11. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
12. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
13. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
14. DB::ReplicatedMergeMutateTaskBase::executeStep() @ 0x0000000010edda08 in /usr/bin/clickhouse
15. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
16. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
17. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
18. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
19. ? @ 0x0000ffffbee2d834
20. ? @ 0x0000ffffbedd1e5c
from clickhouse.
Check table is not showing any error even though the logs have data corruption error
CHECK TABLE kafka.store_metrics PARTITION '20240624'
FORMAT PrettyCompactMonoBlock
SETTINGS check_query_single_value_result = 0
┌─part_path──────────────┬─is_passed─┬─message─┐
│ 20240624_3003_3017_3 │ 1 │ │
│ 20240624_3021_3021_0 │ 1 │ │
│ 20240624_3019_3019_0 │ 1 │ │
│ 20240624_3020_3020_0 │ 1 │ │
│ 20240624_3018_3018_0 │ 1 │ │
│ 20240624_3544_3544_0 │ 1 │ │
│ 20240624_3542_3542_0 │ 1 │ │
│ 20240624_3022_3022_0 │ 1 │ │
│ 20240624_3543_3543_0 │ 1 │ │
│ 20240624_3023_3539_115 │ 1 │ │
│ 20240624_3887_3887_0 │ 1 │ │
│ 20240624_3540_3540_0 │ 1 │ │
│ 20240624_3888_3888_0 │ 1 │ │
│ 20240624_3545_3886_68 │ 1 │ │
│ 20240624_0_3002_1468 │ 1 │ │
│ 20240624_3541_3541_0 │ 1 │ │
│ 20240624_3889_3889_0 │ 1 │ │
│ 20240624_3890_3890_0 │ 1 │ │
│ 20240624_3891_3891_0 │ 1 │ │
│ 20240624_4465_4465_0 │ 1 │ │
│ 20240624_4466_4466_0 │ 1 │ │
│ 20240624_4467_4467_0 │ 1 │ │
│ 20240624_4464_4464_0 │ 1 │ │
│ 20240624_3892_4463_148 │ 1 │ │
│ 20240624_4468_4468_0 │ 1 │ │
└────────────────────────┴───────────┴─────────┘
Able to run min,max and count(*) queries
SELECT
min(ts),
max(ts),
now(),
count(*)
FROM kafka.store_metrics
WHERE _part = '20240624_3545_3886_68'
┌─────────────min(ts)─┬─────────────max(ts)─┬───────────────now()─┬─count()─┐
│ 2024-06-24 10:13:00 │ 2024-06-24 11:10:00 │ 2024-06-24 12:48:51 │ 8010 │
└─────────────────────┴─────────────────────┴─────────────────────┴─────────┘
But select * queries are failing with the same error
SELECT *
FROM kafka.store_metrics
WHERE _part = '20240624_3545_3886_68'
LIMIT 10
Code: 271. DB::Exception: Received from localhost:9000. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/d3b/d3b72119-6e40-4774-bd1f-57b7ada1da4a/20240624_3545_3886_68/ in table kafka.store_metrics (d3b72119-6e40-4774-bd1f-57b7ada1da4a) located on disk default of type local, from mark 0 with max_rows_to_read = 8010): While executing MergeTreeSelect(pool: ReadPoolInOrder, algorithm: InOrder). (CANNOT_DECOMPRESS)
from clickhouse.
Cannot decompress ZSTD-encoded data: Data corruption detected
This is very unlikely that is connected to the Kafka consumers. Probably your Kafka consumers are fine, at least I don't see any reason why they would cause issues. Maybe some errors in parts prevents the cluster to merge parts and thus increasing the number of parts? Could you please run SELECT count(*) FROM system.parts
?
Could you please also run the following query:
SELECT
*,
allocations - deallocations AS active_allocations,
size * active_allocations AS allocated_bytes
FROM system.jemalloc_bins
WHERE allocated_bytes > 0
ORDER BY allocated_bytes DESC
If the output is too long, feel free to add a LIMIT 100
to the query.
from clickhouse.
SELECT count(*) FROM system.parts
-> 9264
SELECT
*,
allocations - deallocations AS active_allocations,
size * active_allocations AS allocated_bytes
FROM system.jemalloc_bins
WHERE allocated_bytes > 0
ORDER BY allocated_bytes DESC
┌─index─┬─large─┬─────size─┬─allocations─┬─deallocations─┬─active_allocations─┬─allocated_bytes─┐
│ 29 │ 0 │ 5120 │ 212898101 │ 210719024 │ 2179077 │ 11156874240 │
│ 52 │ 1 │ 262144 │ 500571503 │ 500546038 │ 25465 │ 6675496960 │
│ 12 │ 0 │ 256 │ 2002273426 │ 1999863123 │ 2410303 │ 617037568 │
│ 11 │ 0 │ 224 │ 1323110659 │ 1320618374 │ 2492285 │ 558271840 │
│ 4 │ 0 │ 64 │ 2817451995 │ 2814543884 │ 2908111 │ 186119104 │
│ 28 │ 0 │ 4096 │ 1334294398 │ 1334258216 │ 36182 │ 148201472 │
│ 61 │ 1 │ 1310720 │ 98889376 │ 98889287 │ 89 │ 116654080 │
│ 44 │ 0 │ 65536 │ 69402881 │ 69401241 │ 1640 │ 107479040 │
│ 40 │ 0 │ 32768 │ 65200498 │ 65198164 │ 2334 │ 76480512 │
│ 24 │ 0 │ 2048 │ 347068428 │ 347031249 │ 37179 │ 76142592 │
│ 36 │ 0 │ 16384 │ 129498930 │ 129494596 │ 4334 │ 71008256 │
│ 5 │ 0 │ 80 │ 1235558043 │ 1234955559 │ 602484 │ 48198720 │
│ 3 │ 0 │ 48 │ 27874244744 │ 27873318952 │ 925792 │ 44438016 │
│ 48 │ 0 │ 131072 │ 39493817 │ 39493521 │ 296 │ 38797312 │
│ 32 │ 0 │ 8192 │ 128356450 │ 128351868 │ 4582 │ 37535744 │
│ 7 │ 0 │ 112 │ 1000443110 │ 1000117795 │ 325315 │ 36435280 │
│ 6 │ 0 │ 96 │ 4956827335 │ 4956531595 │ 295740 │ 28391040 │
│ 69 │ 1 │ 5242880 │ 12139 │ 12134 │ 5 │ 26214400 │
│ 78 │ 1 │ 25165824 │ 56 │ 55 │ 1 │ 25165824 │
│ 15 │ 0 │ 448 │ 167274792 │ 167224442 │ 50350 │ 22556800 │
│ 70 │ 1 │ 6291456 │ 1928 │ 1925 │ 3 │ 18874368 │
│ 22 │ 0 │ 1536 │ 228524438 │ 228512189 │ 12249 │ 18814464 │
│ 37 │ 0 │ 20480 │ 61252859 │ 61252099 │ 760 │ 15564800 │
│ 20 │ 0 │ 1024 │ 155893247 │ 155878237 │ 15010 │ 15370240 │
│ 75 │ 1 │ 14680064 │ 121 │ 120 │ 1 │ 14680064 │
│ 16 │ 0 │ 512 │ 139176063 │ 139148080 │ 27983 │ 14327296 │
│ 13 │ 0 │ 320 │ 639364317 │ 639320012 │ 44305 │ 14177600 │
│ 56 │ 1 │ 524288 │ 1225213 │ 1225186 │ 27 │ 14155776 │
│ 74 │ 1 │ 12582912 │ 16831 │ 16830 │ 1 │ 12582912 │
│ 27 │ 0 │ 3584 │ 54633691 │ 54630624 │ 3067 │ 10992128 │
│ 10 │ 0 │ 192 │ 585120097 │ 585065091 │ 55006 │ 10561152 │
│ 2 │ 0 │ 32 │ 6285427618 │ 6285102474 │ 325144 │ 10404608 │
│ 8 │ 0 │ 128 │ 868443284 │ 868370280 │ 73004 │ 9344512 │
│ 23 │ 0 │ 1792 │ 12238236 │ 12233955 │ 4281 │ 7671552 │
│ 14 │ 0 │ 384 │ 747237974 │ 747218502 │ 19472 │ 7477248 │
│ 63 │ 1 │ 1835008 │ 7595 │ 7591 │ 4 │ 7340032 │
│ 67 │ 1 │ 3670016 │ 3309 │ 3307 │ 2 │ 7340032 │
│ 25 │ 0 │ 2560 │ 49460950 │ 49458112 │ 2838 │ 7265280 │
│ 17 │ 0 │ 640 │ 315818927 │ 315808594 │ 10333 │ 6613120 │
│ 39 │ 0 │ 28672 │ 994701 │ 994490 │ 211 │ 6049792 │
│ 31 │ 0 │ 7168 │ 7392355 │ 7391560 │ 795 │ 5698560 │
│ 30 │ 0 │ 6144 │ 67738935 │ 67738062 │ 873 │ 5363712 │
│ 18 │ 0 │ 768 │ 170703631 │ 170696741 │ 6890 │ 5291520 │
│ 34 │ 0 │ 12288 │ 39448178 │ 39447753 │ 425 │ 5222400 │
│ 33 │ 0 │ 10240 │ 197672864 │ 197672397 │ 467 │ 4782080 │
│ 9 │ 0 │ 160 │ 804045213 │ 804015622 │ 29591 │ 4734560 │
│ 19 │ 0 │ 896 │ 70951187 │ 70946269 │ 4918 │ 4406528 │
│ 60 │ 1 │ 1048576 │ 447749 │ 447745 │ 4 │ 4194304 │
│ 26 │ 0 │ 3072 │ 78814848 │ 78813698 │ 1150 │ 3532800 │
│ 21 │ 0 │ 1280 │ 199075568 │ 199072812 │ 2756 │ 3527680 │
│ 38 │ 0 │ 24576 │ 21303462 │ 21303323 │ 139 │ 3416064 │
│ 64 │ 1 │ 2097152 │ 199648 │ 199647 │ 1 │ 2097152 │
│ 1 │ 0 │ 16 │ 276178919 │ 276066650 │ 112269 │ 1796304 │
│ 35 │ 0 │ 14336 │ 9997524 │ 9997413 │ 111 │ 1591296 │
│ 62 │ 1 │ 1572864 │ 30980 │ 30979 │ 1 │ 1572864 │
│ 55 │ 1 │ 458752 │ 893922 │ 893919 │ 3 │ 1376256 │
│ 59 │ 1 │ 917504 │ 81505 │ 81504 │ 1 │ 917504 │
│ 0 │ 0 │ 8 │ 375999327 │ 375940638 │ 58689 │ 469512 │
│ 47 │ 0 │ 114688 │ 746963 │ 746959 │ 4 │ 458752 │
│ 45 │ 0 │ 81920 │ 152406599 │ 152406594 │ 5 │ 409600 │
│ 50 │ 0 │ 196608 │ 697547 │ 697545 │ 2 │ 393216 │
│ 54 │ 1 │ 393216 │ 951562 │ 951561 │ 1 │ 393216 │
│ 41 │ 0 │ 40960 │ 202247594 │ 202247588 │ 6 │ 245760 │
│ 49 │ 0 │ 163840 │ 41552287 │ 41552286 │ 1 │ 163840 │
│ 43 │ 0 │ 57344 │ 17601714 │ 17601712 │ 2 │ 114688 │
│ 46 │ 0 │ 98304 │ 1180435352 │ 1180435351 │ 1 │ 98304 │
│ 42 │ 0 │ 49152 │ 1085610 │ 1085609 │ 1 │ 49152 │
└───────┴───────┴──────────┴─────────────┴───────────────┴────────────────────┴─────────────────┘
from clickhouse.
I really have no idea what is going on here. 10k parts shouldn't be too bad. But I think it makes sense to differentiate between the issues:
- Increasing memory -> as 24.1 is not supported anymore, I really wouldn't like to spend time on this unless it can be reproduced in 24.3. Maybe you can check the content of
system.merges
, something might got stuck because of the compression error and that could halt parts merging, thus the 10k parts. Maybe 10k part could cause 11GB of memory usage. I am not sure. - the compression error: it is hard to say anything about that either. I would say try to move the problematic parts out of the table and see if the node can start. If it happens with all the parts, then maybe something is misconfigured.
from clickhouse.
10k parts? How come?
@ayush-san can you please share
SELECT
metric,
formatReadableSize(value)
FROM system.asynchronous_metrics
WHERE metric ILIKE '%cach%'
from clickhouse.
Earlier we were running on c7g.2xlarge instances but due to the above errors I increased the instance type to c7g.4xlarge
┌─metric───────────────────────┬─formatReadableSize(value)─┐
│ CompiledExpressionCacheCount │ 9.00 B │
│ CompiledExpressionCacheBytes │ 72.00 KiB │
│ FilesystemCacheFiles │ 0.00 B │
│ FilesystemCacheBytes │ 0.00 B │
│ QueryCacheEntries │ 0.00 B │
│ QueryCacheBytes │ 0.00 B │
│ IndexUncompressedCacheBytes │ 0.00 B │
│ IndexMarkCacheBytes │ 0.00 B │
│ UncompressedCacheCells │ 0.00 B │
│ UncompressedCacheBytes │ 0.00 B │
│ IndexUncompressedCacheCells │ 0.00 B │
│ HashTableStatsCacheMisses │ 3.77 KiB │
│ OSMemoryFreePlusCached │ 12.56 GiB │
│ OSMemoryCached │ 11.23 GiB │
│ MMapCacheCells │ 0.00 B │
│ HashTableStatsCacheHits │ 94.40 KiB │
│ MarkCacheBytes │ 6.64 MiB │
│ IndexMarkCacheFiles │ 0.00 B │
│ OSMemoryFreeWithoutCached │ 1.33 GiB │
│ MarkCacheFiles │ 20.76 KiB │
│ HashTableStatsCacheEntries │ 1.34 KiB │
└──────────────────────────────┴───────────────────────────┘
from clickhouse.
Maybe you can check the content of system.merges, something might got stuck because of the compression error and that could halt parts merging, thus the 10k parts. Maybe 10k part could cause 11GB of memory usage. I am not sure.
@antaljanosbenjamin I checked system.merges and nothing seem stuck. Even before updating the instance type I didn't find any merge stuck
For the data corruption issue, I am trying to drop the corrupted parts
ALTER TABLE kafka.store_metrics ON CLUSTER monitoring
DROP PART '20240624_3545_3886_68'
but still they are visible in system.parts
SELECT
name,
active,
refcount
FROM system.parts
WHERE (table = 'store_metrics') AND (partition = '2024-06-24')
CHECK TABLE kafka.store_metrics PARTITION '20240624'
FORMAT PrettyCompactMonoBlock
SETTINGS check_query_single_value_result = 0
┌─name────────────────────┬─active─┬─refcount─┐
│ 20240624_0_3002_1468 │ 1 │ 1 │
│ 20240624_3003_3017_3 │ 1 │ 3 │
│ 20240624_3018_3018_0 │ 1 │ 3 │
│ 20240624_3019_3019_0 │ 1 │ 3 │
│ 20240624_3020_3020_0 │ 1 │ 3 │
│ 20240624_3021_3021_0 │ 1 │ 3 │
│ 20240624_3022_3022_0 │ 1 │ 3 │
│ 20240624_3023_3539_115 │ 1 │ 5 │
│ 20240624_3540_3540_0 │ 1 │ 3 │
│ 20240624_3541_3541_0 │ 1 │ 3 │
│ 20240624_3542_3542_0 │ 1 │ 3 │
│ 20240624_3543_3543_0 │ 1 │ 5 │
│ 20240624_3544_3544_0 │ 1 │ 5 │
│ 20240624_3545_3886_68 │ 1 │ 3 │
│ 20240624_3887_3887_0 │ 1 │ 3 │
│ 20240624_3888_3888_0 │ 1 │ 3 │
│ 20240624_3889_3889_0 │ 1 │ 3 │
│ 20240624_3890_3890_0 │ 1 │ 3 │
│ 20240624_3891_3891_0 │ 1 │ 3 │
│ 20240624_3892_8138_1157 │ 1 │ 1 │
│ 20240624_8139_8503_72 │ 1 │ 1 │
└─────────────────────────┴────────┴──────────┘
from clickhouse.
You can also try with https://clickhouse.com/docs/en/operations/allocation-profiling#sampling-allocations-and-flushing-heap-profiles (if it's possible for you ofc)
You can generate a heap profile after a few hours of running and send it here.
It's okay to run it on a single replica as long as you catch the memory increase.
from clickhouse.
Here's the memory usage and pdf output of the heap profile -
result.pdf
![image](https://private-user-images.githubusercontent.com/57655135/342695588-c922a447-0927-4c1a-94a5-579166fed598.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk1MTIyMzcsIm5iZiI6MTcxOTUxMTkzNywicGF0aCI6Ii81NzY1NTEzNS8zNDI2OTU1ODgtYzkyMmE0NDctMDkyNy00YzFhLTk0YTUtNTc5MTY2ZmVkNTk4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI3VDE4MTIxN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQ1OTA0MTcwN2MyOTBkNDRkYTJiYWFjZWM1Mzg2Njc5ZGIxNjAxYzNjOGNkNjMzYWFkM2Y5NmNiMDE4NmYyYTAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.esIvLdWcczGI2ncPvxMckuVBCpV8BFeT9cRrUNwL6sw)
from clickhouse.
can you generate collapsed output with --collapsed > result.collapsed
?
Also based on PDF it seems like only 3.3GB were allocated so I don't understand how it fits with the memory usage graph above.
from clickhouse.
@ayush-san I see you are using S3Queue, how many files do you have?
from clickhouse.
Also based on PDF it seems like only 3.3GB were allocated so I don't understand how it fits with the memory usage graph above.
yes but see. the output of free command
I see you are using S3Queue, how many files do you have?
We are currently using 5 S3Queue jobs that are ingesting data from our application ALB logs and 8 kafka consumers. We are storing 30 days of data and each clickhouse server has 200GB data
from clickhouse.
And you generated the profile when usage was so high?
If not you can generate again just to confirm some suspicions.
We are currently using 5 S3Queue jobs that are ingesting data from our application ALB logs and 8 kafka consumers.
can you run
SELECT count() FROM system.s3queue
from clickhouse.
And you generated the profile when usage was so high?
![image](https://private-user-images.githubusercontent.com/57655135/342735871-fdf749c6-e186-4054-875f-cdc7344faa0a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk1MTIyMzcsIm5iZiI6MTcxOTUxMTkzNywicGF0aCI6Ii81NzY1NTEzNS8zNDI3MzU4NzEtZmRmNzQ5YzYtZTE4Ni00MDU0LTg3NWYtY2RjNzM0NGZhYTBhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjI3VDE4MTIxN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTkzMTQ0NTU0YzY2MTc3M2Y0ZmYzMGVhODhlMDBhZjY1MDQ4OWZjZDg1ZjYyZmJlNGZkY2ZhZGVhMDlhODlhMGMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.YIQ9adHxqQyEfsIOgOsBcPlrEC0f8iKRLK_pmRAqf64)
collapsed flag is not working, I am getting Invalid option error
count of s3queue - 2204388
from clickhouse.
okay now it's clear what's going on, thanks for sending another heap profile!
We allocate too much memory inside S3Queue for each file and we found the problematic part which should remove this 11GiB overhead (5KiB per file).
I will keep you posted when a PR is created.
from clickhouse.
Also, please don't forget to disable jemalloc profiler by removing the set environment variables.
from clickhouse.
@antonio2368 ok, but what should I do now with the corrupted parts?
I have followed this - #31061 (comment) and ran check table command but nothing happened
I have also detached the partition and dropped the corrupted parts however every time I attach the partition, the error for the corrupted data appears with new part names.
Would I need to drop the table completely and recreate it again?
from clickhouse.
can you send complete logs from startup to shutdown?
Based on the exceptions some merges fail because on of the created parts cannot be read but that shouldn't restart CH AFAIK
from clickhouse.
@antonio2368 clickhouse logs are filled with the following error only
Now error is coming for system.asynchronous_metric_log
table also
2024.06.25 17:07:37.112260 [ 151071 ] {} <Error> MergeTreeBackgroundExecutor: Exception while executing background task {910a293f-8e3f-4476-9aab-3e61fc3cba64::202406_1515872_1517289_11}: Code: 271. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/910/910a293f-8e3f-4476-9aab-3e61fc3cba64/202406_1515872_1515926_10/ in table system.asynchronous_metric_log (910a293f-8e3f-4476-9aab-3e61fc3cba64) located on disk default of type local, from mark 1 with max_rows_to_read = 8192): While executing MergeTreeSequentialSource. (CANNOT_DECOMPRESS), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x0000000007528fd0 in /usr/bin/clickhouse
2. DB::CompressionCodecZSTD::doDecompressData(char const*, unsigned int, char*, unsigned int) const @ 0x000000000f4b07cc in /usr/bin/clickhouse
3. DB::ICompressionCodec::decompress(char const*, unsigned int, char*) const @ 0x000000000f4ecdbc in /usr/bin/clickhouse
4. DB::CompressedReadBufferFromFile::nextImpl() @ 0x000000000f4a4d64 in /usr/bin/clickhouse
5. DB::ReadBuffer::readStrict(char*, unsigned long) @ 0x0000000007888a84 in /usr/bin/clickhouse
6. DB::SerializationLowCardinality::deserializeBinaryBulkStatePrefix(DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&) const @ 0x000000000f612f64 in /usr/bin/clickhouse
7. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df6d34 in /usr/bin/clickhouse
8. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
9. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
10. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
11. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
12. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
13. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
14. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
15. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
16. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
17. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
18. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
19. DB::MergePlainMergeTreeTask::executeStep() @ 0x0000000010fbafec in /usr/bin/clickhouse
20. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
21. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
22. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
23. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
24. ? @ 0x0000ffff9c748834
25. ? @ 0x0000ffff9c6ece5c
(version 24.1.5.6 (official build))
from clickhouse.
It's hard to say but you can try detaching the problematic parts with DETACH PART
.
It would be interesting to find out why it got corrupted, for example, grep 202406_1515872_1515926_10
from clickhouse.
I have done that - #65600 (comment)
But everytime I reattach the partition after dropping the corrupted part, I get corruption error for some new part
from clickhouse.
Related Issues (20)
- ANN Indices are generated without error but not used. (Experimental Feature) HOT 2
- `primary_key_bytes_in_memory` value for `LowCardinality` columns are counted wrong HOT 2
- `02265_column_ttl` is flaky
- Support `format` argument to `input()` table function
- Missing way of dealing with out of capacity in Arrow and ArrowStream formats
- extractURLParameterNames does not correctly parse the final parameter name when there is no value
- Bad metadata file name for Iceberg Table in GCS Bucket
- clickhouse-client: parsing of parameters is slow
- Proposal: ${RELEASE}-head Docker tags for stable branches
- The settings `preferred_block_size_bytes` and `preferred_max_column_in_block_size_bytes` should have effect for `IRowInputFormat`
- zk/keeper restarts cause zk watch to be unbalanced HOT 2
- Tuples should be pretty-printed in SHOW CREATE TABLE HOT 1
- 00900_long_parquet fails with timeout
- clickhouse-local got `Multi-statements are not allowed` on version HOT 1
- Tests `test_ldap_external_user_directory` occasionally flaky HOT 1
- Broken formatting of PrettyMonoblock for DESCRIBE HOT 1
- system.detached_tables HOT 1
- IN with nullable tuples returns incorrect results
- Random crashes after upgrade to 23.8.15 HOT 1
- Repm install, yum install, or start clickhouse-server error Segmentation fault HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse.