Giter VIP home page Giter VIP logo

Comments (24)

antaljanosbenjamin avatar antaljanosbenjamin commented on June 30, 2024

Hi @ayush-san!

This issue is similar to #54483, but that should be fixed in 24.1.5.6 by #58310. However there might be still some bugs in the implementation, so could you please answer some of the questions:

  1. Have you enabled statistics_interval_ms?
  2. Have you changed the default value of kafka_consumers_pool_ttl_ms?
  3. Could you please paste the result of SELECT * FROM system.kafka_consumers? Feel free to modify any sensitive data, I am not really interested in concrete table names/topic names/etc, but the metrics.
  4. All of your Kafka tables are properly connected with MVs? This is important to know because if there is no MVs attached to a kafka table, then the consumers are not polled regularly and that might cause some issues.

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

@antaljanosbenjamin I am on verison 24.1.5.6

  1. No
  2. No
  3. clickhouse_consumer.txt
  4. How to check this?

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

I had increased the node type to c7g.4xlarge to stop these errors for now but one of my node has become unstable and the clickhouse process on that node keeps on restarting with the following errors

2024.06.24 12:13:46.866652 [ 37258 ] {} <Error> a1764866-d514-4578-9267-40f13eeaf350::1719208800_8_12_1 (MergeFromLogEntryTask): virtual bool DB::ReplicatedMergeMutateTaskBase::executeStep(): Code: 271. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/a17/a1764866-d514-4578-9267-40f13eeaf350/1719208800_11_11_0/ in table alb_logs.store_zomato_application_entry_alb_logs (a1764866-d514-4578-9267-40f13eeaf350) located on disk default of type local, from mark 0 with max_rows_to_read = 42): While executing MergeTreeSequentialSource. (CANNOT_DECOMPRESS), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x0000000007528fd0 in /usr/bin/clickhouse
2. DB::CompressionCodecZSTD::doDecompressData(char const*, unsigned int, char*, unsigned int) const @ 0x000000000f4b07cc in /usr/bin/clickhouse
3. DB::ICompressionCodec::decompress(char const*, unsigned int, char*) const @ 0x000000000f4ecdbc in /usr/bin/clickhouse
4. DB::CompressedReadBufferFromFile::nextImpl() @ 0x000000000f4a4d64 in /usr/bin/clickhouse
5. void DB::deserializeBinarySSE2<1>(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 63ul, 64ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 63ul, 64ul>&, DB::ReadBuffer&, unsigned long) @ 0x000000000f634ea0 in /usr/bin/clickhouse
6. DB::ISerialization::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::unordered_map<String, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>>*) const @ 0x000000000f5ef174 in /usr/bin/clickhouse
7. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df6d54 in /usr/bin/clickhouse
8. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
9. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
10. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
11. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
12. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
13. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
14. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
15. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
16. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
17. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
18. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
19. DB::ReplicatedMergeMutateTaskBase::executeStep() @ 0x0000000010edda08 in /usr/bin/clickhouse
20. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
21. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
22. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
23. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
24. ? @ 0x0000ffffbee2d834
25. ? @ 0x0000ffffbedd1e5c
 (version 24.1.5.6 (official build))
 (version 24.1.5.6 (official build))
2024.06.24 12:13:46.863786 [ 37255 ] {} <Error> MergeTreeBackgroundExecutor: Exception while executing background task {a1764866-d514-4578-9267-40f13eeaf350::1719212400_6_10_1}: Code: 33. DB::Exception: Cannot read all data in MergeTreeReaderCompact. Rows read: 454. Rows expected: 470.: (while reading from part /data/store/a17/a1764866-d514-4578-9267-40f13eeaf350/1719212400_9_9_0/ in table alb_logs.store_zomato_application_entry_alb_logs (a1764866-d514-4578-9267-40f13eeaf350) located on disk default of type local, from mark 0 with max_rows_to_read = 470): While executing MergeTreeSequentialSource. (CANNOT_READ_ALL_DATA), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<unsigned long&, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type, std::type_identity<unsigned long&>::type>, unsigned long&, unsigned long&) @ 0x0000000007e1fcc4 in /usr/bin/clickhouse
2. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df7a68 in /usr/bin/clickhouse
3. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
4. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
5. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
6. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
7. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
8. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
9. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
10. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
11. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
12. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
13. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
14. DB::ReplicatedMergeMutateTaskBase::executeStep() @ 0x0000000010edda08 in /usr/bin/clickhouse
15. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
16. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
17. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
18. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
19. ? @ 0x0000ffffbee2d834
20. ? @ 0x0000ffffbedd1e5c

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

Check table is not showing any error even though the logs have data corruption error

CHECK TABLE kafka.store_metrics PARTITION '20240624'
FORMAT PrettyCompactMonoBlock
SETTINGS check_query_single_value_result = 0
┌─part_path──────────────┬─is_passed─┬─message─┐
│ 20240624_3003_3017_3   │         1 │         │
│ 20240624_3021_3021_0   │         1 │         │
│ 20240624_3019_3019_0   │         1 │         │
│ 20240624_3020_3020_0   │         1 │         │
│ 20240624_3018_3018_0   │         1 │         │
│ 20240624_3544_3544_0   │         1 │         │
│ 20240624_3542_3542_0   │         1 │         │
│ 20240624_3022_3022_0   │         1 │         │
│ 20240624_3543_3543_0   │         1 │         │
│ 20240624_3023_3539_115 │         1 │         │
│ 20240624_3887_3887_0   │         1 │         │
│ 20240624_3540_3540_0   │         1 │         │
│ 20240624_3888_3888_0   │         1 │         │
│ 20240624_3545_3886_68  │         1 │         │
│ 20240624_0_3002_1468   │         1 │         │
│ 20240624_3541_3541_0   │         1 │         │
│ 20240624_3889_3889_0   │         1 │         │
│ 20240624_3890_3890_0   │         1 │         │
│ 20240624_3891_3891_0   │         1 │         │
│ 20240624_4465_4465_0   │         1 │         │
│ 20240624_4466_4466_0   │         1 │         │
│ 20240624_4467_4467_0   │         1 │         │
│ 20240624_4464_4464_0   │         1 │         │
│ 20240624_3892_4463_148 │         1 │         │
│ 20240624_4468_4468_0   │         1 │         │
└────────────────────────┴───────────┴─────────┘

Able to run min,max and count(*) queries

SELECT
    min(ts),
    max(ts),
    now(),
    count(*)
FROM kafka.store_metrics
WHERE _part = '20240624_3545_3886_68'
┌─────────────min(ts)─┬─────────────max(ts)─┬───────────────now()─┬─count()─┐
│ 2024-06-24 10:13:00 │ 2024-06-24 11:10:00 │ 2024-06-24 12:48:51 │    8010 │
└─────────────────────┴─────────────────────┴─────────────────────┴─────────┘

But select * queries are failing with the same error

SELECT *
FROM kafka.store_metrics
WHERE _part = '20240624_3545_3886_68'
LIMIT 10
Code: 271. DB::Exception: Received from localhost:9000. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/d3b/d3b72119-6e40-4774-bd1f-57b7ada1da4a/20240624_3545_3886_68/ in table kafka.store_metrics (d3b72119-6e40-4774-bd1f-57b7ada1da4a) located on disk default of type local, from mark 0 with max_rows_to_read = 8010): While executing MergeTreeSelect(pool: ReadPoolInOrder, algorithm: InOrder). (CANNOT_DECOMPRESS)

from clickhouse.

antaljanosbenjamin avatar antaljanosbenjamin commented on June 30, 2024

Cannot decompress ZSTD-encoded data: Data corruption detected

This is very unlikely that is connected to the Kafka consumers. Probably your Kafka consumers are fine, at least I don't see any reason why they would cause issues. Maybe some errors in parts prevents the cluster to merge parts and thus increasing the number of parts? Could you please run SELECT count(*) FROM system.parts?

Could you please also run the following query:

SELECT
    *,
    allocations - deallocations AS active_allocations,
    size * active_allocations AS allocated_bytes
FROM system.jemalloc_bins
WHERE allocated_bytes > 0
ORDER BY allocated_bytes DESC

If the output is too long, feel free to add a LIMIT 100 to the query.

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

SELECT count(*) FROM system.parts -> 9264

SELECT
    *,
    allocations - deallocations AS active_allocations,
    size * active_allocations AS allocated_bytes
FROM system.jemalloc_bins
WHERE allocated_bytes > 0
ORDER BY allocated_bytes DESC
┌─index─┬─large─┬─────size─┬─allocations─┬─deallocations─┬─active_allocations─┬─allocated_bytes─┐
│    29 │     0 │     5120 │   212898101 │     210719024 │            2179077 │     11156874240 │
│    52 │     1 │   262144 │   500571503 │     500546038 │              25465 │      6675496960 │
│    12 │     0 │      256 │  2002273426 │    1999863123 │            2410303 │       617037568 │
│    11 │     0 │      224 │  1323110659 │    1320618374 │            2492285 │       558271840 │
│     4 │     0 │       64 │  2817451995 │    2814543884 │            2908111 │       186119104 │
│    28 │     0 │     4096 │  1334294398 │    1334258216 │              36182 │       148201472 │
│    61 │     1 │  1310720 │    98889376 │      98889287 │                 89 │       116654080 │
│    44 │     0 │    65536 │    69402881 │      69401241 │               1640 │       107479040 │
│    40 │     0 │    32768 │    65200498 │      65198164 │               2334 │        76480512 │
│    24 │     0 │     2048 │   347068428 │     347031249 │              37179 │        76142592 │
│    36 │     0 │    16384 │   129498930 │     129494596 │               4334 │        71008256 │
│     5 │     0 │       80 │  1235558043 │    1234955559 │             602484 │        48198720 │
│     3 │     0 │       48 │ 27874244744 │   27873318952 │             925792 │        44438016 │
│    48 │     0 │   131072 │    39493817 │      39493521 │                296 │        38797312 │
│    32 │     0 │     8192 │   128356450 │     128351868 │               4582 │        37535744 │
│     7 │     0 │      112 │  1000443110 │    1000117795 │             325315 │        36435280 │
│     6 │     0 │       96 │  4956827335 │    4956531595 │             295740 │        28391040 │
│    69 │     1 │  5242880 │       12139 │         12134 │                  5 │        26214400 │
│    78 │     1 │ 25165824 │          56 │            55 │                  1 │        25165824 │
│    15 │     0 │      448 │   167274792 │     167224442 │              50350 │        22556800 │
│    70 │     1 │  6291456 │        1928 │          1925 │                  3 │        18874368 │
│    22 │     0 │     1536 │   228524438 │     228512189 │              12249 │        18814464 │
│    37 │     0 │    20480 │    61252859 │      61252099 │                760 │        15564800 │
│    20 │     0 │     1024 │   155893247 │     155878237 │              15010 │        15370240 │
│    75 │     1 │ 14680064 │         121 │           120 │                  1 │        14680064 │
│    16 │     0 │      512 │   139176063 │     139148080 │              27983 │        14327296 │
│    13 │     0 │      320 │   639364317 │     639320012 │              44305 │        14177600 │
│    56 │     1 │   524288 │     1225213 │       1225186 │                 27 │        14155776 │
│    74 │     1 │ 12582912 │       16831 │         16830 │                  1 │        12582912 │
│    27 │     0 │     3584 │    54633691 │      54630624 │               3067 │        10992128 │
│    10 │     0 │      192 │   585120097 │     585065091 │              55006 │        10561152 │
│     2 │     0 │       32 │  6285427618 │    6285102474 │             325144 │        10404608 │
│     8 │     0 │      128 │   868443284 │     868370280 │              73004 │         9344512 │
│    23 │     0 │     1792 │    12238236 │      12233955 │               4281 │         7671552 │
│    14 │     0 │      384 │   747237974 │     747218502 │              19472 │         7477248 │
│    63 │     1 │  1835008 │        7595 │          7591 │                  4 │         7340032 │
│    67 │     1 │  3670016 │        3309 │          3307 │                  2 │         7340032 │
│    25 │     0 │     2560 │    49460950 │      49458112 │               2838 │         7265280 │
│    17 │     0 │      640 │   315818927 │     315808594 │              10333 │         6613120 │
│    39 │     0 │    28672 │      994701 │        994490 │                211 │         6049792 │
│    31 │     0 │     7168 │     7392355 │       7391560 │                795 │         5698560 │
│    30 │     0 │     6144 │    67738935 │      67738062 │                873 │         5363712 │
│    18 │     0 │      768 │   170703631 │     170696741 │               6890 │         5291520 │
│    34 │     0 │    12288 │    39448178 │      39447753 │                425 │         5222400 │
│    33 │     0 │    10240 │   197672864 │     197672397 │                467 │         4782080 │
│     9 │     0 │      160 │   804045213 │     804015622 │              29591 │         4734560 │
│    19 │     0 │      896 │    70951187 │      70946269 │               4918 │         4406528 │
│    60 │     1 │  1048576 │      447749 │        447745 │                  4 │         4194304 │
│    26 │     0 │     3072 │    78814848 │      78813698 │               1150 │         3532800 │
│    21 │     0 │     1280 │   199075568 │     199072812 │               2756 │         3527680 │
│    38 │     0 │    24576 │    21303462 │      21303323 │                139 │         3416064 │
│    64 │     1 │  2097152 │      199648 │        199647 │                  1 │         2097152 │
│     1 │     0 │       16 │   276178919 │     276066650 │             112269 │         1796304 │
│    35 │     0 │    14336 │     9997524 │       9997413 │                111 │         1591296 │
│    62 │     1 │  1572864 │       30980 │         30979 │                  1 │         1572864 │
│    55 │     1 │   458752 │      893922 │        893919 │                  3 │         1376256 │
│    59 │     1 │   917504 │       81505 │         81504 │                  1 │          917504 │
│     0 │     0 │        8 │   375999327 │     375940638 │              58689 │          469512 │
│    47 │     0 │   114688 │      746963 │        746959 │                  4 │          458752 │
│    45 │     0 │    81920 │   152406599 │     152406594 │                  5 │          409600 │
│    50 │     0 │   196608 │      697547 │        697545 │                  2 │          393216 │
│    54 │     1 │   393216 │      951562 │        951561 │                  1 │          393216 │
│    41 │     0 │    40960 │   202247594 │     202247588 │                  6 │          245760 │
│    49 │     0 │   163840 │    41552287 │      41552286 │                  1 │          163840 │
│    43 │     0 │    57344 │    17601714 │      17601712 │                  2 │          114688 │
│    46 │     0 │    98304 │  1180435352 │    1180435351 │                  1 │           98304 │
│    42 │     0 │    49152 │     1085610 │       1085609 │                  1 │           49152 │
└───────┴───────┴──────────┴─────────────┴───────────────┴────────────────────┴─────────────────┘

from clickhouse.

antaljanosbenjamin avatar antaljanosbenjamin commented on June 30, 2024

I really have no idea what is going on here. 10k parts shouldn't be too bad. But I think it makes sense to differentiate between the issues:

  1. Increasing memory -> as 24.1 is not supported anymore, I really wouldn't like to spend time on this unless it can be reproduced in 24.3. Maybe you can check the content of system.merges, something might got stuck because of the compression error and that could halt parts merging, thus the 10k parts. Maybe 10k part could cause 11GB of memory usage. I am not sure.
  2. the compression error: it is hard to say anything about that either. I would say try to move the problematic parts out of the table and see if the node can start. If it happens with all the parts, then maybe something is misconfigured.

from clickhouse.

den-crane avatar den-crane commented on June 30, 2024

10k parts? How come?

@ayush-san can you please share

SELECT
    metric,
    formatReadableSize(value)
FROM system.asynchronous_metrics
WHERE metric ILIKE '%cach%'

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

Earlier we were running on c7g.2xlarge instances but due to the above errors I increased the instance type to c7g.4xlarge

┌─metric───────────────────────┬─formatReadableSize(value)─┐
│ CompiledExpressionCacheCount │ 9.00 B                    │
│ CompiledExpressionCacheBytes │ 72.00 KiB                 │
│ FilesystemCacheFiles         │ 0.00 B                    │
│ FilesystemCacheBytes         │ 0.00 B                    │
│ QueryCacheEntries            │ 0.00 B                    │
│ QueryCacheBytes              │ 0.00 B                    │
│ IndexUncompressedCacheBytes  │ 0.00 B                    │
│ IndexMarkCacheBytes          │ 0.00 B                    │
│ UncompressedCacheCells       │ 0.00 B                    │
│ UncompressedCacheBytes       │ 0.00 B                    │
│ IndexUncompressedCacheCells  │ 0.00 B                    │
│ HashTableStatsCacheMisses    │ 3.77 KiB                  │
│ OSMemoryFreePlusCached       │ 12.56 GiB                 │
│ OSMemoryCached               │ 11.23 GiB                 │
│ MMapCacheCells               │ 0.00 B                    │
│ HashTableStatsCacheHits      │ 94.40 KiB                 │
│ MarkCacheBytes               │ 6.64 MiB                  │
│ IndexMarkCacheFiles          │ 0.00 B                    │
│ OSMemoryFreeWithoutCached    │ 1.33 GiB                  │
│ MarkCacheFiles               │ 20.76 KiB                 │
│ HashTableStatsCacheEntries   │ 1.34 KiB                  │
└──────────────────────────────┴───────────────────────────┘

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

Maybe you can check the content of system.merges, something might got stuck because of the compression error and that could halt parts merging, thus the 10k parts. Maybe 10k part could cause 11GB of memory usage. I am not sure.

@antaljanosbenjamin I checked system.merges and nothing seem stuck. Even before updating the instance type I didn't find any merge stuck

For the data corruption issue, I am trying to drop the corrupted parts

ALTER TABLE kafka.store_metrics ON CLUSTER monitoring
    DROP PART '20240624_3545_3886_68'

but still they are visible in system.parts

SELECT
    name,
    active,
    refcount
FROM system.parts
WHERE (table = 'store_metrics') AND (partition = '2024-06-24')
CHECK TABLE kafka.store_metrics PARTITION '20240624'
FORMAT PrettyCompactMonoBlock
SETTINGS check_query_single_value_result = 0
┌─name────────────────────┬─active─┬─refcount─┐
│ 20240624_0_3002_1468    │      1 │        1 │
│ 20240624_3003_3017_3    │      1 │        3 │
│ 20240624_3018_3018_0    │      1 │        3 │
│ 20240624_3019_3019_0    │      1 │        3 │
│ 20240624_3020_3020_0    │      1 │        3 │
│ 20240624_3021_3021_0    │      1 │        3 │
│ 20240624_3022_3022_0    │      1 │        3 │
│ 20240624_3023_3539_115  │      1 │        5 │
│ 20240624_3540_3540_0    │      1 │        3 │
│ 20240624_3541_3541_0    │      1 │        3 │
│ 20240624_3542_3542_0    │      1 │        3 │
│ 20240624_3543_3543_0    │      1 │        5 │
│ 20240624_3544_3544_0    │      1 │        5 │
│ 20240624_3545_3886_68   │      1 │        3 │
│ 20240624_3887_3887_0    │      1 │        3 │
│ 20240624_3888_3888_0    │      1 │        3 │
│ 20240624_3889_3889_0    │      1 │        3 │
│ 20240624_3890_3890_0    │      1 │        3 │
│ 20240624_3891_3891_0    │      1 │        3 │
│ 20240624_3892_8138_1157 │      1 │        1 │
│ 20240624_8139_8503_72   │      1 │        1 │
└─────────────────────────┴────────┴──────────┘

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

You can also try with https://clickhouse.com/docs/en/operations/allocation-profiling#sampling-allocations-and-flushing-heap-profiles (if it's possible for you ofc)
You can generate a heap profile after a few hours of running and send it here.
It's okay to run it on a single replica as long as you catch the memory increase.

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

Here's the memory usage and pdf output of the heap profile -
result.pdf

image

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

can you generate collapsed output with --collapsed > result.collapsed?
Also based on PDF it seems like only 3.3GB were allocated so I don't understand how it fits with the memory usage graph above.

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

@ayush-san I see you are using S3Queue, how many files do you have?

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

Also based on PDF it seems like only 3.3GB were allocated so I don't understand how it fits with the memory usage graph above.

yes but see. the output of free command
image

I see you are using S3Queue, how many files do you have?

We are currently using 5 S3Queue jobs that are ingesting data from our application ALB logs and 8 kafka consumers. We are storing 30 days of data and each clickhouse server has 200GB data

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

And you generated the profile when usage was so high?
If not you can generate again just to confirm some suspicions.

We are currently using 5 S3Queue jobs that are ingesting data from our application ALB logs and 8 kafka consumers.

can you run

SELECT count() FROM system.s3queue

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

And you generated the profile when usage was so high?

image

result (1).pdf

collapsed flag is not working, I am getting Invalid option error

count of s3queue - 2204388

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

okay now it's clear what's going on, thanks for sending another heap profile!
We allocate too much memory inside S3Queue for each file and we found the problematic part which should remove this 11GiB overhead (5KiB per file).

I will keep you posted when a PR is created.

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

Also, please don't forget to disable jemalloc profiler by removing the set environment variables.

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

@antonio2368 ok, but what should I do now with the corrupted parts?

I have followed this - #31061 (comment) and ran check table command but nothing happened

I have also detached the partition and dropped the corrupted parts however every time I attach the partition, the error for the corrupted data appears with new part names.

Would I need to drop the table completely and recreate it again?

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

can you send complete logs from startup to shutdown?
Based on the exceptions some merges fail because on of the created parts cannot be read but that shouldn't restart CH AFAIK

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

@antonio2368 clickhouse logs are filled with the following error only

Now error is coming for system.asynchronous_metric_log table also

2024.06.25 17:07:37.112260 [ 151071 ] {} <Error> MergeTreeBackgroundExecutor: Exception while executing background task {910a293f-8e3f-4476-9aab-3e61fc3cba64::202406_1515872_1517289_11}: Code: 271. DB::Exception: Cannot decompress ZSTD-encoded data: Data corruption detected: (while reading from part /data/store/910/910a293f-8e3f-4476-9aab-3e61fc3cba64/202406_1515872_1515926_10/ in table system.asynchronous_metric_log (910a293f-8e3f-4476-9aab-3e61fc3cba64) located on disk default of type local, from mark 1 with max_rows_to_read = 8192): While executing MergeTreeSequentialSource. (CANNOT_DECOMPRESS), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000bc1a1e8 in /usr/bin/clickhouse
1. DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x0000000007528fd0 in /usr/bin/clickhouse
2. DB::CompressionCodecZSTD::doDecompressData(char const*, unsigned int, char*, unsigned int) const @ 0x000000000f4b07cc in /usr/bin/clickhouse
3. DB::ICompressionCodec::decompress(char const*, unsigned int, char*) const @ 0x000000000f4ecdbc in /usr/bin/clickhouse
4. DB::CompressedReadBufferFromFile::nextImpl() @ 0x000000000f4a4d64 in /usr/bin/clickhouse
5. DB::ReadBuffer::readStrict(char*, unsigned long) @ 0x0000000007888a84 in /usr/bin/clickhouse
6. DB::SerializationLowCardinality::deserializeBinaryBulkStatePrefix(DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&) const @ 0x000000000f612f64 in /usr/bin/clickhouse
7. DB::MergeTreeReaderCompact::readRows(unsigned long, unsigned long, bool, unsigned long, std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>&) @ 0x0000000010df6d34 in /usr/bin/clickhouse
8. DB::MergeTreeSequentialSource::generate() @ 0x0000000010e065c0 in /usr/bin/clickhouse
9. DB::ISource::tryGenerate() @ 0x00000000111a6a8c in /usr/bin/clickhouse
10. DB::ISource::work() @ 0x00000000111a64b4 in /usr/bin/clickhouse
11. DB::ExecutionThreadContext::executeTask() @ 0x00000000111ba96c in /usr/bin/clickhouse
12. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x00000000111b2af4 in /usr/bin/clickhouse
13. DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x00000000111b2614 in /usr/bin/clickhouse
14. DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x00000000111befc8 in /usr/bin/clickhouse
15. DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x00000000111bf17c in /usr/bin/clickhouse
16. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::executeImpl() @ 0x0000000010c88728 in /usr/bin/clickhouse
17. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::execute() @ 0x0000000010c8867c in /usr/bin/clickhouse
18. DB::MergeTask::execute() @ 0x0000000010c8c950 in /usr/bin/clickhouse
19. DB::MergePlainMergeTreeTask::executeStep() @ 0x0000000010fbafec in /usr/bin/clickhouse
20. DB::MergeTreeBackgroundExecutor<DB::DynamicRuntimeQueue>::threadFunction() @ 0x0000000010c9d9ec in /usr/bin/clickhouse
21. ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>) @ 0x000000000bce7030 in /usr/bin/clickhouse
22. void std::__function::__policy_invoker<void ()>::__call_impl<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000000bcea220 in /usr/bin/clickhouse
23. void* std::__thread_proxy[abi:v15000]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void ThreadPoolImpl<std::thread>::scheduleImpl<void>(std::function<void ()>, Priority, std::optional<unsigned long>, bool)::'lambda0'()>>(void*) @ 0x000000000bce91e8 in /usr/bin/clickhouse
24. ? @ 0x0000ffff9c748834
25. ? @ 0x0000ffff9c6ece5c
 (version 24.1.5.6 (official build))

from clickhouse.

antonio2368 avatar antonio2368 commented on June 30, 2024

It's hard to say but you can try detaching the problematic parts with DETACH PART.
It would be interesting to find out why it got corrupted, for example, grep 202406_1515872_1515926_10

from clickhouse.

ayush-san avatar ayush-san commented on June 30, 2024

I have done that - #65600 (comment)
But everytime I reattach the partition after dropping the corrupted part, I get corruption error for some new part

from clickhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.