Giter VIP home page Giter VIP logo

Comments (7)

ThreadDao avatar ThreadDao commented on August 16, 2024

There are also other problems:

  1. No L0 compaction was triggered and L1 MixCompaction is not complete
    image
    image

metrics of compact-master--key-op-28-3921

  1. queryNode oomkilled (maybe because dataNode restarted, a large number of upserts caused a sharp increase in the memory of growing segments)

from milvus.

xiaofan-luan avatar xiaofan-luan commented on August 16, 2024

/assign @XuanYang-cn

from milvus.

ThreadDao avatar ThreadDao commented on August 16, 2024

@czs007 @XuanYang-cn

  • image: master-20240612-9ab3058d-amd64
    panic: segment not found
19740 [2024/06/13 06:50:43.939 +00:00] [INFO] [metacache/meta_cache.go:298] ["remove dropped segment"] [segmentID=450428466827972529]
19741 [2024/06/13 06:50:43.939 +00:00] [INFO] [metacache/meta_cache.go:298] ["remove dropped segment"] [segmentID=450428466827972540]
19742 [2024/06/13 06:50:43.941 +00:00] [WARN] [syncmgr/task.go:199] ["failed to save serialized data into storage"] [collectionID=450428466818449735] [partitionID=450428466818449      736] [segmentID=450428466827972540] [channel=compact-master-sert-op-37-7267-rootcoord-dml_1_450428466818449735v1] [level=L1] [error="segment not found[segment=4504284668279      72540]"]
19743 [2024/06/13 06:50:43.941 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="segment not found[segment=450428466827972540]"] [stack="github.com/milvus-io/mi      lvus/pkg/util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\      n\t/go/pkg/mod/github.com/panjf2000/ants/[email protected]/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool      [...]).Submit.func1.1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvu      s/internal/datanode/syncmgr.(*storageV1Serializer).setTaskMeta.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/storage_serializer.go:156\ngithub.com/      milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).HandleError\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:117\ngithub.com/milvus-io/milvus      /internal/datanode/syncmgr.(*SyncTask).Run.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:132\ngithub.com/milvus-io/milvus/internal/datanode      /syncmgr.(*SyncTask).Run\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:200\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*keyLockDispat      cher[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/key_lock_dispatcher.go:37\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).S      ubmit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/v2@      v2.7.2/worker.go:67"]
19744 panic: segment not found[segment=450428466827972540] [recovered]
19745     panic: segment not found[segment=450428466827972540] [recovered]
19746     panic: segment not found[segment=450428466827972540]
19747 
19748 goroutine 320036 [running]:
19749 panic({0x5684060?, 0xc008fdc840?})
19750     /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc0142db548 sp=0xc0142db498 pc=0x1e21b2c
19751 github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1({0x5684060, 0xc008fdc840})
19752     /go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:56 +0x146 fp=0xc0142db610 sp=0xc0142db548 pc=0x3a6a4c6
19753 github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
19754     /go/pkg/mod/github.com/panjf2000/ants/[email protected]/worker.go:54 +0x6d fp=0xc0142db688 sp=0xc0142db610 pc=0x3a67b2d
19755 runtime.deferCallSave(0xc0142db740, 0xc0142dbfb8?)
19756     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc0142db698 sp=0xc0142db688 pc=0x1e216e4
19757 runtime.runOpenDeferFrame(0xc000d66960)
19758     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc0142db6d8 sp=0xc0142db698 pc=0x1e21518
19759 panic({0x5684060?, 0xc008fdc840?})
19760     /usr/local/go/src/runtime/panic.go:914 +0x21f fp=0xc0142db788 sp=0xc0142db6d8 pc=0x1e2199f
19761 github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1()
19762     /go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74 +0x8d fp=0xc0142db7e8 sp=0xc0142db788 pc=0x4dd122d
19763 runtime.deferCallSave(0xc0142db8a0, 0xc0142dbf50?)
19764     /usr/local/go/src/runtime/panic.go:798 +0x84 fp=0xc0142db7f8 sp=0xc0142db7e8 pc=0x1e216e4
19765 runtime.runOpenDeferFrame(0xc003b203c0)
19766     /usr/local/go/src/runtime/panic.go:771 +0x1b8 fp=0xc0142db838 sp=0xc0142db7f8 pc=0x1e21518

dn_64lfp.log

compact-master-sert-op-37-7267-etcd-0                             1/1     Running     0                27h     10.104.33.17    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-etcd-1                             1/1     Running     0                27h     10.104.16.187   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-etcd-2                             1/1     Running     0                27h     10.104.20.95    4am-node22   <none>           <none>
compact-master-sert-op-37-7267-milvus-datanode-5f44dbd694-64lfp   1/1     Running     2 (23h ago)      27h     10.104.26.156   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-datanode-5f44dbd694-67wpp   1/1     Running     2 (18h ago)      27h     10.104.16.210   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-indexnode-fc74f5df7-6hks6   1/1     Running     0                27h     10.104.34.196   4am-node37   <none>           <none>
compact-master-sert-op-37-7267-milvus-indexnode-fc74f5df7-tnjmc   1/1     Running     0                27h     10.104.1.59     4am-node10   <none>           <none>
compact-master-sert-op-37-7267-milvus-mixcoord-cbc7cd64c-xfz8r    1/1     Running     0                27h     10.104.26.155   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-proxy-79c4df6f79-wzmq5      1/1     Running     0                27h     10.104.16.209   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68f8hgx   1/1     Running     0                27h     10.104.18.204   4am-node25   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68n5tpv   1/1     Running     0                27h     10.104.26.157   4am-node32   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68rcddm   1/1     Running     0                27h     10.104.16.213   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-milvus-querynode-0-64854f68wjqh4   1/1     Running     0                27h     10.104.25.144   4am-node30   <none>           <none>
compact-master-sert-op-37-7267-minio-0                            1/1     Running     0                27h     10.104.33.16    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-minio-1                            1/1     Running     0                27h     10.104.16.193   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-minio-2                            1/1     Running     0                27h     10.104.17.225   4am-node23   <none>           <none>
compact-master-sert-op-37-7267-minio-3                            1/1     Running     0                27h     10.104.20.97    4am-node22   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-0                    1/1     Running     0                27h     10.104.16.197   4am-node21   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-1                    1/1     Running     0                27h     10.104.33.20    4am-node36   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-2                    1/1     Running     0                27h     10.104.17.227   4am-node23   <none>           <none>
compact-master-sert-op-37-7267-pulsar-bookie-init-njm9t           0/1     Completed   0                27h     10.104.4.8      4am-node11   <none>           <none>
compact-master-sert-op-37-7267-pulsar-broker-0                    1/1     Running     0                27h     10.104.4.9      4am-node11   <none>           <none>
compact-master-sert-op-37-7267-pulsar-proxy-0                     1/1     Running     0                27h     10.104.6.66     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-pulsar-init-6mcrk           0/1     Completed   0                27h     10.104.6.63     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-recovery-0                  1/1     Running     0                27h     10.104.6.64     4am-node13   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-0                 1/1     Running     0                27h     10.104.32.177   4am-node39   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-1                 1/1     Running     0                27h     10.104.20.101   4am-node22   <none>           <none>
compact-master-sert-op-37-7267-pulsar-zookeeper-2                 1/1     Running     0                27h     10.104.17.229   4am-node23   <none>           <none>

from milvus.

xiaofan-luan avatar xiaofan-luan commented on August 16, 2024

@czs007 please take care of it

from milvus.

czs007 avatar czs007 commented on August 16, 2024

working on it

from milvus.

ThreadDao avatar ThreadDao commented on August 16, 2024

@czs007
image: master-20240626-9c2eeff4-amd64
dn_mzf2p.log

[2024/06/28 08:59:56.773 +00:00] [INFO] [metacache/meta_cache.go:299] ["remove dropped segment"] [segmentID=450768351030069674]
[2024/06/28 08:59:56.789 +00:00] [WARN] [syncmgr/task.go:169] ["failed to save serialized data into storage"] [collectionID=450768351016256039] [partitionID=450768351016256040] [segmentID=450768351030476406] [channel=level-master-insert-op-76-5164-rootcoord-dml_0_450768351016256039v0] [level=L1] [error="segment not found[segment=450768351030476406]"]
[2024/06/28 08:59:56.789 +00:00] [ERROR] [conc/options.go:54] ["Conc pool panicked"] [panic="segment not found[segment=450768351030476406]"] [stack="github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:54\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1.1\n\t/go/pkg/mod/github.com/panjf2000/ants/[email protected]/worker.go:54\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1.1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:74\nruntime.gopanic\n\t/usr/local/go/src/runtime/panic.go:914\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*storageV1Serializer).setTaskMeta.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/storage_serializer.go:158\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).HandleError\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:107\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:122\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*SyncTask).Run\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/task.go:170\ngithub.com/milvus-io/milvus/internal/datanode/syncmgr.(*keyLockDispatcher[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/internal/datanode/syncmgr/key_lock_dispatcher.go:39\ngithub.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n\t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go:81\ngithub.com/panjf2000/ants/v2.(*goWorker).run.func1\n\t/go/pkg/mod/github.com/panjf2000/ants/[email protected]/worker.go:67"]
panic: segment not found[segment=450768351030476406] [recovered]
    panic: segment not found[segment=450768351030476406] [recovered]
    panic: segment not found[segment=450768351030476406]

goroutine 24223 [running]:
panic({0x56a1ea0?, 0xc00e598840?})
    /usr/local/go/src/runtime/panic.go:1017 +0x3ac fp=0xc0010795e8 sp=0xc001079538 pc=0x1e2cbac
github.com/milvus-io/milvus/pkg/util/conc.(*poolOption).antsOptions.func1({0x56a1ea0, 0xc00e598840})
    /go/src/github.com/milvus-io/milvus/pkg/util/conc/options.go:56 +0x146 fp=0xc0010796b0 sp=0xc0010795e8 pc=0x3ab64e6
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()

argo: https://argo-workflows.zilliz.cc/archived-workflows/qa/cd5ae9ad-6282-48f7-abc0-d62306a3f8ff?nodeId=level-zero-stable-master-tntnd-1014026323

from milvus.

ThreadDao avatar ThreadDao commented on August 16, 2024

fixed master-20240703-a501fa11-amd64

from milvus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.