Giter VIP home page Giter VIP logo

Comments (4)

cockroach-teamcity avatar cockroach-teamcity commented on August 16, 2024

roachtest.gossip/chaos/nodes=9 failed with artifacts on master @ 802554ac59763c31aee034651efcd28f7bdcdd5d:

(gossip.go:81).2: gossip did not stabilize (dead node 4) in 22.5s
test artifacts and logs in: /artifacts/gossip/chaos/nodes=9/run_1

Parameters:

  • ROACHTEST_arch=amd64
  • ROACHTEST_cloud=azure
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

from cockroach.

cockroach-teamcity avatar cockroach-teamcity commented on August 16, 2024

roachtest.gossip/chaos/nodes=9 failed with artifacts on master @ c88a1cedb064f554f38bb5b310e83ac3259ee02e:

(gossip.go:81).2: gossip did not stabilize (dead node 9) in 32.4s
test artifacts and logs in: /artifacts/gossip/chaos/nodes=9/cpu_arch=arm64/run_1

Parameters:

  • ROACHTEST_arch=arm64
  • ROACHTEST_cloud=azure
  • ROACHTEST_coverageBuild=false
  • ROACHTEST_cpu=4
  • ROACHTEST_encrypted=false
  • ROACHTEST_metamorphicBuild=false
  • ROACHTEST_ssd=0
Help

See: roachtest README

See: How To Investigate (internal)

Grafana is not yet available for azure clusters

This test on roachdash | Improve this report!

from cockroach.

nvanbenschoten avatar nvanbenschoten commented on August 16, 2024

In all three cases, we see a sudden long pause between gossip probes and then the test times out.

2024/06/12 09:30:21 gossip.go:88: 1: checking gossip
2024/06/12 09:30:21 gossip.go:92: 1: gossip not ok (dead node 9 present) (0s)
2024/06/12 09:30:22 gossip.go:88: 1: checking gossip
2024/06/12 09:30:23 gossip.go:92: 1: gossip not ok (dead node 9 present) (2s)
2024/06/12 09:30:53 test_impl.go:414: test failure #1: full stack retained in failure_1.log: (gossip.go:81).2: gossip did not stabilize (dead node 9) in 32.4s

and

09:20:36 gossip.go:122: test status: waiting for gossip to exclude dead node
09:20:36 gossip.go:88: 1: checking gossip
09:20:36 gossip.go:92: 1: gossip not ok (dead node 4 present) (0s)
09:20:37 gossip.go:88: 1: checking gossip
09:20:37 gossip.go:92: 1: gossip not ok (dead node 4 present) (1s)
09:20:58 test_impl.go:414: test failure #1: full stack retained in failure_1.log: (gossip.go:81).2: gossip did not stabilize (dead node 4) in 22.5s

and

21:38:19 gossip.go:122: test status: waiting for gossip to exclude dead node
21:38:19 gossip.go:88: 1: checking gossip
21:38:20 gossip.go:92: 1: gossip not ok (dead node 4 present) (0s)
21:38:21 gossip.go:88: 1: checking gossip
21:38:21 gossip.go:92: 1: gossip not ok (dead node 4 present) (1s)
21:38:22 gossip.go:88: 1: checking gossip
21:38:22 gossip.go:88: 2: checking gossip
21:38:22 gossip.go:88: 3: checking gossip
21:38:22 gossip.go:88: 5: checking gossip
21:39:13 test_impl.go:414: test failure #1: full stack retained in failure_1.log: (gossip.go:81).2: gossip did not stabilize (dead node 4) in 53.9s

I don't see much that could explain a pause between a gossip not ok (dead node %d present) log line and the following checking gossip log line. It almost seems like a time.Sleep(time.Second) is blocking for 20+ seconds.

from cockroach.

nvanbenschoten avatar nvanbenschoten commented on August 16, 2024

I've spent another 15 minutes looking at this without any success, so I'll time box the investigation there.

This doesn't look like a release blocker. It looks like a test infra flake.

I'll add a bit more logging then move this to the backlog.

from cockroach.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.