Giter VIP home page Giter VIP logo

Comments (11)

igor47 avatar igor47 commented on August 22, 2024

we've recently observed this problem in our environment as well; it's my top priority to investigate ATM

from synapse.

memory avatar memory commented on August 22, 2024

Well, I guess I'm glad to know it's not just me. If I can provide any additional info for debugging, please ping me on freenode or gchat.

from synapse.

igor47 avatar igor47 commented on August 22, 2024

a way to reproduce this would be AWESOME

On Mon, Jan 6, 2014 at 11:41 AM, memory [email protected] wrote:

Well, I guess I'm glad to know it's not just me. If I can provide any
additional info for debugging, please ping me on freenode or gchat.


Reply to this email directly or view it on GitHubhttps://github.com//issues/34#issuecomment-31679711
.

from synapse.

memory avatar memory commented on August 22, 2024

Ugh, I wish. I have yet to have it happen in <10 hours of process uptime. :(

from synapse.

igor47 avatar igor47 commented on August 22, 2024

a stack trace from a stuck synapse process, curtesy of @nelgau : https://gist.github.com/nelgau/3a910d5da1ac09abfd22

from synapse.

schleyfox avatar schleyfox commented on August 22, 2024

Worth noting, that this indicates that the process gets stuck when trying to exit. We will investigate this further and let you know what we find.

from synapse.

memory avatar memory commented on August 22, 2024

Found another one in the catatonic state; attached with gdb to get a stacktrace; if I'm reading this correctly it does indeed look like it's trying to shut down?

(gdb) bt
#0 0x00007f363f7e9f67 in sched_yield () from /lib/libc.so.6
#1 0x00007f3640724045 in gvl_yield (vm=0x1176010, th=) at thread_pthread.c:123
#2 0x00007f3640724b60 in rb_thread_schedule_limits () at thread.c:1025
#3 rb_thread_schedule () at thread.c:1035
#4 0x00007f3640724d77 in rb_thread_terminate_all () at thread.c:375
#5 0x00007f3640604516 in ruby_cleanup (ex=6) at eval.c:140
#6 0x00007f36406046d5 in ruby_run_node (n=0x1562808) at eval.c:244
#7 0x0000000000400a0b in main (argc=4, argv=0x7fff2342b518) at main.c:38

from synapse.

igor47 avatar igor47 commented on August 22, 2024

so, reading through the code, we don't really have any shutdown handling to speak of. what's more confusing is that it shuts down cleanly so much of the time. i'm going to try to add shutdown handling modeled after nerve's and see if this helps

from synapse.

memory avatar memory commented on August 22, 2024

FWIW, after deploying 0.8.0 across an environment with ~45 synapse instances, 10 hours after starting each process I was able to SIGTERM all of them and see them exit correctly. It's not conclusive, but it's at least positive enough that I'll turn off my cronned midnight SIGKILL and see if we're still in good shape after 48-72 hours.

from synapse.

memory avatar memory commented on August 22, 2024

Okay, now that we have proper cleanup, the underlying problem is a little clearer:

2014-01-17 17:15:48.525411500 I, [2014-01-17T17:15:48.525289 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: discovering backends for service klima-server
2014-01-17 17:16:18.529763500 E, [2014-01-17T17:16:18.529633 #6877] ERROR -- Synapse::Synapse: synapse: encountered unexpected exception #<Zookeeper::Exceptions::ContinuationTimeoutError: response for meth: :get_children, args: [0, "/nerve/services/staging/klima-server/default", nil, #<Zookeeper::Callbacks::WatcherCallback:0x00000002732038 @completed=false, @proc=#Proc:0x00000002731cc8@/export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/lib/zookeeper/callbacks.rb:24>], not received within 30 seconds> in main thread
2014-01-17 17:16:18.529873500 W, [2014-01-17T17:16:18.529827 #6877] WARN -- Synapse::Synapse: synapse: exiting; sending stop signal to all watchers
2014-01-17 17:16:18.529951500 W, [2014-01-17T17:16:18.529907 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533320500 I, [2014-01-17T17:16:18.531645 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533323500 W, [2014-01-17T17:16:18.531713 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533324500 I, [2014-01-17T17:16:18.531747 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533325500 W, [2014-01-17T17:16:18.531779 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533351500 I, [2014-01-17T17:16:18.531810 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533353500 W, [2014-01-17T17:16:18.531848 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533354500 I, [2014-01-17T17:16:18.531879 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533358500 W, [2014-01-17T17:16:18.531909 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533359500 I, [2014-01-17T17:16:18.531940 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533367500 W, [2014-01-17T17:16:18.531969 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533368500 I, [2014-01-17T17:16:18.531998 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533370500 W, [2014-01-17T17:16:18.532030 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533373500 I, [2014-01-17T17:16:18.532061 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533374500 W, [2014-01-17T17:16:18.532091 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533376500 I, [2014-01-17T17:16:18.532121 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533379500 W, [2014-01-17T17:16:18.532150 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533380500 I, [2014-01-17T17:16:18.532179 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533382500 W, [2014-01-17T17:16:18.532208 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533389500 I, [2014-01-17T17:16:18.532235 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533390500 W, [2014-01-17T17:16:18.532265 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533392500 I, [2014-01-17T17:16:18.532293 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533395500 W, [2014-01-17T17:16:18.532325 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533396500 I, [2014-01-17T17:16:18.532355 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533398500 W, [2014-01-17T17:16:18.532386 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533401500 I, [2014-01-17T17:16:18.532414 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533402500 W, [2014-01-17T17:16:18.532443 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533404500 I, [2014-01-17T17:16:18.532470 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533411500 W, [2014-01-17T17:16:18.532498 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533413500 I, [2014-01-17T17:16:18.532526 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533414500 W, [2014-01-17T17:16:18.532555 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533417500 I, [2014-01-17T17:16:18.532582 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533419500 W, [2014-01-17T17:16:18.532610 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533420500 I, [2014-01-17T17:16:18.532637 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533428500 W, [2014-01-17T17:16:18.532665 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533429500 I, [2014-01-17T17:16:18.532693 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533436500 W, [2014-01-17T17:16:18.532722 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533437500 I, [2014-01-17T17:16:18.532750 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533439500 W, [2014-01-17T17:16:18.532780 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533442500 I, [2014-01-17T17:16:18.532808 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533444500 W, [2014-01-17T17:16:18.532836 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533445500 I, [2014-01-17T17:16:18.532864 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533448500 W, [2014-01-17T17:16:18.532892 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533450500 I, [2014-01-17T17:16:18.532919 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533451500 W, [2014-01-17T17:16:18.532946 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533458500 I, [2014-01-17T17:16:18.532974 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533459500 W, [2014-01-17T17:16:18.533003 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533461500 I, [2014-01-17T17:16:18.533031 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533464500 W, [2014-01-17T17:16:18.533059 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533465500 I, [2014-01-17T17:16:18.533087 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533467500 W, [2014-01-17T17:16:18.533119 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533470500 I, [2014-01-17T17:16:18.533148 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533471500 W, [2014-01-17T17:16:18.533176 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533473500 I, [2014-01-17T17:16:18.533204 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533481500 W, [2014-01-17T17:16:18.533233 #6877] WARN -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher exiting
2014-01-17 17:16:18.533482500 I, [2014-01-17T17:16:18.533261 #6877] INFO -- Synapse::ZookeeperWatcher: synapse: zookeeper watcher cleaned up successfully
2014-01-17 17:16:18.533515500 /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/lib/zookeeper/continuation.rb:117:in block in value': response for meth: :get_children, args: [0, "/nerve/services/staging/klima-server/default", nil, #<Zookeeper::Callbacks::WatcherCallback:0x00000002732038 @completed=false, @proc=#<Proc:0x00000002731cc8@/export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/lib/zookeeper/callbacks.rb:24>>], not received within 30 seconds (Zookeeper::Exceptions::ContinuationTimeoutError) 2014-01-17 17:16:18.533608500 from /export/disk0/wb/bin/ruby-env/clemens/lib/ruby/1.9.1/monitor.rb:211:inmon_synchronize'
2014-01-17 17:16:18.533631500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/lib/zookeeper/continuation.rb:107:in value' 2014-01-17 17:16:18.533649500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/ext/c_zookeeper.rb:224:insubmit_and_block'
2014-01-17 17:16:18.533666500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/ext/c_zookeeper.rb:40:in get_children' 2014-01-17 17:16:18.533683500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zookeeper-1.4.7/lib/zookeeper/client_methods.rb:69:inget_children'
2014-01-17 17:16:18.533699500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/client/base.rb:1055:in call_and_check_rc' 2014-01-17 17:16:18.533716500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/client/threaded.rb:584:incall_and_check_rc'
2014-01-17 17:16:18.533733500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/client/base.rb:717:in block in children' 2014-01-17 17:16:18.533750500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/event_handler.rb:296:inblock in setup_watcher!'
2014-01-17 17:16:18.533766500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/event_handler.rb:314:in block in synchronize' 2014-01-17 17:16:18.533783500 from /export/disk0/wb/bin/ruby-env/clemens/lib/ruby/1.9.1/monitor.rb:211:inmon_synchronize'
2014-01-17 17:16:18.533799500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/event_handler.rb:314:in synchronize' 2014-01-17 17:16:18.533815500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/event_handler.rb:283:insetup_watcher!'
2014-01-17 17:16:18.533832500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/client/base.rb:1100:in setup_watcher!' 2014-01-17 17:16:18.533848500 from /export/disk0/wb/bin/synapse/vendor/bundle/ruby/1.9.1/gems/zk-1.9.2/lib/zk/client/base.rb:716:inchildren'
2014-01-17 17:16:18.533865500 from /export/disk0/wb/bin/synapse/lib/synapse/service_watcher/zookeeper.rb:58:in discover' 2014-01-17 17:16:18.533881500 from /export/disk0/wb/bin/synapse/lib/synapse/service_watcher/zookeeper.rb:105:inblock in watcher_callback'
2014-01-17 17:16:18.533898500 from /export/disk0/wb/bin/synapse/lib/synapse/service_watcher/zookeeper.rb:17:in call' 2014-01-17 17:16:18.533914500 from /export/disk0/wb/bin/synapse/lib/synapse/service_watcher/zookeeper.rb:17:instart'
2014-01-17 17:16:18.533930500 from /export/disk0/wb/bin/synapse/lib/synapse.rb:33:in block in run' 2014-01-17 17:16:18.533947500 from /export/disk0/wb/bin/synapse/lib/synapse.rb:33:inmap'
2014-01-17 17:16:18.533963500 from /export/disk0/wb/bin/synapse/lib/synapse.rb:33:in run' 2014-01-17 17:16:18.533979500 from bin/synapse:60:in

'
2014-01-17 17:16:18.880718500 fatal: Not a git repository (or any of the parent directories): .git
2014-01-17 17:16:19.203039500 fatal: Not a git repository (or any of the parent directories): .git

Obviously it's not great if a ZK server times out, but should that really be a fatal error for synapse? We're passing in multiple ZK servers, and a single timeout in one watcher should, it seems to me, be cause at worst for restarting that watcher and/or falling through to the next ZK server.

from synapse.

igor47 avatar igor47 commented on August 22, 2024

@memory agreed, but this is a separate issue; i opened a new one here

from synapse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.