Giter VIP home page Giter VIP logo

corosync's Introduction

SYNCHRONIZATION ALGORITHM:
-------------------------
The synchronization algorithm is used for every service in corosync to
synchronize state of the system.

There are 4 events of the synchronization algorithm.  These events are in fact
functions that are registered in the service handler data structure.  They
are called by the synchronization system whenever a network partitions or
merges.

init:
Within the init event a service handler should record temporary state variables
used by the process event.

process:
The process event is responsible for executing synchronization.  This event
will return a state as to whether it has completed or not.  This allows for
synchronization to be interrupted and recontinue when the message queue buffer
is full.  The process event will be called again by the synchronization service
if requested to do so by the return variable returned in process.

abort:
The abort event occurs when during synchronization a processor failure occurs.

activate:
The activate event occurs when process has returned no more processing is
necessary for any node in the cluster and all messages originated by process
have completed.

CHECKPOINT SYNCHRONIZATION ALGORITHM:
------------------------------------
The purpose of the checkpoint synchronization algorithm is to synchronize
checkpoints after a partition or merge of two or more partitions.  The
secondary purpose of the algorithm is to determine the cluster-wide reference
count for every checkpoint.

Every cluster contains a group of checkpoints.  Each checkpoint has a
checkpoint name and checkpoint number.  The number is used to uniquely reference
an unlinked but still open checkpoint in the cluster.

Every checkpoint contains a reference count which is used to determine when
that checkpoint may be released.  The algorithm rebuilds the reference count
information each time a partition or merge occurs.

local variables
my_sync_state may have the values SYNC_CHECKPOINT, SYNC_REFCOUNT
my_current_iteration_state contains any data used to iterate the checkpoints
	and sections.
checkpoint data
	refcount_set contains reference count for every node consisting of
	number of opened connections to checkpoint and node identifier
	refcount contains a summation of every reference count in the refcount_set

pseudocode executed by a processor when the synchronization service calls
the init event
	call process_checkpoints_enter

pseudocode executed by a processor when the synchronization service calls
the process event in the SYNC_CHECKPOINT state
	if lowest processor identifier of old ring in new ring
		transmit checkpoints or sections starting from my_current_iteration_state
	if all checkpoints and sections could be queued
		call sync_refcounts_enter
	else
		record my_current_iteration_state

	require process to continue

pseudocode executed by a processor when the synchronization service calls
the process event in the SYNC_REFCOUNT state
	if lowest processor identifier of old ring in new ring
		transmit checkpoint reference counts
	if all checkpoint reference counts could be queued
		require process to not continue
	else
		record my_current_iteration_state for checkpoint reference counts

sync_checkpoints_enter:
	my_sync_state = SYNC_CHECKPOINT
	my_current_iteration_state set to start of checkpoint list

sync_refcounts_enter:
	my_sync_state = SYNC_REFCOUNT

on event receipt of foreign ring id message
	ignore message

pseudocode executed on event receipt of checkpoint update
	if checkpoint exists in temporary storage
		ignore message
	else
		create checkpoint
		reset checkpoint refcount array

pseudocode executed on event receipt of checkpoint section update
	if checkpoint section exists in temporary storage
		ignore message
	else
		create checkpoint section

pseudocode executed on event receipt of reference count update
	update temporary checkpoint data storage reference count set by adding
	any reference counts in the temporary message set to those from the
	event
	update that checkpoint's reference count
	set the global checkpoint id to the current checkpoint id + 1 if it
	would increase the global checkpoint id

pseudocode called when the synchronization service calls the activate event:
for all checkpoints
	free all previously committed checkpoints and sections
	convert temporary checkpoints and sections to regular sections
copy my_saved_ring_id to my_old_ring_id

pseudocode called when the synchronization service calls the abort event:
	free all temporary checkpoints and temporary sections

corosync's People

Contributors

asalkeld avatar beekhof avatar bubble75 avatar chrissie-c avatar davidvossel avatar erihafe avatar fabbione avatar fabian-gruenbichler avatar fabientnetasq avatar hideoyamauchi avatar huzhijiang avatar jauler avatar jfriesse avatar jnpkrn avatar jonesmz avatar kirvedx avatar lhh avatar liangxin1300 avatar masatake avatar meyering avatar miz-take avatar rohara avatar rubenk avatar russellb avatar thomaslamprecht avatar wferi avatar yuanren10 avatar yunkai avatar yuusuke avatar zaneb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

corosync's Issues

Improve cpg

opaque data in callbacks (client stores data about itself, every node can access them), permissions (read only/readwrite/ some application may disable listeners) - Probably implemented as extra user space library on top of normal cpg

invalid nodeid parsing on 32-bit systems

Just encountered an issue with specifying 32-bit value (integer from binary form of IP address) in corosync.conf. The following configuration:

nodelist {
    node {
        ring0_addr: 192.168.122.166
        nodeid: 3232266918
    }
}

produces the following error message:

error   [MAIN  ] Value of key "nodeid" must be integer, but "3232266918" was given
error   [MAIN  ] Corosync Cluster Engine exiting with status 8 at main.c:1158.

However, if nodeid is not specified, it assigns automatically and is being displayed correctly:

# corosync-cmapctl runtime.votequorum.this_node_id
runtime.votequorum.this_node_id (u32) = 3232266918

System:

# uname -a
Linux ub1404-i386-ovl 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:08:14 UTC 2014 i686 i686 i686 GNU/Linux
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04 LTS
Release:    14.04
Codename:   trusty

Corosync version:

# corosync -v
Corosync Cluster Engine, version '2.3.3'
Copyright (c) 2006-2009 Red Hat, Inc.

I believe the reason is ERANGE in strtol() here:
https://github.com/corosync/corosync/blob/master/exec/coroparse.c#L415-L418

but I'm not sure about the best way to handle this, thus I'm not providing any patches at the moment.

Not able to see both cluster nodes in crm_mon result‏

I am not able to see both cluster nodes in crm_mon result.

root@server1:# corosync-objctl |grep member
totem.interface.member.memberaddr=192.168.133.76
totem.interface.member.memberaddr=192.168.132.50
runtime.totem.pg.mrp.srp.members.16777343.ip=r(0) ip(127.0.0.1)
runtime.totem.pg.mrp.srp.members.16777343.join_count=1
runtime.totem.pg.mrp.srp.members.16777343.status=joined
root@server1:
#

crm_mon result from server1

Last updated: Fri Apr 18 14:35:37 2014
Last change: Fri Apr 18 13:31:55 2014 via crmd on server1
Stack: openais
Current DC: server1 - partition WITHOUT quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
1 Nodes configured, 2 expected votes
#0 Resources configured.

Online: [ server1 ]

crm_mon result from server2

Last updated: Fri Apr 18 14:36:19 2014
Last change: Fri Apr 18 13:30:30 2014 via crmd on server2
Stack: openais
Current DC: server2 - partition WITHOUT quorum
Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff
1 Nodes configured, 2 expected votes
#0 Resources configured.

Online: [ server2 ]

corosync result from server1

cat /etc/corosync/corosync.conf

Please read the openais.conf.5 manual page

totem {
version: 2
cluster_name: MyCluster
# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Disable encryption
secauth: on

# How many threads to use for encryption/decryption
threads: 0

# Optionally assign a fixed node id (integer)
# nodeid: 1234

# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none

interface {

member {
memberaddr: 192.168.133.76
}
member {
memberaddr: 192.168.132.50
}
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 192.168.0.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}

amf {
mode: disabled
}

service {
# Load the Pacemaker Cluster Resource Manager
ver: 1
name: pacemaker
}

aisexec {
user: root
group: root
}

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}

corosync result from server2

cat /etc/corosync/corosync.conf

Please read the openais.conf.5 manual page

totem {
version: 2
cluster_name: MyCluster
# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Disable encryption
secauth: on

# How many threads to use for encryption/decryption
threads: 0

# Optionally assign a fixed node id (integer)
# nodeid: 1234

# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none

interface {

member {
memberaddr:192.168.133.76
}
member {
memberaddr: 192.168.132.50
}
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 192.168.0.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}

amf {
mode: disabled
}

service {
# Load the Pacemaker Cluster Resource Manager
ver: 1
name: pacemaker
}

aisexec {
user: root
group: root
}

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}

doxygenize include and lib directories.

Currently all man pages are handwritten (with help of small sed script to include error codes in some pages). This is suboptimal. Idea is to explore doxygen possibilities and doxygenize (at least) user libraries. This would give only one reference point needed to edit. Tarball should contain generated manpages, so doxygen is not needed during compilation but this manpages are not included in git.

Lapse in documentation

The Wiki page offers incomplete information on how to join the development efforts. Mention of a mailing list is given, but no instructions on how to join.

When I mailed the mailing list with the traditional one-word message "subscribe", I received the following reject message:

A subscription to this mailing list is required to post a message.
Your message has been rejected.  Please subscribe to the mailing list
if you wish to send a message.

Please fix the documentation, and oh by the way, subscribe me to the mailing list.

The cluster is partitioned into three individual parts after service pacemaker stop.

Hi,

I have a cluster with 3 nodes,

  • dev41: 192.168.63.81
  • dev42: 192.168.63.82
  • dev43: 192.168.63.83

The software stack is:

  • LibQB: v0.14.4
  • cluster-glue: glue-1.0.10
  • resource-agents: v3.9.3
  • corosync: v2.3.0
  • pacemaker: Pacemaker-1.1.8
  • crmsh: crmsh-1.2.4

/etc/corosync/corosync.conf is

totem {
    version: 2

    # crypto_cipher and crypto_hash: Used for mutual node authentication.
    # If you choose to enable this, then do remember to create a shared
    # secret with "corosync-keygen".
    # enabling crypto_cipher, requires also enabling of crypto_hash.
    crypto_cipher: none
    crypto_hash: none

    # interface: define at least one interface to communicate
    # over. If you define more than one interface stanza, you must
    # also set rrp_mode.
    interface {
                # Rings must be consecutively numbered, starting at 0.
        ringnumber: 0
        # This is normally the *network* address of the
        # interface to bind to. This ensures that you can use
        # identical instances of this configuration file
        # across all your cluster nodes, without having to
        # modify this option.
        bindnetaddr: 192.168.63.0
        # However, if you have multiple physical network
        # interfaces configured for the same subnet, then the
        # network address alone is not sufficient to identify
        # the interface Corosync should bind to. In that case,
        # configure the *host* address of the interface
        # instead:
        # bindnetaddr: 192.168.1.1
        # When selecting a multicast address, consider RFC
        #2365 (which, among other things, specifies that
        #239.255.x.x addresses are left to the discretion of
        # the network administrator). Do not reuse multicast
        # addresses across multiple Corosync clusters sharing
        # the same network.
        mcastaddr: 239.255.1.3
        # Corosync uses the port you specify here for UDP
        # messaging, and also the immediately preceding
        # port. Thus if you set this to 5405, Corosync sends
        # messages over UDP ports 5405 and 5404.
        mcastport: 5405
        # Time-to-live for cluster communication packets. The
        # number of hops (routers) that this ring will allow
        # itself to pass. Note that multicast routing must be
        # specifically enabled on most network routers.
        ttl: 1
    }
}

logging {
    # Log the source file and line where messages are being
    # generated. When in doubt, leave off. Potentially useful for
    # debugging.
    fileline: off
    # Log to standard error. When in doubt, set to no. Useful when
    # running in the foreground (when invoking "corosync -f")
    to_stderr: no
    # Log to a log file. When set to "no", the "logfile" option
    # must not be set.
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    # Log to the system log daemon. When in doubt, set to yes.
    to_syslog: yes
    # Log debug messages (very verbose). When in doubt, leave off.
    debug: off
    # Log messages with time stamps. When in doubt, set to on
    # (unless you are only logging to syslog, where double
    # timestamps can be annoying).
    timestamp: on
    logger_subsys {
        subsys: QUORUM
        debug: off
    }
}

quorum {
    # Enable and configure quorum subsystem (default: off)
    # see also corosync.conf.5 and votequorum.5
    #provider: corosync_votequorum
    provider: corosync_votequorum
    expected_votes: 2
}

And the output of "crm configure show" is

node $id="1363126464" dev41
node $id="1379903680" dev42
node $id="1396680896" dev43
property $id="cib-bootstrap-options" \
    dc-version="1.1.8-1f8858c" \
    cluster-infrastructure="corosync" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"

Then I start the service of corosync and pacemaker on dev41, dev42 and dev43.

The out put of "corosync-cmapctl |grep members" is

runtime.totem.pg.mrp.srp.members.1363126464.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1363126464.ip (str) = r(0) ip(192.168.63.81) 
runtime.totem.pg.mrp.srp.members.1363126464.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1363126464.status (str) = joined
runtime.totem.pg.mrp.srp.members.1379903680.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1379903680.ip (str) = r(0) ip(192.168.63.82) 
runtime.totem.pg.mrp.srp.members.1379903680.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1379903680.status (str) = joined
runtime.totem.pg.mrp.srp.members.1396680896.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1396680896.ip (str) = r(0) ip(192.168.63.83) 
runtime.totem.pg.mrp.srp.members.1396680896.join_count (u32) = 3
runtime.totem.pg.mrp.srp.members.1396680896.status (str) = joined

I let the cluster run for several days, then I execute "service pacemaker stop" on dev43, and the output is:

Signaling Pacemaker Cluster Manager to terminate:       [  OK  ]
Waiting for cluster services to unload:..........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................                       [  OK  ]

while the cluster is partitioned into three individual parts.

the output of "corosync-cmapctl |grep members" on dev41 is:

runtime.totem.pg.mrp.srp.members.1363126464.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1363126464.ip (str) = r(0) ip(192.168.63.81) 
runtime.totem.pg.mrp.srp.members.1363126464.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1363126464.status (str) = joined
runtime.totem.pg.mrp.srp.members.1379903680.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1379903680.ip (str) = r(0) ip(192.168.63.82) 
runtime.totem.pg.mrp.srp.members.1379903680.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1379903680.status (str) = left
runtime.totem.pg.mrp.srp.members.1396680896.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1396680896.ip (str) = r(0) ip(192.168.63.83) 
runtime.totem.pg.mrp.srp.members.1396680896.join_count (u32) = 3
runtime.totem.pg.mrp.srp.members.1396680896.status (str) = left

the output of "corosync-cmapctl | grep members" on dev42 is:

runtime.totem.pg.mrp.srp.members.1363126464.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1363126464.ip (str) = r(0) ip(192.168.63.81) 
runtime.totem.pg.mrp.srp.members.1363126464.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1363126464.status (str) = left
runtime.totem.pg.mrp.srp.members.1379903680.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1379903680.ip (str) = r(0) ip(192.168.63.82) 
runtime.totem.pg.mrp.srp.members.1379903680.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1379903680.status (str) = joined
runtime.totem.pg.mrp.srp.members.1396680896.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1396680896.ip (str) = r(0) ip(192.168.63.83) 
runtime.totem.pg.mrp.srp.members.1396680896.join_count (u32) = 3
runtime.totem.pg.mrp.srp.members.1396680896.status (str) = left

the output of "corosync-cmapctl | grep members" on dev43 is:

runtime.totem.pg.mrp.srp.members.1363126464.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1363126464.ip (str) = r(0) ip(192.168.63.81) 
runtime.totem.pg.mrp.srp.members.1363126464.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1363126464.status (str) = left
runtime.totem.pg.mrp.srp.members.1379903680.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1379903680.ip (str) = r(0) ip(192.168.63.82) 
runtime.totem.pg.mrp.srp.members.1379903680.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1379903680.status (str) = left
runtime.totem.pg.mrp.srp.members.1396680896.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1396680896.ip (str) = r(0) ip(192.168.63.83) 
runtime.totem.pg.mrp.srp.members.1396680896.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1396680896.status (str) = joined

/var/log/messges on dev43 is:

Feb 24 16:13:26 dev43 pacemakerd[13564]:   notice: pcmk_shutdown_worker: Shuting down Pacemaker
Feb 24 16:13:26 dev43 crmd[13571]:   notice: crm_shutdown: Requesting shutdown, upper limit is 1200000ms
Feb 24 16:13:26 dev43 pacemakerd[13564]:   notice: stop_child: Stopping crmd: Sent -15 to process 13571
Feb 24 16:13:26 dev43 corosync[13545]:   [TOTEM ] Retransmit List: 67
Feb 24 16:13:26 dev43 corosync[13545]:   [TOTEM ] Retransmit List: 67
........................................................................................................
........................................................................................................
Feb 24 16:21:17 dev43 corosync[13545]:   [TOTEM ] Retransmit List: 67
Feb 24 16:21:17 dev43 corosync[13545]:   [TOTEM ] Retransmit List: 67
Feb 24 16:21:17 dev43 corosync[13545]:   [TOTEM ] Retransmit List: 67
Feb 24 16:21:18 dev43 corosync[13545]:   [TOTEM ] A processor failed, forming new configuration.
Feb 24 16:21:18 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
Feb 24 16:21:18 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1676) was formed.
Feb 24 16:21:18 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
Feb 24 16:21:18 dev43 crmd[13571]:   notice: corosync_node_name: Inferred node name '192.168.63.82' for nodeid 1379903680 from DNS
Feb 24 16:21:18 dev43 crmd[13571]:    error: corosync_node_name: Unable to get node name for nodeid 1379903680
Feb 24 16:21:18 dev43 crmd[13571]:   notice: corosync_mark_unseen_peer_dead: Node 1363126464/dev41 was not seen in the previous transition
Feb 24 16:21:18 dev43 crmd[13571]:   notice: crm_update_peer_state: corosync_mark_unseen_peer_dead: Node dev41[1363126464] - state is now lost
Feb 24 16:21:18 dev43 crmd[13571]:  warning: check_dead_member: Our DC node (dev41) left the cluster
Feb 24 16:21:18 dev43 crmd[13571]:   notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=check_dead_member ]
Feb 24 16:21:18 dev43 crmd[13571]:   notice: do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_election_count_vote ]
Feb 24 16:21:18 dev43 crmd[13571]:   notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
Feb 24 16:21:18 dev43 attrd[13569]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
Feb 24 16:21:18 dev43 attrd[13569]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Feb 24 16:21:18 dev43 attrd[13569]:   notice: attrd_ais_dispatch: Update relayed from dev42
Feb 24 16:21:18 dev43 attrd[13569]:   notice: attrd_trigger_update: Sending flush op to all hosts for: shutdown (1361694046)
Feb 24 16:21:18 dev43 attrd[13569]:   notice: attrd_perform_update: Sent update 11: shutdown=1361694046
Feb 24 16:21:20 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
Feb 24 16:21:20 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1680) was formed.
Feb 24 16:21:20 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
Feb 24 16:21:21 dev43 crmd[13571]:   notice: do_state_transition: State transition S_NOT_DC -> S_STOPPING [ input=I_STOP cause=C_HA_MESSAGE origin=route_message ]
Feb 24 16:21:21 dev43 crmd[13571]:   notice: terminate_cs_connection: Disconnecting from Corosync
Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: stop_child: Stopping pengine: Sent -15 to process 13570
Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: stop_child: Stopping attrd: Sent -15 to process 13569
Feb 24 16:21:21 dev43 attrd[13569]:   notice: main: Exiting...
Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: stop_child: Stopping lrmd: Sent -15 to process 13568
Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: stop_child: Stopping stonith-ng: Sent -15 to process 13567
Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: stop_child: Stopping cib: Sent -15 to process 13566
Feb 24 16:21:21 dev43 cib[13566]:   notice: terminate_cs_connection: Disconnecting from Corosync
Feb 24 16:21:21 dev43 cib[13566]:   notice: terminate_cs_connection: Disconnecting from Corosync
2534 Feb 24 16:21:21 dev43 pacemakerd[13564]:   notice: pcmk_shutdown_worker: Shutdown complete
2535 Feb 24 16:21:21 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2536 Feb 24 16:21:21 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1684) was formed.
2537 Feb 24 16:21:21 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2538 Feb 24 16:21:23 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2539 Feb 24 16:21:23 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1688) was formed.
2540 Feb 24 16:21:23 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2541 Feb 24 16:21:24 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2542 Feb 24 16:21:24 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1692) was formed.
2543 Feb 24 16:21:24 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2544 Feb 24 16:21:26 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2545 Feb 24 16:21:26 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1696) was formed.
2546 Feb 24 16:21:26 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2547 Feb 24 16:21:27 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2548 Feb 24 16:21:27 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1700) was formed.
2549 Feb 24 16:21:27 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2550 Feb 24 16:21:29 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2551 Feb 24 16:21:29 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1704) was formed.
2552 Feb 24 16:21:29 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2553 Feb 24 16:21:30 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2554 Feb 24 16:21:30 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1708) was formed.
2555 Feb 24 16:21:30 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2556 Feb 24 16:21:31 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2557 Feb 24 16:21:31 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1712) was formed.
2558 Feb 24 16:21:31 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2559 Feb 24 16:21:33 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2560 Feb 24 16:21:33 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1716) was formed.
2561 Feb 24 16:21:33 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2562 Feb 24 16:21:34 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2563 Feb 24 16:21:34 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1720) was formed.
2564 Feb 24 16:21:34 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2565 Feb 24 16:21:36 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2566 Feb 24 16:21:36 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1724) was formed.
2567 Feb 24 16:21:36 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2568 Feb 24 16:21:37 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2569 Feb 24 16:21:37 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1728) was formed.
2570 Feb 24 16:21:37 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2571 Feb 24 16:21:39 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2572 Feb 24 16:21:39 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1732) was formed.
2573 Feb 24 16:21:39 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2574 Feb 24 16:21:41 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
2575 Feb 24 16:21:41 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1736) was formed.
2576 Feb 24 16:21:41 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
2577 Feb 24 16:21:42 dev43 corosync[13545]:   [QUORUM] Members[2]: 1379903680 1396680896
........................................................................................................
........................................................................................................
3606 Feb 24 16:30:11 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.83:3108) was formed.
3607 Feb 24 16:30:11 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
3608 Feb 24 16:30:13 dev43 corosync[13545]:   [QUORUM] Members[1]: 1396680896
3609 Feb 24 16:30:13 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.83:3112) was formed.
3610 Feb 24 16:30:13 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
3611 Feb 24 16:30:14 dev43 corosync[13545]:   [QUORUM] Members[1]: 1396680896
3612 Feb 24 16:30:14 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.83:3116) was formed.
3613 Feb 24 16:30:14 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.
3614 Feb 24 16:30:16 dev43 corosync[13545]:   [QUORUM] Members[1]: 1396680896
3615 Feb 24 16:30:16 dev43 corosync[13545]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.83:3120) was formed.
3616 Feb 24 16:30:16 dev43 corosync[13545]:   [MAIN  ] Completed service synchronization, ready to provide service.

/var/log/messages on dev41 is :

248 Feb 24 16:07:40 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
249 Feb 24 16:07:40 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
250 Feb 24 16:07:40 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 281: (null)
251 Feb 24 16:07:40 dev41 crmd[10689]:   notice: run_graph: Transition 281 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
252 Feb 24 16:07:40 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
253 Feb 24 16:21:00 dev41 corosync[10644]:   [TOTEM ] FAILED TO RECEIVE
254 Feb 24 16:21:02 dev41 corosync[10644]:   [QUORUM] This node is within the non-primary component and will NOT provide any services.
255 Feb 24 16:21:02 dev41 corosync[10644]:   [QUORUM] Members[1]: 1363126464
256 Feb 24 16:21:02 dev41 crmd[10689]:   notice: pcmk_quorum_notification: Membership 1676: quorum lost (1)
257 Feb 24 16:21:02 dev41 crmd[10689]:   notice: corosync_mark_unseen_peer_dead: Node 1396680896/dev43 was not seen in the previous transition
258 Feb 24 16:21:02 dev41 crmd[10689]:   notice: crm_update_peer_state: corosync_mark_unseen_peer_dead: Node dev43[1396680896] - state is now lost
259 Feb 24 16:21:02 dev41 crmd[10689]:   notice: corosync_mark_unseen_peer_dead: Node 1379903680/dev42 was not seen in the previous transition
260 Feb 24 16:21:02 dev41 crmd[10689]:   notice: crm_update_peer_state: corosync_mark_unseen_peer_dead: Node dev42[1379903680] - state is now lost
261 Feb 24 16:21:02 dev41 corosync[10644]:   [QUORUM] Members[1]: 1363126464
262 Feb 24 16:21:02 dev41 corosync[10644]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.81:1676) was formed.
263 Feb 24 16:21:02 dev41 corosync[10644]:   [MAIN  ] Completed service synchronization, ready to provide service.
264 Feb 24 16:21:03 dev41 crmd[10689]:  warning: match_down_event: No match for shutdown action on 1379903680
265 Feb 24 16:21:03 dev41 crmd[10689]:   notice: peer_update_callback: Stonith/shutdown of dev42 not matched
266 Feb 24 16:21:03 dev41 crmd[10689]:  warning: match_down_event: No match for shutdown action on 1396680896
267 Feb 24 16:21:03 dev41 crmd[10689]:   notice: peer_update_callback: Stonith/shutdown of dev43 not matched
268 Feb 24 16:21:03 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=check_join_state ]
269 Feb 24 16:21:03 dev41 attrd[10687]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
270 Feb 24 16:21:03 dev41 attrd[10687]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
271 Feb 24 16:21:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
272 Feb 24 16:21:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 282: /var/lib/pacemaker/pengine/pe-input-17.bz2
273 Feb 24 16:21:04 dev41 crmd[10689]:   notice: run_graph: Transition 282 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-17.bz2): Complete
274 Feb 24 16:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
275 Feb 24 16:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
276 Feb 24 16:36:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
277 Feb 24 16:36:04 dev41 crmd[10689]:   notice: run_graph: Transition 283 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
278 Feb 24 16:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
279 Feb 24 16:36:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 283: (null)
280 Feb 24 16:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
281 Feb 24 16:51:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
282 Feb 24 16:51:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 284: (null)
283 Feb 24 16:51:04 dev41 crmd[10689]:   notice: run_graph: Transition 284 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
284 Feb 24 16:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
285 Feb 24 17:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
286 Feb 24 17:06:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
287 Feb 24 17:06:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 285: (null)
288 Feb 24 17:06:04 dev41 crmd[10689]:   notice: run_graph: Transition 285 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
289 Feb 24 17:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
290 Feb 24 17:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
291 Feb 24 17:21:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
292 Feb 24 17:21:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 286: (null)
293 Feb 24 17:21:04 dev41 crmd[10689]:   notice: run_graph: Transition 286 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
294 Feb 24 17:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
295 Feb 24 17:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
296 Feb 24 17:36:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
297 Feb 24 17:36:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 287: (null)
298 Feb 24 17:36:04 dev41 crmd[10689]:   notice: run_graph: Transition 287 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
299 Feb 24 17:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
300 Feb 24 17:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
301 Feb 24 17:51:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
302 Feb 24 17:51:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 288: (null)
303 Feb 24 17:51:04 dev41 crmd[10689]:   notice: run_graph: Transition 288 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
304 Feb 24 17:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
305 Feb 24 18:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
306 Feb 24 18:06:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
307 Feb 24 18:06:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 289: (null)
308 Feb 24 18:06:04 dev41 crmd[10689]:   notice: run_graph: Transition 289 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
309 Feb 24 18:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
310 Feb 24 18:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
311 Feb 24 18:21:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
312 Feb 24 18:21:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 290: (null)
313 Feb 24 18:21:04 dev41 crmd[10689]:   notice: run_graph: Transition 290 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
314 Feb 24 18:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
315 Feb 24 18:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
316 Feb 24 18:36:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
317 Feb 24 18:36:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 291: (null)
318 Feb 24 18:36:04 dev41 crmd[10689]:   notice: run_graph: Transition 291 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
319 Feb 24 18:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
320 Feb 24 18:44:27 dev41 cibadmin[21380]:   notice: crm_log_args: Invoked: cibadmin -Ql
321 Feb 24 18:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
322 Feb 24 18:51:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
323 Feb 24 18:51:04 dev41 crmd[10689]:   notice: run_graph: Transition 292 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
324 Feb 24 18:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
325 Feb 24 18:51:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 292: (null)
326 Feb 24 19:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
327 Feb 24 19:06:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
328 Feb 24 19:06:04 dev41 crmd[10689]:   notice: run_graph: Transition 293 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
329 Feb 24 19:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
330 Feb 24 19:06:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 293: (null)
331 Feb 24 19:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
332 Feb 24 19:21:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
333 Feb 24 19:21:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 294: (null)
334 Feb 24 19:21:04 dev41 crmd[10689]:   notice: run_graph: Transition 294 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
335 Feb 24 19:21:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
336 Feb 24 19:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
337 Feb 24 19:36:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
338 Feb 24 19:36:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 295: (null)
339 Feb 24 19:36:04 dev41 crmd[10689]:   notice: run_graph: Transition 295 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
340 Feb 24 19:36:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
341 Feb 24 19:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
342 Feb 24 19:51:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
343 Feb 24 19:51:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 296: (null)
344 Feb 24 19:51:04 dev41 crmd[10689]:   notice: run_graph: Transition 296 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
345 Feb 24 19:51:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
346 Feb 24 20:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
347 Feb 24 20:06:04 dev41 pengine[10688]:   notice: unpack_config: On loss of CCM Quorum: Ignore
348 Feb 24 20:06:04 dev41 pengine[10688]:   notice: process_pe_message: Calculated Transition 297: (null)
349 Feb 24 20:06:04 dev41 crmd[10689]:   notice: run_graph: Transition 297 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
350 Feb 24 20:06:04 dev41 crmd[10689]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]

/var/log/messages on dev42 is :

   3 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   4 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   5 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   6 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   7 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   8 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
   9 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  10 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  11 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  12 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  13 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  14 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  15 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  16 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  17 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  18 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  19 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  20 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  21 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  22 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  23 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  24 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  25 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  26 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  27 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  28 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  29 Feb 24 16:12:54 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  30 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  31 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  32 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  33 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  34 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  35 Feb 24 16:12:55 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  36 Feb 24 16:12:56 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  37 Feb 24 16:12:56 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  38 Feb 24 16:12:56 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  39 Feb 24 16:12:56 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  40 Feb 24 16:12:56 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  41 Feb 24 16:12:57 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  42 Feb 24 16:12:57 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  43 Feb 24 16:12:57 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  44 Feb 24 16:12:57 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  45 Feb 24 16:12:57 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
  46 Feb 24 16:12:58 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
...............................................................................................................
...............................................................................................................
2499 Feb 24 16:20:45 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
2500 Feb 24 16:20:45 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 67
2501 Feb 24 16:20:46 dev42 corosync[10654]:   [TOTEM ] A processor failed, forming new configuration.
2502 Feb 24 16:20:46 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2503 Feb 24 16:20:46 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1676) was formed.
2504 Feb 24 16:20:46 dev42 crmd[10694]:   notice: corosync_mark_unseen_peer_dead: Node 1363126464/dev41 was not seen in the previous transition
2505 Feb 24 16:20:46 dev42 crmd[10694]:   notice: crm_update_peer_state: corosync_mark_unseen_peer_dead: Node dev41[1363126464] - state is now lost
2506 Feb 24 16:20:46 dev42 crmd[10694]:  warning: check_dead_member: Our DC node (dev41) left the cluster
2507 Feb 24 16:20:46 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2508 Feb 24 16:20:46 dev42 crmd[10694]:   notice: do_state_transition: State transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL origin=check_dead_member ]
2509 Feb 24 16:20:46 dev42 crmd[10694]:   notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ]
2510 Feb 24 16:20:46 dev42 attrd[10692]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
2511 Feb 24 16:20:46 dev42 attrd[10692]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
2512 Feb 24 16:20:46 dev42 attrd[10692]:  warning: attrd_cib_callback: Update shutdown=(null) failed: No such device or address
2513 Feb 24 16:20:47 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
2514 Feb 24 16:20:47 dev42 pengine[10693]:   notice: stage6: Scheduling Node dev43 for shutdown
2515 Feb 24 16:20:47 dev42 crmd[10694]:   notice: run_graph: Transition 0 (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): Complete
2516 Feb 24 16:20:47 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
2517 Feb 24 16:20:47 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-10.bz2
2518 Feb 24 16:20:48 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2519 Feb 24 16:20:48 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1680) was formed.
2520 Feb 24 16:20:48 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2521 Feb 24 16:20:49 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2522 Feb 24 16:20:49 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1684) was formed.
2523 Feb 24 16:20:49 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2524 Feb 24 16:20:50 dev42 crmd[10694]:   notice: peer_update_callback: do_shutdown of dev43 (op 7) is complete
2525 Feb 24 16:20:50 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL origin=check_join_state ]
2526 Feb 24 16:20:51 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2527 Feb 24 16:20:51 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1688) was formed.
2528 Feb 24 16:20:51 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2529 Feb 24 16:20:52 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2530 Feb 24 16:20:52 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1692) was formed.
2531 Feb 24 16:20:52 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2532 Feb 24 16:20:54 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2533 Feb 24 16:20:54 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1696) was formed.
2534 Feb 24 16:20:54 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2535 Feb 24 16:20:55 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2536 Feb 24 16:20:55 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1700) was formed.
2537 Feb 24 16:20:55 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2538 Feb 24 16:20:57 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2539 Feb 24 16:20:57 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1704) was formed.
2540 Feb 24 16:20:57 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2541 Feb 24 16:20:58 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2542 Feb 24 16:20:58 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1708) was formed.
2543 Feb 24 16:20:58 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2544 Feb 24 16:20:59 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2545 Feb 24 16:20:59 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1712) was formed.
2546 Feb 24 16:20:59 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2547 Feb 24 16:21:01 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2548 Feb 24 16:21:01 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1716) was formed.
2549 Feb 24 16:21:01 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2550 Feb 24 16:21:02 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2551 Feb 24 16:21:02 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1720) was formed.
2552 Feb 24 16:21:02 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2553 Feb 24 16:21:04 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2554 Feb 24 16:21:04 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1724) was formed.
2555 Feb 24 16:21:04 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2556 Feb 24 16:21:05 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2557 Feb 24 16:21:05 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1728) was formed.
2558 Feb 24 16:21:05 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2559 Feb 24 16:21:07 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2560 Feb 24 16:21:07 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1732) was formed.
2561 Feb 24 16:21:07 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2562 Feb 24 16:21:09 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2563 Feb 24 16:21:09 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1736) was formed.
2564 Feb 24 16:21:09 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2565 Feb 24 16:21:10 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2566 Feb 24 16:21:10 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1740) was formed.
2567 Feb 24 16:21:10 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2568 Feb 24 16:21:11 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2569 Feb 24 16:21:11 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:1744) was formed.
.................................................................................................................................
.................................................................................................................................
2890 Feb 24 16:23:47 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2172) was formed.
2891 Feb 24 16:23:47 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2892 Feb 24 16:23:49 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2893 Feb 24 16:23:49 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2176) was formed.
2894 Feb 24 16:23:49 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2895 Feb 24 16:23:50 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2896 Feb 24 16:23:50 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2180) was formed.
2897 Feb 24 16:23:50 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2898 Feb 24 16:23:51 dev42 crmd[10694]:    error: crm_timer_popped: Integration Timer (I_INTEGRATED) just popped in state S_INTEGRATION! (180000ms)
2899 Feb 24 16:23:51 dev42 crmd[10694]:  warning: do_state_transition: Progressed to state S_FINALIZE_JOIN after C_TIMER_POPPED
2900 Feb 24 16:23:51 dev42 crmd[10694]:  warning: do_state_transition: 1 cluster nodes failed to respond to the join offer.
2901 Feb 24 16:23:51 dev42 crmd[10694]:  warning: do_log: FSA: Input I_ELECTION_DC from do_dc_join_finalize() received in state S_FINALIZE_JOIN
2902 Feb 24 16:23:51 dev42 crmd[10694]:  warning: join_query_callback: No DC for join-248
2903 Feb 24 16:23:51 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2904 Feb 24 16:23:51 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2184) was formed.
2905 Feb 24 16:23:51 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2906 Feb 24 16:23:53 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2907 Feb 24 16:23:53 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2188) was formed.
2908 Feb 24 16:23:53 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2909 Feb 24 16:23:54 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2910 Feb 24 16:23:54 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2192) was formed.
2911 Feb 24 16:23:54 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2912 Feb 24 16:23:55 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2913 Feb 24 16:23:55 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2196) was formed.
2914 Feb 24 16:23:55 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2915 Feb 24 16:23:57 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2916 Feb 24 16:23:57 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2200) was formed.
2917 Feb 24 16:23:57 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2918 Feb 24 16:23:59 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
2919 Feb 24 16:23:59 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2204) was formed.
2920 Feb 24 16:23:59 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
2921 Feb 24 16:24:00 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
..................................................................................................................................................................................................................................................................
3063 Feb 24 16:25:10 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2396) was formed.
3064 Feb 24 16:25:10 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
3065 Feb 24 16:25:11 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
3066 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2400) was formed.
3067 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2
3068 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3069 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3070 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3071 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3072 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3073 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3074 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3075 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3076 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3077 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3078 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3079 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3080 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3081 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3082 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3083 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3084 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3085 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3086 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3087 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3088 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3089 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3090 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3091 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3092 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3093 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3094 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3095 Feb 24 16:25:11 dev42 corosync[10654]:   [TOTEM ] Retransmit List: 2 3
3096 Feb 24 16:25:12 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
3097 Feb 24 16:25:12 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2404) was formed.
3096 Feb 24 16:25:12 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
3097 Feb 24 16:25:12 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2404) was formed.
3098 Feb 24 16:25:12 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
3099 Feb 24 16:25:14 dev42 corosync[10654]:   [QUORUM] Members[2]: 1379903680 1396680896
3100 Feb 24 16:25:14 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2408) was formed.
3101 Feb 24 16:25:14 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
3102 Feb 24 16:25:14 dev42 attrd[10692]:   notice: attrd_local_callback: Sending full refresh (origin=crmd)
3103 Feb 24 16:25:14 dev42 attrd[10692]:   notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
3104 Feb 24 16:25:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3105 Feb 24 16:25:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 1: /var/lib/pacemaker/pengine/pe-input-11.bz2
3106 Feb 24 16:25:15 dev42 crmd[10694]:   notice: run_graph: Transition 1 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-11.bz2): Complete
3107 Feb 24 16:25:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3108 Feb 24 16:25:16 dev42 corosync[10654]:   [QUORUM] This node is within the non-primary component and will NOT provide any services.
3109 Feb 24 16:25:16 dev42 corosync[10654]:   [QUORUM] Members[1]: 1379903680
3110 Feb 24 16:25:16 dev42 crmd[10694]:   notice: pcmk_quorum_notification: Membership 2412: quorum lost (1)
3111 Feb 24 16:25:16 dev42 crmd[10694]:   notice: corosync_mark_unseen_peer_dead: Node 1396680896/dev43 was not seen in the previous transition
3112 Feb 24 16:25:16 dev42 crmd[10694]:   notice: crm_update_peer_state: corosync_mark_unseen_peer_dead: Node dev43[1396680896] - state is now lost
3113 Feb 24 16:25:16 dev42 corosync[10654]:   [QUORUM] Members[1]: 1379903680
3114 Feb 24 16:25:16 dev42 corosync[10654]:   [TOTEM ] A processor joined or left the membership and a new membership (192.168.63.82:2412) was formed.
3115 Feb 24 16:25:16 dev42 corosync[10654]:   [MAIN  ] Completed service synchronization, ready to provide service.
3116 Feb 24 16:40:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
3117 Feb 24 16:40:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3118 Feb 24 16:40:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 2: /var/lib/pacemaker/pengine/pe-input-12.bz2
3119 Feb 24 16:40:15 dev42 crmd[10694]:   notice: run_graph: Transition 2 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-12.bz2): Complete
3120 Feb 24 16:40:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3121 Feb 24 16:55:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
3122 Feb 24 16:55:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3123 Feb 24 16:55:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 3: (null)
3124 Feb 24 16:55:15 dev42 crmd[10694]:   notice: run_graph: Transition 3 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
3125 Feb 24 16:55:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3126 Feb 24 17:10:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
3127 Feb 24 17:10:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3128 Feb 24 17:10:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 4: (null)
3129 Feb 24 17:10:15 dev42 crmd[10694]:   notice: run_graph: Transition 4 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
3130 Feb 24 17:10:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3131 Feb 24 17:25:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
3132 Feb 24 17:25:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3133 Feb 24 17:25:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 5: (null)
3134 Feb 24 17:25:15 dev42 crmd[10694]:   notice: run_graph: Transition 5 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
3135 Feb 24 17:25:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3136 Feb 24 17:40:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]
3137 Feb 24 17:40:15 dev42 pengine[10693]:   notice: unpack_config: On loss of CCM Quorum: Ignore
3138 Feb 24 17:40:15 dev42 pengine[10693]:   notice: process_pe_message: Calculated Transition 6: (null)
3139 Feb 24 17:40:15 dev42 crmd[10694]:   notice: run_graph: Transition 6 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=unknown): Complete
3140 Feb 24 17:40:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
3141 Feb 24 17:55:15 dev42 crmd[10694]:   notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ]

Is this is a bug ?

Thanks.

LIBTOOL path problem at autogen.sh

Hi,

I'm trying to compile corosync from svn but it complains

common_lib/Makefile.am:34: Libtool library used but LIBTOOL' is undefined common_lib/Makefile.am:34: The usual way to defineLIBTOOL' is to add AC_PROG_LIBTOOL' common_lib/Makefile.am:34: toconfigure.ac' and run aclocal' andautoconf' again.
common_lib/Makefile.am:34: If AC_PROG_LIBTOOL' is inconfigure.ac', make sure
common_lib/Makefile.am:34: its definition is in aclocal's search path.
exec/Makefile.am:49: Libtool library used but LIBTOOL' is undefined exec/Makefile.am:49: The usual way to defineLIBTOOL' is to add AC_PROG_LIBTOOL' exec/Makefile.am:49: toconfigure.ac' and run aclocal' andautoconf' again.
exec/Makefile.am:49: If AC_PROG_LIBTOOL' is inconfigure.ac', make sure
exec/Makefile.am:49: its definition is in aclocal's search path.
lib/Makefile.am:57: Libtool library used but LIBTOOL' is undefined lib/Makefile.am:57: The usual way to defineLIBTOOL' is to add AC_PROG_LIBTOOL' lib/Makefile.am:57: toconfigure.ac' and run aclocal' andautoconf' again.
lib/Makefile.am:57: If AC_PROG_LIBTOOL' is inconfigure.ac', make sure
lib/Makefile.am:57: its definition is in aclocal's search path.

however AC_PROG_LIBTOOL is in configure.ac...

Thanks

assert in totemsrp.c:25??: orf_token_rtr: Assertion `range < 16384' failed

Hi,

I use corosync 1.4.1.
I get the following assertion on the first node in two-node testbed when the second node is starting up:

corosync: totemsrp.c:2557: orf_token_rtr: Assertion `range < 16384' failed

The following code chunk looks to be strange:

define RETRANS_MESSAGE_QUEUE_SIZE_MAX 16384 /* allow 500 messages to be queued */

16384 or 500 for right value?

Could it be the reason?

Built on OS X 10.8.2 (Mountain Lion)

I have at least successfully compiled on OS X 10.8.2 with gcc 4.2.1. (freshly checked out from git about half-an-hour ago)

Prerequisites: install MacPorts, install autoconf and automake via MacPorts, install nss via MacPorts

  1. Modify configure.ac, line 615: (change --as-needed to -dead_strip_dylibs)

LDFLAGS="$ENV_LDFLAGS $lt_prog_compiler_pic $SEC_LDFLAGS -Wl,--as-needed $COVERAGE_LDFLAGS"

to

LDFLAGS="$ENV_LDFLAGS $lt_prog_compiler_pic $SEC_LDFLAGS -Wl,-dead_strip_dylibs $COVERAGE_LDFLAGS"

  1. Modify lib/Makefile.am, remove use of -version-script by deleting line 52:

    -Wl,-version-script=$(srcdir)/lib$(call get_libname,$<).versions \

I am not familiar with autoconf/automake to make this change cross-platform friendly to submit a proper patch.

Install corosync error

The steps i do like following:

1, $ git clone git://github.com/corosync/corosync.git
2, $ cd corosync
3, $ git checkout -b flatiron origin/flatiron
4, $ ./autogen.sh
5, $ ./configure --enable-nss
6, $ sudo make install

After step 6, i got following errors.

Built shared libs
make[2]:正在离开目录 /home/jiawei/test/corosync/lib' Making all in exec make[2]: 正在进入目录/home/jiawei/test/corosync/exec'
gcc -fPIC -L/usr/lib -L/lib -L/usr/lib64 -O3 -ggdb3 -Wall -Wshadow -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wdeclaration-after-statement -Wpointer-arith -Wwrite-strings -Wcast-align -Wbad-function-cast -Wmissing-format-attribute -Wformat=2 -Wformat-security -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -rdynamic -L./ -o corosync main.o util.o sync.o apidef.o service.o timer.o totemconfig.o mainconfig.o quorum.o schedwrk.o lcr_ifact.o evil.o syncv2.o -ltotem_pg -llogsys -lcoroipcs -lrt -lpthread -ldl
main.o: In function corosync_sync_completed': /home/jiawei/test/corosync/exec/main.c:279: undefined reference tototempg_trans_ack'
main.o: In function main': /home/jiawei/test/corosync/exec/main.c:1795: undefined reference topoll_low_fds_event_set'
service.o: In function service_unlink_schedwrk_handler': /home/jiawei/test/corosync/exec/service.c:573: undefined reference tocoroipcs_ipc_service_exit'
totemconfig.o: In function totem_config_validate': /home/jiawei/test/corosync/exec/totemconfig.c:520: undefined reference tototemip_is_mcast'
collect2: ld returned 1 exit status
make[2]: *** [corosync] 错误 1
make[2]:正在离开目录 /home/jiawei/test/corosync/exec' make[1]: *** [all-recursive] 错误 1 make[1]:正在离开目录/home/jiawei/test/corosync'
make: *** [all] 错误 2

Could anyone give me some suggestions? Thanks in advance ;-)

Modify totemsrp to allow dynamic definitions of the ring

to allow a larger number of redundant rings then 2.

There is work on integration of kronosnet. Such will allow to have 64 interfaces together with asymmetric configuration (one node having 3 links, second having 2 links, ...)

autogen.sh fails with latest versions of automake & autoconf

Trying to build corosync on an Ubuntu 14.04.1 LTS system
(with autoconf version == 2.69 (latest release) , and either
automake version == 1.14.1 (latest releae) OR
automake version == 1.10.3 (ubuntu default)
) fails with same error for both autmake versions:
$ ./autogen.sh
Building configuration system...
main::scan_file() called too early to check prototype at /usr/bin/aclocal line 616.
configure.ac:18: warning: macro AM_SILENT_RULES' not found in library libtoolize:AC_PROG_RANLIB' is rendered obsolete by LT_INIT' main::scan_file() called too early to check prototype at /usr/bin/aclocal line 616. configure.ac:18: warning: macroAM_SILENT_RULES' not found in library
configure.ac:18: error: possibly undefined macro: AM_SILENT_RULES
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
autoreconf: /usr/bin/autoconf failed with exit status: 1

Please either:
o) include a generated ./configure in the GIT repository
o) ship a working version of autoconf & automake in the GIT
o) specify exactly which versions of both autoconf and automake (when installed together) work for
you in the INSTALL file .

I tried installing autoconf 2.61 and it also fails with the same error with both automake-1.10.3 and automake-1.14.1 :

$ ./autogen.sh
Building configuration system...
main::scan_file() called too early to check prototype at /usr/bin/aclocal line 616.
configure.ac:18: warning: macro AM_SILENT_RULES' not found in library libtoolize:AC_PROG_RANLIB' is rendered obsolete by LT_INIT' main::scan_file() called too early to check prototype at /usr/bin/aclocal line 616. configure.ac:18: warning: macroAM_SILENT_RULES' not found in library
configure.ac:18: error: possibly undefined macro: AM_SILENT_RULES
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.
autoreconf: /usr/bin/autoconf failed with exit status: 1

Symbols missing in libsam4 after v1.99.9

Hello, I was generatiing symbols files for the latest debian packages and I discovered that in v1.99.9 symbols were added that went missing in v2.3.0:

rik@devrikx:~/Documents/mmod/projects/debian/corosync/ubinaries$ dpkg-gensymbols -v2.3.0 -plibsam4 -Plibsam4_2.3.0-1_amd64/ -Olibsam4.symbols
dpkg-gensymbols: warning: some symbols or patterns disappeared in the symbols file: see diff output below
dpkg-gensymbols: warning: libsam4.symbols doesn't match completely libsam4.symbols
--- libsam4.symbols (libsam4_2.3.0_amd64)
+++ dpkg-gensymbolsh5H7Us   2015-04-24 02:11:23.045214612 -0400
@@ -1,43 +1,43 @@
 libsam.so.4 libsam4 #MINVER#
  COROSYNC_SAM_1.0@COROSYNC_SAM_1.0 1.2.0
- cmap_context_get@Base 1.99.9
- cmap_context_set@Base 1.99.9
- cmap_dec@Base 1.99.9
- cmap_delete@Base 1.99.9
- cmap_dispatch@Base 1.99.9
- cmap_fd_get@Base 1.99.9
- cmap_finalize@Base 1.99.9
- cmap_get@Base 1.99.9
- cmap_get_double@Base 1.99.9
- cmap_get_float@Base 1.99.9
- cmap_get_int16@Base 1.99.9
- cmap_get_int32@Base 1.99.9
- cmap_get_int64@Base 1.99.9
- cmap_get_int8@Base 1.99.9
- cmap_get_string@Base 1.99.9
- cmap_get_uint16@Base 1.99.9
- cmap_get_uint32@Base 1.99.9
- cmap_get_uint64@Base 1.99.9
- cmap_get_uint8@Base 1.99.9
- cmap_inc@Base 1.99.9
- cmap_initialize@Base 1.99.9
- cmap_iter_finalize@Base 1.99.9
- cmap_iter_init@Base 1.99.9
- cmap_iter_next@Base 1.99.9
- cmap_set@Base 1.99.9
- cmap_set_double@Base 1.99.9
- cmap_set_float@Base 1.99.9
- cmap_set_int16@Base 1.99.9
- cmap_set_int32@Base 1.99.9
- cmap_set_int64@Base 1.99.9
- cmap_set_int8@Base 1.99.9
- cmap_set_string@Base 1.99.9
- cmap_set_uint16@Base 1.99.9
- cmap_set_uint32@Base 1.99.9
- cmap_set_uint64@Base 1.99.9
- cmap_set_uint8@Base 1.99.9
- cmap_track_add@Base 1.99.9
- cmap_track_delete@Base 1.99.9
+#MISSING: 2.3.0# cmap_context_get@Base 1.99.9
+#MISSING: 2.3.0# cmap_context_set@Base 1.99.9
+#MISSING: 2.3.0# cmap_dec@Base 1.99.9
+#MISSING: 2.3.0# cmap_delete@Base 1.99.9
+#MISSING: 2.3.0# cmap_dispatch@Base 1.99.9
+#MISSING: 2.3.0# cmap_fd_get@Base 1.99.9
+#MISSING: 2.3.0# cmap_finalize@Base 1.99.9
+#MISSING: 2.3.0# cmap_get@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_double@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_float@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_int16@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_int32@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_int64@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_int8@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_string@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_uint16@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_uint32@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_uint64@Base 1.99.9
+#MISSING: 2.3.0# cmap_get_uint8@Base 1.99.9
+#MISSING: 2.3.0# cmap_inc@Base 1.99.9
+#MISSING: 2.3.0# cmap_initialize@Base 1.99.9
+#MISSING: 2.3.0# cmap_iter_finalize@Base 1.99.9
+#MISSING: 2.3.0# cmap_iter_init@Base 1.99.9
+#MISSING: 2.3.0# cmap_iter_next@Base 1.99.9
+#MISSING: 2.3.0# cmap_set@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_double@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_float@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_int16@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_int32@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_int64@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_int8@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_string@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_uint16@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_uint32@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_uint64@Base 1.99.9
+#MISSING: 2.3.0# cmap_set_uint8@Base 1.99.9
+#MISSING: 2.3.0# cmap_track_add@Base 1.99.9
+#MISSING: 2.3.0# cmap_track_delete@Base 1.99.9
  sam_data_getsize@Base 1.99.9
  sam_data_restore@Base 1.99.9
  sam_data_store@Base 1.99.9

Would this warrant a SONAME bump?

Proper support for DNS

Corosync may have problems if DNS is set so one entry maps to multiple IPs. Idea is to always return one IP.

Statgrab 0.9 changed API

Since 0.9, API of statgrab changed making mon.c service uncompilable.

See libstatgrab/libstatgrab@01fc0ca. Error produced during compilation:

mon.c: In function ‘percent_mem_used_get’:
mon.c:310:2: error: too few arguments to function ‘sg_get_mem_stats’
  mem_stats = sg_get_mem_stats();
  ^
In file included from mon.c:38:0:
/usr/include/statgrab.h:227:27: note: declared here
 __sg_public sg_mem_stats *sg_get_mem_stats(size_t *entries);
                           ^
mon.c:311:2: error: too few arguments to function ‘sg_get_swap_stats’
  swap_stats = sg_get_swap_stats();
  ^
In file included from mon.c:38:0:
/usr/include/statgrab.h:264:28: note: declared here
 __sg_public sg_swap_stats *sg_get_swap_stats(size_t *entries);
                            ^
mon.c: In function ‘min15_loadavg_get’:
mon.c:351:2: error: too few arguments to function ‘sg_get_load_stats’
  load_stats = sg_get_load_stats ();
  ^
In file included from mon.c:38:0:
/usr/include/statgrab.h:238:28: note: declared here
 __sg_public sg_load_stats *sg_get_load_stats(size_t *entries);
                            ^
mon.c: In function ‘mon_exec_init_fn’:
mon.c:466:2: error: too few arguments to function ‘sg_init’
  sg_init();
  ^
In file included from mon.c:38:0:
/usr/include/statgrab.h:130:22: note: declared here
 __sg_public sg_error sg_init(int ignore_init_errors);
                      ^
make[3]: *** [corosync-mon.o] Error 1
make[3]: Leaving directory `/root/cluster-2014-02-21/corosync/corosync-2.3.3/exec'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/root/cluster-2014-02-21/corosync/corosync-2.3.3'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/root/cluster-2014-02-21/corosync/corosync-2.3.3'
make: *** [build-stamp] Error 2

libtotem cleanup/rewrite

  • Rewrite totem fragmentation layer
  • rewrite top level totempg interface
  • Split fragmentation layer in totem (ie: totempg talks to totemfrg talks to totemsrp)
  • Add a getopt and setopt feature to top level interface to allow runtime configuration of the interface

corosync receives SIGKILL and is killed (kill -9) with a lot of memory occupied

corosync receive SIGKILL and is killed after occupies a lot of memory.
Test steps to reproduce the problem:

  1. create 2-cluster nodes, n1 and n2.
  2. start corosync by the command '/usr/sbin/corosync -f' on n1 and run 'testcpg cpg_bm' on n1
  3. on n2, run 'cpgbench' and wait it.
  4. check n1, wait a while and use 'top' tool to check that the memory goes high and high of corosync and later corosync is killed by os.
    from /var/log/messages:
    2014-08-05T16:59:24.745528+08:00 zy-sles12rc1-1 kernel: [85766.794825] testcpg invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0

2014-08-05T16:59:24.754689+08:00 zy-sles12rc1-1 kernel: [85766.795104] Out of memory: Kill process 17031 (corosync) score 316 or sacrifice child

2014-08-05T16:59:24.754690+08:00 zy-sles12rc1-1 kernel: [85766.795106] Killed process 17031 (corosync) total-vm:898872kB, anon-rss:775220kB, file-rss:27296kB

Changelog

I wolud like to know what has changed (or is new) between the releases. The Wiki Page:
https://github.com/corosync/corosync/wiki/Releases does not tell anything about this. Why should I want to use version 2.3? Are there new features? There is also no Release Note mail on the discuss mailinglist.

Best Regards

Sven Anders

IPv6 compat: cannot join cluster with mcast and ipv6

Using corosync 1.4.2-2ubuntu0.2 on Ubuntu 12.04 (amd64), but the same issue showed up in Debian testing with 1.4.2-3 on armhf (running on an rpi with raspbian).
Config has two nodes on the same LAN with totem using multicast on an IPv6 network. Symptoms are that only the first node to start up can join the cluster; the second node is "stuck" with repeating totem "Retransmit List" messages in the corosync log when corosync is started.

On digging into packet captures, determined that corosync is creating >1500 byte UDP packets, so the network layer has to fragment the UDP packets and things go south from there. Setting netmtu in corosync.conf under the totem config to 1500 bytes explicitly does not resolve the issue, but adjusting netmtu down to e.g. 1480 bytes does fix the issue.

My gut says that the IPv6 packet overhead is being miscalculated resulting in corosync generating >1500 byte UDP packets when IPv6 is used.

Totem excerpt from working corosync config:

totem {
    version: 2
    secauth: off
    threads: 0
    nodeid: 200
    interface {
        ringnumber: 0
        bindnetaddr: fd1d:59a9:d198:dada::200
        mcastaddr: ff02::dada:1
        mcastport: 5405
        ttl: 1
    }
    netmtu: 1480
}

"Retransmit List" messages excerpted from corosync log:

Aug 05 20:29:10 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:10 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:10 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:10 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:10 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:11 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:12 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:12 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:12 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:12 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:12 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:13 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:13 corosync [TOTEM ] Retransmit List: 18
Aug 05 20:29:13 corosync [TOTEM ] Retransmit List: 18

permissions for ringid_127.0.0.1 are unsafe

exec/main.c ll.785 - 788 create the ringid_$totemip_print(addr) with 0777 permissions in the corosync_ring_id_store function

The "corosync_ring_id_create_or_load" function uses the safer 0700 permissions.

Allowing any user/process to write to the /var/lib/corosync/ringid_127.0.0.1 file means that the system status could be compromised

Makefile installs into /usr/lib64

Hello,

corosync 2.3.4 can not be packaged for Arch, because it installs the libraries into /usr/lib64, which is a symlink to /usr/lib. This prevents the installation of the build packet, as the directory /usr/lib64 exists in the packet, but on the physical system, it's a symlink. I advise to change the configure script to install the libraries into a configurable directory. Running the configure script with --libdir=/usr/lib does not help, as the Makefile actively refuses to install into /usr/lib:

make[3]: Entering directory '/home/thermi/corosync/src/corosync-2.3.4/lib'
test -z "/usr/lib" || /usr/bin/mkdir -p "/home/thermi/corosync/pkg/corosync/usr/lib"
 /bin/sh ../libtool   --mode=install /usr/bin/install -c   libcpg.la libquorum.la libcfg.la libvotequorum.la libcmap.la libsam.la '/home/thermi/corosync/pkg/corosync/usr/lib'
libtool: install: error: cannot install `libcpg.la' to a directory not ending in /usr/lib64
Makefile:394: recipe for target 'install-libLTLIBRARIES' failed
make[3]: *** [install-libLTLIBRARIES] Error 1
make[3]: Leaving directory '/home/thermi/corosync/src/corosync-2.3.4/lib'
Makefile:575: recipe for target 'install-am' failed
make[2]: *** [install-am] Error 2
make[2]: Leaving directory '/home/thermi/corosync/src/corosync-2.3.4/lib'
Makefile:459: recipe for target 'install-recursive' failed
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory '/home/thermi/corosync/src/corosync-2.3.4'
Makefile:869: recipe for target 'install' failed
make: *** [install] Error 2
==> ERROR: A failure occurred in package().
    Aborting...

Kind regards,
Thermi

Corosync-2.3.4 Failing to Run with Asterisk issues

Was trying to implement clustering of servers in order to allow for SIP client state exchange between two different IP connections. Used the tar.gz source code, and compiled and installed it directly. Generated an authkey that was shared between the two computers and stored in the correct folders. Using sudo service corosync start worked seemingly. But when I ran Asterisk (the trunk version specifically), a segmentation fault occurred due to cpg_dispatch not receiving the required data because of errors. I am not entirely sure if this was my fault (I am an intern at Digium trying to learn the ropes) but I believe I followed the correct instructions.

Here's the backtrace

[New LWP 24409]
[New LWP 24342]
[New LWP 24343]
[New LWP 24339]
[New LWP 24350]
[New LWP 24360]
[New LWP 24344]
[New LWP 24359]
[New LWP 24362]
[New LWP 24346]
[New LWP 24361]
[New LWP 24375]
[New LWP 24347]
[New LWP 24364]
[New LWP 24348]
[New LWP 24386]
[New LWP 24365]
[New LWP 24349]
[New LWP 24397]
[New LWP 24390]
[New LWP 24351]
[New LWP 24403]
[New LWP 24393]
[New LWP 24336]
[New LWP 24352]
[New LWP 24404]
[New LWP 24353]
[New LWP 24407]
[New LWP 24356]
[New LWP 24357]
[New LWP 24408]
[New LWP 24406]
[New LWP 24363]
[New LWP 24402]
[New LWP 24366]
[New LWP 24401]
[New LWP 24367]
[New LWP 24400]
[New LWP 24399]
[New LWP 24396]
[New LWP 24395]
[New LWP 24394]
[New LWP 24389]
[New LWP 24388]
[New LWP 24392]
[New LWP 24369]
[New LWP 24358]
[New LWP 24355]
[New LWP 24387]
[New LWP 24385]
[New LWP 24354]
[New LWP 24384]
[New LWP 24345]
[New LWP 24383]
[New LWP 24382]
[New LWP 24341]
[New LWP 24381]
[New LWP 24380]
[New LWP 24340]
[New LWP 24379]
[New LWP 24338]
[New LWP 24378]
[New LWP 24377]
[New LWP 24337]
[New LWP 24376]
[New LWP 24374]
[New LWP 24391]
[New LWP 24373]
[New LWP 24371]
[New LWP 24372]
[New LWP 24370]
[New LWP 24405]
[New LWP 24368]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `asterisk -cvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  hdb_handle_get (instance=0x7fcc49d587e0, handle_in=<optimized out>, handle_database=0x7fcc6f3c3060 <cpg_handle_t_db>) at ../include/corosync/hdb.h:110
110     return (qb_hdb_handle_get (handle_database, handle_in, instance));
#0  hdb_handle_get (instance=0x7fcc49d587e0, handle_in=<optimized out>, handle_database=0x7fcc6f3c3060 <cpg_handle_t_db>) at ../include/corosync/hdb.h:110
No locals.
#1  cpg_dispatch (handle=465886707598753792, dispatch_types=CS_DISPATCH_ALL) at cpg.c:357
        timeout = -1
        error = <optimized out>
        cpg_inst = <error reading variable cpg_inst (Cannot access memory at address 0x7fcc49d587e0)>
        res_cpg_deliver_callback = <optimized out>
        res_cpg_totem_confchg_callback = <optimized out>
        cpg_inst_copy = <error reading variable cpg_inst_copy (Cannot access memory at address 0x7fcc49d587f0)>
        dispatch_data = <optimized out>
        member_list = <error reading variable member_list (Cannot access memory at address 0x7fcc49d58a40)>
        left_list = <error reading variable left_list (Cannot access memory at address 0x7fcc49d59040)>
        joined_list = <error reading variable joined_list (Cannot access memory at address 0x7fcc49d59640)>
        group_name = <error reading variable group_name (Cannot access memory at address 0x7fcc49d59c40)>
        left_list_start = <optimized out>
        joined_list_start = <optimized out>
        i = <optimized out>
        totem_member_list = <error reading variable totem_member_list (Cannot access memory at address 0x7fcc49d58840)>
        errno_res = <optimized out>
        dispatch_buf = <error reading variable dispatch_buf (Cannot access memory at address 0x7fcc49d59cd0)>
#2  0x00007fcc6f3c73e7 in dispatch_thread_handler (data=0x0) at res_corosync.c:482
        res = 1
        cs_err = CS_OK
        pfd = {{fd = 23, events = 1, revents = 1}, {fd = 22, events = 1, revents = 0}, {fd = 24, events = 1, revents = 0}}
        __PRETTY_FUNCTION__ = "dispatch_thread_handler"
#3  0x00000000005f032b in dummy_start (data=0x33a5fd0) at utils.c:1232
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {140515389843200, -811737713186051750, 0, 0, 140515389843904, 140515389843200, -811737713177663142, 802964189102860634}, __mask_was_saved = 0}}, __pad = {0x7fcc49e59ef0, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x44f7c8 <ast_unregister_thread>
        __cancel_arg = 0x7fcc49e5a700
        __not_first_call = 0
        ret = 0x0
        a = {start_routine = 0x7fcc6f3c71ed <dispatch_thread_handler>, data = 0x0, name = 0x339d980 "dispatch_thread_handler started at [  911] res_corosync.c load_module()"}
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc49e5a700) at pthread_create.c:312
        __res = <optimized out>
        pd = 0x7fcc49e5a700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140515389843200, 802963527754180954, 0, 0, 140515389843904, 140515389843200, -811737713188148902, -811408581869180582}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
No locals.

Thread 73 (Thread 0x7fcca523e700 (LWP 24368)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x7fcca6a90f74 "res_calendar.c", lineno=1817, func=0x7fcca6a925cb <__PRETTY_FUNCTION__.15775> "do_refresh", cond_name=0x7fcca6a91692 "&refresh_condition", mutex_name=0x7fcca6a9155d "&refreshlock", cond=0x7fcca6c94c40 <refresh_condition>, t=0x7fcca6c94c00 <refreshlock>, abstime=0x7fcca523ddf0) at lock.c:666
#2  0x00007fcca6a90a26 in do_refresh (data=0x0) at res_calendar.c:1817
#3  0x00000000005f032b in dummy_start (data=0x30448f0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca523e700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 72 (Thread 0x7fcc4ae58700 (LWP 24405)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x3326300) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2dda428) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2f0ffa0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4ae58700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 71 (Thread 0x7fcca40ec700 (LWP 24370)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc58d94134 "res_timing_pthread.c", lineno=422, func=0x7fcc58d945ac <__PRETTY_FUNCTION__.9663> "do_timing", cond_name=0x7fcc58d94174 "&timing_thread.cond", mutex_name=0x7fcc58d94160 "&timing_thread.lock", cond=0x7fcc58f95300 <timing_thread+64>, t=0x7fcc58f952c8 <timing_thread+8>) at lock.c:598
#2  0x00007fcc58d93e3c in do_timing (arg=0x0) at res_timing_pthread.c:422
#3  0x00000000005f032b in dummy_start (data=0x3110670) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca40ec700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 70 (Thread 0x7fcc4c839700 (LWP 24372)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x3125668, t=0x3125c10) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x3125c10) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x3123d00) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c839700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 69 (Thread 0x7fcc4d03a700 (LWP 24371)):
#0  0x00007fccfeebd9bd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fccfeeeef24 in usleep (useconds=<optimized out>) at ../sysdeps/unix/sysv/linux/usleep.c:32
#2  0x00007fccfd4c3805 in pj_thread_sleep () from /usr/lib/libpj.so.2
#3  0x00007fccfd4c2f19 in pj_ioqueue_poll () from /usr/lib/libpj.so.2
#4  0x00007fcc6a59f1c5 in timer_worker_thread (data=<optimized out>) at res_rtp_asterisk.c:1714
#5  0x00007fccfd4c3f96 in thread_main () from /usr/lib/libpj.so.2
#6  0x00007fccff91b182 in start_thread (arg=0x7fcc4d03a700) at pthread_create.c:312
#7  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 68 (Thread 0x7fcc4c7bd700 (LWP 24373)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x3236e20) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x3236ea8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x312aea0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c7bd700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 67 (Thread 0x7fcc4bcf9700 (LWP 24391)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000052c3a6 in ast_io_wait (ioc=0x325ae20, howlong=1000) at io.c:279
#2  0x00007fcc6ded1495 in do_monitor (data=0x0) at chan_sip.c:28802
#3  0x00000000005f032b in dummy_start (data=0x326ed90) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4bcf9700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 66 (Thread 0x7fcc4c741700 (LWP 24374)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x3237ab8, t=0x32371b0) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x32371b0) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x31299e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c741700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 65 (Thread 0x7fcc4c649700 (LWP 24376)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x3243a30, t=0x32439f8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x32428c0) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x3240b90) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c649700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 64 (Thread 0x7fcd015d5700 (LWP 24337)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x22d4330) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x22d43b8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x22d4530) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcd015d5700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 63 (Thread 0x7fcc4c5cd700 (LWP 24377)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x3244cf0, t=0x3244cb8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x3243b80) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c5cd700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 62 (Thread 0x7fcc4c551700 (LWP 24378)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x3245fb0, t=0x3245f78) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x3244e40) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c551700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 61 (Thread 0x7fcd01559700 (LWP 24338)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2d46250) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2d555f8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2d42a80) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcd01559700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 60 (Thread 0x7fcc4c4d5700 (LWP 24379)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x3247270, t=0x3247238) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x3246100) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c4d5700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 59 (Thread 0x7fccfe797700 (LWP 24340)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf0000c80, t=0x7fccf0000cb0, abstime=0x7fccfe796dc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf0000c78) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf0000c78) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf0000d90) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccfe797700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 58 (Thread 0x7fcc4c459700 (LWP 24380)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x3248530, t=0x32484f8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x32473c0) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c459700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 57 (Thread 0x7fcc4c3dd700 (LWP 24381)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x32497f0, t=0x32497b8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x3248680) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c3dd700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 56 (Thread 0x7fccfe71b700 (LWP 24341)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x0000000000452468 in listener (unused=0x0) at asterisk.c:1582
#2  0x00000000005f032b in dummy_start (data=0x2d59fd0) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fccfe71b700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 55 (Thread 0x7fcc4c361700 (LWP 24382)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x324ac10, t=0x324abd8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x3249aa0) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c361700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 54 (Thread 0x7fcc4c2e5700 (LWP 24383)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x324c030, t=0x324bff8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x324aec0) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c2e5700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 53 (Thread 0x7fccfe52b700 (LWP 24345)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf00015d0, t=0x7fccf0001600, abstime=0x7fccfe52adc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf00015c8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf00015c8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccfe52b700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 52 (Thread 0x7fcc4c269700 (LWP 24384)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x324d450, t=0x324d418) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x324c2e0) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c269700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 51 (Thread 0x7fccfe0cf700 (LWP 24354)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2e29600) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2e6be28) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2e2aa10) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe0cf700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 50 (Thread 0x7fcc4c1ed700 (LWP 24385)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc68aad1f4 "chan_iax2.c", lineno=12093, func=0x7fcc68ab4f70 <__PRETTY_FUNCTION__.22457> "iax2_process_thread", cond_name=0x7fcc68aad551 "&thread->cond", mutex_name=0x7fcc68aad543 "&thread->lock", cond=0x324e8e0, t=0x324e8a8) at lock.c:598
#2  0x00007fcc68a90150 in iax2_process_thread (data=0x324d770) at chan_iax2.c:12093
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c1ed700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 49 (Thread 0x7fcc4bee9700 (LWP 24387)):
#0  0x00007fccff9226dd in accept () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcc67bdf762 in accept_thread (ignore=0x0) at chan_skinny.c:7636
#2  0x00000000005f032b in dummy_start (data=0x3250fc0) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fcc4bee9700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 48 (Thread 0x7fccfe053700 (LWP 24355)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x2e3c218, t=0x2e6bfe0) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x2e6bfe0) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x2e3b490) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe053700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 47 (Thread 0x7fccec18c700 (LWP 24358)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fcca0000950, t=0x7fcca0000980, abstime=0x7fccec18bdc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fcca0000948) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fcca0000948) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fcca0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccec18c700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 46 (Thread 0x7fcca4168700 (LWP 24369)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x30f91b0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x30f8f78) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x30f8830) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca4168700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 45 (Thread 0x7fcc4bc7d700 (LWP 24392)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x327fa28, t=0x3284a50) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x3284a50) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x3282120) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4bc7d700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 44 (Thread 0x7fcc4be6d700 (LWP 24388)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x32520c8, t=0x3250e30) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x3250e30) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x32516f0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4be6d700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 43 (Thread 0x7fcc4bdf1700 (LWP 24389)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcc54835cc6 in do_monitor (data=0x0) at chan_dahdi.c:11394
#2  0x00000000005f032b in dummy_start (data=0x3253c90) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fcc4bdf1700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 42 (Thread 0x7fcc4b77c700 (LWP 24394)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2eecc60) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2eb76f8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2eb3f40) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4b77c700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 41 (Thread 0x7fcc4b700700 (LWP 24395)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2eed240) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2eec2b8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2eb3d30) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4b700700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 40 (Thread 0x7fcc4b684700 (LWP 24396)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x7fcc62915dd4 "app_meetme.c", lineno=6804, func=0x7fcc6291a44f <__PRETTY_FUNCTION__.18229> "sla_thread", cond_name=0x7fcc62917525 "&sla.cond", mutex_name=0x7fcc6291751b "&sla.lock", cond=0x7fcc62b1ea88 <sla+8>, t=0x7fcc62b1eab8 <sla+56>) at lock.c:598
#2  0x00007fcc6290fea6 in sla_thread (data=0x0) at app_meetme.c:6804
#3  0x00000000005f032b in dummy_start (data=0x2ef2640) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4b684700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 39 (Thread 0x7fcc4b140700 (LWP 24399)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000052c3a6 in ast_io_wait (ioc=0x2f38920, howlong=1000) at io.c:279
#2  0x00007fcc4edd7ae8 in do_monitor (data=0x0) at chan_unistim.c:5880
#3  0x00000000005f032b in dummy_start (data=0x2f0fd90) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4b140700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 38 (Thread 0x7fcc4b0c4700 (LWP 24400)):
#0  0x00007fccff922b9d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcc5defab00 in lock_broker (unused=0x0) at func_lock.c:211
#2  0x00000000005f032b in dummy_start (data=0x2f10dd0) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fcc4b0c4700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 37 (Thread 0x7fcca52ba700 (LWP 24367)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x3031a40) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x30319c8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x3031950) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca52ba700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 36 (Thread 0x7fcc4b048700 (LWP 24401)):
#0  0x00007fccff922b9d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcc4f3f0f3a in cleanup (unused=0x0) at pbx_realtime.c:134
#2  0x00000000005f032b in dummy_start (data=0x2f11550) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fcc4b048700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 35 (Thread 0x7fcca5336700 (LWP 24366)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x2febe18, t=0x2feb3f0) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x2feb3f0) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x3002e20) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca5336700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 34 (Thread 0x7fcc4afcc700 (LWP 24402)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000052c3a6 in ast_io_wait (ioc=0x32dc5e0, howlong=1000) at io.c:279
#2  0x00007fcc51ac38cd in network_thread (ignore=0x0) at pbx_dundi.c:2191
#3  0x00000000005f032b in dummy_start (data=0x2f39700) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4afcc700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 33 (Thread 0x7fcca5c2f700 (LWP 24363)):
#0  0x00007fccfeeee823 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fccfd4c5e24 in pj_sock_select () from /usr/lib/libpj.so.2
#2  0x00007fccfd4c2cbd in pj_ioqueue_poll () from /usr/lib/libpj.so.2
#3  0x00007fccfd6f76c8 in pjsip_endpt_handle_events2 () from /usr/lib/libpjsip.so.2
#4  0x00007fcca665e408 in monitor_thread_exec (endpt=<optimized out>) at res_pjsip.c:2949
#5  0x00007fccfd4c3f96 in thread_main () from /usr/lib/libpj.so.2
#6  0x00007fccff91b182 in start_thread (arg=0x7fcca5c2f700) at pthread_create.c:312
#7  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 32 (Thread 0x7fcc4addc700 (LWP 24406)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x33817a0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x33809a8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x33551e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4addc700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 31 (Thread 0x7fcc4ace4700 (LWP 24408)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x33ad540) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x339aa48) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x33aaa10) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4ace4700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 30 (Thread 0x7fccfc0be700 (LWP 24357)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2f86f90) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2f8beb8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2f88190) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfc0be700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 29 (Thread 0x7fccfdfd7700 (LWP 24356)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2e7a4d0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2e7a558) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2e296f0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfdfd7700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 28 (Thread 0x7fcc4ad60700 (LWP 24407)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fcc56939bb4 in do_monitor (data=0x0) at chan_phone.c:1102
#2  0x00000000005f032b in dummy_start (data=0x33859f0) at utils.c:1232
#3  0x00007fccff91b182 in start_thread (arg=0x7fcc4ad60700) at pthread_create.c:312
#4  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 27 (Thread 0x7fccfe14b700 (LWP 24353)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2df9040) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2e05eb8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2dce550) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe14b700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 26 (Thread 0x7fcc4aed4700 (LWP 24404)):
#0  0x00007fccfeebd9bd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fccfeebd854 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007fcc51ac3acd in process_clearcache (ignore=0x0) at pbx_dundi.c:2232
#3  0x00000000005f032b in dummy_start (data=0x2f11a20) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4aed4700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 25 (Thread 0x7fccfe1c7700 (LWP 24352)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2dfc410) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2e01c48) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2dfa0d0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe1c7700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 24 (Thread 0x7fcd015d6780 (LWP 24336)):
#0  malloc_printerr (ptr=<optimized out>, str=<optimized out>, action=<optimized out>) at malloc.c:4992
#1  _int_malloc (av=av@entry=0x7fccff1bb760 <main_arena>, bytes=bytes@entry=109) at malloc.c:3447
#2  0x00007fccfee7fd4c in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3219
#3  0x00000000005ee75a in _ast_calloc (num=1, len=109, file=0x61d894 "config.c", lineno=1435, func=0x61e724 <__PRETTY_FUNCTION__.15636> "cfmtime_new") at /home/chris/asterisk_trunk/include/asterisk/utils.h:559
#4  0x00000000004da276 in cfmtime_new (filename=0x7fff2f2c8b60 "/etc/asterisk/res_corosync.conf", who_asked=0x7fcc6f3c8ab7 "res_corosync") at config.c:1434
#5  0x00000000004dbe4f in config_text_file_load (database=0x7fff2f2cace0 "\220\256,/[", table=0x7fff2f2cade0 "Ȯ,/\377\177", filename=0x7fcc6f3c97b0 <filename> "res_corosync.conf", cfg=0x33a69f0, flags=..., suggested_include_file=0x61d97f "", who_asked=0x7fcc6f3c8ab7 "res_corosync") at config.c:1982
#6  0x00000000004df4f4 in ast_config_internal_load (filename=0x7fcc6f3c97b0 <filename> "res_corosync.conf", cfg=0x33a69f0, flags=..., suggested_include_file=0x61d97f "", who_asked=0x7fcc6f3c8ab7 "res_corosync") at config.c:2935
#7  0x00000000004df5e1 in ast_config_load2 (filename=0x7fcc6f3c97b0 <filename> "res_corosync.conf", who_asked=0x7fcc6f3c8ab7 "res_corosync", flags=...) at config.c:2956
#8  0x00007fcc6f3c8216 in load_config (reload=0) at res_corosync.c:781
#9  0x00007fcc6f3c894c in load_module () at res_corosync.c:916
#10 0x0000000000532a9a in start_resource (mod=0x30735f0) at loader.c:980
#11 0x0000000000533449 in load_resource_list (load_order=0x7fff2f2cb2d0, global_symbols=0, mod_count=0x7fff2f2cb260) at loader.c:1178
#12 0x0000000000533a84 in load_modules (preload_only=0) at loader.c:1331
#13 0x000000000045b15e in main (argc=2, argv=0x7fff2f2cd5d8) at asterisk.c:4555

Thread 23 (Thread 0x7fcc4bc01700 (LWP 24393)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x2e700b8, t=0x2e6fbb0) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x2e6fbb0) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x2e4e550) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4bc01700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 22 (Thread 0x7fcc4af50700 (LWP 24403)):
#0  0x00007fccfeebd9bd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fccfeebd854 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00007fcc51ac3caf in process_precache (ign=0x0) at pbx_dundi.c:2269
#3  0x00000000005f032b in dummy_start (data=0x2f11970) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4af50700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 21 (Thread 0x7fccfe243700 (LWP 24351)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2dd63c0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2dde658) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2ddba90) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe243700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 20 (Thread 0x7fcc4bd75700 (LWP 24390)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x326d7e0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x326e808) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x326e940) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4bd75700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 19 (Thread 0x7fcc4b608700 (LWP 24397)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2ef0db0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2ef2ab8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2ef2bb0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4b608700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 18 (Thread 0x7fccfe33b700 (LWP 24349)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x6063b4 "bridge.c", lineno=4584, func=0x608950 <__PRETTY_FUNCTION__.18280> "bridge_manager_thread", cond_name=0x607849 "&manager->cond", mutex_name=0x607828 "ao2_object_get_lockaddr(manager)", cond=0x2d99958, t=0x2d99900) at lock.c:598
#2  0x0000000000472793 in bridge_manager_thread (data=0x2d99958) at bridge.c:4584
#3  0x00000000005f032b in dummy_start (data=0x2d999b0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe33b700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 17 (Thread 0x7fcca53b2700 (LWP 24365)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x643234 "sched.c", lineno=124, func=0x643510 <__PRETTY_FUNCTION__.12890> "sched_run", cond_name=0x643247 "&con->sched_thread->cond", mutex_name=0x64323c "&con->lock", cond=0x2fddfd8, t=0x2fde670) at lock.c:598
#2  0x00000000005b474f in sched_run (data=0x2fde670) at sched.c:124
#3  0x00000000005f032b in dummy_start (data=0x2fd1f80) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcca53b2700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 16 (Thread 0x7fcc4c171700 (LWP 24386)):
#0  0x00007fccfeee9bad in poll () at ../sysdeps/unix/syscall-template.S:81
#1  0x000000000052c3a6 in ast_io_wait (ioc=0x3122a70, howlong=1000) at io.c:279
#2  0x00007fcc68a92bee in network_thread (ignore=0x0) at chan_iax2.c:12605
#3  0x00000000005f032b in dummy_start (data=0x32404e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c171700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 15 (Thread 0x7fccfe3b7700 (LWP 24348)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2d804b0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2d83a58) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2d84300) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe3b7700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 14 (Thread 0x7fcca542e700 (LWP 24364)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf8000950, t=0x7fccf8000980, abstime=0x7fcca542ddc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf8000948) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf8000948) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf8000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fcca542e700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 13 (Thread 0x7fccfe433700 (LWP 24347)):
#0  0x00007fccfeebd9bd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007fccfeebd854 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x00000000004f4ca5 in db_sync_thread (data=0x0) at db.c:979
#3  0x00000000005f032b in dummy_start (data=0x2d80110) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe433700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 12 (Thread 0x7fcc4c6c5700 (LWP 24375)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x323b5b0) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x323b3a8) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x323b3d0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc4c6c5700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 11 (Thread 0x7fcca7baa700 (LWP 24361)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fcca00012c0, t=0x7fcca00012f0, abstime=0x7fcca7ba9dc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fcca00012b8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fcca00012b8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fcca0000d90) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fcca7baa700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 10 (Thread 0x7fccfe4af700 (LWP 24346)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x00000000005da5a7 in ast_sem_wait (sem=0x2d6c200) at /home/chris/asterisk_trunk/include/asterisk/sem.h:59
#2  0x00000000005da668 in default_tps_processing_function (data=0x2d6c288) at taskprocessor.c:177
#3  0x00000000005f032b in dummy_start (data=0x2d6c3e0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe4af700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 9 (Thread 0x7fcca7b2e700 (LWP 24362)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fcca00015d0, t=0x7fcca0001600, abstime=0x7fcca7b2ddc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fcca00015c8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fcca00015c8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fcca0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fcca7b2e700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 8 (Thread 0x7fccec110700 (LWP 24359)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fcca0000c80, t=0x7fcca0000cb0, abstime=0x7fccec10fdc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fcca0000c78) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fcca0000c78) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fcca0000d90) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccec110700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 7 (Thread 0x7fccfe5a7700 (LWP 24344)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x62c8d4 "logger.c", lineno=1331, func=0x62dae4 <__PRETTY_FUNCTION__.15336> "logger_thread", cond_name=0x62d695 "&logcond", mutex_name=0x62d687 "&logmsgs.lock", cond=0x8b7c00 <logcond>, t=0x89c790 <logmsgs+16>) at lock.c:598
#2  0x0000000000537f72 in logger_thread (data=0x0) at logger.c:1331
#3  0x00000000005f032b in dummy_start (data=0x2d59fd0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe5a7700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x7fccec094700 (LWP 24360)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fcca0000fb0, t=0x7fcca0000fe0, abstime=0x7fccec093dc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fcca0000fa8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fcca0000fa8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fcca0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccec094700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x7fccfe2bf700 (LWP 24350)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000005342d9 in __ast_cond_wait (filename=0x622474 "devicestate.c", lineno=546, func=0x6228e0 <__PRETTY_FUNCTION__.14802> "do_devstate_changes", cond_name=0x622727 "&change_pending", mutex_name=0x622737 "&state_changes.lock", cond=0x8b77e0 <change_pending>, t=0x89a550 <state_changes+16>) at lock.c:598
#2  0x00000000004f5c7e in do_devstate_changes (data=0x0) at devicestate.c:546
#3  0x00000000005f032b in dummy_start (data=0x2d9c660) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fccfe2bf700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 4 (Thread 0x7fcd014dd700 (LWP 24339)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf0000950, t=0x7fccf0000980, abstime=0x7fcd014dcdc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf0000948) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf0000948) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fcd014dd700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7fccfe623700 (LWP 24343)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf00012c0, t=0x7fccf00012f0, abstime=0x7fccfe622dc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf00012b8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf00012b8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf0000d90) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccfe623700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7fccfe69f700 (LWP 24342)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x0000000000534317 in __ast_cond_timedwait (filename=0x64a523 "threadpool.c", lineno=1101, func=0x64aa24 <__PRETTY_FUNCTION__.8575> "worker_idle", cond_name=0x64a767 "&worker->cond", mutex_name=0x64a759 "&worker->lock", cond=0x7fccf0000fb0, t=0x7fccf0000fe0, abstime=0x7fccfe69edc0) at lock.c:666
#2  0x00000000005e488b in worker_idle (worker=0x7fccf0000fa8) at threadpool.c:1101
#3  0x00000000005e45cb in worker_start (arg=0x7fccf0000fa8) at threadpool.c:993
#4  0x00000000005f032b in dummy_start (data=0x7fccf0000a60) at utils.c:1232
#5  0x00007fccff91b182 in start_thread (arg=0x7fccfe69f700) at pthread_create.c:312
#6  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7fcc49e5a700 (LWP 24409)):
#0  hdb_handle_get (instance=0x7fcc49d587e0, handle_in=<optimized out>, handle_database=0x7fcc6f3c3060 <cpg_handle_t_db>) at ../include/corosync/hdb.h:110
#1  cpg_dispatch (handle=465886707598753792, dispatch_types=CS_DISPATCH_ALL) at cpg.c:357
#2  0x00007fcc6f3c73e7 in dispatch_thread_handler (data=0x0) at res_corosync.c:482
#3  0x00000000005f032b in dummy_start (data=0x33a5fd0) at utils.c:1232
#4  0x00007fccff91b182 in start_thread (arg=0x7fcc49e5a700) at pthread_create.c:312
#5  0x00007fccfeef6efd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

corosync.conf

totem {
        version: 2
        token: 160
        token_retransmits_before_loss_const: 3
        join: 30
        consensus: 300
        vsftype: none
        max_messages: 20
        threads: 0
        nodeid: 1
        rrp_mode: none
        interface {
                ringnumber: 0
                bindnetaddr: 10.24.16.128
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
}

res_corosync.conf (under Asterisk)

[general]

publish_event = mwi

subscribe_event = mwi

publish_event = device_state

subscribe_event = device_state

/etc/corosync/uidgid.d/asterisk

uidgid {
    uid: chris
    gid: chris
}

Bug in votequorum.c

Corosync (master for today) terminates with SIGABRT in exec/votequorum.c line 1981:
if (nodeid == VOTEQUORUM_QDEVICE_NODEID) {
struct cluster_node *sender_node = find_node_by_nodeid(sender_nodeid);

assert(sender_node != NULL);    <---HERE

Situation:

  1. Corosync configured with two rings in udpu mode (config follows)
  2. Linux is booting. Network interface for ring0 still unconfigured (Infiniband card with long-time initialization) but interface for ring1 is ready and configured.
  3. Corosync starts and writes in his log:
    [VOTEQ ] No nodelist defined or our node is not in the nodelist
    [VOTEQ ] ev_tracking=0, ev_tracking_barrier = 0: expected_votes = 2
    [VOTEQ ] total_votes=1, expected_votes=2
    [VOTEQ ] node 2130706433 state=1, votes=1, expected=2
  4. Network interface appears
  5. Corosync writes:
    [VOTEQ ] got nodeinfo message from cluster node 181011712
    [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
  6. Corosync dies.

Just putting interface to down state with configured IPs does not repeat situation. Interface must be unconfigured during corosync start.

Corosync config:
totem {
config_version: 2015033113
version: 2
crypto_hash: none
crypto_cipher: none
rrp_mode: passive
transport: udpu
cluster_name: s0s1024
interface {
ringnumber: 0
bindnetaddr: 10.202.5.0
mcastport: 300
}
interface {
ringnumber: 1
bindnetaddr: 10.200.5.0
mcastport: 302
}
}
nodelist {
node {
nodeid: 2048
name: s0
ring0_addr: 10.202.1.0
ring1_addr: 10.200.1.0
}
node {
nodeid: 1024
name: s1024
ring0_addr: 10.202.5.0
ring1_addr: 10.200.5.0
}
}
quorum {
# Quorum for the Pacemaker Cluster Resource Manager
provider: corosync_votequorum
expected_votes: 2
two_node: 1
wait_for_all: 0
}

Logsys glue layer removal

Idea is to remove logsys layer completely and depends only on qblog. It's still unclear how much further is possible to go, because currently logsys is quite a thin layer and there is still need for layer to map between logsys configuration and qblog configuration.

xml schema

XML configuration for corosync exists, but imput file is not checked against
XML schema. This topic is about implementing preferably RelaxNG schema of
corosync configuration.

Also with good integration, it would be possible to generate man page and part of coroparser automatically.

totemip_totemip_to_sockaddr_convert hardcodes sin6_scope_id=2

sin->sin6_scope_id = 2;

According to ipv6(7), sin6_scope_id should contain the interface index. Hardcoding 2 here (i.e. the first non-loopback interface) might cause corosync to bind to the wrong interface when using IPv6 multicast or IPv6 link-local unicast addresses.

This makes corosync's IPv6 support unusable for me, since on my systems, interface ID 2 is half of an 802.3ad bond and binding to that won't do anything useful.

Harden and finish ykd algorithm

YKD is dynamic linear voting algorithm designed specifically to be adaptive to membership changes. Current implementation in corosync is suboptimal. It is now quite a low priority, because votequorum works and does what it should do (so ykd doesn't bring too much new).

Segmentation fault on "Type of received message is wrong"

Hi,

On a test 3 node cluster with corosync 2.3.0 and pacemaker 1.1.10 on one node the corocync config has accidently been replaced by config of corosync 1.4.1 and corosync on that node was restarted to take new config. After that corosyncs on 2 other nodes got (100% reproducible):

gdb corosync

GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
http://bugs.launchpad.net/gdb-linaro/...
Reading symbols from /usr/sbin/corosync...Reading symbols from /usr/lib/debug/usr/sbin/corosync...done.
done.
(gdb) r -f
Starting program: /usr/sbin/corosync -f
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4bc9700 (LWP 10532)]
Jun 16 12:12:41 notice [TOTEM ] Initializing transport (UDP/IP Multicast).
Jun 16 12:12:41 notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
Jun 16 12:12:41 notice [TOTEM ] The network interface [10.0.2.98] is now up.
Jun 16 12:12:41 notice [TOTEM ] A processor joined or left the membership and a new membership (10.0.2.98:160) was formed.
Jun 16 12:12:41 warning [TOTEM ] Type of received message is wrong... ignoring -87.
wrong message type
Jun 16 12:12:41 warning [TOTEM ] Type of received message is wrong... ignoring 9.
wrong message type
Jun 16 12:12:42 warning [TOTEM ] Type of received message is wrong... ignoring 104.
wrong message type
Jun 16 12:12:42 warning [TOTEM ] Type of received message is wrong... ignoring 28.
wrong message type
Jun 16 12:12:42 warning [TOTEM ] Type of received message is wrong... ignoring 76.
wrong message type
Jun 16 12:12:42 warning [TOTEM ] Type of received message is wrong... ignoring -104.
wrong message type
Jun 16 12:12:43 warning [TOTEM ] Type of received message is wrong... ignoring -20.
wrong message type
Jun 16 12:12:43 warning [TOTEM ] Type of received message is wrong... ignoring -116.
wrong message type
Jun 16 12:12:43 warning [TOTEM ] Type of received message is wrong... ignoring 22.
wrong message type
Jun 16 12:12:43 warning [TOTEM ] Type of received message is wrong... ignoring 66.
wrong message type
Jun 16 12:12:44 warning [TOTEM ] Type of received message is wrong... ignoring 91.
wrong message type
Jun 16 12:12:44 warning [TOTEM ] Type of received message is wrong... ignoring -117.
wrong message type
Jun 16 12:12:44 warning [TOTEM ] Type of received message is wrong... ignoring -63.
wrong message type
Jun 16 12:12:44 warning [TOTEM ] Type of received message is wrong... ignoring 67.
wrong message type
Jun 16 12:12:44 warning [TOTEM ] Type of received message is wrong... ignoring 31.
wrong message type
Jun 16 12:12:45 warning [TOTEM ] Type of received message is wrong... ignoring -112.
wrong message type
Jun 16 12:12:45 warning [TOTEM ] Type of received message is wrong... ignoring -74.
wrong message type
Jun 16 12:12:45 warning [TOTEM ] Type of received message is wrong... ignoring -88.
wrong message type
Jun 16 12:12:45 warning [TOTEM ] Type of received message is wrong... ignoring 56.
wrong message type
Jun 16 12:12:46 warning [TOTEM ] Type of received message is wrong... ignoring -90.
wrong message type
Jun 16 12:12:46 warning [TOTEM ] Type of received message is wrong... ignoring -128.
wrong message type
Jun 16 12:12:46 warning [TOTEM ] Type of received message is wrong... ignoring 101.
wrong message type
Jun 16 12:12:46 warning [TOTEM ] Type of received message is wrong... ignoring 112.
wrong message type
Jun 16 12:12:47 warning [TOTEM ] Type of received message is wrong... ignoring -57.
wrong message type
Jun 16 12:12:47 warning [TOTEM ] Type of received message is wrong... ignoring 116.
wrong message type
Jun 16 12:12:47 warning [TOTEM ] Type of received message is wrong... ignoring -35.
wrong message type
Jun 16 12:12:47 warning [TOTEM ] Type of received message is wrong... ignoring -85.
wrong message type
Jun 16 12:12:47 warning [TOTEM ] Type of received message is wrong... ignoring -36.
wrong message type
Jun 16 12:12:48 warning [TOTEM ] Type of received message is wrong... ignoring -45.
wrong message type
Jun 16 12:12:48 warning [TOTEM ] Type of received message is wrong... ignoring -103.
wrong message type
Jun 16 12:12:48 warning [TOTEM ] Type of received message is wrong... ignoring 117.
wrong message type

Program received signal SIGSEGV, Segmentation fault.
memb_commit_token_endian_convert (out=0x7fffffff6360, in=) at totemsrp.c:4261
4261 totemsrp.c: no such file or directory.
(gdb) bt
#0 memb_commit_token_endian_convert (out=0x7fffffff6360, in=) at totemsrp.c:4261
#1 message_handler_memb_commit_token (instance=0x7ffff7fa8010, msg=, msg_len=132,

endian_conversion_needed=<optimized out>) at totemsrp.c:4410

#2 0x00007ffff7bbeecd in rrp_deliver_fn (context=0x55555588e6b0, msg=, msg_len=132)

at totemrrp.c:1794

#3 0x00007ffff7bb9d12 in net_deliver_fn (fd=, revents=, data=0x55555588f060)

at totemudp.c:521

#4 0x00007ffff7757fdf in ?? () from /usr/lib/libqb.so.0
#5 0x00007ffff7757b57 in qb_loop_run () from /usr/lib/libqb.so.0
#6 0x0000555555562365 in main (argc=, argv=, en

vp=) at main.c:1259

Corosync crashes when trying to run in OSX 10.8.3

Steps to produce:

  1. pulled libqb from git today, compiled and installed
  2. pulled corosync from git today, compiled and installed
  3. became root
  4. corosync -f

The error from stderr is:

warning [MAIN ] Could not lock memory of service to avoid page faults: Function not implemented (78)
Abort trap: 6

The error in syslog is:

Apr 22 21:17:55 node2.mydomain.com corosync[29907]: [MAIN ] Corosync Cluster Engine ('2.3.0.19-03f0'): started and ready to provide service.
Apr 22 21:17:55 node2.mydomain.com corosync[29907]: [29907] detected buffer overflow
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: DebugSymbols was unable to start a spotlight query: spotlight is not responding or disabled.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: failed looking up LS service ( scCreateSystemService returned MACH_PORT_NULL, called from SetupCoreApplicationServicesCommunicationPort, so using client-side NULL calls.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: LaunchServices/5123589: Unable to lookup coreservices session port for session 0x186a0 uid=0 euid=0
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: failed looking up LS service ( scCreateSystemService returned MACH_PORT_NULL, called from SetupCoreApplicationServicesCommunicationPort, so using client-side NULL calls.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: LaunchServices/5123589: Unable to lookup coreservices session port for session 0x186a0 uid=0 euid=0
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: failed looking up LS service ( scCreateSystemService returned MACH_PORT_NULL, called from SetupCoreApplicationServicesCommunicationPort, so using client-side NULL calls.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: LaunchServices/5123589: Unable to lookup coreservices session port for session 0x186a0 uid=0 euid=0
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: failed looking up LS service ( scCreateSystemService returned MACH_PORT_NULL, called from SetupCoreApplicationServicesCommunicationPort, so using client-side NULL calls.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: LaunchServices/5123589: Unable to lookup coreservices session port for session 0x186a0 uid=0 euid=0
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: failed looking up LS service ( scCreateSystemService returned MACH_PORT_NULL, called from SetupCoreApplicationServicesCommunicationPort, so using client-side NULL calls.
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: LaunchServices/5123589: Unable to lookup coreservices session port for session 0x186a0 uid=0 euid=0
Apr 22 21:17:55 --- last message repeated 3 times ---
Apr 22 21:17:55 node2.mydomain.com ReportCrash[29908]: Saved crash report for corosync[29907] version 0 to /Library/Logs/DiagnosticReports/corosync_2013-04-22-211755_node2.crash

The crash dump is:

Process: corosync [29907]
Path: /opt/local/sbin/corosync
Identifier: corosync
Version: 0
Code Type: X86-64 (Native)
Parent Process: bash [29906]
User ID: 0

Date/Time: 2013-04-22 21:17:55.363 -0700
OS Version: Mac OS X 10.8.3 (12D78)
Report Version: 10

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000

Application Specific Information:
__abort() called

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff8fc4d212 __pthread_kill + 10
1 libsystem_c.dylib 0x00007fff8f193b54 pthread_kill + 90
2 libsystem_c.dylib 0x00007fff8f1d7e9e __abort + 159
3 libsystem_c.dylib 0x00007fff8f19977d __chk_fail + 35
4 libsystem_c.dylib 0x00007fff8f199886 __snprintf_chk + 150
5 libqb.0.dylib 0x000000010753dd30 qb_ipcs_us_publish + 576 (ipc_setup.c:386)
6 libqb.0.dylib 0x000000010753cdb8 qb_ipcs_run + 104 (ipcs.c:122)
7 corosync 0x0000000107076790 cs_ipcs_service_init + 800 (ipc_glue.c:863)
8 corosync 0x000000010707721c corosync_service_link_and_init + 508 (service.c:175)
9 corosync 0x00000001070777e5 corosync_service_defaults_link_and_init + 69 (service.c:347)
10 corosync 0x0000000107072efb main_service_ready + 27 (main.c:975)
11 libtotem_pg.5.dylib 0x000000010715c897 main_iface_change_fn + 695 (totemsrp.c:4568)
12 libtotem_pg.5.dylib 0x00000001071505d7 timer_function_netif_check_timeout + 4951 (totemudp.c:677)
13 libqb.0.dylib 0x000000010753a510 timer_dispatch + 32 (loop_timerlist.c:57)
14 libqb.0.dylib 0x0000000107538a13 qb_loop_run + 771 (loop.c:44)
15 corosync 0x0000000107072933 main + 2707 (main.c:1314)
16 libdyld.dylib 0x00007fff8d28a7e1 start + 1

Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x0000000000000006 rcx: 0x00007fff58b9e6a8 rdx: 0x0000000000000000
rdi: 0x0000000000000707 rsi: 0x0000000000000006 rbp: 0x00007fff58b9e6d0 rsp: 0x00007fff58b9e6a8
r8: 0x00007fff79a77278 r9: 0x0000000031ba8bfa r10: 0x0000000020000000 r11: 0x0000000000000206
r12: 0x00007fff58b9eb41 r13: 0x0000000000000000 r14: 0x00007fff79a78180 r15: 0x00007fd1e4801b64
rip: 0x00007fff8fc4d212 rfl: 0x0000000000000206 cr2: 0x00007fff79a70ff0
Logical CPU: 0

Binary Images:
0x10705f000 - 0x107085ff7 +corosync (0) <2FFE69EE-9470-309D-8B22-D917076F7A0E> /opt/local/sbin/corosync
0x10714d000 - 0x107166ff7 +libtotem_pg.5.dylib (0) /opt/local/lib/libtotem_pg.5.dylib
0x10717e000 - 0x1071deff7 +libssl3.dylib (0) /opt/local/lib//libssl3.dylib
0x107206000 - 0x10723aff7 +libsmime3.dylib (0) /opt/local/lib/
/libsmime3.dylib
0x107261000 - 0x10728afff +libnssutil3.dylib (0) /opt/local/lib//libnssutil3.dylib
0x1072ac000 - 0x10743afff +libnss3.dylib (0) <0FB72A64-EB2A-35EC-B6B7-FBD2EF991CFE> /opt/local/lib/
/libnss3.dylib
0x1074e3000 - 0x1074e5fff +libplds4.dylib (0) <36D94FD9-6ED2-3D56-9F05-D08B810537B0> /opt/local/lib//libplds4.dylib
0x1074e9000 - 0x1074ecfff +libplc4.dylib (0) <74CA35EF-977E-3744-B64B-9EC6EE6B25C8> /opt/local/lib/
/libplc4.dylib
0x1074f0000 - 0x10751bff7 +libnspr4.dylib (0) <6A6689E8-707F-39F8-B150-AA198CD97D77> /opt/local/lib//libnspr4.dylib
0x107531000 - 0x107531ff7 +libcorosync_common.4.dylib (0) /opt/local/lib/libcorosync_common.4.dylib
0x107535000 - 0x10754efff +libqb.0.dylib (0) /opt/local/lib/libqb.0.dylib
0x1086a2000 - 0x1086fcff7 +libsoftokn3.dylib (0) <2FCDF91F-7D6D-3486-B64F-98616A0C8A39> /opt/local/lib/
/libsoftokn3.dylib
0x10871e000 - 0x1087dcfff +libsqlite3.0.dylib (0) /opt/local/lib/libsqlite3.0.dylib
0x10881b000 - 0x108892ff7 +libfreebl3.dylib (0) /opt/local/lib/*/libfreebl3.dylib
0x7fff66c5f000 - 0x7fff66c9393f dyld (210.2.3) /usr/lib/dyld
0x7fff88bf9000 - 0x7fff88c45ff7 libauto.dylib (185.1) <73CDC482-16E3-3FC7-9BB4-FBA2DA44DBC2> /usr/lib/libauto.dylib
0x7fff88ea8000 - 0x7fff88ea8fff libkeymgr.dylib (25) /usr/lib/system/libkeymgr.dylib
0x7fff8960b000 - 0x7fff8980bfff libicucore.A.dylib (491.11.2) /usr/lib/libicucore.A.dylib
0x7fff89958000 - 0x7fff89959fff libsystem_blocks.dylib (59) /usr/lib/system/libsystem_blocks.dylib
0x7fff8995a000 - 0x7fff89960fff com.apple.DiskArbitration (2.5.2 - 2.5.2) /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration
0x7fff8a59a000 - 0x7fff8a5ebff7 com.apple.SystemConfiguration (1.12.2 - 1.12.2) /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration
0x7fff8a6aa000 - 0x7fff8a7c292f libobjc.A.dylib (532.2) <90D31928-F48D-3E37-874F-220A51FD9E37> /usr/lib/libobjc.A.dylib
0x7fff8a7c3000 - 0x7fff8a7f4ff7 com.apple.DictionaryServices (1.2 - 184.4) <2EC80C71-263E-3D63-B461-6351C876C50D> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework/Versions/A/DictionaryServices
0x7fff8ab4c000 - 0x7fff8ab4dfff liblangid.dylib (116) <864C409D-D56B-383E-9B44-A435A47F2346> /usr/lib/liblangid.dylib
0x7fff8b052000 - 0x7fff8b05fff7 com.apple.NetAuth (4.0 - 4.0) /System/Library/PrivateFrameworks/NetAuth.framework/Versions/A/NetAuth
0x7fff8b068000 - 0x7fff8b09eff7 libsystem_info.dylib (406.17) /usr/lib/system/libsystem_info.dylib
0x7fff8b0f8000 - 0x7fff8b147ff7 libcorecrypto.dylib (106.2) /usr/lib/system/libcorecrypto.dylib
0x7fff8b262000 - 0x7fff8b263ff7 libdnsinfo.dylib (453.19) <14202FFB-C3CA-3FCC-94B0-14611BF8692D> /usr/lib/system/libdnsinfo.dylib
0x7fff8b36c000 - 0x7fff8b3d4ff7 libc++.1.dylib (65.1) /usr/lib/libc++.1.dylib
0x7fff8b43e000 - 0x7fff8b44dff7 libxar.1.dylib (105) /usr/lib/libxar.1.dylib
0x7fff8b4c3000 - 0x7fff8b4c9ff7 libunwind.dylib (35.1) <21703D36-2DAB-3D8B-8442-EAAB23C060D3> /usr/lib/system/libunwind.dylib
0x7fff8b4ce000 - 0x7fff8b537fff libstdc++.6.dylib (56) <08A26E98-26A7-3A23-A594-01EF923BA123> /usr/lib/libstdc++.6.dylib
0x7fff8b538000 - 0x7fff8b6adfff com.apple.CFNetwork (596.3.3 - 596.3.3) <3739DC8D-8610-3740-80EC-43E130779CB8> /System/Library/Frameworks/CFNetwork.framework/Versions/A/CFNetwork
0x7fff8b6b3000 - 0x7fff8b721fff com.apple.framework.IOKit (2.0.1 - 755.22.5) <1547DA6F-9793-30A2-8E92-7368DE84D46C> /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit
0x7fff8b78e000 - 0x7fff8b79bfff libbz2.1.0.dylib (29) /usr/lib/libbz2.1.0.dylib
0x7fff8b83e000 - 0x7fff8b843fff libcompiler_rt.dylib (30) <08F8731D-5961-39F1-AD00-4590321D24A9> /usr/lib/system/libcompiler_rt.dylib
0x7fff8bbb3000 - 0x7fff8bbb4ff7 libsystem_sandbox.dylib (220.2) <6838A6FD-8626-3356-BB4F-BB4787216207> /usr/lib/system/libsystem_sandbox.dylib
0x7fff8cb62000 - 0x7fff8cb74ff7 libz.1.dylib (43) <2A1551E8-A272-3DE5-B692-955974FE1416> /usr/lib/libz.1.dylib
0x7fff8cb75000 - 0x7fff8cc02ff7 com.apple.SearchKit (1.4.0 - 1.4.0) <54A8069C-E497-3B07-BEA7-D3BC9DB5B649> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SearchKit.framework/Versions/A/SearchKit
0x7fff8cf0f000 - 0x7fff8d26cff7 com.apple.Foundation (6.8 - 945.16) <89BD68FD-72C8-35C1-94C6-3A07F097C50D> /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
0x7fff8d288000 - 0x7fff8d28bff7 libdyld.dylib (210.2.3) /usr/lib/system/libdyld.dylib
0x7fff8d414000 - 0x7fff8d4c5fff com.apple.LaunchServices (539.7 - 539.7) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/LaunchServices
0x7fff8d695000 - 0x7fff8d73bff7 com.apple.CoreServices.OSServices (557.6 - 557.6) <1BDB5456-0CE9-301C-99C1-8EFD0D2BFCCD> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/OSServices.framework/Versions/A/OSServices
0x7fff8d741000 - 0x7fff8d76cfff libxslt.1.dylib (11.3) <441776B8-9130-3893-956F-39C85FFA644F> /usr/lib/libxslt.1.dylib
0x7fff8d8d8000 - 0x7fff8d8ddfff libcache.dylib (57) <65187C6E-3FBF-3EB8-A1AA-389445E2984D> /usr/lib/system/libcache.dylib
0x7fff8d8de000 - 0x7fff8d8f5fff com.apple.CFOpenDirectory (10.8 - 151.10) /System/Library/Frameworks/OpenDirectory.framework/Versions/A/Frameworks/CFOpenDirectory.framework/Versions/A/CFOpenDirectory
0x7fff8d8f6000 - 0x7fff8d924ff7 libsystem_m.dylib (3022.6) <11B6081D-6212-3EAB-9975-BED6234BD6A5> /usr/lib/system/libsystem_m.dylib
0x7fff8d9f9000 - 0x7fff8da10fff com.apple.GenerationalStorage (1.1 - 132.3) /System/Library/PrivateFrameworks/GenerationalStorage.framework/Versions/A/GenerationalStorage
0x7fff8e23a000 - 0x7fff8e245fff libsystem_notify.dylib (98.5) /usr/lib/system/libsystem_notify.dylib
0x7fff8e4ab000 - 0x7fff8e4b2fff libcopyfile.dylib (89) <876573D0-E907-3566-A108-577EAD1B6182> /usr/lib/system/libcopyfile.dylib
0x7fff8e4b3000 - 0x7fff8e4c8ff7 libdispatch.dylib (228.23) /usr/lib/system/libdispatch.dylib
0x7fff8e4c9000 - 0x7fff8e4d1ff7 libsystem_dnssd.dylib (379.37) <616FC901-151E-38BF-B2C4-24A351C5FAAD> /usr/lib/system/libsystem_dnssd.dylib
0x7fff8ea13000 - 0x7fff8ea17fff libpam.2.dylib (20) /usr/lib/libpam.2.dylib
0x7fff8ea18000 - 0x7fff8ea99fff com.apple.Metadata (10.7.0 - 707.5) <4140B1F6-7D73-33C7-B3F2-4DB349C31AE9> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata
0x7fff8ec1d000 - 0x7fff8ec1eff7 libremovefile.dylib (23.2) <6763BC8E-18B8-3AD9-8FFA-B43713A7264F> /usr/lib/system/libremovefile.dylib
0x7fff8f007000 - 0x7fff8f066fff com.apple.AE (645.6 - 645.6) <44F403C1-660A-3543-AB9C-3902E02F936F> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/AE.framework/Versions/A/AE
0x7fff8f0a9000 - 0x7fff8f0aafff libDiagnosticMessagesClient.dylib (8) <8548E0DC-0D2F-30B6-B045-FE8A038E76D8> /usr/lib/libDiagnosticMessagesClient.dylib
0x7fff8f0b8000 - 0x7fff8f125ff7 com.apple.datadetectorscore (4.1 - 269.2) <4FD4A7CE-BB00-3AAB-B7AA-AE395D5400EC> /System/Library/PrivateFrameworks/DataDetectorsCore.framework/Versions/A/DataDetectorsCore
0x7fff8f17e000 - 0x7fff8f24aff7 libsystem_c.dylib (825.26) <4C9EB006-FE1F-3F8F-8074-DFD94CF2CE7B> /usr/lib/system/libsystem_c.dylib
0x7fff8f25a000 - 0x7fff8f571ff7 com.apple.CoreServices.CarbonCore (1037.5 - 1037.5) <731D8F92-1C52-3613-BA01-EFEA54DADF41> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/CarbonCore
0x7fff8f6dc000 - 0x7fff8f6e2fff libmacho.dylib (829) /usr/lib/system/libmacho.dylib
0x7fff8fa64000 - 0x7fff8fa72fff libcommonCrypto.dylib (60027) /usr/lib/system/libcommonCrypto.dylib
0x7fff8fc3b000 - 0x7fff8fc56ff7 libsystem_kernel.dylib (2050.22.13) <5A961E2A-CFB8-362B-BC43-122704AEB047> /usr/lib/system/libsystem_kernel.dylib
0x7fff8fca9000 - 0x7fff8fcb1fff liblaunch.dylib (442.26.2) <2F71CAF8-6524-329E-AC56-C506658B4C0C> /usr/lib/system/liblaunch.dylib
0x7fff90ed6000 - 0x7fff90eddfff com.apple.NetFS (5.0 - 4.0) <195D8EC9-72BB-3E04-A64D-E1A89B4850C1> /System/Library/Frameworks/NetFS.framework/Versions/A/NetFS
0x7fff90ede000 - 0x7fff90fdbfff libsqlite3.dylib (138.1) /usr/lib/libsqlite3.dylib
0x7fff91ef2000 - 0x7fff91ef3ff7 libSystem.B.dylib (169.3) /usr/lib/libSystem.B.dylib
0x7fff924a0000 - 0x7fff924c5ff7 libc++abi.dylib (26) /usr/lib/libc++abi.dylib
0x7fff925d6000 - 0x7fff927c0ff7 com.apple.CoreFoundation (6.8 - 744.18) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
0x7fff927c1000 - 0x7fff927c3fff libquarantine.dylib (52) <4BE2E642-A14F-340A-B482-5BD2AEFD9C24> /usr/lib/system/libquarantine.dylib
0x7fff92c62000 - 0x7fff92d5fff7 libxml2.2.dylib (22.3) <7FD09F53-83DA-3ECD-8DD9-870E1A2F0427> /usr/lib/libxml2.2.dylib
0x7fff92d72000 - 0x7fff93043ff7 com.apple.security (7.0 - 55179.11) <73958084-5BBC-3597-A751-7370B0C247E5> /System/Library/Frameworks/Security.framework/Versions/A/Security
0x7fff93217000 - 0x7fff93217fff libOpenScriptingUtil.dylib (148.3) /usr/lib/libOpenScriptingUtil.dylib
0x7fff93261000 - 0x7fff93283ff7 libxpc.dylib (140.42) /usr/lib/system/libxpc.dylib
0x7fff93313000 - 0x7fff93315ff7 libunc.dylib (25) <2FDC94A7-3039-3680-85F3-2164E63B464D> /usr/lib/system/libunc.dylib
0x7fff935c9000 - 0x7fff935cdff7 com.apple.TCC (1.0 - 1) <76A86876-2280-3849-8478-450E1A8C0E01> /System/Library/PrivateFrameworks/TCC.framework/Versions/A/TCC
0x7fff93acf000 - 0x7fff93addff7 libsystem_network.dylib (77.10) <2AAA67A1-525E-38F0-8028-1D2B64716611> /usr/lib/system/libsystem_network.dylib
0x7fff93ade000 - 0x7fff93affff7 libCRFSuite.dylib (33) /usr/lib/libCRFSuite.dylib
0x7fff93f5d000 - 0x7fff93f70ff7 libbsm.0.dylib (32) /usr/lib/libbsm.0.dylib
0x7fff9421b000 - 0x7fff94229ff7 libkxld.dylib (2050.22.13) <4AAF0573-8632-3D06-BE32-C5675F77638D> /usr/lib/system/libkxld.dylib
0x7fff94bce000 - 0x7fff94bcefff com.apple.CoreServices (57 - 57) <45F1466A-8264-3BB7-B0EC-E5E5BFBED143> /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices

External Modification Summary:
Calls made by other processes targeting this process:
task_for_pid: 0
thread_create: 0
thread_set_state: 0
Calls made by this process:
task_for_pid: 0
thread_create: 0
thread_set_state: 0
Calls made by all processes on this machine:
task_for_pid: 5298142
thread_create: 0
thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=84.5M resident=35.7M(42%) swapped_out_or_unallocated=48.9M(58%)
Writable regions: Total=38.5M written=2164K(5%) resident=2240K(6%) swapped_out=0K(0%) unallocated=36.3M(94%)

REGION TYPE VIRTUAL
=========== =======
MALLOC 29.3M
MALLOC guard page 32K
STACK GUARD 56.0M
Stack 8200K
VM_ALLOCATE 4K
__DATA 4028K
__LINKEDIT 53.4M
__TEXT 31.1M
__UNICODE 544K
mapped file 16.0M
shared memory 12K
=========== =======
TOTAL 198.4M

Implement netmalloc

The totemiba.c driver must allocate memory and assign it to a protection domain in order for an infiniband driver to transmit memory. In the current implementation, totemsrp.c also allocates these same frames. This results in an extra memcpy when transmitting with libibverbs technology. Memory copies are to be avoided. The simple solution is to have each network driver provide a memory allocation function. When totemsrp wants a free frame, it requests it from the network driver.

corosync crashes

corosync 2.3.3 with libqb-0.17.0 crashes periodically in exec/totemsrp.c:3016, that is
assert (instance->commit_token->memb_index <= instance->commit_token->addr_entries);

backtrace:

(gdb) info threads 
  Id   Target Id         Frame 
  2    Thread 0x7fcde859c700 (LWP 6977) 0x00007fcdeb394420 in sem_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
* 1    Thread 0x7fcdebe3c700 (LWP 6976) 0x00007fcdeb02e545 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007fcdeb02e545 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fcdeb0317c0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fcdeb0276f1 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fcdeba19a31 in memb_state_commit_token_update (instance=0x7fcde7d5f010) at totemsrp.c:3016
#4  memb_state_commit_enter (instance=instance@entry=0x7fcde7d5f010) at totemsrp.c:2118
#5  0x00007fcdeba1b9aa in message_handler_memb_commit_token (instance=<optimized out>, msg=<optimized out>, msg_len=228, endian_conversion_needed=<optimized out>)
    at totemsrp.c:4548
#6  0x00007fcdeba183cc in rrp_deliver_fn (context=0x7fcdecdd9390, msg=0x7fcdecddc618, msg_len=228) at totemrrp.c:1794
#7  0x00007fcdeba1340e in net_deliver_fn (fd=<optimized out>, revents=<optimized out>, data=0x7fcdecddc5b0) at totemudp.c:521
#8  0x00007fcdeb5afeef in ?? () from /usr/lib/libqb.so.0
#9  0x00007fcdeb5afad7 in qb_loop_run () from /usr/lib/libqb.so.0
#10 0x00007fcdebe61910 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at main.c:1314

Memory leak (a bad one)

Evidence indicates 2.3.4 has a severe memory leak. The problem might be even older than that.
A look through the CHANGELOG since then shows no mentions of a fix or this problem.

Here is a top-except from corosync 2.3.4 running for under 15 hours:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15406 root      20   0 10.5g 9.2g 9168 S  2.0 58.8  11:50.07 corosync

It's showing a RES usage of 9.2 GB. Other systems with highly similar setups show the same pattern.

libqb 0.16.0, release 2.el6 on the systems with and without the memory leak problem.
On the memory leaking systems, nspr is 4.10.2. On the non-memory leak, 4.10.8. However, 4 days prior, I made a configuration change and corosync did not show this behavior -- usage steady at 16M. I reverted the configuration change, but the problem did not go away. No other changes on the system hint at a cause.

During the reported time, I'm debugging an application that uses the Corosync API. The application is started/stopped many times. Sometimes it is left running. The application uses only the CPG portion of the API.

A cronjob which dumps ps output ever minute has been running on these systems, making it possible to see the runaway memory growth. Here's a select set of samples from a single-running process:

TIME      RSS   SZ
17:53:01     3952 18271
17:54:02    18496 21889
17:58:01    70588 37823
18:00:01   113904 45790
18:06:01   207264 69177
18:10:01   245152 83857
18:15:01   338212 103154
18:20:01   402860 123200
...
19:00:01   985844 282619
19:10:01  1058220 322711
...
20:00:01  1960572 522400
20:30:01  2289444 642419

Here is a scrubbed version of the configuration when the problems were noticed:

quroum {
        provider: corosync_votequorum
        expected_votes: 2
}
aisexec {
        user:root
        group:root
}
#service {
#       name: pacemaker
#       ver: 1
#}
totem {
        version: 2

        # if "on" must use shared "corosync-keygen".
        secauth: off
        threads: 2
        rrp_mode: none
        transport: udpu
        interface {
                bindnetaddr: 172.24.0.0
                # Rings must be consecutively numbered, starting at 0.
                ringnumber: 0
                mcastport: 5561
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        logfile: /var/log/corosync.log
        to_syslog: no
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: on
        }
}

nodelist {
        node {
                ring0_addr: 138.x.x.x
        }
        node {
                ring0_addr: 172.24.2.61
        }
        node {
                ring0_addr: 172.24.1.61
        }
        node {
                ring0_addr: 172.24.1.37
        }
        node {
                ring0_addr: 172.24.2.37
        }
}

Currently, I am attempting to rule out configuration changes and possible library changes as culprits.

Remove localhost binding

Corosync binds to localhost if interface is not found or rebind to localhost if interface is lost during operation. This causes problem because new item (127.0.0.1) appears. When interface is then bring up again, it's sent as part of nodelist. Such entry never disappear (localhost is always accessible on all nodes).

Idea is to never rebind to localhost.

This is very frequently asked problem. It makes problem with NM, because NM does ifdown on cable unplug. Also most people tends to test fall-over by simply ifdown interface.

Add mozilla-nss-devel note for OpenSUSE12.3

Using ./configure --disable-nss fails on OpenSUSE because nss is not found. In INSTALL, there are notes to look for the "nss-devel" and "nspr-devel" packages in your distributions repositories, but these don't exist in OpenSUSE. Could this be changed to "nss-devel", "nspr-devel", or "mozilla-nss-devel"? After installing the last, configuration on OpenSUSE12.3 finds nss.

implement rdmaud

Currently our RDMA code uses librdmacm to setup connections. We are not certain this extra library is needed, and may be able to use only ibverbs. If this is possible, the totem code may be more reliable, especially around failure conditions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.