Giter VIP home page Giter VIP logo

t-matsuo / resource-agents Goto Github PK

View Code? Open in Web Editor NEW

This project forked from clusterlabs/resource-agents

119.0 7.0 11.0 32.31 MB

pgsql RA(ocf resource agent) for Pacemaker and PostgreSQL streaming replication. See https://github.com/t-matsuo/resource-agents/wiki

Home Page: https://github.com/t-matsuo/resource-agents/wiki

License: GNU General Public License v2.0

Shell 81.14% C 7.21% Perl 7.10% Makefile 1.31% XSLT 0.96% M4 1.63% Python 0.57% Roff 0.08%

resource-agents's Introduction

Resource agents

This repository contains resource agents (RAs) compliant with the Open Cluster Framework (OCF) specification.

These resource agents are used by two cluster resource management implementations:

  • Pacemaker
  • rgmanager

More information about OCF resource agents is available in the OCF Resource Agent Developer's guide (link below).

Where can I find more information?

resource-agents's People

Contributors

alan-r avatar astralbob avatar atomictoaster avatar beekhof avatar bmarzins avatar chrissie-c avatar davidvossel avatar dciabrin avatar dmuhamedagic avatar fabbione avatar feist avatar fghaas avatar hideoyamauchi avatar horms avatar jbrassow avatar jnpkrn avatar krig avatar lge avatar lhh avatar marxsk avatar mgrzybek avatar nozawat avatar oalbrigt avatar raoulbhatia avatar rohara avatar ryan-mccabe avatar sebastic avatar swhiteho avatar t-matsuo avatar teigland avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

resource-agents's Issues

PostgreSQL promotion method.

Greetings,

I have actually some code for this, but wanted to open a discussion on it first, code is here:

soulhunter@64a10f2

As you can see on that patch, I just shutdown PostgreSQL, remove recovery.conf (I could also rename it, but we recreate it as needed, so...) and start it again, the idea is to avoid a timeline switch, and almost eliminate the need for a shared archive (provided standbys are "close enough" to the newly promoted one).

I know the extra WAL shipping will work, but it has the disadvantage that it will ship the WAL files twice (once through streaming replication, and again through WAL shipping). We may have an extra cold-standby around as a WAL archive (in addition to a WAL archive on the PRI), and have the HS use that archive, but it is a more complex setup.

Using this promotion method allows you to have the archive if you want, in order to allow really lagged standbys to catch-up, but it is not that important (remember the timeline switch requires the .history file(s) to be available, and these are not streamed, so, if we allow the timeline switch, the WAL archive becomes increasingly important, even if a standby is lagged by just 1 WAL segment).

What do you think?

Ildefonso Camargo

Clarification needed

Hi,

I am using the pgsql resource agent you wrote with pacemaker and corosync.
Things work quite smoothly and I am happy with general bahviour of the RA. Thank you for that!
There is just one case where I would have expected a different behaviour. First let me be cleqr I am not a PG admin neither a replication guru, so I might be missing something here.
The case that "worries" me is when a master is demote or stopped and there is a slave to take over the resource as a new master.
When that happens and the old master gets back in the cluster later on, it won't be able to somoothly become a slave of the new master.

As far as I understand it this is due to the following code:

if [ "$1" = "master" -a "$OCF_RESKEY_CRM_meta_notify_slave_uname" = " " ]; then
ocf_log info "Removing $PGSQL_LOCK."
rm -f $PGSQL_LOCK
fi

Which tests if there is still a slave in the cluster where the master is stopped.
Is it to avoid having the old master back in the cluster as a new master (in case there is no master at that time)? Is there other cases where it is useful (or is it for a completely different scenario)?

If that is the only reason, would a test on startup on $OCF_RESKEY_CRM_meta_notify_master_uname to check for a live master would help (by removing the lock file if there is one).

Thank you to shed a light on this for me.

Postgres Master-role migration

Hi guys,

I have a postgres master/slave setup on SLES11SP3. For maintenance purpose I need a procedure to move the master role to the other node. I've tried the command 'crm_resource --resource msPostgresql --ban --master --host node1 --lifetime "P60S"'. Master role is actually moved to node2 but node1 stops with status disconnected (PGSQL.lock file remains in tmp directory). I have to manually delete the lock file and do a resource cleanup. Node1 should remain as slave after the master migration. Is it a bug or expected behaviour?
PS What is the recommended version of pgsql ra agent, the one shipped with SLES or the latest from git? Thanks

PGSQL - Synchronous replication mode uses wrong location for postgresql.conf

In the pgsql resource agent, in pgsql_validate_all(), line 1560, synchronous replication configuration is written to $OCF_RESKEY_pgdata/postgresql.conf instead of $OCF_RESKEY_config.

This is problematic in PostgreSQL installations on Ubuntu, where the default configuration location is /etc/postgresql/9.1/main/postgresql.conf, and is not in the $PGDATA directory.

error in function "is_node_online"

Hello,

There should not be a blank after ^Node $1. If the blank is, function works incorrectly.

is_node_online() {
crm_mon -1 -n | grep -e "^Node $1 " | grep -q -v "OFFLINE"
return $?
}

Cannot start Postgres master on Ubuntu

Hi,

I used your RA on Ubuntu 12.04, Pacemaker 1.1.6, Postgres 9.2.

I edited paths as follow:

 primitive pgsql ocf:heartbeat:pgsql \
    params \
        pgctl="/usr/lib/postgresql/9.2/bin/pg_ctl" \
        psql="/usr/lib/postgresql/9.2/bin/psql" \
        pgdata="/etc/postgresql/9.2/hapg/" \
        start_opt="-p 5432" \
        rep_mode="sync" \
        node_list="pm01 pm02" \
        restore_command="cp /db/data/pg_archive/%f %p" \
        primary_conninfo_opt="keepalives_idle=60 keepalives_interval=5 keepalives_count=5" \
        master_ip="192.168.3.200" \
        stop_escalate="0" \
    op start   timeout="60s" interval="0s"  on-fail="restart" \
    op monitor timeout="60s" interval="7s" on-fail="restart" \
    op monitor timeout="60s" interval="2s"  on-fail="restart" role="Master" \
    op promote timeout="60s" interval="0s"  on-fail="restart" \
    op demote  timeout="60s" interval="0s"  on-fail="block" \
    op stop    timeout="60s" interval="0s"  on-fail="block" \
    op notify  timeout="60s" interval="0s"

And there was an error:

pgsql:0_start_0 (node=pm01, call=3, rc=-2, status=Timed Out): unknown exec error

I have not configured another yet. I also note that Pacemaker 1.1.6 uses time parameter without "s" (e.g timeout="60").

I have configure Postgres Replication successfully, checked on both master and slave:

2012-12-11 17:09:11 ICT LOG:  database system is ready to accept connections
2012-12-11 17:09:11 ICT LOG:  autovacuum launcher started
2012-12-11 17:09:12 ICT LOG:  incomplete startup packet

I tried to start Postgres manually by:

/usr/lib/postgresql/9.2/bin/pg_ctl start -D /etc/postgresql/9.2/hapg -o "-p 5432"

(hapg is my Postgres cluster)

Here is crm_mon output:

============
Last updated: Tue Dec 11 18:00:05 2012
Last change: Tue Dec 11 17:58:12 2012 via cibadmin on pm01
Stack: openais
Current DC: pm01 - partition WITHOUT quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ pm01 ]
OFFLINE: [ pm02 ]


Failed actions:
    pgsql:0_start_0 (node=pm01, call=3, rc=-2, status=Timed Out): unknown exec error

Do you have any idea?

trouble detect master

Hi, again!

I test cluster on corosync 2.3.x.
So I had to build packages from source.
These: libqb, corosync, resource-agents, pacemaker.
And I saw the problems on the slave servers.
They are constantly searching for the master server.
I found in you code this string

crm_mon -n1 | tr -d "\t" | tr -d " " | grep "^pgsql:.*[):]Master"

But, in old version cluster crm_mon return this string

pgsql:0(ocf::heartbeat:pgsql):Master

And in new version:

pgsql(ocf::heartbeat:pgsql):Master

As long as I just took a colon in the regular expression.
I saw your conversation with "OCF_RESKEY_CRM_meta_notify_master_uname".
You did not kill this problem?

In any case, the version of packages installed from source.

# yum list installed libqb corosync resource-agents pacemaker
corosync.x86_64          2.3.0-1.21.298a.el6      @/corosync-2.3.0-1.21.298a.el6.x86_64
libqb.x86_64             0.14.4-1.38.07c9.el6     @/libqb-0.14.4-1.38.07c9.el6.x86_64
pacemaker.x86_64         1.1.10-1.el6             installed
resource-agents.x86_64   3.9.5-1.35.a645.el6      installed

even if pgsql don't run - "pgsql-data-status" will defined.

Hi, again!

I continue experiments with cluster on two node + quorum node.
Today I build next schema:

  1. on quorum node set attibute:
crm_attribute --type nodes --node-uname dev-cluster2-node2 --attr-name quorumnode -v yes -l forever

or

node attribute dev-cluster2-node2 set quorumnode yes
  1. Next, build you standart schema. But set all clone-max=N-1 and define location:
location rsc_location-3 msPostgresql \
rule -inf: not_defined default_ping_set or default_ping_set lt 100 \
rule -inf: defined quorumnode
location quorumban pingCheck \
rule -inf: defined quorumnode

Everything looks right.

# crm_mon -1n
Last updated: Wed Aug 21 17:44:36 2013
Last change: Wed Aug 21 16:33:07 2013 via crm_attribute on dev-cluster2-node4
Stack: cman
Current DC: dev-cluster2-node4 - partition with quorum
Version: 1.1.11-1.el6-4f672bc
3 Nodes configured
5 Resources configured


Node dev-cluster2-node2: online
Node dev-cluster2-node3: online
        pgsql   (ocf::heartbeat:pgsql): Started
        pingCheck       (ocf::pacemaker:ping):  Started
Node dev-cluster2-node4: online
        pgsql   (ocf::heartbeat:pgsql): Master
        pingCheck       (ocf::pacemaker:ping):  Started
        master-ip       (ocf::heartbeat:IPaddr2):       Started

Even a .... crn_resource display resources without status "stopped"

# crm_resource --list
 master-ip      (ocf::heartbeat:IPaddr2):       Started
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ dev-cluster2-node4.unix.tensor.ru ]
     Slaves: [ dev-cluster2-node3.unix.tensor.ru ]
 Clone Set: clnPingCheck [pingCheck]
     Started: [ dev-cluster2-node3.unix.tensor.ru dev-cluster2-node4.unix.tensor.ru ]

....except that the attribute is still set.

# crm_mon -A1
Last updated: Wed Aug 21 17:48:09 2013
Last change: Wed Aug 21 16:33:07 2013 via crm_attribute on dev-cluster2-node4
Stack: cman
Current DC: dev-cluster2-node4 - partition with quorum
Version: 1.1.11-1.el6-4f672bc
3 Nodes configured
5 Resources configured


Online: [ dev-cluster2-node2 dev-cluster2-node3 dev-cluster2-node4 ]

 master-ip      (ocf::heartbeat:IPaddr2):       Started dev-cluster2-node4
 Master/Slave Set: msPostgresql [pgsql]
     Masters: [ dev-cluster2-node4 ]
     Slaves: [ dev-cluster2-node3 ]
 Clone Set: clnPingCheck [pingCheck]
     Started: [ dev-cluster2-node3 dev-cluster2-node4 ]

Node Attributes:
* Node dev-cluster2-node2:
    + pgsql-data-status                 : DISCONNECT
    + quorumnode                        : yes
* Node dev-cluster2-node3:
    + default_ping_set                  : 100
    + master-pgsql                      : -INFINITY
    + pgsql-data-status                 : STREAMING|ASYNC
    + pgsql-status                      : HS:async
* Node dev-cluster2-node4:
    + default_ping_set                  : 100
    + master-pgsql                      : 1000
    + pgsql-data-status                 : LATEST
    + pgsql-master-baseline             : 0000000002000078
    + pgsql-status                      : PRI

don't run pgsql slave after force reboot on centos

The meaning of the problem is that the "SU" always returns 0.
After the restart,

  1. the system tries to stop all RESURSS.
  2. causes pgsql_real_monitor
  3. it causes pgsql_status
  4. pgsql_status sees old PID file and tries to send a signal to the process.
  5. BUT! then used runasowner which tries to command a "SU".

So here! SU always returns true, there is a process or not.

The quick solution to a problem I did this:

ocf_run $quietrun $loglevel su $OCF_RESKEY_pgdba -c "cd $OCF_RESKEY_pgdata; $*"

ocf_run $quietrun $loglevel sudo -u $OCF_RESKEY_pgdba sh -c "cd $OCF_RESKEY_pgdata ; $*"

But I did not like it.

I realize that most likely the problem is better addressed to CentOS, as on ubuntu, suse, freebsd, macos - su works as it should, but ....

How to check:

cat ./1.sh

!/bin/sh

res=su otheruser -c 'whoami11111 ; exit $?'
echo "---"$res"---"$?"---"

If wrong command exit: ------127---
If normal: ---otheruser---0---

error on configure primitive

I have installed:
heartbeat 1:3.0.3-1ubuntu1 Subsystem for High-Availability Linux
pacemaker 1.0.8+hg15494-2ubuntu2

on ubuntu 10.04 server amd64
Linux test1.example.com 2.6.32-28-generic ClusterLabs#55-Ubuntu SMP Mon Jan 10 23:42:43 UTC 2011 x86_64 GNU/Linux

I have the same software on both server

crm(live)# status

Last updated: Tue Jun 26 18:06:43 2012
Stack: Heartbeat
Current DC: test2.example.com (e67f40dc-34fc-483c-b30f-d22141f15ea7) - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, unknown expected votes
#1 Resources configured.

Online: [ test1.example.com test2.example.com ]

ClusterIP (ocf::heartbeat:IPaddr2): Started test1.example.com

When i try to add you pgsql script i get this error
crm(live)# configure primitive postgresql ocf:heartbeat:pgsql
lrmadmin[15081]: 2012/06/26_18:08:32 ERROR: lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply message of rmetadata with function get_ret_from_msg.
ERROR: ocf:heartbeat:pgsql: could not parse meta-data:
ERROR: ocf:heartbeat:pgsql: no such resource agent

Restart postgresql in cluster envirnment

Hello Takatoshi,

I have question about restart postgresql when its works with cluster envirnment. For example I have changed postgres parameter which required restart. How I can correctly restart postgres in this case?

pgsql_validate_all fail on Master and Postgresl not initialized

Hi,

I'm try to make if work on Debian6 using Posrgresql 9.1.2 (version from backports) and Corosync 1.1.6 .

On line 1799; pgsql_validate_all always fail with the error :

Feb 03 15:12:30 bdd2 lrmd: [27051]: info: RA output: (HA_POSTGRESQL:1:start:stderr) [: 1799:
Feb 03 15:12:30 bdd2 lrmd: [27051]: info: RA output: (HA_POSTGRESQL:1:start:stderr) 9.1: unexpected operator
Feb 03 15:12:30 bdd2 lrmd: [27051]: info: RA output: (HA_POSTGRESQL:1:start:stderr)

So, I comment line 1799;

Now Postgresql start but all servers are in "HS:alone" status, the master is not found.
I think the script run to fast and does not wait Postresql:Master be initialized (line 1797)

On the master server I always see in log (with pghost="127.0.0.1")

Feb 03 15:17:31 bdd2 lrmd: [27051]: info: RA output: (HA_POSTGRESQL:1:start:stderr) psql: could not connect to server: Connection refused
Is the server running on host "127.0.0.1" and accepting
TCP/IP connections on port 5432?

Or (without pghost) :

Feb 03 15:20:58 bdd2 lrmd: [27051]: info: RA output: (HA_POSTGRESQL:1:start:stderr) psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

don't remove LOCK file when PGSQL stopped with quorum lost.

Hi!
I again found trouble. )
I doing experiment with quorum lost.
After quorum lost LOCK not deleted.
I see code.

    if  [ "$1" = "master" -a "$OCF_RESKEY_CRM_meta_notify_slave_uname" = " " ]; then
        ocf_log info "Removing $PGSQL_LOCK."
        rm -f $PGSQL_LOCK
    fi

Where defined $OCF_RESKEY_CRM_meta_notify_slave_uname ? what he mean?

the postgresql on master node can not start up?

configure: 2 machine,ip:192.168.4.104 and 192.168.4.105,host name: h105 and h104,
master postgres on h105.

my configuer of pacemaker :
node $id="01211558-f31b-445a-8a97-615bf30fec35" h104 \
attributes pgsql-data-status="DISCONNECT"
node $id="3ed6fbd6-567f-4104-8447-6ef620599369" h105 \
attributes pgsql-data-status="LATEST"
primitive postgresql ocf:heartbeat:pgsql \
params pgctl="/var/lib/pgsql/pginstall/bin/pg_ctl" psql="/var/lib/pgsql/pginstall/bin/psql" pgdata="/var/lib/pgsql/data" logfile="/var/lib/pgsql/data/postgresql.log" start_opt="-p 5432" pgdba="postgres" rep_mode="async" node_list="h105 h104" master_ip="192.168.4.105" \
op start interval="0s" timeout="60s" on-fail="restart" \
op monitor interval="7s" timeout="60s" on-fail="restart" \
op promote interval="0s" timeout="60s" on-fail="restart" \
op demote interval="0s" timeout="60s" on-fail="block" \
op stop interval="0s" timeout="60s" on-fail="block"
ms msPostgres postgresql \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
location rsc_location msPostgres \
rule $id="rsc_location-rule" $role="Master" 200: #uname eq h105 \
rule $id="rsc_location-rule-0" $role="Master" 100: #uname eq h104
property $id="cib-bootstrap-options" \
dc-version="1.0.11-9af47ddebcad19e35a61b2a20301dc038018e8e8" \
cluster-infrastructure="Heartbeat" \
crmd-transition-delay="0s" \
stonith-enabled="false"
no-quorum-policy="ignore"
last-lrm-refresh="1341305646"
expected-quorum-votes="2"
symmetric-cluster="true"
startup-fencing="false"
rsc_defaults $id="rsc-options"
resource-stickiness="INFINITY"
migration-threshold="1"

start up pacemaker on h105 :/etc/init.d/the heartbeat start

then the following errors:

Last updated: Wed Jul 11 16:15:57 2012
Stack: Heartbeat
Current DC: h105 (3ed6fbd6-567f-4104-8447-6ef620599369) - partition with quorum
Version: 1.0.11-9af47ddebcad19e35a61b2a20301dc038018e8e8
2 Nodes configured, 2 expected votes
#1 Resources configured.

Online: [ h105 ]
OFFLINE: [ h104 ]

Node Attributes:

  • Node h105:
    • master-postgresql:0 : -INFINITY
    • pgsql-data-status : LATEST
    • pgsql-status : STOP

Failed actions:
postgresql:0_start_0 (node=h105, call=3, rc=-2, status=Timed Out): unknown exec error

After look up the log of postgresql ,I found that the the postgresql on h105 started as standby mode.And there was the recovery.conf file in data directory. why the postgresql started as standby mode? How do I configure pacemaker to make postgresql start as master?

async mode

Hello Takatoshi,

Thank you very much for your work!

Sync mode works for me. But i need to install postgresql replication in asynchronous mode. I think for high load application it's the best mode. I edit postgresql resource via crm - configure - resource - edit postgresql and specify rep_mode="async". I restart corosync. After restart crm_mon -A shows

Online: [ node1 ]
OFFLINE: [ node2 ]

Clone Set: clnPingCheck
Started: [ node1 ]
Stopped: [ pingCheck:1 ]

Node Attributes:

  • Node node1:
    • default_ping_set : 100
    • master-postgresql:0 : -INFINITY
    • pgsql-data-status : LATEST
    • pgsql-status : STOP

Failed actions:
postgresql:0_start_0 (node=node1, call=9, rc=-2, status=Timed Out): unknown exec error

pgstartup.log....

FATAL: could not open configuration file "/var/lib/pgsql/rep_mode.conf": No such file or directory

corosync.log....

Jan 26 23:00:16 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:16 INFO: server starting
Jan 26 23:00:16 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:16 INFO: PostgreSQL start command sent.
Jan 26 23:00:16 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:16 INFO: PostgreSQL is down
Jan 26 23:00:17 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:17 INFO: PostgreSQL is down
Jan 26 23:00:18 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:18 INFO: PostgreSQL is down
Jan 26 23:00:19 node1 lrmd: [17640]: info: RA output: (postgresql:0:start:stderr) 2012/01/26_23:00:19 INFO: PostgreSQL is down

Help me please to resolve this problem.

Fail-over process

Hi,

When Master failed, Slave became new Master.

How to make the old Master becomes new Slave automatically when it comes back?

I have had to do it manually on the old Master:

  1. Stop Postgres.
  2. Copy WAL files from the new Master:
    On the new Master:
    rsync -av /var/lib/postgresql/9.1/main/pg_xlog 192.168.3.2:/var/lib/postgresql/9.1/main/
  3. Remove PGSQL.lock file
  4. Restart corosync.

Then the old Master became new Slave.

cause data loss when using 3 nodes or higher

When using 3 nodes or higher, rep_mode=sync causes data loss.

Scenario

node1 : Master (PRI)
node2 : Slave (HS:sync)
node3 : Slave (HS:potential)

  1. Network is broken on node2
    -> PostgreSQL of node1 sends data to node3 and reports to client before updating attributes of node3
  2. node1 is broken
    -> Pacemaker promotes node2 because node2 is still HS:sync
    -> cause data loss

using archive command

You have written in wiki about a bug of PostgreSQL9.1.1 fails switching the Master. Because of this bug it is necessary to use "archive command" in postgresql.conf and "restore command" in recovery.conf. Explain please more detail in witch scenario this bug appears.

And if you use archive of WAL records how it's will work in big interval of time? The place occupied with archive WAL records isn't cleared in any way on the master server. And sometime the place will end...

Thank you in advance!

pg9.3 socket dir

This param has been pluralised to directories in 9.3

    [ $check_config_rc -eq 0 ] && : ${OCF_RESKEY_socketdir=`get_pgsql_param unix_socket_directory`}

Access to pg_stat_replication

If monitor_user/monitor_user is specified and monitor user dont have SUPERUSER privileges, pgsql-data-status on slave shows incorrectly. It's because only SUPERUSER has access to view pg_stat_replication.

not using monitor_user and monitor_password

(from #16)

Now, I was taking a closer look at the code, and I can see that we are not actually using monitor_user and monitor_password always, look here:

(this is on pgsql_real_monitor, around line 772, master branch on your repository):

        output=`su $OCF_RESKEY_pgdba -c "cd $OCF_RESKEY_pgdata; \
$OCF_RESKEY_psql $psql_options -U $OCF_RESKEY_pgdba \
-Atc \"${CHECK_MS_SQL}\""`

As you can see, it is clearly using pgdba (-U $OCF_RESKEY_pgdba), instead of monitor_user, that is in PGUSER variable. This also assumes pgdba user has direct access to DB, even though this is the case most of the time, it is not necessarily true, as it only requires a change on pg_hba.conf to disallow direct, passwordless, connection of pgdba user through Unix domain socket.

Later, line 797:

    runasowner -q $loglevel "$OCF_RESKEY_psql $psql_options \
-c '$OCF_RESKEY_monitor_sql'"

At this point, it would use PGUSER and PGPASSWORD if available... but maybe setting the variables before running the command would be better (instead of having PGUSER/PGPASSWORD set globally on the script for every action).

Vip-slave in async mode

If I use async mode resource vip-slave don't move to my slave. I have changed rsc_location-1 configuration for vip-slave migrate:

crm(live)configure# show rsc_location-1
location rsc_location-1 vip-slave
rule $id="rsc_location-1-rule" 200: pgsql-status eq HS:sync
rule $id="rsc_location-1-rule-0" 150: pgsql-status eq HS:async
rule $id="rsc_location-1-rule-1" 100: pgsql-status eq PRI
rule $id="rsc_location-1-rule-2" -inf: not_defined pgsql-status
rule $id="rsc_location-1-rule-3" -inf: pgsql-status ne HS:sync and pgsql-status ne PRI and pgsql-status ne HS:async

How you thinks it is possible to use such configuration?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.