Giter VIP home page Giter VIP logo

orchestrator's Introduction

orchestrator upstream has moved

NOTE: orchestrator development is now active on https://github.com/github/orchestrator, where Issues and Pull Requests are accepted.

This repository is no longer the upstream and latest version of orchestrator.

The documentation in this repository is not up-to-date.


orchestrator [Manual]

Orchestrator is a MySQL replication topology management and visualization tool, allowing for:

Discovery

orchestrator actively crawls through your topologies and maps them. It reads basic MySQL info such as replication status and configuration.

It provides with slick visualization of your topologies, including replication problems, even in the face of failures.

Refactoring

orchestrator understands replication rules. It knows about binlog file:position, GTID, Pseudo GTID, Binlog Servers.

Refactoring replication topologies can be a matter of drag & drop a replica under another master. Moving slaves around becomes safe: orchestrator will reject an illegal refactoring attempt.

Find grained control is achieved by various command line options.

Recovery

Orchestrator uses a holistic approach to detect master and intermediate master failures. Based on information gained from the topology itself, it recognizes a variety of failure scenarios.

Configurable, it may choose to perform automated recovery (or allow the user to choose type of manual recovery). Intermediate master recovery achieved internally to orchestrator. Master failover supported by pre/post failure hooks.

Recovery process utilizes orchestrator's understanding of the topology and of its ability to perform refactoring. It is based on state as opposed to configuration: orchestrator picks the best recovery method by investigating/evaluating the topology at the time of recovery itself.

The interface

Orchestrator supports:

  • Command line interface (love your debug messages, take control of automated scripting)
  • Web API (HTTP GET access)
  • Web interface, a slick one.

Orcehstrator screenshot

More

  • Auditing
  • Supports Pseudo-GTID
  • Datacenter/physical location awareness
  • Maintenenace/downtime server states
  • MySQL-Pool association
  • Run as a service; orchestrator multi-service HA
  • HTTP security/authentication methods
  • When working with orchestrator-agent, seed new/corrupt instances
  • More...

Read the Orchestrator Manual for comprehensive documentation.

Authored by Shlomi Noach at GitHub. Previously at Booking.com and Outbrain

orchestrator's People

Contributors

brianip avatar dgryski avatar dveeden avatar enisoc avatar grypyrg avatar jcesario avatar mantas-sidlauskas avatar orkestrov avatar rlowe avatar roman-vynar avatar shlomi-noach avatar shuhaowu avatar sjmudd avatar sqcesario avatar techwolf359 avatar tomkrouper avatar voroonestic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

orchestrator's Issues

Prevent move-below when invalid

move-below should be prevented upon:

  • replicating from newer major version to older major version (e.g. 5.6 to 5.5)
  • binlog_format incompatible (row|mixed->statement, row->mixed)
  • no (log_bin && log_slave_updates) on current sibling (==new master)
  • replication filters in place and are incompatible (will probably not handle at this stage)

Master-slave alignment problem

Steps followed:

  1. Configured conf file and later started daemon
  2. Discovery -> searched for master (127.0.0.1 and port 3390)
  3. Discovery --> searched for slave (127.0.0.1 and port 3391)

Master and Slave should be alligned properly as mentioned in your documentation, this is not happening in my case. (refer attached screenshot)
master_slave

Daemon fails with error for .deb package

~$ cd /usr/local/orchestrator && ./orchestrator http
[martini] listening on :3000
2014-06-10 11:07:04 ERROR dial tcp 127.0.0.1:5532: connection refused
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x50676e]

goroutine 12 [running]:
runtime.panic(0x70cf60, 0xb285a8)
/usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
database/sql.(*Rows).Close(0x0, 0xc21005a580, 0x70)
/usr/local/go/src/pkg/database/sql/sql.go:1576 +0x1e
github.com/outbrain/sqlutils.QueryRowsMap(0xc21005a600, 0xc21005a580, 0x70, 0x7fdd57bb1e58, 0x7fdd57d29b48, ...)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/sqlutils/sqlutils.go:143 +0x119
github.com/outbrain/orchestrator/inst.ReadOutdatedInstanceKeys(0x674460, 0xc210052700, 0xecb289080, 0x3457510, 0xb2ffa0)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/inst/instance_dao.go:466 +0x1ee
github.com/outbrain/orchestrator/logic.ContinuousDiscovery()
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:130 +0x10e
created by github.com/outbrain/orchestrator/app.Http
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/app/http.go:50 +0x1db

goroutine 1 [IO wait]:
net.runtime_pollWait(0x7fdd57d299a0, 0x72, 0x0)
/usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(_pollDesc).Wait(0xc2100d2a70, 0x72, 0x7fdd57d27098, 0xb)
/usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(_pollDesc).WaitRead(0xc2100d2a70, 0xb, 0x7fdd57d27098)
/usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(_netFD).accept(0xc2100d2a10, 0x826788, 0x0, 0x7fdd57d27098, 0xb)
/usr/local/go/src/pkg/net/fd_unix.go:382 +0x2c2
net.(_TCPListener).AcceptTCP(0xc210000408, 0x51db2b, 0x7fdd57b97ae0, 0x51db2b)
/usr/local/go/src/pkg/net/tcpsock_posix.go:233 +0x47
net.(_TCPListener).Accept(0xc210000408, 0x7fdd57d29a48, 0xc210000428, 0xc21005a500, 0x0)
/usr/local/go/src/pkg/net/tcpsock_posix.go:243 +0x27
net/http.(_Server).Serve(0xc2100c8d20, 0x7fdd57d28a00, 0xc210000408, 0x0, 0x0)
/usr/local/go/src/pkg/net/http/server.go:1622 +0x91
net/http.(_Server).ListenAndServe(0xc2100c8d20, 0xc2100c8d20, 0x7fdd57b97be0)
/usr/local/go/src/pkg/net/http/server.go:1612 +0xa0
net/http.ListenAndServe(0xc2100003e0, 0x5, 0x7fdd57d28980, 0xc2100513c0, 0x1, ...)
/usr/local/go/src/pkg/net/http/server.go:1677 +0x6d
github.com/go-martini/martini.(_Martini).Run(0xc2100513c0)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:84 +0x303
github.com/outbrain/orchestrator/app.Http(0x7fffc347c401)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/app/http.go:57 +0x224
main.main()
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/main.go:59 +0x518

goroutine 14 [chan receive]:
github.com/outbrain/orchestrator/logic.handleDiscoveryRequests(0x0, 0x0)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:39 +0x3d
created by github.com/outbrain/orchestrator/logic.ContinuousDiscovery
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:126 +0x67

goroutine 13 [finalizer wait]:
runtime.park(0x40ad80, 0xb2e8f8, 0xb2ace8)
/usr/local/go/src/pkg/runtime/proc.c:1342 +0x66
runfinq()
/usr/local/go/src/pkg/runtime/mgc0.c:2279 +0x84
runtime.goexit()
/usr/local/go/src/pkg/runtime/proc.c:1394

goroutine 17 [chan receive]:
database/sql.(*DB).connectionOpener(0xc21005a600)
/usr/local/go/src/pkg/database/sql/sql.go:574 +0x3e
created by database/sql.Open
/usr/local/go/src/pkg/database/sql/sql.go:436 +0x24d
mahesh@mahesh-OptiPlex-990:/usr/local/orchestrator$ sudo cd /usr/local/orchestrator && ./orchestrator http
[sudo] password for mahesh:
sudo: cd: command not found
mahesh@mahesh-OptiPlex-990:/usr/local/orchestrator$ sudo ./orchestrator http
[martini] listening on :3000
2014-06-10 11:08:24 ERROR dial tcp 127.0.0.1:5532: connection refused
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x20 pc=0x50676e]

goroutine 12 [running]:
runtime.panic(0x70cf60, 0xb285a8)
/usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
database/sql.(*Rows).Close(0x0, 0xc210058500, 0x70)
/usr/local/go/src/pkg/database/sql/sql.go:1576 +0x1e
github.com/outbrain/sqlutils.QueryRowsMap(0xc210058580, 0xc210058500, 0x70, 0x7f17c6340e58, 0x7f17c64b8a90, ...)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/sqlutils/sqlutils.go:143 +0x119
github.com/outbrain/orchestrator/inst.ReadOutdatedInstanceKeys(0x674460, 0xc210050700, 0xecb2890d0, 0x29b4e2e0, 0xb2ffa0)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/inst/instance_dao.go:466 +0x1ee
github.com/outbrain/orchestrator/logic.ContinuousDiscovery()
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:130 +0x10e
created by github.com/outbrain/orchestrator/app.Http
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/app/http.go:50 +0x1db

goroutine 1 [IO wait]:
net.runtime_pollWait(0x7f17c64b8960, 0x72, 0x0)
/usr/local/go/src/pkg/runtime/netpoll.goc:116 +0x6a
net.(_pollDesc).Wait(0xc2100cfa70, 0x72, 0x7f17c64b6098, 0xb)
/usr/local/go/src/pkg/net/fd_poll_runtime.go:81 +0x34
net.(_pollDesc).WaitRead(0xc2100cfa70, 0xb, 0x7f17c64b6098)
/usr/local/go/src/pkg/net/fd_poll_runtime.go:86 +0x30
net.(_netFD).accept(0xc2100cfa10, 0x826788, 0x0, 0x7f17c64b6098, 0xb)
/usr/local/go/src/pkg/net/fd_unix.go:382 +0x2c2
net.(_TCPListener).AcceptTCP(0xc210000408, 0x18, 0xc210039010, 0x51d903)
/usr/local/go/src/pkg/net/tcpsock_posix.go:233 +0x47
net.(_TCPListener).Accept(0xc210000408, 0x7f17c64b7968, 0xc2100d0060, 0x0, 0x0)
/usr/local/go/src/pkg/net/tcpsock_posix.go:243 +0x27
net/http.(_Server).Serve(0xc2100c5cd0, 0x7f17c64b79c0, 0xc210000408, 0x0, 0x0)
/usr/local/go/src/pkg/net/http/server.go:1622 +0x91
net/http.(_Server).ListenAndServe(0xc2100c5cd0, 0xc2100c5cd0, 0x7f17c6326be0)
/usr/local/go/src/pkg/net/http/server.go:1612 +0xa0
net/http.ListenAndServe(0xc2100003e0, 0x5, 0x7f17c64b7940, 0xc21004f3c0, 0x1, ...)
/usr/local/go/src/pkg/net/http/server.go:1677 +0x6d
github.com/go-martini/martini.(_Martini).Run(0xc21004f3c0)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:84 +0x303
github.com/outbrain/orchestrator/app.Http(0x7fff795c3901)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/app/http.go:57 +0x224
main.main()
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/main.go:59 +0x518

goroutine 14 [chan receive]:
github.com/outbrain/orchestrator/logic.handleDiscoveryRequests(0x0, 0x0)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:39 +0x3d
created by github.com/outbrain/orchestrator/logic.ContinuousDiscovery
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:126 +0x67

goroutine 13 [finalizer wait]:
runtime.park(0x40ad80, 0xb2e8f8, 0xb2ace8)
/usr/local/go/src/pkg/runtime/proc.c:1342 +0x66
runfinq()
/usr/local/go/src/pkg/runtime/mgc0.c:2279 +0x84
runtime.goexit()
/usr/local/go/src/pkg/runtime/proc.c:1394

goroutine 16 [chan receive]:
database/sql.(*DB).connectionOpener(0xc210058580)
/usr/local/go/src/pkg/database/sql/sql.go:574 +0x3e
created by database/sql.Open
/usr/local/go/src/pkg/database/sql/sql.go:436 +0x24d

prevent slave move upon problem

prevent move-up when:

  • unseen
  • unchecked
  • not replicating
  • lag > threshold

prevent move-below when either host:

  • unseen
  • unchecked
  • not replicating
  • lag > threshold

audit all operations

create an audit table; log any:

  • move-(up|below),
  • (begin|end)-maintenance (except for those implied by move-*)
  • discover
  • forget (need to consider auto-forget as well)

Support a read only view in the front end

In larger organizations, it'd be great to support a toggle in the frontend that doesn't allow changing the topology but it still a nice way to see the different cluster's topology and lag status

Last Check Invalid, --debug seems to lack detail on what that means.

I've attempted to attach Orchestrator to a host as such:

Discover > Added host: istsbdcs01-pub (this host exists in dns and resolves)

The output returns this:

2014-06-27 19:32:15 INFO Starting discovery at {Hostname:istsbdcs01-pub.isllc.local Port:3306}
2014-06-27 19:32:15 INFO No entry for &{Hostname:istsbdcs01-pub.isllc.local Port:3306}
2014-06-27 19:32:15 ERROR lookup : no such host

Is there a reason it can't look up the host when it resolves on the server and elsewhere? Is that even relevant to the error?

It states "Last Check Invalid" and gives no further detail as to what part of the check failed. Is there a plan to extend these messages to be more specific?

If possible add a mouse over describing the dots in .../web/clusters/

The line with instances shows up to 4 dots which seem to represent: total servers in the cluster, warning critical and not available.

  • It might be nice if a mouse-over describes this a little more completely to avoid us looking it up in the docs
  • It might also be nice to be able to click on these dots and see only these servers (so we can find out more details about the problem etc).

Support for configuration file for clusters

I am not sure if this already exists and just isn't documented but if it doesn't exist, it'd be great if Orchestrator can support a prewritten JSON file of the clusters it should contain in the dashboard.

As an ops person automating the deployment of an Orchestrator dashboard, I can make my task a lot faster by leveraging infrastructure search tools (chef search for example) to create this instead of having to manually add clusters when a new orchestrator dashboard is added. Especially if I manage over a dozen clusters (which is the case for me)

Martini error stack when attempting to discover a cluster

I tested the orchestrator db user on the target DB before posting this. My lack of knowledge of martini and go may be a factor here rather than a bug but when I try to add a cluster thru the discover page, I get this error stack in the orchestrator.log

[martini] Started GET /api/discover/stdbcontact-002.sjc1.sendgrid.net/3306 for 10.0.2.2:60163
[martini] PANIC: Value not found for type auth.User
/usr/lib64/go/src/runtime/panic.go:387 (0x412bc8)
/usr/share/golang/src/github.com/go-martini/martini/router.go:352 (0x504281)
/usr/share/golang/src/github.com/go-martini/martini/router.go:229 (0x503364)
/usr/share/golang/src/github.com/go-martini/martini/router.go:112 (0x501fe8)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:119 (0x504ecb)
/usr/lib64/go/src/runtime/asm_amd64.s:403 (0x43b005)
/usr/lib64/go/src/reflect/value.go:419 (0x5821b5)
/usr/lib64/go/src/reflect/value.go:296 (0x580fac)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ea747)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x500171)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x500083)
/usr/share/golang/src/github.com/martini-contrib/gzip/gzip.go:37 (0x50c218)
/usr/lib64/go/src/runtime/asm_amd64.s:403 (0x43b005)
/usr/lib64/go/src/reflect/value.go:419 (0x5821b5)
/usr/lib64/go/src/reflect/value.go:296 (0x580fac)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ea747)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x500171)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x500083)
/usr/share/golang/src/github.com/go-martini/martini/recovery.go:142 (0x505509)
/usr/lib64/go/src/runtime/asm_amd64.s:402 (0x43af95)
/usr/lib64/go/src/reflect/value.go:419 (0x5821b5)
/usr/lib64/go/src/reflect/value.go:296 (0x580fac)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ea747)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x500171)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:164 (0x500083)
/usr/share/golang/src/github.com/go-martini/martini/logger.go:25 (0x504bcf)
/usr/lib64/go/src/runtime/asm_amd64.s:403 (0x43b005)
/usr/lib64/go/src/reflect/value.go:419 (0x5821b5)
/usr/lib64/go/src/reflect/value.go:296 (0x580fac)
/usr/share/golang/src/github.com/codegangsta/inject/inject.go:102 (0x5ea747)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:173 (0x500171)
/usr/share/golang/src/github.com/go-martini/martini/martini.go:69 (0x4feef4)
/usr/lib64/go/src/net/http/server.go:1703 (0x52986a)
/usr/lib64/go/src/net/http/server.go:1204 (0x527337)
/usr/lib64/go/src/runtime/asm_amd64.s:2232 (0x43d041)
2014/12/25 02:17:27 http: multiple response.WriteHeader calls
[martini] Completed 500 Internal Server Error in 1.80033ms

Config file

{
"MySQLTopologyUser": "orchestrator",
"MySQLTopologyPassword": "REDACTED",
"MySQLOrchestratorHost": "localhost",
"MySQLOrchestratorPort": 3306,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "orchestrator",
"MySQLOrchestratorPassword": "REDACTED",
"SlaveLagQuery": "",
"DiscoverByShowSlaveHosts": true,
"InstancePollSeconds": 60,
"UnseenInstanceForgetHours": 240,
"ReasonableReplicationLagSeconds" : 10,
"ReasonableMaintenanceReplicationLagSeconds" : 20,
"AuditPageSize": 20,
"SlaveStartPostWaitMilliseconds" : 1000,
"HTTPAuthUser": "",
"HTTPAuthPassword": ""
}

Let me know if I should post any more information.

web: note time since last seen on unseen instances

for unseen instances (visualize by gray, black) don't show replication lag but instead time since last seen. Replication lag is irrelevant since it does not update as the instance cannot be accessed.

Privilege Error

I ran the grants as indicated in the documentation, but I get the following error:

2014-08-14 15:05:03 ERROR Error 1227: Access denied; you need (at least one of) the REPLICATION SLAVE privilege(s) for this operation

This is the result of my "show grant":
GRANT PROCESS, SUPER ON . TO 'orchestrator_act'@'drgordbreport' IDENTIFIED BY PASSWORD 'hashedpassword'

Perhaps the documentation is out of date for the latest release?

Edit: This seems to occur even if I add the "replication slave" privilege.

Add a /health URL that returns JSON with the health of the app as a whole

This is a common pattern in ops deployment of applications. Usually it would be a /health URL that returns, when healthy, return code 200 and a JSON showing all the things it checked. In orchestrator's case it would be a JSON that would look like this

{"orchestrator": "ok", "orchestrator_mysql": {"orchestrator_mysql_running": "ok", "orchestrator_mysql_query": "ok"}}

This would greatly simplify the task of monitoring orchestrator in any given infrastructure

Orchestrator still requires SUPER privilege even when the entire instance is read only

As far as I can tell, all the information needed for a read only orchestrator can come from either REPLICATION CLIENT and REPLICATION SLAVE privileges. Yet with an instance set as ReadOnly true, I still can't even properly discover instances without the super privilege.

In many environments, granting super priv to a non localhost user is not recommended and if I am already deploying Orchestrator as a read only tool, i would think SUPER privileges shouldn't be needed.

[1.0.3] Last Check Invalid, no such host

Ran 1.0.3 with --stack --debug, here's the output:

[martini] Started GET /api/discover/istsbdcs01-pub.isllc.local/3306
[martini] Completed 200 OK in 3.764292ms
2014-06-30 17:04:46 INFO Starting discovery at {Hostname:istsbdcs01-pub.isllc.local Port:3306}
2014-06-30 17:04:46 INFO No entry for &{Hostname:istsbdcs01-pub.isllc.local Port:3306}
2014-06-30 17:04:46 ERROR lookup : no such host
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/log/log.go:105 (0x42cec1)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/log/log.go:151 (0x42d31a)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/inst/instance_dao.go:154 (0x474dad)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:76 (0x4806ff)
/home/snoach/dev/outbrain/github/orchestrator/src/github.com/outbrain/orchestrator/logic/orchestrator.go:53 (0x480ed8)
/usr/local/go/src/pkg/runtime/proc.c:1394 (0x4177e0)

Here's the verification of DNS resolution:

[sbduser@istmysqlgw01 orchestrator]$ nslookup istsbdcs01-pub
Server: 10.1.104.30
Address: 10.1.104.30#53

Name: istsbdcs01-pub.isllc.local
Address: 10.11.1.80

[sbduser@istmysqlgw01 orchestrator]$ nslookup istsbdcs01-pub.isllc.local
Server: 10.1.104.30
Address: 10.1.104.30#53

Name: istsbdcs01-pub.isllc.local
Address: 10.11.1.80

It appears it can pull information on the mysql instance -- something it hasn't failed to do -- but something else is failing/causing it to throw the "last check invalid" error.

host_info

Orchestrator init script returns success code even if orchestrator didn't start successfully

To reproduce, just introduce bad syntax to the JSON config file and issue a restart for Orchestrator

[vagrant@default-centos-64 ~]$ sudo /etc/init.d/orchestrator restart
Stopping orchestrator                             /etc/init.d/orchestrator: line 56: kill: (4366) - No such process
Ok
Starting orchestrator...                          Ok
[vagrant@default-centos-64 ~]$ echo $?
0

Check Orchestrator status and it clearly failed to start

[vagrant@default-centos-64 ~]$ tail /var/log/orchestrator.log
2014-12-20 22:43:45 INFO starting
2014-12-20 22:43:45 FATAL Cannot read config file: /etc/orchestrator.conf.json invalid character '"' after object key:value pair
[vagrant@default-centos-64 ~]$ pgrep orchestrator
[vagrant@default-centos-64 ~]$

I may look at the init script and see if I can make a PR for this but I want to update the chef cookbook to support the new GA release first

status sidebar

add status sidebar on all screens, featuring:

  • maintained instances
  • error servers (replication lag)
  • error servers (unseen lately)
  • audit?

Identify duplicate hosts via server_id

Sometimes hosts are referred to by their IP/hostname that cannot be resolved from the orchestrator machine. This leads to duplication, having same machine appearing twice in the topology, once with IP and once with hostname.
Per topology, we can assume two servers with the same server_id are either a terrible mistake or simply the very same instance.
At the very least, orchestrator should identify this (and perhaps present as a problem).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.