auth0 / coreos-mongodb Goto Github PK

CoreOS MongoDB units

coreos-mongodb's Introduction

Deploy a replicaset to coreos like a boss. Auto-discover new members via etcd.

Deploy

fleetctl destroy mongo@{1..3}.service  mongo-replica-config.service 

etcdctl set /mongo/replica/name myreplica

fleetctl start mongo@{1..3}.service mongo-replica-config.service

Connect

You can test connecting to your replica from one of your nodes as follows:

fleetctl-ssh

COREOS_PRIVATE_IPV4=xx.xx.xx.xxx; echo $COREOS_PRIVATE_IPV4

SITE_USR_ADMIN_PWD=$(etcdctl get /mongo/replica/siteUserAdmin/pwd); echo $SITE_USR_ADMIN_PWD

SITE_ROOT_PWD=$(etcdctl get /mongo/replica/siteRootAdmin/pwd); echo $SITE_ROOT_PWD

docker run -it --rm mongo:2.6 mongo $COREOS_PRIVATE_IPV4/admin  -u siteRootAdmin -p $SITE_ROOT_PWD


$ Welcome to the MongoDB shell.

Trouble shooting

In my shell rc file (~/.zsh_aliases)

fleetctl-switch(){
  ssh-add ~/.docker/certs/key.pem
  DOCKER_HOST=tcp://$1:2376
  export FLEETCTL_TUNNEL=$1:22
  alias etcdctl="ssh -A core@$1 'etcdctl'"
  alias fleetctl-ssh="fleetctl ssh $(fleetctl list-machines | cut -c1-8 | sed -n 2p)"
  RPROMPT="%{$fg[magenta]%}[fleetctl:$1]%{$reset_color%}"
}
destroy_mongo_replica() {
  export FLEETCTL_TUNNEL=$1:22
  fleetctl destroy mongo@{1..3}.service
  fleetctl destroy [email protected]
  fleetctl destroy mongo-replica-config.service

  etcdctl rm /mongo/replica/siteRootAdmin --recursive
  etcdctl rm /mongo/replica/siteUserAdmin --recursive
  etcdctl rm /mongo/replica --recursive
  etcdctl set /mongo/replica/name myreplica

  echo 'Listing etcd /mongo dirs...'
  ssh -A core@$1 'etcdctl ls /mongo --recursive';

  echo Listing $1 /var/mongo
  ssh -A core@$1 'sudo rm -rf /var/mongo/*'
  ssh -A core@$1 'ls /var/mongo/'

  echo Listing $2 /var/mongo
  ssh -A core@$2 'sudo rm -rf /var/mongo/*'
  ssh -A core@$2 'ls /var/mongo/'

  echo Listing $3 /var/mongo
  ssh -A core@$3 'sudo rm -rf /var/mongo/*'
  ssh -A core@$3 'ls /var/mongo/'
}

To start,

fleetctl-switch xx.xx.xx.xx
fleetctl start mongo@{1..3}.service mongo-replica-config.service

To see what's going on in a server,

fleetctl-ssh

To delete all mongodb files,

destroy_mongo_replica <cluser ip 1> <cluser ip 2> <cluser ip 3>

How it works?

The units follow the process explained in this tutorial.

I've split the process in 3 different phases.

Phase 1

During the initial phase, mongo needs to be run without the authentication option and without the keyFile.

We just run the first node of the replicaset while the other are waiting the key file in etcd.

The siteUserAdmin and siteRootAdmin are created on the first node with random passwords stored in etcd.
The keyfile is generated and added to etcd.
All mongodb are started.

Phase 2

During the second phase, we have all the nodes of the replica running and ready to bind each other.

rs.initiate is run in the first node.
rs.add is run for every node except the fisrt one which is automatically added.

Phase 3

The third phase is the final state, we keep watching etcd for new nodes and these new nodes.

Destroy and revert everything

# remove all units
$ fleetctl destroy mongo@{1..3}.service
$ fleetctl destroy mongo-replica-config.service
# or
$ fleetctl list-units --no-legend | awk '{print $1}' | xargs -I{} fleetctl destroy {}

# clean directories
$ fleetctl list-machines --fields="machine" --full --no-legend | xargs -I{} fleetctl ssh {} "sudo rm -rf /var/mongo/*"

(from inside one of the nodes)
$ etcdctl rm /mongo/replica/key
$ etcdctl rm --recursive /mongo/replica/siteRootAdmin
$ etcdctl rm --recursive /mongo/replica/siteUserAdmin
$ etcdctl rm --recursive /mongo/replica/nodes

License

coreos-mongodb's People

Contributors

Stargazers

Watchers

Forkers

tkornblit bleshik nonsense bpoller mattbailey agios ptorrestr iartem ericson-cepeda fastrobot mehreencs87 isabella232

coreos-mongodb's Issues

Sometimes mongo-replica-config.service stays dead

I tried to start all services at once, but mongo-replica-config.service stayed dead after all. I destroyed and started it again and it worked then.

Spinning up 3 nodes and failing

I'm having an issue with the nodes. replca-config runs fine, mongo@1 runs fine, but 2-3 are failing with the reason "Repository mongo already being pulled by another client."

Add support for sharding

It would be handy to have similar nice automatic configuration for sharding.

Add /mongo/replica/master

It would be nice if the master replica will be registered to /mongo/replica/ dir in etcd.

Spin up within Vagrant fails

I am spinning up 3 clean CoreOS nodes within Vagrant on OS X and I am getting failing nodes:

fleetctl list-units
UNIT                MACHINE             ACTIVE      SUB
mongo-replica-config.service    6cb80d22.../172.17.8.103    inactive    dead
[email protected]         6cb80d22.../172.17.8.103    active      running
[email protected]         eb4c2b8e.../172.17.8.102    failed      failed
[email protected]         efb84060.../172.17.8.101    failed      failed

This is what [email protected] on core-03 gives me:

Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.066+0000 [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=731bdf3fc4d7
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.066+0000 [initandlisten] db version v2.6.5
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.067+0000 [initandlisten] git version: e99d4fcb4279c0279796f237aa92fe3b64560bf6
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.067+0000 [initandlisten] build info: Linux build8.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.068+0000 [initandlisten] allocator: tcmalloc
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.068+0000 [initandlisten] options: { replication: { replSet: "quokka" }, security: { keyFile: "/data/db/replica.key" } }
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.071+0000 [initandlisten] journal dir=/data/db/journal
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.071+0000 [initandlisten] recover : no journal files present, no recovery needed
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.182+0000 [initandlisten] info openExisting file size 16777216 but storageGlobalParams.smallfiles=false: /data/db/local.0
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.183+0000 [initandlisten] info openExisting file size 16777216 but storageGlobalParams.smallfiles=false: /data/db/admin.0
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.186+0000 [initandlisten] info openExisting file size 16777216 but storageGlobalParams.smallfiles=false: /data/db/admin.0
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.188+0000 [initandlisten] info openExisting file size 16777216 but storageGlobalParams.smallfiles=false: /data/db/local.0
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.188+0000 [initandlisten] waiting for connections on port 27017
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.197+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:10 core-03 bash[1231]: 2014-11-05T10:54:10.199+0000 [rsStart] replSet info you may need to run replSetInitiate -- rs.initiate() in the shell -- if that is not already done
Nov 05 10:54:11 core-03 bash[1231]: 2014-11-05T10:54:11.200+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:12 core-03 bash[1231]: 2014-11-05T10:54:12.200+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:13 core-03 ntpd[458]: Listen normally on 11 veth29f101e fe80::2850:39ff:fe89:b0cf UDP 123
Nov 05 10:54:13 core-03 ntpd[458]: Deleting interface #10 veth8b1a48f, fe80::90fd:64ff:fe7f:9824#123, interface stats: received=0, sent=0, dropped=0, active_time=62 secs
Nov 05 10:54:13 core-03 ntpd[458]: peers refreshed
Nov 05 10:54:13 core-03 bash[1231]: 2014-11-05T10:54:13.200+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:14 core-03 bash[1231]: 2014-11-05T10:54:14.200+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:15 core-03 bash[1231]: 2014-11-05T10:54:15.202+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:16 core-03 bash[1231]: 2014-11-05T10:54:16.202+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
Nov 05 10:54:17 core-03 bash[1231]: 2014-11-05T10:54:17.203+0000 [rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)

[email protected] and [email protected]:

Nov 05 10:55:02 core-02 bash[2374]: mongo:2.6: The image you are pulling has been verified
Nov 05 10:55:02 core-02 bash[2374]: Status: Image is up to date for mongo:2.6
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] -job pull(mongo, 2.6) = OK (0)
Nov 05 10:55:02 core-02 docker[1865]: [info] DELETE /v1.15/containers/mongodb?force=1
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job rm(mongodb)
Nov 05 10:55:02 core-02 docker[1865]: No such container: mongodb
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] -job rm(mongodb) = ERR (1)
Nov 05 10:55:02 core-02 docker[1865]: [error] server.go:1207 Handler for DELETE /containers/{name:.*} returned error: No such container: mongodb
Nov 05 10:55:02 core-02 docker[1865]: [error] server.go:110 HTTP Error: statusCode=404 No such container: mongodb
Nov 05 10:55:02 core-02 bash[2381]: Error response from daemon: No such container: mongodb
Nov 05 10:55:02 core-02 bash[2381]: 2014/11/05 10:55:02 Error: failed to remove one or more containers
Nov 05 10:55:02 core-02 bash[2392]: 27017
Nov 05 10:55:02 core-02 bash[2391]: /bin/bash: /var/mongo/replica.key: No such file or directory
Nov 05 10:55:02 core-02 systemd[1]: [email protected]: main process exited, code=exited, status=1/FAILURE
Nov 05 10:55:02 core-02 bash[2392]: on
Nov 05 10:55:02 core-02 bash[2424]: /usr/bin/docker stop -t 60 mongodb || true;            /usr/bin/etcdctl set /mongo/replica/nodes/$COREOS_PRIVATE_IPV4/status off
Nov 05 10:55:02 core-02 docker[1865]: [info] POST /v1.15/containers/mongodb/stop?t=60
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job stop(mongodb)
Nov 05 10:55:02 core-02 docker[1865]: No such container: mongodb
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] -job stop(mongodb) = ERR (1)
Nov 05 10:55:02 core-02 docker[1865]: [error] server.go:1207 Handler for POST /containers/{name:.*}/stop returned error: No such container: mongodb
Nov 05 10:55:02 core-02 docker[1865]: [error] server.go:110 HTTP Error: statusCode=404 No such container: mongodb
Nov 05 10:55:02 core-02 bash[2424]: Error response from daemon: No such container: mongodb
Nov 05 10:55:02 core-02 bash[2424]: 2014/11/05 10:55:02 Error: failed to stop one or more containers
Nov 05 10:55:02 core-02 bash[2424]: off
Nov 05 10:55:02 core-02 systemd[1]: Failed to start mongod service.
Nov 05 10:55:02 core-02 systemd[1]: Unit [email protected] entered failed state.
Nov 05 10:55:02 core-02 systemd[1]: [email protected] holdoff time over, scheduling restart.
Nov 05 10:55:02 core-02 systemd[1]: Stopping mongod service...
Nov 05 10:55:02 core-02 systemd[1]: Starting mongod service...
Nov 05 10:55:02 core-02 docker[1865]: [info] POST /v1.15/images/create?fromImage=mongo%3A2.6
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job pull(mongo, 2.6)
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job trust_update_base()
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] -job trust_update_base() = OK (0)
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] +job trust_key_check(/library/mongo)
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job trust_key_check(/library/mongo) = OK (0)
Nov 05 10:55:03 core-02 bash[2441]: mongo:2.6: The image you are pulling has been verified
Nov 05 10:55:03 core-02 bash[2441]: Status: Image is up to date for mongo:2.6
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job pull(mongo, 2.6) = OK (0)
Nov 05 10:55:03 core-02 docker[1865]: [info] DELETE /v1.15/containers/mongodb?force=1
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] +job rm(mongodb)
Nov 05 10:55:03 core-02 docker[1865]: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job rm(mongodb) = ERR (1)
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:1207 Handler for DELETE /containers/{name:.*} returned error: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:110 HTTP Error: statusCode=404 No such container: mongodb
Nov 05 10:55:03 core-02 bash[2449]: Error response from daemon: No such container: mongodb
Nov 05 10:55:03 core-02 bash[2449]: 2014/11/05 10:55:03 Error: failed to remove one or more containers
Nov 05 10:55:03 core-02 bash[2465]: 27017
Nov 05 10:55:03 core-02 bash[2464]: /bin/bash: /var/mongo/replica.key: No such file or directory
Nov 05 10:55:03 core-02 systemd[1]: [email protected]: main process exited, code=exited, status=1/FAILURE
Nov 05 10:55:02 core-02 systemd[1]: [email protected] holdoff time over, scheduling restart.
Nov 05 10:55:02 core-02 systemd[1]: Stopping mongod service...
Nov 05 10:55:02 core-02 systemd[1]: Starting mongod service...
Nov 05 10:55:02 core-02 docker[1865]: [info] POST /v1.15/images/create?fromImage=mongo%3A2.6
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job pull(mongo, 2.6)
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] +job trust_update_base()
Nov 05 10:55:02 core-02 docker[1865]: [27367adf] -job trust_update_base() = OK (0)
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] +job trust_key_check(/library/mongo)
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job trust_key_check(/library/mongo) = OK (0)
Nov 05 10:55:03 core-02 bash[2441]: mongo:2.6: The image you are pulling has been verified
Nov 05 10:55:03 core-02 bash[2441]: Status: Image is up to date for mongo:2.6
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job pull(mongo, 2.6) = OK (0)
Nov 05 10:55:03 core-02 docker[1865]: [info] DELETE /v1.15/containers/mongodb?force=1
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] +job rm(mongodb)
Nov 05 10:55:03 core-02 docker[1865]: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job rm(mongodb) = ERR (1)
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:1207 Handler for DELETE /containers/{name:.*} returned error: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:110 HTTP Error: statusCode=404 No such container: mongodb
Nov 05 10:55:03 core-02 bash[2449]: Error response from daemon: No such container: mongodb
Nov 05 10:55:03 core-02 bash[2449]: 2014/11/05 10:55:03 Error: failed to remove one or more containers
Nov 05 10:55:03 core-02 bash[2465]: 27017
Nov 05 10:55:03 core-02 bash[2464]: /bin/bash: /var/mongo/replica.key: No such file or directory
Nov 05 10:55:03 core-02 systemd[1]: [email protected]: main process exited, code=exited, status=1/FAILURE
Nov 05 10:55:03 core-02 bash[2465]: on
Nov 05 10:55:03 core-02 bash[2497]: /usr/bin/docker stop -t 60 mongodb || true;            /usr/bin/etcdctl set /mongo/replica/nodes/$COREOS_PRIVATE_IPV4/status off
Nov 05 10:55:03 core-02 docker[1865]: [info] POST /v1.15/containers/mongodb/stop?t=60
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] +job stop(mongodb)
Nov 05 10:55:03 core-02 docker[1865]: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [27367adf] -job stop(mongodb) = ERR (1)
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:1207 Handler for POST /containers/{name:.*}/stop returned error: No such container: mongodb
Nov 05 10:55:03 core-02 docker[1865]: [error] server.go:110 HTTP Error: statusCode=404 No such container: mongodb
Nov 05 10:55:03 core-02 bash[2497]: Error response from daemon: No such container: mongodb
Nov 05 10:55:03 core-02 bash[2497]: 2014/11/05 10:55:03 Error: failed to stop one or more containers
Nov 05 10:55:03 core-02 bash[2497]: off
Nov 05 10:55:03 core-02 systemd[1]: Failed to start mongod service.
Nov 05 10:55:03 core-02 systemd[1]: Unit [email protected] entered failed state.
Nov 05 10:55:03 core-02 systemd[1]: [email protected] holdoff time over, scheduling restart.
Nov 05 10:55:03 core-02 systemd[1]: Stopping mongod service...
Nov 05 10:55:03 core-02 systemd[1]: Starting mongod service...
Nov 05 10:55:03 core-02 systemd[1]: [email protected] start request repeated too quickly, refusing to start.
Nov 05 10:55:03 core-02 systemd[1]: Failed to start mongod service.
Nov 05 10:55:03 core-02 systemd[1]: Unit [email protected] entered failed state.

I don´t know whats going wrong, but to me this is not working at all.

Best regards,

Daniel

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.