tbarbugli / cassandra_snapshotter Goto Github PK
View Code? Open in Web Editor NEWA tool to backup cassandra nodes using snapshots and incremental backups on S3
License: Other
A tool to backup cassandra nodes using snapshots and incremental backups on S3
License: Other
Hi all
I am looking for better documentation on how to just use ssh keys, and no password. I am getting an error from fabric that sudo requires a password. Do I need to setup sudo on the remote boxes to all to run the backup w/o password?
Thanks
Chris
From the README.md:
if your bucket is in other then us-west-1 region, you should really specify the region in the command line; otherwise weird 'connection reset by peer' errors can appear as you'll be transferring files through us-west-1 over to eg. eu-west-1
However in the code I can see that the default region is us-east-1
: https://github.com/tbarbugli/cassandra_snapshotter/blob/master/cassandra_snapshotter/utils.py#L46
After an incremental backup does the backups directory remain intact? To manage disk space on my cassandra nodes I would like to cleanup the backups directory after files get pushed to S3.
I am getting following error during restore. Any idea what is causing this failure?
Traceback (most recent call last):
File "/usr/local/bin/cassandra-snapshotter", line 9, in
load_entry_point('cassandra-snapshotter==1.0.0', 'console_scripts', 'cassandra-snapshotter')()
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/main.py", line 284, in main
restore_backup(args)
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/main.py", line 110, in restore_backup
args.s3_bucket_name
TypeError: init() takes exactly 6 arguments (5 given)
My restore command is as follows:
cassandra-snapshotter --s3-bucket-name=mybucket --s3-bucket-region=us-west-2 --s3-base-path=/backups --aws-access-key-id=accesskey --aws-secret-access-key=secretkey restore --snapshot-name=LATEST --keyspace=mykeyspace --hosts=host1 --target-hosts=host1
Getting the fol;lowing error when trying to run pip install; running Fedora 23, similar issue on AMZN Linux and ubuntu.
$ pip install cassandra_snapshooter
Collecting cassandra-snapshooter
Could not find a version that satisfies the requirement cassandra-snapshooter (from versions: )
No matching distribution found for cassandra-snapshooter
[172.31.23.194] sudo: python -c "import os; print os.path.join(['/var/lib/cassandra/data/', '', '', 'snapshots', '20140418145633', ''])"
[172.31.23.194] sudo: python -c "import glob; print '\n'.join(glob.glob('/var/lib/cassandra/data///snapshots/20140418145633/*'))"
[172.31.23.194] put: /tmp/tmpghxPeI -> /tmp/tmpghxPeI
Fatal error: put() encountered an exception while uploading '/tmp/tmpghxPeI'
Underlying exception:
Permission denied
Aborting.
Fatal error: One or more hosts failed while executing task 'upload_node_backups'
Given an S3 bucket with paths like cassandra-backups/staging/20141211231905
... this line returns an array like [u'staging/', u'staging/20141211231905/']
. The get_contents_as_string() call then tries to look up a path like cassandra-backups/staging/manifest.json
which of course fails with a 404.
Here's a backtrace:
Traceback (most recent call last):
File "/usr/bin/cassandra-snapshotter", line 9, in <module>
load_entry_point('cassandra-snapshotter==0.3.0', 'console_scripts', 'cassandra-snapshotter')()
File "/usr/lib/python2.6/site-packages/cassandra_snapshotter-0.3.0-py2.6.egg/cassandra_snapshotter/main.py", line 198, in main
run_backup(args)
File "/usr/lib/python2.6/site-packages/cassandra_snapshotter-0.3.0-py2.6.egg/cassandra_snapshotter/main.py", line 38, in run_backup
table=args.table
File "/usr/lib/python2.6/site-packages/cassandra_snapshotter-0.3.0-py2.6.egg/cassandra_snapshotter/snapshotting.py", line 435, in get_snapshot_for
for snapshot in self:
File "/usr/lib/python2.6/site-packages/cassandra_snapshotter-0.3.0-py2.6.egg/cassandra_snapshotter/snapshotting.py", line 445, in __iter__
self._read_s3()
File "/usr/lib/python2.6/site-packages/cassandra_snapshotter-0.3.0-py2.6.egg/cassandra_snapshotter/snapshotting.py", line 418, in _read_s3
manifest_data = mkey.get_contents_as_string()
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 1780, in get_contents_as_string
response_headers=response_headers)
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 1648, in get_contents_to_file
response_headers=response_headers)
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 1480, in get_file
query_args=None)
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 1512, in _get_file_internal
override_num_retries=override_num_retries)
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 343, in open
override_num_retries=override_num_retries)
File "/usr/lib/python2.6/site-packages/boto-2.34.0-py2.6.egg/boto/s3/key.py", line 303, in open_read
self.resp.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 404 Not Found
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>cassandra-backups/manifest.json</Key><RequestId>A777F43BA4BCC13F</RequestId><HostId>/qU27mGG5n7cPd4KP8UfcaiwTjGNzCP58HakJqYVhIn6KFWBoS6ZBps4pSj7btI/</HostId></Error>
Does this script support use of SSH config files? It doesn't seem to work for me. I have a host, user, port, and IdentityFile defined in my ~/.ssh/config
file, but the script still says:
Fatal error: Needed to prompt for a connection or sudo password (host: my_vm)
I have verified that no password is required for SSH nor sudo once logged in. Here's the config:
Host my_vm
Hostname 127.0.0.1
Port=1122
User=ubuntu
IdentityFile ~/.ssh/id_rsa
We're facing an issue when trying to take the incremental backup using cassandra_snapshotter tool.
We've a three node cluster in Amazon cloud. We've enabled incremental backup and moving it to S3. But when we try to execute the snapshotter tool it gives permission denied error :
ec2-154-421-401-82.compute-1.amazonaws.com] out: Traceback (most recent call last):
[ec2-154-421-401-82.compute-1.amazonaws.com] out: File "/usr/local/bin/cassandra-snapshotter-agent", line 9, in
[ec2-154-421-401-82.compute-1.amazonaws.com] out: load_entry_point('cassandra-snapshotter==0.4.0', 'console_scripts', 'cassandra-snapshotter-agent')()
[ec2-154-421-401-82.compute-1.amazonaws.com] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 266, in main
[ec2-154-421-401-82.compute-1.amazonaws.com] out: args.incremental_backups
[ec2-154-421-401-82.compute-1.amazonaws.com] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 151, in put_from_manifest
[ec2-154-421-401-82.compute-1.amazonaws.com] out: os.remove(f)
It look like some sort of permission error but we've give full permission to (t-9bc68ac040d511e5bab8a38736f22d26/) directory but no luck. also ran the tool from root account but still we see the same error.
We also cleared all the snapshots in S3 and tried to execute the snapshotter again and it worked fine but after inserting some data and running the tool results in above error message.
It seems that AWS credentials are necessary to run the tool. It would be great to make the optional and the script will use the IAM role instead.
Hi,
I just started using cassandra_snapshotter and I got a similar error twice when running it on my cluster (4 nodes, each seems to have ~18GB of data)
[172.31.10.42] out: boto.exception.S3ResponseError: S3ResponseError: 200 OK
Here are outputs of s3cmd du
(converted to GB) on the path/CLUSTERNAME/IP
folders in s3, as you can see, the 1st pair is missing some data while on my first try (2nd pair) it's not so obvious.
17.74581958632916212081 run 1, node 1
12.03844990301877260208 run2
17.30710690561681985855 run 1, node 2
17.32890863530337810516 run 2
18.47752564307302236557 run 1, node 3
18.50679215136915445327 run 2
17.72296549659222364425 run 1, node 4
17.74818217754364013671 run 2
TLDR: I think either there's something wrong with S3 atm, or cassandra_snapshotter should really retry to upload the snapshot if there's a boto error...
Not sure if this issue is related.
We can just use the key_filename attribute in the fab env. I'll get to work on that.
I've just setup the snapshotter, and when I run the command, it compresses and seemingly transfers everything to S3 (looks like it's all there), but then it errors at the end with the nodetool ring
command as follows:
[hostname] run: /usr/bin/nodetool ring
Traceback (most recent call last):
File "/usr/local/bin/cassandra-snapshotter", line 9, in <module>
load_entry_point('cassandra-snapshotter==0.5.0', 'console_scripts', 'cassandra-snapshotter')()
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/main.py", line 280, in main
run_backup(args)
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/main.py", line 76, in run_backup
worker.snapshot(snapshot)
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/snapshotting.py", line 351, in snapshot
self.write_ring_description(snapshot)
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/snapshotting.py", line 400, in write_ring_description
self.write_on_S3(snapshot.s3_bucket, ring_path, content)
File "/usr/local/lib/python2.7/site-packages/cassandra_snapshotter/snapshotting.py", line 394, in write_on_S3
key.set_contents_from_string(content)
File "/usr/lib/python2.7/dist-packages/boto/s3/key.py", line 1426, in set_contents_from_string
encrypt_key=encrypt_key)
File "/usr/lib/python2.7/dist-packages/boto/s3/key.py", line 1293, in set_contents_from_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/lib/python2.7/dist-packages/boto/s3/key.py", line 750, in send_file
chunked_transfer=chunked_transfer, size=size)
File "/usr/lib/python2.7/dist-packages/boto/s3/key.py", line 951, in _send_file_internal
query_args=query_args
File "/usr/lib/python2.7/dist-packages/boto/s3/connection.py", line 665, in make_request
retry_handler=retry_handler
File "/usr/lib/python2.7/dist-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/lib/python2.7/dist-packages/boto/connection.py", line 940, in _mexe
request.body, request.headers)
File "/usr/lib/python2.7/dist-packages/boto/s3/key.py", line 884, in sender
response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>6C23CB43B06B5909</RequestId><HostId>z6yqnOVBCfLjnF0xTB1jyZaKpsVd4Q+Vgizvl9iJCxbgH573MFXuu9UCfCKlv1nvn2Fp/Ronxlo=</HostId></Error>
I've just discovered that this is happening because of encryption. Without the --s3-ssenc
flag and with the bucket encryption policy removed, everything completes. Any known reason why the nodetool ring
fails when encryption is enabled?
S3 bucket policy
{
"Version": "2012-10-17",
"Id": "PutObjPolicy",
"Statement": [
{
"Sid": "DenyUnEncryptedObjectUploads",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::s3-bucket-name/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
}
}
]
}
Should be really easy to do it (ps
seems the right tool for the job).
Hey
I will probably knock this out myself, but this awesome tool may be lacking some documentation ... #justsayin
For instance:
cassandra-snapshotter -v --aws-access-key-id=redacted \ --aws-secret-access-key=redacted \ --s3-bucket-name=your_buck_name --s3-bucket-region=us-west-1 --s3-ssenc \ --s3-base-path=your_base_path restore --keyspace=your_keyspace --target-hosts=cassandra01,cassandra02
Also add more details what happens during a restore, for instance the program is going to download the snapshot out of the s3 repo to the machine you ran the command on, and then run sstableloader. You need disk and sstableloader. Also what are the details on prepping the cluster? Do you need the schema to exists?
I am thinking to provide examples of various commands, also more details on getting snapshots running properly. I am guessing that you have to list the keyspaces to get snapshotting running correctly.
Here is an example of a list command:
cassandra-snapshotter -v --aws-access-key-id=redacted \ --aws-secret-access-key=redacted --s3-bucket-name=your_bucket_name \ --s3-bucket-region=us-west-1 --s3-ssenc --s3-base-path=your_base_path list
Also we may want to link to https://help.ubuntu.com/community/SSH/OpenSSH/Keys for instructions on creating ssh shared keys.
The topics areas that I see need some TLC are:
Thoughts? Comments?
I get the following error from each of my nodes:
[10.x.y.z] out: lzop 1.03
[10. x.y.z] out: LZO library 2.06
[10. x.y.z] out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
[10. x.y.z] out: Exception in thread Thread-2:
[10. x.y.z] out: Traceback (most recent call last):
[10. x.y.z] out: File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
[10. x.y.z] out: self.run()
[10. x.y.z] out: File "/usr/lib/python2.7/threading.py", line 504, in run
[10. x.y.z] out: self.__target(*self.__args, **self.__kwargs)
[10. x.y.z] out: File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks
[10. x.y.z] out: put(task)
[10. x.y.z] out: PicklingError: Can't pickle <type '_hashlib.HASH'>: attribute lookup _hashlib.HASH failed
And then snapshot seems painfully slow - no quit, processes running (though only using about 1% CPU) and 2h+ later, 9GB of data not completed snapshot.
Hey,
I would like to change the module to use docker-swarm discovery engine to execute the commands in order to produce the snapshots.
We are working in docker-swarm environment so the snapshotter will also be containerized.
Any quick advice on how / where to start ?
Script to restore from s3 that should be cover:
TTL โ how does it impact the SSTables?
-Need to test to make sure the expired tables do not come back.
$ pip install cassandra_snapshotter
Downloading/unpacking cassandra-snapshotter
Could not find any downloads that satisfy the requirement cassandra-snapshotter
No distributions at all found for cassandra-snapshotter
Storing complete log in /Users/arikfr/.pip/pip.log
Project isn't on pypi?
Some commands are run in parallel against multiple nodes; this sometimes makes debugging simple issues quite complex (especially for non python folks).
It would be great to have a validate command that spots the most common configuration mistakes:
. broken piramiko (eg. missing python dev libs)
. user needs to be in the sudoers group (with no pass prompt)
. missing lzop
. misconfigured S3 bucket / invalid keys
Hi,
I'm running the latest version on ubuntu 14 and c* 2.0.8.
i run the command:
ubuntu@cas1:~$ cassandra-snapshotter --aws-access-key-id XXXXXXXXXX --aws-secret-access-key XXXXXXXXXX--s3-bucket-name anodot-cas-backup-staging --s3-base-path monitoring backup --host=cas1 --keyspace=combined --table=definition
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621121914//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621123553//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621123751//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621123806//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621124654//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621124849//manifest.json'
WARNING:root:Response: u'Access Denied' manifest_path: u'monitoring/20150621130515//manifest.json'
[cas1] Executing task 'node_start_backup'
[cas1] Executing task 'upload_node_backups'
[cas1] sudo: cassandra-snapshotter-agent create-upload-manifest --manifest_path=/tmp/backupmanifest --snapshot_name=20150621145211 --snapshot_keyspaces=combined --snapshot_table=definition --data_path=/var/lib/cassandra/data/
[cas1] sudo: cassandra-snapshotter-agent put --aws-access-key-id=XXXXXXXX --aws-secret-access-key=XXXXXXX--s3-bucket-name=anodot-cas-backup-staging --s3-bucket-region=us-east-1 --s3-base-path=monitoring/20150621145211/cas1 --manifest=/tmp/backupmanifest --concurrency=4
[cas1] out: lzop 1.03
[cas1] out: LZO library 2.06
[cas1] out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
[cas1] out:
[cas1] Executing task 'clear_node_snapshot'
[cas1] sudo: /usr/bin/nodetool clearsnapshot -t "20150621145211"
[cas1] out: xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M -Xmn200M -XX:+HeapDumpOnOutOfMemoryError -Xss256k
[cas1] out: Requested clearing snapshot for: all keyspaces
[cas1] out:
[cas1] sudo: /usr/bin/nodetool ring
on the S3 i'm unable to see the backup directory
Thanks in advance
Shay.
Based on the history of the CHANGELOG I'm guessing this is e5bf7bb
We need to have some test coverage for the tool. Merging contributed changes is now fairly hard for this reason. @tonylixu @arikfr @chrislovecnm any input/suggestion is more than welcome!
Hello,
This tool looks great so far. I notice that I can set the nodetool path, so I tell the script where to find nodetool. However, if the JMX setup has authentication on it, nodetool will fail (as it needs -u and -pw params). Does this tool take that into account?
I got an error when executing backup command:
cassandra-snapshotter --s3-bucket-name=XXXX \
--s3-bucket-region=us-east-1 \
--s3-base-path=XXXXX \
--aws-access-key-id=XXXXXXXXXXXX \
--aws-secret-access-key=XXXXXXXXXXXXXXXXXXX \
--s3-ssenc \
backup \
--hosts=lab_db \
--user=cassandra
[lab_db] Executing task 'node_start_backup'
Fatal error: Needed to prompt for a connection or sudo password (host: lab_db), but input would be ambiguous in parallel mode
Aborting.
Fatal error: One or more hosts failed while executing task 'node_start_backup'
Aborting.
[lab_db] Executing task 'clear_node_snapshot'
[lab_db] run: /usr/bin/nodetool clearsnapshot -t "20151211141001"
Fatal error: Needed to prompt for a connection or sudo password (host: lab_db), but input would be ambiguous in parallel mode
Aborting.
Fatal error: One or more hosts failed while executing task 'clear_node_snapshot'
Aborting.
Currently this happens:
ssl.CertificateError: hostname 'my.bucket.name.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'
This can usually be worked around by putting the bucket name at the beginning of the path.
I know that you used sstableloader for restore, or do we want to do that or just scp the file and have the agent uncompress it?
Or the agent download the files and restore itself? A reverse of the backup
data directory structure changed
This is probably where I will start before the documentation, but
cassandra-snapshotter --help
Is missing for instance --hosts --keyspaces ... etc, etc
Please support usage of IAM roles sample
This will allow EC2 machines to utilize their machine credentials for S3 access rather than requiring an access/secret key to be hardcoded.
We're using Chef and PyChef to integrate it with Fabric. If we add integrated support for PyChef in cassandra_snapshotter, is it something you will merge in?
The integration will be by passing Chef search string instead of nodes list when starting the tool. For example --chef
instead of --nodes
.
I've been reading the code before trying this out, and I noticed that when incremental_backups is True, put_from_manifest will delete files even if they failed to upload:
...
for ret in pool.imap(upload_file,
((bucket, f, destination_path(s3_base_path, f), s3_ssenc, buffer_size) for f in files)):
if not ret:
exit_code = 1
break
pool.terminate()
if incremental_backups:
for f in files:
os.remove(f) # DELETING FILES HERE EVEN IF exit_code == 1
exit(exit_code)
This doesn't seem wise. The easy solution is to make deletion conditional on the value of exit_code. It would be neater to delete those files that uploaded correctly and leave the files that didn't alone, but maybe it's overkill.
How: followed installation instructions (apt reported successful installation)
What: I get the following error when i type in cassandra-snapshotter --help
|
ubuntu@vsk01:$ cassandra-snapshotter --help$ uname -a
Traceback (most recent call last):
File "/usr/local/bin/cassandra-snapshotter", line 5, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2749, in
working_set = WorkingSet._build_master()
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 444, in _build_master
ws.require(requires)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 725, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 628, in resolve
raise DistributionNotFound(req)
pkg_resources.DistributionNotFound: ecdsa>=0.11
|
Environment:
ubuntu@vsk01:
Linux vsk01 3.13.0-57-generic #95-Ubuntu SMP Fri Jun 19 09:28:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ubuntu@vsk01:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.2 LTS
Release: 14.04
Codename: trusty
Dependencies listed:
Name: cassandra-snapshotter
Version: 0.4.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: argparse, fabric, boto
What is ecdsa and why is this dependency not met ? Thanks!
I installed on datastax AMI using the directions in README. Here's the error:
$ cassandra-snapshotter --aws-access-key-id="**********" --aws-secret-access-key="**********" --s3-bucket-name="*********" --s3-bucket-region="us-west-2" --s3-base-path="" backup --hosts=cassandra
cassandra] Executing task 'node_start_backup'
[cassandra] Executing task 'upload_node_backups'
[cassandra] sudo: cassandra-snapshotter-agent create-upload-manifest --manifest_path=/tmp/backupmanifest --snapshot_name=20140918010604 --snapshot_keyspaces= --snapshot_table= --data_path=/var/lib/cassandra/data/
[cassandra] out: Traceback (most recent call last):
[cassandra] out: File "/usr/local/bin/cassandra-snapshotter-agent", line 5, in <module>
[cassandra] out: from pkg_resources import load_entry_point
[cassandra] out: File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2711, in <module>
[cassandra] out: parse_requirements(__requires__), Environment()
[cassandra] out: File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 584, in resolve
[cassandra] out: raise DistributionNotFound(req)
[cassandra] out: pkg_resources.DistributionNotFound: boto>=2.29.1
[cassandra] out:
Fatal error: sudo() received nonzero return code 1 while executing!
Hey guys, I was seeing no s3 backup directory being generated when there is no change in keyspace using this package. Is that expected ?
Thanks
I get the following error when using cassandra-snapshottr
out: cassandra-snapshotter-agent create-upload-manifest: error: argument --conf_path is required
My command is
cassandra-snapshotter --aws-access-key-id=SOMETHING --aws-secret-access-key=SOMEKEY --s3-bucket-name=mybucket --s3-bucket-region=us-east-1 --s3-base-path=dirname backup --hosts=54.x.x.x --user=username
This command worked fine earlier but fails when I used pip install to get the latest on the node.
Please help me to debug this further and fix it
Hey just wanted to know few things?
1)Are the backups app consistent or crash consistent ?
Thanks :)
Hello, We are using cassandra_snapshot to take snapshots but getting the following error regularly
lzop 1.03
out: LZO library 2.06
out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
] out: Traceback (most recent call last):
out: File "/usr/local/bin/cassandra-snapshotter-agent", line 9, in
out: load_entry_point('cassandra-snapshotter==0.4.0', 'console_scripts', 'cassandra-snapshotter-agent')()
out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 266, in main
out: args.incremental_backups
out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 145, in put_from_manifest
out: for _ in pool.imap(upload_file, ((bucket, f, destination_path(s3_base_path, f), s3_ssenc, buffer_size) for f in files)):
out: File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
out: raise value
out: boto.exception.S3ResponseError: S3ResponseError: 200 OK
out:
out: InternalError
We encountered an internal error. Please try again.XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Can someone please help why are we getting this running snapshots? We have 5 nodes each about 100 gbs of data.
The first time I run the tool, it works well, the subsequent times it gives an error:
[cassandrahost] out: Traceback (most recent call last):
[cassandrahost] out: File "/usr/local/bin/cassandra-snapshotter-agent", line 9, in
[cassandrahost] out: load_entry_point('cassandra-snapshotter==0.4.0', 'console_scripts', 'cassandra-snapshotter-agent')()
[cassandrahost] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 266, in main
[cassandrahost] out: args.incremental_backups
[cassandrahost] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 151, in put_from_manifest
[cassandrahost] out: os.remove(f)
[cassandrahost] out: OSError: [Errno 13] Permission denied: '/mnt/cassandra/data/system/schema_triggers-0359bc7171233ee19a4ab9dfb11fc125/backups/la-4-big-Filter.db'
Are these file in the "backups" directory above, something which is leftover from before? There are a bunch of such files and it keep failing on them. The permissions on these files are:
ls -l /mnt/cassandra/data/system/schema_triggers-0359bc7171233ee19a4ab9dfb11fc125/backups/la-4-big-Filter.db
-rw-r--r-- 1 cassandra cassandra 16 Feb 7 15:00 /mnt/cassandra/data/system/schema_triggers-0359bc7171233ee19a4ab9dfb11fc125/backups/la-4-big-Filter.db
I am running the snapshotted from my mac which SSH's into the node, using the ubuntu user. Any suggestions why it is failing for me, is it the permissions on the file? If so why only I am hitting this issue?
Hi :)
Trying to get this to work on my EC2 instance...
When I run cassandra-snapshotter on my instance, I get the following output:
[x.x.x.x] Executing task 'node_start_backup'
[x.x.x.x] Executing task 'upload_node_backups'
[x.x.x.x] sudo: cassandra-snapshotter-agent create-upload-manifest --manifest_path=/tmp/backupmanifest --snapshot_name=20140724110014 --snapshot_keyspaces= --snapshot_table= --data_path=/var/lib/cassandra/data/
[x.x.x.x] sudo: cassandra-snapshotter-agent put --aws-access-key-id=<SNIP> --aws-secret-access-key=<SNIP> --s3-bucket-name=<SNIP> --s3-bucket-region=us-west-2 --s3-ssenc --s3-base-path=US_OR_Parser/20140724110014/x.x.x.x --manifest=/tmp/backupmanifest --concurrency=4
[x.x.x.x] out: lzop 1.03
[x.x.x.x] out: LZO library 2.06
[x.x.x.x] out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
[x.x.x.x] out:
[x.x.x.x] Executing task 'clear_node_snapshot'
[x.x.x.x] sudo: /usr/bin/nodetool clearsnapshot -t "20140724110014"
[x.x.x.x] out: Requested clearing snapshot for: all keyspaces
[x.x.x.x] out:
[x.x.x.x] sudo: /usr/bin/nodetool ring
And then it exits after a few seconds (code 0).
On S3, it creates folders, but the only two files it puts there are "manifest.json" and "ring". There are no backup files.
incremental_backups is enabled in cassandra.yaml, and JNA is also installed.
Adding --new-snapshot doesn't help.
Am I just being retarded here or is something wrong?
Thanks :)
In the current implementation you run nodetool flush
before incremental backups upload. We were thinking of letting Cassandra flush whenever it feels it needs to, and upload the files it generates between runs.
Is calling flush explicitly is something you added after letting Cassandra flush when it "feels to", or you haven't tried it differently?
Currently the backup command takes --hosts
option to run snapshot commands on each of the cassandra nodes. It would be useful if specifying "localhost" alone would run the snapshot command locally rather than via SSH.
I could see that this might lead to end-user thinking they could have localhost as one node among many non-local nodes. So maybe a new command, like local_backup
or something?
Hosts periodically throw attached error, causing backups to fail
boto-2.38
python-2.7.6
cassandra_snapshotter 0.4.0
[db4a.] out: File "/usr/local/bin/cassandra-snapshotter-agent", line 9, in <module>
[db4a.] out: load_entry_point('cassandra-snapshotter==0.4.0', 'console_scripts', 'cassandra-snapshotter-agent')()
[db4a.] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 266, in main
[db4a.] out: args.incremental_backups
[db4a.] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 145, in put_from_manifest
[db4a.] out: for _ in pool.imap(upload_file, ((bucket, f, destination_path(s3_base_path, f), s3_ssenc, buffer_size) for f in files)):
[db4a.] out: File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
[db4a.] out: raise value
[db4a.] out: boto.exception.S3ResponseError: S3ResponseError: 200 OK
[db4a.] out: <?xml version="1.0" encoding="UTF-8"?>
[db4a.] out: <Error><Code>InternalError</Code><Message>We encountered an internal error. Please try again.</Message><RequestId>RIDHERE</RequestId><HostId>IDHERE</HostId></Error>
[db4a.] out:
Fatal error: run() received nonzero return code 1 while executing!
Requested: cassandra-snapshotter-agent put (REST OF COMMAND)
Aborting.
It'd be cool if the hash that corresponds to what went out as 0.4.0 was tagged; and this tag pushed to the repo.
Somewhat akin to #49
Thanks
Hi,
I am making following call but receiving MalformedXML error. Any idea, what i am missing?
cassandra-snapshotter --aws-access-key-id **** --aws-secret-access-key **** --s3-bucket-region **** --s3-bucket-name **** --s3-base-path **** backup --hosts localhost --nodetool-path ~/cassandra/cassandra/bin/nodetool --cassandra-bin-dir ~/cassandra/cassandra/bin/ --user ubuntu --cassandra-conf-path ~/cassandra/cassandra/conf
Do i need to create any policy at s3 bucket side? Currently, i have not configured any policy with same. Any pointer here would be of great help. Thanks!
Output:
[localhost] out: lzop 1.03
[localhost] out: LZO library 2.06
[localhost] out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
[localhost] out: lzop: lzop.c:351: f_open: Assertion ft->name[0]' failed. [localhost] out: lzop: lzop.c:351: f_open: Assertion
ft->name[0]' failed.
[localhost] out: lzop: lzop.c:351: f_open: Assertion ft->name[0]' failed. [localhost] out: lzop: lzop.c:351: f_open: Assertion
ft->name[0]' failed.
[localhost] out: Traceback (most recent call last):
[localhost] out: File "/usr/local/bin/cassandra-snapshotter-agent", line 9, in
[localhost] out: load_entry_point('cassandra-snapshotter==0.4.0', 'console_scripts', 'cassandra-snapshotter-agent')()
[localhost] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 266, in main
[localhost] out: args.incremental_backups
[localhost] out: File "/usr/local/lib/python2.7/dist-packages/cassandra_snapshotter/agent.py", line 145, in put_from_manifest
[localhost] out: for _ in pool.imap(upload_file, ((bucket, f, destination_path(s3_base_path, f), s3_ssenc, buffer_size) for f in files)):
[localhost] out: File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
[localhost] out: raise value
[localhost] out: boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request
[localhost] out: MalformedXML
The XML you provided was not well-formed or did not validate against our published schema0056A9AE61068765PEIIsIHjnwy/331DR/DokeyFCWRIKqSxUcV/W31WXmNXmHYCiW5W4iZgF+bV5MvHZet6jx+ufIM=
[localhost] out: lzop: lzop.c:351: f_open: Assertion `ft->name[0]' failed.
[localhost] out:
When restoring a keyspace, it doesn't get created before loading the sstables, which generates the following error:
INFO:root:invoking: sstableloader --nodes localhost -v MY_KEYSPACE/MY_TABLE
Could not retrieve endpoint ranges:
InvalidRequestException(why:No such keyspace: MY_KEYSPACE)
Run with --debug to get full stack trace or --help to get help.
Even after creating the keyspace, the tables don't get created:
...
Skipping file IP_MY_KEYSPACE-MY_TABLE-jb-42-Data.db: column family MY_KEYSPACE.MY_TABLE doesn't exist
...
i have no idea what this means.. is this a verison issue?
[10.1.3.151] out: lzop 1.03
[10.1.3.151] out: LZO library 2.06
[10.1.3.151] out: Copyright (C) 1996-2010 Markus Franz Xaver Johannes Oberhumer
[10.1.3.151] out: Exception in thread Thread-2:
[10.1.3.151] out: Traceback (most recent call last):
[10.1.3.151] out: File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
[10.1.3.151] out: self.run()
[10.1.3.151] out: File "/usr/lib/python2.7/threading.py", line 504, in run
[10.1.3.151] out: self.__target(_self.__args, *_self.__kwargs)
[10.1.3.151] out: File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in _handle_tasks
[10.1.3.151] out: put(task)
[10.1.3.151] out: PicklingError: Can't pickle <type '_hashlib.HASH'>: attribute lookup _hashlib.HASH failed
[10.1.3.151] out:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.