basho / riak_cs Goto Github PK
View Code? Open in Web Editor NEWRiak CS is simple, available cloud storage built on Riak.
Home Page: http://docs.basho.com/riakcs/latest/
License: Apache License 2.0
Riak CS is simple, available cloud storage built on Riak.
Home Page: http://docs.basho.com/riakcs/latest/
License: Apache License 2.0
According to https://github.com/basho/riak_moss/wiki/User-Management, it should be possible for a non-admin user to fetch info about themselves via s3cmd get s3://riak-cs/user -
.
However, it looks like the admin user is always being checked, according to redbug:start({riak_moss_wm_user, admin_check, [return]}, [{msgs,100},{print_depth,12}]).
output:
00:21:07 <<0.16944.0>> {riak_moss_wm_user,admin_check,3} -> {{halt,403},
....
Provide the ability to have multiple users specified as administrators.
This brings up a question of which credential we use to configure riak_moss with stanchion.
Interesting that I can make a bucket with the same name appear twice in "s3cmd ls" output.
% s3cmd ls
2012-04-24 21:19 s3://decision
% s3cmd mb s3://decision
Bucket 's3://decision/' created
% s3cmd ls
2012-04-24 22:15 s3://decision
2012-04-24 21:19 s3://decision
% s3cmd mb s3://decision
Bucket 's3://decision/' created
% s3cmd ls
2012-04-24 22:15 s3://decision
2012-04-24 21:19 s3://decision
% s3cmd rb s3://decision
Bucket 's3://decision/' removed
% s3cmd ls
2012-04-24 21:19 s3://decision
% s3cmd rb s3://decision
ERROR: Access to bucket 'decision' was denied
Bucket 's3://decision/' removed
% s3cmd ls
2012-04-24 21:19 s3://decision
([email protected])18> C:get(<<"moss.buckets">>, <<"decision">>).
{ok,{r_object,<<"moss.buckets">>,<<"decision">>,
[{r_content,{dict,4,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],...},
{{[],[],[],[],[],[],[],[],[],[],...}}},
<<"0">>}],
[{<<35,9,254,249,79,140,128,76>>,{44,63502524922}}],
{dict,1,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],...},
{{[],[],[],[],[],[],[],[],[],[],[],...}}},
undefined}}
IIRC, I'd managed to get the system into this state by multiple "s3cmd mb s3://decision" and "s3cmd rb s3://decision" quickly & multiple times. (I was trying to get their counters to go get incremented in the new stats gathering stuff, and I'm impatient.)
app.config
files[
...
{riak_moss, [
...
{admin_key, "key"},
{admin_secret, "secret"},
...
]},
...
].
[
...
{stanchion, [
...
{admin_key, "key1"},
{admin_secret, "secret1"},
...
]},
...
].
$ curl http://localhost:8080/user --data [email protected]\&name=dan
<?xml version="1.0" encoding="UTF-8"?><Error><Code>AccessDenied</Code><Message>Access Denied</Message><Resource>/user</Resource><RequestId></RequestId></Error>
A log message in the Riak CS logs indicating that the new user request could not be completed because the admin credentials were not accepted by Stanchion.
No obvious errors.
This would be similar to the Riak ping resource available via the HTTP resource.
Additionally, it should return a non successful response if the Riak node it's communicating with is unresponsive. A Riak node may become unresponsive due to LevelDB issues (basho/eleveldb#23).
Related issue:
https://help.basho.com/tickets/1381
riak_moss_wm_bucket:shift_to_owner calls riak_moss_utils:get_user with a binary representation of the key_id when it expects a string and this causes a badarg error.
Riak CS console output follows:
([email protected])1> 12:57:01.334 [error] webmachine error: path="/testbucket/"
{error,badarg,[{erlang,list_to_binary,[<<"Q5U52KT6JYEF2EUPL8DP">>],[]},{riak_moss_utils,get_user,2,[{file,"src/riak_moss_utils.erl"},{line,279}]},{riak_moss_wm_bucket,shift_to_owner,4,[{file,"src/riak_moss_wm_bucket.erl"},{line,120}]},{webmachine_resource,resource_call,3,[{file,"src/webmachine_resource.erl"},{line,166}]},{webmachine_resource,do,3,[{file,"src/webmachine_resource.erl"},{line,125}]},{webmachine_decision_core,resource_call,1,[{file,"src/webmachine_decision_core.erl"},{line,48}]},{webmachine_decision_core,decision,1,[{file,"src/webmachine_decision_core.erl"},{line,198}]},{webmachine_decision_core,handle_request,2,[{file,"src/webmachine_decision_core.erl"},{line,33}]}]}
When auth_bypass is enabled, querying usage reports via curl will fail with the below error:
$ curl http://localhost:8080/usage/ED8RGOBOL9PTHJR0J_BO?a
InvalidAccessKeyId
The AWS Access Key Id you provided does not exist in our records./usage/ED8RGOBOL9PTHJR0J_BO
Disabling auth_bypass and querying a usage report on the same key returns data.
os:timestamp() doesn't guarantee monotonically increasing timestamps, but apparently has less lock-contention.
Use the following ruby script to reproduce: https://gist.github.com/db9006729bedd3b92ab7
Add configuration option to allow system installers to choose to restrict user creation to admins or permit anyone to create a user directly.
Upload object to CS.
Retrieve info for that object .. for example via "s3cmd info"
"Last mod" shown doesn't seem to be in GMT?
Add an independent connection pool only for bucket listings and limit it by default to 5.
For the move_manifests_to_gc_bucket() function discussion, see the line notes (at least two of them, please read all) at: https://github.com/basho/riak_moss/pull/147/files#r1072657
A while back, the "top" option was added to the riak-admin
utility to make it easy for folks to use the OTP app "etop". Such a thing is missing from the RCS packaging: there is no riak-cs-admin
utility at the moment. I'm not tremendously fussy about where it goes, but riak-cs-admin top
would be consistent with the Riak product.
We should have a suite of tests to ensure our compatibility with the official Java S3 SDK
Source management: the KV backend sources in the riak_moss repo causes eunit and dialyzer problems. Those problems can be solved by: 1. moving them to the riak_kv repo, or 2. adding a rebar dependency riak_moss -> riak_kv and take the big chain of subsequent deps that that would create.
For the MapReduce BEAM files, do we need to create another dir, e.g. erl -pz, for BEAM files that aren't part of the Riak packaging but also is separate from the basho-patches dir?
Since the default webmachine error handler is used, any time there is a Real Error, the user will see a webmachine/mochiweb generated response that is likely to include an Erlang stacktrace. We probably don't want users seeing those, so we should write a Riak CS custom error handler.
Worse yet, there is a bug in the default error handler that means if we ever encounter an error while generating the stream for a get-object response, we will attempt to start generating that stream over afresh as the error message. See https://github.com/basho/webmachine/issues/60
For both the riak_moss and stanchion apps, I believe that we should be using start() rather than start_link() when establishing a PB socket connection. If the Riak service is unavailable, the newly-started gen_server process will immediately exit for a non-'normal' reason which then kills the proc that called start_link(). We don't want that death to happen immediately. So, use start() at first and only if successful then link to the PB socket pid.
Research topic: Is this a case where the link is always undesirable?
When using riak-cs-storage batch to generate storage usage samples, I'm now getting data back like so:
"Storage":{"Samples":[{"StartTime":"20120607T004100Z","EndTime":"20120607T004100Z"},{"StartTime":"20120606T195208Z","EndTime":"20120606T195208Z"}],
It appears these samples are missing the actual usage data.
Add an independent connection pool only for bucket listings and limit it by default to 5.
Current modules that are copied from Riak CS -> Stanchion
y = copied
n = different
s = similar
n stanchion.app.src
y stanchion_acl_utils.erl
n stanchion_app.erl
y stanchion_auth.erl
y stanchion_blockall_auth.erl
y stanchion_manifest_fsm.erl
y stanchion_manifest_resolution.erl
y stanchion_manifest_utils.erl
y stanchion_passthru_auth.erl
s stanchion_response.erl
n stanchion_server.erl
n stanchion_server_sup.erl
n stanchion_sup.erl
y stanchion_utils.erl
n stanchion_web.erl
n stanchion_wm_acl.erl
n stanchion_wm_bucket.erl
n stanchion_wm_buckets.erl
y stanchion_wm_error_handler.erl
n stanchion_wm_users.erl
s stanchion_wm_utils.erl
n velvet.erl
libssl0.9.8
$ sudo apt-get install libssl0.9.8
riak-cs_1.0.0-1_amd64.deb
package$ sudo dpkg -i riak-cs_1.0.0-1_amd64.deb
$ sudo riak-cs start
/usr/sbin/riak-cs: line 8: /usr/lib/riak-cs/lib/env.sh: Permission denied
/usr/sbin/riak-cs: line 11: check_user: command not found
/usr/sbin/riak-cs: line 19: node_down_check: command not found
mkdir: missing operand
Try `mkdir --help' for more information.
/usr/sbin/riak-cs: line 25: /run_erl: No such file or directory
/usr/lib/riak-cs/lib/env.sh
$ ls -al /usr/lib/riak-cs/lib/env.sh
-rw------- 1 root root 2302 2012-04-02 13:23 /usr/lib/riak-cs/lib/env.sh
$ sudo chmod -R a+r /usr/lib/riak-cs/lib/env.sh
$ sudo riak-cs start
grep: /etc/riak-cs/vm.args: Permission denied
vm.args needs to have either -name or -sname parameter.
/etc/riak-cs/
$ ls -al /etc/riak-cs/
total 28
drwxr-xr-x 2 root root 4096 2012-04-14 19:51 .
drwxr-xr-x 82 root root 4096 2012-04-14 19:51 ..
-rw------- 1 root root 4512 2012-04-02 13:23 app.config
-rw------- 1 root root 1009 2012-04-02 13:23 cert.pem
-rw------- 1 root root 887 2012-04-02 13:23 key.pem
-rw------- 1 root root 1125 2012-04-02 13:23 vm.args
$ sudo chmod -R a+r /etc/riak-cs/
$ sudo riak-cs start
escript: Failed to open file: /usr/lib/riak-cs/erts-5.8.5/bin/nodetool
/usr/lib/riak-cs/erts-5.8.5/bin
$ ls -al /usr/lib/riak-cs/erts-5.8.5/bin
total 4532
drwxr-xr-x 2 root root 4096 2012-04-14 19:51 .
drwxr-xr-x 8 root root 4096 2012-04-14 19:51 ..
-rwxr-xr-x 1 root root 2053264 2012-04-02 13:24 beam
-rwxr-xr-x 1 root root 2249808 2012-04-02 13:24 beam.smp
-rwxr-xr-x 1 root root 14504 2012-04-02 13:24 child_setup
-rwxr-xr-x 1 root root 26928 2012-04-02 13:24 ct_run
-rwxr-xr-x 1 root root 47632 2012-04-02 13:24 epmd
-rwxr-xr-x 1 root root 1182 2012-04-02 13:23 erl
-rwxr-xr-x 1 root root 31056 2012-04-02 13:24 erlc
-rwxr-xr-x 1 root root 48048 2012-04-02 13:24 erlexec
-rwxr-xr-x 1 root root 26992 2012-04-02 13:24 escript
-rwxr-xr-x 1 root root 10360 2012-04-02 13:24 heart
-rwxr-xr-x 1 root root 47568 2012-04-02 13:24 inet_gethost
-rw------- 1 root root 4874 2012-04-02 13:23 nodetool
-rwxr-xr-x 1 root root 22960 2012-04-02 13:24 run_erl
-rwxr-xr-x 1 root root 1183 2012-04-02 13:23 start
-rwxr-xr-x 1 root root 14488 2012-04-02 13:24 to_erl
sudo chmod a+rx /usr/lib/riak-cs/erts-5.8.5/bin/nodetool
$ sudo riak-cs start
Riak CS would start without issue the first time
Several permission settings had to be adjusted before Riak CS could be started
Use the dss-stats1 branch as a base and create a resource so the stats information can be retrieved via HTTP.
riak-cs_1.0.0-1_amd64.deb
)$ sudo dpkg -i riak-cs_1.0.0-1_amd64.deb
Selecting previously deselected package riak-cs.
(Reading database ... 26683 files and directories currently installed.)
Unpacking riak-cs (from riak-cs_1.0.0-1_amd64.deb) ...
dpkg: dependency problems prevent configuration of riak-cs:
riak-cs depends on libssl0.9.8 (>= 0.9.8m-1); however:
Version of libssl0.9.8 on system is 0.9.8k-7ubuntu8.8.
dpkg: error processing riak-cs (--install):
dependency problems - leaving unconfigured
Processing triggers for man-db ...
Processing triggers for ureadahead ...
Errors were encountered while processing:
riak-cs
Riak CS would install
Riak CS doesn't install because of a dependency issue
There's bad call in riak_moss_acl_utils:get_owner_data/2
to the function riak_moss_utils:get_user/2
is bogus.
Instead, I think it should be a call to riak_moss_utils:get_user_by_index/3
, but that function requires a RiakPid
,
and there isn't one easily available.
(Note: this error was found by Dialyzer)
Currently, the user API supports retrieving object user/key_id for a particular users information. However, some clients escape the '/' characters in the request URI, so this becomes user%2Fkey_id which does not work with riak cs.
We ran into this issue with stats, and decided to change the URI format, as Amazon's S3 service currently incorrectly allows access of an object with '/' characters escaped or not (which is handled incorrectly as those two resources are not the same).
Consider changing the URI or allowing this access to ensure highest level of compatibility with exisiting Amazon S3 clients.
The archiver process should abort the archival of the current timeslice if the total time for archiving it exceeds some configured time, to prevent compounded backlog. Currently timeouts only happen on a per-user-slice storage operation basis.
This issue was originally filed as sprint.ly 95, a sub-task of the initial usage-tracking story, and later as sprint.ly 202.
The block_server fetches chunks using r=1. If any Riak node is down there is a high probability that the downed node houses a primary vnode for one of the file blocks. This leads to a scenario where the responding vnode may respond with {error, notfound} and this causes the file retrieval to fail. If the requesting client is s3cmd then this manifests as s3cmd hanging and chewing up cpu time.
If a user that doesn't exist attempts to put a file into a bucket that doesn't exist (haven't tried it if the bucket does exist, sorry), I see this on the CS console:
17:02:51.571 [error] webmachine error: path="/test/foo1"
{error,{badrecord,key_context},[{riak_moss_wm_key,forbidden,2,[{file,"src/riak_moss_wm_key.erl"},{line,79}]},{webmachine_resource,resource_call,3,[{file,"src/webmachine_resource.erl"},{line,166}]},{webmachine_resource,do,3,[{file,"src/webmachine_resource.erl"},{line,125}]},{webmachine_decision_core,resource_call,1,[{file,"src/webmachine_decision_core.erl"},{line,48}]},{webmachine_decision_core,decision,1,[{file,"src/webmachine_decision_core.erl"},{line,198}]},{webmachine_decision_core,handle_request,2,[{file,"src/webmachine_decision_core.erl"},{line,33}]},{webmachine_mochiweb,loop,1,[{file,"src/webmachine_mochiweb.erl"},{line,89}]},{mochiweb_http,headers,5,[{file,"src/mochiweb_http.erl"},{line,136}]}]}
And the s3cmd
output says:
ERROR: mismatched tag: line 22, column 133
ERROR: Parameter problem: Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/
The HTTP response code is 500, and the body contains the exception shown above.
If I add a catch
such as catch extract_user(...
in riak_moss_wm_key.erl line 79, I instead see this on the CS console:
17:08:49.997 [error] Supervisor poolboy_sup had child riak_moss_riakc_pool_worker started with {riak_moss_riakc_pool_worker,start_link,undefined} at <0.129.0> exit with reason killed in context child_terminated
And s3cmd
correctly (right?) says:
ERROR: S3 error: 403 (InvalidAccessKeyId): The AWS Access Key Id you provided does not exist in our records.
Users' key_secret fields are currently stored as plaintext. They should be stored using a salted hash to increase security.
Changes for this should not be merged until Issue 152 is completed and merged. Otherwise we are unable to help in the case that someone misplaces their key_secret.
Attempting to retrieve a key with a leading slash is causing a key_context error.
Expected([200, 206]) <=> Actual(500 InternalServerError)
request => {:connect_timeout=>60, :headers=>{"Date"=>"Wed, 13 Jun 2012 14:52:57 +0000", "Authorization"=>"AWS PQXW7L5DVF9EVAGWEERV:c0/K332V8Scea1TMWkDWe/j/3K4=", "Host"=>"bucket-1339599177-70306578377940.localhost:8080"}, :instrumentor_name=>"excon", :mock=>false, :read_timeout=>60, :retry_limit=>4, :ssl_ca_file=>"/Users/cmeiklejohn/.rbenv/versions/1.9.3-p194/lib/ruby/gems/1.9.1/gems/excon-0.13.4/data/cacert.pem", :ssl_verify_peer=>true, :write_timeout=>60, :host=>"bucket-1339599177-70306578377940.localhost", :path=>"/key-1339599177-70306578377940", :port=>"8080", :query=>nil, :scheme=>"http", :expects=>[200, 206], :idempotent=>true, :method=>"GET"}
response => #<Excon::Response:0x007fe30fed28b8 @body="<html><head><title>500 Internal Server Error</title></head><body><h1>Internal Server Error</h1>The server encountered an error while processing this request:<br><pre>{error,{badrecord,key_context},\n [{riak_moss_wm_key,forbidden,2,\n [{file,\"src/riak_moss_wm_key.erl\"},{line,79}]},\n {webmachine_resource,resource_call,3,\n [{file,\"src/webmachine_resource.erl\"},\n {line,166}]},\n {webmachine_resource,do,3,\n [{file,\"src/webmachine_resource.erl\"},\n {line,125}]},\n {webmachine_decision_core,resource_call,1,\n [{file,\"src/webmachine_decision_core.erl\"},\n {line,48}]},\n {webmachine_decision_core,decision,1,\n [{file,\"src/webmachine_decision_core.erl\"},\n {line,198}]},\n {webmachine_decision_core,handle_request,2,\n [{file,\"src/webmachine_decision_core.erl\"},\n
I've seen this while running a load test (100% "s3cmd put" operations with a 200MB file):
18:07:59.902 [warning] Error occurred trying to query <<"1d3f9e079ad5ce4d02cb0997a0176d28f3fbcef54e835c8209ba3eeb33bd9a76">> in user index <<"c_id_bin">>. Reason: <<"Error sending inputs: {noproc,\n [{erlang,link,[<0.3455.17>]},\n {riak_kv_pipe_index,queue_existing_pipe,4},\n {riak_kv_mrc_pipe,'-send_inputs_async/3-fun-0-',\n 3}]}">>
18:07:59.903 [warning] Failed to retrieve key_id for user "foobar" with canonical_id "1d3f9e079ad5ce4d02cb0997a0176d28f3fbcef54e835c8209ba3eeb33bd9a76"
Using Riak 1.1 and using CS from the slf-chash-locality-experiment2 branch, which is based on commit a056658 (master as of June 18 2012).
A webmachine resource process exits with reason normal
if it tries to send data to a socket that was closed by the client. The process exiting in this way skips the call to the access logger, so neither the operation nor the bytes it sent are counted.
One fix for this was committed as https://github.com/basho/webmachine/pull/63 but was deemed unusable due to the mess that is shutting down Riak connections after requests complete.
te\[email protected]
as emailcurl http://127.0.0.1:8080/user --data "email=te\[email protected]&name=test"
Either create the new user or return an error indicating the email is invalid.
<html><head><title>500 Internal Server Error</title></head><body><h1>Internal Server Error</h1>The server encountered an error while processing this request:<br><pre>{error,{exit,{fatal,{{invalid_name,"<\"{\\\"p"},
{file,file_name_unknown},
{line,4},
{col,39}}},
[{xmerl_scan,fatal,2},
{xmerl_scan,scan_element,7},
{xmerl_scan,scan_content,11},
{xmerl_scan,scan_element,12},
{xmerl_scan,scan_content,11},
{xmerl_scan,scan_element,12},
{xmerl_scan,scan_content,11},
{xmerl_scan,scan_element,12}]}}</pre><P><HR><ADDRESS>mochiweb+webmachine web server</ADDRESS></body></html>
Marcel reported this migrating some Java code from S3 to CS. The following is the relevant Java snippet:
ObjectMetadata metaData= new ObjectMetadata();
metaData.addUserMetadata("title", "Test\u00A0");
s3.putObject(new PutObjectRequest(bucketName, key, new FileInputStream(createSampleFile()), metaData));
Riak CS currently returns list keys requests with all the keys in one response. S3 limits to 1k keys / request, and also has an IsTruncated
bool to tell the user whether all the keys requested have been sent or not. We should consider implementing this functionality. As it stands now, you can't ask for only 50 keys to be returned. Here are the corresponding S3 docs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.