Giter VIP home page Giter VIP logo

sftpcloudfs's Introduction

sftp cloudfs

This is a SFTP (Secure File Transfer Protocol) interface to OpenStack Object Storage, providing a service that acts as a proxy between a SFTP client and a storage service.

The username/password pair used to open the SFTP session is validated using the authentication service of the files/storage service to get an authentication token.

The communication between the client and the SFTP daemon is encrypted all the time, and the SFTP service supports HTTPS communication with the remote files/storage service.

There's limited SCP support since 0.10.

Install

Requirements:

  • python (2.6)
  • paramiko (1.7.6+; 1.17.0+ recommended)
  • python-swiftclient (2.0+)
  • python-daemon (1.5.5)
  • ftp-cloudfs (0.35+)
  • python-memcached (1.45)

These are the minimum recommended versions based in our testing environment.

You may need to create a host key with ssh-keygen.

To install the software, run following command:

python setup.py install

Or using pip:

pip install sftp-cloudfs

Please use the latest pip version, old versions may have bugs. You can upgrade pip using pip: pip install --upgrade pip.

Usage

Once installed you can run the service with sftpcloudfs executable, that supports following options:

--version             show program's version number and exit
-h, --help            show this help message and exit
-a AUTHURL, --auth-url=AUTHURL
                      Authentication URL
--insecure            Allow to access servers without checking SSL certs
-k HOST_KEY, --host-key-file=HOST_KEY
                      Host RSA key used by the server
-b BIND_ADDRESS, --bind-address=BIND_ADDRESS
                      Address to bind (default: 127.0.0.1)
-p PORT, --port=PORT  Port to bind (default: 8022)
--server-ident=SERVER_IDENT
                      Server ident to use when sending the SSH banner to the
                      client (default: sftpcloudfs_VER)
--memcache=MEMCACHE   Memcache server(s) to be used for cache (ip:port)
-l LOG_FILE, --log-file=LOG_FILE
                      Log into provided file
-f, --foreground      Run in the foreground (don't detach from terminal)
--disable-scp         Disable SCP support (default: enabled)
--syslog              Enable logging to system logger (daemon facility)
-v, --verbose         Show detailed information on logging
--pid-file=PID_FILE   Full path to the pid file location
--uid=UID/USER        UID or USER to drop the privileges to when in daemon mode
--gid=GID/GROUP       GID or GROUP to drop the privileges to when in daemon mode
--keystone-auth       Use OpenStack Identity Service (Keystone, requires keystoneclient)
--keystone-auth-version=VERSION
                      Identity API version to be used, default: 2.0
--keystone-region-name=REGION_NAME
                      Region name to be used in Keystone auth
--keystone-tenant-separator=TENANT_SEPARATOR
                      Character used to separate tenant_name/username in
                      Keystone auth, default: TENANT.USERNAME
--keystone-domain-separator=DOMAIN_SEPARATOR
                      Character used to separate project_name/project_domain_name
                      and username/user_domain_name in Keystone auth v3, default: @
--keystone-service-type=SERVICE_TYPE
                      Service type to be used in Keystone auth,
                      default: object-store
--keystone-endpoint-type=ENDPOINT_TYPE
                      Endpoint type to be used in Keystone auth,
                      default: publicURL
--storage-policy=STORAGE_POLICY
                      Swift storage policy to be used (optional)
                      Access to other containers will be denied
--config=CONFIG       Use an alternative configuration file

The default location for the configuration file is /etc/sftpcloudfs.conf.

Memcache is optional but highly recommended for better performance. Any Memcache server must be secured to prevent unauthorized access to the cached data.

By default Swift auth 1.0 will be used, and is compatible with OpenStack Object Storage (Swift) using swauth authentication middleware.

Optionally OpenStack Identity Service (aka keystone) v2.0 or v3 can be used. Currently python-keystoneclient (0.3.2+ recommended) is required to use Keystone auth and it can be enabled with keystone-auth option.

The server supports large files (over the 5GB default) by splitting the files in parts into a .part subdirectory and using a manifest file to access them as a single file.

With storage-policy parameter, you can restrict user access to a single policy. If no name is specified, the default policy is used (and if no other policies, defined Policy-0 is considered the default).

Please check the example configuration file for further details.

License

This is free software under the terms of MIT license (check COPYING file included in this package).

The server is loosely based on the BSD licensed sftpd server code from:

http://code.google.com/p/pyfilesystem/

Contact and support

The project website is at:

https://github.com/memset/sftpcloudfs

There you can file bug reports, ask for help or contribute patches.

Authors

Contributors

sftpcloudfs's People

Contributors

c35sys avatar edwardbetts avatar mnederlof avatar ncw avatar reidrac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

sftpcloudfs's Issues

Performance improvement, packet size 16Kbits, chunk

Hello,

Currently I use this project to send data to my Swift infrastructure. We encounter some performance issue when reading/writing big files (> dozen MB) from/to swift. We did serveral tests, one of those for instance using rclone shows that when uploading a big file (2GB) into swift using rclone swift backend type we reach a speed of 200MB/s (both ways : upload/download). Then, using rclone sftp backend pointing to sftpcloudfs (running locally), we only reach 6MB/s (local to swift) and 3MB/s (Swift to local).
It looks like there's a huge overhead reading remote file by small chunks. Do you think there's a way to improve this, if so, could finger point where to look at in the code ? :-)

sftpcloudfs to run on system launch

I am trying to make sftpcloudfs run on startup to ensure the desired behaviour is running after a reboot.

How to do this on my Plesk install on CentOS 6.4 (Final)?

Also, should it run using a specific user account name or just root?

Objects are not cached in memcache

Hello,

I typed in sftpcloudfs.conf memcache server with port, which is running, but files being uploaded to SFTP are not being cached. I checked it using memcache stats. I'm using sftpcloudfs from GitHub with Ubuntu 14.10 (package in Ubuntu is broken, can't get files from OpenSwift).

Caching in ftp-cloudfs is working with memcache.

keystone v3

Are there any plans to support keystone's v3 authentication? If not would you be open to a patch?

SCP recursive sub-directory upload issue

When uploading sub-directories containing more than 1 file with scp, an extra \x00 is inserted in channel, causing error while parsing record in recv_inner().

How to reproduce :

--- a/sftpcloudfs/scp.py
+++ b/sftpcloudfs/scp.py
@@ -143,9 +143,15 @@ class SCPHandler(threading.Thread):
         if '\n' not in self.buffer:
             while True:
                 chunk = self.channel.recv(1024)
+                self.log.info('[recv_line] chunk=%s', chunk)
+                hex_chunk = "".join(x.encode('hex') for x in chunk)
+                self.log.info('[recv_line] hex chunk=%s', hex_chunk)
                 self.buffer += chunk
+                self.log.info('[recv_line] buffer=%s', self.buffer)
+                hex_buffer = "".join(x.encode('hex') for x in self.buffer)
+                self.log.info('[recv_line] hex buffer=%s', hex_buffer)
                 if '\n' in chunk:
                     break

Logs :

Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] chunk=D0755 0 in
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] hex chunk=4430373535203020696e0a
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] buffer=D0755 0 in
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] hex buffer=4430373535203020696e0a
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] chunk=D0755 0 sub1
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] hex chunk=4430373535203020737562310a
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] buffer=D0755 0 sub1
Jan  6 14:56:17 node-xxx paramiko[18892]: INFO: [recv_line] hex buffer=4430373535203020737562310a
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] chunk=C0644 65536 file2
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex chunk=43303634342036353533362066696c65320a
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] buffer=C0644 65536 file2
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex buffer=43303634342036353533362066696c65320a
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] chunk=
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex chunk=00
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] buffer=
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex buffer=00
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] chunk=C0644 32768 file1
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex chunk=43303634342033323736382066696c65310a
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] buffer=
Jan  6 14:56:18 node-xxx paramiko[18892]: INFO: [recv_line] hex buffer=0043303634342033323736382066696c65310a

My (dirty) workaround :

--- a/sftpcloudfs/scp.py
+++ b/sftpcloudfs/scp.py
@@ -143,6 +143,8 @@ class SCPHandler(threading.Thread):
         if '\n' not in self.buffer:
             while True:
                 chunk = self.channel.recv(1024)
+                if chunk == '\x00':
+                    continue
                 self.buffer += chunk
                 if '\n' in chunk:
                     break

Any hints on what's causing this behaviour ?

Update dependencies

Update the dependencies to the latest available version with support, test and fix any incompatibilities.

Rate limiting support?

Even though it is suggested that sftpcloudfs is running on a high port, by a non-root user, it will most likely be accessible to clients on *:22 - via a rewrite, or similar, to prevent the user from having to connect to a non-standard port.

On that basis it will get hit by dictionary attacks which should ideally be rate-limited.

My initial though was that we could use the memcached instance to count failing login attempts and drop clients who made too many, or implement some backoff.

However sftpcloudfs doesn't directly use memcaced. Instead it merely passes over the details to the ftp-cloudfs library.

Would you accept a patch which implemented rate-limiting by actually trying to keep track of connection failures in memcached, instantiating a connection to a memcached instanc if the memcached options were supplied?

Or failing that would you have an alternative preferred approach to handling rate-limiting?

Port to Python 3

Python 2.7 will end of life in 2020 so we should ensure that the project works with Python 3.

Supporting Python 2 is optional.

Memcache, advantages and setup

I am trying to run sftpcoudfs on a production server to send backup files over to CloudFiles.

What are the benefits of using memcache?

How to set it up correctly?

Not able to put files

Hi,

I have followed the instructions on your website as it is and installed sftpcloudfs along with memcache.

I am able to login to the sftp server with winscp / filezilla. I am also able to download files from my container on openstack swift.

The only issue I am facing is, I am not able to PUT a file in the container. The error I am getting is as below in syslog.

May 13 17:20:22 Storage-1 paramiko[3]: DEBUG: open((<sftpcloudfs.server.SFTPServerInterface object at 0x23c30d0>, u'/ftp-cloudfs/404error.html', 705, <SFTPAttributes: [ size=30 ]>),{}): enter

May 13 17:20:22 Storage-1 paramiko[3]: DEBUG: SFTPHandle(path=u'/ftp-cloudfs/404error.html', flags=705)

May 13 17:20:23 Storage-1 paramiko[3]: INFO: open(u'/ftp-cloudfs/404error.html', 705, <SFTPAttributes: [ size=30 ]>) from ('124.153.69.242', 23710): HTTPConnectionPool(host='simplidrivedel.netmagicsolutions.comPUT', port=80): Max retries exceeded with url: / (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)

May 13 17:20:23 Storage-1 paramiko[3]: DEBUG: open: returns 4

May 13 17:20:24 Storage-1 paramiko[3]: DEBUG: stat((<sftpcloudfs.server.SFTPServerInterface object at 0x23c30d0>, u'/ftp-cloudfs/404error.html'),{}): enter

May 13 17:20:24 Storage-1 paramiko[3]: INFO: stat(u'/ftp-cloudfs/404error.html',) from ('124.153.69.242', 23710): [Errno 2] No such file or directory 404error.html

May 13 17:20:24 Storage-1 paramiko[3]: DEBUG: stat: returns 2

May 13 17:20:26 Storage-1 object-auditor Begin object audit "forever" mode (ZBF).
error-scr

Default auth url

ftp-cloudfs uses Rackspace Cloud Files auth url as defautl auth url.

We should probably be doing the same (makes sense, this is a Cloud Files proxy).

Pidfile is used in foreground mode

When the sftpcloudfs is started with --foreground, the pidfile is still checked for existence, and possibly created.

We're using sftpcloudfs with ubuntu's upstart, which doesn't use pidfiles. It found it quite counterintuitive that a pidfile is checked and created, even though it is not configured or required.

I see two solutions:

  • Don't create a pidfile when it is not configured (my preference)
  • Don't create a pidfile when --foreground is set.

I'm willing to implement either of these solutions, once a decision has been made.

Missing dep while installing via pip?

I am installing sftp-cludfiles via pip which seems to install fine according to this log http://pastie.org/private/6yiqvbsq9ipdtxvp6uchha

when trying to use run pidlock i am getting :

chmouel@swift-all-in-one:~$ sftpcloudfs 
Traceback (most recent call last):
  File "/usr/local/bin/sftpcloudfs", line 4, in <module>
    from sftpcloudfs.main import Main
  File "/usr/local/lib/python2.6/dist-packages/sftpcloudfs/main.py", line 35, in <module>
    from daemon.pidlockfile import PIDLockFile
ImportError: No module named pidlockfile

This is because Ubuntu has a old version of python-pidlock by default which doesn't have a daemon.pidlockfile and take precendence in python path :

http://pastie.org/private/r7ca5l0f7xmk2d47ogcaga

I have work-arounded in ftpcloudfs with using my own in utils.PidFile

error: certificate was not OK

I am trying to use your code to securely send files to my Rackspace Cloud files via SFTP.

However, after following your instructions from QuickStart.md on how to generate an RSA key, I am unable to connect to the sftp proxy.
This is what I use to test if it works:
curl sftp://127.0.0.1:8022 -u username
This is the error message I get:
curl: (51) SSL peer certificate or SSH remote key was not OK

It only works when I specify --insecure, which is not good.
curl sftp://127.0.0.1:8022 --insecure -u username

Would be able to help me fix this issue?

PS: I already have created a valid SSL certificate which I've paid for vie RapidSSL, could this be of any use?

Endpoint for 'object-store' not found

We have sftpcloudfs installed and running on CentOS 6.4, but are having an issue when configuring for Keystone authentication. The following error is being logged:

Errno 5] Endpoint for 'object-store' not found - have you specified a region?:

The following config file is being used:

# sftpcloudfs configuration file
#

[sftpcloudfs]

# Authentication URL (eg Memstore, RackSpace Cloud Files, OpenStack)
auth-url = http://192.168.20.50:5000/v2.0

# Host RSA key used by the server.
host-key-file = /etc/sftpcloudfs/rsa_key

# Address to bind.
bind-address = 192.168.20.50

# Port to bind.
port = 8022

# SCP support.
# scp-support = yes

# Memcache server(s) for external cache (eg 127.0.0.1:11211)
# Can be a comma-separated list.
memcache = 192.168.20.50:11211

# Maximum number of workers.
# max-children = 20

# Large file support.
# Specify a size in MB to split large files.
# split-large-files = (empty)

# Hide .part directory from large files
# hide-part-dir = no

# Log file location.
# log-file = (empty)

# Enable logging to system logger (daemon facility).
syslog = yes

# Show detailed information on logging.
verbose = yes

# Pid file location when in daemon mode.
# pid-file = (empty)

# UID to drop privileges when in daemon mode.
# uid = (empty)

# GID to drop privileges when in daemon mode.
# gid = (empty)

# Auth 2.0 (Keystone), requires keystoneclient
keystone-auth = yes

# Region name to be used with Auth 2.0 (optional)
keystone-region-name = regionOne

# Tenant separator to be used with Auth 2.0 (eg. TENANT.USERNAME)
keystone-tenant-separator = ':'

# Service type to be used with Auth 2.0.
keystone-service-type = 'object-store'

# Endpoint type to be used with Auth 2.0.
keystone-endpoint-type = 'publicurl'

# EOF

RFE: Check permissions before uploading big files

The communication between the SFTP daemon and Swift is over HTTP, and it means that when uploading a big file without write permissions we need to upload the whole file before getting an error.

It would be a good idea testing the permissions first (ie. create a zero bytes file), so the SFTP service returns the error ASAP.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.