Giter VIP home page Giter VIP logo

gcsfs's Introduction

GCSFS

A Python filesystem abstraction of Google Cloud Storage (GCS) implemented as a PyFilesystem2 extension.

https://travis-ci.org/Othoz/gcsfs.svg?branch=master https://readthedocs.org/projects/fs-gcsfs/badge/?version=latest

With GCSFS, you can interact with Google Cloud Storage as if it was a regular filesystem.

Apart from the nicer interface, this will highly decouple your code from the underlying storage mechanism: Exchanging the storage backend with an in-memory filesystem for testing or any other filesystem like S3FS becomes as easy as replacing gs://bucket_name with mem:// or s3://bucket_name.

For a full reference on all the PyFilesystem possibilities, take a look at the PyFilesystem Docs!

Documentation

Installing

Install the latest GCSFS version by running:

$ pip install fs-gcsfs

Or in case you are using conda:

$ conda install -c conda-forge fs-gcsfs

Examples

Instantiating a filesystem on Google Cloud Storage (for a full reference visit the Documentation):

from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")

Alternatively you can use a FS URL to open up a filesystem:

from fs import open_fs
gcsfs = open_fs("gs://mybucket/root_path?project=test&api_endpoint=http%3A//localhost%3A8888&strict=False")

Supported query parameters are:

  • project (str): Google Cloud project to use
  • api_endpoint (str): URL-encoded endpoint that will be passed to the GCS client's client_options
  • strict ("True" or "False"): Whether GCSFS will be opened in strict mode

You can use GCSFS like your local filesystem:

>>> from fs_gcsfs import GCSFS
>>> gcsfs = GCSFS(bucket_name="mybucket")
>>> gcsfs.tree()
├── foo
│   ├── bar
│   │   ├── file1.txt
│   │   └── file2.csv
│   └── baz
│       └── file3.txt
└── file4.json
>>> gcsfs.listdir("foo")
["bar", "baz"]
>>> gcsfs.isdir("foo/bar")
True

Uploading a file is as easy as:

from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name="mybucket")
with open("local/path/image.jpg", "rb") as local_file:
    with gcsfs.open("path/on/bucket/image.jpg", "wb") as gcs_file:
        gcs_file.write(local_file.read())

You can even sync an entire bucket on your local filesystem by using PyFilesystem's utility methods:

from fs_gcsfs import GCSFS
from fs.osfs import OSFS
from fs.copy import copy_fs

gcsfs = GCSFS(bucket_name="mybucket")
local_fs = OSFS("local/path")

copy_fs(gcsfs, local_fs)

For exploring all the possibilities of GCSFS and other filesystems implementing the PyFilesystem interface, we recommend visiting the official PyFilesystem Docs!

Development

To develop on this project make sure you have pipenv installed and run the following from the root directory of the project:

$ pipenv install --dev --three

This will create a virtualenv with all packages and dev-packages installed.

Tests

All CI tests run against an actual GCS bucket provided by Othoz.

In order to run the tests against your own bucket, make sure to set up a Service Account with all necessary permissions:

  • storage.objects.get
  • storage.objects.list
  • storage.objects.create
  • storage.objects.update
  • storage.objects.delete

All five permissions listed above are e.g. included in the predefined Cloud Storage IAM Role roles/storage.objectAdmin.

Expose your bucket name as an environment variable $TEST_BUCKET and run the tests via:

$ pipenv run pytest

Note that the tests mostly wait for I/O, therefore it makes sense to highly parallelize them with xdist, e.g. by running the tests with:

$ pipenv run pytest -n 10

Credits

Credits go to S3FS which was the main source of inspiration and shares a lot of code with GCSFS.

gcsfs's People

Contributors

birnbaum avatar btschroer avatar mathiaseitz avatar trendelkampschroer avatar bgroenks96 avatar elephantum avatar dvgica avatar bemeyvif avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.