Giter VIP home page Giter VIP logo

python-with-jupyter-notebook-in-ibm-cloud-storage's Introduction

Working with Jupyter Notebook in IBM Cloud

Hello today we are going to explain how to create a Jupyter notebok in the IBM cloud and how to upload and download files from the cloud by using python.

Step 1. Login to IBM Cloud

First you have to login to IBM Cloud

https://cloud.ibm.com/login

image-20230226112742146

Step 2. Creation of the Jupyter notebook in IBM Cloud.

In the Dashboard click create resource and then

image-20230226114015439

then search for Watson Studio and then create

image-20230226114203777

Enter the necessary information

  • Open Watson Studio by clicking Resource list - IBM Cloud and lunch IBM Cloud Park for Data

    image-20230226114646771

    Click New project

    image-20230226120110304

    After the project was created you can open the project

    image-20230226120338066

    After was created Click on "Assets"

    image-20230226120442148

Then we click new asset

image-20230226120545427

and we type Jupyter Notebook

image-20230226120648094

Click on "Notebook"

image-20230226120952866

Now click on “Create Notebook”. Now we can use this notebook to getting start with IBM Cloud.

Step 3 .Connect Jupyter notebook to IBM Cloud Storage

Cloud storage helps you to store data and files in an off-site location and can be accessed either via the public internet or by a dedicated private network. The data you send off-site for storage is a third-party cloud provider ‘s responsibility. The vendor owns, secures, operates and supports the servers and the relevant services and guarantees that you have access to the data anytime you need it.

Assuming you have a account in IBM Cloud Storage , you click customize your bucket

image-20230226122605261

name you unique bucket name as for example ruslanmv-bucket, you can put your favorite unique name

image-20230226123403897

Step 4 Getting the credentials for the python connection

In this part we should copy your own credentials to use in the next step.

Go to Service credentials and create new credentials

image-20230226125815236

and copy API details

image-20230226130431914

For endpoint_url you have to select configuration under bucket and copy the private URL

image-20230226135913108

you should choose the edpoints, if you are working in a public bucket you should use the public endopoint, in my case I will use only private buckets so we choose the private endpoint.

image-20230226140053190

Step 5. Connect Jupyter Notebook with IBM Cloud

Python support is provided through a fork of the boto3 library with features to make the most of IBM Cloud Object Storage.

The package that we will use is the ibm-cos-sdk that should be installed in the default notebook, otherwise you can install by using pip install ibm-cos-sdk.

Let us return back you your notebook that you create and first let us load the following libraries.

import ibm_boto3
from ibm_botocore.client import Config, ClientError
import pandas as pd
import io

In this part you should paste the credentials that you have copied in the step 4

cos_credentials={
  "apikey": "*********************",
  "endpoints": "https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints",
  "iam_apikey_description": "Auto-generated for key 0c6590b0-9625-4bf3-951b-389a7a5507f2",
  "iam_apikey_name": "cloud-object-storage-74-cos-standard-f10",
  "iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Writer",
  "iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/ccfb479a77e943759e482fa2198a5775::serviceid:ServiceId-0862d7bc-d250-4f89-a0bd-5dacd01dee4c",
  "resource_instance_id": "crn:v1:bluemix:public:cloud-object-storage:global:a/ccfb479a77e943759e482fa2198a5775:7c098bc1-1550-469d-b10d-e801e8b45412::",
  
}

# Constants for IBM COS values
auth_endpoint = 'https://iam.cloud.ibm.com/identity/token' # Current list avaiable at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints
service_endpoint = 'https://s3.private.eu-de.cloud-object-storage.appdomain.cloud'

Step 6. Client operations

A client provides a low-level interface to the COS S3 API. This allows for processing HTTP responses directly, rather than making use of abstracted methods and attributes provided by a resource to access the information contained in headers or XML response payloads.

#Create client
cos = ibm_boto3.client('s3',
                         ibm_api_key_id=cos_credentials['apikey'],
                        ibm_service_instance_id=cos_credentials['resource_instance_id'],
                         ibm_auth_endpoint=auth_endpoint,
                         config=Config(signature_version='oauth'),
                         endpoint_url=service_endpoint)

You can check all the posible operations that you can do with IBM s3 client by typing

dir(cos)

and you get

['_PY_TO_OP_NAME',
 '__class__',
 '__delattr__',
 .
 .
 .
 'abort_multipart_upload',
 'add_legal_hold',
 'can_paginate',
 'complete_multipart_upload',
 'copy',
 'copy_object',
 'create_bucket',
 'create_multipart_upload',
 'delete_bucket',
 'delete_bucket_analytics_configuration',
 'delete_bucket_cors',
 'delete_bucket_inventory_configuration',
 'delete_bucket_lifecycle',
 'delete_bucket_metrics_configuration',
 'delete_bucket_policy',
 'delete_bucket_replication',
 'delete_bucket_tagging',
 'delete_bucket_website',
 'delete_legal_hold',
 'delete_object',
 'delete_object_tagging',
 'delete_objects',
 'delete_public_access_block',
 'download_file',
 'download_fileobj',
 'exceptions',
 'extend_object_retention',
 'generate_presigned_post',
 'generate_presigned_url',
 'get_bucket_accelerate_configuration',
 'get_bucket_acl',
 'get_bucket_analytics_configuration',
 'get_bucket_aspera',
 'get_bucket_cors',
 'get_bucket_inventory_configuration',
 'get_bucket_lifecycle_configuration',
 'get_bucket_location',
 'get_bucket_metrics_configuration',
 'get_bucket_protection_configuration',
 'get_bucket_replication',
 'get_bucket_tagging',
 'get_bucket_versioning',
 'get_bucket_website',
 'get_object',
 'get_object_acl',
 'get_object_tagging',
 'get_object_torrent',
 'get_paginator',
 'get_public_access_block',
 'get_waiter',
 'head_bucket',
 'head_object',
 'list_bucket_analytics_configurations',
 'list_bucket_inventory_configurations',
 'list_bucket_metrics_configurations',
 'list_buckets',
 'list_buckets_extended',
 'list_legal_holds',
 'list_multipart_uploads',
 'list_object_versions',
 'list_objects',
 'list_objects_v2',
 'list_parts',
 'meta',
 'put_bucket_accelerate_configuration',
 'put_bucket_acl',
 'put_bucket_analytics_configuration',
 'put_bucket_cors',
 'put_bucket_inventory_configuration',
 'put_bucket_lifecycle_configuration',
 'put_bucket_metrics_configuration',
 'put_bucket_protection_configuration',
 'put_bucket_replication',
 'put_bucket_tagging',
 'put_bucket_versioning',
 'put_bucket_website',
 'put_object',
 'put_object_acl',
 'put_object_tagging',
 'put_public_access_block',
 'restore_object',
 'upload_file',
 'upload_fileobj',
 'upload_part',
 'upload_part_copy',
 'waiter_names']

List current buckets

cos.list_buckets()

or simply

for bucket in cos.list_buckets()['Buckets']:
    print(bucket['Name'])
apachesparkfundamentals-donotdelete-pr-krj4dovovltwtz
jupyternotebookproject-donotdelete-pr-ijhhmanwzaepps
mypersonalbucket1
ruslanmv-bucket

Creating sample dataframe

# Import pandas library
import pandas as pd

# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age'])

# print dataframe.
df
Name Age
0 tom 10
1 nick 15
2 juli 14
```python df.to_csv('out.csv') ```
!pwd
/home/wsuser/work
!ls
out.csv
#importing the os module
import os

#to get the current working directory
directory = os.getcwd()
# Join various path components
file_name=os.path.join(directory, "out.csv")
print(file_name)
/home/wsuser/work/out.csv

File Uploads

def upload_file(file_name, bucket, object_name=None):
    """Upload a file to an S3 bucket

    :param file_name: File to upload
    :param bucket: Bucket to upload to
    :param object_name: S3 object name. If not specified then file_name is used
    :return: True if file was uploaded, else False
    """

    # If S3 object_name was not specified, use file_name
    if object_name is None:
        object_name = os.path.basename(file_name)

    # Upload the file
    s3_client = cos
    try:
        response = s3_client.upload_file(file_name, bucket, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True

If you want to upload your file a a new bucket that you don't have created yet, you can create a bucket by specifying a unique name for it:

cos.create_bucket(Bucket='ruslanmv-bucket')

and then you can upload your file

upload_file(file_name,'ruslanmv-bucket')
True
# Upload file out.csv from /home/wsuser/work/ folder into project bucket as out.csv
cos.upload_file(Filename=file_name,Bucket='ruslanmv-bucket',Key='out.csv')

File Downloads

cos.download_file(Bucket='ruslanmv-bucket',Key='out.csv',Filename='downloaded_out.csv')
!ls
downloaded_out.csv  out2.csv  out.csv
# download file like object 
with open('object_out.csv', 'wb') as data:
    cos.download_fileobj('ruslanmv-bucket', 'out.csv', data)
!ls -ltr
total 12

-rw-rw---- 1 wsuser wscommon 39 Feb 26 15:24 out.csv

-rw-rw---- 1 wsuser wscommon 39 Feb 26 15:24 downloaded_out.csv

-rw-rw---- 1 wsuser wscommon 39 Feb 26 15:37 object_out.csv
from ibm_botocore.client import Config
import ibm_boto3

def download_file_cos(local_file_name,bucket,key):  
    s3_client = cos
    try:
        res=s3_client.download_file(Bucket=bucket,Key=key,Filename=local_file_name)
    except Exception as e:
        print(Exception, e)
    else:
        print('File Downloaded')
download_file_cos('downloaded_out2.csv','ruslanmv-bucket','out.csv')
File Downloaded

Create/Delete Buckets

cos.create_bucket(Bucket='ruslanmv-bucket1-test')
{'ResponseMetadata': {'RequestId': '2d47234e-4165-4801-a813-e3c45efe9c15',
  'HostId': '',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sun, 26 Feb 2023 15:48:40 GMT',
   'x-clv-request-id': '2d47234e-4165-4801-a813-e3c45efe9c15',
   'server': 'Cleversafe',
   'x-clv-s3-version': '2.5',
   'x-amz-request-id': '2d47234e-4165-4801-a813-e3c45efe9c15',
   'content-length': '0'},
  'RetryAttempts': 0}}
for bucket in cos.list_buckets()['Buckets']:
    print(bucket['Name'])
apachesparkfundamentals-donotdelete-pr-krj4dovovltwtz
jupyternotebookproject-donotdelete-pr-ijhhmanwzaepps
mypersonalbucket1
ruslanmv-bucket
ruslanmv-bucket1-test

Create/Delete Buckets

cos.delete_bucket(Bucket='ruslanmv-bucket1-test')
{'ResponseMetadata': {'RequestId': '0015434a-b64e-44ec-9c3b-9cb5f275e2a4',
  'HostId': '',
  'HTTPStatusCode': 204,
  'HTTPHeaders': {'date': 'Sun, 26 Feb 2023 15:50:31 GMT',
   'x-clv-request-id': '0015434a-b64e-44ec-9c3b-9cb5f275e2a4',
   'server': 'Cleversafe',
   'x-clv-s3-version': '2.5',
   'x-amz-request-id': '0015434a-b64e-44ec-9c3b-9cb5f275e2a4'},
  'RetryAttempts': 0}}
for bucket in cos.list_buckets()['Buckets']:
    print(bucket['Name'])
apachesparkfundamentals-donotdelete-pr-krj4dovovltwtz
jupyternotebookproject-donotdelete-pr-ijhhmanwzaepps
mypersonalbucket1
ruslanmv-bucket

Additionally you can check by yourself the content of your uploaded files

image-20230226202152307

For more commands you can visit the documentation of IBM S3 here

Congratulations! we have practice how to upload and download files by using IBM Cloud.

python-with-jupyter-notebook-in-ibm-cloud-storage's People

Contributors

ruslanmv avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.