Giter VIP home page Giter VIP logo

pandas_redshift's Introduction

pandas_redshift

This package is designed to make it easier to get data from redshift into a pandas DataFrame and vice versa. The pandas_redshift package only supports python3.

Installation

pip install pandas-redshift

Example

import pandas_redshift as pr

Connect to redshift. If port is not supplied it will be set to amazon default 5439

pr.connect_to_redshift(dbname = <dbname>,
                        host = <host>,
                        port = <port>,
                        user = <user>,
                        password = <password>)

Query redshift and return a pandas DataFrame.

data = pr.redshift_to_pandas('select * from gawronski.nba_shots_log')

data.ix[0:5,0:5]

GAME_ID1                   MATCHUP LOCATION  W  FINAL_MARGIN
0  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24
1  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24
2  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24
3  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24
4  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24
5  21400899  MAR 4, 2015 - CHA @ BKN        A  W            24

Write a pandas DataFrame to redshift. Requires access to an S3 bucket and previously running pr.connect_to_redshift.

If the table currently exists IT WILL BE DROPPED and then the pandas DataFrame will be put in it's place.

# Connect to S3
pr.connect_to_s3(aws_access_key_id = <aws_access_key_id>,
                aws_secret_access_key = <aws_secret_access_key>,
                bucket = <bucket>,
                subdirectory = <subdirectory>)

# Write the DataFrame to S3 and then to redshift
pr.pandas_to_redshift(data_frame = data,
                        redshift_table_name = 'gawronski.nba_shots_log')

Other options:

pr.pandas_to_redshift(data_frame,
                        redshift_table_name,
                        # Defaults:
                        column_data_types = None, # A list of column data types. If not supplied all columns will default to varchar(256)
                        index = False,
                        save_local = False, # If set to True a csv from the data frame will save in the current directory
                        delimiter = ',',
                        quotechar = '"',
                        dateformat = 'auto',
                        timeformat = 'auto')

Redshift data types: http://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html

Finally close the cursor, commit and close the database connection, and remove variables from the environment.

pr.close_up_shop()

pandas_redshift's People

Contributors

agawronski avatar

Watchers

Pankesh Bamotra avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.