Giter VIP home page Giter VIP logo

bytesofintelligences / ibm-project-03-analyzing-spreadsheet-data-with-python Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 1.43 MB

Spreadsheets are computer programs that allow users to enter, view, and change information in a gridlike format. One of the most useful programs on a PC is a spreadsheet, which allows users to organize data in tabular form. A spreadsheet's primary purpose is to store numerical data and sometimes a few words of text.

Home Page: https://cognitiveclass.ai/courses/course-v1:IBM+GPXX0AYQEN+v1

License: MIT License

Jupyter Notebook 100.00%
data-analysis data-analysis-python data-analyst data-science excel python spreadsheet

ibm-project-03-analyzing-spreadsheet-data-with-python's Introduction


IBM Project 03 Analyzing Spreadsheet Data with Python

Description

About Spreadsheets are computer programs that allow users to enter, view, and change information in a gridlike format. One of the most useful programs on a PC is a spreadsheet, which allows users to organize data in tabular form. A spreadsheet's primary purpose is to store numerical data and sometimes a few words of text.

Features

Analyzing Spreadsheet Data with Python

Estimated time needed: 40 minutes

We've all been there: sitting at the computer, a cup of coffee in hand, ready to begin the day by tackling a spreadsheet. But once we open it, all we see are numbers upon numbers staring back at us. It can be a little intimidating tackling these numbers, especially with larger spreadsheets, but not to worry! With a few tricks up its sleeve, Python is here to give us a hand!

Interested in learning more about this? Well, you're at the right place!

In this project, we'll be giving you an introduction to the tools that you can use to load spreadsheet data into Python, manipulate it, and write it back to the original file.

What You'll Need

You should be familiar with basic concepts of computer programming and the syntaxes of Python programming language. This refers to the concept of list, dictionary, function, class, instance variable, method, etc.

What You'll Learn

After completing this guided project, you will be able to:

  1. Understand the definition of a dataset.
  2. Understand the concept of Cell and Sheets in spreadsheet.
  3. Load a sheet in an xlsx or xls file to Python.
  4. Perform basic spreadsheet operations such as subsetting, filtering, calculating column mean, median, max and mean, etc.
  5. Write to a sheet in an xlsx or xls file from Python.

Exercise 1: Let's Define Data!

So, let's talk about data. We've been throwing this term around a lot: analyzing data, manipulating data and writing data. What exactly is this "data?

Data is defined as "a collection of numbers, characters, images or other items that provide information of something" (SDM, 3rd CE, Deveaux et al.). In statistical sciences, we store data in files that records cases and variables:

image

As shown in the table above, each entry is called a case, and each case includes value(s) for variable(s). Cases are often deleted due to missing data.

Exercise 2: Explore a Spreadsheet!

Now that we know the definition of data, we can dive into a spreadsheet!

A spreadsheet is a computer application for organization, analysis, and storage of data in tabular form. xlsx and xls are the two most popular formats for spreadsheet files. They can be created, opened, and saved using spreadsheet applications such as Microsoft Office Excel and Google Sheets.

image

import requests
import os

if not os.path.isfile("./Airport_Data.xlsx"):
    r = requests.get("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/analysing-spreadsheet-data-with-python/Airport_Data.xlsx")
    f = open("./Airport_Data.xlsx", mode = "wb+")
    f.write(r.content)
    f.close()

To load the spreadsheet into Python, we can use pandas's read_excel() function. When calling this function, you'll need to specify:

  • The path to the xls or xlsx file, and
  • The name of the sheet in which you wish to load into Python.

Both will be passed in format of the string.

import pandas as pd

df_facilities = pd.read_excel("./Airport_Data.xlsx", sheet_name="Facilities")

Author(s)

Weiqing Wang

Weiqing is a Data Scientist intern at IBM Canada Ltd. Weiqing holds an Honours Bachelor of Science from the University of Toronto with two specialist degrees, respectively in computer science and statistical sciences. He is presently working towards a graduate degree in computer science at the University of Toronto.

Other contributors

Kathy An, Yasmine Hemmati

Change Log

Date (YYYY-MM-DD) Version Changed By Change Description
2021-08-10 1.0 Weiqing Wang Initial Version Created.

© IBM Corporation 2021. All rights reserved.

ibm-project-03-analyzing-spreadsheet-data-with-python's People

Contributors

bytesofintelligences avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.