Giter VIP home page Giter VIP logo

json-and-csv's Introduction

Introduction to CSV and JSON

In the last exercise, we saw how we can read the data stored in log files and parse the files to generate reports. Log files or Text files are an example of what we call Raw text data. Text that doesn't have a specific data structure specified in the format of the file.

However, most of the times you would work with data that has some structure. Some common examples include excel sheets, databases, JSON, XML etc. And, you have to process that data find some insights.

For this excercise, let's assume you work as a data analyst at a Healthcare startup and you have to analyse Covid-19 research data stored in JSON files and generate CSV.

JSON and CSV are two of the most popular formats for storing and processing data.

Learning Objectives

  1. Build on Python skills you learned in the previous unit by expanding your knowledge of writing, reading, and parsing data in formats like JSON and CSV.
  2. Learn how to use Python libraries like Pandas and json.

CSV

Tabular data is a spreadsheet format like what you'd see in Excel or Google Sheets. It has rows and columns. Often, each row will be a single observation and each column will be a specific variable. The same variables will be recorded for each observation (or else you have empty columns in some rows). File formats: .csv, .tsv, .xls, .xlsx

CSV Stands for "Comma Separated Values"

JSON

Hierarchical data is data format where values can be nested within each other. With hierarchical data structures, you can have different information about each observation. You generally want to try to avoid trying to "flatten" hierarchical data into a tabular data structure because it's often not space efficient. For example, if you have a dataset of food products and clothes products, you probably want to know the expiration date for the food and the clothing size for the clothes. In a hierarchical structure you don't need to specify that you don't know the size for the food or the expiration date for the clothes, but in a tabular data structure you would. As a result, you'd end up with a lot of NA cells that aren't very informative and waste space. File formats: .json, .xml

json-and-csv's People

Contributors

autodidact24 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.