For the old version of this lab go here.
NASA wants to go to Mars! Before they build their rocket, NASA needs to track information about all of the planets in the Solar System. Use SQL to help NASA create, alter, and insert data into a database that stores all of this important information. Then practice querying the database with various SELECT
statements. We will select different columns, and employ other SQL clauses like WHERE
to return the data we would like.
You will be able to:
- Retrieve all the information from a table
- Retrieve a subset of records from a table using a
WHERE
clause - Retrieve a subset of columns from a table
To start, let's create a database using sqlite3. To do this, import the package and create a connection as we did in the previous lecture. This method connects to a database if it exists, or creates a new one if none exists. In this case, we'll create a new database called planets.db.
# Import sqlite3, then create the SQL database/connect to it (creating the database connection will create a DB)
Now create a cursor object so that you can execute statements through your connection.
#Your code here; create a cursor object so you can execute statement against the database.
Write the necessary SQL to create a table using the CREATE TABLE
command. Call the table planets
.
Remember: your create table statement should be formatted like the following:
CREATE TABLE table_name (
# column_names and data types here
);
NASA is interested in comparing each planet across several characteristics. They want to know each planet's name, color, number of moons, and mass (relative to earth). Your columns should be the following types:
column | type |
---|---|
id | integer |
name | text |
color | text |
num_of_moons | integer |
mass | real |
Notes: Make sure to set the
id
column as the table's primary key.
#Your code here; create the table as described above.
NASA notices that several of the planets have rings around them. However, we do not have a column to keep track of this information. Use the ALTER TABLE
statement to add a column called rings
with a data type of boolean
to the planets
table. Write the code below to use sqlite3 to execute that SQL query against your database.
# Your code for reading and executing alter.sql
Populate the table with data for the nine planets that constitute the Solar System using the INSERT INTO
command. The relevant information is provided in the table below:
name | color | num_of_moons | mass | rings |
---|---|---|---|---|
Mercury | gray | 0 | 0.55 | no |
Venus | yellow | 0 | 0.82 | no |
Earth | blue | 1 | 1.00 | no |
Mars | red | 2 | 0.11 | no |
Jupiter | orange | 53 | 317.90 | no |
Saturn | hazel | 62 | 95.19 | yes |
Uranus | light blue | 27 | 14.54 | yes |
Neptune | dark blue | 14 | 17.15 | yes |
Pluto | brown | 5 | 0.003 | no |
Refer to the SQLite3 documentation to remember how to express boolean values in SQLite3.
Hint: to save some tedious typing, feel free to open up this cell and copy and paste some of the text to modify in your insert command below.
#Your code here; add data to the table
NASA has confirmed that Jupiter has another 15 moons! Write an UPDATE
command so that Jupiter has 68 moons instead of 53.
Hint: you probably need to use a
WHERE
statement to accomplish this task.
# Your code to update the table
Wait just a moment! NASA decided that Pluto is no longer a planet. Remove Pluto from the table using the DELETE FROM
command.
# Your code to delete pluto here
We will be querying data from the planets
table we just created. We can see it again below:
name | color | num_of_moons | mass | rings |
---|---|---|---|---|
Mercury | gray | 0 | 0.55 | no |
Venus | yellow | 0 | 0.82 | no |
Earth | blue | 1 | 1.00 | no |
Mars | red | 2 | 0.11 | no |
Jupiter | orange | 67 | 317.90 | no |
Saturn | hazel | 62 | 95.19 | yes |
Uranus | light blue | 27 | 14.54 | yes |
Neptune | dark blue | 14 | 17.15 | yes |
Write SQL queries for each of the statements below. You can also preview the results as a nice pandas DataFrame if you want to see the output like this:
Let's take brief look at nicely displaying the results of the sql queries you are about to write. Specifically, we'll look at how to pipe your sql queries into Pandas so that you can use your standard data processing techniques and build your workflow into pipelines.
import pandas as pd
#Demonstrating running a query and previewing results as pandas DataFrame
results = cur.execute("""select * from planets;""").fetchall()
#Alternatively we could do this in two steps:
#cur.execute("""select * from planets;""")
#results = cur.fetchall()
df = pd.DataFrame(results)
df. columns = [i[0] for i in cur.description]
df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
id | name | color | num_of_moons | mass | rings | |
---|---|---|---|---|---|---|
0 | 1 | Mercury | gray | 0 | 0.55 | 0 |
1 | 2 | Venus | yellow | 0 | 0.82 | 0 |
2 | 3 | Earth | blue | 1 | 1.00 | 0 |
3 | 4 | Mars | red | 2 | 0.11 | 0 |
4 | 5 | Jupiter | orange | 68 | 317.90 | 0 |
For simplicity, you may wish to make this process a function:
def sql_select_to_df(SQL_COMMAND, cur=cur):
results = cur.execute(SQL_COMMAND).fetchall()
df = pd.DataFrame(results)
df.columns = [i[0] for i in cur.description]
return df
df = sql_select_to_df("""select * from planets;""")
df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
id | name | color | num_of_moons | mass | rings | |
---|---|---|---|---|---|---|
0 | 1 | Mercury | gray | 0 | 0.55 | 0 |
1 | 2 | Venus | yellow | 0 | 0.82 | 0 |
2 | 3 | Earth | blue | 1 | 1.00 | 0 |
3 | 4 | Mars | red | 2 | 0.11 | 0 |
4 | 5 | Jupiter | orange | 68 | 317.90 | 0 |
With that, it's time to practice your sql skills!
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
name | color | |
---|---|---|
0 | Mercury | gray |
1 | Venus | yellow |
2 | Earth | blue |
3 | Mars | red |
4 | Jupiter | orange |
5 | Saturn | hazel |
6 | Uranus | light blue |
7 | Neptune | dark blue |
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
id | name | color | num_of_moons | mass | rings | |
---|---|---|---|---|---|---|
0 | 5 | Jupiter | orange | 68 | 317.90 | 0 |
1 | 6 | Saturn | hazel | 62 | 95.19 | 1 |
2 | 7 | Uranus | light blue | 27 | 14.54 | 1 |
3 | 8 | Neptune | dark blue | 14 | 17.15 | 1 |
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
name | mass | |
---|---|---|
0 | Mercury | 0.55 |
1 | Venus | 0.82 |
2 | Earth | 1.00 |
3 | Mars | 0.11 |
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
name | color | |
---|---|---|
0 | Jupiter | orange |
1 | Saturn | hazel |
2 | Uranus | light blue |
3 | Neptune | dark blue |
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
id | name | color | num_of_moons | mass | rings | |
---|---|---|---|---|---|---|
0 | 3 | Earth | blue | 1 | 1.00 | 0 |
1 | 4 | Mars | red | 2 | 0.11 | 0 |
#Your code here
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
name | color | |
---|---|---|
0 | Earth | blue |
1 | Uranus | light blue |
2 | Neptune | dark blue |
Congratulations! NASA is one step closer to embarking upon its mission to Mars. In this lab, we created a table to track all the planets in the solar system, altered the table to include another column, inserted values to populate the table, and we deleted data from the table. Then we practiced writing select statements that query a single table to get specific information. We also used other clauses and specified column names to cherry pick the data we wanted to retrieve.