MongoDB Charts
Setup the Datasets
Download the airbnb dataset, unzip it, and import it into your MongoDB Atlas
cluster. Have a look at the command line tools
for directions on how to use mongoimport
with MongoDB Atlas. The included
dataset is sized to be able to fit into a free M0 cluster on MongoDB Atlas.
Analysing Airbnb Data
Add a Data Source
- With a dataset in place, you'll want to enable Charts for your project.
- Select the Charts link on the left-hand side, then select the Activate MongoDB Charts button.
- Add a new data source to your project, choose the Cluster, database and
collection. For this example, I'll use the
sanFranciscoListingsAndReviews
collection. Then I'll accept the default permissions.
Create a Dashboard
- Go to the Dashboards tab
- Click New Dashboard
- Enter a name and description
Add Some Charts
Inside the MongoDB Charts dashboard, you'll now add a chart to the dashboard.
Multi-Series Stacked Bar Chart
Let's locate the neighborhoods in San Francisco that have the most Airbnb properties and split them out by property type.
- Select the San Francisco Airbnb (
airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource. - For the Chart Type select Bar/Stacked.
- Use the following in the Encoding sections for the chart data:
- X Axis:
_id
, Count aggregation - Y Axis:
address.suburb
- Sort By: Aggregated Value, Descending
- Limit: 30
- Series:
property_type
- X Axis:
- Add a name to the chart, like Properties by Location
Your chart should look something like:
Show the areas with the most expensive properties, and correlate with the review scores.
Data Source: Select the San Francisco Airbnb (airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource.
Chart Title: Price by Location, Colored by Location Review Score
Chart Type: Bar / Colored
Encodings:
- X Axis:
price
, Mean aggregation - Y Axis:
address.suburb
- Sort By: Aggregated Value, Descending
- Limit: 30
- Color:
review_scores.review_scores_location
, Mean aggregation
Your chart should look something like:
Show the number of properties with various review scores
Data Source: Select the San Francisco Airbnb (airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource.
Chart Title: Review Score Histogram
Chart Type: Column / Grouped
Encodings:
- X Axis:
review_scores.review_scores_rating
- Binning: On, bin size 5
- Y Axis:
_id
, Count aggregation - Filter (to hide properties with no review score):
{ 'review_scores.review_score_rating': { $gte: 0 }}
Your chart should look something like:
Show the types of properties that contain pools.
Data Source: Select the San Francisco Airbnb (airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource.
Chart Title: Properties with Pools
Chart Type: Circular / Donut
Encodings:
- Label:
property_type
- Arc:
_id
, Count aggregation - Filter (to show only properties with pools):
{ amenities: 'Pool' }
Your chart should look something like:
Show the number of reviews of any property, month by month
Data Source: Select the San Francisco Airbnb (airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource.
Chart Title: Reviews Over Time
Chart Type: Area / Discrete
Encodings:
- X Axis:
reviews.date
- Array Reduction: Unwind Array
- Binning: On, bin by month
- Y Axis:
_id
, Count aggregation
Your chart should look something like:
Show the number of properties with and without pools by area
Data Source: Select the San Francisco Airbnb (airbnb.sanFranciscoListingsAndReviews
) dataset as the datasource.
Chart Title: Pool or No Pool
Chart Type: Column / Stacked
Encodings:
- X Axis:
address.suburb
- Y Axis:
_id
, Count aggregation - Series:
amenities
- Array Reduction: Existence of value: "Pool"