Giter VIP home page Giter VIP logo

532_group_22's People

Watchers

 avatar  avatar

532_group_22's Issues

Writeup for Section 1

Section 1: Motivation and Purpose
rubric={reasoning:8,writing:2}

In a few sentences, provide motivation for why you are creating a dashboard. Who is your target audience, and what role are you embodying? What problem could your dashboard solve for the intended user? You can read the Project background section for some rough ideas. Be brief and clear.

Example writeup:

Our role: Data scientist consultancy firm

Target audience: Health care administrators

Missed medical appointments cost the healthcare system a lot of money and affects the quality of care. If we could understand what factors lead to missed appointments it may be possible to reduce their frequency. To address this challenge, we propose building a data visualization app that allows health care administrators to visually explore a dataset of missed appointments to identify common factors. Our app will show the distribution of factors contributing to appointment show/no show and allow users to explore different aspects of this data by filtering and re-ordering on different variables in order to compare factors that contribute to absence.

Document your functions' functionality (Optional)

You have all already written good docstring for your functions, right??? Well then, congrats! Your good habits have been awarded with free points in this lab. If not, this is your chance to remedy the situation. Write proper docstrings for all functions, including a description of what the function parameters do, as you have learnt in previous courses. Clear comments where needed in the code is also a plus.

Missing dependency - alt.data_transformers

The line alt.data_transformers.enable("data_server") in tab1.py causes an error when I try to run it with the current environment file. I'm not sure what the dependency is called, but it needs to be added to the yaml file if we keep that line in.

Deployment on Heroku

rubric={accuracy:3}

Deploy your app on Heroku and include the link to your deployed dashboard clearly visible near the top of your README.
Don't push to Heroku after the milestone deadline.
We will compare the milestone release commits with the deployed app so updating it after the deadline will give a late penalty. If you want your newest changes deployed online, you can create a new heroku repo.
This week, you're also going to setup Heroku's GitHub integration to automate your deploys, so that you have a branch in your github repo that is automatically deployed to Heroku.
Create a new branch for this on GitHub that you name deployment.
Don't wait to deploy until Saturday night, you will not have time to solve potential issues.
Deploy early and check that things are working, then redeploy every now and then.
After making the milestone release, make a final push to Heroku to redeploy the miletone app.
Make sure to take away debug=True when you are deploying to Heroku, there should not be a blue debug button on the page your target audience will visit!

Setup app reviews with Heroku (Optional)

Heroku has a neat functionality where you can set it up to atomically deploy branches when PRs are opened on GitHub. This way you can test the dashboard live while reviewing a PR without downloading your collaborator's branch and running it locally. Set up your repo accordingly and create at least one PR that triggers an auto-deployment. Link this PR in canvas-submission.html.

Update folder structure

GitHub folder structure

Since we now have a mix of many different file types,
let's tidy things up a bit.
Use a project structure similar to what we learnt in 521:

project/
├── data/            .csv .hdf .pkl .feather
│   ├── processed/
│   └── raw/
├── src/             .py .R
├── reports/         .ipynb .Rmd
├── doc/             .md
├── environment.yaml (or/and requirements.txt)
├── README.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
└── LICENSE.md

The difference between the reports and doc folders
is that the former contains analytic reports often involving code
(such as notebooks)
whereas the latter is more project documentation.
So where should you put your project proposal and reflections?
I would suggest the doc folder,
but remember that these are guidelines and not strict rule,
there are other sensible folder structures too.
You can upload any analysis you do along the way to explore the data in the reports folder,
but the analysis itself will not be reviewed by the TAs.

MIlestone 2 Feedback

HI team 22,

  • I found your reflection document very easy to read and was organized in a way that made it very easy to understand what you have and haven’t deployed, why or why not, etc. Thank you!
  • Overall, well done getting most of your components working! The dashboard is clean, and not too cluttered.
  • In regards to your question about manually cleaning the y-axis of your Tab 1 bar graph, removing the [number] from each label would be more than enough. I don’t think you have to also abbreviate the provinces, not that I think this would actually be too hard with a set of replace commands. I would instead focus on ordering the data in the bar graph, either from highest to lowest or by province. The current order doesn’t make sense and I think is limiting the amount of useful information that someone can get from looking at it. Having the x-axis at the bottom of the graph outside the immediate view is not ideal. Consider putting it at the top of the graph as well, and even more ideal would be for it to float as you scroll. Since region is not a selection on Tab 1 you could even break the bar chart into multiple plots, one for each region, and then order each from highest to lowest.
  • For tab 2: As I was using this tab, I found it a little frustrating that my location selection was lost when I toggled between the Province and CMA options. - Thank you for defining CMA in your reflection for me! Otherwise, well done on this tab, the selections work, and the graphs are well done.
  • Nice touch adding a graph save option. It would be helpful if you made a note to let the user know they can do this. It was hard to notice.

Writeup for Section 2

Section 2: Description of the data
rubric={reasoning:8,writing:2}

You are allowed to select any dataset you want for this project, as long as you have the license to use it publicly. Warning: finding a good data set can take a lot of time and effort. We therefore recommend that you select one that you have worked with in a previous lab in MDS and that you are already familiar with (for example the Gapminder, movie, or language data sets from 531 (all are on OneDrive)).

A few datasets that have been popular in previous years:

https://www.kaggle.com/zynicide/wine-reviews/data
https://www.kaggle.com/osmi/mental-health-in-tech-survey
https://github.com/themarshallproject/city-crime
Good general resources for finding interesting datasets:

https://github.com/fivethirtyeight/data
https://github.com/the-pudding/data
https://www.kaggle.com/datasets
In your proposal, briefly describe the dataset and the variables that you will visualize. If your are planning to visualize a lot of columns, provide a high level descriptor of the variable types rather than listing every single column. For example, indicate that the dataset contains a variety of categorical variables for demographics and provide a brief list rather than describing every single variable. You may also want to consider visualizing a smaller set of variables given the short duration of this project. This might include brief exploratory data analysis for you to grasp what could be interesting aspects to look at in your data. We will not be grading the EDA aspect, but feel free to include your EDA notebooks in the public GitHub repo, so that you have everything in one place.

Example writeup:

We will be visualizing a dataset of approximately 300,000 missed patient appointments. Each appointment has 15 associated variables that describe the patient who made the appointment (patient_id, gender, age), the health status (health_status) of the patient (Hypertension, Diabetes, Alcohol intake, physical disabilities), information about the appointment itself (appointment_id, appointment_date), whether the patient showed up (status), and if a text message was sent to the patient about the appointment (sms_sent). Using this data we will also derive a new variable, which is the predicted probability that a patient will show up for their appointment (prob_show).

Remember if your dataset has a lot of columns, stick to summaries and avoid listing out every single column. The example also differentiates columns that come with the dataset (i.e. Age) from new variables that you might derive for your visualizations (i.e ProbShow) - you should make a similar distinction in your write-up if you can. Another example of a good description of a dataset is the Kaggle world happiness report.

Milestone 1 Feedback

Hi Group 22,

  • I enjoyed reading your proposal. It was well written and included necessary details in appropriate detail.

  • I like how there is an order to how the user would use the dashboard and a reason behind how you set it up. The dashboard concept is clear and well designed.

  • Tab 1: I am worried the text in the bar graph might be too small to read if you show all the data. Subsetting or a scroll bar might help the used look at the data easier. Where will you put the legend for the choropleth map?

  • Tab 2: will there be a max number of CMAs Sarah can select to compare?

  • Will the abbreviation CMA be explained anywhere in the dashboard?

Update README based on peer feedback

#67
README.md file
As a potential user of your app, I really like the way you wrote the readme file. It has a clear structure and sections are well explained and organized.

  • For the screenshot that you included in the readme file, I think it would be great to include the screenshot of your real app interface instead of the prototype given that it was milestone2.

  • For the convenience of potential contributors who are interested in contributing to your project, it would be great to include the instructions on how they can run the app, and suggestions on what aspects they can contribute to for the project.

Peer feedback from Group21

Hi, Group 22,
Hope you are doing great. This is frank from group 21. After looking at your proposal and the dashboard you have been working on, I am really interested in this project. Here is my feedback on your milestone2 release.

README.md file

  • As a potential user of your app, I really like the way you wrote the readme file. It has a clear structure and sections are well explained and organized.
  • For the screenshot that you included in the readme file, I think it would be great to include the screenshot of your real app interface instead of the prototype given that it was milestone2.
  • For the convenience of potential contributors who are interested in contributing to your project, it would be great to include the instructions on how they can run the app, and suggestions on what aspects they can contribute to for the project.

proposal.md file

This file looks great to me.

dashboard interface

  • It's great to see a lot of interactivities on all plots on your dashboard interface.
  • However, regarding usability, I think the appearance can be improved. For example,
    • the font of the two tabs could be bigger & bolded.
    • You can also add more white space between filters.
    • would be great if you add a side note to explain what the empty plots mean based on user selections (e.g. no data).

Thank you! Let me know if you have any questions or want to have a further discussion with me.

Update About on repo

Please modify your GitHub repo description in the top right corner where it says "About" to include

  • A short description of your app (might already be there)
  • The link to your deployed dashboard (it is often useful to still keep this in the README as well)
    A few keywords describing which plots, widgets, and interactions you have used in your dashboards, like I have done in my demo app.
    If I have time during the break, I will use these to make a resource where you can easily find each others dashboards without searching through the public GitHub repos directly. I think this will be useful to reference back to for capstone and later. You can DM me on slack if your group does not want to be part of this for some reason.

Submit Milestone 2 to Canvas

  • Once you have finished the work for this milestone
    you must create a release on GitHub.com before the submission deadline.
  • The only file you need to submit to Canvas is the one called canvas-submission.html.
    • I changed the file ending to HTML so it renders up on canvas.
    • This file is in your github.ubc.ca repo.
    • Submit this file manually on Canvas and only once per group.
    • Make sure to add a link to your milestone release in this file and leave the rest as is (this facilitates grading).

Deployment on Heroku

2. Deployment on Heroku

rubric={accuracy:8}

  • Deploy your app on Heroku
    and include the link to your deployed dashboard clearly visible near the top of your README.
  • Don't push to Heroku after the milestone deadline.
    • We will compare the milestone release commits with the deployed app
      so updating it after the deadline will give a late penalty.
      If you want your newest changes online,
      you can create a new heroku repo.
  • Since your app.py will be inside the src folder,
    you need to change the Procfile to web: gunicorn src.app:server
    instead of what it is in the dash deployment docs.
  • I recommend creating requirements.txt manually
    and only fix the versions of dash and plotly.
    • Don't forget to include gunicorn.
  • Don't wait to deploy until Saturday night
    after you have implemented every single feature you want.
    You will not have time to solve potential issues.
    • Deploy early and check that things are working,
      then redeploy every now and then,
      especially after adding new package dependencies.
    • After making the milestone2 release,
      make a final push to Heroku to redeploy the miletone2 app.

Design update

  • Add app description on sidebar
  • Peer feedback #67 You can also add more white space between filters.
  • TA feedback #66 Nice touch adding a graph save option. It would be helpful if you made a note to let the user know they can do this. It was hard to notice.
  • Peer feedback #67 Would be great if you add a side note to explain what the empty plots mean based on user selections (e.g. no data).
  • Peer feedback #67 : the font of the two tabs could be bigger & bolded.

Sketch and Description

  1. Description of your app & sketch
    rubric={viz:10}

Building from your research questions and usage scenarios, give a high-level description of the interface for the app you will build. Remember to be realistic about your expectations and plans since you will actually be implementing this app (but again, you will not be penalized if you need to adjust a bit in later milestones). It is better to design a slightly more limited app that you have time to implement well, instead of a complicated app that you don't have time to finish. At the same time, you cannot just make a single barchart and call it a day. The app needs to have a few plot panels, use the visualizations from previous students shown in lecture one as a guide as a complexity target for the final app.

In this description you are not required to use terminology specific to Dash apps (i.e. widgets, components, etc...) or make reference to specific Python or R libraries. Your sketch can be hand-drawn or mocked up using a graphics editor. If you can show the app visual design & interaction design in a single image that is ideal, but if you need more space to show some other planned features of your app you can include max three images for this proposal.

The description should be about 200-300 words and live in the README.md file of your GitHub.com repository. The sketch should be linked in the README.md file of your GitHub.com repository underneath the high level description so that the image shows up on GitHub.

Example description

The app contains a landing page that shows the distribution (depending on data type, bar chart, density chart etc) of dataset factors (hypertension, physical disabilities etc.) colored coded according to whether patients showed up or didn't show up for an appointment. From a dropdown list, users can filter out variables from the distribution display, by patient demographics (i.e. only show female patients), by appointment data (i.e. if SMS was sent), and finally by the date range of appointments. A different dropdown menu will allow users to re-order variables according to the probability of patients being a no-show or in alphabetical order to comorbidities. Users can compare the distribution of co-morbidities by scrolling down through the app interface.

Example sketch

dashBoard

This sketch was drawn using Powerpoint with icons from the noun project. You can use others graphics tools (i.e. Inkscape, GIMP, Photoshop, Illustrator, etc.) or you can even draw you app by hand and upload the scanned version of your drawing. Whatever you choose to do, make sure that the final image in your report is legible.

Interactive Dash app in Python and Altair

3. Interactive Dash app in Python and Altair

rubric={accuracy:10, quality:5, viz:15}

  • Implement the dashboard you outlined in your proposal.
  • Keep your usage scenario and target audience in mind when designing your interface.
  • Aim to implement most of your dashboard's functionality, but not everything.
    • Since the complexity varies between proposals,
      the rough goal here is to have around 3 plots
      and most of their widgets
      and interactivity implemented
    • The app should be clearly usable,
      so focus on the most important things first.
    • In the upcoming milestones you will have time to improve your app
      based on your proposal and the feedback you have received.
    • The TAs will give you feedback on how to adjust the overall complexity
      of your final app for milestone 3 and 4 (if needed). For this milestone,
      use the above directions.
  • Your interface should be as self-documenting as possible,
    with appropriate labels for panes and widgets,
    legends documenting the meaning of visual encodings,
    and a meaningful title for the app.
  • Note that TAs will be grading your app on Heroku in a full-screen window

It can be easy to get sucked into a rabbit hole when trying to implement a stubborn feature
(I know this all too well myself =p).
While it is important to build your troubleshooting skills,
it is often even more important to build your time management skills
and we do not want one annoying bug to prevent you from completing your app.
Compromises may need to be made - this is a short project.
You can add the bells and whistles at the later milestones and
if you're struggling with a particularly tough problem,
save it for later and ask a TA for help!

Update repo folder structure

We might want to consider updating the folders. For instance, there are 3 image files for the sketch which should probably be in a folder. Just not sure what to name it :P
If we change the location of the image files we need to remember to update the references in the README and any other document that links to them.

add colors to bar chart by province

I think adding the ability to filter by provinces would be cool. That way the CMA barplot on the right won’t be as long, we would just show CMAs in the provinces selected. It would also help for the choropleth since it will change the scale based on only the province of interest.

This could be implemented as a multi-select dropdown. Let me know in the comments if you like the idea or not!

Misc fixes/improvements (includes Joel feedback)

  • Add a Title to the tabs (+ perhaps a short description/explanation)
  • remove the coding numbers that are part of the Geographies and the Violation Descriptions
  • PEI is being recorded as a CMA. , for Geography = "Prince Edward Island [11]", we need to change the value for the Province column to = "PROVINCE".

Reflection

In this section, your group should document on what you have implemented in your dashboard so far and explain what is not yet implemented. It is important that you include what you know is not working in your dashboard, so that your TAs can distinguish between features in development and bugs. Since this is the last milestone, you really need to motivate well why you have not chosen to include some feature that you were planning on including previously.

This week it is suitable to include thoughts on the feedback you received from your peer and/or TA, e.g.

  • Has it been easy to use your app?
  • Are there reoccurring themes in your feedback on what is good and what can be improved?
  • Is there any feedback (or other insight) that you have found particularly valuable during your dashboard development?
    This section should be around 300-500 words and the reflection-milestone4.md document should live in your GitHub.com repo in the doc folder.

Add text to app

I think it would be useful to add some text explaining how to use the app, what some of the key terms in the data mean, and where the data is from.

Reflection Document

4. Reflection

rubric={reasoning:6}

In this section,
your group should document on what you have implemented in your dashboard so far
and explain what is not yet implemented.
It is important that you include what you know is not working in your dashboard,
so that your TAs can distinguish between features in development and bugs.

Reflect on what you think your dashboard does well
what its limitations are,
and what are good future improvements and additions.
This section should not be more than 500 words
and the reflection-milestone2.md document should live in your GitHub.com repo
in the doc folder.

Improve the README (Optional)

5. Improve the README (Optional)

rubric={reasoning:2}

Expand on the README file to be a welcoming place for anyone coming
to your project for the first time.
For your project,
your README should cater to at least two groups of people
(on bigger projects these can be separated and put in different files):

  1. Those potentially interested in using your dashboard
    • Include motivation behind your project and clearly explain
      what problem you are solving and why it is important.
    • You do not have to include detailed usage instructions,
      just high level what they can do in your dashboard and and the deployed link.
    • This is a good example
  2. Those interesting in helping you develop your dashboard
    • Potential contributors are interested in the above as well,
      but also need to know how they can install your app and how to run it locally
      (maybe they are great in Altair but have never used Dash).
    • Suggestions for what you would like help with and how to work in your project,
      some of this can go in contributing also.
    • This is an example of a program I made as part of my thesis.

Including a table of contents can be useful,
as well as a short GIF of your dashboard doing something impressive.
No matter how many nice words you put down,
seeing the functionality right when they land on your GH page
is very useful to evoke interest.

Tab 2 Updates

  • Troubleshoot: Plots not showing up
  • Location dropdown: have list of locations remain after switching from Province to CMA and back.
  • Set default plots on Tab 2

Implement value sorting in bar chart

-At minimum sort the bar chart from largest value to smallest value
-optional: allow user to flip the sorting order (largest to smallest, smallest to largest)

Tab 2

#26
This milestone

  • 4 Plots
  • CMA vs Province radio button
  • CMA/Province selection (based on radio button response)

Future milestone

  • Move legend to menu on left
  • Metric dropdown

Submit Milestone 4 to Canvas

Once you have finished the work for this milestone you must create a release on GitHub.com before the submission deadline.
Please read the GitHub documentation on how to create a release via the online interface. Name your release with the respective milestone name.
We will grade all files in the repo at the state they were in when you created the release. This means that you can continue to make changes in the repo without worrying about messing up your grading for the previous milestone.
The only file you need to submit to Canvas is the one called canvas-submission.html.
This file is in your github.ubc.ca repo.
Submit this file manually on Canvas and only once per group.
Make sure to add a link to your milestone release in this file and leave the rest as is (this facilitates grading).
Optionally include a link to an auto-deployed Heroku PR (see section 6).

Writeup Section 3

Section 3: Research questions and usage scenarios
rubric={reasoning:12,writing:2}

The purpose of this section is to get you to think about how your target audience might use the app you're to designing and to account for those needs in the proposal.

For this it can be helpful to create a brief persona description of a member in your intended target audience and write small user story for what they might do with your app. User stories are typically written in a narrative style and include the specific context of usage, tasks associated with that use context, and a hypothetical walkthrough of how the user would accomplish those tasks with your app. If you are using a Kaggle dataset, you may use their "Overview (inspiration)" to create your usage scenario.

An example usage scenario with tasks (tasks are indicated in brackets, i.e. [task], and are optional to include)

Mary is a policy maker with the Canadian Ministry of Health and she wants to understand what factors lead to missed appointments in order to devise an intervention that improves attendance numbers. She wants to be able to [explore] a dataset in order to [compare] the effect of different variables on absenteeism and [identify] the most relevant variables around which to frame her intervention policy. When Mary logs on to the "Missed Appointments app", she will see an overview of all the available variables in her dataset, according to the number of people that did or did not show up to their medical appointment. She can filter out variables for head-to-head comparisons, and/or rank patients according to their predicted probability of missing an appointment. When she does so, Mary may notice that "physical disability" appears to be a strong predictor missing appointments, and in fact patients with a physical disability also have the largest number of missed appointments. She hypothesizes that patients with a physical disability could be having a hard time finding transportation to their appointments, and decides she needs to conduct a follow-on study since transportation information is not captured in her current dataset.

Note that in the above example, "physical disability" being an important variable is fictional. You don't need to conduct an analysis of your data to figure out what is important or not. Instead, estimate what someone might find, and how they may use this information.

Tab 1

#26

This milestone

  • Violation per CMA plot
  • Violation dropdown
  • Metric dropdown

Future milestone

  • Violation per province plot (in progress, currently commented out)
  • Subcategory dropdown
  • Sorting for CMA plot
  • Year slider

Address TA Feedback

#40

  • Subsetting or a scroll bar for barchart on tab 1 might help the used look at the data easier
  • Put legend for the choropleth map somewhere
  • Limit CMAs to select
  • Display CMA abbreviation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.