Giter VIP home page Giter VIP logo

cip-error-messages's Introduction

cip-error-messages

This repo contains the code used in this paper: insert-link-to-paper. error_message_code contains the code used to generate the different error message types. All other folders contain code used in the data analysis. Note that the data for this project lives in the Code in Place 3 firestore, and was not published to github to maintain privacy of the students.

  • error_message_code/ contains the code used to generate the different error message types.

    • frontend/ has the files were copied directly from the Code in Place React web application code.
    • backend/ has the files directly copied from the firebase repo for creating Forum based error messages.
  • data_files/ contains all of the data files for this project. Note that Sierra did not push most of them for privacy. Contact Sierra if you want access to these files.

    • sl_and_student_data.csv contains the age, gender, city, country, occupation, section_id, country's HDI, longitude, latitude for each student and section leader. Sierra downloaded each user's role (sl or student), gender, age, city, country, occupation, section id from the users/ collection in Code in Place firestore in 2023. She specifically downloaded all users who had section data, and a role that was either ‘sl’ or ‘student’. She performed three manual edits of this data - (1) removed Brahm who was listed as a student, (2) added user YVrXNIB6vTR1GbdNzExVlgYo2D72 as an sl, because they are listed as a ta but also lead a section, and (3) removed Chris Piech Test who was listed as a student. The HDI, latitude, and longitude columns were added programmatically based on each user's location.
    • student_logs/ contains the IDE logs per student. There is one file per student, titled {student_id}.csv. Each row contains the 'code', 'error_message', 'error', 'output', 'assnId', 'type', 'title', 'ogTimestamp', 'serverTimestamp', 'projectId', and 'unitTestResults' for every time the student ran their code. The 'error' is the compiler generated error, while the 'error_message' is the enhanced error that is shown to the student.
    • short_term_data/ contains the parsed data on how students resolve their errors in the short term
    • long_term_data/ contains the parsed data on how students resolve their errors in the long term
    • hdi.xlsx is data on the Human Development Index (HDI) for many countries, downloaded from the United Nations Programme in 2023 (note, that this data is from 2021-22 however)
    • demographics_with_error_data.csv contains demographics information on most of the students (it missing data on 50 students for an unknown reason). This information comes from the files firebase/scripts/admission/accept_adults_original.csv and firebase/scripts/admission/accept_minors.csv. This is used for the experience column.
  • download_scripts/ contains the scripts needed to download data from the Code in Place 2023 firebase. To run these scripts, you must clone the Code in Place firebase repo, then copy and run each script from firebase/scripts/download_data/

    • download_logs_per_student.py downloads one csv per student containing their IDE logs
    • download_user_data.py downloads one csv containing demographic information on each student
  • parsing_scripts/

    • add_hdi.py adds an hdi column to sl_and_student_data.csv
    • add_error_message_type.py adds an error_message_type column to each row of each student log file that says what error message type was used, and each row of sl_and_student_data.csv that says the error message type they used, if they just used one for the whole course.
    • add_experience.py adds an experience column to each row of log file, that says what error message type was used
    • add_avg_error_run_length.py adds an avg_error_run_length column to sl_and_student_data.csv that contains the average error run length of each student for the course.
    • analyze_demographics_results.py analyzes the statsitical significance of the demographics results
    • analyze_long_term_results.py analyzes the statsitical significance of the long term results
    • analyze_short_term_results.py analyzes the statsitical significance of the short term results
    • count_users_and_errors.py counts the number of students that used each error message type, and the number of errors made of each error message type. Sierra used this code to generate table 1 in the paper.
    • graph_demographics_results.py graphs the time to resolve errors for HDI, gender, and programming experience
    • graph_long_term_results.py graphs the long term results for different error message types
    • graph_short_term_results.py graphs the short term results for different error message types
    • parse_long_term_results.py determines how students' rate of error, and time to resolve their error, changes throughout the course
    • parse_short_term_results.py determines the rate that users make the same error in the subsequent run, and the number of runs it takes a user to resolve an error, and writes this information to output files
    • error_message_types.py contains common information about error message types
    • clean_error_message.py contains functions for turning raw errors into more general errors (removes function and variable names, etc)
    • karel_info.py contains information about karel assignments

cip-error-messages's People

Contributors

sierrawang avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.