Giter VIP home page Giter VIP logo

revhelper-replication-package-msr2017's Introduction

Predicting Usefulness of Code Review Comments using Textual Features and Developer Experience

Accepted Papers at MSR 2017

Predicting Usefulness of Code Review Comments using Textual Features and Developer Experience
Mohammad Masudur Rahman, Chanchal K. Roy, and Raula G. Kula

Download this paper: PDF

Impact of Continuous Integration on Code Reviews
Mohammad Masudur Rahman and Chanchal K. Roy

Download this paper: PDF

Abstract: Although peer code review is widely adopted in both commercial and open source development, existing studies suggest that such code reviews often contain a significant amount of non-useful review comments. Unfortunately, to date, no tools or techniques exist that can provide automatic support in improving those non-useful comments. In this paper, we first report a comparative study between useful and nonuseful review comments where we contrast between them using their textual characteristics and reviewers’ experience. Then, based on the findings from the study, we develop RevHelper, a prediction model that can help the developers improve their code review comments through automatic prediction of their usefulness during review submission. Comparative study using 1,116 review comments suggested that useful comments share more vocabularies with the changed code, contain salient items like relevant code elements, and their reviewers are generally more experienced. Experiments using 1,482 review comments report that our model can predict comment usefulness with 66% prediction accuracy which is promising. Comparison with three variants of a baseline model using a case study validates our empirical findings and demonstrates the potential of our model.

Comparative Study

The review comments below are used for our comparative study between useful and non-useful comments:

  • CS (256)
  • SM (281)
  • MS (288)
  • SR (291)
  • All useful comments (618)
  • All non-useful comments (498)

Review Usefulness Prediction

Prediction Model

Auxiliary Items for Replication:

  • Stop words
  • Python keywords
  • Readability Ease calculator library
  • Regex for question identification: "[?]($|\s)" and sentence identification: "[?!.]($|\s)"
  • Regex for code element identification: "`{3}[\S+\s+]+`{3}|`[\S+]+`|`[\S+\s+]+`"
  • GitHub API librarry
  • Git Bash

Model Training & Testing

  • All useful comments (618)
  • All non-useful comments (498)

Model Validation & Case study

  • CS (81)
  • SM (99)
  • MS (99)
  • SR (87)
  • All useful comments (262)
  • All non-useful comments (104)

Please cite our work as

@inproceedings{msr2017masud, 
author = {Rahman, M. M. and Roy, C. K. and Kula, R. G. }, 
title = {{Predicting Usefulness of Code Review Comments using Textual Features and Developer Experience}}, 
booktitle = {Proc. MSR}, 
year = {2017}, 
pages = {215--226} }

Download this paper: PDF

Do you also want to check CORRECT?

Something not working as expected?

Contact: Masud Rahman ([email protected])

OR

Create an issue from here

revhelper-replication-package-msr2017's People

Contributors

masud-technope avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.