Giter VIP home page Giter VIP logo

string-bugs's Introduction

String-related Bugs

This repository contains the dataset and scripts of "No Strings Attached: An Empirical Study of String-related Software Bugs".

For citing this work use "Aryaz Eghbali and Michael Pradel. 2020. No Strings Attached: An Empirical Study of String-related Software Bugs. In 35th IEEE/ACM International Conference on Automated Software Engineering (ASE โ€™20), September 21โ€“25, 2020, Virtual Event, Australia" (https://doi.org/10.1145/3324884.3416576).

Dataset Description

  • The main.csv file contains all bugs from our study. The columns are as follows:
    • Bug Description: A short textual description of the bug.
    • Commit URL: The Github URL to the commit that fixed the bug.
    • String literal fix: 1 if the bugfix fixed a string literal.
      • Code: 1 if the string literal is some software code.
      • Path: 1 if the string literal is a path (URL, file path, etc.) to some resource.
    • Regex: 1 if the bug is caused by a regular expression.
      • Mistake in literal: 1 if the bug is because of wrong literal
      • Grouping: 1 if the bug is because of wrong grouping
      • New case: 1 if the bug is fixed by adding more cases to the regex
      • Use regex at all: 1 if the bug is fixed by using regular expressions
      • Escaping: 1 if the bug is caused by not escaping some special characters
      • Repetition/optional: 1 if the bug is caused by wrong repetition/existence directives
      • Anchors: 1 if the bug is caused by wrong anchors
      • replace() added or removed: 1 if the bug is fixed by using or removing the replace() method call
    • Script as string: 1 if the bug is caused by some script that is used as a string value.
    • API: 1 if the bug is caused by some string-related API.
    • String literal --> function/config generated: 1 if the bug is fixed by using a function generated string or reading from configuration source instead of using a fixed string.
    • Additional operation/comparison required: 1 if the bug is fixed by applying more operations or comparisons.
    • Wrong path: 1 if a path to some resource is wrong.
    • Treat as/cast to string: 1 if the bug is fixed by treating a value as or casting a value to the string type.
    • String comparison using exact format: 1 if the bug is caused by comparing a value to a fixed string with exact equality operations.
    • Case-sensitive comparison: 1 if the bug is caused by comparing strings case-sensitive.
    • Should have been a string literal: 1 if the bug is fixed by using a string literal instead of some variable, method call, etc.
    • Empty string problems: 1 if the bug is caused by empty strings.
    • Remove some operation/comparison: 1 if the bug is fixed by removing some operations or comparisons.
    • Core: 1 if the component affected by the bug is the core component.
    • Testing: 1 if the testing part of the code is affected by the bug.
    • Utility: 1 if some utility function/class/etc. is affected by the bug.
    • Build: 1 if the build process is affected by the bug.
    • Demo: 1 if the demonstration code is affected by the bug.
    • UI: 1 if the user interface is affected by the bug.
    • Consequence: A short description of the bug's consequences.
    • Incorrect output: 1 if the output is wrong because of the bug. The output can be any visual output from textual to graphical user interfaces.
    • Corrupt a file: 1 if the bug corrupts the contents of a file.
    • Cause problem in specific OS: 1 if the bug makes some problems in some operating system(s).
    • Causes problem in s specific software: 1 if the bug causes problems in some software(s), like web browsers.
    • Test fails: 1 if some test fails because of this bug.
    • Task not done: 1 if the task that was supposed to be done by the code is not preformed (partially or completely).
    • Interpreter/commandline error: 1 if the bug presents an error.
    • Authentication fails: 1 if authentication to some protected resource fails because of the bug.
    • Inexecutable program part: 1 if some part of the program does not get executed because of the bug.
    • Tool/Software warning: 1 if the bug causes a warning in some other tool or software.
    • Cause problem in a specific device: 1 if the bug happens in some device(s).
    • Security risk: 1 if the bug poses a security risk.
  • The changes.csv containes statistics on how many characters and lines are changed in each of the bugfixes studied.
  • The file rootcause_component.csv contains the matrix of how many of each root cause affect each component. The file rootcause_consequence.csv contains the matrix of how many of each root cause entails which consequences.

Scripts

The following scripts (presented in the "scripts" directory) were used for various analysis:

  • extract_all_patches.sh: Extracts all patches (commits) into separate files (one file per patch).
  • small_fixes_in_js.py: Selects commits with few number (4) of changed lines in js files.
  • extract_interesting_commits.sh: Copies the small commits from small_fixes_in_js.py script into a file for manual inspection.
  • repair.js: For each bugfix, looks for the added tokens in at most 100 lines away to see if those tokens appear in the vicinity of the code change.
  • compare_with_jshint.py: Runs JSHint and checks if any commits match the same lines of JSHint warnings. Also uses .jshintrc.

string-bugs's People

Contributors

aryaze avatar michaelpradel avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.