Giter VIP home page Giter VIP logo

randoopscraper's Introduction

Randoop scraper is a project intended to allow quickly gathering literal values from Java source files for use by the Randoop unit test generator tool https://github.com/randoop/randoop . The tool scans individual files or directory trees looking for Java source files and scans each file looking for literal values, integers, characters, bytes, floating-point, and strings for use as literal values during Randoop test suite creation. The values found are then formatted for use by Randoop using the --literals-file command-line flag.

The parsing performed on the Java source files is performed using the Javaparser library, https://github.com/javaparser which permits extracting literal values from source with minimal effort. Each value found during the parsing is normalized and formatted as a string for use in the Randoop compatible literals file creation phase. Due to the way Javaparser operates, certain types of literal values are treated as equivalent types despite their use in in the Java source. An example of this is byte, short and long values in the original Java source are all considered to be integer values by the Javaparser library, despite these having been defined explicitly as one of the aforementioned types. This tool does not attempt determine what was actually defined in the original code for these but instead checks each value to determine what Java types it can be applicable to, and creates literal values according to this. To do this the tool compares the integer literal value to MAX_BYTE, MAX_SHORT and MAX_LONG as well as the minimum values defined by Java, and assigns the value to all literal types that can support the value, e.g. a value of 3 could be a byte, a short, a long or an int value, thus it would create one for each type for inclusion into the literals file created for Randoop. Similarly, the Java double and float types also have this comparison to maximum/minimum to determine what types that particular literal value could apply to. The Javaparser also elides the negation operators from the literal values it supplies to this tool, however this is mitigated by the mutation operations.

The tool also performs mutation upon the literal values found to enhance the value set and provide more interesting values for use by Randoop. The mutations applied are as follows:

  • Integer and floating point value negation
  • Integer value mutation by off-by-one
  • Floating point reciprocal
Each literal value found is associated with the particular Java class in which it was found, and subclasses found within a Java program source file are considered to be part of the outer class, so literals found within them are treated as belonging to the outer class. Also, redundant values are eliminated during the literals collection phase so only a single version of any literal value is stored for a particular Java class.

During creation of the Randoop literals text file (the output of this tool) duplicate values that are already part of the default Randoop literals set are eliminated.

Owing to the way Randoop works, this method may be suboptimal due to the combinatoric explosion that can result when Randoop performs input selection for methods, particularly those that require multiple literal values to be supplied. Thus, using this tool you should be aware of this and inspect the output manually to eliminate those values that are not considered interesting. Any values that are known to be required by Randoop for testing your application should also be manually inserted into the literals file created, e.g. login and password information for interfacing to web resources.

randoopscraper's People

Contributors

waynemot avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

randoopscraper's Issues

Need to remove new-lines from strings

Randoop doesn't allow new-line (^J) characters embedded into string literals due to the way it parses the literals list input file. This tool needs to remove the new-line characters from the string literals reported by Javaparser before saving them. Strangely it doesn't care about carriage return (^M) characters being embedded, at least on Mac OS and presumably Linux. It may be important in Windows but has not been tested there.

Handle backslash character literals

Randoop does not allow for backslashed character literals to be used in the literals file specs, e.g. \t \r \n all cause heartburn for Randoop (exception error) thus these need to either be translated into their respective decimal equivalent value or removed from the stored values.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.