APIBench

APIBench is the benchmark for evaluating the performance of API recommendation approaches released in paper "Revisiting, Benchmarking and Exploring API Recommendation: How Far Are We?".

Download the Benchmark

As GitHub does not hold large datasets, you can download the benchmark dataset at Zenodo.

Benchmark Details

Currently APIBench contains two sub-dataset for evaluating the performance of query-based and code-based API recommendation approaches, namely APIBench-Q and APIBench-C. Each sub-dataset has a Java version and a Python version.

Task Definition

Here we give the definitions of query-based and code-based API recommendation:

Query-based API recommendation: Approaches for query-based API recommendation aim at providing related APIs to developers given a query which describes programming requirements in natural language. The approaches can inform developers which API to use for a programming task.

Code-based API recommendation: Approaches for code-based API recommendation aim at predicting the next API given the code surrounding the point of prediction. They can directly improve the efficiency of coding.

APIBench-Q

APIBench-Q is located in the folder APIBench/APIBench_Q. There are a Python folder and a Java folder in it for the Python and Java version of APIBench-Q.

Currently there are two json files in each version:

Original{Java,Python}Queries.json: It contains the original queries, corrsponding APIs and API classes, along with the source the query collected from (Stack Overflow or Tutorial Websites).

Reformulated{Java,Python}Queries.json : It contains all the elements in Original{Java,Python}Queries.json. Besides, it contains the reformulated queries derive from the original queries. The reformulated queries are wrapped by @.

Currently we apply the following query reformulation techniques to process the original queries:

Technique	Source
RACK	https://github.com/masud-technope/RACK-Replication-Package
NLP2API	https://github.com/masud-technope/NLP2API-Replication-Package
SEQUER	https://github.com/kbcao/sequer
Google Prediction Service	http://suggestqueries.google.com/complete/search?
NLPAUG	https://github.com/makcedward/nlpaug

Number of queries in APIBench-Q:

	Original	Expanded	Modified
Python	4,309	173,517	224,068
Java	6,563	400,126	341,276

APIBench-C

APIBench-C is located in the folder APIBench/APIBench_C. There are a Python folder and a Java folder in it for the Python and Java version of APIBench-C.

Currently there are two zip files and a json file in each version:

{Java,Python}_MetaData.zip: It contains the metadata json files for code in each domain. The long, normal and short kewords indicate the function with extremely long, moderate and extremely short lengths.

{Java, Python}_Code.zip: It contains the source code of repositories we mined from GitHub at April, 2021. Only .py and .java files are reserved.

{Java, Python}RepoInfo.json: It contains the information of Github repositories we collected, including the lines of code, code ratio, domain, files, forks, stars, addresses, etc.

The structure of metadata:

"$filename$": {
		"$classname$" or "@Global@": {
      "$funcname$":{
        "@FuncLoc@":[
          $startlineno$,
          $endlineno$
        ],
        "$APIname$":[
          [
            $lineno$,
            $columnno$,
            "pure" or "attr",  #whether called from a class
            "Front", "Mdiddle" or "Back" #location of recommendation point
          ],
          ...
          ,
          "Standard", "User-defined", "Popular", or "Unknown" #category of APIs
        ]
      }
    }
}

Statistics of APIBench-C:

Baseline Results

To facilitate further research on API recommendation and reduce the burden of re-implementing different baselines, we release all the evaluation results and outputs of 11 baselines along with 4 IDEs discussed in the paper.

Please go to the experiment_resultsfolder for further detailed information.

Cite Us

If you use our benchmark dataset and related experiment results or code, please cite us:

@misc{peng2021revisiting,
      title={Revisiting, Benchmarking and Exploring API Recommendation: How Far Are We?}, 
      author={Yun Peng and Shuqing Li and Wenwei Gu and Yichen Li and Wenxuan Wang and Cuiyun Gao and Michael Lyu},
      year={2021},
      eprint={2112.12653},
      archivePrefix={arXiv},
      primaryClass={cs.SE}
}

Contact

If you have any questions, please contact [email protected].

nashid / apibench Goto Github PK

apibench's Introduction

APIBench

Download the Benchmark

Benchmark Details

Task Definition

APIBench-Q

APIBench-C

Baseline Results

Cite Us

Contact

apibench's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent