Giter VIP home page Giter VIP logo

requirement-traceability-analysis's Introduction

Requirement Traceability Tool

This project creates a tool that identifies and visualizes the trace links for a software project given the requirements and software development repository. A short video explanining the architecture and the tool.

Used Technologies

Programming Language: Python

Graph DB: Neo4j

Authors

How to Run

The project needs Python=3.10 to operate. You can download it by clicking here.

Prepare virtual environment

It is recommended to use a virtual environment on Python using venv library.

Clone the project using the following shell command.

  git clone https://github.com/ersoykadir/Requirement-Traceability-Analysis.git

Create a virtual environment,

  python -m venv /path_to_your_venv

and activate it.

For Linux:
  . path_to_your_venv/bin/activate
For Windows:
  path_to_your_venv\Scripts\activate

Navigate to the root directory of the project and install the required dependencies.

  cd ./Requirement-Traceability-Analysis
  pip install -r requirements.txt

Navigate to the traceGraph directory:

cd traceGraph

Create a .env file with the following content:

GITHUB_USERNAME= < your github username >
GITHUB_TOKEN= < your github token >
NEO4J_PASSWORD= password
NEO4J_USERNAME= neo4j
NEO4J_URI= bolt://localhost:7687
OPENAI_API_KEY= < your openai token >
PRETRAINED_MODEL_PATH= < path to your pre-trained word-vector model >
FILTER_BEFORE= < OPTIONAL, provide to filter out software artifacts before a certain date >

You can find a file named .env.example as a template in the traceGraph directory. We have chosen neo4j/password as default credentials. Please don't change neo4j credential defaults, since they are also used while creating the neo4j docker, or update the docker-compose file as well. OpenAI key token utilized for acquiring word embeddings from openai's text-embedding-ada-002 model. Also any pre-trained word-vectors can be used, providing its path. We have utilized word2vec's GoogleNews-vectors-negative300.


Prepare Neo4j server

A neo4j server is required to use our dashboard. We provide a docker compose file, assuming docker is installed, to fasten installation process. At the root directory of the project, run

  docker compose up -d

Alternative to dockerized version, Neo4j Desktop can also be used. Just don't forget to add apoc plugin.


Run the tool

Navigate to the traceGraph directory under the root.

cd ./traceGraph

BEWARE! If you want to use a repository apart from our example repositories, please provide the requirements.txt

Open a repository named data_reponame and create requirements.txt

cd data_reponame
touch requirements.txt

Then you can copy your projects requirements into it.


Run .main.py with three system arguments;

python ./main.py <git_repo> <search_method> <options>
  • <git_repo> repo_owner/repo_name

    • e.g. ersoykadir/Requirement-Traceability-Analysis
  • <search_method> indicates the method used for searching traces.

    • keyword: Keyword extraction method for capturing traces.
    • tf-idf vector: tf-idf vector method for capturing traces.
    • word-vector: Word-vector method for capturing traces, requires a pre-trained model.
    • llm-vector: Word-vector embeddings taken from openai.
  • <options_> run options

    • rt: requirement tree mode,

      • includes parent requirements for keyword extraction, requires a file named 'requirements.txt' in the root directory of the repository
    • rg: reset graph,

      • deletes the graph pickle to re-create the graph from scratch

We have a Config file which controls everything related to running configurations. Beware that the tool needs the requirement specifications for the repository you provided. The two repositories with their requirements is available on the repo, which were also used for development of this tool.

Dashboard

Navigate to http://neodash.graphapp.io/ to view the dashboard. From the menu, navigate to new dashboard. Provide the default neo4j credentials mentioned above. From left menu bar, navigate to load. Proceed to Select from Neo4j option on the opening page and select our pre-uploaded dashboard template. The dashboard must be ready for use. In the case that dashboard template doesn't show up, Select from File option can always be used and the dashboard template can be taken from our repository. Navigate to http://localhost:7474/ to view graph database.

Example queries for Neo4j graph database

The following query returns the traces for a specific requirement number.

MATCH p=(r)-[t:tracesTo]->(a) 
where r.number='<req_number>'
RETURN *

For keyword search, the following query returns the traces which have a specific keyword match.

MATCH p=(r)-[t:tracesTo]->(a) 
WHERE t.keyword='<keyword>'
RETURN *

The following query returns the traces between the requirement and the specific artifact type.

MATCH p=(r)-[t:tracesTo]->(a:<artifact_type>) 
where r.number='<req_number>'
RETURN *

requirement-traceability-analysis's People

Contributors

ersoykadir avatar codingaku avatar aydemirfb avatar suzan-uskudarli avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

requirement-traceability-analysis's Issues

Initial Research - Traceability

Issue Description

We have settled on Traceability as our research topic. We need to do a literature review for Traceability.
We can do keyword search on title, abstracts and keywords. Also we might look on references of found papers to extend our research.

Side Note: Read papers carefully, evaluate if they are easy to read, look for the methods they used to present their content. Also keep in mind,

  • What can I do to improve this research, how it can be extended.
  • What methods has been used (NLP...)
  • Is it implemented, if so, look for the codes.

Step Details

Steps that will be performed:

  • Find similar papers and researches.
  • Document them with summaries on wiki.

Final Actions

Wiki page for the research must be reviewed.

Deadline of the Issue

23.59 - 04.03.2023

Setting up Milestones

Issue Description

Milestones are essential in any software project to improve the level of organization. In this context, we need to set some fundamental milestones to set a clearer roadmap.
These include:

  • Literature Research ( 2 weeks)
  • Development Phase (5 weeks)
  • Evaluation Phase(3 weeks)
  • Documentation Phase(3 weeks)

Step Details

Steps that will be performed:

  • Create milestones on issues section
  • Link issues to the milestone

Final Actions

Although the milestones will set general deadlines, changes may happen on the schedule.

Deadline of the Issue

23.59 - 04.03.2023

Initial Research - Requirements Analysis

Issue Description

We have started on researching similar papers on topics like requirements analysis and requirements tracking. We need to document the results on wiki.

Step Details

Steps that will be performed:

  • Find similar papers and researches
  • Document them with summaries on wiki.

Final Actions

Wiki page for initial research must be reviewed.

Deadline of the Issue

23.59 - 23.02.2023

Evaluation of Bounswe Repos - Traceability Graph

Issue Description

As mentioned and documented in Issue #9, Bounswe projects Learnify and BUcademy have been scanned and traceability links between requirements and software artifacts are noted in a text format. For these projects, a traceability graph with various nodes and links to visualize the traces between artifacts needs to be drawn.

The graph may have several link types:

  • Requirements to issues may have a "similarity" type of link.
  • Issues to PRs may have an "implements" type of link.
  • Commits to PR may have an "includes" type of link.

The graph may have several node types. This may include requirement, issue, commit.

The nodes may have properties:

  • The issue node may have a title and description.
  • The requirement node may have vagueness, the number of nodes connected,
  • PRs may have the title, description, and merged status.
  • Commits may have messages.

Other node and link types are welcomed along with discussed ones.

Step Details

Projects that will be graphed are:

Final Actions

The graphs must be linked to the Wiki page for documentation purposes.

Deadline of the Issue

26.03.2023 - 23.59

Research On Repositories

Issue Description

The aim for this week is to look for repos on github, examine them with the perspective of newly learned methods.

  • BounSWE repos are good to observe, also look for open-source projects.
  • What can be done for traceability, what kind of information is available for use in repos?
  • wordnetnltkwikidataspacy are some of the tools that we will play with, to test out info retrieval and semantic relation methods.

Step Details

Steps that will be performed:

  • Observe repositories, What can be achieved in terms of tracing?
  • Observe tools
  • Try the tools to have an idea about how feasible the suggestions on first step are

Final Actions

Ideas and findings must be documented on wiki.

Deadline of the Issue

11:00- 13.03.2023

Evaluation of Bounswe Repos - 2022 Group 3 and Group 2

Issue Description

As discussed in Meeting 3, some repos in Bounswe will be evaluated manually. This means the requirements and their traces on commit messages, issue and PR descriptions and any other artifact will be noted. At the same time, the time each requirement took to complete will be noted. Moreover, if duplicate requirements exist, they will be pointed out.

In addition, frequently used keywords in any software artifact need to be noted for similarity analysis is required. If the number of issues related to a requirement is higher than usual, the work taken by it should be documented as well.

For the final evaluation, the requirements that have been completed and the ones that have not been implemented should be noted to show the progress the application has.

The findings will overall show a quality analysis of the project based on requirements.

Step Details

Projects to be evaluated:

Final Actions

The findings need to be documented on the project Wiki.

Deadline of the Issue

19.03.2023 - 23.59

Creating a First Page for Wiki

Issue Description

A wiki first page is needed to describe the project. The page must include the title of the project, brief description, the aim and the contributors.

Final Actions

Review for the adequacy is required.

Deadline of the Issue

23.59 - 23.02.2023

Keyword Extraction : Dependency Parsing Links

Issue Description

With Issue #11, the keywords of requirements for tracing is extracted. However, the initial trace results are noisy to operate. The noise needs to be reduced with more relevant keywords. One way to provide this is with dependency parsing links.

Step Details

Steps that will be performed:

  • Create a program that shows dependency parsing links.
  • Use the program on keywords extracted.
  • Increase the importance of keywords with parsing links.

Final Actions

The findings need to be documented in Wiki page.

Deadline of the Issue

10.04.2023 - 23.59

TraceGraph: Creating Neo4j nodes

Issue Description

We will store our software artifact data in neo4j. We need to convert our parsing methods such that artifact data from github will be parsed into neo4j nodes.

Step Details

Steps that will be performed:

  • Learn basic neo4j queries
  • Parse artifact data into neo4j nodes

Final Actions

The findings need to be documented in Wiki page.

Deadline of the Issue

24.04.2023 @23.59

Evaluation of Prototype Results

Issue Description

The initial prototype with captured traces exists. The next step is to compare the results with a set of ground truth and document the evaluation.

Initializing Ground Truth Set:
For this purpose, a subset of requirements from the Learnify App Repository will be selected and their traces will be documented. This will be a manual analysis made by @codingAku and @ersoykadir.
The selected requirement subset is as follows:

  • 1.1.1.2.1. Users shall provide their usernames and passwords to log in. - @codingAku
  • 1.1.1.4.2. Users shall be authenticated after verification and be logged in. - @ersoykadir
  • 1.1.2.14.1. Users shall be able to annotate post images and texts in learning spaces. @codingAku
  • 1.1.2.14.2. Users shall be able to view annotations made by other users. - @ersoykadir
  • 1.1.2.15.1. Users shall be able to see all learning spaces they created or enrolled in. - @codingAku
  • 1.1.3.2.5.1. Participants shall be able to create community events for that learning space. - @codingAku
  • 1.1.2.2 Users shall be able to edit their profile page. - @ersoykadir
  • 1.1.3.2.7.1. Participants of a learning space shall be able to create discussion posts. - @codingAku

The results of semantically related traces in the software artifacts will be used as ground truth set for evaluation.

The results of initial prototype:

The tool will be run with two different trace selection methods; keyword extraction and word vectors. The results will be compared with the ground truth set to document recall and precision values.

Step Details

Steps that will be performed:

  • Create ground truth set
  • Execute the tool with different trace capture methods
  • Document recall and precision values.

Final Actions

The findings need to be documented in a tabular format.

Deadline of the Issue

09.05.2023

Improving Dependency Parsing Keyword Extraction

Issue Description

Currently, the dependency parsing does not include conjunction links to catch verb-object pairs. An example is given below:
image

The custom pipeline needs to be updated to capture the verb-object pairs with conjunction depth.

Step Details

Steps that will be performed:

  • Implement deeper dependency link search
  • Include to returned keywords

Final Actions

The trace capture must be re-run with new verb object pairs.

Deadline of the Issue

08.05.2023

Tracing between Issue and PRs

Issue Description

Currently, the tool creates trace links between Requirement nodes and the rest of the software artifacts. This architecture should be updated to have traces as follows:

  • Requirements --tracesTo-> Issues
  • Issues --tracesTo->PRs

The results must be on Neo4j graph database

Step Details

Steps that will be performed:

  • Change the Neo4j query to trace Requirements to Issues
  • Write new query to trace Issues to PRs

Final Actions

The final graph must be reviewed.

Deadline of the Issue

09.05.2023

Create Custom Sidebar and Meeting Template

Issue Description

A custom sidebar to boost organization and provide easy access to wiki pages are needed. Existing wiki pages must be linked.
Secondly, a meeting template that includes time, zoom link, agenda and the action items shall be created.

Step Details

Steps that will be performed:

  • Create a custom sidebar and link existing pages
  • Create a meeting template

Final Actions

Review is needed to finalize.

Deadline of the Issue

23.59 - 24.02.2023

Create an Issue Template

To have a better organization with issues, an issue template is needed.
The template needs to have:

  • Description
  • Steps to Perform
  • Final Actions
  • Final Deadline

After the template is created, the related file will be in a directory in main branch.

Deadline :

  • 13.00 - 23.02.2023

Keyword Extraction on Requirements

Issue Description

The traceability graphs are complete. However, the main search for similarity links were searching important keywords from requirement specifications on software artifacts. This process needs to be done automatically.

For the purpose, the following methods may be used:

  • Dependency parsing
  • Part of speech parsing

Step Details

Steps that will be performed:

  • Research keyword extraction methods
  • Test available engines
  • Document the effectiveness

Final Actions

The documentation must be on Wiki.

Deadline of the Issue

4.04.2023

Initial Automatization of Trace Links

Issue Description

We have manually built, graphed and analyzed traces. Now, we need to automate the link and graph creation. So we will create a data structure for software artifacts to store them. Then we will proceed with building traces between the artifacts automatically, via keyword-based and semantic matching.

Step Details

Steps that will be performed:

  • Download the issue, pr and commit data from github API(beware of the rate limit!) and requirements from github wiki.
  • Decide on the data structure to store each artifact as a node. (With titles, descriptions and properties)
  • Parse the github data into the structure.
  • Create the script to build the trace links between nodes.
    • Research and implement a keyword extraction method.
    • Use it on software artifacts to extract keywords and search them among other artifacts, aiming to build the traces.
  • Visualize the results (can be text based for starters, later graphs will be used)

Final Actions

Document the results on wiki.

Deadline of the Issue

03.04.2023 @25.59

Writing Abstract of the project

Issue Description

We need to write an abstract section for the project Requirement Traceability Tool.

Step Details

Steps that will be performed:

  • Research abstract writing formulas
  • Write the draft abstract
  • Evaluate and finalize.

Final Actions

Next weekly meeting, the abstract will be discussed with our professors.

Deadline of the Issue

28.05.2023

Midterm Report

Issue Description

We need to prepare our midterm report. It will essentially contain what have we done so far.
We already have on wiki:

Step Details

There are 7 sections to the report

  • Introduction
    • Broad Impact
    • Ethical Considerations
  • Project Definition and Planning
    • Project Definition - Ecenur
    • Project Planning - Kadir
      • Milestones and meeting notes
    • Project Time and Resource Estimation
      • How long it will take?, What are the resources required? (see)
    • Success Criteria
    • Risk Analysis
    • Team Work (if applicable)
  • Related Work (Literature survey) - Ecenur
  • Methodology(Solution methods.)
  • Requirements Specification (Use case diagrams.)
  • Design
    • Information Structure(ER Diagrams)
    • Information Flow(Activity diagrams, sequence diagrams, Business Process Modeling Notation.)
    • System Design(Class diagrams, module diagrams.) - Kadir
    • User Interface Design (if applicable)
  • Implementation and Testing
    • Implementation
    • Testing TBD
    • Deployment TBD
  • Results TBD
  • Conclusion TBD

Final Actions

  • Review the overall report.
  • Send the report to Advisors
  • Submit the report through moodle(both students).

Deadline of the Issue

09.04.2023 @23.59

Creating Glossary Page

Issue Description

A glossary page containing formal definition to keywords related to the project development is needed to create on wiki.

Step Details

Steps that will be performed:

  • Create a glossary page
  • Include keywords and reference
  • Add hyperlink to sidebar.

Final Actions

The glossary page will be updated frequently with the progress in the project.

Deadline of the Issue

5.03.2023 - 23.59

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.