konveyor / tackle-container-advisor Goto Github PK

View Code? Open in Web Editor NEW

28.0 14.0 25.0 40.14 MB

Recommends containerization plan for legacy applications.

Home Page: https://konveyor.github.io/tackle-container-advisor-docs/

License: Apache License 2.0

Dockerfile 0.61% Python 97.66% Makefile 0.24% Batchfile 0.29% Shell 1.19%

container image modernization

tackle-container-advisor's Introduction

Tackle Container Advisor (TCA)

Usage

Development

Purpose

TCA provides APIs to standardize natural language description of technology stack components, cluster a portfolio of technology stacks into similar technology stack groups, match technology stacks to docker, openshift or operator catalog images.

1. App1: rhel, db2, java, tomcat
2. App2: .net, java, oracle db
3. App3: dot net, java, oracle dbms

TCA takes the following steps to recommend the containerization.

Standardize: Standardize natural language inputs to relevant named entities of technology stacks present in our knowledge base. For details on the knowledge base please check the db folder. For example, the inputs in App1,App2,App3 get mapped as the following named entities.

1. App1: rhel: RedHat Enterprise Linux, db2: DB2, java: Java, tomcat: Apache Tomcat
2. App2: .net: .NET, java: Java, oracle db: Oracle DB
3. App3: dot net: .NET, java: Java, oracle dbms: Oracle DB

Clustering: Cluster the standardized technology stack components into groups of similar technology stacks. For example, the standardized technology stacks for App1,App2,App3 get clustered into the two technology stack clusters.

1. Cluster1: {App1}
2. Cluster2: {App2, App3}

Containerize: Determines whether a technology stack is fully containerizable, partially containerizable or not containerizableat all. If a technology stack is recommended as fully or partially containerizable, it also generates container images based on DockerHub or Openshift image catalogs. It is also possible to provide custom user-defined catalogs for matching to catalog images. For example, if a user decides to generate DockerHub related images, then TCA generates the following images.

1. Cluster1: tomcat|https://hub.docker.com/_/tomcats
2. Cluster2: db2|https://hub.docker.com/r/ibmcom/db2

For OpenShift, TCA generates the following images.

1. Cluster1: tomcat|https://access.redhat.com/containers/#/registry.access.redhat.com/jboss-webserver-3/webserver31-tomcat8-openshift
2. Cluster2: db2|https://access.redhat.com/containers/#/cp.stg.icr.io/cp/ftm/base/ftm-db2-base

TCA Pipeline

The pipeline ingests raw inputs from clients data and standardizes the data to generate named entities and versions. For standardizing or normalizing raw inputs we use a tf-idf similarity based approach. To find container images we represent images in terms of named entities as well. The normalized representation helps to match legacy applications with container images to suggest the best possible recommendations.

Code of Conduct

Refer to Konveyor's Code of Conduct page

tackle-container-advisor's People

Contributors

Stargazers

Watchers

tackle-container-advisor's Issues

Techtokens get split based on colon

some tech tokens are getting split based on ":"

Tableau:SI

SI gets separated from Tableau in recommendation

Expand the current KG with Openshift and DockerHub images

Expand ACA knowledge base with entity versions

We need to capture versions for entities in the following order

OS
App
App Servers
Runtime
PLs

Static code analysis using SonarQube and git actions

Whether SonarQube is an open source not.
if not we need to find an equivalent static code analysis tool.

Add the version standardizer feature to the containerization assessment API

Refactor the ACA Code to create containerization recommendation API

As a user I should be able to get containerization recommendation for APIs

overall deployment pipeline for GBS

Publish Tutorials on TCA

KG refactoring for Containerization

Performance Bottleneck on Compose App

The recommendation has some bottlenecks. It takes .6 sec for processing 1 record. This is only for compose-app module. The rest of the modules are pretty fast. Thanks to @Salai123 for reporting it.

Older code with old mentions (2k+)

Request Received 1631602243.2549899
Triggered Planning 1631602243.2552867
call containerization plan  1631602243.3072283
detect access token 1631602243.3079007
detected access token 1631602243.3079774
composed app 1631602243.4333582
missing infer tech 1631602243.4339461
app validate 1631602243.434026
assessment ui 1631602243.4362257
planning ui 1631602243.4396195
return final response  1631602243.4432805

New Code with new mentions (6k+)

Request Received 1631609691.3319073
Triggered Planning 1631609691.3320363
call containerization plan  1631609691.3685682
detect access token 1631609691.368657
detected access token 1631609691.368682
composed app 1631609691.911315
app validate 1631609691.911479
assessment ui 1631609691.9115021
return final response  1631609691.9117076

Unit test has an old API

One of the unit test has an old API

app_data = [
{
'application_id': 'App ID 0114',
'application_name': 'App Name 0114',
'application_description': 'App Desc 0114',
'component_name': 'Comp 1',
'operating_system': 'RHEL',
'programming_languages': 'Java',
'middleware': 'WebSphere Application Server',
'database': 'db2 10.0',
'integration_services_and_additional_softwares': 'Redis',
'technology_summary': 'angularJs,express.js,jenkins',
'versioning_tool_type': '1',
'application_inbound_interfaces': 5,
'application_outbound_interfaces': 1,
'devops_maturity_level': 'Moderate',
'devops_tooling': 'Jenkins, Git, JIRA',
'test_automation_%': '50%',
'performance_testing_enabled': 'No'
}

Can we change it?

Update ACA DB to add refactored ACA KG for containerization API

We will add all information from refactored KG to ACA DB. It will be 2 DBs. One community DB and another enterprise DB.

Update Documents required for containerization API

The documents for the containerization API needs to be updated.

ACA backend API (Salai)
KG utils (Lambert)

Expand the current KG with the generated mentions

Given the generated mentions, we need to add more entities to the current KG.

Verify the expanded KG with the new mentions

We need to verify the mentions and QIDs.

Create a benchmarking for entity standardization

Trigger unit tests using git actions

Expand ACA KG with Environment variables for Docker and Openshift images

New UI requirement: data upload

Improve the performance for serial processing.

KG refactoring for Containerization

Update Automated Tests for the containerization API

Update tests for the containerization API.

Create disposition recommendations

We have to address various labels for 6Rs. One of them being upgrade (refactor).

Expand the operators catalog

Expanded the operators' catalog to add new operators

Containerization API

As a user I should be able to get the containerization recommendation for my legacy apps.

Openshift recommendations has errors

{
"status": 201,
"message": "Container recommendation generated!",
"containerization": [
{
"Name": "App",
"Desc": "string",
"Cmpt": "string",
"Valid": true,
"Ref Dockers": "1. {'openjdk_Linux': 'https://access.redhat.com/containers/#/registry.access.redhat.com/ubi8/openjdk-11'}",
"Confidence": 0.56,
"Reason": "Containerization feasibility unknown for COTS applications: Apache Tomcat, MySQL",
"Recommend": "Partially Containerize"
}
]
}

Create 6R disposition downstream task

Can we add the confidence scores for all the recommendations we generate?

Remove/Update the duplicate code in entity standardization

We have few utility files and sim applier duplicate in both entity standardization and back end api

Create an API to extract environment variables for images

Either an API or a script to generate JSON representations

ACA Deployment

Additional Version Information

Requiring inputs on couple of things

Dear Team,

What would be expected for technology summary here as it seems to unable to create an output?
How to upload bulk files onto the page? Here I can see only one input is allowed

Input : -
[
{
"application_name": "Sample",
"application_description": "This is a Sample App",
"component_name": "Framework",
"operating_system": "RHEL",
"programming_languages": "Java",
"middleware": "Kafka",
"database": "PostGreSQL",
"integration_services_and_additional_softwares": "",
"technology_summary": "Web"
}
]

Output : -
{
"status": 201,
"message": "Assessment completed successfully!",
"assessment": [
{
"Name": "Sample",
"Desc": "This is a Sample App",
"Cmpt": "Framework",
"OS": "{'RHEL': {'Linux|Red Hat Enterprise Linux': ''}}",
"Lang": "{'Java': {'Java': ''}}",
"App Server": "{}",
"Dependent Apps": "{'Kafka': {'Apache Kafka': ''}, 'PostGreSQL': {'PostgreSQL': ''}}",
"Runtime": "{}",
"Libs": "{}",
"Reason": "Reason 101: Medium or low confidence for the inferred data: {"Web": {"Java|Java Web Start": "", "Apache HTTP Server": "", "Java|Google Web Toolkit (GWT)": ""}}",
"KG Version": "1.0.0"
}
]
}

Update KG utils to generate files for containerization API

This will involve connecting to DB and generating files required for the containerization API

Refactor ACA original KG for containerization API (Community Edition)

We need to refactor the ACA original to add data for community edition. We need to generate the enterprise edition as well for GBS consumption.

Add unit tests for containerization API

Containerization API

Remove Version Complexity from the containerization part

We dont need to release version complexity as of now. We can remove it entirely.

Documentation needed to interface with TCA from Tackle Inventory

Hi @kaliaanup , I'm onboarding to the Tackle project along with @jortel and @mansam right now.

We are working to learn how the Tackle App Inventory can interface with other tools in the Tackle suite, and trying to define the requirements of the App Inventory data model.

It would be helpful to have a reference doc on the public interface of TCA so that we can get an idea of how TCA this would tie in with Tackle Inventory, and what changes might be needed to the Tackle Inventory data model:

Expected inputs / outputs of TCA (examples would be useful)
Methods for invoking TCA (REST? CLI? Something else?)
Is there a Quay.io container auto-build set up so we can consume TCA easily running as a Pod on OpenShift?
Is there a deploy YAML available with a deployment definition for TCA and any required auxiliary OpenShift resources (e.g. namespaces, services, routes). This would let us quickly play around with TCA using other info provided. We may be able to assist here if needed.

Do documents detailing the public interface of TCA exist? Even some simple examples would be useful to get started.

cc @rromannissen @jwmatthews @PhilipCattanach