Giter VIP home page Giter VIP logo

adoptium / aqa-test-tools Goto Github PK

View Code? Open in Web Editor NEW
28.0 28.0 72.0 19.22 MB

Home of Test Results Summary Service (TRSS) and PerfNext. These tools are designed to improve our ability to monitor and triage tests at the Adoptium project. The code is generic enough that it is extensible for use by any project that needs to monitor multiple CI servers and aggregate their results.

License: Apache License 2.0

JavaScript 8.59% HTML 1.12% CSS 0.13% Shell 0.06% Batchfile 0.01% Ruby 0.06% Jupyter Notebook 89.72% Python 0.21% EJS 0.09% Dockerfile 0.01% Groovy 0.01%

aqa-test-tools's People

Contributors

2001asjad avatar amandanguyeen avatar andixiong avatar avishreekh avatar awsafsakif avatar dependabot[bot] avatar dhlee49 avatar gdams avatar ichristod avatar jiayaolinn avatar joeyleeeeeee97 avatar karianna avatar llxia avatar longyuzhang avatar momanmalik avatar nadeensami avatar oscarqq avatar piyush286 avatar prajwalborkar avatar renfeiw avatar say-droid427 avatar smlambert avatar snyk-bot avatar sophiaxu0424 avatar suhaasya avatar tommytwm avatar variony avatar xius666 avatar yanshan24 avatar yuehan-lin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aqa-test-tools's Issues

Add Parsers & Perf Graph for AcmeAir

Problem Description

Parsers need to be added for the following benchmarks so that they can be monitored using the existing tools (Perf Graphs) and tools under development (Tabular View), allowing the summarization and visualization of the perf results.

Benchmarks Parsers to Add

Acmeair: https://github.com/acmeair/acmeair
Octane: https://chromium.github.io/octane/

Steps for Adding a Parser

I'm adding some steps here that I provided in the internal GitHub so that others can refer to this documentation for adding more parsers in future.

Our parsers in Test Result Summary Service (https://github.com/AdoptOpenJDK/openjdk-test-tools/tree/master/TestResultSummaryService/parsers) expect all the results to be in the Jenkins output, so that we don't need to parse results from different files. Hence, we might need to output the results from some log file to the main Jenkins output. If TRSS is monitoring that Jenkins pipeline and it's marked as "Perf" in the build monitoring list (as we discussed in our meeting), then it would parse all the results once the builds are done.

All Jenkins builds should output the following info (Testci string, Benchmark Name, Benchmark Variant, Product) at the start of each benchmark build.

Sample Output:

echo "********** START OF NEW TESTCI BENCHMARK JOB **********"
echo "Benchmark Name: LibertyStartupDT Benchmark Variant: 17dev-4way-0-256-qs"
echo "Benchmark Product: jdk8u181-b13-openj9-0.9.0"

We use the following regexes in TRSS for parsing that info as shown below:

const benchmarkDelimiterRegex = /[\r\n]\*\*\*\*\*\*\*\*\*\* START OF NEW TESTCI BENCHMARK JOB \*\*\*\*\*\*\*\*\*\*[\r\n]/;
const benchmarkNameRegex = /[\r\n]Benchmark Name: (.*) Benchmark Variant: .*[\r\n]/;
const benchmarkVariantRegex = /[\r\n]Benchmark Name: .* Benchmark Variant: (.*)[\r\n]/;
const benchmarkProductRegex = /[\r\n]Benchmark Product: (.*)[\r\n]/;

For startup and footprint, we start an app multiple times (usually 8 times). Before each run, we print "Warm run <Iteration#>" (i.e. outerRegex), which works as an outer regex to get smaller section to parse. Then we use another regex (i.e. regex) to parse the metric value from each smaller section. So we can parse results for each iteration. You can see sample parsed data here (#73 (comment)).

For throughput, we just use one regex (i.e. "Throughput: ") for AcmeAir. AcmeAir can use the same parser for startup, which is already there in the code.

Snippet for Liberty Startup

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/TestResultSummaryService/parsers/BenchmarkMetric.js#L13-L28
 

Sample Liberty AcmeAir Startup Job:

Warm run 0 //Outer Regex
Footprint (kb)=148944 //Inner Regex
Startup time: 4530 //Inner Regex
...
Warm run 1 //Outer Regex
Footprint (kb)=148120 //Inner Regex
Startup time: 4134 //Inner Regex

...
# Other Iterations

Sample Liberty AcmeAir Throughput Job:

Throughput: 2424.19 //Regex

To Add a Perf Graph

For adding a perf graph widget, you can refer to one of the existing perf graphs such as ODM: https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/master/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx

Simplify Benchmark Parser Design

Problem Description

Currently, the benchmark parser design is slightly complicated since it uses terminology such as regexRepeat and outerRegex, which requires one to have a good understanding of BenchmarkParser.js in order to add parsers for any new benchmarks.

We want to simplify the design as much as possible so that one can easily add parsers with having minimum knowledge of the codebase. Simplifying the parser design would be significantly helpful once we start adding more and more benchmarks to PerfNext and Openjdk-tests Framework.

Proposed Changes

1) Remove regexRepeat

It's confusing when regexRepeat should be set to true or false as different benchmarks require different value as shown below.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/TestResultSummaryService/parsers/BenchmarkMetric.js#L13-L19

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/TestResultSummaryService/parsers/BenchmarkMetric.js#L29-L34

In BenchmarkParser.js, we should just use regex to split blocks to see whether there are multiple values for that metric.

2) Factor out outerRegex

Currently, each metric has its outerRegex, something that's a bit redundant. The benchmark should have the outerRegex instead of each metric having that since outerRegex would be the safe for all metrics under a benchmark. We'll stick to this design to simplify things.

As shown below, both Footprint and Startup metrics have the same outerRegex, and hence it could be moved out to benchmark level. This step would significantly reduce code size once we add more combinations of benchmarks and metrics.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/TestResultSummaryService/parsers/BenchmarkMetric.js#L13-L25

3) Soft-code Liberty

Currently, parser for Liberty throughput is hard-coded to run # 5. This could change depending on the number of warmup and measure runs. Also, it doesn't take care of storing the values from multiple measure runs. We need to store the values of each measure run in the value array, similar to how we do it for Startup and Footprint metrics.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/TestResultSummaryService/parsers/BenchmarkMetric.js#L8

4) Remove redundant checks from BenchmarkParser.js

Some checks might not be needed.

5) Add detailed comments

This would save time for someone who isn't familiar with the Benchmark parser code and just wants to add regexes for new benchmarks. Also, we wouldn't need to provide instructions every time as done here: #119 (comment)

Assigned Contributors

I'll be working with Dong (@dhlee49) on the new design.

Support for Third Party Code Used in PerfNext

Before PerfNext was open-sourced, we were using a couple of third party libraries. While open-sourcing, we had to remove them when the code was moved to the open AdoptOpenJDK openjdk-test-tools repo since we shouldn't maintain third party code here.

Due to other higher priority issues and resource constraints, we had just removed the third party libraries without adding proper support for some of them.

Currently, one would be required to get those libraries from various places in order to run or deploy PerfNext directly from openjdk-test-tools repo. Hence, we need to add proper support for that third party code by using either online hosted libraries or npm modules so that it's easy for anyone to use PerfNext.

I'll be working on adding this support.

TRSS install throws error on got-2

I am trying to use TRSS for our daytrader3 application. When is do a npm install, i get the following error

npm ERR! code E404
npm ERR! 404 Not Found: [email protected]

npm ERR! A complete log of this run can be found in:
npm ERR! $HOME/.npm/_logs/2018-09-24T05_17_54_494Z-debug.log

The log file says :
error code E404
error 404 Not Found: [email protected]

The node version is v10.8.0 and the npm version is v6.4.1.

TRS Server can't use Jenkins Password that has Special Characters

Relevant Code Snippet

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/414ec78d4edadf6542ba2a53357e73037494ba21/TestResultSummaryService/JenkinsInfo.js#L64-L77

JenkinsInfo.js can't use the password that has special characters such as "!" to communicate with Jenkins properly, giving 401 Unauthorized error.

Helpful Links

https://github.com/jansepar/node-jenkins-api
We tried replacing the original code with this snippet but didn't have much luck:

// Password that needs to be %-encoded
const { URL } = require('url');
const jenkinsUrl = new URL('https://[email protected]');
jenkinsUrl.password = 'some_weirdPASSWORD123!@#$%^&*()~`\\/;\'';
var jenkins = jenkinsapi.init(jenkinsUrl.href);

Workaround

  1. Can use a password that doesn't have special characters.
  2. Can use a token instead of password: https://stackoverflow.com/questions/45466090/how-to-get-the-api-token-for-jenkins.

Define API for aggregate dashboard data

From #16, data that matters:
[ # pass, # fail, # excluded, # lastPass, # lastFail, # lastExcluded, platform, impl, version, testGroup ]

definitions:
TA = totalsArray is [totalPass, totalFail, totalSkipped, totalExcludes]
sdkID = combo of Jenkins Job/BuildID info and SHAs that uniquely define the sdk binary being tested (shas of CL, VM, OMR, ...)

API:
TA = getTotals(sdkID)
TA = getTotalsForGroup(testGroup, sdkID)
TA = getTotalsForPlatform(platform, sdkID)
TA = getTotalsForJDKImpl(impl, sdkID)
TA = getTotalsForLevel(impl, sdkID)
TA = getTotalsForPlatformAndGroup(platform, testGroup, sdkID)
TA = getTotalsForLevelPlatformImpl(level, platform, impl, sdkID)
TA = getTotalsForLevelGroupPlatformImpl(level, group, platform, impl, sdkID)
sdkID = getPreviousBuildID(sdkID)
sdkID = getPreviousReleaseID(sdkID)

The underlying implementation of this API can be optimized and reuse common code (as most methods do the same activity, pulling pass/fail/skip/excludes data from the DB and add them to create totalsArray). In initial MVP, excludes data is not populated.

Note: we will use the combo of url, buildName and BuildNum as sdkID until we update the parser code to store shas of CL, VM, OMR.

Store Build Configuration Output For Each Benchmark Run

Problem Description

Originally, we were running multiple iterations of the benchmarks in one Jenkins build (i.e. Old design). Now, we're moving to a new design in #24 in which a parent build would launch multiple child builds, each child build acting as a single benchmark iteration. Hence, we'll be running multiple child jobs instead of one Jenkins builds with multiple iterations inside so that we can interleave Jenkins builds.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/eb9c0d302759787e15afe96fe8d24b0e2b1f907c/TestResultSummaryService/parsers/BenchmarkParser.js#L5

Currently, we're just getting the output for all the iterations and not storing any output before the first ********** START OF NEW TESTCI BENCHMARK JOB **********. This is a bug with the old design since we would only show the the output of the first iteration (value.tests[0]._id) when the user clicks on a Jenkins build even though that Jenkins build may have multiple iterations, which won't be displayed. This problem will be fixed with the new design.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/eb9c0d302759787e15afe96fe8d24b0e2b1f907c/test-result-summary-client/src/Build/TopLevelBuilds.jsx#L73-L86

Proposed Changes

We should store the complete output or some useful info before the start of each benchmark (i.e. Before ********** START OF NEW TESTCI BENCHMARK JOB ********** is printed). This would help in having access to the output for the steps that would have been used to download and configure all the benchmark material before the actual benchmark run.

Support Plugin structure

Different users may want to add additional functions or store additional information into database. The current design is not flexible to allow users to do so without changing the core structure. We need to create a plugin structure to allow users to add their own functions and features. The idea is that the application can run its main features with or without the plugins. Also, the plugin files may or may not need to be stored in this repo. The application will search the folder and run all plugins available at runtime.

Add logo on the header

It would be nice to have the Adopt logo added on the header.

  • Color needs to be changed to white as the header is black
  • It should be clickable and link to main page

BlueOcean link does not work for builds under folders

Dynamically Fetch Perf Pipeline Names & Support Multiple Metrics in Perf Graphs

Problem Description

Currently, Perf Graph uses hard-coded names for pipelines. We should dynamically get this list from the database by looking at all the perf pipelines stored in it as it's done for Tabular View.

I've shown relevant snippets from ODM graphs, but other graphs have the same issue.

Hard-coded Pipeline Names
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/628e800b2e8a7005e80a156dd0ebf2b41ae39eb4/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx#L11-L13

Also, Perf Graphs currently support the display of only one metric. We should display all metrics related to a benchmark run on the graphs. We should have the option of switching on and off the display of different metrics so that it allows us to limit the data for easier visualization.

Support for one metric only
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/628e800b2e8a7005e80a156dd0ebf2b41ae39eb4/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx#L111-L116
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/628e800b2e8a7005e80a156dd0ebf2b41ae39eb4/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx#L185

Furthermore, we currently assume that the the supported metric is in index 0. That assumption was only valid initially when we were just parsing one metric for ODM. As we add more metrics in the parser, the order could be different.

No longer true Assumption
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/628e800b2e8a7005e80a156dd0ebf2b41ae39eb4/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx#L90

Proposed Changes

  1. To fetching pipeline names, we can use the same or similar query used in Tabular view.
  2. To display all metrics, we'll need to loop through the metrics array and show their data on different lines.

This issue will be looked at after we add the support for aggregated data in #106 and clean up the Perf Graph code in #118.

Store HW Specific Benchmark Variables for Perf Testing

Currently, for internal testing, we are using master_machine_list.xml, a file that has info about all machines regarding their HW and specific benchmark variables, with PerfNext to dynamically populate HW specific benchmark variables. We would need a similar file for enabling perf testing on AdoptOpenJDK.

Some Options for Storing HW Specific Benchmark Variables

1) Use Environment Variables for Jenkins Node

https://<JenkinsURL>/computer/<machineName>/api/

Pros: 1) Everything in one place 2) Can possibly be integrated with the new machines data file generated from openjdk-jenkins-helper for PerfNext: adoptium/jenkins-helper#25 3) PerfNext and Openjdk-tests Framework can use just one file that has all info about the machine and the HW specific benchmark variables

Cons: 1) Hacky! We won't really be using the environment vars as expected. Instead of using the vars in bash (i.e. $VAR_NAME), we'll be storing the XML data and parsing that. 2) "Agent Config History" doesn't keep track of changes for environment vars. 3) Not friendly UI to edit configs

2) Use some existing Git Repo

Pros: 1) Clean approach 2) Git takes care of versioning 3) Can be easily edited

Cons: 1) Won't integrate with the new machines data file generated from openjdk-jenkins-helper for PerfNext: adoptium/jenkins-helper#25 2) PerfNext and Openjdk-tests Framework would need to use 2 separate files instead of just one that has all the info about the machine and the HW specific benchmark variables

Ignore invalid build url when monitoring builds

Currently, we do not check the exception message. We try to connect to build url 5 times regardless what kind of exception we get and set the status to Done.

In fact, if the error is 404 (invalid url), we need to ignore this url. But for other errors (i.e., ESOCKETTIMEDOUT), we should retry and keep the build status to NotDone. In this way, we can try it later in the next round.

Extend Functionality for Dynamically Fetching HW Specific Environment Variables

Problem Description

For running any benchmark, we need to fetch the HW specific variables that change depending on the machine selected. We also populate these HW specific variables on PerfNext GUI, as shown below in the screenshot, so that users are able to change the affinity and other commands if needed.

image

Currently, environment variables are hard-coded for just one machine right now. So, you would need to manually change environment variables such as CLIENT, DB_MACHINE, LIBERTY_HOST and AFFINITY while running some benchmarks, especially the ones that require more than one machine, such as Liberty DayTrader.

One example of hard-coded configs:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/config/benchmarks/data_simple/Liberty.xml#L18-L31

Proposed Solution

Task 1:

We should remove all machine specific environment variables from the various XML configs files under PerfNext/config/benchmarks/data_simple since they are already present in the /config/master_machine_list.xml, which has all the machine specific configs, a file that the user provides while deploying PerfNext.

Snippet from master_machine_list.xml

                    <capability id="41" name="LibertyDayTrader" bits="all">
                        <property id="1" name="client">perfxposh10G</property>
                        <property id="2" name="dbMachine">perfxposh10G</property>
                        <property id="3" name="dbHome">/home/db2inst1/</property>
                        <property id="4" name="dbName">day30r</property>
                        <property id="5" name="dbUserName">db2inst1</property>
                        <property id="6" name="appServer">DayTrader3</property>
                        <property id="7" name="dbPort">50000</property>
                        <property id="8" name="libertyPort">9080</property>
                        <property id="9" name="scriptName">tradelite.sh</property>
                        <property id="10" name="clientWorkDir">/java/perffarm/liberty</property>
                        <property id="11" name="libertyHost">kermit10G</property>
                    </capability>

Task 2:

While PerfNext was being developed internally, I had already added majority of the functionality for fetching HW specific environment variables such as CPU Affinity variables by coding various functions (shown below) for it in PerfNext/public/lib/js/util.js. We need extend this functionality to fetch other HW specific environment variables that might be missing.

image

Snippet Code from PerfNext/public/lib/js/util.js:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/public/lib/js/util.js#L423-L635

Assigned Contributors

Awsaf (@pinicman) from my team would work on adding this functionality.

Performance Analysis Tools (Proposal from Developer JumpStart Tech Challenge)

Proposal Name: Performance Analysis Tools
Proposal Owner: Piyush Gupta / Shelley Lambert / Lan Xia
Technical Mentor: Piyush Gupta
Team: @AdamNatale @Variony @acarnegie @armaanfarhadi @kguirguis

Temporary Branch for the Development of these Features:

https://github.com/AdoptOpenJDK/openjdk-test-tools/tree/pat

Blurb (Short overview of the Proposal):

Since performance is crucial for any product that we use in our lives, developers are always striving to evaluate and boost our performance on various workloads by running the latest releases and development builds against different benchmarks and by identifying opportunities for compiler optimizations. As part of this JumpStart Challenge, we’re looking for people to brainstorm and develop a new solution that would enhance our capabilities to spot performance issues with ease.

Currently, Performance Measurement & Analysis (PMA) team and the Runtimes test team collaborate to build new tools and infrastructure to adapt to the changing requirements of users and to the open-source development concepts. These tools such as PerfNext and Test Result Summary Service (TRSS) have been pushed to the open Adopt OpenJDK repo: https://github.com/AdoptOpenJDK/openjdk-test-tools. Under TRSS, we have a dashboard for displaying performance results from daily runs. While this dashboard has some basic functionality of displaying numbers, we could add new features for identifying and monitoring regressions and automating the investigation of these issues. This in turn would improve the efficiency of our performance monitoring and drive faster turnaround as issues are detected.

Please describe the business problem your customers (e.g. external clients, internal team, etc.) are experiencing OR the improvement/opportunity that could be brought to them.

Our PMA team manages the performance monitoring and problem investigation for Eclipse OpenJ9 (https://www.eclipse.org/openj9/) and Java releases from AdoptOpenJDK (https://adoptopenjdk.net/) on all supported hardware platforms. We are also responsible for publishing official performance scores for each Java release. Since performance is of paramount importance to the Java customers, we strive to evaluate and boost our performance on various workloads by running our latest releases and development builds against different benchmarks and by identifying opportunities for compiler optimizations.

Due to large number of benchmark variants and platforms, it’s challenging to identify performance regressions and gains. Currently, we’ve relied on a tool called Traffic Lights that helps in displaying the results from the performance runs. This tool is old and not very flexible and lacks performance monitoring abilities. As a result, we need to develop a new solution that would enhance our capabilities to spot performance issues with ease.

Developers need to know quickly whether their changes cause performance regressions. The sooner this is discovered, the ‘cheaper’ it is to correct and fix the code that introduced the regression. Developers depend on PMA team to run benchmarks, measure and analyze performance results. The PMA team is understaffed and cannot possibly keep up with the growing number of requests from dev team. An effort has begun to make it MUCH easier for developers to run benchmarks and analyze results themselves. Easy-to-use tools empower developers, making them more autonomous and our projects more agile.

What is the key issue (customer pain) or benefit that motivates the need for this project?

Key issue: Performance testing is hard and not standardized, making it difficult for developers.
Key benefit: With easier tooling and approaches, we ‘crowd-source’ the task of performance measurement, empower the development team and make projects more agile.

We have some features we already want to see incorporated which we understand are common tasks manually done by developers. Some of these include use of profiling tools, and looking at additional inputs (such as JIT or GC logs to gather and correlate more data for problem determination).

Better data visualization of results is also an area of great interest. Here is the data we gathered, what is the most compelling way to represent it, to that its quickly communicated and shared with interested parties.

We need to brainstorm the features that need to be added to TRSS and then choose and implement the ones that would provide most benefit to all developers. Currently, PMA team members would be required to look at the graphs and carry out further investigations by launching some more runs and identifying the commit that might be responsible for the regression.

Developing these new tools would benefit everyone since we’ll be able to triage new regressions more easily. Having automated monitoring abilities would significantly reduce PMA team’s workload, allowing it to go deeper into the code issues and to help developers to resolve issues faster.

How might the results of the project be used after the Challenge?

Results of this challenge would be reviewed and potentially incorporated into our live tools.

What are the key technical and business goals?

Technical: Design and develop new features that would help in identifying and investigating Java performance regressions with ease

Business: Display performance results and identify regressions such that the PMA team can improve efficiency while scaling up on Eclipse OpenJ9 performance coverage. Easily articulate the benefits of our products to potential customers.

What specialized skills might be beneficial for the project?

  • Experience with web development (React, Node.js, JavaScript)
  • Ability to contribute to the new Jenkins based performance infrastructure and to develop new features to meet performance analysis needs
  • Data visualization experience
  • Statistical analysis

Store build information in a file

BuildStatus.jsx contains information of builds that TRSS needs to monitor in dashboard (not insert into db).

 OpenJ9: {
        url: "https://ci.eclipse.org/openj9/",
        builds: [
            "Pipeline-Build-Test-All",
            "Pipeline-Build-Test-JDK8-linux_390-64_cmprssptrs",
...

Instead of keeping these information in *.js, we should keep it in a .json file. In this way, change the information does not require re-build and each user can have their own file.

New feature to add brief notes to a particular pipeline or job

To aid concurrent triage efforts (where several people are triaging builds at the same time), add ability to annotate jobs in TRSS

This can also be a location where automated search for existing issues dumps links to found issues, to further aid triage.

Extract Perf Results From Adopt Perf Builds

Currently, TRSS can't parse the perf builds running at Adopt (https://ci.adoptopenjdk.net/view/Test_perf/) as perf builds using perf parser (i.e. BenchmarkParser). We mainly need to do the following in order to achieve that:

  1. Decide on a convention for benchmark name and variant since Adopt tests (https://github.com/AdoptOpenJDK/openjdk-tests/tree/master/perf) just use testCaseName and doesn't print out the regexes such as benchmark name and variants expected by TRSS. Only Liberty test does so.

Since benchmarks are classified in folders on Adopt, maybe we can use the folder name or something corresponding to that for benchmark name and we can use the testCaseName as the variant. I need to look into that more.

Related Issue: adoptium/aqa-tests#1144

  1. Add missing parsers for benchmarks such as Dacapo and others.

Use Aggregated Perf Results in Perf Compare

Problem Description

With #73, we've added the ability to aggregate the perf results from multiple iterations. We should update Perf Compare to use and display aggregated perf results in order to comply with the new design.

Proposed Changes

Perf Compare should use the aggregated data for any build that's passed in the input for test or baseline build.

It should be able to support the following comparisons:

  1. Master Build vs Master Build
  2. Child Build vs Child Build
  3. Master Build vs Child Build

Since we'll be using the aggregated data for each build, Perf Compare would get a significant boost in displaying the results.

Data to Display:

For parent build:

Display the aggregated data for the parent and the raw data for each of its child build.

For child build:

Display the aggregated data and the raw data for each test iteration.

Assigned Contributors

I'll be working with Sophia (@sophiaxu0424) from my team to add this feature.

Add Machine Schema for Validating Machine Data File for PerfNext

Currently, PerfNext fetches the latest machine list from a server as shown below.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/eb9c0d302759787e15afe96fe8d24b0e2b1f907c/PerfNext/app/apis/machines.js#L8-L12

PerfNext expects the machine data file to be in a certain format so that it can be parsed accordingly and be used for populating the machine list and hardware environment variables associated with them.

We need to add the machine scheme that's expected by PerfNext so that this schema can be used to validate any machine data that could be generated on any Jenkins server before being used with PerfNext. For example, this machine schema could be used by adoptium/jenkins-helper#25 for validation.

Add Parsers for Various Benchmark Variants from PerfNext, Perffarm & Adopt

Problem Description

Currently, TRSS supports parsing of both PerfNext and Perffarm jobs. While PerfNext builds that use Jenkins are parsed right away after runs are done, Perffarm builds are parsed when Perf Compare is used.

While TRSS does have some parsers, it doesn't have benchmark parsers for many of the benchmark variants that exist on PerfNext and Perffarm, configurations that are used frequently. As a result, we need to manually parse them for the time being.

Furthermore, even if we have parsers for some configs, we parse only some metrics and haven't added support for others, which might be important to get full performance evaluation of a build.

Some of these configs were newly added to launcher tools in order to support new variants and benchmark versions. Also, parsers could have been missed for some of the less popular benchmark variants, something that we should still add in order to extend coverage.

Benchmark Parsers to Add

Perffarm & PerfNext:

  • Different Startup and throughput variants for Liberty
  • Various ODM variants
  • HiBench
  • SPECjbb2015
  • Any other missing

Adopt:

  • BumbleBench
  • Dacapo
  • Idle Micro
  • Liberty
  • Renaissance

Proposed Changes

Assigned Contributors

I'll be working with Dong (@dhlee49) from my team to work on this design.

Update test parser logic

Recently some of the builds produced large output (~150M-200M).

For example:
https://ci.adoptopenjdk.net/view/all/job/openjdk11_j9_external_extended_tomcat_x86-64_linux/6/console

With #59, TRSS only stores the last 12M of output per test/build. But before storing, TRSS will process the whole output (in this case ~150M-200M) to figure out related info. If it is a test build, TRSS will break the whole output per test and store test related output individually.

In this case, TRSS needs to process ~150M-200M output to figure out how many tests within the output. Test parser is using regex to find a match for tests. It is very CPU intensive when the output is large. Currently, 100% CPU is used and UI stops responding to other requests.

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 7492 root      20   0 1896376 732992  26264 R 100.0  9.0 507:41.88 node
 7259 mongodb   20   0  307916  70356  24696 S   0.7  0.9   3:35.06 mongod

Instead of using regex to find test output, we may need to update the test parser logic to process the whole output line by line. And if a test is found, store the test related info.

If needed, we can further update the logic to stream the output. Process it block by block.

Aggregate and sub-aggregate tests dashboard

Nice enhancement idea from Martijn:

Use the external tests to track Java version support amongst popular libraries and frameworks so we can also identify which ones need help.

Display a name-of-project-java-version-tested-against-<pass|fail>

Matrix such as this (with hyperlinks to actual builds):
application status jdkversion implementation platform
scala-jdk8_j9 pass 8 j9 x64_linux_docker
scala-jdk9_hs pass etc...

where we have columns for application, jdkversion, implementation, platform, status (so they can be sorted by same) that includes all apps, elasticsearch, wildfly, etc for jdk8, 9, 10, 11

Fetch Latest Machines Data File

Problem Description

Currently, PerfNext expects the machine data file, master_machine_list.xml, to be placed at /config/master_machine_list.xml. PerfNext's backend has an API called '/api/machinesInfo' as shown below, which is used by PerfNext's frontend to get all the HW related info about machines that are available on PerfNext to be used for performance runs. More details regarding HW specific environment variables can be found in #32.

API Call from Frontend:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/public/lib/js/benchmarks.js#L41-L45

Backend API:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/app/apis/machines.js#L14-L19

Not fetching this file directly from the original source (i.e. build server) would require one to update this file manually every time if some new machine is added to the build server or if some machine specific configuration has changed. Otherwise, PerfNext would not reflect the most updated machine data.

Proposed Solution

Instead of storing that machine file (i.e. master_machine_list.xml), PerfNext should directly download it from the build server (i.e. Jenkins, Axxon or something else) or anywhere where that machine data file might be hosted as PerfNext does for fetching build info in PerfNext/app/apis/builds.js.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/app/apis/builds.js#L12-L33

This solution would make sure that PerfNext always accesses and uses the latest machine data file from whatever location specified by the PerfNext host.

User should provide the URL for machine data file in PerfNext/config/APP_DATA.json and set any credentials required to download it in the PerfNext/config/credentials.json file.

Assigned Contributors

Awsaf (@pinicman) from my team would work on adding this functionality.

Use Common Benchmark Metric Router Design for All Perf Tools

Problem Description

Currently, TRSS has 2 parsing mechanisms for extract benchmark metric values.

First mechanism is used for Jenkins, which is used by PerfNext and Adopt builds, to parse and store results automatically when a build finishes.

Second mechanism is used for Axxon schedular, which is used by Perffarm currently, to extract numbers from a CSV results file that's generated by Perffarm. Some files related to this mechanism are also used in all frontend perf tools such as Tabular View and Perf Compare.

Maintaining 2 mechanism adds too much redundancy, making it harder to maintain and support different tools.

Proposed Changes

  • Unify both the parsing mechanisms.
  • Add higherbetter and units to BenchmarkMetric. Currently, BenchmarkMetric doesn't have that information and we rely on BenchmarkVariants for it.
  • Get rid of BenchmarkVariants file.
  • Make an API so that all perf tools can request the backend for the BenchmarkMetric file and use it accordingly.
  • Update perf tools to use that API to fetch BenchmarkMetric file instead of using BenchmarkVariants file.

Assigned Contributors

I'll be working with Dong (@dhlee49) from my team to work on this design.

Move Duplicate Code for Perf Graphs to Common Utils Library

Problem Description

Currently, we have 3 Perf widgets for Dashboard: DayTrader, ODM and SPECjbb2015. There is significant duplication of code between those 3 widgets since we just copied and modified the code from the first widget instead of using a common library while adding a new widget every time.

All perf graphs have the same purpose of displaying perf results for different benchmarks run on different platforms. Besides some specific data, everything else is the same among those widgets as shown below in the screenshots.

image

image

The graphs would have some minor feature difference because some of the features added by #84, where not extended to all 3 perf widgets.

As a result, it's not easy to add new widgets for new benchmarks without duplicating code from some existing widget.

Proposed Changes

We should use a library to keep the common code, in order to avoid duplicating code for different benchmark widgets. For example, currently, utils.js just has one function parseSHA, which is used in multiple widgets but there is still enough scope to clean up code by moving common code to this library.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/test-result-summary-client/src/Dashboard/Widgets/Graph/utils.js#L1-L2

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/test-result-summary-client/src/Dashboard/Widgets/Graph/ODM.jsx#L159-L160

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/test-result-summary-client/src/Dashboard/Widgets/Graph/DayTrader3.jsx#L151-L152

TRS Code Incorrectly Uses Jenkins "Build" instead of "Project" At Various Places

Test Result Summary is using the Jenkins term "build" incorrectly instead of "project" in some places. So it's easy to get confused when the code might actually be referring to "project" and not the "build" in various places.

Sorry to be pedantic. I know it's not a big deal but it's would be good to update the wrong references so that it's easier to understand and maintain the code.

Terminology

https://jenkins.io/doc/book/glossary/

Project
A user-configured description of work which Jenkins should perform, such as building a piece of software, etc.
Example: PerfNext-Pipeline, Grinder, Daily-ODM

Build:
Result of a single execution of a Project
Example: Different Builds: 1, 2, ..., 99, 100...

Job
A deprecated term, synonymous with Project.

Some references that should use "project" instead of "build":

TRSS

  • BuildMonitor.js (File Name)
  • EventHandler.js (Function Name: monitorBuild())
  • getTopLevelBuildNames.js (File Name & var: "buildName")
  • getBuildHistory.js (File Name)

TRSC

  • TopLevelBuilds.jsx (/api/getBuildHistory?buildName=${buildName}&url=${url} & builds[url][buildName] in updateData())

MongoDB

  • buildList (Collection Name)

Node Packet Manager for TRSC (yarn vs npm)

Currently, README.md for TRSS says that yarn should be used for installing modules (i.e. yarn install) but we actually have test-result-summary-client/package-lock.json, which we get when we use npm (i.e. npm install).

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/cd7fa61f2dc94b7ce67ff78275ffd38911c52cec/TestResultSummaryService/README.md#L33

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/cd7fa61f2dc94b7ce67ff78275ffd38911c52cec/test-result-summary-client/package-lock.json#L1-L100

We should either update the README.md from yarn install to npm install or add the yarn.lock and remove package-lock.json if we use to use yarn install.

Remove React Workshop Guide from TRSC README.md

README.md for TRSC (https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/master/test-result-summary-client/README.md) has been copied from https://github.com/reach/react-fundamentals-workshop/blob/master/3-state/README.md. We should remove this guide from our repo since it's already outdated as shown in the screenshot and add a hyperlink to the original README.md from reach/react-fundamentals-workshop so that everyone can refer to the most updated info. This change would also free up our README.md since the guide is currently taking unnecessary space so that developers can focus on the info related to this specific project.

image
image
image

Ability to Configure User Specific Data such as Build Server URL

Currently, PerfCompare uses a hard-coded string http://perffarmServer as a sample Build Server URL at the following places:

  • openjdk-test-tools/test-result-summary-client/src/PerfCompare/PerfCompare.jsx
  • /openjdk-test-tools/TestResultSummaryService/routes/getPerffarmRunCSV.js

We need to use a variable instead of http://perffarmServer as a placeholder in order to set the Build Server URL, which is used to fetch benchmark results from the build server. One of the solutions could be to read user specific information such as URLs to various servers from a configuration file into placeholder variables. We might need a slightly different way of doing so for TRC client.

This could be done similar to how PerfNext deals with user specific configs. PerfNext reads the user specific data from PerfNext/config/APP_DATA.json in /openjdk-test-tools/PerfNext/app.js when the server starts. That's just an example so we can explore for better solutions.

Sophia Xu (@sophiaxu0424) would be working on this feature.

Dynamically Populate Machine List

Problem Description

Currently, PerfNext displays a static list of machines, available for launching benchmark runs, as shown below. PerfNext front uses this API /api/machines to get the list of machines. PerfNext host is required put this machine list inside /config/machines.json.

API Call from Frontend:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/public/lib/js/benchmarks.js#L35-L39

Backend API:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/app/apis/machines.js#L10-L12

Static Population of Machine List
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/public/lib/js/util.js#L4-L13

image

Proposed Solution

PerfNext should dynamically populate the dropdown menu for machine list depending on the platform selected by using the machine data file to search for all available machines that meet the platform and benchmark requirements. More info about machine data file can be found here in #33.

Assigned Contributors

Awsaf (@pinicman) from my team would work on adding this functionality.

Optimize Tabular View Code

In the Tabular view (#37), there are few places where the code seems to pretty costly to run. In interest of time and considering the first commit for Tabular View, we'll revisit the code for it to see how we can optimize it in order to reduce CPU usage.

For example, we are using distinct to get unique values for platforms and benchmarks. distinct is expensive to run.

Snippet from getTabularData.js

const platforms = await db.distinct("buildName", query);
const benchmarks = await db.distinct("aggregateInfo.benchmarkName", query);

For more details, please refer to #131 (comment).

Aggregate Perf Results From Multiple Benchmark Iterations

Problem Description

Currently, we don't aggregate numbers for multiple benchmark iterations when each Jenkins build is stored in the database. As a result, all results such as average, median and confidence interval need to be calculated when Perf Compare is used to compare 2 builds. This design is not preferred due to the following reasons:
1) It takes time to generate Perf Reports through Perf Compare.
2) Aggregated results are not stored so they would need to be generated every time they are needed even though they don't change.
3) It requires more CPU time and puts unnecessary pressure on the database.

These issues should be resolved with the proposed changes mentioned below. This would significantly improve the speed of getting results, which would be needed for different views such as Dashboard (#28) and Tabular View (#37).

Proposed Changes

  1. Move the math library from frontend to backend: https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/eb9c0d302759787e15afe96fe8d24b0e2b1f907c/test-result-summary-client/src/PerfCompare/lib/BenchmarkMath.js
  2. Generate all the aggregated results for a master build that may have single or multiple child jobs and store them in the parent object in testResults collection.
  • Additional data that needs to be added to Parent Object: benchmarkName, benchmarkVariant, benchmarkProduct, testData.
  • Additional data in testData for Parent Object: Aggregated numbers for all metrics: Mean, Median, Confidence Interval, Min, Max, StdDev
  • Note: For Liberty startup, there will only be 1 index in testData.metrics.[0].value for parent object.
  1. Instead of Perf Compare generating the perf numbers, it should just make a request to the backend to query the database and fetch the stored results.

Assigned Contributors

Sophia (@sophiaxu0424) from my team will work on this feature.

Tabular View for Comparing Baseline and Test Builds for Selected Platforms & Metrics

Background About Benchmarking

For benchmarking, we always launch several iterations of a benchmark with a specific build to get performance results for various metrics such as throughput and startup time. These relative numbers are not very useful since they could change when benchmark is run on another platform, when the machine state isn't identical or when the configs are slightly different. Hence, we always use a baseline to gauge the performance of a newer test build.

While comparing baseline and test builds, it's important to use a relative number (Build 1 Score/Build 2 Score) instead of an absolute number (Build 1 Score - Build 2 Score) to look at the performance gap since the absolute number doesn't really mean much, could change and could have significantly varying range.

We usually use this formula to comparison:

Scenario Example of Metrics Comparison Formula
Higher is better Throughput Test Build/Baseline Build
Lower is better Startup time, Footprint Baseline Build/Test Build

Details about the Proposed Feature

Test Result Summary (TRS) should have the ability to create and show tabular views for comparing baseline and test build. Each view should show the relative comparison between baseline and test build in percentages corresponding to one specific metric and platform in a result cell. These result cells should be painted with different colors to classify the performance according to the table shown below.

Color Scheme for Result Cells
image
These tabular views would be extremely helpful in finding regression. I'm going to show the benefits of these tabular views with 2 examples.

Example 1:

SPEC Benchmarks
image

The tabular view above shows the results of all the SPEC benchmarks run on different platforms. From the results above, we can identify the following regressions easily:

  1. x64 Regression (OS: Linux, Windows, macOS) for SPECjEnterprise & SPECjbb2015
  2. Multi-benchmark Regression (SPECjEnterprise & SPECjbb2015) for x64 (Same as the 1st)
  3. Single Platform Regression (Linux s390x) for SPECjbb2005

Example 2

Micro Benchmarks
image

The tabular view above shows the results of all the micro benchmarks run on different platforms. From the results above, we can identify the following regressions easily:

  1. Cross-platform Linux Regression (HW: x64, ppcle64 & s390x) for ILOG ODM
  2. Power Regression (OS: Linux & AIX) for HiBench

Requirements for Tabular Views

Basic requirements of this tabular comparison view:

  1. Should show the relative comparison percent between baseline and test build for all platforms and metrics selected for that specific view.
  2. Should the show the results of the latest test build against the latest runs for the baseline.
  3. Should identify the score ranges with different colors.
  4. Hovering over a result cell should show some basic information such as Java versions, confidence intervals and average scores
  5. Open a new window with an unique URL when a result cell is clicked to show the full details of the runs such as score of each iteration for all metrics from those runs for both baseline and test build. This detailed view should be the same as the one shown when one clicks on the detailed view URL from the graph view, being developed for issue #28.
  6. Should be configurable to show different platforms and metrics that are selected by the user for that specific view.
  7. Ability to show the historic data for all previous weeks. This ability would help in finding the first build that showed a regression.
  8. Ability to use one baseline build with different test builds of the same platform, even though that baseline build may not have been interleaved (More details about interleaving here: adoptium/aqa-tests#850 & #24) with any of those test builds. Let's say, you want to have 2 table views: one for comparing OpenJDK8-OpenJ9 GA vs OpenJDK8-Hotspot Latest and another for OpenJDK8-OpenJ9 GA vs OpenJDK8-OpenJ9 Latest. So we have 2 views, both of which use the same baseline OpenJDK8-OpenJ9 GA. While running these 3 builds, we could have interleaved the baseline build with one of the two test builds (i.e. OpenJDK8-OpenJ9), so we wouldn't want to run the baseline again with the second test build (OpenJDK8-Hotspot) since the baseline would essentially give the same score, a move that would save significant machine time.

Advance Requirements for Tabular Views

  1. Ability to show "Best So Far" build from all the data (To be included in Graph Timeline view as well)
  2. Ability to monitor specific cell
  3. Ability to show the difference between current and previous week for all cells
  4. Ability to show only the results cells that have changed since last week
  5. Ability to check and uncheck a specific cell to monitor for possible regression
  6. Ability to link a GitHub issue to one or more cells

Assigned Contributors

My team would work on adding this functionality.

Handle document size > 16M

Max document size is 16M in MongoDB https://docs.mongodb.com/manual/reference/limits/

Some of the builds produce very large output (~57M) for 4 tests:
https://ci.adoptopenjdk.net/view/all/job/openjdk11_hs_externaltest_x86-64_linux/169/console

Even TRSS only stores each test output per document, it is still too large to insert.

2:12:56 PM - debug: update newData url=https://ci.adoptopenjdk.net, buildNameStr=openjdk11_hs_externaltest_x86-64_linux, buildNum=169, _bsontype=ObjectID, 0=92, 1=139, 2=226, 3=78, 4=160, 5=198, 6=233, 7=26, 8=158, 9=73, 10=192, 11=212, type=Test, status=Done, timestamp=1552592655173, buildUrl=https://ci.adoptopenjdk.net/job/openjdk11_hs_externaltest_x86-64_linux/169/, buildDuration=46983740, buildResult=FAILURE, parserType=Test, machine=Jenkins, total=0, executed=0, passed=0, failed=0, skipped=0, startBy=upstream project "build-scripts/jobs/jdk11u/jdk11u-linux-x64-hotspot" build number 141, artifactory=null
2:12:56 PM - error: Exception in BuildProcessor:  message=document is larger than the maximum size 16777216, name=MongoError, stack=MongoError: document is larger than the maximum size 16777216
    at Function.MongoError.create (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb-core/lib/error.js:31:11)
    at toError (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/utils.js:139:22)
    at addToOperationsList (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/bulk/unordered.js:154:51)
    at UnorderedBulkOperation.raw (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/bulk/unordered.js:387:7)
    at bulkWrite (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/collection.js:646:12)
    at /Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/collection.js:540:5
    at new Promise (<anonymous>)
    at Collection.insertMany (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/collection.js:539:10)
    at Collection.insert (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/node_modules/mongodb/lib/collection.js:835:15)
    at OutputDB.populateDB (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/Database.js:16:25)
    at Promise.all.tests.map (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/DataManager.js:101:51)
    at Array.map (<anonymous>)
    at DataManager.updateBuildWithOutput (/Users/lanxia/workspace/ttss/openjdk-test-tools/TestResultSummaryService/DataManager.js:98:55)
    at process.internalTickCallback (internal/process/next_tick.js:77:7), driver=true

We are actively working on reducing the test output and splitting the large builds into smaller ones: adoptium/aqa-tests#834

I think TRSS should handle this case gracefully. For now, we should store last ~12M output (leave some space for other data) .

Parsing issue with latest build

TRSS cannot display the Test build page. I suspect this is related to test output change. We need to check the parser code and make sure the console output can parse and display properly.

Update Benchmark Versions & Add Profiling Parameters to PerfNext Launcher

Problem Description

Currently, all benchmark configs are using older version of benchmarks. We should update them to these latest benchmarks.

  Current Version Version to Use
Liberty 19.0.0.2 19.0.0.9
ODM 8.8.1 8.10.0
SPECjbb2015 SPECjbb2015GMR specjbb2015v101_jaxb24

Also, we recently extended the support for profiling tools to ODM from Liberty. In future, this profiling ability would be extended further to other benchmarks. As part of that work, we also added the capability in our scripts to use the profiling parameters passed by the user or to set them to default if nothing's specified.

These profiling parameters should be added to the relevant benchmark configs so that user can tune these params according to their needs.

We should add all these parameters mentioned below and anything else that I might have missed:

  1. PROFILING_TOOL
  2. PROFILING_JAVA_OPTIONS
  3. PROFILING_PROFILE_TIME
  4. PROFILING_SLEEP_TIME
  5. PERF_SAMPLING_PERIOD
  6. PERF_EVENT

Proposed Work

  • Update benchmark versions
  • Add any new or missing profiling parameters to PerfNext Launcher

Selected jobs are the ones that we care about:

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b3510949c0d6f24650f48d393769ec5d10c35c2d/PerfNext/config/benchmarks/data_simple/Liberty.xml#L3
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b3510949c0d6f24650f48d393769ec5d10c35c2d/PerfNext/config/benchmarks/data_simple/Liberty.xml#L239
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b3510949c0d6f24650f48d393769ec5d10c35c2d/PerfNext/config/benchmarks/data_simple/ODM.xml#L3
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b3510949c0d6f24650f48d393769ec5d10c35c2d/PerfNext/config/benchmarks/data_simple/ODM.xml#L27

  • After updating the versions and adding new profiling options, we should test our changes with both ODM and Liberty for all the supported profiling tools for each benchmark. We should make sure to test one (i.e. DT7, any ODM) of the 2 jobs from each benchmark suite extensively (i.e. Run all the 7 currently supported profiling tools as mentioned below). For the other jobs (i.e. DT3, other ODM job), we could do some sanity testing to make sure things work.

Default Testing: Parameters (Avoid setting profiling parameters unless required. You might just need to set PROFILING_TOOL. Leave the non-profiling options as default.)

Non-default Testing: Explicitly set all profiling parameters to some value similar to their default but not the same as default value. For example, if the default value for PROFILING_SLEEP_TIME is 60, test it with 30 or something.

Currently Supported Profiling Tools:

  1. jprof tprof
  2. jprof scs
  3. jprof callflow
  4. jprof calltree
  5. jprof rtarcf
  6. perf stat
  7. perf record

Increase the TRSS Server efficiency by running multiple node in parallel

With the current code, we are using a single node when running TRSS server. As we monitor more and more projects, the TRSS server can be very busy to process the Jenkins outputs. As a result, TRSS server may not be able to response the client requests in a timely manner.

With the current structure, we can easily break into two node servers:

  • frontend - for responding to client requests (mainly for querying database)
  • backend - for querying Jenkins and inserting/updating database.

We do not want to use fork() here because it will complicate the logic and add lots of if conditions. And these two tasks are very different.

Once two node servers are created, we can further improve the efficiency by leveraging multiple processors on the machine. We can use fork() to create multiple backend workers and we can process Jenkins jobs in parallel.

Note: a flag is needed to keep the worker id and to keep track of which job is processing by which backend worker. Also, we should set a timeout after which we assume the worker is dead and we can restart the worker.

image

In summary, there are the following steps:

  • Break node server into two: frontend and backend

  • configure forever service on the server

  • create multiple backend workers

  • update readme

Support build deletion

In order to avoid to have too much data in database, we need to support build deletion. Users need to provide # Builds to Keep when they add new build into the build monitor list.

screen shot 2018-08-29 at 11 31 46 am

Delete button in build monitor list will not delete history builds. It will only remove the build from the build monitor list.

Before we insert any build into db, the program will check number of builds in database. If the number > # Builds to Keep, older builds will be deleted. Otherwise, skip deletion.

Ability to Interleave Performance Runs for Baseline & Test Builds

Background About Interleaved Runs

While doing the benchmarking comparison between different builds, it's crucial to interleave runs for both baseline and test builds in order to get the most consistent and reliable results.

For benchmarking, we always launch several iterations of a benchmark with a specific build to get performance results for various metrics such as throughput and startup time. These relative numbers are not very useful since they could change when benchmark is run on another platform, when the machine state isn't identical or when the configs are slightly different. Hence, we always use a baseline to gauge the performance of a newer test build.

There are various machine factors that could affect the numbers between multiple iterations, even though they might be running with the same configs. Hence, interleaving runs helps in avoiding those issues and makes sure that the same factor would affect both baseline and test build runs. In order to keep this issue short, I won't get into the benefits and scenarios of interleaved runs.

If T = Test Build; B = Baseline Build, # = Iteration

Interleaving Run Pattern:
Do alternate iteration of each baseline and test build in a ping-pong fashion.
T1, B1, T2, B2, T3, B3

Non-interleaved Run Pattern:
Do all iterations for one build and then do all iterations for another.
T1, T2, T3, B1, B2, B3

Related Issue for Openjdk-tests Framework

adoptium/aqa-tests#850

Background About PerfNext

Currently, PerfNext does not have the capability of launching interleaved runs. It uses non-interleaved runs and launches all iterations under one Jenkins job by using a loop for iterations as shown below:

iteration=0 
while [ "$iteration" -lt 3 ]
do 
echo "Start of iteration $iteration" 
echo ""
echo "********** START OF NEW TESTCI BENCHMARK JOB **********"
echo "Benchmark Name: LibertyStartupDT Benchmark Variant: 17dev-4way-0-256-qs"
echo "Benchmark Product: pxa6480sr6-20190123_02"
echo ""
# Export benchmark vars
export JDK_OPTIONS="-Xmx256m"

## HW Specific Environment Vars ##
# Export HW vars
bash ./bin/sufp_benchmark.sh 
echo "End of iteration $iteration" 
iteration=$((iteration+1)) 
done

Background About TRS

Once the Jenkins build is done, TestResultSummaryService (TRSS) stores the raw Jenkins build output, which has the output for all benchmark iterations for a build, in MongoDB. Then, TRSS parses the data (i.e. /openjdk-test-tools/TestResultSummaryService/parsers/BenchmarkParser.js) for each iteration by using regex for the string: echo "********** START OF NEW TESTCI BENCHMARK JOB **********", which is printed in Jenkins output for each iteration, and stores the results in another collection in MongoDB. These parsed numbers for various benchmark metrics are used to calculate various important numbers such as average, confidence interval, min, max and median for each build, enabling test-result-summary-client (TRSC) to display performance charts and Perf Compare to compare 2 builds.

Proposed Features for PerfNext:

  1. When baseline is not checked, launch the test build using a parent pipeline build that launches child builds for each iteration.
    For example, PerfNext -> Jenkins Parent Pipeline -> Launches Child Parent Pipelines for test build with the sequence: T1, T2, T3
  2. When baseline is checked, launch both test and baseline builds using a parent pipeline build that launches child builds for alternate iteration of each baseline and test build in a ping-pong fashion.
    For example, PerfNext -> Jenkins Parent Pipeline -> Launches Child Parent Pipelines with the sequence: T1, B1, T2, B2, T3, B3
  3. Output parent pipeline build URL to user

Proposed Features for TRS:

  1. Redesign the parser to use the parent pipeline build URL to browse through each child pipeline build and parse data for each iteration

Assigned Contributors

Members from my team, Awsaf (@pinicman) and Sophia (@sophiaxu0424) will be starting to work on the PerfNext and TRS features respectively.

Enable PerfNext To Use Openjdk-tests Framework

Background

Currently, PerfNext uses its own pipeline to launch benchmark jobs, a process requiring several setup steps such as downloading and setting up benchmark and SDK packages.

Proposal

PerfNext should use the Openjdk-tests Framework (https://github.com/AdoptOpenJDK/openjdk-tests) by using its pipeline scripts, which have several functionalities including the ones that PerfNext needs. This move would streamline the execution of performance tests into the CI pipelines that are used for other system and functional tests.

Currently, PerfNext has an API called /api/benchengine/submit (aka BenchEngine) that generates the necessary scripts and submits the request to Jenkins.
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/b528a63ea146080f4bb1430740a734c2af87df7e/PerfNext/app/apis/BenchEngine/parser.js#L42-L54

As shown above, /api/benchengine/submit API currently sends 2 scripts to Jenkins Pipeline:
setupScript: It does all the setup tasks such as downloading benchmark and SDK packages.
benchmarkScript: Exports all the necessary environment variables and runs the main benchmark script.

We could get rid of the setupScript since Openjdk-tests Framework can already do that. We would still need the ability to generate benchmarkScript so that developers are able to change default configs and PerfNext can pass the custom benchmark script to Openjdk-tests Framework to run.

Details

I'll be adding more details to this issue soon.

Show Aggregated Perf Results in Perf Graph View

Problem Description

With #73, we've added the ability to aggregate the perf results from multiple iterations. We should update Perf Graph View to use and display aggregated perf results in order to comply with the new design.

Proposed Changes

Besides the info that's already displayed, show all aggregated data for each build:

  • max
  • min
  • median
  • stddev
  • CI
  • iteration

Assigned Contributors

I'll be working with Sophia (@sophiaxu0424) from my team to add this feature.

Delay Generation of HW Specific Variables

Problem Description

Currently, we fetch the latest machine data file every time PerfNext is loaded. If some benchmark is selected and the machine data hasn't been fetched yet—fetching that can take a few seconds—then we can get an error while generating HW environment variables.

Error

image

parser.js: Entering generateHWENV()
/Users/piyush/Work/Git/openjdk-test-tools/PerfNext/app/apis/BenchEngine/parser.js:453
	    var envVar = HW_ENV[property].$.name;
	                                    ^

TypeError: Cannot read property 'name' of undefined
    at generateHWENV (/Users/piyush/Work/Git/openjdk-test-tools/PerfNext/app/apis/BenchEngine/parser.js:453:38)
    at /Users/piyush/Work/Git/openjdk-test-tools/PerfNext/app/apis/BenchEngine/parser.js:393:23
    at /Users/piyush/Work/Git/openjdk-test-tools/PerfNext/app/apis/BenchEngine/parser.js:423:9
    at FSReqWrap.readFileAfterClose [as oncomplete] (internal/fs/read_file_context.js:53:3)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: `node app.js`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/piyush/.npm/_logs/2019-04-01T23_40_41_512Z-debug.log
Piyushs-MacBook-Pro:PerfNext piyush$

Related Snippets

Frontend: Call to Backend for Getting Machine List
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/PerfNext/public/lib/js/benchmarks.js#L34-L40

Backend: Get machine list
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/PerfNext/app/apis/machines.js#L7-L24

Snippet to Generate HW Specific Variables
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/524d16e3784c17f4af6cee75f9105eb929397792/PerfNext/app/apis/BenchEngine/parser.js#L448-L460

Proposed Changes

We need to find a workaround to prevent this issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.