Currently, TRSS can't parse the perf builds running at Adopt (<a href="https://ci.adop

Above snippet from Test.js is the failure case which can be ignored. We should c

Updated Proposed Changes Had a discussion with <a class="user-ment

Choose monitoring pipeline: <a href="https://ci.adoptopenjdk.net/job/build-scrip

Also did not see affections to tests numbers here: <a target="_blank

Extract Perf Results From Adopt Perf Builds about aqa-test-tools HOT 16 CLOSED

adoptium commented on June 12, 2024

Extract Perf Results From Adopt Perf Builds

from aqa-test-tools.

Comments (16)

llxia commented on June 12, 2024 1

Above snippet from Test.js is the failure case which can be ignored. We should call perf parser in the passed case:
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/6e18606c88eefe4e2b984a1094e0a4e027444b13/TestResultSummaryService/parsers/Test.js#L52-L59
javaVersion and javaBuildDate are common values for all tests. Please move them to top level (Parser.js). Move to top of BenchmarkParser.js is not the most efficient. #141
set benchmarkName and benchmarkVariant as test name (if not needed, benchmarkVariant should be removed
remove unnecessary output from tests (i.e., START OF NEW TESTCI BENCHMARK JOB, Benchmark Name, Benchmark Variant, etc)
be careful of how type is set and used. It may affect perf test rending.

from aqa-test-tools.

piyush286 commented on June 12, 2024

Currently, we look at the Pipeline name to classify the builds as shown below.

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/be9fbd0d28fb7a0b6a82e85aec4702bee42a9113/TestResultSummaryService/parsers/ParentBuild.js#L39-L44

Both perf and non-perf on Adopt start with the Test- string, and hence, the even the perf jobs would be classified as test. Example of an Adopt perf pipeline name (https://ci.adoptopenjdk.net/view/Test_perf/): Test_openjdk8_j9_sanity.perf_x86-64_linux. According to the current design, we expect the perf builds to start with PerfNext, which should be changed to adhere to Adopt's naming convention of all pipeline names starting with Test-.

Test and perf builds use different parsers. Currently, we can only call one of them and not both. Perf builds should actually be a subset of the test builds, for which we should call both the test parser and the perf parser.

Partial Solution

            if (buildName.match(/^PerfNext/) || buildName.match(/.*perf_.*/)) {
                type = "Perf";
            } else if (buildName.match(/^Test-/)) {
                type = "Test";
            }

The snippet above would only be a partial solution since it would allow perf builds to use the perf parser instead of the test parser, hence not allowing the functional results of the perf builds to show up on Result Grid view and other views.

Proposed Changes

We need to change the existing design so that a perf build can first be parsed as a test build to get the functional results (i.e. # of passed and failed jobs etc) and then use additional perf parser to get the perf results.

from aqa-test-tools.

piyush286 commented on June 12, 2024

Updated Proposed Changes

Had a discussion with @llxia. These are some of the changes that we'll need to make.

Remove BenchmarkParser from the list of parsers in index.js
Remove/update the regexes in BenchmarkParser

https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/6e18606c88eefe4e2b984a1094e0a4e027444b13/TestResultSummaryService/parsers/BenchmarkParser.js#L5-L11

In Test.js, after all relevant test data has been parsed, check the pipeline name to see whether it has "perf" in it. If it doesn't, do nothing extra. Else, send the testStr to the BenchmarkParser and extract the relevant data.
Push both the test data along with the additional perf data

~~Relevant Snippet from Test.js~~ See next comment by Lan
https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/6e18606c88eefe4e2b984a1094e0a4e027444b13/TestResultSummaryService/parsers/Test.js#L67-L88

Test rigorously to see nothing's broken
Update PerfNext to print out new regexes such as Running test instead of the old ones such as START OF NEW TESTCI BENCHMARK JOB

from aqa-test-tools.

piyush286 commented on June 12, 2024

Some TODOs that were discovered while working on other PR:

#141 (comment)

BenchmarkParser.js
We need to consider the case in 'tests' would have multiple benchmarks/variants and related metrics data inside
DataManagerPerf.js and DataManagerAggregate.js
We need to consider the case when aggregateInfo has multiple elements inside it. Like different benchmarkName / benchmarkVariant sets.
Parser.js / Test.js
Will use extractJavaVersion and extractJdkDate functions to get JDK version and date for Test builds in future PR when we enable the parsing of Adopt builds.

from aqa-test-tools.

piyush286 commented on June 12, 2024

More details about design:

Benchmark Name and Variant

For PerfNext, we already set benchmark name and variant. For Adopt builds, we don't really have both. So in that case, we'll use the testCaseName that's outputted in the Running test line. For example, OpenLibertyStartupDT7_0 means benchmarkName would be OpenLibertyStartupDT7 and benchmarkVariant would be 0.

Parser to Use

We should add entries for perf tests on Adopt in https://github.com/AdoptOpenJDK/openjdk-test-tools/blob/master/TestResultSummaryService/parsers/BenchmarkMetricRouter.js as shown below:

    "OpenLibertyStartupDT7": {
        "0": "LibertyStartup"
        // No need for "1": "LibertyStartup" for the reason mentioned below
    },

In order to avoid adding variants 0, 1, 2 and so on, we should just use 0 to find the parser to use since all variants would have the same parser. In that way, we wouldn't have to update our TRSS parser code every time a new variant for a test case is added in the openjdk-tests repo. For each test, there would always be 0 by default. So we would need to add support only once whenever a new test case is added on Adopt.
To find the parser to use for some test (i.e. OpenLibertyStartupDT7_11), we should check whether that case variant (i.e. 11) is a number (even though it's stored as a string in the database). If so, then we should use "0" to get to the required parser. For example, OpenLibertyStartupDT7_0, OpenLibertyStartupDT7_5, OpenLibertyStartupDT7_11 and OpenLibertyStartupDT7_1 should all go to the parser for OpenLibertyStartupDT7_0.

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

This is one of the parsed jenkins builds' result. Under tests the first and last elements are the pre-test and post-test results. others are from test/perf functional results. If Perf test has been applied, we would have variables: benchmarkName, benchmarkVariant and testData.

Detailed testData:

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

Choose monitoring pipeline: https://ci.adoptopenjdk.net/job/build-scripts/job/openjdk8-pipeline/,
Tracking the jenkins build under its path: https://ci.adoptopenjdk.net/job/Test_openjdk8_j9_sanity.openjdk_s390x_linux/105/,
ParserType: Test

Attached the original testResult and the one with my code added. No affection to it.
Before:

After

Link to JSON text files:

before.txt: https://ibm.ent.box.com/file/608025533273
after.txt: https://ibm.ent.box.com/file/608025163537

Result:
Except the ObjectId and newly added variables differences. the whole tests frame did not get affected.

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

Also did not see affections to tests numbers here:

from aqa-test-tools.

llxia commented on June 12, 2024

Just want to confirm, the numbers are the same before and after the change for both perf build and non-perf build?

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

Yes, there's no numbers difference before and after my code changes for both non-perf/perf builds. The numbers are basically from the testSummary and I have compared its frame before and after my code, and did not see changes .

from aqa-test-tools.

piyush286 commented on June 12, 2024

Initial Design for Aggregating Multiple Benchmarks in One Jenkins Build

AggregateInfo

[
    {
        "benchmarkName" : "bumbleBench-ArrayListSortCollectionsBench",
        "benchmarkVariant" : "0",
        "metrics" : [ 
            {
                "name" : "Score",
                "value" : {
                    "mean" : 1990239,
                    "max" : 1990239,
                    "min" : 1990239,
                    "median" : 1990239,
                    "stddev" : 0,
                    "CI" : 0,
                    "validIterations" : 1
                }
            }, 
        ]
    },
    {
        "benchmarkName" : "bumbleBench-ArrayListSortComparatorBench",
        "benchmarkVariant" : "0",
         ...........
    },
]

benchmarkAggregate

[
    {
        "bname" : "bumbleBench-ArrayListSortCollectionsBench",
        "bvariant" : "0",
        "benchmarkMetricsCollection" : {
            "Score" : [ 
                1990239
            ]
        }
    },
    {
        "bname" : "bumbleBench-ArrayListSortComparatorBench",
        "bvariant" : "0",
         ...........
    },
]

More Efficient Design for Aggregating Multiple Benchmarks in One Jenkins Build

We should just store raw values inside rawValues and add it to aggregateInfo instead of using a separate data structure for it, a design that would save us to storing a lot of duplicate info such as benchmark name, variant and metric names. If it's not too much work, we should rename value to statValues since value isn't a very reflective name of what it stores.

AggregateInfo

[
    {
        "benchmarkName" : "bumbleBench-ArrayListSortCollectionsBench",
        "benchmarkVariant" : "0",
        "metrics" : [ 
            {
                "name" : "Score",
                "statValues" : {
                    "mean" : 1990239,
                    "max" : 1990239,
                    "min" : 1990239,
                    "median" : 1990239,
                    "stddev" : 0,
                    "CI" : 0,
                    "validIterations" : 1
                }
                "rawValues" : [
                       1990239
                       1990245
                       ...........
                ]
            }, 
        ]
    },
    {
        "benchmarkName" : "bumbleBench-ArrayListSortComparatorBench",
        "benchmarkVariant" : "0",
         ...........
    },
]

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

This is a new database frame for tests(no change) and aggregateInfo(changed) based on the new design from above:

tests (3 different benchmarks with 2 iterations each):

New aggregateInfo design(3 elements based on the 3 benchmarks, including benchmark information and its related statistic values and raw values)

from aqa-test-tools.

piyush286 commented on June 12, 2024

@sophiaxu0424 Thanks for the quick change :) Let's make the values plural: statValues and rawValues. Thanks!

from aqa-test-tools.

sophiaxu0424 commented on June 12, 2024

New Design for aggRawMetricValues from array to a JSON object. (This is only a display for showing directly here, not existing in database):

from aqa-test-tools.

piyush286 commented on June 12, 2024

@sophiaxu0424 The data should still follow the structure mentioned in #136 (comment). Changing our logic as mentioned in #179 (comment) shouldn't really change that. We don't need to store aggRawMetricValues.

Update: Nvm! Missed reading

display for showing directly here, not existing in database

from aqa-test-tools.

piyush286 commented on June 12, 2024

Closing this since it's complete.

from aqa-test-tools.

Extract Perf Results From Adopt Perf Builds about aqa-test-tools HOT 16 CLOSED

Comments (16)

Partial Solution

Proposed Changes

Updated Proposed Changes

More details about design:

Link to JSON text files:

Initial Design for Aggregating Multiple Benchmarks in One Jenkins Build

More Efficient Design for Aggregating Multiple Benchmarks in One Jenkins Build

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent