complianceascode / content-test-filtering Goto Github PK

System for Automatic Filtering of Tests for ComplianceAsCode project

Python 68.91% Shell 31.09%

content-test-filtering's Introduction

content-test-filtering

Automatic tests filtering for ComplianceAsCode/content project.

The project analyses changed files, then based on changes decides what has been affected, and at the end selects what should be tested.

Requirements

Python 3.4 or newer
Git 1.7 or newer
GitPython (Python package)
PyYAML (Python package)
DeepDiff (Python package)
Jinja2 (Python package)
xmldiff (Python package)

Installation

Fedora/RHEL:

yum install python3 git
pip3 install gitpython PyYAML deepdiff Jinja2 xmldiff
git clone https://github.com/mildas/content-test-filtering

WARNING: For analysis of Jinja files, ComplianceAsCode dependencies are required. See ComplianceAsCode Developer Guide

Usage

How to obtain changes:

Pull request number - get changes from a ComplianceAsCode/content PR
Branch name - name of a branch

positional arguments:
  {pr,branch}  Subcommands: pr, branch
    pr         Compare base against selected pull request
    branch     Compare base against selected branch

Both options have common optional arguments:

  -h, --help            show this help message and exit
  --base BASE_BRANCH    Base branch against which we want to compare, default is "master" branch
  --repository REPOSITORY_PATH
                        Path to ComplianceAsCode repository. If not provided, the repository will be cloned into
                        /tmp folder.
  --remote_repo REMOTE_REPO
                        Remote repository for pulling, updating and finding branches (Pull requests). Default is
                        https://github.com/ComplianceAsCode/content.
  --local               Do not pull from remote, use local branches. Mainly for testing purposes.
  --verbose             Increase verbose level of logging to DEBUG level. Default level is INFO
  --output-tests        Output only list of tests. Completely turns off all other outputs.
  --output-format {raw,markdown}
                        Output format.
  --output {commands,json}
                        Output from the tool.
  --profile             Print only profile tests.
  --rule                Print only rule tests.

Remote vs local analysis

By default, changes are obtained from remote repository. When no optional arguments are passed, ComplianceAsCode/content repository is clonned, branch with changes fetched, and then localy analysed.

To disable clonning of the repository and fetch only changes, --repository is needed. To perform local changes analysis, branch (Pull Request numbering is unknown in local repository), --local (disable branch fetching), and --repository (path to local repository) arguments must be used.

Output format

By default, the project prints human-readable output with identified rule and profile changes and with recommended tests to execute. Github comments support Markdown format, and so to make the output nice for comments, using --output-format markdown argument outputs it in the format.

For machine-readble output, use --output json argument. It outputs separate output for profile changes and for rule related changes:

profile changes - profile and product keys
rule changes - rule, product, bash, and ansible keys

Tests

Tests are located in tests/ directory and are using BATS.

For each script with tests, testing repository is prepared. For each test scenario, setup and clean up phase are done. The setup phase creates temporary file for changes, and clean up puts the repository to previous state.

content-test-filtering's People

Contributors

Stargazers

Watchers

Forkers

ggbecker mab879 yuumasato jan-cerny maage

content-test-filtering's Issues

Traceback when analysing Bash remediation of `rsyslog_files_permissions`

token_before: "s/[\" token_after: "s/[\"

Traceback (most recent call last):
  File "/home/ggasparb/workspace/github/content-test-filtering/content_test_filtering.py", line 45, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/BashAnalysis.py", line 123, in process_analysis
    self.analyse_bash()
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/BashAnalysis.py", line 91, in analyse_bash
    token_before = tokens_before.get_token()
  File "/usr/lib64/python3.9/shlex.py", line 109, in get_token
    raw = self.read_token()
  File "/usr/lib64/python3.9/shlex.py", line 191, in read_token
    raise ValueError("No closing quotation")
ValueError: No closing quotation

The problem is the backslash that escapes the double quotes. The bash content that is being parsed is this one:
https://github.com/ComplianceAsCode/content/blob/fc6bdac51301debae58900bac597183cd7039620/linux_os/guide/system/logging/ensure_rsyslog_log_file_configuration/rsyslog_files_permissions/bash/shared.sh#L39

CLEANED_PATHS=$(sed -e "s/[\"')]//g; /\\/etc.*\.conf/d; /\\/dev\\//d" <<< "${FILTERED_PATHS}")

CTF identifies partially the token probably because this crazy sed command construction.

Reproducer: python3 content_test_filtering.py pr --rule --output json 7715

I don't know if we can circumvent this here or if this is a legitimate problem with the content. Either way we could catch the ValueError exception so the it doesn't crash.

Traceback when adding CentOS Stream 9 derivative product to CaC/content

./ctf/content_test_filtering.py pr --rule --output json 7878

Traceback (most recent call last):
  File "/__w/content/content/./ctf/content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/__w/content/content/ctf/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/__w/content/content/ctf/ctf/analysis/OVALAnalysis.py", line 215, in process_analysis
    self.diff_struct.add_changed_product_by_rule(self.rule_name, msg=msg)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 122, in add_changed_product_by_rule
    product_name = self.get_rule_products(rule_name)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 89, in get_rule_products
    with open(ruleyml_path) as f:
TypeError: expected str, bytes or os.PathLike object, not NoneType
Error: Process completed with exit code 1.

Path to test_suite.py needs dir prefix

In ComplianceAsCode/content#5609, the content-test-filtering instructed to run:

build_product fedora
test_suite.py rule --libvirt qemu:///system test-suite-vm --remediate ansible --datastream build/ssg-fedora-ds.xml audit_rules_immutable

The second command should probably be:

tests/test_suite.py rule --libvirt qemu:///system test-suite-vm --remediate ansible --datastream build/ssg-fedora-ds.xml audit_rules_immutable

Note: I don't know how --remediate ansible is actually working. The option I see in test_suite.py is --remediate-using.

Traceback when analysing OVAL from shared/checks/oval directory

python3 ./ctf/content_test_filtering.py pr --verbose --rule --output json 7635

DEBUG   OVALAnalysis           - Analyzing OVAL file /shared/checks/oval/var_firewall_package_is_nftables.xml
DEBUG   OVALAnalysis           - Rule name: var_firewall_package_is_nftables
DEBUG   DiffStruct             - rule.yml was not found for var_firewall_package_is_nftables
Traceback (most recent call last):
  File "/__w/content/content/./ctf/content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/__w/content/content/ctf/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/__w/content/content/ctf/ctf/analysis/OVALAnalysis.py", line 235, in process_analysis
    self.diff_struct.add_changed_product_by_rule(self.rule_name, msg=msg)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 122, in add_changed_product_by_rule
    product_name = self.get_rule_products(rule_name)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 89, in get_rule_products
    with open(ruleyml_path) as f:
TypeError: expected str, bytes or os.PathLike object, not NoneType

Feature request: Handle recursive macros

See example code of recursive macros at:

ComplianceAsCode/content@bef2156

And use case:
ComplianceAsCode/content#10534

Addition of test scenarios in a pull request should have its rule reported as changed

This PR ComplianceAsCode/content#7313 adds a new templated rule and test scenarios. It's a good idea to report the rule id in the JSON output format in this case so the rule is tested by the SSGTS GH integration.

[RFE] Use openscap-ci credentials when adding new comments

Unbind from your personal account

Traceback when analysing jinja macros

python3 ./content_test_filtering.py pr --output-format markdown 7468

Traceback (most recent call last):
  File "/home/ggasparb/workspace/github/content-test-filtering/./content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/JinjaAnalysis.py", line 293, in process_analysis
    self.analyse_macros(changed_macros)
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/JinjaAnalysis.py", line 156, in analyse_macros
    self.analyse_macros_in_rules(macros)
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/JinjaAnalysis.py", line 117, in analyse_macros_in_rules
    del default_macros[macro.name]
KeyError: '_make_bash_variable_assignment'

RFE - Option to skip rules not applicable for RHEL

When reviewing a PR which changed rules only applicable to OL, though an explicit defition of prodtype: ol7, it was noted that CI tests only relevant for RHEL and Fedora were executed and failed with the following output:

---------Bash Remediation Logs---------
ERROR - Rule 'xccdf_org.ssgproject.content_rule_accounts_authorized_local_users_sidadm_orasid' isn't present in benchmark 'xccdf_org.ssgproject.content_benchmark_RHEL-8' in '/tmp/ssgts-ds-gzmbdvrc'
ERROR - No tests found matching the rule ID(s) 'accounts_authorized_local_users_sidadm_orasid, java_jre_insecure_prompt_locked, java_jre_enable_jws, java_jre_security_revocation_check_locked, java_jre_validation_crl_locked, java_jre_validation_ocsp, java_jre_deployment_config_mandatory, java_jre_lock_untrusted_sources, java_jre_deployment_properties_exists, java_jre_validation_crl, java_jre_disable_untrusted_sources, java_jre_validation_ocsp_locked, java_jre_accepted_sites_exists, java_jre_unsigned_applications, java_jre_unsigned_applications_locked, java_jre_security_revocation_check, java_jre_blacklist_check_locked, java_jre_enable_jws_locked, java_jre_deployment_config_properties, java_jre_blacklist_check, java_jre_disable_untrusted_sources_locked, java_jre_lock_untrusted_sources_locked, java_jre_accepted_sites_properties, java_jre_insecure_prompt, java_jre_deployment_config_exists'
WARNING - Nothing has been tested!

The output is correct since the rule is indeed not available for RHEL8, as in this example.

In such cases, testing this rule in RHEL is not necessary since the result will always fail.
It would be good to have an option or other mechanism to not test rules in some products when the rules are not applicable for the respective product.

References:
PR: ComplianceAsCode/content#9398
Tests CS8: https://github.com/ComplianceAsCode/content/runs/7998438537?check_suite_focus=true
Tests CS9: https://github.com/ComplianceAsCode/content/runs/7998421987?check_suite_focus=true

TypeError: expected str, bytes or os.PathLike object, not NoneType in CaC/content

This seems to be the same error message from #23

python3 ./ctf/content_test_filtering.py pr --rule --output json 7908 > output.json

Traceback (most recent call last):
  File "/__w/content/content/./ctf/content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/__w/content/content/ctf/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/__w/content/content/ctf/ctf/analysis/BashAnalysis.py", line 103, in process_analysis
    self.diff_struct.add_changed_product_by_rule(self.rule_name, msg=msg)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 122, in add_changed_product_by_rule
    product_name = self.get_rule_products(rule_name)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 89, in get_rule_products
    with open(ruleyml_path) as f:
TypeError: expected str, bytes or os.PathLike object, not NoneType
Error: Process completed with exit code 1.

Add detection of product(s) when test scenarios are updated/added

CTF should properly detect which product(s) are targeted by updated or newly added test scenarios. Currently this doesn't work as expected, for example see https://github.com/ComplianceAsCode/content/pull/7634/checks?check_run_id=3687722905 from ComplianceAsCode/content#7634 where RHEL7 scenarios got added but CTF detected product as RHEL8.

Use collections.defaultdict to simplify code

The code in DiffLogging.py often looks like

            if rule in self.rules:
                self.rules[rule].add(msg)
            else:
                self.rules[rule] = {msg}

You can use the defaultdict type to save yourself some boring code:

https://docs.python.org/3.8/library/collections.html#collections.defaultdict

Traceback when file is moved but no changes are done

When a file is moved to a difference place but no changes are done, the CTF ends with traceback.

The flag when a file is changed is M.

Reproducer
python3 content_test_filtering.py pr --base 354110f9e9779bf55e2360193b6a5fee16cdd04f --remote_repo https://github.com/ComplianceAsCode/content --verbose --rule --output json 8529 > output.json

Stacktrace

DEBUG   JinjaAnalysis          - Analyzing Jinja macro file shared/macros/macros-ansible.jinja
Traceback (most recent call last):
  File "/home/ggasparb/workspace/github/content-test-filtering/content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/JinjaAnalysis.py", line 290, in process_analysis
    diff = self.load_diff()
  File "/home/ggasparb/workspace/github/content-test-filtering/ctf/analysis/JinjaAnalysis.py", line 208, in load_diff
    diff = diff["values_changed"]["root"]["diff"]
KeyError: 'values_changed'

Typo in string

https://github.com/mildas/content-test-filtering/blob/cbd26212d08c5482c572dfe8ff6c4209fe13c4a1/ctf/analysis/AnsibleAnalysis.py#L111

Should be Templatization

Possibly undefined variables in diff.py - create_file_records_from_diff

Either file_before or file_after (or both) may be undefined depending on the flag

Do a complete parsing procedure of CaC/content files to find culprits of bad syntax

Sometimes a PR touches files that cannot be parsed by the CTF, we should do a parsing on all files to detect problematic files and prevent issues on PRs that cannot be easily fixed during the PR review.

product will always be rhel8 // replace prodtype logic

with ComplianceAsCode/content#11378 prodtype got removed.

so this

content-test-filtering/ctf/DiffStruct.py

Lines 95 to 96 in 552403f

 if not prodtype_line: 

 return None

will always return None and

content-test-filtering/ctf/DiffStruct.py

Lines 103 to 114 in 552403f

 def add_changed_rule(self, rule_name, product_name=None, msg=""): 

 self.add_rule_log(rule_name, msg) 

 if not product_name: 

 product_name = self.get_rule_products(rule_name) 

 if product_name: 

 product_name = product_name[0] 

 logger.debug("Rule %s is part of %s datastream.", rule_name, product_name) 

 else: 

 product_name = "rhel8" 

 logger.debug("Rule %s is not part of any datastream. " 

 "Added default rhel8 value", rule_name) 

 self.changed_rules[product_name].add(rule_name)

will always set rhel8 as product

Traceback in PR 10574

It probably needs to ignore the tests folder.

python3 ./ctf/content_test_filtering.py pr --base 328eac5d78ee756d158c389a91633f5dd74a5d60 --remote_repo https://github.com/ComplianceAsCode/content --verbose --rule --output json 10574 > output.json

DEBUG   BashAnalysis           - Analyzing bash file tests/unit/bash/execute_tests.sh
DEBUG   BashAnalysis           - Rule name: unit
DEBUG   DiffStruct             - rule.yml was not found for unit
Traceback (most recent call last):
  File "/__w/content/content/./ctf/content_test_filtering.py", line 44, in <module>
    diff_structure = diff_analysis.analyse_file(file_record)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/content/content/ctf/ctf/diff_analysis.py", line 54, in analyse_file
    return file_analyzer.process_analysis()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/content/content/ctf/ctf/analysis/BashAnalysis.py", line 122, in process_analysis
    self.analyse_bash()
  File "/__w/content/content/ctf/ctf/analysis/BashAnalysis.py", line 95, in analyse_bash
    self.diff_struct.add_changed_product_by_rule(self.rule_name, msg=msg)
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 122, in add_changed_product_by_rule
    product_name = self.get_rule_products(rule_name)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/__w/content/content/ctf/ctf/DiffStruct.py", line 89, in get_rule_products
    with open(ruleyml_path) as f:
         ^^^^^^^^^^^^^^^^^^
TypeError: expected str, bytes or os.PathLike object, not NoneType

[RFE] Edit existent comment instead of always adding a new one

Identify if there is already a comment related to content-test-filtering and edit instead of always adding a new one

The tool marks a rule for testing which has not been touched

Example is here:
ComplianceAsCode/content#8327

The tool marked the rule audit_rules_kernel_module_loading but audit_rules_kernel_module_loading_delete, audit_rules_kernel_module_loading_init and audit_rules_kernel_module_loading_finit were modified.

	def add_changed_rule(self, rule_name, product_name=None, msg=""):
	self.add_rule_log(rule_name, msg)
	if not product_name:
	product_name = self.get_rule_products(rule_name)
	if product_name:
	product_name = product_name[0]
	logger.debug("Rule %s is part of %s datastream.", rule_name, product_name)
	else:
	product_name = "rhel8"
	logger.debug("Rule %s is not part of any datastream. "
	"Added default rhel8 value", rule_name)
	self.changed_rules[product_name].add(rule_name)