Giter VIP home page Giter VIP logo

detect-secrets's Introduction

Build Status PyPI version Homebrew PRs Welcome AMF

detect-secrets

About

detect-secrets is an aptly named module for (surprise, surprise) detecting secrets within a code base.

However, unlike other similar packages that solely focus on finding secrets, this package is designed with the enterprise client in mind: providing a backwards compatible, systematic means of:

  1. Preventing new secrets from entering the code base,
  2. Detecting if such preventions are explicitly bypassed, and
  3. Providing a checklist of secrets to roll, and migrate off to a more secure storage.

This way, you create a separation of concern: accepting that there may currently be secrets hiding in your large repository (this is what we refer to as a baseline), but preventing this issue from getting any larger, without dealing with the potentially gargantuan effort of moving existing secrets away.

It does this by running periodic diff outputs against heuristically crafted regex statements, to identify whether any new secret has been committed. This way, it avoids the overhead of digging through all git history, as well as the need to scan the entire repository every time.

For a look at recent changes, please see CHANGELOG.md.

If you are looking to contribute, please see CONTRIBUTING.md.

For more detailed documentation, check out our other documentation.

Examples

Quickstart:

Create a baseline of potential secrets currently found in your git repository.

$ detect-secrets scan > .secrets.baseline

or, to run it from a different directory:

$ detect-secrets -C /path/to/directory scan > /path/to/directory/.secrets.baseline

Scanning non-git tracked files:

$ detect-secrets scan test_data/ --all-files > .secrets.baseline

Adding New Secrets to Baseline:

This will rescan your codebase, and:

  1. Update/upgrade your baseline to be compatible with the latest version,
  2. Add any new secrets it finds to your baseline,
  3. Remove any secrets no longer in your codebase

This will also preserve any labelled secrets you have.

$ detect-secrets scan --baseline .secrets.baseline

For baselines older than version 0.9, just recreate it.

Alerting off newly added secrets:

Scanning Staged Files Only:

$ git diff --staged --name-only -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline

Scanning All Tracked Files:

$ git ls-files -z | xargs -0 detect-secrets-hook --baseline .secrets.baseline

Viewing All Enabled Plugins:

$ detect-secrets scan --list-all-plugins
ArtifactoryDetector
AWSKeyDetector
AzureStorageKeyDetector
BasicAuthDetector
CloudantDetector
DiscordBotTokenDetector
GitHubTokenDetector
GitLabTokenDetector
Base64HighEntropyString
HexHighEntropyString
IbmCloudIamDetector
IbmCosHmacDetector
JwtTokenDetector
KeywordDetector
MailchimpDetector
NpmDetector
PrivateKeyDetector
SendGridDetector
SlackDetector
SoftlayerDetector
SquareOAuthDetector
StripeDetector
TwilioKeyDetector

Disabling Plugins:

$ detect-secrets scan --disable-plugin KeywordDetector --disable-plugin AWSKeyDetector

If you want to only run a specific plugin, you can do:

$ detect-secrets scan --list-all-plugins | \
    grep -v 'BasicAuthDetector' | \
    sed "s#^#--disable-plugin #g" | \
    xargs detect-secrets scan test_data

Auditing a Baseline:

This is an optional step to label the results in your baseline. It can be used to narrow down your checklist of secrets to migrate, or to better configure your plugins to improve its signal-to-noise ratio.

$ detect-secrets audit .secrets.baseline

Usage in Other Python Scripts

Basic Use:

from detect_secrets import SecretsCollection
from detect_secrets.settings import default_settings

secrets = SecretsCollection()
with default_settings():
    secrets.scan_file('test_data/config.ini')


import json
print(json.dumps(secrets.json(), indent=2))

More Advanced Configuration:

from detect_secrets import SecretsCollection
from detect_secrets.settings import transient_settings

secrets = SecretsCollection()
with transient_settings({
    # Only run scans with only these plugins.
    # This format is the same as the one that is saved in the generated baseline.
    'plugins_used': [
        # Example of configuring a built-in plugin
        {
            'name': 'Base64HighEntropyString',
            'limit': 5.0,
        },

        # Example of using a custom plugin
        {
            'name': 'HippoDetector',
            'path': 'file:///Users/aaronloo/Documents/github/detect-secrets/testing/plugins.py',
        },
    ],

    # We can also specify whichever additional filters we want.
    # This is an example of using the function `is_identified_by_ML_model` within the
    # local file `./private-filters/example.py`.
    'filters_used': [
        {
            'path': 'file://private-filters/example.py::is_identified_by_ML_model',
        },
    ]
}) as settings:
    # If we want to make any further adjustments to the created settings object (e.g.
    # disabling default filters), we can do so as such.
    settings.disable_filters(
        'detect_secrets.filters.heuristic.is_prefixed_with_dollar_sign',
        'detect_secrets.filters.heuristic.is_likely_id_string',
    )

    secrets.scan_file('test_data/config.ini')

Installation

$ pip install detect-secrets
✨🍰✨

Install via brew:

$ brew install detect-secrets

Usage

detect-secrets comes with three different tools, and there is often confusion around which one to use. Use this handy checklist to help you decide:

  1. Do you want to add secrets to your baseline? If so, use detect-secrets scan.
  2. Do you want to alert off new secrets not in the baseline? If so, use detect-secrets-hook.
  3. Are you analyzing the baseline itself? If so, use detect-secrets audit.

Adding Secrets to Baseline

$ detect-secrets scan --help
usage: detect-secrets scan [-h] [--string [STRING]] [--only-allowlisted]
                           [--all-files] [--baseline FILENAME]
                           [--force-use-all-plugins] [--slim]
                           [--list-all-plugins] [-p PLUGIN]
                           [--base64-limit [BASE64_LIMIT]]
                           [--hex-limit [HEX_LIMIT]]
                           [--disable-plugin DISABLE_PLUGIN]
                           [-n | --only-verified]
                           [--exclude-lines EXCLUDE_LINES]
                           [--exclude-files EXCLUDE_FILES]
                           [--exclude-secrets EXCLUDE_SECRETS]
                           [--word-list WORD_LIST_FILE] [-f FILTER]
                           [--disable-filter DISABLE_FILTER]
                           [path [path ...]]

Scans a repository for secrets in code. The generated output is compatible
with `detect-secrets-hook --baseline`.

positional arguments:
  path                  Scans the entire codebase and outputs a snapshot of
                        currently identified secrets.

optional arguments:
  -h, --help            show this help message and exit
  --string [STRING]     Scans an individual string, and displays configured
                        plugins' verdict.
  --only-allowlisted    Only scans the lines that are flagged with `allowlist
                        secret`. This helps verify that individual exceptions
                        are indeed non-secrets.

scan options:
  --all-files           Scan all files recursively (as compared to only
                        scanning git tracked files).
  --baseline FILENAME   If provided, will update existing baseline by
                        importing settings from it.
  --force-use-all-plugins
                        If a baseline is provided, detect-secrets will default
                        to loading the plugins specified by that baseline.
                        However, this may also mean it doesn't perform the
                        scan with the latest plugins. If this flag is
                        provided, it will always use the latest plugins
  --slim                Slim baselines are created with the intention of
                        minimizing differences between commits. However, they
                        are not compatible with the `audit` functionality, and
                        slim baselines will need to be remade to be audited.

plugin options:
  Configure settings for each secret scanning ruleset. By default, all
  plugins are enabled unless explicitly disabled.

  --list-all-plugins    Lists all plugins that will be used for the scan.
  -p PLUGIN, --plugin PLUGIN
                        Specify path to custom secret detector plugin.
  --base64-limit [BASE64_LIMIT]
                        Sets the entropy limit for high entropy strings. Value
                        must be between 0.0 and 8.0, defaults to 4.5.
  --hex-limit [HEX_LIMIT]
                        Sets the entropy limit for high entropy strings. Value
                        must be between 0.0 and 8.0, defaults to 3.0.
  --disable-plugin DISABLE_PLUGIN
                        Plugin class names to disable. e.g.
                        Base64HighEntropyString

filter options:
  Configure settings for filtering out secrets after they are flagged by the
  engine.

  -n, --no-verify       Disables additional verification of secrets via
                        network call.
  --only-verified       Only flags secrets that can be verified.
  --exclude-lines EXCLUDE_LINES
                        If lines match this regex, it will be ignored.
  --exclude-files EXCLUDE_FILES
                        If filenames match this regex, it will be ignored.
  --exclude-secrets EXCLUDE_SECRETS
                        If secrets match this regex, it will be ignored.
  --word-list WORD_LIST_FILE
                        Text file with a list of words, if a secret contains a
                        word in the list we ignore it.
  -f FILTER, --filter FILTER
                        Specify path to custom filter. May be a python module
                        path (e.g.
                        detect_secrets.filters.common.is_invalid_file) or a
                        local file path (e.g.
                        file://path/to/file.py::function_name).
  --disable-filter DISABLE_FILTER
                        Specify filter to disable. e.g.
                        detect_secrets.filters.common.is_invalid_file

Blocking Secrets not in Baseline

$ detect-secrets-hook --help
usage: detect-secrets-hook [-h] [-v] [--version] [--baseline FILENAME]
                           [--list-all-plugins] [-p PLUGIN]
                           [--base64-limit [BASE64_LIMIT]]
                           [--hex-limit [HEX_LIMIT]]
                           [--disable-plugin DISABLE_PLUGIN]
                           [-n | --only-verified]
                           [--exclude-lines EXCLUDE_LINES]
                           [--exclude-files EXCLUDE_FILES]
                           [--exclude-secrets EXCLUDE_SECRETS]
                           [--word-list WORD_LIST_FILE] [-f FILTER]
                           [--disable-filter DISABLE_FILTER]
                           [filenames [filenames ...]]

positional arguments:
  filenames             Filenames to check.

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose mode.
  --version             Display version information.
  --json                Print detect-secrets-hook output as JSON
  --baseline FILENAME   Explicitly ignore secrets through a baseline generated
                        by `detect-secrets scan`

plugin options:
  Configure settings for each secret scanning ruleset. By default, all
  plugins are enabled unless explicitly disabled.

  --list-all-plugins    Lists all plugins that will be used for the scan.
  -p PLUGIN, --plugin PLUGIN
                        Specify path to custom secret detector plugin.
  --base64-limit [BASE64_LIMIT]
                        Sets the entropy limit for high entropy strings. Value
                        must be between 0.0 and 8.0, defaults to 4.5.
  --hex-limit [HEX_LIMIT]
                        Sets the entropy limit for high entropy strings. Value
                        must be between 0.0 and 8.0, defaults to 3.0.
  --disable-plugin DISABLE_PLUGIN
                        Plugin class names to disable. e.g.
                        Base64HighEntropyString

filter options:
  Configure settings for filtering out secrets after they are flagged by the
  engine.

  -n, --no-verify       Disables additional verification of secrets via
                        network call.
  --only-verified       Only flags secrets that can be verified.
  --exclude-lines EXCLUDE_LINES
                        If lines match this regex, it will be ignored.
  --exclude-files EXCLUDE_FILES
                        If filenames match this regex, it will be ignored.
  --exclude-secrets EXCLUDE_SECRETS
                        If secrets match this regex, it will be ignored.
  -f FILTER, --filter FILTER
                        Specify path to custom filter. May be a python module
                        path (e.g.
                        detect_secrets.filters.common.is_invalid_file) or a
                        local file path (e.g.
                        file://path/to/file.py::function_name).
  --disable-filter DISABLE_FILTER
                        Specify filter to disable. e.g.
                        detect_secrets.filters.common.is_invalid_file

We recommend setting this up as a pre-commit hook. One way to do this is by using the pre-commit framework:

# .pre-commit-config.yaml
repos:
-   repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
    -   id: detect-secrets
        args: ['--baseline', '.secrets.baseline']
        exclude: package.lock.json

Inline Allowlisting

There are times when we want to exclude a false positive from blocking a commit, without creating a baseline to do so. You can do so by adding a comment as such:

secret = "hunter2"      # pragma: allowlist secret

or

//  pragma: allowlist nextline secret
const secret = "hunter2";

Auditing Secrets in Baseline

$ detect-secrets audit --help
usage: detect-secrets audit [-h] [--diff] [--stats]
                      [--report] [--only-real | --only-false]
                      [--json]
                      filename [filename ...]

Auditing a baseline allows analysts to label results, and optimize plugins for
the highest signal-to-noise ratio for their environment.

positional arguments:
  filename      Audit a given baseline file to distinguish the difference
                between false and true positives.

optional arguments:
  -h, --help    show this help message and exit
  --diff        Allows the comparison of two baseline files, in order to
                effectively distinguish the difference between various plugin
                configurations.
  --stats       Displays the results of an interactive auditing session which
                have been saved to a baseline file.
  --report      Displays a report with the secrets detected

reporting:
  Display a summary with all the findings and the made decisions. To be used with the report mode (--report).

  --only-real   Only includes real secrets in the report
  --only-false  Only includes false positives in the report

analytics:
  Quantify the success of your plugins based on the labelled results in your
  baseline. To be used with the statistics mode (--stats).

  --json        Outputs results in a machine-readable format.

Configuration

This tool operates through a system of plugins and filters.

  • Plugins find secrets in code
  • Filters ignore false positives to increase scanning precision

You can adjust both to suit your precision/recall needs.

Plugins

There are three different strategies we employ to try and find secrets in code:

  1. Regex-based Rules

    These are the most common type of plugin, and work well with well-structured secrets. These secrets can optionally be verified, which increases scanning precision. However, solely depending on these may negatively affect the recall of your scan.

  2. Entropy Detector

    This searches for "secret-looking" strings through a variety of heuristic approaches. This is great for non-structured secrets, but may require tuning to adjust the scanning precision.

  3. Keyword Detector

    This ignores the secret value, and searches for variable names that are often associated with assigning secrets with hard-coded values. This is great for "non-secret-looking" strings (e.g. le3tc0de passwords), but may require tuning filters to adjust the scanning precision.

Want to find a secret that we don't currently catch? You can also (easily) develop your own plugin, and use it with the engine! For more information, check out the plugin documentation.

Filters

detect-secrets comes with several different in-built filters that may suit your needs.

--exclude-lines

Sometimes, you want to be able to globally allow certain lines in your scan, if they match a specific pattern. You can specify a regex rule as such:

$ detect-secrets scan --exclude-lines 'password = (blah|fake)'

Or you can specify multiple regex rules as such:

$ detect-secrets scan --exclude-lines 'password = blah' --exclude-lines 'password = fake'

--exclude-files

Sometimes, you want to be able to ignore certain files in your scan. You can specify a regex pattern to do so, and if the filename meets this regex pattern, it will not be scanned:

$ detect-secrets scan --exclude-files '.*\.signature$'

Or you can specify multiple regex patterns as such:

$ detect-secrets scan --exclude-files '.*\.signature$' --exclude-files '.*/i18n/.*'

--exclude-secrets

Sometimes, you want to be able to ignore certain secret values in your scan. You can specify a regex rule as such:

$ detect-secrets scan --exclude-secrets '(fakesecret|\${.*})'

Or you can specify multiple regex rules as such:

$ detect-secrets scan --exclude-secrets 'fakesecret' --exclude-secrets '\${.*})'

Inline Allowlisting

Sometimes, you want to apply an exclusion to a specific line, rather than globally excluding it. You can do so with inline allowlisting as such:

API_KEY = 'this-will-ordinarily-be-detected-by-a-plugin'    # pragma: allowlist secret

These comments are supported in multiple languages. e.g.

const GoogleCredentialPassword = "something-secret-here";     //  pragma: allowlist secret

You can also use:

# pragma: allowlist nextline secret
API_KEY = 'WillAlsoBeIgnored'

This may be a convenient way for you to ignore secrets, without needing to regenerate the entire baseline again. If you need to explicitly search for these allowlisted secrets, you can also do:

$ detect-secrets scan --only-allowlisted

Want to write more custom logic to filter out false positives? Check out how to do this in our filters documentation.

Extensions

wordlist

The --exclude-secrets flag allows you to specify regex rules to exclude secret values. However, if you want to specify a large list of words instead, you can use the --word-list flag.

To use this feature, be sure to install the pyahocorasick package, or simply use:

$ pip install detect-secrets[word_list]

Then, you can use it as such:

$ cat wordlist.txt
not-a-real-secret
$ cat sample.ini
password = not-a-real-secret

# Will show results
$ detect-secrets scan sample.ini

# No results found
$ detect-secrets scan --word-list wordlist.txt

Gibberish Detector

The Gibberish Detector is a simple ML model, that attempts to determine whether a secret value is actually gibberish, with the assumption that real secret values are not word-like.

To use this feature, be sure to install the gibberish-detector package, or use:

$ pip install detect-secrets[gibberish]

Check out the gibberish-detector package for more information on how to train the model. A pre-trained model (seeded by processing RFCs) will be included for easy use.

You can also specify your own model as such:

$ detect-secrets scan --gibberish-model custom.model

This is not a default plugin, given that this will ignore secrets such as password.

Caveats

This is not meant to be a sure-fire solution to prevent secrets from entering the codebase. Only proper developer education can truly do that. This pre-commit hook merely implements several heuristics to try and prevent obvious cases of committing secrets.

Things That Won't Be Prevented:

  • Multi-line secrets
  • Default passwords that don't trigger the KeywordDetector (e.g. login = "hunter2")

FAQ

General

  • "Did not detect git repository." warning encountered, even though I'm in a git repo.

    Check to see whether your git version is >= 1.8.5. If not, please upgrade it then try again. More details here.

Windows

  • detect-secrets audit displays "Not a valid baseline file!" after creating baseline.

    Ensure the file encoding of your baseline file is UTF-8. More details here.

detect-secrets's People

Contributors

0atman avatar abdelrahman-thafeer avatar asottile avatar calvinli avatar cclauss avatar chenrui333 avatar danielpops avatar dariuszporowski avatar dependabot[bot] avatar dgzlopes avatar domanchi avatar dryoni avatar dw-ernest avatar edoverflow avatar gdemarcsek avatar gliptak avatar gruebel avatar jpdakran avatar justineyster avatar kevinhock avatar lorenzodb1 avatar louistrezzini avatar nickiaconis avatar nimrodkor avatar ninoseki avatar oicmudkips avatar pablosnt avatar sindrig avatar sinkettu avatar syn-4ck avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

detect-secrets's Issues

File extension specific exclusions

So only in say, .json files you want to exclude checksum, but no other file type. Or something like that.

Or no answer = in .tf files

Performance issues

I've been trying to run this tool against the discourse repository.
It seems to get stuck when encountering this file:

[secrets_collection]    INFO    Checking file: ./plugins/discourse-narrative-bot/lib/discourse_narrative_bot/certificate_generator.rb

Any workaround/suggestion?

Create performance testing benchmarks

This tool is meant to be fast. Not blazingly fast, but fast nonetheless. Before we can optimize on speed though, we need to create some testing frameworks to accurately measure performance of the engine.

This way, we can make improvements, and note the speed differential for different regexes / features.

--audit crashes on non-existant files

When you audit a baseline, if a file has been removed, it crashes the audit command and you lose all your progress.

Progress should be saved after each keystroke, and system calls should be protected by a try ... catch block

How to reproduce

  1. Create a baseline with secrets
  2. Remove one of the files referenced in the baseline
  3. Audit the baseline

pre-commit hook removes audited secrets

Steps to Reproduce

$ detect-secrets scan test_data/short_files/first_line.py > .secrets.baseline
$ echo "delete the secret from test_data/short_files/first_line.py"
$ PYTHONPATH=`pwd` detect_secrets/pre_commit_hook.py --baseline .secrets.baseline test_data/short_files/first_line.py
$ git diff .secrets.baseline

`--import` baseline could be smarter

Issues

  • When using --import <baseline_filename>, a baseline is created without knowledge of the existing baseline file. This means that the current baseline file will be scanned for secrets (which it will clearly find, due to the secret hashes stored in there).

  • When we upgrade baselines, we currently need to perform a two-liner (without sponge):

$ detect-secrets scan --import .secrets.baseline > .secrets.baseline.new
$ mv .secrets.baseline.new .secrets.baseline

If we already know the filename we're importing from (as compared to reading from stdin), we should also write to it.

Suggested Fix

$ detect-secrets scan --upgrade .secrets.baseline

This will write results to the provided file, and ignore the false positives in the current baseline file.

`pre-commit autoupdate` always fails if detect-secrets is present

hookid: detect-secrets                

The supplied baseline may be incompatible with the current                  
version of detect-secrets. Please recreate your baseline to                 
avoid potential mis-configurations.   

Current Version: 0.10.3               
Baseline Version: 0.9.1

detect-secrets should heal the repository when this happens rather than requiring manual intervention.

--scan option is just completely missing

The docs suggest using the --scan option but it doesnt exist

$ detect-secrets --scan > .secrets.baseline
usage: detect-secrets [-h] [--base64-limit BASE64_LIMIT]
                      [--hex-limit HEX_LIMIT] [-v] [--initialize [INITIALIZE]]
                      [--exclude EXCLUDE]
detect-secrets: error: unrecognized arguments: --scan

pre-commit hook removes plugins from baseline (on version 0.8.8)

$ tox -e pre-commit -- run detect-secrets --all-files
[detect-secrets] Detect secrets..........................................Failed
hookid: detect-secrets

Files were modified by this hook. Additional output:

Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.

ERROR: InvocationError: '/nail/home/louist/pg/puppet/.tox/pre-commit/bin/pre-commit run detect-secrets --verbose --all-files'
________________________________________________________________________________________________________________________________________________________ summary _________________________________________________________________________________________________________________________________________________________
ERROR:   pre-commit: commands failed
diff --git a/.secrets.baseline b/.secrets.baseline
index 70a2e0c..37fd650 100644
--- a/.secrets.baseline
+++ b/.secrets.baseline
@@ -1,19 +1,7 @@
 {
   "exclude_regex": "^(\\.git|venv|vendor|secrets)",
-  "generated_at": "2018-06-13T09:34:56Z",
-  "plugins_used": [
-    {
-      "limit": 4.5,
-      "name": "Base64HighEntropyString"
-    },
-    {
-      "limit": 3,
-      "name": "HexHighEntropyString"
-    },
-    {
-      "name": "PrivateKeyDetector"
-    }
-  ],
+  "generated_at": "2018-06-18T09:46:46Z",
+  "plugins_used": [],
   "results": {
     "Puppetfile": [
       {

Output "The baseline file was updated." dialogue once

We output the same thing over and over again

e.g. for one repo, running pre-commit run detect-secrets --all-files outputs:

Detect secrets...........................................................Failed
hookid: detect-secrets

Files were modified by this hook. Additional output:

The baseline file was updated.
Probably to keep line numbers of secrets up-to-date.
Please `git add .secrets.baseline`, thank you.


The baseline file was updated.
Probably to keep line numbers of secrets up-to-date.
Please `git add .secrets.baseline`, thank you.


Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
Your baseline file (.secrets.baseline) is unstaged.
`git add .secrets.baseline` to fix this.
The baseline file was updated.
Probably to keep line numbers of secrets up-to-date.
Please `git add .secrets.baseline`, thank you.


The baseline file was updated.
Probably to keep line numbers of secrets up-to-date.
Please `git add .secrets.baseline`, thank you.

We should try to do this just once.

Slow performance when scanning a non ini file with millions of lines

When scanning a non-ini file with more than 1 million lines, it would hang at line below.

(self._analyze_ini_file(add_header=True), configparser.Error,),

I'm able to trace back to configParse and found the following line is extremely inefficient to add all offending lines (essentially all the lines in the file) into the error message with string concatenation.

self.message += '\n\t[line %2d]: %s' % (lineno, line)

I did not have the patience to wait for the scan to finish, on my laptop it did hang for at least more than 10 minutes.

We need a more efficient way to scan large non-ini file.

Improve accuracy of aws_secret_access_key in keyword detector

It is one of the rarely specific kinds of secrets where we should ensure length and stuff of the captured group, so that e.g. foo or bar are not captured with that keyword. This isn't possible for most keywords, since e.g. passwords can be anything.

Detect same secret multiple times in the same line or file

From #52, we're able to do:

$ detect-secrets scan --string '012345678a'

but what happens if the string two or more secrets? e.g.

$ detect-secrets scan --string '"0123456789a" and "0123456789b"'

Right now, we're only going to show the scanned results for the first secret. But you can imagine it's kinda weird UX to only show results for the first one (silently ignoring the second).

baselines break with version bumps

Currently, baselines have no notion of which version of detect-secrets created it. This makes things slightly annoying, because with a major version bump, it could invalidate old baselines, requiring the user to recreate the baseline to be compliant once again.

At the very least, we should have baselines know which detect-secrets version created it, so we can be aware when this happens.

E.g. #26

Add an --audit functionality to audit created baseline

Currently, we use the baseline for two purposes:

  1. Acknowledgement of current true positive secrets in the code base, AND
  2. Whitelisted false positives.

For migratory purposes, we need some way to distinguish them, so that we can aggregate how many secrets left to move to a more secure storage. Therefore, the proposed way of displaying secrets in the baseline is:

{
    "type": "High Entropy String",
    "line_number": 19,
    "hashed_secret": "b8b693a3759e023b509093f4cacf0a3c973266fc",
    "is_secret": false,
}

Then, we should make an --audit command line flag that prompts users through each of their secrets found, verifies with them whether it is indeed a secret, and makes the appropriate baseline change.

`pragma: whitelist secret` doesn't support additional text

detect-secrets doesn't complain about:

whitelisted_api_key: "DEADBEEF1234" # pragma: whitelist secret

but complains about:

whitelisted_api_key: "DEADBEEF1234" # pragma: whitelist secret blah

If it's a comment, it should support text that follows it.

Line exclude regex just for keyword detector

In #127 we added a line exclude for all plugins and reverted 15a6e6a, which was a line exclude regex for just the high-entropy plugins, this was partially due to imagined use-cases of things someone may want to exclude from all plugins e.g. lines whose first non-whitespace character is a#, and general cleanliness (DRY).

The keyword detector however, is a young plugin with kind of an ambitious goal, and users shouldn't have to add to the FALSE_POSITIVES dict things that may be specific to their codebases, and version bumps won't need to happen to trim false positives.

This is a good-first-issue, b/c it was done very similarly in 15a6e6a

Inconsistent analyze_string return values

So right now the private_key plugin returns a dictionary of the form {'filename':PotentialSecret}, whereas the high_entropy_strings plugin returns a dictionary of the form {PotentialSecret.__hash()__: PotentialSecret}. We should change this to be the same, I lean towards the {PotentialSecret.__hash()__: PotentialSecret} because I think the rest of the code assumes this. So e.g.

-            output[filename] = PotentialSecret(
+            secret = PotentialSecret(
                 self.secret_type,
                 filename,
                 line_num,
                 string,
             )
+            output[secret] = secret

Respect plugin list from baseline

Some of the plugins, in particular, entropy-based and keyword plugins, can generate a relatively high number of false positives. When some of our teams are using detect-secrets, they choose to exclude certain plugins (with or without the combination of excluding some files). Currently, if you run a scan with --no-xxx-scan option, the used plugin list would be persisted in the baseline file.

If some developer or automation system picks up the repo, have no pre-commit hook setup and also unaware of the exclude list, they could run into the issue that they issue detect-secrets --update baseline, then the baseline file is regenerated with all plugins used.

Would the community entertain the idea that detect-secrets --update baseline scan use the plugin list from baseline instead of all plugins (default setting)? Some additional options can be added if you want to use more plugins than baseline ones to scan the repo.

We have something implemented in our fork (offline in our GHE), we'd like to hear some feedback on the problem before submitting a big PR.

Audit doesn't highlight on python2

Doing e.g. detect-secrets audit .secrets.baseline won't highlight the relevant secret, which makes it kind of frustrating to use.

I didn't dive into what is causing this, b/c I switched to a python 3 venv, but we should fix it.

Consider using truffleHogRegexes

Hi,

I run truffleHog. I recognize other projects have spun out doing similar things, so a little while ago I broke the regexes into their own package:

https://github.com/dxa4481/truffleHogRegexes

The thinking is I'd ideally like to get the whole community contributing regexes to one place, even if the underlying engine and technology is different.

It's on pypi, feel free to include the regex library.

Handle un-scannable files more gracefully

Having a bunch of
INFO: Checking file: some_image.png
WARNING: some_image.png failed to load.
from

log.warning("%s failed to load.", filename)

is not ideal. We know we cannot scan certain files e.g. images, so we should behave more gracefully.

In detect-secrets-server we already have the IGNORED_FILE_EXTENSIONS tuple we made to skip files like this

IGNORED_FILE_EXTENSIONS = (
    '7z',
    'bmp',
    'bz2',
    'dmg',
    'exe',
    'gif',
    'gz',
    'ico',
    'jar',
    'jpg',
    'jpeg',
    'png',
    'rar',
    'realm',
    's7z',
    'tar',
    'tif',
    'tiff',
    'webp',
    'zip',
)

maybe we should move it to detect-secrets, change it to a dict, and use it.

Same secret multiple times in the same file

Hello,

I'm trying to scan using detect-secrets --all-files and I noticed that the tool does not detect multiple instances of the same secret in a single file. It will simply flag the first instance found. The comment located here makes me believe that this is by design. Is it possible to change this behavior or is the tool not designed for this?

[audit functionality] Handle Unicode better

Someone reported to us the following redacted stacktrace:

...detect_secrets/core/audit.py", line 347, in _highlight_secret
    secret_line[index_of_secret + len(raw_secret):],
UnicodeEncodeError: 'ascii' codec can't encode character u'\xef' in position 51: ordinal not in range(128)

Fix sequential false-positives

To name a few:
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
frozenset('0123456789ABCDEFabcdef')
"abcdefghijklmnopqrstuvwxyz=/"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
'0123456789abcdef'
'0123456789ABCDEF'
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
'0123456789ABCDEFabcdef'
'abcdefghijklmnopqrstuvwxyz'
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

should not trigger, as it is sequential. We could do secret in that_string or maybe a set of sequential strings, but that's the naive first-thought solution.

detect-secrets --scan outside/of/repository/ does not work

Perhaps in baseline.py we can do something similar to

+        if os.path.isdir(
+            os.path.join(rootdir, '.git')
+        ):
+            # This only works when you run it on the root directory of another repository
+            git_ls_files_args = [
+                'git',
+                '--git-dir', os.path.join(rootdir, '.git'),
+                'ls-files',
+            ]
+        else:
+            # This only works when you run it on a folder or file in the current repository
+            git_ls_files_args = [
+                'git',
+                'ls-files',
+                rootdir,
+            ]
         with open(os.devnull, 'w') as fnull:
             git_files = subprocess.check_output(
-                [
-                    'git',
-                    'ls-files',
-                    rootdir,
-                ],
+                git_ls_files_args,
                 stderr=fnull,
             )

However I tested this and although the output from the command was good and all, the outputted baseline was not.

Also note, that the above only works if the directory outside the current repository is the root directory for that repository, not if it's some/other/repo/folder_inside_that_repo I can imagine we could loop through all parents and see if one of them has a .git directory, but that feels real dirty. We could maybe try something similar to

    subprocess.check_output(
        ('git', 'remote', 'get-url', 'origin'),
        cwd=os.path.dirname(filename)
    ).decode('utf8').strip()

Reduce false positives for private keys

We're less concerned about private keys, if they are encrypted with a passphrase. An example format is:

-----BEGIN RSA PRIVATE KEY-----                                                    
Proc-Type: 4,ENCRYPTED                                                             
DEK-Info: AES-128-CBC,99AD1487680054D5E49D263D3E4CBFEB

We probably can use this heuristic to reduce flagged data.

Should 'secret_key' be added to the keyword plugin?

While trying out detect-secrets for the first time today, I noticed the term 'secret_key' is not present in the keyword plugin (see https://github.com/Yelp/detect-secrets/blob/master/detect_secrets/plugins/keyword.py#L38). This means it doesn't find the very common Django SECRET_KEY variable. I was wondering if folks think 'secret_key' should be added to the keyword plugin's blacklist...but as a newcomer to this library I wasn't sure if that would cause consternation, since it would basically point out SECRET_KEY for any and all Django projects (if I understand the plugin correctly).

Just looking to start the conversation! Thanks!

Add keyword exclude to baseline

In #132 a keyword exclude option was added, but we didn't write what it was to the baseline. This is problematic because e.g. detect-secrets scan --update .secrets.baseline will re-scan and never use the keyword exclude.

I realized this before releasing 0.12.1 but thought it was pretty low priority, as it is mostly used to enable us to not have to bump detect-secrets and detect-secrets-server versions to make additions to the FALSE_POSITIVES dictionary in the keyword plugin, and sync it upstream asynchronously.

-v Verbosity overflow, 3+ v's cause a KeyError

Repro: detect-secrets --scan . -vvv

Here lies the relevant stack trace:

Traceback (most recent call last):
  File "/hey/three_six/bin/detect-secrets", line 11, in <module>
    sys.exit(main())
  File "/hey/three_six/lib/python3.6/site-packages/detect_secrets/main.py", line 27, in main
    log.set_debug_level(args.verbose)
  File "/hey/three_six/lib/python3.6/site-packages/detect_secrets/core/log.py", line 46, in _set_debug_level
    self.setLevel(mapping[debug_level])
KeyError: 3

Keyword Detector is not used in 0.10.5

Hello,

When i run a scan on a repo, KeywordDetector is not used.
If it's normal, why it's disable ?

{
"exclude_regex": null,
"generated_at": "2018-11-26T09:42:29Z",
"plugins_used": [
{
"base64_limit": 4.5,
"name": "Base64HighEntropyString"
},
{
"name": "BasicAuthDetector"
},
{
"hex_limit": 3,
"name": "HexHighEntropyString"
},
{
"name": "PrivateKeyDetector"
}
],
"results": {},
"version": "0.10.5"
}

Thank you

[audit functionality] Return the correct occurrence of the secret text in a line

If you have e.g.

self.thepassword = "thepassword" 

and run the soon-to-be-merged keyword detector, then run the audit functionality, it will highlight the first occurrence, and instead of the second. This is a bug.

This is because in audit.py we find the index of the secret

index_of_secret = secret_line.lower().index(raw_secret.lower())

and we don't return the index of the secret in the secret_generator method of plugins.

This somewhat related to the issue of handling multiple secrets on the same line

# TODO: Handle multiple secrets on single line.

Keyword plugin regex.search throws exception in 0.11.3

Hello,

I've tested the new version and I get this error :

Traceback (most recent call last):
File "/home/.../detect-secrets-allprojects/venv/bin/detect-secrets", line 11, in
sys.exit(main())
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/main.py", line 43, in main
_perform_scan(args, plugins),
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/main.py", line 118, in _perform_scan
args.all_files,
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/core/baseline.py", line 52, in initialize
output.scan_file(file)
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/core/secrets_collection.py", line 185, in scan_file
self._extract_secrets_from_file(f, filename_key)
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/core/secrets_collection.py", line 282, in _extract_secrets_from_file
results.update(plugin.analyze(f, filename))
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/plugins/base.py", line 32, in analyze
secrets = self.analyze_string(line, line_num, filename)
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/plugins/keyword.py", line 136, in analyze_string
filetype=determine_file_type(filename),
File "/home/.../detect-secrets-allprojects/venv/local/lib/python3.5/site-packages/detect_secrets/plugins/keyword.py", line 157, in secret_generator
match = REGEX.search(lowered_string)

Do you have an idea ?

Best regards,

Tioborto

ini-like file not parsing

$ cat environment.variables
PASSWORD=d1bc8d3ba4afc7e109612cb73acbdddac052c93025aa1f82942edabb7deb82a1
$ detect-secrets scan environment.variables
{
  "exclude_regex": null,
  "generated_at": "2018-12-21T20:02:03Z",
  "plugins_used": [
    {
      "base64_limit": 4.5,
      "name": "Base64HighEntropyString"
    },
    {
      "name": "BasicAuthDetector"
    },
    {
      "hex_limit": 3,
      "name": "HexHighEntropyString"
    },
    {
      "name": "PrivateKeyDetector"
    }
  ],
  "results": {},
  "version": "0.11.0"
}

Environment variables look like .ini files, but without a header. We should be able to capture this using this fact.

Scan throws a UnicodeEncodeError when ini file has unicode on RHS

Scanning a Unicode file (e.g. detect-secrets scan poc.foo) results in

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 76: ordinal not in range(128)

The last place in the stack trace is

re.escape(values_list[current_value_list_index]),

https://github.com/Yelp/detect-secrets/pull/129/files was a previous fix for something similar, where from __future__ import unicode_literals fixed the issue easily. I vimd into my site-packages, added the line to ini_file_parser.py, and it fixed it. So we just need to make a PR similar to #129.

configparser requirement not in setup.py

While testing what I uploaded to Test PyPI, I did detect-secrets --help (on Python 2), and got:

ImportError: No module named configparser (high_entropy_strings.py)

flake8, in requirements-dev.txt requires configparser, so we never ran into this during make test, 😮 pretty sneaky!

We should make a make test w/ setup.py dependencies only option from the Makefile.

But for now, python-future suggest doing from configparser import ConfigParser so that is what I will do.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.