Giter VIP home page Giter VIP logo

pip-license-checker's Introduction

pip-license-checker

codecov Docker Pulls Docker Image Version (latest semver) Clojars Version

License compliance tool. Detect license names and types for Python PyPI packages. Identify license types for given license names obtained by third-party tools. Great coverage of free/libre and open source licenses of all types: public domain, permissive, copyleft.

Supported formats:

  • Python: packages or requirements.txt (detect license name and license type)
  • Clojure: EDN files generated by lein-licenses (detect license type)
  • JavaScript: CSV files generated by license-checker (detect license type)
  • iOS: Apple Plist files generated by CocoaPods Acknowledgements plugin (detect license type)
  • Android: JSON files generated by Gradle License Plugin (detect license type)
  • Other: CSV files with package name and license name columns (detect license type).

Installation

You can install pip-license-checker either by pulling a Docker image, builing from the source code or plugging-in GitHub Action to your CI pipeline.

I. Docker

There are two options for getting a docker image:

  1. Pulling an official image from Docker Hub
docker pull pilosus/pip-license-checker

Use specific version tag (it's matching version of the tool in the repo) or just latest.

  1. Building a docker image yourself
git clone https://github.com/pilosus/pip-license-checker.git
cd pip-license-checker
docker build -t pilosus/pip-license-checker .

II. Compiling from source code

  1. Install Leiningen

  2. Get the source code

git clone https://github.com/pilosus/pip-license-checker.git
cd pip-license-checker

It's enough to start using the tool with lein. But you can optionally compile a standalone jar-file too:

  1. (Optional) Compile uberjar file
lein uberjar
cd target/uberjar
java -jar pip-license-checker-[version]-standalone.jar [args] [options]

III. GitHub Action for CI integration

If your project is hosted by GitHub, try the GitHub Action based on pip-license-checker.

Usage

Docker

docker run -it --rm pilosus/pip-license-checker \
  java -jar app.jar 'aiostream==0.4.3' 'pygit2' 'aiohttp>3.7.1'

In case of checking files (e.g. with --requirements or --external tool's options) mount a host directory containing the files with docker's -v option:

docker run -v `pwd`:/volume \
    -e GITHUB_TOKEN=your-gh-token \
    -it --rm pilosus/pip-license-checker \
    java -jar app.jar --exclude 'pylint.*' \
    --requirements '/volume/requirements.txt' \
    --external '/volume/licenses.csv' \
    --fail StrongCopyleft \
    --fails-only \
    --verbose

Command line tool

Examples below assume you are using lein tool. If you'd like to use standalone jar file, just substitute lein run with java -jar pip-license-checker-[version]-standalone.jar.

### see usage message
lein run

### check a single package
lein run piny==0.6.0
lein run aiostream

### include pre-release and development versions
lein run --pre aiohttp

### scan all packages in requirements file
lein run --requirements resources/requirements.txt

aiohttp:3.7.2                       Apache Software License                                 Permissive
piny:0.6.0                          MIT License                                             Permissive
aiostream:0.4.3                     GPLv3                                                   StrongCopyleft
mo-collections:4.30.21121           Mozilla Public License 2.0 (MPL 2.0)                    WeakCopyleft
aiocache:0.11.1                     BSD 3-Clause "New" or "Revised" License                 Permissive
aiokafka:0.6.0                      Apache Software License                                 Permissive
aiopg:122.3.5                       Error                                                   Error
telegram-bot-framework:3.15.2       GNU Affero General Public License v3 or later (AGPLv3+) NetworkCopyleft
aio-throttle:1.6.2                  MIT License                                             Permissive
workflow-tools:0.6.0                Apache Software License                                 Permissive
Synx:0.0.3                          Other/Proprietary License                               Other

### scan packages matching regex pattern
### e.g. all lines except containing "aio.*" packages
lein run --requirements resources/requirements.txt --exclude 'aio.*'

piny:0.6                       MIT License                    Permissive
workflow-tools:0.5.0           Apache Software License        Permissive
Synx                           Other/Proprietary License      Other

Help

Run application with lein run or use --help option with standalone jar for more details.

lein run

pip-license-checker - license compliance tool to identify dependencies license names and types.

Usage:
pip-license-checker [options]... [package]...

Description:
  package	List of Python package names in format `name[specifier][version]`

  -v, --verbose                            0                                       Verbosity level: error (-v), info (-vv), debug (-vvv)
  -r, --requirements REQUIREMENTS_FILE     []                                      Python pip requirement file name
  -x, --external FILE_NAME                 []                                      File containing package names and license names
      --external-format FILE_FORMAT        csv                                     External file format: csv, cocoapods, gradle
      --external-options OPTS_EDN_STRING   {:skip-header true, :skip-footer true}  String of options map in EDN format
      --report-format FORMAT               stdout                                  Report format: stdout, json, json-pretty, csv
      --formatter PRINTF_FMT               %-35s %-55s %-20s %-40s                 Printf-style formatter string for stdout report formatting
  -f, --fail LICENSE_TYPE                  #{}                                     Return non-zero exit code if license type is found
  -e, --exclude REGEX                                                              PCRE to exclude packages with matching names
      --exclude-license REGEX                                                      PCRE to exclude packages with matching license names
      --[no-]pre                           false                                   Include pre-release and development versions. By default, use only stable versions
      --[no-]totals                        false                                   Print totals for license types
      --[no-]with-totals                   nil                                     [deprecated '0.41.0'] Print totals for license types
      --[no-]totals-only                   false                                   Print only totals for license types
      --[no-]headers                       false                                   Print report headers
      --[no-]table-headers                 nil                                     [deprecated '0.41.0'] Print table headers
      --[no-]fails-only                    false                                   Print only packages of license types specified with --fail flags
      --[no-]parallel                      true                                    Run requests in parallel
      --[no-]exit                          true                                    Exit program, used for CLI mode
      --rate-limits REQUESTS/MILLISECONDS  {:requests 120, :millis 60000}          Rate limit requests to public APIs
      --github-token TOKEN                 nil                                     GitHub OAuth Token to increase rate-limits. Defaults to GITHUB_TOKEN env
  -h, --help                                                                       Print this help message

Examples:
pip-license-checker django
pip-license-checker aiohttp==3.7.2 piny==0.6.0 django
pip-license-checker --pre 'aiohttp<4'
pip-license-checker --totals --headers --requirements resources/requirements.txt
pip-license-checker --totals-only -r file1.txt -r file2.txt -r file3.txt
pip-license-checker -r resources/requirements.txt django aiohttp==3.7.1 --exclude 'aio.*'
pip-license-checker -r resources/requirements.txt --rate-limits 10/1000
pip-license-checker -r resources/requirements.github.txt --github-token your-token
pip-license-checker -x resources/external.csv --exclude-license '(?i).*(?:mit|bsd).*'
pip-license-checker -x resources/external.csv --external-options '{:skip-header false}'
pip-license-checker -x resources/external.cocoapods --external-format cocoapods'
pip-license-checker -x resources/external.edn --external-format edn --formatter '%-70s %-60s %-35s %-40s'
pip-license-checker -r resources/requirements.txt --report-format 'json-pretty' --totals --headers -vvv

License types

The following valid license types are available (to be used with --fail option):

  • NetworkCopyleft (GNU Affero GPL and the like)
  • StrongCopyleft (GNU GPL and the like)
  • WeakCopyleft (file-based or module-based copyleft like MPL or EPL)
  • Copyleft (includes all of above)
  • Permissive (BSD-style licenses)
  • Other (EULA, licenses not detected by the tool)
  • Error (no license name found or provided)

External file formats and their options

The tool supports multiple file formats containing package names and license name. External file format can be specified with the --external-format option.

Options specific to the file format can be specified with the --external-options as a string in EDN format.

csv

A CSV file generated by license-checker or any other third-party tool.

At least two columns need to be present in the file: package name and license name.

Default --external-options are the following:

{:skip-header true :package-column-index 0 :license-column-index 1}
  • skip-header [boolean]: skip the first line in a CSV file because it's a header line
  • package-column-index [integer]: index of the column with a package name (zero-based indexing)
  • license-column-index [integer]: index of the column with a license name (zero-based indexing)

edn

An EDN file generated by lein-licenses, a Clojure's leiningen plugin for detecting licenses for the project's dependencies.

Default --external-options are the following:

{:fully-qualified-names true}
  • {:fully-qualified-names [boolean]}: if true use fully-qualified dependency names, e.g. org.clojars.vrs/pip-license-checker:0.30.0. Otherwise try to strip a Clojars's group or a Maven's groupId part, e.g. pip-license-checker:0.30.0. Default value is true.

cocoapods

A propery list (plist) file generated by CocoaPods Acknowledgements plugin.

See available options in the documentation.

gradle

A JSON file generated by Gradle License Plugin.

See available options in the documentation.

External file options

The following valid external file options are available (to be used with --external-options option) for the external formats:

  • csv: '{:skip-header [boolean]}' -- skip the first (header) line of the csv file or not.
  • cocoapods: see the documentation.
  • gradle: see the documentation

FAQ

Q1. Does the tool consider the Python package's version? What if a package changes its license over time?

The tool resolves the version for Python packages just pip package manager does. It also checks the license only for the resolved version of the package.

Q2. How do I check all Python dependencies for my project, both explicit and transitive ones?

pip-license-checker checks only explicitly defined dependencies, without transitive ones. The easiest way to check all dependencies is to get them by the pip as the list and then run the tool with that list:

pip freeze > requirements-all.txt
lein run -r requirements-all.txt

Q3. Does the tool consider PEP-508 extras and markers specified for requirements?

PEP508 indeed allows specifying extra packages to be installed for the package as well as markers describing the rules when the dependency should be used:

requests[security];python_version<"3.9"

The tool ignores both extras and markers. Use the recipe for the Q2 if you need extras/markers to have an effect on the final list of dependencies to be checked.

Disclaimer

pip-license-checker is provided on an "as-is" basis and makes no warranties regarding any information provided through it, and disclaims liability for damages resulting from using it. Using pip-license-checker does not constitute legal advice nor does it create an attorney-client relationship.

License

Copyright © 2020-2023 Vitaly Samigullin

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

SPDX-License-Identifier: EPL-2.0 OR GPL-2.0 WITH Classpath-exception-2.0

pip-license-checker's People

Contributors

dependabot[bot] avatar pilosus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pip-license-checker's Issues

Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "20160909030348"

While running the latest docker image over my project's requirements.txt, came across a traceback. I traced the failure down to graphql-core==3.1.3 in the requirements file. Turns out this is reproducible with just docker run -it --rm pilosus/pip-license-checker java -jar app.jar graphql-core too.

Traceback

Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "20160909030348"
	at java.base/java.util.concurrent.FutureTask.report(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.get(Unknown Source)
	at clojure.core$deref_future.invokeStatic(core.clj:2300)
	at clojure.core$future_call$reify__8454.deref(core.clj:6974)
	at clojure.core$deref.invokeStatic(core.clj:2320)
	at clojure.core$deref.invoke(core.clj:2306)
	at clojure.core$map$fn__5866.invoke(core.clj:2753)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:58)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at clojure.core$concat$fn__5493.invoke(core.clj:725)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:51)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at pip_license_checker.core$process_requirements.invokeStatic(core.clj:80)
	at pip_license_checker.core$_main.invokeStatic(core.clj:222)
	at pip_license_checker.core$_main.doInvoke(core.clj:217)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at pip_license_checker.core.main(Unknown Source)
Caused by: java.lang.NumberFormatException: For input string: "20160909030348"
	at java.base/java.lang.NumberFormatException.forInputString(Unknown Source)
	at java.base/java.lang.Integer.parseInt(Unknown Source)
	at java.base/java.lang.Integer.parseInt(Unknown Source)
	at pip_license_checker.version$parse_number.invokeStatic(version.clj:35)
	at pip_license_checker.version$parse_letter_version.invokeStatic(version.clj:50)
	at pip_license_checker.version$validate_version.invokeStatic(version.clj:98)
	at pip_license_checker.version$parse_version.invokeStatic(version.clj:125)
	at pip_license_checker.pypi$get_releases$fn__4439.invoke(pypi.clj:58)
	at clojure.core$map$fn__5866.invoke(core.clj:2755)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:51)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at clojure.core$filter$fn__5893.invoke(core.clj:2809)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:51)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at clojure.core$filter$fn__5893.invoke(core.clj:2809)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:51)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at clojure.core$filter$fn__5893.invoke(core.clj:2809)
	at clojure.lang.LazySeq.sval(LazySeq.java:42)
	at clojure.lang.LazySeq.seq(LazySeq.java:51)
	at clojure.lang.RT.seq(RT.java:535)
	at clojure.core$seq__5402.invokeStatic(core.clj:137)
	at pip_license_checker.version$filter_versions.invokeStatic(version.clj:436)
	at pip_license_checker.version$get_version.invokeStatic(version.clj:463)
	at pip_license_checker.pypi$get_requirement_version.invokeStatic(pypi.clj:72)
	at pip_license_checker.pypi$requirement__GT_license.invokeStatic(pypi.clj:155)
	at pip_license_checker.pypi$get_parsed_requiements$fn__4479.invoke(pypi.clj:188)
	at clojure.core$pmap$fn__8462$fn__8463.invoke(core.clj:7022)
	at clojure.core$binding_conveyor_fn$fn__5754.invoke(core.clj:2030)
	at clojure.lang.AFn.call(AFn.java:18)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

Support markers

Markers allow to specify platform, OS, python version, etc. constraints for a package.

We need to implement:

  1. -m/--markers="python_version>'2',os_name=='a' and os_name=='b',..." CLI option for markers
  2. Markers parsing
  3. Filtering for markers

Version resolution

For now only equality specifiers works as expected, all other specifiers translate into "get the latest version".
We need to use real version resolution.
Take packaging as a role model for version resolution.

Feature request: for a given project license, a distribution profile and dependency licenses return required actions

For now, using the tool requires some sort of expertise. A user should understand the differences between the license types (e.g. permissive, weak copyleft, strong copyleft, "copyleft over the network") in order to understand the consequences for the project he/she check the dependencies for.

We need to "transfer" some expertise to the tool itself by supporting:

  • a project license (e.g. Proprietory)

  • a project distribution profile (e.g. mobile-app)

  • project dependencies with their license names/types (already implemented at the time of writing)

  • returning/detecting required actions e.g.:

  • add the original copyright notice

  • disclose the source code

  • (re)-license the project under the same or compatible strong-copyleft license

More verbosity for errors

When an Error occurs it may mean too many different things:

  • package not found
  • license name in package meta not found
  • all kinds of network errors (including rate limits related)

We need to know what exactly has happened, either by extending Error with comments, or sending list of errors to stderr

License :: OSI Approved trove classifier should not be considered as a license name

For now scikit-learn gets recognized as Other type, because it has the following in metadata:

"classifiers": [
      ...
      "License :: OSI Approved",
      ...
]
"license": "new BSD"

PyPI page shows license as OSI Approved (new BSD), meaning it recognizes that classifier has only common type without specific license name.

Packaging docs says:

The license argument doesn’t have to indicate the license under which your package is being released, although you may optionally do so if you want. If you’re using a standard, well-known license, then your main indication can and should be via the classifiers argument. Classifiers exist for all major open-source licenses.

The “license” argument is more typically used to indicate differences from well-known licenses, or to include your own, unique license. As a general rule, it’s a good idea to use a standard, well-known license, both to avoid confusion and because some organizations avoid software whose license is unapproved.

So the algorithm for choosing the licenses should be amended so that:

  1. Check license in classifiers first, if specific license found, use it, if not go p.2
    License :: OSI Approved is considered to be a common license description, not the license name!
  2. Check license field, if found use it, if not go p.3
  3. Check github repo license

Support detecting license types for deps with known licenses

Instead of passing in requirements.txt to the program, we may pass in a list of dependencies with their licenses to detect/gather stats about license types.

File format: TBD (possibly csv), see data.csv

dep1,MIT License
dep2,GPLv2
dep3,EULA
dep4,Error

CLI argument: TBD (possibly --external)

CLI options to print totals

Totals options

Let's introduce two new options:

  1. --with-totals

Standard output with totals details appended

  1. --totals-only

Only totals details in program output

Totals details output

Totals should reflect total counters for each license type detected:

Total permissive: 10 
Total copyleft: 1
Total other: 0
Total errors: 0

Support for the popular CI/CD platforms

For now, we support GitHub Actions. We need to extend the support to all the popular CI/CD platforms, either at the documentation level, or as a plugin:

  • Jenkins
  • GitLab CI
  • Travis CI
  • CircleCI
  • ...

Support setup.py dependencies

For now we recognize only requirements files. We need to also support setup.py dependencies:

  1. Distinguish it from requirements file (simple match the name against predefined setup.py)
  2. Parse the file, find a list of dependencies in install-requires section
  3. Do regual processing for requirements found

Limitations

We don't want to reinvent Python parser. That's why should do the simplies parsing for install-requires: check for an explicit list of strings. Variables, function calls assigned to install-requires won't be parsed.

Maybe also add verbosity flag and stdout notification if setup.py file detected, but install-requires parsing failed (field exists, but contains not a list of strings)

Fallback to the GitHub API to detect a Python dep's license name should be visible to a user

For now, we are trying to detect Python dep's license name this way:

  1. Metadata's trove classifier (trove classifiers are recommended for OSI-approved FLOSS licenses)
  2. Metadata's license field (recommended for licenses not available for trove classifiers, e.g. FLOSS license with exceptions or EULA)
  3. GitHub repo license

The problem with the GitHub API response for license name is that it is not version-specific, but rather HEAD-specific.
If we want to detect a license name for package:0.1.2, but the HEAD is pointing to the package:1.0.0 we can easily end up with the wrong verdict if the package has changed its license since the version 0.1.2.

What to do?

  1. Try to implement more sophisticated heuristics (e.g. check out the code to version branch/tag, both v0.1.2 and 0.1.2, try to parse LICENSE or COPYING)
  2. Use the GitHub API as we do now, add an additional column to the report:
| Package           | License Name                               | License ID                | License Type   | License Source        |
| package1:0.1.2    | Apache 2.0 License                         | Apache-2.0                | Permissive     | External              |
| package2:3.141592 | GNU General Public License v2 or any later | GPL-2.0-or-later          | StrongCopyleft | External              |
| package3:21.09    | Other/Proprietary License (EULA)           | NA                        | Other          | PythonMetaClassifiers |
| package4          | GPL-3.0 Linking Exception                  | GPL-3.0-linking-exception | WeakCopyleft   | PythonMetaLicense     |
| package5:2.19.2   | null                                       | NA                        | Error          | PythonGitHub          |
  1. Introduce a flag option --fail-license-source SOURCE_NAME, so that a user who needs stricter checks may always get notified if the GitHub API fallback with its known disadvantages is triggered.

Step 1 is arguably laborious to implement, error-prone (dep's version may not necessarily be matching the branch name or a tag), may require adding GitHub API token support (the API has a rate limit of 60 RPS, multiple requests to the API may easily lead to 429 status code for exceeding the limits, especially for checks with longish lists of deps).

I'd go with steps 3 and 3 and not implementing step 1.

Fine grained copyleft detection: weak/partial copyleft, strong copyleft, "over the network"/SaaS-copyleft

For now we detect only Copyleft licenses.
But they are so different in their nature, that one copyleft license in different use cases behaves drastically different, e.g. a software linking a GPLv3 library running on the company's private server and as a distribution delivered to company's customers.

We have to introduce fine grained copyleft detection:

  1. Weak/Partial copyleft
    E.g. LGPL, MPL, GPL with linking exception
    Value for --fail flag: WeakCopyleft

  2. Strong copyleft
    E.g. GPL v2 or later
    Value for --fail flag: StrongCopyleft

  3. SaaS-copyleft
    E.g. AGPL, SSPL
    Value for --fail flag: NetworkCopyleft

We probably also want to support --fail "Copyleft" to include all kinds of copyleft

Project standalone documentation

As the project gets bigger, so does the README. It's hard to navigate now.
Although getting help from lein run --help is a nice way to look up options available, cannot cover all the topics not necessarily about the tool but probably related to how the tool is to be run in certain scenarios.
cljdoc is a nice autodoc for the library uploaded to the Clojars, yet it suffers from the same limitations as the embedded help.

All in all, a standalone website with the documentation is needed to cover:

  • tool's basic help
  • library's autodoc
  • articles related to the tool's topic of interest (license types, what real-life scenarios require attention to what types of licenses, etc.)

The platforms for hosting static web pages to consider:

Fallback to license from github repo

Prioroties for getting license name:

  1. Trove classifiers
  2. "license" field

Some packages ignore both of these. In this case we should try to find github repository of the project in one of the following fields:

  "home_page": "https://github.com/aio-libs/aiohttp",
  "project_url": "https://pypi.org/project/aiohttp/",
  "project_urls": {
    "CI: Azure Pipelines": "https://dev.azure.com/aio-libs/aiohttp/_build",
    "Chat: Gitter": "https://gitter.im/aio-libs/Lobby",
    "Coverage: codecov": "https://codecov.io/github/aio-libs/aiohttp",
    "Docs: RTD": "https://docs.aiohttp.org",
    "GitHub: issues": "https://github.com/aio-libs/aiohttp/issues",
    "GitHub: repo": "https://github.com/aio-libs/aiohttp",
    "Homepage": "https://github.com/aio-libs/aiohttp"
  },

then try to get the license via github api:
https://docs.github.com/en/free-pro-team@latest/rest/reference/licenses#get-the-license-for-a-repository

Project logo

It would be nice to have a logo.

It could possibly play around pip word meaning, e.g. a slice of orange with the seeds or a tangerine sees, or an apple seeds.
I guess that incorporating some sort of a fruit slice with seeds will be much more positive than using a representation of license/agreement (e.g. a list of legal papers) or a representation of search/checks (checklist, magnifying glass, scales).

Some photos for inspiration.

Technical requirements

  • PNG format with transparent background
  • Adapted to be placed both on whitish and blackish backgrounds
  • Square size

Cannot get package meta when requirement file uses --follow-links

To reproduce:

  1. Create a requirements.txt
-f https://download.pytorch.org/whl/torch_stable.html
torch==1.7.0+cpu
  1. Pass in to the checker and get an error
lein run --requirements requirements.txt
torch:1.7.0+cpu                     Error                                                   Error      

The problem is that we plug in an external link with a package version that is absent from the global PyPI index. That's why we cannot compare versions and get the error

Use SPDX license identifiers

We must adopt the SPDX License List. Because SPDX license identifiers are precise, unambiguous, short and become to be adopted little by little (see more details).

The exact matching with the SPDX id means the exact license name and version, which in turn means the exact license type.
SPDX id exact matching must be a prioritized way of license name/type detection with the fallback to existing heuristics.

Having an unambiguous license identifier means that in addition to excluding licenses with the --exclude-license REGEX option we could also exclude licenses by predefined identifiers.

Preliminary TODO

  1. Introduce SPDX ids map (Clojure's keyword to a string holding a SPDX id)
  2. Introduce a mapping SPDX id to License type (public domain, permissive, weak/strong/network copyleft, proprietory, etc.)
  3. Revise existing regex (PCRE) to match a license's SPDX id, instead of a license type.
    See SPDX matching guidelines
PCRE -> SPDX id -> License type
  1. Include SPDX id to the report's output (sure enough, only
    free/libre/open-source licenses expected to have an SPDX id, others should get an NA or equivalent)
| Package           | License Name                               | License ID                | License Type   |
| package1:0.1.2    | Apache 2.0 License                         | Apache-2.0                | Permissive     |
| package2:3.141592 | GNU General Public License v2 or any later | GPL-2.0-or-later          | StrongCopyleft |
| package3:21.09    | Other/Proprietary License (EULA)           | NA                        | Other          |
| package4          | GPL-3.0 Linking Exception                  | GPL-3.0-linking-exception | WeakCopyleft   |
| package5:2.19.2   | null                                       | NA                        | Error          |

Could be, possibly, an optional column turned on/off with the option flag?

  1. Introduce --exclude-license-id option to exclude by SPDX id
  2. Introduce --fail-license-id option to fail if a package with the provided license SPDX id found
  3. Introduce stats for SPDX ids found with --stats-by-license-id option
  4. (breaking change) Change --with-totals to --stats-by-license-type to provide stats for license types found a

Make GitHub Action

Make GHA so that pip-license-checker could be incorporated into GitHub-based CI

Formatting long lines

Some package names or license names are way too long for existing report formatting (see example on Print report step).

There are the following possible solutions to the problem:

  1. Introduce an option with printf format string to be used instead of the hardcoded values
  2. Calculate optimal formatting automatically by analyzing all the rows of the final report. Possibly, add an option flag to turn on/off auto-formatting
  3. Introduce a new --external-format called edn to support lein-licenses with --extra-options to shorten fully-qualified package names (just like for gradle format)
  4. Truncate long lines to the limit in accordance with the formatter's constants.

Option 3 seems to be orthogonal to the problem.
Option 4 may be implemented along with option 1: we truncate the strings but allow to set the limits via the option flag.
Option 2 sounds good, although examples of existing software choosing this approach would be a good idea.

Allow custom host for PyPI mirrors

PyPI URL is hardcoded in the checker's code.
Given that people may be using PyPI mirrors, we need a CLI option to pass in a custom URL for JSON API

Fine-grained permissive license detection: public-domain-like licenses, notice-required licenses

For now, we lump all "permissive" licenses into one type Permissive.
A user may want to differentiate between

  1. Licenses that require preserving copyright notice, i.e. attribution (like Apache-2.0 or MIT)
    They may require a prominently visible copyright notice.

  2. Public-domain-equvalent licenses (like WTFPL, CC0, 0BSD). They are:

  • require no attribution
  • public-domain is a complex topic, legislation-dependant, there may be problems qualifying a software give away to the public domain at will in Continental Europe at least (so that the copyrights may hold)
  • can be susceptible to paten trolling

We may split Permissivetype into PermissiveNotice and PermissivePublicDomain respectively.

Report formatting: output format, columns control

For now, the only way to output the check results is by printing to the stdout. This is fine, especially after #80 has been implemented.

But once we introduce new columns, License SPDX ID (see #85 ) and License source (see #89), we may need to:

  1. Control what columns to be shown/hidden from the output

We can do that with the flag

--report-columns '[:package :license-id :license-name :license-source :license-type]'

Passing in the EDN vector enables a user to set the column presence and their order.

  1. Export as a file with the support for some popular file formats: csv, json, xml, edn.

We can do that with the flag --report-format FILE_FORMAT with the default value of stdout.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.