spdx / tools Goto Github PK

SPDX Tools

License: Apache License 2.0

Java 99.22% HTML 0.72% GAP 0.06%

tools's Introduction

Important Update

This version of the SPDX Java tools is planned to be replaced on the next major release of the SPDX Spec. The new Java tools can be found in the tools-java repo. You are encouraged to switch over to the new version of the SPDX Java tools which should be stable. If you would like to use a lighter weight library in you Java application, check out the SPDX Java Library.

Overview

The Software Package Data Exchange (SPDX) specification is a standard format for communicating the components, licenses and copyrights associated with a software package.

These tools are published by the SPDX Workgroup see http://spdx.org/

See the SPDX Tools Documentation for details on how to use the command line tools.

Getting Starting

The SPDX Tool binaries can be downloaded from the BinTray SPDX Tools Java repo under the respective release. The package is also available in Maven Central (organization org.spdx, artifact spdx-tools).

See the Syntax section below for the commands available.

Contributing

See the file CONTRIBUTING.md for information on making contributions to the SPDX tools.

Issues

Report any security related issues by sending an email to [email protected]

Non-security related issues should be added to the SPDX tools issues list

Syntax

The command line interface of the spdx tools can be used like this:

java -jar spdx-tools-jar-with-dependencies.jar <function> <parameters>

SPDX format converters

The following converter tools are provided by the spdx tools:

TagToSpreadsheet
TagToRDF
RdfToTag
RdfToHtml
RdfToSpreadsheet
SpreadsheetToRDF
SpreadsheetToTag

Example to convert a SPDX file from tag to rdf format:

java -jar spdx-tools-jar-with-dependencies.jar TagToRDF Examples/SPDXTagExample.tag TagToRDF.rdf

Compare utilities

The following tools can be used to compare one or more SPDX documents:

CompareSpdxDocs

Example to compare two SPDX files provided in rdf format:

java -jar spdx-tools-jar-with-dependencies.jar CompareSpdxDocs doc1 doc2 [output]

CompareMultipleSpdxDocs

Example to compare multiple SPDX files provided in rdf format and provide a spreadsheet with the results:
```
java -jar spdx-tools-jar-with-dependencies.jar CompareMultipleSpdxDocs output.xls doc1 doc2 ... docN
```

SPDX Viewer

The following tool can be used to "Pretty Print" an SPDX document.

SPDXViewer

Sample usage:

java -jar spdx-tools-jar-with-dependencies.jar SPDXViewer TestFiles/SPDXRdfExample.rdf

Verifier

The following tool can be used to verify an SPDX document:

Verify

Sample usage:

java -jar spdx-tools-jar-with-dependencies.jar Verify TestFiles/SPDXRdfExample.rdf

Generators

The following tool can be used to generate an SPDX verification code from a directory of source files:

GenerateVerificationCode sourceDirectory

Sample usage:

    java -jar spdx-tools-jar-with-dependencies.jar GenerateVerificationCode sourceDirectory [ignoredFilesRegex]

SPDX Validation Tool

The SPDX Workgroup provides an online interface to validate, compare, and convert SPDX documents in addition to the command line options above. The SPDX Validation Tool is an all-in-one portal to upload and parse SPDX documents for validation, comparison and conversion and search the SPDX license list.

License

See the NOTICE file for licensing information including info from 3rd Party Software

See LICENSE file for full license text

SPDX-License-Identifier:	Apache-2.0
PackageLicenseDeclared:	Apache-2.0

Development

Build

You need Apache Maven to build the project:

mvn clean install

Update tools data formats

To update SPDX tools, the following is a very brief checklist:

Update the SpdxRdfContants with any new or changed RDF properties and classes
Update the Java code representing the RDF model.
Update the properties files in the org.spdx.tag package for any new tag values
Update the org.spdx.tag.CommonCode.java for any new or changed tag values. This will implement both the rdfToTag and the SPDXViewer applications.
Update the org.spdx.tag.BuildDocument to implement changes for the TagToRdf application
Update the HTML template (resources/htmlTemplate/SpdxHTMLTemplate.html) and contexts in org.spdx.html to implement changes for the SpdxToHtml application
Update the related sheets and RdfToSpreadsheet.java file in the package org.spdx.spreadsheet
Update the sheets and SpdxComparer/SpdxFileComparer in the org.spdx.compare package

Upgrading to SPDX 2.0

To the users of the tools as a binary, there should not be any need to upgrade. The tools should be backwards compatible with SPDX 1.0, 1.1, and 1.2.

If, however, you are using this Java code as a library for your own tools read on...

There are a number of changes to the design of the SPDX Parser both due to the extensive changes to the SPEC (e.g. support for multiple SPDX Packages within a document and support for relationships with external SPDX documents) and due to some much needed refactoring.

The starting point remains SPDXDocumentFactory. To ease the migration, the old 1.2 code and model is still available and simply changing your code to call SPDXDcoumentFactory.createLegacySpdxDocument(...) will probably work. You'll notice, however, almost everything your application is using is deprecated. These will be removed once SPDX 2.0 has been released and people have a chance to migrate (likely around Jan 1 2016).

To move over to the new model, simply start with SPDXDocumentFactory and call the createSpdxDocument(...) method to create the new SpdxDocument model code.
The object returned will be similar to the 1.2 version for SPDXDocument, but with a few key differences. All new model objects are in the package org.spdx.rdfparser.mode. The SPDX prefix is either removed or replaced with a more consistent Spdx.

Accessing the model objects is similar to 1.2, simply call the get/set methods. The method names have all been changed to be consistent with the specification property names. As a convenience, many of the old getter method names are still there but deprecated.

The structure has changed with the SpdxPackage being a distinct class from SpdxDocument. There is also a new class org.spdx.rdfparser.SpdxDocumentContainer which separates out the container functionality from the SpdxDocument leaving the SpdxDocument to represent the SpdxDocument properties. There are several new classes which are consistent with the SPDX 2.0 Model. See the JavaDocs and the SPDX 2.0 specification for a description of those classes and properties.

There is one significant class not found in the SPDX 2.0 model - ExternalSpdxElement. This class represents elements not found within the SPDX Document. The only valid property for this element is the ID (all other properties including the type are only known in the external document containing the element). There is a more structured class hierarchy, mostly mirroring the SPDX 2.0 model. As a user of the library, you likely do not need to understand these internals - but if you are interested, start at RdfModelObject and read the JavaDocs.

If you have any problems, and especially if you have any solutions, email the tech working group for SPDX at [email protected].

tools's People

Contributors

Stargazers

Watchers

tools's Issues

No way to obtain packages not related to document

There is no way to list all the packages in the document. So in a scenario where Document describes Package 1 and Package 1 is related to Package 2, there is no way to get to Package 2 without a depth-first search through all the relationships.

This can be counterproductive when trying to manipulate the document, given that with the addition of filesAnalyzed in 2.1, a document will now contain many packages.

Fix copilot vulnerabilities

There are components with known CVE vulnerabilities. We should fix them.

Utilize the OSI API's to automatically populate the isOsiApproved flag in the listed license

https://api.opensource.org/licenses/ can access the SPDX license ID and OSI status. This can be used to do one of the following:

Fill in the OSI approved text on spdx.org/licenses based on JavaScript and real time access to the OSI API and deprecate the isOsiApproved attribute in the license list XML
Set the value for osiApproved in the listed licenses based on the OSI API information at the time the license list is generated and deprecate the isOsiApproved attribute in the license list XML
Continue to use the isOsiApproved attribute in the license list XML, but generate a warning if the OSI API does not agree with the isOsiApproved XML attribute value.

Match expressions for license templates fail when using white space

When there is a white space used in a pattern for the <<var expression in a license template, it will sometimes not match.

This is due to the license text being tokenized as part of the match.

It fails specifically if the space is at the very end of text to be matched. The space will be trimmed off of the comparison text causing the failure.

A workaround is to use the optional keyword (e.g. " |" -> " ?|").

Add seeAlso URL's to license list JSON file

I would like to add the seeAlso URL's for the listed license to the generated JSON file located at http://spdx.org/licenses/liceneses.json. It will add some additional approx. 50% to the file size (from about 42K to 66K).

If there is any concern about compatibility or impact to existing implementations, please comment on this issue. I plan to make the change around August 5 if no issues are raised.

Below is a snippet of the JSON file for a license with seeAlso:
{"licenseId":"AFL-1.1","isOsiApproved":true,"name":"Academic Free License v1.1","referenceNumber":"3","seeAlso":["http://opensource.linux-mirror.org/licenses/afl-1.1.txt"],"reference":"./AFL-1.1.html"}

Make license texts easily accessible

I currently scrape them from the git repo and sanitize them. Would be easier if they were easily accessible from the official source.

Show regexp in HTML output for replacable text

formatReplaceabledHTML currently sets class="replacable-license-text". I think we should also set title="match-regexp" or something so users can mouse over (if they have a mouse) to see the replacement regexp. Raw regexps aren't the greatest UX, but that's what we've been using so far, and I can't think of anything better.

Spun off from here.

NPE when reading file

I'm attempting to use the spdxviewer and the verify tool for the first time. Both result in a null pointer exception being thrown.

$ java -jar spdx-tools-2.1.4-jar-with-dependencies.jar SPDXViewer myfile.spdx 
log4j:WARN No appenders could be found for logger (Jena).
log4j:WARN Please initialize the log4j system properly.
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.spdx.rdfparser.SPDXDocumentFactory.createSpdxDocument(SPDXDocumentFactory.java:104)
	at org.spdx.tools.SpdxViewer.main(SpdxViewer.java:61)
	at org.spdx.tools.Main.main(Main.java:27)
Caused by: java.lang.NullPointerException
	at org.apache.jena.tdb.sys.EnvTDB.processGlobalSystemProperties(EnvTDB.java:33)
	at org.apache.jena.tdb.TDB.init(TDB.java:248)
	at org.apache.jena.tdb.sys.InitTDB.start(InitTDB.java:29)
	at org.apache.jena.system.JenaSystem.lambda$init$2(JenaSystem.java:119)
	at java.util.ArrayList.forEach(ArrayList.java:1249)
	at org.apache.jena.system.JenaSystem.forEach(JenaSystem.java:194)
	at org.apache.jena.system.JenaSystem.forEach(JenaSystem.java:171)
	at org.apache.jena.system.JenaSystem.init(JenaSystem.java:117)
	at org.apache.jena.util.FileManager.<clinit>(FileManager.java:86)
	... 3 more

$ java -jar spdx-tools-2.1.4-jar-with-dependencies.jar Verify myfile.spdx 
log4j:WARN No appenders could be found for logger (Jena).
log4j:WARN Please initialize the log4j system properly.
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.spdx.rdfparser.SPDXDocumentFactory.createSpdxDocument(SPDXDocumentFactory.java:104)
	at org.spdx.tools.CompareSpdxDocs.openRdfOrTagDoc(CompareSpdxDocs.java:1152)
	at org.spdx.tools.Verify.main(Verify.java:51)
	at org.spdx.tools.Main.main(Main.java:49)
Caused by: java.lang.NullPointerException
	at org.apache.jena.tdb.sys.EnvTDB.processGlobalSystemProperties(EnvTDB.java:33)
	at org.apache.jena.tdb.TDB.init(TDB.java:248)
	at org.apache.jena.tdb.sys.InitTDB.start(InitTDB.java:29)
	at org.apache.jena.system.JenaSystem.lambda$init$2(JenaSystem.java:119)
	at java.util.ArrayList.forEach(ArrayList.java:1249)
	at org.apache.jena.system.JenaSystem.forEach(JenaSystem.java:194)
	at org.apache.jena.system.JenaSystem.forEach(JenaSystem.java:171)
	at org.apache.jena.system.JenaSystem.init(JenaSystem.java:117)
	at org.apache.jena.util.FileManager.<clinit>(FileManager.java:86)
	... 4 more

The file I'm using for input was generated by spdx-maven-plugin v0.2.5 which is the latest release in Maven Central.

Log Exception Stack Traces When The Tool Fails

Currently, when jar commands fail, a single message line is output. This is not helpful for debugging, as it give no clues as to the source of the issue.

The tool should log the stack trace of such exceptions for easier tracking

Render text color highlights for alt and optional text within standard license headers

Alt and optional text in standard license headers do not display the color highlights as do the optional and alt text for the license template text

XHTML declaring itself text/html

Spun off from here. We're serving the license pages as HTML:

$ curl -sI https://spdx.org/licenses/preview/ISC.html | grep Content-Type
Content-Type: text/html; charset=UTF-8

and doubling down on that with a meta http-equiv in the template (and a few sibling templates under htmlTemplate. This conflicts with the W3C recommendation that XHTML served as text/html not contain XML declarations (which our templates do). I see two potential solutions:

a. Serve our XHTML as application/xhtml+xml. This works for clients that ask for XHTML, but the W3C recommends only serving it to clients who “explicitly indicate they support this media type”, so we'd need HTML versions to serve to everyone else.
b. Drop XHTML and just serve vanilla HTML5.

(a) is a superset of (b), since we'll need a vanilla HTML option to comply with the W3C recommendations for clients who don't explicitly indicate they support XHTML. The question is whether we want to continue to support XHTML in parallel or not. With XML data already available from license-list-XML, I don't see a point to continuing to maintain XHTML output. If that seems like a reasonable position, I can start filing PRs to transition the templates to HTML5.

Normalize quotes for SPDX licenses

When creating text from the license list XML format, normalize all quotes to straight quotes.

TagToRDF on document containing a PackageComment fails with InvalidSpdxTagFileException

Attempting to convert this file from tag to RDF:
https://gist.github.com/ttgurney/54abaae84f5274d6b739

$ java -jar spdx-tools-2.0.1-SNAPSHOT-jar-with-dependencies.jar TagToRDF yocto-spdx.tag yocto-spdx.rdf
Error creating SPDX Analysis: org.spdx.tag.InvalidSpdxTagFileException: Expecting a definition of a file, package, license information, or document property at PackageComment:

Removing the PackageComment line from the document allows it to convert successfully.

UPDATE: Looks like the issue is here:
https://github.com/spdx/tools/blob/master/src/org/spdx/tag/SpdxTagValueConstants.properties#L70
But changing the PackageComment: to Comment: in the above tag file still produced an error on attempting to convert.

license id regex allows `+` chars in middle of license

According to the SDPX specifications, a license simple expression can only contain the + character as the last character. This is tested in this project using the java regex found here, which allows the + character anywhere in the string.

For example, the license ID hello+world would pass the regex mentioned above, however this would be an invalid ID according to the SPDX spec.

enable travis-ci

I would really appreciate if the admin of spdx GitHub account could setup the travis-ci hook. I've added the travis.yml a while ago.

see https://travis-ci.org/bufferoverflow/tools/

TagToRdf fails when no Package specified

Release version 2.1.6 fails when trying to convert from Tag-based to Rdf via TagToRdf on an SPDX document which has no Package definition. The error message is:

Error creating SPDX Analysis: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

If running Verify on an SPDX file without a Package specification, it raises the following concern:

Unable to parse the file: File ../LICENSES.nonworking.spdx is not a recognized RDF/XML or tag/value format: [line: 1, col: 1 ] Content is not allowed in prolog.

I'm attaching the SPDX file which I feel should validate. If I introduce in this file a Package, plus a relation between the SPDXRef-DOCUMENT and SPDXRef-Package, I can get spdx-tools to be happy about it, but my interpretation of the specification is this should not be needed in v2.1.

LICENSE.nonworking.spdx.txt

Replace Uses of Java Util Implementations With Interface References

The implementation of the tools library uses a lot of HashMap, ArrayList, and HashSet instances, instead of referencing these by the interfaces they implement. This is generally considered bad practice, and its use in API signatures forces clients to follow this practice in order to use the library

Change the default GitHub branch to "develop"

@goneall, I was very grateful for your e-mail of earlier this week pointing out that active development is currently on develop, not master.

For the benefit of anyone who wanders into the code via GitHub, might you like to change the default branch to develop? I've been added to @spdx/contributor, but it doesn't look like I have permissions on the repository needed to do so.

If you'd like, I will PR a GitHub-recognized CONTRIBUTING file making clear that coders should PR from topic branches off develop and give notice in PR comments that they license contributions per Apache-2.0.

Verifying RDF file with invalid URI prints error but returns 0

Running Verify on an RDF file containing an invalid URI, such as this file, prints the below errors but exits with return code 0. As this is not a valid file, it should have a non-zero return code.

 WARN [main] (RDFDefaultErrorHandler.java:36) - file:////tmp/lgpl_spdx.rdf(line 21 column 137): {W107} Bad URI: <file:///srv/fossology/repository/report/SPDX2_lgpl_sic.txt_1456357178.rdf#LicenseRef-GPL-2.1[sic]> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
 WARN [main] (RDFDefaultErrorHandler.java:36) - file:////tmp/lgpl_spdx.rdf(line 76 column 149): {W107} Bad URI: <file:///srv/fossology/repository/report/SPDX2_lgpl_sic.txt_1456357178.rdf#LicenseRef-GPL-2.1[sic]> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
This SPDX Document is not valid due to:
    Invalid license id 'LicenseRef-GPL-2.1[sic]'.  Must start with 'LicenseRef-' and made up of the characters from the set 'a'-'z', 'A'-'Z', '0'-'9', '+', '_', '.', and '-'.

Question: How to parse tag files

I'd like to parse tag-based SPDX files and attempted to use SPDXDocumentFactory.createSpdxDocument thinking it would handle RDF and SPDX files. It appears this only supports RDF.

I'm unable to find anything that parses a tag file and returns a SpdxDocument. Is the solution to use TagToRDF first, then SPDXDocumentFactory? Any insight would be extremely helpful.

Deprecated license identifiers in JSON file

I notice that the identifiers listed under "Deprecated Licenses" in the license list are not included in the JSON list at https://spdx.org/licenses/licenses.json. Is this intentional?

Most troubling are eCos-2.0 and WxWindows, which are OSI-approved.

Support license templates for license exceptions

Currently, templates are not supported for license exceptions even though the license matching guidelines states that license exceptions can be used.

The solution will impact the LicenseRDFaGenenerator as well as the model for the LicenseException and anyplace that comparisons are done for LicenseExceptions.

Update to Jena 3.x

As discussed in calls, it makes sense to use the latest versions of Jena and the features made available therein (e.g. disk-based triple stores, etc)

Add a license expression validator to LicenseExpressionParser and the command line tools

Requested by David Wheeler for the web validation project as a feature for the online tool.

To support this, we can add a method "validate" to the class LicenseExpressionParser which would return a string with any validation error message or return null if there are no validation errors.

We could add a new command to the command line tool "VerifyLicenseExpression" which would take a string as a parameter and return back any validation errors or "VALID" if correct.

Note that you can accomplish this today by calling the parse method in the LicenseExpressionParser. If there is an invalid expression, an exception will be thrown with the exception method describing the validation error.

Dependency on Java 1.7 introduced

Commit cb32edb "remove potential NPE occurrences" introduced a dependency on Java version 1.7. This should be removed until we decide to drop support for version 1.6 (currently in discussion on the SPDX tech list).

Test license matching against more than one instance

In this comment, @goneall pointed at LicenseRDFAGenerator.java as a way to test our license-matching against known-sufficient instances of each license's text (maybe here? My Java is weak). However, a single test case is insufficient to exercise conditional inclusions like the new XML <alt …> and <optional> sections. I'd rather see a corpus of known license instances (e.g. BSD-2-Clause/good/basic.txt, BSD-2-Clause/good/altered-copyright-holder.txt, BSD-2-Clause/bad/additional-condition.txt, …) to ensure we successfully match license instances that those short identifiers are intended to correspond to. That way we know that the <alt …>, <optional>, etc. markup and the spec's matching guidelines are as broad (or narrow) as the legal team expects them to be.

Upgrade Java to 1.7

Oracle discontinued support for Java 1.6 some time ago, and will soon end support for Java 1.7. Asking all consumers to jump to Java 1.8 is (in my opinion) a bit much to ask at this time, but I'd suggest that we should try to keep it no more than one "unsupported" version of Java behind.

I'm not sure of the process required to make such a switch, so I am expecting some discussion on this issue will be required before any pull request is accepted.

Conditional <var> or <div> in formatReplaceabledHTML

formatReplaceabledHTML currently uses a wrapping <span>, but that doesn't work if the content is not phrasing content. For example, if the <alt> wraps a <p> or <ul> (which are flow content, and which is currently allowed), then we should be using a <div> or similar instead of a <span>. I expect we'll need to parse the contents to make this switch, and that's beyond my Java ability, so I'm filing this as an issue instead of a PR ;).

Line Endings are Not Consistent, Despite Git Settings

Hi,

When trying to create a new clone of the repository ('develop' branch), I see many files already marked as changed. Some investigation leads me to believe that these files somehow got committed to the repository without the 'text=auto' gitattribute setting being applied.

This occurs on Windows and Linux (Ubuntu) machines, which is what leads me to believe that they sneaked in without the setting being applied, as opposed to the setting being wrong.

The git status output I see is:

git status
On branch develop
Your branch is up-to-date with 'origin/develop'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   README.md
    modified:   src/org/spdx/spdxspreadsheet/SpreadsheetException.java
    modified:   src/org/spdx/tag/BuildDocument.java
    modified:   src/org/spdx/tag/BuildLegacyDocument.java
    modified:   src/org/spdx/tag/HandBuiltParser.java
    modified:   src/org/spdx/tag/InvalidSpdxTagFileException.java
    modified:   src/org/spdx/tag/NoCommentInputStream.java
    modified:   src/org/spdx/tag/SpdxTagValueConstants.properties
    modified:   src/org/spdx/tag/SpdxViewerConstants.properties
    modified:   src/org/spdx/tag/TagValueBehavior.java
    modified:   src/org/spdx/tag/TagValueLexer.smap
    modified:   src/org/spdx/tag/TagValueParser.smap
    modified:   src/org/spdx/tag/TagValueParserTokenTypes.java
    modified:   src/org/spdx/tag/TagValueParserTokenTypes.txt
    modified:   src/org/spdx/tag/data.g
    modified:   src/org/spdx/tools/CompareMultpleSpdxDocs.java
    modified:   src/org/spdx/tools/CompareSpdxDocs.java
    modified:   src/org/spdx/tools/GenerateVerificationCode.java
    modified:   src/org/spdx/tools/Main.java
    modified:   src/org/spdx/tools/MatchingStandardLicenses.java
    modified:   src/org/spdx/tools/RdfToHtml.java
    modified:   src/org/spdx/tools/RdfToSpreadsheet.java
    modified:   src/org/spdx/tools/SpreadsheetToRDF.java
    modified:   src/org/spdx/tools/SpreadsheetToTag.java
    modified:   src/org/spdx/tools/TagToSpreadsheet.java

Link to third-party pages from license list

This is not critical, and it would require API extentions, but it would make it easier for users to audit our assertions and/or discover additional details if the “FSF Free/Libre?” and “OSI Approved?” entries in resources/htmlTemplate/TocHTMLTemplate.html were also links to the third-party page. For example, if the MIT entries linked to https://www.gnu.org/licenses/license-list.html#Expat and https://opensource.org/licenses/MIT:

Full name	Identifier	FSF Free/Libre?	OSI Approved?	Text
MIT License	`MIT`	Y	Y	License Text

Definitely not something to block a release on or anything, but if someone has the time to work this up, I think it would be a useful improvement.

License matching optional tag only works on whole words, not individual characters

The current license matching algorithm will not recognize individual characters if it is in the middle of a word. For example attorney<<beginOptional>>'<<endOptional>>s. will fail to match the token attorney's.

This is due to the matcher tokenizing the text and performing the optional matches at the token level rather than at the character level.

As a work-around, a var tag can be used with the alternative words used in the match. For example <<var name="optionalApostrophe";match="attorneys|attorney's";original="attorney's">> will treat the apostrophe as optional.

Create a tool to indicate where two licenses does not match

The tool would use the matching guidelines and compare license text from 2 licenses. It would return specifics on which parts of the license do not match (e.g. line, character)

Add SPDX 2.1 example files

It would be nice to also have valid SPDX files for the 2.1 specification under ./TestFiles and ./Examples.

Update JGit when new version is available

One of the JGit dependencies, JSCH version 0.1.53 shows a security vulnerability. This is not in the current execution path of the SPDX tools, however, it should be updated to JSCH version 0.1.54. At the time of this issue report, JGit has updated the version in its development branch, but has not yet provided a released version. Once JGit release a new version, we should update to JGit in order to update JSCH.

Missing Documentation Fails Maven Build Against Java 8 Environment

There are several partial JavaDoc statements present in the library which make it difficult to use for someone trying to get familiar with it. In addition, the started documentation which isn't finished is causing the Maven build to fail if someone doesn't force Maven to use Java 7 or earlier for its own operations (there is a strict setting for things like "param" annotations to have text).

Note: There is a workaround for the Java 8 failure, but its really just a side-effect, the documentation itself is the bigger concern.

I will work on pull request improving this - I will combine these changes with the changes for #6 in small batches to prevent overwriting over a hundred files for line ending reasons

Switch to Log4J 2.

Log4J 1.x has reached its end-of-life. The change to log4J should involve only changing package names and dependency information. It may slightly improve performance as well.

Consistent header/value for license-list “Text” column

And as a very minor nit for resources/htmlTemplate/TocHTMLTemplate.html, it's a bit strange to have a “Text” header on a column where all the values are “License Text”. If “Text” is sufficient, why not use that everywhere? If “Text” is not sufficient, why not use “License Text” in the header? I'm also fine dropping this column entirely, because the license name already links to the per-license page, and few of our licenses have so many notes and such that that the text is hard to find there ;).

Generate var text for copyright tags while generating the template files in the licenseRdfaGenerator

Currently, <copyright>...</copyright> text is treated as normal text for matching purposes.

Per the matching guidelines, this text is optional and variable.

This could be just a match=".*" however, that may match additional legal text and change the meaning of the license.

A more sophisticated design may be needed.

Verification fails when filesAnalyzed = false and verification code is null

In fact, per the spec, the package verification code MUST be absent when filesAnalyzed =false.

Package Verification Code generated when filesAnalyzed = false

Per the 2.1 spec, the cardinality of PackageVerificationCode is:

Mandatory, one if filesAnalyzed is true or omitted,
Zero (must be omitted) if filesAnalyzed is false.

I added a verification warning for this in #82, which causes a unit test to fail, because PackageVerificationCodes get included even for empty packages with filesAnalyzed=false. I'm a bit nervous that this may go too deep for me to tinker with on my own.

Pretty print JSON data files

https://github.com/spdx/license-list-data/blob/master/json/licenses.json (and other JSON files) are currently printed as compact JSON. I think a pretty printed JSON (with newlines and perhaps 2-space indentation) would be better for the following reasons:

much easier to visually parse. This makes it easier to work with the JSON file because it becomes more obvious what data it contains.
better for git diffs and seeing the changes to the JSON over time

The downside is a slightly larger file size, although I think this is trivial since the file is not large.

Missing SPDX file :-)

Just as the cobbler forgets to repair his own shoes, most SPDX sponsored projects seem to be missing an SPDX desciption file for the project.

Need an SPDX file (files?) unless you think the SPDXParser.spdx file covers this though it isn’t intuitive. Would think an SPDX folder might work or standard naming (project.spdx, package.spdx) something to describe scope of applicability?

Var text following optional text fails matching

If a license template has the following form:

some text that doesn't affect the outcome <<beginOptional>>some optional text <<beginOption>>make this complext<<endOptional>><<endOptional>> <<var name="something"; original="orig";match=".+">> more text

it will fail since the match template parser will look for the more text without considering possible variable text.

This issue will not occur if there is no nested optional or var elements nested inside the optional.

rename repo to spdx-tools

This package is well known as spdx-tools.

I suggest to rename the repo to spdx-tools as well.

Eclipse Flags Compilation Errors In Existing Source

Eclipse (Luna, on Windows) is flagging certain lines as invalid characters. Could anyone shed some light onto what this is supposed to be and how I get my environment to not flag it? Am I missing a character set or something?

Line is in LicenseRDFGenerator:

INVALID_FILENAME_CHARS.add('Â³'); (Line 67)

Add templates to the standard license headers

Support templates and matching functionality for standard license headers for SPDX licenses. This would involve adding get/set template functions, get HTML functions, and license helper matching functions.

SPDXCreatorInformation.equals does not return flase if created dates are different but non-null

The SPDXCreatorInformation.equals method is missing the return-false statement when two non-null created date values are present. The If statement to check for this is present, but empty.

Conflicting class names when building the dependency jar files

The following warnings are produced by the shade plugin:
[WARNING] java-rdfa-htmlparser-0.4.2-RC2.jar, htmlparser-1.4.jar define 70 overlapping classes:
[WARNING] - nu.validator.htmlparser.common.XmlViolationPolicy
[WARNING] - nu.validator.htmlparser.xom.XOMTreeBuilder
[WARNING] - nu.validator.htmlparser.io.Driver$1
[WARNING] - nu.validator.htmlparser.impl.TreeBuilder
[WARNING] - nu.validator.htmlparser.annotation.Prefix
[WARNING] - nu.validator.htmlparser.sax.XmlSerializer$PrefixMapping
[WARNING] - nu.validator.htmlparser.common.ByteReadable
[WARNING] - nu.validator.htmlparser.common.EncodingDeclarationHandler
[WARNING] - nu.validator.htmlparser.io.HtmlInputStreamReader
[WARNING] - nu.validator.htmlparser.io.BomSniffer
[WARNING] - 60 more...
[WARNING] jcl-over-slf4j-1.7.21.jar, commons-logging-1.1.3.jar define 6 overlapping classes:
[WARNING] - org.apache.commons.logging.impl.SimpleLog$1
[WARNING] - org.apache.commons.logging.Log
[WARNING] - org.apache.commons.logging.impl.SimpleLog
[WARNING] - org.apache.commons.logging.LogConfigurationException
[WARNING] - org.apache.commons.logging.impl.NoOpLog
[WARNING] - org.apache.commons.logging.LogFactory
[WARNING] xml-apis-1.4.01.jar, stax-api-1.0.1.jar define 37 overlapping classes:
[WARNING] - javax.xml.stream.XMLEventReader
[WARNING] - javax.xml.stream.StreamFilter
[WARNING] - javax.xml.namespace.NamespaceContext
[WARNING] - javax.xml.stream.util.StreamReaderDelegate
[WARNING] - javax.xml.stream.events.StartDocument
[WARNING] - javax.xml.stream.EventFilter
[WARNING] - javax.xml.stream.XMLEventWriter
[WARNING] - javax.xml.stream.XMLStreamConstants
[WARNING] - javax.xml.stream.events.EntityDeclaration
[WARNING] - javax.xml.stream.events.ProcessingInstruction
[WARNING] - 27 more...
[WARNING] maven-shade-plugin has detected that some class files are
[WARNING] present in two or more JARs. When this happens, only one
[WARNING] single version of the class is copied to the uber jar.
[WARNING] Usually this is not harmful and you can skip these warnings,
[WARNING] otherwise try to manually exclude artifacts based on
[WARNING] mvn dependency:tree -Ddetail=true and the above output.
[WARNING] See http://maven.apache.org/plugins/maven-shade-plugin/

getLicenseConcluded() seems to be broken

test output:

Failed tests:   testAddAndGet(spdxspreadsheet.TestPerFileSheet): expected:<(((id2 OR id1 OR CECILL-B) AND id3) AND ((id1 OR ((id2 OR id1 OR CECILL-B) AND id3) OR AFL-3.0) OR EUPL-1.0 OR (id1 AND CECILL-B AND AFL-3.0)) AND id3)> but was:<((id2 OR id1 OR CECILL-B) AND (id1 OR ((id2 OR id1 OR CECILL-B) AND id3) OR EUPL-1.0 OR AFL-3.0 OR (id1 AND CECILL-B AND AFL-3.0)) AND id3)>

I had to do this to pass the test suite...

diff --git a/Test/spdxspreadsheet/TestPerFileSheet.java b/Test/spdxspreadsheet/TestPerFileSheet.java
index 2df37fe..f8150f9 100644
--- a/Test/spdxspreadsheet/TestPerFileSheet.java
+++ b/Test/spdxspreadsheet/TestPerFileSheet.java
@@ -254,8 +254,8 @@ public class TestPerFileSheet {

        @SuppressWarnings("deprecation")
        private void compareSpdxFile(SpdxFile testFile, SpdxFile result) throws InvalidSPDXAnalysisException {
-               assertEquals(testFile.getLicenseConcluded(), result.getLicenseConcluded());
-               compareLicenseDeclarations(testFile.getLicenseInfoFromFiles(), result.getLicenseInfoFromFiles());
+               //assertEquals(testFile.getLicenseConcluded(), result.getLicenseConcluded());
+               //compareLicenseDeclarations(testFile.getLicenseInfoFromFiles(), result.getLicenseInfoFromFiles());
                compareProjects(testFile.getArtifactOf(), result.getArtifactOf());
                assertEquals(testFile.getCopyrightText(), result.getCopyrightText());
                assertEquals(testFile.getLicenseComments(), result.getLicenseComments());

see https://travis-ci.org/bufferoverflow/tools/builds/93048139

LicenseRDFaGenerator only includes first note in license XML document

If multiple notes are included in the License XML document, only the first note is included in the output website html.

spdx / tools Goto Github PK

tools's Introduction

Important Update

Overview

Getting Starting

Contributing

Issues

Syntax

SPDX format converters

Compare utilities

SPDX Viewer

Verifier

Generators

SPDX Validation Tool

License

Development

Build

Update tools data formats

Upgrading to SPDX 2.0

tools's People

Contributors

Stargazers

Watchers

Forkers

tools's Issues

Recommend Projects

Recommend Topics

Recommend Org