Giter VIP home page Giter VIP logo

Comments (24)

wking avatar wking commented on August 23, 2024 1

In RDF, we will always have case-sensitive URIs…

This is a good reason to require case-sensitive IDs. With that, a tool can easily construct the URI for a given license ID. Without a case requirement, tools would need to build in a complete (for a given release?) copy of the license list if they wanted to be able to canonicalize the case to get the license URI. To mitigate that problem if we decide to allow case-insensitive IDs, I think we would want to add resources for downcased identifiers that redirect to the canonical URI.

from spdx-spec.

swinslow avatar swinslow commented on August 23, 2024 1

I'll submit a PR to add the sentence clarifying that it needs to be case-sensitive, per @goneall's comment at #63 (comment)

from spdx-spec.

wking avatar wking commented on August 23, 2024

from spdx-spec.

pombredanne avatar pombredanne commented on August 23, 2024

In practice case does not matter and every id is unique case-wise. Mandating a certain case is an hindrance to adoption IMHO and serves absolutely no good purpose.

from spdx-spec.

wking avatar wking commented on August 23, 2024

Mandating a certain case is an hindrance to adoption…

If we explicitly allow tools to accept case-insensitive matches, then tool maintainers who feel that case is not important can ignore it.

…and serves absolutely no good purpose.

I think preserving case is useful for readability, and we should at least SHOULD authors to use the right case even if we don't MUST them. For example, I think RHeCos is a better hint for “Red Hat eCos” than rhecos or RHECOS would be.

If we MUST case preservation for authors, then tools can can chose not to support case-insensitive matching as well. That simplifies the parsing logic. Downcasing the whole license expression before parsing it would be a straightforward way to do case-insensitive parsing, but then it's a bit tedious to get back to the original case if you want to warn about an unrecognized identifier.

from spdx-spec.

pombredanne avatar pombredanne commented on August 23, 2024

I am not sure that the mostly all uppercase approach we have today helps with readability.
Instead we should define a canonical form and allow case-insensitive IDs and operators in license expressions.

@jeffmcaffer ping

from spdx-spec.

wking avatar wking commented on August 23, 2024

from spdx-spec.

jeffmcaffer avatar jeffmcaffer commented on August 23, 2024

as a user and relatively new to expressions I can definitely say that allowing people to do whatever case they want is a win. Consider

MIT AND GPL-2.0 AND GPL-1.0+ AND EPL-1.0 AND ISC

MIT and GPL-2.0 and GPL-1.0+ and EPL-1.0 and ISC

The latter is more approachable. More approachable == simpler. Simpler == more interest in using.

my preference is to allow, but not require, tools to support noncanonical casing

Seems like that would a problem as you might use non-canonical casing and then my tool may not understand it. So the interchange format is not all that interchangeable.

I likely just missed it but what is the argument for dictating casing?

from spdx-spec.

kestewart avatar kestewart commented on August 23, 2024

The case sensitivity concerns were more with the actual short form identifiers. If there is quorum to permit the license expression operators to be case insensitive, that is less impact. Note this would probably mean that AND, and, And, aNd, etc. variants would all be recognized (impact is AND, OR, WITH keywords).

from spdx-spec.

goneall avatar goneall commented on August 23, 2024

As one example of a tool implementation of license expression, it would be quite easy to make the entire expression case insensitive, quite easy to make the entire expression case sensitive and just a little bit of work to have different case sensitivity for the operators and license ID's.

The most important thing is to specify and document case sensitivity so that all tools behave consistently. You wouldn't want one tool to accept a license expression and a different tool to reject it due to different decisions on how to treat case sensitivity. Allowing tools to treat the case differently would make the interchange less reliable and defeat one of the goals of SPDX IMHO.

I would support whatever is easiest for the user.

from spdx-spec.

wking avatar wking commented on August 23, 2024

Consider

MIT AND GPL-2.0 AND GPL-1.0+ AND EPL-1.0 AND ISC
MIT and GPL-2.0 and GPL-1.0+ and EPL-1.0 and ISC

The latter is more approachable.

“more approachable” probably depends on your past experience. For example, and is a Python operator (and AND is not supported), so folks with a Python background may prefer and. SQL operators are case insensitive, but the convention is to use uppercase operators (as seen here). No single case convention will feel native to all authors. Allowing authors to pick their own case convention makes things friendlier (by not necessarily easier) for those authors. It makes life slightly less friendly (but not necessarily harder) for other human readers. And it makes life slightly harder for tool authors who want to generate warnings that match the original casing.

Seems like that would a problem as you might use non-canonical casing and then my tool may not understand it.

If the spec says that canonical casing is required (my preference), than yeah, that's a risk you'd take by using non-canonical casing. I don't think that's a compatibility issue, because you'll have the same issue if you break any of the other SPDX rules.

from spdx-spec.

jeffmcaffer avatar jeffmcaffer commented on August 23, 2024

"more approachable", yes, it is subjective. However, I was speaking as a human reader, not a programmer or technical person. Simply put, many people find readying a long string of all uppercase text hard. If these expressions are to show up in a human context (e.g., SPDX identifier tags in readmes) then the more human-readable, the better.

For tool compatibility, the spec should not IMHO bail on taking a position. Saying that "tools SHOULD tolerate different casing" (for example) is not really helpful as users still don't know with confidence what they can do. So anyone who cares about using different tools (which presumably is the point of an interchange format standard) will then read that as they MUST use the canonical casing if they want interchange.

from spdx-spec.

zvr avatar zvr commented on August 23, 2024

Consider

MIT AND GPL-2.0 AND GPL-1.0+ AND EPL-1.0 AND ISC
MIT and GPL-2.0 and GPL-1.0+ and EPL-1.0 and ISC

@jeffmcaffer, we are also looking at

Mit and gpl-2.0 And Gpl-1.0+ aNd ePl-1.0 aND isc

and any other combination. At least we are still using ASCII for now :)
I personally prefer having a single way of specifying these things.

I assume all this discussion is only about tag-value representation? In RDF, we will always have case-sensitive URIs:

     <spdx:licenseConcluded>
        <spdx:member rdf:resource="http://spdx.org/licenses/GPL-3.0"/>
    </spdx:licenseConcluded>

from spdx-spec.

wking avatar wking commented on August 23, 2024

from spdx-spec.

pombredanne avatar pombredanne commented on August 23, 2024

I always receive complaints about this case sensitivity. From Fedora people, from others. Case does not matter at all since every ID is unique ignoring case and the keyword case does not matter either.

We should have a canonical representation of an expression (which can be specific case-wise) but mandating using a certain case for something that does not need it is just a barrier to use and adoption IMHO.

from spdx-spec.

Conan-Kudo avatar Conan-Kudo commented on August 23, 2024

The current style for case-sensitivity is a major annoyance for identifying tags and expressions.

From my point of view, I want expressions to be clearly distinct from license identifiers (many of which are initialisms, so are in all capitals, or at least begin with a capital letter). Thus, my preference is that expression terms (such as and, or, with, or without, etc.) should be lowercase while license tags are either title case (if they are words) or all caps (if they are initialisms/acronyms).

from spdx-spec.

wking avatar wking commented on August 23, 2024

from spdx-spec.

goneall avatar goneall commented on August 23, 2024

An additional data point for tools developer impact. I created a pull request for the SPDX tools to make the entire license expression parsing case insensitive. See spdx/tools#153

Bottom line from the work- completely ignoring case would be a moderate amount of work to any tool that would like to preserve the proper case for license ID's for human readable purposes or to comply with the RDF spec.

Details:

The tools already allows all uppercase and all lower case operators (e.g. and and AND are both allowed). It does not currently allow mixed case (e.g. aNd is not allowed).

It was easy to update the operators to completely ignore case.

It was a moderate amount of work to ignore case on listed license ID's. I had to maintain a map of lowercase to SPDX license ID's and translate back and forth when displaying or interpreting licenses. Not a big deal, but a couple dozen lines of code which make the code a bit more complex.

Similar to the listed licenses, local document license-ref's needed a hashmap from the lowercase to proper (or original) cased ID's. In the case of the SPDX tools, there was already a map of ID's to the extracted license objects, so it was a bit easier to update.

from spdx-spec.

salicodes avatar salicodes commented on August 23, 2024

@goneall @kestewart has this issue been resolved? If no, can I work on it ?

from spdx-spec.

goneall avatar goneall commented on August 23, 2024

@salicodes We should wait until we have consensus on the specification before working on the solution. There is probably enough discussion to add this as an agenda topic to an upcoming SPDX technical meeting. Once resolved, we would welcome the help in the spec and also the tools..

from spdx-spec.

goneall avatar goneall commented on August 23, 2024

Discussed on tech call on 6/5/2018: Need to respect the case for the license ID's since they translate to URI's in RDF. There are also other use cases that may break other parsers.

Note: license identifies must be unique ignoring case.

Spec can be strict on operator case sensitive, but tools implementations are suggested to allow case insensitivity.

Operators will be case sensitive in spec.

TODO: Create a pull request to update the spec. - just adding a sentence (ABNF already is case sensitive)

from spdx-spec.

phadej avatar phadej commented on August 23, 2024

We (Haskell's Cabal) got a pull request related to this, and I found others using https://spdx.org/spdx-license-list/matching-guidelines as the justification about parsing spdx license expressions case-insensitively. Could the matching guidelines document be updated to clearly state that it doesn't apply on matching spdx license expressions. (or that it does, if it's so)

from spdx-spec.

phadej avatar phadej commented on August 23, 2024

Reading more carefully matching guidelines say

1.1 Purpose: To ensure consistent results by different SPDX document creators when matching license information that will be included in the License Information in File field. SPDX document creators or tools may match on the license or exception text itself, the official license header, or the SPDX License List short identifier.

4.1.1 Guideline: All upper case and lower case letters should be treated as lower case letters. Templates do not include markup for this guideline.


I'm slightly confused. I guess, always producing identifiers as they are written in the License list, but being lenient in the parser is safer approach for tooling. I.e.

  • parsers of the SPDX license expressions MAY / SHOULD parse identifiers case-insensitively but
  • (pretty-)printers of the SPDX license expressions MUST produce identifiers as listed in the license list?

from spdx-spec.

zvr avatar zvr commented on August 23, 2024

@phadej I think you were correct on your first assumption that the matching guidelines should not be used when parsing license expressions or short identifiers.

from spdx-spec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.