Giter VIP home page Giter VIP logo

Comments (7)

bathos avatar bathos commented on July 18, 2024 1

I’d be happy to help. I haven’t dug into the source here yet, but I’ll take a look.

I think (?) PVP uses the tokenizer from postcss itself, so it’s possible that some stuff, like the examples with escapes inside idents, might concern postcss rather than this lib.

from postcss-values-parser.

shellscape avatar shellscape commented on July 18, 2024

You can use conditions in the parser to detect when you're inside of a function like url(...). Ignoring the double slashes there would need to be for all urls, with and without a protocol too. see: https://github.com/shellscape/postcss-values-parser/blob/master/lib/parser.js#L302

It should be possible. Are double-slash comments part of any future spec for CSS standard?

from postcss-values-parser.

alexander-akait avatar alexander-akait commented on July 18, 2024

@shellscape let's do it 👍 No it is not part of standard. But we support less/scss/sass and i think better parse this only in loose mode

from postcss-values-parser.

bathos avatar bathos commented on July 18, 2024

This has caused postcss-values-parser to silently parse some CSS incorrectly.

The source sequence "//" lexes as two delim-tokens, "/" and "/", in CSS. These tokens have no impact on the tokenization of subsequent source text in the line where they appear and they don’t create comment tokens.

CSS is extremely extensible by design, but it does not achieve this property by leaving lots of open syntactic space at the lexical level — it leaves none at all, in fact. Instead, its extensibility stems from a layered-application-of-grammars approach, recycling a small handful of grouping constructs and ignoring whatever isn’t recognized. This carefully(?) designed system permits use of new features without breaking older agents. It’s the reason we generally don’t need ‘Babel for CSS’ and it also helps avoid bikeshedding over ever-scarcer growlix resources. The catch though is that in general, any extensions to CSS must occur on top of the higher level, syntactic layer rather than down at lexical (tokenization) level. The lexical grammar already describes a language whose membership is every string, and current and future CSS modules must be able to rely on that fact, especially as we get closer to CSS.registerProperty().

Are double-slash comments part of any future spec for CSS standard?

For these and other reasons, I think it is pretty unlikely for CSS to ever introduce line comments, but if it did, it could not use "//" as its starting delimiter, since it’s already recognized CSS in other places, and CSS must retain backwards compatibility.

The SASS landing page says it is ‘completely compatible with all versions of CSS.’ I won’t claim this is false, mainly because it’s not really clear to me what is meant by it, but neither SASS nor SCSS have ever been supersets of CSS.


That might seem like an academic concern at first, but there really are CSS grammars which expect and understand two "/" tokens in a row, rather than discarding them. For properties whose value grammars include multiple list-of-numbers parts, "/" is the delim-token typically used to delimit the lists. These grammars sometimes permit empty members, which means "/" can be followed directly by "/". For example, in border-image, two slashes after the slice value (20) indicates elision of the width value, letting us skip ahead to declaring a 10px offset while leaving the width as its default value:

border-image: url(/poop.png) 20 // 10px;

image

Postcss handles this correctly. It gives us a Declaration node whose value is 'url(/poop.png) 20 // 10px':

image

However, postcss-values-parser parses this string as SASS, not CSS. It replaces the last four tokens with a comment token that doesn’t exist.

image

I’m sure these cases aren’t encountered often, and I’d guess that most devs would — provided they noticed the incorrect output — figure out they they can get around the issue by adding an extra space between the slashes. However I would suggest reconsidering this behavior being enabled by default. Security exploits made possible by passing source text trusted to be one language into a parser that expects another have a long tradition. I see that one bug caused by this has already been addressed (#65), but really every place "//.*" produces a comment node is a bug if this is meant to be able to parse CSS.

from postcss-values-parser.

shellscape avatar shellscape commented on July 18, 2024

@bathos can you link to the spec for use of double slashes like you've screenshotted?

from postcss-values-parser.

bathos avatar bathos commented on July 18, 2024

Yep! Actually I’ll try to enumerate all the cases I can think of where the collision becomes apparent. It may not all be equally useful info, but it could help paint a picture.


In CSS Syntax § 4.3.1 Consume a token, the last step will produce a delim-token for each "/". These tokens will generally end up in ‘raw’ component value lists which higher-level grammars take as input.

The example above is border-image (CSS Backgrounds § 6.7). The grammar for a border-image component value list is given as:

<'border-image-source'> || <'border-image-slice'>
[ / <'border-image-width'> | / <'border-image-width'>? / <'border-image-outset'> ]? ||
<'border-image-repeat'>

The meanings of ||, [...], etc can be found in CSS Values. The / above represents a literal delim-token whose value is "/".

The portion of interest is / <'border-image-width'>? /. The ? says that the preceding member is optional, so this production can be expanded to two alternatives, the latter of which is just two "/" delim tokens in succession. These could have comment or whitespace tokens between them or they could not.

The use of slashes-for-lists-of-lists (or in other contexts where a delimiter prevents ambiguity) occurs in a good number of other CSS value grammars. The second paragraph in CSS Values § 2.1 gives "/" license to appear without quotes, like comma, because of its similar role as a separator. The specific pattern above, though — of a grammar including an optional item between two slashes — is rare. I only know of one other example of that pattern, the rather similar (and recent) mask-border.


We don’t need to look for cases where two "/" delim tokens are expected, though, to find cases where two source text slashes are expected. This was encountered previously with the URL-with-relative-protocol issue, for example.

In any place where a "/" delim token would be recognized, so too would be a "/" delim token followed or preceded by a comment. The delim token could be acting as a separator or it could be a division operator.

color: hsl(0deg 0% 0% //* this is an inline comment */ 0.5);
width: calc(100vh //* this is an inline comment */ 10);

SASS considers the first of these to have a line comment (and an unclosed hsl func). PVP actually parses it as CSS, not SASS, though. Although that’s the behavior I would advocate for as default, things have gotten a bit weird now: PVP has invented a language which is neither CSS nor SASS :)


A slash can appear in some other tokens besides URLs, delims, and comments, too. One of them is ident, where it could be the last source character. As a consequence, anywhere <custom-ident> occurs in a grammar, two adjacent slashes could appear, either because a comment or a a slash delim follows.

grid-column: ident-ending-with-slash\//1;
will-change: foo\//*bar*/;

CSS and SASS are actually in agreement on the meaning of both of these examples. That makes sense: the escaped slashes in the idents are consumed greedily by both. However, even though SASS doesn’t think either of these examples have line comments, PVP does. That hints at a distinct issue related to ident lexing, though.


So far we’ve only looked at what happens in (predefined) component value grammars that we’re trying to match. An unrecognized property is still valid CSS, though. The following stylesheet contains a color property (and no comments).

foo {
  //; color: red;
}

New component value grammars are added pretty often. It’s not unusual for them to introduce token combos which haven’t been seen before. They can do that without fear because the component value system provides a kind of safe playground. The grammars are free to be just about anything provided just a few rules are followed, which are something like this roughly:

  1. they must adhere to the CSS lexical grammar
  2. any braces — () {} [] — must appear in balanced pairs
  3. any ; delim tokens may occur only between braces.

From this system we get backwards and forwards compatibility for free. This is pretty cool!

In many ways in this system, forwards compat matters more than backwards compat. That becomes especially clear when looking at the new CSS.registerProperty function. So far only Chrome has implemented this, and it’s still behind a flag. It allows specifying the component value grammar of a custom property entirely in userland.

When determining what’s safe to extend, I think the thing to check isn’t ‘are any specific items (property grammars, etc) defined right now with which this would collide,’ but rather ‘can this be built using the lego system that forms the basis of CSS’. PVP is a pretty important & valuable ecosystem tool, so my feeling is that it should probably not ‘spoil the commons’ by default by melting down legos, build tool or not. On the other hand, this isn’t my project and I don’t know how its authors prioritize that stuff.

I think there are a few bugs to address related to // regardless of whether sass comments become opt-in (or opt-out), though.

from postcss-values-parser.

shellscape avatar shellscape commented on July 18, 2024

@bathos I really applaud all you've put into this. That's a wealth of information. I'm all for improving the parser, but I'd need some outside assistance in making this particular bug adhere to the spec.

from postcss-values-parser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.