Giter VIP home page Giter VIP logo

Comments (17)

translatenix avatar translatenix commented on May 21, 2024

Is there a timeline to fix this? Developing on WSL is really painful. Gradle and esp. IntelliJ are extremely slow (30s pause is normal), and important IntelliJ features such as showing the JDK source code don't work.
I think fixing this might require changing the package cache file layout (no : character in filenames).

from pkl.

translatenix avatar translatenix commented on May 21, 2024

@bioball Windows support is very important to me. I'd be willing to work on this first step, but I'd need some guidance wrt. changing the cache file layout.

from pkl.

bioball avatar bioball commented on May 21, 2024

How about we precent-encode them?

According to their docs, these characters are reserved:

  • < (less than)
  • > (greater than)
  • : (colon)
  • " (double quote)
  • / (forward slash)
  • \ (backslash)
  • | (vertical bar or pipe)
  • ? (question mark)
  • * (asterisk)

Forward slash is also reserved on macOS and Linux, so we don't need to worry about encoding them (Pkl filenames cannot contain forward slashes, and URI forward slashes are equated to path separators).

For Windows only, it probably makes sense to percent-encode all other characters. Or, are there other encodings that people use?

Here's some literature:

https://stackoverflow.com/questions/1184176/how-can-i-safely-encode-a-string-in-java-to-use-as-a-filename
https://stackoverflow.com/questions/1077935/will-urlencode-fix-this-problem-with-illegal-characters-in-file-names-c

I think we'd want to take the same approach for writing output files. For instance, what will be written when you pkl eval -m . foo.pkl here?

// foo.pkl
output {
  files {
    ["foo:bar.txt"] { text = "foo:bar" }
  }
}

from pkl.

translatenix avatar translatenix commented on May 21, 2024

Don't we want the exact same layout across operating systems? Seems desirable for portability/tooling/etc.
Gradle's dependency cache used to be non-portable, and it was causing a lot of pain.
gradle/gradle#1338

from pkl.

bioball avatar bioball commented on May 21, 2024

I can see wanting that, especially if you are using the cache dir as a way to vendor dependencies in a repo.

In that case, it might make more sense to think of this as a separate problem from writing file paths with -o or -m. In those cases, it'd be surprising if -o "foo:bar.txt" somehow mangled that output file on linux/macOS.

In that case, maybe it makes the most sense to always percent encode those characters when writing to the cache dir.

from pkl.

translatenix avatar translatenix commented on May 21, 2024

Is mangling required for -o/-m? Can’t -o "foo:bar.txt" just fail on Windows? Another option would be to make it fail everywhere (“non-portable output path”). Anyway, I agree that this is a separate concern.

from pkl.

bioball avatar bioball commented on May 21, 2024

What does Windows do right now if you try to create a filename with a reserved character? E.g. if you do echo "hello" > foo:bar.txt?

Note: for percent-encoding, we will also need to percent-encode the literal %.

from pkl.

translatenix avatar translatenix commented on May 21, 2024

What does Windows do right now if you try to create a filename with a reserved character?

PowerShell:

echo "hello" > foo:bar.txt
Out-File: Cannot find drive. A drive with the name 'foo' does not exist.
echo "hello" > $home\foo:bar.txt   # creates a file named "foo", but it's empty

Java 21:

jshell> new File("foo:bar.txt").createNewFile()   // creates a file named "foo"
jshell> Files.createFile(Path.of("foo:bar.txt"))
|  Exception java.nio.file.InvalidPathException: Illegal char <:> at index 3: foo:bar.txt

File explorer:
Can't create file with invalid path

from pkl.

translatenix avatar translatenix commented on May 21, 2024

I assume the package URL -> file path conversion should be unique, to rule out name collisions. Should it also be reversible?

from pkl.

bioball avatar bioball commented on May 21, 2024

Yeah, ideally reversible, which is why percent-encoding seems like a good choice here.

from pkl.

mitchcapper avatar mitchcapper commented on May 21, 2024

echo "hello" > $home\foo:bar.txt
jshell> new File("foo:bar.txt").createNewFile() // creates a file named "foo"

To note, the colon here is only kind of an invalid character. Really what you are doing is writing to alternate data streams. This is why the file can appear but is 'empty'. For details: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3

This is just one character but will be slightly different than other path characters that are truly invalid (in terms of if/what errors thrown).

Also why from a command line this works:

> echo "hey" > "somefile.txt:alt_stream" && cat somefile.txt && echo "..." && cat "somefile.txt:alt_stream"
"..."
"hey"

from pkl.

bioball avatar bioball commented on May 21, 2024

Hm... this impacts pkldoc too. This means that generated pkldoc websites should also use percent encoding, otherwise we can't write directories or filenames onto Windows.

@mitchcapper good to know. I don't think that changes the fact that we should encode these characters.

from pkl.

mitchcapper avatar mitchcapper commented on May 21, 2024

I don't think that changes the fact that we should encode these characters.

Right, should certainly not be on the whitelist as not valid as path only some apis for the alt stream access.

from pkl.

bioball avatar bioball commented on May 21, 2024

WRT encoding:

  • Scala does not do any encoding; generating scaladoc on Windows will error if a class name contains a reserved character.
  • Kotlin's Dokka wraps reserved characters in square brackets, then the character code inside. For example, : becomes [58].
  • Doesn't seem like Swift's DocC has figured out a solution here yet: https://forums.swift.org/t/docc-colons-in-filenames/57917/27

Using percent-encoding is pretty ugly, because you get doubly-encoded URL paths.

Kotlin's encoding seems nicer. A nice benefit here is that the file paths match the URL paths. However, it works for them because square bracket chars aren't allowed as identifier names. In Pkl, the only character not allowed in an identifier is the backtick literal.

Encoding file path URL path
Percent localhost%3A0 localhost%253A0
Dokka localhost[58]0 localhost[58]0

from pkl.

bioball avatar bioball commented on May 21, 2024

Maybe we can copy Dokka. We can simply represent verbatim [ as [[. Multiple [ literals would just be double the amount of [.

I don't think we need a special way to represent ]; it only has meaning if it is preceded with regex [\d{2}.

literal encoded
foo:bar foo[58]bar
foo[58]bar foo[[58]bar
foo[[ foo[[[[
foo[:bar foo[[[58]bar
illegal foo[bar

One downside is that we now use four bytes to represent one byte, which might be a problem with URL addresses. But this is an edge case that maybe we can live with. Also, this is still better than percent-encoding, which uses five bytes for URL paths.

Another thought: I think this new encoding necessitates packages-2 (we use packages-1 as our cache dir right now).

from pkl.

bioball avatar bioball commented on May 21, 2024

Proposal here: apple/pkl-evolution#3

from pkl.

bioball avatar bioball commented on May 21, 2024

This works now per #489.

Might need --depth=1 in order for it to work.

from pkl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.