Comments (8)
Whatever the outcome of this discussion, CLIs depending on pystow MUST provide a way to change the location. I have had now cases where using, say, OAK with ODK, where all the processing outside PWD will be lost after the run (because the process, say an ontology query) is running inside the docker container. In fact, its possible that the caller does not have write right outside PWD at all, which needs to be considered.
from pystow.
I noticed in the docs, that the location of the data is configurable:
If you want to use an alternate folder name to .data inside the home directory, you can set the PYSTOW_NAME environment variable. For example, if you set PYSTOW_NAME=mydata, then the following code for the pykeen app will create the $HOME/mydata/pykeen/ directory
from pystow.
I agree that if pystow is used as a dependency of some package that someone installs, it might not be clear what it is and what the .data
folder is. I wonder if instead of naming it .pystow_data
which assumes someone knows what pystow is, naming it simply .pydata
would at least convey the fact that "this is a data folder for Python packages".
from pystow.
I've never used a package that created a .data
folder, so I'm curious if you guys or anyone else has an example of another package that's doing this that would create the conflict.
I don't want to make the name python-specific nor pystow-specific because the concept transcends languages. I have actually been planning to write an R port of this that should have a similar interface (and potentially have a higher impact, since R users are terrible with reproducibility).
There are so many application-specific folders littering the home directory now that I'd rather keep this one generic as a reminder that it should supersede the other ones.
from pystow.
I'm not aware of any such packages either. My thinking was based on the sheer scale of the number of software and data professionals in the world, as well as hobbyists, and the impossibility of knowing if any are using a .data
folder, especially for the case of custom/bespoke workflows that wouldn't be public knowledge. That it is such generic name, and that it seemed like an obvious choice to you, makes it seem at least plausible to me that could also be an obvious choice for someone else.
You bring up a good point though. If you want to create something that is not pystow or language specific, then a somewhat generic name is in order. In this case, I think you need to start thinking about branding. If you want this thing in its language agnostic form to gain mindshare, maybe it should have a pithy name to help brand it. In any case, I've updated adeft to use appdirs to place the models in the platform specific user data location (which I think is a better fit for adeft), so I no longer feel personally responsible here.
from pystow.
I agree with having the layout transcend language or a particular implementation, but there are a set of conventions at play here, so I think having having some more specific name is warranted, and I agree with @steppi that it's likely that its not unlikely other applications will choose .data
.
from pystow.
Can someone point to a concrete example of another application (any platform/language) that’s using the .data directory in the home folder?
If that’s really an issue, there are several ways to configure where pystow uses its home directory both by specifying it explicitly or by falling back to the xgd standard
from pystow.
Can someone point to a concrete example of another application (any platform/language) that’s using the .data directory in the home folder?
I’m not aware of any examples. To summarize my thoughts.
- .data is a generic name and if you think it’s a natural choice it’s not unlikely someone else will too.
- The consequences of a clash could be catastrophic. I misconfigured pystow before due to gh-11 and it happily deleted all of the content in the clashing directory.
- The direct users of pystow are Python package developers. The users of these packages may not even know that pystow exists and shouldn’t have to worry about configuring it.
Even if the frequency of clashes is very small, that a clash could lead to catastrophic results for a user is enough to scare me away from using pystow in one of my own packages. Absence of evidence isn’t necessarily evidence of absence, and that I’m not aware of any applications using a .data directory doesn’t make me feel secure given my complete ignorance of the bespoke workflows that are used in different teams/groups/labs.
That said, I don’t intend to push any further on this and hope I don’t come off as too aggressive.
from pystow.
Related Issues (20)
- Confusing folder configuration HOT 1
- Joining to existing pystow path HOT 1
- Python 3.10 support
- Ensure zip file? HOT 3
- Bug in ensure_open_zip when using API HOT 2
- Variants of ensure functions focused on loading HOT 3
- syncing an upstream gzip file with an expanded local version HOT 4
- Package on Anaconda via conda-forge? HOT 6
- [WINDOWS] TypeError when using `ensure_csv`: "argument of type 'WindowsPath' is not iterable" HOT 1
- loading comma-separated values format defaults to tab separators
- Ensure from figshare
- Best practices for clearing stale files HOT 2
- Need to create a config file before it can be written to using write_config HOT 6
- Get configuration from file corresponding to leading part
- Alignment with platform-specific directories HOT 1
- Feature: Add progress bar during download HOT 2
- Return types for some ensure functions
- Missing type information HOT 1
- Raise error if directory passed as name
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pystow.