chendaniely / pyprojroot Goto Github PK
View Code? Open in Web Editor NEWFinding project directories in Python (data science) projects, just like in R rprojroot and here packages
License: MIT License
Finding project directories in Python (data science) projects, just like in R rprojroot and here packages
License: MIT License
Some functions will return an exception when passed a pathlib object.
wrapping str()
around here will fix this, but if we set some option for here
like `output='str`` that will return a string object else a pathlib object, it would make the code a little cleaner.
This would break if used like here(output='str') / folder1 / folder2
since the /
operator is only overloaded for pathlib not strings.
Hi everyone:
I'm going to try to consolidate the messages from a bunch of github issues into this one issue + response.
So far the ones i've seen are:
First, applogies
GitHub pings are something I don't really see becuase all my important github work and communications happen over slack channels, or even emails. Also, a lot of github pings come from my own students, and are mostly from teaching examples that usually don't prompt me to anything urgent.
What I'm goign to be doing the next few week is merge in all the PRs that don't have any major merge conflicts working. What ended up happening in the past was I accepcted a good feature into the project that worked with a bunch of type hints and unit tests that I don't really know how to work with, and it really broke a lot of my own maintence of the project. I think #32 just fixed that so thank you.
Moving forward, I might need to just work on getting all the incoming featured merged in without worrying about the unit tests, and slowly work on figuring that part out on my own. There are a few things that I would like to do with this project.
There are a few conferences I'm planning to attend with sprints that this project seems good for (PyCascades, PyCon, SciPy). And also it would be really nice to revisit #14 and actually get this project published through PyOpenSci, now that Leah is full time on the organization.
I'll try to track down all the people with forks and imprrovements over the the next few weeks and invite them to PR back up stream so we can all be on the authorship when the proposal to PyOpenSci goes in.
I'll post more in this issue as things progress along.
If you need to reach out to me my gmail account is the same as this username (and should be able to be tracked in the commit history?)
Triage Issues + Current PRs
See what state would the repostiory need to be in for PyCascades Sprint March 20th.
Dan
I found an ITP on the debian-python list about your project.
Sorry, I don't understand all details.
Can you improve your README and explain a bit more detailed what the package does and what the advantage is here. Currently I don't see an advantage.
I only see very simple path handling which could be done by or via pathlib
itself also. What does your package add to the pathlib
functionality?
Jetbrains' PyCharm creates a .idea
folder in the project root.
This should be added to the list of recognized roots.
#3 added tests to the repository, but after merging a few PRs the tests are broken.
even if I git reset --hard d381ef9
I get the following errors
$ pytest -k test_here
============================= test session starts ==============================
platform linux -- Python 3.7.3, pytest-5.0.1, py-1.8.0, pluggy-0.12.0
rootdir: /home/dchen/git/hub/pyprojroot
plugins: doctestplus-0.3.0, arraydiff-0.3, openfiles-0.3.2, remotedata-0.3.1
collected 25 items / 1 deselected / 24 selected
tests/test_pyprojroot.py .....F.....F.....F.....F [100%]
=================================== FAILURES ===================================
____________________________ test_here[stuff-.dvc] _____________________________
self = PosixPath('/.git')
def __str__(self):
"""Return the string representation of the path, suitable for
passing to system calls."""
try:
> return self._str
E AttributeError: _str
/home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
During handling of the above exception, another exception occurred:
tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_stuff__dvc_0')
proj_file = '.dvc', child_dir = 'stuff'
@pytest.mark.parametrize(
"proj_file",
[
".git",
".here",
"my_project.Rproj",
"requirements.txt",
"setup.py",
".dvc",
],
)
@pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
def test_here(tmpdir, proj_file, child_dir):
"""
This test uses pytest's tmpdir facilities to create a simulated project
directory, and checks that the path is correct.
"""
# Make proj_file
tmpdir = Path(tmpdir)
p = tmpdir / proj_file
with p.open("w") as fpath:
fpath.write("blah")
# Make child dirs
(tmpdir / child_dir).mkdir(parents=True)
os.chdir(tmpdir / child_dir)
assert os.getcwd() == str(tmpdir / child_dir)
# Check that proj
> path = here()
/home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
E RecursionError: maximum recursion depth exceeded while calling a Python object
!!! Recursion detected (same locals & position)
_____________________________ test_here[src-.dvc] ______________________________
self = PosixPath('/.git')
def __str__(self):
"""Return the string representation of the path, suitable for
passing to system calls."""
try:
> return self._str
E AttributeError: _str
/home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
During handling of the above exception, another exception occurred:
tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_src__dvc_0')
proj_file = '.dvc', child_dir = 'src'
@pytest.mark.parametrize(
"proj_file",
[
".git",
".here",
"my_project.Rproj",
"requirements.txt",
"setup.py",
".dvc",
],
)
@pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
def test_here(tmpdir, proj_file, child_dir):
"""
This test uses pytest's tmpdir facilities to create a simulated project
directory, and checks that the path is correct.
"""
# Make proj_file
tmpdir = Path(tmpdir)
p = tmpdir / proj_file
with p.open("w") as fpath:
fpath.write("blah")
# Make child dirs
(tmpdir / child_dir).mkdir(parents=True)
os.chdir(tmpdir / child_dir)
assert os.getcwd() == str(tmpdir / child_dir)
# Check that proj
> path = here()
/home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
E RecursionError: maximum recursion depth exceeded while calling a Python object
!!! Recursion detected (same locals & position)
_____________________________ test_here[data-.dvc] _____________________________
self = PosixPath('/.git')
def __str__(self):
"""Return the string representation of the path, suitable for
passing to system calls."""
try:
> return self._str
E AttributeError: _str
/home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
During handling of the above exception, another exception occurred:
tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_data__dvc_0')
proj_file = '.dvc', child_dir = 'data'
@pytest.mark.parametrize(
"proj_file",
[
".git",
".here",
"my_project.Rproj",
"requirements.txt",
"setup.py",
".dvc",
],
)
@pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
def test_here(tmpdir, proj_file, child_dir):
"""
This test uses pytest's tmpdir facilities to create a simulated project
directory, and checks that the path is correct.
"""
# Make proj_file
tmpdir = Path(tmpdir)
p = tmpdir / proj_file
with p.open("w") as fpath:
fpath.write("blah")
# Make child dirs
(tmpdir / child_dir).mkdir(parents=True)
os.chdir(tmpdir / child_dir)
assert os.getcwd() == str(tmpdir / child_dir)
# Check that proj
> path = here()
/home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
E RecursionError: maximum recursion depth exceeded while calling a Python object
!!! Recursion detected (same locals & position)
__________________________ test_here[data/hello-.dvc] __________________________
self = PosixPath('/.git')
def __str__(self):
"""Return the string representation of the path, suitable for
passing to system calls."""
try:
> return self._str
E AttributeError: _str
/home/dchen/anaconda3/lib/python3.7/pathlib.py:697: AttributeError
During handling of the above exception, another exception occurred:
tmpdir = PosixPath('/tmp/pytest-of-dchen/pytest-3/test_here_data_hello__dvc_0')
proj_file = '.dvc', child_dir = 'data/hello'
@pytest.mark.parametrize(
"proj_file",
[
".git",
".here",
"my_project.Rproj",
"requirements.txt",
"setup.py",
".dvc",
],
)
@pytest.mark.parametrize("child_dir", ["stuff", "src", "data", "data/hello"])
def test_here(tmpdir, proj_file, child_dir):
"""
This test uses pytest's tmpdir facilities to create a simulated project
directory, and checks that the path is correct.
"""
# Make proj_file
tmpdir = Path(tmpdir)
p = tmpdir / proj_file
with p.open("w") as fpath:
fpath.write("blah")
# Make child dirs
(tmpdir / child_dir).mkdir(parents=True)
os.chdir(tmpdir / child_dir)
assert os.getcwd() == str(tmpdir / child_dir)
# Check that proj
> path = here()
/home/dchen/git/hub/pyprojroot/tests/test_pyprojroot.py:40:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:15: in here
proj_path = pyprojroot(pl.Path('.').cwd(), proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
/home/dchen/git/hub/pyprojroot/pyprojroot/pyprojroot.py:10: in pyprojroot
return pyprojroot(p.parent, proj_files)
E RecursionError: maximum recursion depth exceeded while calling a Python object
!!! Recursion detected (same locals & position)
============== 4 failed, 20 passed, 1 deselected in 0.70 seconds ===============
Spyder projects create a .spyproject
directory that should be added to the list of files/folders to look for.
While here
was mainly used to read data. It's also useful to write data to locations.
However, it's a little clunky to create folders before using here.
For example:
pl.Path(here("./data/f1/f2/n1/n1", warn=False)).mkdir(parents=True, exist_ok=True)
df.to_csv(here("./data/f1/f2/n1/n1/data.csv", warn=False))
Will make sure all the folders for "./data/f1/f2/n1/n1"
are created first before writing.
We can save on the explicit pathlib import and make the save a single line if here
contains an option akin to create_dirs=True
.
It would turn the above code into.
df.to_csv(here("./data/f1/f2/n1/n1/data.csv", create_dirs=True))
The create_dirs
parameter would also set warn=False
automatically (since you're most likely going to use this idiom to write to a new file).
I know python 2 is EOL but there are a lot of geospatial analysts clinging to it - python 2.x is still shipping with ArcGIS Desktop and people are reluctant to change, and this module could simplify a lot of hacky path issues.
With that in mind, would you be interested in a PR to add python 2 backwards compatibility for pyprojroot? I'm happy to write and submit the changes, but if it's not something you want to add to the module, I understand.
From here: https://stackoverflow.com/questions/67631/how-to-import-a-module-given-the-full-path
we can actually import module files from a full python path. Since here()
will return a full path, we can write a function that makes it easier to import.
I don't want to name the function import
, but something like:
import pyprojroot
# will append `src` to the import path (temporarily)
pyprojroot. proj_import(here("./src"))`
import my_module1
import my_module2
# will import module directly, given a .py script
pyprojroot. proj_import(here("./src/my_module.py"))`
This project should have continuous integration (especially since the tests in v0.2.0 are broken), but there are lots of options, including:
I can help set this up. Which one do you prefer?
First, thanks for making this. It's been on my TODO list for a long time: https://github.com/jamesmyatt/pyprojroot
The interface of this library is very different from that used by rprojroot. Would it be OK to change it to be closer? In particular, the main function is find_root
and it takes some criterion
as the main input. What do you think?
Hi there!
I wonder whether it is possible to cut a new release after the major refactor last year.
PyPi shows 0.2.0 from 2019 - https://pypi.org/project/pyprojroot/#history
What can I do to help with this?
In conda, I get the following error:
ClobberError: This transaction has incompatible packages due to a shared path.
packages: conda-forge/noarch::pyprojroot-0.2.0-py_0, conda-forge/noarch::shtab-1.3.4-pyhd3deb0d_0
path: 'lib/site-packages/tests/__init__.py'
This suggests that this package is distributing a module called tests
, which is wrong.
I assume that the same problem will occur with pip too.
In my experience, the best solution is a src
layout as described in https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure and https://hynek.me/articles/testing-packaging/#src.
It's also worth noting that the "tests outside application code" pattern usually doesn't make the tests
directory into a module: https://docs.pytest.org/en/latest/goodpractices.html?highlight=src#tests-outside-application-code. Note that this good practice guide also strongly suggests a src layout.
edit: 450c40f fixes it so that the old way still works.
Most major change from #20 (thanks @jamesmyatt) is that the import statement will need to change
Old way:
from pyprojroot import here
new way
from pyprojroot.here import here
Changes are mainly to make this package more flexible in specifying a base root directory like how the R here
package builds on top of the rprojroot
library.
Hi, I get the following error on import. What can I do?
flo@comp:/projects/bla$ sudo pip3 install pyprojroot
Requirement already satisfied: pyprojroot in /usr/local/lib/python3.5/dist-packages (0.1.1)
flo@comp:/projects/bla$ python3
Python 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyprojroot import here
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/pyprojroot/__init__.py", line 1, in <module>
from .pyprojroot import *
File "/usr/local/lib/python3.5/dist-packages/pyprojroot/pyprojroot.py", line 22
warnings.warn(f"Path doesn't exist: {pth}")
^
SyntaxError: invalid syntax
pyprojroot/pyprojroot/pyprojroot.py
Line 44 in d7ddccd
There is a problem assuming cwd
if this code gets used from outside the assumed project. It would be a user responsibility to provide something like __file__
as a starting dir.
Hi @chendaniely thank you for putting pyprojroot
together, I use it all the time now
Just ran into an issue where I was outside my project direct (without realizing it) and got a cryptic RecursionError
16 if len(found) > 0:
17 return path
---> 18 return py_project_root(path.parent, project_files)
19
20
RecursionError: maximum recursion depth exceeded while calling a Python object
Obviously it's my fault for being in the wrong directory.
But I wonder if there's a way to catch this and give a user a more informative error message though?
i.e. something like
try:
return py_project_root(path.parent, project_files)
except RecursionError as e:
print("did not find project root, are you in the project directory?")
raise e
(I don't know if that's an idiomatic way to add info and re-raise the error, but something like that which gives a little more info to a user that might be lost in some other problem and confused by the error, like me
Just a suggestion. Thanks!
Currently the tool will find any path containing a .dvc
folder, however there are times when it is necessary to initialize dvc in a subdirectory. This comes up when e.g. packaging the mydata.csv.dvc
files via, say, importlib
to let users treat datasets as dependencies in the code itself.
On that note, dvc actually does this for you, with dvc.api.Repo.find_root()
. It may be worth modularizing the assumptions here a bit, to let users assemble the "right" assumptions for what a "root directory" is, while providing sane defaults to the rest.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.