Giter VIP home page Giter VIP logo

Comments (14)

sgugger avatar sgugger commented on May 25, 2024 3

It may break other packages too. I think in general libraries should avoid this type of global patching especially on core modules like pathlib.

I personally think it is rather silly to check if something is an instance of pathlib.Path by checking if it has some attributes instead of using isinstance(o, Path) but I may be missing some context into why it is done this way. Patching things is one of the great ways in python to add some missing behavior + what if one day a write method is indeed added to Pathlib objectcs?

In any case, if there is a way to fix this error internally, I can review any PR. It is very unlikely we will remove this (or stop adding useful methods to objects of other libraries).

from fastcore.

parano avatar parano commented on May 25, 2024 1

@sgugger thanks for the reply - yes I'm all for using patch but I think it should only be applied in a local scope.

Unfortunately, I don't think this workaround(importing original before importing fastai2) will work, because the fastcore patch is modifying the same loaded module in memory that BasePath in your example code is pointing to.

Screen Shot 2020-04-23 at 11 34 46 AM

I can probably avoid the error by deleting those attributes and adding back afterward in my code. Hope this thread is helpful for others.

from fastcore.

sgugger avatar sgugger commented on May 25, 2024 1

Ok, the problem is then that subclassing Path is next to impossible: if I do something like:

from pathlib.Path import Path as BasePath
Path = class(BasePth): pass

I then get problems when using it (specific error is that it has no _flavour).

I'm open to do this subclass to contain the patching, but I remember spending hours trying to have subclasses of Path a while ago without succeeding...

The easiest way I see is to have a context manager that temporarily deletes the methods we patch, I'm also open to have that has part of fastcore since we are the one adding that stuff, so it's kind of expected we provide a way to temporarily disable it.

from fastcore.

KushajveerSingh avatar KushajveerSingh commented on May 25, 2024

The above error means that the object should be saved as a binary object, but by default this is not done in fastcore. So I changed the following lines and the error is resolved.

# Original (in write function of utils.py)
with self.open('w', encoding=encoding) as f: f.write(txt)

# Change the above line to
with self.open('wb') as f: f.write(txt)

To integrate the above change to fastcore, I think the solution would be to use try-except statement, as it would not change the functionality of any fastcore module.

The below code solves the problem.

@patch
def write(self:Path, txt, encoding='utf8'):
    "Write `txt` to `self`, creating directories as needed"
    self.parent.mkdir(parents=True,exist_ok=True)
    try:  
        with self.open('w', encoding=encoding) as f: f.write(txt)
    except TypeError: 
        with self.open('wb') as f: f.write(txt)

However, input from @sgugger is needed on how this needs to be integrated into fastcore. I can submit a PR if needed.

from fastcore.

KushajveerSingh avatar KushajveerSingh commented on May 25, 2024

Do you want to make changes to resolve this issue, as this issue is specific to a library and the change is more suited to be made in that library?

The reason for the above statement is that this would also require to change the read function. I tested the read function and it results in an infinite loop for some reason when loading the saved yaml object.

Also, if this issue has to be resolved it can be easily done by deleting write, read, readlines attributes of the pathlib.Path class. So using the below lines in the code solves the issue:

from ruamel.yaml import YAML
from fastcore.all import *

delattr(Path, 'write')
delattr(Path, 'read')
delattr(Path, 'readlines')

yaml.dump(y, Path("test.yml")) # This line works now

from fastcore.

parano avatar parano commented on May 25, 2024

@sgugger the reason ruamel.yaml did it that way was for supporting python2. pathlib was introduced in python3.4 and libraries supporting both python2 and 3 use other implementations such as pathlib2 or other custom pathlib implementation under python2. This way ruamel.yaml works with all those pathlib variations as long as they follow the same interface as the builtin pathlib. If pathlib actually added a write method one day and its signature or implementation were different than the code in fastcore, fastcore will have to change as well right?

Personally I'd prefer an approach that avoids patch and replace it with a utility function. e.g.:

# fastai.util.path_utils.py
def write(path:Path, text, encoding="utf-8"):
      ...

and replace usages like path.write(text) with:

from fastcore.util import path_utils
path_utils.write(path, text)

We could also patch it within the fastai project's scope, something like this would work:

# fastai.util.path.py
from pathlib import Path as BuiltinPath

class Path(BuiltinPath):
     def write(self, text, encoding="utf-8"):
         ...

In other fastai code, replace from pathlib import Path with:

from fastcore.util.path import Path

Happy to help make this contribution either this way or the try ... except ... approach @KushajveerSingh provided above if we can get consensus, let me know what you think.

Thanks!

from fastcore.

parano avatar parano commented on May 25, 2024

Do you want to make changes to resolve this issue, as this issue is specific to a library and the change is more suited to be made in that library?

The reason for the above statement is that this would also require to change the read function. I tested the read function and it results in an infinite loop for some reason when loading the saved yaml object.

Also, if this issue has to be resolved it can be easily done by deleting write, read, readlines attributes of the pathlib.Path class. So using the below lines in the code solves the issue:

from ruamel.yaml import YAML
from fastcore.all import *

delattr(Path, 'write')
delattr(Path, 'read')
delattr(Path, 'readlines')

yaml.dump(y, Path("test.yml")) # This line works now

That will break some other fastai/fastcore functionality right?

from fastcore.

KushajveerSingh avatar KushajveerSingh commented on May 25, 2024

That will break some other fastai/fastcore functionality right?
I don't know exactly. Just for reference, there are three functions that I suggested to delete:

  1. write :- you can replace this function with the try-except version and you are good to go
  2. readlines :- I don't think this function is much used, so you can ignore it.
  3. read :- Yeah this will get affected. But the sole reason being I have not yet gone through the source code of why this function is not working with ruamel.yaml. When I tested this function it was an infinite loop or something but I did not dig into it much. So maybe with some work, the problem can be found and then just a refractoring of the code.

from fastcore.

sgugger avatar sgugger commented on May 25, 2024

I maintain that the problem does not come from fastcore but from the way the test is done ruamel.yaml. Relying on attributes in a dynamic language like Python is dangerous, for the reason we are seeing with this bug, and the test should test for actual pathlib or pathlib2 objects by having imports conditioned on a version of python.

Saying you should never patch things in a language that lets you do that is like saying you should never use the word "the" in English.

Instead of wanting us to change our code, you can simply import the original Path before import fastai2 like such:

from pathlib import Path as BasePath

and then use this one (or plain strings...) in ruamel.yaml.

I am closing this issue as a result.

from fastcore.

sgugger avatar sgugger commented on May 25, 2024

Ah, I see. Is that what you meant with your workaround (cause that would still make Path have a write attribute, but pathlib.Path would be untouched)?

from fastcore.

parano avatar parano commented on May 25, 2024

@sgugger yes that's what I mean, with that workaround, all fastai code can still use the patched Path, where pathlib.Path remains untouched.

from fastcore.

sgugger avatar sgugger commented on May 25, 2024

Added a context manager, you should now be able to do:

with remove_patches_path():
    yaml.dump(y, Path("/tmp/test.yml"))

and it should work.

from fastcore.

parano avatar parano commented on May 25, 2024

@sgugger awesome, thanks for adding this!

from fastcore.

domdfcoding avatar domdfcoding commented on May 25, 2024

pathlib.Path has a hardcoded check in __new__ for cls being pathlib.Path. In a subclass that is False, so it assumes that cls is instead one of the flavours of Path. Is is possible to create a subclass of Path, but it requires a bit more work:

import os
import pathlib


class FastCorePath(pathlib.Path):
	"""Subclass of pathlib.Path with extra fastcore methods.

	Path represents a filesystem path but unlike PurePath, also offers
	methods to do system calls on path objects. Depending on your system,
	instantiating a Path will return either a PosixPath or a WindowsPath
	object. You can also instantiate a PosixPath or WindowsPath directly,
	but cannot instantiate a WindowsPath on a POSIX system or vice versa.
	"""

	def __new__(cls, *args, **kwargs):
		if cls is FastCorePath:
			cls = PosixFastCorePath if os.name == 'nt' else WindowsFastCorePath

		self = cls._from_parts(args, init=False)  # type: ignore
		if not self._flavour.is_supported:
			raise NotImplementedError(f"cannot instantiate {cls.__name__!r} on your system")

		self._init()
		return self


class PosixFastCorePath(FastCorePath, pathlib.PurePosixPath):
	"""Path subclass for non-Windows systems.

	On a POSIX system, instantiating a FastCorePath object should return an instance of this class.
	"""
	__slots__ = ()


class WindowsFastCorePath(FastCorePath, pathlib.PureWindowsPath):
	"""Path subclass for Windows systems.

	On a Windows system, instantiating a FastCorePath object should return an instance of this class.
	"""
	__slots__ = ()

	def owner(self):
		raise NotImplementedError("Path.owner() is unsupported on this system")

	def group(self):
		raise NotImplementedError("Path.group() is unsupported on this system")

	def is_mount(self):
		raise NotImplementedError("Path.is_mount() is unsupported on this system")

Adding the fastcore-specific methods to FastCorePath will also make them available in PosixFastCorePath and WindowsFastCorePath. Fastcore can then use this class internally without monkeypatching pathlib.Path

I have been using a subclass created in this way for a while now without issue.

from fastcore.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.