Giter VIP home page Giter VIP logo

Comments (11)

cosmicBboy avatar cosmicBboy commented on May 31, 2024

Care to provide additional detail and a proposal? E.g. what can be removed and what part of the codebase would need to be refactored?

from pandera.

ulfaslakprecis avatar ulfaslakprecis commented on May 31, 2024

I apologize for not providing more detail, but the main problem is just that dependency graph is huge. I'm raising this issue out of concern that this is a problem. It is in our case and it may be so too for others that would like to use it as a dependency.

Screenshot 2023-10-03 at 11 29 16

I can't start going into how this can best be pruned, but I'm sure there is a way. OS libraries will typically have up to tens of packages in their dependency graph, but this package has thousands. Surely, there's something which can be done about this?

I understand if this is an annoying request to make, but I'm only here because I really like Pandera and wish I could use it at work.

from pandera.

GOGKI avatar GOGKI commented on May 31, 2024

I appreciate the developers in what they created. I have exactly the same situation as @ulfaslakprecis in my org. Having all dev dependencies in the requirements, means that if we want to productionize it, we will need to create enormous containers just to perform a simple validation. Not all the functionalities are needed for the core, and in terms of other teams that's a clear no go. There are lot's of dependencies management tools (like poetry, or pip-tools), that besides other optionalities target this issue. If you want to go for a native pip solution it is possible as well:
https://peps.python.org/pep-0508/

from pandera.

ulfaslakprecis avatar ulfaslakprecis commented on May 31, 2024

@GOGKI I started building pandabear recently. It has a similar/near-identical API to pandera but it ONLY does pandas dataframe/series validation. Still very beta, but input is much appreciated.

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

Happy to support work on making pandera more light-weight. @ulfaslakprecis any appetite for contributing to pandera as opposed to building + maintaining a brand new project?

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

Also wanted to better-understand the issue here. The items listed in the dependency graph are not necessarily what you get when you pip install pandera. The dependencies listed there are an exhaustive list based on all the **/requirements* files and github actions: these are not installed with a plain pip install pandera installation.

Without installing all of the extras, the packages installed are listed here:
https://github.com/unionai-oss/pandera/blob/main/setup.py#L47-L57

That said, I do think we could get rid of multimethod, wrapt, and packaging off the bat. pydantic and typeguard can potentially be cordoned off into their own extras.

Having all dev dependencies in the requirements, means that if we want to productionize it, we will need to create enormous containers just to perform a simple validation

@GOGKI just so I understand this, do you only need to install core pandera when you need to productionize your code? What unexpected/unwanted dependencies do you get?

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

@GOGKI just so I understand this, do you only need to install core pandera when you need to productionize your code? What unexpected/unwanted dependencies do you get?

same question to you @ulfaslakprecis. What dependencies do you consider too heavy weight in your pandera installation (not the dependency graph reported by github, but the ones that are actually installed when you pip install pandera

from pandera.

z4m0 avatar z4m0 commented on May 31, 2024

@cosmicBboy in our case we are having issues with typeguard. Pandera uses typeguard>=3.0.2 and jaxtyping uses typeguard==2.13.3 which makes them incompatible. So having typeguard as an optional dependency would possibly fix the problem.

Allowing typeguard 2 would also fix our problem.

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

@z4m0 see #1563

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

@ulfaslakprecis any comments on #1365 (comment)?

If not gonna close this issue in the next few days.

Created

To capture slimming down the dependencies of a bare pandera installation, but the initial claim in this issue

We want to use Pandera in our organization's codebase, but a some evaluation deemed it unusable at the moment, due to the ENORMOUS (213 pages long) dependency graph.

is actually a non-issue, since the github-reported dependency graph naively reports dependencies in requirements files and not actually the dependencies entailed by pip install pandera.

from pandera.

cosmicBboy avatar cosmicBboy commented on May 31, 2024

Closing now

from pandera.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.