Comments (6)
To add to @smackesey - can also confirm this bug exists on the main
branch, with a minimal Dockerfile example:
FROM python:3.10
ENV PIP_ROOT_USER_ACTION=ignore
RUN pip install --upgrade pip
RUN pip install git+https://github.com/unionai-oss/pandera@main
ENTRYPOINT python -c 'import pandera'
> docker build . --no-cache -t 'pandera-main' && docker run 'pandera-main'
[+] Building 19.2s (7/7) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 228B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10 1.3s
=> CACHED [1/3] FROM docker.io/library/python:3.10@sha256:1b2e0805e24189fbba4e55b9bee89e3c25533cbe4fb71ae151f3e7ae0c9b86c5 0.0s
=> [2/3] RUN pip install --upgrade pip 3.8s
=> [3/3] RUN pip install git+https://github.com/unionai-oss/pandera@main 13.4s
=> exporting to image 0.6s
=> => exporting layers 0.6s
=> => writing image sha256:475ba4457bd99ca1fc9eb1cc7425bc96820a0e152278ae55d9c7983b228c047d 0.0s
=> => naming to docker.io/library/pandera-main 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.10/site-packages/pandera/__init__.py", line 4, in <module>
import pandera.backends
File "/usr/local/lib/python3.10/site-packages/pandera/backends/__init__.py", line 6, in <module>
import pandera.backends.pandas
File "/usr/local/lib/python3.10/site-packages/pandera/backends/pandas/__init__.py", line 5, in <module>
import pandera.typing
File "/usr/local/lib/python3.10/site-packages/pandera/typing/__init__.py", line 9, in <module>
from pandera.typing import (
File "/usr/local/lib/python3.10/site-packages/pandera/typing/dask.py", line 5, in <module>
from pandera.typing.common import DataFrameBase, IndexBase, SeriesBase
File "/usr/local/lib/python3.10/site-packages/pandera/typing/common.py", line 11, in <module>
from pandera.engines import numpy_engine, pandas_engine, pyspark_engine
File "/usr/local/lib/python3.10/site-packages/pandera/engines/pyspark_engine.py", line 17, in <module>
import pyspark.sql.types as pst
ModuleNotFoundError: No module named 'pyspark'
from pandera.
I confirm I am also affected by that. For now, I am pinning to 0.15.2
in my use cases.
Buggy line of code: https://github.com/unionai-oss/pandera/blob/main/pandera/typing/common.py#L11
from pandera.
I also confirm this issue in that line, looks like the pyspark imports need to be wrapped in conditional logic similar to the geopandas ones.
from pandera.
#1268 should fix this!
will cut a bugfix release later today
from pandera.
Just cut a 0.16.1
release. @smackesey can you try again?
from pandera.
It works now, thanks!
from pandera.
Related Issues (20)
- Improve how typed DataFrames interact with VSCode's type peeking
- Partial MultiIndex check fails if only one index column is passed
- pandas.DataFrame.attrs missing after validation
- Default value assignment: Pandas 2.2.0 FutureWarning
- Add Schema name field to decorators._parse_schema_error msg
- Multi index dataframes with datetime level cannot drop invalid rows HOT 3
- DataFrameModel validation method return type annotation incorrect
- Datatype coercion fails to occur if DataFrame Model Config `coerce=True` defined in inherited Config class
- Add model_validate method to DataframeModel HOT 1
- [Bug] import pandera overrides the spark environment
- Index name defined in DataFrameModel is discarded when creating DataFrameSchema
- check_types decorator doesnt validate after inplace mutation with method
- `pyspark.DataFrameSchema` doesn't have data syntheis method(s) implemented HOT 1
- pandera 0.18.0 does not work with multimethod 1.11.0 (but with 1.10.0) HOT 2
- Type hints are not retained HOT 4
- Allow column-level `coerce` setting to override class-level `coerce`.
- `values` arg in `unique_values_eq` has incorrect type hint.
- strict=True is not very strict on the index
- Document pandera model with sphinx HOT 2
- TypeError("Subscripted generics cannot be used with class and instance checks") HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandera.