Comments (6)
Using Pyright/MyPy, you'll get a static type-checking warning if you try and pass in object()
to time_zone
here. The only way this could perhaps be built on (from a type-checking perspective), is maintaining a literal of all possible timezones, and using that instead of str
.
from polars.
thanks for reporting - the validation currently happens later:
In [9]: pl.Series([datetime(2020, 1, 1)], dtype=pl.Datetime(time_zone='cabbage'))
---------------------------------------------------------------------------
ComputeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 pl.Series([datetime(2020, 1, 1)], dtype=pl.Datetime(time_zone='cabbage'))
File ~/scratch/.venv/lib/python3.11/site-packages/polars/series/series.py:312, in Series.__init__(self, name, values, dtype, strict, nan_to_null, dtype_if_empty)
309 raise TypeError(msg)
311 if isinstance(values, Sequence):
--> 312 self._s = sequence_to_pyseries(
313 name,
314 values,
315 dtype=dtype,
316 strict=strict,
317 nan_to_null=nan_to_null,
318 )
320 elif values is None:
321 self._s = sequence_to_pyseries(name, [], dtype=dtype)
File ~/scratch/.venv/lib/python3.11/site-packages/polars/_utils/construction/series.py:235, in sequence_to_pyseries(name, values, dtype, strict, nan_to_null)
225 if values_tz != "UTC" and dtype_tz is None:
226 warnings.warn(
227 "Constructing a Series with time-zone-aware "
228 "datetimes results in a Series with UTC time zone. "
(...)
233 stacklevel=find_stacklevel(),
234 )
--> 235 return s.dt.replace_time_zone(dtype_tz or "UTC")._s
236 return s._s
238 elif (
239 _check_for_numpy(value)
240 and isinstance(value, np.ndarray)
241 and len(value.shape) == 1
242 ):
File ~/scratch/.venv/lib/python3.11/site-packages/polars/series/utils.py:107, in call_expr.<locals>.wrapper(self, *args, **kwargs)
105 expr = getattr(expr, namespace)
106 f = getattr(expr, func.__name__)
--> 107 return s.to_frame().select_seq(f(*args, **kwargs)).to_series()
File ~/scratch/.venv/lib/python3.11/site-packages/polars/dataframe/frame.py:7906, in DataFrame.select_seq(self, *exprs, **named_exprs)
7883 def select_seq(
7884 self, *exprs: IntoExpr | Iterable[IntoExpr], **named_exprs: IntoExpr
7885 ) -> DataFrame:
7886 """
7887 Select columns from this DataFrame.
7888
(...)
7904 select
7905 """
-> 7906 return self.lazy().select_seq(*exprs, **named_exprs).collect(_eager=True)
File ~/scratch/.venv/lib/python3.11/site-packages/polars/lazyframe/frame.py:1810, in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, no_optimization, streaming, background, _eager)
1807 if background:
1808 return InProcessQuery(ldf.collect_concurrently())
-> 1810 return wrap_df(ldf.collect())
ComputeError: unable to parse time zone: 'cabbage'. Please check the Time Zone Database for a list of available time zones
from polars.
Hi @MarcoGorelli, should we add some checks here
polars/py-polars/polars/datatypes/classes.py
Lines 461 to 463 in 2970c57
Similar to what's done here
polars/py-polars/polars/_utils/convert.py
Lines 163 to 194 in 2970c57
Or should we leave the error as it is? Do you have any thoughts on this? If possible, I'd like to improve it.
from polars.
Related to this: it's a bit concerning that the invalid time zone can end up in Rust without it being validated:
import polars as pl
dtype = pl.Datetime(time_zone="invalid_time_zone")
s = pl.Series(dtype=dtype)
print(s)
shape: (0,)
Series: '' [datetime[μs, invalid_time_zone]]
[
]
from polars.
true, validation only kicks in if there's an actual date in the column. maybe it can happen earlier
from polars.
hey for testing time zones in polars, can we add
df[col_name].dt.tz
to get a series of time zone strings?
could be more performant than
def tz_from_df(df: pl.DataFrame, date_col_name: str = "date") -> str:
tz = df.schema[date_col_name].time_zone
return tz
from polars.
Related Issues (20)
- Scanning cloud paths with percentages '%' fails
- Make pl.Enum(...) return type rather than instance HOT 2
- Elementwise check on `join` expressions is too restrictive HOT 3
- Built-in datasets and a function to load them HOT 1
- Python test workflows may fail due to failure to download `torch` dependency HOT 1
- `scan_parquet` does not optimise `slice` or `tail` operations HOT 3
- Apply function to rows of dictionaries in `map_rows`
- De-duplicate decompression in CSV/NDJSON scans
- import numpy with initial null value HOT 2
- Eager/Lazy API alignment: LazyGroupBy vs DynamicGroupBy
- list.any() and list.all() behavior with all null list looks incorrect HOT 3
- pl.from_numpy produces a DataFrame with the wrong values if a schema is given HOT 1
- Assigning multiple columns on same condition, splitting struct into multiple columns? HOT 2
- Add a `replace` method to the `expr.name` attribute
- Arrow PyCapsule TypeError: __arrow_c_stream__() missing 1 required positional argument: 'requested_schema' HOT 1
- Add a `casefold` method to the `expr.str` attribute
- There‘s no Right in JoinType enum HOT 3
- Polars does not allow reading a dataframe from piped csv output of another process HOT 1
- Add option for n_unique() to ignore None values HOT 6
- 1.3.0 regression when reading all-null DECIMAL(19,0) column @ parquet file exported by AWS Redshift HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.