Comments (4)
Maybe consider opening an issue at IEEE_754? ;)
from polars.
@Macfly to expound a bit on Ritchie's answer, numbers are stored in the computer in binary format with a limited number of bits, and so we cannot represent all numbers exactly. If you take a look at https://www.omnicalculator.com/other/floating-point and select Number to Floating Point
and enter 799.32
, you'll see that the number stored as a 32-bit float actually represents the number 799.32000732421875. If you set the number of bits to 64, you get a whole lot more precision, but it's still not exact: 799.32000000000005.
from polars.
I understand the floating accuracy but why with Numpy and Pandas, it is displayed as 799.32 with float32.
Is there something special with polars that always show the exact floating representation?
from polars.
numpy does the same thing actually
>>> pl.Series([799.32], dtype=pl.Float32).item()
799.3200073242188
>>> np.float32(799.32).item()
799.3200073242188
>>> np.float32(799.32).astype(np.float64)
799.3200073242188
when you do Series([799.32], dtype=pl.Float32)[0]
, you are actually performing these operations:
>>> np.float64(799.32).astype(np.float32).astype(np.float64).item()
799.3200073242188
you start with a 64-bit Python float
, which gets casted to a 32-bit float by polars, and then back to a 64-bit Python float
when you do Series[0]
from polars.
Related Issues (20)
- Improve serialization of Categorical/Enum types
- Segfault / PanicException with `pl.lit` + `.slice` in a group by context
- `str.replace_many` to take a dictionary that defines a replacement mapping. HOT 8
- Different join behavior when streaming during join on different types HOT 1
- Allow remapping of hive partitioning columns (or physical parquet columns) before they're unified.
- DataFrame.vstack() deadlocks in asyncio background task HOT 2
- Panic when calling `top_k` on list-of-lists type
- Scan parquet should allow manual schema overrides (and/or should treat any `list[null]` encountered as a type to be resolved at query execution time?)
- Support sorting list-of-lists column
- Change `dt.week()` to be more consistent with `dt.year()` (Gregorian Year) HOT 1
- Deserializing list-of-null data leads to incorrect results
- Cannot instantiate Series of type Struct(List(UInt64)) with large integer value
- Aliasing Columns Names in Map_Element () with Dataclass Fields HOT 1
- Parallel string operations HOT 1
- add option for `to_dummies` to respect nulls
- `.mean_horizontal()` does not work with `pl.datetime` (but `.mean()` does)
- floor_div runtime error for i64, u32 and u64
- Support No-op round() on Int columns HOT 1
- `sort_by("col", nulls_last=True).over("other")` does not put nulls at the end HOT 2
- Native `delta` reader and writer using delta-kernel-rs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.