hgrecco / pint-pandas Goto Github PK

View Code? Open in Web Editor NEW

166.0 17.0 39.0 503 KB

Pandas support for pint

License: Other

Jupyter Notebook 13.47% Python 86.53%

pint-pandas's Introduction

Pint-Pandas

Pandas support for pint

>>> import pandas as pd
>>> import pint_pandas

>>> df = pd.DataFrame({
...     "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
...     "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
... })
>>> df['power'] = df['torque'] * df['angular_velocity']
>>> df.dtypes
torque                                       pint[foot * force_pound]
angular_velocity                         pint[revolutions_per_minute]
power               pint[foot * force_pound * revolutions_per_minute]
dtype: object

Documentation

Full documentation is available at http://pint-pandas.readthedocs.io/

Quick Installation

To install Pint-Pandas, simply:

$ pip install pint-pandas

or utilizing conda, with the conda-forge channel:

$ conda install -c conda-forge pint-pandas

and then simply enjoy it!

pint-pandas's People

Contributors

Stargazers

Watchers

pint-pandas's Issues

Error applying assignment operators to series

@hgrecco thanks very much for the recent 0.2 release! I was already making use of the new #47 by installing the dev build, but am happy to see it released 🙂

I came across some edge cases when using assignment operators and compound assignment operators (e.g., +=); particularly with empty slices of a series (see the following code).

>>> import pandas as pd
>>> import pint
>>> import pint_pandas
>>> import pytest
>>> 
>>> 
>>> ureg = pint.get_application_registry()
>>> 
>>> pint_pandas.show_versions()
{'numpy': '1.20.1', 'pandas': '1.1.5', 'pint': '0.17', 'pint_pandas': '0.2'}
>>> durations = pd.Series(1.0, index=range(5))
>>> durations.loc[:] += 1  # OK
>>> durations.loc[[]] += 1  # OK
>>> durations.loc[:] = 1  # OK
>>> durations.loc[[]] = 1  # OK
>>> 
>>> durations = pd.Series(1.0, index=range(5), dtype="pint[meter]")
>>> 
>>> durations[:] += 1 * ureg.meter  # OK
>>> with pytest.raises(IndexError):
...     durations[[]] += 1 * ureg.meter  # Fails: `IndexError: list index out of range`
>>> with pytest.raises(TypeError):
...     durations[:] = 1 * ureg.meter  # Fails: `TypeError: object of type 'int' has no len()`
>>> with pytest.raises(TypeError):
...     durations[[]] = 1 * ureg.meter  # Fails: `TypeError: object of type 'int' has no len()`

As shown in the example above, the = operator and the += operator both work on a series with dtype=float, whether empty or not. However, when using a pint array series, the operator fails under the following conditions:

Operator	Series	Result
`=`	Non-empty	❌ `TypeError`
`=`	Empty	❌ `TypeError`
`+=`	Non-empty	✔️ OK
`+=`	Empty	❌ `IndexError`

I was able to work around the issue in my case by checking the size of the slice before affected assignment statements, skipping the assignment as needed.

feature request: add possibility to quantify with an external dict

For now, one can use quantify() using a multi-level index by passing level value. It could be handy to quantify by means of an external dictionnary.
For example:

df = pd.DataFrame({'a': 1, 2, 3], 'b': [4, 5, 6]})  # no specific dtypes, no multi-level
df = df.pint.quantify(by={"a": "cm", "b": "bar"})

Problems plotting

import pandas as pd
import pint

ureg = pint.UnitRegistry()

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})
print(df)

df.plot()

yields:

Traceback (most recent call last):
  File "/Users/grecco/Documents/code/simbio/examples/test_plot.py", line 13, in <module>
    df.plot()
  File "/Users/grecco/anaconda3/envs/sci38/lib/python3.8/site-packages/pandas/plotting/_core.py", line 847, in __call__
    return plot_backend.plot(data, kind=kind, **kwargs)
  File "/Users/grecco/anaconda3/envs/sci38/lib/python3.8/site-packages/pandas/plotting/_matplotlib/__init__.py", line 61, in plot
    plot_obj.generate()
  File "/Users/grecco/anaconda3/envs/sci38/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 261, in generate
    self._compute_plot_data()
  File "/Users/grecco/anaconda3/envs/sci38/lib/python3.8/site-packages/pandas/plotting/_matplotlib/core.py", line 410, in _compute_plot_data
    raise TypeError("no numeric data to plot")
TypeError: no numeric data to plot

Any ideas how to fix this without monkey patching pandas? Any way to declare what is a "numeric type"?

Pandas `apply` function does not work with series containing `PintArray`s?

I noticed that the apply function does not work with PintArray series. The following code demonstrates this:

import pint
ureg = pint.UnitRegistry()
pint.PintType.ureg = ureg

def g(x):
    return x+1*ureg.day
df = pd.DataFrame({'A':pd.Series([1,2,3,4], dtype='pint[day]'),'B':pd.Series([5,6,7,8], dtype='pint[day]')})
res = df['A'].apply(g)
print(type(df['A'].values))
print(type(res.values))

The output is

<class 'pintpandas.pint_array.PintArray'>
<class 'numpy.ndarray'>

res.values is actually a numpy array containing Quantity elements and a dtype of object. I was looking through the pandas docs for ExtensionArrays and it seems that apply is not mentioned anywhere. Also looked through the code for pintarrays and found the following:

PintArray._add_arithmetic_ops()
PintArray._add_comparison_ops()

I have a feeling that apply is not added with those. Does the pandas extension framework purposely exclude apply, or has the pint-pandas project simply not implemented it yet? In my opinion this is a very important feature as frequently operations on columns are more complex than simply performing arithmetic operations or comparisons....

I'd be happy to add this myself if it is relatively straightforward ... I'd probably need some guidance in the right direction though...

How to use with geopandas?

I am interested in using this with Geopandas; are there are specific instructions to integrate pint / pint-pandas with geopandas?

Example on README page doesn't work

import pint must be replaced with import pint_pandas.

Fixing the tests

I'd like to add a PR which fixes #51 but it's not possible at the moment because the tests wouldn't pass, even if I made the right change. More generally, at the moment we can't really do any meaningful pull request reviews because the tests are failing.

@andrewgsavage I think you have the best overview of this, can you see a way to get the tests going again? For example, could we split #48 into two: 1. fix tests, 2. fix offset units? That would help things move and give a clearer sense of what functionality is still not yet implemented (because it's waiting on fixes to e.g. pandas-dev/pandas#35131

Move to github actions?

As of 2020-06-20 the tests are failing.

Travis doesn't seem happy with our CI, particularly Python 3.7 and 3.8 are failing for some reason. Maybe we should move to github actions as they seem to streamline things in my experience (or at least that was my experience with https://github.com/openscm/openscm-units, see https://github.com/openscm/openscm-units/tree/master/.github/workflows).

Thoughts @hgrecco @andrewgsavage

Support Pandas 1.0.0

Pandas has released version 1.0.0 Therefore we should be able to build against a stable API now.

Very long runtimes for DataFrames with units

Using pint dtypes increases run time significantly compared to standard pandas DataFrames.

Example:

import numpy as np
import pandas as pd
import pint_pandas


def make_df(size, pint_units=True, dtype=float):
    if pint_units:
        dist_unit ='pint[m]'
        time_unit = 'pint[s]'
    else:
        dist_unit = dtype
        time_unit = dtype
    return pd.DataFrame(
        {'distance': pd.Series(np.arange(1, size + 1, dtype=dtype), dtype=dist_unit),
         'time':  pd.Series(np.arange(1, size + 1, dtype=dtype), dtype=time_unit)
        }
    )


n = 1_000_000
df_pint = make_df(n)
df = make_df(n, pint_units=False)

Now, time in a Notebook:

tp = %timeit -o df_pint['speed'] = df_pint['distance'] / df_pint[('time')]
822 ms ± 12 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

t = %timeit -o df['speed'] = df.distance * df.time
3.61 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

tp.average / t.average
227.46385324897062

The unit version takes more than 200x as long as the DataFrame without units. Am I doing something wrong? The computation should be done by NumPy in the background. Therefore, I would expect very litte performance differences.

PintArray should implement Series._reduce

PintArray is missing is an implementation of pd.Series._reduce:
https://github.com/pandas-dev/pandas/blob/cf25c5cd3ab1f647f780167df622c74b737fa8f5/pandas/core/series.py#L3709, to support reduction functions like min/max/sum/prod.

I'm pretty psyched about this new EA integration with pandas, I've wanted something like this for a long time... good job! 👍

Implement numpy __array_ufunc__

Some numpy functions, like np.absolute applied to a Series containing a PintArray return a series with object dtype.

Implementing numpy's __array_ufunc__ would allow functions like np.absolute to return a series containing a PintArray.

Current behaviour

import pint
import pandas as pd
import pint_pandas
pa = PA_([2,3.5], dtype="in")
s= pd.Series(pa)
np.absolute(s)

0    2 inch
1    3 inch
dtype: object

Desired behaviour

import pint
import pandas as pd
import pint_pandas
pa = PA_([2,3.5], dtype="in")
s= pd.Series(pa)
np.absolute(s)

0    2
1    3
dtype: pint[inch]

Display of decimal places

How do I control the number of decimal places that are displayed when Pint objects with float magnitudes are displayed in a Pandas DataFrame?

Plan

I saw a nice example of a duck typed extension array and thought it would be applicable to pint.
A PeriodArray stores it's frequency in its PeriodDtype,

pd.core.dtypes.dtypes.PeriodDtype(freq = "M")
period[M]

So I've made a branch where PintArray stores it's unit in PintType, which works nicely:

import pandas as pd
import pintpandas

df=pd.DataFrame({
        "length":pd.Series([1,2],dtype="pint[m]"),
        "area"  :pd.Series([1,4],dtype="pint[m^2]"),
             } )
print(df)
  length area
0      1    1
1      2    4


df.dtypes
length         pint[meter]
area      pint[meter ** 2]
dtype: object


df.length
0    1
1    2
Name: length, dtype: pint[meter]

Previously the quantity containing a 1d array was stored in PintArray._data . Now that the unit is in the EAtype, ._data can store the magnitudes like most other EAs do, making the implementation more relatable. I'd prefer to use this duck typed version in the future.

That leaves three versions which it'd be good to have history for (although that could just be in my repo?)

pint/master, failing tests
andrewgsavage/pint/uprev_pandas, passing all but one test (test needs redefining)
andrewgsavage/pint-pandas/duck_typed, passing all but same test

Should we push each of those versions to this repo in that order to maintain the history/review changes?

dequantify with dataframes of mixed types

I have a dataframe where I have a couple of columns of PintArray types and a couple of boolean columns.
I need to be able to dequantify the columns so I can save everything to a csv, but the .pint.dequantify()
causes an error.

File "..\lib\site-packages\pint_pandas\pint_array.py", line 730, in dequantify
    df_columns["units"] = [
File "..\lib\site-packages\pint_pandas\pint_array.py", line 731, in <listcomp>
    formatter_func(df[col].values.units) for col in df.columns
AttributeError: 'numpy.ndarray' object has no attribute 'units'

Is there a way to get pint_pandas to skip the non-pintarray columns?

Thanks

Why is scalar added to list then immediately removed?

pint-pandas/pintpandas/pint_array.py

Line 244 in 2cf891b

value = value[0]

The intent of this code is unclear. Is this vestigial code that could be removed?

Upload to pypi

Is there anything else we'd like to do before uploading it to pypi?

Cannot operate with Quantity and Quantity of different registries.

Dear all,

I follow the code on the pandas support page and in a last step I try to integrate the flow rate over time:

import pandas as pd
import io
import pint
import pintpandas

ureg = pint.UnitRegistry()

pintpandas.PintType.ureg = ureg

test_data = '''speed,mech power,torque,rail pressure,fuel flow rate,fluid power
rpm,kW,N m,bar,l/min,kW
1000.0,,10.0,1000.0,10.0,
1100.0,,10.0,100000000.0,10.0,
1200.0,,10.0,1000.0,10.0,
1200.0,,10.0,1000.0,10.0,'''

df = pd.read_csv(io.StringIO(test_data),header=[0,1])

df_ = df.pint.quantify(level=-1)

test = df_["fuel flow rate"] * 10 * ureg["minute"]

I get the following error:

ValueError: Cannot operate with Quantity and Quantity of different registries.

Is it linked to a pint / pintpandas update I recently made? Before this code was running without error. (I have pandas 0.25.3, pint 0.11 and pint-pandas 0.1.dev0 installed).

Thanks for your assistance.

Best regards,

data dtype issues

Hi,

[edit: add versions]
python 3.8
pint-pandas: 0.1
pandas: 1.1.3
numpy: 1.19.4
[/edit]

Thanks for this promizing piece of software! Playing around with it, I discovered an issue with the data (values) dtypes determination:

watch out the types for the torque column:

import pandas as pd
import pint_pandas as pintpandas

df = pd.DataFrame({
    "torque": pd.Series([1.1, 2, 2.4, 3.6], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})

print(df)

  torque angular_velocity
0    1.1                1
1    2.0                2
2    2.4                2
3    3.6                3

Now, playing the same, but setting the first torques's item as an integer will set all the torque column as integers:

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2.4, 3.6], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})

print(df)

  torque angular_velocity
0      1                1
1      2                2
2      2                2
3      3                3

Pint-pandas having its own UnitRegistry()

Dear all,

As of today, PintType has its own UnitRegistry(), which of course is incompatible with other registries. I.e.:

df = pd.read_csv('dummy.csv', header=[0,1])
df = df.pint.quantify()

ureg = pint.UnitRegistry()
uregi = df.A.dtype.ureg

df.A+3*ureg.s # fails
df.A+3*uregi.s # works

Since it is possible elsewhere in Pint to have custom registries, it seems more consistent to also allow this in pint-pandas. Would it be possible, for instance, to let df.pint.quantify() take an optional ureg argument? I may contribute when/if I find time (for me this is only recreational programming).

Update to pandas 1.0.4 prior to release

The current status for the tests (run in my computer) is:

failed: 63
passed: 265
ignored: 2

Broken with Pandas 1.2

Pandas 1.2 seems to have broken pint-pandas

/Work/pint-pandas/pint_pandas/pint_array.py in _create_method(cls, op, coerce_to_dtype)
    615             return res
    616 
--> 617         op_name = ops._get_op_name(op, True)
    618         return set_function_name(_binop, op_name, cls)
    619 

AttributeError: module 'pandas.core.ops' has no attribute '_get_op_name'

can we have this uploaded to pypi so we can pip install directly

is there a reason why this is not done?
and we are having to do it with git links and pip?

bug in construction of temperature quantities

For temperature quantities except for those in degK, this error would be raised when constructing:

OffsetUnitCalculusError: Ambiguous operation with offset unit (degC).

The issue is due to the way that multiplication is used for constructing PintArrays.

Rename module

Running import pint-pandas doesn't work. Seems hyphens aren't recommended in module names. Can the module be renamed to pintpandas or similiar?

Notebook doesn't work and is a duplicate of the notebook in pint

The notebook here doesn't work. There's an updated version in pint. https://github.com/hgrecco/pint/blob/master/docs/pint-pandas.ipynb

Shall we remove it?

TypeError: object of type 'numpy.float64' has no len()

After merging the al pending PRs and linking to Pint 0.10.1 I am getting this error in many test
TypeError: object of type 'numpy.float64' has no len()

ping @andrewgsavage

TypeError: unhashable type: 'numpy.ndarray'

Hi,

I totally new to programming and have been trying to learn python on my own since last month. While plotting the graph it give me error "TypeError: unhashable type: 'numpy.ndarray' and I not at sure why. I checked the web for it, it seems like this error was earlier reported and have been answered, However, it looks the soution is not applicable to me.

Here is the code

import numpy as np
from pandas.plotting import register_matplotlib_converters
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.pylab import rcParams
from datetime import datetime

rcParams["figure.figsize"] = 10,6

dataset = pd.read_csv("D:\Data Science\supsales1.csv")
print(dataset)

dataset["Month"].astype(object)
dataset["Sales"].astype(float)

plt.xlabel("Sales")
plt.ylabel("Month")
plt.plot(dataset)

Below are the Error Messages

C:\Users\Welcome\PycharmProjects\Nilesh\venv\Scripts\python.exe C:/Users/Welcome/PycharmProjects/Nilesh/practice.py
Month Sales
0 Jan 951686
1 Feb 2386742
2 Mar 1068704
3 Apr 641233
4 May 1611669
Traceback (most recent call last):
File "C:/Users/Welcome/PycharmProjects/Nilesh/practice.py", line 18, in
plt.plot(dataset)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\pyplot.py", line 2811, in plot
is not None else {}), **kwargs)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib_init_.py", line 1810, in inner
return func(ax, *args, **kwargs)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\axes_axes.py", line 1611, in plot
for line in self._get_lines(*args, **kwargs):
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\axes_base.py", line 393, in _grab_next_args
yield from self._plot_args(this, kwargs)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\axes_base.py", line 370, in _plot_args
x, y = self._xy_from_xy(x, y)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\axes_base.py", line 205, in _xy_from_xy
by = self.axes.yaxis.update_units(y)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\axis.py", line 1473, in update_units
default = self.converter.default_units(data, self)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\category.py", line 103, in default_units
axis.set_units(UnitData(data))
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\category.py", line 169, in init
self.update(data)
File "C:\Users\Welcome\PycharmProjects\Nilesh\venv\lib\site-packages\matplotlib\category.py", line 186, in update
for val in OrderedDict.fromkeys(data):
TypeError: unhashable type: 'numpy.ndarray'

Process finished with exit code 1

Index loose quantity information

import pandas as pd
import pint

ureg = pint.UnitRegistry()

time = [1., 2., 2., 3.] * ureg.second
df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
}, index=time)

print(df.index)

outputs:

Float64Index([1.0, 2.0, 2.0, 3.0], dtype='float64')

Typerror when calling numpy method on series containig pintarray

Do PintArrays support numpy operations? Take something like

import pandas as pd
import pint
import numpy as np
ureg = pint.UnitRegistry()
pint.PintType.ureg = ureg

def f(x):
    return np.log(x/1*ureg.day)

df = pd.DataFrame({'A':pd.Series([1,2,3,4], dtype='pint[day]'),'B':pd.Series([5,6,7,8], dtype='pint[day]')})
f(df['A'])

I get an error:

TypeError: loop of ufunc does not support argument 0 of type Quantity which has no callable log method

Whereas:

np.log10(1*ureg.day/(1*ureg.day))

Works.

conda-forge install doesn't work

I see instructions on the project homepage to install from conda-forge. But the project isn't listed on the conda-forge website (https://conda-forge.org/feedstocks/), and copying the command string into my Anaconda prompt results in a PackagesNotFoundError.

Can you make PintType.ureg settable?

Would it make sense to have PintType.ureg use pint.get_application_registry() to allow using a common UnitRegistry? In the past I have just done PintType.ureg = myUREG, but that gets overridden if pint_array gets imported elsewhere. It would be great to be able to guarantee that all PintTypes would (or could) use a common UnitRegistry.

cumsum() changes dtype to object

2019-12-28 00:00:52.470569300+00:00     -661212.7629936477
2019-12-28 00:00:56.358419200+00:00    -1383400.0425049507
2019-12-28 00:00:58.889932700+00:00     -894082.3358455619
2019-12-28 00:01:01.422784900+00:00     -874452.8101116775
                                              ...         
2019-12-28 23:57:23.719467400+00:00                    0.0
2019-12-28 23:58:00.080144500+00:00      2151.174441538055
2019-12-28 23:58:02.064671800+00:00                    0.0
2019-12-28 23:59:39.443425300+00:00      5761.132695684438
2019-12-28 23:59:41.927944600+00:00                    0.0
Length: 12928, dtype: pint[joule]

i.cumsum()

2019-12-28 00:00:52.470569300+00:00     -661212.7629936477 joule
2019-12-28 00:00:56.358419200+00:00    -2044612.8054985984 joule
2019-12-28 00:00:58.889932700+00:00    -2938695.1413441603 joule
2019-12-28 00:01:01.422784900+00:00     -3813147.951455838 joule
                                                 ...            
2019-12-28 23:57:23.719467400+00:00    -4767888585.5850525 joule
2019-12-28 23:58:00.080144500+00:00     -4767886434.410611 joule
2019-12-28 23:58:02.064671800+00:00     -4767886434.410611 joule
2019-12-28 23:59:39.443425300+00:00     -4767880673.277915 joule
2019-12-28 23:59:41.927944600+00:00     -4767880673.277915 joule
Length: 12928, dtype: object

this is likely caused by np.cumsum on an PintArray resulting in an numpy array of Quantity objects because it falls back to a loop of + operations?

My manual workaround is:
my_series.to_frame('power').pint.dequantify().cumsum().pint.quantify()

Pint creating a second registry

Hi. I don't know if this is an issue/bug....but the following code is giving inconsistent registry results....

If I run it after a kernel restart, Pint only creates one registry. The output looks something like:

Constants Dictionary Unit Registries
g:  140391382456736
cdg:  140391382456736
Fluid Dictionary Unit Registries
fdex:  140391382456736
DataFrame Unit Registries
ID:  140391382456736
V:  140391382456736

If I run the code again, the registries for the last two quantities stay the same, but new registries are created for the first three quantities....

Constants Dictionary Unit Registries
g:  140391416170096
cdg:  140391416170096
Fluid Dictionary Unit Registries
fdex:  140391416170096
DataFrame Unit Registries
ID:  140391382456736
V:  140391382456736`

Clearing the variables doesn't have any effect.

Constants Dictionary Unit Registries
g:  140391416999456
cdg:  140391416999456
Fluid Dictionary Unit Registries
fdex:  140391416999456
DataFrame Unit Registries
ID:  140391382456736
V:  140391382456736`

The only way I get back to one registry is to restart the kernel.

It's probably just something I'm doing wrong....but I've searched all the help/examples I can find and haven't stumbled upon something to explain it.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from __future__ import print_function, absolute_import, division
import pandas as pd
import pint # Units
import pint_pandas as ppi

# Unit Registry
ureg = pint.UnitRegistry(auto_reduce_dimensions=False)
pint.set_application_registry(ureg)
ppi.PintType.ureg = ureg
ppi.PintType.ureg.default_format = "~P"

# Constants and Parameters
#Acceleration Due to Gravity
g = 1.0 * ureg.g_0
g.ito(ureg.foot / ureg.second**2 ) # ft/s2

print('Constants Dictionary Unit Registries')
print('g: ', id(g._REGISTRY))

# Constants Dict
const_dict = {}
const_dict['g'] = g
const_dict['SL_LB'] = 1.0*ureg.slug/(1.0*ureg.slug).to(ureg.pound) # slug/lb
const_dict['rho'] = 68.48 * ureg.pound / ureg.foot**3  #lbMass/ft^3
const_dict['eta'] = 0.6644 * ureg.centipoise

# Check registry of constants dictionary
cdg = const_dict['g']
print('cdg: ', id(cdg._REGISTRY))

# Fluid Dict
fluid_dict = {}
fluid_dict['Dens_SL'] = const_dict['rho'] * const_dict['SL_LB'] # slugs/ft3
fluid_dict['DynVisc_LBFT2'] = const_dict['eta'].to(ureg.force_pound * ureg.second / ureg.foot**2) #lbF-s/ft2

# Check registry of fluid dictionary
fdex = fluid_dict['Dens_SL']
print('Fluid Dictionary Unit Registries')
print('fdex: ', id(fdex._REGISTRY))

# Small segment of the input data
Dij = [4.0]*4
Lij = [2000.0]*4
data = {'Dij': Dij, 'Lij': Lij}
arc = pd.DataFrame(data=data)

# Put Data into DataFrame
df = pd.DataFrame({
        "Dij": ppi.PintArray(arc['Dij'], dtype=ureg.inch),
        "Lij": ppi.PintArray(arc['Lij'], dtype=ureg.foot)
    }, index=arc.index)

print('DataFrame Unit Registries')
ID = df.at[0,'Dij']
print('ID: ', id(ID._REGISTRY))

v = list(range(1,3))
vels = ppi.PintArray(v, dtype=ureg.foot/ureg.second)

# Check registry of Pint Array content
vt=vels[0]
print('V: ', id(vt._REGISTRY))

vvels = [vels]*len(arc)
vv = list(zip(arc.index.values, vvels))
vd = dict(vv)
pwdf = pd.DataFrame(vd, index=v)
pwdf2 = pwdf.T

# Breaks here....
#for y in vels:
#    ReNumb = y * ID * fluid_dict['Dens_SL'] / fluid_dict['DynVisc_LBFT2']
#    ReNumb.ito_reduced_units()
#    print(ReNumb)

Documentation page doesn't render correctly, shows source code of Jupyter notebook

Browsing to https://pint.readthedocs.io/en/stable/pint-pandas.html with Firefox 89 shows the following:

Instead of showing the marked up content, it appears to show the source code of a Jupyter notebook.

`sqrt` doesn't work on PintArrays

The following code fails in the last line:

import numpy as np
import pandas as pa
import pint_pandas

x = pint_pandas.PintArray(np.arange(4),"m")
print(np.sqrt(x.quantity))
print(np.sqrt(x))

Difficulty reproducing readme example

Hi,

I was very excited to try pint-pandas, but I have some trouble reproducing the example from your readme. I tried installing pint-pandas using both your conda and pip. Not at the same time, of course :) Would you have time to give me some pointers?

import pandas as pd
import pint

df = pd.DataFrame({
    "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
    "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
})
df['power'] = df['torque'] * df['angular_velocity']
df.dtypes

outputs:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-62be7ca8289c> in <module>
      3 
      4 df = pd.DataFrame({
----> 5     "torque": pd.Series([1, 2, 2, 3], dtype="pint[lbf ft]"),
      6     "angular_velocity": pd.Series([1, 2, 2, 3], dtype="pint[rpm]"),
      7 })

~\anaconda3\envs\foci_modeling\lib\site-packages\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    282                 data = {}
    283             if dtype is not None:
--> 284                 dtype = self._validate_dtype(dtype)
    285 
    286             if isinstance(data, MultiIndex):

~\anaconda3\envs\foci_modeling\lib\site-packages\pandas\core\generic.py in _validate_dtype(cls, dtype)
    342         """ validate the passed dtype """
    343         if dtype is not None:
--> 344             dtype = pandas_dtype(dtype)
    345 
    346             # a compound dtype

~\anaconda3\envs\foci_modeling\lib\site-packages\pandas\core\dtypes\common.py in pandas_dtype(dtype)
   1797     # raise a consistent TypeError if failed
   1798     try:
-> 1799         npdtype = np.dtype(dtype)
   1800     except SyntaxError as err:
   1801         # np.dtype uses `eval` which can raise SyntaxError

TypeError: data type 'pint[lbf ft]' not understood

My env on Windows 10:

# packages in environment at C:\Users\Christine\anaconda3\envs\foci_modeling:
#
# Name                    Version                   Build  Channel
argon2-cffi               20.1.0           py37hcc03f2d_2    conda-forge
asttokens                 2.0.4                    pypi_0    pypi
async_generator           1.10                       py_0    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
autopep8                  1.5.5              pyh44b312d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
bleach                    3.3.0              pyh44b312d_0    conda-forge
brotlipy                  0.7.0           py37hcc03f2d_1001    conda-forge
ca-certificates           2020.12.5            h5b45459_0    conda-forge
cachecontrol              0.12.6                   pypi_0    pypi
cachetools                4.2.1                    pypi_0    pypi
certifi                   2020.12.5        py37h03978a9_1    conda-forge
cffi                      1.14.5           py37hd8e9650_0    conda-forge
chardet                   4.0.0            py37h03978a9_1    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
cryptography              3.4.7            py37h20c650d_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
executing                 0.5.4                    pypi_0    pypi
firebase-admin            4.5.1                    pypi_0    pypi
freetype                  2.10.4               h546665d_1    conda-forge
google-api-core           1.26.0                   pypi_0    pypi
google-api-python-client  1.12.8                   pypi_0    pypi
google-auth               1.26.1                   pypi_0    pypi
google-auth-httplib2      0.0.4                    pypi_0    pypi
google-cloud-core         1.6.0                    pypi_0    pypi
google-cloud-firestore    2.0.2                    pypi_0    pypi
google-cloud-storage      1.36.0                   pypi_0    pypi
google-crc32c             1.1.2                    pypi_0    pypi
google-resumable-media    1.2.0                    pypi_0    pypi
googleapis-common-protos  1.52.0                   pypi_0    pypi
grpcio                    1.35.0                   pypi_0    pypi
httplib2                  0.19.0                   pypi_0    pypi
icecream                  2.1.0                    pypi_0    pypi
icu                       68.1                 h0e60522_0    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        3.4.0            py37h03978a9_0    conda-forge
importlib_metadata        3.4.0                hd8ed1ab_0    conda-forge
importlib_resources       5.1.2            py37h03978a9_0    conda-forge
intel-openmp              2020.3             h57928b3_311    conda-forge
ipykernel                 5.3.4            py37h5ca1d4c_0    anaconda
ipython                   7.20.0           py37heaed05f_2    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.6.3              pyhd3deb0d_0    conda-forge
jedi                      0.18.0           py37h03978a9_2    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h8ffe710_0    conda-forge
jsonschema                3.2.0                      py_2    conda-forge
jupyter                   1.0.0            py37h03978a9_6    conda-forge
jupyter_client            6.1.11             pyhd8ed1ab_1    conda-forge
jupyter_console           6.2.0                      py_0    conda-forge
jupyter_core              4.7.1            py37h03978a9_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
kiwisolver                1.3.1            py37h8c56517_1    conda-forge
lcms2                     2.12                 h2a16943_0    conda-forge
libblas                   3.9.0                     8_mkl    conda-forge
libcblas                  3.9.0                     8_mkl    conda-forge
libclang                  11.0.1          default_h5c34c98_1    conda-forge
liblapack                 3.9.0                     8_mkl    conda-forge
libpng                    1.6.37               h1d00b33_2    conda-forge
libsodium                 1.0.18               h8d14728_1    conda-forge
libtiff                   4.2.0                hc10be44_0    conda-forge
lz4-c                     1.9.3                h8ffe710_0    conda-forge
m2w64-gcc-libgfortran     5.3.0                         6    conda-forge
m2w64-gcc-libs            5.3.0                         7    conda-forge
m2w64-gcc-libs-core       5.3.0                         7    conda-forge
m2w64-gmp                 6.1.0                         2    conda-forge
m2w64-libwinpthread-git   5.0.0.4634.697f757               2    conda-forge
markupsafe                1.1.1            py37hcc03f2d_3    conda-forge
matplotlib                3.3.4            py37h03978a9_0    conda-forge
matplotlib-base           3.3.4            py37h3379fd5_0    conda-forge
mistune                   0.8.4           py37hcc03f2d_1003    conda-forge
mkl                       2020.4             hb70f87d_311    conda-forge
msgpack                   1.0.2                    pypi_0    pypi
msys2-conda-epoch         20160418                      1    conda-forge
multipledispatch          0.6.0                      py_0    conda-forge
natsort                   7.1.1              pyhd8ed1ab_0    conda-forge
nbclient                  0.5.2              pyhd8ed1ab_0    conda-forge
nbconvert                 6.0.7            py37h03978a9_3    conda-forge
nbformat                  5.1.2              pyhd8ed1ab_1    conda-forge
nest-asyncio              1.4.3              pyhd8ed1ab_0    conda-forge
notebook                  6.2.0            py37h03978a9_0    conda-forge
numpy                     1.20.2           py37hcbcd69c_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openssl                   1.1.1k               h8ffe710_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.2.2            py37h08fd248_0    conda-forge
pandas-flavor             0.2.0                      py_0    conda-forge
pandoc                    2.11.4               h8ffe710_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
parso                     0.8.1              pyhd8ed1ab_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.1.1            py37h96663a1_0    conda-forge
pint                      0.17               pyhd8ed1ab_0    conda-forge
pint-pandas               0.2                      pypi_0    pypi
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
plotly                    4.14.3                     py_0    plotly
prometheus_client         0.9.0              pyhd3deb0d_0    conda-forge
prompt-toolkit            3.0.16             pyha770c72_0    conda-forge
prompt_toolkit            3.0.16               hd8ed1ab_0    conda-forge
proto-plus                1.13.0                   pypi_0    pypi
protobuf                  3.14.0                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pycodestyle               2.6.0              pyh9f0ad1d_0    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygments                  2.8.0              pyhd8ed1ab_0    conda-forge
pyjanitor                 0.20.14            pyhd8ed1ab_0    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.12.3           py37h03978a9_7    conda-forge
pyqt-impl                 5.12.3           py37hf2a7229_7    conda-forge
pyqt5-sip                 4.19.18          py37hf2a7229_7    conda-forge
pyqtchart                 5.12             py37hf2a7229_7    conda-forge
pyqtwebengine             5.12.1           py37hf2a7229_7    conda-forge
pyrsistent                0.17.3           py37hcc03f2d_2    conda-forge
pysocks                   1.7.1            py37h03978a9_3    conda-forge
python                    3.7.9           h7840368_100_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pywin32                   300              py37hcc03f2d_0    conda-forge
pywinpty                  0.5.7            py37hc8dfbb8_1    conda-forge
pyzmq                     20.0.0           py37h0d95fc2_1    conda-forge
qt                        5.12.9               h5909a2a_3    conda-forge
qtconsole                 5.0.2              pyhd8ed1ab_0    conda-forge
qtpy                      1.9.0                      py_0    conda-forge
requests                  2.25.1             pyhd3deb0d_0    conda-forge
retrying                  1.3.3                      py_2    conda-forge
rsa                       4.7.1                    pypi_0    pypi
scikit-learn              0.24.1           py37heb15398_0    conda-forge
scipy                     1.6.0            py37h6db1a17_0    conda-forge
seaborn                   0.11.0                     py_0    anaconda
send2trash                1.5.0                      py_0    conda-forge
setuptools                49.6.0           py37h03978a9_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sqlite                    3.34.0               h8ffe710_0    conda-forge
statsmodels               0.12.2           py37hda49f71_0    conda-forge
terminado                 0.9.2            py37h03978a9_0    conda-forge
testpath                  0.4.4                      py_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
tk                        8.6.10               h8ffe710_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tornado                   6.1              py37hcc03f2d_1    conda-forge
tqdm                      4.58.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
uritemplate               3.0.1                    pypi_0    pypi
urllib3                   1.26.3                   pypi_0    pypi
vc                        14.2                 hb210afc_3    conda-forge
vs2015_runtime            14.28.29325          h5e1d092_3    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
widgetsnbextension        3.5.1            py37h03978a9_4    conda-forge
win_inet_pton             1.1.0            py37h03978a9_2    conda-forge
wincertstore              0.2             py37h03978a9_1006    conda-forge
winpty                    0.4.3                         4    conda-forge
xarray                    0.17.0             pyhd8ed1ab_0    conda-forge
xlrd                      2.0.1              pyhd8ed1ab_3    conda-forge
xz                        5.2.5                h62dcd97_1    conda-forge
zeromq                    4.3.3                h0e60522_3    conda-forge
zipp                      3.4.0                      py_0    conda-forge
zlib                      1.2.11            h62dcd97_1010    conda-forge
zstd                      1.4.9                h6255e5f_0    conda-forge

Thanks for working on pint-pandas!
-M.

PintArray's `data_dtype` is ignored

PintArray.init ignores the data_dtype agument.

Something like this would fix it:
#56

pint pandas DataFrame throws exception on standard df methods like sum(), min() and max()

Description

When using pint enabled DataFrames, the well known methods of Pandas DataFrames like sum(), min() and max() are throwing an exception. Example:
TypeError: cannot perform min with type pint[kilogram * meter]

Expected behaviour

Using standard DataFrame methods should work with pint enabled DataFrames, preserving the column units and returning a calculated value that has a pint unit attached to it.

Steps to reproduce

Setting the pint_pandas unit registry to a ureg that can also be used outside the dataframes:

import pandas as pd
import pint
import pint_pandas
u = pint.UnitRegistry()
pint_pandas.PintType.ureg = u

Creating a Pint Pandas DataFrame as ususal:

PA_ = pint_pandas.PintArray

df = pd.DataFrame({
        "torque": PA_([1, 2, 2, 3], dtype=u.kg*u.m),
        })

Invoking a Pandas DataFrame method results in an exception:

df['torque'].min()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-488b600ceff2> in <module>
----> 1 df['torque'].min()

/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
  11463         if level is not None:
  11464             return self._agg_by_level(name, axis=axis, level=level, skipna=skipna)
> 11465         return self._reduce(
  11466             func, name=name, axis=axis, skipna=skipna, numeric_only=numeric_only
  11467         )
/usr/local/lib/python3.8/dist-packages/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   4225         if isinstance(delegate, ExtensionArray):
   4226             # dispatch to ExtensionArray interface
-> 4227             return delegate._reduce(name, skipna=skipna, **kwds)
   4228 
   4229         else:
/usr/local/lib/python3.8/dist-packages/pandas/core/arrays/base.py in _reduce(self, name, skipna, **kwargs)
   1145         TypeError : subclass does not define reductions
   1146         """
-> 1147         raise TypeError(f"cannot perform {name} with type {self.dtype}")
   1148 
   1149     def __hash__(self):
TypeError: cannot perform min with type pint[kilogram * meter]

Workaround

It is possible to remove the quantities before the operation and re-attaching them afterwards to the result value:
df.pint.dequantify()['torque'].min()[0] * df.torque.pint.units
But this is tedious.

Note: The workaround only works with columns that do not have spaces in the column name.

support for pandas v0.25.0

this version doesn't work with pandas v0.25.0.

Dependencies

Related to #6, which of Pint's dependencies do we want to include? As it's simpler, I'd prefer to only support Pint with a full installation (i.e. make the latest numpy and uncertainties dependencies too) rather than support all the different combinations of numpy versions and other libraries (hence move away from the things are done in Pint).

Python versions to support

As it says in the title. I think the things we have to consider are:

versions supported by Pint (many)
versions support by Pandas (2.7, 3.5+)
each extra Python version we support makes things trickier

My preference would be to only support the same Python versions as Pandas, thoughts @hgrecco and @andrewgsavage?

.apply not working "properly"

In our project we make a lot of usage of DataFrame.apply and Series.apply - using these with pandas does not set
the dtype of the result properly. A simple workaround is to call apply as follows:

df = pd.DataFrame([2, 3], dtype="pint[W]")
df["new"] = df.apply(lambda x: x[0]).astype("pint[W])"

This requires me to set the unit manually which I'd like to avoid.
As a workaround we patch the panda's apply-functions with these functions:

def df_apply(manual_self, *args, **kwargs):
    """
    A pint friendly version of pandas DataFrame.apply.

    Normally `pd.DataFrame.apply` would not set the dtype for the result properly.
    """

    res = PintApply.original_df_apply(manual_self, *args, **kwargs)
    if isinstance(res, pd.DataFrame):
        cols_with_units = [hasattr(res[col][0], "units") for col in res]
        if all(cols_with_units):
            types = {col: f"pint[{res[col][0].units}]" for col in res}
            magnitudes = res.applymap(lambda x: x.magnitude)
            res = magnitudes.astype(types)
            return res
        elif any(cols_with_units):
            raise Exception(
                "This DataFrame contains pint and none pint values - don't mix!"
            )
    elif isinstance(res, pd.Series):
        if hasattr(res[0], "units"):
            unit = res[0].units
            magnitude = res.transform(lambda x: x.magnitude)
            if str(unit) == "":
                return magnitude.astype("pint[dimensionless]")
            return magnitude.astype(f"pint[{unit}]")
    return res

@staticmethod
def series_apply(manual_self, *args, **kwarg):
    """
    A pint friendly version of pandas Series.apply.

    Normally `pd.Series.apply` would not set the dtype for the result properly.
    """

    res = PintApply.original_series_apply(manual_self, *args, **kwarg)
    if hasattr(res[0], "units"):
        unit = res[0].units
        magnitude = res.transform(lambda x: x.magnitude)
        if str(unit) == "":
            return magnitude.astype("pint[dimensionless]")
        return magnitude.astype(f"pint[{unit}]")
    return res

If this is a solution you'd like to have in pint_pandas I'll prepare a PR, if it's a none issue I'll happy to learn
about a better solution.

Shall we release a new version?

My idea is to release a new version by the end of the week

Can we close any addition issue/PR by then?
Shall we wait for any particular issue/PR?

License

I suggest adding a license to this repo (e.g. the same as in Pint).

Unit Conversions based on updated registry

I've updated the registry with a new quantity

Q_ = ureg.Quantity
ureg.define('sacks = 94 * pounds = sack')

[WORKS FINE]
lbs = Q_(94, 'pounds')

print(lbs.to('sacks'))

[WORKS]

yy=Q_(0.000961390592872627,'m**3/kg')

yy.to('ft**3/sack')

[Doesn't Work]
[This part works]
df=pd.DataFrame({'test':pd.Series([0.000961390592872627], dtype="pint[m**3/kg]")})

[DOESN'T WORK]
df['test'][0].to('ft**3/sacks')

UndefinedUnitError: 'sacks' is not defined in the unit registry

Plotting of pandas dataframe

The matplotlib plotting support of pint as shown in the documentation is more or less working (cf. hgrecco/pint#760 - remove the axhline and axvline calls to get it working), however if a pandas dataframe is created using a PintArray as show in the docs, there is another issue:

import pint
import pandas as pd
PA_= pint.PintArray
ureg=pint.UnitRegistry()
ureg.setup_matplotlib()
df = pd.DataFrame({
        "length" : pd.Series([1,2], dtype="pint[m]"),
        "width" : PA_([2,3], dtype="pint[m]")
    })
df.length.plot()

This yields a TypeError: Empty 'DataFrame': no numeric data to plot . The same error appears if df.plot() is issued.

A more matplotlib oriented call:

plt.plot('length', data=df)

yields a different error: IndexError: tuple index out of range

My setup is:

Anyway, thanks for your great work! I really appreciate using pint in my projects.

For completeness, here are the full tracebacks of the errors above:

`df.length.plot()`

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-86-98a16b4be98e> in <module>
      8         "width" : PA_([2,3], dtype="pint[m]")
      9     })
---> 10 df.plot()

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\pandas\plotting\_core.py in __call__(self, x, y, kind, ax, subplots, sharex, sharey, layout, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, secondary_y, sort_columns, **kwds)
   2937                           fontsize=fontsize, colormap=colormap, table=table,
   2938                           yerr=yerr, xerr=xerr, secondary_y=secondary_y,
-> 2939                           sort_columns=sort_columns, **kwds)
   2940     __call__.__doc__ = plot_frame.__doc__
   2941 

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\pandas\plotting\_core.py in plot_frame(data, x, y, kind, ax, subplots, sharex, sharey, layout, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, secondary_y, sort_columns, **kwds)
   1968                  yerr=yerr, xerr=xerr,
   1969                  secondary_y=secondary_y, sort_columns=sort_columns,
-> 1970                  **kwds)
   1971 
   1972 

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\pandas\plotting\_core.py in _plot(data, x, y, subplots, ax, kind, **kwds)
   1796         plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
   1797 
-> 1798     plot_obj.generate()
   1799     plot_obj.draw()
   1800     return plot_obj.result

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\pandas\plotting\_core.py in generate(self)
    247     def generate(self):
    248         self._args_adjust()
--> 249         self._compute_plot_data()
    250         self._setup_subplots()
    251         self._make_plot()

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\pandas\plotting\_core.py in _compute_plot_data(self)
    362         if is_empty:
    363             raise TypeError('Empty {0!r}: no numeric data to '
--> 364                             'plot'.format(numeric_data.__class__.__name__))
    365 
    366         self.data = numeric_data

TypeError: Empty 'DataFrame': no numeric data to plot

`plt.plot('length', data=df)`

IndexError                                Traceback (most recent call last)
<ipython-input-80-1c9c8127f2cc> in <module>
----> 1 plt.plot('length', data=df)

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\matplotlib\pyplot.py in plot(scalex, scaley, data, *args, **kwargs)
   2809     return gca().plot(
   2810         *args, scalex=scalex, scaley=scaley, **({"data": data} if data
-> 2811         is not None else {}), **kwargs)
   2812 
   2813 

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\matplotlib\__init__.py in inner(ax, data, *args, **kwargs)
   1808                         "the Matplotlib list!)" % (label_namer, func.__name__),
   1809                         RuntimeWarning, stacklevel=2)
-> 1810             return func(ax, *args, **kwargs)
   1811 
   1812         inner.__doc__ = _add_data_doc(inner.__doc__,

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\matplotlib\axes\_axes.py in plot(self, scalex, scaley, *args, **kwargs)
   1609         kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D._alias_map)
   1610 
-> 1611         for line in self._get_lines(*args, **kwargs):
   1612             self.add_line(line)
   1613             lines.append(line)

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\matplotlib\axes\_base.py in _grab_next_args(self, *args, **kwargs)
    391                 this += args[0],
    392                 args = args[1:]
--> 393             yield from self._plot_args(this, kwargs)
    394 
    395 

D:\WinPython-64bit-3.7.2.0\python-3.7.2.amd64\lib\site-packages\matplotlib\axes\_base.py in _plot_args(self, tup, kwargs)
    376             func = self._makefill
    377 
--> 378         ncx, ncy = x.shape[1], y.shape[1]
    379         if ncx > 1 and ncy > 1 and ncx != ncy:
    380             cbook.warn_deprecated("2.2", "cycling among columns of inputs "

IndexError: tuple index out of range

Seems like this should work but UndefinedUnitError

Following along with the documentation of pint-pandas and pint as best I could, I have an implementation which does not seem to want to work. This reproduces my UndefinedUnitError.

python: 3.8.8
pint: 0.17
pint-pandas: 0.2

import pint
import pint_pandas
ureg = UnitRegistry()
ureg.load_definitions('pint_unit_definitions.txt')
pint.set_application_registry(ureg)


df = pd.DataFrame([[4,5,6],[1,3,4]], dtype='pint[bpm]' )

My pint_unit_definitions.txt file looks like this:

minute = 60 * second = min
beats_per_minute = beat / minute = bpm
hertz = counts / second = hz
beat = [heart_beats] = b

Am I doing something wrong? Thanks!

unstacking and (built-in) dtype

It looks like there is some kind of dtype inference when unstacking data isn't it ? This may be a problem with my environment (corporate one, I cannot control, not really sane), and I am not sure that you will be able to reproduce the following issue. Say you have

>>> pd.DataFrame({"ang_vel": pd.Series([1., 2.1, 2., 3.], dtype="pint[rpm]")}).unstack()
ang_vel  0    1.0
         1    2.1
         2    2.0
         3    3.0
dtype: pint[revolutions_per_minute]

Everything is fine. Now consider the following:

>>> pd.DataFrame({"ang_vel": pd.Series([1, 2.1, 2., 3.], dtype="pint[rpm]")}).unstack()
>>> #                                   ^
ang_vel  0    1
         1    2
         2    2
         3    3
dtype: pint[revolutions_per_minute]

This may even be a desired behaviour 0o. Which I doubt :)

Thx. Pint is amazing anyway.

PintArray fails to fit into a pickle jar

I am wondering if anyone has a quick workaround for pickling of PintArray. I am getting an error running following snippet (Python 3.8, pint-pandas 0.2):

import pickle, pint_pandas
pickle.dumps(pint_pandas.PintArray([1.], dtype="m"))

# AttributeError: Can't pickle local object 'build_quantity_class.<locals>.Quantity'

It is also possible I am doing something totally wrong. Any ideas? Thx