Giter VIP home page Giter VIP logo

cftime's People

Contributors

5tefan avatar aidanheerdegen avatar andersy005 avatar archangegabriel avatar barronh avatar bjlittle avatar cclauss avatar cgohlke avatar ckhroulev avatar davidhassell avatar dependabot[bot] avatar djhoese avatar dopplershift avatar hetland avatar huard avatar jamesp avatar jswhit avatar keewis avatar matthew-brett avatar mcgibbon avatar mcuntz avatar mdecker avatar ocefpaf avatar richli avatar shoyer avatar spencerkclark avatar timoroth avatar zbruick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cftime's Issues

netcdftime cannot be used with netCDF4-python<=1.3.1

As discussed in #32, when netCDF4 is installed its version of netcdftime overrides any installation of this package. It is very difficult to expect scientific python users to have the most up to date version of netCDF4, so up to date that it does not exist yet. This gives very strange behavior that is difficult to debug unless you've already run into it.

There is a really simple fix for this, which is to rename the netcdftime package, or provide an alias package name. This would remove a massive package incompatibility problem.

At the very least, netcdftime should require that these older versions of netCDF4 are not installed. This would allow me to tell users that they need to install netcdftime to get X functionality, and have a guarantee that when they do so they will not run into these import problems. One way to do this would be to check in netCDF4 whether netCDF4 is installed, and if it is installed to add a dependency on netCDF4 having at least the required version.

Julian Day issue

I am getting strange results with the function cftime.DateFromJulianDay using the Gregorian Calendar:

cftime.DateFromJulianDay(1684958.5,calendar='gregorian')
cftime.DatetimeProlepticGregorian(-100, 3, 2, 23, 59, 59, 999967, -1, 1)

Thus effectively, the 3rd March, -100 BC.
A very small change to the Julian Day, give the 2nd March, -100 BC.

cftime.DateFromJulianDay(1684958.50001,calendar='gregorian')
cftime.DatetimeProlepticGregorian(-100, 3, 2, 0, 0, 0, 863974, -1, 1)

The issue also affect the standard calendar:

cftime.DateFromJulianDay(1684958.5,calendar='standard')
# returns cftime.DatetimeGregorian(-100, 3, 2, 23, 59, 59, 999967, -1, 1)
cftime.DateFromJulianDay(1684958.50001,calendar='standard')
# return cftime.DatetimeGregorian(-100, 3, 2, 0, 0, 0, 863974, -1, 1)

According to this converting web-site (http://aa.usno.navy.mil/jdconverter?ID=AA&jd=1684958.5), 2nd March, -100 BC should be correct.

I thing that cftimes uses the Jean Meeus algorithm, which should be correct for negative years (as long as the JD is positive).

Duplicate def of `date2index`

There are two separate definitions of date2index in _netcdftime.pyx, at lines 286 and and 1309. I assume one of these should go away.

different requirements in conda recipe and setup.py

cftime seems to think it requires a different set of packages to run in the conda recipe:

    - python
    - setuptools
    - numpy x.x

and in setup.py (via requirements.txt):

numpy
cython
setuptools>=18.0

The requirement for cython appears to be retained in such a way that a conda environment that includes cftime but not cython and tries to use make use of pkg_resources gets the error:

...
pkg_resources.DistributionNotFound: The 'cython' distribution was not found and is required by cftime

The error is resolved by installing cython or including it as an (unnecessary) dependency of any conda package that depends on cftime but it would be more convenient if cython were only included in setup.py in setup_requires and not in install_requires.

So far, I haven't been able to create a super simple test that reproduces the issue. I see it when I try to build a conda package from a repo that I'm just starting to develop. Here is where I seem to need to include cython, though I don't think it should be needed:
https://github.com/xylar/misomip1analysis/blob/initial_stub/conda/recipe/meta.yaml#L28

timezone formats not supported correctly

The timezone parsing does not seem to follow the same standard that I can find on Wikipedia (https://en.wikipedia.org/w/index.php?title=ISO_8601&oldid=793312108#Time_zone_designators). Unfortunately, I cannot find any primary ISO source for the exact timezone format, so I have to rely solely on Wikipedia for the time being.
The page referenced in the code (https://github.com/Unidata/netcdftime/blob/1904de5c81fff7e40f3bbb23906dc2db83d7aaa1/netcdftime/_netcdftime.pyx#L33) is also not available any more. But judging from the archived version (https://web.archive.org/web/20161009150515/http://delete.me.uk/2005/03/iso8601.html), it suffers some - but not all of the format errors (see below).

What I think is wrong:

  • Colons are optional in the timezone specification (according to Wikipedia). Right now the timezone string "+01" cannot be recognized, even though this as well as "+0100" should be valid.
  • Both hour and minute specification should consist of exactly two numbers - not one OR two. This was actually done right in the JavaScript code quoted as reference. If this is fixed, then there won't be any ambiguities introduced by making the colon optional, either.

Both problems are relatively easy to fix by slightly modifying the regexes and maybe the parser code itself. If we can agree on the above points, I can work on a pull request to correct this.

Windows wheels missing in 1.0.3

1.0.3 is lacking Windows wheels. Releasing a new patch version without releasing the wheels at the same time means that anyone trying to pip install it on Windows without specifying an exact version now sees failures. 1.0.2.1 has Windows wheels, and 1.0.3 as a patch release should be fully compatible to the old version, but this is not the case if the wheels are missing. This is especially annoying since cftime is often pulled in as transitive dependency, e.g. from the netCDF4 package. So pinning netCDF4 to a fixed version alone doesn't help, I would also have to pin the cftime version to 1.0.2.1 in that case to prevent any possible issues.

The point of this issue is to highlight the problem of release timelines. It's not enough to upload binary wheels days or weeks after the source package was released to PyPI. Ideally it should be an atomic operation where a new version immediately comes with all the wheels.

Don't cythonize in the clean target.

cythonize should not be executed in the clean target, this breaks Debian package build by modifying the source tree outside the build chroot.

The following patch fixes the issue:

--- a/setup.py
+++ b/setup.py
@@ -75,9 +75,14 @@ if FLAG_COVERAGE in sys.argv or os.envir
         sys.argv.remove(FLAG_COVERAGE)
     print('enable: "linetrace" Cython compiler directive')
 
-extension = Extension('{}._{}'.format(NAME, NAME),
-                      sources=[CYTHON_FNAME],
-                      define_macros=DEFINE_MACROS)
+ext_modules = []
+if "clean" not in sys.argv:
+    extension = Extension('{}._{}'.format(NAME, NAME),
+                          sources=[CYTHON_FNAME],
+                          define_macros=DEFINE_MACROS)
+    ext_modules = cythonize(extension,
+                            compiler_directives=COMPILER_DIRECTIVES,
+                            language_level=2)
 
 setup(
     name=NAME,
@@ -89,9 +94,7 @@ setup(
     cmdclass={'clean_cython': CleanCython},
     packages=[NAME],
     version=extract_version(),
-    ext_modules=cythonize(extension,
-                          compiler_directives=COMPILER_DIRECTIVES,
-                          language_level=2),
+    ext_modules=ext_modules,
     setup_requires=load('setup.txt'),
     install_requires=load('requirements.txt'),
     tests_require=load('requirements-dev.txt'))

Use Appveyor for CI on Windows

I think we should probably test this package on windows. I have no experience setting up Appveyor so I'm looking for someone that does and wants to see this package supported on Windows.

Stable release

netcdf4 version 1.4.0 uses cftime and it will be released soon on PyPI. (It is already available on GitHub and conda-forge is shipping it.)

It would be nice if we could have a stable release of cftime to accompany netcdf4 1.4.0.

DateFromJulianDay : day is out of range for month

The following was raised in SciTools/cf-units#77:

>>> import cftime
>>> cftime.DateFromJulianDay(2450022.5, "standard")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "cftime/_cftime.pyx", line 709, in cftime._cftime.DateFromJulianDay
ValueError: day is out of range for month

I haven't taken the time to dig further into this, but looks like reasonable input to the function.

When doing a bit of digging, I also tried:

>>> cftime.DateFromJulianDay(2450022.5, "standard", only_use_cftime_datetimes=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "cftime/_cftime.pyx", line 709, in cftime._cftime.DateFromJulianDay
  File "cftime/_cftime.pyx", line 1759, in cftime._cftime.DatetimeGregorian.__init__
  File "cftime/_cftime.pyx", line 1903, in cftime._cftime.assert_valid_date
ValueError: invalid day number provided in cftime.DatetimeGregorian(1995, 10, 32, 23, 59, 59, 999952, -1, 1)

dayofyr and dayofwk after cftime.datetime.replace

#66 enabled working dayofyr and dayofwk attributes for all cftime dates constructed manually, which is great!

I've noticed that if one constructs datetime objects using replace, that the dayofyr and dayofwk attributes are not automatically updated:

In [21]: date = cftime.DatetimeNoLeap(1, 2, 1)

In [22]: date.dayofyr
Out[22]: 32

In [23]: date.replace(year=2, month=5).dayofyr
Out[23]: 32

A workaround is to specify dayofwk=-1 within replace:

In [24]: date.replace(year=2, month=5, dayofwk=-1).dayofyr
Out[24]: 121

Is this intentional? Should we not pass down the old dayofyr and dayofwk attributes to the new date object in replace? I'm happy to provide a PR to change this behavior, if desired.

ECCN for netcdftime

If you have already classified this software with ECCN, please confirm the applicable number.

If you do not have your software classified with an ECCN, please kindly answer the following questions so that we may self-assess:

1, Does the Software perform any encryption or utilize any encryption processes? Y/N
2. If the answer is YES to question 1, please indicate if the encryption is coded into the application or separately called (such as using SSL)
3. If the answer is YES to question 1, please indicate what function(s) the cryptography/encryption serves

A, Copyright protection purposes (Includes using a license key/code)
B, User authentication purposes
C, A core part of the functionality such as to encrypt databases
D, To encrypt communications between the software and a host system.

Any differences between datetime and phony datetimes?

I've written a PR for Sympl (code) to start integrating netcdftime datetime-like objects into Sympl models. Since the documentation is quite sparse and a little vague, I thought I should ask if there are any differences between the built-in datetime and netcdftime's datetime-like objects that I should be aware of and warn users about? You're welcome to comment on the PR directly.

I was also confused as to why there was a DatetimeProlepticGregorian to override the built-in datetime, so I've used the built-in one instead for that calendar option. Any insight into whether this is a good idea would be appreciated.

Partial datetime string parsing in cftime

Recently in the xarray mailing list (here) and in aospy (here) the need has arisen for the partial datetime string parsing that is currently implemented in xarray as private API, which we therefore don't want to rely on. @spencerkclark argued, and I think I agree, that this functionality is outside the scope of xarray in terms of making it public API.

Separately, @mcgibbon raised the notion of porting it to cftime, which I like the idea of also. Does this seem reasonable? It seems like such a common datetime-related need that it would well fit in the scope of cftime.

CC: @shoyer for any thoughts from the xarray side

new release?

There have been 134 commits since the release of cftime 1.0.0 in May.

Give the large number of recent bugfixes and feature additions, perhaps a new release soon would be appropriate?

Thanks to everyone who works on this important package.

Problem with masked array

Not sure what's going on here:

from cftime import num2date
num2date(np.ma.array([0., 1., 2.]), 'hours since 1980-06-03 12:00')

gives

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-56-fbdbe6e3afd0> in <module>()
      3 time_units = 'hours since 1980-06-03 12:00'
      4 
----> 5 num2date(np.ma.array([0., 1., 2.]), time_units)

cftime/_cftime.pyx in cftime._cftime.num2date()

cftime/_cftime.pyx in cftime._cftime.utime.num2date()

~/miniconda3/envs/py36/lib/python3.6/site-packages/numpy/core/fromnumeric.py in reshape(a, newshape, order)
    277            [5, 6]])
    278     """
--> 279     return _wrapfunc(a, 'reshape', newshape, order=order)
    280 
    281 

~/miniconda3/envs/py36/lib/python3.6/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
     49 def _wrapfunc(obj, method, *args, **kwds):
     50     try:
---> 51         return getattr(obj, method)(*args, **kwds)
     52 
     53     # An AttributeError occurs if the object does not have

ValueError: cannot reshape array of size 1 into shape (3,)

It works if I convert the data to a regular array.

Julian Calendar can not calculate day of year.

There's an error in the num2date function when using the Julian calendar in cftime. The num2date does not correctly produce the day of year:
ie:

import cftime
In [6]: cftime.num2date(16,'day since 1950-01-01 00:00:00.0000000',calendar='365_day')
Out[6]: cftime._cftime.DatetimeNoLeap(1950, 1, 17, 0, 0, 0, 0, 1, 17)

In [7]: cftime.num2date(16,'day since 1950-01-01 00:00:00.0000000',calendar='julian')
Out[7]: cftime._cftime.DatetimeJulian(1950, 1, 17, 0, 0, 0, 0, -1, 1)

The day of year (final number in the date output) is calculated correctly in the 365_day calendar, but is set to 1 in the julian calendar.

Question regarding precision of datetime arithmetic

We are working on a cftime-compatible version of pandas' resample functionality in xarray (pydata/xarray#2593). In order to accomplish this, we are making heavy use of the datetime arithmetic functionality in cftime. There are two contexts where this occurs:

  • Adding a datetime.timedelta object do a cftime.datetime object to produce another cftime.datetime object
  • Taking the difference between two cftime.datetime objects to produce a datetime.timedelta object.

It is my understanding that in the first context, integer arithmetic is used, so the result is microsecond-precise; in the second context, however, things follow a different code path (using date2num) and arithmetic is not necessarily exact (in my tests on a Mac this issue seems to have mostly gone away in cftime version 1.0.3.4; however @jwenfai has found on Windows that issues persist).

We are finding that having an exact result for the difference between two datetimes is important for our new use-case. A potential solution one might think of trying is to try to write a function that computes exact datetime differences.

Here is a potential function I came up with. Do you think that this would be a safe workaround for this issue, or might there be issues I'm not anticipating? Thanks for your help.

from datetime import timedelta

def exact_cftime_datetime_difference(a, b):
    """Exact computation of b - a

    Assumes:

        a = a_0 + a_m
        b = b_0 + b_m

    Here a_0, and b_0 represent the input dates rounded 
    down to the nearest second, and a_m, and b_m represent
    the remaining microseconds associated with date a and
    date b.

    We can then express the value of b - a as:

        b - a = (b_0 + b_m) - (a_0 + a_m) = b_0 - a_0 + b_m - a_m
    
    By construction, we know that b_0 - a_0 must be a round number
    of seconds.  Therefore we can take the result of b_0 - a_0 using
    ordinary cftime.datetime arithmetic and round to the nearest
    second.  b_m - a_m is the remainder, in microseconds, and we
    can simply add this to the rounded timedelta.

    Parameters
    ----------
    a : cftime.datetime
        Input datetime
    b : cftime.datetime
        Input datetime

    Returns
    -------
    datetime.timedelta
    """
    seconds = b.replace(microsecond=0) - a.replace(microsecond=0)
    seconds = int(round(seconds.total_seconds()))
    microseconds = b.microsecond - a.microsecond
    return timedelta(seconds=seconds, microseconds=microseconds)

Add a version of num2date that strictly returns netcdftime.datetime objects?

In response to pydata/xarray#1252 (comment), @shoyer wrote:

I think netcdftime should have a version of num2date that always returns
netcdftime.datetime objects. It's really error prone to functions that turn
different types dependent on input values.

Would this be something folks would be open to? Somewhat selfishly this would make things easier downstream in pydata/xarray#1252, because we would only need to keep track of two datetime types for indexing and it would hopefully simplify the logic to encode datetimes for faithful roundtripping via netcdftime.num2date.

num2date largely does this already. I think it currently only returns datetime.datetime objects in the following situations (though please correct me if there are others):

  • If dates are encoded using a calendar type of 'gregorian' or 'proleptic_gregorian'
  • If the units argument provided to num2date includes a date after the beginning of the Gregorian calendar and the dates are encoded using a calendar type of 'standard'.

If we could agree on how this alternative version of num2date would work in these situations I would be happy to put together a PR. I'm not sure if we might want to define a new function or expose this via a keyword argument in the existing num2date.

Unit test for rich comparison fix

A fix for fully functioning rich comparisons at python 3 was introduced in #53 but without a unit test. I just want to record in an issue what such a unit test would look like so when I get time to write it I haven't forgotten!

I am imagining creating two test classes that can be compared to cftime instances. One of which is designed to play nice, and one which isn't. Instances of these classes should be compared to a cftime instance in both directions, i.e. a .cmp. b and b .cmp. a. The results will then be dependent on which version of python is being used.

Building with icc failes

I have been trying to build with icc, but that failes as for the linking of the so, gcc is used instead.

Other extensions seem to work fine.
You can reproduce this issue as well without an extra compiler:

CC=echo CXX=echo python3 setup.py build_ext

I have tested other extensions, and they use echo for compiling and linking, whereas cftime uses it only for compiling, and gcc for linking.

This issue is present in master, as well as cftime-1.0.0 from pypi.

netcdftime.datetime refers to DatetimeProlepticGregorian

It would be nice if the super class for all netcdftime datetime objects (netcdftime._netcdftime.datetime) were exposed in the public API; this would allow one to succinctly do type checking.

Currently, netcdftime.datetime refers to the DatetimeProlepticGregorian object. Is this intended, or could it be changed to point to netcdftime._netcdftime.datetime?

In [1]: from netcdftime import datetime, DatetimeAllLeap

In [2]: datetime(1, 1, 1)
Out[2]: netcdftime._netcdftime.DatetimeProlepticGregorian(1, 1, 1, 0, 0, 0, 0, -1, 1)

In [3]: test = DatetimeAllLeap(1, 1, 1)

In [4]: isinstance(test, datetime)
Out[4]: False

In [5]: from netcdftime._netcdftime import datetime as super_datetime

In [6]: isinstance(test, super_datetime)
Out[6]: True

cc @shoyer
xref: pydata/xarray#1252 (comment)

cftime.real_datetime not pandas compatible

See pandas-dev/pandas#23419 for context.

The issue is following:

import pandas, numpy, netCDF4
time=netCDF4.num2date([0,1,2,3,4,5,500,1000], 'days since 1801-10-01')
pandas.Series(numpy.arange(len(time)), index=time)

Fails with

AttributeError: 'real_datetime' object has no attribute 'nanosecond'

@spencerkclark says that this is due to the fact that netcdf4 now returns cftime objects per default (always). I guess it is a necessary change (having different time objects to deal with depending on the calendar is a pain), but it makes me wonder about the level of compatibility between cftime objects and the rest of the pydata stack.

I'm no specialist at all here, so I don't know what it would involve, but it would be nice if cftime objects could be handled by pandas.

The easiest fix seems to simply add a nanoseconds attribute to the cftime.real_datetime objects.

num2date for dates before the start of the Gregorian calendar

xref: pydata/xarray#1929

Using the current version of netCDF4.num2date the following works:

>>> from netCDF4 import num2date
>>> import numpy as np
>>> num2date(np.arange(2), 'days since 1000-01-01', 'gregorian')
array([datetime.datetime(1000, 1, 1, 0, 0),
       datetime.datetime(1000, 1, 2, 0, 0)], dtype=object)

However, using the current netcdftime.num2date, it does not:

>>> from netcdftime import num2date
>>> import numpy as np
>>> num2date(np.arange(2), 'days since 1000-01-01', 'gregorian')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "netcdftime/_netcdftime.pyx", line 283, in netcdftime._netcdftime.num2date
  File "netcdftime/_netcdftime.pyx", line 1165, in netcdftime._netcdftime.utime.num2date
  File "netcdftime/_netcdftime.pyx", line 673, in netcdftime._netcdftime.DateFromJulianDay
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This error came up in the Xarray test suite (not in a real-world use-case). In my reading, it seems the Gregorian calendar does not start until October 15th, 1582, so should this be possible (as it is in netCDF4) or should it not be? In other words, in this case is Xarray testing something that doesn't make sense? How does (or should?) netcdftime handle this situation?

Towards the first standalone release of netcdftime

My apologies for getting this started and then letting it slide for so long. I'd like to aim for getting a first release of this package by the end of February.

I have created a milestone on the issue tracker for this release. Some other things to consider:

  • what version number should we start with?
  • how should we go about removing the current netcdftime from netcdf4-python

cc @dopplershift, @spencerkclark, @jswhit

from @dopplershift:

I'm happy to help step up on the infrastructure side (well, in a couple weeks probably), but I'm a little uncomfortable with stepping up fully since I don't have much technical knowledge in the space netcdftime is operating. So I'm happy to cut releases and stuff, but I'm not in possession of answers when it comes to solutions to problems.

If you could help get the documentation working, I can spend some cycles on getting the test infrastructure working again. I'm hoping @spencerkclark can help wrap up his open issues in that time frame.

Full support for UDUNIT years units

@jhamman Hope all is well with you since the AOSPy Workshop! I'm putting an issue in here to address some missing capabilities that are CF compliant, but not fully supported by the netcdftime module.

This issue can be immediately seen by trying to initialize a utime instance with the 'common_year' or 'year' units (which are UDUNITS supported):

utime('common_years since 1-1-1 0:0:0, calendar='noleap')

which immediately errors with the error:

ValueError: units must be one of 'seconds', 'minutes', 'hours' or 'days' (or singular version of these), got 'years'

My understanding is that this should work, since 'common_year' is a fixed multiple of 'day' (i.e., 365 days, regardless of calendar). However, netcdftime does not support any year-like units.

I'm interested in implementing this ASAP, so I might fork the repo and put in a PR soon-ish.

Datetime comparison

copy of Unidata/netcdf4-python#639

When a netcdftime datetime object is unable to be compared to another object, the richcmp method raises a TypeError. In previous versions of the module the method would return NotImplemented, causing the appropriate method on the other object to be called and allowing classes to be written which could be compared against netcdftime datetime objects. With the TypeError behaviour this is no longer the case.

Can the previous behaviour be restored?

Local build of docs differs from web documentation

I'm looking to submit a documentation PR, but am having trouble getting documentation to build locally with the same result as the web docs.

Steps I am following:

  1. Clone repository locally
  2. python setup.py build
  3. python setup.py install
  4. cd docs && make html

Result I expect: The docs in _build/html should be the same as on the web.

Result I get: During build, this is the log:

[mcgibbon@stcu docs]$ make html
Running Sphinx v1.5.2
making output directory...
loading pickled environment... not yet created
loading intersphinx inventory from https://docs.python.org/objects.inv...
intersphinx inventory has moved: https://docs.python.org/objects.inv -> https://docs.python.org/3/objects.inv
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: 1 added, 0 changed, 0 removed
reading sources... [100%] index                                                                                                                               
/home/disk/eos4/mcgibbon/python/netcdftime/docs/index.rst:11: WARNING: missing attribute mentioned in :members: or __all__: module netcdftime, attribute date2num
/home/disk/eos4/mcgibbon/python/netcdftime/docs/index.rst:11: WARNING: missing attribute mentioned in :members: or __all__: module netcdftime, attribute num2date
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index                                                                                                                                
generating indices... genindex py-modindex
writing additional pages... search
copying static files... done
copying extra files... done
dumping search index in English (code: en) ... done
dumping object inventory... done
build succeeded, 2 warnings.

Build finished. The HTML pages are in _build/html.

In the resulting html docs, date2num and num2date are missing.

Further to all this, if I add netcdftime.datetime to the list of automodule members on netcdftime, I get the docstring "alias of DatetimeProlepticGregorian", rather than the docstring I get in IPython:

In [4]: nt.datetime?
Docstring:
Phony datetime object which mimics the python datetime object,
but allows for dates that don't exist in the proleptic gregorian calendar.

Supports timedelta operations by overloading + and -.

Has strftime, timetuple, replace, __repr__, and __str__ methods. The
format of the string produced by __str__ is controlled by self.format
(default %Y-%m-%d %H:%M:%S). Supports comparisons with other phony
datetime instances using the same calendar; comparison with
datetime.datetime instances is possible for netcdftime.datetime
instances using 'gregorian' and 'proleptic_gregorian' calendars.

Instance variables are year,month,day,hour,minute,second,microsecond,dayofwk,dayofyr,
format, and calendar.
File:      /home/disk/p/mcgibbon/anaconda/lib/python2.7/site-packages/netcdftime/_netcdftime.so
Type:      type

Install with pipenv fails for version 1.0.3

On Debian in a Dockerfile with Pipfile:

RUN pip install pipenv && pipenv install
with cftime = "*" in Pipfile
results in:
pipenv.patched.notpip._internal.exceptions.InstallationError: Command "python setup.py egg_info" failed with error code 1 in /tmp/tmp3dy86zfkbuild/cftime/

Version 1.0.0 or also 1.0.2.1 does not have this problem.

Installation with pip install cftime works for both versions.

More roundoff problems

I don't know if this is related to #54 but I have noticed what I would call "unwanted precision" when interconverting between dates and numbers:

In [1]: from cftime import num2date, date2num
In [2]: calendar = 'noleap'
In [3]: u1970 = 'days since 1970-01-01' 
In [4]: u1980 = 'days since 1980-01-01' 
In [5]: date_1980 = num2date(0,u1980,calendar)
In [6]: date_1980
Out[6]: cftime.DatetimeNoLeap(1980, 1, 1, 0, 0, 0, 0, 6, 1)
In [7]: date2num(date_1980,u1970,calendar)
Out[7]: 3650.000000000001

I think this could be characterised as erroneous, The return value of the last date2num call should be exactly 3650.

Raise error upon construction of out-of-bounds datetime?

In pydata/xarray#1252 I am working on a NetCDFTimeIndex that is intended to bring some of the features of pandas's DatetimeIndex (for now namely field accessors and partial datetime string indexing) to time indexes that use netcdftime._netcdftime.datetime objects.

Currently one can construct non-sensical datetimes using netcdftime._netcdftime.datetime objects:

In [1]: from netcdftime import DatetimeNoLeap

In [2]: DatetimeNoLeap(1, 45, 45)
Out[2]: netcdftime._netcdftime.DatetimeNoLeap(1, 45, 45, 0, 0, 0, 0, -1, 1)

Would it be possible for this kind of expression to raise an error? This would be nice, because that way if one tries to index with a slice involving an out-of-bounds datetime, it would automatically cause an error, rather than behave in the fashion below:

In [1]: import xarray as xr

In [2]: from xarray.conventions.netcdftimeindex import NetCDFTimeIndex

In [3]: from netcdftime import DatetimeNoLeap

In [4]: dates = [DatetimeNoLeap(1, 1, 1), DatetimeNoLeap(1, 2, 1), DatetimeNoLeap(2, 1, 1), DatetimeNoLeap(2, 2, 1)]

In [5]: da = xr.DataArray([1, 2, 3, 4], coords=[NetCDFTimeIndex(dates)], dims=['time'])

In [6]: da.sel(time=slice(DatetimeNoLeap(1, 1, 1), DatetimeNoLeap(1, 45, 45)))
Out[6]:
<xarray.DataArray (time: 2)>
array([1, 2])
Coordinates:
  * time     (time) object    1-01-01 00:00:00    1-02-01 00:00:00

cc @shoyer

Consider switching to a repr with a zero-padded year?

xref: pydata/xarray#1252 (comment)

When encoding datetime-like objects for storage in netCDF files, xarray follows the standard convention (converting dates to a series of numbers representing some time unit since a reference date). The units are encoded as a string attribute of a variable in a netCDF file, e.g. 'days since 2000-01-01 00:00:00'. In the process of constructing this string upon saving the file, xarray uses the repr of the datetime object stored in the array. Additionally, xarray's datetime decoding logic depends on the use of the pandas.Timestamp constructor to assess whether a reference date with a standard calendar (encoded as a string) can be represented using a Timestamp object (allowing the time series to be decoded fully without the optional dependency of netCDF4) or needs to be represented using a netcdftime object (requiring the optional dependency of netCDF4 to be decoded).

In experimenting with round-tripping arrays containing datetime objects outside the Timestamp-valid range to netCDF files and back to xarray objects within pydata/xarray#1252, the non-zero-padded repr of netcdftime objects (used in encoding the dates) and the datetime parser used by pandas in attempting to decode the dates do not always play well together. See for example:

In [1]: import xarray as xr

In [2]: from netcdftime import DatetimeProlepticGregorian

In [3]: da = xr.DataArray([DatetimeProlepticGregorian(1, 1, 1), DatetimeProlepticGregorian(1, 2, 1)])

In [4]: da
Out[4]:
<xarray.DataArray (dim_0: 2)>
array([netcdftime._netcdftime.DatetimeProlepticGregorian(1, 1, 1, 0, 0, 0, 0, -1, 1),
       netcdftime._netcdftime.DatetimeProlepticGregorian(1, 2, 1, 0, 0, 0, 0, -1, 1)], dtype=object)
Dimensions without coordinates: dim_0

In [5]: da.to_dataset(name='time').to_netcdf('test-roundtrip.nc')

In [6]: xr.open_dataset('test-roundtrip.nc')
Out[6]:
<xarray.Dataset>
Dimensions:  (dim_0: 2)
Dimensions without coordinates: dim_0
Data variables:
    time     (dim_0) datetime64[ns] 2001-01-01 2001-02-01

Upon closer inspection, one can find that this could be fixed by using a repr with a zero-padded year when encoding the datetimes:

In [7]: ds = xr.open_dataset('test-roundtrip.nc', decode_times=False)

In [8]: ds.time
Out[8]:
<xarray.DataArray 'time' (dim_0: 2)>
array([ 0, 31])
Dimensions without coordinates: dim_0
Attributes:
    units:     days since    1-01-01 00:00:00
    calendar:  proleptic_gregorian

In [9]: ds.time.attrs['units'] = 'days since 0001-01-01 00:00:00'

In [10]: xr.decode_cf(ds)
/Users/spencerclark/xarray-dev/xarray/xarray/conventions/coding.py:416: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  result = decode_cf_datetime(example_value, units, calendar)
Out[10]: /Users/spencerclark/xarray-dev/xarray/xarray/conventions/coding.py:435: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  calendar=self.calendar)

<xarray.Dataset>
Dimensions:  (dim_0: 2)
Dimensions without coordinates: dim_0
Data variables:
    time     (dim_0) object 0001-01-01 0001-02-01

Would using repr with a zero-padded year possibly make sense for netcdftime or should we look for an alternative solution to this issue?

datetime object dayofyear and dayofwk issue

netcdftime.datetime objects constructed directly don't have the correct day of year or day of week; they always give a value of 1 and -1 respectively. Objects returned by utime.num2date have the correct values, except for those with Julian calendars, which still have the issue.

>>> import netcdftime
>>> print netcdftime.__version__
1.4.1
>>> date = netcdftime.DatetimeNoLeap(2000, 1, 2)
>>> print repr(date)
netcdftime._netcdftime.DatetimeNoLeap(2000, 1, 2, 0, 0, 0, 0, -1, 1)
>>> print date.dayofyr 
1
>>> print date.dayofwk
-1

Python version: 2.7
netdf4 version: 1.2.7
Installed via conda forge.

1.0.3 fails to build on 32bit architectures due to test failures (TypeError: object of type 'numpy.int32' has no len())

The Debian package build for cftime 1.0.3 failed on 32bit architectures due to test failures, see:

https://buildd.debian.org/status/package.php?p=cftime

The buildlog contains many TypeError issues like this:

=================================== FAILURES ===================================
____________________________ cftimeTestCase.runTest ____________________________

self = <test_cftime.cftimeTestCase testMethod=runTest>

    def runTest(self):
        """testing cftime"""
        # test mixed julian/gregorian calendar
        # check attributes.
        self.assertTrue(self.cdftime_mixed.units == 'hours')
        self.assertTrue(
            str(self.cdftime_mixed.origin) == '0001-01-01 00:00:00')
        self.assertTrue(
            self.cdftime_mixed.unit_string == 'hours since 0001-01-01 00:00:00')
        self.assertTrue(self.cdftime_mixed.calendar == 'standard')
        # check date2num method. (date before switch)
        d = datetime(1582, 10, 4, 23)
        t1 = self.cdftime_mixed.date2num(d)
        assert_almost_equal(t1, 13865687.0)
        # check num2date method.
>       d2 = self.cdftime_mixed.num2date(t1)

test/test_cftime.py:99: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
cftime/_cftime.pyx:874: in cftime._cftime.utime.num2date
    ???
cftime/_cftime.pyx:492: in cftime._cftime.DateFromJulianDay
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   TypeError: object of type 'numpy.int32' has no len()

Full buildlogs for the achitectures in question: armel, armhf, i386, mips, mipsel, hppa, hurd-i386

setup.py imports from Cython before checking for Cython

As discussed in Unidata/netcdf4-python#767, setup.py calls from Cython.Build import cythonize before it requires Cython to be installed. As a consequence, building without Cython previously installed fails with the following error:

Traceback (most recent call last):
  File "setup.py", line 3, in <module>
    from Cython.Build import cythonize
ImportError: No module named Cython.Build

Uninformative error message from _dateparse

cftime._cftime._dateparse returns quite opaque error messages that can lead to issues tracking down errors.

For example:

>>> cftime._cftime._dateparse('days_since_1900-01-01')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "cftime/_cftime.pyx", line 56, in cftime._cftime._dateparse
IndexError: list index out of range

The code assumes the string can be split by whitespace into two strings. In the above example it cannot.

num2date returns 01:00:00 or 00:59:59 depending on calendar

Good afternoon,

Here is an issue I posted on Unidata/netcdf4-python#810 two weeks ago, and I have been advised to submit it here. Here it is below:

You may already be aware of this problem, but I didn't find how to solve it.
I have an output with hourly intervals written in days since 1979-01-01 00:00:00.
The first time step is 0. The second is 0.041666666667.
I have noticed that num2date does not give the same results when reading the second time step depending on the calendar used. It sometimes give 1979-01-01 01:00:00 (which is what we want), and sometimes 1979-01-01 00:59:59 (see below).
How can we fix it so num2date would give the correct result (01:00:00. 02:00:00, etc.) whatever calendar is used?

Thank you for your help,
Marie-Estelle

time_in1=0.041666666667
print(time_in_units)
days since 1979-01-01 00:00:00
dt_in1 = num2date(time_in1,time_in_units,calendar='standard')
str(dt_in1)
'1979-01-01 01:00:00'
dt_in1 = num2date(time_in1,time_in_units,calendar='gregorian')
str(dt_in1)
'1979-01-01 01:00:00'
dt_in1 = num2date(time_in1,time_in_units,calendar='proleptic_gregorian')
str(dt_in1)
'1979-01-01 01:00:00'
dt_in1 = num2date(time_in1,time_in_units,calendar='noleap')
str(dt_in1)
'1979-01-01 00:59:59'
dt_in1 = num2date(time_in1,time_in_units,calendar='365_day')
str(dt_in1)
'1979-01-01 00:59:59'
dt_in1 = num2date(time_in1,time_in_units,calendar='360_day')
str(dt_in1)
'1979-01-01 00:59:59'
dt_in1 = num2date(time_in1,time_in_units,calendar='julian')
str(dt_in1)
'1979-01-01 01:00:00'
dt_in1 = num2date(time_in1,time_in_units,calendar='all_leap')
str(dt_in1)
'1979-01-01 00:59:59'
dt_in1 = num2date(time_in1,time_in_units,calendar='366_day')
str(dt_in1)
'1979-01-01 00:59:59'

Installation from source on PyPI fails with ValueError

The sdist archive, cftime-1.0.3.1.tar.gz, on PyPI contains an absolute path in cftime.egg-info\SOURCES.txt such that installation from source fails:

<snip>
running egg_info
writing cftime.egg-info\PKG-INFO
writing dependency_links to cftime.egg-info\dependency_links.txt
writing requirements to cftime.egg-info\requires.txt
writing top-level names to cftime.egg-info\top_level.txt
reading manifest file 'cftime.egg-info\SOURCES.txt'
Traceback (most recent call last):
  File "setup.py", line 122, in <module>
    'License :: OSI Approved'])
  File "X:\Python36\lib\site-packages\setuptools\__init__.py", line 143, in setup
    return distutils.core.setup(**attrs)
  File "X:\Python36\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "X:\Python36\lib\distutils\dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "X:\Python36\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "X:\Python36\lib\site-packages\wheel\bdist_wheel.py", line 224, in run
    self.run_command('install')
  File "X:\Python36\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "X:\Python36\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "X:\Python36\lib\site-packages\setuptools\command\install.py", line 61, in run
    return orig.install.run(self)
  File "X:\Python36\lib\distutils\command\install.py", line 557, in run
    self.run_command(cmd_name)
  File "X:\Python36\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "X:\Python36\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "X:\Python36\lib\site-packages\setuptools\command\install_egg_info.py", line 34, in run
    self.run_command('egg_info')
  File "X:\Python36\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "X:\Python36\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "X:\Python36\lib\site-packages\setuptools\command\egg_info.py", line 296, in run
    self.find_sources()
  File "X:\Python36\lib\site-packages\setuptools\command\egg_info.py", line 303, in find_sources
    mm.run()
  File "X:\Python36\lib\site-packages\setuptools\command\egg_info.py", line 534, in run
    self.add_defaults()
  File "X:\Python36\lib\site-packages\setuptools\command\egg_info.py", line 577, in add_defaults
    self.read_manifest()
  File "X:\Python36\lib\site-packages\setuptools\command\sdist.py", line 199, in read_manifest
    self.filelist.append(line)
  File "X:\Python36\lib\site-packages\setuptools\command\egg_info.py", line 476, in append
    path = convert_path(item)
  File "X:\Python36\lib\distutils\util.py", line 125, in convert_path
    raise ValueError("path '%s' cannot be absolute" % pathname)
ValueError: path '/private/tmp/cftime/cftime/_cftime.c' cannot be absolute

decode 'months since' for 360_day calendars

The IRI Data Library contains tons of datasets with the following time attributes:

	float32 T(T) ;
		T:standard_name = time ;
		T:pointwidth = 1.0 ;
		T:long_name = Time ;
		T:calendar = 360 ;
		T:expires = 1538524800 ;
		T:gridtype = 0 ;
		T:units = months since 1960-01-01 ;

(edited to correct typo in original post)

(example dataset)

I would like to be able to open and decode these datasets in xarray, with time decoding handled by cftime.

There are two problems:

  1. The calendar is not a valid CF calendar. It should be 360_day. But that is easy to fix by rewriting the calendar attribute.
  2. Even with calendar = 360_day, months is not considered a valid time unit

However, in a 360-day calendar month==30 days, so this should be valid.

This was discussed over in Unidata/netcdf4-python#434 (comment), where @jswhit commented:

A pull request allowing months since when calendar is 360_day would be welcome

improve accuracy of Julian day calculations

The accuracy of the current algorithm is about a millisecond, which can cause suprising results due to roundoff errors (see issue #54).

One simple way to improve the accuracy would be to modify the routines so that they represent the Julian day as two floats, instead of just one, as is done with jdcal. From the jdcal pypi page:

"Julian dates are stored in two floating point numbers (double). Julian dates, and Modified Julian dates, are large numbers. If only one number is used, then the precision of the time stored is limited. Using two numbers, time can be split in a manner that will allow maximum precision. For example, the first number could be the Julian date for the beginning of a day and the second number could be the fractional day. Calculations that need the latter part can now work with maximum precision."

Negative dates in CF and cftime

It has been suggested that my issue may be of interest/discussed/resolved through this group. I’ll present the questions pre-emptively as it may help understand the description of my problem:
Are CF dates constrained as being positive years?
Are code updates planned/required to allow for the use of the ISO/DIS 8601-2 standard to allow for negative dates?

We have NetCDF data files (with SeaDataNet and CF conventions) with a date channel as http://vocab.nerc.ac.uk/collection/P01/current/CJDY1101/
double TIME(INSTANCE) ;
TIME:long_name = "Chronological Julian Date" ;
TIME:sdn_parameter_urn = "SDN:P01::CJDY1101" ;
TIME:sdn_parameter_name = "Julian Date (chronological)" ;
TIME:sdn_uom_urn = "SDN:P06::UTAA" ;
TIME:sdn_uom_name = "Days" ;
TIME:units = "days since -4713-01-01T00:00:00Z" ;
TIME:standard_name = "time" ;
TIME:axis = "T" ;
TIME:ancillary_variables = "TIME_SEADATANET_QC" ;
TIME:calendar = "julian" ;
TIME:_FillValue = -99999. ;
byte TIME_SEADATANET_QC(INSTANCE) ;
….

Running the data file through the CFchecker software (http://pumatest.nerc.ac.uk/cgi-bin/cf-checker.pl) fails with
File "netcdftime/_netcdftime.pyx", line 715, in netcdftime._netcdftime.utime.init (netcdftime/_netcdftime.c:11201)
ValueError: negative reference year in time units, must be >= 1

This was reported to the CFchecker software, with the response from the developer that
cfunits is throwing the error
"netCDF4-python throws an error if real world calendars have negative years (https://github.com/Unidata/cftime/blob/master/cftime/_cftime.pyx#L140-L147)"

The following trail seems to imply that we are working correctly with the date, in particular having a negative year for the time origin:
http://cfconventions.org/Conformance/conformance.html

https://www.unidata.ucar.edu/software/netcdf/docs/BestPractices.html#bp_Calendar-Date-Time
refers to the udunits using ISO8601

https://en.wikipedia.org/wiki/ISO_8601
refers to
To represent years before 0000 or after 9999, the standard also permits the expansion of the year representation but only by prior agreement between the sender and the receiver.

https://www.iso.org/news/2017/02/Ref2164.html
refers to simply adding a minus sign.

https://www.unidata.ucar.edu/software/thredds/current/netcdf-java/CDM/CalendarDateTime.html refers to a minus date.

I hope this rather long description makes some sort of sense.
Ray

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.