Giter VIP home page Giter VIP logo

Comments (5)

shawnbrown avatar shawnbrown commented on June 3, 2024

Hello--thanks for filing this issue. I'd like to replicate your problem as accurately as I can before I start addressing the issue.

I have some sample code below but I'm not sure what you're using as the requirement:

from datetime import datetime
import pandas as pd
from datatest import validate

data = pd.Series([
    None,
    None,
    None,
    datetime(2010, 12, 31),
    datetime(2010, 12, 31),
])

requirement = ???  # <- What is this?
validate(data, requirement)

Can you tell me what your requirement value is?

from datatest.

Belightar avatar Belightar commented on June 3, 2024

Thanks for you reply.

from datetime import datetime, timedelta
import pandas as pd
from datatest import validate, accepted, Extra

data = pd.Series([
    None,
    None,
    None,
    datetime(2010, 12, 31),
    datetime(2010, 12, 31),
])

Today = datetime.today()
Tomorrow = Today + timedelta(days=1)

def date_requirement(var_datetime):
    return pd.Timestamp(year=2000, month=1, day=1) < var_datetime < \
            pd.Timestamp(year=Tomorrow.year, month=Tomorrow.month, day=Tomorrow.day)

with accepted(Extra(pd.NaT)):
    validate(data, date_requirement)

Here I want to accept the NaT type data. I tried pd.NaT, np.datetime64('NaT'), or NanToken method mentioned in the document and the results are the same:

datatest.ValidationError: does not satisfy date_requirement() (3 differences): [
    Invalid(numpy.datetime64('NaT')),
    Invalid(numpy.datetime64('NaT')),
    Invalid(numpy.datetime64('NaT')),
]

from datatest.

shawnbrown avatar shawnbrown commented on June 3, 2024

Ah, OK. As a stopgap, you can use the accepted.args() method together with the pd.isna() function:

...

with accepted.args(pd.isna):
    validate(data, date_requirement)

The accepted.args() method accepts differences whose args satisfy a given predicate. And by using pd.isna() as the predicate, you can accept differences that contain NaT, NaN, or other "missing value" objects.

For a longer term solution, I want to bring the handling of these NaT values inline with how datatest handles other NaN values (as documented here). I will follow up when I have addressed this issue more thoroughly.

from datatest.

Belightar avatar Belightar commented on June 3, 2024

Thank you so much.
Your code works well in my project.
And yes, I also used pd.isna to judge whether it is pd.NaT or not. (Is this the only way?) I simply droped those rows then do the datatest.
I've used python and programed for 3 years and haven't realized there're differences among bool, np.bool_ or pd.NaT, pd.Nan, np.nan, nan before.
I've learnt alot from your work, and thanks for your patience again.

from datatest.

shawnbrown avatar shawnbrown commented on June 3, 2024

I'm glad you found it helpful. I noticed that your date_requirement() function is checking for an interval. If it suits your needs, you could also use the validate.interval() method:

...

begin_date = pd.Timestamp(year=2000, month=1, day=1)
tomorrow = pd.Timestamp(datetime.today() + timedelta(days=1))

with accepted.args(pd.isna):
    validate.interval(data, begin_date, tomorrow)

One difference with this approach is that time differences trigger Deviation objects that contain a timedelta. There are some how-to documents for date handling that you mignt find helpful as well:

from datatest.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.