Comments (5)
Hello--thanks for filing this issue. I'd like to replicate your problem as accurately as I can before I start addressing the issue.
I have some sample code below but I'm not sure what you're using as the requirement:
from datetime import datetime
import pandas as pd
from datatest import validate
data = pd.Series([
None,
None,
None,
datetime(2010, 12, 31),
datetime(2010, 12, 31),
])
requirement = ??? # <- What is this?
validate(data, requirement)
Can you tell me what your requirement value is?
from datatest.
Thanks for you reply.
from datetime import datetime, timedelta
import pandas as pd
from datatest import validate, accepted, Extra
data = pd.Series([
None,
None,
None,
datetime(2010, 12, 31),
datetime(2010, 12, 31),
])
Today = datetime.today()
Tomorrow = Today + timedelta(days=1)
def date_requirement(var_datetime):
return pd.Timestamp(year=2000, month=1, day=1) < var_datetime < \
pd.Timestamp(year=Tomorrow.year, month=Tomorrow.month, day=Tomorrow.day)
with accepted(Extra(pd.NaT)):
validate(data, date_requirement)
Here I want to accept the NaT type data. I tried pd.NaT, np.datetime64('NaT'), or NanToken method mentioned in the document and the results are the same:
datatest.ValidationError: does not satisfy date_requirement() (3 differences): [
Invalid(numpy.datetime64('NaT')),
Invalid(numpy.datetime64('NaT')),
Invalid(numpy.datetime64('NaT')),
]
from datatest.
Ah, OK. As a stopgap, you can use the accepted.args()
method together with the pd.isna()
function:
...
with accepted.args(pd.isna):
validate(data, date_requirement)
The accepted.args()
method accepts differences whose args
satisfy a given predicate. And by using pd.isna()
as the predicate, you can accept differences that contain NaT, NaN, or other "missing value" objects.
For a longer term solution, I want to bring the handling of these NaT values inline with how datatest handles other NaN values (as documented here). I will follow up when I have addressed this issue more thoroughly.
from datatest.
Thank you so much.
Your code works well in my project.
And yes, I also used pd.isna to judge whether it is pd.NaT or not. (Is this the only way?) I simply droped those rows then do the datatest.
I've used python and programed for 3 years and haven't realized there're differences among bool, np.bool_ or pd.NaT, pd.Nan, np.nan, nan before.
I've learnt alot from your work, and thanks for your patience again.
from datatest.
I'm glad you found it helpful. I noticed that your date_requirement()
function is checking for an interval. If it suits your needs, you could also use the validate.interval()
method:
...
begin_date = pd.Timestamp(year=2000, month=1, day=1)
tomorrow = pd.Timestamp(datetime.today() + timedelta(days=1))
with accepted.args(pd.isna):
validate.interval(data, begin_date, tomorrow)
One difference with this approach is that time differences trigger Deviation
objects that contain a timedelta
. There are some how-to documents for date handling that you mignt find helpful as well:
from datatest.
Related Issues (20)
- Fully Composable Allowances. HOT 1
- Simplified DataSource Loading. HOT 1
- Selector.load_data() silently fails on missing file. HOT 1
- pytest_runtest_makereport crashes on test exceptions HOT 2
- Add "How to Validate Inequalities" documentation.
- Add "How to Validate Counts and Cardinality" documentation.
- Change get_reader.from_excel() to accept keyword arguments HOT 1
- AcceptedExtra not working as expected with dicts HOT 3
- validation errors Extra(nan) or Invalid(nan) HOT 5
- Squint objects not handled properly when used as requirements. HOT 1
- Crashes pytest-xdist processes (NOTE: See comments for fix.) HOT 3
- Investigate Support for DataFrame-Protocol
- Squint nested-mapping queries not handled properly with non-mapping requirements.
- Improve error message for @working_directory decorator
- Hey man! HOT 1
- Improve existing or create another Deviation-like difference
- Understanding Pandas validation HOT 1
- How to validate Pandas data type "Int64"?
- accepted.tolerance() not applied when comparing values in nested dictionary HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datatest.