Comments (3)
Thanks for the suggestion @j-adamczyk! Any other features you'd suggest for these task types?
from deepchecks.
@noamzbr thank you for fast response.
This requires a mix of regression tests and NLP tests.
From tabular quickstart, interesting checks are:
- train-test performance
- regression error distribution
- prediction drift
- simple model comparison
Specifically, tabular regression checks that don't make sense for NLP are weak segments performance (since they are not well defined for NLP), boosting overfit (since NLP does not use boosting typically) and model inference time (which is naturally long for NLP).
From [NLP text classification quickstart]:
- text property outliers
- unknown tokens
- under annotated property segments
- under annotated metadata segments
- text duplicates
- special characters
Also, image regression could also be added in a very similar way (but that is outside the scope of this issue).
from deepchecks.
@noamzbr any news on this? As far as I understand, this is mixing 2 existing things together, and no really new code is needed
from deepchecks.
Related Issues (20)
- Docker Image HOT 1
- [FEAT] Fixing Inconsistent Legend Colors for Train and Test Datasets in Train-Test-evaluation Charts HOT 2
- [BUG] Outdated Examples - COCOData does not exist HOT 1
- QST: why deepchecks use NumPy to storage the nlp text list,that can easily cause a memory overflow。 HOT 2
- [BUG] Incorrect Legends in FeatureDrift Check - DeepCheks v0.17.3 HOT 2
- Error adding custom scorers to SimpleModelComparison check
- [BUG] Creating a text data for classification task with all labels = none causes exception HOT 1
- [Docs] Documentation contains a mistake.
- [BUG] cannot import name 'is_datetime_or_timedelta_dtype' from 'pandas.core.dtypes.common' HOT 2
- The doc of new category train test is misleading [BUG]
- [OPTIMIZATION] function optimization for removing special chars from text.
- [BUG] data integrity suite passes when given a non-existing column to ignore HOT 2
- CVE-2023-24816 vulnerability in ipython package used by Deepchecks
- [FEAT] NLP property - sudden stop HOT 2
- [BUG] Scikit-learn 1.4.0 breaks _ProbaScorer
- [FEAT] Better GitHub markdown reporting with emojis for checkmark ✅ and cross ❌ HOT 1
- [FEAT] create a helm chart for Kubernetes deployment of Deepchecks Monitoring Open Source. HOT 2
- [BUG] MyModelWrapper is incorrectly interpreted as "Regressor" for classification metrics
- [FEAT] Speed up `import deepchecks` by making it lazier
- [FEAT] LLM Support?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepchecks.