Comments (4)
Hi @ogencoglu sounds like a cool idea, thanks. Any thought on how to approach clustering them ?
from pydataset.
I think it is just manual work. Not all datasets may be suitable for this but many machine learning people search for datasets to try their algorithms/implementations in a smaller scale before going to well-known benchmark datasets.
from pydataset.
agreed it'd be nice to filter for regression or classification, but dont see how you could properly categorize datasets.
a regression dataset could be a classification dataset, and vice versa, depending on your preprocessing strategy (eg binning) and target feature.
for example, the canonical iris dataset, used for classification, could be viewed as regression too.
from pydataset.
My idea was something similar to UCI data repo:
http://archive.ics.uci.edu/ml/datasets.html
The column can be "Default Task". Some datasets may have even both Classification and Regression.
from pydataset.
Related Issues (15)
- Unable to load datasets (Python 3.5.1 under Anaconda, Win 7) HOT 6
- Getting error in windows 10 when installing with pip3
- Importing Pydataset
- Display options set
- Fix simple typo: smiliarity -> similarity
- Distinct dataset documentation
- Provide namespaces and an index HOT 1
- Allow usage of pydataset with no external dependancies HOT 4
- Process for adding datasets? HOT 1
- Translating R to Python. Worth the effort? HOT 4
- Break same name with R HOT 1
- Merge code ? DataPackage / datasets ... HOT 2
- Please make datasets non-executable HOT 1
- get_rdatasets in statsmodels HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pydataset.