Comments (4)
- dataset, wordvec ... are called as
resources
- every resources has a config
- resources name (unique)
- resources type (identify postprocess)
- resources hashtag
- resources link
- other things download module needs (as least as possible)
- download pipeline
- given resources name
- find config
- if not cached, download from link. Process after download (find by type), save cache.
- if cached, read from local disk.
- check hashtag
- process before read (find by type) (*)
- provide to other module
- import benchmark from local
- given resources name and local path
- check the hash tag
- import to cache
- continue step (*)
- import temporary resources from local
- local path and type
- continue step(*)
from cotk.
TODO:
- import temporary resources from local
- fill the dataset config and type
from cotk.
-
file_path
change tofile_id
:- if
file_id
is like "resources://RESOURCE_ID", findRESOURCE_ID
in configs. (Check the config type == MSCOCO) - if
file_id
starts with "http" or "https", download and cache it. (set resources type = MSCOCO) - otherwise,
file_id
is a local path. Don't cache it. (set resources type = MSCOCO) - change other dataloader & wordvector
- if
-
dataset download path:
from cotk.
Done
from cotk.
Related Issues (20)
- [Maintenance] 补充dataloader中file_id的解释
- [Maintenance] 添加resources_processor的解释
- [Maintenance] Disable Hooks
- [Feature] 增加CLI功能,显示已经储存的resources
- [Feature] 对metric的重构
- [Feature] 修改Recorder返回str,而不是tokens
- [Feature] 增加SentenceBERT HOT 1
- [Maintenance] 整理version_up的commit
- [BUG] 修正version_up分支中cotk.dataloader使用url导入时错误 HOT 2
- [Maintenance] 关于dataloader.fields
- [Feature] 补充MultiturnDialog、SentenceClassification的docstring
- [Feature] Corpus BLEU and Averaged Sentence BLEU HOT 1
- [Maintenance] 关于Metric的special token
- [Maintenance] Field测试情况
- [Maintenance] Add doc for SentenceCandidates
- [Maintenance] Add doc for ``cotk resources``
- [Maintenance] typo in docs for special tokens HOT 1
- [Feature] DenseLabel and SparseLabel add an attribute returning the size of labels
- [Maintenance] SST add a parameter of convert_to_lower_letter
- [Maintenance] add global interface to access CPU_COUNT
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cotk.