Comments (6)
you can have a try.
and be aware that there are some differences between bert and albert in modelling.py
why do you want to train multillingual model?
from albert_zh.
i want to use model with vietnamese language. The important of change is share parameters, i know. how i can train with my language. Thanks for support =)
from albert_zh.
1、you can change vocab.txt in ./albert_config, then set non_chinese to True when create pretrain data using create_pretraining_data.py
2、then do pre train using run_pretraining.py
from albert_zh.
okay. Thanks for support. Best repo =)
from albert_zh.
I have tried to pretrain with my dataset, but i see the loss is very small but accuracy is not improve. How i can improve result
from albert_zh.
@brightmart Can we have a multilingual model for just Chinese and English? Cause in practical scenerios we may meet many english words in APP names, music names, Apple's all products's name and so on, and Google's multilingual model has too many languages. Our daliy life cannot leave English, you can see that Apple try to use purely Chinese in its products, such as replace Finder with 访达 which I think is totally a mess.
Maybe a language model for just Chinese and English can have huge impact on both research and industry and many multilingual tasks can benefit from it.
from albert_zh.
Related Issues (20)
- 有兄弟试过 iflytek 数据训练出的分类模型吗?albert_tiny 模型进行下游任务,得到的结果比较郁闷。
- 请问考虑将模型发布到 tfhub 上面吗?
- The exact English pretraining data and Chinese pretraining data that are exact same to the BERT paper's pretraining data.
- 句向量特征提取的最佳实践
- 预训练语料构造问题
- 模型转换pytorch问题 HOT 5
- 'str' object has no attribute 'size' HOT 1
- 求助 HOT 2
- 请问中文albert训练过程中是按字符级分割还是按词语级分割? HOT 1
- 请问预训练模型是文本分类模型吗? 想做语言模型,预测下一个单字或词应该怎么修改?
- whats the difference between `albert_tiny_zh` and `albert_tiny_google_zh` HOT 1
- 数据集下载不了 HOT 1
- 请问DataProcessor类在哪里呢
- 想請問模型的 license
- 预训练的差异 HOT 1
- 有相关onnx模型转换与调用的支持嘛?
- 在预训练生成特定格式的文件(tfrecords) 时内存不足问题 HOT 1
- File path error
- AttributeError: module 'tensorflow.contrib.tpu' has no attribute 'InputPipelineConfig'
- Internal: Blas GEMM launch failed : a.shape=(4096, 128), b.shape=(128, 312), m=4096, n=312, k=128
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from albert_zh.