Giter VIP home page Giter VIP logo

aybaras / bert-base-turkish-qa Goto Github PK

View Code? Open in Web Editor NEW
8.0 0.0 3.0 122 KB

A Turkish question answering system made by fine-tuning BERTurk and XLM-Roberta models.

Home Page: https://huggingface.co/dbmdz/bert-base-turkish-128k-cased

License: MIT License

Jupyter Notebook 100.00%
question-answering bert-model bert-fine-tuning turkish-nlp nlp natural-language-processing nlp-machine-learning question-answering-system haystack huggingface

bert-base-turkish-qa's Introduction

BERT-base-Turkish-QA

A Turkish question answering system made by fine-tuning BERTurk, which is a BERT base model transformer. We have trained and evaluated the exact match and F1 scores using different Turkish data sets, then compared the evaluation results. In our final model, we have concatenated all of the Turkish data sets into one data set and trained our model using the whole training data set. You can check out our final models Turkish BERTurk Based Model, Turkish XLM-R Based Model.

This project is made during our joint internship at SESTEK Speech Enabled Software Technologies.

Data Sets

OkanVK's Turkish Reading Comprehension Question Answering Data Set

TQuAD (Turkish Question Answering Data Set)

XQuAD (Cross-lingual Question Answering Data Set)

Kuzgunlar's Data Set

Base Models

BERTurk

XLM-R

Model Comparison

Base Models Training Set Evalulation Set Results Hyperparameters
Exact F1 epoch max_seq_length doc_stride learning_rate
best-base-turkish-128k-cased whole_train_dataset whole_dev_dataset 62.48 81.60 2 512 128 3,00E-05
whole_train_dataset okanvk_dev 62.48 81.66 2 512 128 3,00E-05
whole_train_dataset tquad_dev 62.22 80.42 2 512 128 3,00E-05
whole_train_dataset xquad.tr 45.89 66.37 2 512 128 3,00E-05
best-base-turkish-128k-cased Tquad_train whole_dev_dataset 56.32 76.86 2 512 128 3,00E-05
Tquad_train okanvk_dev 56.31 76.87 2 512 128 3,00E-05
Tquad_train tquad_dev 57.40 78.68 2 512 128 3,00E-05
Tquad_train xquad.tr 41.76 60.83 2 512 128 3,00E-05
best-base-turkish-128k-cased Okanvk_train whole_dev_dataset 60.37 80.53 2 512 128 3,00E-05
Okanvk_train okanvk_dev 60.37 80.63 2 512 128 3,00E-05
Okanvk_train tquad_dev 58.63 78.43 2 512 128 3,00E-05
Okanvk_train xquad.tr 46.38 70.74 2 512 128 3,00E-05
XLM-R Base whole_train_dataset whole_dev_dataset 54.60 76.83 2 512 128 3,00E-05
whole_train_dataset okanvk_dev 54.68 76.94 2 512 128 3,00E-05
whole_train_dataset tquad_dev 53.48 75.26 2 512 128 3,00E-05
whole_train_dataset xquad.tr 42.27 61.72 2 512 128 3,00E-05
Savasy QA (not base) Tquad_train tquad_dev 62.56 80.48

Authors

👤 Aras Güngöre

👤 Aybars Manav

bert-base-turkish-qa's People

Contributors

arasgungore avatar aybarsmanav avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.