japanese-nli-model's Introduction

Japanese Natural Language Inference Model

This repository provides the code for Japanese NLI model, a fine-tuned masked language model.

Performance

The model showed performance comparable with those reported in JGLUE [Kurihara et al. 2022] and JSICK [Yanaka and Mineshima 2022] papers, in terms of overall accuracy:

Model	JGLUE-JNLI valid [%]	JSICK test [%]
[Kurihara et al. 2022]	91.9	N/A
[Yanaka and Mineshima 2022]	N/A	89.1
ours using both JNLI and JSICK	90.9	89.0

References

Hitomi Yanaka and Koji Mineshima. Compositional Evaluation on Japanese Textual Entailment and Similarity. TACL2022.
Kentaro Kurihara, Daisuke Kawahara, and Tomohide Shibata. JGLUE: Japanese General Language Understanding Evaluation. LREC2022.
Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. EMNLP-IJCNLP2019.
Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised Cross-lingual Representation Learning at Scale. ACL2020.

Appendix: Hyperparameters

random seeds

Yes, we tested only a single run :(

torch.manual_seed(0)
random.seed(0)
np.random.seed(0)

dataset order

JSICK
JGLUE

labels

We converted string label into integer using the following mapping:

label2int = {"contradiction": 0, "entailment": 1, "neutral": 2}

CrossEncoder

We mimicked batch_size=128 using gradient accumulation 32 * 4 = 128.

batch_size=32,
shuffle=True,
epochs=3,
accumulation_steps=4,
optimizer_params={'lr': 5e-5},
warmup_steps=math.ceil(0.1 * len(data)),

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

cyberagentailab / japanese-nli-model Goto Github PK