Giter VIP home page Giter VIP logo

code-switching-papers's Introduction

Code-switching Research Papers

This is the list of important papers on computational linguistic approaches to code-switching research. The list will be updated over the time.

Tutorial

There was a comprehensive tutorial about code-mixing by Microsoft Research (Monojit Choudhury, Kalika Bali, Anirudh Srinivasan, and Sandipan Dandapat) at EMNLP 2019, you can also check the following link.

Language Identification and POS Tagging

  • Mager, et al. (2019) Subword-Level Language Identification for Intra-Word Code-Switching. NAACL [Paper]
  • Kelsey Ball and Dan Garrette. (2018) Part-of-Speech Tagging for Code-Switched, Transliterated Texts without Explicit Language Identification. EMNLP [Paper]
  • Zeynep Yirmibesoglu and Gulsen Eryigit. (2018) Detecting Code-Switching between Turkish-English Language Pair. Workshop W-NUT, EMNLP [Paper]
  • Mavem, et al. (2018) Language Identification and Analysis of Code-Switched Social Media Text. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Victor Soto and Julia Hirschberg. (2018) Joint Part-of-Speech and Language ID Tagging for Code-Switched Data. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Bullock, et al. (2018) Predicting the presence of a Matrix Language in code-switching. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Soto, et al. (2018) The Role of Cognate Words, POS Tags, and Entrainment in Code-Switching. Interspeech [Paper]
  • Barman, et al. (2016) Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline,Stacking and Joint Modelling. 2nd Workshop on Computational Approaches to Code-Switching, ACL [Paper]
  • Thamar Solorio and Yang Liu. (2008) Learning to Predict Code-Switching Points. EMNLP [Paper]
  • Dau-Cheng Lyu and Ren-Yuan Lyu. (2008) Language Identification on Code-Switching Utterances Using Multiple Cues. Interspeech [Paper]

Language Modeling and Speech Recognition

  • Winata, et al. (2019) Code-Switched Language Models Using Neural Based Synthetic Data from Parallel Sentences. CoNLL [Paper]
  • Hila Gonen and Yoav Goldberg (2019) Language Modeling for Code-Switching:Evaluation, Integration of Monolingual Data, and Discriminative Training. EMNLP [Paper]
  • Lee, et al. (2019) Linguistically Motivated Parallel Data Augmentation for Code-switch Language Modeling. Interspeech [Paper]
  • Victor Soto and Julia Hirschberg (2019) Improving Code-Switched Language Modeling Performance Using Cognate Features. Interspeech [Paper]
  • Chang, et al. (2019) Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. Interspeech [Paper]
  • Zeng, et al. (2019) On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition. Interspeech [Paper]
  • Taneja, et al. (2019) Exploiting Monolingual Speech Corpora for Code-mixed Speech Recognition. Interspeech [Paper]
  • Grandee Lee, Haizhou Li. (2019) Word and Class Common Space Embedding for Code-switch Language Modelling. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) [Paper]
  • Hamed, et al. (2019) Code-Switching Language Modeling with Bilingual Word Embeddings: A Case Study for Egyptian Arabic-English. International Conference on Speech and Computer [Paper]
  • Nakayama, et al. (2018) Speech Chain for Semi-Supervised Learning of Japanese-English Code-Switching ASR and TTS. IEEE Spoken Language Technology Workshop (SLT) [Paper]
  • Jesse Emond, Bhuwana Ramabhadran, Brian Roark, Pedro Moreno, and Min Ma. (2018) Transliteration Based Approaches to Improve Code-Switched Speech Recognition Performance, IEEE Spoken Language Technology Workshop (SLT) [Paper]
  • Ganji Sreeram and Rohit Sinha. (2018) Exploiting Parts-of-Speech for Improved Textual Modeling of Code-Switching Data. 2018 Twenty Fourth National Conference on Communications (NCC) [Paper]
  • Garg, et al. (2018) Code-switched Language Models Using Dual RNNs and Same-Source Pretraining. EMNLP [Paper]
  • Ewald van der Westhuizen and Thomas R. Niesler. (2018) Synthesised bigrams using word embeddings for code-switched ASR of four South African language pairs. Computer Speech and Language [Paper]
  • Biswal, et al. (2018) Multilingual Neural Network Acoustic Modelling for ASR of Under-Resourced English-isiZulu Code-Switched Speech. Interspeech [Paper]
  • Winata, et al. (2018) Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper] [Code]
  • Chandu, et al. (2018) Language Informed Modeling of Code-Switched Text. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Pratapa, et al. (2018) Language Modeling for Code-Mixing: The Role of Linguistic Theory based Synthetic Data. ACL [Paper]
  • Sivasankaran, et al. (2018) Phone Merging For Code-Switched Speech Recognition. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Garg, et al. (2018) Dual Language Models for Code Switched Speech Recognition. Interspeech [Paper]
  • Baheti, et al. (2017) Curriculum Design for Code-switching: Experiments with Language Identification and Language Modeling with Deep Neural Networks. ICON [Paper]
  • Adel, et al. (2015) Syntactic and Semantic Features For Code-Switching Factored Language Models. IEEE Transactions on Audio, Speech, and Language Processing [Paper]
  • Ying Li and Pascale Fung. (2014) Code switch language modeling with Functional Head Constraint. ICASSP [Paper]
  • Ying Li and Pascale Fung. (2014) Language Modeling with Functional Head Constraint for Code Switching Speech Recognition. EMNLP [Paper]
  • Adel, et al. (2013) Combination of Recurrent Neural Networks and Factored Language Models for Code-Switching Language Modeling. ACL [Paper]
  • Adel, et al. (2013) Recurrent neural network language modeling for code switching conversational speech. ICASSP [Paper]
  • Vu, et al. (2012) A First Speech Recognition System for Mandarin-English Code-Switch Conversational Speech. ICASSP [Paper]
  • Ying Li and Pascale Fung. (2012) Code-switch Language Model with Inversion Constraints for Mixed Language Speech Recognition. COLING [Paper]
  • Li, et al. (2011) Asymmetric acoustic modeling of mixed language speech. ICASSP [Paper]

Metric

  • Guzman, et al. (2017) Metrics for modeling code-switching across corpora. Interspeech [Paper]

Representation Learning

  • Winata, et al. (2019) Hierarchical Meta-Embeddings for Code-Switching Named Entity Recognition. EMNLP [Paper] [Code]
  • Pratapa, et al. (2018) Word Embeddings for Code-Mixed Language Processing. EMNLP [Paper]

Machine Translation

  • Thoudam Doren Singh and Thamar Solorio. (2017) Towards Translating Mixed-Code Comments from Social Media. CICLing [Paper]

Named Entity Recognition

  • Winata, et al. (2019) Learning Multilingual Meta-Embeddings for Code-Switching Named Entity Recognition. RepL4NLP, ACL [Paper] [Code]
  • Aguilar, et al. (2018) Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Wang, et al. (2018) Code-Switched Named Entity Recognition with Embedding Attention. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Winata, et al. (2018) Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition. 3rd Workshop of Computational Approaches to Linguistic Code-switching, ACL [Paper]
  • Aguilar, et al. (2017) A Multi-task Approach for Named Entity Recognition in Social Media Data. 3rd Workshop on Noisy User-generated Text, EMNLP [Paper]

Linguistics

  • Fairchild, Sarah, and Janet G. Van Hell. (2017) Determiner-noun code-switching in Spanish heritage speakers. Bilingualism: Language and Cognition 20.1 (2017): 150-161. [Paper]
  • Bhatt, Rakesh M., and Agnes Bolonyai. (2011) Code-switching and the optimal grammar of bilingual language use. Bilingualism: Language and Cognition 14.4 (2011): 522-546. [Paper]
  • Lipski (2005) Code-switching or Borrowing? No sé so no puedo decir, you know. Second Workshop on Spanish Sociolinguistics [Paper]
  • Roberto R. Heredia and Jeanette Altarriba (2001) Bilingual Language Mixing: Why Do Bilinguals Code-Switch? SAGE Publications [Paper]
  • Belazi, et al. (1994) Code switching and X-bar theory: The functional head constraint. Linguistic inquiry Vol 25 No.2 Spring [Paper]
  • Shana Poplack (1980) Sometimes i’ll start a sentence in spanish y termino en espanol: toward a typology of code-switching1. Linguistics 18(7-8) [Paper]
  • Pfaff, Carol W. (1979) Constraints on language mixing: intrasentential code-switching and borrowing in Spanish/English. Language: 291-318. [Paper]
  • Shana Poplack (1978) Syntactic structure and social function of code-switching. Vol. 2. Centro de Estudios Puertorriqueños, City University of New York [Paper]
  • Gumperz, J. J., & Hernandez, E. (1969) Cognitive aspects of bilingual communication. Institute of International Studies, University of California [Paper]

Affective Computing

  • Wang, et al. (2017) Emotion Analysis in Code-Switching Text With Joint Factor Graph Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing [Paper]
  • Wang, et al. (2016) A Bilingual Attention Network for Code-switched Emotion Prediction. COLING [Paper]
  • Sophia Lee and Zhongqing Wang (2015) Emotion in Code-switching Texts: Corpus Construction and Analysis. Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing [Paper]
  • Wang, et al. (2015) Emotion Detection in Code-switching Texts via Bilingual and Sentimental Information. ACL [Paper]

Dialog and Conversational System

  • Gupta, et al. (2018) Uncovering Code-Mixed Challenges: A Framework for Linguistically Driven Question Generation and Neural based Question Answering. CoNLL [Paper]

Syntax

  • Özlem Çetinoglu and Çagrı Çöltekin (2019) Challenges of Annotating a Code-Switching Treebank. SyntaxFest [Paper]

Social Linguistics

  • Yoder, et al. (2017) Code-Switching as a Social Act:The Case of Arabic Wikipedia Talk Pages. Proceedings of the Second Workshop on Natural Language Processing and Computational Social Science, ACL [Paper]
  • Agrawal, et al. (2017) Agarwal, Prabhat, et al. I may talk in English but gaali toh Hindi mein hi denge: A study of English-Hindi code-switching and swearing pattern on social networks. International Conference on Communication Systems and Networks (COMSNETS) [Paper]

Corpus

  • Singh, et al. (2018) A Twitter Corpus for Hindi-English Code Mixed POS Tagging. Sixth International Workshop on Natural Language Processing for Social Media, ACL [Paper]
  • Li, et al. (2012) A Mandarin-English Code-Switching Corpus. LREC [Paper]
  • Lyu, et al. (2010) SEAME: A Mandarin-English Code-Switching Speech Corpus in South-East Asia. Interspeech [Paper]
  • Lyu, et al. (2010) An Analysis of a Mandarin-English Code-switching Speech Corpus: SEAME. Age [Paper]

Dialog and Conversational System

  • Banerjee, et al. (2018) A Dataset for Building Code-Mixed Goal Oriented Conversation Systems. COLING [Paper]

Annotation Tool

  • Shah, et al. (2019) CoSSAT: Code-Switched Speech Annotation Tool. Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP [Paper]

Other useful resources

Book

  • Caciullos and Travis (2018) Bilingualism in the Community. Cambridge University Press

code-switching-papers's People

Contributors

gentaiscool avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.