Giter VIP home page Giter VIP logo

acl23-big-tech-nlp's Introduction

The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research

arXiv License

This repository contains the code and data for the ACL 2023 paper The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research.

teaser

Data

If you want to use our preprocessed data, consider data/processed_data.zip.

To reproduce the data for later years, you will need to download the anthology.bib from the ACL Anthology and place it in the data folder.

You can find the list of universities here and an export of big tech companies here.

To reproduce the dataset, you can use the notebook notebooks/datasets.ipynb.

Analysis

For running parts of the analysis of the paper, you can use the notebook notebooks/analysis.ipynb.

How to Cite

@inproceedings{abdalla-etal-2023-elephant,
    title = The Elephant in the Room: Analyzing the Presence of Big Tech in Natural Language Processing Research,
    author = "Abdalla, Mohamed  and
      Wahle, Jan Philip  and
      Lima Ruas, Terry  and
      N{\'e}v{\'e}ol, Aur{\'e}lie  and
      Ducel, Fanny  and
      Mohammad, Saif  and
      Fort, Karen",
    booktitle = Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),
    month = jul,
    year = 2023,
    address = Toronto, Canada,
    publisher = Association for Computational Linguistics,
    url = https://aclanthology.org/2023.acl-long.734,
    doi = 10.18653/v1/2023.acl-long.734,
    pages = 13141--13160,
    abstract = "Recent advances in deep learning methods for natural language processing (NLP) have created new business opportunities and made NLP research critical for industry development. As one of the big players in the field of NLP, together with governments and universities, it is important to track the influence of industry on research. In this study, we seek to quantify and characterize industry presence in the NLP community over time. Using a corpus with comprehensive metadata of 78,187 NLP publications and 701 resumes of NLP publication authors, we explore the industry presence in the field since the early 90s. We find that industry presence among NLP authors has been steady before a steep increase over the past five years (180{\%} growth from 2017 to 2022). A few companies account for most of the publications and provide funding to academic researchers through grants and internships. Our study shows that the presence and impact of the industry on natural language processing research are significant and fast-growing. This work calls for increased transparency of industry influence in the field.",
}

acl23-big-tech-nlp's People

Contributors

jpwahle avatar

Stargazers

Max Gnewuch avatar  avatar Huan Yee Koh (Huan) avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.