Giter VIP home page Giter VIP logo

gt-nlp-class's Introduction

CS 4650 and 7650

(Note about registration: registration is currently restricted to students pursuing CS degrees for which this course is an essential requirement. Unfortunately, the enrollment is already at the limit of the classroom space, so this restriction is unlikely to be lifted.)

  • Course: Natural Language Understanding
  • Instructor: Jacob Eisenstein
  • Semester: Spring 2018
  • Time: Mondays and Wednesdays, 3:00-4:15pm
  • TAs: Murali Raghu Babu, James Mullenbach, Yuval Pinter, Zhewei Sun
  • Schedule
  • Recaps from previous classes

This course gives an overview of modern data-driven techniques for natural language processing. The course moves from shallow bag-of-words models to richer structural representations of how words interact to create meaning. At each level, we will discuss the salient linguistic phemonena and most successful computational models. Along the way we will cover machine learning techniques which are especially relevant to natural language processing.

Learning goals

  • Acquire the fundamental linguistic concepts that are relevant to language technology. This goal will be assessed in the short homework assignments and the exams.
  • Analyze and understand state-of-the-art algorithms and statistical techniques for reasoning about linguistic data. This goal will be assessed in the exams and the assigned projects.
  • Implement state-of-the-art algorithms and statistical techniques for reasoning about linguistic data. This goal will be assessed in the assigned projects.
  • Adapt and apply state-of-the-art language technology to new problems and settings. This goal will be assessed in assigned projects.
  • (7650 only) Read and understand current research on natural language processing. This goal will be assessed in assigned projects.

Readings will be drawn mainly from my notes. Additional readings may be assigned from published papers, blogposts, and tutorials.

Supplemental textbooks

These are completely optional, but might deepen your understanding of the material.

Grading

The graded material for the course will consist of:

  • Seven short homework assignments, of which you must do six. Most of these involve performing linguistic annotation on some text of your choice. The purpose is to get a basic understanding of key linguistic concepts. Each assignment should take less than an hour. Each homework is worth 2 points (12 total). (Many of these homeworks are implemented at quizzes on Canvas.)
  • Four assigned problem sets. These involve building and using NLP techniques which are at or near the state-of-the-art. The purpose is to learn how to implement natural language processing software, and to have fun. These assignments must be done individually. Each problem set is worth ten points (48 total). Students enrolled in CS 7650 will have an additional, research-oriented component to the problem sets.
  • An in-class midterm exam, worth 20 points, and a final exam, worth 20 points. The purpose of these exams is to assess understanding of the core theoretical concepts, and to encourage you to review and synthesize your understanding of these concepts.

Barring a personal emergency or an institute-approved absence, you must take each exam on the day indicated in the schedule. Job interviews and travel plans are generally not a reason for an institute-approved absence. See here for more information on GT policy about absences.

Late policy

Problem sets will be accepted up to 72 hours late, at a penalty of 2 points per 24 hours. (Maximum score after missing the deadline: 10/12; maximum score 24 hours after the deadline: 8/12, etc.) It is usually best just to turn in what you have at the due date. Late homeworks will not be accepted. This late policy is intended to ensure fair and timely evaluation.

Getting help

My office hours follow Wednesday classes (4:15-5:15PM) and take place in class when available.

TA office hours are in CCB commons (1st floor) unless otherwise announced on Piazza.

  • Murali: Friday 10AM-11AM
  • James: Thursday 11AM-12PM
  • Yuval: Tuesday 3PM-4PM
  • Zhewei: Monday 1PM-2PM

Online help

Please use Piazza rather than personal email to ask questions. This helps other students, who may have the same question. Personal emails may not be answered. If you cannot make it to office hours, please use Piazza to make an appointment. It is unlikely that I will be able to chat if you make an unscheduled visit to my office. The same is true for the TAs.

Class policies

Attendance will not be taken, but you are responsible for knowing what happens in every class. If you cannot attend class, make sure you check up with someone who was there.

Respect your classmates and your instructor by preventing distractions. This means be on time, turn off your cellphone, and save side conversations for after class. If you can't read something I wrote on the board, or if you think I made a mistake in a derivation, please raise your hand and tell me!

Using a laptop in class is likely to reduce your education attainment. This has been documented by multiple studies, which are nicely summarized in the following article:

I am not going to ban laptops, as long as they are not a distraction to anyone but the user. But I suggest you try pen and paper for a few weeks, and see if it helps.

Prerequisites

The official prerequisite for CS 4650 is CS 3510/3511, "Design and Analysis of Algorithms." This prerequisite is essential because understanding natural language processing algorithms requires familiarity with dynamic programming, as well as automata and formal language theory: finite-state and context-free languages, NP-completeness, etc. While course prerequisites are not enforced for graduate students, prior exposure to analysis of algorithms is very strongly recommended.

Furthermore, this course assumes:

  • Good coding ability, corresponding to at least a third or fourth-year undergraduate CS major. Assignments will be in Python.
  • Background in basic probability, linear algebra, and calculus.

People sometimes want to take the course without having all of these prerequisites. Frequent cases are:

  • Junior CS students with strong programming skills but limited theoretical and mathematical background,
  • Non-CS students with strong mathematical background but limited programming experience.

Students in the first group suffer in the exam and don't understand the lectures, and students in the second group suffer in the problem sets. My advice is to get the background material first, and then take this course.

One of the goals of the assigned work is to assess your individual progress in meeting the learning objectives of the course. You may discuss the homework and projects with other students, but your work must be your own -- particularly all coding and writing. For example:

Examples of acceptable collaboration

  • Alice and Bob discuss alternatives for storing large, sparse vectors of feature counts, as required by a problem set.
  • Bob is confused about how to implement the Viterbi algorithm, and asks Alice for a conceptual description of her strategy.
  • Alice asks Bob if he encountered a failure condition at a "sanity check" in a coding assignment, and Bob explains at a conceptual level how he overcame that failure condition.
  • Alice is having trouble getting adequate performance from her part-of-speech tagger. She finds a blog page or research paper that gives her some new ideas, which she implements.

Examples of unacceptable collaboration

  • Alice and Bob work together to write code for storing feature counts.
  • Alice and Bob divide the assignment into parts, and each write the code for their part, and then share their solutions with each other to complete the assignment.
  • Alice or Bob obtain a solution to a previous year's assignment or to a related assignment in another class, and use it as the starting point for their own solutions.
  • Bob is having trouble getting adequate performance from his part-of-speech tagger. He finds source code online, and copies it into his own submission.
  • Alice wants to win the Kaggle competition for a problem set. She finds the test set online, and customizes her submission to do well on it.

Some assignments will involve written responses. Using other people’s text or figures without attribution is plagiarism, and is never acceptable.

Suspected cases of academic misconduct will be (and have been!) referred to the Honor Advisory Council. For any questions involving these or any other Academic Honor Code issues, please consult me, my teaching assistants, or http://www.honor.gatech.edu.

gt-nlp-class's People

Contributors

jacobeisenstein avatar jiyfeng avatar logan-life avatar muralibalusu12 avatar sandeepsoni avatar stevenbedrick avatar umashanthi avatar yuvalpinter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gt-nlp-class's Issues

Small typo

In the version 'draft 3 june' at line 1892 there is word duplication by accident:

Classification of these relations relations can be performed by searching for characteristic patterns between pairs of words

The word relations appears twice.

By the way, so far, great read, thx a lot for opensourcing this!

Typo line 10316

"An indicator random variable is a functions" -> "An indicator random variable is a function"

A typo in the textbook

Dear Professor Eisenstein,

I found a typo in the textbook. The left-hand side of the equation [15.33] has more closing parentheses than opening ones.

With best regards,

Katô Taisei

Possible typo in Equation 2.58

Dear Prof. Eisenstein,

I guess there should be a minus sign at the beginning of the right side of the equation:

l_{LogReg} (\theta; x^{(i)}, y^{(i)}) = - \theta f(x^{(i)}, y^{(i)}) + rest of the equation

This can be also illustrated by Equation 2.60.

Thank you for sharing your book! I enjoy reading it.

Weiwei

Possible error on 4.4.1

Hi there,

I think the selected sentence should be A high-recall classifier is preferred when false positives are cheaper than false negatives.

image

Thanks.

Typos in the notes, chapter 6

I've found some small typos in the chapter 6 of the notes, the version from October 21, 2018:

  • Page 133, line 8: $p^\star(Duluth)$$p^\star_1(Duluth)$ (in order to be consistent with the probability of Francisco)
  • Page 140, line -3: "a perplexities of around" → "perplexities of around"
  • Page 140, equation [6.42] – the comma is in the exponent
  • Page 141, line -5: "n-gramsor RNNs" → "n-grams or RNNs"
  • Page 141, line -6: "character" overflows into the margin

And one suggestion: I think the article of Melis et al., On State of the Art of Evaluation in Neural Language Models, might be relevant for subsection 6.3.2, as it makes a strong case for hyperparameter tuning.

That said, thank you for making the notes available – they seem like a very fine resource!

Typo in the notes

Hi,

I know this is very minor but there is a typo in the notes eisenstein-nlp-notes-snapshot.pdf.

It is at the top of page 86, when discussing effective counts just above equation (5.14):

dominator -> denominator

(dominate, exterminate?)

Possible missing subscript on Equation for CRFs

Dear professor Eisenstein,

Thank you very much for making the book publicly available.
I would like to point a possible typo on the equation on section 7.5.3.1:

Should not the s function be written s_m, indicating the transition between labels y_{m-1} and y_m on the i-th position of the underlying w sequence?

Thank you very much,
Luiz C F Ribeiro

Errors in the textbook

Dear Professor Eisenstein,

I found some errors in the printed version of Introduction to Natural Language Processing.

  • On Notation (page xiii), the base of the exponential and logarithm should be e because the base-2 exponent and the base-2 logarithm do not satisfy (\exp x)' = \exp x and (\log x)' = 1 / x, respectively, which are used in, for example, equation [2.26].
  • On Exercise 5-5, 'This is the "direct transfer" baseline' should be appended to the first bullet item.
  • On Exercise 6-9, the definition of the Riemann zeta function is \sum_{r = 1}^\infty r^{-s}.
  • In Algorithm 11 (on page 142),
    • On lines 1 and 4, k iterates from 0 to K, but it means that there are K + 1 tags. Also, there is a missing comma on line 1.
    • On line 9, b_m should be b_{m+1}.
  • On equation [7.86], n (both in the numerator and the denominator) should iterate until M + 1.
  • On equation [7.88], n should start from m + 1. And the following equation must be \sum_{k' \in \mathcal{Y}} \exp s_{m + 1}(k', k)\sum_{\boldsymbol{y}_{m + 1: M}: Y_{m + 1} = k'} \prod_{n = m + 2}^{M + 1} \exp s_n(y_n, y_{n - 1}).

With best regards,

Katô Taisei

Typo on line 697 of June 20, 2018 edition

Professor Eisenstein,
"It is also includes" (last 4 words on line 697, p.24): I think you meant "It also includes".
P.S. Thanks for making the draft of your textbook available. I've just started reading it and it's been very enjoyable.

Suggestion for the commit message

Dear @jacobeisenstein,

Thanks to always improve content quality of your NLP note. However, i found difficulty to refer the actual changes since your commit message too brief without any details. This also due to the nature of PDF file which harder to track the diff.

If you mind, i recommend to add more details about the changes in commit message, at least the location of the diff (e.g: which page and section). Thus, instead redownload your note, i can just correct it manually by my hand.

Hope this not overwhelming, thanks

HW1 Uploading?

There isn't an assignment listed on Tsquare. Will there be, or do we post somewhere else?

Notes error Bayes exercise?

Hi,

I have to say that I am not totally sure, but I think that there is a mistake in the review of basic probability, on the example using Bayes rule (Lana Linguist's pattern matcher).

Page 19, equation (1.23) states that

Pr(T ∩ ¬G) = Pr(T | G) × Pr(¬G)

instead of

Pr(T ∩ ¬G) = Pr(T | ¬G) × Pr(¬G)

Because of this, the numbers are wrong, as Pr(T | ¬G) = .995 and not .005

EDIT: Pr(T | ¬G) is indeed .005, so the numbers are not wrong.

The rest of the derivation up to equation (1.28) is wrong because of this error.

In order to correct this, we should change the four equations below:

Pr(T ∩ ¬G) = Pr(T | ¬G) × Pr(¬G) = 0.995 × (1−10^−5) ≈ 0.995 (1.23)
= 0.95 × 10^−5 + 0.995 ≈ 0.005 (1.25)
= 0.95 × 10^−5 / (0.95 × 10^−5 + 0.995 × (1−10^−5)) (1.27)
≈ 0.00001 (1.28)

The pattern matcher is even less useful than previously thought!

Possible Typo Lines 817 to 818

Hi, I think there's a typo on page 33 Lines 817 to 818 where you use the word "describes" twice. Maybe you meant to say "Algorithm 1 describes the generative model, the Naive Bayes classifier, with parameters..."

  • Brandon Peck

Knerser-Ney Smoothing formula 6.23

It seems like the formula 6.23 does not agree with what wikipedia gives.

I find the probability sum less than 1 according to the formula in the note.
Correct me if I'm wrong.
Thanks.

Bibtex entry

Dear professor,

Once again thank you very much for making this material publicly available. Is there any default Bibtex that we should use to cite the notes?

Thank you.

Formal Language Theory and Dijkstra's algorithm, Chapter 9

Chapter 9, page 194 states that for the membership operation with we should (can) use Dijkstra's algorithm for a DFA. Surely with a DFA the membership test is much simpler, as a string has a unique path. So we need only follow the M edges labelled with the input symbol sequence, then check if this state is a final state F. Thus no need for Dijsktra, and the complexity is linear in the string size (and the size of the DFA is irrelevant). The need for Dijkstra would come if we use a NFA that hasn't been determinised, or a FST. My comment also pertains to p197, below [9.5] which repeats the bit about shortest-path for WFSAs.

Maybe I'm missing something, please let me know.

A small typo

Hi Jacob,

The footnote 13 (under line 1057) of the class note (version June 1, 2018):

A function f is convex iff αf(x i )+(1−α)f(x j ) ≥ f(αx i +(1−α)x j ),

Should 'iff' be 'if'? Thanks a lot!

Best,

S

question

Hi, thank you for your great book.
I am confused about the subscript in formula 3.29, which I think the zk should be zj (sorry for the uncorrect typing shape).
Thanks.

Any plan to publish the book?

Dear @jacobeisenstein,

First of all, really appreciate for write book of NLP with recent trends. I know this is too early, but any raw plan to make this "book" become actual book? Sorry if i'm missing your statement about this.

Besides, i'm willing to read your book but seems the content not final yet. Could you give some notes which chapters or sections that might still in early stage/prone to error? is it what you mark it with "*" sign in the TOC? Thank you, can't wait for the final version.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.