Giter VIP home page Giter VIP logo

Comments (4)

seancarmody avatar seancarmody commented on July 17, 2024 1

It looks as though something unusual is happening as a result of the inverted commas. If you have a look at the Google ngram viewer page linked to below, you'll see the same result as the ngramr code generates.

https://books.google.com/ngrams/graph?content=international+institutions%2Cinternational+order%2Cinternational+regimes&year_start=1900&year_end=2019&corpus=26&smoothing=2

Note that this chart is case sensitive, so will not include the variants with the i's capitalised.

I'm not sure exactly what is happening with the ngram chart you've created directly in the Google viewer, but I note there are warning messages displayed, including

Replaced "international order" with " international order " to match how we processed the books.

Also, the frequencies in the inverted comma chart are far lower (by two orders of magnitude) than in the chart without inverted commas, so it looks as though it's missing a lot of cases. I would therefore suggest that the results you are getting from the ngramr code are in fact more accurate. To ensure that you get a case insensitive search you can use the parameters case_ins=TRUE and aggregate=TRUE (without the latter the data will split, for example, 'international institutions' and 'International institutions' separately).

As an aside, the sample code is a little long-winded and you can instead use something like this:

ggram(c("international order", "international institutions", "international regimes"),
      year_start=1900, year_end=2019, smoothing=2, case_ins=TRUE, aggregate=TRUE)

or

data_long <- ngram(c("international order", "international institutions", "international regimes"), 
                   year_start=1900, year_end=2019, smoothing=2, case_ins=TRUE, aggregate=TRUE)
ggram(data_long)

While that doesn't completely clarify what is going on, with any luck this enough to keep you going. Let me know how you go.

from ngramr.

seancarmody avatar seancarmody commented on July 17, 2024

Let me take a look and get back to you...

from ngramr.

bfbraum avatar bfbraum commented on July 17, 2024

Ah, fantastic, thanks so much. I had no idea that the ngram interface was so fragile. Noted for future reference, and thanks for a really cool and easy-to-use package (easier than the ngram viewer itself, it turns out....)

from ngramr.

seancarmody avatar seancarmody commented on July 17, 2024

No problem. Happy to help! I hadn't realised this particular peculiarity myself. I'm also conscious that the fragility of the interface can translate to fragility of the package since it just scrapes calls to the web page.

This comparison highlights more clearly the difference between searches with and without inverted commas:

https://books.google.com/ngrams/graph?content=%22international+institutions%22%2Cinternational+institutions&year_start=1800&year_end=2019&corpus=26&smoothing=3&direct_url=t1%3B%2C%22%20international%20institutions%20%22%3B%2Cc0%3B.t1%3B%2Cinternational%20institutions%3B%2Cc0

from ngramr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.