Giter VIP home page Giter VIP logo

carmen-python's People

Contributors

hughhan1 avatar jackjyzhang avatar kite1988 avatar mdredze avatar query avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

carmen-python's Issues

Incorporate timezone information

The timezone a user is posting in can help narrow down possible locations.

We would need to look into:

  1. Is this information easily available in the tweet?
  2. Do we have timezone information of a location from GeoNames?

Move to GeoNames database

The current database is internal and separate from any Twitter or other place IDs. We want this tool to be easily extendable with, so we will move to a popular geographical database, GeoNames.

Current progress is in branch geonames-alexandra in folder carmen/geonames_mapping

  • Convert GeoNames entries to the Carmen jsonlines format
  • "map" GeoNames entries to the internal Carmen locations. Main point of this is to (1) fill in missing location information and (2) combine the "aliases" for more robust lookup
  • Evaluate Carmen's coverage with only GeoNames, only internal locations, and the combined GeoNames/internal

Add profile location merge script

The user profile information is no longer part of the basic tweet and needs to be queried separately. For convenience we should have a merge script that is run before Carmen and takes in (1) the tweet JSON file and (2) the user file

Add support for Twitter API v2

Carmen should be able to handle tweets queried with v1 and v2 API.

API v2 information: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/place

Details below

@nimahassanpour Carmen was written for the old Twitter API, not v2. your country info is at the top-level of the JSON, but Carmen expects that information to be under place. and usually the user-specified location is under user

https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/object-model/tweet

I haven't used the new API yet. So it just puts the requested fields in the top-level of the JSON instead of nested?
https://developer.twitter.com/en/docs/twitter-api/tweets/lookup/api-reference/get-tweets#tab2

Originally posted by @AADeLucia in #3 (comment)

Compatibility with twitter api 2

Hi,

Thank you for this useful library. I modified the code a little bit to make it work with twitter api version 2. I have tested this on some Tweets I have acquired from the academic research endpoint using 'twarc2' library and it works on them. I used the following pages to change the code.

https://visual-data-format-migration-tool.glitch.me/
https://developer.twitter.com/en/docs/twitter-api/migrate/data-formats

I have created a fork and committed the changes there. I thought I put it here in case someone may find it useful.

https://github.com/manisci/carmen-python_api2

Thank you,
Mani Sotoodeh

Using the Python API returning incomplete location object

Trying to resolve tweets "on the fly" from the twitter stream API :

def get_location(data):
location = resolver.resolve_tweet(data)
return location

getting results typical:

(False, Location(country='United States', known=True, id=2645))
(False, Location(country='United States', state='California', known=True, id=440))
(False, Location(country='United States', state='New York', county='Erie County', city='Buffalo', known=True, id=2583))
(False, Location(country='United States', state='Tennessee', county='Robertson County', city='Springfield', known=True, id=3243))
(False, Location(country='United States', state='North Carolina', county='Davidson County', city='Thomasville', known=True, id=4604))
(False, Location(country='United States', state='Michigan', known=True, id=3046))
(False, Location(country='United States', state='Florida', known=True, id=3033))
(False, Location(country='United States', state='District of Columbia', county='District of Columbia', city='Washington', known=True, id=2575))
(False, Location(country='United States', state='District of Columbia', known=True, id=3032))

How do I call to include latitude, longitude, and resolution_method?

The frontend returns everything. This is a great program. Thank you!

Tweet JSON format?

Hi, this tool looks great, but I'm wondering if it is still supported?

I run the example from you API and the output is pasted below

with open('test.json', 'r') as f:
    tweet_json = f.readlines(1)

tweet = json.loads(tweet_json)
resolver = carmen.get_resolver()
resolver.load_locations()
location = resolver.resolve_tweet(tweet)
print(location)

image

a line from my tweet JSON file looks like this:

{
 "id": "1361068983001219000",
 "text": "sometext"
}

I'd greatly appreciate your help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.