Giter VIP home page Giter VIP logo

utdeventdata's Introduction

UTDEventData ver. 1.0.0

DOI DOI

The UTDEventData R package provides an interface to extract data from the UTD Event Data server. This package is stable and actively maintained/updated. Your comments, feedback and suggestions are welcome.
If you have any question regarding the package, please contact Marcus Sianan [email protected], or open an issue (https://github.com/KateHyoung/UTDEventData/issues).

Note: Our server now provides the access to 'CLINE_PHOENIX_LNNYT' data that contains several million events from 17.5 million news stories from New York Times (1945 - 2019) that is provided by Open Event Data Alliance. You can find more information by clicking the link here.

This package is part of the "Modernizing Political Event Data for Big Data Social Science Research" project. More information can be found on the project webpage.

Several functions to preview and download data are listed below. More details of these methods are illustrated in the vignette.

  • citeData( ): for citing the package and data tables in the UTD server for publications
  • DataTables( ): for looking up data tables in the UTD server
  • tableVar( ): for looking up the variables of a data table
  • previewData( ): for previewing the data structure of a data table
  • pullData( ): for downloading data by countries and time periods
  • entireData( ): for downloading an entire data table
  • getQuerySize(): for measuring the size of requested data from the UTD server
  • sendQuery( ): for requesting built queries from the API server to download data
  • Table: a reference class

Leaf Query Block functions:

  • returnTimes( ): create a query block by time periods
  • returnCountries( ): create a query block by countries
  • returnLatLon( ): create a query block by latitude and longitude
  • returnDyad( ): create a query block of a dyad for both source and target actors
  • returnRegExp( ): create a query block by pattern of attributes in a data table

Branch Query Block functions:

  • orList( ): match records that satisfy any of the child query blocks
  • andList( ): match records that satisfy all of the child query blocks

Installation

Without the vignette:

devtools::install_github("KateHyoung/UTDEventData") 

With the vignette:

devtools::install_github("KateHyoung/UTDEventData", build_vignettes=TRUE)

Users with newer versions of R may need to follow this format:

install.packages("devtools")
library(remotes)
install_github("KateHyoung/UTDEventData")
library(devtools)
library(UTDEventData)

Retrieve an API key

Access to the UTD data server requires an API key. To obtain an API key, follow the link and fill the form: https://eventdata.utdallas.edu/signup. Please check your spam and junk email if you do not receive the API key in your inbox.

Using the API key

Method 1: Pass the key as the first argument

You will need to pass the key on every function call.

k <- '...your API key....'
DataTables(utd_api_key = k)

Method 2: Store the key in an environment variable

Set the default API key by setting the environment variable UTDAPIKEY.

Sys.setenv(UTDAPIKEY = "...your API key...")

DataTables()
tableVar(table = "icews", lword = "target")

Note: Method 2 currently works only with DataTabes(), tableVar(), and previewData(). We plan to expand this method to other functions that require an API key.

Further examples will assume the api key is set in an environment variable.

Data Preview

Retrieve a sample of 100 observations.

dataSample <- previewData(table_name = "PHOENIX_RT")
View(dataSample)

Data Download (quick)

pullData() can be used to retrieve data subsetted by country names and dates.

subset1 <- pullData(table_name = "phoenix_rt", country = list('canada','China'), start = '20171101',  end = '20171102', T)

Data Download (custom)

More complex queries with intersections, unions and multiple sets of constraints may be submitted via the sendQuery() function. More details on this method are provided in the vignette.

Example Usage

dt <- pullData('utd_api_key', "Phoenix_rt", list("RUS", "SYR"), start="20180101", end="20180331", citation = F)

## querying the fight event by CAMEO codes

Fgt <- dt[dt$code %in% c("190", "191", "192", "193", "194", "195", "1951", "1952", "196"),]
Fgt <- Fgt[,1:23] ## remove url and oid columns

tb <- table(Fgt$country_code, Fgt$month) # monthly incidents

barplot(tb, main = "Monthly Fight Incidents between RUS and SYR", col=c("darkblue", "red"),
        legend = rownames(tb), beside=TRUE,  xlab="Month in 2018")

{width=70%}

Military related fights between Russia and Syria from January 2018 to March 2018 are depicted by month. Event types are articulated by CAMEO codes in Phoenix real-time data.

Vignette

Access the vignette by executing the following R snippet. This requires an initial package installation with build_vignette=TRUE.

vignette("UTDEventData")

Alternatively, download the PDF version here

Authors

Marcus Sianan [email protected] (Maintainer)

Dr. Patrick T. Brandt [email protected]
Dr. Vito D'Orazio [email protected]
Dr. Latifur Khan [email protected]
Dr. HyoungAh(Kate) Kim [email protected]
Michael J. Shoemate [email protected]
Sayeed Salam [email protected]
Jared Looper [email protected]

Community Guidelines

This project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. Feedback, bug reports, and feature requests here. You may request to store a dataset in the UTD Event Data server by contacting one of the authors. Those who request to store data as collaborators also agree to abide by its terms specified in the Contributor Code of Conduct.

License

GPL-3
This package is supported by the RIDIR project funded by National Science Foundation, Grant No. SBE-SMA-1539302.

utdeventdata's People

Contributors

katehyoung avatar marcusmms avatar andrewheiss avatar danielskatz avatar shoeboxam avatar

Stargazers

 avatar  avatar Omar Ansari avatar MinShiMia avatar Michael C. McCall avatar  avatar  avatar Babak RezaeeDaryakenari avatar Michelle Dion avatar  avatar Alex Hanna avatar Steven V. Miller avatar Jay Ulfelder avatar Andreas Beger avatar Welton Chang avatar

Watchers

Thomas Leo Scherer avatar Welton Chang avatar Daniel avatar  avatar Sercan Pekel avatar

utdeventdata's Issues

Quering by event code in Phoenix RT

I want to find all events in Phoenix RT with, say, code 90. I see that a dataframe generated by the query does indeed contain a column 'code' with a respective CAMEO code, but
tableVar(apikey, 'PHOENIX_RT') shows no variable with such name. And indeed, I can't query it with returnRegExp (though no error is shown):

> evcode <- returnRegExp("5WSDHiyObsBUBu9JR4PVdg5E6lrkStRM", 'PHOENIX_RT', "90", "code")
> evcode
NULL

So can this be addressed in any way? An option quering by event code would be awesome!

More options for quering Dyad

1.Is it possible to query Dyad with for 2 and more source/target countries? E.g. I want to search for events where source actor is CHN, and target actors are USA, CAN and RUS. I've tried several options like:

#writing in line
returnDyad('PHOENIX_RT', "CHN", "USA" "CAN" "RUS")
#making a vector
returnDyad('PHOENIX_RT', "CHN", c("USA", "CAN", "RUS")) 
#making a list
returnDyad('PHOENIX_RT', "CHN", list("USA", "CAN", "RUS"))

None work properly however - when I run it with sendQuery, I just get an empty data frame.

2.Can I query only by source/target? E.g. I need all events where source country is AFG. So I'd want to write somethign like
returnDyad('PHOENIX_RT',"AFG", )
But there is no default argument for the source or target section - so apparently I can't leave it blank.

So perhaps there are solutions, but I can't find any?

adding progress bar

Could you add a progress bar to the "PullData()" functions so we can monitor the progress in receiving data?

Thanks!

The Table class isn't working

Following the vignette, creating an object from the Table class doesn't work:

# creating an object
obj <- Table$new()
# Error: object 'Table' not found

Error when using pullData

I am using the example provided in the vignette to get data from phoenix_rt:

k <- '...My API Key...'
subset1 <- pullData(utd_api_key = k, table_name = "phoenix_rt", country = list('canada','China'), start = '20171101', end = '20171102', T)

But I get the below error:
Error in readLines(curl::curl(url_submit), warn = FALSE) :
HTTP error 404.

Managing Duplicate Events

Hi There,

I am using returnDyad( ) to extract data for two actors, I am wondering if there is anyway to remove the duplicates post extraction?

I have duplicated data for several entries with different URLs or text_sources.
I am attempting to do the cleaning myself but before that I was wondering if there is a way to extract "distinct events""?

Error when downloading ICEWS entire data

I use below command to download the entire of ICEWS dataset,

k <- '...My API Key...'
data_ICEWS <- entireData(utd_api_key = k , table_name = 'ICEWS',

  •                    citation = FALSE)
    

But, I get below error:
Error in curl::curl_download(url_submit, tmp) : Empty reply from server

The command works for "CLINE_PHOENIX_FBIS" table.

It seems pullData() also doesn't work well with ICEWS table.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.