Giter VIP home page Giter VIP logo

rga's Introduction

R Google Analytics

This is a package for extracting data from Google Analytics into R.

The package uses OAuth 2.0 (protocol) to access the Google Analytics API.

News / changelist

  • Pulling data in batches has been added
  • Pulling unsampled data has been added
  • No more SSL errors (thanks to Schaun Wheeler, who also has been added as a collaborator!)
  • No more parseing errors
  • A bunch of tweaks

Installation

Manually

Since rga is still under development it is not yet on CRAN, please download the development version. You can get the latest version from github with:

Install the devtools package:

install.packages("devtools")
library(devtools)

And then run the install_github command:

install_github("rga", "skardhamar")
library(rga)

Authenticating

The principle of this package is to create an instance of the API Authentication, which is a S4/5-class (utilizing the setRefClass). This instance then contains all the functions needed to extract data, and all the data needed for the authentication and reauthentication. The class is in essence self sustaining.

This means that you can create as many instances as you need.

Basic use

The instance is created with the rga.open command:

rga.open(instance="ga")

This will check if the instance is already created, and if it is, it'll prepare the token. If the instance is not created, it'll create the instance, and redirect the client to a browser for authentication with Google.

You then have to authenticate the application, Google will then output an access code, which you need to enter in the R console.

Advanced use

If you want to store the instance locally, this can be done by adding the where attribute:

rga.open(instance="ga", where="~/ga.rga")

This means, that even if you delete the .RData workspace, the package will make sure you have access to the API.

Use own Google API Client

If you want to use your own Google API Client, you need to provide this data in the rga.open:

rga.open(instance = "ga", 
		 client.id = "862341168163-qtefv92ckvn2gveav66im725c3gqj728.apps.googleusercontent.com", 
		 client.secret = "orSEbf0-S76VZv6RMHe46z_N")

Create a project in Google API Console to acquire client.id and client.secret.

Extracting data

In order to extract data from the instance, there is a couple of commands to use. The most important one is $getData:

ga$getData(ids, start.date, end.date, 
		   metrics = "ga:visits", dimensions = "ga:date", 
		   sort = "", filters = "", segment = "",
		   start = 1, max = 1000)

This will output the data in a data frame, with all the correct formats applied.

ids refers to one's site-specific Analytics "profile ID"; if one doesn't know it, this can be typically found in URLs in the GA interface as the number following a "p" in the URL (eg. the ID for a URL like https://www.google.com/analytics/web/#report/visitors-overview/a18912926w37930778p37491797/ would be 37491797.)

The syntax for dimensions/filters/segments follows the one dictated by Google - please refer to the Google Analytics API documentation such as "Dimensions & Metrics Reference" for further information. (Note that the argument to metrics is a comma-delimited character-string, not a vector of character-strings, so one specifies arguments like "ga:pageviews,ga:sessions,ga:visitors".)

The dates defaults to the current day, meaning that if you don't input these, only data from today will be extracted.

Extracting more observations than 10,000

The Google Analytics API has a natural limit of 10,000 observations pr. pull. Therefore there has been added the possibility to extract data in batches. The $getData-function will now throw a message if not all the observations are being extracted.

In order to extract this data, just use the batch-attribute, for example:

ga$getData(ids, batch = TRUE, start.date, end.date, 
		   metrics = "ga:visits", dimensions = "ga:date,ga:medium,ga:source", 
		   sort = "", filters = "", segment = "")

Alternatively you can set the batch to an integer, and the function will pull the date in batches of this integer. If you just set it to TRUE it will automatically pull the data in batches of 10.000 observations (which is what is the maximum allowed observations).

Notice that in this example the max-attribute is missing, if this is the case, the function will automatically pull ALL the data. If you however set the max-attribute and the batch to true, the function will pull the data in batches untill it reaches the max.

Get the first date with data

In order to get the date that contains the first data, use the function:

ga$getFirstDate(ids)

This function will do a lookup for first available data.

Get the data unsampled

In some cases where there exists large amount of data, Google Analytics will return sampled data. In order to avoid this, you can partition the query into multiple small querys (day-by-day). One reason of sampling is if a query includes more than 500,000 sessions and is not one of the pre-aggregated queries. Using an advanced segment or filter in a query will generally mean that sampling will occur if the 500,000 sessions are exceeded.

You can get this day-by-day data by using the walk-attribute, which in effect will 'walk' through the the data set day-by-day, this results in unsampled data (set batch to TRUE to require ALL data), for example:

ga$getData(ids, batch = TRUE, walk = TRUE, 
		   start.date, end.date, 
		   metrics = "ga:visits,ga:transactions", 
		   dimensions="ga:keyword",
		   filter="ga:country==Denmark;ga:medium==organic")

However, this will result in a lot of requests made to the API, this can result in hitting the quota limit. So use with care.

rga's People

Contributors

antoniotajuelo avatar artemklevtsov avatar gwern avatar hkjinlee avatar kusara avatar luxa-makiyama avatar mattpolicastro avatar onthestairs avatar schaunwheeler avatar tcarnus avatar willempaling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rga's Issues

Error in ga.data$error : $ operator is invalid for atomic vectors

Everything worked fine till yesterday, but now all I get is

ga$getData(XXXXXXXX,
start.date = "2012-06-01",
end.date = "2013-04-10",
metrics = "ga:visits",
filters = "ga:keyword==(not provided);ga:source==google;ga:medium==organic",
dimensions = "ga:date",
max = 1500,
sort = "ga:date")
Error in ga.data$error : $ operator is invalid for atomic vectors

when I try to query for data. However, when I use

profiles <- ga$getProfiles();
head(profiles)

I get data, so I suppose that the generic API access is not the issue. Any clues?

Error in charToDate(x) : character string is not in a standard unambiguous format

Have been using the package quite happily until today it spits out this error message:

  Error in charToDate(x) : character string is not in a standard unambiguous format

I'm looping over several accounts to rollup the data, with the same start and end date, so not sure why this occurs for only one fetch.....

ah, I see now it is when I include the "ga:date" dimension in the fetch.

i.e.

 dimensions = 'ga:campaign,ga:adGroup,ga:transactionId' 

runs successfully but:

    dimensions = 'ga:date,ga:campaign,ga:adGroup,ga:transactionId'

...triggers the error message on some of the profiles (I suspect empty result ones) and breaks the loop.

I have also wrapped the GA data call to a try() command, so I'm not sure why this error breaks the loop - am I not catching all the errors correctly? I put this in to get around the 0 results that have occured before:

for (i in 1:length(GAProfileTable$id)) {

print(paste("Fetching:",GAProfileTable$name[i],
            "ProfileID:",GAProfileTable$id[i], 
            start_date, end_date, metrics, dimensions,
            "sort:", sort,"filters:",filters, "segment:",segment))
# If max is -1, then no limit on max and it will get all results
if (max_results<0) {
  GA.data <- try(ga$getData(GAProfileTable$id[i], batch = TRUE, start.date = start_date, end.date = end_date, 
                          metrics = metrics, dimensions = dimensions, 
                          sort = sort, filters = filters, segment = segment,
                          start = start_index), silent = TRUE)
} else {      #otherwise it will get max results.
  GA.data <- try(ga$getData(GAProfileTable$id[i], batch = TRUE, start.date = start_date, end.date = end_date, 
                          metrics = metrics, dimensions = dimensions, 
                          sort = sort, filters = filters, segment = segment,
                          start = start_index, max = max_results), silent = TRUE)

}

      #if error, class is "try-error"
      if (class(GA.data) != "try-error"){
        print("Fetch Successful, writing")
        GA.data$name <- GAProfileTable$name[i]
        GA.data$currency <- GAProfileTable$currency[i]
        Results <- rbind(Results,GA.data)
      } else {
        ErrorMessage <- paste("WARK WARK ERROR WITH GAProfileTable$id = ",
                              GAProfileTable$id[i],
                              "name:",
                              GAProfileTable$name[i])
        Results <- rbind(Results,GA.data)
        print(ErrorMessage)
        print(GA.data)
      }    

}

...but perhaps that helps narrow it down?

could not find function "days"

Hi,

I did a fresh R install and when I try some query that demands WALK I get the following error:

Error in format(as.POSIXct(start.date) + days(i), "%Y-%m-%d") : 
  could not find function "days"

The query:

data=ga$getData(xxxx, batch = TRUE, walk = TRUE, 
                      "2011-09-01","2014-02-25", 
                      metrics = "ga:transactions,ga:itemQuantity,ga:itemRevenue,ga:transactionTax,ga:transactionShipping,ga:localItemRevenue,ga:localTransactionTax,ga:localTransactionShipping,ga:sessions,ga:bounces",  
                      dimensions="ga:date,ga:sourceMedium,ga:browser,ga:deviceCategory,ga:hostname,ga:country,ga:currencyCode")

My system details:

platform       x86_64-apple-darwin13.1.0   
arch           x86_64                      
os             darwin13.1.0                
system         x86_64, darwin13.1.0        
status                                     
major          3                           
minor          1.0                         
year           2014                        
month          04                          
day            10                          
svn rev        65387                       
language       R                           
version.string R version 3.1.0 (2014-04-10)
nickname       Spring Dance  

Any tips on how to fix this?

Thanks
Joao

Feature: Metadata API

I'm fairly new to R and Analytics, so please forgive me if this duplicates earlier requests or is outside the scope of this project:

I'd love to be able to quickly replace metrics/dimensions as column names with their uiName attribute when I graph/table them into my final report (e.g. pageviewsPerSession would become "Pages / Session", per the Dev Guide). Is there any way to tack on a Metadata API call to avoid complex/maintenance-heavy find/replace code?

Single data

Hello,
In the first place, I apologize about English write, I get a level basic.

Actually, I am working in a statistic proyect about web site of my University, so I am working with google Analytics and R.

I would like to know if rga is avaiable by get data single.

For example:

the query:
ga$getData(id, start.date="2013-09-01", end.date="2013-09-30",metrics = "ga:visits",dimensions = "ga:exitPagePath",sort = "-ga:visits",start = 1, max = 30)

I get a table which I have the url and the number de visits, but I would like to get data only each people i.e I need information about only one people so I could to study de custom of people.

I don't interest the name of people, only I would like to know the behavior each people.

I hope you understand my issue.

Best,

José Luis.

rga call cannot pass authenticating

Hi there,

I installed rga per the web site https://github.com/skardhamar/rga. Environment is R3.0.1 and Windows 8. The following is R codes (I did not place my real client.id and secret here):

library(devtools)
library(rga)
rga.open(instance="ga", client.id = "MyClientID(no problem)", client.secret = "MyCientSecret(no problem)")

It still brings me to google site for a code (I do not assume this happens per skardhamar's page). I got the following err after inputting it to the R console. When I used the simplified form rga.open(instance="ga") or rga.open(instance="ga", where="~/ga.rga"), the problem persists.

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

Any advice? Thanks!
Weihong

Couldn't connect to host

Is there some configuration options to set the proxy? There are some restriction in my networks , so I need to use RGA with proxy. And I have tried to setInternet2(use = TRUE) and set the proxy in system internet options,and use download.file method to test my proxy setting , it seems that the proxy is active in download.file method , but when I try rga.open it still report couldn't connect to host.
How can I solve this problem?Many thanks.

rga.open(instance="ga")
Browse URL: https://accounts.google.com/o/oauth2/auth?scope=https://www.googleapis.com/auth/analytics.readonly&state=%2Fprofile&redirect_uri=urn:ietf:wg:oauth:2.0:oob&response_type=code&client_id=862341168163-qtefv92ckvn2gveav66im725c3gqj728.apps.googleusercontent.com&approval_prompt=force&access_type=offline
Please enter code here: 4/FLbMwvBAjmakNWGU5j9xm-O-qQmn.AmsGK4hYmmQWgrKXntQAax3-2WwJegI
Error :function (type, msg, asError = TRUE) : couldn't connect to host

Wrong propierties showned

0
down vote
favorite
I work with 2 emails for accesing Google Analtics. Each one has access to diffent accounts within Google Analytics. I use email A at my house, and email B at work.

Now, im using RGA (from CRAN) within R, and at my house i need to access the Google Analytics accounts from the Email B (work email, that has access to specific Google Analytics accounts).

The problem is that when using this code:

'ga_token <- authorize(client.id, client.secret, cache = TRUE, verbose = getOption("rga.verbose", FALSE))'
Then I use this to get accounts:

'get_accounts(start.index = NULL, max.results = NULL, ga_token, verbose = getOption("rga.verbose", FALSE))'

The problem: is that i want to get the accounts for email B, but no matter what i just get the accounts related with email A.

I've delated all my Google Analytics API projects (from A and B), and recreated the API for Email B. But no matter what, i just get the accounts for email A.

**My Google Api project was created with B (the email with the access to the desire account). But i just see accounts related to A.

What can i do? Thanks

Installation failed through proxy

Which web address and ports shall I request to unblock so that I can install rga successfully through a internal proxy?

My office computer is using proxy to connect to external network. I can install devtools package. I can download https://github.com/skardhamar/rga/archive/master.zip with a browser.

But when I ran install_github("rga", "skardhamar") in R console, it gave me "
Downloading rga.zip from https://github.com/skardhamar/rga/archive/master.zip
Error in function (type, msg, asError = TRUE) : couldn't connect to host "

I tried set_config(use_proxy ...) command with my own proxy, it didn't work due to SSL certificate problem.

Error in fromJSON(getURL(url) : unexpected escaped character '.\.' at post 21

Got this bug, but not for all GA accounts, just this one.

library("rga")
rga.open(instance="ga", where="~/ga.rga")
Accounts <- ga$getAccounts()
Profiles <- ga$getProfiles()

Error: Unexpected escaped character '.' at pos 46

Segments <- ga$getSegments()

Error: unexpected escaped character '.' at pos 21

Goals <- ga$getGoals()
WebProperties <- ga$getWebProperties()

Unable to install on R 3.1.1.

When I try to install rga on the latest version of R (3.1.1.), it mentions this warning:

'Warning in install.packages :
package ‘C:/.../rga-master.zip’ is not available (for R version 3.1.1)'

I am trying to install directly from the zip file as the install_github command is not working on my system. I have successfully installed using this method on the previous version of R.

Any ideas much appreciated.

I don't connect Google Analytics

Hello,

I recently have got the last versions of R, exactly Version 0.98.945 for Mac , so, I don't connect for Googlye Analitys. Before I write:

query <- QueryBuilder()

and automatically open windows of Safary for Authorize my account and paste the accesstoken, but don't open Safary now.

Have any problem the software update?.

SSL Certificate Problem

Hello,
Great function!

I'm using my own API project, with client.id and client.secret. It still asks for the code, which I get from the webpage and enter into R. When I enter it into R, I get this error message:
Error in function (type, msg, asError = TRUE) : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

The new batch flag not working?

I downloaded the latest version, but the batch = TRUE flag doesn't seem to be live?

ga$getData(ids, start.date, end.date,
metrics = "ga:visits", dimensions = "ga:date,ga:medium,ga:source",
sort = "", filters = "", segment = "")

Works fine.

ga$getData(ids, batch = TRUE, start.date, end.date,
metrics = "ga:visits", dimensions = "ga:date,ga:medium,ga:source",
sort = "", filters = "", segment = "")

Error in ga$getData(GAProfileTable$id[i], batch = TRUE, start.date = start_date, :
unused argument(s) (batch = TRUE)

I can see the new methods such as ga$getDataInBatches etc in the package, so I think I have the correct commit.

Auth Error: Error in fromJSON(raw.data) : STRING_ELT() can only be applied to a 'character vector', not a 'raw'

I've recently started getting this issue when I try to authenticate.

traceback()
5: .Call("fromJSON", json_str, unexpected.escape, PACKAGE = "rjson")
4: fromJSON(raw.data)
3: .rga.authenticate(client.id = client.id, client.secret = client.secret, 
       code = code, redirect.uri = redirect.uri)
2: .rga.getToken(client.id, client.secret)
1: rga.open(instance = "ga", client.id = "willems-client-id", 
       client.secret = "willems-secret")

Any ideas? I'll fork and try to fix it...but if anyone else is having this problem and know what it might be related to, let me know.

Not sure if it's due to having too many JSON libraries loaded. With jsonlite, there's 3 of them now.

Issue with Batched Calls

When my query is greater than 10,000 the last batch throughs and error and doesn't return the dataframe. It appears it makes 1 too many batches than there are observations in the data.

Run (8/100): observations [70001;80000]. Batch size: 10000
Recieved: 4517 observations
Run (9/100): observations [80001;90000]. Batch size: 10000
Error in .self$getData(ids = ids, start.date = start.date, end.date = end.date, :
no results: 74517

Output no longer consolidating when walk=T

When "walk=TRUE" I'm now seeing an output, for example, of:
screen shot 2014-06-05 at 3 19 14 pm
For the above output, we used to see something like
screen shot 2014-06-05 at 3 19 19 pm
where all of the corresponding entries were consolidated.

Thank you!

rga.open Error

Hey there,

Everything worked fine till yesterday, but now all I get is;

rga.open(instance="ga")
Error: could not find function "rga.open"

Any ideas as to why?

Stupid Question #2: Error: Attempt to apply non-function

After successfully authenticating confirming the the instance returns no errors in a new session, the basic query in the README file returns the error Error: attempt to apply non-function. I used the query as-is with the exception that I changed the ids value to be ga:xxxxxx where xxxxx is my account value.

I am sure this is something that I am doing wrong, but any idea what I am missing? I am on a Windows 7 machine using R 2.15

Include RGA's date in each chunk of a walk

Hi guys,

Rather than extracting ga:date as a dimension in a query, @jdeboer had an excellent idea with his plugin to include date in the output of a Walk query. That allows you to extract more dimensions when date is inferred by the parameters of the walk function.

Would it be as simple as the following change?

    getDataInWalks = function(total, max, batch, ids, start.date, end.date, date.format,
                              metrics, dimensions, sort, filters, segment, fields, envir) {
        # this function will extract data day-by-day (to avoid sampling)
        walks.max <- ceiling(as.numeric(difftime(end.date, start.date, units = "days")))
        chunk.list <- vector("list", walks.max + 1)

        for (i in 0:(walks.max)) {
            date <- format(as.POSIXct(start.date) + days(i), "%Y-%m-%d")

            message(paste("Run (", i + 1, "/", walks.max + 1, "): for date ", date, sep = ""))
            chunk <- .self$getData(ids = ids, start.date = date, end.date = date, date.format = date.format,
                                   metrics = metrics, dimensions = dimensions, sort = sort, filters = filters,
                                   segment = segment, fields = fields, envir = envir, max = max,
                                   rbr = TRUE, messages = FALSE, return.url = FALSE, batch = batch)
            message(paste("Received:", nrow(chunk), "observations"))
            chunk$walk_date <- date
            chunk.list[[i + 1]] <- chunk
        }

        return(do.call(rbind, chunk.list, envir = envir))
    }

Incorrect batch size with getData + walk

When using this method the message relating the final batch size appears to be wrong. For example:
Pulling 52906 observations in batches of 10000
Run (1/6): observations [1;10000]. Batch size: 10000
Received 10000 observations
Run (2/6): observations [10001;20000]. Batch size: 10000
Received 10000 observations
Run (3/6): observations [20001;30000]. Batch size: 10000
Received 10000 observations
Run (4/6): observations [30001;40000]. Batch size: 10000
Received 10000 observations
Run (5/6): observations [40001;50000]. Batch size: 10000
Received 10000 observations
Run (6/6): observations [50001;52906]. Batch size: 42906
Received 2906 observations

Automatic pulling of all data (when the max attribute is excluded) doesn't work when walk=T

When the max attribute is excluded from the ga$getData() function AND batch=T AND walk=T, the automatic pulling of all rows isn't working.

Ideally, we'd walk through the time period pulling data for one day at a time, and on each day we'd detect the total number of rows and pull all of the data for that day.

For example, for June 1–30, with batch=T, walk=T and the max attribute excluded, I currently see
screen shot 2014-07-24 at 11 44 31 am

Ideally, though, the automatic detection of the max number of rows would occur on each day, similar to how it does when the query includes only one day. For example, for June 1 only, with batch=T, walk=T and the max attribute excluded, I currently see
screen shot 2014-07-24 at 11 52 21 am

Let me know if any additional information would be helpful.

Thanks for the awesome package!

Profile permissions

I'm trying to set rga up and can't seem to make a query. I've tried with several GA accounts, all where I have full admin permissions.

The code:

ga$getData(ids, start.date, end.date, metrics = "ga:visits", dimensions = "ga:date",  sort = "", filters = "", segment = "", start = 1, max = 1000)

The error that comes back:

Error in ga$getData(ids, start.date, end.date, metrics = "ga:visits",  : 
  error in fetching data: User does not have sufficient permissions for this profile.
Called from: top level 

why i have this Error: Error: attempt to apply non-function

r<-c(0,0.31,0.41,0.66,1)
x<-c(0.25,0.5,1,1.5,2)
a<-c()
f<-function(n){
for (i in 1:4){
a[i]=(x[i+1]-x[i])/(r[i+1]-r[i])
}
a

p=numeric
for(j in 1:n){
u<-runif(1,0,1)
for (k in 1:4){
if ((u[j]>=r[k]) & (u[j]< r[k+1]))
p[j]=x[k]+ak
}
}
p
}
f(10)

"Improve filters handling" deletion of whitespace causing errors

Hey @skardhamar,

Love this package. Looks like the latest release is causing some difficulties, though.

With this update (b905b59), the new deletion of whitespace in the filters input makes it impossible to filter by any string with whitespace.

For example: filters = ga:eventCategory=="Purchase Button"
is converted to: filters = ga:eventCategory=="PurchaseButton"

I edited the getData function for a temporary fix, but is it possible to remove this from the package?

Thank you!
Brian

Cant pull data for more than 10k a day even with batch

I reinstalled R from scratch

R version 3.1.0 (2014-04-10) -- "Spring Dance"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

I have script that worked before:

ga.data = ga$getData("ga:1111111", batch = TRUE, walk = TRUE, "2012-28-01", "2012-12-31",
metrics = "ga:sessions,ga:transactions,ga:goal4completions,ga:pageviews,ga:bounces,ga:sessionDuration",
dimensions = "ga:keyword,ga:sourceMedium,ga:date,ga:region,ga:landingPagePath,ga:adContent,ga:adMatchedQuery",
sort = "", filters = "ga:browser=~(Chrome|Firefox|Internet Explorer|Opera|Safari|YaBrowser)", segment = "")

And now it pulls only first 10k of data and doesn't use batch:(

no batch

Can't get ga:date for visits

Hi. im trying to pull the movile visitas for october with this code:

moviles-oct <- ga$getData(id, batch = TRUE, walk = TRUE, start.date="2014-10-01", end.date="2014-10-31", metrics = "ga:sessions", dimensions = "ga:date", sort = "", start = 1, max = 10,000)

but i get this warning:

Pulling 10 observations in batches of 10 Run (1/1): observations [1;10]. Batch size: 10 Error in format.POSIXlt(as.POSIXlt(x), ...) : invalid 'format' argument

What is going on? I just want to try first with october. And when i get the "query" working, i want to apply this to all 2014. I think the problem is the format of the date column. Because, when i use ga:month, and ga:day .... i get what i want... help pls.

trying to get slot "className" from an object (class "factor") that is not an S4 object

Hi, First of all thanks for the development.

My issue is that 3 days ago the package was working for me. Since 2 days ago I get an error when requesting data via ga$getData().

Expected output is a dataframe but what I get instead is an error in formatting:

Error in is(object, Class) :
trying to get slot "className" from an object (class "factor") that is not an S4 object
In addition: Warning messages:
1: In [<-.factor(*tmp*, formats$dataType == "STRING", value = c(NA, :
invalid factor level, NAs generated
2: In [<-.factor(*tmp*, formats$dataType == "INTEGER", value = c(NA_integer_, :
invalid factor level, NAs generated
3: In [<-.factor(*tmp*, formats$dataType == "PERCENT", value = c(NA_integer_, :
invalid factor level, NAs generated
4: In [<-.factor(*tmp*, formats$dataType == "TIME", value = c(NA_integer_, :
invalid factor level, NAs generated
5: In [<-.factor(*tmp*, formats$dataType == "CURRENCY", value = c(NA_integer_, :
invalid factor level, NAs generated
6: In [<-.factor(*tmp*, formats$name == "date", value = c(NA_integer_, :
invalid factor level, NAs generated

I tried to solve it myself, but couldn't succeed. Any suggestions what might cause the problem here? My guess it is is the update google did on their API on the 14th. See: https://groups.google.com/forum/?fromgroups=#!topic/google-analytics-api-notify/j4crWq0bRb8. Thanks in advance!

Greetings, Menno

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-w64-mingw32/x64 (64-bit)

API rga

Hi,

I am try to find information about rga. When I type help in RStudio, I only get a small information about rga:

"ga.open {rga} R Documentation
Open a GA API instance

Description

This function creates a Google Analytics API instance, bound to one Google Account.

Usage

rga.open(instance = "ga", client.id, client.secret, where, envir = .GlobalEnv)
Arguments

instance
this should be the name of your reference handle. By default it's set to "ga". This will be the class, that will carry all the API functions and token data.

client.id
this should be set the client id provided by Google Analytics API. By default it is set to a default application.

client.secret
this should be set the client secret provided by Google Analytics API. By default it is set to a default application.

where
if this is set to an arbitrary file, the instance will be saved here and later acquired from here. Use this for continous work with the Google Analytics API.

envir
where should the instance be stored.

Value

This function returns an class of type rga, this contains all the methods for data extraction, and keeps itself authenticated.

Note

When this function runs for the first time, it will open an available browser, and direct you to the authentication on Google. You need to accept the authentication, and copy the code Google gives you, this code needs to be inputted into the console.

If you can't run the function interactively (on a server for example), just create the instance in an environment where you can, and copy the instance to the non-interactive environment.

Examples

Not run:

rga.open(instance = "ga");
profiles <- ga$getProfiles();
# explore class:
ga$getRefClass()

End(Not run)"

Nevertheless, if you type ga$RefClass() in RStudio, you can see more method:

"callSuper", "copy", "explore", "export", "field", "getAccounts", "getClass",
"getData", "getDataInBatches", "getDataInWalks", "getFirstDate", "getGoals",
"getProfiles", "getRefClass", "getSegments", "getToken", "getWebProperties", "import",
"initFields", "initialize", "isToken", "isTokenExpired", "isWhere", "prepare",
"processManagementData", "refreshToken", "setToken", "show", "status",
"tokenExpiresIn", "trace", "untrace", "usingMethods"

¿Where is the all information about rga?

Best,

José Luis.

Error in names(row) <- ga.headers$name : 'names' attribute [6] must be the same length as the vector [1]

Hi there.

I'm trying to pull a filtered view using this rga call:

ga.funneltrouble = ga$getData(ga.profiles[1,1],batch=TRUE,walk=TRUE,
'2013-06-01','2014-07-23',
metrics='ga:uniquePageviews,ga:sessions,ga:bounceRate',
dimensions='ga:date,ga:pageTitle,ga:sourceMedium',
filters='ga:pageTitle==blah1,ga:pageTitle==blah2,
ga:pageTitle==blah3,ga:pageTitle==blah4,ga:pageTitle==blah5')

And I get this error: Error in names(row) <- ga.headers$name : 'names' attribute [6] must be the same length as the vector [1].

I expect this call to come up with a few empty rows. It works fine without the filter, and I think the error code is referring to this block in /R/core.R starting at line 156.

 # did not return any results
            if (!inherits(ga.data$rows, "matrix") && !rbr) {
                stop(paste("no results:", ga.data$totalResults))
            } else if (!inherits(ga.data$rows, "matrix") && rbr) {
                # return data.frame with NA, if row-by-row setting is true
                row <- as.data.frame(matrix(rep(NA, length(ga.headers$name), nrow = 1)))
                names(row) <- ga.headers$name
                return(row)
            }

I'm somewhat new to R, so I could be wrong here, but in the row <- as.data.frame(matrix(rep(NA, length(ga.headers$name), nrow = 1))) line it looks to me like the closing paren for rep() is on the outside of nrow=1 when it should be on the inside.

I'd really appreciate any help you could give me on fixing this.

Authentication Error

When I run

rga.open(instance = "ga")

It returns the error
Error in get(instance, envir = envir, mode = "S4") :
object 'ga' of mode 'S4' was not found

I'm on R 2.15 on a Mac.

use ga:nextPagePath

Hi,

I understand, if your website contain links refer other pages in your property then we can to see it, but, ¿what happening if my website contain links that refer other website?

For example I have did this query:

query$Init(start.date = "2014-01-01",
end.date = "2014-05-31",
metrics = "ga:users",
dimensions="ga:nextPagePath",
sort="-ga:users",
table.id = paste("ga:",id,sep="",collapse=","),
access_token=access_token)

table<- ga$GetReportData(query)

In table, only get url about my property but don't get url about other property, ¿there is any form for get url about all links that contain in your website?

Thanks.

SSL certificate problem, verify that the CA cert is OK

Hello, I am trying to use this library to access our GA account, and have run into a problem. When I use the recommended commands: rga.open(instance="ga") I am directed through the Google Analytics login screen via my browser, and given the approval code - which I paste into R. At this point, I see the following error:

Error in function (type, msg, asError = TRUE) :
SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

I have enabled the Google Analytics API, but I get the sense that I'm missing something basic? Any guidance would be appreciated. I'm on 64-bit Windows 7, using both plain R 2.15.2 as well as R Studio 0.97.314.

Many thanks,
Matt

Error when a single batch or walk step returns 0 results

When a pull returns 0 results, I get the error message:
Error in .self$getData(ids = ids, start.date = start.date, end.date = end.date, :
no results: 85533

For a single run, this wouldn't be a problem, but difficulties arise when batch or walk are set to true.

For example, if I try to pull data for the entire month of May and execute the getData function with batch=F and walk=T, the package collects data for May 1st, then May 2nd, .... If there are 0 results for May 3rd, the program outputs the error message posted above and I lose all the results from May 1st and 2nd.

A similar problem occurs when batch=T. If my "max=" is set too high, the program crashes at the first occurrence of a batch with 0 results. In this case, it's easy enough to lower my "max=" to adjust for the correct number of data-containing pulls, but when we also have walk=T, lowering the "max=" value creates complications. If each day doesn't require the same number of batches, we might set the "max=" value too low (and lose data), or we might set the "max=" value too high (and reach a batch on a later day with no results).

Here is a screenshot of the output when both batch=T and walk=T
screen shot 2014-06-26 at 3 58 30 pm
On the first day, there were only 85533 results, but I know on the second day there would be ~650000, which is why I ran the program with such a high "max=" value.

problems with fething data (R3.1.0) - Error in ga.data$error : object of type 'externalptr' is not subsettable

Hi!

Yesterday I've updated my version of R (from 3.0.2 to 3.1.0) and... ga$getData become broken.
first, R asked me to install XML package.
second, when I tried to use two or more metrics, function ga$getData was interrupted due to the error "Error in ga.data$error : object of type 'externalptr' is not subsettable". When I used only one metrics, I get data without any errors.

Ex.:
two metrics:

index.overview <- ga$getData(XXXXXX,

  •                          start.date = "2014-01-01",
    
  •                          end.date = "2014-03-31",
    
  •                          metrics = "ga:pageviews, ga:visits")
    

Error in ga.data$error : object of type 'externalptr' is not subsettable

one metrics:

index.overview <- ga$getData(XXXXXX,

  •                          start.date = "2014-01-01",
    
  •                          end.date = "2014-03-31",
    
  •                          metrics = "ga:pageviews")
    

head(index.overview)
date pageviews
1 2014-01-01 254242
2 2014-01-02 415720
3 2014-01-03 504666
4 2014-01-04 572571
5 2014-01-05 613999
6 2014-01-06 632270

I would be very pleased if you could tell me what I do wrong or how I can patch this problem.
Thank you.

Cannot authenticate

Both

 > rga.open(instance="ga")

and rga.open with client.id and client.secret gives

  Error in get(instance, envir = envir, mode = "S4") : 
  object 'ga' of mode 'S4' was not found

In RStudio 0.97.314.

Bad Request Error

A few months back, I was able to successfully connect and query Google Analytics with this package. I have a new project but can not connect. For reference, I am on Windows 7 32-bit. Here is how I got to my error:

  1. Navigated to a new directory.
  2. Ran the following code
    options(RCurlOptions = list(verbose = FALSE, capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE)); rga.open(instance="uga", where="uga.rga");
  3. I was prompted to add the code from my browser, after which I got the following error:

Error: Bad Request

This is new to me, and as I said, I was able to connect in the past. Any ideas?

Universal analytics

Thnx 4 a great lib,

just a little note re universal analytics

The one who uses it should write a request like:

ua$getData instead of ga$getData

and

rga.open(instance = "ua",

might look straightforward though didn't found it in the readme file
Thnx again 4 helpfull stuff

Not Walking for more than 10,000 rows

Hi,

I'm trying to get more than 10k records per day. I tried max=50000 and still nothing.
I had the idea that the WALK option was working in the past.

data=ga$getData(xxxx, batch = TRUE, walk = TRUE, max = 50000,
"2014-01-01","2014-01-30",
metrics = "ga:sessions, ga:bounces,ga:pageviews,ga:uniquePageviews,ga:transactions,ga:itemRevenue,ga:itemQuantity,ga:uniquePurchases,ga:sessionDuration",
dimensions="ga:date,ga:campaign,ga:source,ga:medium,ga:country")

Run (1/30): for date 2014-01-01
Received: 9738 observations
Run (2/30): for date 2014-01-02
Received: 10000 observations
Run (3/30): for date 2014-01-03
Received: 10000 observations
Run (4/30): for date 2014-01-04
Received: 10000 observations
Run (5/30): for date 2014-01-05
Received: 10000 observations
Run (6/30): for date 2014-01-06

Here is my system details:

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.3.3 rga_0.8 httr_0.5 jsonlite_0.9.12 RCurl_1.95-4.1 bitops_1.0-6
loaded via a namespace (and not attached):
[1] digest_0.6.4 memoise_0.2.1 plyr_1.8.1 Rcpp_0.11.2 stringr_0.6.2 tools_3.1.1

Any idea how to fix this?

particular query

Hi,

I want to extract particular infomation about my website.

In my website there is button for CONTACT, I want to know if R google analytics allow to make this particular query: "number of the people that to make click in contact for day" ¿What metric and dimensions do I have to write in R?

I think the dimensions is ga:date, but I dont know the metric,

Thanks you.

Problems with cyrillic enecoding

Hey Guys

I've tried to download report with ga:keyword dimension.
Keywords in report look like

screenshot_30_06_2013_15_58-2

Is there any way to fix it?

I use R 3.01, last version rga
Windows Server 2012

invalid input in 'utf8towcs' reading diacritics

I am attempting to pull a list of supplied search terms from GA via this library, and am having a problem when the script comes across phrases with diacritics in them. The current example is the string "Granulites and crustal di(CHARACTER HERE)erentiation" (copied and pasted from the GA interface). The command I am trying to issue is:

result <- ga$getData(GAprofile, start.date = days[i], end.date = days[i], metrics=("ga:visits,ga:visitsWithEvent,ga:totalEvents,ga:eventValue"), dimensions = "ga:date,ga:eventCategory,ga:eventAction,ga:eventLabel", sort = "", filters = "ga:eventCategory==Discovery", segment = "")

The problematic characters, if it matters, are coming from the ga:eventLabel field within the dimenstions attribute. Should I be passing this content through a filter/converter before storing it in the data frame?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.