8-bit-sheep / googleanalyticsr Goto Github PK

View Code? Open in Web Editor NEW

257.0 33.0 76.0 44.66 MB

Use the Google Analytics API from R

Home Page: https://8-bit-sheep.com/googleAnalyticsR/

License: Other

R 33.31% HTML 65.11% Jupyter Notebook 1.46% Dockerfile 0.11% JavaScript 0.01%

googleanalyticsr google analytics api r googleauthr

googleanalyticsr's Introduction

googleAnalyticsR

Install

From CRAN:

install.packages("googleAnalyticsR")

Or the latest development version on GitHub:

remotes::install_github("8-bit-sheep/googleAnalyticsR")

Getting started

Examples and tutorials available at https://8-bit-sheep.com/googleAnalyticsR.

googleanalyticsr's People

Contributors

Stargazers

Watchers

googleanalyticsr's Issues

Set up goals via API

Using goal insert
https://developers.google.com/analytics/devguides/config/mgmt/v3/mgmtReference/management/goals/insert

Document single filters syntax better - Dimension filter error

When running the following code:

df <- dim_filter(dimension="ga:campaign",operator="REGEXP",expressions="welcome")
data_fetch <- google_analytics_4(ga_id,date_range = c("2016-01-01","2016-12-31"),
                                 metrics = c("ga:itemRevenue","ga:itemQuantity"),
                                 dimensions = c("ga:campaign","ga:transactionId","ga:dateHour","ga:productBrand","ga:productName"),
                                 dim_filters = df,
                                 anti_sample = TRUE)

It has the following error:

Error in checkGoogleAPIError(req) :
JSON fetch error: Invalid JSON payload received. Unknown name "dimension_name" at 'report_requests[0].dimension_filter_clauses': Cannot find field.
Invalid JSON payload received. Unknown name "not" at 'report_requests[0].dimension_filter_clauses': Cannot find field.
Invalid value at 'report_requests[0].dimension_filter_clauses.operator' (TYPE_ENUM), "REGEXP"
Invalid JSON payload received. Unknown name "expressions" at 'report_requests[0].dimension_filter_clauses': Cannot find field.
Invalid JSON payload received. Unknown name "case_sensitive" at 'report_requests[0].dimension_filter_clauses': Cannot find field.

Parsing error with multi-channel funnels

Cohort example ignores first cohort

Dear Mark,

First of all, thank you for writing this great R package! I am delighted to be able to use the API v4 directly from R.

I tried out the cohort example from the vignette, but it didn't behave the way I expected.

## first make a cohort group
cohort4 <- make_cohort_group(list("cohort 1" = c("2015-08-01", "2015-08-01"), 
                                "cohort 2" = c("2015-07-01","2015-07-01")))

## then call cohort report.  No date_range and must include metrics and dimensions
##   from the cohort list
cohort_example <- google_analytics_4(ga_id, 
                                     dimensions=c('cohort'), 
                                     cohort = cohort4, 
                                     metrics = c('cohortTotalUsers'))

This only returns a value for cohort 2. My first thought was that we didn't have data for cohort 1, but if I swapped the date ranges, I got only data for 2015-08-01. In fact, if I added a third row, I get the last two:

## first make a cohort group
cohort4 <- make_cohort_group(list("cohort 1" = c("2015-08-01", "2015-08-01"), 
                                "cohort 2" = c("2015-07-01","2015-07-01"),
                                "cohort 3" = c("2015-08-01", "2015-08-01")))

## then call cohort report.  No date_range and must include metrics and dimensions
##   from the cohort list
cohort_example <- google_analytics_4(ga_id, 
                                     dimensions=c('cohort'), 
                                     cohort = cohort4, 
                                     metrics = c('cohortTotalUsers'))

The above returns cohorts 2 and 3. Seems to be a simple off-by-one. I am using R 3.2.4 and I installed via install.packages today.

object 'out' not found

I tried anti-sampling query, but there was the issue:

ga_auth()
unsampled_data_fetch <- google_analytics_4(ga_id, 
                                         date_range = c("2015-01-01","2015-06-21"), 
                                         metrics = c("users","sessions","bounceRate"), 
                                         dimensions = c("date","landingPagePath","source"),
                                         anti_sample = TRUE)

....

anti_sample set to TRUE. Mitigating sampling via multiple API calls.
Finding how much sampling in data request...
Downloaded [10] rows from a total of [49581].
Data is sampled, based on 1.1% of sessions. Use argument anti_sample = TRUE to request unsampled data.
Finding number of sessions for anti-sample calculations...
Downloaded [172] rows from a total of [172].
Calculated [102] batches are needed to download approx. [59497] rows unsampled.
Attempting hourly anti-sampling...
Finding number of hourly sessions for anti-sample calculations...
Downloaded [24] rows from a total of [24].
Anti-sample call covering 24 hours: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Error in google_analytics_4(viewId = viewId, date_range = c(the_day, the_day),  : 
  object 'out' not found

Do you have any ideas about the issue?

Get goal names

https://developers.google.com/analytics/devguides/config/mgmt/v3/mgmtReference/management/goals/list

Batch fail when querying non-golden data

Downloaded [21189] rows from a total of [21207].
Error in seq.default(from = 50000, to = all_rows, by = reqRowLimit) : 
  wrong sign in 'by' argument

It looks like the total row field changes when you are querying data from today, possibly non-golden data

Duplicate data

Hi Mark,

Another issue using the v4 API. The following query returns 126,399 rows. It is expected to return 26,399 rows. As far as I can tell, it's just repeating some of the result rows. The rows that I checked looked to have the correct data.

data.googleAnalyticsR <- google_analytics_4(ga_id, 
                                 dimensions=c('ga:month', "ga:year", "ga:landingPagePath"), 
                                 date_range=c("2015-04-01", "2015-04-30"),
                                 metrics = c('ga:sessions', "ga:bounceRate", "ga:avgSessionDuration", "ga:pageviewsPerSession"),
                                 max=1000000)

When I ran the equivalent queries on RGA and RGoogleAnalytics, I got the expected behavior. These are, I believe, using the v3 API.

tmp.query.list <- Init(start.date = "2015-04-01",
                       end.date = "2015-04-30",
                       dimensions=c('ga:month', "ga:year", "ga:landingPagePath"), 
                       metrics = c('ga:sessions', "ga:bounceRate", "ga:avgSessionDuration", "ga:pageviewsPerSession"),
                       max.results = 1000000,
                       table.id = "ga:XXXXXX")
tmp.query <- QueryBuilder(tmp.query.list)
data.RGoogleAnalytics <- GetReportData(tmp.query, token, split_daywise = F, delay = 0)

data.RGA <- get_ga(profileId = "ga:XXXXXX",
                  dimensions=c('ga:month', "ga:year", "ga:landingPagePath"), 
                  metrics = c('ga:sessions', "ga:bounceRate", "ga:avgSessionDuration", "ga:pageviewsPerSession"),
                  start.date='2015-04-01',
                  end.date='2015-04-30',
                  max=1000000
)

Thanks and let me know if you need any more sleuthing. Happy to help--it's the least I can do.

Best,
David

pageTitle won't work as dimension in query

gadata <- google_analytics(id = XXXXXX, 
                       start=start.date, end=end.date, 
                       metrics = c("uniquePageviews"), 
                       dimensions = c("pageTitle","date","channelGrouping"))

head(gadata)

  pageTitle       date  channelGrouping uniquePageviews
1 (not set) 2015-03-01           Direct              95
2 (not set) 2015-03-01 Email Newsletter             190
3 (not set) 2015-03-01   Organic Search             475
4 (not set) 2015-03-01      Paid Search             285
5 (not set) 2015-03-01         Referral              95
6 (not set) 2015-03-02           Direct             285

Add account user listing

A list of users per accountId/webPropertyId/ViewId

anti_sample fails

Example account with 91.98% sampling per day.

Minor error documentation typo: "gar_auth()"

I think this should say ga_auth()

Error in google_analytics_account_list()

Running account_list <- google_analytics_account_list() gives an error:

request: https://www.googleapis.com/analytics/v3/management/accountSummaries/
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match

Traceback shows:

traceback()
7: stop("numbers of columns of arguments do not match")
6: rbind(deparse.level, ...)
5: f(init, x[[i]])
4: Reduce(rbind, listNameToDFCol(wp_prep, "accountId"))
3: data_parse_function(req$content, ...)
2: acc_sum()
1: google_analytics_account_list()

Retrieving actual data works fine, including the google_analytics_meta() function.

WALK not working if one dday

Not working first time....

google_analytics(gaId,
                        start = "2016-01-25",
                        end = "2016-01-25",
                        metrics = c("pageviews"),
                        dimensions = c("dimension3", "pagePath"),
                        samplingLevel = "WALK",
                        filters = "ga:dimension3!~_scUid",
                        max_results = 20000)

Doesn't fetch dates correctly:

raw <- google_analytics(gaId,
                        start = "2016-01-25",
                        end = "2016-01-26",
                        metrics = c("pageviews"),
                        dimensions = c("dimension3", "pagePath"),
                        samplingLevel = "WALK",
                        filters = "ga:dimension3!~_scUid",
                        max_results = 20000)

Only returning 10,000 rows when samplingLevel is set to "WALK" and only 1,000 rows when segment used

When trying to use the walk option for samplingLevel, only 10,000 rows are returned for each api call.

full <- google_analytics(profile, start = singleStart ,end = singleEnd, metrics= "ga:sessions,ga:itemRevenue", dimensions = "ga:date,ga:dimension14,ga:dimension16,ga:medium,ga:source,ga:userType", samplingLevel = "WALK", max_results = 100000)

When adding a segment, the total drops to 1,000 per api call.

segment <- google_analytics(profile, start = singleStart ,end = singleEnd, metrics= "ga:sessions,ga:itemRevenue", dimensions = "ga:date,ga:dimension14,ga:dimension16,ga:medium,ga:source,ga:userType", segment = "sessions::condition::!ga:eventAction=@Create Account", samplingLevel = "WALK", max_results = 100000)

Is my script correct or is there another way to walk through the results?

Error when anti-sample includes some 0-row data.frames, names are all NULL

Treat this better:

Error in fetch_google_analytics_4(requests, merge = TRUE) : 
  List of dataframes have non-identical column names. Got NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL 
9.
stop("List of dataframes have non-identical column names. Got ", 
    paste(lapply(out, function(x) names(x)), collapse = " ")) 
8.
fetch_google_analytics_4(requests, merge = TRUE) 
7.
google_analytics_4(viewId = viewId, date_range = c(x$start_date, 
    x$end_date), metrics = metrics, dimensions = dimensions, 
    dim_filters = dim_filters, met_filters = met_filters, filtersExpression = filtersExpression, 
    order = order, segments = segments, pivots = pivots, cohorts = cohorts,  ... 
6.
FUN(X[[i]], ...) 
5.
lapply(new_date_ranges, function(x) {
    if (x$range_date > 1) {
        myMessage("Anti-sample call covering ", x$range_date, 
            " days: ", x$start_date, ", ", x$end_date, level = 3) ... 
4.
anti_sample(viewId = viewId, date_range = date_range, metrics = metrics, 
    dimensions = dimensions, dim_filters = dim_filters, met_filters = met_filters, 
    filtersExpression = filtersExpression, order = order, segments = segments, 
    pivots = pivots, cohorts = cohorts, metricFormat = metricFormat,  ... 
3.
googleAnalyticsR::google_analytics_4(viewId = id, date_range = c(start, 
    end), metrics = metrics, dimensions = dimensions, filtersExpression = filters, 
    segments = segment, anti_sample = anti_sample, max = max_results) at misc.R#188
2.
iihGoogleAnalytics(id = ga_viewId, start = as.Date("2016-01-01"), 
    end = Sys.Date(), metrics = c("adClicks", "CTR", "CPC", "sessions", 
        "bounces", "adCost", "goalCompletionsAll", "goal9Completions", 
        "goal11Completions"), dimensions = c("date", "campaign",  ... at xxx_functions.R#213
1.
SEM_data(ga_viewId = ga_viewId)

v4 batching

pageToken or the v3 way?

samplingLevel = "WALK" - error

The below query works and returns sampled results:

start.date <-  "2015-05-01"
end.date <- "2015-11-10"

ga.data <- google_analytics(id = 93625103, 
                        start=start.date, end=end.date, 
                        metrics = c("uniquePageviews"), 
                        dimensions = c("pageTitle","date","channelGrouping"),
                        filters = "ga:pageTitle%3D%3DXXXXX,ga:pageTitle%3D%3DXXXXX;ga:country%3D%3DNetherlands;ga:deviceCategory%3D%3Ddesktop",
                        max=100000,)

However when I add samplingLevel = "WALK" to query I get:

Request to profileId:  ()
Error in if (x$kind == "analytics#gaData") { : argument is of length zero

ga_v4_objects.R / order_type

In the order_type function should this line

testthat::expect_type(field, character)

testthat::expect_type(field, "character")

Can't seem to get a query with this option to work.

502 timeout - Problem fetching data with version 0.3.0.9000

The query:

se <- segment_element("transactionRevenue", 
                      operator = "GREATER_THAN", 
                      type = "METRIC", 
                      comparisonValue = 0, 
                      scope = "SESSION") 
sv_simple <- segment_vector_simple(list(list(se)))
seg_defined <- segment_define(list(sv_simple))
segment4 <- segment_ga4("simple", user_segment = seg_defined)

ga_conversion_paths.df  <- google_analytics_4(
  ga_id,
  date_range = as.character(date_range),
  metrics = c('hits','transactionRevenue'),
  dimensions =  c("dimension18","dimension17", "eventAction", "eventLabel","segment","dimension8"),
  filtersExpression = "ga:eventLabel!~^:;ga:dimension18!=false;ga:eventCategory==clientid",
  segments = segment4,
  anti_sample = T,
  anti_sample_batches = 15
)

The response:

anti_sample set to TRUE. Mitigating sampling via multiple API calls.
Finding how much sampling in data request...
Downloaded [10] rows from a total of [258213].
Data is sampled, based on 53.8% of sessions.
Calculated [3] batches are needed to download approx. [309856] rows unsampled.
Anti-sample call covering 14 days: 2016-10-01, 2016-10-14
Request Status Code: 502
Error: lexical error: invalid char in json text.
                                       <!DOCTYPE html> <html lang=en> 
                     (right here) ------^

Auto-authentication JSON file not working / Account list parsing fail

Hi,
maybe I'm just not fully integrating the authentication process. But as I followed the setup the only thing I need to do is to place the .json from Google APIs Admin Console and point to it with the Global variable.

Sys.setenv(GA_AUTH_FILE = "/Users/michaelsinner/rStudio/test/auth/myAuth.json")

When I try to run a simple test code I get this error:
No authorization yet in this session!
NOTE: a .httr-oauth file exists in current working directory.
Run gar_auth() to use the credentials cached for this session.
Token doesn't exist
Fehler in acc_sum() : Invalid Token
Zusätzlich: Warnmeldung:
In checkTokenAPI(shiny_access_token) : Invalid local token

What am I missing? Is there an issue with the auto authentication with JSON file?

use gar_auth_service for Travis tests

Avoid hourly anti-sampling when its just a one day fetch

It could just be there is one day left, not the session limiting it to one day

New ga_auth() return the token invisibily

This may confuse:

> ga_auth()

Auto-auth - .httr-oauth
Authenticated
<Token>
<oauth_endpoint>
 authorize: https://accounts.google.com/o/oauth2/auth
 access:    https://accounts.google.com/o/oauth2/token
 validate:  https://www.googleapis.com/oauth2/v1/tokeninfo
 revoke:    https://accounts.google.com/o/oauth2/revoke
<oauth_app> google
  key:    289759286325-da3fr5kq4nl4nkhmhs2uft776kdsggbo.apps.googleusercontent.com
  secret: <hidden>
<credentials> access_token, token_type, expires_in, refresh_token

---

Error: could not find function "dynamicSegment"

The following code returns:

Error: could not find function "dynamicSegment"

It's weird. googleAnalyticsR is loaded, and the help is showing, but R is saying the function doesn't exist. (Put aside that I might not have the syntax right yet... but it seems like there's something more fundamental I'm missing here.)

library(googleAnalyticsR)
segments_list <- list(
  segment_ga4("All Visits","gaid::-1"),
  dynamicSegment("Non-Paid",sessionSegment = "sessions::condition::ga:channelGrouping=~(Organic.Search)|(Direct)|(Referral)"),
  dynamicSegment("Paid",sessionSegment = "sessions::condition::ga:channelGrouping=~(Paid.Search)|(Display)|(Video)|(Social)|(Email)|(Other)")
)

pivots

Parsing broken

Anti-sampling

The v4 GA quotas will make this a lot easier for not so complicated fetches.

The traditional per day fetch to avoid sampling only works in most cases as it ensures the API fetches re below the sampling limit, but it breaks if that is not the case and is very inefficient if only lightly sampled.

The API supplies how many rows are sampled, so take this in the first call to calculate the batch sizes, and use that to split the data into non-sampled calls instead.

Error when using list_goals function

I run this query for generating a list of goals:

list_goals(accountId = accountId, webPropertyId = uaCode_sb, profileId = "~all", start.index = NULL, max.results = NULL)

It returns the following error:

> Error: Variables must be length 1 or 8.
> Problem variables: 'id', 'accountId', 'webPropertyId', 'internalWebPropertyId', 'profileId', 'name', 'value', 'active', 'type', 'created', 'updated', 'urlDestinationDetails.url', 'urlDestinationDetails.caseSensitive', 'urlDestinationDetails.matchType', 'urlDestinationDetails.firstStepRequired', 'eventDetails.useEventValue'

Similar queries for custom metrics, custom datasources, etc. work fine. Not sure this is a bug, but I don't understand how to solve this based on the error message.

anti_sample if not sampled returns 10 results

It should rerun for the full amount.

google_analytics_account_list() breaking

It didn't parse when a new fild "starred" appeared in the webproperty and view list.
Need to make it more robust to new columns, perhaps by relying on dplyr bind_rows?

https://github.com/hadley/dplyr/blob/master/R/bind.r

unsampled report downloads

Fetch from management API for 360 properties

Error using google_analytics_account_list()

Windows 10 64 bit.
R 3.3.2
RStudio Version 1.0.44

The code:
account_list = google_analytics_account_list()

returns the following error:

Error : df is not a data frame
API Data failed to parse.  Returning parsed from JSON content.
                    Use this to test against your data_parse_function.

How can it be solved?

Use gar_batch_walk with multi-account fetching

From Jimmy Glenn, he uses batching to speed up multi-account fetching in v3:

[.....] we have over 100 GA properties. We have a rollup property, but we're not on GA premium. Being able to run the same query on each property in a batch format is much faster. For a simple pageview query, batching takes 15 - 20 seconds versus 50 seconds using google_analytics calls. That time savings adds up when pulling traffic to each site by article or section.

 ga_pars <- list(ids = ga_views$gaId[1], 
               'start-date' = start_date, 'end-date' = end_date,
                metrics = 'ga:users,ga:sessions',
                output = 'json')
 # converting dates to character and URLencoding
ga_pars <- lapply(ga_pars, as.character) %>% 
                            lapply(., function(x) URLencode(x, reserved = T))

f <- gar_api_generator("https://www.googleapis.com/analytics/v3/data/ga",
                       "GET", 
                       pars_args = ga_pars, 
                       data_parse_function = parse_google_analytics)
output <- gar_batch_walk(f, 
                        walk_vector = ga_views$gaId, 
                        gar_pars = ga_pars, 
                        pars_walk = 'ids', 
                        data_frame_output = FALSE)

results <- lapply(output, plyr::ldply) %>% plyr::ldply()

Issue with googleAnalytics Authentication

Hello Mark,
I'm struggling to figure out how googleAuthR works. I've written a script to compare Google and Adobe. I'm scheduling the script for every week. When I run the script live myself in R, the authentication works (obviously as I get the pop up to authorize my Google account). However when I schedule the script to run, I have authentication issues. Not sure what I'm doing wrong or if it is the order I have the code in.

I've ran the script and have the .httr-oauth file saved in the same folder as the R script. Do I have the order wrong? Should there be something in the gar_auth()? Did I screw up the options? Thanks.

library(googleAnalyticsR)
library (googleAuthR)

gar_auth()

options(googleAuthR.client_id = "uxxxx.apps.googleusercontent.com")
options(googleAuthR.client_secret = "xxxx")
options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/analytics")

Parity with shinyga features

Things like the management API. Maybe put some of the dashboard macros into another library.

Very big downloads failing v4

Something happens in the second 50000 batch that gives a "503: service error", look at JSON boxes.

> gaaa <- getGoogleAnalytics(config, historic = FALSE, auth_file = "ga.httr-oauth")
## getGoogleAnalytics
anti_sample set to TRUE. Mitigating sampling via multiple API calls.
Finding how much sampling in data request...
Auto-refreshing stale OAuth token.
Downloaded [10] rows from a total of [601845].
No sampling found, returning call
Downloaded [50000] rows from a total of [601845].
Request Status Code: 503
Trying again: 1 of 5
Trying again: 2 of 5

Check pivot gav4 parsing

Under some circumstances looks to fail.

list(structure(list(columnHeader = structure(list(dimensions = list(
    "ga:eventLabel"), metricHeader = structure(list(metricHeaderEntries = list(
    structure(list(name = "ga:users", type = "INTEGER"), .Names = c("name", 
    "type"))), pivotHeaders = list(structure(list(totalPivotGroupsCount = 1L), .Names = "totalPivotGroupsCount"))), .Names = c("metricHeaderEntries", 
"pivotHeaders"))), .Names = c("dimensions", "metricHeader")), 
    data = structure(list(rows = list(structure(list(dimensions = list(
        "http://producer.imglobal.com/producerdocuments.ashx?a=524451&documentid=2645"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("http://www.expatriatehealthcare.com/Broker/WFTI"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("http://www.internationalrail.com/"), 
        metrics = list(structure(list(values = list("2"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("http://www.piau-engaly.com/"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("http://www.saintlary.com/hiver/index.php"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("http://www.travel-claims.net/"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://medicaltravelcompared.co.uk/affiliate/q/rothwelltowler"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://producer.imglobal.com/international-insurance-plans.aspx?imgac=524451"), 
        metrics = list(structure(list(values = list("3"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://quote.freespirittravelinsurance.com/a/3145"), 
        metrics = list(structure(list(values = list("2"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://secure.guestfirst.co.uk/"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://uk.trustpilot.com/review/www.world-first.co.uk"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/Quote/GLOBAL_FUSION/pre-quote?imgac=524451"), 
        metrics = list(structure(list(values = list("2"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/Quote/globehopper_group/pre-quote?imgac=524451"), 
        metrics = list(structure(list(values = list("2"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/quote/globehopper_multitrip_group?imgac=524451"), 
        metrics = list(structure(list(values = list("1"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/Quote/globehopper_multitrip/pre-quote?imgac=524451"), 
        metrics = list(structure(list(values = list("9"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/quote/globehopper_platinum?imgac=524451"), 
        metrics = list(structure(list(values = list("3"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    )), structure(list(dimensions = list("https://www.imgeurope.co.uk/purchase/Quote/globehopper/pre-quote?imgac=524451"), 
        metrics = list(structure(list(values = list("33"), pivotValueRegions = list(
            structure(list(), .Names = character(0)))), .Names = c("values", 
        "pivotValueRegions")))), .Names = c("dimensions", "metrics"
    ))), totals = list(structure(list(values = list("65"), pivotValueRegions = list(
        structure(list(), .Names = character(0)))), .Names = c("values", 
    "pivotValueRegions"))), rowCount = 17L, minimums = list(structure(list(
        values = list("1"), pivotValueRegions = list(structure(list(), .Names = character(0)))), .Names = c("values", 
    "pivotValueRegions"))), maximums = list(structure(list(values = list(
        "33"), pivotValueRegions = list(structure(list(), .Names = character(0)))), .Names = c("values", 
    "pivotValueRegions")))), .Names = c("rows", "totals", "rowCount", 
    "minimums", "maximums"))), .Names = c("columnHeader", "data"
)))

Data upload with no filename

Apparently its an error with client libraries so why does it happen here too?

http://stackoverflow.com/questions/39537395/file-name-in-uploaddata-google-analytics

Splitting data into multiple rows when using anti_sample=true

I run the following code with or without anti_sample:

cf <- dim_filter("dimension7","EXACT",campaign_name,not = FALSE)
fc <- filter_clause_ga4(list(cf))
gaDataFunnel_dev1 <- google_analytics_4(viewId, 
                                    date_range = c(dateStart, dateYesterday),
                                    dimensions=c("deviceCategory", "dimension7"), 
                                    metrics = c("uniquePageviews"),
                                    order = order_type("deviceCategory", "ASCENDING"),
                                    dim_filters = fc,
                                    anti_sample = TRUE)

The code with anti_sample=true returns the same values separated into two rows:

Without anti_sample=true, the table is no longer split into multiple rows:

This seems to occur only in the past few days. It didn't happen before.
I'm running version 0.3.0 of the package

Better error messages when using v3 sytax in v4

Make it easier to port google_analytics calls to google_analytics_4

Things like filters get evaluated to filtersExpression when you should be using dim_filters or met_filters

Move start and end to date_range

Set max_results to max

etc.

calculated metrics fail if expression starts not with ga:

It checks prefix, so things like (ga:sessions/ga:users) fails as ( is detected and replaced with ga:(

Make the message feedback a bit quieter

For example batch walk data its a lot of message spam, add some levels of feedback of verbosity.

MCF reports not working (v4)

Hello,
It seems like the plugin won't recognize mcf dimensions and metrics. It tries to add "ga:" string, instead of adding "mcf:"
I'm not sure If it's just a bug or mcf support requires more thing to do.

add some more segment examples

you have to set "segment" as a dimension, doesn't seem natural, and not explicitly mentioned in the docs..
There is a full example of an AND segment, but not of for an IF statement.
The operators have specific names. Referring to a list of them would be helpful.

URLencode filters

When running a query with the filter "ga:pageTitle=@at&T", it looks like the filter is not url encoded.
I added this to line 15 of google_analytics and it appears to work:

if (!is.null(filters)) {
    filters <- utils::URLencode(filters, reserved = TRUE)
    }

There's probably a much more elegant solution out there, not sure I'm proficient enough to find it.

WALK not merging non-date queries correctly

Saw it with default channel grouping via Simon

multi_account_batching = TRUE does not refresh OAuth token

It looks like multi_account_batching does not refresh the OAuth token. If I start an R session with a query where multi_account_batching = TRUE, I get an error. If I run a standard google_analytics query, the console prints "Auto-refreshing stale OAuth token." After that, I have no errors with multi_account_batching = TRUE.

multiple filters return error

gadata <- google_analytics(id = XXXXXX, 
                       start=start.date, end=end.date, 
                       metrics = c("uniquePageviews"), 
                       dimensions = c("pageTitle","date","channelGrouping"),
                       filters = c("ga:channelGrouping%3D%3DSocial","ga:channelGrouping%3D%3DDirect")

)

returns:

Error in checkGoogleAPIError(req) : 
  JSON fetch error: Invalid value 'c("ga:channelGrouping==Social", "ga:channelGrouping==Direct").
  Values must match the following regular expression: 'ga:.+'

max_results documentation not up to date?

The documentation for google_analytics() states the default value for max_results is -1:
https://github.com/MarkEdmondson1234/googleAnalyticsR_public/blob/master/man/google_analytics.Rd#L10
However, looking into the code, it is 100:
https://github.com/MarkEdmondson1234/googleAnalyticsR_public/blob/master/R/getData.R#L35
Explicitly setting max_results to -1 doesn't lead to the described behaviour of automatically querying all data:
https://github.com/MarkEdmondson1234/googleAnalyticsR_public/blob/master/man/google_analytics.Rd#L29
It leads to an error:
JSON fetch error: Invalid value '-1' for max-results parameter. Valid values are between 0 and 10000.
Which makes absolutely sense, because no special behaviour for max_results = -1 seems to be implemented.

Another thing I noticed is that for values > 10000 automatically all data is fetched. I just saw that this is documented as a TODO in the code ;) https://github.com/MarkEdmondson1234/googleAnalyticsR_public/blob/master/R/fetch_functions.R#L2

But also for 10000 batching is used, so the following condition could probably be safely changed to max_results <= 10000 (ie not < 10000):
https://github.com/MarkEdmondson1234/googleAnalyticsR_public/blob/master/R/getData.R#L83

when max is high but all results are found, stop batching

At the moment setting max to 1,000,000 means it will try to fetch 1000 times even if there are only 10 results, which is daft.