Hi all,
I'm using the googleLanguageR package version 0.2.0.9 to transcribe German phone calls to text with the Google Speech-to-text API (speaker diarization is turned on, two speakers).
However, whenever I want to transcribe a file, which is longer than 60 seconds (i.e., I store it in a Cloud Bucket and then access it via the URI) it gives me a warning message.
Here is my code:
my_config <- list(encoding = "LINEAR16",
enableSpeakerDiarization = TRUE,
diarizationSpeakerCount = 2)
testcall <- "gs://[bucket]/testcall.wav"
apicall<- gl_speech(testcall, sampleRateHertz = 8000, languageCode = "de-DE", asynch = TRUE, customConfig = my_config)
testcall_transcript <- gl_speech_op(apicall)
The transcription is successful but R gives me this warning message.
Warning message:
In value[[3L]](cond) : Could not parse object with names:
What this error causes is that the structure of the two returend dataframes seems to be a little mixed up.
When I call str(testcall_transcript) it gives me the following output:
List of 2
$ transcript:Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 2 obs. of 2 variables:
..$ transcript: chr [1:2] "ja hallo ja und vergebe Zusatzdaten und zwar hat er was mache ich als nicht anzumerken ist einfach machen 815" "und wie heißt die Variable die drin da diese Datei ein Kratzer nennst Zusatzdaten Zusatzdaten vorgangs-id nicht"| __truncated__
..$ confidence: chr [1:2] "0.8421874" "0.8393924"
$ timings :List of 2
..$ :'data.frame': 1 obs. of 3 variables:
.. ..$ transcript: chr "ja hallo ja und vergebe Zusatzdaten und zwar hat er was mache ich als nicht anzumerken ist einfach machen 815"
.. ..$ confidence: num 0.842
.. ..$ words :List of 1
.. .. ..$ :'data.frame': 20 obs. of 3 variables:
.. .. .. ..$ startTime: chr [1:20] "0s" "17.600s" "18.100s" "18.200s" ...
.. .. .. ..$ endTime : chr [1:20] "17.600s" "18.100s" "18.200s" "19.100s" ...
.. .. .. ..$ word : chr [1:20] "ja" "hallo" "ja" "und" ...
..$ :'data.frame': 1 obs. of 3 variables:
.. ..$ transcript: chr "und wie heißt die Variable die drin da diese Datei ein Kratzer nennst Zusatzdaten Zusatzdaten vorgangs-id nicht"| __truncated__
.. ..$ confidence: num 0.839
.. ..$ words :List of 1
.. .. ..$ :'data.frame': 43 obs. of 4 variables:
.. .. .. ..$ startTime : chr [1:43] "0s" "17.600s" "18.100s" "18.200s" ...
.. .. .. ..$ endTime : chr [1:43] "17.600s" "18.100s" "18.200s" "19.100s" ...
.. .. .. ..$ word : chr [1:43] "ja" "hallo" "ja" "und" ...
.. .. .. ..$ speakerTag: int [1:43] 1 1 1 1 1 1 1 1 1 1 ...
Looks all fine BUT...
when I try to access the $timings dataframe I'm having trouble to access the $speakerTag variable. I need to access the speakerTag and the respective start and Endtimes in order to determine the time stamps when a speaker turn happens.
For a short file (less than 60sec) R gives me this output (perfectly working):
> transcript_short$timings$speakerTag
[1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[68] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1
For the long file R gives me this output:
> testcall_transcript$timings$speakerTag
NULL
Any ideas on how this can be fixed? Extracting the speakerTags is crucial for my further data processing.
Thanks! :)