As of today (3AM CET, as the earliest measured occurrence): ge

I can confirm that I see this. See also <a href="http://stackoverflow.com/questions/33

"Fixed the issue by having cookies when it requests URLs." see <a href="http://stackov

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

No content is retrieved, potential error at the readHTMLTable stage. about scholar HOT 13 CLOSED

jkeirstead commented on July 18, 2024

No content is retrieved, potential error at the readHTMLTable stage.

from scholar.

Comments (13)

bastienboutonnet commented on July 18, 2024

Looks like the problem is still there. I'm not sure if I can be more helpful but it may be a good idea to look into this. I've been able to reproduce the problem using a different scholar profile, and on a different computer.

Not sure where exactly the problem resides but it would seem that when trying to read the table from the page using readHTMLTable(url) no content is retrieved:

> t=readHTMLTable("http://scholar.google.com/citations?hl=en&user=qZLGnroAAAAJ")
> t
named list()

from scholar.

jefferis commented on July 18, 2024

I can confirm that I see this. See also http://stackoverflow.com/questions/33741372/google-server-gives-a-server-error-with-the-first-request-in-private-browsing-mo

That SO post notes that repeating the request bypasses the issue. Doing:

library(httr)
res=GET('https://scholar.google.com/citations?hl=en&user=qZLGnroAAAAJ')
content(res)
res2=GET('https://scholar.google.com/citations?hl=en&user=qZLGnroAAAAJ')
content(res2)

worked for me the second time, while getURL never worked.

from scholar.

bastienboutonnet commented on July 18, 2024

I can confirm that this works indeed.

if readHTMLTable() is ran on the content of res2 (readHTMLTable(content(res2))) then we obtain the tables needed for the rest of the functions to work.

What does @jkeirstead recommend in terms of mending this issue? Should the functions be written so that a test for content retrieval is performed and if this fails a pull of the content using the method outlined above (twice) is performed and the rest of the function runs on the content of the object?

Not sure how long this issue with google will remain, seems to have been quite some days already. But seems like a fix, that if judged needed I'd be happy to help fixing :)

from scholar.

Kyeongan commented on July 18, 2024

I visited here from Stackoverflow.com. Your R package is pretty nice and it seems we have the same issue from Google now on. I would like to discuss any ideas for solving it and share any things that is helpful.

from scholar.

jkeirstead commented on July 18, 2024

Thanks for raising this issue and posting the fix. I'm inclined to wait to see if Google fixes this since that's what the error message suggests.

from scholar.

bastienboutonnet commented on July 18, 2024

Makes sense. If it take too long and want the fixes implemented I'll be happy to help.

from scholar.

rogiersbart commented on July 18, 2024

Ok guys, great! Just noticed the bug looking at an empty citation history plot on my personal blog. Let's hope they fix this soon!

from scholar.

LechMadeyski commented on July 18, 2024

"Fixed the issue by having cookies when it requests URLs." see http://stackoverflow.com/questions/33741372/google-server-gives-a-server-error-with-the-first-request-in-private-browsing-mo

from scholar.

jefferis commented on July 18, 2024

That makes sense. I think httr GET looks after the cookie state.

Sent from my iPhone

On 19 Nov 2015, at 21:15, Lech Madeyski [email protected] wrote:

"Fixed the issue by having cookies when it requests URLs." see http://stackoverflow.com/questions/33741372/google-server-gives-a-server-error-with-the-first-request-in-private-browsing-mo

—
Reply to this email directly or view it on GitHub.

from scholar.

jkeirstead commented on July 18, 2024

Thanks @LechMadeyski. That does indeed seem to be the problem; will try to get a fix out shortly.

from scholar.

jkeirstead commented on July 18, 2024

This has now been fixed and the latest version is available on dev; a CRAN release should be out very soon.

For those who are curious, the problem was that cookies have to be accepted in order to access the content. The package now performs a one-off check for a dummy URL and then maintains a persistent Curl handle for future queries.

from scholar.

guillaumelobet commented on July 18, 2024

It appears the issue is back, or at least for me. I try to compile data from several colleagues (so multiple get_profile() queries) and I got randomly stuck with the Error in tables[[1]] : subscript out of bounds error...

Any ideas how to fix this, or any workaround?

from scholar.

pzhaonet commented on July 18, 2024

I also have the same issue. Does anyone know how to fix it?

get_profile(id = "TErVoUAAAAJ")

from scholar.

No content is retrieved, potential error at the readHTMLTable stage. about scholar HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent