elsevierdev / elsapy Goto Github PK

View Code? Open in Web Editor NEW

349.0 349.0 138.0 131 KB

A Python module for use with Elsevier's APIs: Scopus, ScienceDirect, others.

Home Page: http://api.elsevier.com

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

elsapy's People

Contributors

Stargazers

Watchers

Forkers

fulofelisa arissetyawan govertbuijs manazevedof srputnam alistairwalsh adam3smith ayadlin ctorney pethof ridhwanluthra katelouisep scanfyu jdagdelen w140601 guyfleeman iluckyyang jiaola melnarte hgrhgy 2waybene creyesk aakaashrao eamon ezequieljsosa drb326 matzech melindamcdaniel lucasleite314 pbisaillon i-am-schramm jstarr-bepress dickinsjohn gugar20 jotdevelopers mariohoh cfpanaite1 see2 camprog shenchenguang g-simmons based-god-fucked-my-bitch-fuckzig philip-muench asoehartono articoder jasonpcasey giantfurosemide dscott-git dankangdongwon philipptempel shyshy903 jamtrac marcuscarpenter97 cauluomeng caunlp arafatalam1 olimarborges yoomambo aqeelferoze hissouka007 venderjb chenyuejiao lucascbarbosa bbkk0502 tclose mhafidi erinlmartin harsh3029 matthewrgonzalez james-geiger saad770 mqpasta ma-ji kineretbk uniformlymatt umiamilibraries rcmoraleshernandez bilgieyup andres-fr raj1box eason51 mk-imagine vivekvresearch zhedian karakatic trifen dikshakapoor15 a-vi-shek philipinf medo0070 tretomaszewski gangshao eggry ssaberipouya nguyenduchoang26 verasamaooo kartikey-git jazzbuck littleboy12580 mweiselbiggle

elsapy's Issues

No abstract in results

Hi all,

Im relatively new to python but im trying to get all scopus results based on multiple combinations of synonyms. This works fine only I cant get the abstract into my results. In the elsevier API it is documented as "dc:description" (https://dev.elsevier.com/sc_search_views.html). How do I get the abstract text into my dataframe results?

Non-subscriber "bad handshake error"

I'm not a Elsevier Developer Portal Subscriber and I'm getting the following error when trying to use ElsSearch:

The code fails when trying to execute the search, I'm working on a corporate network. What could be going wrong?

Get complete info from affiliation

Hi friends, i need get complete Author data for publications retrieved from Affiliation query, buy i not found author atributes how can i do it?

Affiliation example

...: # Initialize affiliation with ID as string
...: my_aff = ElsAffil(affil_id = '60101411')
...: if my_aff.read(client):
...:     print ("my_aff.name: ", my_aff.name)
...:     my_aff.write()
...: else:
...:     print ("Read affiliation failed.")
...:

my_aff.name: American Institute of Steel Construction

In [20]: ## Affiliation example

ELSAPY - ELSSEARCH

Hi!

I have a question about "ELSSEARCH". How could I write result to my disk. I want to download detail information about publication but I get error.

AttributeError: 'ElsSearch' object has no attribute 'read'
AttributeError: 'ElsSearch' object has no attribute 'write'

Parsing resources, include protocol to response resources.

I am querying some information about authors and the URIs aren't correctly parsed. To develop further, making the next call to the API:

http://api.elsevier.com/content/author/author_id/AUTHOR_ID:57193429229?apiKey=##############&view=ENHANCED&httpAccept=application/rdf%2Bxml

Returns all resources with domain api.elsevier.com without a protocol. An example of a resource is ://api.elsevier.com/content/serial/title/issn/21945357

For now, I am parsing and correcting each URI while reading the RDF with Sesame Rio. But it would be great if this is fixed from the provider.

change uri base

i want change uri to https://api-elsevier-com.ezaccess.library.uitm.edu.my/content/author/author_id/{author_id} .
But not work

Key error

I'm trying to get the code working but keep running into problems with the key. I've copied the example program and set up config with my API key but I get the error:

Which seems reasonable because when I print config it looks like this:

{'cells': [{'cell_type': 'code', 'execution_count': 2, 'metadata': {}, 'outputs': [{'data': {'text/plain': ["{'apikey': 'xxxxxxxxxxxxxxxxxxxxxxxxxxx', 'insttoken': ''}"]}, 'execution_count': 2, 'metadata': {}, 'output_type': 'execute_result'}], 'source': ['{\n', ' 'apikey': "xxxxxxxxxxxxxxxxxxxxxxxxxxx",\n', ' "insttoken": ""\n', '}']}, {'cell_type': 'code', 'execution_count': None, 'metadata': {}, 'outputs': [], 'source': []}], 'metadata': {'kernelspec': {'display_name': 'Python 3', 'language': 'python', 'name': 'python3'}, 'language_info': {'codemirror_mode': {'name': 'ipython', 'version': 3}, 'file_extension': '.py', 'mimetype': 'text/x-python', 'name': 'python', 'nbconvert_exporter': 'python', 'pygments_lexer': 'ipython3', 'version': '3.7.4'}}, 'nbformat': 4, 'nbformat_minor': 2}

So they api key is inside it but 'apikey' is not actually a key in the dictionary.

Why am I getting this problem and how can I fix it?

author keywords are not accessible

Hi, I am trying to retrieve author keywords using the following code:

from elsapy.elsclient import ElsClient
from elsapy.elsprofile import ElsAuthor, ElsAffil
from elsapy.elsdoc import FullDoc, AbsDoc
from elsapy.elssearch import ElsSearch
import json
    
## Load configuration
con_file = open("config.json")
config = json.load(con_file)
con_file.close()

## Initialize client
client = ElsClient(config['apikey'])
# client.inst_token = config['insttoken']

query = "TITLE-ABS-KEY(earthquakes)&view=COMPLETE"

doc_srch = ElsSearch(query,'scopus')
doc_srch.execute(client, get_all = True)
print ("doc_srch has", len(doc_srch.results), "results.")

However, the resulting dict does not contain any fields related to author keywords. Can you please tell me what is the status for this and if I need to do something different. My API key is through Institutional access which has COMPLETE subscription.

Thank you very much.

Retrieve citing articles

Hi, first of all, thank you for making the APIs work! I have a question about the cited-by function. I have some "seed articles," and I need to "trace" the articles that cite these seed articles (citing article). I noticed that there is a "scopus-citedby" field (a url link) in the results of the Search function. Is there a way to get the "citing articles" in json or xml format? Thanks! (I've reviewed the docs carefully, but maybe I miss it.)

The api is good but only works for 2 of Elsevier's 50 product API's

The api code is good but only works for 2 of Elsevier's 50 product API's.

ElsAuthor.read_docs return false despite institution api key

Hello there

I am trying to use the read_docs method from ElsAuthor however it is returning false.
Here is the log message. I am using an API key from my institution and I am able to use FullDoc to extract a full document.

2020-02-18 08:50:21,365 - elsapy.elsprofile - WARNING - ('HTTP 400 Error from https://api.elsevier.com/content/author/author_id/AUTHOR_ID:7003886815?view=documents\nand using headers {\'X-ELS-APIKey\': \'Key', \'User-Agent\': \'elsapy-v0.5.0\', \'Accept\': \'application/json\'}:\n{"service-error":{"status":{"statusCode":"INVALID_INPUT","statusText":"View parameter entered is not valid for this service"}}}',)

According to this post, this seems to be related to a permission problem. What type of permission is required?

Enhancement Request: provide automatic url encoding of passed query strings

When I see the error output of some of my queries, it looks like some of them are not properly url-encoded by default. Please wrap query strings with a url-encoding function. :)

Issue retreiving FULL XML

I am not able to retrieve the Full XML, but only the stripped-down full-text.

Search results questions

While searching for some keywords, I realized that I always got 5k
results. Is the the limitation for results per query? It is quite strange that for several keywords, I got always 5k results.

The follow up questions is are the 5k results from the most recent ones going back in year? Or randomly chosen?

How to access EMTREE using python

I want to programmatically access EMBASE EMTREE Thesaurus to check for synonyms/indexed words in real time based on given input.
So far, I haven't found any tutorial/ How-to guide to do the same. Can you provide some info on how to do it ?

Disabling logging.

Is there any way to disable logging ? Thank you.

errors while running the code

I'm trying the Python script, and though I have a valid API key, I get errors:

$ python exampleProg.py
Read author failed.
Read affiliation failed.
Read document failed.
pii_doc.title: Establishing the fair allocation of international aviation carbon emission rights
doi_doc.title: Sensitive Sequencing Method for KRAS Mutation Detection by Pyrosequencing
Load documents (Y/N)?
--> Y
Read docs for author failed.
Read docs for affiliation failed.
Traceback (most recent call last):
File "exampleProg.py", line 87, in
auth_srch.execute(client)
File "/Users/aaragon/Local/elsapy/elsapy/elssearch.py", line 79, in execute
api_response = els_client.exec_request(self._uri)
File "/Users/aaragon/Local/elsapy/elsapy/elsclient.py", line 117, in exec_request
raise requests.HTTPError("HTTP " + str(r.status_code) + " Error from " + URL + "\nand using headers " + str(headers) + ":\n" + r.text)
requests.exceptions.HTTPError: HTTP 500 Error from https://api.elsevier.com/content/search/author?query=authlast(keuskamp)
and using headers {'X-ELS-APIKey': '[edited_out]', 'User-Agent': 'elsapy-v0.3.2', 'Accept': 'application/json'}:
{"service-error":{"status":{"statusCode":"GENERAL_SYSTEM_ERROR","statusText":"Unrecoverable service exception occurred"}}}`

Anyone knows why I'm getting these problems? Should I try with an older version of the code?

Citations

Is there any way to get info about cited articles in specified document? On dev elsevier it is called citations overview

Author doc_list

I'm trying to retrieve doc lists by author ID. I'm able to establish the connection, read data (so I can get full name, etc), but after running read_docs the doc_list returns as None. I've tried it with authors that I know, as well as the example from the wiki and all return None. It doesn't throw an error, just returns None.

e.g.
myAuth = ElsAuthor(author_id="7202909704")
myAuth.read_docs(my_client)
print(myAuth.doc_list, type(myAuth.doc_list))

None <class 'NoneType'>

read_docs() method fail

I am trying to run exampleProg.py, but the read_docs() method (both for author and for affiliation) fails.

In config.json I have a API key but the insttoken is empty

{
    "apikey": "MY_API_KEY",
    "insttoken": ""
}

This is the output from the script:

$ python exampleProg.py
my_auth.full_name:  Paul John McGuiness
my_aff.name:  American Institute of Steel Construction
scp_doc.title:  Control of somatic tissue differentiation by the long non-coding RNA TINCR
pii_doc.title:  Establishing the fair allocation of international aviation carbon emission rights
doi_doc.title:  Sensitive Sequencing Method for KRAS Mutation Detection by Pyrosequencing
Load documents (Y/N)?
--> y
Read docs for author failed.
Read docs for affiliation failed.
auth_srch has 13 results.
aff_srch has 25 results.
doc_srch has 259 results.
doc_srch has 25 results.

This is a problem because I need this method. I have a list of author's IDs and I need to retrieve their publications.

Why does read_docs fail?

Retrieve index keywords

I am using this module to automatically perform a simple Scopus search and save some article metadata (basically keywords), which I then analyze to look for correlations.

If I do a manual search from the website and export its results, I am able to select two fields called "Author Keywords" and "Index Keywords" for each record. When I use the API, however, the field "Index Keywords" appears to be missing even when I am adding the parameter &view=COMPLETE (see minimum working example below). Is there a way to get the full metadata for the search results?

## Import modules
from elsapy.elsclient import ElsClient
from elsapy.elssearch import ElsSearch
import json

## Load configuration
with open("config.json") as con_file:
    config = json.load(con_file)  # Only my API key in this file
client = ElsClient(config['apikey'])
    
## Define query and execute search
query = 'PUBYEAR > 2015 AND TITLE-ABS-KEY(Masonry+Earthquake)&view=COMPLETE'
doc_srch = ElsSearch(query, 'scopus')
doc_srch.execute(client, get_all=False)

## Print obtained fields for the first record
print(doc_srch.results[0].keys())
>>>dict_keys(['@_fa', 'link', 'prism:url', 'dc:identifier', 'eid', 'dc:title', 'dc:creator', 'prism:publicationName', 'prism:issn', 'prism:volume', 'prism:pageRange', 'prism:coverDate', 'prism:coverDisplayDate', 'prism:doi', 'pii', 'dc:description', 'citedby-count', 'affiliation', 'prism:aggregationType', 'subtype', 'subtypeDescription', 'author-count', 'author', 'authkeywords', 'source-id'])

I have double-checked that:

The field "Index Keywords" exists and contains information that I can export using the manual search.
This data is not included when I retrieve the same records using the API.

Institution Token is not associated with API Key

Hi,
I just installed elsapy on windows and I can't run the exampleProg.py code.
For example, when I run

from elsapy.elsclient import ElsClient
from elsapy.elsprofile import ElsAuthor, ElsAffil
from elsapy.elsdoc import FullDoc, AbsDoc
from elsapy.elssearch import ElsSearch
import json

## Load configuration
con_file = open("config.json")
config = json.load(con_file)
con_file.close()

## Initialize client
client = ElsClient(config['apikey'])
client.inst_token = config['insttoken']

## Initialize doc search object using Scopus and execute search, retrieving 
#   all results
doc_srch = ElsSearch("TITLE-ABS-KEY-AUTH ( artificial AND intelligence) AND PUBYEAR > 2015",'scopus')
doc_srch.execute(client, get_all = True)
print ("doc_srch has", len(doc_srch.results), "results.")

I received this error :

HTTPError                                 Traceback (most recent call last)
<ipython-input-5-699a5fdad0e9> in <module>
      2 #   all results
      3 doc_srch = ElsSearch("TITLE-ABS-KEY-AUTH ( artificial AND intelligence) AND PUBYEAR > 2015",'scopus')
----> 4 doc_srch.execute(client, get_all = True)
      5 print ("doc_srch has", len(doc_srch.results), "results.")

~\.conda\envs\django\lib\site-packages\elsapy\elssearch.py in execute(self, els_client, get_all)
     93             all results for the search, up to a maximum of 5,000."""
     94         ## TODO: add exception handling
---> 95         api_response = els_client.exec_request(self._uri)
     96         self._tot_num_res = int(api_response['search-results']['opensearch:totalResults'])
     97         self._results = api_response['search-results']['entry']

~\.conda\envs\django\lib\site-packages\elsapy\elsclient.py in exec_request(self, URL)
    119         else:
    120             self._status_msg="HTTP " + str(r.status_code) + " Error from " + URL + " and using headers " + str(headers) + ": " + r.text
--> 121             raise requests.HTTPError("HTTP " + str(r.status_code) + " Error from " + URL + "\nand using headers " + str(headers) + ":\n" + r.text)

HTTPError: HTTP 401 Error from https://api.elsevier.com/content/search/scopus?query=TITLE-ABS-KEY-AUTH+%28+artificial+AND+intelligence%29+AND+PUBYEAR+%3E+2015
and using headers {'X-ELS-APIKey': '04e82c772fb3e580796d2fb98c48e57c', 'User-Agent': 'elsapy-v0.5.0', 'Accept': 'application/json', 'X-ELS-Insttoken': 'ENTER_INSTTOKEN_HERE_IF_YOU_HAVE_ONE_ELSE_DELETE'}:
{"service-error":{"status":{"statusCode":"AUTHENTICATION_ERROR","statusText":"Institution Token is not associated with API Key"}}}

I run my code on JupiterLab 2.2.6, using elsapy version 0.5.0 and numpy 1.19.0.

Problem with queries

I am trying to run som search queries using the code below.

"""An example program that uses the elsapy module"""

from elsapy.elsclient import ElsClient
from elsapy.elsprofile import ElsAuthor, ElsAffil
from elsapy.elsdoc import FullDoc, AbsDoc
from elsapy.elssearch import ElsSearch
import json
    
## Load configuration
con_file = open("config.json")
config = json.load(con_file)
con_file.close()

## Initialize client
client = ElsClient(config['apikey'])
doc_srch = ElsSearch('deoxy+xylulose+phosphate+reductoisomerase+ki','sciencedirect')
print(doc_srch._uri)
doc_srch.execute(client, get_all = True)
print ("doc_srch has", len(doc_srch.results), "results.")

But I am getting the error below. The 'api_response = els_client.exec_request(self._uri)' seems not work properly, maybe related to url encoding.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-203-19f1c4c761d5> in <module>
     16 doc_srch = ElsSearch('deoxy+xylulose+phosphate+reductoisomerase+ki','sciencedirect')
     17 print(doc_srch._uri)
---> 18 doc_srch.execute(client, get_all = True)
     19 print ("doc_srch has", len(doc_srch.results), "results.")

~/anaconda3/envs/base-copy/lib/python3.6/site-packages/elsapy/elssearch.py in execute(self, els_client, get_all)
     74         ## TODO: add exception handling
     75         api_response = els_client.exec_request(self._uri)
---> 76         self._tot_num_res = int(api_response['search-results']['opensearch:totalResults'])
     77         self._results = api_response['search-results']['entry']
     78         if get_all is True:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Strangely other queries work. For example using 'deoxy+xylulose+phosphate+ki' as query works. I can't spot any import difference between the two.

Setting num_res on ElsClient has no effect

It is possible to pass an argument when instantiating an ElsClient to set the the number of results to retrieve with each request. The num_res property is correctly set, but is unused; the first request issued does not include a count parameter, and subsequent requests are made using the next link in the response, which stipulates the default (count=25).

Question about university token

I did loggin into elsevier using my university account then created an API key which I added in
https://github.com/ElsevierDev/elsapy/blob/master/CONFIG.md.
But I did not receive a university token. Will the API key be sufficient to retrieve Full Text papers available only through the university account?

My API key does not work

I tried using [email protected] to establish communication with your team but no one has responded to me for already 2 weeks. My API key does not work. I guess I need a token from Elsevier. How can I get it?

error with date

I am running this query:

search_query = 'TITLE-ABS-KEY("smart working" OR "smart work" OR smartworking OR smartwork) AND SUBJAREA(ENGI) AND LOAD-DATE AFT 19800101'
search_results = ElsSearch(search_query,  'scopus')
search_results.execute(client, get_all = True)

and since one of the Scopus data has a weird format (1972-00-01), I get this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/opt/anaconda3/lib/python3.7/site-packages/dateutil/parser/_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
654         try:
--> 655             ret = self._build_naive(res, default)
656         except ValueError as e:

~/opt/anaconda3/lib/python3.7/site-packages/dateutil/parser/_parser.py in _build_naive(self, res, default)
   1240 
-> 1241         naive = default.replace(**repl)
   1242 

ValueError: month must be in 1..12

The above exception was the direct cause of the following exception:

ParserError                               Traceback (most recent call last)
pandas/_libs/tslibs/conversion.pyx in 
pandas._libs.tslibs.conversion.convert_str_to_tsobject()

pandas/_libs/tslibs/parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()

~/opt/anaconda3/lib/python3.7/site-packages/dateutil/parser/_parser.py in parse(timestr, parserinfo, **kwargs)
   1373     else:
-> 1374         return DEFAULTPARSER.parse(timestr, **kwargs)
 1375 

~/opt/anaconda3/lib/python3.7/site-packages/dateutil/parser/_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    656         except ValueError as e:
--> 657             six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
   658 

 ~/opt/anaconda3/lib/python3.7/site-packages/six.py in raise_from(value, from_value)

ParserError: month must be in 1..12: 1972-00-01

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-49-192886904661> in <module>
  1 search_query = 'TITLE-ABS-KEY("smart working" OR "smart work" OR smartworking OR smartwork) AND SUBJAREA(ENGI) AND LOAD-DATE AFT 19800101'
  2 search_results = ElsSearch(search_query, 'scopus')
----> 3 search_results.execute(client, get_all = True)

~/opt/anaconda3/lib/python3.7/site-packages/elsapy/elssearch.py in execute(self, els_client, get_all)
    105         with open('dump.json', 'w') as f:
    106             f.write(json.dumps(self._results))
--> 107         self.results_df = recast_df(pd.DataFrame(self._results))
    108 
    109     def hasAllResults(self):

~/opt/anaconda3/lib/python3.7/site-packages/elsapy/utils.py in recast_df(df)
     41             logger.info("Converting {}".format(date_field))
     42             df[date_field] = df[date_field].apply(
---> 43                     pd.Timestamp)
     44     return df

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   3846             else:
   3847                 values = self.astype(object).values
-> 3848                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   3849 
   3850         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_str_to_tsobject()

ValueError: could not convert string to Timestamp

any idea how to fix this?

Enhancement request: provide a wrapper for the Scopus Analyzer API

Good day (again)! :)

Do you have any plans to write a python wrapper for the Scopus Analyzer API? For example:
https://www.scopus.com/term/analyzer.uri?src=s&sort=plf-f&sdt=cite&sot=cite&sl=18&citedAuthorId=55698719300

I would like to retrieve a year-by-year count for a list of authors. Unfortunately, I have not been able to find where the Analyzer API is even documented, so I am not exactly sure where to begin.

Hope all is well,

Unable to run the exampleProg.py...

This appears to say that my APIkey is not good? Any hint? Thank you!

$ python3 exampleProg.py
Read author failed.
Read affiliation failed.
scp_doc.title: Control of somatic tissue differentiation by the long non-coding RNA TINCR
pii_doc.title: Establishing the fair allocation of international aviation carbon emission rights
doi_doc.title: Sensitive Sequencing Method for KRAS Mutation Detection by Pyrosequencing
Load documents (Y/N)?
--> Y
Read docs for author failed.
Read docs for affiliation failed.
Traceback (most recent call last):
File "exampleProg.py", line 89, in
auth_srch.execute(client)
File "/home/chunnan/ElsaPy/elsapy/elsapy/elssearch.py", line 75, in execute
api_response = els_client.exec_request(self._uri)
File "/home/chunnan/ElsaPy/elsapy/elsapy/elsclient.py", line 119, in exec_request
raise requests.HTTPError("HTTP " + str(r.status_code) + " Error from " + URL + "\nand using headers " + str(headers) + ":\n" + r.text)
requests.exceptions.HTTPError: HTTP 401 Error from https://api.elsevier.com/content/search/author?query=authlast(keuskamp)
and using headers {'Accept': 'application/json', 'User-Agent': 'elsapy-v0.4.6', 'X-ELS-APIKey': '[edited-out]'}:
{"service-error":{"status":{"statusCode":"AUTHORIZATION_ERROR","statusText":"The requestor is not authorized to access the requested view or fields of the resource"}}}

Sciencedirect Querys not working

Hello,
I'm trying to make some ScienceDirect work, e.g:

doc_srch = ElsSearch(tak(renal failure),'sciencedirect')

doc_srch = ElsSearch(title-abstr-key(renal failure),'sciencedirect')

as described in the Science Direct Search Tips
`

The Error says that the query cannot be translated:
{"service-error":{"status":{"statusCode":"INVALID_INPUT","statusText":"Unable to translate query provided. Error=[The search contained no translatable search terms. EmptyQueryException - No translatable query terms found in field.]"}}}`

Any suggestions on this?

'readDocs()' vs 'read_docs()' in wiki

Hello!
I noticed that the wiki suggests to use the .readDocs() function of the ElsAuthor and ElsAffil objects. However, this function is currently called read_docs(). Perhaps, it is worth fixing in the wiki documentation.

Some DOIs not working in retrieving abstract text

I’m trying to use the Elsapy module to extract the abstracts of documents on certain topics.
I am able to do this but, unfortunately, only for a fraction of the documents found.

For example, a particular search returns 16 documents but I am only able to extract the information (e.g. abstracts) from 4 of them.

Upon further inspection, it seems that for the documents I can’t get the abstracts of:
-Don’t have a PII
-And have DOIs that don’t work.

I have tested the DOIs in the article retrieval interactive API guide
-The ones that returned abstracts worked fine
-The other ones return the error:

RESOURCE_NOT_FOUNDThe resource specified cannot be found.

Even though I have found the original articles and checked their DOI is correct.

An example of one that didn’t work is:
Sengupta, N. K., & Sibley, C. G. (2019). The political attitudes and subjective wellbeing of the one percent. Journal of Happiness Studies, 20(7), 2125-2140. doi:10.1007/s10902-018-0038-4

I have found that the ones that do ‘work’ all have the general form:
10.1016/j.ssmph.2019.100471
10.1016/j.apacoust.2015.03.004

Please let me know if you know why this is and how I can fix it.

pathlib module not found

Installing python 3.6 with Anaconda includes pathlib2 instead of pathlib, which then throws an error when importing elsclient

Is it worth looking into replace this library, or doing something like the following:

try:
    import pathlib
except ImportError:
    import pathlib2 as pathlib

exampleProg.py does not work properly

Hi everyone,

I'm trying to run the example as it's given, but I don't get the expected behavior while reading docs:

$ python exampleProg.py
my_auth.full_name: Paul John McGuiness
my_aff.name: American Institute of Steel Construction
scp_doc.title: Control of somatic tissue differentiation by the long non-coding RNA TINCR
pii_doc.title: Establishing the fair allocation of international aviation carbon emission rights
doi_doc.title: Sensitive Sequencing Method for KRAS Mutation Detection by Pyrosequencing
Load documents (Y/N)?
--> Y
Read docs for author failed.
Read docs for affiliation failed.
auth_srch has 13 results.
aff_srch has 25 results.
doc_srch has 184 results.

Perhaps I'm not using the latest version of the API because I installed it through pip? Thanks for the help.

query error

Good day,

I have a list of names I wish to iterate through to gather author profiles, but I seem to have constructed the query incorrectly. Unfortunately, after reading the various documentations, it is not clear to me what exactly is wrong with the query. Would you be able to provide any insight?

my.py
`fnames=[]
lnames=[]
with open('authors.txt','r') as f:
reader=csv.reader(f,delimiter='\t')
for fname,lname in reader:
fnames.append(fname)
lnames.append(lname)

numNames=len(fnames)
print ("Imported ", numNames, " names")

con_file = open("config.json")
config = json.load(con_file)
con_file.close()

client = ElsClient(config['apikey'])
client.inst_token = config['insttoken']

for i in range(numNames):
query='AUTHOR-NAME(AUTHLASTNAME('+lnames[i]+') AUTHFIRST('+fnames[i]+')) AND AFFIL("University of Illinois" OR "University of Illinois at Urbana-Champaign"'
print ("Searching: ", query)
auth_srch = ElsSearch(query,'author')
auth_srch.execute(client)
print ("auth_srch has", len(auth_srch.results), "results.")`

and the error:

(C:\Users\sac\AppData\Local\Continuum\Anaconda3\envs\scopus) C:\Users\sac\scopus_project\elsapy>python my.py Imported 533 names Searching: AUTHOR-NAME(AUTHLASTNAME(Endres) AUTHFIRST(A Bryan)) AND AFFIL("University of Illinois" OR "University of Illinois at Urbana-Champaign" Traceback (most recent call last): File "my.py", line 62, in <module> auth_srch.execute(client) File "C:\Users\sac\scopus_project\elsapy\elsapy\elssearch.py", line 75, in execute api_response = els_client.exec_request(self._uri) File "C:\Users\sac\scopus_project\elsapy\elsapy\elsclient.py", line 119, in exec_request raise requests.HTTPError("HTTP " + str(r.status_code) + " Error from " + URL + "\nand using headers " + str(headers) + ":\n" + r.text) requests.exceptions.HTTPError: HTTP 400 Error from https://api.elsevier.com/content/search/author?query=AUTHOR-NAME(AUTHLASTNAME(Endres) AUTHFIRST(A Bryan)) AND AFFIL("University of Illinois" OR "University of Illinois at Urbana-Champaign" and using headers {'X-ELS-APIKey': 'xxxxxxxxxxxxxxxxxxxxxxx', 'User-Agent': 'elsapy-v0.4.2', 'Accept': 'application/json'}: {"service-error":{"status":{"statusCode":"INVALID_INPUT","statusText":"Unable to translate query provided."}}}

FullDoc does not contain full-text

Hello! When I run the FullDoc snippet below (from the example script) only the abstract is retrieved and not the full-text, as seen in the "JSON result" below. How can this code be modified to access the full-text?

## ScienceDirect (full-text) document example using DOI
doi_doc = FullDoc(doi = '10.1016/S1525-1578(10)60571-5')
if doi_doc.read(client):
    print ("doi_doc.title: ", doi_doc.title)
    doi_doc.write()   
else:
    print ("Read document failed.")

JSON result:

{"coredata": {"prism:url": "https://api.elsevier.com/content/article/pii/S1525157810605715", "dc:identifier": "doi:10.1016/S1525-1578(10)60571-5", "eid": "1-s2.0-S1525157810605715", "prism:doi": "10.1016/S1525-1578(10)60571-5", "pii": "S1525-1578(10)60571-5", "dc:title": "Sensitive Sequencing Method for KRAS Mutation Detection by Pyrosequencing ", "prism:publicationName": "The Journal of Molecular Diagnostics", "prism:aggregationType": "Journal", "pubType": "fla", "prism:issn": "15251578", "prism:volume": "7", "prism:issueIdentifier": "3", "prism:startingPage": "413", "prism:endingPage": "421", "prism:pageRange": "413-421", "prism:number": "3", "dc:format": "application/json", "prism:coverDate": "2005-08-31", "prism:coverDisplayDate": "August 2005", "prism:copyright": "Copyright \u00a9 2005 American Society for Investigative Pathology and Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.", "prism:publisher": "American Society for Investigative Pathology and Association for Molecular Pathology. Published by Elsevier Inc.", "dc:creator": [{"@_fa": "true", "$": "Ogino, Shuji"}, {"@_fa": "true", "$": "Kawasaki, Takako"}, {"@_fa": "true", "$": "Brahmandam, Mohan"}, {"@_fa": "true", "$": "Yan, Liying"}, {"@_fa": "true", "$": "Cantor, Mami"}, {"@_fa": "true", "$": "Namgyal, Chungdak"}, {"@_fa": "true", "$": "Mino-Kenudson, Mari"}, {"@_fa": "true", "$": "Lauwers, Gregory Y."}, {"@_fa": "true", "$": "Loda, Massimo"}, {"@_fa": "true", "$": "Fuchs, Charles S."}], "dc:description": "\n                  Both benign and malignant tumors represent heterogenous tissue containing tumor cells and non-neoplastic mesenchymal and inflammatory cells. To detect a minority of mutant KRAS alleles among abundant wild-type alleles, we developed a sensitive DNA sequencing assay using Pyrosequencing, ie, nucleotide extension sequencing with an allele quantification capability. We designed our Pyrosequencing assay for use with whole-genome-amplified DNA from paraffin-embedded tissue. Assessing various mixtures of DNA from mutant KRAS cell lines and DNA from a wild-type KRAS cell line, we found that mutation detection rates for Pyrosequencing were superior to dideoxy sequencing. In addition, Pyrosequencing proved superior to dideoxy sequencing in the detection of KRAS mutations from DNA mixtures of paraffin-embedded colon cancer and normal tissue as well as from paraffin-embedded pancreatic cancers. Quantification of mutant alleles by Pyrosequencing was precise and useful for assay validation, monitoring, and quality assurance. Our Pyrosequencing method is simple, robust, and sensitive, with a detection limit of approximately 5% mutant alleles. It is particularly useful for tumors containing abundant non-neoplastic cells. In addition, the applicability of this assay for DNA amplified by whole-genome amplification technique provides an expanded source of DNA for large-scale studies.\n               ", "openaccess": "0", "openaccessArticle": false, "openaccessType": null, "openArchiveArticle": false, "openaccessSponsorName": null, "openaccessSponsorType": null, "openaccessUserLicense": null, "link": [{"@href": "https://api.elsevier.com/content/article/pii/S1525157810605715", "@rel": "self", "@_fa": "true"}, {"@href": "https://www.sciencedirect.com/science/article/pii/S1525157810605715", "@rel": "scidir", "@_fa": "true"}]}, "scopus-id": "23844497341", "scopus-eid": "2-s2.0-23844497341", "pubmed-id": "16049314", "link": {"@href": "https://api.elsevier.com/content/abstract/scopus_id/23844497341", "@rel": "abstract"}, "originalText": {"xocs:doc": {"xocs:meta": {"xocs:open-access": {"xocs:oa-article-status": {"@is-open-access": "0", "@is-open-archive": "0"}}, "xocs:available-online-date": {"@yyyymmdd": "20101231", "$": "2010-12-31"}}}}}

different number of results???

Why is the number different between doc_srch.results and doc_srch.tot_num_res?

I read the explanation for the two functions, but I did not understand the difference.

Error getting citation info for REFEID

Hi.
I'm trying to get citation info for REFEID (https://api.elsevier.com/content/search/scopus?apiKey=xxx&view=COMPLETE&httpAccept=application/json&suppressNavLinks=true&query=REFEID(2-s2.0-84864329732)&field=eid,prism:coverDate&date=2018)

I have hidden my apiKey.
I'm getting error:
{"service-error":{"status":{"statusCode":"INVALID_INPUT","statusText":"Use of certain field restrictions in the search query is not allowed for this requestor."}}}
I think my organization (Slovenian IZUM) doesn't have such restrictions. What can I do to obtain this data?

Sincerely,
Dejan Fajt
Developer
IZUM - Slovenia

Pagination breaks when returned @next value contains a '+'

I am working with the ScopusSearch API. My code uses the cursor and loops through result pages by extracting the '@next' key. However, this breaks when the @next key value contains a '+'.

I have checked that everywhere in my code, I am treating the cursor variable and updated cursor variable as str type.

Can you please suggest what may be happening?

I am attaching a few outputs for different queries so as to show you that it breaks only when there is a '+' in the @next key value.
Output 1:

Output 2:

Output 3:

Thanks!

How to download science direct files in XML format rather than json?

Hello,

So I want to download the science direct full article files in XML than default JSON from your API. Though, I did tried to change the elsentity.py write function to download it in XML but it gives me blank file with an error message;
TypeError: must be str, not dict

How can I fix this and get data in XML? I can see that it is possible to do that through your interactive API:

https://dev.elsevier.com/sciencedirect.html#!/Article_Retrieval/ArticleRetrieval_0_1

I need to do so is because I need only portion of the text and data from Materials and Methods section. This is possible only by extracting tags from XML as in JSON all is in description tag.

Hope to receive your help soon.

It's error like that.

Is server error ?

readDocs

I would like to retrieve the documents as you showed in the example python script. However, I always fail. Could you please let me know how I can get access to all the documents of an author? I only need the DOI for further analysis of citation, etc. Thanks

Scopus (Abtract) document example

Initialize document with ID as integer

scp_doc = AbsDoc(scp_id = 56923985200)
if scp_doc.read(client):
print ("scp_doc.title: ", scp_doc.title)
scp_doc.write()
else:
print ("Read document failed.")

ScienceDirect (full-text) document example using PII

pii_doc = FullDoc(sd_pii = 'S1674927814000082')
if pii_doc.read(client):
print ("pii_doc.title: ", pii_doc.title)
pii_doc.write()
else:
print ("Read document failed.")

ScienceDirect (full-text) document example using DOI

doi_doc = FullDoc(doi = '10.1016/S1525-1578(10)60571-5')
if doi_doc.read(client):
print ("doi_doc.title: ", doi_doc.title)
doi_doc.write()
else:
print ("Read document failed.")

An explanation of what to do with the files?

Hi there,

I'm rather new to programming and completely new to GitHub. I didn't really understand what I was supposed to do with the elsapy files on GitHub and went on to waste a lot of time trying to use pip, and trying to understand what people normally do on GitHub (which has nothing to do with me).

Perhaps a brief explanation of how the files are supposed to work may save hassle at both ends?

I've barely started (but at least I don't get a 'No Module elsapy'!)... so you may still hear from me...xD

Thank you for the git and thank you in advance for any further help!

Cheers,

Alfonso

Problem searching by subject classification

I am trying to use ElsSearch to retrieve DOI's for documents classified by subject area.
The query

 subj_srch = ElsSearch(word + '+SUBJAREA(' + subj + ')','scidir')
 subj_srch.execute(client)

(where subj varies from "AGRI" to "BIOC", for example) does not yield unique results. If I remove the word query, I get an error. How can I search by subject classification?

Filter using keywords/tags associated with an article.

Hi,
This isn't really an error reporting. I need to filter my search results specifically to get articles which have certain terms as their tags/keywords but I am not sure how to do.

It will be great of someone can help with that.

Nipun

can not support affiliation name with & even escaped

seems the API just suddenly not supporting institution name with & any more.

Querying documents with key words only returns the last 25 results

The publication year does not seem to have any effect.

BTW: it seems that a lot of the queries grammar described here http://schema.elsevier.com/dtds/document/bkapi/search/SCOPUSSearchTips.htm is not supported or are ignored by ElsSearch (my main interest is to search by keywords and per year)

Extracting reference list

Hello,
I am trying to extract reference list of known papers, using scopus ids.
I tested a script few weeks ago and it seemed to work fine. If I run it now, I get this error:

Traceback
(most recent call last):
File "scp_id_to_reference_list.py", line 60, in
ref_list = scp_doc.data['item']['bibrecord']['tail']['bibliography']['reference']
KeyError: 'item'

The code looks like this:

`
#! /usr/bin/env python3
from elsapy.elsclient import ElsClient
from elsapy.elsprofile import ElsAuthor, ElsAffil
from elsapy.elsdoc import FullDoc, AbsDoc
from elsapy.elssearch import ElsSearch
import json

# Load configuration
con_file = open("config.json")
config = json.load(con_file)
con_file.close()

scp_id_list = ### list of ids

client = ElsClient(config['apikey'])

ref_list = list()

for scp_id_use in scp_id_list:
    scp_doc = AbsDoc(scp_id = scp_id_use)
    if scp_doc.read(client):
        print ("scp_doc.title: ", scp_doc.title)
        scp_doc.write()   
    else:
        print ("Read document failed.")

    # get the reference list of this article
    ref_list = scp_doc.data['item']['bibrecord']['tail']['bibliography']['reference']

    ref_list_json = list()

    # save the reference list as a json file
    for i in range(0, len(ref_list)):
        ref_info = ref_list[i]['ref-info']
        ref_json = json.dumps(ref_info)
        ref_list_json.append(json.loads(ref_json))

    ref_list[scp_id_use] = ref_list_json

`
Any idea on what might be wrong?
Thanks,
Noemi

Subject Area Information from ElsSearch

Hey there,

I'm currently constructing an ElsSearch for multiple papers using their EIDs. While I can retrieve almost all the information, such as: authors, affiliations, etc. I cannot receive data on a papers subject area. My current code for the search is:

ElsSearch('EID(2-s2.0-0000001493) OR EID(2-s2.0-0000000118) &view=COMPLETE','scopus')

ElsAffil.read_docs always retruns false (HTTP 400)

This snippet from your example does not work. The client gets a HTTP 400 error with the message "View parameter entered is not valid for this service". This fails with the snippet below as well as a larger script I wrote to download document information for a list of affiliation IDs.

Code to reproduce issue:

from elsapy.elsclient import ElsClient
from elsapy.elsprofile import ElsAuthor, ElsAffil
from elsapy.elsdoc import FullDoc, AbsDoc
from elsapy.elssearch import ElsSearch
import json


config = json.load(open('elsevier_config.json'))
client = ElsClient(config['apikey'])

my_aff = ElsAffil(affil_id=60101411)
if my_aff.read(client):  # this part works
    print(f'my_aff.name: {my_aff.name}') 
    my_aff.write()

if my_aff.read_docs(client):  # this returns false for every affiliation id I've thrown at it
    print(f'my_aff.doc_list has {len(my_aff.doc_list)} items')
    my_aff.write_docs()
else:
    print('Read docs for affiliation failed.')

Log (verbatim except for censored API key)

2019-10-15 10:29:52,937 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:29:53,315 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:29:53,316 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:29:53,317 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:29:53,318 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:29:53,319 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:31:01,646 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:31:02,017 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:31:02,018 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:31:02,019 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:31:02,020 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:31:02,021 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:31:54,665 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:31:54,982 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:31:54,983 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:31:54,984 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:31:54,985 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:31:54,986 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:31:55,666 - elsapy.elsclient - INFO - Sending GET request to https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:31:56,049 - elsapy.elsentity - INFO - Data loaded for https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:31:56,090 - elsapy.elsentity - INFO - Wrote https://api.elsevier.com/content/affiliation/affiliation_id/60101411 to file
2019-10-15 10:32:35,218 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:32:35,570 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:32:35,571 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:32:35,572 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:32:35,573 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:32:35,575 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:32:52,392 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:32:52,708 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:32:52,710 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:32:52,711 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:32:52,712 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:32:52,713 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:32:53,392 - elsapy.elsclient - INFO - Sending GET request to https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:32:53,687 - elsapy.elsentity - INFO - Data loaded for https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:32:53,721 - elsapy.elsentity - INFO - Wrote https://api.elsevier.com/content/affiliation/affiliation_id/60101411 to file
2019-10-15 10:33:14,511 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:33:14,862 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:33:14,863 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:33:14,864 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:33:14,865 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:33:14,867 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:34:17,816 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:34:18,146 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:34:18,147 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:34:18,148 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:34:18,150 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:34:18,151 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:34:18,817 - elsapy.elsclient - INFO - Sending GET request to https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:34:19,112 - elsapy.elsentity - INFO - Data loaded for https://api.elsevier.com/content/affiliation/affiliation_id/60101411
2019-10-15 10:34:19,146 - elsapy.elsentity - INFO - Wrote https://api.elsevier.com/content/affiliation/affiliation_id/60101411 to file
2019-10-15 10:34:20,113 - elsapy.elsclient - INFO - Sending GET request to https://api.elsevier.com/content/affiliation/affiliation_id/60101411?view=documents
2019-10-15 10:34:20,301 - elsapy.elsprofile - WARNING - ('HTTP 400 Error from https://api.elsevier.com/content/affiliation/affiliation_id/60101411?view=documents\nand using headers {\'X-ELS-APIKey\': \'********************************\', \'User-Agent\': \'elsapy-v0.5.0\', \'Accept\': \'application/json\'}:\n{"service-error":{"status":{"statusCode":"INVALID_INPUT","statusText":"View parameter entered is not valid for this service"}}}',)
2019-10-15 10:34:43,720 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:34:44,077 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:34:44,078 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:34:44,079 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:34:44,080 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:34:44,081 - elsapy.elssearch - INFO - Module loaded.
2019-10-15 10:35:00,518 - elsapy.elsclient - INFO - Module loaded.
2019-10-15 10:35:00,865 - elsapy.elsentity - INFO - Module loaded.
2019-10-15 10:35:00,866 - elsapy.utils - INFO - Module loaded.
2019-10-15 10:35:00,867 - elsapy.elsprofile - INFO - Module loaded.
2019-10-15 10:35:00,868 - elsapy.elsdoc - INFO - Module loaded.
2019-10-15 10:35:00,869 - elsapy.elssearch - INFO - Module loaded.