Giter VIP home page Giter VIP logo

iclr2023-openreviewdata's Issues

Cannot crawl the data from the OpenReview website

Hi there, I tried to run the parse_data.py to crawl data from openreview. Unfortunately, it did not work. The following are the error messages. Is anybody can give me a hand? Thank you!

ipython parse_data.py
Offset: 0 Data: 0
Offset: 1000 Data: 1000
Offset: 2000 Data: 2000
Offset: 3000 Data: 3000
Offset: 4000 Data: 3809
Number of submissions: 3809
Number of papers (including old): 4874
0%| | 0/4874 [00:00<?, ?it/s]
0%| | 0/4874 [00:00<?, ?it/s]

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/dongxingshuai/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/home/dongxingshuai/research_associate/nlp/ICLR2023-OpenReviewData-main/notebooks/parse_data.py", line 166, in filter_data
withdraw = 1 if 'Withdrawn_Submission' in meta_note[0]['invitation'] else 0
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

IndexError Traceback (most recent call last)
File ~/research_associate/nlp/ICLR2023-OpenReviewData-main/notebooks/parse_data.py:195
190 # In[59]:
191
192
193 # filter data in a pool of processes
194 with Pool(8) as p:
--> 195 filtered_notes = list(tqdm(p.imap(filter_data, notes), total=len(notes)))
198 # In[60]:
199
200
201 # create dataframe
202 ratings = pd.DataFrame(filtered_notes)

File ~/anaconda3/envs/nlp/lib/python3.8/site-packages/tqdm/notebook.py:249, in tqdm_notebook.iter(self)
247 try:
248 it = super(tqdm_notebook, self).iter()
--> 249 for obj in it:
250 # return super(tqdm...) will not catch exception
251 yield obj
252 # NB: except ... [ as ...] breaks IPython async KeyboardInterrupt

File ~/anaconda3/envs/nlp/lib/python3.8/site-packages/tqdm/std.py:1182, in tqdm.iter(self)
1179 time = self._time
1181 try:
-> 1182 for obj in iterable:
1183 yield obj
1184 # Update and possibly print the progressbar.
1185 # Note: does not call self.update(1) for speed optimisation.

File ~/anaconda3/envs/nlp/lib/python3.8/multiprocessing/pool.py:868, in IMapIterator.next(self, timeout)
866 if success:
867 return value
--> 868 raise value

IndexError: list index out of range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.