ranjaykrishna / visual_genome_python_driver Goto Github PK
View Code? Open in Web Editor NEWA python wrapper for the Visual Genome API
License: MIT License
A python wrapper for the Visual Genome API
License: MIT License
Hi there,
Do you have an api under python3?
Hi,
I was wondering how you determined the common images between COCO and Visual Genome. I need to use the common images for a project. Is there a way to access these images on the website?
Thanks!
Dear Visual Genome authors,
It seems like website is not working thus impossible to get data through api. Is there any other way to get your data?
Thank you!
'RetrievData()' function under util.py creates this error:
ValueError Traceback (most recent call last)
in ()
3 response = connection.getresponse()
4 jsonString = response.read()
----> 5 data = json.loads(jsonString)
/usr/lib/python2.7/json/init.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
337 parse_int is None and parse_float is None and
338 parse_constant is None and object_pairs_hook is None and not kw):
--> 339 return _default_decoder.decode(s)
340 if cls is None:
341 cls = JSONDecoder
/usr/lib/python2.7/json/decoder.pyc in decode(self, s, _w)
362
363 """
--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
365 end = _w(s, end).end()
366 if end != len(s):
/usr/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx)
380 obj, end = self.scan_once(s, idx)
381 except StopIteration:
--> 382 raise ValueError("No JSON object could be decoded")
383 return obj, end
ValueError: No JSON object could be decoded
http://visualgenome.org/api/v0/api_home.html
The page lists two versions of question answers: V1.2 is in the size of 803.19MB and V1.0 is in the size of 201.09MB. However, both links lead to the same zip file which I believe is for V1.0.
While iterating through all the question-answers, I find that a big portion of ids do not have its question-answer list. I am wondering if those missing question-answers are only available in V1.2 ? If so, would you please redirect the download link to the correct file?
Thanks!
AttributeError: module 'visual_genome.api' has no attribute 'GetImageIdsInRange'
there is a missing ]
in local.py
, line 32:
output.append(utils.ParseRegionDescriptions(image['regions'], imageMap[image['id']))
should be imageMap[image["id"]]
When I tried
qas = api.get_all_QAs(qtotal=10)
it returns KeyError: 'qa_id'. Seems like many QAs are missing.
Also, when I downloaded the QA json from dataset website, it only has 200MB, instead of 800MB as stated in the website.
This should enable developers to import the API from other Python scripts, without having to download and copy the src
folder each time the API is required. I can submit a PR to fix this issue if you want.
Several bounding box coordinates in region_descriptions.json file contain negative values for x and/or y. If I understand correctly, the x and y coordinates indicate the lower left corner with the origin at the lower left corner of the image, therefore should never be negative... I am using version 1.2 of the dataset.
I count 5,408,689 region descriptions (which, surprisingly, is more than 4,297,502 which is listed on the visual genome homepage), of which 5654 contain a negative coordinate. There are also cases like region 5456063, in which even y + height yields a negative.What should one do about these cases?
Some examples with large negative coordinates:
{'height': 25, 'x': 179, 'phrase': 'A cloud in the sky.', 'image_id': 2330845, 'region_id': 5851305, 'width': 55, 'y': -904}
{'height': 6, 'x': 216, 'phrase': 'A window on a building.', 'image_id': 2315874, 'region_id': 5455943, 'width': 8, 'y': -710}
{'height': 24, 'x': 181, 'phrase': 'A leaf on a stem.', 'image_id': 2315869, 'region_id': 5456063, 'width': 25, 'y': -871}
Some examples with small negative coordinates (more frequent):
{'height': 35, 'x': 480, 'phrase': 'glass window on the building', 'image_id': 2417220, 'region_id': 5503839, 'width': 10, 'y': -2}
{'x': 1, 'height': 70, 'phrase': 'white bowl behind plate', 'width': 277, 'image_id': 2417836, 'y': -1, 'region_id': 5514392}
{'x': -1, 'height': 327, 'phrase': 'white plate with food', 'width': 500, 'image_id': 2417836, 'y': 45, 'region_id': 5514399}
{'x': -1, 'height': 207, 'phrase': 'wooden blocks of tile covering floor', 'width': 498, 'image_id': 2417841, 'y': 291, 'region_id': 5514473}
{'x': 415, 'height': 327, 'phrase': 'telephone wire running up wall', 'width': 82, 'image_id': 2417841, 'y': -1, 'region_id': 5514478}
had to change:
from src import vg
to:
from src import api as vg
to make it work
Hello!
Thank you for this wrapper! It is very useful for my work. I found a small mistake in the tutorial in README.md. There you suggest to extract ids in some range by calling
api.get_image_ids_in_range(startIndex=0, endIndex=100)
In fact, the keyword arguments are called start_index and end_index.
How to get region_graphs of an image using local.py?
Because there are many images without any region_graph.
Dear @ranjaykrishna
First of all, thank you very much for sharing the work.
I was wondering if you could share the file names which overlap with images in the COCO dataset for users who would like to avoid any data leakage during training? I did check your comment on another closed issue (#30 (comment)), but it doesn't provide specific file names.
I believe this would benefit many users looking to train with Visual Genome data and then evaluate their model on COCO images.
Many thanks,
Gyungin
I am trying to execute following code but its asking scene_graph.json file. where I can find that file
import visual_genome.local as vg
Convert full .json files to image-specific .jsons, save these to 'data/by-id'.
These files will take up a total ~1.1G space on disk.
vg.save_scene_graphs_by_id(data_dir='data/', image_data_dir='data/by-id/')
Load scene graphs in 'data/by-id', from index 0 to 200.
We'll only keep scene graphs with at least 1 relationship.
scene_graphs = vg.get_scene_graphs(start_index=0, end_index=-1, min_rels=1,
data_dir='data/', image_data_dir='data/by-id/')print len(scene_graphs)
149print scene_graphs[0].objects
[clock, street, shade, man, sneakers, headlight, car, bike, bike, sign, building, ... , street, sidewalk, trees, car, work truck]
Hello everyone,
does anyone know how to incorporate all the file of this dataset into one database for easy access to all the info inside it. if not , where to put the files of the dataset locally so that the dataset can be accessed by the functions.
Hello.
This line (โ123)
qas.append(QA(info['qa_id'], image_map[info['image_id']], info['question'], info['answer'], qos, aos))
But there are no "qa_id" and "image_id" in info. There are "id" and "image" instead so it should be
qas.append(QA(info['id'], image_map[info['image']], info['question'], info['answer'], qos, aos))
no?
Another one problem. From tutorial. Line
qas = api.get_all_QAs(qtotal=10)
have to result in getting qas with size 10, as i believe, but it's length is 1000. I believe there are something wron in "get_all_QAs". Please check it
Hi @ranjaykrishna,
Great dataset. I am trying to match the Visual Genome images to the ones in COCO. Based on the meta data, 103077 images have a flickr_id, 51498 of which also have a coco_id. For the remaining 51579, I tried using their flickr_id to find the corresponding COCO id, but these flickr_ids do not appear in the COCO dataset, at least as far as the flickr_urls indicate.
Can you tell me how to find the COCO images/ids corresponding to the 51579 images that don't have a coco_id?
Thanks!
According to the API documentation, the file question_answers.json
should present the following structure:
[
{
"image_id": 2317993,
"qas": [
{
"qa_id": 912402,
"question": "Where are the clouds?",
"answer": "sky",
"question_synsets": [
{
"synset_name": "cloud.n.01",
"entity_name": "cloud",
"entity_idx_start": 14,
"entity_idx_end": 20
},
],
"answer_synsets": [
{
"synset_name": "sky.n.01",
"entity_name": "sky",
"entity_idx_start": 0,
"entity_idx_end": 3
},
]
},
]
},
]
However, the values present on the actual JSON file look like this:
[
{
"id": 912402,
"image": 2317993,
"question": "Where are the clouds?",
"answer": "Sky."
},
]
As you may see, both JSON formats differ, with the actual one presenting less information than the one presented on the documentation. i.e., QA don't contain the corresponding objects and synsets.
Is there a range of the image ids that correspond to the COCO ids? Or are the COCO images randomly distributed?
Thank you very much!
If all VG images are taken from MSCOCO, then why do a few images not have a valid cocoid. For instance id of 2359297 has a coco_id of -1.
Thank you.
Hi if anyone has Python 3.x installed and you wish to run the demo jupyter notebook, try cloning this repo instead:
git clone https://github.com/alibabadoufu/visual_genome_python_driver.git
I forked the original one and edit some non-working codes.
Hi Ranjay,
Thanks a lot for providing the community with such an amazing resource. I tried to recreate a working snapshot of VisualGenome on my computer by downloading the following JSON files:
attribute_synsets.json
object_alias.txt
objects.json
relationship_alias.txt
relationships.json
attributes.json
image_data.json
object_synsets.json
region_descriptions.json
relationship_synsets.json
scene_graphs.json
synsets.json
I'm not really interested to the QA data so I haven't downloaded them.
By playing with the official API that you provide I have tried to read the scene graph for the image with ID "2381815". However, seems that no synsets are associated to the objects nor to the relationships. I have tried to retrieve the scene graph from the online API but seems that it's not available at the moment. Other images seems to have all the required information (apart from some of the objects). Am I missing some of the metadata files? Any ideas about this problem?
Thanks a lot!
I have been unable to use the GetSceneGraphOfImage
function in the API recently. When I used it about a month ago, it would be extremely slow - from 1 to 10 minutes to load a single scene graph. As of last night, I have been unable to acquire any scene graphs. Once I call the function, it seems to hang infinitely (for at least 10-12 hours).
If I interrupt the process while it's hanging, I get the following trace:
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-8-2265c0abc8cf> in <module>()
----> 1 scene_graph = vg.GetSceneGraphOfImage(id=image_id)
/Users/eric/code/visual_genome_python_driver/src/api.pyc in GetSceneGraphOfImage(id)
71 def GetSceneGraphOfImage(id=61512):
72 image = GetImageData(id=id)
---> 73 data = utils.RetrieveData('/api/v0/images/' + str(id) + '/graph')
74 if 'detail' in data and data['detail'] == 'Not found.':
75 return None
/Users/eric/code/visual_genome_python_driver/src/utils.pyc in RetrieveData(request)
18 connection = httplib.HTTPSConnection("visualgenome.org", '443')
19 connection.request("GET", request)
---> 20 response = connection.getresponse()
21 jsonString = response.read()
22 data = json.loads(jsonString)
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.pyc in getresponse(self, buffering)
1134
1135 try:
-> 1136 response.begin()
1137 assert response.will_close != _UNKNOWN
1138 self.__state = _CS_IDLE
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.pyc in begin(self)
451 # read until we get a non-100 response
452 while True:
--> 453 version, status, reason = self._read_status()
454 if status != CONTINUE:
455 break
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.pyc in _read_status(self)
407 def _read_status(self):
408 # Initialize with Simple-Response defaults
--> 409 line = self.fp.readline(_MAXLINE + 1)
410 if len(line) > _MAXLINE:
411 raise LineTooLong("header line")
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.pyc in readline(self, size)
478 while True:
479 try:
--> 480 data = self._sock.recv(self._rbufsize)
481 except error, e:
482 if e.args[0] == EINTR:
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.pyc in recv(self, buflen, flags)
732 "non-zero flags not allowed in calls to recv() on %s" %
733 self.__class__)
--> 734 return self.read(buflen)
735 else:
736 return self._sock.recv(buflen, flags)
/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ssl.pyc in read(self, len, buffer)
619 v = self._sslobj.read(len, buffer)
620 else:
--> 621 v = self._sslobj.read(len or 1024)
622 return v
623 except SSLError as x:
KeyboardInterrupt:
Hello,
I have sought everywhere but cant seem to find what flickr_id correspond to the actual flickr 30K dataset.
Is there any way to link the images ?
Thank you vm in advance
When I mention the following line in my Python3 code and run:
from src import api as vg
I am getting the above mentioned error. Please help?
Note: I have git cloned the repo, and even run the setup.py successfully.
Since GetAllImageData()
uses json.load()
as return value, GetAllImageData()
will return a list of dictionary, not a list of Image
object. Then the follow code in GetAllRegionDescriptions()
and GetAllQAs()
will not work:
imageData = GetAllImageData()
imageMap = {}
for d in imageData:
imageMap[d.id] = d
because d
is a dictionary, not a parsed image data, the d.id
here will not work.
I think it's better to change the return value of GetAllImageData()
to a list of Image
objects.
Hello,
I've been trying to use the get_scene_graph_of_image method, but it seems to hang indefinitely.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.