Giter VIP home page Giter VIP logo

dataset's Introduction

Functional Map of the World (fMoW) Dataset

There are two versions of the dataset: fMoW-full and fMoW-rgb. fMoW-full is in TIFF format, contains 4-band and 8-band multispectral imagery, and is quite large at ~3.5TB in size. fMoW-rgb is in JPEG format, all multispectral imagery has been converted to RGB, and it is significantly smaller in size at ~200GB.

Please see the fMoW flyer for more info about the challenge. Note that the fMoW challenge has now ended.

Reference

If you use our dataset or code, please cite our paper:

@inproceedings{fmow2018,
  title={Functional Map of the World},
  author={Christie, Gordon and Fendley, Neil and Wilson, James and Mukherjee, Ryan},
  booktitle={CVPR},
  year={2018}
}

Categories

["airport", "airport_hangar", "airport_terminal", "amusement_park", "aquaculture", "archaeological_site", "barn", "border_checkpoint", "burial_site", "car_dealership", "construction_site", "crop_field", "dam", "debris_or_rubble", "educational_institution", "electric_substation", "factory_or_powerplant", "fire_station", "flooded_road", "fountain", "gas_station", "golf_course", "ground_transportation_station", "helipad", "hospital", "impoverished_settlement", "interchange", "lake_or_pond", "lighthouse", "military_facility", "multi-unit_residential", "nuclear_powerplant", "office_building", "oil_or_gas_facility", "park", "parking_lot_or_garage", "place_of_worship", "police_station", "port", "prison", "race_track", "railway_bridge", "recreational_facility", "road_bridge", "runway", "shipyard", "shopping_mall", "single-unit_residential", "smokestack", "solar_farm", "space_facility", "stadium", "storage_tank", "surface_mine", "swimming_pool", "toll_booth", "tower", "tunnel_opening", "waste_disposal", "water_treatment_facility", "wind_farm", "zoo"]

Download

Originally, there were two official ways to download the dataset: from AWS or using BitTorrent. However, the BitTorrent method is no longer actively maintained, leaving AWS as the primary recommended method for downloading the data.

AWS

The fMoW datasets are available on AWS for free at:

  • fMoW-full: s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full
  • fMoW-rgb: s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb

Accessing the data through AWS is possible using tools such as the AWS CLI. For example, to get a directory listing using the AWS CLI run the following commands:

aws s3 ls s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/
aws s3 ls s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb/

To download the manifest.json.bz2 files that list all images and metadata present in each bucket, run the following commands:

aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-full/manifest.json.bz2 ./
aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb/manifest.json.bz2 ./

BitTorrent (Deprecated)

NOTE: This download method is no longer maintained! Using the client of your choice, you can add the following torrent files to download the corresponding subsets of the fMoW dataset:

Additional details

The train and val sets were released to competitors with category labels and a rich set of metadata fields. The test and seq sets had category labels removed and a small amount of noise added to many metadata fields. Certain fields, such as GPS coordinates, were removed from all sets during the challenge. However, now that the challenge has ended, the sequestered and ground truth data has been released, which contains all raw metadata, including category labels and GPS coordinates, for every image.

Joining these ground truth metadata files with the original test and seq imagery does require a small amount of effort. In each of the ground truth archives for fMoW-full and fMoW-rgb there is a mapping JSON file. This mapping file provides the association between each test and seq image and its corresponding metadata. You can also use this mapping file to reorganize the test and seq data into category and temporal sequence folders similar to the train and val sets.

Bounding box format

Bounding boxes are provided in the format [x, y, width, height] where the point (x, y) corresponds to the top-left edge of the box surrounding the object of interest. In other words, these four values can also be thought of as corresponding to [left, top, width, height] for a box surrounding the object of interest.

Non-existent country codes

Some country codes in the dataset may not be valid. Please consider re-computing country codes using the underlying geographic coordinate metadata. See this issue for more details.

License

This data is licensed under the Functional Map of the World Challenge Public License. This new license is similar to the previous license with modifications to clarify that algorithms trained on challenge data are not considered adapted material.

dataset's People

Contributors

gordonac avatar mukhery avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dataset's Issues

Cannot download torrent RGB dataset

Hey, both fMoW-rgb train and val and fMoW-rgb test torrents are down (not seeded), will they be ever available again? I was about to use them in my research. Thanks.

GPS data for train/val

Hi there,

I was wondering if GPS coordinate data for the train and validation sets are made available anywhere? Thank you!

Ask for FMoW dataset

When I attempt to download fMoW dataset, I noticed that only torrent files are available on GitHub or AWS. However, upon trying to download the seed file, I found that no user owns any bucket, making it impossible for me to obtain the dataset corresponding to the seed. Could you please provide a more stable download method for us? Thank you very much for your assistance.

Nonexistent ISO Alpha-3 country codes

The paper mentions that the metadata includes ISO Alpha-3 country codes (Appendix I). However, the country_code field in the metadata json's includes two codes that do not exist as ISO Alpha-3 code: KO- and CA-.

  • KO- is not listed as any country code on the ISO website. KO is listed as unassigned Alpha-2 code: link. All locations in the fMoW dataset with this country code are in Kosovo. Kosovo does not have its own ISO Alpha-3 code, the correct code is the one for Serbia: SRB (just quoting the ISO website). There is an unofficial Alpha-2 code for Kosovo, as described here; XK.
  • Same for CA-, it does not exist as Alpha-3 code. CA exists as Alpha-2 code for Canada (ISO website), but the fMoW metadata uses the Alpha-3 code CAN to refer to Canada. The locations in the fMoW dataset with CA- are around the Caspian Sea, both in southern Russia (Dagestan) and Azerbaijan. I don't know why the proper codes, like RUS or AZE wouldn't be used. Some of these locations are in Azerbaijan's capital, so doesn't seem like disputed territory.

I'm posting this here not to get political but because I think this might be relevant to some people, and it is poorly documented by fMoW. E.g. I was trying to connect the fMoW countries to ISO 3 country codes in a shapefile to do operations with the locations. Please document this properly somewhere :-).

Use Sequential Images for Change Detection

As the image size varies for the same location in different timestamp, I wonder how to use the sequential images in tasks such as change detection where input image size should be the same.

Or, is there an anchor point that align the sequential images position.

About the `"raw_polygon"` field in metadata files

Hello,
AFAIK, there are some fields in the metadata files that are not documented, such as the "raw_polygon" field.

Is this supposed to be the polygon enclosing the image, with geospatial coordinates ?
What is the syntax of the value of this field ? Is it possible to convert it to a GeoJSON ?

Thanks

downloading via s3

I'm trying to download the data on S3 right now. I have opened an account and installed the command line interface. But I don't get which command I need to download the rgb data.I tried something like:
aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb ./, but I get an error that the key Hosted-Dataset doesn't exist. I need to download and store the dataset in a specific file.
Thank you in advance.

Access Denied: Error while trying to download from s3

I have been trying to download the dataset using s3 command. But it says "Access Denied"
aws s3 cp s3://spacenet-dataset/Hosted-Datasets/fmow/fmow-rgb . --recursive

fatal error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

What do box dimensions mean?

The ground truth metadata json's denote the bounding box dimensions as 'box': [697, 1451, 5720, 2649], but not this repository nor the fMoW paper describes what these dimensions mean. You have to search the baseline code and find in this function that these represent [x, y, w, h] where [x, y] are the upper (?) left corner's indices. Other bounding box annotation schemes exist, e.g. see here, so I think this would be something nice to have clearly documented somewhere.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.