Giter VIP home page Giter VIP logo

harvest-piles's Introduction

HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using Harvest Piles and Remote Sensing

Pile Examples Examples of harvest piles circled in red

HarvestNet

HarvestNet is a dataset for tracking farm activity by detecting harvest piles. This document introduces the procedures required for replicating the results in our paper.

Abstract

Small farms contribute to a large share of the productive land in developing countries. In regions such as sub-Saharan Africa, where 80% of farms are small (under 2 ha in size), the task of mapping smallholder cropland is an important part of tracking sustainability measures such as crop productivity. However, the visually diverse and nuanced appearance of small farms has limited the effectiveness of traditional approaches to cropland mapping. Here we introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems throughout the world. We present HarvestNet, a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023, collected using expert knowledge and satellite images, totaling 7k hand-labeled images and 2k ground collected labels. We also benchmark a set of baselines including SOTA models in remote sensing with our best models having around 80% classification performance on hand labelled data and 90%, 98% accuracy on ground truth data for Tigray, Amhara respectively. We also perform a visual comparison with a widely used pre-existing coverage map and show that our model detects an extra 56,621 hectares of cropland in Tigray. We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions.

Overview

Our dataset consists of 7k labelled square SkySat images of size 512x512 pixels at a resolution of 0.5m per pixel. Each of these labelled images also correspond to a PlanetScope image of size 56x56 pixels at a resolution of 4.77m per pixel to cover the same geographic area of 256x256m. The labels are stored as train.csv and test.csv. Each row in the labelled dataset contains:

Field Description
filename Name of the corresponding SkySat and PlanetScope image
lat_1 Latitude of top left corner of area
lon_1 Longitude of top left corner of area
lat_2 Latitude of bottom right corner of area
lon_2 Longitude of bottom right corner of area
activity Label for whether the image contains harvest pile activity
altitude Alttidue of the center of the image, in meters
lat_mean Mean of lat_1 and lat_2
lon_mean Mean of lon_1 and lon_2
year Year of image capture
month Month of image capture
day Day of image capture
group Contiguous overlapping group the area belongs to. If no overlap, assign group = -1

This dataset also includes ~150k unlabelled images SkySat images. They are of the same dimension with similar label format as our labelled dataset, without the group and activity fields defined. The labelled and unlabelled dataset are both included in /skysat_images.

The datasets folder and weights folder are not included in this repository. Please download the dataset from FigShare and put them in the root directory of this repository as shown below. We also provide the pretrained weights for our models, which are also hosted on FigShare

File path | Description


/datasets
โ”ฃ ๐Ÿ“‚ skysat_images
โ”ƒ   โ”— ๐Ÿ“œ 0.tif
โ”ƒ   โ”— ...
โ”ƒ   โ”— ๐Ÿ“œ xx.tif
โ”ฃ ๐Ÿ“‚ planetscope_images
โ”ƒ   โ”— ๐Ÿ“œ 0.png
โ”ƒ   โ”— ...
โ”ƒ   โ”— ๐Ÿ“œ xx.png
โ”— ๐Ÿ“œ train.csv                  (labels for training set)
โ”— ๐Ÿ“œ test.csv                   (labels for test set)
โ”— ๐Ÿ“œ labels_all.csv             (labels for entire 150k dataset)

/weights
โ”ฃ ๐Ÿ“‚ swin_finetune
โ”ฃ ๐Ÿ“‚ swin_pretrain
โ”ฃ ๐Ÿ“‚ satmae_finetune
โ”— ๐Ÿ“œ resnet.pt
โ”— ๐Ÿ“œ satlas.pth

/src
โ”ฃ ๐Ÿ“‚ optim                      (custom optimizers)
โ”ฃ ๐Ÿ“‚ preprocessing              (helper scripts for creating dataset)
โ”ฃ ๐Ÿ“‚ scripts                    (helper scripts for running jobs on HPC)
โ”— ๐Ÿ“œ finetune_satlas.py         (main script for fine-tuning Satlas classifier)
โ”— ๐Ÿ“œ swin_pretrain.py           (main script for pretraining Swin V2 MAE)
โ”— ๐Ÿ“œ swin_finetune.py           (main script for finetuning Swin V2 classifier)
โ”— ๐Ÿ“œ train_resnet.py            (main script for finetuning Resnet50 classifier)

โ”— ๐Ÿ“œ config.py                  (configurations for training scripts)
โ”— ๐Ÿ“œ dataset.py                 (functions for loading datasets)
โ”— ๐Ÿ“œ eval_metrics.py            (functions for evaluation metrics)

/notebooks
โ”— ๐Ÿ“œ Dataset_Explorer.ipynb     (printing grid of positive images, plot histograms of dataset distribution)
โ”— ๐Ÿ“œ Dataset_Maker.ipynb        (creating csv containing image labels from .tif images)
โ”— ๐Ÿ“œ Dataset_Split.ipynb        (create dataset for MTurks, applying expert labels, overlap partitioning algorithm, train test split scripts)
โ”— ๐Ÿ“œ Image_Load.ipynb           (remove corrupted images from dataset)
โ”— ๐Ÿ“œ Labelling.ipynb            (used by experts to label images)
โ”— ๐Ÿ“œ Migration.ipynb            (combine disjoint labels to one dataset)
โ”— ๐Ÿ“œ PlanetScope_Download.ipynb (download images from PlanetScope)
โ”— ๐Ÿ“œ SkySat_Clip_Bbox.ipynb     (create basic csv file for images in folder)
โ”— ๐Ÿ“œ SkySat_Clip.ipynb          (divide SkySat captures into 512x512 px images)
โ”— ๐Ÿ“œ SkySat_Download.ipynb      (download images from SkySat)

Environment setup

Create and activate conda environment named harvest from our env.yaml

conda env create -f env.yaml
conda activate harvest

Download data

Due to size limit and license issues, the original SkySat images will need to be downloaded from the Planet Explorer. The pre-processing scripts are also included in this repo.

  1. SkySat_Download.ipynb: Notebook to download specified SkySat assets. Please refer to the Planet SDK for Python repo to set up your Planet account.
  2. SkySat_Clip.ipynb: Notebook to clip given SkySat Collects into 512x512 px images and delete images that are partially empty.
  3. SkySat_Clip_Bbox.ipynb: Notebook to extract bounding box coordinates of each SkySat clipped image to be used to download PlanetScope images.
  4. PlanetScope_Download.ipynb: Notebook to download PlanetScope monthly basemaps using Google Earth Engine. Please refer to this NICFI access page to setup your Google Earth Engine account to gain access to collection of interest.

harvest-piles's People

Contributors

amnaalmgly avatar jonxuxu avatar rjlee6 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.