Giter VIP home page Giter VIP logo

Comments (15)

DotWang avatar DotWang commented on May 30, 2024 1

@Li-Qingyun

The accuracies of the "impervious_surface" in your table are None, it is obviously wrong.

In addition, the category and corresponding color are officially defined, and I suggest that you do not change it.

We transform the label of '5_Labels_all.zip' by directly mapping since the "Undefined" category does not exist, here are our codes, note we use the skimage.io to load image

palette = {0 : (255, 255, 255),  # Impervious surfaces (white)
           1 : (0, 0, 255),     # Buildings (blue)
           2 : (0, 255, 255),   # Low vegetation (cyan)
           3 : (0, 255, 0),     # Trees (green)
           4 : (255, 255, 0),   # Cars (yellow)
           5 : (255, 0, 0),     # Clutter (red)
           6 : (0, 0, 0)}       # Undefined (black)

invert_palette = {v: k for k, v in palette.items()}

def convert_from_color(arr_3d, palette=invert_palette):
    """ RGB-color encoding to grayscale labels """
    arr_2d = np.zeros((arr_3d.shape[0], arr_3d.shape[1]), dtype=np.uint8)

    for c, i in palette.items():
        m = np.all(arr_3d == np.array(c).reshape(1, 1, 3), axis=2)
        arr_2d[m] = i

    return arr_2d

def load_img(imgPath):
    """
    Load image
    :param imgPath: path of the image to load
    :return: numpy array of the image
    """
    if imgPath.endswith('.tif'):
        img = io.imread(imgPath)
        #img = tif.read_image()
        # img = tifffile.imread(imgPath)
    else:
        raise ValueError('Install pillow and uncomment line in load_img')
    return img

Thus we set reduce_zero_label=False, num_classes=5 and ignore_index=5 to ignore the "Clutter" category

The corresponding transformation in mmseg is

    if to_label:
        color_map = np.array([[0, 0, 0], [255, 255, 255], [255, 0, 0],
                              [255, 255, 0], [0, 255, 0], [0, 255, 255],
                              [0, 0, 255]])

Note: the RGBs are inversed since mmcv use the opencv to read image

If you use this function, since '5_Labels_all.zip' doesn't have the "black boundary", the label will be transformed to 1-6 (here, cluster=6)

(Correspondingly, '5_Labels_noBoundary.zip' will be transformed to 0-6.)

At this time, the reduce_zero_label should be in True (1-6 -> 0-5), then set num_classes=5 and ignore_index=5.

from vitae-transformer-remote-sensing.

DotWang avatar DotWang commented on May 30, 2024

batchsize=samples_per_gpu * gpu_number

samples_per_gpu is set in configs/_base_/datasets/xxxx.py (xxxx is the dataset you used)

gpu_number is controled by CUDA_VISIBLE_DEVICES

while --nproc_per_node = gpu_number

For example, the samples_per_gpu in potsdam.py has been set to 8

so if you want to set batchsize=16 and operate on GPU 0,1, you can use

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --master_port=40001 tools/train.py \
    configs/upernet/upernet_our_r50_512x512_80k_potsdam_epoch300.py \
    --launcher 'pytorch'

In configs/swin/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py

We reset samples_per_gpu=4

Thus, if you need batchsize=8, please use

CUDA_VISIBLE_DEVICES=x,y python -m torch.distributed.launch --nproc_per_node=2 --master_port=xxxxx tools/train.py \
    configs/upernet/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py \
    --launcher 'pytorch'

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Thanks for your reply~

I have trained with configs/swin/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py, both 1x8 and 2x4 strategies were tested, which only achieved the following eval results:
图片

The eval results of each eval_interval are as follows:

iter aAcc mFscore mIoU
8000 80.81 / 59.73
16000 82.03 / 61.41
24000 82.52 / 61.96
32000 82.72 / 62.41
40000 83.23 / 62.97
48000 83.0 75.03 62.63
56000 82.69 74.63 62.27
64000 82.88 75.19 62.7
72000 83.35 75.57 63.29
80000 83.3 75.58 63.23

Hence I open this issue to ask.
I'll appreciate your assistance!

from vitae-transformer-remote-sensing.

DotWang avatar DotWang commented on May 30, 2024

@Li-Qingyun How did you prepare the potsdam dataset?

This dataset contains two versions including RGB and IR-R-G

and the label also has two versions: with or without boundary

In our implementation, we use '3_Ortho_IRRG.zip' and '5_Labels_all.zip'.

In addition, you can check the label

In our experiment, the label in '5_Labels_all.zip' range from 0-5, so we directly ignore the class 5

Another kind of label extra includes an undefined category.

Note we don't use the transformation function provided by mmsegmentation.

If you use it, you may need to adjust corresponding settings

such as

whether to reduce_zero_label in configs/_base_/datasets/potsdam.py;

settings of num_classes and ignore_index in configs/swin/your config file;

and the dataset file in mmseg/datasets/your dataset file

We have not described these in the readme.md since they are highly customized.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Thanks for your support!
I followed the official guides for preparing datasets of mmseg, in which the '2_Ortho_RGB.zip' and '5_Labels_all_noBoundary.zip' are required.
Such a huge difference between segmentation results of RGB w/o Boundary and IRRG w/ Boundary.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

Oh, the actually used zip is '4_Ortho_RGBIR.zip' and '5_Labels_for_participants_no_Boundary.zip'.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Why each image in RGBIR contains 3 channels?

from vitae-transformer-remote-sensing.

DotWang avatar DotWang commented on May 30, 2024

@Li-Qingyun RGBIR image contains 4 channels: R, G, B, NIR

you can use skimage to read it

But since we use ordinary deep models for processing 3-channel images, the RGBIR is usually not used.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Thanks, I used the cv2.imread, which read imgs in 3-channel mode.
I know too little about this dataset, thank you for your support. I will adjust the data and rerun the experiment.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Hi, for the metric provided by mmseg.

Is the 'mFscore' correspond to mf1?
and the 'aAcc' correspond to OA?

from vitae-transformer-remote-sensing.

DotWang avatar DotWang commented on May 30, 2024

@Li-Qingyun yes

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

Another kind of label extra includes an undefined category.
Note we don't use the transformation function provided by mmsegmentation.
If you use it, you may need to adjust corresponding settings
such as
whether to reduce_zero_label in configs/base/datasets/potsdam.py;
settings of num_classes and ignore_index in configs/swin/your config file;
and the dataset file in mmseg/datasets/your dataset file
We have not described these in the readme.md since they are highly customized.

I followed the custom potsdam.py in the repo, setting reduce_zero_label=False, ignore_index=5, the dataset was prepared with Semantic Segmentation/tools/convert_datasets/potsdam.py, '3_Ortho_IRRG.zip' and '5_Labels_all.zip' were adoped.

My training achieved OA (91.22) of upernet+swin-T-IMP, however about 88.69 mFscore only.

图片

I think my setting of reduce_zero_label and ignore_index might be wrong.

I wrote a script to read the annotation (prepared by tools/convert_datasets/potsdam.py).
and found that, for the '5_Labels_all.zip', the script turns the palette to 1~5.

multiclass_mask_all
bin_mask all

for the '5_Labels_noBoundary.zip', the script turns the palette to 0~5, in which the 0 seems to be boundary.

multiclass_mask_noboundary
bin_mask noboundary

the IRRG image is:
2_10_0_0_512_512_IRRG

And the CLASSES in both potsdam.py and potsdam_ori.py are:

CLASSES = ('impervious_surface', 'building', 'low_vegetation', 'tree',
           'car', 'clutter')

PALETTE = [[255, 255, 255], [0, 0, 255], [0, 255, 255], [0, 255, 0],
           [255, 255, 0], [255, 0, 0]]

I think the class which should be ignored is 'clutter', isn't it?

And the comment of the PotsdamDataset:

@DATASETS.register_module()
class PotsdamDataset(CustomDataset):
    """ISPRS Potsdam dataset.

    In segmentation map annotation for Potsdam dataset, 0 is the ignore index.
    ``reduce_zero_label`` should be set to True. The ``img_suffix`` and
    ``seg_map_suffix`` are both fixed to '.png'.
    """

said 0 is the ignore index, and `reduce_zero_label`` should be set to True

I'm still confused about the dataset preparing and the true usage to achieve the reported results as a baseline of my work.
I'll appreciate your help and be willing to pull a request of the dataset preparing.

Thanks for your quick replies.

The script is as follow:

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec

from mmsegmentation.configs_rs._base_.potsdam import data as RGB_data
from mmsegmentation.configs_rs._base_.potsdam_IRRG import data as IRRG_data

from mmseg.datasets import build_dataset
RGB_trainset = build_dataset(RGB_data['train'])
IRRG_trainset = build_dataset(IRRG_data['train'])

palette_map = {
    '[255, 255, 255]': 'black',
    '[0, 0, 255]': 'blue',
    '[0, 255, 0]': 'green',
    '[255, 0, 0]': 'red',
    '[255, 255, 0]': 'yellow',
    '[255, 0, 255]': '',
    '[0, 255, 255]': 'cyan',
}

# CLASSES = ('impervious_surface', 'building', 'low_vegetation', 'tree',
#            'car', 'clutter')
CLASSES = ('clutter', 'impervious_surface', 'building', 'low_vegetation',
           'tree', 'car')

palette = RGB_trainset.PALETTE
print({k: palette_map[str(v)] for k, v in enumerate(palette)})


def save_ann_with_custom_palette(ann_path, output_path, ann_name):
    ann = Image.open(ann_path)
    ann_array = np.array(ann)
    print(f'{ann_name}: {np.unique(ann_array)}')
    save_bin_mask(ann_array, ann_name, output_path)
    h, w = ann_array.shape
    classes = np.unique(ann_array)
    out_ann = np.zeros((h, w, 3))
    for cls in classes:
        indices = np.nonzero(ann_array == cls)
        out_ann[indices] = palette[cls]
    plt.figure()
    plt.title(ann_name)
    plt.imshow(out_ann)
    plt.savefig(output_path + f'{ann_name}.png')


def save_bin_mask(ann_array: np.ndarray, remark: str, output_path):
    plt.figure()
    plt.suptitle(f'label {remark}')
    classes = np.unique(ann_array)
    _len = len(classes)
    subplot_w = int(np.ceil(np.sqrt(_len)))
    subplot_h = int(np.ceil(_len / subplot_w))
    gs = gridspec.GridSpec(subplot_h, subplot_w * 2)
    gs.update(wspace=0.8)
    for i, cls in enumerate(np.unique(ann_array)):
        bin_mask = (ann_array == cls).astype(np.float32)
        if _len - i >= subplot_w or _len % 2 == 0:
            plt.subplot(
                gs[i // subplot_w, i % subplot_w * 2: i % subplot_w * 2 + 2])
        else:
            plt.subplot(
                gs[i // subplot_w, i % subplot_w * 2 + 1: i % subplot_w * 2 + 3])
        plt.title(f'{cls}-{CLASSES[cls]}')
        # plt.title(f'class {cls} ({remark})')
        plt.imshow(bin_mask)
    plt.savefig(output_path + f'bin_mask {remark}')


ann0_path = f'/home/lqy/Desktop/DINO_semantic_seg/mmsegmentation/data' \
            f'/potsdam/ann_noboundary/train/2_10_0_0_512_512.png'
ann1_path = f'/home/lqy/Desktop/DINO_semantic_seg/mmsegmentation/data' \
            f'/potsdam/ann_all/train/2_10_0_0_512_512.png'
output_path = '/home/lqy/Desktop/DINO_semantic_seg/develop/dataset/'
save_ann_with_custom_palette(ann0_path, output_path, 'noboundary')
save_ann_with_custom_palette(ann1_path, output_path, 'all')

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Thanks for your replies.

I did not find your script of preparing dataset in the repo at first, hence, I followed the official instruments of mmseg, which seems mismatched with the PotsdamDataset classes in the potsdam.py this repo. The potsdam_ori.py is the one should be used.

I searched the reduce_zero_label parameter globally and tried to understand how it make effects. The core logic is as follows:

if self.reduce_zero_label:
    # avoid using underflow conversion
    gt_semantic_seg[gt_semantic_seg == 0] = 255
    gt_semantic_seg = gt_semantic_seg - 1
    gt_semantic_seg[gt_semantic_seg == 254] = 255

which makes the background class to be labeled 255.

And there seems to be two place of reduce_zero_label working:

  1. LoadAnnotation of the pipline
  2. CustomDataset for eval metric calculating

And whey all do the same thing, which is easy to wonder if the action will be repeated.
I thought the train annotations is convert by the one in LoadAnnotation and the val annotations is convert by the one of CustomDataset. It seems that we hardly ever call the eval function to verify the segmentation performance of the model on the training set, otherwise, the reduce operation is likely to be performed twice.

Closer to home, if the user follow mmseg's official dataset preparation, the label seems to have gone through the following mapping process (the color format is RGB):

{0 : (255, 255, 255),  # Impervious surfaces (white)
 1 : (0, 0, 255),     # Buildings (blue)
 2 : (0, 255, 255),   # Low vegetation (cyan)
 3 : (0, 255, 0),     # Trees (green)
 4 : (255, 255, 0),   # Cars (yellow)
 5 : (255, 0, 0),     # Clutter (red)
 6 : (0, 0, 0)}       # Undefined (black)
               | | | |
               | | | |  
             \ | | | | /     transformation in `convert_datasets/potsdam.py`
               \ | | /
                 \ /
{0 : (0, 0, 0), # Undefined (black)
 1 : (255, 255, 255),  # Impervious surfaces (white)
 2 : (0, 0, 255),     # Buildings (blue)
 3 : (0, 255, 255),   # Low vegetation (cyan)
 4 : (0, 255, 0),     # Trees (green)
 5 : (255, 255, 0),   # Cars (yellow)
 6 : (255, 0, 0)}     # Clutter (red)
               | | | |
               | | | |  
             \ | | | | /     reduce_zero_label in `LoadAnnotation` 
               \ | | /
                 \ /
{0 : (255, 255, 255),  # Impervious surfaces (white)
 1 : (0, 0, 255),     # Buildings (blue)
 2 : (0, 255, 255),   # Low vegetation (cyan)
 3 : (0, 255, 0),     # Trees (green)
 4 : (255, 255, 0),   # Cars (yellow)
 5 : (255, 0, 0),     # Clutter (red)
 255 : (0, 0, 0)}       # Undefined (black)

Hence, in the official potsdam_ori.py, it is ignore_index=255.
They use 'label_noBoundary.zip', whose converted labels (The second one) are [0 1 2 3 4 5 6].
When setting reduce_zero_label=True, the labels are [255 0 1 2 3 4 5], hence the ignore_index was set 255, which set the Undefined background as the ignored category. However, actually both of 5 and 255 should be ignored, isn't it?

In the ViTAE-RS, for 'label_all.zip', whose converted labels are [1 2 3 4 5 6].
When setting 'reduce_zero_label=True', the labels are [0 1 2 3 4 5], hence the ignore_index was set 5.
If 'reduce_zero_label=False', the labels are [1 2 3 4 5 6], the ignored index is 6, however, a 0 is an extra label. Hence, the transformation should be deleted in this repo, to keep the origin [0 1 2 3 4 5], and 5 is the ignored Clutter class and the Undefined background class 6 is not annotated, so ignore_index=5.

from vitae-transformer-remote-sensing.

DotWang avatar DotWang commented on May 30, 2024

@Li-Qingyun
Haha, the script of preparing potsdam dataset is used in our previous projects, so we adopt it instead of the mmseg transformation in this work. We do not upload the script since we think the mmseg is highly customized for users.

Most of your understanding is right. The transformation convert_datasets/potsdam.py exists in the original mmseg, we didn't even use this folder and directly upload them.

The mIOUs that are shown in the mmseg site include all categories except the "Undefined". However, in RS literatures, the "Cluster" is also considered as background and does not take part in the metric calculation. In fact, whether or not to mask this category in training is both OK. For convenience, we also ignore it when training models.

from vitae-transformer-remote-sensing.

Li-Qingyun avatar Li-Qingyun commented on May 30, 2024

@DotWang Thank you very much for your help and detailed and patient explanation, I finally achieved the results in the paper and can focus on doing my own research. Wish you all the best with your research. Thank your !

from vitae-transformer-remote-sensing.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.