Comments (15)
The accuracies of the "impervious_surface" in your table are None, it is obviously wrong.
In addition, the category and corresponding color are officially defined, and I suggest that you do not change it.
We transform the label of '5_Labels_all.zip' by directly mapping since the "Undefined" category does not exist, here are our codes, note we use the skimage.io to load image
palette = {0 : (255, 255, 255), # Impervious surfaces (white)
1 : (0, 0, 255), # Buildings (blue)
2 : (0, 255, 255), # Low vegetation (cyan)
3 : (0, 255, 0), # Trees (green)
4 : (255, 255, 0), # Cars (yellow)
5 : (255, 0, 0), # Clutter (red)
6 : (0, 0, 0)} # Undefined (black)
invert_palette = {v: k for k, v in palette.items()}
def convert_from_color(arr_3d, palette=invert_palette):
""" RGB-color encoding to grayscale labels """
arr_2d = np.zeros((arr_3d.shape[0], arr_3d.shape[1]), dtype=np.uint8)
for c, i in palette.items():
m = np.all(arr_3d == np.array(c).reshape(1, 1, 3), axis=2)
arr_2d[m] = i
return arr_2d
def load_img(imgPath):
"""
Load image
:param imgPath: path of the image to load
:return: numpy array of the image
"""
if imgPath.endswith('.tif'):
img = io.imread(imgPath)
#img = tif.read_image()
# img = tifffile.imread(imgPath)
else:
raise ValueError('Install pillow and uncomment line in load_img')
return img
Thus we set reduce_zero_label=False
, num_classes=5
and ignore_index=5
to ignore the "Clutter" category
The corresponding transformation in mmseg is
if to_label:
color_map = np.array([[0, 0, 0], [255, 255, 255], [255, 0, 0],
[255, 255, 0], [0, 255, 0], [0, 255, 255],
[0, 0, 255]])
Note: the RGBs are inversed since mmcv use the opencv to read image
If you use this function, since '5_Labels_all.zip' doesn't have the "black boundary", the label will be transformed to 1-6 (here, cluster=6)
(Correspondingly, '5_Labels_noBoundary.zip' will be transformed to 0-6.)
At this time, the reduce_zero_label
should be in True (1-6 -> 0-5), then set num_classes=5
and ignore_index=5
.
from vitae-transformer-remote-sensing.
batchsize
=samples_per_gpu
* gpu_number
samples_per_gpu
is set in configs/_base_/datasets/xxxx.py (xxxx is the dataset you used)
gpu_number
is controled by CUDA_VISIBLE_DEVICES
while --nproc_per_node
= gpu_number
For example, the samples_per_gpu
in potsdam.py has been set to 8
so if you want to set batchsize=16
and operate on GPU 0,1, you can use
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --master_port=40001 tools/train.py \
configs/upernet/upernet_our_r50_512x512_80k_potsdam_epoch300.py \
--launcher 'pytorch'
In configs/swin/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py
We reset samples_per_gpu=4
Thus, if you need batchsize=8
, please use
CUDA_VISIBLE_DEVICES=x,y python -m torch.distributed.launch --nproc_per_node=2 --master_port=xxxxx tools/train.py \
configs/upernet/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py \
--launcher 'pytorch'
from vitae-transformer-remote-sensing.
@DotWang Thanks for your reply~
I have trained with configs/swin/upernet_swin_tiny_patch4_window7_512x512_80k_potsdam.py
, both 1x8 and 2x4 strategies were tested, which only achieved the following eval results:
The eval results of each eval_interval are as follows:
iter | aAcc | mFscore | mIoU |
---|---|---|---|
8000 | 80.81 | / | 59.73 |
16000 | 82.03 | / | 61.41 |
24000 | 82.52 | / | 61.96 |
32000 | 82.72 | / | 62.41 |
40000 | 83.23 | / | 62.97 |
48000 | 83.0 | 75.03 | 62.63 |
56000 | 82.69 | 74.63 | 62.27 |
64000 | 82.88 | 75.19 | 62.7 |
72000 | 83.35 | 75.57 | 63.29 |
80000 | 83.3 | 75.58 | 63.23 |
Hence I open this issue to ask.
I'll appreciate your assistance!
from vitae-transformer-remote-sensing.
@Li-Qingyun How did you prepare the potsdam dataset?
This dataset contains two versions including RGB and IR-R-G
and the label also has two versions: with or without boundary
In our implementation, we use '3_Ortho_IRRG.zip' and '5_Labels_all.zip'.
In addition, you can check the label
In our experiment, the label in '5_Labels_all.zip' range from 0-5, so we directly ignore the class 5
Another kind of label extra includes an undefined category.
Note we don't use the transformation function provided by mmsegmentation.
If you use it, you may need to adjust corresponding settings
such as
whether to reduce_zero_label
in configs/_base_/datasets/potsdam.py
;
settings of num_classes
and ignore_index
in configs/swin/your config file
;
and the dataset file in mmseg/datasets/your dataset file
We have not described these in the readme.md since they are highly customized.
from vitae-transformer-remote-sensing.
@DotWang Thanks for your support!
I followed the official guides for preparing datasets of mmseg, in which the '2_Ortho_RGB.zip' and '5_Labels_all_noBoundary.zip' are required.
Such a huge difference between segmentation results of RGB w/o Boundary and IRRG w/ Boundary.
from vitae-transformer-remote-sensing.
Oh, the actually used zip is '4_Ortho_RGBIR.zip' and '5_Labels_for_participants_no_Boundary.zip'.
from vitae-transformer-remote-sensing.
@DotWang Why each image in RGBIR contains 3 channels?
from vitae-transformer-remote-sensing.
@Li-Qingyun RGBIR image contains 4 channels: R, G, B, NIR
you can use skimage to read it
But since we use ordinary deep models for processing 3-channel images, the RGBIR is usually not used.
from vitae-transformer-remote-sensing.
@DotWang Thanks, I used the cv2.imread, which read imgs in 3-channel mode.
I know too little about this dataset, thank you for your support. I will adjust the data and rerun the experiment.
from vitae-transformer-remote-sensing.
@DotWang Hi, for the metric provided by mmseg.
Is the 'mFscore' correspond to mf1?
and the 'aAcc' correspond to OA?
from vitae-transformer-remote-sensing.
@Li-Qingyun yes
from vitae-transformer-remote-sensing.
Another kind of label extra includes an undefined category.
Note we don't use the transformation function provided by mmsegmentation.
If you use it, you may need to adjust corresponding settings
such as
whether to reduce_zero_label in configs/base/datasets/potsdam.py;
settings of num_classes and ignore_index in configs/swin/your config file;
and the dataset file in mmseg/datasets/your dataset file
We have not described these in the readme.md since they are highly customized.
I followed the custom potsdam.py in the repo, setting reduce_zero_label=False
, ignore_index=5
, the dataset was prepared with Semantic Segmentation/tools/convert_datasets/potsdam.py, '3_Ortho_IRRG.zip' and '5_Labels_all.zip' were adoped.
My training achieved OA (91.22) of upernet+swin-T-IMP, however about 88.69 mFscore only.
I think my setting of reduce_zero_label and ignore_index might be wrong.
I wrote a script to read the annotation (prepared by tools/convert_datasets/potsdam.py
).
and found that, for the '5_Labels_all.zip', the script turns the palette to 1~5.
for the '5_Labels_noBoundary.zip', the script turns the palette to 0~5, in which the 0 seems to be boundary.
And the CLASSES in both potsdam.py and potsdam_ori.py are:
CLASSES = ('impervious_surface', 'building', 'low_vegetation', 'tree',
'car', 'clutter')
PALETTE = [[255, 255, 255], [0, 0, 255], [0, 255, 255], [0, 255, 0],
[255, 255, 0], [255, 0, 0]]
I think the class which should be ignored is 'clutter', isn't it?
And the comment of the PotsdamDataset:
@DATASETS.register_module()
class PotsdamDataset(CustomDataset):
"""ISPRS Potsdam dataset.
In segmentation map annotation for Potsdam dataset, 0 is the ignore index.
``reduce_zero_label`` should be set to True. The ``img_suffix`` and
``seg_map_suffix`` are both fixed to '.png'.
"""
said 0 is the ignore index, and `reduce_zero_label`` should be set to True
I'm still confused about the dataset preparing and the true usage to achieve the reported results as a baseline of my work.
I'll appreciate your help and be willing to pull a request of the dataset preparing.
Thanks for your quick replies.
The script is as follow:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
from mmsegmentation.configs_rs._base_.potsdam import data as RGB_data
from mmsegmentation.configs_rs._base_.potsdam_IRRG import data as IRRG_data
from mmseg.datasets import build_dataset
RGB_trainset = build_dataset(RGB_data['train'])
IRRG_trainset = build_dataset(IRRG_data['train'])
palette_map = {
'[255, 255, 255]': 'black',
'[0, 0, 255]': 'blue',
'[0, 255, 0]': 'green',
'[255, 0, 0]': 'red',
'[255, 255, 0]': 'yellow',
'[255, 0, 255]': '',
'[0, 255, 255]': 'cyan',
}
# CLASSES = ('impervious_surface', 'building', 'low_vegetation', 'tree',
# 'car', 'clutter')
CLASSES = ('clutter', 'impervious_surface', 'building', 'low_vegetation',
'tree', 'car')
palette = RGB_trainset.PALETTE
print({k: palette_map[str(v)] for k, v in enumerate(palette)})
def save_ann_with_custom_palette(ann_path, output_path, ann_name):
ann = Image.open(ann_path)
ann_array = np.array(ann)
print(f'{ann_name}: {np.unique(ann_array)}')
save_bin_mask(ann_array, ann_name, output_path)
h, w = ann_array.shape
classes = np.unique(ann_array)
out_ann = np.zeros((h, w, 3))
for cls in classes:
indices = np.nonzero(ann_array == cls)
out_ann[indices] = palette[cls]
plt.figure()
plt.title(ann_name)
plt.imshow(out_ann)
plt.savefig(output_path + f'{ann_name}.png')
def save_bin_mask(ann_array: np.ndarray, remark: str, output_path):
plt.figure()
plt.suptitle(f'label {remark}')
classes = np.unique(ann_array)
_len = len(classes)
subplot_w = int(np.ceil(np.sqrt(_len)))
subplot_h = int(np.ceil(_len / subplot_w))
gs = gridspec.GridSpec(subplot_h, subplot_w * 2)
gs.update(wspace=0.8)
for i, cls in enumerate(np.unique(ann_array)):
bin_mask = (ann_array == cls).astype(np.float32)
if _len - i >= subplot_w or _len % 2 == 0:
plt.subplot(
gs[i // subplot_w, i % subplot_w * 2: i % subplot_w * 2 + 2])
else:
plt.subplot(
gs[i // subplot_w, i % subplot_w * 2 + 1: i % subplot_w * 2 + 3])
plt.title(f'{cls}-{CLASSES[cls]}')
# plt.title(f'class {cls} ({remark})')
plt.imshow(bin_mask)
plt.savefig(output_path + f'bin_mask {remark}')
ann0_path = f'/home/lqy/Desktop/DINO_semantic_seg/mmsegmentation/data' \
f'/potsdam/ann_noboundary/train/2_10_0_0_512_512.png'
ann1_path = f'/home/lqy/Desktop/DINO_semantic_seg/mmsegmentation/data' \
f'/potsdam/ann_all/train/2_10_0_0_512_512.png'
output_path = '/home/lqy/Desktop/DINO_semantic_seg/develop/dataset/'
save_ann_with_custom_palette(ann0_path, output_path, 'noboundary')
save_ann_with_custom_palette(ann1_path, output_path, 'all')
from vitae-transformer-remote-sensing.
@DotWang Thanks for your replies.
I did not find your script of preparing dataset in the repo at first, hence, I followed the official instruments of mmseg, which seems mismatched with the PotsdamDataset classes in the potsdam.py this repo. The potsdam_ori.py is the one should be used.
I searched the reduce_zero_label parameter globally and tried to understand how it make effects. The core logic is as follows:
if self.reduce_zero_label:
# avoid using underflow conversion
gt_semantic_seg[gt_semantic_seg == 0] = 255
gt_semantic_seg = gt_semantic_seg - 1
gt_semantic_seg[gt_semantic_seg == 254] = 255
which makes the background class to be labeled 255.
And there seems to be two place of reduce_zero_label working:
- LoadAnnotation of the pipline
- CustomDataset for eval metric calculating
And whey all do the same thing, which is easy to wonder if the action will be repeated.
I thought the train annotations is convert by the one in LoadAnnotation and the val annotations is convert by the one of CustomDataset. It seems that we hardly ever call the eval function to verify the segmentation performance of the model on the training set, otherwise, the reduce operation is likely to be performed twice.
Closer to home, if the user follow mmseg's official dataset preparation, the label seems to have gone through the following mapping process (the color format is RGB):
{0 : (255, 255, 255), # Impervious surfaces (white)
1 : (0, 0, 255), # Buildings (blue)
2 : (0, 255, 255), # Low vegetation (cyan)
3 : (0, 255, 0), # Trees (green)
4 : (255, 255, 0), # Cars (yellow)
5 : (255, 0, 0), # Clutter (red)
6 : (0, 0, 0)} # Undefined (black)
| | | |
| | | |
\ | | | | / transformation in `convert_datasets/potsdam.py`
\ | | /
\ /
{0 : (0, 0, 0), # Undefined (black)
1 : (255, 255, 255), # Impervious surfaces (white)
2 : (0, 0, 255), # Buildings (blue)
3 : (0, 255, 255), # Low vegetation (cyan)
4 : (0, 255, 0), # Trees (green)
5 : (255, 255, 0), # Cars (yellow)
6 : (255, 0, 0)} # Clutter (red)
| | | |
| | | |
\ | | | | / reduce_zero_label in `LoadAnnotation`
\ | | /
\ /
{0 : (255, 255, 255), # Impervious surfaces (white)
1 : (0, 0, 255), # Buildings (blue)
2 : (0, 255, 255), # Low vegetation (cyan)
3 : (0, 255, 0), # Trees (green)
4 : (255, 255, 0), # Cars (yellow)
5 : (255, 0, 0), # Clutter (red)
255 : (0, 0, 0)} # Undefined (black)
Hence, in the official potsdam_ori.py
, it is ignore_index=255
.
They use 'label_noBoundary.zip', whose converted labels (The second one) are [0 1 2 3 4 5 6].
When setting reduce_zero_label=True
, the labels are [255 0 1 2 3 4 5], hence the ignore_index was set 255, which set the Undefined background as the ignored category. However, actually both of 5 and 255 should be ignored, isn't it?
In the ViTAE-RS, for 'label_all.zip', whose converted labels are [1 2 3 4 5 6].
When setting 'reduce_zero_label=True', the labels are [0 1 2 3 4 5], hence the ignore_index was set 5.
If 'reduce_zero_label=False', the labels are [1 2 3 4 5 6], the ignored index is 6, however, a 0 is an extra label. Hence, the transformation should be deleted in this repo, to keep the origin [0 1 2 3 4 5], and 5 is the ignored Clutter class and the Undefined background class 6 is not annotated, so ignore_index=5
.
from vitae-transformer-remote-sensing.
@Li-Qingyun
Haha, the script of preparing potsdam dataset is used in our previous projects, so we adopt it instead of the mmseg transformation in this work. We do not upload the script since we think the mmseg is highly customized for users.
Most of your understanding is right. The transformation convert_datasets/potsdam.py
exists in the original mmseg, we didn't even use this folder and directly upload them.
The mIOUs that are shown in the mmseg site include all categories except the "Undefined". However, in RS literatures, the "Cluster" is also considered as background and does not take part in the metric calculation. In fact, whether or not to mask this category in training is both OK. For convenience, we also ignore it when training models.
from vitae-transformer-remote-sensing.
@DotWang Thank you very much for your help and detailed and patient explanation, I finally achieved the results in the paper and can focus on doing my own research. Wish you all the best with your research. Thank your !
from vitae-transformer-remote-sensing.
Related Issues (20)
- About Labels of Million-AID Dataset HOT 3
- 变化检测预训练权重问题 HOT 2
- About download the pretained model with change detection.
- KeyError: "EncoderDecoder: 'ViTAE_Window_NoShift_basic is not in the models registry'" HOT 7
- DIOR-R Benchmark question. HOT 1
- What are the differences between 'Your_ResNet' and MMCV's ResNet vb/vc/vd HOT 4
- Can't find the hrsc2016 in configs/_base_/datasets? HOT 1
- Reproduce the SeCo DOTA result. HOT 6
- use one image to test issues HOT 4
- Where is the train_labels_{}_{}.txt for scene recognition? HOT 2
- Semantic Segmentation: Potsdam 数据集复现性能差距有点大 HOT 15
- 模型注册问题 HOT 5
- 模型训练问题 HOT 3
- reproduce problem about swin-t in scene classification. HOT 1
- ann_file是什么格式的,怎么把八点法的labelTxt转成ann_file HOT 1
- label讀取的問題 HOT 1
- 数据集处理问题 HOT 3
- mmcv版本问题 HOT 1
- 模型预训练权重在哪下载 HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vitae-transformer-remote-sensing.