hasnainraz / semsegpipeline Goto Github PK
View Code? Open in Web Editor NEWA simpler way of reading and augmenting image segmentation data into TensorFlow
License: MIT License
A simpler way of reading and augmenting image segmentation data into TensorFlow
License: MIT License
Can we do the data augmentation with the map function?
It seems no more arguments about the augmentation can be feed the _parse_data function.
HasnainRaz,
I found tensorflow cannot support tif format ? are there any solutions?
yours sincerely~
liuhao
It looks like image_size is supposed to be single number according to code; even though documentation mentions them as tuples. So it seems to throw error when i run it unmodified.
Also in python3.6 the os.listdir() function for getting file names may result in different order of image names and may lead to improper pairing of the image with mask. I got improper image-mask pair from dataset (using 'example-use' code) when i used jpg images for source; even though i changed extension in code. I think, it would be better to sort this list to ensure they are properly paired before applying any other operations/shuffling, considering the images are properly named.
Since the images are parsed one at a time, the shuffling function doesn't do anything currently. It should be moved to after the prefetch command.
Great tool Hasnain, thank you for this!
This works if you have a 1:1 ratio of images to masks (i.e. one image file with one corresponding mask file).
What if your data is structured with more than one mask file (each mask file containing a single mask) for each image file (one mask.png file for each mask polygon shape)?
For example, say you have 3 masks on, say, image # 17. So your image file name is image_17.jpg.
Your mask files are:
mask_17_0.png
mask_17_1.png
mask_17_2.png
How would you modify this script to accept a list of mask files rather than a single mask file.
'data/training/masks/mask_0_1.png',
'data/training/masks/mask_0_2.png',]
train_image_paths = [os.path.join(train_img_dir, x) for x in os.listdir(train_img_dir) if x.endswith('.png')]
train_mask_paths = [os.path.join(train_mask_dir, x) for x in os.listdir(train_mask_dir) if x.endswith('.png')]
print(train_image_paths[0:3])
print(train_mask_paths[0:3])
Which output:
['dataset/train_dir/uav_aerials/97.png', 'dataset/train_dir/uav_aerials/76.png', 'dataset/train_dir/uav_aerials/136.png']
['dataset/train_dir/uav_masks/97.png', 'dataset/train_dir/uav_masks/76.png', 'dataset/train_dir/uav_masks/136.png']
So I think the image and mask paths are correct.
My mask image is in RGB channel format, and the foreground color is (128,0,0)
color_list = [[0, 0, 0], [128, 0, 0]]
train_dl = DataLoader(image_paths=train_image_paths,
mask_paths=train_mask_paths,
image_size=(256,256),
channels=(3,3),
augment=True,
one_hot_encoding=True,
palette=color_list)
train_ds = train_dl.data_batch(batch_size=BATCH_SIZE,shuffle=True)
for image, mask in train_ds.take(1):
print(image,mask)
2022-04-24 09:40:55.352224: W tensorflow/core/framework/op_kernel.cc:1733] INVALID_ARGUMENT: ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.
Traceback (most recent call last):
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 269, in __call__
return func(device, token, args)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 147, in __call__
outputs = self._call(device, args)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 154, in _call
ret = self._func(*args)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/impl/api.py", line 642, in wrapper
return func(*args, **kwargs)
File "/tmp/__autograph_generated_filenbagilbr.py", line 57, in _augmentation_func
ag__.if_stmt(ag__.ld(self).augment, if_body_1, else_body_1, get_state_1, set_state_1, ('image_f', 'mask_f'), 2)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1321, in if_stmt
_py_if_stmt(cond, body, orelse)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1374, in _py_if_stmt
return body() if cond else orelse()
File "/tmp/__autograph_generated_filenbagilbr.py", line 50, in if_body_1
ag__.if_stmt(ag__.ld(self).compose, if_body, else_body, get_state, set_state, ('image_f', 'mask_f'), 2)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1321, in if_stmt
_py_if_stmt(cond, body, orelse)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/operators/control_flow.py", line 1374, in _py_if_stmt
return body() if cond else orelse()
File "/tmp/__autograph_generated_filenbagilbr.py", line 47, in else_body
(image_f, mask_f) = ag__.converted_call(ag__.ld(augment_func), (ag__.ld(image_f), ag__.ld(mask_f)), None, fscope_1)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/impl/api.py", line 335, in converted_call
return _call_unconverted(f, args, kwargs, options, False)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/autograph/impl/api.py", line 459, in _call_unconverted
return f(*args)
File "/home/ylzhao/project/potsdam_seg/dataloader.py", line 95, in _crop_random
h = tf.cast(shape[0] * self.crop_percent, tf.int32)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py", line 102, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Attempt to convert a value (None) with an unsupported type (<class 'NoneType'>) to a Tensor.
2022-04-24 09:40:55.369289: W tensorflow/core/framework/op_kernel.cc:1733] INVALID_ARGUMENT: ValueError: Tensor conversion requested dtype uint8 for Tensor with dtype float32: <tf.Tensor: shape=(256, 256, 2), dtype=float32, numpy=
array([[[1., 0.],
[1., 0.],
[1., 0.],
...,
[1., 0.],
[1., 0.],
[1., 0.]],
[[1., 0.],
[1., 0.],
[1., 0.],
...,
[1., 0.],
[1., 0.],
[1., 0.]],
[[1., 0.],
[1., 0.],
[1., 0.],
...,
[1., 0.],
[1., 0.],
[1., 0.]],
...,
[[1., 0.],
[1., 0.],
[1., 0.],
...,
[0., 1.],
[0., 1.],
[0., 1.]],
[[1., 0.],
[1., 0.],
[1., 0.],
...,
[0., 1.],
[0., 1.],
[0., 1.]],
[[1., 0.],
[1., 0.],
[1., 0.],
...,
[0., 1.],
[0., 1.],
[0., 1.]]], dtype=float32)>
Traceback (most recent call last):
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 269, in __call__
return func(device, token, args)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 147, in __call__
outputs = self._call(device, args)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 163, in _call
outputs = [
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 164, in <listcomp>
_maybe_copy_to_context_device(self._convert(x, dtype=dtype),
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/ops/script_ops.py", line 131, in _convert
return ops.convert_to_tensor(value, dtype=dtype)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/profiler/trace.py", line 183, in wrapped
return func(*args, **kwargs)
File "/home/ylzhao/anaconda3/envs/airbus/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 1662, in convert_to_tensor
raise ValueError(
Hi, thank you for your work! It looks a really nice and simple data loading pipeline. Could you please make modifications in order for your script to return 3 masks (in case of having more than just one output). In my case, I only have 1 input, but it would also be nice to give the option for both multiple inputs and outputs.
Thank you in advance.
Hi! Thank you for sharing your work.
I would like to ask for some issues I faced if you could please help me. First of all, the code I run is the combination of your "dataloader.py" file and your example code:
from typing import List, Tuple
import tensorflow as tf
import random
AUTOTUNE = tf.data.experimental.AUTOTUNE
class DataLoader(object):
"""A TensorFlow Dataset API based loader for semantic segmentation problems."""
def __init__(self, image_paths: List[str], mask_paths: List[str], image_size: Tuple[int],
channels: Tuple[int] = (3, 3), crop_percent: float = None, seed: int = None,
augment: bool = True, compose: bool = False, one_hot_encoding: bool = False, palette=None):
"""
Initializes the data loader object
Args:
image_paths: List of paths of train images.
mask_paths: List of paths of train masks (segmentation masks)
image_size: Tuple, the final height, width of the loaded images.
channels: Tuple of ints, first element is number of channels in images,
second is the number of channels in the mask image (needed to
correctly read the images into tensorflow and apply augmentations)
crop_percent: Float in the range 0-1, defining percentage of image
to randomly crop.
palette: A list of RGB pixel values in the mask. If specified, the mask
will be one hot encoded along the channel dimension.
seed: An int, if not specified, chosen randomly. Used as the seed for
the RNG in the data pipeline.
"""
self.image_paths = image_paths
self.mask_paths = mask_paths
self.palette = palette
self.image_size = image_size
self.augment = augment
self.compose = compose
self.one_hot_encoding = one_hot_encoding
if crop_percent is not None:
if 0.0 < crop_percent <= 1.0:
self.crop_percent = tf.constant(crop_percent, tf.float32)
elif 0 < crop_percent <= 100:
self.crop_percent = tf.constant(crop_percent / 100., tf.float32)
else:
raise ValueError("Invalid value entered for crop size. Please use an \
integer between 0 and 100, or a float between 0 and 1.0")
else:
self.crop_percent = None
self.channels = channels
if seed is None:
self.seed = random.randint(0, 1000)
else:
self.seed = seed
def _corrupt_brightness(self, image, mask):
"""
Radnomly applies a random brightness change.
"""
cond_brightness = tf.cast(tf.random.uniform(
[], maxval=2, dtype=tf.int32), tf.bool)
image = tf.cond(cond_brightness, lambda: tf.image.random_brightness(
image, 0.1), lambda: tf.identity(image))
return image, mask
def _corrupt_contrast(self, image, mask):
"""
Randomly applies a random contrast change.
"""
cond_contrast = tf.cast(tf.random.uniform(
[], maxval=2, dtype=tf.int32), tf.bool)
image = tf.cond(cond_contrast, lambda: tf.image.random_contrast(
image, 0.1, 0.8), lambda: tf.identity(image))
return image, mask
def _corrupt_saturation(self, image, mask):
"""
Randomly applies a random saturation change.
"""
cond_saturation = tf.cast(tf.random.uniform(
[], maxval=2, dtype=tf.int32), tf.bool)
image = tf.cond(cond_saturation, lambda: tf.image.random_saturation(
image, 0.1, 0.8), lambda: tf.identity(image))
return image, mask
def _crop_random(self, image, mask):
"""
Randomly crops image and mask in accord.
"""
cond_crop_image = tf.cast(tf.random.uniform(
[], maxval=2, dtype=tf.int32, seed=self.seed), tf.bool)
shape = tf.cast(tf.shape(image), tf.float32)
h = tf.cast(shape[0] * self.crop_percent, tf.int32)
w = tf.cast(shape[1] * self.crop_percent, tf.int32)
comb_tensor = tf.concat([image, mask], axis=2)
comb_tensor = tf.cond(cond_crop_image, lambda: tf.image.random_crop(
comb_tensor, [h, w, self.channels[0] + self.channels[1]], seed=self.seed), lambda: tf.identity(comb_tensor))
image, mask = tf.split(comb_tensor, [self.channels[0], self.channels[1]], axis=2)
return image, mask
def _flip_left_right(self, image, mask):
"""
Randomly flips image and mask left or right in accord.
"""
comb_tensor = tf.concat([image, mask], axis=2)
comb_tensor = tf.image.random_flip_left_right(comb_tensor, seed=self.seed)
image, mask = tf.split(comb_tensor, [self.channels[0], self.channels[1]], axis=2)
return image, mask
def _resize_data(self, image, mask):
"""
Resizes images to specified size.
"""
image = tf.image.resize(image, self.image_size)
mask = tf.image.resize(mask, self.image_size, method="nearest")
return image, mask
def _parse_data(self, image_paths, mask_paths):
"""
Reads image and mask files depending on
specified exxtension.
"""
image_content = tf.io.read_file(image_paths)
mask_content = tf.io.read_file(mask_paths)
images = tf.image.decode_jpeg(image_content, channels=self.channels[0])
masks = tf.image.decode_jpeg(mask_content, channels=self.channels[1])
return images, masks
def _one_hot_encode(self, image, mask):
"""
Converts mask to a one-hot encoding specified by the semantic map.
"""
one_hot_map = []
for colour in self.palette:
class_map = tf.reduce_all(tf.equal(mask, colour), axis=-1)
one_hot_map.append(class_map)
one_hot_map = tf.stack(one_hot_map, axis=-1)
one_hot_map = tf.cast(one_hot_map, tf.float32)
return image, one_hot_map
@tf.function
def _map_function(self, images_path, masks_path):
image, mask = self._parse_data(images_path, masks_path)
def _augmentation_func(image_f, mask_f):
if self.augment:
if self.compose:
image_f, mask_f = self._corrupt_brightness(image_f, mask_f)
image_f, mask_f = self._corrupt_contrast(image_f, mask_f)
image_f, mask_f = self._corrupt_saturation(image_f, mask_f)
image_f, mask_f = self._crop_random(image_f, mask_f)
image_f, mask_f = self._flip_left_right(image_f, mask_f)
else:
options = [self._corrupt_brightness,
self._corrupt_contrast,
self._corrupt_saturation,
self._crop_random,
self._flip_left_right]
augment_func = random.choice(options)
image_f, mask_f = augment_func(image_f, mask_f)
if self.one_hot_encoding:
if self.palette is None:
raise ValueError('No Palette for one-hot encoding specified in the data loader! \
please specify one when initializing the loader.')
image_f, mask_f = self._one_hot_encode(image_f, mask_f)
image_f, mask_f = self._resize_data(image_f, mask_f)
return image_f, mask_f
return tf.py_function(_augmentation_func, [image, mask], [tf.float32, tf.uint8])
def data_batch(self, batch_size, shuffle=False):
"""
Reads data, normalizes it, shuffles it, then batches it, returns a
the next element in dataset op and the dataset initializer op.
Inputs:
batch_size: Number of images/masks in each batch returned.
augment: Boolean, whether to augment data or not.
shuffle: Boolean, whether to shuffle data in buffer or not.
one_hot_encode: Boolean, whether to one hot encode the mask image or not.
Encoding will done according to the palette specified when
initializing the object.
Returns:
data: A tf dataset object.
"""
# Create dataset out of the 2 files:
data = tf.data.Dataset.from_tensor_slices((self.image_paths, self.mask_paths))
# Parse images and labels
data = data.map(self._map_function, num_parallel_calls=AUTOTUNE)
if shuffle:
# Prefetch, shuffle then batch
data = data.prefetch(AUTOTUNE).shuffle(random.randint(0, len(self.image_paths))).batch(batch_size)
else:
# Batch and prefetch
data = data.batch(batch_size).prefetch(AUTOTUNE)
return data
import os
MAIN_PATH = 'D:/code/test12/'
IMAGE_DIR_PATH = MAIN_PATH+'VOC2012/JPEGImages/'
MASK_DIR_PATH = MAIN_PATH+'VOC2012/SegmentationObject/'
BATCH_SIZE = 4
# create list of PATHS
image_paths = [os.path.join(IMAGE_DIR_PATH, x) for x in os.listdir(IMAGE_DIR_PATH) if x.endswith('.jpg')]
mask_paths = [os.path.join(MASK_DIR_PATH, x) for x in os.listdir(MASK_DIR_PATH) if x.endswith('.png')]
Initialize the dataloader object
dataset = DataLoader(image_paths=image_paths,
mask_paths=mask_paths,
image_size=(256, 256),
crop_percent=0.8,
channels=(3, 1),
augment=True,
compose=False,
seed=47)
# Parse the images and masks, and return the data in batches, augmented optionally.
dataset = dataset.data_batch(batch_size=BATCH_SIZE,shuffle=True)
# Initialize the data queue
for image, mask in dataset:
# Do whatever you want now
It's probably a silly question but I'm quite new in python: in order to "import" something (like from dataloader import DataLoader
), you have to have installed it first, right? In this case how the "installation" is done? What I did (which I think is not the right approach) is to copy the code of "dataloader.py" and paste it at the beginning of my code (your code actually, your example). So I run the content of "dataloader.py" and then the rest of the code.
I downloaded the PASCAL VOC dataset and added the paths in the code (IMAGE_DIR_PATH & MASK_DIR_PATH). When running the line dataset = dataset.data_batch(batch_size=BATCH_SIZE,shuffle=True)
, I get this error:
Traceback (most recent call last):
File "<ipython-input-15-d91f0c5e698b>", line 1, in <module>
dataset = dataset.data_batch(batch_size=BATCH_SIZE,shuffle=True)
File "<ipython-input-1-83f210ebd7b2>", line 202, in data_batch
data = tf.data.Dataset.from_tensor_slices((self.image_paths, self.mask_paths))
File "C:\Anaconda3\envs\spyder\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 682, in from_tensor_slices
return TensorSliceDataset(tensors)
File "C:\Anaconda3\envs\spyder\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3010, in __init__
batch_dim.assert_is_compatible_with(tensor_shape.Dimension(
File "C:\Anaconda3\envs\spyder\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 279, in assert_is_compatible_with
raise ValueError("Dimensions %s and %s are not compatible" %
ValueError: Dimensions 17125 and 2913 are not compatible
one_hot_encode=True
as argument in dataset.data_batch()
, is this right? When I do that, i.e. dataset = dataset.data_batch(batch_size=BATCH_SIZE,shuffle=True,one_hot_encode=True)
and run this line, I get this error:Traceback (most recent call last):
File "<ipython-input-16-ff6e561e9f62>", line 1, in <module>
dataset = dataset.data_batch(batch_size=BATCH_SIZE,shuffle=True,one_hot_encode=True)
TypeError: data_batch() got an unexpected keyword argument 'one_hot_encode'
Thank you in advance for your time and any possible help.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.