hamedmp / imageflow Goto Github PK

A simple wrapper of TensorFlow for Converting, Importing (and Soon, Training) Images in tensorflow.

License: Apache License 2.0

Python 100.00%

imageflow's Introduction

Notice - This version of imageflow is no longer under maintenance and major update is required.

The tensorflow version is too old and the library is not working as expected. You are welcome to add your use-cases in the Issues as Feature request to be considered in the new versions. Sorry for the inconvenience.

ImageFlow

A simple wrapper of TensorFlow for Converting, Importing (and Soon, Training) Images in tensorflow.

Installation:

pip install imageflow

Usage:

import imageflow

Convert a directory of images and their labels to `.tfrecords`

Just calling the following function will make a filename.tfrecords file in the directory converted_data in your projects root(where you call this method).

convert_images(images, labels, filename)

The images should be an array of shape [-1, height, width, channel] and has the same rows as the labels

Read distorted and normal data from `.tfrecords` in multi-thread manner:

# Distorted images for training
images, labels = distorted_inputs(filename='../my_data_raw/train.tfrecords', batch_size=FLAGS.batch_size,
                                      num_epochs=FLAGS.num_epochs,
                                      num_threads=5, imshape=[32, 32, 3], imsize=32)

# Normal images for validation
val_images, val_labels = inputs(filename='../my_data_raw/validation.tfrecords', batch_size=FLAGS.batch_size,
                                    num_epochs=FLAGS.num_epochs,
                                    num_threads=5, imshape=[32, 32, 3])

Dependencies:

TensorFlow ( => version 0.7.0)
Numpy
Pillow

imageflow's People

Contributors

Stargazers

Watchers

imageflow's Issues

Provide the test example for cifar prediction

Would you provide a test example based ontest.tfrecords?
After the training of the model is finished, I want to get the predicting labels for every testing image and save them in a file.

IndexError: tuple index out of range in convert_to

Dear @HamedMP ,
I use files in example folder for reading and convert images to tfrecords file. but, I have some errors:
I using grayscale transform, so my images are 2D array.
first, I had a "too many indices" error in line: validation_images = train_images[:FLAGS.validation_size, :, :, :]
and with changing [:FLAGS.validation_size, :, :, :] to [:FLAGS.validation_size] this worked well.
besides that, my other issue is "IndexError: tuple index out of range" in convert_to function( in line "rows = images.shape[1]").
I don't understand, why we are using "rows = images.shape[1]" whereas our images have different size and even more, they have no depth in grayscale version.

best,
MRHajbabaei.

Anaconda

How can I install ImageFlow in Anaconda?

Image flow example: Receive only a portion of tfrecords images

Dear @HamedMP

I use your code in example folder and tried to train my network with that. However, when I start the training process I receive only a portion of my whole images in my tfrecords file (actually there is a formula for that! Number_of_images = 172*(FLAGS.num_epochs/FLAGS.batch_size), I guess).
I know that coord.should_stop() try to stop network from training if a thread should stop. But, I cannot realize why that try to stop network very fast. despite the fact that, my epoch number not yet completed?

Another question: what does FLAGS.num_epochs do in this code? (I don't realize difference of FLAGS.num_epochs in your code and epoch_numer=all_data_size/batch_size)

thank you very much.
mohammad reza

make a prediction with new image?

Hi HamedMP,

I have been studying this library and found is very amazing.
I trained my own model and I want to make prediction to having a predicted label with a new image.
Not sure how to do it, may you share your knowledge around this? Thanks.

Regards,
Deeperic

Multilabel classification example

Would it be possible to also include an example of how to create the TFRecord with multiple labels? I've been working on this, but am stuck--not sure if I'm not interpreting the example incorrectly or what the issue is.

In def convert_to, I've changed 'label': _int64_feature(int(labels[index])) to 'label': labels[index,:], where labels is an array of n by 5--n examples with 5 different labels representing 5 different attributes of a possible image. To create the labels array, I use:

labels[i,:] = [_int64_feature(int(l1)), _int64_feature(int(l2)), _int64_feature(int(l3)), _int64_feature(int(l4)), _int64_feature(int(l5))]

where I loop through each line of my text file, parse it for the 5 specific numbers I need (read in as strings), cast them as ints, and then pass them to the _int64_feature function.

However, I get errors with only _int64_feature(int(l5)), where it says that TypeError: float() argument must be a string or a number. When I wasn't getting that error, using the convert_to function led to this error: TypeError: Parameter to MergeFrom() must be instance of same class: expected Feature got ndarray.

Apologies if this isn't the correct place to ask this question, I just figured that it'd be helpful for anyone else who's working on a similar thing to be able to refer to this issue to resolve their errors as well.

Name conflict - disambiguation?

Hi, I'm the author of https://github.com/imazen/imageflow - another image processing tool. Do you think we should add a disambiguation paragraph on each repo?

my_cifar_train.py variables initialisation issue

from: init = tf.initialize_all_variables()
to: init = tf.group(tf.initialize_all_variables(), tf.initialize_local_variables())
as stated in: http://stackoverflow.com/questions/38136081/setting-num-epochs-on-tf-train-string-input-producer-produces-an-error

reader.py missing

It seems that reader.py doesn't get installed when you do pip install imageflow.

edit: I looked in my package directory and did see the reader.py file and util.py files.. but for some reason It throws an error when I attempt to import imageflow. Says reader is missing... was able to fix it by puting the files in the local directory of the py file I am working with.

tensorflow tf.image.resize_images raise ValueError

I used ImageFlow to create a tensorflow tfrecords file on my own images. My images are of the same size 223x334x3. I tried the cifar example on my own images. Because the cifar convnet need image of 32x32 size so i have to resize it.

My function to read tfrecords file , decode images and resize them as follow:

def read_my_file_format(filename_queue):
reader = tf.TFRecordReader()
key, serialized_example = reader.read(filename_queue)

features = tf.parse_single_example(
serialized_example,
features={
  'image_raw': tf.FixedLenFeature([], tf.string),
  'label': tf.FixedLenFeature([], tf.int64)
})

image = tf.decode_raw(features['image_raw'], tf.uint8)
image = tf.cast(image, tf.float32)
label = tf.cast(features['label'], tf.int32)

image = tf.reshape(image, [223446])
image.set_shape(223446)
print 'get shape: ',image.get_shape()

print 'image: ',image.get_shape()
print 'label: ',label.get_shape()

image = tf.image.resize_images(image, 32,32)
image = tf.reshape(image,[32,32,3])

#processed_example = some_processing(example)

processed_example = image
return processed_example, label

But when i call it, i got the following error:

Last executed 2016-07-22 23:58:40 in 71ms
get shape:  (223446,)
image:  (223446,)
label:  ()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-32-44cda3e31e6a> in <module>()
----> 1 tf.app.run()

/home/guanwanxian/anaconda2/envs/theano/lib/python2.7/site-packages/tensorflow/python/platform/app.pyc in run(main)
     28   f._parse_flags()
     29   main = main or sys.modules['__main__'].main
---> 30   sys.exit(main(sys.argv))

<ipython-input-7-2788f377abf1> in main(argv)
      1 def main(argv=None):
----> 2     train()

<ipython-input-31-4c9a489f07f7> in train(re_train, continue_from_pre)
      8         # Get images and labels for CIFAR-10.
      9         # images, labels = my_input.inputs()
---> 10         images, labels = input_pipeline(['../data/trainV2.tfrecords'],batch_size=128)
     11         val_images, val_labels = input_pipeline(['../data/testV2.tfrecords'],batch_size=128)
     12 

<ipython-input-3-ce4e93d1c196> in input_pipeline(filenames, batch_size, num_epochs)
      2     filename_queue = tf.train.string_input_producer(
      3                      filenames, num_epochs=num_epochs, shuffle=True)
----> 4     example, label = read_my_file_format(filename_queue)
      5     # min_after_dequeue defines how big a buffer we will randomly sample
      6     #   from -- bigger means better shuffling but slower start up and more

<ipython-input-30-8be4eeea8a06> in read_my_file_format(filename_queue)
     21     print 'label: ',label.get_shape()
     22 
---> 23     image = tf.image.resize_images(image, 32,32)
     24     image = tf.reshape(image,[32,32,3])
     25 

/home/guanwanxian/anaconda2/envs/theano/lib/python2.7/site-packages/tensorflow/python/ops/image_ops.pyc in resize_images(images, new_height, new_width, method, align_corners)
    629     images = array_ops.expand_dims(images, 0)
    630 
--> 631   _, height, width, depth = _ImageDimensions(images)
    632 
    633   # Handle tensor-valued sizes as well as Python integers.

ValueError: need more than 1 value to unpack

and My code to create tfrecords from images with littlt modification as follow:

def read_image_and_labels_from(path):
'''read images and return image and label array
'''

directories = glob.glob(os.path.join(path,'*'))
class_names = [os.path.basename(directory) for directory in directories]
class_names.sort()
num_classes = len(class_names)

file_paths = glob.glob(os.path.join(path,'*/*'))
file_paths = sorted(file_paths,
       key=lambda filename:os.path.basename(filename).split('.')[0])

images = []
labels = []
shapes = []
shape_set = set()
for filename in file_paths:
    im = Image.open(filename)
    arr = np.asarray(im, np.uint8)
    images.append(arr)

    im_shape = arr.shape
    shapes.append(im_shape) # tuple of 3
    if im_shape not in shape_set:
        shape_set.add(im_shape)
    # image_name = os.path.basename(filename).split('.')[0]

    class_name = os.path.basename(os.path.dirname(filename))
    label_num = class_names.index(class_name)
    labels.append(np.asarray(label_num, np.uint32))

images = np.array(images)
labels = np.array(labels)
shapes = np.array(shapes)
print images.shape, labels.shape, shapes.shape
print 'shape set: ', shape_set
return images, labels, shapes


def convert_images_to_tfrecords_file(images, labels, shapes, directorys, name):
num_examples = labels.shape[0]
print('labels shape is: ', labels.shape[0])
if num_examples != images.shape[0] != labels.shape[0]:
    raise ValueError("Images size %d does not match label size %d." %
                                     (images.shape[0], num_examples))

filename = os.path.join(directorys, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
    image_raw = images[index].tostring()
    image_shape = shapes[index]
    example = tf.train.Example(features=tf.train.Features(feature={
            'height': _int64_feature(image_shape[0]),
            'width': _int64_feature(image_shape[1]),
            'depth': _int64_feature(image_shape[2]),
            'label': _int64_feature(int(labels[index])),
            'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())

So what's wrong with my code?

ImportError: No module named 'reader'

I'm currently running Jupyter notebook using tensorflow-notebook . When I try import imageflow, I get the following:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-5-f680d40931f4> in <module>()
----> 1 from imageflow import *

/opt/conda/lib/python3.5/site-packages/imageflow/__init__.py in <module>()
     27 
     28 from imageflow import *
---> 29 from reader import read_and_decode
     30 
     31 __author__ = 'HANEL'

ImportError: No module named 'reader'

Might-be-relevant-software versions:
Python - 3.5.2
Jupyter - 4.1.1
numpy - 1.10.4
Pillow - 3.2.0
tensorflow - 0.9.0
imageflow - 0.0.2

EDIT: Looks like it's a problem with Jupyter notebook, as everything works fine from the command line, so I'll close this.

Create dataset from scratch

First of all, thank you @HamedMP for the huge work.
In my opinion, to extend ImageFlow functionalities, a nice feature to add is giving the possibility to create a new dataset from scratch.
It could be very usefull to anyone who needs to create his personal cnn for image recognition not base on pre-built dataset (i.e. Cifar, MNIST or Imagenet).
What do you think ?

per_image_whitening function does not exist

The function tensorflow.image.per_image_whitening (used here) no longer exists.

It has been renamed in TensorFlow 0.12 to tensorflow.image.per_image_standardization. See TensorFlow 0.12 release notes

hamedmp / imageflow Goto Github PK

imageflow's Introduction

Notice - This version of imageflow is no longer under maintenance and major update is required.

ImageFlow

Convert a directory of images and their labels to .tfrecords

Read distorted and normal data from .tfrecords in multi-thread manner:

imageflow's People

Contributors

Stargazers

Watchers

Forkers

imageflow's Issues

Recommend Projects

Recommend Topics

Recommend Org

Convert a directory of images and their labels to `.tfrecords`

Read distorted and normal data from `.tfrecords` in multi-thread manner: