gskielian / jpg-png-to-mnist-nn-format Goto Github PK
View Code? Open in Web Editor NEWPython/Bash scripts for creating custom Neural Net Training Data -- this repo is for the MNIST format
License: Apache License 2.0
Python/Bash scripts for creating custom Neural Net Training Data -- this repo is for the MNIST format
License: Apache License 2.0
https://blog.csdn.net/qq_44042678/article/details/131631917?spm=1001.2014.3001.5502
Brings together a number of issues
This script is working perfectly, except it is not taking in files from the folder named 22. I have 26 folders in both testing and training, and image is loaded, converted, and compressed perfectly, except for folder #22.
I can make a folder #27, and the script will take in the images, same with 28 and so on. It's not my bitmaps because the same thing happens with the base EMNIST files.
Thanks.
Hi, @gskielian
Where is the image batches.meta.txt file?
I cannot change the file because of no file.
I am getting an error in this line of the code as follows:
label = int(filename.split('/')[2])
IndexError: list index out of range
So i tried changing it as follows:
label = (filename.split("\")[2])
label1 = int(label)
Im = Image.open(filename)
pixel = Im.load()
width, height = Im.size
for x in range(0, width):
for y in range(0, height):
data_image.append(pixel[y, x])
data_label.append(label1)
and i again got error as:
label1 = int(label)
ValueError: invalid literal for int() with base 10: 'im9085.png'
I am using this script on the data given here itself.
How should i solve this error and get all the 4 files required?
I could not add a tuple of color values to 'data_image':
data_image.append(pixel[y,x])
Changed to data_image.append(pixel[y,x][0])
since images are grayscale.
How to add the tuple in case all color channels are different?
Thank you.
Hi, @gskielian @vianamp
NameError: name 'width' is not defined
What's wrong with me?
Thanks in advance.
from @bemoregt.
there you go! the necessary modifications:
#hexval = "{0:#0{1}x}".format(len(FileList),6) # number of files in HEX
hexval = "{0:#0{1}x}".format(len(FileList),10) # number of files in HEX
# header for label array
header = array('B')
#header.extend([0,0,8,1,0,0])
#header.append(int('0x'+hexval[2:][:2],16))
#header.append(int('0x'+hexval[2:][2:],16))
header.extend([0,0,8,1])
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[4:][:2],16))
header.append(int('0x'+hexval[6:][:2],16))
header.append(int('0x'+hexval[8:][:2],16))
data_label = header + data_label
# additional header for images array
#if max([width,height]) <= 256:
# header.extend([0,0,0,width,0,0,0,height])
#else:
# raise ValueError('Image exceeds maximum size: 256x256 pixels');
hexval = "{0:#0{1}x}".format(width,10) # width in HEX
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[4:][:2],16))
header.append(int('0x'+hexval[6:][:2],16))
header.append(int('0x'+hexval[8:][:2],16))
hexval = "{0:#0{1}x}".format(height,10) # height in HEX
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[4:][:2],16))
header.append(int('0x'+hexval[6:][:2],16))
header.append(int('0x'+hexval[8:][:2],16))
header[3] = 3 # Changing MSB for image data (0x00000803)
I think code below is not work normal.
when i just have
[1:] makes 2 dirs, like as
/0/xx.jpg
/1/yy.jpg
it just make a /0/
15 for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
Hi,
the line:
for dirname in os.listdir(name[0])[1:]: (line 15)
Is very problematic for any other OS than MacOS.
Here the first Class is excluded from the Project!!!!
I have done this instead:
if sys.platform == 'darwin':
file_range = os.listdir(name[0])[1:]
else:
file_range = os.listdir(name[0])
for dir_name in file_range:
path = os.path.join(name[0], dir_name)
Because I have no Mac.
Additionally the lines 45, 46 do not work for file number larger than the size of 2Byte!
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[2:][2:],16))
And why do you use strings instread of a bitmask? My fix was:
for i in range(3, -1, -1):
val = len(FileList) # int(hexval, 16)
mask = 0xff << (i * 8)
erg = (val & mask) >> (i * 8)
print(str(erg))
header.append(erg)
But I am a novice to Tensorflow. Is it a must to use only 2 Byte for the number of files? If not, I would advice you to change that code!
Traceback (most recent call last):
File "convert-images-to-mnist-format.py", line 35, in
data_image.append(pixel[y,x])
TypeError: an integer is required
Do you know what can be wrong?
i have 28x28px image
from tensorflow.examples.tutorials.mnist import input_data
data_sets = input_data.read_data_sets('.', False)
images, labels = data_sets.train.next_batch(50)
print(labels)
when i use the code above to show the labels, they are always [1,1,1,1,......,1,1,1]. Can anyone help me?
File "C:/Users/seanw/.spyder-py3/temp.py", line 42, in
label = int(filename.split('\\')[2])
ValueError: invalid literal for int() with base 10: 'im56908.png'
Hello,
I ran into a problem when running JPG-PNG-to-MNIST-NN-Format/convert-images-to-mnist-format.py:
Traceback (most recent call last):
File "/Users/shhgliu/git/JPG-PNG-to-MNIST-NN-Format/convert-images-to-mnist-format.py", line 35, in
data_image.append(pixel[y,x])
TypeError: an integer is required
after checking the code, I find the root cause is the typecode of data_image is 'B', which mean it only contains integer, however, at line 35, data_image.append(pixel[y,x]), the expression "pixel[y,x]" returns an array object.
so could you please help to check it? thanks!
Hi,
I get an error at this line:
data_image.append(pixel[y,x])
TypeError: an integer is required (got type tuple)
Hi
I'am a newcommer in TensorFlow.
In particular, I'm trying to understand the format of the MNIST .gz files like train-images-idx3-ubyte.gz found at Dr. LeCun's website. I understanded your code But I don't know about "2. Change the appropriate labels in batches.meta.txt".
Colud you please show me some example about batches.meta.txt.
Please let me know if you need more information or if I'm asking the wrong question or heading in an insensible direction. Thanks.
Why is it that after I run this file, I can't find the modified file and my initial data has disappeared. Can you help me?
data_image.append(pixel[y, x])
TypeError: an integer is required (got type tuple)
thank you very much for your help .
Is it possible to convert colored images? When I run the script with colored PNGs I get the following error:
convert-images-to-mnist-format.py
Traceback (most recent call last):
File "convert-images-to-mnist-format.py", line 35, in <module>
data_image.append(pixel[y,x])
TypeError: an integer is required
EDIT: I will use this repo
Hi,
I now have four files:
test-images-idx3-ubyte.gz
test-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz
How can I replace this line(#37) in the TensorFlow tutorial, so I could read the above mentioned files and save them to the mnist variable
#mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
Thanks in advance. :)
I followed your readme and after running the Python code I get following error.
Traceback (most recent call last):
File "convert-images-to-mnist-format.py", line 71, in <module>
print average(pix[x,y])
File "convert-images-to-mnist-format.py", line 9, in average
return (pixel[0] + pixel[1] + pixel[2])/3
` TypeError: 'int' object has no attribute 'getitem '
Would you please give me a hint to solve this error ?
And in your readme you said
Change the appropriate labels in batches.meta.txt
what do you mean by that ?
invalid parameter - /0
test-images/0/0.png: PNG image data, 48x48, 8-bit grayscale, interlaced
invalid parameter - /0
test-images/0/1.png: PNG image data, 48x48, 8-bit grayscale, interlaced
invalid parameter - /0
test-images/0/2.png: PNG image data, 48x48, 8-bit grayscale, interlaced
and so on.
any solution to this problem?
This is a tuple that is (67, 35, 20), and data_image.append() expert an integer, please tell me what do you want to do and how to solve the problem
When my dataset goes to 77000+ images, there is an error:
Traceback (most recent call last): File "convert-images-to-mnist-format.py", line 46, in <module> header.append(long('0x'+hexval[2:][2:],16)) OverflowError: unsigned byte integer is greater than maximum
Can anybody help me?
I think it is because images too many, how to fix it?
File "C:/Users/PC5/Downloads/Compressed/JPG-PNG-to-MNIST-NN-Format-master/JPG-PNG-to-MNIST-NN-Format-master/convert-images-to-mnist-format.py", line 25, in
label = int(filename.split('/')[2])
IndexError: list index out of range
hello
my data is text i want make cnn and lsm code by tensorflow python
error: InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 576 values, but the requested shape requires a multiple of 20736
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Reshape/shape)]]
sorry for my english because i not good for it and for take your time
thank you
If i have already adjusted my framework to except images larger than 28x28 will this encode the appropriate dimensions (ie 124x124) ?
When i have 78200 images in total .. the code breaks..
it was fixed by changing the
array('B') => array('H')
the total size of the idx3-ubyte
should be 1700x46x784= 61308800 (#images * class * image_size=total_size)
but, sadly gives 122617632..
may be the code should be changed for array('H')..
i right now dont want to go into the detail how the code works.
After completing execution of 90% part in this program, I've involved in a problem based on Operating System. I can't understand that what i need to change on PATH variable on "Environment Variable" section of OS. Just for convert data set into .gz file through this program i'm unable to go further part of my project. For convenience please see attachment of output:
Hello! I found out that the MNIST dataset is loaded with one .NPZ file while training NN!
But output of this code is very strange! Why it 4 output file? How can i load my own dataset it to NN??
Where is here x_train, y_train/ x_test, y_test?
I understand that largest files is train files, small - test. But it named very strange (for example: ---idx3, ---idx1 ). I was confused.
Hope for your advices and help.
P.S. For example - how can i use it in that code https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist_softmax.py
ValueError: Image exceeds maximum size: 256x256 pixels
这怎么解决呀
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.