zhaoj9014 / multi-human-parsing Goto Github PK
View Code? Open in Web Editor NEW🔥🔥Official Repository for Multi-Human-Parsing (MHP)🔥🔥
Home Page: http://lv-mhp.github.io/
License: MIT License
🔥🔥Official Repository for Multi-Human-Parsing (MHP)🔥🔥
Home Page: http://lv-mhp.github.io/
License: MIT License
First, thank you so much for open sourcing such a useful project like this. I was trying to download the dataset but i got the following error:
"Sorry, you can't view or download this file at this time.
Too many users have viewed or downloaded this file recently. Please try accessing the file again later. If the file you are trying to access is particularly large or is shared with many people, it may take up to 24 hours to be able to view or download the file. If you still can't access a file after 24 hours, contact your domain administrator."
Then I tried Baidu, but everything was in Chinese so wasn't able to understand it. I wonder if there is another way to download the dataset without getting this error. Thanks a lot again!
Can you help me to analyse the error about it?
Reading data...
Traceback (most recent call last):
File "train_step1.py", line 134, in
reader = data_provider('train.list')
File "train_step1.py", line 92, in init
seg = img_reader.pad(seg,np.uint8,False)
File "/Multi-Human-Parsing/Nested_Adversarial_Networks/img_reader.py", line 42, in pad
res[start_point:start_point+a] = img
ValueError: could not broadcast input array from shape (281,500,3) into shape (281,500)
Each pixel only belong to one label? Even on different people in the same image.
Hi, @ZhaoJ9014
The current code for multi-person parsing evaluation does not support the PCP metric.
I wonder will the code support PCP in the future.
File "/Users/rida/Documents/Virtual Retail/External Repos/Multi-Human-Parsing/Nested_Adversarial_Networks/train_e2e.py", line 437, in
net = network()
TypeError: init() takes exactly 2 arguments (1 given)
Where is /data/reform_img/ and /data/reform_anno/ shown here in train_step2.py
mask = read_label(fname[0].replace('reform_img', 'reform_annot').replace('jpg', 'png'))
I read the label map, but I find that the order of labels are not match to the description of README such as pixel 4 dosen't represent Hair but represents Singlet. Can you give me a right order label?
@ZhaoJ9014
Hi, Jian. Very nice job of the multiple human parsing dataset.
I'm here to ask if there is any official evaluation code or online servers for this task?
Please add pretrained models for use and maybe also for transfer learning. Thanks.
it's said
Moreover, 2D human poses with 16 dense key points ("right-shoulder", "right-elbow", "right-wrist", "left-shoulder", "left-elbow", "left-wrist", "right-hip", "right-knee", "right-ankle", "left-hip", "left-knee", "left-ankle", "head", "neck", "spine" and "pelvis". Each key point has a flag indicating whether it is visible-0/occluded-1/out-of-image-2)
but it's not correct order if we visualize points. Is someone is interested the correct order is the following:
["right-ankle", "right-knee", "right-hip", "left-hip", "left-knee", "left-ankle", "pelvis", "spine", "neck", "head", "right-wrist", "right-elbow", "right-shoulder", "left-shoulder", "left-elbow", "left-wrist"]
it's also should be mentioned that visibility flags don't seem to be correct cause I see keypoints having negative coordinates with flag set to 0 (visible) so that I manually set those to 2 (out of image)
I hope the info will be useful for someone..
Hi,
I don't know what's wrong,
I use python2.7 and tensorflow1.2 and VOC2012 dataset
2018-11-18 21:48:15.769563: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769595: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769600: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769604: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:15.769608: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2018-11-18 21:48:16.205096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-11-18 21:48:16.205568: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.8475
pciBusID 0000:01:00.0
Total memory: 7.93GiB
Free memory: 7.12GiB
2018-11-18 21:48:16.205601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2018-11-18 21:48:16.205615: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2018-11-18 21:48:16.205632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
('loading from model:', u'./savings_bgfg/pretrain.ckpt')
Traceback (most recent call last):
File "train_step1.py", line 146, in
loss = net.train(img_batch,lab_batch)
File "train_step1.py", line 62, in train
ls,_ = self.sess.run([self.loss,self.train_op],feed_dict={self.inp_holder:img_batch, self.lab_holder:lab_batch})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [3364,2] and labels shape [423200]
[[Node: bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape, bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]]
[[Node: bg_fg/Mean/_263 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_8656_bg_fg/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Caused by op u'bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits', defined at:
File "train_step1.py", line 140, in
net = network()
File "train_step1.py", line 26, in init
self.build_loss(seg_layer,lab_holder)
File "train_step1.py", line 47, in build_loss
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=lab_reform,logits=seg_layer))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1703, in sparse_softmax_cross_entropy_with_logits
precise_logits, labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 2486, in _sparse_softmax_cross_entropy_with_logits
features=features, labels=labels, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): logits and labels must have the same first dimension, got logits shape [3364,2] and labels shape [423200]
[[Node: bg_fg/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits = SparseSoftmaxCrossEntropyWithLogits[T=DT_FLOAT, Tlabels=DT_INT64, _device="/job:localhost/replica:0/task:0/gpu:0"](bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape, bg_fg/SparseSoftmaxCrossEntropyWithLogits/Reshape_1)]]
[[Node: bg_fg/Mean/_263 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_8656_bg_fg/Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]
Hello.
I'm trying to reproduce your model using the provided dataset, but I can't find on your code how to properly adapt the parsing annotations from MHPv2 and Nested Adversarial Networks, as shown on your paper.
Specially due how each image has multiple parsing images in order to load them for training...
How do I use your dataset in your model?
Thanks beforehand
does the train_step1.py, train_step2.py and train_step3.py use the same dataset for training? if so why they use different path to get the training data?
What is the version is supported?
I faced a issue in train_step1.py where my input image jpeg file were actually 2 paths in single string: which was failing on:
Reading data...
Data file= train.list
f= <_io.TextIOWrapper name='train.list' mode='r' encoding='UTF-8'>
i= VOC2012/JPEGImages/2007_000032.jpg VOC2012/SegmentationClass/2007_000032.png
i= VOC2012/JPEGImages/2007_000032.jpg VOC2012/SegmentationClass/2007_000032.png
rest= VOC2012/JPEGImages/2007_000032.jpg
jpgfile= VOC2012/JPEGImages/2007_000032.jpg VOC2012/JPEGImages/2007_000032.jpg
Image path: VOC2012/JPEGImages/2007_000032.jpg VOC2012/JPEGImages/2007_000032.jpg
Traceback (most recent call last):
File "train_step1.py", line 142, in <module>
reader = data_provider('train.list')
File "train_step1.py", line 98, in __init__
jpg = img_reader.read_img(jpgfile,500,padding=True)
File "/home/pegasus/proj/human-parts-parsing/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/img_reader.py", line 26, in read_img
raw_im = np.array(Image.open(im_path).convert('RGB'), np.uint8)
File "/home/pegasus/anaconda3/envs/mhp_nan/lib/python3.6/site-packages/PIL/Image.py", line 2609, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'VOC2012/JPEGImages/2007_000032.jpg\tVOC2012/JPEGImages/2007_000032.jpg'
Now I removed the part "\tVOC2012/JPEGImages/2007_000032.jpg" which results into:
Reading data...
Data file= train.list
f= <_io.TextIOWrapper name='train.list' mode='r' encoding='UTF-8'>
i= VOC2012/JPEGImages/2007_000032.jpg VOC2012/SegmentationClass/2007_000032.png
i= VOC2012/JPEGImages/2007_000032.jpg VOC2012/SegmentationClass/2007_000032.png
rest= VOC2012/JPEGImages/2007_000032.jpg
jpgfile= VOC2012/JPEGImages/2007_000032.jpg
Image path: VOC2012/JPEGImages/2007_000032.jpg
Traceback (most recent call last):
File "train_step1.py", line 142, in <module>
reader = data_provider('train.list')
File "train_step1.py", line 100, in __init__
seg = img_reader.pad(seg,np.uint8,False)
File "/home/pegasus/proj/human-parts-parsing/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/img_reader.py", line 43, in pad
res[start_point:start_point+a] = img
ValueError: could not broadcast input array from shape (281,500,3) into shape (281,500)
I use Python3, so not PIL package(only supports 2.7), but Pillow which could be causing this issue
As written in the paper, the NAN model is composed of three sub-nets, each one trained with some prediction loss (compared with ground truth labels) and adversarial loss. However I could not find the adversarial losses in the code. Is it not included, or lying somethere that I missed? BTW, thanks for sharing the code.
The leaderboard page of the website mentions about base-line models for both tasks. Is the code/weights for these models published?
Thanks
I find all faces are occluded and the parsing labels are 0?
Hi, thanks for the dataset.
It seems there is no connection between poses and parsings.
For example, train/parsing_annos/1360_05_01.png
and person_0
in train/pose_annos/1360.mat
represents different persons.
Could you provide connections between poses and parsings?
Also, the dataset in the Google Drive seems zipped in MAC OSX. It can be unzipped without problems in MAC OSX, however Linux and Window says the zipped file is a corrupted one.
output_sample.py
seems not working. class_num
is missing at first. Even if a number is given here, loading pretrained model will fail.
loading from model: ./savings_bgfg/pretrain.ckpt
2019-03-07 18:40:09.127626: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key bg_fg/SegLayer/conv_0/conv_0/bias not found in checkpoint
LV-MHP-v1 README.txt show:
One image is corresponding to multiple annotation files with the same prefix, one file per person. In each annotation file, the label represents:
0: background
1: hat
2: hair
3: sunglass
4: upper-clothes
5: skirt
6: pants
7: dress
8: belt
9: left-shoe
10: right-shoe
11: face
12: left-leg
13: right-leg
14: left-arm
15: right-arm
16: bag
17: scarf
18: torso-skin
For example, the original picture of a photograph contains information such as hat, hair and so on, but it is a black picture. How can I parse and get the detailed information it contains?
Cannot entirely unzip the zip file downloaded from google drive for the MHP v2.0 dataset. I have a message error: The archive is corrupted.
I did some small change and now the train step 1 can run but all pixels are classified as backgound(black).
Anyone has this same issue?
Initially the build_loss function is
def build_loss(self,seg_layer,lab_holder):
lab_reform = tf.expand_dims(lab_holder,-1)
lab_reform = tf.image.resize_images(seg_layer,tf.shape(lab_reform)[1:3])
lab_reform = tf.squeeze(lab_reform)
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_layer,labels=lab_reform))
I changed it to
def build_loss(self,seg_layer,lab_holder):
# lab_reform = tf.expand_dims(lab_holder, -1)
# lab_reform = tf.image.resize_images(seg_layer, tf.shape(lab_reform)[1:3]) # z00445456
seg_reform = tf.image.resize_images(seg_layer, tf.shape(lab_holder)[1:3]) # z00445456
# lab_reform = tf.squeeze(lab_reform)
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_reform,labels=lab_holder))
and it can run without error. But the result is weird. All pixels are classified as backgound.
this error occurs when running 'output_sample.py', and I found that though there are some variable scopes when building the network in 'out_model.py', such as
with tf.variable_scope('SegLayer'): mod = M.Model(feature) mod.convLayer(3, 2, dilation_rate=dilation)
but after I run 'deploy_pretrain.py', still no corresponding values or defines are found in the checkpoint files when I check them with netron, all are WideRes, no Seglayer, MergingLayer, or inst_layer.
Thank you for your publication. However, there are many confused annotations such as instance bounding box has value [-1,-1,1] or the bottom right coordinate is smaller than the upper left coordinate. For example, 17608.jpg, 8865.jpg. I think this may be a mistake.
The network class in the old code does not set the default value, so there will be problem with the training. Will the rework NAN come recently?
I have implement the first and second stage network in https://github.com/Windaway/Pytorch-Multi-Human-Parsing. But the third stage net has two version? A RPN and a GCN net in arxiv paper.
The NAN rework define another version, which is the baseline method?
Can we also save the occluded Coordinates like we are doing for isvisible?
There seems to be a small mistake(,) in the code while calling the function tf.nn.softmax at line 144
/Multi-Human-Parsing_MHP/Nested_Adversarial_Networks/output_model.py", line 144 tf.nn.softmax(tf.image.resize_images(net_bgfg.seg_layer,tf.shape(img_holder)[1:3]),1)[:,:,:,1], ^ SyntaxError: invalid syntax
What must be the value class_num to be passed from out_sample?
I download the MHP_v2 dataset in baidu drive and I can't find pose annotations in MHP_v2 dataset. Please tell me where I can find it, thanks.
Run python train_step1.py
gets the following error:
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
BN training: False
Conv_bias: False
Conv_bias: False
BN training: False
Conv_bias: True
Conv_bias: True
Traceback (most recent call last):
File "train_step1.py", line 137, in <module>
net = network()
File "train_step1.py", line 25, in __init__
self.build_loss(seg_layer,lab_holder)
File "train_step1.py", line 45, in build_loss
seg_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=seg_layer,labels=lab_reform))
File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 2084, in sparse_softmax_cross_entropy_with_logits
precise_logits, labels, name=name)
File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 7515, in sparse_softmax_cross_entropy_with_logits
labels=labels, name=name)
File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 609, in _apply_op_helper
param_name=input_name)
File "/home/skyuuka/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'labels' has DataType float32 not in list of allowed values: int32, int64
But I'm still lack of NAN、MH-Parser well-trained model for comparison.
Do anyone have it and can share with me?
Dear Mr. Zhao,
As mentioned in the title, the script train_e2e.py has some syntax errors, such as the missing of commas in line 211, and the inconsistency of 'tabs' and 'space' in line 247. I need a human parsing tool for some downstream application, but these problems as well as the missing of adversarial loss, which is mentioned in the paper, make me unconfident about the result and discourage me from training a model myself.
Sincerely looking forward to your modification of the code and released pre-trained model in the future.
Thank you.
Zhiyong
TypeError: init() missing 1 required positional argument: 'class_num'
I am interested in more detailed classification of human pose, something like here:
https://github.com/ashafaei/dense-depth-body-parts
How hard would it be to customize this project to provide more details? I think better resolution will help with identification when some parts are hidden/out of camera view, etc.
Thanks for any comments, hints on how to do.
when I run train_step1.py Error info show
FileNotFoundError: [Errno 2] No such file or directory: 'VOC2012/JPEGImages/2012_001955.jpg VOC2012/JPEGImages/2012_001955.jpg'
The previous issue for this #9 was closed, though I still don't see any link to download pretrained models. Any update on this, please?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.