ilovepose / darkpose Goto Github PK

View Code? Open in Web Editor NEW

549.0 29.0 80.0 336 KB

Distribution-Aware Coordinate Representation for Human Pose Estimation

Home Page: https://ilovepose.github.io/coco

License: Apache License 2.0

Makefile 0.03% Python 30.68% Cuda 67.66% C++ 0.03% Shell 0.76% Cython 0.84%

human-pose-estimation deep-learning coco-dataset mpii-dataset mscoco-keypoint

darkpose's People

Contributors

Stargazers

Watchers

darkpose's Issues

How can I test it on arbitrary RGB image?

I've tried to write demo code but I got stuck how to interpreter output of network:

import argparse
import os
import cv2
import numpy as np
import torch
import torchvision
import torchvision.transforms as transforms
from config import cfg
from config import update_config
from core.inference import get_final_preds
from utils.vis import save_debug_images
import glob
from models.pose_hrnet import get_pose_net

def parse_args():
	parser = argparse.ArgumentParser(description='Train keypoints network')
	# general
	parser.add_argument('--cfg',
						help='experiment configure file name',
						default='experiments/coco/hrnet/w48_384x288_adam_lr1e-3.yaml',
						type=str)

	parser.add_argument('opts',
						help="Modify config options using the command-line",
						default=None,
						nargs=argparse.REMAINDER)

	parser.add_argument('--modelDir',
						help='model directory',
						type=str,
						default='')
	parser.add_argument('--logDir',
						help='log directory',
						type=str,
						default='')
	parser.add_argument('--dataDir',
						help='data directory',
						type=str,
						default='./Inputs/')
	parser.add_argument('--prevModelDir',
						help='prev Model directory',
						type=str,
						default='')

	args = parser.parse_args()
	return args

def save_images(img, joints_pred, name,nrow=8, padding=2):
	height = int(img.size(0) + padding)
	width = int(img.size(1) + padding)
	nmaps = 1
	xmaps = min(nrow, nmaps)
	ymaps = int(math.ceil(float(nmaps) / xmaps))
	height = int(batch_image.size(2) + padding)
	width = int(batch_image.size(3) + padding)
	k = 0
	for y in range(ymaps):
		for x in range(xmaps):
			if k >= nmaps:
				break
			joints = batch_joints[k]
			joints_vis = batch_joints_vis[k]
			for joint in joints:
				joint[0] = x * width + padding + joint[0]
				joint[1] = y * height + padding + joint[1]
				cv2.circle(img, (int(joint[0]), int(joint[1])), 2, [255, 0, 0], 2)
			k = k + 1
	cv2.imwrite(f"Results/{name}", img)

def main():
	normalize = transforms.Normalize(
			mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
		)
	transform = transforms.Compose([
		transforms.ToTensor(),
		normalize,
	])
	args = parse_args()
	update_config(cfg, args)
	image_size = np.array(cfg.MODEL.IMAGE_SIZE)

	model = get_pose_net(
		cfg, is_train=False
	)

	if cfg.TEST.MODEL_FILE:
		model.load_state_dict(torch.load(cfg.TEST.MODEL_FILE), strict=False)
	else:
		model_state_file = os.path.join(
			final_output_dir, 'final_state.pth'
		)
		model.load_state_dict(torch.load(model_state_file))

	model = torch.nn.DataParallel(model, device_ids=cfg.GPUS).cuda()
	
	img_path_l = sorted(glob.glob('./Inputs' + '/*'))
	with torch.no_grad():
		for path in img_path_l:
			name  = path.split('/')[-1]
			image = cv2.imread(path)
			image = cv2.resize(image, (384, 288))
			input = transform(image).unsqueeze(0)
			#print(input.shape)
			outputs = model(input)
			if isinstance(outputs, list):
				output = outputs[-1]
			else:
				output = outputs
			print(f"{name} : {output.shape}")
	

if __name__ == '__main__':
	main()

I don't know what I set scale and center in get_final_preds .

Could you explain how to calculate the second derivative 'dxy' ?

In inference.py, function taylor, the 2nd derivative dxy is calculated by:

dxy = 0.25 * (hm[py+1][px+1] - hm[py-1][px+1] - hm[py+1][px-1] + hm[py-1][px-1])

So could you explain why is the equation like this?

code

excuse me ,i am a begonner of cs ,i want to know when will you show your code ,sorry to disturb you

Tensor size mismatch when training takes hourglass as back-bone

When I tried to train the hourglass network with image's input size of 128 x 96, the code threw an error about tensor size mismatch in this line:

DarkPose/lib/models/hourglass.py

Line 91 in 612fad5

out = up1 + up2

the shape of the parameter up1 and up2 are torch.Size([4, 256, 4, 3]) and torch.Size([4, 256, 4, 2]). Does it has anything to do with the pytorch version or max-pooling layer?

How to download pretrained model for test?

Hi, I am new to DarkPose. Is there any address for downloading pretrained model file?

Will you publish the code?

In the paper you mention a "model-agnostic plugin" - will you open source the code for your approach?

AI challenger dataset is not accessible. How to reproduce your result of W48*?

I can't download the AI challenger data. (https://challenger.ai/dataset/keypoint)
How do you get the result for HRnet W48 * ( with extra data?)

why are there two coefficients(0.5, 0.25) before the derivative and hessian?

DarkPose/lib/core/inference.py

Line 51 in 0185b42

def taylor(hm, coord):

According to the taylor series, the offset is equal to -InvH * g. While in this code, the offset is equal to -2*InvH * g because of the two coefficients. Can you please explain it?

Hi, how long does it take to get a return from coco evaluation server?

Unable to reproduce the score presented as Model + Dark

Hi,

I ran the test.py with the default HRNet w32 256 x 192 and HRNet w32 384 x 288. I am able to only reproduce the author's scores of 74.4 and 75.8 respectively.

Command Used :
python tools/test.py --cfg experiments/coco/hrnet/w32_256x192_adam_lr1e-3.yaml TEST.MODEL_FILE <MODEL_PATH> TEST.USE_GT_BBOX False

The MODEL_FILE used was the original author's model.

and likewise for 384 x 288.

I observed that the function taylor(hm, coord) in lib/core/inference.py was being invoked but I am not able to reproduce the results provided by you.

What do I need to to do reproduce the results provided ?

Thanks in advance

I have a quesetion for the code

DarkPose/lib/dataset/JointsDataset.py /line 284-286
feat_stride = self.image_size / self.heatmap_size
mu_x = joint[0]
mu_y = joint[1]
I have a question that why the code is not as follow:
feat_stride = self.image_size / self.heatmap_size
mu_x = joint[0]/feat_stride[0]
mu_y = joint[1]/feat_stride[1]

And I search the JointsDataset.py, I find that you don't use the 'feat_stride' in anywhere, but if your heatmap is 1/4 downsampling of the ori-image, I think it's a neccecery step to use the 'feat_stride'.

I want to know if my understanding is wrong or the code is wrong, thanks.

code for 'standard coordinate decoding method'

Thanks for the great paper, can I get the code for standard coordinate decoding method that you used for Table 1 to reproduce the comparison?

The difference in Table 1 and Table 2 ?

Where is this DARK encode part of the algorithm?

Where is this DARK decode part of the algorithm? @xizero00

Will this repo release C++ inference code?

if flip test is not used, Dark is not helpful for performance？

Did you have test the performance of Dark if the flip test is not used? It seems that we should use flip test and Dark or standard shifting together, but you didn't mention that in your paper nor did HRNet.

For video inference, what is the fastest fps

Confusion about the principle approach

Didn't understand a point and would like to add it to the blogger and all the other bigwigs please. What is described in the article is an inference of the actual point "µ" based on the maximum probability value "m" extracted from the heat map, so in a continuous function, shouldn't the first order derivative of the maximum probability value point m be 0?

The Loss of Training Process is Very Low!!!

Hi, thanks for your great work! I have tried this work to train on my own dataset with ImageNet-pretrained weights, however, during the training process, the loss is very low(nearly 0.00060 initially ). So, is this phenomenon normal？ and will not leading to gradient vanishing?

Question about a small detail

In evaluate.py, function calc_dists, the Euclidean distance will be calculated under the condition:

if target[n, c, 0] > 1 and target[n, c, 1] > 1:

It seems that you exclude the case where the target coordinates are in [0, 1], so why do this?

How to test the model on COCO test-dev set 2017

I want to test model on test-dev set, but the code tool/test.py is only for testing on validation set (5000 images):

So, I have a question: How to test the model on COCO test-dev set 2017 to submit the json file to codalab server?
I'm looking forward from you. Thank you for your great work!

AP, AR on MPII Dataset

I am currently working on extending the OKS similarity metric to MPII dataset, haven't finished it yet so, I am unaware of the problems but for now I haven't faced any. However, I wonder why none of the papers submit AP, AR on MPII dataset?

why does everyone use PCK for MPII dataset, AP for COCO? Is there any particular reason??

Also, If someone has already implemented can you share in this thread?

Thanks.

Can I setup nms on windows ?

Flip test shifting?

Sorry to bother you, I have a little question. In HRNet or SimpleBaseline, flip strategy is often used in test process, there is often 1 pixel shifting in flip output. But in your code, I didn't fine the shifting process, could you tell me the reason?

inference demo?

Where is this DARK decode part of the algorithm?

Where is this DARK decode part of the algorithm? @xizero00

Explanation for COCO data filtering

It seems from your code that you are selectively discarding some annotations. If I understand correctly you look at the center of all the visible keypoints and the center of the bounding box annotation and measure the ks between these two points.

However it is not clear how you selected the values to measure this heuristic. In particular:

How did you decide 0.2 in ks = np.exp(-1.0*(diff_norm2**2) / ((0.2)**2*2.0*area))
What does the computed metric = (0.2 / 16) * num_vis + 0.45 - 0.2 / 16 correspond to and what does the number 0.45 represent?

Thanks a lot! Great work!

Pre-trained models

Hello, where is the w48 256x192 dark pre-trained model?

Why use different BLUR_KERNEL in post-processing?

For the input size of 128x96, BLUR_KERNEL is 3. However, BLUR_KERNEL is 11 for the input size of 256x192. Can you explain the reason? Thanks.

kernel size of gaussian_blur

hi,thanks for your wonderful work. But I still have one question. The kernel size of gaussian_blur for output map is set to 11 that differs from the one of gt map,which is 1-3. And in paper ,it said "Specifically, to match the requirement of our method we
propose exploiting a Gaussian kernel K with the same variation as the training data to smooth out the effects of multiple
peaks in the heatmap h"

so , how to set the kernel size ? tks

Model weights look bad

About learning rate

Hello.
In paper, the learning rate is described as:
"the base learning rate was fine-tuned to 2.5e-4, and decayed to 2.5e-5 and 2.5e-6 at the 90-th and 120-th epoch".
In the repo, the initial learning rate is 0.001.

Which one is better? Should I change it as what is described as in paper for reproducing?

Where are pretrained weights of the best model?

Hi, thanks for your great work.

At README.md, there's an evaluation result of HRNet+Dark trained on MSCOCO+AI Challenger with some other techniques. (Indicated with *-+)
Where can I found the pretrained weights of this best performing model? I can't find it on the link you provided

Thanks in advance.

Results on MPII test dataset

Hi, thanks for your great work!

We would cite your paper, could you give me the result of your model on MPII test dataset?

Inference script in Python for pre-trained models

I'd like to know if the authors plan to publish inference script for pre-trained models in Python.

about bottom-up

can it be used to bottom-up model?

I wonder how you trained Hourglass model...

I downloaded your project, tried to train Hourglass, but got following error:

=> creating output/coco/hourglass/hg4_128x96_d256x3_adam_lr2
=> creating log/coco/hourglass/hg4_128x96_d256x3_adam_lr2_2021-08-16-12-55
Namespace(cfg='experiments/coco/hourglass/hg4_128x96_d256x3_adam_lr2.5e-4.yaml', dataDir='', logDir='', modelDir='', opts=[], prevModelDir='')
AUTO_RESUME: True
CUDNN:
BENCHMARK: True
DETERMINISTIC: False
ENABLED: True
DATASET:
COLOR_RGB: False
DATASET: coco
DATA_FORMAT: jpg
FLIP: True
HYBRID_JOINTS_TYPE:
NUM_JOINTS_HALF_BODY: 8
PROB_HALF_BODY: 0.0
ROOT: data/coco
ROT_FACTOR: 40
SCALE_FACTOR: 0.3
SELECT_DATA: False
TEST_SET: val2017
TRAIN_SET: train2017
DATA_DIR:
DEBUG:
DEBUG: True
SAVE_BATCH_IMAGES_GT: True
SAVE_BATCH_IMAGES_PRED: True
SAVE_HEATMAPS_GT: True
SAVE_HEATMAPS_PRED: True
GPUS: (0,)
LOG_DIR: log
LOSS:
TOPK: 8
USE_DIFFERENT_JOINTS_WEIGHT: False
USE_OHKM: False
USE_TARGET_WEIGHT: True
MODEL:
EXTRA:
NUM_BLOCKS: 1
NUM_FEATURES: 256
NUM_STACKS: 4
HEATMAP_SIZE: [24, 32]
IMAGE_SIZE: [96, 128]
INIT_WEIGHTS: False
NAME: hourglass
NUM_JOINTS: 17
PRETRAINED: models/pytorch/imagenet/resnet50-19c8e357.pth
SIGMA: 1
TAG_PER_JOINT: True
TARGET_TYPE: gaussian
OUTPUT_DIR: output
PIN_MEMORY: True
PRINT_FREQ: 100
RANK: 0
TEST:
BATCH_SIZE_PER_GPU: 32
BBOX_THRE: 1.0
BLUR_KERNEL: 11
COCO_BBOX_FILE: data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json
FLIP_TEST: True
IMAGE_THRE: 0.0
IN_VIS_THRE: 0.2
MODEL_FILE:
NMS_THRE: 1.0
OKS_THRE: 0.9
POST_PROCESS: True
SOFT_NMS: False
USE_GT_BBOX: True
TRAIN:
BATCH_SIZE_PER_GPU: 8
BEGIN_EPOCH: 0
CHECKPOINT:
END_EPOCH: 140
GAMMA1: 0.99
GAMMA2: 0.0
LR: 0.00025
LR_FACTOR: 0.1
LR_STEP: [90, 120]
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: adam
RESUME: False
SHUFFLE: True
WD: 0.0001
WORKERS: 24
The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3
Error occurs, No graph saved
Traceback (most recent call last):
File "/home/fl/dark/tools/train.py", line 223, in
main()
File "/home/fl/dark/tools/train.py", line 111, in main
writer_dict['writer'].add_graph(model, (dump_input, ))
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/tensorboardX/writer.py", line 945, in add_graph
self._get_file_writer().add_graph(graph(model, input_to_model, verbose))
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 292, in graph
raise e
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 286, in graph
trace = torch.jit.trace(model, args)
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/jit/_trace.py", line 742, in trace
_module_class,
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/jit/_trace.py", line 940, in trace_module
_force_outplace,
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 182, in forward
y = self.hgi
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 887, in _call_impl
result = self._slow_forward(*input, **kwargs)
File "/home/fl/miniconda3/envs/pose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 860, in _slow_forward
result = self.forward(*input, **kwargs)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 95, in forward
return self._hour_glass_forward(self.depth, x)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 86, in _hour_glass_forward
low2 = self._hour_glass_forward(n-1, low1)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 86, in _hour_glass_forward
low2 = self._hour_glass_forward(n-1, low1)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 86, in _hour_glass_forward
low2 = self._hour_glass_forward(n-1, low1)
File "/home/fl/dark/tools/../lib/models/hourglass.py", line 91, in _hour_glass_forward
out = up1 + up2
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 3
Process finished with exit code 1

the code has a little flaw

In inference.py, the function get_max_preds, I think the code shown as follows needs to add - 1 to its tail:
preds[:, :, 0] = (preds[:, :, 0]) % width
should be rectified as:
preds[:, :, 0] = (preds[:, :, 0]) % width - 1
So what's your opinion?

Why the test result on HRNet with DarkPose is the same as the result without DARK?

Run with DarkPose's open source code. The test result on HRNet is the same as the result without DARK. There is no improvement. What is the reason? The configuration file and pre-training model are all used in HRNet.

ilovepose / darkpose Goto Github PK

darkpose's People

Contributors

Stargazers

Watchers

Forkers

darkpose's Issues

Recommend Projects

Recommend Topics

Recommend Org