Giter VIP home page Giter VIP logo

rgbd-pose3d's People

Contributors

zimmerm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rgbd-pose3d's Issues

Hardware requirements?

Hi,

I am trying to run your 3D human detection with RGB-D camera (Intel RealSense R200) and I would like to know what should be the hardware requirements? I have two GPUs detected by Tensorflow (Nvidia Quadro P4000 (8GB memory) and Nvidia GeForce GTX 1050 (4GB memory)) and I managed to run forward_pass.py on one and ROSNode.py on another but I am getting warnning : "Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.40GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.". So I could live with that if I get any meaningful result, but now TF tree of a human doesn't look correct. Now I am not sure is it because of the lack of memory or something else is wrong.
(btw I can run OpenPose 2D detection on three cameras at the same time with this GPU configuration)

Thank you in advance and looking forward to your response!

2D projection resulting in points that 'jump around', as if snapping to a low resolution grid

Hi there

Amazing work. We've started using your model on the output of an Azure Kinect DK, because the model including in their SDK does not handle occlusions very well.

One odd thing we noticed is that the 2d keypoints, when visualized on a video don't seem to move smoothly, along with the real joints. They take discrete jumps of a certain number of pixels, as if the model output was constrained by a low resolution grid.

Is this to be expected?

Thank you!

RGB to depth alignment

Hey there,

Thanks for your work!

I'm currently facing a problem of aligning depth map and rgb frame from your dataset.

I'm currently following an idea of projecting depth map to camera, then transform it to the world coordinate and the project it back to the color image plane. For the sake of testing, I've tried doing this without paying attention to depth and trying using the API you've provided as much as possible.

So the steps I'm doing right now

  1. Project from depth pixel space to depth camera with my own function

def project_from_view(depth, camid, calib_data):
""" project from pixel plane to the camera coordinates """
depth_map = np.array(depth)
intrinsics = calib_data[camid][0]
focals, opt_center = np.eye(3), np.eye(3)
focals[[0, 1], [0, 1]] = [intrinsics[0][0], intrinsics[1][1]]
opt_center[[0, 1], [2, 2]] = [intrinsics[0][2], intrinsics[1][2]]
pts = np.vstack((np.where(depth_map != None), np.full(depth_map.shape[0] * depth_map.shape[1], -1)))
pts = np.linalg.inv(focals).dot(opt_center.dot(pts))
pts[2, :] = 1
pts = pts * depth_map.flatten()
return pts.T

  1. Projecting to world coordinate frame with the function from your API:
    trafo_cam2world(pts_cam_d, depth_cam_id, calib_frame)

  2. Projecting from world to image plane with the function from your API:
    project_from_world_to_view(pts_world, color_cam_id, calib_frame)

After applying those steps I get the following map:
image

The depth map at the bottom right is shifted, but I'm sure that I've used the correct intrinsics and extrinsics.

Moreover, I've tested my projection function separately and do not think that there might be an issue here.

Is it a normal behavior? Still hard to believe that reprojection gives such a big displacement while 3D points are perfectly mapped and aligned.

Possible to run on CPU?

Hello,

I'm currently downloading the weights and the repo (it's going to take a while). I wanted to try to run the network on a low powered robot (Atom 1.6Ghz 4x core, no dedicated gpu). I saw in the comments of forward_pass.py 1-2Hz results on beefy GPUs. Do you think it is worth testing on my use-case?

Thank you.

What are the inputs to VoxelPoseNet during training?

Hi, thank you for your work!
I wonder how to train VoxelPoseNet? The inputs to it are 2D groundtruth? or 2D results predicted by 2D network? And is it possible to cascade 2D and 3D network making it an end-to-end network?

How defines the intrinsic calibration data?

Very appreciated for your help and quick reply~
There is another question about wrapped depth image. As your article mentioned, the depth map is transformed into the color frame using the camera calibration.
The intrinsic calibration data in 'forward_pass.py' seems to be the product of depth cam's intrinsic parameters and scaling resolution ratios.
Why not use the color cam's intrinsic calibration directly when depth map was already in the color space?

I want to know 'Naive Lifting' is whose works

hello ,I am interested in your work .i have a little problem . i read your paper and i see you compare your work with named 'Naive Lifting' .but you don not tell 'Naive Lifting' is whose work. i want to study futher it .can you tell me 'Naive Lifting' is whose works. thank you

Which is the channel order of Network output, [1, D, H, W, C] or [1, W, H, D, C] ?

Thanks for your work!
I follow your voxelposenet, rewrite it using Tensorlayer.
The 3D network‘s result from my own data is not so good, the channel order may be reason.
The annotations of function 'PoseNet3D._detect_scorevol' indicate "Tensor scorevolume is [1, D, H, W, C]".
I didn't found channel rearrangement in your following codes. But final result use the order of 'xyz'.

Any answers? Thanks a lot.

How did you calibrate the Kinect v2 devices when collecting the MKV dataset?

Hi, thank you so much for your work. We are interested in generating a pedestrian walking dataset using a similar approach. However due to the limited resources we have we want to first consult on how others have done it. Would you mind sharing with us a brief overview of your Kinect calibration method? Did you use any special technique to make it work?

Problem with 3d plot

Thanks for the code!
But I simply can't draw a 3D plot with forward_pass.py I tried to call the 3d limb Coco function but it doesn't look like the teaser, and hand_plot_3d simply don't work. What is the right way for me to get a plot like the teaser?

Compatibility issues

Hi,

I am trying to run this code with Ubuntu 18.04 and CUDA 11.
More info about my driver card
NVIDIA-SMI 460.80
Driver Version: 460.80
CUDA Version: 11.2

I am encountering a series of problems with my recent configuration.
I tried to downgrade to CUDA 10 but nvidia drivers 418 (required for CUDA 10) are not compatible with my graphic card.

Are you planning on updating this code with more recent libraries?
Do you have any suggestion about how to you your module with more recent HW configurations?

Thanks

Latency of ROS node

Hi, thanks for sharing the results of your research! When trying out your ROS node, I noticed there is sometimes quite a bit of latency even though you set queue_size to 1 on both image subscribers. This is usually caused by a too small buff_size. Basically, the trick is that whenever you subscribe to large messages (such as images) and want to achieve low latency by setting a small queue_size, to supply the keyword argument buff_size to the constructor and set it to queue_size * avg_msg_size_in_bytes.

In your code example, I set buff_size to 10000000 (~10 MB), such that an average image message should now easily fit in there. This way, the messages won't be queued up anymore in the operating system's buffer that is inducing the latency (and you may not even need a separate processing thread anymore).

This issue is only relevant for rospy, not roscpp. See here for further details: ros/ros_comm#536

Measure of accuracy

Hi, thank you for releasing the code. I just had a question about the way the results are reported in your paper. Though you have compared the results with Tome et al., it is only visually expressed. Is there any additional resource online where you compare the results in terms of mean per joint position error (MPJPE) as well?

Thanks!

Where can I find training dataset

Thank you for your great work.
I want to reproduce your result.But I can't find the dataset for training and testing .
Is this dataset published? If published , can you tell me where can I find the dataset ?
Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.