Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict pose using my images #120

Open
agenthong opened this issue Dec 14, 2020 · 14 comments
Open

Predict pose using my images #120

agenthong opened this issue Dec 14, 2020 · 14 comments

Comments

@agenthong
Copy link

Hi, @karfly
Thanks for sharing this great repo.
I've trained the model using human3.6 dataset. After that, I use 2D heatmaps of other images to unproject with my own carlibration and feed to the trained model. But I find result like this:
image
Is this the 3D pose?
And I think maybe this result is in a different coordinate system. If yes, how can I get the corresponding poses in my images' coordinate system?

@karfly
Copy link
Owner

karfly commented Dec 14, 2020

Hey, @agenthong!
Yes, this tensor consists of 3D locations of joints. You can project it back to image using calibration matrix of your camera.

@agenthong
Copy link
Author

Hey, @agenthong!
Yes, this tensor consists of 3D locations of joints. You can project it back to image using calibration matrix of your camera.

Thanks for replying.
Do you have this step in the code? Maybe you can point out where it is.

@karfly
Copy link
Owner

karfly commented Dec 15, 2020

You can find example here.

@chaisheng-dawnlight
Copy link

Hi @agenthong @karfly , I also use my own data to predict 3D pose like this:
屏幕快照 2020-12-17 上午10 52 02
my question is how to visualize the result like it:
屏幕快照 2020-12-17 上午10 55 28
I generate the result as follows:
屏幕快照 2020-12-17 上午10 59 46
Whether the predicted result needs further post-processing?

@agenthong
Copy link
Author

You can find example here.

Yeah, but I want to get 3D joints, it projects the tensor to 2D images.

@karfly
Copy link
Owner

karfly commented Dec 17, 2020

@agenthong
To convert 3D points to your camera coordinate system, you need to apply rotation (R) and translation (t) to these 3D points.

@karfly
Copy link
Owner

karfly commented Dec 17, 2020

@chaisheng-dawnlight
Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)

@agenthong
Copy link
Author

@agenthong
To convert 3D points to your camera coordinate system, you need to apply rotation (R) and translation (t) to these 3D points.

Thanks a lot! So it means that this tensor is the 3D joints in world coordinate system?

@chaisheng-dawnlight
Copy link

@chaisheng-dawnlight
Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)

@karfly Hi, Thanks for your reply. I use the scatter function to draw the 3D pose, but it's still not work. This is my visualize code:
屏幕快照 2020-12-18 下午7 36 49

@karfly
Copy link
Owner

karfly commented Dec 18, 2020

@agenthong
Actually in the coordinates of the 1st camera. It usually equals to world coordinates.

@karfly
Copy link
Owner

karfly commented Dec 18, 2020

@chaisheng-dawnlight
What plot do you get from this code?

@agenthong
Copy link
Author

@agenthong
Actually in the coordinates of the 1st camera. It usually equals to world coordinates.

I have GT 3D keypoints like this:
image
It's quite different from my result:
image
To sum up, what's the format of output? And how can I compare my prediction with GT in same type?

@roselidev
Copy link

roselidev commented May 26, 2021

Hi, I'm also experiencing similar problem. I followed the comment like @karfly said.

@chaisheng-dawnlight
Try to visualize these 3D points without any processing with plt scatter 3D function (https://www.geeksforgeeks.org/3d-scatter-plotting-in-python-using-matplotlib/)

The Ground Truth points looks like this, I used draw_3d_pose in this repository :

And the pretrained model prediction looks like this :

I've plotted the 3d points like you said, and it looks like this :

I currently have no idea what error made this result..

I used 4 views of single pose, with corresponding camera parameter. Here's the code How I got the predicted 3d points.

# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)

#OUTPUT#
#Loading pretrained weights from: ./model/pretrained/resnet/pose_resnet_4.5_pixels_human36m.pth
#Reiniting final layer filters: module.final_layer.weight
#Reiniting final layer biases: module.final_layer.bias
#Successfully loaded pretrained weights for backbone

# PROCESSING MODEL INPUT
annotations_path = ['./data/anno/16-1_001-C01_3D.json', './data/anno/16-1_001-C02_3D.json', './data/anno/16-1_001-C03_3D.json', './data/anno/16-1_001-C04_3D.json']
device = torch.device('cpu')
batch_keypoints_3d = []
cameras = []
for path in annotations_path:
    _, _, _, keypoints_3d, camera = process_annotation_json(path)
    batch_keypoints_3d.append(keypoints_3d)
    cameras.append(camera)

batch = {'cameras' : cameras, 'pred_keypoints_3d' : batch_keypoints_3d}
images_batch = []
images_batch = process_images_batch(np.array(images_data))
proj_matricies_batch = torch.stack([torch.stack([torch.from_numpy(cam.projection) for cam in c])for c in cameras])
proj_matricies_batch = proj_matricies_batch.float().to(device)

# FORWARD MODEL
keypoints_3d_pred, heatmaps_pred, volumes_pred, confidences_pred, cuboids_pred, coord_volumes_pred, base_points_pred = model(images_batch, proj_matricies_batch, batch)

@roselidev
Copy link

roselidev commented May 27, 2021

I found that I didn't load the pretrained weights, so I added the code like this :
before :

# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)

after :

# MODEL LOAD
config = load_config('./model/pretrained/learnable_triangulation_volumetric/human36m_vol_softmax.yaml')
model = VolumetricTriangulationNet(config)
if config.model.init_weights:
    state_dict = torch.load(config.model.checkpoint)
    for key in list(state_dict.keys()):
        new_key = key.replace("module.", "")
        state_dict[new_key] = state_dict.pop(key)
    model.load_state_dict(state_dict, strict=True)
    print("Successfully loaded pretrained weights for whole model")

I've used the code in train.py.

And the result is still quite the same as before..

What I suspect is that the keypoints_3d_pred has different coordinate scale with the human36m gt data.
Hope I could get any help on how I should process my 3d points ground truth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants