You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why are the weights set to zero? Won't gradient vanishing happenn in the linear layer? Since in this way, the gradients of the weights are also be 0s during back propagation.
The text was updated successfully, but these errors were encountered:
Hi, thanks for your nice work!
I'm confuse by initialization of attention weights. They are all set to zero.
BEVFormer/projects/mmdet3d_plugin/bevformer/modules/temporal_self_attention.py
Line 123 in 66b65f3
Why are the weights set to zero? Won't gradient vanishing happenn in the linear layer? Since in this way, the gradients of the weights are also be 0s during back propagation.
The text was updated successfully, but these errors were encountered: