Multi-Person 3D Motion Prediction with Multi-Range Transformers

We show the results of our method. Green color represents the input and blue represents the output. The results show that our method can predict smooth and natural multi-person 3d motions.

We show our method compared with the other methods. Green color represents the input and blue represents the output. Our results are not only the closest to the real records but also very smooth and natural. It can be seen that RNN-based method (SocialPool) will quickly produce freezing motion. When predicting the absolute skeleton joint positions, decoding based on an input seed sequence (HRI) or adding the input sequential residual to the output (LTD), will make the predicted motion have hysteresis and repeat the history.