In This work, a bi-directional long-short term memory (LSTM) framework is proposed to refine pose estimation and tracking for multiple people. The key idea of our algorithm is to learn the temporal consistencies of the human body shapes between subsequent frames. This helps removing the wrong sudden outliers and improve the general smoothness of the pose tracking. The proposed approach has been evaluated on PoseTrack dataset for both the validation and test subset sequences. The overall detection and tracking results have been improved over the frame-by-frame only baseline detection.