Paper ID | IVMSP-24.2 | ||
Paper Title | LIGHTWEIGHT HUMAN POSE ESTIMATION UNDER RESOURCE-LIMITED SCENES | ||
Authors | Zhe Zhang, Jie Tang, Gangshan Wu, Nanjing University, China | ||
Session | IVMSP-24: Applications 2 | ||
Location | Gather.Town | ||
Session Time: | Thursday, 10 June, 15:30 - 16:15 | ||
Presentation Time: | Thursday, 10 June, 15:30 - 16:15 | ||
Presentation | Poster | ||
Topic | Image, Video, and Multidimensional Signal Processing: [IVTEC] Image & Video Processing Techniques | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | Recent research on human pose estimation has achieved significant improvement. However, most existing methods tend to pursue higher scores on benchmark datasets using complex architecture, ignoring the deployment costs in practice. In this paper, we investigate the problem of lightweight human pose estimation under resource-limited scenes. We first redesign a lightweight bottleneck block with two concepts: depthwise convolution and attention mechanism. And then, based on the lightweight block, we present a single-stage Lightweight Pose Network (LPN). Our small network LPN-50 only has 2.7M parameters and 1.0G FLOPs, which is much more lightweight than other popular networks. In order to overcome the training barrier, we propose an iterative training strategy that can give full play to our LPNs' potential to get more accurate predicted results. We empirically demonstrate the effectiveness and efficiency of our methods on the benchmark dataset: the COCO keypoint detection dataset. Besides, we show the speed superiority of our lightweight network at inference time on a non-GPU platform. Specifically, our LPN-50 can achieve 68.7 in AP score on the COCO test-dev set, with 17 FPS inference speed on an Intel i7-8700K (6 cores) CPU machine. |