Paper ID | MMSP-5.3 | ||
Paper Title | A Triplet Appearance Parsing Network for Person Re-Identification | ||
Authors | Mingfu Xiong, Engineering Research Center of Hubei Province for Clothing Information, School of Mathematics and Computer Science, Wuhan Textile University, China; Zhongyuan Wang, School of Computer Science, Wuhan University, Wuhan 430072, China, China; Ruhan He, Xinrong Hu, School of Mathematics and Computer Science, Wuhan Textile University, China; Ming Cheng, Engineering Research Center of Hubei Province for Clothing Information, School of Mathematics and Computer Science, Wuhan Textile University, China; Xiao Qin, Shelby Center for Engineering Technology, Samuel Ginn College of Engineering, American Samoa; Jia Chen, School of Mathematics and Computer Science, Wuhan Textile University, China | ||
Session | MMSP-5: Human Centric Multimedia 1 | ||
Location | Gather.Town | ||
Session Time: | Thursday, 10 June, 14:00 - 14:45 | ||
Presentation Time: | Thursday, 10 June, 14:00 - 14:45 | ||
Presentation | Poster | ||
Topic | Multimedia Signal Processing: Signal Processing for Multimedia Applications | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | As one of the specific vision tasks, person re-identification has become a prevalent research topic in the field of multimedia and computer vision. However, existing feature extraction methods, originating from the quality of the bounding boxes which could cause the inhomogeneity and incoherence of person representation for cluttered backgrounds, are difficult to adapt the challenges of the harsh real-world scenarios. This study develops a Triplet person Appearances Parsing Framework (TAPF) which eliminates the surrounding interference factors of bounding boxes for person re-identification. The framework consists of a triplet person parsing network and an integration mechanism for person local and global appearance information. Concretely, the triplet parsing network includes a channel parsing module, a position parsing module and a color parsing module, which are used to extract the person channel parsing descriptor, regional descriptor and color perception descriptor, respectively. Then, a local and global flatten gaussian operations are performed to integrate the person appearance parsing descriptors to obtain more discriminative features for the person representation. The experimental results have been conducted to validate our proposed algorithm can achieve a better performance for person re-identification on several public datasets, i.e., VIPeR and Market-1501, respectively. |