TY - JOUR AU1 - Liu, Tongcun AU2 - Liu, Haoxin AU3 - Wang, Yulong AB - The explosive growth in online video streaming presents challenges for video understanding with high accuracy and low computation complexity. Recent methods have realized global video representation without considering the local spatial structures of the videos over time. In this paper, we propose a method called partial channel fusion (PCF), which exploits local spatio-temporal characteristics for video understanding. We also present an agnostic and effective module for PCF which can provide both high efficiency and high performance in a variety of networks. Rather than independently modeling the spatial structure and motion structure of videos, the PCF module enables information exchange among multiple frames by partially fusing channels over the temporal dimension. By inserting the PCF module into different layers of a 2D convolutional network (2D-convNets), the local and global spatio-temporal characteristics of videos can be captured. Experimental results on two challenging datasets demonstrate the superiority of PCF in improving the accuracy of a 2D-convNets, advancing the state-of-the-art without increasing computational complexity. TI - Exploiting local spatio-temporal characteristics for effective video understanding JF - Multimedia Tools and Applications DO - 10.1007/s11042-021-11093-7 DA - 2021-09-01 UR - https://www.deepdyve.com/lp/springer-journals/exploiting-local-spatio-temporal-characteristics-for-effective-video-pjJsVy418a SP - 31821 EP - 31836 VL - 80 IS - 21-23 DP - DeepDyve ER -