With the rapid growth of video surveillance applications and services, the amount of surveillance videos has become extremely "big" which makes human monitoring tedious and difficult. Therefore, there exists a huge demand for smart surveillance techniques which can perform monitoring in an automatic or semi-automatic way. A number of challenges have arisen in the area of big surveillance data analysis and processing. Firstly, with the huge amount of surveillance videos in storage, video analysis tasks such as event detection, action recognition, and video summarization are of increasing importance in applications including events-of-interest retrieval and abnormality detection. Secondly, semantic data (e.g. objects' trajectory and bounding boxes) has become an essential data type in surveillance systems owing much to the growth of its size and complexity, hence introducing new challenging topics, such as efficient semantic data processing and compression, to the community. Thirdly, with the rapid growth from the static centric-based processing to the dynamic computing among distributed video processing nodes/cameras, new challenges such as multi-camera analysis, person re-identification, or distributed video processing are being issued in front of us. To meet these challenges, there is great need to extend existing approaches or explore new feasible techniques.

This is the 4th edition of our workshop. The first three were organized in conjunction with ICME 2019 (Shanghai, China), ICME 2020 (London, UK) and ICME 2021 (Shenzhen, China)

Scope & Topics

This workshop is intended to provide a forum for researchers and engineers to present their latest innovations and share their experiences on all aspects of design and implementation of new surveillance video analysis and processing techniques. Topics of interests include, but are not limited to:

  • Action/activity recognition, and event detection in surveillance videos
  • Object detection and tracking in surveillance videos
  • Multi-camera surveillance networks and applications
  • Surveillance scene parsing, segmentation, and analysis
  • Crowd parsing, estimation and analysis
  • Person, group or object or re-identification
  • Summarization and synopsis of surveillance videos
  • Big Data processing in large-scale surveillance systems
  • Distributed, edge and fog computing for surveillance systems
  • Data compression in surveillance systems
  • Low-resolution video analysis and processing: Recognition and object detection, restoration, denoising, enhancement, super-resolution
  • Surveillance from multiple modalities, not limited to: UAVs, satellite imagery, dash cams, wearables.

Call for Papers

    The conference schedule can be found here
    List of paper IDs of accepted papers can be found here
Important Dates
    Paper Submission Due Date: March 12, 2022 March 26, 2022
    Notification of Acceptance/Rejection: April 22, 2022 April 26, 2022
    Camera-Ready Due Date: April 29, 2022 May 12, 2022
    Workshop Date and Venue: July 22
Format Requirements & Templates
    Length: Papers must be no longer than 6 pages, including all text, figures, and references.
    Format: Workshop papers have the same format as regular papers. See the templates below. Submitted paper does not need to be double blind.
    Important: A complete paper should be submitted using the above templates.
Submission Details
    Paper Submission Site:
    (Please make sure your paper is submitted to the correct track)
    Submissions may be accompanied by up to 20 MB of supplemental material following the same guidelines as regular and special session papers.
    Review: Reviews will be handled directly by the Organizers and the Technical Program Committee (TPC).
    Presentation guarantee: As with accepted Regular and Special Session papers, accepted Workshop papers must be registered by the author deadline and presented at the conference; otherwise they will not be included in IEEE Xplore. A workshop paper is covered by a full-conference registration only.
    Conference Location: Virtual


Time: July 22 (Friday) UTC+8 Talk/Presentation (Zoom link is here)
9:00 a.m.-9:55 a.m. Invited Keynote: Privacy-preserving Video Analytics
Chen Chen (University of Central Florida)
9:55 a.m.-10:00 a.m. Short Break
10:00 a.m.-11:00 a.m.
(15 mins per talk)
Session 1

PAMI-AD: An Activity Detector Exploiting Part-Attention and Motion Information in Suverillance Videos
Yunhao Du*; Zhihang Tong; Junfeng Wan ; Binyu Zhang; Yanyun Zhao
(Beijing University of Posts and Telecommunications)

Beiji Zou; Min Wang; LingZi Jiang; Yue Zhang; Shu Liu*
(Central South University)

Bottleneck Detection in Crowded Video Scenes utilizing Lagrangian Motion Analysis via Density and Arc Length Measures
Maik Simon*; Erik Bochinski; Markus Küchhold; Thomas Sikora
(Technische Universität Berlin)

CDTnet: Cross-Domain Transformer based on Attributes for Person Re-Identification
Mengyuan Guan*; Suncheng Xiang; Ting Liu; Yuzhuo Fu
(Shanghai Jiao Tong University)

11:00 a.m.-11.05 a.m. Short Break
11:05 a.m.-12:05 a.m.
(15 mins per talk)
Session 2

Interaction guided hand-held object detection
Kaiyuan Dong*; Yuang Zhang; Aixin Zhang
(Shang Hai Jiao Tong University)

Ziran Qin*; Huanyu He; Weiyao Lin
(Shanghai Jiao Tong University)

Integer Network for Cross Platform Graph Data Lossless Compression
Ge Zhang1; Huanyu He1; Haiyang Wang2; Weiyao Lin1*
(1Shanghai Jiao Tong university, 2Clobotics)

Kuan-Hsien Liu*; Song-Jie Chen; Tsung-Jung Liu
(National Taichung University of Science and Technology)

Invited Keynote Speaker

Privacy-preserving Video Analytics

Abstract: Video-analytics-as-a-service enables a wide range of real-world applications, e.g., video surveillance, smart shopping systems like Amazon Go, elderly person monitoring systems. A key concern in such services is the privacy of the videos being analyzed, as analyzing such information-rich video data may reveal personal information like an individual’s daily routine, home location, gender, race, clothes, etc. Therefore, there is a pressing need for solutions to privacy-preserving video analysis. In this talk, we will present our recent work on a novel self-supervised privacy-preserving action recognition framework. It removes privacy information from input video in a self-supervised manner without requiring privacy labels. Extensive experiments show that our framework achieves competitive performance compared to the supervised baseline for the known action privacy attributes. We also showed that our method achieves better generalization to novel action-privacy attributes compared to the supervised baseline.

Biodata: Dr. Chen Chen is an Assistant Professor at the Center for Research in Computer Vision at UCF. He received his Ph.D. in Electrical Engineering from UT Dallas in 2016, receiving the David Daniel Fellowship (Best Doctoral Dissertation Award). His research interests include computer vision, efficient deep learning, and federated learning. He has been actively involved in several NSF and industry sponsored research projects, focusing on efficient resource-aware machine vision algorithms and systems development for large-scale camera networks. He is an Associate Editor of IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), Journal of Real-Time Image Processing, and IEEE Journal on Miniaturization for Air and Space Systems. He also served as an area chair for several conferences such as ECCV’2022, CVPR’2022, ACM-MM 2019-2022, ICME 2021 and 2022. According to Google Scholar, he has 10K+ citations and an h-index of 50.


Weiyao Lin
 wylin AT
John See
 johnsee AT
Xiatian Zhu (Eddy)
 eddy.zhuxt AT


Please feel free to send any question or comments to:
wylin AT, johnsee AT, eddy.zhuxt AT