An International Workshop organised in conjuction with IEEE AVSS 2017, August 29, Lecce, Italy

Workshop motivation and description

Law enforcement agencies, in large criminal investigations, usually have to analyse vast amounts of audio-visual content, such as videos collected from street CCTV footage, audio recording, hard drives with diverse sources of data or online resources (e.g. YouTube). In order to piece together a story of events leading up to an incident, and to determine what happened afterwards a typical procedure is to employ numerous officers and volunteers to watch many hours of content to locate, identify, understand and trace the movements of suspects, victims, witnesses, and even inanimate objects (e.g. luggage).. However, studying large amounts of content is highly time consuming and requires lots of human resources. As an alternative, using automated video processing and image/audio understanding would lead to a quantum leap in efficiency, effectiveness and ultimately better crime solving and prevention.

Given significant recent advances in the efficiency and effectiveness of video and audio analysis techniques, it is now timely to consider how to bridge the “research to practice” gap for such techniques so that they can be put in the hands of law enforcement agencies. Unfortunately, existing state-of-the-art solutions in image/audio processing usually fail when used on audio-visual content “in the Wild”, i.e. from real-world data sources. Such content is recorded in uncontrolled environments (e.g. video captured without any measured change in focus, lighting and position), which results in data of poor quality (e.g. long-running continuous video sequences with many variations due to light changes) presenting significant analysis challenges.

This workshop focuses on the investigation of novel approaches for analysis of video and audio to support the security forces in the process of crime solving and prevention targeting real-world challenging data sources. The goal is to present revisited and novel algorithms that show resilience when applied to challenging real content from CCTV, hard drives or online resources (e.g. YouTube). Only papers describing related techniques with solid evidence of the use and validation in video and audio “in the Wild” will be presented. The objective is to draw researcher’s attention to emerging strategies that are robust against the real challenges to be addressed when technologies developed in a laboratory environment are deployed in practice. To this end we will present each accepted paper with the opportunity to showcase their approach via a practical demonstration of how it could be used in practice during a dedicated demo session organized as part of the workshop.

The workshop will leverage the results of the EU-funded ASGARD, SURVANT, DANTE and FORENSOR projects but in the spirit of openness welcomes contributions from the broader research community.

Papers to be presented in the workshop cover topics related to:

  • robust video processing algorithms for face detection, object detection, logo detection;
  • object and human tracking, person re-identification;
  • video pre-processing, stabilization, colour enhancement;
  • action recognition, behaviour analysis and learning;
  • biometric analysis (soft biometrics such as gait/gesture, clothes, face/skin colour);
  • indexing and query optimization for very large multimedia collections;
    benchmarking, introduction of new experimental datasets derived from real CCTV footage;