End-Users reviewing WP1 and WP7 results
The meeting was attended by 25 people, representatives of all partners engaged in WP1 and WP7 work packages: GUT and PSI (leaders of two WPs, respectively), AGH (project coordinator), and BUW, FHTW, PUT, TUKE. End-User Requirements Board was represented by 4 persons from: Romanian Police, Greek Ministry of Public Order and Citizen Protection, Spanish Ministry of Defence, and Polish Company – Microsystem.
The meeting was started by overview of goals and deadlines of WP1 and WP7 – two work packages dedicated to processing of audio, video, and multimodal signals for the purpose of live detection of potentially dangerous events.
Multiple aspects were discussed, and several prototypes were demonstrated:
- Integration Platform by PSI - providing a frontend for reception and presentation of events.
- Video prototypes by BUW - image processing for detection of crossing a barrier (live demo remotely from Wuppertal), and crowd monitoring employing Time-of-Flight camera.
- Audio events classification by TUKE - live processing of gunshots sounds in noisy conditions was presented, with high accuracy rate.
- Live assessment and on-the-fly adjustment of video quality for CCTV streaming by AGH - live demo prototype covering aspects of maintaining high video quality under varying conditions of network bandwidth and transfer speed drops, lost packets, and others.
- Recognizing dangerous tools (e.g. guns) in the image by AGH - image analysis dedicated to detection of particular (predefined) objects in the still images and videos.
- Digital watermarks - blurring the faces, and photo tampering protection by AGH - working application for hiding important information in the image watermark: “backup” copies of crucial parts of the image can be stored as an invisible and un-removable watermark inside the image, allowing for tampering detection and recovery of the original.
- Video processing prototypes by PUT, covering numerous aspects: people counting by top-down facing camera; application of stereovision for surveillance videos - object separation by depth; a method for combining video-frames from PTZ-camera with pre-programmed cycle into one panoramic image; pedestrian crossing observation and red-light passing detection; detection of fire in the image; tram stop monitoring and tracks crossing; people counting on the stadium tribune by locating empty seats; people density map generation from long video recordings; automatic object following PTZ-camera by CamShift algorithm.
- Stereo cameras and depth estimation for video by FHTW - utilization of two rectified and synchronized cameras allowed calculation of depth in the image, and estimation of distances, 3D trajectories, precise object sizes, facilitating precise detection and tracking.
- Node Station and Central Station framework by GUT - A modular system for data acquisition, processing, and transmission. It allows to create a data processing pipeline adapted to user's needs. It is used as an audio/video/and Bluetooth signals processing framework, both by developers and by end-users. The framework is platform independent (windows, Linux, mac, smartphones) and algorithms are developed to be mutually interconnected and exchange data in parallel. Examples of parking lot area monitoring, outdoor monitoring, bank area monitoring and other were presented.
- Bluetooth authorization and detection by GUT - A module dedicated to detection and identification of Bluetooth devices. It allows for recognition of authorized users, and for proximity assessment. In presented scenarios loss of a signal indicates theft or abduction of a protected goods/person.
- Crowd monitoring by GUT - A module for crowd flow monitoring. It is based on processing of video signals from typical IP cameras (contrarily to dedicated Time of Flight cameras). The solution performs a real-time accurate detection and counting of people entering and exiting through a virtual gate.
- Visual object detection and tracking by GUT - and event detection and automatic PTZ camera positioning (following an object). A module for real-time analysis of video streams was presented. It performs object detection, tracking, and classification. It is able to detect predefined events, such as: crossing a barrier in allowed/forbidden direction, various car parking events, abandoned luggage, person entering a road, and other. Detected event and object of interest are automatically tracked in a multi-camera setup and available PTZ cameras can be automatically positioned to follow the object and provide high magnification image. Operation of the module was presented on live streams from outdoor scenes.
- Audio events detection, classification and localization by GUT - A module for real-time analysis of audio streams was presented. It performs detection, classification and localization of sound sources. It is trained to classify sounds related to threats (scream, explosion, gunshot, breaking glass). The information of localization is used to automatically position PTZ camera at the source and track it visually. The audio processing algorithms provide also a possibility of adjustable directional characteristic (overhearing a conversation from a user-defined direction in very noisy conditions).
Second part of the meeting was dedicated to live presentation of 3 practical scenarios of multimodal events detection, aimed at validation of integration and concurrent operation of multiple Node Station modules.
- Scenario 1. The Chamber. A live demonstration of a framework configured to process multi-sensor data was presented. It comprises of person authorization by Bluetooth devices, and object tracking by PTZ camera, door entering and exiting by video analysis, breaking glass detection and localization by audio analysis and PTZ camera positioning.
- Scenario 2. VIP abduction. A video demonstration of previous experiments involving multi-sensor data analysis was presented. It is dedicated to outdoor events, involving cars, and persons. It comprises of Bluetooth-based abduction detection, video analysis for suspicious cars and person behaviours, and audio detection and localization of explosions and gunshots.
- Scenario 3. Bank robbery. A video demonstration of previous experiments involving multi-sensor data analysis was presented. It is dedicated to indoor events, crowd and audio analysis. It comprises thermo-vision and vision detection of people being forced to the floor, and audio detection and localization of shouts and gunshots.
Each presentation and live demo was commented by End-Users, and met with a positive response.
Multimodal event detection interface for “Chamber” scenario.
User interface for detection of outdoor events