- The paper introduces, RF-Action, an end-to-end deep neural network that recognizes human actions from wireless signals.
- It is also able to identify body movements, such as a hand shake between two people.
- University of California researchers created a method to identify a person through a wall using video and Wi-FI signals.
The new work by MIT researchers “Making the Invisible Visible: Action Recognition Through Walls and Occlusions” will be presented at the International Conference on Computer Vision (ICCV), an annual research conference sponsored by the Institute of Electrical and Electronics Engineers. It is considered, together with CVPR, the top level conference on computer vision. The conference is usually spread over four to five days. It will take place in Seoul, South Korea, October 27 – November 2, 2019.
Human action recognition is a core task in computer vision. It has broad applications in video games, surveillance, gesture recognition, behavior analysis. The work uses a neural network model that can detect human actions through walls and occlusions, and in poor lighting conditions. Our model takes radio frequency (RF) signals as input, generates 3D human skeleton as an intermediate representation, and recognizes actions and interactions of multiple people over time.
The paper introduces, RF-Action, an end-to-end deep neural network that recognizes human actions from wireless signals. End to end deep learning is an idea of outputting complex data types from raw features, for example, audio transcripts, image captures.
A deep neural network is a neural network with a certain level of complexity, a neural network with more than two layers. Deep neural networks use sophisticated mathematical modeling to process data in complex ways.
Usually, computer technologies use human poses on video. Algorithms are then used to identify the behavior parameters of the multiple people. The engineers from MIT developed an algorithm that combines multiple parameters: raw camera data transmitted to the neural network, while also creating a skeletal model with the body. Next algorithmic analysis models and chooses appropriate actions. It is also able to identify body movements, such as a hand shake between two people.
To achieve the visual data using a system from multiple cameras, an Alpha Pose algorithm was used, which takes 2D skeletal models and converts them to 3D. Alpha Pose is an accurate multi-person pose estimator, which is the first real-time open-source system that matches poses that correspond to the same person across frames, they also provide an efficient online pose tracker called Pose Flow.
Consequently, to achieve RF scanning through the walls and other obstacles, the engineers designed a transceiver. A transceiver is a device comprising both a transmitter and a receiver that are combined and share common circuitry or a single housing. The transceiver has two sets of antennas oriented vertically and horizontally. Hence, the signals are formed in 2D and the neural network receives multiple images.
Recently, University of California researchers created a method to identify a person through a wall using video and Wi-FI signals. The video-WiFi cross-modal gait-based person identification system XModal-ID has a variety of applications, including surveillance and security. The approach makes it possible to determine if the person behind the wall is the same as the one in a video footage, using only a pair of off-the-shelf WiFi transceivers outside.