Final Reports of the practical course for Computer Vision for Human-Computer-Interaction WS20/21

Human-Drone Interaction

Drones have become a common tool, which is utilized inmany tasks such as aerial photography, surveillance, anddelivery. However, operating a drone requires more andmore interaction with the user. A natural and safe methodfor Human-Drone Interaction (HDI) is using gestures. Thispaper describes an HDI framework, which is based on the Robot Operating System (ROS) middleware. Our frame-work provides the functionality to control the movement of the drone with simple arm gestures and to follow the userwhile keeping a safe distance. We also propose a monocu-lar distance estimation method, which is entirely based onimage features. [pdf]

An extension of this work has been accepted at EUSIPCO 2021 [pdf]

Material Classification in Construction Sites

Computer vision is being used in more and more areasof our life. However, it is still underrepresented in construc-tion site contexts. But there are many possible applicationsin this area, for example, to automatically monitor the con-struction process and, thus, potentially improve efficiency.One possibility is to estimate the construction status basedon the materials visible in a certain phase of construction.For example, the floor is initially made of concrete and willthen be covered with wood in a later step. To monitor thisprogress, it is sufficient to take photographs of the construction site. These can then be analyzed in more detail. For thispurpose, a pipeline was developed and implemented withinthe scope of this work. It uses a photograph of the construc-tion site as input. The image is then segmented with respectto the spatial components of the room, which makes it possi-ble to subsequently classify the material on the ground and,thus, obtain an approximation of the distribution of the ma-terials used. The segmentation and classification are donewith different CNNs. These were trained using differentpublic datasets (ADE20k, OpenSurfaces, and MINC-2500).For testing, an own dataset, which consists of constructionsite images, was annotated. The results show that the seg-mentation works very well, but that there is still room forimprovement in the classification. [pdf]

Augmented Reality for Users with Low Vision

In this paper we use an AR device, specifically the HoloLens 2, to generate and display objects from the realworld as 3D holograms. Our approach outsources computationally intensive workloads to reduce the overall run-time, keep a high accuracy and thus improve the user expe-rience. To achieve this, we use Unity with the Mixed Real-ity Toolkit of Microsoft to create the AR app and a serversoftware to perform instance segmentation and shape esti-mation using Mask R-CNN and Mesh R-CNN respectively. [pdf]

Wall-Wall boundary detection

Recently, there has been growing interest in develop-ing learning-based methods to detect global structuresfor 3D scene modeling and understanding. As a stu-dent, practice is a good opportunity to learn and under-stand technological development. It is a key step for 3Dscene reconstruction to accurately extract room struc-ture lines. The purpose of this exercise is to extract thewall boundary in panoramic room images. For learningpurposes, three different methods have been tried: ba-sic image processing, computer vision image processingand image processing with deep learning . The three are a progressive relationship with each other. A pre-trained model is used in deep learning image pro-cessing, and in the end the boundary line between thewall is accurately identified. [pdf]

Human-Drone Interaction
Material Classification in Construction Sites
Augmented Reality for Users with Low Vision
Wall boundary detection