Human-centered AI

Kernelizing feature matching

droppedImage_1 (1).jpg

In my Ph.D. work, Arnulf Graf, Barbara Caputo and I came up with an easy extension to the Support Vector framework that would allow people to perform feature matching inside of the Kernel evaluation. Although our initial proof needed to be revised, this publication was the first to connect local feature methods with Kernel methods and to demonstrate significant gains in performance (Wallraven et al., 2003)!

What information is available at a glance from a scene? Can we create computer vision algorithms that capture the “gist” of a scene?

Humans are able to get the “gist” of a scene already after 150ms. Using perceptual experiments, we have shown that humans can estimate the horizon of a scene with high accuracy. First computational experiments indicate that this might be done using a global, frequency spectrum analysis (Herdtweck and Wallraven, 2010, 2013).

What information determines aesthetic judgments? Can a computer become an art critic?

With the recent advances in image processing, we ask how far state-of-the-art image measures can be used to model the aesthetic experience of observers looking at a painting. In a series of papers, we have shown that these measures already capture aspects of aesthetic judgments (e.g., Wallraven et al., 2009, Rigau et al., 2010) - the computer, however, will need much more training to become an art expert!

In another project, we have shown that although art is seen as more aesthetic than ordinary photographs, they are actually not remembered better. Several computational features were used to try to model memorability of either category, but also here we found that low-level features have limited power for prediction (Wallraven et al., 2015).

Evaluation of algorithms

In a visual ”Turing test”, we found that people were unable to tell the difference between computer graphics objects that were inserted into the scene and real objects, thus validating the approach.

Similarly, we were able to evaluate a sophisticated image interpolation technique using perceptual experiments to show that it produced the least perceptual artefacts.

In addition, several algorithms on creation of bas-reliefs were evaluated to see which methods would produce the most appealing effect.

Towards fast and efficient face landmark processing

In 2020, we set new benchmarks for a system that is capable to learn facial landmarks from very little supervision, leveraging large amounts of unsupervised pre-training. Our system was capable of excellent landmark localization after training with as few as 10 faces!

Code for this is available at: https://github.com/browatbn2/3FabRec

Is it useful to pre-process EEG data when using deep neural networks for decoding?

Taeho Kang showed that deep neural networks do not benefit from pre-processing algorithms designed to remove artefacts from EEG data (i.e., independent-component-analysis approaches). Across three different tasks from the field of brain computer interfaces, performance of deep networks did not improve significantly when using the pre-processing, showing that these algorithms may be robust enough to “see through the artefacts”.