October 19, 2019

3053 words 15 mins read

Paper Group ANR 361

Human Activity Recognition for Mobile Robot. QBF as an Alternative to Courcelle’s Theorem. Predicting the Generalization Gap in Deep Networks with Margin Distributions. Elliptical Distributions-Based Weights-Determining Method for OWA Operators. Neural Multi-scale Image Compression. A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based …

Human Activity Recognition for Mobile Robot


Title	Human Activity Recognition for Mobile Robot
Authors	Iyiola E. Olatunji
Abstract	Due to the increasing number of mobile robots including domestic robots for cleaning and maintenance in developed countries, human activity recognition is inevitable for congruent human-robot interaction. Needless to say that this is indeed a challenging task for robots, it is expedient to learn human activities for autonomous mobile robots (AMR) for navigating in an uncontrolled environment without any guidance. Building a correct classifier for complex human action is non-trivial since simple actions can be combined to recognize a complex human activity. In this paper, we trained a model for human activity recognition using convolutional neural network. We trained and validated the model using the Vicon physical action dataset and also tested the model on our generated dataset (VMCUHK). Our experiment shows that our method performs with high accuracy, human activity recognition task both on the Vicon physical action dataset and VMCUHK dataset.
Tasks	Activity Recognition, Human Activity Recognition
Published	2018-01-23
URL	http://arxiv.org/abs/1801.07633v1
PDF	http://arxiv.org/pdf/1801.07633v1.pdf
PWC	https://paperswithcode.com/paper/human-activity-recognition-for-mobile-robot
Repo
Framework

QBF as an Alternative to Courcelle’s Theorem


Title	QBF as an Alternative to Courcelle’s Theorem
Authors	Michael Lampis, Stefan Mengel, Valia Mitsou
Abstract	We propose reductions to quantified Boolean formulas (QBF) as a new approach to showing fixed-parameter linear algorithms for problems parameterized by treewidth. We demonstrate the feasibility of this approach by giving new algorithms for several well-known problems from artificial intelligence that are in general complete for the second level of the polynomial hierarchy. By reduction from QBF we show that all resulting algorithms are essentially optimal in their dependence on the treewidth. Most of the problems that we consider were already known to be fixed-parameter linear by using Courcelle’s Theorem or dynamic programming, but we argue that our approach has clear advantages over these techniques: on the one hand, in contrast to Courcelle’s Theorem, we get concrete and tight guarantees for the runtime dependence on the treewidth. On the other hand, we avoid tedious dynamic programming and, after showing some normalization results for CNF-formulas, our upper bounds often boil down to a few lines.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08456v1
PDF	http://arxiv.org/pdf/1805.08456v1.pdf
PWC	https://paperswithcode.com/paper/qbf-as-an-alternative-to-courcelles-theorem
Repo
Framework

Predicting the Generalization Gap in Deep Networks with Margin Distributions


Title	Predicting the Generalization Gap in Deep Networks with Margin Distributions
Authors	Yiding Jiang, Dilip Krishnan, Hossein Mobahi, Samy Bengio
Abstract	As shown in recent research, deep neural networks can perfectly fit randomly labeled data, but with very poor accuracy on held out data. This phenomenon indicates that loss functions such as cross-entropy are not a reliable indicator of generalization. This leads to the crucial question of how generalization gap should be predicted from the training data and network parameters. In this paper, we propose such a measure, and conduct extensive empirical studies on how well it can predict the generalization gap. Our measure is based on the concept of margin distribution, which are the distances of training points to the decision boundary. We find that it is necessary to use margin distributions at multiple layers of a deep network. On the CIFAR-10 and the CIFAR-100 datasets, our proposed measure correlates very strongly with the generalization gap. In addition, we find the following other factors to be of importance: normalizing margin values for scale independence, using characterizations of margin distribution rather than just the margin (closest distance to decision boundary), and working in log space instead of linear space (effectively using a product of margins rather than a sum). Our measure can be easily applied to feedforward deep networks with any architecture and may point towards new training loss functions that could enable better generalization.
Tasks
Published	2018-09-28
URL	https://arxiv.org/abs/1810.00113v2
PDF	https://arxiv.org/pdf/1810.00113v2.pdf
PWC	https://paperswithcode.com/paper/predicting-the-generalization-gap-in-deep
Repo
Framework

Elliptical Distributions-Based Weights-Determining Method for OWA Operators


Title	Elliptical Distributions-Based Weights-Determining Method for OWA Operators
Authors	Xiuyan Sha, Zeshui Xu, Chuancun Yin
Abstract	The ordered weighted averaging (OWA) operators play a crucial role in aggregating multiple criteria evaluations into an overall assessment supporting the decision makers’ choice. One key point steps is to determine the associated weights. In this paper, we first briefly review some main methods for determining the weights by using distribution functions. Then we propose a new approach for determining OWA weights by using the RIM quantifier. Motivated by the idea of normal distribution-based method to determine the OWA weights, we develop a method based on elliptical distributions for determining the OWA weights, and some of its desirable properties have been investigated.
Tasks
Published	2018-09-09
URL	http://arxiv.org/abs/1809.02909v1
PDF	http://arxiv.org/pdf/1809.02909v1.pdf
PWC	https://paperswithcode.com/paper/elliptical-distributions-based-weights
Repo
Framework

Neural Multi-scale Image Compression


Title	Neural Multi-scale Image Compression
Authors	Ken Nakanishi, Shin-ichi Maeda, Takeru Miyato, Daisuke Okanohara
Abstract	This study presents a new lossy image compression method that utilizes the multi-scale features of natural images. Our model consists of two networks: multi-scale lossy autoencoder and parallel multi-scale lossless coder. The multi-scale lossy autoencoder extracts the multi-scale image features to quantized variables and the parallel multi-scale lossless coder enables rapid and accurate lossless coding of the quantized variables via encoding/decoding the variables in parallel. Our proposed model achieves comparable performance to the state-of-the-art model on Kodak and RAISE-1k dataset images, and it encodes a PNG image of size $768 \times 512$ in 70 ms with a single GPU and a single CPU process and decodes it into a high-fidelity image in approximately 200 ms.
Tasks	Image Compression
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06386v1
PDF	http://arxiv.org/pdf/1805.06386v1.pdf
PWC	https://paperswithcode.com/paper/neural-multi-scale-image-compression
Repo
Framework

A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based on Modified U-Net


Title	A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based on Modified U-Net
Authors	Yang Deng, Yao Sun, Yongpei Zhu, Mingwang Zhu, Wei Han, Kehong Yuan
Abstract	Accurate segmentation of MR brain tissue is a crucial step for diagnosis,surgical planning, and treatment of brain abnormalities. However,it is a time-consuming task to be performed by medical experts. So, automatic and reliable segmentation methods are required. How to choose appropriate training dataset from limited labeled dataset rather than the whole also has great significance in saving training time. In addition, medical data labeled is too rare and expensive to obtain extensively, so choosing appropriate unlabeled dataset instead of all the datasets to annotate, which can attain at least same performance, is also very meaningful. To solve the problem above, we design an automatic segmentation method based on U-shaped deep convolutional network and obtain excellent result with average DSC metric of 0.8610, 0.9131, 0.9003 for Cerebrospinal Fluid (CSF), Gray Matter (GM) and White Matter (WM) respectively on the well-known IBSR18 dataset. We use bootstrapping algorithm for selecting the most effective training data and get more state-of-the-art segmentation performance by using only 50% of training data. Moreover, we propose a strategy of MR brain tissue images’ suggestive annotation for unlabeled medical data based on the modified U-net. The proposed method performs fast and can be used in clinical.
Tasks
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07510v4
PDF	http://arxiv.org/pdf/1807.07510v4.pdf
PWC	https://paperswithcode.com/paper/a-strategy-of-mr-brain-tissue-images
Repo
Framework

Advanced local motion patterns for macro and micro facial expression recognition


Title	Advanced local motion patterns for macro and micro facial expression recognition
Authors	B. Allaert, IM. Bilasco, C. Djeraba
Abstract	In this paper, we develop a new method that recognizes facial expressions, on the basis of an innovative local motion patterns feature, with three main contributions. The first one is the analysis of the face skin temporal elasticity and face deformations during expression. The second one is a unified approach for both macro and micro expression recognition. And, the third one is the step forward towards in-the-wild expression recognition, dealing with challenges such as various intensity and various expression activation patterns, illumination variation and small head pose variations. Our method outperforms state-of-the-art methods for micro expression recognition and positions itself among top-rank state-of-the-art methods for macro expression recognition.
Tasks	Facial Expression Recognition
Published	2018-05-04
URL	http://arxiv.org/abs/1805.01951v1
PDF	http://arxiv.org/pdf/1805.01951v1.pdf
PWC	https://paperswithcode.com/paper/advanced-local-motion-patterns-for-macro-and
Repo
Framework

Unsupervised Learning of Dense Optical Flow, Depth and Egomotion from Sparse Event Data


Title	Unsupervised Learning of Dense Optical Flow, Depth and Egomotion from Sparse Event Data
Authors	Chengxi Ye, Anton Mitrokhin, Cornelia Fermüller, James A. Yorke, Yiannis Aloimonos
Abstract	In this work we present a lightweight, unsupervised learning pipeline for \textit{dense} depth, optical flow and egomotion estimation from sparse event output of the Dynamic Vision Sensor (DVS). To tackle this low level vision task, we use a novel encoder-decoder neural network architecture - ECN. Our work is the first monocular pipeline that generates dense depth and optical flow from sparse event data only. The network works in self-supervised mode and has just 150k parameters. We evaluate our pipeline on the MVSEC self driving dataset and present results for depth, optical flow and and egomotion estimation. Due to the lightweight design, the inference part of the network runs at 250 FPS on a single GPU, making the pipeline ready for realtime robotics applications. Our experiments demonstrate significant improvements upon previous works that used deep learning on event data, as well as the ability of our pipeline to perform well during both day and night.
Tasks	Optical Flow Estimation
Published	2018-09-23
URL	http://arxiv.org/abs/1809.08625v2
PDF	http://arxiv.org/pdf/1809.08625v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-dense-optical-flow
Repo
Framework

A Benchmark and Evaluation of Non-Rigid Structure from Motion


Title	A Benchmark and Evaluation of Non-Rigid Structure from Motion
Authors	Sebastian Hoppe Nesgaard Jensen, Alessio Del Bue, Mads Emil Brix Doest, Henrik Aanæs
Abstract	Non-Rigid structure from motion (NRSfM), is a long standing and central problem in computer vision, allowing us to obtain 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting of data set compiled for this purpose, which is made publicly available, and considerably larger than previous state of the art. To validate the applicability of this data set, and provide and investigation into the state of the art of NRSfM, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 16 different methods with available code, which we argue reasonably spans the state of the art in NRSfM. We also hope, that the presented and public data set and evaluation, will provide benchmark tools for further development in this field.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08388v2
PDF	http://arxiv.org/pdf/1801.08388v2.pdf
PWC	https://paperswithcode.com/paper/a-benchmark-and-evaluation-of-non-rigid
Repo
Framework

One-Shot Item Search with Multimodal Data


Title	One-Shot Item Search with Multimodal Data
Authors	Jonghwa Yim, Junghun James Kim, Daekyu Shin
Abstract	In the task of near similar image search, features from Deep Neural Network is often used to compare images and measure similarity. In the past, we only focused visual search in image dataset without text data. However, since deep neural network emerged, the performance of visual search becomes high enough to apply it in many industries from 3D data to multimodal data. Compared to the needs of multimodal search, there has not been sufficient researches. In this paper, we present a method of near similar search with image and text multimodal dataset. Earlier time, similar image search, especially when searching shopping items, treated image and text separately to search similar items and reorder the results. This regards two tasks of image search and text matching as two different tasks. Our method, however, explore the vast data to compute k-nearest neighbors using both image and text. In our experiment of similar item search, our system using multimodal data shows better performance than single data while it only increases minute computing time. For the experiment, we collected more than 15 million of accessory and six million of digital product items from online shopping websites, in which the product item comprises item images, titles, categories, and descriptions. Then we compare the performance of multimodal searching to single space searching in these datasets.
Tasks	Image Retrieval, Text Matching
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10969v2
PDF	http://arxiv.org/pdf/1811.10969v2.pdf
PWC	https://paperswithcode.com/paper/one-shot-item-search-with-multimodal-data
Repo
Framework

Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling


Title	Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling
Authors	Liang Ding, Di Chang, Russell Malmberg, Aaron Martinez, David Robinson, Matthew Wicker, Hongfei Yan, Liming Cai
Abstract	The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree width $\leq k$, for every fixed $k\geq 2$. In particular, we prove that approximation of a finite probabilistic system with such Markov networks has the minimum information loss when the network topology is achieved with a maximum spanning $k$-tree. While constructing a maximum spanning $k$-tree is intractable for even $k=2$, we show that polynomial algorithms can be ensured by a sufficient condition accommodated by many meaningful applications. In particular, we prove an efficient algorithm for learning the optimal topology of higher order correlations among random variables that belong to an underlying linear structure.
Tasks
Published	2018-01-21
URL	http://arxiv.org/abs/1801.06900v1
PDF	http://arxiv.org/pdf/1801.06900v1.pdf
PWC	https://paperswithcode.com/paper/efficient-learning-of-optimal-markov-network
Repo
Framework

A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome


Title	A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome
Authors	Li Chen, Hailun Ding, Qi Li, Jiawei Zhu, Jian Peng, Haifeng Li
Abstract	Convolutional neural networks (CNNs) are easily spoofed by adversarial examples which lead to wrong classification results. Most of the defense methods focus only on how to improve the robustness of CNNs or to detect adversarial examples. They are incapable of detecting and correctly classifying adversarial examples simultaneously. We find that adversarial examples and original images have diverse representations in the feature space, and this difference grows as layers go deeper, which we call Adversarial Feature Separability (AFS). Inspired by AFS, we propose a defense framework based on Adversarial Feature Genome (AFG), which can detect and correctly classify adversarial examples into original classes simultaneously. AFG is an innovative encoding for both image and adversarial example. It consists of group features and a mixed label. With group features which are visual representations of adversarial and original images via group visualization method, one can detect adversarial examples because of ASF of group features. With a mixed label, one can trace back to the original label of an adversarial example. Then, the classification of adversarial example is modeled as a multi-label classification trained on the AFG dataset, which can get the original class of adversarial example. Experiments show that the proposed framework not only effectively detects adversarial examples from different attack algorithms, but also correctly classifies adversarial examples. Our framework potentially gives a new perspective, i.e., a data-driven way, to improve the robustness of a CNN model.
Tasks	Multi-Label Classification
Published	2018-12-25
URL	http://arxiv.org/abs/1812.10085v2
PDF	http://arxiv.org/pdf/1812.10085v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-feature-genome-a-data-driven
Repo
Framework

Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle


Title	Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle
Authors	Evgeniy Martyushev
Abstract	The internal calibration of a pinhole camera is given by five parameters that are combined into an upper-triangular $3\times 3$ calibration matrix. If the skew parameter is zero and the aspect ratio is equal to one, then the camera is said to have Euclidean image plane. In this paper, we propose a non-iterative self-calibration algorithm for a camera with Euclidean image plane in case the remaining three internal parameters — the focal length and the principal point coordinates — are fixed but unknown. The algorithm requires a set of $N \geq 7$ point correspondences in two views and also the measured relative rotation angle between the views. We show that the problem generically has six solutions (including complex ones). The algorithm has been implemented and tested both on synthetic data and on publicly available real dataset. The experiments demonstrate that the method is correct, numerically stable and robust.
Tasks	Calibration
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11279v1
PDF	http://arxiv.org/pdf/1807.11279v1.pdf
PWC	https://paperswithcode.com/paper/self-calibration-of-cameras-with-euclidean
Repo
Framework

Comparing Computing Platforms for Deep Learning on a Humanoid Robot


Title	Comparing Computing Platforms for Deep Learning on a Humanoid Robot
Authors	Alexander Biddulph, Trent Houlistion, Alexandre Mendes, Stephan K. Chalup
Abstract	The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account.
Tasks	Pedestrian Detection
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03668v2
PDF	http://arxiv.org/pdf/1809.03668v2.pdf
PWC	https://paperswithcode.com/paper/comparing-computing-platforms-for-deep
Repo
Framework

How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level


Title	How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level
Authors	Vlad Eidelman, Anastassia Kornilova, Daniel Argyle
Abstract	Modeling U.S. Congressional legislation and roll-call votes has received significant attention in previous literature. However, while legislators across 50 state governments and D.C. propose over 100,000 bills each year, and on average enact over 30% of them, state level analysis has received relatively less attention due in part to the difficulty in obtaining the necessary data. Since each state legislature is guided by their own procedures, politics and issues, however, it is difficult to qualitatively asses the factors that affect the likelihood of a legislative initiative succeeding. Herein, we present several methods for modeling the likelihood of a bill receiving floor action across all 50 states and D.C. We utilize the lexical content of over 1 million bills, along with contextual legislature and legislator derived features to build our predictive models, allowing a comparison of the factors that are important to the lawmaking process. Furthermore, we show that these signals hold complementary predictive power, together achieving an average improvement in accuracy of 18% over state specific baselines.
Tasks
Published	2018-06-13
URL	http://arxiv.org/abs/1806.05284v1
PDF	http://arxiv.org/pdf/1806.05284v1.pdf
PWC	https://paperswithcode.com/paper/how-predictable-is-your-state-leveraging
Repo
Framework