Paper Group ANR 361
Human Activity Recognition for Mobile Robot. QBF as an Alternative to Courcelle’s Theorem. Predicting the Generalization Gap in Deep Networks with Margin Distributions. Elliptical Distributions-Based Weights-Determining Method for OWA Operators. Neural Multi-scale Image Compression. A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based …
Human Activity Recognition for Mobile Robot
Title | Human Activity Recognition for Mobile Robot |
Authors | Iyiola E. Olatunji |
Abstract | Due to the increasing number of mobile robots including domestic robots for cleaning and maintenance in developed countries, human activity recognition is inevitable for congruent human-robot interaction. Needless to say that this is indeed a challenging task for robots, it is expedient to learn human activities for autonomous mobile robots (AMR) for navigating in an uncontrolled environment without any guidance. Building a correct classifier for complex human action is non-trivial since simple actions can be combined to recognize a complex human activity. In this paper, we trained a model for human activity recognition using convolutional neural network. We trained and validated the model using the Vicon physical action dataset and also tested the model on our generated dataset (VMCUHK). Our experiment shows that our method performs with high accuracy, human activity recognition task both on the Vicon physical action dataset and VMCUHK dataset. |
Tasks | Activity Recognition, Human Activity Recognition |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07633v1 |
http://arxiv.org/pdf/1801.07633v1.pdf | |
PWC | https://paperswithcode.com/paper/human-activity-recognition-for-mobile-robot |
Repo | |
Framework | |
QBF as an Alternative to Courcelle’s Theorem
Title | QBF as an Alternative to Courcelle’s Theorem |
Authors | Michael Lampis, Stefan Mengel, Valia Mitsou |
Abstract | We propose reductions to quantified Boolean formulas (QBF) as a new approach to showing fixed-parameter linear algorithms for problems parameterized by treewidth. We demonstrate the feasibility of this approach by giving new algorithms for several well-known problems from artificial intelligence that are in general complete for the second level of the polynomial hierarchy. By reduction from QBF we show that all resulting algorithms are essentially optimal in their dependence on the treewidth. Most of the problems that we consider were already known to be fixed-parameter linear by using Courcelle’s Theorem or dynamic programming, but we argue that our approach has clear advantages over these techniques: on the one hand, in contrast to Courcelle’s Theorem, we get concrete and tight guarantees for the runtime dependence on the treewidth. On the other hand, we avoid tedious dynamic programming and, after showing some normalization results for CNF-formulas, our upper bounds often boil down to a few lines. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08456v1 |
http://arxiv.org/pdf/1805.08456v1.pdf | |
PWC | https://paperswithcode.com/paper/qbf-as-an-alternative-to-courcelles-theorem |
Repo | |
Framework | |
Predicting the Generalization Gap in Deep Networks with Margin Distributions
Title | Predicting the Generalization Gap in Deep Networks with Margin Distributions |
Authors | Yiding Jiang, Dilip Krishnan, Hossein Mobahi, Samy Bengio |
Abstract | As shown in recent research, deep neural networks can perfectly fit randomly labeled data, but with very poor accuracy on held out data. This phenomenon indicates that loss functions such as cross-entropy are not a reliable indicator of generalization. This leads to the crucial question of how generalization gap should be predicted from the training data and network parameters. In this paper, we propose such a measure, and conduct extensive empirical studies on how well it can predict the generalization gap. Our measure is based on the concept of margin distribution, which are the distances of training points to the decision boundary. We find that it is necessary to use margin distributions at multiple layers of a deep network. On the CIFAR-10 and the CIFAR-100 datasets, our proposed measure correlates very strongly with the generalization gap. In addition, we find the following other factors to be of importance: normalizing margin values for scale independence, using characterizations of margin distribution rather than just the margin (closest distance to decision boundary), and working in log space instead of linear space (effectively using a product of margins rather than a sum). Our measure can be easily applied to feedforward deep networks with any architecture and may point towards new training loss functions that could enable better generalization. |
Tasks | |
Published | 2018-09-28 |
URL | https://arxiv.org/abs/1810.00113v2 |
https://arxiv.org/pdf/1810.00113v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-the-generalization-gap-in-deep |
Repo | |
Framework | |
Elliptical Distributions-Based Weights-Determining Method for OWA Operators
Title | Elliptical Distributions-Based Weights-Determining Method for OWA Operators |
Authors | Xiuyan Sha, Zeshui Xu, Chuancun Yin |
Abstract | The ordered weighted averaging (OWA) operators play a crucial role in aggregating multiple criteria evaluations into an overall assessment supporting the decision makers’ choice. One key point steps is to determine the associated weights. In this paper, we first briefly review some main methods for determining the weights by using distribution functions. Then we propose a new approach for determining OWA weights by using the RIM quantifier. Motivated by the idea of normal distribution-based method to determine the OWA weights, we develop a method based on elliptical distributions for determining the OWA weights, and some of its desirable properties have been investigated. |
Tasks | |
Published | 2018-09-09 |
URL | http://arxiv.org/abs/1809.02909v1 |
http://arxiv.org/pdf/1809.02909v1.pdf | |
PWC | https://paperswithcode.com/paper/elliptical-distributions-based-weights |
Repo | |
Framework | |
Neural Multi-scale Image Compression
Title | Neural Multi-scale Image Compression |
Authors | Ken Nakanishi, Shin-ichi Maeda, Takeru Miyato, Daisuke Okanohara |
Abstract | This study presents a new lossy image compression method that utilizes the multi-scale features of natural images. Our model consists of two networks: multi-scale lossy autoencoder and parallel multi-scale lossless coder. The multi-scale lossy autoencoder extracts the multi-scale image features to quantized variables and the parallel multi-scale lossless coder enables rapid and accurate lossless coding of the quantized variables via encoding/decoding the variables in parallel. Our proposed model achieves comparable performance to the state-of-the-art model on Kodak and RAISE-1k dataset images, and it encodes a PNG image of size $768 \times 512$ in 70 ms with a single GPU and a single CPU process and decodes it into a high-fidelity image in approximately 200 ms. |
Tasks | Image Compression |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06386v1 |
http://arxiv.org/pdf/1805.06386v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-multi-scale-image-compression |
Repo | |
Framework | |
A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based on Modified U-Net
Title | A Strategy of MR Brain Tissue Images’ Suggestive Annotation Based on Modified U-Net |
Authors | Yang Deng, Yao Sun, Yongpei Zhu, Mingwang Zhu, Wei Han, Kehong Yuan |
Abstract | Accurate segmentation of MR brain tissue is a crucial step for diagnosis,surgical planning, and treatment of brain abnormalities. However,it is a time-consuming task to be performed by medical experts. So, automatic and reliable segmentation methods are required. How to choose appropriate training dataset from limited labeled dataset rather than the whole also has great significance in saving training time. In addition, medical data labeled is too rare and expensive to obtain extensively, so choosing appropriate unlabeled dataset instead of all the datasets to annotate, which can attain at least same performance, is also very meaningful. To solve the problem above, we design an automatic segmentation method based on U-shaped deep convolutional network and obtain excellent result with average DSC metric of 0.8610, 0.9131, 0.9003 for Cerebrospinal Fluid (CSF), Gray Matter (GM) and White Matter (WM) respectively on the well-known IBSR18 dataset. We use bootstrapping algorithm for selecting the most effective training data and get more state-of-the-art segmentation performance by using only 50% of training data. Moreover, we propose a strategy of MR brain tissue images’ suggestive annotation for unlabeled medical data based on the modified U-net. The proposed method performs fast and can be used in clinical. |
Tasks | |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07510v4 |
http://arxiv.org/pdf/1807.07510v4.pdf | |
PWC | https://paperswithcode.com/paper/a-strategy-of-mr-brain-tissue-images |
Repo | |
Framework | |
Advanced local motion patterns for macro and micro facial expression recognition
Title | Advanced local motion patterns for macro and micro facial expression recognition |
Authors | B. Allaert, IM. Bilasco, C. Djeraba |
Abstract | In this paper, we develop a new method that recognizes facial expressions, on the basis of an innovative local motion patterns feature, with three main contributions. The first one is the analysis of the face skin temporal elasticity and face deformations during expression. The second one is a unified approach for both macro and micro expression recognition. And, the third one is the step forward towards in-the-wild expression recognition, dealing with challenges such as various intensity and various expression activation patterns, illumination variation and small head pose variations. Our method outperforms state-of-the-art methods for micro expression recognition and positions itself among top-rank state-of-the-art methods for macro expression recognition. |
Tasks | Facial Expression Recognition |
Published | 2018-05-04 |
URL | http://arxiv.org/abs/1805.01951v1 |
http://arxiv.org/pdf/1805.01951v1.pdf | |
PWC | https://paperswithcode.com/paper/advanced-local-motion-patterns-for-macro-and |
Repo | |
Framework | |
Unsupervised Learning of Dense Optical Flow, Depth and Egomotion from Sparse Event Data
Title | Unsupervised Learning of Dense Optical Flow, Depth and Egomotion from Sparse Event Data |
Authors | Chengxi Ye, Anton Mitrokhin, Cornelia Fermüller, James A. Yorke, Yiannis Aloimonos |
Abstract | In this work we present a lightweight, unsupervised learning pipeline for \textit{dense} depth, optical flow and egomotion estimation from sparse event output of the Dynamic Vision Sensor (DVS). To tackle this low level vision task, we use a novel encoder-decoder neural network architecture - ECN. Our work is the first monocular pipeline that generates dense depth and optical flow from sparse event data only. The network works in self-supervised mode and has just 150k parameters. We evaluate our pipeline on the MVSEC self driving dataset and present results for depth, optical flow and and egomotion estimation. Due to the lightweight design, the inference part of the network runs at 250 FPS on a single GPU, making the pipeline ready for realtime robotics applications. Our experiments demonstrate significant improvements upon previous works that used deep learning on event data, as well as the ability of our pipeline to perform well during both day and night. |
Tasks | Optical Flow Estimation |
Published | 2018-09-23 |
URL | http://arxiv.org/abs/1809.08625v2 |
http://arxiv.org/pdf/1809.08625v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-dense-optical-flow |
Repo | |
Framework | |
A Benchmark and Evaluation of Non-Rigid Structure from Motion
Title | A Benchmark and Evaluation of Non-Rigid Structure from Motion |
Authors | Sebastian Hoppe Nesgaard Jensen, Alessio Del Bue, Mads Emil Brix Doest, Henrik Aanæs |
Abstract | Non-Rigid structure from motion (NRSfM), is a long standing and central problem in computer vision, allowing us to obtain 3D information from multiple images when the scene is dynamic. A main issue regarding the further development of this important computer vision topic, is the lack of high quality data sets. We here address this issue by presenting of data set compiled for this purpose, which is made publicly available, and considerably larger than previous state of the art. To validate the applicability of this data set, and provide and investigation into the state of the art of NRSfM, including potential directions forward, we here present a benchmark and a scrupulous evaluation using this data set. This benchmark evaluates 16 different methods with available code, which we argue reasonably spans the state of the art in NRSfM. We also hope, that the presented and public data set and evaluation, will provide benchmark tools for further development in this field. |
Tasks | |
Published | 2018-01-25 |
URL | http://arxiv.org/abs/1801.08388v2 |
http://arxiv.org/pdf/1801.08388v2.pdf | |
PWC | https://paperswithcode.com/paper/a-benchmark-and-evaluation-of-non-rigid |
Repo | |
Framework | |
One-Shot Item Search with Multimodal Data
Title | One-Shot Item Search with Multimodal Data |
Authors | Jonghwa Yim, Junghun James Kim, Daekyu Shin |
Abstract | In the task of near similar image search, features from Deep Neural Network is often used to compare images and measure similarity. In the past, we only focused visual search in image dataset without text data. However, since deep neural network emerged, the performance of visual search becomes high enough to apply it in many industries from 3D data to multimodal data. Compared to the needs of multimodal search, there has not been sufficient researches. In this paper, we present a method of near similar search with image and text multimodal dataset. Earlier time, similar image search, especially when searching shopping items, treated image and text separately to search similar items and reorder the results. This regards two tasks of image search and text matching as two different tasks. Our method, however, explore the vast data to compute k-nearest neighbors using both image and text. In our experiment of similar item search, our system using multimodal data shows better performance than single data while it only increases minute computing time. For the experiment, we collected more than 15 million of accessory and six million of digital product items from online shopping websites, in which the product item comprises item images, titles, categories, and descriptions. Then we compare the performance of multimodal searching to single space searching in these datasets. |
Tasks | Image Retrieval, Text Matching |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.10969v2 |
http://arxiv.org/pdf/1811.10969v2.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-item-search-with-multimodal-data |
Repo | |
Framework | |
Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling
Title | Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling |
Authors | Liang Ding, Di Chang, Russell Malmberg, Aaron Martinez, David Robinson, Matthew Wicker, Hongfei Yan, Liming Cai |
Abstract | The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree width $\leq k$, for every fixed $k\geq 2$. In particular, we prove that approximation of a finite probabilistic system with such Markov networks has the minimum information loss when the network topology is achieved with a maximum spanning $k$-tree. While constructing a maximum spanning $k$-tree is intractable for even $k=2$, we show that polynomial algorithms can be ensured by a sufficient condition accommodated by many meaningful applications. In particular, we prove an efficient algorithm for learning the optimal topology of higher order correlations among random variables that belong to an underlying linear structure. |
Tasks | |
Published | 2018-01-21 |
URL | http://arxiv.org/abs/1801.06900v1 |
http://arxiv.org/pdf/1801.06900v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-learning-of-optimal-markov-network |
Repo | |
Framework | |
A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome
Title | A Data-driven Adversarial Examples Recognition Framework via Adversarial Feature Genome |
Authors | Li Chen, Hailun Ding, Qi Li, Jiawei Zhu, Jian Peng, Haifeng Li |
Abstract | Convolutional neural networks (CNNs) are easily spoofed by adversarial examples which lead to wrong classification results. Most of the defense methods focus only on how to improve the robustness of CNNs or to detect adversarial examples. They are incapable of detecting and correctly classifying adversarial examples simultaneously. We find that adversarial examples and original images have diverse representations in the feature space, and this difference grows as layers go deeper, which we call Adversarial Feature Separability (AFS). Inspired by AFS, we propose a defense framework based on Adversarial Feature Genome (AFG), which can detect and correctly classify adversarial examples into original classes simultaneously. AFG is an innovative encoding for both image and adversarial example. It consists of group features and a mixed label. With group features which are visual representations of adversarial and original images via group visualization method, one can detect adversarial examples because of ASF of group features. With a mixed label, one can trace back to the original label of an adversarial example. Then, the classification of adversarial example is modeled as a multi-label classification trained on the AFG dataset, which can get the original class of adversarial example. Experiments show that the proposed framework not only effectively detects adversarial examples from different attack algorithms, but also correctly classifies adversarial examples. Our framework potentially gives a new perspective, i.e., a data-driven way, to improve the robustness of a CNN model. |
Tasks | Multi-Label Classification |
Published | 2018-12-25 |
URL | http://arxiv.org/abs/1812.10085v2 |
http://arxiv.org/pdf/1812.10085v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-feature-genome-a-data-driven |
Repo | |
Framework | |
Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle
Title | Self-Calibration of Cameras with Euclidean Image Plane in Case of Two Views and Known Relative Rotation Angle |
Authors | Evgeniy Martyushev |
Abstract | The internal calibration of a pinhole camera is given by five parameters that are combined into an upper-triangular $3\times 3$ calibration matrix. If the skew parameter is zero and the aspect ratio is equal to one, then the camera is said to have Euclidean image plane. In this paper, we propose a non-iterative self-calibration algorithm for a camera with Euclidean image plane in case the remaining three internal parameters — the focal length and the principal point coordinates — are fixed but unknown. The algorithm requires a set of $N \geq 7$ point correspondences in two views and also the measured relative rotation angle between the views. We show that the problem generically has six solutions (including complex ones). The algorithm has been implemented and tested both on synthetic data and on publicly available real dataset. The experiments demonstrate that the method is correct, numerically stable and robust. |
Tasks | Calibration |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11279v1 |
http://arxiv.org/pdf/1807.11279v1.pdf | |
PWC | https://paperswithcode.com/paper/self-calibration-of-cameras-with-euclidean |
Repo | |
Framework | |
Comparing Computing Platforms for Deep Learning on a Humanoid Robot
Title | Comparing Computing Platforms for Deep Learning on a Humanoid Robot |
Authors | Alexander Biddulph, Trent Houlistion, Alexandre Mendes, Stephan K. Chalup |
Abstract | The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account. |
Tasks | Pedestrian Detection |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03668v2 |
http://arxiv.org/pdf/1809.03668v2.pdf | |
PWC | https://paperswithcode.com/paper/comparing-computing-platforms-for-deep |
Repo | |
Framework | |
How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level
Title | How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level |
Authors | Vlad Eidelman, Anastassia Kornilova, Daniel Argyle |
Abstract | Modeling U.S. Congressional legislation and roll-call votes has received significant attention in previous literature. However, while legislators across 50 state governments and D.C. propose over 100,000 bills each year, and on average enact over 30% of them, state level analysis has received relatively less attention due in part to the difficulty in obtaining the necessary data. Since each state legislature is guided by their own procedures, politics and issues, however, it is difficult to qualitatively asses the factors that affect the likelihood of a legislative initiative succeeding. Herein, we present several methods for modeling the likelihood of a bill receiving floor action across all 50 states and D.C. We utilize the lexical content of over 1 million bills, along with contextual legislature and legislator derived features to build our predictive models, allowing a comparison of the factors that are important to the lawmaking process. Furthermore, we show that these signals hold complementary predictive power, together achieving an average improvement in accuracy of 18% over state specific baselines. |
Tasks | |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.05284v1 |
http://arxiv.org/pdf/1806.05284v1.pdf | |
PWC | https://paperswithcode.com/paper/how-predictable-is-your-state-leveraging |
Repo | |
Framework | |