January 28, 2020

3176 words 15 mins read

Paper Group ANR 861

OtoMechanic: Auditory Automobile Diagnostics via Query-by-Example. Questions to Guide the Future of Artificial Intelligence Research. Optimal Function Approximation with Relu Neural Networks. Adversarial Analysis of Natural Language Inference Systems. Inferring and Improving Street Maps with Data-Driven Automation. Weakly-Supervised Completion Mome …

OtoMechanic: Auditory Automobile Diagnostics via Query-by-Example


Title	OtoMechanic: Auditory Automobile Diagnostics via Query-by-Example
Authors	Max Morrison, Bryan Pardo
Abstract	Early detection and repair of failing components in automobiles reduces the risk of vehicle failure in life-threatening situations. Many automobile components in need of repair produce characteristic sounds. For example, loose drive belts emit a high-pitched squeaking sound, and bad starter motors have a characteristic whirring or clicking noise. Often drivers can tell that the sound of their car is not normal, but may not be able to identify the cause. To mitigate this knowledge gap, we have developed OtoMechanic, a web application to detect and diagnose vehicle component issues from their corresponding sounds. It compares a user’s recording of a problematic sound to a database of annotated sounds caused by failing automobile components. OtoMechanic returns the most similar sounds, and provides weblinks for more information on the diagnosis associated with each sound, along with an estimate of the similarity of each retrieved sound. In user studies, we find that OtoMechanic significantly increases diagnostic accuracy relative to a baseline accuracy of consumer performance.
Tasks
Published	2019-11-05
URL	https://arxiv.org/abs/1911.02073v1
PDF	https://arxiv.org/pdf/1911.02073v1.pdf
PWC	https://paperswithcode.com/paper/otomechanic-auditory-automobile-diagnostics
Repo
Framework

Questions to Guide the Future of Artificial Intelligence Research


Title	Questions to Guide the Future of Artificial Intelligence Research
Authors	Jordan Ott
Abstract	The field of machine learning has focused, primarily, on discretized sub-problems (i.e. vision, speech, natural language) of intelligence. While neuroscience tends to be observation heavy, providing few guiding theories. It is unlikely that artificial intelligence will emerge through only one of these disciplines. Instead, it is likely to be some amalgamation of their algorithmic and observational findings. As a result, there are a number of problems that should be addressed in order to select the beneficial aspects of both fields. In this article, we propose leading questions to guide the future of artificial intelligence research. There are clear computational principles on which the brain operates. The problem is finding these computational needles in a haystack of biological complexity. Biology has clear constraints but by not using it as a guide we are constraining ourselves.
Tasks
Published	2019-12-21
URL	https://arxiv.org/abs/1912.10305v2
PDF	https://arxiv.org/pdf/1912.10305v2.pdf
PWC	https://paperswithcode.com/paper/questions-to-guide-the-future-of-artificial
Repo
Framework

Optimal Function Approximation with Relu Neural Networks


Title	Optimal Function Approximation with Relu Neural Networks
Authors	Bo Liu, Yi Liang
Abstract	We consider in this paper the optimal approximations of convex univariate functions with feed-forward Relu neural networks. We are interested in the following question: what is the minimal approximation error given the number of approximating linear pieces? We establish the necessary and sufficient conditions and uniqueness of optimal approximations, and give lower and upper bounds of the optimal approximation errors. Relu neural network architectures are then presented to generate these optimal approximations. Finally, we propose an algorithm to find the optimal approximations, as well as prove its convergence and validate it with experimental results.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03731v2
PDF	https://arxiv.org/pdf/1909.03731v2.pdf
PWC	https://paperswithcode.com/paper/optimal-function-approximation-with-relu
Repo
Framework

Adversarial Analysis of Natural Language Inference Systems


Title	Adversarial Analysis of Natural Language Inference Systems
Authors	Tiffany Chien, Jugal Kalita
Abstract	The release of large natural language inference (NLI) datasets like SNLI and MNLI have led to rapid development and improvement of completely neural systems for the task. Most recently, heavily pre-trained, Transformer-based models like BERT and MT-DNN have reached near-human performance on these datasets. However, these standard datasets have been shown to contain many annotation artifacts, allowing models to shortcut understanding using simple fallible heuristics, and still perform well on the test set. So it is no surprise that many adversarial (challenge) datasets have been created that cause models trained on standard datasets to fail dramatically. Although extra training on this data generally improves model performance on just that type of data, transferring that learning to unseen examples is still partial at best. This work evaluates the failures of state-of-the-art models on existing adversarial datasets that test different linguistic phenomena, and find that even though the models perform similarly on MNLI, they differ greatly in their robustness to these attacks. In particular, we find syntax-related attacks to be particularly effective across all models, so we provide a fine-grained analysis and comparison of model performance on those examples. We draw conclusions about the value of model size and multi-task learning (beyond comparing their standard test set performance), and provide suggestions for more effective training data.
Tasks	Multi-Task Learning, Natural Language Inference
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03441v1
PDF	https://arxiv.org/pdf/1912.03441v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-analysis-of-natural-language
Repo
Framework

Inferring and Improving Street Maps with Data-Driven Automation


Title	Inferring and Improving Street Maps with Data-Driven Automation
Authors	Favyen Bastani, Songtao He, Satvat Jagwani, Edward Park, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, Mohammad Amin Sadeghi
Abstract	Street maps are a crucial data source that help to inform a wide range of decisions, from navigating a city to disaster relief and urban planning. However, in many parts of the world, street maps are incomplete or lag behind new construction. Editing maps today involves a tedious process of manually tracing and annotating roads, buildings, and other map features. Over the past decade, many automatic map inference systems have been proposed to automatically extract street map data from satellite imagery, aerial imagery, and GPS trajectory datasets. However, automatic map inference has failed to gain traction in practice due to two key limitations: high error rates (low precision), which manifest in noisy inference outputs, and a lack of end-to-end system design to leverage inferred data to update existing street maps. At MIT and QCRI, we have developed a number of algorithms and approaches to address these challenges, which we combined into a new system we call Mapster. Mapster is a human-in-the-loop street map editing system that incorporates three components to robustly accelerate the mapping process over traditional tools and workflows: high-precision automatic map inference, data refinement, and machine-assisted map editing. Through an evaluation on a large-scale dataset including satellite imagery, GPS trajectories, and ground-truth map data in forty cities, we show that Mapster makes automation practical for map editing, and enables the curation of map datasets that are more complete and up-to-date at less cost.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.04869v2
PDF	https://arxiv.org/pdf/1910.04869v2.pdf
PWC	https://paperswithcode.com/paper/inferring-and-improving-street-maps-with-data
Repo
Framework

Weakly-Supervised Completion Moment Detection using Temporal Attention


Title	Weakly-Supervised Completion Moment Detection using Temporal Attention
Authors	Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen
Abstract	Monitoring the progression of an action towards completion offers fine grained insight into the actor’s behaviour. In this work, we target detecting the completion moment of actions, that is the moment when the action’s goal has been successfully accomplished. This has potential applications from surveillance to assistive living and human-robot interactions. Previous effort required human annotations of the completion moment for training (i.e. full supervision). In this work, we present an approach for moment detection from weak video-level labels. Given both complete and incomplete sequences, of the same action, we learn temporal attention, along with accumulated completion prediction from all frames in the sequence. We also demonstrate how the approach can be used when completion moment supervision is available. We evaluate and compare our approach on actions from three datasets, namely HMDB, UCF101 and RGBD-AC, and show that temporal attention improves detection in both weakly-supervised and fully-supervised settings.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09920v1
PDF	https://arxiv.org/pdf/1910.09920v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-completion-moment-detection
Repo
Framework

Neural Networks on Groups


Title	Neural Networks on Groups
Authors	Stella Rose Biderman
Abstract	Although neural networks traditionally are typically used to approximate functions defined over $\mathbb{R}^n$, the successes of graph neural networks, point-cloud neural networks, and manifold deep learning among other methods have demonstrated the clear value of leveraging neural networks to approximate functions defined over more general spaces. The theory of neural networks has not kept up however,and the relevant theoretical results (when they exist at all) have been proven on a case-by-case basis without a general theory or connection to classical work. The process of deriving new theoretical backing for each new type of network has become a bottleneck to understanding and validating new approaches. In this paper we extend the definition of neural networks to general topological groups and prove that neural networks with a single hidden layer and a bounded non-constant activation function can approximate any $\mathcal{L}^p$ function defined over any locally compact Abelian group. This framework and universal approximation theorem encompass all of the aforementioned contexts. We also derive important corollaries and extensions with minor modification, including the case for approximating continuous functions on a compact subset, neural networks with ReLU activation functions on a linearly bi-ordered group, and neural networks with affine transformations on a vector space. Our work obtains as special cases the recent theorems of Qi et al. [2017], Sennai et al. [2019], Keriven and Peyre [2019], and Maron et al. [2019]
Tasks
Published	2019-06-13
URL	https://arxiv.org/abs/1907.03742v2
PDF	https://arxiv.org/pdf/1907.03742v2.pdf
PWC	https://paperswithcode.com/paper/neural-networks-on-groups
Repo
Framework

Structured agents for physical construction


Title	Structured agents for physical construction
Authors	Victor Bapst, Alvaro Sanchez-Gonzalez, Carl Doersch, Kimberly L. Stachenfeld, Pushmeet Kohli, Peter W. Battaglia, Jessica B. Hamrick
Abstract	Physical construction—the ability to compose objects, subject to physical dynamics, to serve some function—is fundamental to human intelligence. We introduce a suite of challenging physical construction tasks inspired by how children play with blocks, such as matching a target configuration, stacking blocks to connect objects together, and creating shelter-like structures over target objects. We examine how a range of deep reinforcement learning agents fare on these challenges, and introduce several new approaches which provide superior performance. Our results show that agents which use structured representations (e.g., objects and scene graphs) and structured policies (e.g., object-centric actions) outperform those which use less structured representations, and generalize better beyond their training when asked to reason about larger scenes. Model-based agents which use Monte-Carlo Tree Search also outperform strictly model-free agents in our most challenging construction problems. We conclude that approaches which combine structured representations and reasoning with powerful learning are a key path toward agents that possess rich intuitive physics, scene understanding, and planning.
Tasks	Scene Understanding
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03177v2
PDF	https://arxiv.org/pdf/1904.03177v2.pdf
PWC	https://paperswithcode.com/paper/structured-agents-for-physical-construction
Repo
Framework

Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic Segmentation


Title	Road Scene Understanding by Occupancy Grid Learning from Sparse Radar Clusters using Semantic Segmentation
Authors	Liat Sless, Gilad Cohen, Bat El Shlomo, Shaul Oron
Abstract	Occupancy grid mapping is an important component in road scene understanding for autonomous driving. It encapsulates information of the drivable area, road obstacles and enables safe autonomous driving. Radars are an emerging sensor in autonomous vehicle vision, becoming more widely used due to their long range sensing, low cost, and robustness to severe weather conditions. Despite recent advances in deep learning technology, occupancy grid mapping from radar data is still mostly done using classical filtering approaches.In this work, we propose learning the inverse sensor model used for occupancy grid mapping from clustered radar data. This is done in a data driven approach that leverages computer vision techniques. This task is very challenging due to data sparsity and noise characteristics of the radar sensor. The problem is formulated as a semantic segmentation task and we show how it can be learned using lidar data for generating ground truth. We show both qualitatively and quantitatively that our learned occupancy net outperforms classic methods by a large margin using the recently released NuScenes real-world driving data.
Tasks	Autonomous Driving, Scene Understanding, Semantic Segmentation
Published	2019-03-31
URL	https://arxiv.org/abs/1904.00415v2
PDF	https://arxiv.org/pdf/1904.00415v2.pdf
PWC	https://paperswithcode.com/paper/self-supervised-occupancy-grid-learning-from
Repo
Framework

Deep Learning in a Computational Model for Conceptual Shifts in a Co-Creative Design System


Title	Deep Learning in a Computational Model for Conceptual Shifts in a Co-Creative Design System
Authors	Pegah Karimi, Mary Lou Maher, Nicholas Davis, Kazjon Grace
Abstract	This paper presents a computational model for conceptual shifts, based on a novelty metric applied to a vector representation generated through deep learning. This model is integrated into a co-creative design system, which enables a partnership between an AI agent and a human designer interacting through a sketching canvas. The AI agent responds to the human designer’s sketch with a new sketch that is a conceptual shift: intentionally varying the visual and conceptual similarity with increasingly more novelty. The paper presents the results of a user study showing that increasing novelty in the AI contribution is associated with higher creative outcomes, whereas low novelty leads to less creative outcomes.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10188v1
PDF	https://arxiv.org/pdf/1906.10188v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-in-a-computational-model-for
Repo
Framework

Detecting Adversarial Attacks On Audio-Visual Speech Recognition


Title	Detecting Adversarial Attacks On Audio-Visual Speech Recognition
Authors	Pingchuan Ma, Stavros Petridis, Maja Pantic
Abstract	Adversarial attacks pose a threat to deep learning models. However, research on adversarial detection methods, especially in the multi-modal domain, is very limited. In this work, we propose an efficient and straightforward detection method based on the temporal correlation between audio and video streams. The main idea is that the correlation between audio and video in adversarial examples will be lower than benign examples due to added adversarial noise. We use the synchronisation confidence score as a proxy for audio-visual correlation and based on it we can detect adversarial attacks. To the best of our knowledge, this is the first work on detection of adversarial attacks on audio-visual speech recognition models. We apply recent adversarial attacks on two audio-visual speech recognition models trained on the GRID and LRW datasets. The experimental results demonstrated that the proposed approach is an effective way for detecting such attacks.
Tasks	Audio-Visual Speech Recognition, Speech Recognition, Visual Speech Recognition
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08639v1
PDF	https://arxiv.org/pdf/1912.08639v1.pdf
PWC	https://paperswithcode.com/paper/detecting-adversarial-attacks-on-audio-visual
Repo
Framework

Improved algorithm for neuronal ensemble inference by Monte Carlo method


Title	Improved algorithm for neuronal ensemble inference by Monte Carlo method
Authors	Shun Kimura, Koujin Takeda
Abstract	Neuronal ensemble inference is one of the significant problems in the study of biological neural networks. Various methods have been proposed for ensemble inference from their activity data taken experimentally. Here we focus on Bayesian inference approach for ensembles with generative model, which was proposed in recent work. However, this method requires large computational cost, and the result sometimes gets stuck in bad local maximum solution of Bayesian inference. In this work, we give improved Bayesian inference algorithm for these problems. We modify ensemble generation rule in Markov chain Monte Carlo method, and introduce the idea of simulated annealing for hyperparameter control. We also compare the performance of ensemble inference between our algorithm and the original one.
Tasks	Bayesian Inference
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06509v1
PDF	https://arxiv.org/pdf/1911.06509v1.pdf
PWC	https://paperswithcode.com/paper/improved-algorithm-for-neuronal-ensemble
Repo
Framework

Collage Inference: Achieving low tail latency during distributed image classification using coded redundancy models


Title	Collage Inference: Achieving low tail latency during distributed image classification using coded redundancy models
Authors	Krishna Narra, Zhifeng Lin, Ganesh Ananthanarayanan, Salman Avestimehr, Murali Annavaram
Abstract	Reducing the latency variance in machine learning inference is a key requirement in many applications. Variance is harder to control in a cloud deployment in the presence of stragglers. In spite of this challenge, inference is increasingly being done in the cloud, due to the advent of affordable machine learning as a service (MLaaS) platforms. Existing approaches to reduce variance rely on replication which is expensive and partially negates the affordability of MLaaS. In this work, we argue that MLaaS platforms also provide unique opportunities to cut the cost of redundancy. In MLaaS platforms, multiple inference requests are concurrently received by a load balancer which can then create a more cost-efficient redundancy coding across a larger collection of images. We propose a novel convolutional neural network model, Collage-CNN, to provide a low-cost redundancy framework. A Collage-CNN model takes a collage formed by combining multiple images and performs multi-image classification in one shot, albeit at slightly lower accuracy. We then augment a collection of traditional single image classifiers with a single Collage-CNN classifier which acts as a low-cost redundant backup. Collage-CNN then provides backup classification results if a single image classification straggles. Deploying the Collage-CNN models in the cloud, we demonstrate that the 99th percentile tail latency of inference can be reduced by 1.47X compared to replication based approaches while providing high accuracy. Also, variation in inference latency can be reduced by 9X with a slight increase in average inference latency.
Tasks	Image Classification
Published	2019-06-05
URL	https://arxiv.org/abs/1906.03999v1
PDF	https://arxiv.org/pdf/1906.03999v1.pdf
PWC	https://paperswithcode.com/paper/collage-inference-achieving-low-tail-latency
Repo
Framework

Verification of Very Low-Resolution Faces Using An Identity-Preserving Deep Face Super-Resolution Network


Title	Verification of Very Low-Resolution Faces Using An Identity-Preserving Deep Face Super-Resolution Network
Authors	Esra Ataer-Cansizoglu, Michael Jones, Ziming Zhang, Alan Sullivan
Abstract	Face super-resolution methods usually aim at producing visually appealing results rather than preserving distinctive features for further face identification. In this work, we propose a deep learning method for face verification on very low-resolution face images that involves identity-preserving face super-resolution. Our framework includes a super-resolution network and a feature extraction network. We train a VGG-based deep face recognition network (Parkhi et al. 2015) to be used as feature extractor. Our super-resolution network is trained to minimize the feature distance between the high resolution ground truth image and the super-resolved image, where features are extracted using our pre-trained feature extraction network. We carry out experiments on FRGC, Multi-PIE, LFW-a, and MegaFace datasets to evaluate our method in controlled and uncontrolled settings. The results show that the presented method outperforms conventional super-resolution methods in low-resolution face verification.
Tasks	Face Identification, Face Recognition, Face Verification, Super-Resolution
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10974v1
PDF	http://arxiv.org/pdf/1903.10974v1.pdf
PWC	https://paperswithcode.com/paper/verification-of-very-low-resolution-faces
Repo
Framework

Conversational Help for Task Completion and Feature Discovery in Personal Assistants


Title	Conversational Help for Task Completion and Feature Discovery in Personal Assistants
Authors	Madan Gopal Jhawar, Vipindeep Vangala, Nishchay Sharma, Ankur Hayatnagarkar, Mansi Saxena, Swati Valecha
Abstract	Intelligent Personal Assistants (IPAs) have become widely popular in recent times. Most of the commercial IPAs today support a wide range of skills including Alarms, Reminders, Weather Updates, Music, News, Factual Questioning-Answering, etc. The list grows every day, making it difficult to remember the command structures needed to execute various tasks. An IPA must have the ability to communicate information about supported skills and direct users towards the right commands needed to execute them. Users interact with personal assistants in natural language. A query is defined to be a Help Query if it seeks information about a personal assistant’s capabilities, or asks for instructions to execute a task. In this paper, we propose an interactive system which identifies help queries and retrieves appropriate responses. Our system comprises of a C-BiLSTM based classifier, which is a fusion of Convolutional Neural Networks (CNN) and Bidirectional LSTM (BiLSTM) architectures, to detect help queries and a semantic Approximate Nearest Neighbours (ANN) module to map the query to an appropriate predefined response. Evaluation of our system on real-world queries from a commercial IPA and a detailed comparison with popular traditional machine learning and deep learning based models reveal that our system outperforms other approaches and returns relevant responses for help queries.
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07564v1
PDF	https://arxiv.org/pdf/1907.07564v1.pdf
PWC	https://paperswithcode.com/paper/conversational-help-for-task-completion-and
Repo
Framework