May 6, 2019

3140 words 15 mins read

Paper Group ANR 207

Bayesian selection for the l2-Potts model regularization parameter: 1D piecewise constant signal denoising. Hierarchical learning for DNN-based acoustic scene classification. Demographic Dialectal Variation in Social Media: A Case Study of African-American English. Robustness of classifiers: from adversarial to random noise. DROW: Real-Time Deep Le …

Bayesian selection for the l2-Potts model regularization parameter: 1D piecewise constant signal denoising


Title	Bayesian selection for the l2-Potts model regularization parameter: 1D piecewise constant signal denoising
Authors	Jordan Frecon, Nelly Pustelnik, Nicolas Dobigeon, Herwig Wendt, Patrice Abry
Abstract	Piecewise constant denoising can be solved either by deterministic optimization approaches, based on the Potts model, or by stochastic Bayesian procedures. The former lead to low computational time but require the selection of a regularization parameter, whose value significantly impacts the achieved solution, and whose automated selection remains an involved and challenging problem. Conversely, fully Bayesian formalisms encapsulate the regularization parameter selection into hierarchical models, at the price of high computational costs. This contribution proposes an operational strategy that combines hierarchical Bayesian and Potts model formulations, with the double aim of automatically tuning the regularization parameter and of maintaining computational effciency. The proposed procedure relies on formally connecting a Bayesian framework to a l2-Potts functional. Behaviors and performance for the proposed piecewise constant denoising and regularization parameter tuning techniques are studied qualitatively and assessed quantitatively, and shown to compare favorably against those of a fully Bayesian hierarchical procedure, both in accuracy and in computational load.
Tasks	Denoising
Published	2016-08-27
URL	http://arxiv.org/abs/1608.07739v2
PDF	http://arxiv.org/pdf/1608.07739v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-selection-for-the-l2-potts-model
Repo
Framework

Hierarchical learning for DNN-based acoustic scene classification


Title	Hierarchical learning for DNN-based acoustic scene classification
Authors	Yong Xu, Qiang Huang, Wenwu Wang, Mark D. Plumbley
Abstract	In this paper, we present a deep neural network (DNN)-based acoustic scene classification framework. Two hierarchical learning methods are proposed to improve the DNN baseline performance by incorporating the hierarchical taxonomy information of environmental sounds. Firstly, the parameters of the DNN are initialized by the proposed hierarchical pre-training. Multi-level objective function is then adopted to add more constraint on the cross-entropy based loss function. A series of experiments were conducted on the Task1 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 challenge. The final DNN-based system achieved a 22.9% relative improvement on average scene classification error as compared with the Gaussian Mixture Model (GMM)-based benchmark system across four standard folds.
Tasks	Acoustic Scene Classification, Scene Classification
Published	2016-07-13
URL	http://arxiv.org/abs/1607.03682v3
PDF	http://arxiv.org/pdf/1607.03682v3.pdf
PWC	https://paperswithcode.com/paper/hierarchical-learning-for-dnn-based-acoustic
Repo
Framework


Title	Demographic Dialectal Variation in Social Media: A Case Study of African-American English
Authors	Su Lin Blodgett, Lisa Green, Brendan O’Connor
Abstract	Though dialectal language is increasingly abundant on social media, few resources exist for developing NLP tools to handle such language. We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter. We propose a distantly supervised model to identify AAE-like language from demographics associated with geo-located messages, and we verify that this language follows well-known AAE linguistic phenomena. In addition, we analyze the quality of existing language identification and dependency parsing tools on AAE-like text, demonstrating that they perform poorly on such text compared to text associated with white speakers. We also provide an ensemble classifier for language identification which eliminates this disparity and release a new corpus of tweets containing AAE-like language.
Tasks	Dependency Parsing, Language Identification
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08868v1
PDF	http://arxiv.org/pdf/1608.08868v1.pdf
PWC	https://paperswithcode.com/paper/demographic-dialectal-variation-in-social
Repo
Framework

Robustness of classifiers: from adversarial to random noise


Title	Robustness of classifiers: from adversarial to random noise
Authors	Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard
Abstract	Several recent works have shown that state-of-the-art classifiers are vulnerable to worst-case (i.e., adversarial) perturbations of the datapoints. On the other hand, it has been empirically observed that these same classifiers are relatively robust to random noise. In this paper, we propose to study a \textit{semi-random} noise regime that generalizes both the random and worst-case noise regimes. We propose the first quantitative analysis of the robustness of nonlinear classifiers in this general noise regime. We establish precise theoretical bounds on the robustness of classifiers in this general regime, which depend on the curvature of the classifier’s decision boundary. Our bounds confirm and quantify the empirical observations that classifiers satisfying curvature constraints are robust to random noise. Moreover, we quantify the robustness of classifiers in terms of the subspace dimension in the semi-random noise regime, and show that our bounds remarkably interpolate between the worst-case and random noise regimes. We perform experiments and show that the derived bounds provide very accurate estimates when applied to various state-of-the-art deep neural networks and datasets. This result suggests bounds on the curvature of the classifiers’ decision boundaries that we support experimentally, and more generally offers important insights onto the geometry of high dimensional classification problems.
Tasks
Published	2016-08-31
URL	http://arxiv.org/abs/1608.08967v1
PDF	http://arxiv.org/pdf/1608.08967v1.pdf
PWC	https://paperswithcode.com/paper/robustness-of-classifiers-from-adversarial-to
Repo
Framework

DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data


Title	DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data
Authors	Lucas Beyer, Alexander Hermans, Bastian Leibe
Abstract	We introduce the DROW detector, a deep learning based detector for 2D range data. Laser scanners are lighting invariant, provide accurate range data, and typically cover a large field of view, making them interesting sensors for robotics applications. So far, research on detection in laser range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a Convolutional Neural Network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2D range data, and propose a depth preprocessing step and voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464k laser scans, out of which 24k were annotated.
Tasks
Published	2016-03-08
URL	http://arxiv.org/abs/1603.02636v2
PDF	http://arxiv.org/pdf/1603.02636v2.pdf
PWC	https://paperswithcode.com/paper/drow-real-time-deep-learning-based-wheelchair
Repo
Framework

Numerical Atrribute Extraction from Clinical Texts


Title	Numerical Atrribute Extraction from Clinical Texts
Authors	Sarath P R, Sunil Mandhan, Yoshiki Niwa
Abstract	This paper describes about information extraction system, which is an extension of the system developed by team Hitachi for “Disease/Disorder Template filling” task organized by ShARe/CLEF eHealth Evolution Lab 2014. In this extension module we focus on extraction of numerical attributes and values from discharge summary records and associating correct relation between attributes and values. We solve the problem in two steps. First step is extraction of numerical attributes and values, which is developed as a Named Entity Recognition (NER) model using Stanford NLP libraries. Second step is correctly associating the attributes to values, which is developed as a relation extraction module in Apache cTAKES framework. We integrated Stanford NER model as cTAKES pipeline component and used in relation extraction module. Conditional Random Field (CRF) algorithm is used for NER and Support Vector Machines (SVM) for relation extraction. For attribute value relation extraction, we observe 95% accuracy using NER alone and combined accuracy of 87% with NER and SVM.
Tasks	Named Entity Recognition, Relation Extraction
Published	2016-01-31
URL	http://arxiv.org/abs/1602.00269v1
PDF	http://arxiv.org/pdf/1602.00269v1.pdf
PWC	https://paperswithcode.com/paper/numerical-atrribute-extraction-from-clinical
Repo
Framework

Understanding User Instructions by Utilizing Open Knowledge for Service Robots


Title	Understanding User Instructions by Utilizing Open Knowledge for Service Robots
Authors	Dongcai Lu, Feng Wu, Xiaoping Chen
Abstract	Understanding user instructions in natural language is an active research topic in AI and robotics. Typically, natural user instructions are high-level and can be reduced into low-level tasks expressed in common verbs (e.g., `take',` get’, `put’). For robots understanding such instructions, one of the key challenges is to process high-level user instructions and achieve the specified tasks with robots’ primitive actions. To address this, we propose novel algorithms by utilizing semantic roles of common verbs defined in semantic dictionaries and integrating multiple open knowledge to generate task plans. Specifically, we present a new method for matching and recovering semantics of user instructions and a novel task planner that exploits functional knowledge of robot’s action model. To verify and evaluate our approach, we implemented a prototype system using knowledge from several open resources. Experiments on our system confirmed the correctness and efficiency of our algorithms. Notably, our system has been deployed in the KeJia robot, which participated the annual RoboCup@Home competitions in the past three years and achieved encouragingly high scores in the benchmark tests. \|
Tasks
Published	2016-06-09
URL	http://arxiv.org/abs/1606.02877v1
PDF	http://arxiv.org/pdf/1606.02877v1.pdf
PWC	https://paperswithcode.com/paper/understanding-user-instructions-by-utilizing
Repo
Framework

Recurrent Neural Networks for Dialogue State Tracking


Title	Recurrent Neural Networks for Dialogue State Tracking
Authors	Ondřej Plátek, Petr Bělohlávek, Vojtěch Hudeček, Filip Jurčíček
Abstract	This paper discusses models for dialogue state tracking using recurrent neural networks (RNN). We present experiments on the standard dialogue state tracking (DST) dataset, DSTC2. On the one hand, RNN models became the state of the art models in DST, on the other hand, most state-of-the-art models are only turn-based and require dataset-specific preprocessing (e.g. DSTC2-specific) in order to achieve such results. We implemented two architectures which can be used in incremental settings and require almost no preprocessing. We compare their performance to the benchmarks on DSTC2 and discuss their properties. With only trivial preprocessing, the performance of our models is close to the state-of- the-art results.
Tasks	Dialogue State Tracking
Published	2016-06-28
URL	http://arxiv.org/abs/1606.08733v2
PDF	http://arxiv.org/pdf/1606.08733v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-for-dialogue-state
Repo
Framework

Automatic 3D liver location and segmentation via convolutional neural networks and graph cut


Title	Automatic 3D liver location and segmentation via convolutional neural networks and graph cut
Authors	Fang Lu, Fa Wu, Peijun Hu, Zhiyi Peng, Dexing Kong
Abstract	Purpose Segmentation of the liver from abdominal computed tomography (CT) image is an essential step in some computer assisted clinical interventions, such as surgery planning for living donor liver transplant (LDLT), radiotherapy and volume measurement. In this work, we develop a deep learning algorithm with graph cut refinement to automatically segment liver in CT scans. Methods The proposed method consists of two main steps: (i) simultaneously liver detection and probabilistic segmentation using 3D convolutional neural networks (CNNs); (ii) accuracy refinement of initial segmentation with graph cut and the previously learned probability map. Results The proposed approach was validated on forty CT volumes taken from two public databases MICCAI-Sliver07 and 3Dircadb. For the MICCAI-Sliver07 test set, the calculated mean ratios of volumetric overlap error (VOE), relative volume difference (RVD), average symmetric surface distance (ASD), root mean square symmetric surface distance (RMSD) and maximum symmetric surface distance (MSD) are 5.9%, 2.7%, 0.91%, 1.88 mm, and 18.94 mm, respectively. In the case of 20 3Dircadb data, the calculated mean ratios of VOE, RVD, ASD, RMSD and MSD are 9.36%, 0.97%, 1.89%, 4.15 mm and 33.14 mm, respectively. Conclusion The proposed method is fully automatic without any user interaction. Quantitative results reveal that the proposed approach is efficient and accurate for hepatic volume estimation in a clinical setup. The high correlation between the automatic and manual references shows that the proposed method can be good enough to replace the time-consuming and non-reproducible manual segmentation method.
Tasks	Computed Tomography (CT)
Published	2016-05-10
URL	http://arxiv.org/abs/1605.03012v1
PDF	http://arxiv.org/pdf/1605.03012v1.pdf
PWC	https://paperswithcode.com/paper/automatic-3d-liver-location-and-segmentation
Repo
Framework

Why Deep Neural Networks for Function Approximation?


Title	Why Deep Neural Networks for Function Approximation?
Authors	Shiyu Liang, R. Srikant
Abstract	Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, we consider univariate functions on a bounded interval and require a neural network to achieve an approximation error of $\varepsilon$ uniformly over the interval. We show that shallow networks (i.e., networks whose depth does not depend on $\varepsilon$) require $\Omega(\text{poly}(1/\varepsilon))$ neurons while deep networks (i.e., networks whose depth grows with $1/\varepsilon$) require $\mathcal{O}(\text{polylog}(1/\varepsilon))$ neurons. We then extend these results to certain classes of important multivariate functions. Our results are derived for neural networks which use a combination of rectifier linear units (ReLUs) and binary step units, two of the most popular type of activation functions. Our analysis builds on a simple observation: the multiplication of two bits can be represented by a ReLU.
Tasks
Published	2016-10-13
URL	http://arxiv.org/abs/1610.04161v2
PDF	http://arxiv.org/pdf/1610.04161v2.pdf
PWC	https://paperswithcode.com/paper/why-deep-neural-networks-for-function
Repo
Framework

Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval


Title	Nested Invariance Pooling and RBM Hashing for Image Instance Retrieval
Authors	Olivier Morère, Jie Lin, Antoine Veillard, Vijay Chandrasekhar, Tomaso Poggio
Abstract	The goal of this work is the computation of very compact binary hashes for image instance retrieval. Our approach has two novel contributions. The first one is Nested Invariance Pooling (NIP), a method inspired from i-theory, a mathematical theory for computing group invariant transformations with feed-forward neural networks. NIP is able to produce compact and well-performing descriptors with visual representations extracted from convolutional neural networks. We specifically incorporate scale, translation and rotation invariances but the scheme can be extended to any arbitrary sets of transformations. We also show that using moments of increasing order throughout nesting is important. The NIP descriptors are then hashed to the target code size (32-256 bits) with a Restricted Boltzmann Machine with a novel batch-level regularization scheme specifically designed for the purpose of hashing (RBMH). A thorough empirical evaluation with state-of-the-art shows that the results obtained both with the NIP descriptors and the NIP+RBMH hashes are consistently outstanding across a wide range of datasets.
Tasks	Image Instance Retrieval
Published	2016-03-15
URL	http://arxiv.org/abs/1603.04595v2
PDF	http://arxiv.org/pdf/1603.04595v2.pdf
PWC	https://paperswithcode.com/paper/nested-invariance-pooling-and-rbm-hashing-for
Repo
Framework

Group Invariant Deep Representations for Image Instance Retrieval


Title	Group Invariant Deep Representations for Image Instance Retrieval
Authors	Olivier Morère, Antoine Veillard, Jie Lin, Julie Petta, Vijay Chandrasekhar, Tomaso Poggio
Abstract	Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to overfitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-the-art compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step –quantization or hashing– may be able to further improve the competitiveness of the descriptors.
Tasks	Dimensionality Reduction, Image Classification, Image Instance Retrieval, Quantization
Published	2016-01-09
URL	http://arxiv.org/abs/1601.02093v2
PDF	http://arxiv.org/pdf/1601.02093v2.pdf
PWC	https://paperswithcode.com/paper/group-invariant-deep-representations-for
Repo
Framework

Solving Set Optimization Problems by Cardinality Optimization via Weak Constraints with an Application to Argumentation


Title	Solving Set Optimization Problems by Cardinality Optimization via Weak Constraints with an Application to Argumentation
Authors	Wolfgang Faber, Mauro Vallati, Federico Cerutti, Massimiliano Giacomin
Abstract	Optimization - minimization or maximization - in the lattice of subsets is a frequent operation in Artificial Intelligence tasks. Examples are subset-minimal model-based diagnosis, nonmonotonic reasoning by means of circumscription, or preferred extensions in abstract argumentation. Finding the optimum among many admissible solutions is often harder than finding admissible solutions with respect to both computational complexity and methodology. This paper addresses the former issue by means of an effective method for finding subset-optimal solutions. It is based on the relationship between cardinality-optimal and subset-optimal solutions, and the fact that many logic-based declarative programming systems provide constructs for finding cardinality-optimal solutions, for example maximum satisfiability (MaxSAT) or weak constraints in Answer Set Programming (ASP). Clearly each cardinality-optimal solution is also a subset-optimal one, and if the language also allows for the addition of particular restricting constructs (both MaxSAT and ASP do) then all subset-optimal solutions can be found by an iterative computation of cardinality-optimal solutions. As a showcase, the computation of preferred extensions of abstract argumentation frameworks using the proposed method is studied.
Tasks	Abstract Argumentation
Published	2016-12-22
URL	http://arxiv.org/abs/1612.07589v1
PDF	http://arxiv.org/pdf/1612.07589v1.pdf
PWC	https://paperswithcode.com/paper/solving-set-optimization-problems-by
Repo
Framework

Action Classification via Concepts and Attributes


Title	Action Classification via Concepts and Attributes
Authors	Amir Rosenfeld, Shimon Ullman
Abstract	Classes in natural images tend to follow long tail distributions. This is problematic when there are insufficient training examples for rare classes. This effect is emphasized in compound classes, involving the conjunction of several concepts, such as those appearing in action-recognition datasets. In this paper, we propose to address this issue by learning how to utilize common visual concepts which are readily available. We detect the presence of prominent concepts in images and use them to infer the target labels instead of using visual features directly, combining tools from vision and natural-language processing. We validate our method on the recently introduced HICO dataset reaching a mAP of 31.54% and on the Stanford-40 Actions dataset, where the proposed method outperforms that obtained by direct visual features, obtaining an accuracy 83.12%. Moreover, the method provides for each class a semantically meaningful list of keywords and relevant image regions relating it to its constituent concepts.
Tasks	Action Classification, Temporal Action Localization
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07824v2
PDF	http://arxiv.org/pdf/1605.07824v2.pdf
PWC	https://paperswithcode.com/paper/action-classification-via-concepts-and
Repo
Framework

Improving Human Action Recognition by Non-action Classification


Title	Improving Human Action Recognition by Non-action Classification
Authors	Yang Wang, Minh Hoai
Abstract	In this paper we consider the task of recognizing human actions in realistic video where human actions are dominated by irrelevant factors. We first study the benefits of removing non-action video segments, which are the ones that do not portray any human action. We then learn a non-action classifier and use it to down-weight irrelevant video segments. The non-action classifier is trained using ActionThread, a dataset with shot-level annotation for the occurrence or absence of a human action. The non-action classifier can be used to identify non-action shots with high precision and subsequently used to improve the performance of action recognition systems.
Tasks	Action Classification, Temporal Action Localization
Published	2016-04-21
URL	http://arxiv.org/abs/1604.06397v2
PDF	http://arxiv.org/pdf/1604.06397v2.pdf
PWC	https://paperswithcode.com/paper/improving-human-action-recognition-by-non
Repo
Framework