October 17, 2019

3073 words 15 mins read

Paper Group ANR 701

Robust Estimation via Robust Gradient Estimation. Image Score: How to Select Useful Samples. Multi-Scale Supervised Network for Human Pose Estimation. Hybrid Forests for Left Ventricle Segmentation using only the first slice label. Beyond black-boxes in Bayesian inverse problems and model validation: applications in solid mechanics of elastography. …

Robust Estimation via Robust Gradient Estimation


Title	Robust Estimation via Robust Gradient Estimation
Authors	Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar
Abstract	We provide a new computationally-efficient class of estimators for risk minimization. We show that these estimators are robust for general statistical models: in the classical Huber epsilon-contamination model and in heavy-tailed settings. Our workhorse is a novel robust variant of gradient descent, and we provide conditions under which our gradient descent variant provides accurate estimators in a general convex risk minimization problem. We provide specific consequences of our theory for linear regression, logistic regression and for estimation of the canonical parameters in an exponential family. These results provide some of the first computationally tractable and provably robust estimators for these canonical statistical models. Finally, we study the empirical performance of our proposed methods on synthetic and real datasets, and find that our methods convincingly outperform a variety of baselines.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.06485v2
PDF	http://arxiv.org/pdf/1802.06485v2.pdf
PWC	https://paperswithcode.com/paper/robust-estimation-via-robust-gradient
Repo
Framework

Image Score: How to Select Useful Samples


Title	Image Score: How to Select Useful Samples
Authors	Simiao Zuo, Jialin Wu
Abstract	There has long been debates on how we could interpret neural networks and understand the decisions our models make. Specifically, why deep neural networks tend to be error-prone when dealing with samples that output low softmax scores. We present an efficient approach to measure the confidence of decision-making steps by statistically investigating each unit’s contribution to that decision. Instead of focusing on how the models react on datasets, we study the datasets themselves given a pre-trained model. Our approach is capable of assigning a score to each sample within a dataset that measures the frequency of occurrence of that sample’s chain of activation. We demonstrate with experiments that our method could select useful samples to improve deep neural networks in a semi-supervised leaning setting.
Tasks	Decision Making
Published	2018-12-02
URL	http://arxiv.org/abs/1812.00334v1
PDF	http://arxiv.org/pdf/1812.00334v1.pdf
PWC	https://paperswithcode.com/paper/image-score-how-to-select-useful-samples
Repo
Framework

Multi-Scale Supervised Network for Human Pose Estimation


Title	Multi-Scale Supervised Network for Human Pose Estimation
Authors	Lipeng Ke, Ming-Ching Chang, Honggang Qi, Siwei Lyu
Abstract	Human pose estimation is an important topic in computer vision with many applications including gesture and activity recognition. However, pose estimation from image is challenging due to appearance variations, occlusions, clutter background, and complex activities. To alleviate these problems, we develop a robust pose estimation method based on the recent deep conv-deconv modules with two improvements: (1) multi-scale supervision of body keypoints, and (2) a global regression to improve structural consistency of keypoints. We refine keypoint detection heatmaps using layer-wise multi-scale supervision to better capture local contexts. Pose inference via keypoint association is optimized globally using a regression network at the end. Our method can effectively disambiguate keypoint matches in close proximity including the mismatch of left-right body parts, and better infer occluded parts. Experimental results show that our method achieves competitive performance among state-of-the-art methods on the MPII and FLIC datasets.
Tasks	Activity Recognition, Keypoint Detection, Pose Estimation
Published	2018-08-05
URL	http://arxiv.org/abs/1808.01623v1
PDF	http://arxiv.org/pdf/1808.01623v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-supervised-network-for-human-pose
Repo
Framework

Hybrid Forests for Left Ventricle Segmentation using only the first slice label


Title	Hybrid Forests for Left Ventricle Segmentation using only the first slice label
Authors	Ismaël Koné, Lahsen Boulmane
Abstract	Machine learning models produce state-of-the-art results in many MRI images segmentation. However, most of these models are trained on very large datasets which come from experts manual labeling. This labeling process is very time consuming and costs experts work. Therefore finding a way to reduce this cost is on high demand. In this paper, we propose a segmentation method which exploits MRI images sequential structure to nearly drop out this labeling task. Only the first slice needs to be manually labeled to train the model which then infers the next slice’s segmentation. Inference result is another datum used to train the model again. The updated model then infers the third slice and the same process is carried out until the last slice. The proposed model is an combination of two Random Forest algorithms: the classical one and a recent one namely Mondrian Forests. We applied our method on human left ventricle segmentation and results are very promising. This method can also be used to generate labels.
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11317v1
PDF	http://arxiv.org/pdf/1804.11317v1.pdf
PWC	https://paperswithcode.com/paper/hybrid-forests-for-left-ventricle
Repo
Framework

Beyond black-boxes in Bayesian inverse problems and model validation: applications in solid mechanics of elastography


Title	Beyond black-boxes in Bayesian inverse problems and model validation: applications in solid mechanics of elastography
Authors	Lukas Bruder, Phaedon-Stelios Koutsourelakis
Abstract	The present paper is motivated by one of the most fundamental challenges in inverse problems, that of quantifying model discrepancies and errors. While significant strides have been made in calibrating model parameters, the overwhelming majority of pertinent methods is based on the assumption of a perfect model. Motivated by problems in solid mechanics which, as all problems in continuum thermodynamics, are described by conservation laws and phenomenological constitutive closures, we argue that in order to quantify model uncertainty in a physically meaningful manner, one should break open the black-box forward model. In particular we propose formulating an undirected probabilistic model that explicitly accounts for the governing equations and their validity. This recasts the solution of both forward and inverse problems as probabilistic inference tasks where the problem’s state variables should not only be compatible with the data but also with the governing equations as well. Even though the probability densities involved do not contain any black-box terms, they live in much higher-dimensional spaces. In combination with the intractability of the normalization constant of the undirected model employed, this poses significant challenges which we propose to address with a linearly-scaling, double-layer of Stochastic Variational Inference. We demonstrate the capabilities and efficacy of the proposed model in synthetic forward and inverse problems (with and without model error) in elastography.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00930v3
PDF	http://arxiv.org/pdf/1803.00930v3.pdf
PWC	https://paperswithcode.com/paper/beyond-black-boxes-in-bayesian-inverse
Repo
Framework

Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft


Title	Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft
Authors	Stephan Alaniz
Abstract	Deep reinforcement learning has been successfully applied to several visual-input tasks using model-free methods. In this paper, we propose a model-based approach that combines learning a DNN-based transition model with Monte Carlo tree search to solve a block-placing task in Minecraft. Our learned transition model predicts the next frame and the rewards one step ahead given the last four frames of the agent’s first-person-view image and the current action. Then a Monte Carlo tree search algorithm uses this model to plan the best sequence of actions for the agent to perform. On the proposed task in Minecraft, our model-based approach reaches the performance comparable to the Deep Q-Network’s, but learns faster and, thus, is more training sample efficient.
Tasks
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08456v1
PDF	http://arxiv.org/pdf/1803.08456v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-with-model
Repo
Framework

Lattice Identification and Separation: Theory and Algorithm


Title	Lattice Identification and Separation: Theory and Algorithm
Authors	Yuchen He, Sung Ha Kang
Abstract	Motivated by lattice mixture identification and grain boundary detection, we present a framework for lattice pattern representation and comparison, and propose an efficient algorithm for lattice separation. We define new scale and shape descriptors, which helps to considerably reduce the size of equivalence classes of lattice bases. These finitely many equivalence relations are fully characterized by modular group theory. We construct the lattice space $\mathscr{L}$ based on the equivalent descriptors and define a metric $d_{\mathscr{L}}$ to accurately quantify the visual similarities and differences between lattices. Furthermore, we introduce the Lattice Identification and Separation Algorithm (LISA), which identifies each lattice patterns from superposed lattices. LISA finds lattice candidates from the high responses in the image spectrum, then sequentially extracts different layers of lattice patterns one by one. Analyzing the frequency components, we reveal the intricate dependency of LISA’s performances on particle radius, lattice density, and relative translations. Various numerical experiments are designed to show LISA’s robustness against a large number of lattice layers, moir'{e} patterns and missing particles.
Tasks	Boundary Detection
Published	2018-12-19
URL	http://arxiv.org/abs/1901.02520v1
PDF	http://arxiv.org/pdf/1901.02520v1.pdf
PWC	https://paperswithcode.com/paper/lattice-identification-and-separation-theory
Repo
Framework

Chief complaint classification with recurrent neural networks


Title	Chief complaint classification with recurrent neural networks
Authors	Scott H Lee, Drew Levin, Pat Finley, Charles M Heilig
Abstract	Syndromic surveillance detects and monitors individual and population health indicators through sources such as emergency department records. Automated classification of these records can improve outbreak detection speed and diagnosis accuracy. Current syndromic systems rely on hand-coded keyword-based methods to parse written fields and may benefit from the use of modern supervised-learning classifier models. In this paper we implement two recurrent neural network models based on long short-term memory (LSTM) and gated recurrent unit (GRU) cells and compare them to two traditional bag-of-words classifiers: multinomial naive Bayes (MNB) and a support vector machine (SVM). The MNB classifier is one of only two machine learning algorithms currently being used for syndromic surveillance. All four models are trained to predict diagnostic code groups as defined by Clinical Classification Software, first to predict from discharge diagnosis, then from chief complaint fields. The classifiers are trained on 3.6 million de-identified emergency department records from a single United States jurisdiction. We compare performance of these models primarily using the F1 score. Using discharge diagnoses, the LSTM classifier performs best, though all models exhibit an F1 score above 96.00. The GRU performs best on chief complaints (F1=47.38), and MNB with bigrams performs worst (F1=39.40). Certain syndrome types are easier to detect than others. For examples, chief complaints using the GRU model predicts alcohol-related disorders well (F1=78.91) but predicts influenza poorly (F1=14.80). In all instances, the RNN models outperformed the bag-of-word classifiers, suggesting deep learning models could substantially improve the automatic classification of unstructured text for syndromic surveillance.
Tasks
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07574v2
PDF	http://arxiv.org/pdf/1805.07574v2.pdf
PWC	https://paperswithcode.com/paper/chief-complaint-classification-with-recurrent
Repo
Framework

The Marchex 2018 English Conversational Telephone Speech Recognition System


Title	The Marchex 2018 English Conversational Telephone Speech Recognition System
Authors	Seongjun Hahm, Iroro Orife, Shane Walker, Jason Flaks
Abstract	In this paper, we describe recent performance improvements to the production Marchex speech recognition system for our spontaneous customer-to-business telephone conversations. In our previous work, we focused on in-domain language and acoustic model training. In this work we employ state-of-the-art semi-supervised lattice-free maximum mutual information (LF-MMI) training process which can supervise over full lattices from unlabeled audio. On Marchex English (ME), a modern evaluation set of conversational North American English, we observed a 3.3% (3.2% for agent, 3.6% for caller) reduction in absolute word error rate (WER) with 3x faster decoding speed over the performance of the 2017 production system. We expect this improvement boost Marchex Call Analytics system performance especially for natural language processing pipeline.
Tasks	Language Modelling, Speech Recognition
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02058v2
PDF	http://arxiv.org/pdf/1811.02058v2.pdf
PWC	https://paperswithcode.com/paper/the-marchex-2018-english-conversational
Repo
Framework

ET-Lasso: A New Efficient Tuning of Lasso-type Regularization for High-Dimensional Data


Title	ET-Lasso: A New Efficient Tuning of Lasso-type Regularization for High-Dimensional Data
Authors	Songshan Yang, Jiawei Wen, Xiang Zhan, Daniel Kifer
Abstract	The L1 regularization (Lasso) has proven to be a versatile tool to select relevant features and estimate the model coefficients simultaneously and has been widely used in many research areas such as genomes studies, finance, and biomedical imaging. Despite its popularity, it is very challenging to guarantee the feature selection consistency of Lasso especially when the dimension of the data is huge. One way to improve the feature selection consistency is to select an ideal tuning parameter. Traditional tuning criteria mainly focus on minimizing the estimated prediction error or maximizing the posterior model probability, such as cross-validation and BIC, which may either be time-consuming or fail to control the false discovery rate (FDR) when the number of features is extremely large. The other way is to introduce pseudo-features to learn the importance of the original ones. Recently, the Knockoff filter is proposed to control the FDR when performing feature selection. However, its performance is sensitive to the choice of the expected FDR threshold. Motivated by these ideas, we propose a new method using pseudo-features to obtain an ideal tuning parameter. In particular, we present the Efficient Tuning of Lasso (ET-Lasso) to separate active and inactive features by adding permuted features as pseudo-features in linear models. The pseudo-features are constructed to be inactive by nature, which can be used to obtain a cutoff to select the tuning parameter that separates active and inactive features. Experimental studies on both simulations and real-world data applications are provided to show that ET-Lasso can effectively and efficiently select active features under a wide range of scenarios
Tasks	Feature Selection
Published	2018-10-10
URL	https://arxiv.org/abs/1810.04513v2
PDF	https://arxiv.org/pdf/1810.04513v2.pdf
PWC	https://paperswithcode.com/paper/et-lasso-efficient-tuning-of-lasso-for-high
Repo
Framework

Combining Deep and Depth: Deep Learning and Face Depth Maps for Driver Attention Monitoring


Title	Combining Deep and Depth: Deep Learning and Face Depth Maps for Driver Attention Monitoring
Authors	Guido Borghi
Abstract	Recently, deep learning approaches have achieved promising results in various fields of computer vision. In this paper, we investigate the combination of deep learning based methods and depth maps as input images to tackle the problem of driver attention monitoring. Moreover, we assume the concept of attention as Head Pose Estimation and Facial Landmark Detection tasks. Differently from other proposals in the literature, the proposed systems are able to work directly and based only on raw depth data. All presented methods are trained and tested on two new public datasets, namely Pandora and MotorMark, achieving state-of-art results and running with real time performance.
Tasks	Driver Attention Monitoring, Facial Landmark Detection, Head Pose Estimation, Pose Estimation
Published	2018-12-14
URL	http://arxiv.org/abs/1812.05831v1
PDF	http://arxiv.org/pdf/1812.05831v1.pdf
PWC	https://paperswithcode.com/paper/combining-deep-and-depth-deep-learning-and
Repo
Framework


Title	Robust positioning of drones for land use monitoring in strong terrain relief using vision-based navigation
Authors	Oleg Kupervasser, Vitalii Sarychev, Alexander Rubinstein, Roman Yavich
Abstract	For land use monitoring, the main problems are robust positioning in urban canyons and strong terrain reliefs with the use of GPS system only. Indeed, satellite signal reflection and shielding in urban canyons and strong terrain relief results in problems with correct positioning. Using GNSS-RTK does not solve the problem completely because in some complex situations the whole satellite’s system works incorrectly. We transform the weakness (urban canyons and strong terrain relief) to an advantage. It is a vision-based navigation using a map of the terrain relief. We investigate and demonstrate the effectiveness of this technology in Chinese region Xiaoshan. The accuracy of the vision-based navigation system corresponds to the expected for these conditions. . It was concluded that the maximum position error based on vision-based navigation is 20 m and the maximum angle Euler error based on vision-based navigation is 0.83 degree. In case of camera movement, the maximum position error based on vision-based navigation is 30m and the maximum Euler angle error based on vision-based navigation is 2.2 degrees.
Tasks
Published	2018-02-20
URL	http://arxiv.org/abs/1803.00398v1
PDF	http://arxiv.org/pdf/1803.00398v1.pdf
PWC	https://paperswithcode.com/paper/robust-positioning-of-drones-for-land-use
Repo
Framework

Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training


Title	Pyramidal Person Re-IDentification via Multi-Loss Dynamic Training
Authors	Feng Zheng, Cheng Deng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Rongrong Ji
Abstract	Most existing Re-IDentification (Re-ID) methods are highly dependent on precise bounding boxes that enable images to be aligned with each other. However, due to the challenging practical scenarios, current detection models often produce inaccurate bounding boxes, which inevitably degenerate the performance of existing Re-ID algorithms. In this paper, we propose a novel coarse-to-fine pyramid model to relax the need of bounding boxes, which not only incorporates local and global information, but also integrates the gradual cues between them. The pyramid model is able to match at different scales and then search for the correct image of the same identity, even when the image pairs are not aligned. In addition, in order to learn discriminative identity representation, we explore a dynamic training scheme to seamlessly unify two losses and extract appropriate shared information between them. Experimental results clearly demonstrate that the proposed method achieves the state-of-the-art results on three datasets. Especially, our approach exceeds the current best method by 9.5% on the most challenging CUHK03 dataset.
Tasks	Person Re-Identification
Published	2018-10-29
URL	https://arxiv.org/abs/1810.12193v3
PDF	https://arxiv.org/pdf/1810.12193v3.pdf
PWC	https://paperswithcode.com/paper/a-coarse-to-fine-pyramidal-model-for-person
Repo
Framework

Lexicosyntactic Inference in Neural Models


Title	Lexicosyntactic Inference in Neural Models
Authors	Aaron Steven White, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
Abstract	We investigate neural models’ ability to capture lexicosyntactic inferences: inferences triggered by the interaction of lexical and syntactic information. We take the task of event factuality prediction as a case study and build a factuality judgment dataset for all English clause-embedding verbs in various syntactic contexts. We use this dataset, which we make publicly available, to probe the behavior of current state-of-the-art neural systems, showing that these systems make certain systematic errors that are clearly visible through the lens of factuality prediction.
Tasks
Published	2018-08-19
URL	http://arxiv.org/abs/1808.06232v1
PDF	http://arxiv.org/pdf/1808.06232v1.pdf
PWC	https://paperswithcode.com/paper/lexicosyntactic-inference-in-neural-models
Repo
Framework

Learning Robust Manipulation Skills with Guided Policy Search via Generative Motor Reflexes


Title	Learning Robust Manipulation Skills with Guided Policy Search via Generative Motor Reflexes
Authors	Philipp Ennen, Pia Bresenitz, Rene Vossen, Frank Hees
Abstract	Guided Policy Search enables robots to learn control policies for complex manipulation tasks efficiently. Therein, the control policies are represented as high-dimensional neural networks which derive robot actions based on states. However, due to the small number of real-world trajectory samples in Guided Policy Search, the resulting neural networks are only robust in the neighbourhood of the trajectory distribution explored by real-world interactions. In this paper, we present a new policy representation called Generative Motor Reflexes, which is able to generate robust actions over a broader state space compared to previous methods. In contrast to prior state-action policies, Generative Motor Reflexes map states to parameters for a state-dependent motor reflex, which is then used to derive actions. Robustness is achieved by generating similar motor reflexes for many states. We evaluate the presented method in simulated and real-world manipulation tasks, including contact-rich peg-in-hole tasks. Using these evaluation tasks, we show that policies represented as Generative Motor Reflexes lead to robust manipulation skills also outside the explored trajectory distribution with less training needs compared to previous methods.
Tasks
Published	2018-09-15
URL	http://arxiv.org/abs/1809.05714v2
PDF	http://arxiv.org/pdf/1809.05714v2.pdf
PWC	https://paperswithcode.com/paper/learning-robust-manipulation-skills-with
Repo
Framework