October 20, 2019

3224 words 16 mins read

Paper Group AWR 346

Paper Group AWR 346

Densely Connected Bidirectional LSTM with Applications to Sentence Classification. Adversarial Spheres. Learning monocular depth estimation with unsupervised trinocular assumptions. Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation. Locally Private Bayesian Inference for Count Models. Multichannel Semantic …

Densely Connected Bidirectional LSTM with Applications to Sentence Classification

Title Densely Connected Bidirectional LSTM with Applications to Sentence Classification
Authors Zixiang Ding, Rui Xia, Jianfei Yu, Xiang Li, Jian Yang
Abstract Deep neural networks have recently been shown to achieve highly competitive performance in many computer vision tasks due to their abilities of exploring in a much larger hypothesis space. However, since most deep architectures like stacked RNNs tend to suffer from the vanishing-gradient and overfitting problems, their effects are still understudied in many NLP tasks. Inspired by this, we propose a novel multi-layer RNN model called densely connected bidirectional long short-term memory (DC-Bi-LSTM) in this paper, which essentially represents each layer by the concatenation of its hidden state and all preceding layers’ hidden states, followed by recursively passing each layer’s representation to all subsequent layers. We evaluate our proposed model on five benchmark datasets of sentence classification. DC-Bi-LSTM with depth up to 20 can be successfully trained and obtain significant improvements over the traditional Bi-LSTM with the same or even less parameters. Moreover, our model has promising performance compared with the state-of-the-art approaches.
Tasks Sentence Classification
Published 2018-02-03
URL http://arxiv.org/abs/1802.00889v1
PDF http://arxiv.org/pdf/1802.00889v1.pdf
PWC https://paperswithcode.com/paper/densely-connected-bidirectional-lstm-with
Repo https://github.com/xuhaiming1996/Densely-Connected-Bidirectional-LSTM
Framework tf

Adversarial Spheres

Title Adversarial Spheres
Authors Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, Ian Goodfellow
Abstract State of the art computer vision models have been shown to be vulnerable to small adversarial perturbations of the input. In other words, most images in the data distribution are both correctly classified by the model and are very close to a visually similar misclassified image. Despite substantial research interest, the cause of the phenomenon is still poorly understood and remains unsolved. We hypothesize that this counter intuitive behavior is a naturally occurring result of the high dimensional geometry of the data manifold. As a first step towards exploring this hypothesis, we study a simple synthetic dataset of classifying between two concentric high dimensional spheres. For this dataset we show a fundamental tradeoff between the amount of test error and the average distance to nearest error. In particular, we prove that any model which misclassifies a small constant fraction of a sphere will be vulnerable to adversarial perturbations of size $O(1/\sqrt{d})$. Surprisingly, when we train several different architectures on this dataset, all of their error sets naturally approach this theoretical bound. As a result of the theory, the vulnerability of neural networks to small adversarial perturbations is a logical consequence of the amount of test error observed. We hope that our theoretical analysis of this very simple case will point the way forward to explore how the geometry of complex real-world data sets leads to adversarial examples.
Tasks
Published 2018-01-09
URL http://arxiv.org/abs/1801.02774v3
PDF http://arxiv.org/pdf/1801.02774v3.pdf
PWC https://paperswithcode.com/paper/adversarial-spheres
Repo https://github.com/xiaozhanguva/Measure-Concentration
Framework pytorch

Learning monocular depth estimation with unsupervised trinocular assumptions

Title Learning monocular depth estimation with unsupervised trinocular assumptions
Authors Matteo Poggi, Fabio Tosi, Stefano Mattoccia
Abstract Obtaining accurate depth measurements out of a single image represents a fascinating solution to 3D sensing. CNNs led to considerable improvements in this field, and recent trends replaced the need for ground-truth labels with geometry-guided image reconstruction signals enabling unsupervised training. Currently, for this purpose, state-of-the-art techniques rely on images acquired with a binocular stereo rig to predict inverse depth (i.e., disparity) according to the aforementioned supervision principle. However, these methods suffer from well-known problems near occlusions, left image border, etc inherited from the stereo setup. Therefore, in this paper, we tackle these issues by moving to a trinocular domain for training. Assuming the central image as the reference, we train a CNN to infer disparity representations pairing such image with frames on its left and right side. This strategy allows obtaining depth maps not affected by typical stereo artifacts. Moreover, being trinocular datasets seldom available, we introduce a novel interleaved training procedure enabling to enforce the trinocular assumption outlined from current binocular datasets. Exhaustive experimental results on the KITTI dataset confirm that our proposal outperforms state-of-the-art methods for unsupervised monocular depth estimation trained on binocular stereo pairs as well as any known methods relying on other cues.
Tasks Depth Estimation, Image Reconstruction, Monocular Depth Estimation
Published 2018-08-05
URL http://arxiv.org/abs/1808.01606v1
PDF http://arxiv.org/pdf/1808.01606v1.pdf
PWC https://paperswithcode.com/paper/learning-monocular-depth-estimation-with
Repo https://github.com/mattpoggi/3net
Framework tf

Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation

Title Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation
Authors Rania Briq, Michael Moeller, Juergen Gall
Abstract Weakly supervised semantic segmentation has been a subject of increased interest due to the scarcity of fully annotated images. We introduce a new approach for solving weakly supervised semantic segmentation with deep Convolutional Neural Networks (CNNs). The method introduces a novel layer which applies simplex projection on the output of a neural network using area constraints of class objects. The proposed method is general and can be seamlessly integrated into any CNN architecture. Moreover, the projection layer allows strongly supervised models to be adapted to weakly supervised models effortlessly by substituting ground truth labels. Our experiments have shown that applying such an operation on the output of a CNN improves the accuracy of semantic segmentation in a weakly supervised setting with image-level labels.
Tasks Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published 2018-07-24
URL http://arxiv.org/abs/1807.09169v1
PDF http://arxiv.org/pdf/1807.09169v1.pdf
PWC https://paperswithcode.com/paper/convolutional-simplex-projection-network-cspn
Repo https://github.com/briqr/CSPN
Framework caffe2

Locally Private Bayesian Inference for Count Models

Title Locally Private Bayesian Inference for Count Models
Authors Aaron Schein, Zhiwei Steven Wu, Alexandra Schofield, Mingyuan Zhou, Hanna Wallach
Abstract We present a general method for privacy-preserving Bayesian inference in Poisson factorization, a broad class of models that includes some of the most widely used models in the social sciences. Our method satisfies limited precision local privacy, a generalization of local differential privacy, which we introduce to formulate privacy guarantees appropriate for sparse count data. We develop an MCMC algorithm that approximates the locally private posterior over model parameters given data that has been locally privatized by the geometric mechanism (Ghosh et al., 2012). Our solution is based on two insights: 1) a novel reinterpretation of the geometric mechanism in terms of the Skellam distribution (Skellam, 1946) and 2) a general theorem that relates the Skellam to the Bessel distribution (Yuan & Kalbfleisch, 2000). We demonstrate our method in two case studies on real-world email data in which we show that our method consistently outperforms the commonly-used naive approach, obtaining higher quality topics in text and more accurate link prediction in networks. On some tasks, our privacy-preserving method even outperforms non-private inference which conditions on the true data.
Tasks Bayesian Inference, Link Prediction
Published 2018-03-22
URL http://arxiv.org/abs/1803.08471v3
PDF http://arxiv.org/pdf/1803.08471v3.pdf
PWC https://paperswithcode.com/paper/locally-private-bayesian-inference-for-count
Repo https://github.com/xandaschofield/locally_private_bpf_icml19
Framework none

Multichannel Semantic Segmentation with Unsupervised Domain Adaptation

Title Multichannel Semantic Segmentation with Unsupervised Domain Adaptation
Authors Kohei Watanabe, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada
Abstract Most contemporary robots have depth sensors, and research on semantic segmentation with RGBD images has shown that depth images boost the accuracy of segmentation. Since it is time-consuming to annotate images with semantic labels per pixel, it would be ideal if we could avoid this laborious work by utilizing an existing dataset or a synthetic dataset which we can generate on our own. Robot motions are often tested in a synthetic environment, where multichannel (eg, RGB + depth + instance boundary) images plus their pixel-level semantic labels are available. However, models trained simply on synthetic images tend to demonstrate poor performance on real images. In order to address this, we propose two approaches that can efficiently exploit multichannel inputs combined with an unsupervised domain adaptation (UDA) algorithm. One is a fusion-based approach that uses depth images as inputs. The other is a multitask learning approach that uses depth images as outputs. We demonstrated that the segmentation results were improved by using a multitask learning approach with a post-process and created a benchmark for this task.
Tasks Domain Adaptation, Semantic Segmentation, Unsupervised Domain Adaptation
Published 2018-12-11
URL http://arxiv.org/abs/1812.04351v1
PDF http://arxiv.org/pdf/1812.04351v1.pdf
PWC https://paperswithcode.com/paper/multichannel-semantic-segmentation-with
Repo https://github.com/LittleWat/multichannel-semseg-with-uda
Framework pytorch

FCHD: Fast and accurate head detection in crowded scenes

Title FCHD: Fast and accurate head detection in crowded scenes
Authors Aditya Vora, Vinay Chilaka
Abstract In this paper, we propose FCHD-Fully Convolutional Head Detector, an end-to-end trainable head detection model. Our proposed architecture is a single fully convolutional network which is responsible for both bounding box prediction and classification. This makes our model lightweight with low inference time and memory requirements. Along with run-time, our model has better overall average precision (AP) which is achieved by selection of anchor sizes based on the effective receptive field of the network. This can be concluded from our experiments on several head detection datasets with varying head counts. We achieve an AP of 0.70 on a challenging head detection dataset which is comparable to some standard benchmarks. Along with this our model runs at 5 FPS on Nvidia Quadro M1000M for VGA resolution images. Code is available at https://github.com/aditya-vora/FCHD-Fully-Convolutional-Head-Detector.
Tasks Head Detection
Published 2018-09-24
URL https://arxiv.org/abs/1809.08766v3
PDF https://arxiv.org/pdf/1809.08766v3.pdf
PWC https://paperswithcode.com/paper/fchd-a-fast-and-accurate-head-detector
Repo https://github.com/aditya-vora/FCHD-Fully-Convolutional-Head-Detector
Framework pytorch

Word Mover’s Embedding: From Word2Vec to Document Embedding

Title Word Mover’s Embedding: From Word2Vec to Document Embedding
Authors Lingfei Wu, Ian E. H. Yen, Kun Xu, Fangli Xu, Avinash Balakrishnan, Pin-Yu Chen, Pradeep Ravikumar, Michael J. Witbrock
Abstract While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called \emph{Word Mover’s Distance} (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the \emph{Word Mover’s Embedding } (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.
Tasks Document Embedding, Sentence Embedding, Text Classification, Word Embeddings
Published 2018-10-30
URL http://arxiv.org/abs/1811.01713v1
PDF http://arxiv.org/pdf/1811.01713v1.pdf
PWC https://paperswithcode.com/paper/word-movers-embedding-from-word2vec-to
Repo https://github.com/IBM/WordMoversEmbeddings
Framework none

Categorizing Comparative Sentences

Title Categorizing Comparative Sentences
Authors Alexander Panchenko, Alexander Bondarenko, Mirco Franzek, Matthias Hagen, Chris Biemann
Abstract We tackle the tasks of automatically identifying comparative sentences and categorizing the intended preference (e.g., “Python has better NLP libraries than MATLAB” => (Python, better, MATLAB). To this end, we manually annotate 7,199 sentences for 217 distinct target item pairs from several domains (27% of the sentences contain an oriented comparison in the sense of “better” or “worse”). A gradient boosting model based on pre-trained sentence embeddings reaches an F1 score of 85% in our experimental evaluation. The model can be used to extract comparative sentences for pro/con argumentation in comparative / argument search engines or debating technologies.
Tasks Argument Mining, Sentence Embeddings
Published 2018-09-17
URL https://arxiv.org/abs/1809.06152v2
PDF https://arxiv.org/pdf/1809.06152v2.pdf
PWC https://paperswithcode.com/paper/categorization-of-comparative-sentences-for
Repo https://github.com/ablx/comparative-arguments-thesis
Framework none

SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception

Title SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception
Authors Yue Meng, Yongxi Lu, Aman Raj, Samuel Sunarjo, Rui Guo, Tara Javidi, Gaurav Bansal, Dinesh Bharadia
Abstract Unsupervised learning for geometric perception (depth, optical flow, etc.) is of great interest to autonomous systems. Recent works on unsupervised learning have made considerable progress on perceiving geometry; however, they usually ignore the coherence of objects and perform poorly under scenarios with dark and noisy environments. In contrast, supervised learning algorithms, which are robust, require large labeled geometric dataset. This paper introduces SIGNet, a novel framework that provides robust geometry perception without requiring geometrically informative labels. Specifically, SIGNet integrates semantic information to make depth and flow predictions consistent with objects and robust to low lighting conditions. SIGNet is shown to improve upon the state-of-the-art unsupervised learning for depth prediction by 30% (in squared relative error). In particular, SIGNet improves the dynamic object class performance by 39% in depth prediction and 29% in flow prediction. Our code will be made available at https://github.com/mengyuest/SIGNet
Tasks 3D Geometry Perception, Depth Estimation, Monocular Depth Estimation, Optical Flow Estimation
Published 2018-12-13
URL http://arxiv.org/abs/1812.05642v2
PDF http://arxiv.org/pdf/1812.05642v2.pdf
PWC https://paperswithcode.com/paper/signet-semantic-instance-aided-unsupervised
Repo https://github.com/mengyuest/SIGNet
Framework tf

Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation

Title Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation
Authors Anurag Ranjan, Varun Jampani, Lukas Balles, Kihwan Kim, Deqing Sun, Jonas Wulff, Michael J. Black
Abstract We address the unsupervised learning of several interconnected problems in low-level vision: single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions. Our key insight is that these four fundamental vision problems are coupled through geometric constraints. Consequently, learning to solve them together simplifies the problem because the solutions can reinforce each other. We go beyond previous work by exploiting geometry more explicitly and segmenting the scene into static and moving regions. To that end, we introduce Competitive Collaboration, a framework that facilitates the coordinated training of multiple specialized neural networks to solve complex problems. Competitive Collaboration works much like expectation-maximization, but with neural networks that act as both competitors to explain pixels that correspond to static or moving regions, and as collaborators through a moderator that assigns pixels to be either static or independently moving. Our novel method integrates all these problems in a common framework and simultaneously reasons about the segmentation of the scene into moving objects and the static background, the camera motion, depth of the static scene structure, and the optical flow of moving objects. Our model is trained without any supervision and achieves state-of-the-art performance among joint unsupervised methods on all sub-problems.
Tasks Depth Estimation, Monocular Depth Estimation, Motion Estimation, Motion Segmentation, Optical Flow Estimation
Published 2018-05-24
URL http://arxiv.org/abs/1805.09806v3
PDF http://arxiv.org/pdf/1805.09806v3.pdf
PWC https://paperswithcode.com/paper/competitive-collaboration-joint-unsupervised
Repo https://github.com/anuragranj/cc
Framework pytorch

Latent Variable Time-varying Network Inference

Title Latent Variable Time-varying Network Inference
Authors Federico Tomasi, Veronica Tozzo, Saverio Salzo, Alessandro Verri
Abstract In many applications of finance, biology and sociology, complex systems involve entities interacting with each other. These processes have the peculiarity of evolving over time and of comprising latent factors, which influence the system without being explicitly measured. In this work we present latent variable time-varying graphical lasso (LTGL), a method for multivariate time-series graphical modelling that considers the influence of hidden or unmeasurable factors. The estimation of the contribution of the latent factors is embedded in the model which produces both sparse and low-rank components for each time point. In particular, the first component represents the connectivity structure of observable variables of the system, while the second represents the influence of hidden factors, assumed to be few with respect to the observed variables. Our model includes temporal consistency on both components, providing an accurate evolutionary pattern of the system. We derive a tractable optimisation algorithm based on alternating direction method of multipliers, and develop a scalable and efficient implementation which exploits proximity operators in closed form. LTGL is extensively validated on synthetic data, achieving optimal performance in terms of accuracy, structure learning and scalability with respect to ground truth and state-of-the-art methods for graphical inference. We conclude with the application of LTGL to real case studies, from biology and finance, to illustrate how our method can be successfully employed to gain insights on multivariate time-series data.
Tasks Time Series
Published 2018-02-12
URL http://arxiv.org/abs/1802.03987v2
PDF http://arxiv.org/pdf/1802.03987v2.pdf
PWC https://paperswithcode.com/paper/latent-variable-time-varying-network
Repo https://github.com/fdtomasi/regain
Framework none

Contextual Bandits with Stochastic Experts

Title Contextual Bandits with Stochastic Experts
Authors Rajat Sen, Karthikeyan Shanmugam, Sanjay Shakkottai
Abstract We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem. In our problem setting, we assume access to a class of stochastic experts, where each expert is a conditional distribution over the arms given a context. We propose upper-confidence bound (UCB) algorithms for this problem, which employ two different importance sampling based estimators for the mean reward for each expert. Both these estimators leverage information leakage among the experts, thus using samples collected under all the experts to estimate the mean reward of any given expert. This leads to instance dependent regret bounds of $\mathcal{O}\left(\lambda(\pmb{\mu})\mathcal{M}\log T/\Delta \right)$, where $\lambda(\pmb{\mu})$ is a term that depends on the mean rewards of the experts, $\Delta$ is the smallest gap between the mean reward of the optimal expert and the rest, and $\mathcal{M}$ quantifies the information leakage among the experts. We show that under some assumptions $\lambda(\pmb{\mu})$ is typically $\mathcal{O}(\log N)$. We implement our algorithm with stochastic experts generated from cost-sensitive classification oracles and show superior empirical performance on real-world datasets, when compared to other state of the art contextual bandit algorithms.
Tasks Multi-Armed Bandits
Published 2018-02-23
URL http://arxiv.org/abs/1802.08737v1
PDF http://arxiv.org/pdf/1802.08737v1.pdf
PWC https://paperswithcode.com/paper/contextual-bandits-with-stochastic-experts
Repo https://github.com/rajatsen91/CB_StochasticExperts
Framework none

Consistent Estimation of Propensity Score Functions with Oversampled Exposed Subjects

Title Consistent Estimation of Propensity Score Functions with Oversampled Exposed Subjects
Authors Sherri Rose
Abstract Observational cohort studies with oversampled exposed subjects are typically implemented to understand the causal effect of a rare exposure. Because the distribution of exposed subjects in the sample differs from the source population, estimation of a propensity score function (i.e., probability of exposure given baseline covariates) targets a nonparametrically nonidentifiable parameter. Consistent estimation of propensity score functions is an important component of various causal inference estimators, including double robust machine learning and inverse probability weighted estimators. This paper develops the use of the probability of exposure from the source population in a flexible computational implementation that can be used with any algorithm that allows observation weighting to produce consistent estimators of propensity score functions. Simulation studies and a hypothetical health policy intervention data analysis demonstrate low empirical bias and variance for these propensity score function estimators with observation weights in finite samples.
Tasks Causal Inference
Published 2018-05-20
URL http://arxiv.org/abs/1805.07684v2
PDF http://arxiv.org/pdf/1805.07684v2.pdf
PWC https://paperswithcode.com/paper/consistent-estimation-of-propensity-score
Repo https://github.com/sherrirose/ConditionalCohortSamples
Framework none

Practical Window Setting Optimization for Medical Image Deep Learning

Title Practical Window Setting Optimization for Medical Image Deep Learning
Authors Hyunkwang Lee, Myeongchan Kim, Synho Do
Abstract The recent advancements in deep learning have allowed for numerous applications in computed tomography (CT), with potential to improve diagnostic accuracy, speed of interpretation, and clinical efficiency. However, the deep learning community has to date neglected window display settings - a key feature of clinical CT interpretation and opportunity for additional optimization. Here we propose a window setting optimization (WSO) module that is fully trainable with convolutional neural networks (CNNs) to find optimal window settings for clinical performance. Our approach was inspired by the method commonly used by practicing radiologists to interpret CT images by adjusting window settings to increase the visualization of certain pathologies. Our approach provides optimal window ranges to enhance the conspicuity of abnormalities, and was used to enable performance enhancement for intracranial hemorrhage and urinary stone detection. On each task, the WSO model outperformed models trained over the full range of Hounsfield unit values in CT images, as well as images windowed with pre-defined settings. The WSO module can be readily applied to any analysis of CT images, and can be further generalized to tasks on other medical imaging modalities.
Tasks Computed Tomography (CT)
Published 2018-12-03
URL http://arxiv.org/abs/1812.00572v1
PDF http://arxiv.org/pdf/1812.00572v1.pdf
PWC https://paperswithcode.com/paper/practical-window-setting-optimization-for
Repo https://github.com/Synho/windows_optimization
Framework tf
comments powered by Disqus