July 27, 2019

3196 words 16 mins read

Paper Group ANR 534

Language Modeling for Code-Switched Data: Challenges and Approaches. Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification. Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond. Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank. Bayesian A …

Language Modeling for Code-Switched Data: Challenges and Approaches


Title	Language Modeling for Code-Switched Data: Challenges and Approaches
Authors	Ganji Sreeram, Rohit Sinha
Abstract	Lately, the problem of code-switching has gained a lot of attention and has emerged as an active area of research. In bilingual communities, the speakers commonly embed the words and phrases of a non-native language into the syntax of a native language in their day-to-day communications. The code-switching is a global phenomenon among multilingual communities, still very limited acoustic and linguistic resources are available as yet. For developing effective speech based applications, the ability of the existing language technologies to deal with the code-switched data can not be over emphasized. The code-switching is broadly classified into two modes: inter-sentential and intra-sentential code-switching. In this work, we have studied the intra-sentential problem in the context of code-switching language modeling task. The salient contributions of this paper includes: (i) the creation of Hindi-English code-switching text corpus by crawling a few blogging sites educating about the usage of the Internet (ii) the exploration of the parts-of-speech features towards more effective modeling of Hindi-English code-switched data by the monolingual language model (LM) trained on native (Hindi) language data, and (iii) the proposal of a novel textual factor referred to as the code-switch factor (CS-factor), which allows the LM to predict the code-switching instances. In the context of recognition of the code-switching data, the substantial reduction in the PPL is achieved with the use of POS factors and also the proposed CS-factor provides independent as well as additive gain in the PPL.
Tasks	Language Modelling
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03541v1
PDF	http://arxiv.org/pdf/1711.03541v1.pdf
PWC	https://paperswithcode.com/paper/language-modeling-for-code-switched-data
Repo
Framework

Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification


Title	Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification
Authors	John McKay, Vishal Monga, Raghu G. Raj
Abstract	Sonar imaging has seen vast improvements over the last few decades due in part to advances in synthetic aperture Sonar (SAS). Sophisticated classification techniques can now be used in Sonar automatic target recognition (ATR) to locate mines and other threatening objects. Among the most promising of these methods is sparse reconstruction-based classification (SRC) which has shown an impressive resiliency to noise, blur, and occlusion. We present a coherent strategy for expanding upon SRC for Sonar ATR that retains SRC’s robustness while also being able to handle targets with diverse geometric arrangements, bothersome Rayleigh noise, and unavoidable background clutter. Our method, pose corrected sparsity (PCS), incorporates a novel interpretation of a spike and slab probability distribution towards use as a Bayesian prior for class-specific discrimination in combination with a dictionary learning scheme for localized patch extractions. Additionally, PCS offers the potential for anomaly detection in order to avoid false identifications of tested objects from outside the training set with no additional training required. Compelling results are shown using a database provided by the United States Naval Surface Warfare Center.
Tasks	Anomaly Detection, Dictionary Learning
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08590v1
PDF	http://arxiv.org/pdf/1706.08590v1.pdf
PWC	https://paperswithcode.com/paper/robust-sonar-atr-through-bayesian-pose
Repo
Framework

Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond


Title	Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond
Authors	Dong Huang, Chang-Dong Wang, Jian-Huang Lai, Chee-Keong Kwoh
Abstract	The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by incorporating different subspace-based techniques. However, besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimilarity metrics. It remains a surprisingly open problem in ensemble clustering how to create and aggregate a large population of diversified metrics, and furthermore, how to jointly investigate the multi-level diversity in the large populations of metrics, subspaces, and clusters in a unified framework. To tackle this problem, this paper proposes a novel multi-diversified ensemble clustering approach. In particular, we create a large number of diversified metrics by randomizing a scaled exponential similarity kernel, which are then coupled with random subspaces to form a large set of metric-subspace pairs. Based on the similarity matrices derived from these metric-subspace pairs, an ensemble of diversified base clusterings can thereby be constructed. Thereafter, an entropy-based criterion is adopted to explore the cluster-wise diversity in ensembles. By jointly exploiting the multi-level diversity in metrics, subspaces, and clusters, three specific ensemble clustering algorithms are finally presented. Experimental results on 30 real-world high-dimensional datasets (including 18 cancer gene expression datasets and 12 image/speech datasets) have demonstrated the superiority of the proposed algorithms over the state-of-the-art.
Tasks
Published	2017-10-09
URL	https://arxiv.org/abs/1710.03113v3
PDF	https://arxiv.org/pdf/1710.03113v3.pdf
PWC	https://paperswithcode.com/paper/toward-multi-diversified-ensemble-clustering
Repo
Framework

Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank


Title	Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
Authors	Liang Zhao, Siyu Liao, Yanzhi Wang, Zhe Li, Jian Tang, Victor Pan, Bo Yuan
Abstract	Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks. Empirical results have shown that neural networks with weight matrices of LDR matrices, referred as LDR neural networks, can achieve significant reduction in space and computational complexity while retaining high accuracy. We formally study LDR matrices in deep learning. First, we prove the universal approximation property of LDR neural networks with a mild condition on the displacement operators. We then show that the error bounds of LDR neural networks are as efficient as general neural networks with both single-layer and multiple-layer structure. Finally, we propose back-propagation based training algorithm for general LDR neural networks.
Tasks
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00144v4
PDF	http://arxiv.org/pdf/1703.00144v4.pdf
PWC	https://paperswithcode.com/paper/theoretical-properties-for-neural-networks
Repo
Framework

Bayesian Alignments of Warped Multi-Output Gaussian Processes


Title	Bayesian Alignments of Warped Multi-Output Gaussian Processes
Authors	Markus Kaiser, Clemens Otte, Thomas Runkler, Carl Henrik Ek
Abstract	We propose a novel Bayesian approach to modelling nonlinear alignments of time series based on latent shared information. We apply the method to the real-world problem of finding common structure in the sensor data of wind turbines introduced by the underlying latent and turbulent wind field. The proposed model allows for both arbitrary alignments of the inputs and non-parametric output warpings to transform the observations. This gives rise to multiple deep Gaussian process models connected via latent generating processes. We present an efficient variational approximation based on nested variational compression and show how the model can be used to extract shared information between dependent time series, recovering an interpretable functional decomposition of the learning problem. We show results for an artificial data set and real-world data of two wind turbines.
Tasks	Gaussian Processes, Time Series
Published	2017-10-08
URL	http://arxiv.org/abs/1710.02766v3
PDF	http://arxiv.org/pdf/1710.02766v3.pdf
PWC	https://paperswithcode.com/paper/bayesian-alignments-of-warped-multi-output
Repo
Framework

Stabilizing Adversarial Nets With Prediction Methods


Title	Stabilizing Adversarial Nets With Prediction Methods
Authors	Abhay Yadav, Sohil Shah, Zheng Xu, David Jacobs, Tom Goldstein
Abstract	Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when convergence does happen it is often highly sensitive to learning rates. We propose a simple modification of stochastic gradient descent that stabilizes adversarial networks. We show, both in theory and practice, that the proposed method reliably converges to saddle points, and is stable with a wider range of training parameters than a non-prediction method. This makes adversarial networks less likely to “collapse,” and enables faster training with larger learning rates.
Tasks
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07364v3
PDF	http://arxiv.org/pdf/1705.07364v3.pdf
PWC	https://paperswithcode.com/paper/stabilizing-adversarial-nets-with-prediction
Repo
Framework

Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging


Title	Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging
Authors	Jianyu Lin, Neil T. Clancy, Yang Hu, Ji Qi, Taran Tatla, Danail Stoyanov, Lena Maier-Hein, Daniel S. Elson
Abstract	Intra-operative measurements of tissue shape and multi/ hyperspectral information have the potential to provide surgical guidance and decision making support. We report an optical probe based system to combine sparse hyperspectral measurements and spectrally-encoded structured lighting (SL) for surface measurements. The system provides informative signals for navigation with a surgical interface. By rapidly switching between SL and white light (WL) modes, SL information is combined with structure-from-motion (SfM) from white light images, based on SURF feature detection and Lucas-Kanade (LK) optical flow to provide quasi-dense surface shape reconstruction with known scale in real-time. Furthermore, “super-spectral-resolution” was realized, whereby the RGB images and sparse hyperspectral data were integrated to recover dense pixel-level hyperspectral stacks, by using convolutional neural networks to upscale the wavelength dimension. Validation and demonstration of this system is reported on ex vivo/in vivo animal/ human experiments.
Tasks	Decision Making, Optical Flow Estimation
Published	2017-06-19
URL	http://arxiv.org/abs/1706.06081v2
PDF	http://arxiv.org/pdf/1706.06081v2.pdf
PWC	https://paperswithcode.com/paper/endoscopic-depth-measurement-and-super
Repo
Framework

An Optimization Framework with Flexible Inexact Inner Iterations for Nonconvex and Nonsmooth Programming


Title	An Optimization Framework with Flexible Inexact Inner Iterations for Nonconvex and Nonsmooth Programming
Authors	Yiyang Wang, Risheng Liu, Xiaoliang Song, Zhixun Su
Abstract	In recent years, numerous vision and learning tasks have been (re)formulated as nonconvex and nonsmooth programmings(NNPs). Although some algorithms have been proposed for particular problems, designing fast and flexible optimization schemes with theoretical guarantee is a challenging task for general NNPs. It has been investigated that performing inexact inner iterations often benefit to special applications case by case, but their convergence behaviors are still unclear. Motivated by these practical experiences, this paper designs a novel algorithmic framework, named inexact proximal alternating direction method (IPAD) for solving general NNPs. We demonstrate that any numerical algorithms can be incorporated into IPAD for solving subproblems and the convergence of the resulting hybrid schemes can be consistently guaranteed by a series of simple error conditions. Beyond the guarantee in theory, numerical experiments on both synthesized and real-world data further demonstrate the superiority and flexibility of our IPAD framework for practical use.
Tasks
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08627v3
PDF	http://arxiv.org/pdf/1702.08627v3.pdf
PWC	https://paperswithcode.com/paper/an-optimization-framework-with-flexible
Repo
Framework

Towards Universal Semantic Tagging


Title	Towards Universal Semantic Tagging
Authors	Lasha Abzianidze, Johan Bos
Abstract	The paper proposes the task of universal semantic tagging—tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b) they are suitable for cross-lingual semantic parsing. An application of the semantic tagging in the Parallel Meaning Bank supports both of these points as the tags contribute to formal lexical semantics and their cross-lingual projection. As a part of the application, we annotate a small corpus with the semantic tags and present new baseline result for universal semantic tagging.
Tasks	Semantic Parsing
Published	2017-09-29
URL	http://arxiv.org/abs/1709.10381v1
PDF	http://arxiv.org/pdf/1709.10381v1.pdf
PWC	https://paperswithcode.com/paper/towards-universal-semantic-tagging
Repo
Framework

A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue


Title	A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue
Authors	Mihail Eric, Christopher D. Manning
Abstract	Task-oriented dialogue focuses on conversational agents that participate in user-initiated dialogues on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an explicit representation by depending on a latent neural embedding of state and learning selective attention to dialogue history together with copying to incorporate relevant prior context. We complement recent work by showing the effectiveness of simple sequence-to-sequence neural architectures with a copy mechanism. Our model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2.
Tasks
Published	2017-01-15
URL	http://arxiv.org/abs/1701.04024v3
PDF	http://arxiv.org/pdf/1701.04024v3.pdf
PWC	https://paperswithcode.com/paper/a-copy-augmented-sequence-to-sequence
Repo
Framework

Video Object Segmentation Without Temporal Information


Title	Video Object Segmentation Without Temporal Information
Authors	Kevis-Kokitsi Maninis, Sergi Caelles, Yuhua Chen, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc Van Gool
Abstract	Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly or they may not even produce any result at all. This paper explores the orthogonal approach of processing each frame independently, i.e disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOS-S), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent video segmentation databases, which show that OSVOS-S is both the fastest and most accurate method in the state of the art.
Tasks	Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2017-09-18
URL	http://arxiv.org/abs/1709.06031v2
PDF	http://arxiv.org/pdf/1709.06031v2.pdf
PWC	https://paperswithcode.com/paper/video-object-segmentation-without-temporal
Repo
Framework

The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback


Title	The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback
Authors	Amos Korman, Yoav Rodeh
Abstract	We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of possibly dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of $d$ doors that are initially closed, and the aim is to open all of them as fast as possible. To open a door, the algorithm knocks on it and it might open or not according to some probability distribution. This distribution may depend on which other doors are currently open, as well as on which other doors were open during each of the previous knocks on that door. The algorithm aims to minimize the expected time until all doors open. Crucially, it must act at any time without knowing whether or which other doors have already opened. In this work, we focus on scenarios where dependencies between doors are both positively correlated and acyclic.The fundamental distribution of a door describes the probability it opens in the best of conditions (with respect to other doors being open or closed). We show that if in two configurations of $d$ doors corresponding doors share the same fundamental distribution, then these configurations have the same optimal running time up to a universal constant, no matter what are the dependencies between doors and what are the distributions. We also identify algorithms that are optimal up to a universal constant factor. For the case in which all doors share the same fundamental distribution we additionally provide a simpler algorithm, and a formula to calculate its running time. We furthermore analyse the price of lacking feedback for several configurations governed by standard fundamental distributions. In particular, we show that the price is logarithmic in $d$ for memoryless doors, but can potentially grow to be linear in $d$ for other distributions.We then turn our attention to investigate precise bounds. Even for the case of two doors, identifying the optimal sequence is an intriguing combinatorial question. Here, we study the case of two cascading memoryless doors. That is, the first door opens on each knock independently with probability $p_1$. The second door can only open if the first door is open, in which case it will open on each knock independently with probability $p_2$. We solve this problem almost completely by identifying algorithms that are optimal up to an additive term of 1.
Tasks
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06096v1
PDF	http://arxiv.org/pdf/1704.06096v1.pdf
PWC	https://paperswithcode.com/paper/the-dependent-doors-problem-an-investigation
Repo
Framework

EEG-Based User Reaction Time Estimation Using Riemannian Geometry Features


Title	EEG-Based User Reaction Time Estimation Using Riemannian Geometry Features
Authors	Dongrui Wu, Brent J. Lance, Vernon J. Lawhern, Stephen Gordon, Tzyy-Ping Jung, Chin-Teng Lin
Abstract	Riemannian geometry has been successfully used in many brain-computer interface (BCI) classification problems and demonstrated superior performance. In this paper, for the first time, it is applied to BCI regression problems, an important category of BCI applications. More specifically, we propose a new feature extraction approach for Electroencephalogram (EEG) based BCI regression problems: a spatial filter is first used to increase the signal quality of the EEG trials and also to reduce the dimensionality of the covariance matrices, and then Riemannian tangent space features are extracted. We validate the performance of the proposed approach in reaction time estimation from EEG signals measured in a large-scale sustained-attention psychomotor vigilance task, and show that compared with the traditional powerband features, the tangent space features can reduce the root mean square estimation error by 4.30-8.30%, and increase the estimation correlation coefficient by 6.59-11.13%.
Tasks	EEG
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08533v1
PDF	http://arxiv.org/pdf/1704.08533v1.pdf
PWC	https://paperswithcode.com/paper/eeg-based-user-reaction-time-estimation-using
Repo
Framework

Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets


Title	Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets
Authors	Maciej Wielgosz, Andrzej Skoczeń, Matej Mertik
Abstract	This paper presents a model based on Deep Learning algorithms of LSTM and GRU for facilitating an anomaly detection in Large Hadron Collider superconducting magnets. We used high resolution data available in Post Mortem database to train a set of models and chose the best possible set of their hyper-parameters. Using Deep Learning approach allowed to examine a vast body of data and extract the fragments which require further experts examination and are regarded as anomalies. The presented method does not require tedious manual threshold setting and operator attention at the stage of the system setup. Instead, the automatic approach is proposed, which achieves according to our experiments accuracy of 99%. This is reached for the largest dataset of 302 MB and the following architecture of the network: single layer LSTM, 128 cells, 20 epochs of training, look_back=16, look_ahead=128, grid=100 and optimizer Adam. All the experiments were run on GPU Nvidia Tesla K80
Tasks	Anomaly Detection, Time Series
Published	2017-02-02
URL	http://arxiv.org/abs/1702.00833v1
PDF	http://arxiv.org/pdf/1702.00833v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-for-anomaly
Repo
Framework

Trimmed Density Ratio Estimation


Title	Trimmed Density Ratio Estimation
Authors	Song Liu, Akiko Takeda, Taiji Suzuki, Kenji Fukumizu
Abstract	Density ratio estimation is a vital tool in both machine learning and statistical community. However, due to the unbounded nature of density ratio, the estimation procedure can be vulnerable to corrupted data points, which often pushes the estimated ratio toward infinity. In this paper, we present a robust estimator which automatically identifies and trims outliers. The proposed estimator has a convex formulation, and the global optimum can be obtained via subgradient descent. We analyze the parameter estimation error of this estimator under high-dimensional settings. Experiments are conducted to verify the effectiveness of the estimator.
Tasks
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03216v3
PDF	http://arxiv.org/pdf/1703.03216v3.pdf
PWC	https://paperswithcode.com/paper/trimmed-density-ratio-estimation
Repo
Framework