Paper Group ANR 534
Language Modeling for Code-Switched Data: Challenges and Approaches. Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification. Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond. Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank. Bayesian A …
Language Modeling for Code-Switched Data: Challenges and Approaches
Title | Language Modeling for Code-Switched Data: Challenges and Approaches |
Authors | Ganji Sreeram, Rohit Sinha |
Abstract | Lately, the problem of code-switching has gained a lot of attention and has emerged as an active area of research. In bilingual communities, the speakers commonly embed the words and phrases of a non-native language into the syntax of a native language in their day-to-day communications. The code-switching is a global phenomenon among multilingual communities, still very limited acoustic and linguistic resources are available as yet. For developing effective speech based applications, the ability of the existing language technologies to deal with the code-switched data can not be over emphasized. The code-switching is broadly classified into two modes: inter-sentential and intra-sentential code-switching. In this work, we have studied the intra-sentential problem in the context of code-switching language modeling task. The salient contributions of this paper includes: (i) the creation of Hindi-English code-switching text corpus by crawling a few blogging sites educating about the usage of the Internet (ii) the exploration of the parts-of-speech features towards more effective modeling of Hindi-English code-switched data by the monolingual language model (LM) trained on native (Hindi) language data, and (iii) the proposal of a novel textual factor referred to as the code-switch factor (CS-factor), which allows the LM to predict the code-switching instances. In the context of recognition of the code-switching data, the substantial reduction in the PPL is achieved with the use of POS factors and also the proposed CS-factor provides independent as well as additive gain in the PPL. |
Tasks | Language Modelling |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03541v1 |
http://arxiv.org/pdf/1711.03541v1.pdf | |
PWC | https://paperswithcode.com/paper/language-modeling-for-code-switched-data |
Repo | |
Framework | |
Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification
Title | Robust Sonar ATR Through Bayesian Pose Corrected Sparse Classification |
Authors | John McKay, Vishal Monga, Raghu G. Raj |
Abstract | Sonar imaging has seen vast improvements over the last few decades due in part to advances in synthetic aperture Sonar (SAS). Sophisticated classification techniques can now be used in Sonar automatic target recognition (ATR) to locate mines and other threatening objects. Among the most promising of these methods is sparse reconstruction-based classification (SRC) which has shown an impressive resiliency to noise, blur, and occlusion. We present a coherent strategy for expanding upon SRC for Sonar ATR that retains SRC’s robustness while also being able to handle targets with diverse geometric arrangements, bothersome Rayleigh noise, and unavoidable background clutter. Our method, pose corrected sparsity (PCS), incorporates a novel interpretation of a spike and slab probability distribution towards use as a Bayesian prior for class-specific discrimination in combination with a dictionary learning scheme for localized patch extractions. Additionally, PCS offers the potential for anomaly detection in order to avoid false identifications of tested objects from outside the training set with no additional training required. Compelling results are shown using a database provided by the United States Naval Surface Warfare Center. |
Tasks | Anomaly Detection, Dictionary Learning |
Published | 2017-06-26 |
URL | http://arxiv.org/abs/1706.08590v1 |
http://arxiv.org/pdf/1706.08590v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-sonar-atr-through-bayesian-pose |
Repo | |
Framework | |
Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond
Title | Toward Multi-Diversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond |
Authors | Dong Huang, Chang-Dong Wang, Jian-Huang Lai, Chee-Keong Kwoh |
Abstract | The rapid emergence of high-dimensional data in various areas has brought new challenges to current ensemble clustering research. To deal with the curse of dimensionality, recently considerable efforts in ensemble clustering have been made by incorporating different subspace-based techniques. However, besides the emphasis on subspaces, rather limited attention has been paid to the potential diversity in similarity/dissimilarity metrics. It remains a surprisingly open problem in ensemble clustering how to create and aggregate a large population of diversified metrics, and furthermore, how to jointly investigate the multi-level diversity in the large populations of metrics, subspaces, and clusters in a unified framework. To tackle this problem, this paper proposes a novel multi-diversified ensemble clustering approach. In particular, we create a large number of diversified metrics by randomizing a scaled exponential similarity kernel, which are then coupled with random subspaces to form a large set of metric-subspace pairs. Based on the similarity matrices derived from these metric-subspace pairs, an ensemble of diversified base clusterings can thereby be constructed. Thereafter, an entropy-based criterion is adopted to explore the cluster-wise diversity in ensembles. By jointly exploiting the multi-level diversity in metrics, subspaces, and clusters, three specific ensemble clustering algorithms are finally presented. Experimental results on 30 real-world high-dimensional datasets (including 18 cancer gene expression datasets and 12 image/speech datasets) have demonstrated the superiority of the proposed algorithms over the state-of-the-art. |
Tasks | |
Published | 2017-10-09 |
URL | https://arxiv.org/abs/1710.03113v3 |
https://arxiv.org/pdf/1710.03113v3.pdf | |
PWC | https://paperswithcode.com/paper/toward-multi-diversified-ensemble-clustering |
Repo | |
Framework | |
Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
Title | Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank |
Authors | Liang Zhao, Siyu Liao, Yanzhi Wang, Zhe Li, Jian Tang, Victor Pan, Bo Yuan |
Abstract | Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks. Empirical results have shown that neural networks with weight matrices of LDR matrices, referred as LDR neural networks, can achieve significant reduction in space and computational complexity while retaining high accuracy. We formally study LDR matrices in deep learning. First, we prove the universal approximation property of LDR neural networks with a mild condition on the displacement operators. We then show that the error bounds of LDR neural networks are as efficient as general neural networks with both single-layer and multiple-layer structure. Finally, we propose back-propagation based training algorithm for general LDR neural networks. |
Tasks | |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00144v4 |
http://arxiv.org/pdf/1703.00144v4.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-properties-for-neural-networks |
Repo | |
Framework | |
Bayesian Alignments of Warped Multi-Output Gaussian Processes
Title | Bayesian Alignments of Warped Multi-Output Gaussian Processes |
Authors | Markus Kaiser, Clemens Otte, Thomas Runkler, Carl Henrik Ek |
Abstract | We propose a novel Bayesian approach to modelling nonlinear alignments of time series based on latent shared information. We apply the method to the real-world problem of finding common structure in the sensor data of wind turbines introduced by the underlying latent and turbulent wind field. The proposed model allows for both arbitrary alignments of the inputs and non-parametric output warpings to transform the observations. This gives rise to multiple deep Gaussian process models connected via latent generating processes. We present an efficient variational approximation based on nested variational compression and show how the model can be used to extract shared information between dependent time series, recovering an interpretable functional decomposition of the learning problem. We show results for an artificial data set and real-world data of two wind turbines. |
Tasks | Gaussian Processes, Time Series |
Published | 2017-10-08 |
URL | http://arxiv.org/abs/1710.02766v3 |
http://arxiv.org/pdf/1710.02766v3.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-alignments-of-warped-multi-output |
Repo | |
Framework | |
Stabilizing Adversarial Nets With Prediction Methods
Title | Stabilizing Adversarial Nets With Prediction Methods |
Authors | Abhay Yadav, Sohil Shah, Zheng Xu, David Jacobs, Tom Goldstein |
Abstract | Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when convergence does happen it is often highly sensitive to learning rates. We propose a simple modification of stochastic gradient descent that stabilizes adversarial networks. We show, both in theory and practice, that the proposed method reliably converges to saddle points, and is stable with a wider range of training parameters than a non-prediction method. This makes adversarial networks less likely to “collapse,” and enables faster training with larger learning rates. |
Tasks | |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07364v3 |
http://arxiv.org/pdf/1705.07364v3.pdf | |
PWC | https://paperswithcode.com/paper/stabilizing-adversarial-nets-with-prediction |
Repo | |
Framework | |
Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging
Title | Endoscopic Depth Measurement and Super-Spectral-Resolution Imaging |
Authors | Jianyu Lin, Neil T. Clancy, Yang Hu, Ji Qi, Taran Tatla, Danail Stoyanov, Lena Maier-Hein, Daniel S. Elson |
Abstract | Intra-operative measurements of tissue shape and multi/ hyperspectral information have the potential to provide surgical guidance and decision making support. We report an optical probe based system to combine sparse hyperspectral measurements and spectrally-encoded structured lighting (SL) for surface measurements. The system provides informative signals for navigation with a surgical interface. By rapidly switching between SL and white light (WL) modes, SL information is combined with structure-from-motion (SfM) from white light images, based on SURF feature detection and Lucas-Kanade (LK) optical flow to provide quasi-dense surface shape reconstruction with known scale in real-time. Furthermore, “super-spectral-resolution” was realized, whereby the RGB images and sparse hyperspectral data were integrated to recover dense pixel-level hyperspectral stacks, by using convolutional neural networks to upscale the wavelength dimension. Validation and demonstration of this system is reported on ex vivo/in vivo animal/ human experiments. |
Tasks | Decision Making, Optical Flow Estimation |
Published | 2017-06-19 |
URL | http://arxiv.org/abs/1706.06081v2 |
http://arxiv.org/pdf/1706.06081v2.pdf | |
PWC | https://paperswithcode.com/paper/endoscopic-depth-measurement-and-super |
Repo | |
Framework | |
An Optimization Framework with Flexible Inexact Inner Iterations for Nonconvex and Nonsmooth Programming
Title | An Optimization Framework with Flexible Inexact Inner Iterations for Nonconvex and Nonsmooth Programming |
Authors | Yiyang Wang, Risheng Liu, Xiaoliang Song, Zhixun Su |
Abstract | In recent years, numerous vision and learning tasks have been (re)formulated as nonconvex and nonsmooth programmings(NNPs). Although some algorithms have been proposed for particular problems, designing fast and flexible optimization schemes with theoretical guarantee is a challenging task for general NNPs. It has been investigated that performing inexact inner iterations often benefit to special applications case by case, but their convergence behaviors are still unclear. Motivated by these practical experiences, this paper designs a novel algorithmic framework, named inexact proximal alternating direction method (IPAD) for solving general NNPs. We demonstrate that any numerical algorithms can be incorporated into IPAD for solving subproblems and the convergence of the resulting hybrid schemes can be consistently guaranteed by a series of simple error conditions. Beyond the guarantee in theory, numerical experiments on both synthesized and real-world data further demonstrate the superiority and flexibility of our IPAD framework for practical use. |
Tasks | |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08627v3 |
http://arxiv.org/pdf/1702.08627v3.pdf | |
PWC | https://paperswithcode.com/paper/an-optimization-framework-with-flexible |
Repo | |
Framework | |
Towards Universal Semantic Tagging
Title | Towards Universal Semantic Tagging |
Authors | Lasha Abzianidze, Johan Bos |
Abstract | The paper proposes the task of universal semantic tagging—tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b) they are suitable for cross-lingual semantic parsing. An application of the semantic tagging in the Parallel Meaning Bank supports both of these points as the tags contribute to formal lexical semantics and their cross-lingual projection. As a part of the application, we annotate a small corpus with the semantic tags and present new baseline result for universal semantic tagging. |
Tasks | Semantic Parsing |
Published | 2017-09-29 |
URL | http://arxiv.org/abs/1709.10381v1 |
http://arxiv.org/pdf/1709.10381v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-semantic-tagging |
Repo | |
Framework | |
A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue
Title | A Copy-Augmented Sequence-to-Sequence Architecture Gives Good Performance on Task-Oriented Dialogue |
Authors | Mihail Eric, Christopher D. Manning |
Abstract | Task-oriented dialogue focuses on conversational agents that participate in user-initiated dialogues on domain-specific topics. In contrast to chatbots, which simply seek to sustain open-ended meaningful discourse, existing task-oriented agents usually explicitly model user intent and belief states. This paper examines bypassing such an explicit representation by depending on a latent neural embedding of state and learning selective attention to dialogue history together with copying to incorporate relevant prior context. We complement recent work by showing the effectiveness of simple sequence-to-sequence neural architectures with a copy mechanism. Our model outperforms more complex memory-augmented models by 7% in per-response generation and is on par with the current state-of-the-art on DSTC2. |
Tasks | |
Published | 2017-01-15 |
URL | http://arxiv.org/abs/1701.04024v3 |
http://arxiv.org/pdf/1701.04024v3.pdf | |
PWC | https://paperswithcode.com/paper/a-copy-augmented-sequence-to-sequence |
Repo | |
Framework | |
Video Object Segmentation Without Temporal Information
Title | Video Object Segmentation Without Temporal Information |
Authors | Kevis-Kokitsi Maninis, Sergi Caelles, Yuhua Chen, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc Van Gool |
Abstract | Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly or they may not even produce any result at all. This paper explores the orthogonal approach of processing each frame independently, i.e disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOS-S), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent video segmentation databases, which show that OSVOS-S is both the fastest and most accurate method in the state of the art. |
Tasks | Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06031v2 |
http://arxiv.org/pdf/1709.06031v2.pdf | |
PWC | https://paperswithcode.com/paper/video-object-segmentation-without-temporal |
Repo | |
Framework | |
The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback
Title | The Dependent Doors Problem: An Investigation into Sequential Decisions without Feedback |
Authors | Amos Korman, Yoav Rodeh |
Abstract | We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of possibly dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of $d$ doors that are initially closed, and the aim is to open all of them as fast as possible. To open a door, the algorithm knocks on it and it might open or not according to some probability distribution. This distribution may depend on which other doors are currently open, as well as on which other doors were open during each of the previous knocks on that door. The algorithm aims to minimize the expected time until all doors open. Crucially, it must act at any time without knowing whether or which other doors have already opened. In this work, we focus on scenarios where dependencies between doors are both positively correlated and acyclic.The fundamental distribution of a door describes the probability it opens in the best of conditions (with respect to other doors being open or closed). We show that if in two configurations of $d$ doors corresponding doors share the same fundamental distribution, then these configurations have the same optimal running time up to a universal constant, no matter what are the dependencies between doors and what are the distributions. We also identify algorithms that are optimal up to a universal constant factor. For the case in which all doors share the same fundamental distribution we additionally provide a simpler algorithm, and a formula to calculate its running time. We furthermore analyse the price of lacking feedback for several configurations governed by standard fundamental distributions. In particular, we show that the price is logarithmic in $d$ for memoryless doors, but can potentially grow to be linear in $d$ for other distributions.We then turn our attention to investigate precise bounds. Even for the case of two doors, identifying the optimal sequence is an intriguing combinatorial question. Here, we study the case of two cascading memoryless doors. That is, the first door opens on each knock independently with probability $p_1$. The second door can only open if the first door is open, in which case it will open on each knock independently with probability $p_2$. We solve this problem almost completely by identifying algorithms that are optimal up to an additive term of 1. |
Tasks | |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06096v1 |
http://arxiv.org/pdf/1704.06096v1.pdf | |
PWC | https://paperswithcode.com/paper/the-dependent-doors-problem-an-investigation |
Repo | |
Framework | |
EEG-Based User Reaction Time Estimation Using Riemannian Geometry Features
Title | EEG-Based User Reaction Time Estimation Using Riemannian Geometry Features |
Authors | Dongrui Wu, Brent J. Lance, Vernon J. Lawhern, Stephen Gordon, Tzyy-Ping Jung, Chin-Teng Lin |
Abstract | Riemannian geometry has been successfully used in many brain-computer interface (BCI) classification problems and demonstrated superior performance. In this paper, for the first time, it is applied to BCI regression problems, an important category of BCI applications. More specifically, we propose a new feature extraction approach for Electroencephalogram (EEG) based BCI regression problems: a spatial filter is first used to increase the signal quality of the EEG trials and also to reduce the dimensionality of the covariance matrices, and then Riemannian tangent space features are extracted. We validate the performance of the proposed approach in reaction time estimation from EEG signals measured in a large-scale sustained-attention psychomotor vigilance task, and show that compared with the traditional powerband features, the tangent space features can reduce the root mean square estimation error by 4.30-8.30%, and increase the estimation correlation coefficient by 6.59-11.13%. |
Tasks | EEG |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08533v1 |
http://arxiv.org/pdf/1704.08533v1.pdf | |
PWC | https://paperswithcode.com/paper/eeg-based-user-reaction-time-estimation-using |
Repo | |
Framework | |
Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets
Title | Recurrent Neural Networks for anomaly detection in the Post-Mortem time series of LHC superconducting magnets |
Authors | Maciej Wielgosz, Andrzej Skoczeń, Matej Mertik |
Abstract | This paper presents a model based on Deep Learning algorithms of LSTM and GRU for facilitating an anomaly detection in Large Hadron Collider superconducting magnets. We used high resolution data available in Post Mortem database to train a set of models and chose the best possible set of their hyper-parameters. Using Deep Learning approach allowed to examine a vast body of data and extract the fragments which require further experts examination and are regarded as anomalies. The presented method does not require tedious manual threshold setting and operator attention at the stage of the system setup. Instead, the automatic approach is proposed, which achieves according to our experiments accuracy of 99%. This is reached for the largest dataset of 302 MB and the following architecture of the network: single layer LSTM, 128 cells, 20 epochs of training, look_back=16, look_ahead=128, grid=100 and optimizer Adam. All the experiments were run on GPU Nvidia Tesla K80 |
Tasks | Anomaly Detection, Time Series |
Published | 2017-02-02 |
URL | http://arxiv.org/abs/1702.00833v1 |
http://arxiv.org/pdf/1702.00833v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-for-anomaly |
Repo | |
Framework | |
Trimmed Density Ratio Estimation
Title | Trimmed Density Ratio Estimation |
Authors | Song Liu, Akiko Takeda, Taiji Suzuki, Kenji Fukumizu |
Abstract | Density ratio estimation is a vital tool in both machine learning and statistical community. However, due to the unbounded nature of density ratio, the estimation procedure can be vulnerable to corrupted data points, which often pushes the estimated ratio toward infinity. In this paper, we present a robust estimator which automatically identifies and trims outliers. The proposed estimator has a convex formulation, and the global optimum can be obtained via subgradient descent. We analyze the parameter estimation error of this estimator under high-dimensional settings. Experiments are conducted to verify the effectiveness of the estimator. |
Tasks | |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03216v3 |
http://arxiv.org/pdf/1703.03216v3.pdf | |
PWC | https://paperswithcode.com/paper/trimmed-density-ratio-estimation |
Repo | |
Framework | |