July 26, 2019

3084 words 15 mins read

Paper Group ANR 770

Subword and Crossword Units for CTC Acoustic Models. Recursive Multikernel Filters Exploiting Nonlinear Temporal Structure. Riemannian Optimization via Frank-Wolfe Methods. Non-line-of-sight Imaging with Partial Occluders and Surface Normals. Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification. RCA …

Subword and Crossword Units for CTC Acoustic Models


Title	Subword and Crossword Units for CTC Acoustic Models
Authors	Thomas Zenkel, Ramon Sanabria, Florian Metze, Alex Waibel
Abstract	This paper proposes a novel approach to create an unit set for CTC based speech recognition systems. By using Byte Pair Encoding we learn an unit set of an arbitrary size on a given training text. In contrast to using characters or words as units this allows us to find a good trade-off between the size of our unit set and the available training data. We evaluate both Crossword units, that may span multiple word, and Subword units. By combining this approach with decoding methods using a separate language model we are able to achieve state of the art results for grapheme based CTC systems.
Tasks	Language Modelling, Speech Recognition
Published	2017-12-19
URL	http://arxiv.org/abs/1712.06855v2
PDF	http://arxiv.org/pdf/1712.06855v2.pdf
PWC	https://paperswithcode.com/paper/subword-and-crossword-units-for-ctc-acoustic
Repo
Framework

Recursive Multikernel Filters Exploiting Nonlinear Temporal Structure


Title	Recursive Multikernel Filters Exploiting Nonlinear Temporal Structure
Authors	Steven Van Vaerenbergh, Simone Scardapane, Ignacio Santamaria
Abstract	In kernel methods, temporal information on the data is commonly included by using time-delayed embeddings as inputs. Recently, an alternative formulation was proposed by defining a gamma-filter explicitly in a reproducing kernel Hilbert space, giving rise to a complex model where multiple kernels operate on different temporal combinations of the input signal. In the original formulation, the kernels are then simply combined to obtain a single kernel matrix (for instance by averaging), which provides computational benefits but discards important information on the temporal structure of the signal. Inspired by works on multiple kernel learning, we overcome this drawback by considering the different kernels separately. We propose an efficient strategy to adaptively combine and select these kernels during the training phase. The resulting batch and online algorithms automatically learn to process highly nonlinear temporal information extracted from the input signal, which is implicitly encoded in the kernel values. We evaluate our proposal on several artificial and real tasks, showing that it can outperform classical approaches both in batch and online settings.
Tasks
Published	2017-06-12
URL	http://arxiv.org/abs/1706.03533v1
PDF	http://arxiv.org/pdf/1706.03533v1.pdf
PWC	https://paperswithcode.com/paper/recursive-multikernel-filters-exploiting
Repo
Framework

Riemannian Optimization via Frank-Wolfe Methods


Title	Riemannian Optimization via Frank-Wolfe Methods
Authors	Melanie Weber, Suvrit Sra
Abstract	We study projection-free methods for constrained Riemannian optimization. In particular, we propose the Riemannian Frank-Wolfe (RFW) method. We analyze non-asymptotic convergence rates of RFW to an optimum for (geodesically) convex problems, and to a critical point for nonconvex objectives. We also present a practical setting under which RFW can attain a linear convergence rate. As a concrete example, we specialize Rfw to the manifold of positive definite matrices and apply it to two tasks: (i) computing the matrix geometric mean (Riemannian centroid); and (ii) computing the Bures-Wasserstein barycenter. Both tasks involve geodesically convex interval constraints, for which we show that the Riemannian “linear oracle” required by RFW admits a closed-form solution; this result may be of independent interest. We further specialize RFW to the special orthogonal group and show that here too, the Riemannian “linear oracle” can be solved in closed form. Here, we describe an application to the synchronization of data matrices (Procrustes problem). We complement our theoretical results with an empirical comparison of Rfw against state-of-the-art Riemannian optimization methods and observe that RFW performs competitively on the task of computing Riemannian centroids.
Tasks
Published	2017-10-30
URL	https://arxiv.org/abs/1710.10770v3
PDF	https://arxiv.org/pdf/1710.10770v3.pdf
PWC	https://paperswithcode.com/paper/riemannian-frank-wolfe-with-application-to
Repo
Framework

Non-line-of-sight Imaging with Partial Occluders and Surface Normals


Title	Non-line-of-sight Imaging with Partial Occluders and Surface Normals
Authors	Felix Heide, Matthew O’Toole, Kai Zang, David Lindell, Steven Diamond, Gordon Wetzstein
Abstract	Imaging objects obscured by occluders is a significant challenge for many applications. A camera that could “see around corners” could help improve navigation and mapping capabilities of autonomous vehicles or make search and rescue missions more effective. Time-resolved single-photon imaging systems have recently been demonstrated to record optical information of a scene that can lead to an estimation of the shape and reflectance of objects hidden from the line of sight of a camera. However, existing non-line-of-sight (NLOS) reconstruction algorithms have been constrained in the types of light transport effects they model for the hidden scene parts. We introduce a factored NLOS light transport representation that accounts for partial occlusions and surface normals. Based on this model, we develop a factorization approach for inverse time-resolved light transport and demonstrate high-fidelity NLOS reconstructions for challenging scenes both in simulation and with an experimental NLOS imaging system.
Tasks	Autonomous Vehicles
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07134v3
PDF	http://arxiv.org/pdf/1711.07134v3.pdf
PWC	https://paperswithcode.com/paper/non-line-of-sight-imaging-with-partial
Repo
Framework

Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification


Title	Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification
Authors	Ji Wen
Abstract	Relation classification is an important semantic processing task in the field of natural language processing (NLP). In this paper, we present a novel model, Structure Regularized Bidirectional Recurrent Convolutional Neural Network(SR-BRCNN), to classify the relation of two entities in a sentence, and the new dataset of Chinese Sanwen for named entity recognition and relation classification. Some state-of-the-art systems concentrate on modeling the shortest dependency path (SDP) between two entities leveraging convolutional or recurrent neural networks. We further explore how to make full use of the dependency relations information in the SDP and how to improve the model by the method of structure regularization. We propose a structure regularized model to learn relation representations along the SDP extracted from the forest formed by the structure regularized dependency tree, which benefits reducing the complexity of the whole model and helps improve the $F_{1}$ score by 10.3. Experimental results show that our method outperforms the state-of-the-art approaches on the Chinese Sanwen task and performs as well on the SemEval-2010 Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used will be released in the further.
Tasks	Named Entity Recognition, Relation Classification
Published	2017-11-06
URL	http://arxiv.org/abs/1711.02509v1
PDF	http://arxiv.org/pdf/1711.02509v1.pdf
PWC	https://paperswithcode.com/paper/structure-regularized-bidirectional-recurrent
Repo
Framework

RCAMP: A Resilient Communication-Aware Motion Planner for Mobile Robots with Autonomous Repair of Wireless Connectivity


Title	RCAMP: A Resilient Communication-Aware Motion Planner for Mobile Robots with Autonomous Repair of Wireless Connectivity
Authors	Sergio Caccamo, Ramviyas Parasuraman, Luigi Freda, Mario Gianni, Petter Ögren
Abstract	Mobile robots, be it autonomous or teleoperated, require stable communication with the base station to exchange valuable information. Given the stochastic elements in radio signal propagation, such as shadowing and fading, and the possibilities of unpredictable events or hardware failures, communication loss often presents a significant mission risk, both in terms of probability and impact, especially in Urban Search and Rescue (USAR) operations. Depending on the circumstances, disconnected robots are either abandoned or attempt to autonomously back-trace their way to the base station. Although recent results in Communication-Aware Motion Planning can be used to effectively manage connectivity with robots, there are no results focusing on autonomously re-establishing the wireless connectivity of a mobile robot without back-tracking or using detailed a priori information of the network. In this paper, we present a robust and online radio signal mapping method using Gaussian Random Fields and propose a Resilient Communication-Aware Motion Planner (RCAMP) that integrates the above signal mapping framework with a motion planner. RCAMP considers both the environment and the physical constraints of the robot, based on the available sensory information. We also propose a self-repair strategy using RCMAP, that takes both connectivity and the goal position into account when driving to a connection-safe position in the event of a communication loss. We demonstrate the proposed planner in a set of realistic simulations of an exploration task in single or multi-channel communication scenarios.
Tasks	Motion Planning
Published	2017-10-18
URL	http://arxiv.org/abs/1710.09303v1
PDF	http://arxiv.org/pdf/1710.09303v1.pdf
PWC	https://paperswithcode.com/paper/rcamp-a-resilient-communication-aware-motion
Repo
Framework

Solving the “false positives” problem in fraud prediction


Title	Solving the “false positives” problem in fraud prediction
Authors	Roy Wedge, James Max Kanter, Santiago Moral Rubio, Sergio Iglesias Perez, Kalyan Veeramachaneni
Abstract	In this paper, we present an automated feature engineering based approach to dramatically reduce false positives in fraud prediction. False positives plague the fraud prediction industry. It is estimated that only 1 in 5 declared as fraud are actually fraud and roughly 1 in every 6 customers have had a valid transaction declined in the past year. To address this problem, we use the Deep Feature Synthesis algorithm to automatically derive behavioral features based on the historical data of the card associated with a transaction. We generate 237 features (>100 behavioral patterns) for each transaction, and use a random forest to learn a classifier. We tested our machine learning model on data from a large multinational bank and compared it to their existing solution. On an unseen data of 1.852 million transactions, we were able to reduce the false positives by 54% and provide a savings of 190K euros. We also assess how to deploy this solution, and whether it necessitates streaming computation for real time scoring. We found that our solution can maintain similar benefits even when historical features are computed once every 7 days.
Tasks	Automated Feature Engineering, Feature Engineering
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07709v1
PDF	http://arxiv.org/pdf/1710.07709v1.pdf
PWC	https://paperswithcode.com/paper/solving-the-false-positives-problem-in-fraud
Repo
Framework

Distance-based Self-Attention Network for Natural Language Inference


Title	Distance-based Self-Attention Network for Natural Language Inference
Authors	Jinbae Im, Sungzoon Cho
Abstract	Attention mechanism has been used as an ancillary means to help RNN or CNN. However, the Transformer (Vaswani et al., 2017) recently recorded the state-of-the-art performance in machine translation with a dramatic reduction in training time by solely using attention. Motivated by the Transformer, Directional Self Attention Network (Shen et al., 2017), a fully attention-based sentence encoder, was proposed. It showed good performance with various data by using forward and backward directional information in a sentence. But in their study, not considered at all was the distance between words, an important feature when learning the local dependency to help understand the context of input text. We propose Distance-based Self-Attention Network, which considers the word distance by using a simple distance mask in order to model the local dependency without losing the ability of modeling global dependency which attention has inherent. Our model shows good performance with NLI data, and it records the new state-of-the-art result with SNLI data. Additionally, we show that our model has a strength in long sentences or documents.
Tasks	Machine Translation, Natural Language Inference
Published	2017-12-06
URL	http://arxiv.org/abs/1712.02047v1
PDF	http://arxiv.org/pdf/1712.02047v1.pdf
PWC	https://paperswithcode.com/paper/distance-based-self-attention-network-for
Repo
Framework

On reproduction of On the regularization of Wasserstein GANs


Title	On reproduction of On the regularization of Wasserstein GANs
Authors	Junghoon Seo, Taegyun Jeon
Abstract	This report has several purposes. First, our report is written to investigate the reproducibility of the submitted paper On the regularization of Wasserstein GANs (2018). Second, among the experiments performed in the submitted paper, five aspects were emphasized and reproduced: learning speed, stability, robustness against hyperparameter, estimating the Wasserstein distance, and various sampling method. Finally, we identify which parts of the contribution can be reproduced, and at what cost in terms of resources. All source code for reproduction is open to the public.
Tasks
Published	2017-12-16
URL	http://arxiv.org/abs/1712.05882v1
PDF	http://arxiv.org/pdf/1712.05882v1.pdf
PWC	https://paperswithcode.com/paper/on-reproduction-of-on-the-regularization-of
Repo
Framework

Trimming the Independent Fat: Sufficient Statistics, Mutual Information, and Predictability from Effective Channel States


Title	Trimming the Independent Fat: Sufficient Statistics, Mutual Information, and Predictability from Effective Channel States
Authors	Ryan G. James, John R. Mahoney, James P. Crutchfield
Abstract	One of the most fundamental questions one can ask about a pair of random variables X and Y is the value of their mutual information. Unfortunately, this task is often stymied by the extremely large dimension of the variables. We might hope to replace each variable by a lower-dimensional representation that preserves the relationship with the other variable. The theoretically ideal implementation is the use of minimal sufficient statistics, where it is well-known that either X or Y can be replaced by their minimal sufficient statistic about the other while preserving the mutual information. While intuitively reasonable, it is not obvious or straightforward that both variables can be replaced simultaneously. We demonstrate that this is in fact possible: the information X’s minimal sufficient statistic preserves about Y is exactly the information that Y’s minimal sufficient statistic preserves about X. As an important corollary, we consider the case where one variable is a stochastic process’ past and the other its future and the present is viewed as a memoryful channel. In this case, the mutual information is the channel transmission rate between the channel’s effective states. That is, the past-future mutual information (the excess entropy) is the amount of information about the future that can be predicted using the past. Translating our result about minimal sufficient statistics, this is equivalent to the mutual information between the forward- and reverse-time causal states of computational mechanics. We close by discussing multivariate extensions to this use of minimal sufficient statistics.
Tasks
Published	2017-02-07
URL	http://arxiv.org/abs/1702.01831v1
PDF	http://arxiv.org/pdf/1702.01831v1.pdf
PWC	https://paperswithcode.com/paper/trimming-the-independent-fat-sufficient
Repo
Framework

Visual Saliency Prediction Using a Mixture of Deep Neural Networks


Title	Visual Saliency Prediction Using a Mixture of Deep Neural Networks
Authors	Samuel Dodge, Lina Karam
Abstract	Visual saliency models have recently begun to incorporate deep learning to achieve predictive capacity much greater than previous unsupervised methods. However, most existing models predict saliency using local mechanisms limited to the receptive field of the network. We propose a model that incorporates global scene semantic information in addition to local information gathered by a convolutional neural network. Our model is formulated as a mixture of experts. Each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks’ output, with weights determined by a separate gating network. This gating network is guided by global scene information to predict weights. The expert networks and the gating network are trained simultaneously in an end-to-end manner. We show that our mixture formulation leads to improvement in performance over an otherwise identical non-mixture model that does not incorporate global scene information.
Tasks	Saliency Prediction
Published	2017-02-01
URL	http://arxiv.org/abs/1702.00372v1
PDF	http://arxiv.org/pdf/1702.00372v1.pdf
PWC	https://paperswithcode.com/paper/visual-saliency-prediction-using-a-mixture-of
Repo
Framework

Interactive, Intelligent Tutoring for Auxiliary Constructions in Geometry Proofs


Title	Interactive, Intelligent Tutoring for Auxiliary Constructions in Geometry Proofs
Authors	Ke Wang, Zhendong Su
Abstract	Geometry theorem proving forms a major and challenging component in the K-12 mathematics curriculum. A particular difficult task is to add auxiliary constructions (i.e, additional lines or points) to aid proof discovery. Although there exist many intelligent tutoring systems proposed for geometry proofs, few teach students how to find auxiliary constructions. And the few exceptions are all limited by their underlying reasoning processes for supporting auxiliary constructions. This paper tackles these weaknesses of prior systems by introducing an interactive geometry tutor, the Advanced Geometry Proof Tutor (AGPT). It leverages a recent automated geometry prover to provide combined benefits that any geometry theorem prover or intelligent tutoring system alone cannot accomplish. In particular, AGPT not only can automatically process images of geometry problems directly, but also can interactively train and guide students toward discovering auxiliary constructions on their own. We have evaluated AGPT via a pilot study with 78 high school students. The study results show that, on training students how to find auxiliary constructions, there is no significant perceived difference between AGPT and human tutors, and AGPT is significantly more effective than the state-of-the-art geometry solver that produces human-readable proofs.
Tasks	Automated Theorem Proving
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07154v1
PDF	http://arxiv.org/pdf/1711.07154v1.pdf
PWC	https://paperswithcode.com/paper/interactive-intelligent-tutoring-for
Repo
Framework

A Survey on Multi-View Clustering


Title	A Survey on Multi-View Clustering
Authors	Guoqing Chao, Shiliang Sun, Jinbo Bi
Abstract	With advances in information acquisition technologies, multi-view data become ubiquitous. Multi-view learning has thus become more and more popular in machine learning and data mining fields. Multi-view unsupervised or semi-supervised learning, such as co-training, co-regularization has gained considerable attention. Although recently, multi-view clustering (MVC) methods have been developed rapidly, there has not been a survey to summarize and analyze the current progress. Therefore, this paper reviews the common strategies for combining multiple views of data and based on this summary we propose a novel taxonomy of the MVC approaches. We further discuss the relationships between MVC and multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated. To promote future development of MVC, we envision several open problems that may require further investigation and thorough examination.
Tasks	MULTI-VIEW LEARNING
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06246v2
PDF	http://arxiv.org/pdf/1712.06246v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-multi-view-clustering
Repo
Framework

Deep Local Binary Patterns


Title	Deep Local Binary Patterns
Authors	Kelwin Fernandes, Jaime S. Cardoso
Abstract	Local Binary Pattern (LBP) is a traditional descriptor for texture analysis that gained attention in the last decade. Being robust to several properties such as invariance to illumination translation and scaling, LBPs achieved state-of-the-art results in several applications. However, LBPs are not able to capture high-level features from the image, merely encoding features with low abstraction levels. In this work, we propose Deep LBP, which borrow ideas from the deep learning community to improve LBP expressiveness. By using parametrized data-driven LBP, we enable successive applications of the LBP operators with increasing abstraction levels. We validate the relevance of the proposed idea in several datasets from a wide range of applications. Deep LBP improved the performance of traditional and multiscale LBP in all cases.
Tasks	Texture Classification
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06597v1
PDF	http://arxiv.org/pdf/1711.06597v1.pdf
PWC	https://paperswithcode.com/paper/deep-local-binary-patterns
Repo
Framework

Testing the limits of unsupervised learning for semantic similarity


Title	Testing the limits of unsupervised learning for semantic similarity
Authors	Richa Sharma, Muktabh Mayank Srivastava
Abstract	Semantic Similarity between two sentences can be defined as a way to determine how related or unrelated two sentences are. The task of Semantic Similarity in terms of distributed representations can be thought to be generating sentence embeddings (dense vectors) which take both context and meaning of sentence in account. Such embeddings can be produced by multiple methods, in this paper we try to evaluate LSTM auto encoders for generating these embeddings. Unsupervised algorithms (auto encoders to be specific) just try to recreate their inputs, but they can be forced to learn order (and some inherent meaning to some extent) by creating proper bottlenecks. We try to evaluate how properly can algorithms trained just on plain English Sentences learn to figure out Semantic Similarity, without giving them any sense of what meaning of a sentence is.
Tasks	Semantic Similarity, Semantic Textual Similarity, Sentence Embeddings
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08246v1
PDF	http://arxiv.org/pdf/1710.08246v1.pdf
PWC	https://paperswithcode.com/paper/testing-the-limits-of-unsupervised-learning
Repo
Framework