Paper Group ANR 881
Learning to Navigate Using Mid-Level Visual Priors. Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis. Parametric Graph-based Separable Transforms for Video Coding. Estimating Normalizing Constants for Log-Concave Distributions: Algorithms and Lower Bounds. From Persistent Homology to Reinforcement L …
Learning to Navigate Using Mid-Level Visual Priors
Title | Learning to Navigate Using Mid-Level Visual Priors |
Authors | Alexander Sax, Jeffrey O. Zhang, Bradley Emi, Amir Zamir, Silvio Savarese, Leonidas Guibas, Jitendra Malik |
Abstract | How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. navigating a complex environment)? What are the consequences of not utilizing such visual priors in learning? We study these questions by integrating a generic perceptual skill set (a distance estimator, an edge detector, etc.) within a reinforcement learning framework (see Fig. 1). This skill set (“mid-level vision”) provides the policy with a more processed state of the world compared to raw images. Our large-scale study demonstrates that using mid-level vision results in policies that learn faster, generalize better, and achieve higher final performance, when compared to learning from scratch and/or using state-of-the-art visual and non-visual representation learning methods. We show that conventional computer vision objectives are particularly effective in this regard and can be conveniently integrated into reinforcement learning frameworks. Finally, we found that no single visual representation was universally useful for all downstream tasks, hence we computationally derive a task-agnostic set of representations optimized to support arbitrary downstream tasks. |
Tasks | Representation Learning |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.11121v1 |
https://arxiv.org/pdf/1912.11121v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-navigate-using-mid-level-visual |
Repo | |
Framework | |
Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis
Title | Genetic Programming for Evolving Similarity Functions for Clustering: Representations and Analysis |
Authors | Andrew Lensen, Bing Xue, Mengjie Zhang |
Abstract | Clustering is a difficult and widely-studied data mining task, with many varieties of clustering algorithms proposed in the literature. Nearly all algorithms use a similarity measure such as a distance metric (e.g. Euclidean distance) to decide which instances to assign to the same cluster. These similarity measures are generally pre-defined and cannot be easily tailored to the properties of a particular dataset, which leads to limitations in the quality and the interpretability of the clusters produced. In this paper, we propose a new approach to automatically evolving similarity functions for a given clustering algorithm by using genetic programming. We introduce a new genetic programming-based method which automatically selects a small subset of features (feature selection) and then combines them using a variety of functions (feature construction) to produce dynamic and flexible similarity functions that are specifically designed for a given dataset. We demonstrate how the evolved similarity functions can be used to perform clustering using a graph-based representation. The results of a variety of experiments across a range of large, high-dimensional datasets show that the proposed approach can achieve higher and more consistent performance than the benchmark methods. We further extend the proposed approach to automatically produce multiple complementary similarity functions by using a multi-tree approach, which gives further performance improvements. We also analyse the interpretability and structure of the automatically evolved similarity functions to provide insight into how and why they are superior to standard distance metrics. |
Tasks | Feature Selection |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10264v1 |
https://arxiv.org/pdf/1910.10264v1.pdf | |
PWC | https://paperswithcode.com/paper/genetic-programming-for-evolving-similarity |
Repo | |
Framework | |
Parametric Graph-based Separable Transforms for Video Coding
Title | Parametric Graph-based Separable Transforms for Video Coding |
Authors | Hilmi E. Egilmez, Oguzhan Teke, Amir Said, Vadim Seregin, Marta Karczewicz |
Abstract | In many video coding systems, separable transforms (such as two-dimensional DCT-2) have been used to code block residual signals obtained after prediction. This paper proposes a parametric approach to build graph-based separable transforms (GBSTs) for video coding. Specifically, a GBST is derived from a pair of line graphs, whose weights are determined based on two non-negative parameters. As certain choices of those parameters correspond to the discrete sine and cosine transform types used in recent video coding standards (including DCT-2, DST-7 and DCT-8), this paper further optimizes these graph parameters to better capture residual block statistics and improve video coding efficiency. The proposed GBSTs are tested on the Versatile Video Coding (VVC) reference software, and the experimental results show that about 0.4% average coding gain is achieved over the existing set of separable transforms constructed based on DCT-2, DST-7 and DCT-8 in VVC. |
Tasks | |
Published | 2019-11-16 |
URL | https://arxiv.org/abs/1911.06981v2 |
https://arxiv.org/pdf/1911.06981v2.pdf | |
PWC | https://paperswithcode.com/paper/parametric-graph-based-separable-transforms |
Repo | |
Framework | |
Estimating Normalizing Constants for Log-Concave Distributions: Algorithms and Lower Bounds
Title | Estimating Normalizing Constants for Log-Concave Distributions: Algorithms and Lower Bounds |
Authors | Rong Ge, Holden Lee, Jianfeng Lu |
Abstract | Estimating the normalizing constant of an unnormalized probability distribution has important applications in computer science, statistical physics, machine learning, and statistics. In this work, we consider the problem of estimating the normalizing constant $Z=\int_{\mathbb{R}^d} e^{-f(x)},\mathrm{d}x$ to within a multiplication factor of $1 \pm \varepsilon$ for a $\mu$-strongly convex and $L$-smooth function $f$, given query access to $f(x)$ and $\nabla f(x)$. We give both algorithms and lowerbounds for this problem. Using an annealing algorithm combined with a multilevel Monte Carlo method based on underdamped Langevin dynamics, we show that $\widetilde{\mathcal{O}}\Bigl(\frac{d^{4/3}\kappa + d^{7/6}\kappa^{7/6}}{\varepsilon^2}\Bigr)$ queries to $\nabla f$ are sufficient, where $\kappa= L / \mu$ is the condition number. Moreover, we provide an information theoretic lowerbound, showing that at least $\frac{d^{1-o(1)}}{\varepsilon^{2-o(1)}}$ queries are necessary. This provides a first nontrivial lowerbound for the problem. |
Tasks | |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03043v1 |
https://arxiv.org/pdf/1911.03043v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-normalizing-constants-for-log |
Repo | |
Framework | |
From Persistent Homology to Reinforcement Learning with Applications for Retail Banking
Title | From Persistent Homology to Reinforcement Learning with Applications for Retail Banking |
Authors | Jeremy Charlier |
Abstract | The retail banking services are one of the pillars of the modern economic growth. However, the evolution of the client’s habits in modern societies and the recent European regulations promoting more competition mean the retail banks will encounter serious challenges for the next few years, endangering their activities. They now face an impossible compromise: maximizing the satisfaction of their hyper-connected clients while avoiding any risk of default and being regulatory compliant. Therefore, advanced and novel research concepts are a serious game-changer to gain a competitive advantage. In this context, we investigate in this thesis different concepts bridging the gap between persistent homology, neural networks, recommender engines and reinforcement learning with the aim of improving the quality of the retail banking services. Our contribution is threefold. First, we highlight how to overcome insufficient financial data by generating artificial data using generative models and persistent homology. Then, we present how to perform accurate financial recommendations in multi-dimensions. Finally, we underline a reinforcement learning model-free approach to determine the optimal policy of money management based on the aggregated financial transactions of the clients. Our experimental data sets, extracted from well-known institutions where the privacy and the confidentiality of the clients were not put at risk, support our contributions. In this work, we provide the motivations of our retail banking research project, describe the theory employed to improve the financial services quality and evaluate quantitatively and qualitatively our methodologies for each of the proposed research scenarios. |
Tasks | |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.11573v1 |
https://arxiv.org/pdf/1911.11573v1.pdf | |
PWC | https://paperswithcode.com/paper/from-persistent-homology-to-reinforcement |
Repo | |
Framework | |
Recurrent Connectivity Aids Recognition of Partly Occluded Objects
Title | Recurrent Connectivity Aids Recognition of Partly Occluded Objects |
Authors | Markus Roland Ernst, Jochen Triesch, Thomas Burwick |
Abstract | Feedforward convolutional neural networks are the prevalent model of core object recognition. For challenging conditions, such as occlusion, neuroscientists believe that the recurrent connectivity in the visual cortex aids object recognition. In this work we investigate if and how artificial neural networks can also benefit from recurrent connectivity. For this we systematically compare architectures comprised of bottom-up (B), lateral (L) and top-down (T) connections. To evaluate performance, we introduce two novel stereoscopic occluded object datasets, which bridge the gap from classifying digits to recognizing 3D objects. The task consists of recognizing one target object occluded by multiple occluder objects. We find that recurrent models perform significantly better than their feedforward counterparts, which were matched in parametric complexity. We show that for challenging stimuli, the recurrent feedback is able to correctly revise the initial feedforward guess of the network. Overall, our results suggest that both artificial and biological neural networks can exploit recurrence for improved object recognition. |
Tasks | Object Recognition |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.06175v1 |
https://arxiv.org/pdf/1909.06175v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-connectivity-aids-recognition-of |
Repo | |
Framework | |
On the Variance of Unbiased Online Recurrent Optimization
Title | On the Variance of Unbiased Online Recurrent Optimization |
Authors | Tim Cooijmans, James Martens |
Abstract | The recently proposed Unbiased Online Recurrent Optimization algorithm (UORO, arXiv:1702.05043) uses an unbiased approximation of RTRL to achieve fully online gradient-based learning in RNNs. In this work we analyze the variance of the gradient estimate computed by UORO, and propose several possible changes to the method which reduce this variance both in theory and practice. We also contribute significantly to the theoretical and intuitive understanding of UORO (and its existing variance reduction technique), and demonstrate a fundamental connection between its gradient estimate and the one that would be computed by REINFORCE if small amounts of noise were added to the RNN’s hidden units. |
Tasks | |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02405v1 |
http://arxiv.org/pdf/1902.02405v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-variance-of-unbiased-online-recurrent |
Repo | |
Framework | |
Protein Fold Family Recognition From Unassigned Residual Dipolar Coupling Data
Title | Protein Fold Family Recognition From Unassigned Residual Dipolar Coupling Data |
Authors | Rishi Mukhopadhyay, Paul Shealy, Homayoun Valafar |
Abstract | Despite many advances in computational modeling of protein structures, these methods have not been widely utilized by experimental structural biologists. Two major obstacles are preventing the transition from a purely-experimental to a purely-computational mode of protein structure determination. The first problem is that most computational methods need a large library of computed structures that span a large variety of protein fold families, while structural genomics initiatives have slowed in their ability to provide novel protein folds in recent years. The second problem is an unwillingness to trust computational models that have no experimental backing. In this paper we test a potential solution to these problems that we have called Probability Density Profile Analysis (PDPA) that utilizes unassigned residual dipolar coupling data that are relatively cheap to acquire from NMR experiments. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00383v1 |
https://arxiv.org/pdf/1911.00383v1.pdf | |
PWC | https://paperswithcode.com/paper/protein-fold-family-recognition-from |
Repo | |
Framework | |
ROBO: Robust, Fully Neural Object Detection for Robot Soccer
Title | ROBO: Robust, Fully Neural Object Detection for Robot Soccer |
Authors | Marton Szemenyei, Vladimir Estivill-Castro |
Abstract | Deep Learning has become exceptionally popular in the last few years due to its success in computer vision and other fields of AI. However, deep neural networks are computationally expensive, which limits their application in low power embedded systems, such as mobile robots. In this paper, an efficient neural network architecture is proposed for the problem of detecting relevant objects in robot soccer environments. The ROBO model’s increase in efficiency is achieved by exploiting the peculiarities of the environment. Compared to the state-of-the-art Tiny YOLO model, the proposed network provides approximately 35 times decrease in run time, while achieving superior average precision, although at the cost of slightly worse localization accuracy. |
Tasks | Object Detection |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.10949v1 |
https://arxiv.org/pdf/1910.10949v1.pdf | |
PWC | https://paperswithcode.com/paper/robo-robust-fully-neural-object-detection-for |
Repo | |
Framework | |
A Deep Convolutional Network for Seismic Shot-Gather Image Quality Classification
Title | A Deep Convolutional Network for Seismic Shot-Gather Image Quality Classification |
Authors | Eduardo Betine Bucker, Antonio José Grandson Busson, Ruy Luiz Milidiú, Sérgio Colcher, Bruno Pereira Dias, André Bulcão |
Abstract | Deep Learning-based models such as Convolutional Neural Networks, have led to significant advancements in several areas of computing applications. Seismogram quality assurance is a relevant Geophysics task, since in the early stages of seismic processing, we are required to identify and fix noisy sail lines. In this work, we introduce a real-world seismogram quality classification dataset based on 6,613 examples, manually labeled by human experts as good, bad or ugly, according to their noise intensity. This dataset is used to train a CNN classifier for seismic shot-gathers quality prediction. In our empirical evaluation, we observe an F1-score of 93.56% in the test set. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01148v1 |
https://arxiv.org/pdf/1912.01148v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-convolutional-network-for-seismic-shot |
Repo | |
Framework | |
Transfer Learning with Edge Attention for Prostate MRI Segmentation
Title | Transfer Learning with Edge Attention for Prostate MRI Segmentation |
Authors | Xiangxiang Qin |
Abstract | Prostate cancer is one of the common diseases in men, and it is the most common malignant tumor in developed countries. Studies have shown that the male prostate incidence rate is as high as 2.5% to 16%, Currently, the inci-dence of prostate cancer in Asia is lower than that in the West, but it is increas-ing rapidly. If prostate cancer can be found as early as possible and treated in time, it will have a high survival rate. Therefore, it is of great significance for the diagnosis and treatment of prostate cancer. In this paper, we propose a trans-fer learning method based on deep neural network for prostate MRI segmenta-tion. In addition, we design a multi-level edge attention module using wavelet decomposition to overcome the difficulty of ambiguous boundary in prostate MRI segmentation tasks. The prostate images were provided by MICCAI Grand Challenge-Prostate MR Image Segmentation 2012 (PROMISE 12) challenge dataset. |
Tasks | Semantic Segmentation, Transfer Learning |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09847v1 |
https://arxiv.org/pdf/1912.09847v1.pdf | |
PWC | https://paperswithcode.com/paper/191209847 |
Repo | |
Framework | |
Zooming Cautiously: Linear-Memory Heuristic Search With Node Expansion Guarantees
Title | Zooming Cautiously: Linear-Memory Heuristic Search With Node Expansion Guarantees |
Authors | Laurent Orseau, Levi H. S. Lelis, Tor Lattimore |
Abstract | We introduce and analyze two parameter-free linear-memory tree search algorithms. Under mild assumptions we prove our algorithms are guaranteed to perform only a logarithmic factor more node expansions than A* when the search space is a tree. Previously, the best guarantee for a linear-memory algorithm under similar assumptions was achieved by IDA*, which in the worst case expands quadratically more nodes than in its last iteration. Empirical results support the theory and demonstrate the practicality and robustness of our algorithms. Furthermore, they are fast and easy to implement. |
Tasks | |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.03242v1 |
https://arxiv.org/pdf/1906.03242v1.pdf | |
PWC | https://paperswithcode.com/paper/zooming-cautiously-linear-memory-heuristic |
Repo | |
Framework | |
GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition
Title | GPU-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition |
Authors | Hugo Braun, Justin Luitjens, Ryan Leary, Tim Kaldewey, Daniel Povey |
Abstract | We present an optimized weighted finite-state transducer (WFST) decoder capable of online streaming and offline batch processing of audio using Graphics Processing Units (GPUs). The decoder is efficient in memory utilization, input/output (I/O) bandwidth, and uses a novel Viterbi implementation designed to maximize parallelism. The reduced memory footprint allows the decoder to process significantly larger graphs than previously possible, while optimizing I/O increases the number of simultaneous streams supported. GPU preprocessing of lattice segments enables intermediate lattice results to be returned to the requestor during streaming inference. Collectively, the proposed algorithm yields up to a 240x speedup over single core CPU decoding, and up to 40x faster decoding than the current state-of-the-art GPU decoder, while returning equivalent results. This decoder design enables deployment of production-grade ASR models on a large spectrum of systems, ranging from large data center servers to low-power edge devices. |
Tasks | Speech Recognition |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10032v2 |
https://arxiv.org/pdf/1910.10032v2.pdf | |
PWC | https://paperswithcode.com/paper/gpu-accelerated-viterbi-exact-lattice-decoder |
Repo | |
Framework | |
Machine learning applied to quantum synchronization-assisted probing
Title | Machine learning applied to quantum synchronization-assisted probing |
Authors | Gabriel Garau Estarellas, Gian Luca Giorgi, Miguel C. Soriano, Roberta Zambrini |
Abstract | A probing scheme is considered with an accessible and controllable qubit, used to probe an out-of equilibrium system consisting of a second qubit interacting with an environment. Quantum spontaneous synchronization between the probe and the system emerges in this model and, by tuning the probe frequency, can occur both in-phase and in anti-phase. We analyze the capability of machine learning in this probing scheme based on quantum synchronization. An artificial neural network is used to infer, from a probe observable, main dissipation features, such as the environment Ohmicity index. The efficiency of the algorithm in the presence of some noise in the dataset is also considered. We show that the performance in either classification and regression is significantly improved due to the in/anti-phase synchronization transition. This opens the way to the characterization of environments with arbitrary spectral densities. |
Tasks | |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05230v1 |
http://arxiv.org/pdf/1901.05230v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-applied-to-quantum |
Repo | |
Framework | |
On the Realization of Compositionality in Neural Networks
Title | On the Realization of Compositionality in Neural Networks |
Authors | Joris Baan, Jana Leible, Mitja Nikolaus, David Rau, Dennis Ulmer, Tim Baumgärtner, Dieuwke Hupkes, Elia Bruni |
Abstract | We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Li\v{s}ka et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01634v2 |
https://arxiv.org/pdf/1906.01634v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-realization-of-compositionality-in |
Repo | |
Framework | |