July 29, 2019

3140 words 15 mins read

Paper Group ANR 78

Paper Group ANR 78

Introduction to Tensor Decompositions and their Applications in Machine Learning. Model compression as constrained optimization, with application to neural nets. Part I: general framework. The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning. Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent …

Introduction to Tensor Decompositions and their Applications in Machine Learning

Title Introduction to Tensor Decompositions and their Applications in Machine Learning
Authors Stephan Rabanser, Oleksandr Shchur, Stephan Günnemann
Abstract Tensors are multidimensional arrays of numerical values and therefore generalize matrices to multiple dimensions. While tensors first emerged in the psychometrics community in the $20^{\text{th}}$ century, they have since then spread to numerous other disciplines, including machine learning. Tensors and their decompositions are especially beneficial in unsupervised learning settings, but are gaining popularity in other sub-disciplines like temporal and multi-relational data analysis, too. The scope of this paper is to give a broad overview of tensors, their decompositions, and how they are used in machine learning. As part of this, we are going to introduce basic tensor concepts, discuss why tensors can be considered more rigid than matrices with respect to the uniqueness of their decomposition, explain the most important factorization algorithms and their properties, provide concrete examples of tensor decomposition applications in machine learning, conduct a case study on tensor-based estimation of mixture models, talk about the current state of research, and provide references to available software libraries.
Tasks
Published 2017-11-29
URL http://arxiv.org/abs/1711.10781v1
PDF http://arxiv.org/pdf/1711.10781v1.pdf
PWC https://paperswithcode.com/paper/introduction-to-tensor-decompositions-and
Repo
Framework

Model compression as constrained optimization, with application to neural nets. Part I: general framework

Title Model compression as constrained optimization, with application to neural nets. Part I: general framework
Authors Miguel Á. Carreira-Perpiñán
Abstract Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model compression as constrained optimization. This includes many types of compression: quantization, low-rank decomposition, pruning, lossless compression and others. Then, we give a general algorithm to optimize this nonconvex problem based on the augmented Lagrangian and alternating optimization. This results in a “learning-compression” algorithm, which alternates a learning step of the uncompressed model, independent of the compression type, with a compression step of the model parameters, independent of the learning task. This simple, efficient algorithm is guaranteed to find the best compressed model for the task in a local sense under standard assumptions. We present separately in several companion papers the development of this general framework into specific algorithms for model compression based on quantization, pruning and other variations, including experimental results on compressing neural nets and other models.
Tasks Model Compression, Object Recognition, Quantization
Published 2017-07-05
URL http://arxiv.org/abs/1707.01209v1
PDF http://arxiv.org/pdf/1707.01209v1.pdf
PWC https://paperswithcode.com/paper/model-compression-as-constrained-optimization-1
Repo
Framework

The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning

Title The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning
Authors Anil Usumezbas, Ricardo Fabbri, Benjamin Kimia
Abstract The three-dimensional reconstruction of scenes from multiple views has made impressive strides in recent years, chiefly by methods correlating isolated feature points, intensities, or curvilinear structure. In the general setting, i.e., without requiring controlled acquisition, limited number of objects, abundant patterns on objects, or object curves to follow particular models, the majority of these methods produce unorganized point clouds, meshes, or voxel representations of the reconstructed scene, with some exceptions producing 3D drawings as networks of curves. Many applications, e.g., robotics, urban planning, industrial design, and hard surface modeling, however, require structured representations which make explicit 3D curves, surfaces, and their spatial relationships. Reconstructing surface representations can now be constrained by the 3D drawing acting like a scaffold to hang on the computed representations, leading to increased robustness and quality of reconstruction. This paper presents one way of completing such 3D drawings with surface reconstructions, by exploring occlusion reasoning through lofting algorithms.
Tasks
Published 2017-07-13
URL http://arxiv.org/abs/1707.03946v1
PDF http://arxiv.org/pdf/1707.03946v1.pdf
PWC https://paperswithcode.com/paper/the-surfacing-of-multiview-3d-drawings-via
Repo
Framework

Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent Convolutional Networks with Toeplitz Attention

Title Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent Convolutional Networks with Toeplitz Attention
Authors Alican Bozkurt, Kivanc Kose, Jaume Coll-Font, Christi Alessi-Fox, Dana H. Brooks, Jennifer G. Dy, Milind Rajadhyaksha
Abstract Reflectance confocal microscopy (RCM) is an effective, non-invasive pre-screening tool for skin cancer diagnosis, but it requires extensive training and experience to assess accurately. There are few quantitative tools available to standardize image acquisition and analysis, and the ones that are available are not interpretable. In this study, we use a recurrent neural network with attention on convolutional network features. We apply it to delineate skin strata in vertically-oriented stacks of transverse RCM image slices in an interpretable manner. We introduce a new attention mechanism called Toeplitz attention, which constrains the attention map to have a Toeplitz structure. Testing our model on an expert labeled dataset of 504 RCM stacks, we achieve 88.17% image-wise classification accuracy, which is the current state-of-art.
Tasks
Published 2017-12-01
URL http://arxiv.org/abs/1712.00192v1
PDF http://arxiv.org/pdf/1712.00192v1.pdf
PWC https://paperswithcode.com/paper/delineation-of-skin-strata-in-reflectance
Repo
Framework

Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach

Title Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach
Authors Shun Yang, Wenshuo Wang, Chang Liu, Kevin Deng, J. Karl Hedrick
Abstract Deep learning-based approaches have been widely used for training controllers for autonomous vehicles due to their powerful ability to approximate nonlinear functions or policies. However, the training process usually requires large labeled data sets and takes a lot of time. In this paper, we analyze the influences of features on the performance of controllers trained using the convolutional neural networks (CNNs), which gives a guideline of feature selection to reduce computation cost. We collect a large set of data using The Open Racing Car Simulator (TORCS) and classify the image features into three categories (sky-related, roadside-related, and road-related features).We then design two experimental frameworks to investigate the importance of each single feature for training a CNN controller.The first framework uses the training data with all three features included to train a controller, which is then tested with data that has one feature removed to evaluate the feature’s effects. The second framework is trained with the data that has one feature excluded, while all three features are included in the test data. Different driving scenarios are selected to test and analyze the trained controllers using the two experimental frameworks. The experiment results show that (1) the road-related features are indispensable for training the controller, (2) the roadside-related features are useful to improve the generalizability of the controller to scenarios with complicated roadside information, and (3) the sky-related features have limited contribution to train an end-to-end autonomous vehicle controller.
Tasks Autonomous Vehicles, Feature Selection
Published 2017-03-28
URL http://arxiv.org/abs/1703.09744v1
PDF http://arxiv.org/pdf/1703.09744v1.pdf
PWC https://paperswithcode.com/paper/feature-analysis-and-selection-for-training
Repo
Framework

A Machine Learning Approach to Routing

Title A Machine Learning Approach to Routing
Authors Asaf Valadarsky, Michael Schapira, Dafna Shahaf, Aviv Tamar
Abstract Can ideas and techniques from machine learning be leveraged to automatically generate “good” routing configurations? We investigate the power of data-driven routing protocols. Our results suggest that applying ideas and techniques from deep reinforcement learning to this context yields high performance, motivating further research along these lines.
Tasks
Published 2017-08-10
URL http://arxiv.org/abs/1708.03074v2
PDF http://arxiv.org/pdf/1708.03074v2.pdf
PWC https://paperswithcode.com/paper/a-machine-learning-approach-to-routing
Repo
Framework

Learning to Disambiguate by Asking Discriminative Questions

Title Learning to Disambiguate by Asking Discriminative Questions
Authors Yining Li, Chen Huang, Xiaoou Tang, Chen-Change Loy
Abstract The ability to ask questions is a powerful tool to gather information in order to learn about the world and resolve ambiguities. In this paper, we explore a novel problem of generating discriminative questions to help disambiguate visual instances. Our work can be seen as a complement and new extension to the rich research studies on image captioning and question answering. We introduce the first large-scale dataset with over 10,000 carefully annotated images-question tuples to facilitate benchmarking. In particular, each tuple consists of a pair of images and 4.6 discriminative questions (as positive samples) and 5.9 non-discriminative questions (as negative samples) on average. In addition, we present an effective method for visual discriminative question generation. The method can be trained in a weakly supervised manner without discriminative images-question tuples but just existing visual question answering datasets. Promising results are shown against representative baselines through quantitative evaluations and user studies.
Tasks Image Captioning, Question Answering, Question Generation, Visual Question Answering
Published 2017-08-09
URL http://arxiv.org/abs/1708.02760v1
PDF http://arxiv.org/pdf/1708.02760v1.pdf
PWC https://paperswithcode.com/paper/learning-to-disambiguate-by-asking
Repo
Framework

Speaker-independent Speech Separation with Deep Attractor Network

Title Speaker-independent Speech Separation with Deep Attractor Network
Authors Yi Luo, Zhuo Chen, Nima Mesgarani
Abstract Despite the recent success of deep learning for many speech processing tasks, single-microphone, speaker-independent speech separation remains challenging for two main reasons. The first reason is the arbitrary order of the target and masker speakers in the mixture permutation problem, and the second is the unknown number of speakers in the mixture output dimension problem. We propose a novel deep learning framework for speech separation that addresses both of these issues. We use a neural network to project the time-frequency representation of the mixture signal into a high-dimensional embedding space. A reference point attractor is created in the embedding space to represent each speaker which is defined as the centroid of the speaker in the embedding space. The time-frequency embeddings of each speaker are then forced to cluster around the corresponding attractor point which is used to determine the time-frequency assignment of the speaker. We propose three methods for finding the attractors for each source in the embedding space and compare their advantages and limitations. The objective function for the network is standard signal reconstruction error which enables end-to-end operation during both training and test phases. We evaluated our system using the Wall Street Journal dataset WSJ0 on two and three speaker mixtures and report comparable or better performance than other state-of-the-art deep learning methods for speech separation.
Tasks Speech Separation
Published 2017-07-12
URL http://arxiv.org/abs/1707.03634v3
PDF http://arxiv.org/pdf/1707.03634v3.pdf
PWC https://paperswithcode.com/paper/speaker-independent-speech-separation-with
Repo
Framework

Vehicle Routing with Drones

Title Vehicle Routing with Drones
Authors Rami Daknama, Elisabeth Kraus
Abstract We introduce a package service model where trucks as well as drones can deliver packages. Drones can travel on trucks or fly; but while flying, drones can only carry one package at a time and have to return to a truck to charge after each delivery. We present a heuristic algorithm to solve the problem of finding a good schedule for all drones and trucks. The algorithm is based on two nested local searches, thus the definition of suitable neighbourhoods of solutions is crucial for the algorithm. Empirical tests show that our algorithm performs significantly better than a natural Greedy algorithm. Moreover, the savings compared to solutions without drones turn out to be substantial, suggesting that delivery systems might considerably benefit from using drones in addition to trucks.
Tasks
Published 2017-05-18
URL http://arxiv.org/abs/1705.06431v1
PDF http://arxiv.org/pdf/1705.06431v1.pdf
PWC https://paperswithcode.com/paper/vehicle-routing-with-drones
Repo
Framework

Break it Down for Me: A Study in Automated Lyric Annotation

Title Break it Down for Me: A Study in Automated Lyric Annotation
Authors Lucas Sterckx, Jason Naradowsky, Bill Byrne, Thomas Demeester, Chris Develder
Abstract Comprehending lyrics, as found in songs and poems, can pose a challenge to human and machine readers alike. This motivates the need for systems that can understand the ambiguity and jargon found in such creative texts, and provide commentary to aid readers in reaching the correct interpretation. We introduce the task of automated lyric annotation (ALA). Like text simplification, a goal of ALA is to rephrase the original text in a more easily understandable manner. However, in ALA the system must often include additional information to clarify niche terminology and abstract concepts. To stimulate research on this task, we release a large collection of crowdsourced annotations for song lyrics. We analyze the performance of translation and retrieval models on this task, measuring performance with both automated and human evaluation. We find that each model captures a unique type of information important to the task.
Tasks Text Simplification
Published 2017-08-11
URL http://arxiv.org/abs/1708.03492v1
PDF http://arxiv.org/pdf/1708.03492v1.pdf
PWC https://paperswithcode.com/paper/break-it-down-for-me-a-study-in-automated
Repo
Framework

Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit

Title Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit
Authors Brendan Maginnis, Pierre H. Richemond
Abstract Recurrent Neural Networks architectures excel at processing sequences by modelling dependencies over different timescales. The recently introduced Recurrent Weighted Average (RWA) unit captures long term dependencies far better than an LSTM on several challenging tasks. The RWA achieves this by applying attention to each input and computing a weighted average over the full history of its computations. Unfortunately, the RWA cannot change the attention it has assigned to previous timesteps, and so struggles with carrying out consecutive tasks or tasks with changing requirements. We present the Recurrent Discounted Attention (RDA) unit that builds on the RWA by additionally allowing the discounting of the past. We empirically compare our model to RWA, LSTM and GRU units on several challenging tasks. On tasks with a single output the RWA, RDA and GRU units learn much quicker than the LSTM and with better performance. On the multiple sequence copy task our RDA unit learns the task three times as quickly as the LSTM or GRU units while the RWA fails to learn at all. On the Wikipedia character prediction task the LSTM performs best but it followed closely by our RDA unit. Overall our RDA unit performs well and is sample efficient on a large variety of sequence tasks.
Tasks
Published 2017-05-23
URL http://arxiv.org/abs/1705.08480v2
PDF http://arxiv.org/pdf/1705.08480v2.pdf
PWC https://paperswithcode.com/paper/efficiently-applying-attention-to-sequential
Repo
Framework

Constructing Datasets for Multi-hop Reading Comprehension Across Documents

Title Constructing Datasets for Multi-hop Reading Comprehension Across Documents
Authors Johannes Welbl, Pontus Stenetorp, Sebastian Riedel
Abstract Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods, but currently there exist no resources to train and test this capability. We propose a novel task to encourage the development of models for text understanding across multiple documents and to investigate the limits of existing methods. In our task, a model learns to seek and combine evidence - effectively performing multi-hop (alias multi-step) inference. We devise a methodology to produce datasets for this task, given a collection of query-answer pairs and thematically linked documents. Two datasets from different domains are induced, and we identify potential pitfalls and devise circumvention strategies. We evaluate two previously proposed competitive models and find that one can integrate information across documents. However, both models struggle to select relevant information, as providing documents guaranteed to be relevant greatly improves their performance. While the models outperform several strong baselines, their best accuracy reaches 42.9% compared to human performance at 74.0% - leaving ample room for improvement.
Tasks Multi-Hop Reading Comprehension, Reading Comprehension
Published 2017-10-17
URL http://arxiv.org/abs/1710.06481v2
PDF http://arxiv.org/pdf/1710.06481v2.pdf
PWC https://paperswithcode.com/paper/constructing-datasets-for-multi-hop-reading
Repo
Framework

Best Practices for Applying Deep Learning to Novel Applications

Title Best Practices for Applying Deep Learning to Novel Applications
Authors Leslie N. Smith
Abstract This report is targeted to groups who are subject matter experts in their application but deep learning novices. It contains practical advice for those interested in testing the use of deep neural networks on applications that are novel for deep learning. We suggest making your project more manageable by dividing it into phases. For each phase this report contains numerous recommendations and insights to assist novice practitioners.
Tasks
Published 2017-04-05
URL http://arxiv.org/abs/1704.01568v1
PDF http://arxiv.org/pdf/1704.01568v1.pdf
PWC https://paperswithcode.com/paper/best-practices-for-applying-deep-learning-to
Repo
Framework

CRNN: A Joint Neural Network for Redundancy Detection

Title CRNN: A Joint Neural Network for Redundancy Detection
Authors Xinyu Fu, Eugene Ch’ng, Uwe Aickelin, Simon See
Abstract This paper proposes a novel framework for detecting redundancy in supervised sentence categorisation. Unlike traditional singleton neural network, our model incorporates character-aware convolutional neural network (Char-CNN) with character-aware recurrent neural network (Char-RNN) to form a convolutional recurrent neural network (CRNN). Our model benefits from Char-CNN in that only salient features are selected and fed into the integrated Char-RNN. Char-RNN effectively learns long sequence semantics via sophisticated update mechanism. We compare our framework against the state-of-the-art text classification algorithms on four popular benchmarking corpus. For instance, our model achieves competing precision rate, recall ratio, and F1 score on the Google-news data-set. For twenty-news-groups data stream, our algorithm obtains the optimum on precision rate, recall ratio, and F1 score. For Brown Corpus, our framework obtains the best F1 score and almost equivalent precision rate and recall ratio over the top competitor. For the question classification collection, CRNN produces the optimal recall rate and F1 score and comparable precision rate. We also analyse three different RNN hidden recurrent cells’ impact on performance and their runtime efficiency. We observe that MGU achieves the optimal runtime and comparable performance against GRU and LSTM. For TFIDF based algorithms, we experiment with word2vec, GloVe, and sent2vec embeddings and report their performance differences.
Tasks Text Classification
Published 2017-06-04
URL http://arxiv.org/abs/1706.01069v1
PDF http://arxiv.org/pdf/1706.01069v1.pdf
PWC https://paperswithcode.com/paper/crnn-a-joint-neural-network-for-redundancy
Repo
Framework

Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds

Title Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds
Authors Lijun Zhang, Tianbao Yang, Rong Jin
Abstract Although there exist plentiful theories of empirical risk minimization (ERM) for supervised learning, current theoretical understandings of ERM for a related problem—stochastic convex optimization (SCO), are limited. In this work, we strengthen the realm of ERM for SCO by exploiting smoothness and strong convexity conditions to improve the risk bounds. First, we establish an $\widetilde{O}(d/n + \sqrt{F_*/n})$ risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where $d$ is the dimensionality of the problem, $n$ is the number of samples, and $F_*$ is the minimal risk. Thus, when $F_*$ is small we obtain an $\widetilde{O}(d/n)$ risk bound, which is analogous to the $\widetilde{O}(1/n)$ optimistic rate of ERM for supervised learning. Second, if the objective function is also $\lambda$-strongly convex, we prove an $\widetilde{O}(d/n + \kappa F_*/n )$ risk bound where $\kappa$ is the condition number, and improve it to $O(1/[\lambda n^2] + \kappa F_*/n)$ when $n=\widetilde{\Omega}(\kappa d)$. As a result, we obtain an $O(\kappa/n^2)$ risk bound under the condition that $n$ is large and $F_*$ is small, which to the best of our knowledge, is the first $O(1/n^2)$-type of risk bound of ERM. Third, we stress that the above results are established in a unified framework, which allows us to derive new risk bounds under weaker conditions, e.g., without convexity of the random function and Lipschitz continuity of the expected function. Finally, we demonstrate that to achieve an $O(1/[\lambda n^2] + \kappa F_*/n)$ risk bound for supervised learning, the $\widetilde{\Omega}(\kappa d)$ requirement on $n$ can be replaced with $\Omega(\kappa^2)$, which is dimensionality-independent.
Tasks
Published 2017-02-07
URL http://arxiv.org/abs/1702.02030v1
PDF http://arxiv.org/pdf/1702.02030v1.pdf
PWC https://paperswithcode.com/paper/empirical-risk-minimization-for-stochastic
Repo
Framework
comments powered by Disqus