Paper Group ANR 78
Introduction to Tensor Decompositions and their Applications in Machine Learning. Model compression as constrained optimization, with application to neural nets. Part I: general framework. The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning. Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent …
Introduction to Tensor Decompositions and their Applications in Machine Learning
Title | Introduction to Tensor Decompositions and their Applications in Machine Learning |
Authors | Stephan Rabanser, Oleksandr Shchur, Stephan Günnemann |
Abstract | Tensors are multidimensional arrays of numerical values and therefore generalize matrices to multiple dimensions. While tensors first emerged in the psychometrics community in the $20^{\text{th}}$ century, they have since then spread to numerous other disciplines, including machine learning. Tensors and their decompositions are especially beneficial in unsupervised learning settings, but are gaining popularity in other sub-disciplines like temporal and multi-relational data analysis, too. The scope of this paper is to give a broad overview of tensors, their decompositions, and how they are used in machine learning. As part of this, we are going to introduce basic tensor concepts, discuss why tensors can be considered more rigid than matrices with respect to the uniqueness of their decomposition, explain the most important factorization algorithms and their properties, provide concrete examples of tensor decomposition applications in machine learning, conduct a case study on tensor-based estimation of mixture models, talk about the current state of research, and provide references to available software libraries. |
Tasks | |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10781v1 |
http://arxiv.org/pdf/1711.10781v1.pdf | |
PWC | https://paperswithcode.com/paper/introduction-to-tensor-decompositions-and |
Repo | |
Framework | |
Model compression as constrained optimization, with application to neural nets. Part I: general framework
Title | Model compression as constrained optimization, with application to neural nets. Part I: general framework |
Authors | Miguel Á. Carreira-Perpiñán |
Abstract | Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model compression as constrained optimization. This includes many types of compression: quantization, low-rank decomposition, pruning, lossless compression and others. Then, we give a general algorithm to optimize this nonconvex problem based on the augmented Lagrangian and alternating optimization. This results in a “learning-compression” algorithm, which alternates a learning step of the uncompressed model, independent of the compression type, with a compression step of the model parameters, independent of the learning task. This simple, efficient algorithm is guaranteed to find the best compressed model for the task in a local sense under standard assumptions. We present separately in several companion papers the development of this general framework into specific algorithms for model compression based on quantization, pruning and other variations, including experimental results on compressing neural nets and other models. |
Tasks | Model Compression, Object Recognition, Quantization |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01209v1 |
http://arxiv.org/pdf/1707.01209v1.pdf | |
PWC | https://paperswithcode.com/paper/model-compression-as-constrained-optimization-1 |
Repo | |
Framework | |
The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning
Title | The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning |
Authors | Anil Usumezbas, Ricardo Fabbri, Benjamin Kimia |
Abstract | The three-dimensional reconstruction of scenes from multiple views has made impressive strides in recent years, chiefly by methods correlating isolated feature points, intensities, or curvilinear structure. In the general setting, i.e., without requiring controlled acquisition, limited number of objects, abundant patterns on objects, or object curves to follow particular models, the majority of these methods produce unorganized point clouds, meshes, or voxel representations of the reconstructed scene, with some exceptions producing 3D drawings as networks of curves. Many applications, e.g., robotics, urban planning, industrial design, and hard surface modeling, however, require structured representations which make explicit 3D curves, surfaces, and their spatial relationships. Reconstructing surface representations can now be constrained by the 3D drawing acting like a scaffold to hang on the computed representations, leading to increased robustness and quality of reconstruction. This paper presents one way of completing such 3D drawings with surface reconstructions, by exploring occlusion reasoning through lofting algorithms. |
Tasks | |
Published | 2017-07-13 |
URL | http://arxiv.org/abs/1707.03946v1 |
http://arxiv.org/pdf/1707.03946v1.pdf | |
PWC | https://paperswithcode.com/paper/the-surfacing-of-multiview-3d-drawings-via |
Repo | |
Framework | |
Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent Convolutional Networks with Toeplitz Attention
Title | Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent Convolutional Networks with Toeplitz Attention |
Authors | Alican Bozkurt, Kivanc Kose, Jaume Coll-Font, Christi Alessi-Fox, Dana H. Brooks, Jennifer G. Dy, Milind Rajadhyaksha |
Abstract | Reflectance confocal microscopy (RCM) is an effective, non-invasive pre-screening tool for skin cancer diagnosis, but it requires extensive training and experience to assess accurately. There are few quantitative tools available to standardize image acquisition and analysis, and the ones that are available are not interpretable. In this study, we use a recurrent neural network with attention on convolutional network features. We apply it to delineate skin strata in vertically-oriented stacks of transverse RCM image slices in an interpretable manner. We introduce a new attention mechanism called Toeplitz attention, which constrains the attention map to have a Toeplitz structure. Testing our model on an expert labeled dataset of 504 RCM stacks, we achieve 88.17% image-wise classification accuracy, which is the current state-of-art. |
Tasks | |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00192v1 |
http://arxiv.org/pdf/1712.00192v1.pdf | |
PWC | https://paperswithcode.com/paper/delineation-of-skin-strata-in-reflectance |
Repo | |
Framework | |
Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach
Title | Feature Analysis and Selection for Training an End-to-End Autonomous Vehicle Controller Using the Deep Learning Approach |
Authors | Shun Yang, Wenshuo Wang, Chang Liu, Kevin Deng, J. Karl Hedrick |
Abstract | Deep learning-based approaches have been widely used for training controllers for autonomous vehicles due to their powerful ability to approximate nonlinear functions or policies. However, the training process usually requires large labeled data sets and takes a lot of time. In this paper, we analyze the influences of features on the performance of controllers trained using the convolutional neural networks (CNNs), which gives a guideline of feature selection to reduce computation cost. We collect a large set of data using The Open Racing Car Simulator (TORCS) and classify the image features into three categories (sky-related, roadside-related, and road-related features).We then design two experimental frameworks to investigate the importance of each single feature for training a CNN controller.The first framework uses the training data with all three features included to train a controller, which is then tested with data that has one feature removed to evaluate the feature’s effects. The second framework is trained with the data that has one feature excluded, while all three features are included in the test data. Different driving scenarios are selected to test and analyze the trained controllers using the two experimental frameworks. The experiment results show that (1) the road-related features are indispensable for training the controller, (2) the roadside-related features are useful to improve the generalizability of the controller to scenarios with complicated roadside information, and (3) the sky-related features have limited contribution to train an end-to-end autonomous vehicle controller. |
Tasks | Autonomous Vehicles, Feature Selection |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09744v1 |
http://arxiv.org/pdf/1703.09744v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-analysis-and-selection-for-training |
Repo | |
Framework | |
A Machine Learning Approach to Routing
Title | A Machine Learning Approach to Routing |
Authors | Asaf Valadarsky, Michael Schapira, Dafna Shahaf, Aviv Tamar |
Abstract | Can ideas and techniques from machine learning be leveraged to automatically generate “good” routing configurations? We investigate the power of data-driven routing protocols. Our results suggest that applying ideas and techniques from deep reinforcement learning to this context yields high performance, motivating further research along these lines. |
Tasks | |
Published | 2017-08-10 |
URL | http://arxiv.org/abs/1708.03074v2 |
http://arxiv.org/pdf/1708.03074v2.pdf | |
PWC | https://paperswithcode.com/paper/a-machine-learning-approach-to-routing |
Repo | |
Framework | |
Learning to Disambiguate by Asking Discriminative Questions
Title | Learning to Disambiguate by Asking Discriminative Questions |
Authors | Yining Li, Chen Huang, Xiaoou Tang, Chen-Change Loy |
Abstract | The ability to ask questions is a powerful tool to gather information in order to learn about the world and resolve ambiguities. In this paper, we explore a novel problem of generating discriminative questions to help disambiguate visual instances. Our work can be seen as a complement and new extension to the rich research studies on image captioning and question answering. We introduce the first large-scale dataset with over 10,000 carefully annotated images-question tuples to facilitate benchmarking. In particular, each tuple consists of a pair of images and 4.6 discriminative questions (as positive samples) and 5.9 non-discriminative questions (as negative samples) on average. In addition, we present an effective method for visual discriminative question generation. The method can be trained in a weakly supervised manner without discriminative images-question tuples but just existing visual question answering datasets. Promising results are shown against representative baselines through quantitative evaluations and user studies. |
Tasks | Image Captioning, Question Answering, Question Generation, Visual Question Answering |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02760v1 |
http://arxiv.org/pdf/1708.02760v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-disambiguate-by-asking |
Repo | |
Framework | |
Speaker-independent Speech Separation with Deep Attractor Network
Title | Speaker-independent Speech Separation with Deep Attractor Network |
Authors | Yi Luo, Zhuo Chen, Nima Mesgarani |
Abstract | Despite the recent success of deep learning for many speech processing tasks, single-microphone, speaker-independent speech separation remains challenging for two main reasons. The first reason is the arbitrary order of the target and masker speakers in the mixture permutation problem, and the second is the unknown number of speakers in the mixture output dimension problem. We propose a novel deep learning framework for speech separation that addresses both of these issues. We use a neural network to project the time-frequency representation of the mixture signal into a high-dimensional embedding space. A reference point attractor is created in the embedding space to represent each speaker which is defined as the centroid of the speaker in the embedding space. The time-frequency embeddings of each speaker are then forced to cluster around the corresponding attractor point which is used to determine the time-frequency assignment of the speaker. We propose three methods for finding the attractors for each source in the embedding space and compare their advantages and limitations. The objective function for the network is standard signal reconstruction error which enables end-to-end operation during both training and test phases. We evaluated our system using the Wall Street Journal dataset WSJ0 on two and three speaker mixtures and report comparable or better performance than other state-of-the-art deep learning methods for speech separation. |
Tasks | Speech Separation |
Published | 2017-07-12 |
URL | http://arxiv.org/abs/1707.03634v3 |
http://arxiv.org/pdf/1707.03634v3.pdf | |
PWC | https://paperswithcode.com/paper/speaker-independent-speech-separation-with |
Repo | |
Framework | |
Vehicle Routing with Drones
Title | Vehicle Routing with Drones |
Authors | Rami Daknama, Elisabeth Kraus |
Abstract | We introduce a package service model where trucks as well as drones can deliver packages. Drones can travel on trucks or fly; but while flying, drones can only carry one package at a time and have to return to a truck to charge after each delivery. We present a heuristic algorithm to solve the problem of finding a good schedule for all drones and trucks. The algorithm is based on two nested local searches, thus the definition of suitable neighbourhoods of solutions is crucial for the algorithm. Empirical tests show that our algorithm performs significantly better than a natural Greedy algorithm. Moreover, the savings compared to solutions without drones turn out to be substantial, suggesting that delivery systems might considerably benefit from using drones in addition to trucks. |
Tasks | |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06431v1 |
http://arxiv.org/pdf/1705.06431v1.pdf | |
PWC | https://paperswithcode.com/paper/vehicle-routing-with-drones |
Repo | |
Framework | |
Break it Down for Me: A Study in Automated Lyric Annotation
Title | Break it Down for Me: A Study in Automated Lyric Annotation |
Authors | Lucas Sterckx, Jason Naradowsky, Bill Byrne, Thomas Demeester, Chris Develder |
Abstract | Comprehending lyrics, as found in songs and poems, can pose a challenge to human and machine readers alike. This motivates the need for systems that can understand the ambiguity and jargon found in such creative texts, and provide commentary to aid readers in reaching the correct interpretation. We introduce the task of automated lyric annotation (ALA). Like text simplification, a goal of ALA is to rephrase the original text in a more easily understandable manner. However, in ALA the system must often include additional information to clarify niche terminology and abstract concepts. To stimulate research on this task, we release a large collection of crowdsourced annotations for song lyrics. We analyze the performance of translation and retrieval models on this task, measuring performance with both automated and human evaluation. We find that each model captures a unique type of information important to the task. |
Tasks | Text Simplification |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03492v1 |
http://arxiv.org/pdf/1708.03492v1.pdf | |
PWC | https://paperswithcode.com/paper/break-it-down-for-me-a-study-in-automated |
Repo | |
Framework | |
Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit
Title | Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit |
Authors | Brendan Maginnis, Pierre H. Richemond |
Abstract | Recurrent Neural Networks architectures excel at processing sequences by modelling dependencies over different timescales. The recently introduced Recurrent Weighted Average (RWA) unit captures long term dependencies far better than an LSTM on several challenging tasks. The RWA achieves this by applying attention to each input and computing a weighted average over the full history of its computations. Unfortunately, the RWA cannot change the attention it has assigned to previous timesteps, and so struggles with carrying out consecutive tasks or tasks with changing requirements. We present the Recurrent Discounted Attention (RDA) unit that builds on the RWA by additionally allowing the discounting of the past. We empirically compare our model to RWA, LSTM and GRU units on several challenging tasks. On tasks with a single output the RWA, RDA and GRU units learn much quicker than the LSTM and with better performance. On the multiple sequence copy task our RDA unit learns the task three times as quickly as the LSTM or GRU units while the RWA fails to learn at all. On the Wikipedia character prediction task the LSTM performs best but it followed closely by our RDA unit. Overall our RDA unit performs well and is sample efficient on a large variety of sequence tasks. |
Tasks | |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08480v2 |
http://arxiv.org/pdf/1705.08480v2.pdf | |
PWC | https://paperswithcode.com/paper/efficiently-applying-attention-to-sequential |
Repo | |
Framework | |
Constructing Datasets for Multi-hop Reading Comprehension Across Documents
Title | Constructing Datasets for Multi-hop Reading Comprehension Across Documents |
Authors | Johannes Welbl, Pontus Stenetorp, Sebastian Riedel |
Abstract | Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods, but currently there exist no resources to train and test this capability. We propose a novel task to encourage the development of models for text understanding across multiple documents and to investigate the limits of existing methods. In our task, a model learns to seek and combine evidence - effectively performing multi-hop (alias multi-step) inference. We devise a methodology to produce datasets for this task, given a collection of query-answer pairs and thematically linked documents. Two datasets from different domains are induced, and we identify potential pitfalls and devise circumvention strategies. We evaluate two previously proposed competitive models and find that one can integrate information across documents. However, both models struggle to select relevant information, as providing documents guaranteed to be relevant greatly improves their performance. While the models outperform several strong baselines, their best accuracy reaches 42.9% compared to human performance at 74.0% - leaving ample room for improvement. |
Tasks | Multi-Hop Reading Comprehension, Reading Comprehension |
Published | 2017-10-17 |
URL | http://arxiv.org/abs/1710.06481v2 |
http://arxiv.org/pdf/1710.06481v2.pdf | |
PWC | https://paperswithcode.com/paper/constructing-datasets-for-multi-hop-reading |
Repo | |
Framework | |
Best Practices for Applying Deep Learning to Novel Applications
Title | Best Practices for Applying Deep Learning to Novel Applications |
Authors | Leslie N. Smith |
Abstract | This report is targeted to groups who are subject matter experts in their application but deep learning novices. It contains practical advice for those interested in testing the use of deep neural networks on applications that are novel for deep learning. We suggest making your project more manageable by dividing it into phases. For each phase this report contains numerous recommendations and insights to assist novice practitioners. |
Tasks | |
Published | 2017-04-05 |
URL | http://arxiv.org/abs/1704.01568v1 |
http://arxiv.org/pdf/1704.01568v1.pdf | |
PWC | https://paperswithcode.com/paper/best-practices-for-applying-deep-learning-to |
Repo | |
Framework | |
CRNN: A Joint Neural Network for Redundancy Detection
Title | CRNN: A Joint Neural Network for Redundancy Detection |
Authors | Xinyu Fu, Eugene Ch’ng, Uwe Aickelin, Simon See |
Abstract | This paper proposes a novel framework for detecting redundancy in supervised sentence categorisation. Unlike traditional singleton neural network, our model incorporates character-aware convolutional neural network (Char-CNN) with character-aware recurrent neural network (Char-RNN) to form a convolutional recurrent neural network (CRNN). Our model benefits from Char-CNN in that only salient features are selected and fed into the integrated Char-RNN. Char-RNN effectively learns long sequence semantics via sophisticated update mechanism. We compare our framework against the state-of-the-art text classification algorithms on four popular benchmarking corpus. For instance, our model achieves competing precision rate, recall ratio, and F1 score on the Google-news data-set. For twenty-news-groups data stream, our algorithm obtains the optimum on precision rate, recall ratio, and F1 score. For Brown Corpus, our framework obtains the best F1 score and almost equivalent precision rate and recall ratio over the top competitor. For the question classification collection, CRNN produces the optimal recall rate and F1 score and comparable precision rate. We also analyse three different RNN hidden recurrent cells’ impact on performance and their runtime efficiency. We observe that MGU achieves the optimal runtime and comparable performance against GRU and LSTM. For TFIDF based algorithms, we experiment with word2vec, GloVe, and sent2vec embeddings and report their performance differences. |
Tasks | Text Classification |
Published | 2017-06-04 |
URL | http://arxiv.org/abs/1706.01069v1 |
http://arxiv.org/pdf/1706.01069v1.pdf | |
PWC | https://paperswithcode.com/paper/crnn-a-joint-neural-network-for-redundancy |
Repo | |
Framework | |
Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds
Title | Empirical Risk Minimization for Stochastic Convex Optimization: $O(1/n)$- and $O(1/n^2)$-type of Risk Bounds |
Authors | Lijun Zhang, Tianbao Yang, Rong Jin |
Abstract | Although there exist plentiful theories of empirical risk minimization (ERM) for supervised learning, current theoretical understandings of ERM for a related problem—stochastic convex optimization (SCO), are limited. In this work, we strengthen the realm of ERM for SCO by exploiting smoothness and strong convexity conditions to improve the risk bounds. First, we establish an $\widetilde{O}(d/n + \sqrt{F_*/n})$ risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where $d$ is the dimensionality of the problem, $n$ is the number of samples, and $F_*$ is the minimal risk. Thus, when $F_*$ is small we obtain an $\widetilde{O}(d/n)$ risk bound, which is analogous to the $\widetilde{O}(1/n)$ optimistic rate of ERM for supervised learning. Second, if the objective function is also $\lambda$-strongly convex, we prove an $\widetilde{O}(d/n + \kappa F_*/n )$ risk bound where $\kappa$ is the condition number, and improve it to $O(1/[\lambda n^2] + \kappa F_*/n)$ when $n=\widetilde{\Omega}(\kappa d)$. As a result, we obtain an $O(\kappa/n^2)$ risk bound under the condition that $n$ is large and $F_*$ is small, which to the best of our knowledge, is the first $O(1/n^2)$-type of risk bound of ERM. Third, we stress that the above results are established in a unified framework, which allows us to derive new risk bounds under weaker conditions, e.g., without convexity of the random function and Lipschitz continuity of the expected function. Finally, we demonstrate that to achieve an $O(1/[\lambda n^2] + \kappa F_*/n)$ risk bound for supervised learning, the $\widetilde{\Omega}(\kappa d)$ requirement on $n$ can be replaced with $\Omega(\kappa^2)$, which is dimensionality-independent. |
Tasks | |
Published | 2017-02-07 |
URL | http://arxiv.org/abs/1702.02030v1 |
http://arxiv.org/pdf/1702.02030v1.pdf | |
PWC | https://paperswithcode.com/paper/empirical-risk-minimization-for-stochastic |
Repo | |
Framework | |