July 28, 2019

3031 words 15 mins read

Paper Group ANR 461

Paper Group ANR 461

The Compressed Model of Residual CNDS. On the organization of grid and place cells: Neural de-noising via subspace learning. Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy. A Survey of Model Compression and Acceleration for Deep Neural Networks. Texture Fuzzy Segmentation using Skew Divergence Adaptive …

The Compressed Model of Residual CNDS

Title The Compressed Model of Residual CNDS
Authors Hussam Qassim, David Feinzimer, Abhishek Verma
Abstract Convolutional neural networks have achieved a great success in the recent years. Although, the way to maximize the performance of the convolutional neural networks still in the beginning. Furthermore, the optimization of the size and the time that need to train the convolutional neural networks is very far away from reaching the researcher’s ambition. In this paper, we proposed a new convolutional neural network that combined several techniques to boost the optimization of the convolutional neural network in the aspects of speed and size. As we used our previous model Residual-CNDS (ResCNDS), which solved the problems of slower convergence, overfitting, and degradation, and compressed it. The outcome model called Residual-Squeeze-CNDS (ResSquCNDS), which we demonstrated on our sold technique to add residual learning and our model of compressing the convolutional neural networks. Our model of compressing adapted from the SQUEEZENET model, but our model is more generalizable, which can be applied almost to any neural network model, and fully integrated into the residual learning, which addresses the problem of the degradation very successfully. Our proposed model trained on very large-scale MIT Places365-Standard scene datasets, which backing our hypothesis that the new compressed model inherited the best of the previous ResCNDS8 model, and almost get the same accuracy in the validation Top-1 and Top-5 with 87.64% smaller in size and 13.33% faster in the training time.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.06419v1
PDF http://arxiv.org/pdf/1706.06419v1.pdf
PWC https://paperswithcode.com/paper/the-compressed-model-of-residual-cnds
Repo
Framework

On the organization of grid and place cells: Neural de-noising via subspace learning

Title On the organization of grid and place cells: Neural de-noising via subspace learning
Authors David M. Schwartz, O. Ozan Koyluoglu
Abstract Place cells in the hippocampus are active when an animal visits a certain location (referred to as a place field) within an environment. Grid cells in the medial entorhinal cortex (MEC) respond at multiple locations, with firing fields that form a periodic and hexagonal tiling of the environment. The joint activity of grid and place cell populations, as a function of location, forms a neural code for space. An ensemble of codes is generated by varying grid and place cell population parameters. For each code in this ensemble, codewords are generated by stimulating a network with a discrete set of locations. In this manuscript, we develop an understanding of the relationships between coding theoretic properties of these combined populations and code construction parameters. These relationships are revisited by measuring the performances of biologically realizable algorithms implemented by networks of place and grid cell populations, as well as constraint neurons, which perform de-noising operations. Objectives of this work include the investigation of coding theoretic limitations of the mammalian neural code for location and how communication between grid and place cell networks may improve the accuracy of each population’s representation. Simulations demonstrate that de-noising mechanisms analyzed here can significantly improve fidelity of this neural representation of space. Further, patterns observed in connectivity of each population of simulated cells suggest that inter-hippocampal-medial-entorhinal-cortical connectivity decreases downward along the dorsoventral axis.
Tasks
Published 2017-12-13
URL http://arxiv.org/abs/1712.04602v2
PDF http://arxiv.org/pdf/1712.04602v2.pdf
PWC https://paperswithcode.com/paper/on-the-organization-of-grid-and-place-cells
Repo
Framework

Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy

Title Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy
Authors Asit Mishra, Debbie Marr
Abstract Deep learning networks have achieved state-of-the-art accuracies on computer vision workloads like image classification and object detection. The performant systems, however, typically involve big models with numerous parameters. Once trained, a challenging aspect for such top performing models is deployment on resource constrained inference systems - the models (often deep networks or wide networks or both) are compute and memory intensive. Low-precision numerics and model compression using knowledge distillation are popular techniques to lower both the compute requirements and memory footprint of these deployed models. In this paper, we study the combination of these two techniques and show that the performance of low-precision networks can be significantly improved by using knowledge distillation techniques. Our approach, Apprentice, achieves state-of-the-art accuracies using ternary precision and 4-bit precision for variants of ResNet architecture on ImageNet dataset. We present three schemes using which one can apply knowledge distillation techniques to various stages of the train-and-deploy pipeline.
Tasks Image Classification, Model Compression, Object Detection
Published 2017-11-15
URL http://arxiv.org/abs/1711.05852v1
PDF http://arxiv.org/pdf/1711.05852v1.pdf
PWC https://paperswithcode.com/paper/apprentice-using-knowledge-distillation
Repo
Framework

A Survey of Model Compression and Acceleration for Deep Neural Networks

Title A Survey of Model Compression and Acceleration for Deep Neural Networks
Authors Yu Cheng, Duo Wang, Pan Zhou, Tao Zhang
Abstract Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.
Tasks Model Compression
Published 2017-10-23
URL https://arxiv.org/abs/1710.09282v8
PDF https://arxiv.org/pdf/1710.09282v8.pdf
PWC https://paperswithcode.com/paper/a-survey-of-model-compression-and
Repo
Framework

Texture Fuzzy Segmentation using Skew Divergence Adaptive Affinity Functions

Title Texture Fuzzy Segmentation using Skew Divergence Adaptive Affinity Functions
Authors José F. S. Neto, Waldson P. N. Leandro, Matheus A. Gadelha, Tiago S. Santos, Bruno M. Carvalho, Edgar Garduño
Abstract Digital image segmentation is the process of assigning distinct labels to different objects in a digital image, and the fuzzy segmentation algorithm has been successfully used in the segmentation of images from a wide variety of sources. However, the traditional fuzzy segmentation algorithm fails to segment objects that are characterized by textures whose patterns cannot be successfully described by simple statistics computed over a very restricted area. In this paper, we propose an extension of the fuzzy segmentation algorithm that uses adaptive textural affinity functions to perform the segmentation of such objects on bidimensional images. The adaptive affinity functions compute their appropriate neighborhood size as they compute the texture descriptors surrounding the seed spels (spatial elements), according to the characteristics of the texture being processed. The algorithm then segments the image with an appropriate neighborhood for each object. We performed experiments on mosaic images that were composed using images from the Brodatz database, and compared our results with the ones produced by a recently published texture segmentation algorithm, showing the applicability of our method.
Tasks Semantic Segmentation
Published 2017-10-07
URL http://arxiv.org/abs/1710.02754v1
PDF http://arxiv.org/pdf/1710.02754v1.pdf
PWC https://paperswithcode.com/paper/texture-fuzzy-segmentation-using-skew
Repo
Framework

Learning Control for Air Hockey Striking using Deep Reinforcement Learning

Title Learning Control for Air Hockey Striking using Deep Reinforcement Learning
Authors Ayal Taitler, Nahum Shimkin
Abstract We consider the task of learning control policies for a robotic mechanism striking a puck in an air hockey game. The control signal is a direct command to the robot’s motors. We employ a model free deep reinforcement learning framework to learn the motoric skills of striking the puck accurately in order to score. We propose certain improvements to the standard learning scheme which make the deep Q-learning algorithm feasible when it might otherwise fail. Our improvements include integrating prior knowledge into the learning scheme, and accounting for the changing distribution of samples in the experience replay buffer. Finally we present our simulation results for aimed striking which demonstrate the successful learning of this task, and the improvement in algorithm stability due to the proposed modifications.
Tasks Q-Learning
Published 2017-02-26
URL http://arxiv.org/abs/1702.08074v2
PDF http://arxiv.org/pdf/1702.08074v2.pdf
PWC https://paperswithcode.com/paper/learning-control-for-air-hockey-striking
Repo
Framework

Massive Data Clustering in Moderate Dimensions from the Dual Spaces of Observation and Attribute Data Clouds

Title Massive Data Clustering in Moderate Dimensions from the Dual Spaces of Observation and Attribute Data Clouds
Authors Fionn Murtagh
Abstract Cluster analysis of very high dimensional data can benefit from the properties of such high dimensionality. Informally expressed, in this work, our focus is on the analogous situation when the dimensionality is moderate to small, relative to a massively sized set of observations. Mathematically expressed, these are the dual spaces of observations and attributes. The point cloud of observations is in attribute space, and the point cloud of attributes is in observation space. In this paper, we begin by summarizing various perspectives related to methodologies that are used in multivariate analytics. We draw on these to establish an efficient clustering processing pipeline, both partitioning and hierarchical clustering.
Tasks
Published 2017-04-06
URL http://arxiv.org/abs/1704.01871v1
PDF http://arxiv.org/pdf/1704.01871v1.pdf
PWC https://paperswithcode.com/paper/massive-data-clustering-in-moderate
Repo
Framework

This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News

Title This Just In: Fake News Packs a Lot in Title, Uses Simpler, Repetitive Content in Text Body, More Similar to Satire than Real News
Authors Benjamin D. Horne, Sibel Adali
Abstract The problem of fake news has gained a lot of attention as it is claimed to have had a significant impact on 2016 US Presidential Elections. Fake news is not a new problem and its spread in social networks is well-studied. Often an underlying assumption in fake news discussion is that it is written to look like real news, fooling the reader who does not check for reliability of the sources or the arguments in its content. Through a unique study of three data sets and features that capture the style and the language of articles, we show that this assumption is not true. Fake news in most cases is more similar to satire than to real news, leading us to conclude that persuasion in fake news is achieved through heuristics rather than the strength of arguments. We show overall title structure and the use of proper nouns in titles are very significant in differentiating fake from real. This leads us to conclude that fake news is targeted for audiences who are not likely to read beyond titles and is aimed at creating mental associations between entities and claims.
Tasks
Published 2017-03-28
URL http://arxiv.org/abs/1703.09398v1
PDF http://arxiv.org/pdf/1703.09398v1.pdf
PWC https://paperswithcode.com/paper/this-just-in-fake-news-packs-a-lot-in-title
Repo
Framework

Learning Topic-Sensitive Word Representations

Title Learning Topic-Sensitive Word Representations
Authors Marzieh Fadaee, Arianna Bisazza, Christof Monz
Abstract Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.
Tasks
Published 2017-05-01
URL http://arxiv.org/abs/1705.00441v1
PDF http://arxiv.org/pdf/1705.00441v1.pdf
PWC https://paperswithcode.com/paper/learning-topic-sensitive-word-representations
Repo
Framework

Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization

Title Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization
Authors Zhouyuan Huo, Bin Gu, Ji Liu, Heng Huang
Abstract Stochastic composition optimization draws much attention recently and has been successful in many emerging applications of machine learning, statistical analysis, and reinforcement learning. In this paper, we focus on the composition problem with nonsmooth regularization penalty. Previous works either have slow convergence rate or do not provide complete convergence analysis for the general problem. In this paper, we tackle these two issues by proposing a new stochastic composition optimization method for composition problem with nonsmooth regularization penalty. In our method, we apply variance reduction technique to accelerate the speed of convergence. To the best of our knowledge, our method admits the fastest convergence rate for stochastic composition optimization: for strongly convex composition problem, our algorithm is proved to admit linear convergence; for general composition problem, our algorithm significantly improves the state-of-the-art convergence rate from $O(T^{-1/2})$ to $O((n_1+n_2)^{{2}/{3}}T^{-1})$. Finally, we apply our proposed algorithm to portfolio management and policy evaluation in reinforcement learning. Experimental results verify our theoretical analysis.
Tasks
Published 2017-11-10
URL http://arxiv.org/abs/1711.03937v2
PDF http://arxiv.org/pdf/1711.03937v2.pdf
PWC https://paperswithcode.com/paper/accelerated-method-for-stochastic-composition
Repo
Framework

Stability Selection for Structured Variable Selection

Title Stability Selection for Structured Variable Selection
Authors George Philipp, Seunghak Lee, Eric P. Xing
Abstract In variable or graph selection problems, finding a right-sized model or controlling the number of false positives is notoriously difficult. Recently, a meta-algorithm called Stability Selection was proposed that can provide reliable finite-sample control of the number of false positives. Its benefits were demonstrated when used in conjunction with the lasso and orthogonal matching pursuit algorithms. In this paper, we investigate the applicability of stability selection to structured selection algorithms: the group lasso and the structured input-output lasso. We find that using stability selection often increases the power of both algorithms, but that the presence of complex structure reduces the reliability of error control under stability selection. We give strategies for setting tuning parameters to obtain a good model size under stability selection, and highlight its strengths and weaknesses compared to competing methods screen and clean and cross-validation. We give guidelines about when to use which error control method.
Tasks
Published 2017-12-13
URL http://arxiv.org/abs/1712.04688v1
PDF http://arxiv.org/pdf/1712.04688v1.pdf
PWC https://paperswithcode.com/paper/stability-selection-for-structured-variable
Repo
Framework

Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently

Title Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently
Authors Muthuraman Chidambaram, Yanjun Qi
Abstract The idea of style transfer has largely only been explored in image-based tasks, which we attribute in part to the specific nature of loss functions used for style transfer. We propose a general formulation of style transfer as an extension of generative adversarial networks, by using a discriminator to regularize a generator with an otherwise separate loss function. We apply our approach to the task of learning to play chess in the style of a specific player, and present empirical evidence for the viability of our approach.
Tasks Style Transfer
Published 2017-02-22
URL http://arxiv.org/abs/1702.06762v2
PDF http://arxiv.org/pdf/1702.06762v2.pdf
PWC https://paperswithcode.com/paper/style-transfer-generative-adversarial
Repo
Framework

Deep Learning to Attend to Risk in ICU

Title Deep Learning to Attend to Risk in ICU
Authors Phuoc Nguyen, Truyen Tran, Svetha Venkatesh
Abstract Modeling physiological time-series in ICU is of high clinical importance. However, data collected within ICU are irregular in time and often contain missing measurements. Since absence of a measure would signify its lack of importance, the missingness is indeed informative and might reflect the decision making by the clinician. Here we propose a deep learning architecture that can effectively handle these challenges for predicting ICU mortality outcomes. The model is based on Long Short-Term Memory, and has layered attention mechanisms. At the sensing layer, the model decides whether to observe and incorporate parts of the current measurements. At the reasoning layer, evidences across time steps are weighted and combined. The model is evaluated on the PhysioNet 2012 dataset showing competitive and interpretable results.
Tasks Decision Making, Time Series
Published 2017-07-17
URL http://arxiv.org/abs/1707.05010v1
PDF http://arxiv.org/pdf/1707.05010v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-to-attend-to-risk-in-icu
Repo
Framework

Slope Stability Analysis with Geometric Semantic Genetic Programming

Title Slope Stability Analysis with Geometric Semantic Genetic Programming
Authors Juncai Xu, Zhenzhong Shen, Qingwen Ren, Xin Xie, Zhengyu Yang
Abstract Genetic programming has been widely used in the engineering field. Compared with the conventional genetic programming and artificial neural network, geometric semantic genetic programming (GSGP) is superior in astringency and computing efficiency. In this paper, GSGP is adopted for the classification and regression analysis of a sample dataset. Furthermore, a model for slope stability analysis is established on the basis of geometric semantics. According to the results of the study based on GSGP, the method can analyze slope stability objectively and is highly precise in predicting slope stability and safety factors. Hence, the predicted results can be used as a reference for slope safety design.
Tasks
Published 2017-08-30
URL http://arxiv.org/abs/1708.09116v2
PDF http://arxiv.org/pdf/1708.09116v2.pdf
PWC https://paperswithcode.com/paper/slope-stability-analysis-with-geometric
Repo
Framework

Forecasting Across Time Series Databases using Recurrent Neural Networks on Groups of Similar Series: A Clustering Approach

Title Forecasting Across Time Series Databases using Recurrent Neural Networks on Groups of Similar Series: A Clustering Approach
Authors Kasun Bandara, Christoph Bergmeir, Slawek Smyl
Abstract With the advent of Big Data, nowadays in many applications databases containing large quantities of similar time series are available. Forecasting time series in these domains with traditional univariate forecasting procedures leaves great potentials for producing accurate forecasts untapped. Recurrent neural networks (RNNs), and in particular Long Short-Term Memory (LSTM) networks, have proven recently that they are able to outperform state-of-the-art univariate time series forecasting methods in this context when trained across all available time series. However, if the time series database is heterogeneous, accuracy may degenerate, so that on the way towards fully automatic forecasting methods in this space, a notion of similarity between the time series needs to be built into the methods. To this end, we present a prediction model that can be used with different types of RNN models on subgroups of similar time series, which are identified by time series clustering techniques. We assess our proposed methodology using LSTM networks, a widely popular RNN variant. Our method achieves competitive results on benchmarking datasets under competition evaluation procedures. In particular, in terms of mean sMAPE accuracy, it consistently outperforms the baseline LSTM model and outperforms all other methods on the CIF2016 forecasting competition dataset.
Tasks Time Series, Time Series Clustering, Time Series Forecasting
Published 2017-10-09
URL http://arxiv.org/abs/1710.03222v2
PDF http://arxiv.org/pdf/1710.03222v2.pdf
PWC https://paperswithcode.com/paper/forecasting-across-time-series-databases
Repo
Framework
comments powered by Disqus