July 28, 2019

3088 words 15 mins read

Paper Group ANR 211

An Attention-Based Deep Net for Learning to Rank. Image Segmentation of Multi-Shaped Overlapping Objects. Topology Analysis of International Networks Based on Debates in the United Nations. Kernel Scaling for Manifold Learning and Classification. Batch Reinforcement Learning on the Industrial Benchmark: First Experiences. Long-Term Ensemble Learnin …

An Attention-Based Deep Net for Learning to Rank


Title	An Attention-Based Deep Net for Learning to Rank
Authors	Baiyang Wang, Diego Klabjan
Abstract	In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to this problem, and in this paper, we propose an attention-based deep neural network which better incorporates different embeddings of the queries and search results with an attention-based mechanism. This model also applies a decoder mechanism to learn the ranks of the search results in a listwise fashion. The embeddings are trained with convolutional neural networks or the word2vec model. We demonstrate the performance of this model with image retrieval and text querying data sets.
Tasks	Image Retrieval, Information Retrieval, Learning-To-Rank
Published	2017-02-20
URL	http://arxiv.org/abs/1702.06106v3
PDF	http://arxiv.org/pdf/1702.06106v3.pdf
PWC	https://paperswithcode.com/paper/an-attention-based-deep-net-for-learning-to
Repo
Framework

Image Segmentation of Multi-Shaped Overlapping Objects


Title	Image Segmentation of Multi-Shaped Overlapping Objects
Authors	Kumar Abhinav, Jaideep Singh Chauhan, Debasis Sarkar
Abstract	In this work, we propose a new segmentation algorithm for images containing convex objects present in multiple shapes with a high degree of overlap. The proposed algorithm is carried out in two steps, first we identify the visible contours, segment them using concave points and finally group the segments belonging to the same object. The next step is to assign a shape identity to these grouped contour segments. For images containing objects in multiple shapes we begin first by identifying shape classes of the contours followed by assigning a shape entity to these classes. We provide a comprehensive experimentation of our algorithm on two crystal image datasets. One dataset comprises of images containing objects in multiple shapes overlapping each other and the other dataset contains standard images with objects present in a single shape. We test our algorithm against two baselines, with our proposed algorithm outperforming both the baselines.
Tasks	Semantic Segmentation
Published	2017-11-06
URL	http://arxiv.org/abs/1711.02217v1
PDF	http://arxiv.org/pdf/1711.02217v1.pdf
PWC	https://paperswithcode.com/paper/image-segmentation-of-multi-shaped
Repo
Framework

Topology Analysis of International Networks Based on Debates in the United Nations


Title	Topology Analysis of International Networks Based on Debates in the United Nations
Authors	Stefano Gurciullo, Slava Mikhaylov
Abstract	In complex, high dimensional and unstructured data it is often difficult to extract meaningful patterns. This is especially the case when dealing with textual data. Recent studies in machine learning, information theory and network science have developed several novel instruments to extract the semantics of unstructured data, and harness it to build a network of relations. Such approaches serve as an efficient tool for dimensionality reduction and pattern detection. This paper applies semantic network science to extract ideological proximity in the international arena, by focusing on the data from General Debates in the UN General Assembly on the topics of high salience to international community. UN General Debate corpus (UNGDC) covers all high-level debates in the UN General Assembly from 1970 to 2014, covering all UN member states. The research proceeds in three main steps. First, Latent Dirichlet Allocation (LDA) is used to extract the topics of the UN speeches, and therefore semantic information. Each country is then assigned a vector specifying the exposure to each of the topics identified. This intermediate output is then used in to construct a network of countries based on information theoretical metrics where the links capture similar vectorial patterns in the topic distributions. Topology of the networks is then analyzed through network properties like density, path length and clustering. Finally, we identify specific topological features of our networks using the map equation framework to detect communities in our networks of countries.
Tasks	Dimensionality Reduction
Published	2017-07-29
URL	http://arxiv.org/abs/1707.09491v1
PDF	http://arxiv.org/pdf/1707.09491v1.pdf
PWC	https://paperswithcode.com/paper/topology-analysis-of-international-networks
Repo
Framework

Kernel Scaling for Manifold Learning and Classification


Title	Kernel Scaling for Manifold Learning and Classification
Authors	Ofir Lindenbaum, Moshe Salhov, Arie Yeredor, Amir Averbuch
Abstract	Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel’s scale parameter, also referred to as the kernel’s bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold’s intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.
Tasks	Dimensionality Reduction
Published	2017-07-04
URL	https://arxiv.org/abs/1707.01093v2
PDF	https://arxiv.org/pdf/1707.01093v2.pdf
PWC	https://paperswithcode.com/paper/kernel-scaling-for-manifold-learning-and
Repo
Framework

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences


Title	Batch Reinforcement Learning on the Industrial Benchmark: First Experiences
Authors	Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing
Abstract	The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.
Tasks
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07262v2
PDF	http://arxiv.org/pdf/1705.07262v2.pdf
PWC	https://paperswithcode.com/paper/batch-reinforcement-learning-on-the
Repo
Framework

Long-Term Ensemble Learning of Visual Place Classifiers


Title	Long-Term Ensemble Learning of Visual Place Classifiers
Authors	Xiaoxiao Fei, Kanji Tanaka, Yichu Fang, Akitaka Takayama
Abstract	This paper addresses the problem of cross-season visual place classification (VPC) from a novel perspective of long-term map learning. Our goal is to enable transfer learning efficiently from one season to the next, at a small constant cost, and without wasting the robot’s available long-term-memory by memorizing very large amounts of training data. To realize a good tradeoff between generalization and specialization abilities, we employ an ensemble of convolutional neural network (DCN) classifiers and consider the task of scheduling (when and which classifiers to retrain), given a previous season’s DCN classifiers as the sole prior knowledge. We present a unified framework for retraining scheduling and discuss practical implementation strategies. Furthermore, we address the task of partitioning a robot’s workspace into places to define place classes in an unsupervised manner, rather than using uniform partitioning, so as to maximize VPC performance. Experiments using the publicly available NCLT dataset revealed that retraining scheduling of a DCN classifier ensemble is crucial and performance is significantly increased by using planned scheduling.
Tasks	Transfer Learning
Published	2017-09-16
URL	http://arxiv.org/abs/1709.05470v1
PDF	http://arxiv.org/pdf/1709.05470v1.pdf
PWC	https://paperswithcode.com/paper/long-term-ensemble-learning-of-visual-place
Repo
Framework

Early Stopping without a Validation Set


Title	Early Stopping without a Validation Set
Authors	Maren Mahsereci, Lukas Balles, Christoph Lassner, Philipp Hennig
Abstract	Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. We propose a novel early stopping criterion based on fast-to-compute local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression, as well as neural networks.
Tasks
Published	2017-03-28
URL	http://arxiv.org/abs/1703.09580v3
PDF	http://arxiv.org/pdf/1703.09580v3.pdf
PWC	https://paperswithcode.com/paper/early-stopping-without-a-validation-set
Repo
Framework

Learning Graph Representations with Embedding Propagation


Title	Learning Graph Representations with Embedding Propagation
Authors	Alberto Garcia-Duran, Mathias Niepert
Abstract	We propose Embedding Propagation (EP), an unsupervised learning framework for graph-structured data. EP learns vector representations of graphs by passing two types of messages between neighboring nodes. Forward messages consist of label representations such as representations of words and other attributes associated with the nodes. Backward messages consist of gradients that result from aggregating the label representations and applying a reconstruction loss. Node representations are finally computed from the representation of their labels. With significantly fewer parameters and hyperparameters an instance of EP is competitive with and often outperforms state of the art unsupervised and semi-supervised learning methods on a range of benchmark data sets.
Tasks
Published	2017-10-09
URL	http://arxiv.org/abs/1710.03059v1
PDF	http://arxiv.org/pdf/1710.03059v1.pdf
PWC	https://paperswithcode.com/paper/learning-graph-representations-with-embedding
Repo
Framework

Binary-decomposed DCNN for accelerating computation and compressing model without retraining


Title	Binary-decomposed DCNN for accelerating computation and compressing model without retraining
Authors	Ryuji Kamiya, Takayoshi Yamashita, Mitsuru Ambai, Ikuro Sato, Yuji Yamauchi, Hironobu Fujiyoshi
Abstract	Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The large number of parameters also require large amounts of memory. This is resulting in increasingly long computation times and large model sizes. To implement mobile and other low performance devices incorporating DCNN, model sizes must be compressed and computation must be accelerated. To that end, this paper proposes Binary-decomposed DCNN, which resolves these issues without the need for retraining. Our method replaces real-valued inner-product computations with binary inner-product computations in existing network models to accelerate computation of inference and decrease model size without the need for retraining. Binary computations can be done at high speed using logical operators such as XOR and AND, together with bit counting. In tests using AlexNet with the ImageNet classification task, speed increased by a factor of 1.79, models were compressed by approximately 80%, and increase in error rate was limited to 1.20%. With VGG-16, speed increased by a factor of 2.07, model sizes decreased by 81%, and error increased by only 2.16%.
Tasks
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04731v1
PDF	http://arxiv.org/pdf/1709.04731v1.pdf
PWC	https://paperswithcode.com/paper/binary-decomposed-dcnn-for-accelerating
Repo
Framework

A Survey on Optical Character Recognition System


Title	A Survey on Optical Character Recognition System
Authors	Noman Islam, Zeeshan Islam, Nazia Noor
Abstract	Optical Character Recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into its constituent characters. Despite decades of intense research, developing OCR with capabilities comparable to that of human still remains an open challenge. Due to this challenging nature, researchers from industry and academic circles have directed their attentions towards Optical Character Recognition. Over the last few years, the number of academic laboratories and companies involved in research on Character Recognition has increased dramatically. This research aims at summarizing the research so far done in the field of OCR. It provides an overview of different aspects of OCR and discusses corresponding proposals aimed at resolving issues of OCR.
Tasks	Optical Character Recognition
Published	2017-10-03
URL	http://arxiv.org/abs/1710.05703v1
PDF	http://arxiv.org/pdf/1710.05703v1.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-optical-character-recognition
Repo
Framework

Cortical microcircuits as gated-recurrent neural networks


Title	Cortical microcircuits as gated-recurrent neural networks
Authors	Rui Ponte Costa, Yannis M. Assael, Brendan Shillingford, Nando de Freitas, Tim P. Vogels
Abstract	Cortical circuits exhibit intricate recurrent architectures that are remarkably similar across different brain areas. Such stereotyped structure suggests the existence of common computational principles. However, such principles have remained largely elusive. Inspired by gated-memory networks, namely long short-term memory networks (LSTMs), we introduce a recurrent neural network in which information is gated through inhibitory cells that are subtractive (subLSTM). We propose a natural mapping of subLSTMs onto known canonical excitatory-inhibitory cortical microcircuits. Our empirical evaluation across sequential image classification and language modelling tasks shows that subLSTM units can achieve similar performance to LSTM units. These results suggest that cortical circuits can be optimised to solve complex contextual problems and proposes a novel view on their computational function. Overall our work provides a step towards unifying recurrent networks as used in machine learning with their biological counterparts.
Tasks	Image Classification, Language Modelling, Sequential Image Classification
Published	2017-11-07
URL	http://arxiv.org/abs/1711.02448v2
PDF	http://arxiv.org/pdf/1711.02448v2.pdf
PWC	https://paperswithcode.com/paper/cortical-microcircuits-as-gated-recurrent
Repo
Framework

Deep Control - a simple automatic gain control for memory efficient and high performance training of deep convolutional neural networks


Title	Deep Control - a simple automatic gain control for memory efficient and high performance training of deep convolutional neural networks
Authors	Brendan Ruff
Abstract	Training a deep convolutional neural net typically starts with a random initialisation of all filters in all layers which severely reduces the forward signal and back-propagated error and leads to slow and sub-optimal training. Techniques that counter that focus on either increasing the signal or increasing the gradients adaptively but the model behaves very differently at the beginning of training compared to later when stable pathways through the net have been established. To compound this problem the effective minibatch size varies greatly between layers at different depths and between individual filters as activation sparsity typically increases with depth leading to a reduction in effective learning rate since gradients may superpose rather than add and this further compounds the covariate shift problem as deeper neurons are less able to adapt to upstream shift. Proposed here is a method of automatic gain control of the signal built into each convolutional neuron that achieves equivalent or superior performance than batch normalisation and is compatible with single sample or minibatch gradient descent. The same model is used both for training and inference. The technique comprises a scaled per sample map mean subtraction from the raw convolutional filter output followed by scaling of the difference.
Tasks
Published	2017-06-13
URL	http://arxiv.org/abs/1706.03907v1
PDF	http://arxiv.org/pdf/1706.03907v1.pdf
PWC	https://paperswithcode.com/paper/deep-control-a-simple-automatic-gain-control
Repo
Framework

Approximating Continuous Functions by ReLU Nets of Minimal Width


Title	Approximating Continuous Functions by ReLU Nets of Minimal Width
Authors	Boris Hanin, Mark Sellke
Abstract	This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d_{in}\geq 1,$ what is the minimal width $w$ so that neural nets with ReLU activations, input dimension $d_{in}$, hidden layer widths at most $w,$ and arbitrary depth can approximate any continuous, real-valued function of $d_{in}$ variables arbitrarily well? It turns out that this minimal width is exactly equal to $d_{in}+1.$ That is, if all the hidden layer widths are bounded by $d_{in}$, then even in the infinite depth limit, ReLU nets can only express a very limited class of functions, and, on the other hand, any continuous function on the $d_{in}$-dimensional unit cube can be approximated to arbitrary precision by ReLU nets in which all hidden layers have width exactly $d_{in}+1.$ Our construction in fact shows that any continuous function $f:[0,1]^{d_{in}}\to\mathbb R^{d_{out}}$ can be approximated by a net of width $d_{in}+d_{out}$. We obtain quantitative depth estimates for such an approximation in terms of the modulus of continuity of $f$.
Tasks
Published	2017-10-31
URL	http://arxiv.org/abs/1710.11278v2
PDF	http://arxiv.org/pdf/1710.11278v2.pdf
PWC	https://paperswithcode.com/paper/approximating-continuous-functions-by-relu
Repo
Framework

Semantic Augmented Reality Environment with Material-Aware Physical Interactions


Title	Semantic Augmented Reality Environment with Material-Aware Physical Interactions
Authors	Long Chen, Karl Francis, Wen Tang
Abstract	In Augmented Reality (AR) environment, realistic interactions between the virtual and real objects play a crucial role in user experience. Much of recent advances in AR has been largely focused on developing geometry-aware environment, but little has been done in dealing with interactions at the semantic level. High-level scene understanding and semantic descriptions in AR would allow effective design of complex applications and enhanced user experience. In this paper, we present a novel approach and a prototype system that enables the deeper understanding of semantic properties of the real world environment, so that realistic physical interactions between the real and the virtual objects can be generated. A material-aware AR environment has been created based on the deep material learning using a fully convolutional network (FCN). The state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) has been used for the semantic mapping. Together with efficient accelerated 3D ray casting, natural and realistic physical interactions are generated for interactive AR games. Our approach has significant impact on the future development of advanced AR systems and applications.
Tasks	Scene Understanding
Published	2017-08-03
URL	http://arxiv.org/abs/1708.01208v3
PDF	http://arxiv.org/pdf/1708.01208v3.pdf
PWC	https://paperswithcode.com/paper/semantic-augmented-reality-environment-with
Repo
Framework

SEP-Nets: Small and Effective Pattern Networks


Title	SEP-Nets: Small and Effective Pattern Networks
Authors	Zhe Li, Xiaoyu Wang, Xutao Lv, Tianbao Yang
Abstract	While going deeper has been witnessed to improve the performance of convolutional neural networks (CNN), going smaller for CNN has received increasing attention recently due to its attractiveness for mobile/embedded applications. It remains an active and important topic how to design a small network while retaining the performance of large and deep CNNs (e.g., Inception Nets, ResNets). Albeit there are already intensive studies on compressing the size of CNNs, the considerable drop of performance is still a key concern in many designs. This paper addresses this concern with several new contributions. First, we propose a simple yet powerful method for compressing the size of deep CNNs based on parameter binarization. The striking difference from most previous work on parameter binarization/quantization lies at different treatments of $1\times 1$ convolutions and $k\times k$ convolutions ($k>1$), where we only binarize $k\times k$ convolutions into binary patterns. The resulting networks are referred to as pattern networks. By doing this, we show that previous deep CNNs such as GoogLeNet and Inception-type Nets can be compressed dramatically with marginal drop in performance. Second, in light of the different functionalities of $1\times 1$ (data projection/transformation) and $k\times k$ convolutions (pattern extraction), we propose a new block structure codenamed the pattern residual block that adds transformed feature maps generated by $1\times 1$ convolutions to the pattern feature maps generated by $k\times k$ convolutions, based on which we design a small network with $\sim 1$ million parameters. Combining with our parameter binarization, we achieve better performance on ImageNet than using similar sized networks including recently released Google MobileNets.
Tasks	Quantization
Published	2017-06-13
URL	http://arxiv.org/abs/1706.03912v1
PDF	http://arxiv.org/pdf/1706.03912v1.pdf
PWC	https://paperswithcode.com/paper/sep-nets-small-and-effective-pattern-networks
Repo
Framework