Paper Group ANR 211
An Attention-Based Deep Net for Learning to Rank. Image Segmentation of Multi-Shaped Overlapping Objects. Topology Analysis of International Networks Based on Debates in the United Nations. Kernel Scaling for Manifold Learning and Classification. Batch Reinforcement Learning on the Industrial Benchmark: First Experiences. Long-Term Ensemble Learnin …
An Attention-Based Deep Net for Learning to Rank
Title | An Attention-Based Deep Net for Learning to Rank |
Authors | Baiyang Wang, Diego Klabjan |
Abstract | In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to this problem, and in this paper, we propose an attention-based deep neural network which better incorporates different embeddings of the queries and search results with an attention-based mechanism. This model also applies a decoder mechanism to learn the ranks of the search results in a listwise fashion. The embeddings are trained with convolutional neural networks or the word2vec model. We demonstrate the performance of this model with image retrieval and text querying data sets. |
Tasks | Image Retrieval, Information Retrieval, Learning-To-Rank |
Published | 2017-02-20 |
URL | http://arxiv.org/abs/1702.06106v3 |
http://arxiv.org/pdf/1702.06106v3.pdf | |
PWC | https://paperswithcode.com/paper/an-attention-based-deep-net-for-learning-to |
Repo | |
Framework | |
Image Segmentation of Multi-Shaped Overlapping Objects
Title | Image Segmentation of Multi-Shaped Overlapping Objects |
Authors | Kumar Abhinav, Jaideep Singh Chauhan, Debasis Sarkar |
Abstract | In this work, we propose a new segmentation algorithm for images containing convex objects present in multiple shapes with a high degree of overlap. The proposed algorithm is carried out in two steps, first we identify the visible contours, segment them using concave points and finally group the segments belonging to the same object. The next step is to assign a shape identity to these grouped contour segments. For images containing objects in multiple shapes we begin first by identifying shape classes of the contours followed by assigning a shape entity to these classes. We provide a comprehensive experimentation of our algorithm on two crystal image datasets. One dataset comprises of images containing objects in multiple shapes overlapping each other and the other dataset contains standard images with objects present in a single shape. We test our algorithm against two baselines, with our proposed algorithm outperforming both the baselines. |
Tasks | Semantic Segmentation |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.02217v1 |
http://arxiv.org/pdf/1711.02217v1.pdf | |
PWC | https://paperswithcode.com/paper/image-segmentation-of-multi-shaped |
Repo | |
Framework | |
Topology Analysis of International Networks Based on Debates in the United Nations
Title | Topology Analysis of International Networks Based on Debates in the United Nations |
Authors | Stefano Gurciullo, Slava Mikhaylov |
Abstract | In complex, high dimensional and unstructured data it is often difficult to extract meaningful patterns. This is especially the case when dealing with textual data. Recent studies in machine learning, information theory and network science have developed several novel instruments to extract the semantics of unstructured data, and harness it to build a network of relations. Such approaches serve as an efficient tool for dimensionality reduction and pattern detection. This paper applies semantic network science to extract ideological proximity in the international arena, by focusing on the data from General Debates in the UN General Assembly on the topics of high salience to international community. UN General Debate corpus (UNGDC) covers all high-level debates in the UN General Assembly from 1970 to 2014, covering all UN member states. The research proceeds in three main steps. First, Latent Dirichlet Allocation (LDA) is used to extract the topics of the UN speeches, and therefore semantic information. Each country is then assigned a vector specifying the exposure to each of the topics identified. This intermediate output is then used in to construct a network of countries based on information theoretical metrics where the links capture similar vectorial patterns in the topic distributions. Topology of the networks is then analyzed through network properties like density, path length and clustering. Finally, we identify specific topological features of our networks using the map equation framework to detect communities in our networks of countries. |
Tasks | Dimensionality Reduction |
Published | 2017-07-29 |
URL | http://arxiv.org/abs/1707.09491v1 |
http://arxiv.org/pdf/1707.09491v1.pdf | |
PWC | https://paperswithcode.com/paper/topology-analysis-of-international-networks |
Repo | |
Framework | |
Kernel Scaling for Manifold Learning and Classification
Title | Kernel Scaling for Manifold Learning and Classification |
Authors | Ofir Lindenbaum, Moshe Salhov, Arie Yeredor, Amir Averbuch |
Abstract | Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel’s scale parameter, also referred to as the kernel’s bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold’s intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task. |
Tasks | Dimensionality Reduction |
Published | 2017-07-04 |
URL | https://arxiv.org/abs/1707.01093v2 |
https://arxiv.org/pdf/1707.01093v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-scaling-for-manifold-learning-and |
Repo | |
Framework | |
Batch Reinforcement Learning on the Industrial Benchmark: First Experiences
Title | Batch Reinforcement Learning on the Industrial Benchmark: First Experiences |
Authors | Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing |
Abstract | The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions. |
Tasks | |
Published | 2017-05-20 |
URL | http://arxiv.org/abs/1705.07262v2 |
http://arxiv.org/pdf/1705.07262v2.pdf | |
PWC | https://paperswithcode.com/paper/batch-reinforcement-learning-on-the |
Repo | |
Framework | |
Long-Term Ensemble Learning of Visual Place Classifiers
Title | Long-Term Ensemble Learning of Visual Place Classifiers |
Authors | Xiaoxiao Fei, Kanji Tanaka, Yichu Fang, Akitaka Takayama |
Abstract | This paper addresses the problem of cross-season visual place classification (VPC) from a novel perspective of long-term map learning. Our goal is to enable transfer learning efficiently from one season to the next, at a small constant cost, and without wasting the robot’s available long-term-memory by memorizing very large amounts of training data. To realize a good tradeoff between generalization and specialization abilities, we employ an ensemble of convolutional neural network (DCN) classifiers and consider the task of scheduling (when and which classifiers to retrain), given a previous season’s DCN classifiers as the sole prior knowledge. We present a unified framework for retraining scheduling and discuss practical implementation strategies. Furthermore, we address the task of partitioning a robot’s workspace into places to define place classes in an unsupervised manner, rather than using uniform partitioning, so as to maximize VPC performance. Experiments using the publicly available NCLT dataset revealed that retraining scheduling of a DCN classifier ensemble is crucial and performance is significantly increased by using planned scheduling. |
Tasks | Transfer Learning |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05470v1 |
http://arxiv.org/pdf/1709.05470v1.pdf | |
PWC | https://paperswithcode.com/paper/long-term-ensemble-learning-of-visual-place |
Repo | |
Framework | |
Early Stopping without a Validation Set
Title | Early Stopping without a Validation Set |
Authors | Maren Mahsereci, Lukas Balles, Christoph Lassner, Philipp Hennig |
Abstract | Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. We propose a novel early stopping criterion based on fast-to-compute local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression, as well as neural networks. |
Tasks | |
Published | 2017-03-28 |
URL | http://arxiv.org/abs/1703.09580v3 |
http://arxiv.org/pdf/1703.09580v3.pdf | |
PWC | https://paperswithcode.com/paper/early-stopping-without-a-validation-set |
Repo | |
Framework | |
Learning Graph Representations with Embedding Propagation
Title | Learning Graph Representations with Embedding Propagation |
Authors | Alberto Garcia-Duran, Mathias Niepert |
Abstract | We propose Embedding Propagation (EP), an unsupervised learning framework for graph-structured data. EP learns vector representations of graphs by passing two types of messages between neighboring nodes. Forward messages consist of label representations such as representations of words and other attributes associated with the nodes. Backward messages consist of gradients that result from aggregating the label representations and applying a reconstruction loss. Node representations are finally computed from the representation of their labels. With significantly fewer parameters and hyperparameters an instance of EP is competitive with and often outperforms state of the art unsupervised and semi-supervised learning methods on a range of benchmark data sets. |
Tasks | |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.03059v1 |
http://arxiv.org/pdf/1710.03059v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-graph-representations-with-embedding |
Repo | |
Framework | |
Binary-decomposed DCNN for accelerating computation and compressing model without retraining
Title | Binary-decomposed DCNN for accelerating computation and compressing model without retraining |
Authors | Ryuji Kamiya, Takayoshi Yamashita, Mitsuru Ambai, Ikuro Sato, Yuji Yamauchi, Hironobu Fujiyoshi |
Abstract | Recent trends show recognition accuracy increasing even more profoundly. Inference process of Deep Convolutional Neural Networks (DCNN) has a large number of parameters, requires a large amount of computation, and can be very slow. The large number of parameters also require large amounts of memory. This is resulting in increasingly long computation times and large model sizes. To implement mobile and other low performance devices incorporating DCNN, model sizes must be compressed and computation must be accelerated. To that end, this paper proposes Binary-decomposed DCNN, which resolves these issues without the need for retraining. Our method replaces real-valued inner-product computations with binary inner-product computations in existing network models to accelerate computation of inference and decrease model size without the need for retraining. Binary computations can be done at high speed using logical operators such as XOR and AND, together with bit counting. In tests using AlexNet with the ImageNet classification task, speed increased by a factor of 1.79, models were compressed by approximately 80%, and increase in error rate was limited to 1.20%. With VGG-16, speed increased by a factor of 2.07, model sizes decreased by 81%, and error increased by only 2.16%. |
Tasks | |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04731v1 |
http://arxiv.org/pdf/1709.04731v1.pdf | |
PWC | https://paperswithcode.com/paper/binary-decomposed-dcnn-for-accelerating |
Repo | |
Framework | |
A Survey on Optical Character Recognition System
Title | A Survey on Optical Character Recognition System |
Authors | Noman Islam, Zeeshan Islam, Nazia Noor |
Abstract | Optical Character Recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into its constituent characters. Despite decades of intense research, developing OCR with capabilities comparable to that of human still remains an open challenge. Due to this challenging nature, researchers from industry and academic circles have directed their attentions towards Optical Character Recognition. Over the last few years, the number of academic laboratories and companies involved in research on Character Recognition has increased dramatically. This research aims at summarizing the research so far done in the field of OCR. It provides an overview of different aspects of OCR and discusses corresponding proposals aimed at resolving issues of OCR. |
Tasks | Optical Character Recognition |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.05703v1 |
http://arxiv.org/pdf/1710.05703v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-optical-character-recognition |
Repo | |
Framework | |
Cortical microcircuits as gated-recurrent neural networks
Title | Cortical microcircuits as gated-recurrent neural networks |
Authors | Rui Ponte Costa, Yannis M. Assael, Brendan Shillingford, Nando de Freitas, Tim P. Vogels |
Abstract | Cortical circuits exhibit intricate recurrent architectures that are remarkably similar across different brain areas. Such stereotyped structure suggests the existence of common computational principles. However, such principles have remained largely elusive. Inspired by gated-memory networks, namely long short-term memory networks (LSTMs), we introduce a recurrent neural network in which information is gated through inhibitory cells that are subtractive (subLSTM). We propose a natural mapping of subLSTMs onto known canonical excitatory-inhibitory cortical microcircuits. Our empirical evaluation across sequential image classification and language modelling tasks shows that subLSTM units can achieve similar performance to LSTM units. These results suggest that cortical circuits can be optimised to solve complex contextual problems and proposes a novel view on their computational function. Overall our work provides a step towards unifying recurrent networks as used in machine learning with their biological counterparts. |
Tasks | Image Classification, Language Modelling, Sequential Image Classification |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.02448v2 |
http://arxiv.org/pdf/1711.02448v2.pdf | |
PWC | https://paperswithcode.com/paper/cortical-microcircuits-as-gated-recurrent |
Repo | |
Framework | |
Deep Control - a simple automatic gain control for memory efficient and high performance training of deep convolutional neural networks
Title | Deep Control - a simple automatic gain control for memory efficient and high performance training of deep convolutional neural networks |
Authors | Brendan Ruff |
Abstract | Training a deep convolutional neural net typically starts with a random initialisation of all filters in all layers which severely reduces the forward signal and back-propagated error and leads to slow and sub-optimal training. Techniques that counter that focus on either increasing the signal or increasing the gradients adaptively but the model behaves very differently at the beginning of training compared to later when stable pathways through the net have been established. To compound this problem the effective minibatch size varies greatly between layers at different depths and between individual filters as activation sparsity typically increases with depth leading to a reduction in effective learning rate since gradients may superpose rather than add and this further compounds the covariate shift problem as deeper neurons are less able to adapt to upstream shift. Proposed here is a method of automatic gain control of the signal built into each convolutional neuron that achieves equivalent or superior performance than batch normalisation and is compatible with single sample or minibatch gradient descent. The same model is used both for training and inference. The technique comprises a scaled per sample map mean subtraction from the raw convolutional filter output followed by scaling of the difference. |
Tasks | |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.03907v1 |
http://arxiv.org/pdf/1706.03907v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-control-a-simple-automatic-gain-control |
Repo | |
Framework | |
Approximating Continuous Functions by ReLU Nets of Minimal Width
Title | Approximating Continuous Functions by ReLU Nets of Minimal Width |
Authors | Boris Hanin, Mark Sellke |
Abstract | This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d_{in}\geq 1,$ what is the minimal width $w$ so that neural nets with ReLU activations, input dimension $d_{in}$, hidden layer widths at most $w,$ and arbitrary depth can approximate any continuous, real-valued function of $d_{in}$ variables arbitrarily well? It turns out that this minimal width is exactly equal to $d_{in}+1.$ That is, if all the hidden layer widths are bounded by $d_{in}$, then even in the infinite depth limit, ReLU nets can only express a very limited class of functions, and, on the other hand, any continuous function on the $d_{in}$-dimensional unit cube can be approximated to arbitrary precision by ReLU nets in which all hidden layers have width exactly $d_{in}+1.$ Our construction in fact shows that any continuous function $f:[0,1]^{d_{in}}\to\mathbb R^{d_{out}}$ can be approximated by a net of width $d_{in}+d_{out}$. We obtain quantitative depth estimates for such an approximation in terms of the modulus of continuity of $f$. |
Tasks | |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11278v2 |
http://arxiv.org/pdf/1710.11278v2.pdf | |
PWC | https://paperswithcode.com/paper/approximating-continuous-functions-by-relu |
Repo | |
Framework | |
Semantic Augmented Reality Environment with Material-Aware Physical Interactions
Title | Semantic Augmented Reality Environment with Material-Aware Physical Interactions |
Authors | Long Chen, Karl Francis, Wen Tang |
Abstract | In Augmented Reality (AR) environment, realistic interactions between the virtual and real objects play a crucial role in user experience. Much of recent advances in AR has been largely focused on developing geometry-aware environment, but little has been done in dealing with interactions at the semantic level. High-level scene understanding and semantic descriptions in AR would allow effective design of complex applications and enhanced user experience. In this paper, we present a novel approach and a prototype system that enables the deeper understanding of semantic properties of the real world environment, so that realistic physical interactions between the real and the virtual objects can be generated. A material-aware AR environment has been created based on the deep material learning using a fully convolutional network (FCN). The state-of-the-art dense Simultaneous Localisation and Mapping (SLAM) has been used for the semantic mapping. Together with efficient accelerated 3D ray casting, natural and realistic physical interactions are generated for interactive AR games. Our approach has significant impact on the future development of advanced AR systems and applications. |
Tasks | Scene Understanding |
Published | 2017-08-03 |
URL | http://arxiv.org/abs/1708.01208v3 |
http://arxiv.org/pdf/1708.01208v3.pdf | |
PWC | https://paperswithcode.com/paper/semantic-augmented-reality-environment-with |
Repo | |
Framework | |
SEP-Nets: Small and Effective Pattern Networks
Title | SEP-Nets: Small and Effective Pattern Networks |
Authors | Zhe Li, Xiaoyu Wang, Xutao Lv, Tianbao Yang |
Abstract | While going deeper has been witnessed to improve the performance of convolutional neural networks (CNN), going smaller for CNN has received increasing attention recently due to its attractiveness for mobile/embedded applications. It remains an active and important topic how to design a small network while retaining the performance of large and deep CNNs (e.g., Inception Nets, ResNets). Albeit there are already intensive studies on compressing the size of CNNs, the considerable drop of performance is still a key concern in many designs. This paper addresses this concern with several new contributions. First, we propose a simple yet powerful method for compressing the size of deep CNNs based on parameter binarization. The striking difference from most previous work on parameter binarization/quantization lies at different treatments of $1\times 1$ convolutions and $k\times k$ convolutions ($k>1$), where we only binarize $k\times k$ convolutions into binary patterns. The resulting networks are referred to as pattern networks. By doing this, we show that previous deep CNNs such as GoogLeNet and Inception-type Nets can be compressed dramatically with marginal drop in performance. Second, in light of the different functionalities of $1\times 1$ (data projection/transformation) and $k\times k$ convolutions (pattern extraction), we propose a new block structure codenamed the pattern residual block that adds transformed feature maps generated by $1\times 1$ convolutions to the pattern feature maps generated by $k\times k$ convolutions, based on which we design a small network with $\sim 1$ million parameters. Combining with our parameter binarization, we achieve better performance on ImageNet than using similar sized networks including recently released Google MobileNets. |
Tasks | Quantization |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.03912v1 |
http://arxiv.org/pdf/1706.03912v1.pdf | |
PWC | https://paperswithcode.com/paper/sep-nets-small-and-effective-pattern-networks |
Repo | |
Framework | |