Paper Group ANR 620
Metabolize Neural Network. Dependent Gated Reading for Cloze-Style Question Answering. Table-to-Text: Describing Table Region with Natural Language. Probabilistic Class-Specific Discriminant Analysis. Object Relation Detection Based on One-shot Learning. A survey on policy search algorithms for learning robot controllers in a handful of trials. Met …
Metabolize Neural Network
Title | Metabolize Neural Network |
Authors | Dan Dai, Zhiwen Yu, Yang Hu, Wenming Cao, Mingnan Luo |
Abstract | The metabolism of cells is the most basic and important part of human function. Neural networks in deep learning stem from neuronal activity. It is self-evident that the significance of metabolize neuronal network(MetaNet) in model construction. In this study, we explore neuronal metabolism for shallow network from proliferation and autophagy two aspects. First, we propose different neuron proliferate methods that constructive the selfgrowing network in metabolism cycle. Proliferate neurons alleviate resources wasting and insufficient model learning problem when network initializes more or less parameters. Then combined with autophagy mechanism in the process of model self construction to ablate under-expressed neurons. The MetaNet can automatically determine the number of neurons during training, further, save more resource consumption. We verify the performance of the proposed methods on datasets: MNIST, Fashion-MNIST and CIFAR-10. |
Tasks | |
Published | 2018-09-04 |
URL | http://arxiv.org/abs/1809.00837v1 |
http://arxiv.org/pdf/1809.00837v1.pdf | |
PWC | https://paperswithcode.com/paper/metabolize-neural-network |
Repo | |
Framework | |
Dependent Gated Reading for Cloze-Style Question Answering
Title | Dependent Gated Reading for Cloze-Style Question Answering |
Authors | Reza Ghaeini, Xiaoli Z. Fern, Hamed Shahbazi, Prasad Tadepalli |
Abstract | We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query during encoding and decision making. Our evaluation shows that DGR obtains highly competitive performance on well-known machine comprehension benchmarks such as the Children’s Book Test (CBT-NE and CBT-CN) and Who DiD What (WDW, Strict and Relaxed). Finally, we extensively analyze and validate our model by ablation and attention studies. |
Tasks | Decision Making, Question Answering, Reading Comprehension |
Published | 2018-05-26 |
URL | http://arxiv.org/abs/1805.10528v2 |
http://arxiv.org/pdf/1805.10528v2.pdf | |
PWC | https://paperswithcode.com/paper/dependent-gated-reading-for-cloze-style |
Repo | |
Framework | |
Table-to-Text: Describing Table Region with Natural Language
Title | Table-to-Text: Describing Table Region with Natural Language |
Authors | Junwei Bao, Duyu Tang, Nan Duan, Zhao Yan, Yuanhua Lv, Ming Zhou, Tiejun Zhao |
Abstract | In this paper, we present a generative model to generate a natural language sentence describing a table region, e.g., a row. The model maps a row from a table to a continuous vector and then generates a natural language sentence by leveraging the semantics of a table. To deal with rare words appearing in a table, we develop a flexible copying mechanism that selectively replicates contents from the table in the output sequence. Extensive experiments demonstrate the accuracy of the model and the power of the copying mechanism. On two synthetic datasets, WIKIBIO and SIMPLEQUESTIONS, our model improves the current state-of-the-art BLEU-4 score from 34.70 to 40.26 and from 33.32 to 39.12, respectively. Furthermore, we introduce an open-domain dataset WIKITABLETEXT including 13,318 explanatory sentences for 4,962 tables. Our model achieves a BLEU-4 score of 38.23, which outperforms template based and language model based approaches. |
Tasks | Language Modelling |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11234v1 |
http://arxiv.org/pdf/1805.11234v1.pdf | |
PWC | https://paperswithcode.com/paper/table-to-text-describing-table-region-with |
Repo | |
Framework | |
Probabilistic Class-Specific Discriminant Analysis
Title | Probabilistic Class-Specific Discriminant Analysis |
Authors | Alexandros Iosifidis |
Abstract | In this paper we formulate a probabilistic model for class-specific discriminant subspace learning. The proposed model can naturally incorporate the multi-modal structure of the negative class, which is neglected by existing class-specific methods. Moreover, it can be directly used to define a probabilistic classification rule in the discriminant subspace. We show that existing class-specific discriminant analysis methods are special cases of the proposed probabilistic model and, by casting them as probabilistic models, they can be extended to class-specific classifiers. We illustrate the performance of the proposed model in both verification and classification problems. |
Tasks | |
Published | 2018-12-14 |
URL | https://arxiv.org/abs/1812.05980v4 |
https://arxiv.org/pdf/1812.05980v4.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-class-specific-discriminant |
Repo | |
Framework | |
Object Relation Detection Based on One-shot Learning
Title | Object Relation Detection Based on One-shot Learning |
Authors | Li Zhou, Jian Zhao, Jianshu Li, Li Yuan, Jiashi Feng |
Abstract | Detecting the relations among objects, such as “cat on sofa” and “person ride horse”, is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language. Despite the remarkable progress of deep learning in detection and recognition of individual objects, it is still a challenging task to localize and recognize the relations between objects due to the complex combinatorial nature of various kinds of object relations. Inspired by the recent advances in one-shot learning, we propose a simple yet effective Semantics Induced Learner (SIL) model for solving this challenging task. Learning in one-shot manner can enable a detection model to adapt to a huge number of object relations with diverse appearance effectively and robustly. In addition, the SIL combines bottom-up and top-down attention mech- anisms, therefore enabling attention at the level of vision and semantics favorably. Within our proposed model, the bottom-up mechanism, which is based on Faster R-CNN, proposes objects regions, and the top-down mechanism selects and integrates visual features according to semantic information. Experiments demonstrate the effectiveness of our framework over other state-of-the-art methods on two large-scale data sets for object relation detection. |
Tasks | One-Shot Learning |
Published | 2018-07-16 |
URL | http://arxiv.org/abs/1807.05857v1 |
http://arxiv.org/pdf/1807.05857v1.pdf | |
PWC | https://paperswithcode.com/paper/object-relation-detection-based-on-one-shot |
Repo | |
Framework | |
A survey on policy search algorithms for learning robot controllers in a handful of trials
Title | A survey on policy search algorithms for learning robot controllers in a handful of trials |
Authors | Konstantinos Chatzilygeroudis, Vassilis Vassiliades, Freek Stulp, Sylvain Calinon, Jean-Baptiste Mouret |
Abstract | Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word “big-data”, we refer to this challenge as “micro-data reinforcement learning”. We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time. |
Tasks | |
Published | 2018-07-06 |
URL | https://arxiv.org/abs/1807.02303v5 |
https://arxiv.org/pdf/1807.02303v5.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-policy-search-algorithms-for |
Repo | |
Framework | |
Metalearning with Hebbian Fast Weights
Title | Metalearning with Hebbian Fast Weights |
Authors | Tsendsuren Munkhdalai, Adam Trischler |
Abstract | We unify recent neural approaches to one-shot learning with older ideas of associative memory in a model for metalearning. Our model learns jointly to represent data and to bind class labels to representations in a single shot. It builds representations via slow weights, learned across tasks through SGD, while fast weights constructed by a Hebbian learning rule implement one-shot binding for each new task. On the Omniglot, Mini-ImageNet, and Penn Treebank one-shot learning benchmarks, our model achieves state-of-the-art results. |
Tasks | Omniglot, One-Shot Learning |
Published | 2018-07-12 |
URL | http://arxiv.org/abs/1807.05076v1 |
http://arxiv.org/pdf/1807.05076v1.pdf | |
PWC | https://paperswithcode.com/paper/metalearning-with-hebbian-fast-weights |
Repo | |
Framework | |
Dynamic Spectrum Matching with One-shot Learning
Title | Dynamic Spectrum Matching with One-shot Learning |
Authors | Jinchao Liu, Stuart J. Gibson, James Mills, Margarita Osadchy |
Abstract | Convolutional neural networks (CNN) have been shown to provide a good solution for classification problems that utilize data obtained from vibrational spectroscopy. Moreover, CNNs are capable of identification from noisy spectra without the need for additional preprocessing. However, their application in practical spectroscopy is limited due to two shortcomings. The effectiveness of the classification using CNNs drops rapidly when only a small number of spectra per substance are available for training (which is a typical situation in real applications). Additionally, to accommodate new, previously unseen substance classes, the network must be retrained which is computationally intensive. Here we address these issues by reformulating a multi-class classification problem with a large number of classes, but a small number of samples per class, to a binary classification problem with sufficient data available for representation learning. Namely, we define the learning task as identifying pairs of inputs as belonging to the same or different classes. We achieve this using a Siamese convolutional neural network. A novel sampling strategy is proposed to address the imbalance problem in training the Siamese Network. The trained network can effectively classify samples of unseen substance classes using just a single reference sample (termed as one-shot learning in the machine learning community). Our results demonstrate better accuracy than other practical systems to date, while allowing effortless updates of the system’s database with novel substance classes. |
Tasks | One-Shot Learning, Representation Learning |
Published | 2018-06-23 |
URL | http://arxiv.org/abs/1806.09981v1 |
http://arxiv.org/pdf/1806.09981v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-spectrum-matching-with-one-shot |
Repo | |
Framework | |
A Review of Literature on Parallel Constraint Solving
Title | A Review of Literature on Parallel Constraint Solving |
Authors | Ian P. Gent, Ciaran McCreesh, Ian Miguel, Neil C. A. Moore, Peter Nightingale, Patrick Prosser, Chris Unsworth |
Abstract | As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation. Under consideration in Theory and Practice of Logic Programming (TPLP). |
Tasks | |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.10981v1 |
http://arxiv.org/pdf/1803.10981v1.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-literature-on-parallel-constraint |
Repo | |
Framework | |
Securing Behavior-based Opinion Spam Detection
Title | Securing Behavior-based Opinion Spam Detection |
Authors | Shuaijun Ge, Guixiang Ma, Sihong Xie, Philip S. Yu |
Abstract | Reviews spams are prevalent in e-commerce to manipulate product ranking and customers decisions maliciously. While spams generated based on simple spamming strategy can be detected effectively, hardened spammers can evade regular detectors via more advanced spamming strategies. Previous work gave more attention to evasion against text and graph-based detectors, but evasions against behavior-based detectors are largely ignored, leading to vulnerabilities in spam detection systems. Since real evasion data are scarce, we first propose EMERAL (Evasion via Maximum Entropy and Rating sAmpLing) to generate evasive spams to certain existing detectors. EMERAL can simulate spammers with different goals and levels of knowledge about the detectors, targeting at different stages of the life cycle of target products. We show that in the evasion-defense dynamic, only a few evasion types are meaningful to the spammers, and any spammer will not be able to evade too many detection signals at the same time. We reveal that some evasions are quite insidious and can fail all detection signals. We then propose DETER (Defense via Evasion generaTion using EmeRal), based on model re-training on diverse evasive samples generated by EMERAL. Experiments confirm that DETER is more accurate in detecting both suspicious time window and individual spamming reviews. In terms of security, DETER is versatile enough to be vaccinated against diverse and unexpected evasions, is agnostic about evasion strategy and can be released without privacy concern. |
Tasks | |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03739v1 |
http://arxiv.org/pdf/1811.03739v1.pdf | |
PWC | https://paperswithcode.com/paper/securing-behavior-based-opinion-spam |
Repo | |
Framework | |
Zigzag Learning for Weakly Supervised Object Detection
Title | Zigzag Learning for Weakly Supervised Object Detection |
Authors | Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian |
Abstract | This paper addresses weakly supervised object detection with only image-level supervision at training stage. Previous approaches train detection models with entire images all at once, making the models prone to being trapped in sub-optimums due to the introduced false positive examples. Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds. Towards this goal, we first develop a criterion named mean Energy Accumulation Scores (mEAS) to automatically measure and rank localization difficulty of an image containing the target object, and accordingly learn the detector progressively by feeding examples with increasing difficulty. In this way, the model can be well prepared by training on easy examples for learning from more difficult ones and thus gain a stronger detection ability more efficiently. Furthermore, we introduce a novel masking regularization strategy over the high level convolutional feature maps to avoid overfitting initial samples. These two modules formulate a zigzag learning process, where progressive learning endeavors to discover reliable object instances, and masking regularization increases the difficulty of finding object instances properly. We achieve 47.6% mAP on PASCAL VOC 2007, surpassing the state-of-the-arts by a large margin. |
Tasks | Object Detection, Weakly Supervised Object Detection |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09466v1 |
http://arxiv.org/pdf/1804.09466v1.pdf | |
PWC | https://paperswithcode.com/paper/zigzag-learning-for-weakly-supervised-object |
Repo | |
Framework | |
XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
Title | XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks |
Authors | Andrawes Al Bahou, Geethan Karunaratne, Renzo Andri, Lukas Cavigelli, Luca Benini |
Abstract | Deploying state-of-the-art CNNs requires power-hungry processors and off-chip memory. This precludes the implementation of CNNs in low-power embedded systems. Recent research shows CNNs sustain extreme quantization, binarizing their weights and intermediate feature maps, thereby saving 8-32\x memory and collapsing energy-intensive sum-of-products into XNOR-and-popcount operations. We present XNORBIN, an accelerator for binary CNNs with computation tightly coupled to memory for aggressive data reuse. Implemented in UMC 65nm technology XNORBIN achieves an energy efficiency of 95 TOp/s/W and an area efficiency of 2.0 TOp/s/MGE at 0.8 V. |
Tasks | Quantization |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.05849v1 |
http://arxiv.org/pdf/1803.05849v1.pdf | |
PWC | https://paperswithcode.com/paper/xnorbin-a-95-topsw-hardware-accelerator-for |
Repo | |
Framework | |
Inverse Rational Control: Inferring What You Think from How You Forage
Title | Inverse Rational Control: Inferring What You Think from How You Forage |
Authors | Zhengwei Wu, Paul Schrater, Xaq Pitkow |
Abstract | Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning. Inferring an agent’s internal model is a crucial ingredient in social interactions (theory of mind), for imitation learning, and for interpreting neural activities of behaving agents. Here we describe a generic method to model an agent’s behavior under an environment with uncertainty, and infer the agent’s internal model, reward function, and dynamic beliefs. We apply our method to a simulated agent performing a naturalistic foraging task. We assume the agent behaves rationally — that is, they take actions that optimize their subjective utility according to their understanding of the task and its relevant causal variables. We model this rational solution as a Partially Observable Markov Decision Process (POMDP) where the agent may make wrong assumptions about the task parameters. Given the agent’s sensory observations and actions, we learn its internal model and reward function by maximum likelihood estimation over a set of task-relevant parameters. The Markov property of the POMDP enables us to characterize the transition probabilities between internal belief states and iteratively estimate the agent’s policy using a constrained Expectation-Maximization (EM) algorithm. We validate our method on simulated agents performing suboptimally on a foraging task currently used in many neuroscience experiments, and successfully recover their internal model and reward function. Our work lays a critical foundation to discover how the brain represents and computes with dynamic beliefs. |
Tasks | Imitation Learning |
Published | 2018-05-24 |
URL | https://arxiv.org/abs/1805.09864v4 |
https://arxiv.org/pdf/1805.09864v4.pdf | |
PWC | https://paperswithcode.com/paper/inverse-pomdp-inferring-what-you-think-from |
Repo | |
Framework | |
Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification
Title | Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification |
Authors | Zhao Kang, Xiao Lu, Jinfeng Yi, Zenglin Xu |
Abstract | Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse performance than using a single kernel. There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms. In this paper, we propose a novel MKL framework by following two intuitive assumptions: (i) each kernel is a perturbation of the consensus kernel; and (ii) the kernel that is close to the consensus kernel should be assigned a large weight. Impressively, the proposed method can automatically assign an appropriate weight to each kernel without introducing additional parameters, as existing methods do. The proposed framework is integrated into a unified framework for graph-based clustering and semi-supervised classification. We have conducted experiments on multiple benchmark datasets and our empirical results verify the superiority of the proposed framework. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07697v1 |
http://arxiv.org/pdf/1806.07697v1.pdf | |
PWC | https://paperswithcode.com/paper/self-weighted-multiple-kernel-learning-for |
Repo | |
Framework | |
Opening the black box of neural nets: case studies in stop/top discrimination
Title | Opening the black box of neural nets: case studies in stop/top discrimination |
Authors | Thomas Roxlo, Matthew Reece |
Abstract | We introduce techniques for exploring the functionality of a neural network and extracting simple, human-readable approximations to its performance. By performing gradient ascent on the input space of the network, we are able to produce large populations of artificial events which strongly excite a given classifier. By studying the populations of these events, we then directly produce what are essentially contour maps of the network’s classification function. Combined with a suite of tools for identifying the input dimensions deemed most important by the network, we can utilize these maps to efficiently interpret the dominant criteria by which the network makes its classification. As a test case, we study networks trained to discriminate supersymmetric stop production in the dilepton channel from Standard Model backgrounds. In the case of a heavy stop decaying to a light neutralino, we find individual neurons with large mutual information with $m_{T2}^{\ell\ell}$, a human-designed variable for optimizing the analysis. The network selects events with significant missing $p_T$ oriented azimuthally away from both leptons, efficiently rejecting $t\overline{t}$ background. In the case of a light stop with three-body decays to $Wb{\widetilde \chi}$ and little phase space, we find neurons that smoothly interpolate between a similar top-rejection strategy and an ISR-tagging strategy allowing for more missing momentum. We also find that a neural network trained on a stealth stop parameter point learns novel angular correlations. |
Tasks | |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.09278v1 |
http://arxiv.org/pdf/1804.09278v1.pdf | |
PWC | https://paperswithcode.com/paper/opening-the-black-box-of-neural-nets-case |
Repo | |
Framework | |