October 18, 2019

3090 words 15 mins read

Paper Group ANR 620

Metabolize Neural Network. Dependent Gated Reading for Cloze-Style Question Answering. Table-to-Text: Describing Table Region with Natural Language. Probabilistic Class-Specific Discriminant Analysis. Object Relation Detection Based on One-shot Learning. A survey on policy search algorithms for learning robot controllers in a handful of trials. Met …

Metabolize Neural Network


Title	Metabolize Neural Network
Authors	Dan Dai, Zhiwen Yu, Yang Hu, Wenming Cao, Mingnan Luo
Abstract	The metabolism of cells is the most basic and important part of human function. Neural networks in deep learning stem from neuronal activity. It is self-evident that the significance of metabolize neuronal network(MetaNet) in model construction. In this study, we explore neuronal metabolism for shallow network from proliferation and autophagy two aspects. First, we propose different neuron proliferate methods that constructive the selfgrowing network in metabolism cycle. Proliferate neurons alleviate resources wasting and insufficient model learning problem when network initializes more or less parameters. Then combined with autophagy mechanism in the process of model self construction to ablate under-expressed neurons. The MetaNet can automatically determine the number of neurons during training, further, save more resource consumption. We verify the performance of the proposed methods on datasets: MNIST, Fashion-MNIST and CIFAR-10.
Tasks
Published	2018-09-04
URL	http://arxiv.org/abs/1809.00837v1
PDF	http://arxiv.org/pdf/1809.00837v1.pdf
PWC	https://paperswithcode.com/paper/metabolize-neural-network
Repo
Framework

Dependent Gated Reading for Cloze-Style Question Answering


Title	Dependent Gated Reading for Cloze-Style Question Answering
Authors	Reza Ghaeini, Xiaoli Z. Fern, Hamed Shahbazi, Prasad Tadepalli
Abstract	We present a novel deep learning architecture to address the cloze-style question answering task. Existing approaches employ reading mechanisms that do not fully exploit the interdependency between the document and the query. In this paper, we propose a novel \emph{dependent gated reading} bidirectional GRU network (DGR) to efficiently model the relationship between the document and the query during encoding and decision making. Our evaluation shows that DGR obtains highly competitive performance on well-known machine comprehension benchmarks such as the Children’s Book Test (CBT-NE and CBT-CN) and Who DiD What (WDW, Strict and Relaxed). Finally, we extensively analyze and validate our model by ablation and attention studies.
Tasks	Decision Making, Question Answering, Reading Comprehension
Published	2018-05-26
URL	http://arxiv.org/abs/1805.10528v2
PDF	http://arxiv.org/pdf/1805.10528v2.pdf
PWC	https://paperswithcode.com/paper/dependent-gated-reading-for-cloze-style
Repo
Framework

Table-to-Text: Describing Table Region with Natural Language


Title	Table-to-Text: Describing Table Region with Natural Language
Authors	Junwei Bao, Duyu Tang, Nan Duan, Zhao Yan, Yuanhua Lv, Ming Zhou, Tiejun Zhao
Abstract	In this paper, we present a generative model to generate a natural language sentence describing a table region, e.g., a row. The model maps a row from a table to a continuous vector and then generates a natural language sentence by leveraging the semantics of a table. To deal with rare words appearing in a table, we develop a flexible copying mechanism that selectively replicates contents from the table in the output sequence. Extensive experiments demonstrate the accuracy of the model and the power of the copying mechanism. On two synthetic datasets, WIKIBIO and SIMPLEQUESTIONS, our model improves the current state-of-the-art BLEU-4 score from 34.70 to 40.26 and from 33.32 to 39.12, respectively. Furthermore, we introduce an open-domain dataset WIKITABLETEXT including 13,318 explanatory sentences for 4,962 tables. Our model achieves a BLEU-4 score of 38.23, which outperforms template based and language model based approaches.
Tasks	Language Modelling
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11234v1
PDF	http://arxiv.org/pdf/1805.11234v1.pdf
PWC	https://paperswithcode.com/paper/table-to-text-describing-table-region-with
Repo
Framework

Probabilistic Class-Specific Discriminant Analysis


Title	Probabilistic Class-Specific Discriminant Analysis
Authors	Alexandros Iosifidis
Abstract	In this paper we formulate a probabilistic model for class-specific discriminant subspace learning. The proposed model can naturally incorporate the multi-modal structure of the negative class, which is neglected by existing class-specific methods. Moreover, it can be directly used to define a probabilistic classification rule in the discriminant subspace. We show that existing class-specific discriminant analysis methods are special cases of the proposed probabilistic model and, by casting them as probabilistic models, they can be extended to class-specific classifiers. We illustrate the performance of the proposed model in both verification and classification problems.
Tasks
Published	2018-12-14
URL	https://arxiv.org/abs/1812.05980v4
PDF	https://arxiv.org/pdf/1812.05980v4.pdf
PWC	https://paperswithcode.com/paper/probabilistic-class-specific-discriminant
Repo
Framework

Object Relation Detection Based on One-shot Learning


Title	Object Relation Detection Based on One-shot Learning
Authors	Li Zhou, Jian Zhao, Jianshu Li, Li Yuan, Jiashi Feng
Abstract	Detecting the relations among objects, such as “cat on sofa” and “person ride horse”, is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language. Despite the remarkable progress of deep learning in detection and recognition of individual objects, it is still a challenging task to localize and recognize the relations between objects due to the complex combinatorial nature of various kinds of object relations. Inspired by the recent advances in one-shot learning, we propose a simple yet effective Semantics Induced Learner (SIL) model for solving this challenging task. Learning in one-shot manner can enable a detection model to adapt to a huge number of object relations with diverse appearance effectively and robustly. In addition, the SIL combines bottom-up and top-down attention mech- anisms, therefore enabling attention at the level of vision and semantics favorably. Within our proposed model, the bottom-up mechanism, which is based on Faster R-CNN, proposes objects regions, and the top-down mechanism selects and integrates visual features according to semantic information. Experiments demonstrate the effectiveness of our framework over other state-of-the-art methods on two large-scale data sets for object relation detection.
Tasks	One-Shot Learning
Published	2018-07-16
URL	http://arxiv.org/abs/1807.05857v1
PDF	http://arxiv.org/pdf/1807.05857v1.pdf
PWC	https://paperswithcode.com/paper/object-relation-detection-based-on-one-shot
Repo
Framework

A survey on policy search algorithms for learning robot controllers in a handful of trials


Title	A survey on policy search algorithms for learning robot controllers in a handful of trials
Authors	Konstantinos Chatzilygeroudis, Vassilis Vassiliades, Freek Stulp, Sylvain Calinon, Jean-Baptiste Mouret
Abstract	Most policy search algorithms require thousands of training episodes to find an effective policy, which is often infeasible with a physical robot. This survey article focuses on the extreme other end of the spectrum: how can a robot adapt with only a handful of trials (a dozen) and a few minutes? By analogy with the word “big-data”, we refer to this challenge as “micro-data reinforcement learning”. We show that a first strategy is to leverage prior knowledge on the policy structure (e.g., dynamic movement primitives), on the policy parameters (e.g., demonstrations), or on the dynamics (e.g., simulators). A second strategy is to create data-driven surrogate models of the expected reward (e.g., Bayesian optimization) or the dynamical model (e.g., model-based policy search), so that the policy optimizer queries the model instead of the real system. Overall, all successful micro-data algorithms combine these two strategies by varying the kind of model and prior knowledge. The current scientific challenges essentially revolve around scaling up to complex robots (e.g., humanoids), designing generic priors, and optimizing the computing time.
Tasks
Published	2018-07-06
URL	https://arxiv.org/abs/1807.02303v5
PDF	https://arxiv.org/pdf/1807.02303v5.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-policy-search-algorithms-for
Repo
Framework

Metalearning with Hebbian Fast Weights


Title	Metalearning with Hebbian Fast Weights
Authors	Tsendsuren Munkhdalai, Adam Trischler
Abstract	We unify recent neural approaches to one-shot learning with older ideas of associative memory in a model for metalearning. Our model learns jointly to represent data and to bind class labels to representations in a single shot. It builds representations via slow weights, learned across tasks through SGD, while fast weights constructed by a Hebbian learning rule implement one-shot binding for each new task. On the Omniglot, Mini-ImageNet, and Penn Treebank one-shot learning benchmarks, our model achieves state-of-the-art results.
Tasks	Omniglot, One-Shot Learning
Published	2018-07-12
URL	http://arxiv.org/abs/1807.05076v1
PDF	http://arxiv.org/pdf/1807.05076v1.pdf
PWC	https://paperswithcode.com/paper/metalearning-with-hebbian-fast-weights
Repo
Framework

Dynamic Spectrum Matching with One-shot Learning


Title	Dynamic Spectrum Matching with One-shot Learning
Authors	Jinchao Liu, Stuart J. Gibson, James Mills, Margarita Osadchy
Abstract	Convolutional neural networks (CNN) have been shown to provide a good solution for classification problems that utilize data obtained from vibrational spectroscopy. Moreover, CNNs are capable of identification from noisy spectra without the need for additional preprocessing. However, their application in practical spectroscopy is limited due to two shortcomings. The effectiveness of the classification using CNNs drops rapidly when only a small number of spectra per substance are available for training (which is a typical situation in real applications). Additionally, to accommodate new, previously unseen substance classes, the network must be retrained which is computationally intensive. Here we address these issues by reformulating a multi-class classification problem with a large number of classes, but a small number of samples per class, to a binary classification problem with sufficient data available for representation learning. Namely, we define the learning task as identifying pairs of inputs as belonging to the same or different classes. We achieve this using a Siamese convolutional neural network. A novel sampling strategy is proposed to address the imbalance problem in training the Siamese Network. The trained network can effectively classify samples of unseen substance classes using just a single reference sample (termed as one-shot learning in the machine learning community). Our results demonstrate better accuracy than other practical systems to date, while allowing effortless updates of the system’s database with novel substance classes.
Tasks	One-Shot Learning, Representation Learning
Published	2018-06-23
URL	http://arxiv.org/abs/1806.09981v1
PDF	http://arxiv.org/pdf/1806.09981v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-spectrum-matching-with-one-shot
Repo
Framework

A Review of Literature on Parallel Constraint Solving


Title	A Review of Literature on Parallel Constraint Solving
Authors	Ian P. Gent, Ciaran McCreesh, Ian Miguel, Neil C. A. Moore, Peter Nightingale, Patrick Prosser, Chris Unsworth
Abstract	As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation. Under consideration in Theory and Practice of Logic Programming (TPLP).
Tasks
Published	2018-03-29
URL	http://arxiv.org/abs/1803.10981v1
PDF	http://arxiv.org/pdf/1803.10981v1.pdf
PWC	https://paperswithcode.com/paper/a-review-of-literature-on-parallel-constraint
Repo
Framework

Securing Behavior-based Opinion Spam Detection


Title	Securing Behavior-based Opinion Spam Detection
Authors	Shuaijun Ge, Guixiang Ma, Sihong Xie, Philip S. Yu
Abstract	Reviews spams are prevalent in e-commerce to manipulate product ranking and customers decisions maliciously. While spams generated based on simple spamming strategy can be detected effectively, hardened spammers can evade regular detectors via more advanced spamming strategies. Previous work gave more attention to evasion against text and graph-based detectors, but evasions against behavior-based detectors are largely ignored, leading to vulnerabilities in spam detection systems. Since real evasion data are scarce, we first propose EMERAL (Evasion via Maximum Entropy and Rating sAmpLing) to generate evasive spams to certain existing detectors. EMERAL can simulate spammers with different goals and levels of knowledge about the detectors, targeting at different stages of the life cycle of target products. We show that in the evasion-defense dynamic, only a few evasion types are meaningful to the spammers, and any spammer will not be able to evade too many detection signals at the same time. We reveal that some evasions are quite insidious and can fail all detection signals. We then propose DETER (Defense via Evasion generaTion using EmeRal), based on model re-training on diverse evasive samples generated by EMERAL. Experiments confirm that DETER is more accurate in detecting both suspicious time window and individual spamming reviews. In terms of security, DETER is versatile enough to be vaccinated against diverse and unexpected evasions, is agnostic about evasion strategy and can be released without privacy concern.
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03739v1
PDF	http://arxiv.org/pdf/1811.03739v1.pdf
PWC	https://paperswithcode.com/paper/securing-behavior-based-opinion-spam
Repo
Framework

Zigzag Learning for Weakly Supervised Object Detection


Title	Zigzag Learning for Weakly Supervised Object Detection
Authors	Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian
Abstract	This paper addresses weakly supervised object detection with only image-level supervision at training stage. Previous approaches train detection models with entire images all at once, making the models prone to being trapped in sub-optimums due to the introduced false positive examples. Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds. Towards this goal, we first develop a criterion named mean Energy Accumulation Scores (mEAS) to automatically measure and rank localization difficulty of an image containing the target object, and accordingly learn the detector progressively by feeding examples with increasing difficulty. In this way, the model can be well prepared by training on easy examples for learning from more difficult ones and thus gain a stronger detection ability more efficiently. Furthermore, we introduce a novel masking regularization strategy over the high level convolutional feature maps to avoid overfitting initial samples. These two modules formulate a zigzag learning process, where progressive learning endeavors to discover reliable object instances, and masking regularization increases the difficulty of finding object instances properly. We achieve 47.6% mAP on PASCAL VOC 2007, surpassing the state-of-the-arts by a large margin.
Tasks	Object Detection, Weakly Supervised Object Detection
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09466v1
PDF	http://arxiv.org/pdf/1804.09466v1.pdf
PWC	https://paperswithcode.com/paper/zigzag-learning-for-weakly-supervised-object
Repo
Framework

XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks


Title	XNORBIN: A 95 TOp/s/W Hardware Accelerator for Binary Convolutional Neural Networks
Authors	Andrawes Al Bahou, Geethan Karunaratne, Renzo Andri, Lukas Cavigelli, Luca Benini
Abstract	Deploying state-of-the-art CNNs requires power-hungry processors and off-chip memory. This precludes the implementation of CNNs in low-power embedded systems. Recent research shows CNNs sustain extreme quantization, binarizing their weights and intermediate feature maps, thereby saving 8-32\x memory and collapsing energy-intensive sum-of-products into XNOR-and-popcount operations. We present XNORBIN, an accelerator for binary CNNs with computation tightly coupled to memory for aggressive data reuse. Implemented in UMC 65nm technology XNORBIN achieves an energy efficiency of 95 TOp/s/W and an area efficiency of 2.0 TOp/s/MGE at 0.8 V.
Tasks	Quantization
Published	2018-03-05
URL	http://arxiv.org/abs/1803.05849v1
PDF	http://arxiv.org/pdf/1803.05849v1.pdf
PWC	https://paperswithcode.com/paper/xnorbin-a-95-topsw-hardware-accelerator-for
Repo
Framework

Inverse Rational Control: Inferring What You Think from How You Forage


Title	Inverse Rational Control: Inferring What You Think from How You Forage
Authors	Zhengwei Wu, Paul Schrater, Xaq Pitkow
Abstract	Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning. Inferring an agent’s internal model is a crucial ingredient in social interactions (theory of mind), for imitation learning, and for interpreting neural activities of behaving agents. Here we describe a generic method to model an agent’s behavior under an environment with uncertainty, and infer the agent’s internal model, reward function, and dynamic beliefs. We apply our method to a simulated agent performing a naturalistic foraging task. We assume the agent behaves rationally — that is, they take actions that optimize their subjective utility according to their understanding of the task and its relevant causal variables. We model this rational solution as a Partially Observable Markov Decision Process (POMDP) where the agent may make wrong assumptions about the task parameters. Given the agent’s sensory observations and actions, we learn its internal model and reward function by maximum likelihood estimation over a set of task-relevant parameters. The Markov property of the POMDP enables us to characterize the transition probabilities between internal belief states and iteratively estimate the agent’s policy using a constrained Expectation-Maximization (EM) algorithm. We validate our method on simulated agents performing suboptimally on a foraging task currently used in many neuroscience experiments, and successfully recover their internal model and reward function. Our work lays a critical foundation to discover how the brain represents and computes with dynamic beliefs.
Tasks	Imitation Learning
Published	2018-05-24
URL	https://arxiv.org/abs/1805.09864v4
PDF	https://arxiv.org/pdf/1805.09864v4.pdf
PWC	https://paperswithcode.com/paper/inverse-pomdp-inferring-what-you-think-from
Repo
Framework

Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification


Title	Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification
Authors	Zhao Kang, Xiao Lu, Jinfeng Yi, Zenglin Xu
Abstract	Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse performance than using a single kernel. There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms. In this paper, we propose a novel MKL framework by following two intuitive assumptions: (i) each kernel is a perturbation of the consensus kernel; and (ii) the kernel that is close to the consensus kernel should be assigned a large weight. Impressively, the proposed method can automatically assign an appropriate weight to each kernel without introducing additional parameters, as existing methods do. The proposed framework is integrated into a unified framework for graph-based clustering and semi-supervised classification. We have conducted experiments on multiple benchmark datasets and our empirical results verify the superiority of the proposed framework.
Tasks
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07697v1
PDF	http://arxiv.org/pdf/1806.07697v1.pdf
PWC	https://paperswithcode.com/paper/self-weighted-multiple-kernel-learning-for
Repo
Framework

Opening the black box of neural nets: case studies in stop/top discrimination


Title	Opening the black box of neural nets: case studies in stop/top discrimination
Authors	Thomas Roxlo, Matthew Reece
Abstract	We introduce techniques for exploring the functionality of a neural network and extracting simple, human-readable approximations to its performance. By performing gradient ascent on the input space of the network, we are able to produce large populations of artificial events which strongly excite a given classifier. By studying the populations of these events, we then directly produce what are essentially contour maps of the network’s classification function. Combined with a suite of tools for identifying the input dimensions deemed most important by the network, we can utilize these maps to efficiently interpret the dominant criteria by which the network makes its classification. As a test case, we study networks trained to discriminate supersymmetric stop production in the dilepton channel from Standard Model backgrounds. In the case of a heavy stop decaying to a light neutralino, we find individual neurons with large mutual information with $m_{T2}^{\ell\ell}$, a human-designed variable for optimizing the analysis. The network selects events with significant missing $p_T$ oriented azimuthally away from both leptons, efficiently rejecting $t\overline{t}$ background. In the case of a light stop with three-body decays to $Wb{\widetilde \chi}$ and little phase space, we find neurons that smoothly interpolate between a similar top-rejection strategy and an ISR-tagging strategy allowing for more missing momentum. We also find that a neural network trained on a stealth stop parameter point learns novel angular correlations.
Tasks
Published	2018-04-24
URL	http://arxiv.org/abs/1804.09278v1
PDF	http://arxiv.org/pdf/1804.09278v1.pdf
PWC	https://paperswithcode.com/paper/opening-the-black-box-of-neural-nets-case
Repo
Framework