January 30, 2020

3104 words 15 mins read

Paper Group ANR 419

Is artificial data useful for biomedical Natural Language Processing algorithms?. Towards Integration of Statistical Hypothesis Tests into Deep Neural Networks. SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement. A Unifying Framework for Variance Reduction Algorithms for Finding Zeroes of Monotone Operators. The Kik …

Is artificial data useful for biomedical Natural Language Processing algorithms?


Title	Is artificial data useful for biomedical Natural Language Processing algorithms?
Authors	Zixu Wang, Julia Ive, Sumithra Velupillai, Lucia Specia
Abstract	A major obstacle to the development of Natural Language Processing (NLP) methods in the biomedical domain is data accessibility. This problem can be addressed by generating medical data artificially. Most previous studies have focused on the generation of short clinical text, and evaluation of the data utility has been limited. We propose a generic methodology to guide the generation of clinical text with key phrases. We use the artificial data as additional training data in two key biomedical NLP tasks: text classification and temporal relation extraction. We show that artificially generated training data used in conjunction with real training data can lead to performance boosts for data-greedy neural network algorithms. We also demonstrate the usefulness of the generated data for NLP setups where it fully replaces real training data.
Tasks	Relation Extraction, Text Classification
Published	2019-07-01
URL	https://arxiv.org/abs/1907.01055v2
PDF	https://arxiv.org/pdf/1907.01055v2.pdf
PWC	https://paperswithcode.com/paper/is-artificial-data-useful-for-biomedical
Repo
Framework

Towards Integration of Statistical Hypothesis Tests into Deep Neural Networks


Title	Towards Integration of Statistical Hypothesis Tests into Deep Neural Networks
Authors	Ahmad Aghaebrahimian, Mark Cieliebak
Abstract	We report our ongoing work about a new deep architecture working in tandem with a statistical test procedure for jointly training texts and their label descriptions for multi-label and multi-class classification tasks. A statistical hypothesis testing method is used to extract the most informative words for each given class. These words are used as a class description for more label-aware text classification. Intuition is to help the model to concentrate on more informative words rather than more frequent ones. The model leverages the use of label descriptions in addition to the input text to enhance text classification performance. Our method is entirely data-driven, has no dependency on other sources of information than the training data, and is adaptable to different classification problems by providing appropriate training data without major hyper-parameter tuning. We trained and tested our system on several publicly available datasets, where we managed to improve the state-of-the-art on one set with a high margin, and to obtain competitive results on all other ones.
Tasks	Text Classification
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06550v1
PDF	https://arxiv.org/pdf/1906.06550v1.pdf
PWC	https://paperswithcode.com/paper/towards-integration-of-statistical-hypothesis
Repo
Framework


Title	SCEF: A Support-Confidence-aware Embedding Framework for Knowledge Graph Refinement
Authors	Yu Zhao, Ji Liu
Abstract	Knowledge graph (KG) refinement mainly aims at KG completion and correction (i.e., error detection). However, most conventional KG embedding models only focus on KG completion with an unreasonable assumption that all facts in KG hold without noises, ignoring error detection which also should be significant and essential for KG refinement.In this paper, we propose a novel support-confidence-aware KG embedding framework (SCEF), which implements KG completion and correction simultaneously by learning knowledge representations with both triple support and triple confidence. Specifically, we build model energy function by incorporating conventional translation-based model with support and confidence. To make our triple support-confidence more sufficient and robust, we not only consider the internal structural information in KG, studying the approximate relation entailment as triple confidence constraints, but also the external textual evidence, proposing two kinds of triple supports with entity types and descriptions respectively.Through extensive experiments on real-world datasets, we demonstrate SCEF’s effectiveness.
Tasks
Published	2019-02-18
URL	https://arxiv.org/abs/1902.06377v2
PDF	https://arxiv.org/pdf/1902.06377v2.pdf
PWC	https://paperswithcode.com/paper/scef-a-support-confidence-aware-embedding
Repo
Framework

A Unifying Framework for Variance Reduction Algorithms for Finding Zeroes of Monotone Operators


Title	A Unifying Framework for Variance Reduction Algorithms for Finding Zeroes of Monotone Operators
Authors	Xun Zhang, William B. Haskell, Zhisheng Ye
Abstract	A wide range of optimization problems can be recast as monotone inclusion problems. We propose a unifying framework for solving the monotone inclusion problem with randomized Forward-Backward algorithms. Our framework covers many existing deterministic and stochastic algorithms. Under various conditions, we can establish both sublinear and linear convergence rates in expectation for the algorithms covered by this framework. In addition, we consider algorithm design as well as asynchronous randomized Forward algorithms. Numerical experiments demonstrate the worth of the new algorithms that emerge from our framework
Tasks
Published	2019-06-22
URL	https://arxiv.org/abs/1906.09437v1
PDF	https://arxiv.org/pdf/1906.09437v1.pdf
PWC	https://paperswithcode.com/paper/a-unifying-framework-for-variance-reduction
Repo
Framework

The Kikuchi Hierarchy and Tensor PCA


Title	The Kikuchi Hierarchy and Tensor PCA
Authors	Alexander S. Wein, Ahmed El Alaoui, Cristopher Moore
Abstract	For the tensor PCA (principal component analysis) problem, we propose a new hierarchy of increasingly powerful algorithms with increasing runtime. Our hierarchy is analogous to the sum-of-squares (SOS) hierarchy but is instead inspired by statistical physics and related algorithms such as belief propagation and AMP (approximate message passing). Our level-$\ell$ algorithm can be thought of as a linearized message-passing algorithm that keeps track of $\ell$-wise dependencies among the hidden variables. Specifically, our algorithms are spectral methods based on the Kikuchi Hessian, which generalizes the well-studied Bethe Hessian to the higher-order Kikuchi free energies. It is known that AMP, the flagship algorithm of statistical physics, has substantially worse performance than SOS for tensor PCA. In this work we ‘redeem’ the statistical physics approach by showing that our hierarchy gives a polynomial-time algorithm matching the performance of SOS. Our hierarchy also yields a continuum of subexponential-time algorithms, and we prove that these achieve the same (conjecturally optimal) tradeoff between runtime and statistical power as SOS. Our proofs are much simpler than prior work, and also apply to the related problem of refuting random $k$-XOR formulas. The results we present here apply to tensor PCA for tensors of all orders, and to $k$-XOR when $k$ is even. Our methods suggest a new avenue for systematically obtaining optimal algorithms for Bayesian inference problems, and our results constitute a step toward unifying the statistical physics and sum-of-squares approaches to algorithm design.
Tasks	Bayesian Inference
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03858v2
PDF	https://arxiv.org/pdf/1904.03858v2.pdf
PWC	https://paperswithcode.com/paper/the-kikuchi-hierarchy-and-tensor-pca
Repo
Framework

People infer recursive visual concepts from just a few examples


Title	People infer recursive visual concepts from just a few examples
Authors	Brenden M. Lake, Steven T. Piantadosi
Abstract	Machine learning has made major advances in categorizing objects in images, yet the best algorithms miss important aspects of how people learn and think about categories. People can learn richer concepts from fewer examples, including causal models that explain how members of a category are formed. Here, we explore the limits of this human ability to infer causal “programs” – latent generating processes with nontrivial algorithmic properties – from one, two, or three visual examples. People were asked to extrapolate the programs in several ways, for both classifying and generating new examples. As a theory of these inductive abilities, we present a Bayesian program learning model that searches the space of programs for the best explanation of the observations. Although variable, people’s judgments are broadly consistent with the model and inconsistent with several alternatives, including a pre-trained deep neural network for object recognition, indicating that people can learn and reason with rich algorithmic abstractions from sparse input data.
Tasks	Object Recognition
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08034v2
PDF	https://arxiv.org/pdf/1904.08034v2.pdf
PWC	https://paperswithcode.com/paper/people-infer-recursive-visual-concepts-from
Repo
Framework

Collaboration of AI Agents via Cooperative Multi-Agent Deep Reinforcement Learning


Title	Collaboration of AI Agents via Cooperative Multi-Agent Deep Reinforcement Learning
Authors	Niranjan Balachandar, Justin Dieter, Govardana Sachithanandam Ramachandran
Abstract	There are many AI tasks involving multiple interacting agents where agents should learn to cooperate and collaborate to effectively perform the task. Here we develop and evaluate various multi-agent protocols to train agents to collaborate with teammates in grid soccer. We train and evaluate our multi-agent methods against a team operating with a smart hand-coded policy. As a baseline, we train agents concurrently and independently, with no communication. Our collaborative protocols were parameter sharing, coordinated learning with communication, and counterfactual policy gradients. Against the hand-coded team, the team trained with parameter sharing and the team trained with coordinated learning performed the best, scoring on 89.5% and 94.5% of episodes respectively when playing against the hand-coded team. Against the parameter sharing team, with adversarial training the coordinated learning team scored on 75% of the episodes, indicating it is the most adaptable of our methods. The insights gained from our work can be applied to other domains where multi-agent collaboration could be beneficial.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00327v1
PDF	https://arxiv.org/pdf/1907.00327v1.pdf
PWC	https://paperswithcode.com/paper/collaboration-of-ai-agents-via-cooperative
Repo
Framework

Improved training of binary networks for human pose estimation and image recognition


Title	Improved training of binary networks for human pose estimation and image recognition
Authors	Adrian Bulat, Georgios Tzimiropoulos, Jean Kossaifi, Maja Pantic
Abstract	Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin. However, under low memory and limited computational power constraints, the accuracy on the same problems drops considerable. In this paper, we propose a series of techniques that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary). We evaluate the proposed improvements on two diverse tasks: fine-grained recognition (human pose estimation) and large-scale image recognition (ImageNet classification). Specifically, we introduce a series of novel methodological changes including: (a) more appropriate activation functions, (b) reverse-order initialization, (c) progressive quantization, and (d) network stacking and show that these additions improve existing state-of-the-art network binarization techniques, significantly. Additionally, for the first time, we also investigate the extent to which network binarization and knowledge distillation can be combined. When tested on the challenging MPII dataset, our method shows a performance improvement of more than 4% in absolute terms. Finally, we further validate our findings by applying the proposed techniques for large-scale object recognition on the Imagenet dataset, on which we report a reduction of error rate by 4%.
Tasks	Object Recognition, Pose Estimation, Quantization
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05868v1
PDF	http://arxiv.org/pdf/1904.05868v1.pdf
PWC	https://paperswithcode.com/paper/improved-training-of-binary-networks-for
Repo
Framework

Definitively Identifying an Inherent Limitation to Actual Cognition


Title	Definitively Identifying an Inherent Limitation to Actual Cognition
Authors	Arthur Charlesworth
Abstract	A century ago, discoveries of a serious kind of logical error made separately by several leading mathematicians led to acceptance of a sharply enhanced standard for rigor within what ultimately became the foundation for Computer Science. By 1931, Godel had obtained a definitive and remarkable result: an inherent limitation to that foundation. The resulting limitation is not applicable to actual human cognition, to even the smallest extent, unless both of these extremely brittle assumptions hold: humans are infallible reasoners and reason solely via formal inference rules. Both assumptions are contradicted by empirical data from well-known Cognitive Science experiments. This article investigates how a novel multi-part methodology recasts computability theory within Computer Science to obtain a definitive limitation whose application to human cognition avoids assumptions contradicting empirical data. The limitation applies to individual humans, to finite sets of humans, and more generally to any real-world entity.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.13010v2
PDF	https://arxiv.org/pdf/1905.13010v2.pdf
PWC	https://paperswithcode.com/paper/definitively-identifying-an-inherent
Repo
Framework

Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning


Title	Leveraging Communication Topologies Between Learning Agents in Deep Reinforcement Learning
Authors	Dhaval Adjodah, Dan Calacci, Abhimanyu Dubey, Anirudh Goyal, Peter Krafft, Esteban Moro, Alex Pentland
Abstract	A common technique to improve learning performance in deep reinforcement learning (DRL) and many other machine learning algorithms is to run multiple learning agents in parallel. A neglected component in the development of these algorithms has been how best to arrange the learning agents involved to improve distributed search. Here we draw upon results from the networked optimization literatures suggesting that arranging learning agents in communication networks other than fully connected topologies (the implicit way agents are commonly arranged in) can improve learning. We explore the relative performance of four popular families of graphs and observe that one such family (Erdos-Renyi random graphs) empirically outperforms the de facto fully-connected communication topology across several DRL benchmark tasks. Additionally, we observe that 1000 learning agents arranged in an Erdos-Renyi graph can perform as well as 3000 agents arranged in the standard fully-connected topology, showing the large learning improvement possible when carefully designing the topology over which agents communicate. We complement these empirical results with a theoretical investigation of why our alternate topologies perform better. Overall, our work suggests that distributed machine learning algorithms could be made more effective if the communication topology between learning agents was optimized.
Tasks
Published	2019-02-16
URL	https://arxiv.org/abs/1902.06740v2
PDF	https://arxiv.org/pdf/1902.06740v2.pdf
PWC	https://paperswithcode.com/paper/communication-topologies-between-learning
Repo
Framework

An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration


Title	An Application-Specific VLIW Processor with Vector Instruction Set for CNN Acceleration
Authors	Andreas Bytyn, Rainer Leupers, Gerd Ascheid
Abstract	In recent years, neural networks have surpassed classical algorithms in areas such as object recognition, e.g. in the well-known ImageNet challenge. As a result, great effort is being put into developing fast and efficient accelerators, especially for Convolutional Neural Networks (CNNs). In this work we present ConvAix, a fully C-programmable processor, which – contrary to many existing architectures – does not rely on a hard-wired array of multiply-and-accumulate (MAC) units. Instead it maps computations onto independent vector lanes making use of a carefully designed vector instruction set. The presented processor is targeted towards latency-sensitive applications and is capable of executing up to 192 MAC operations per cycle. ConvAix operates at a target clock frequency of 400 MHz in 28nm CMOS, thereby offering state-of-the-art performance with proper flexibility within its target domain. Simulation results for several 2D convolutional layers from well known CNNs (AlexNet, VGG-16) show an average ALU utilization of 72.5% using vector instructions with 16 bit fixed-point arithmetic. Compared to other well-known designs which are less flexible, ConvAix offers competitive energy efficiency of up to 497 GOP/s/W while even surpassing them in terms of area efficiency and processing speed.
Tasks	Object Recognition
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05106v1
PDF	http://arxiv.org/pdf/1904.05106v1.pdf
PWC	https://paperswithcode.com/paper/an-application-specific-vliw-processor-with
Repo
Framework

Beyond NP: Quantifying over Answer Sets


Title	Beyond NP: Quantifying over Answer Sets
Authors	Giovanni Amendola, Francesco Ricca, Mirek Truszczynski
Abstract	Answer Set Programming (ASP) is a logic programming paradigm featuring a purely declarative language with comparatively high modeling capabilities. Indeed, ASP can model problems in NP in a compact and elegant way. However, modeling problems beyond NP with ASP is known to be complicated, on the one hand, and limited to problems in {\Sigma}^P_2 on the other. Inspired by the way Quantified Boolean Formulas extend SAT formulas to model problems beyond NP, we propose an extension of ASP that introduces quantifiers over stable models of programs. We name the new language ASP with Quantifiers (ASP(Q)). In the paper we identify computational properties of ASP(Q); we highlight its modeling capabilities by reporting natural encodings of several complex problems with applications in artificial intelligence and number theory; and we compare ASP(Q) with related languages. Arguably, ASP(Q) allows one to model problems in the Polynomial Hierarchy in a direct way, providing an elegant expansion of ASP beyond the class NP. Under consideration for acceptance in TPLP.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09559v1
PDF	https://arxiv.org/pdf/1907.09559v1.pdf
PWC	https://paperswithcode.com/paper/beyond-np-quantifying-over-answer-sets
Repo
Framework

Learning Question-Guided Video Representation for Multi-Turn Video Question Answering


Title	Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Authors	Guan-Lin Chao, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tür, Jindong Chen, Ian Lane
Abstract	Understanding and conversing about dynamic scenes is one of the key capabilities of AI agents that navigate the environment and convey useful information to humans. Video question answering is a specific scenario of such AI-human interaction where an agent generates a natural language response to a question regarding the video of a dynamic scene. Incorporating features from multiple modalities, which often provide supplementary information, is one of the challenging aspects of video question answering. Furthermore, a question often concerns only a small segment of the video, hence encoding the entire video sequence using a recurrent neural network is not computationally efficient. Our proposed question-guided video representation module efficiently generates the token-level video summary guided by each word in the question. The learned representations are then fused with the question to generate the answer. Through empirical evaluation on the Audio Visual Scene-aware Dialog (AVSD) dataset, our proposed models in single-turn and multi-turn question answering achieve state-of-the-art performance on several automatic natural language generation evaluation metrics.
Tasks	Question Answering, Text Generation, Video Question Answering
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13280v1
PDF	https://arxiv.org/pdf/1907.13280v1.pdf
PWC	https://paperswithcode.com/paper/learning-question-guided-video-representation
Repo
Framework

A Discrete CVAE for Response Generation on Short-Text Conversation


Title	A Discrete CVAE for Response Generation on Short-Text Conversation
Authors	Jun Gao, Wei Bi, Xiaojiang Liu, Junhui Li, Guodong Zhou, Shuming Shi
Abstract	Neural conversation models such as encoder-decoder models are easy to generate bland and generic responses. Some researchers propose to use the conditional variational autoencoder(CVAE) which maximizes the lower bound on the conditional log-likelihood on a continuous latent variable. With different sampled la-tent variables, the model is expected to generate diverse responses. Although the CVAE-based models have shown tremendous potential, their improvement of generating high-quality responses is still unsatisfactory. In this paper, we introduce a discrete latent variable with an explicit semantic meaning to improve the CVAE on short-text conversation. A major advantage of our model is that we can exploit the semantic distance between the latent variables to maintain good diversity between the sampled latent variables. Accordingly, we pro-pose a two-stage sampling approach to enable efficient diverse variable selection from a large latent space assumed in the short-text conversation task. Experimental results indicate that our model outperforms various kinds of generation models under both automatic and human evaluations and generates more diverse and in-formative responses.
Tasks	Short-Text Conversation
Published	2019-11-22
URL	https://arxiv.org/abs/1911.09845v1
PDF	https://arxiv.org/pdf/1911.09845v1.pdf
PWC	https://paperswithcode.com/paper/a-discrete-cvae-for-response-generation-on-1
Repo
Framework

Stochastic Bandits with Delayed Composite Anonymous Feedback


Title	Stochastic Bandits with Delayed Composite Anonymous Feedback
Authors	Siddhant Garg, Aditya Kumar Akash
Abstract	We explore a novel setting of the Multi-Armed Bandit (MAB) problem inspired from real world applications which we call bandits with “stochastic delayed composite anonymous feedback (SDCAF)". In SDCAF, the rewards on pulling arms are stochastic with respect to time but spread over a fixed number of time steps in the future after pulling the arm. The complexity of this problem stems from the anonymous feedback to the player and the stochastic generation of the reward. Due to the aggregated nature of the rewards, the player is unable to associate the reward to a particular time step from the past. We present two algorithms for this more complicated setting of SDCAF using phase based extensions of the UCB algorithm. We perform regret analysis to show sub-linear theoretical guarantees on both the algorithms.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01161v2
PDF	https://arxiv.org/pdf/1910.01161v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-bandits-with-delayed-composite
Repo
Framework