Paper Group ANR 1556
Knowledge Enhanced Attention for Robust Natural Language Inference. DLBC: A Deep Learning-Based Consensus in Blockchains for Deep Learning Services. Frame-Recurrent Video Inpainting by Robust Optical Flow Inference. Riemannian optimization on the simplex of positive definite matrices. Embedding Biomedical Ontologies by Jointly Encoding Network Stru …
Knowledge Enhanced Attention for Robust Natural Language Inference
Title | Knowledge Enhanced Attention for Robust Natural Language Inference |
Authors | Alexander Hanbo Li, Abhinav Sethy |
Abstract | Neural network models have been very successful at achieving high accuracy on natural language inference (NLI) tasks. However, as demonstrated in recent literature, when tested on some simple adversarial examples, most of the models suffer a significant drop in performance. This raises the concern about the robustness of NLI models. In this paper, we propose to make NLI models robust by incorporating external knowledge to the attention mechanism using a simple transformation. We apply the new attention to two popular types of NLI models: one is Transformer encoder, and the other is a decomposable model, and show that our method can significantly improve their robustness. Moreover, when combined with BERT pretraining, our method achieves the human-level performance on the adversarial SNLI data set. |
Tasks | Natural Language Inference |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00102v1 |
https://arxiv.org/pdf/1909.00102v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-enhanced-attention-for-robust |
Repo | |
Framework | |
DLBC: A Deep Learning-Based Consensus in Blockchains for Deep Learning Services
Title | DLBC: A Deep Learning-Based Consensus in Blockchains for Deep Learning Services |
Authors | Boyang Li, Changhao Chenli, Xiaowei Xu, Yiyu Shi, Taeho Jung |
Abstract | With the increasing artificial intelligence application, deep neural network (DNN) has become an emerging task. However, to train a good deep learning model will suffer from enormous computation cost and energy consumption. Recently, blockchain has been widely used, and during its operation, a huge amount of computation resources are wasted for the Proof of Work (PoW) consensus. In this paper, we propose DLBC to exploit the computation power of miners for deep learning training as proof of useful work instead of calculating hash values. it distinguishes itself from recent proof of useful work mechanisms by addressing various limitations of them. Specifically, DLBC handles multiple tasks, larger model and training datasets, and introduces a comprehensive ranking mechanism that considers tasks difficulty(e.g., model complexity, network burden, data size, queue length). We also applied DNN-watermark [1] to improve the robustness. In Section V, the average overhead of digital signature is 1.25, 0.001, 0.002 and 0.98 seconds, respectively, and the average overhead of network is 3.77, 3.01, 0.37 and 0.41 seconds, respectively. Embedding a watermark takes 3 epochs and removing a watermark takes 30 epochs. This penalty of removing watermark will prevent attackers from stealing, improving, and resubmitting DL models from honest miners. |
Tasks | Semantic Segmentation |
Published | 2019-04-15 |
URL | https://arxiv.org/abs/1904.07349v2 |
https://arxiv.org/pdf/1904.07349v2.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-computation-power-of-blockchain |
Repo | |
Framework | |
Frame-Recurrent Video Inpainting by Robust Optical Flow Inference
Title | Frame-Recurrent Video Inpainting by Robust Optical Flow Inference |
Authors | Yifan Ding, Chuan Wang, Haibin Huang, Jiaming Liu, Jue Wang, Liqiang Wang |
Abstract | In this paper, we present a new inpainting framework for recovering missing regions of video frames. Compared with image inpainting, performing this task on video presents new challenges such as how to preserving temporal consistency and spatial details, as well as how to handle arbitrary input video size and length fast and efficiently. Towards this end, we propose a novel deep learning architecture which incorporates ConvLSTM and optical flow for modeling the spatial-temporal consistency in videos. It also saves much computational resource such that our method can handle videos with larger frame size and arbitrary length streamingly in real-time. Furthermore, to generate an accurate optical flow from corrupted frames, we propose a robust flow generation module, where two sources of flows are fed and a flow blending network is trained to fuse them. We conduct extensive experiments to evaluate our method in various scenarios and different datasets, both qualitatively and quantitatively. The experimental results demonstrate the superior of our method compared with the state-of-the-art inpainting approaches. |
Tasks | Image Inpainting, Optical Flow Estimation, Video Inpainting |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.02882v1 |
https://arxiv.org/pdf/1905.02882v1.pdf | |
PWC | https://paperswithcode.com/paper/frame-recurrent-video-inpainting-by-robust |
Repo | |
Framework | |
Riemannian optimization on the simplex of positive definite matrices
Title | Riemannian optimization on the simplex of positive definite matrices |
Authors | Bamdev Mishra, Hiroyuki Kasai, Pratik Jawanpuria |
Abstract | We discuss optimization-related ingredients for the Riemannian manifold defined by the constraint ${\bf{\rm X}}_1 + {\bf{\rm X}}_2 + \ldots + {\bf{\rm X}}_K = {\bf{\rm I}}$, where the matrix ${\bf{\rm X}}_i \succ 0$ is symmetric positive definite of size $n\times n$ for all $i = {1,\ldots,K }$. For the case $n =1$, the constraint boils down to the popular standard simplex constraint. |
Tasks | |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.10436v1 |
https://arxiv.org/pdf/1906.10436v1.pdf | |
PWC | https://paperswithcode.com/paper/riemannian-optimization-on-the-simplex-of |
Repo | |
Framework | |
Embedding Biomedical Ontologies by Jointly Encoding Network Structure and Textual Node Descriptors
Title | Embedding Biomedical Ontologies by Jointly Encoding Network Structure and Textual Node Descriptors |
Authors | Sotiris Kotitsas, Dimitris Pappas, Ion Androutsopoulos, Ryan McDonald, Marianna Apidianaki |
Abstract | Network Embedding (NE) methods, which map network nodes to low-dimensional feature vectors, have wide applications in network analysis and bioinformatics. Many existing NE methods rely only on network structure, overlooking other information associated with the nodes, e.g., text describing the nodes. Recent attempts to combine the two sources of information only consider local network structure. We extend NODE2VEC, a well-known NE method that considers broader network structure, to also consider textual node descriptors using recurrent neural encoders. Our method is evaluated on link prediction in two networks derived from UMLS. Experimental results demonstrate the effectiveness of the proposed approach compared to previous work. |
Tasks | Link Prediction, Network Embedding |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05939v2 |
https://arxiv.org/pdf/1906.05939v2.pdf | |
PWC | https://paperswithcode.com/paper/embedding-biomedical-ontologies-by-jointly |
Repo | |
Framework | |
Meta reinforcement learning as task inference
Title | Meta reinforcement learning as task inference |
Authors | Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh, Nicolas Heess |
Abstract | Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes proposals to learn the learning algorithm itself, an idea also known as meta learning. One formal interpretation of this idea is as a partially observable multi-task RL problem in which task information is hidden from the agent. Such unknown task problems can be reduced to Markov decision processes (MDPs) by augmenting an agent’s observations with an estimate of the belief about the task based on past experience. However estimating the belief state is intractable in most partially-observed MDPs. We propose a method that separately learns the policy and the task belief by taking advantage of various kinds of privileged information. Our approach can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment with sparse rewards and requiring long-term memory. |
Tasks | Continuous Control, Meta-Learning |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06424v2 |
https://arxiv.org/pdf/1905.06424v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-reinforcement-learning-as-task-inference |
Repo | |
Framework | |
Providing Advanced Access to Historical War Memoirs Through the Identification of Events, Participants and Roles
Title | Providing Advanced Access to Historical War Memoirs Through the Identification of Events, Participants and Roles |
Authors | Marco Rovera, Federico Nanni, Simone Paolo Ponzetto |
Abstract | The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events happened in the past; among them, memoirs are a very common type of primary source. In this paper, we present an approach for extracting information from historical war memoirs and turning it into structured knowledge. This is based on the semantic notions of events, participants and roles. We assess quantitatively each of the key-steps of our approach and provide a graph-based representation of the extracted knowledge, which allows the end user to move between close and distant reading of the collection. |
Tasks | |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.05439v1 |
http://arxiv.org/pdf/1904.05439v1.pdf | |
PWC | https://paperswithcode.com/paper/providing-advanced-access-to-historical-war |
Repo | |
Framework | |
Deep Learning for Plasma Tomography and Disruption Prediction from Bolometer Data
Title | Deep Learning for Plasma Tomography and Disruption Prediction from Bolometer Data |
Authors | Diogo R. Ferreira, Pedro J. Carvalho, Horácio Fernandes |
Abstract | The use of deep learning is facilitating a wide range of data processing tasks in many areas. The analysis of fusion data is no exception, since there is a need to process large amounts of data collected from the diagnostic systems attached to a fusion device. Fusion data involves images and time series, and are a natural candidate for the use of convolutional and recurrent neural networks. In this work, we describe how CNNs can be used to reconstruct the plasma radiation profile, and we discuss the potential of using RNNs for disruption prediction based on the same input data. Both approaches have been applied at JET using data from a multi-channel diagnostic system. Similar approaches can be applied to other fusion devices and diagnostics. |
Tasks | Time Series |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.13257v1 |
https://arxiv.org/pdf/1910.13257v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-plasma-tomography-and |
Repo | |
Framework | |
Evolutionary Multi-Objective Optimization Driven by Generative Adversarial Networks (GANs)
Title | Evolutionary Multi-Objective Optimization Driven by Generative Adversarial Networks (GANs) |
Authors | Cheng He, Shihua Huang, Ran Cheng, Kay Chen Tan, Yaochu Jin |
Abstract | Recently, more and more works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality. To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs). At each generation of the proposed algorithm, the parent solutions are first classified into real and fake samples to train the GANs; then the offspring solutions are sampled by the trained GANs. Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data. The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables. Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.04966v1 |
https://arxiv.org/pdf/1910.04966v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-multi-objective-optimization-1 |
Repo | |
Framework | |
Image recognition from raw labels collected without annotators
Title | Image recognition from raw labels collected without annotators |
Authors | Fatih Furkan Yilmaz, Reinhard Heckel |
Abstract | Image classification problems are typically addressed by first collecting examples with candidate labels, second cleaning the candidate labels manually, and third training a deep neural network on the clean examples. The manual labeling step is often the most expensive one as it requires workers to label millions of images. In this paper we propose to work without any explicitly labeled data by i) directly training the deep neural network on the noisy candidate labels, and ii) early stopping the training to avoid overfitting. With this procedure we exploit an intriguing property of standard overparameterized convolutional neural networks trained with (stochastic) gradient descent: Clean labels are fitted faster than noisy ones. We consider two classification problems, a subset of ImageNet and CIFAR-10. For both, we construct large candidate datasets without any explicit human annotations, that only contain 10%-50% correctly labeled examples per class. We show that training on the candidate examples and regularizing through early stopping gives higher test performance for both problems than when training on the original, clean data. This is possible because the candidate datasets contain a huge number of clean examples, and, as we show in this paper, the noise generated through the label collection process is not nearly as adversarial for learning as the noise generated by randomly flipping labels. |
Tasks | Image Classification |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.09055v3 |
https://arxiv.org/pdf/1910.09055v3.pdf | |
PWC | https://paperswithcode.com/paper/leveraging-inductive-bias-of-neural-networks |
Repo | |
Framework | |
Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Efficiency, and Power-Intermittency Resilience
Title | Processing-In-Memory Acceleration of Convolutional Neural Networks for Energy-Efficiency, and Power-Intermittency Resilience |
Authors | Arman Roohi, Shaahin Angizi, Deliang Fan, Ronald F DeMara |
Abstract | Herein, a bit-wise Convolutional Neural Network (CNN) in-memory accelerator is implemented using Spin-Orbit Torque Magnetic Random Access Memory (SOT-MRAM) computational sub-arrays. It utilizes a novel AND-Accumulation method capable of significantly-reduced energy consumption within convolutional layers and performs various low bit-width CNN inference operations entirely within MRAM. Power-intermittence resiliency is also enhanced by retaining the partial state information needed to maintain computational forward-progress, which is advantageous for battery-less IoT nodes. Simulation results indicate $\sim$5.4$\times$ higher energy-efficiency and 9$\times$ speedup over ReRAM-based acceleration, or roughly $\sim$9.7$\times$ higher energy-efficiency and 13.5$\times$ speedup over recent CMOS-only approaches, while maintaining inference accuracy comparable to baseline designs. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07864v1 |
http://arxiv.org/pdf/1904.07864v1.pdf | |
PWC | https://paperswithcode.com/paper/processing-in-memory-acceleration-of |
Repo | |
Framework | |
FCA2VEC: Embedding Techniques for Formal Concept Analysis
Title | FCA2VEC: Embedding Techniques for Formal Concept Analysis |
Authors | Dominik Dürrschnabel, Tom Hanika, Maximilian Stubbemann |
Abstract | Embedding large and high dimensional data into low dimensional vector spaces is a necessary task to computationally cope with contemporary data sets. Superseding latent semantic analysis recent approaches like word2vec or node2vec are well established tools in this realm. In the present paper we add to this line of research by introducing fca2vec, a family of embedding techniques for formal concept analysis (FCA). Our investigation contributes to two distinct lines of research. First, we enable the application of FCA notions to large data sets. In particular, we demonstrate how the cover relation of a concept lattice can be retrieved from a computational feasible embedding. Secondly, we show an enhancement for the classical node2vec approach in low dimension. For both directions the overall constraint of FCA of explainable results is preserved. We evaluate our novel procedures by computing fca2vec on different data sets like, wiki44 (a dense part of the Wikidata knowledge graph), the Mushroom data set and a publication network derived from the FCA community. |
Tasks | |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11496v1 |
https://arxiv.org/pdf/1911.11496v1.pdf | |
PWC | https://paperswithcode.com/paper/fca2vec-embedding-techniques-for-formal |
Repo | |
Framework | |
A Survey on Deep Learning Architectures for Image-based Depth Reconstruction
Title | A Survey on Deep Learning Architectures for Image-based Depth Reconstruction |
Authors | Hamid Laga |
Abstract | Estimating depth from RGB images is a long-standing ill-posed problem, which has been explored for decades by the computer vision, graphics, and machine learning communities. In this article, we provide a comprehensive survey of the recent developments in this field. We will focus on the works which use deep learning techniques to estimate depth from one or multiple images. Deep learning, coupled with the availability of large training datasets, have revolutionized the way the depth reconstruction problem is being approached by the research community. In this article, we survey more than 100 key contributions that appeared in the past five years, summarize the most commonly used pipelines, and discuss their benefits and limitations. In retrospect of what has been achieved so far, we also conjecture what the future may hold for learning-based depth reconstruction research. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06113v1 |
https://arxiv.org/pdf/1906.06113v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-deep-learning-architectures-for |
Repo | |
Framework | |
Robust modal regression with direct log-density derivative estimation
Title | Robust modal regression with direct log-density derivative estimation |
Authors | Hiroaki Sasaki, Tomoya Sakai, Takafumi Kanamori |
Abstract | Modal regression is aimed at estimating the global mode (i.e., global maximum) of the conditional density function of the output variable given input variables, and has led to regression methods robust against heavy-tailed or skewed noises. The conditional mode is often estimated through maximization of the modal regression risk (MRR). In order to apply a gradient method for the maximization, the fundamental challenge is accurate approximation of the gradient of MRR, not MRR itself. To overcome this challenge, in this paper, we take a novel approach of directly approximating the gradient of MRR. To approximate the gradient, we develop kernelized and neural-network-based versions of the least-squares log-density derivative estimator, which directly approximates the derivative of the log-density without density estimation. With direct approximation of the MRR gradient, we first propose a modal regression method with kernels, and derive a new parameter update rule based on a fixed-point method. Then, the derived update rule is theoretically proved to have a monotonic hill-climbing property towards the conditional mode. Furthermore, we indicate that our approach of directly approximating the gradient is compatible with recent sophisticated stochastic gradient methods (e.g., Adam), and then propose another modal regression method based on neural networks. Finally, the superior performance of the proposed methods is demonstrated on various artificial and benchmark datasets. |
Tasks | Density Estimation |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08280v1 |
https://arxiv.org/pdf/1910.08280v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-modal-regression-with-direct-log |
Repo | |
Framework | |
Continuous-time Discounted Mirror-Descent Dynamics in Monotone Concave Games
Title | Continuous-time Discounted Mirror-Descent Dynamics in Monotone Concave Games |
Authors | Bolin Gao, Lacra Pavel |
Abstract | In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically in concave games with monotone (negative) pseudo-gradient. Furthermore, we show that when the regularizer enjoys strong convexity, the resulting dynamics can converge even in games with hypo-monotone (negative) pseudo-gradient, which corresponds to a shortage of monotonicity. |
Tasks | |
Published | 2019-12-07 |
URL | https://arxiv.org/abs/1912.03460v1 |
https://arxiv.org/pdf/1912.03460v1.pdf | |
PWC | https://paperswithcode.com/paper/continuous-time-discounted-mirror-descent |
Repo | |
Framework | |