January 26, 2020

3116 words 15 mins read

Paper Group ANR 1375

Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications. A model for a Lindenmayer reconstruction algorithm. Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model. Self-Binarizing Networks. Compositio …

Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications


Title	Fault Matters: Sensor Data Fusion for Detection of Faults using Dempster-Shafer Theory of Evidence in IoT-Based Applications
Authors	Nimisha Ghosh, Rourab Paul, Satyabrata Maity, Krishanu Maity, Sayantan Saha
Abstract	Fault detection in sensor nodes is a pertinent issue that has been an important area of research for a very long time. But it is not explored much as yet in the context of Internet of Things. Internet of Things work with a massive amount of data so the responsibility for guaranteeing the accuracy of the data also lies with it. Moreover, a lot of important and critical decisions are made based on these data, so ensuring its correctness and accuracy is also very important. Also, the detection needs to be as precise as possible to avoid negative alerts. For this purpose, this work has adopted Dempster-Shafer Theory of Evidence which is a popular learning method to collate the information from sensors to come up with a decision regarding the faulty status of a sensor node. To verify the validity of the proposed method, simulations have been performed on a benchmark data set and data collected through a test bed in a laboratory set-up. For the different types of faults, the proposed method shows very competent accuracy for both the benchmark (99.8%) and laboratory data sets (99.9%) when compared to the other state-of-the-art machine learning techniques.
Tasks	Fault Detection
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09769v1
PDF	https://arxiv.org/pdf/1906.09769v1.pdf
PWC	https://paperswithcode.com/paper/fault-matters-sensor-data-fusion-for
Repo
Framework

A model for a Lindenmayer reconstruction algorithm


Title	A model for a Lindenmayer reconstruction algorithm
Authors	Diego Gabriel Krivochen, Beth Phillips
Abstract	Given an input string s and a specific Lindenmayer system (the so-called Fibonacci grammar), we define an automaton which is capable of (i) determining whether s belongs to the set of strings that the Fibonacci grammar can generate (in other words, if s corresponds to a generation of the grammar) and, if so, (ii) reconstructing the previous generation.
Tasks
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08407v1
PDF	http://arxiv.org/pdf/1901.08407v1.pdf
PWC	https://paperswithcode.com/paper/a-model-for-a-lindenmayer-reconstruction
Repo
Framework

Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model


Title	Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model
Authors	Jiin Nam, Seunghyun Yoon, Kyomin Jung
Abstract	While deep learning techniques have shown promising results in many natural language processing (NLP) tasks, it has not been widely applied to the clinical domain. The lack of large datasets and the pervasive use of domain-specific language (i.e. abbreviations and acronyms) in the clinical domain causes slower progress in NLP tasks than that of the general NLP tasks. To fill this gap, we employ word/subword-level based models that adopt large-scale data-driven methods such as pre-trained language models and transfer learning in analyzing text for the clinical domain. Empirical results demonstrate the superiority of the proposed methods by achieving 90.6% accuracy in medical domain natural language inference task. Furthermore, we inspect the independent strengths of the proposed approaches in quantitative and qualitative manners. This analysis will help researchers to select necessary components in building models for the medical domain.
Tasks	Language Modelling, Natural Language Inference, Transfer Learning
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07854v1
PDF	https://arxiv.org/pdf/1906.07854v1.pdf
PWC	https://paperswithcode.com/paper/surf-at-mediqa-2019-improving-performance-of
Repo
Framework

Self-Binarizing Networks


Title	Self-Binarizing Networks
Authors	Fayez Lahoud, Radhakrishna Achanta, Pablo Márquez-Neila, Sabine Süsstrunk
Abstract	We present a method to train self-binarizing neural networks, that is, networks that evolve their weights and activations during training to become binary. To obtain similar binary networks, existing methods rely on the sign activation function. This function, however, has no gradients for non-zero values, which makes standard backpropagation impossible. To circumvent the difficulty of training a network relying on the sign activation function, these methods alternate between floating-point and binary representations of the network during training, which is sub-optimal and inefficient. We approach the binarization task by training on a unique representation involving a smooth activation function, which is iteratively sharpened during training until it becomes a binary representation equivalent to the sign activation function. Additionally, we introduce a new technique to perform binary batch normalization that simplifies the conventional batch normalization by transforming it into a simple comparison operation. This is unlike existing methods, which are forced to the retain the conventional floating-point-based batch normalization. Our binary networks, apart from displaying advantages of lower memory and computation as compared to conventional floating-point and binary networks, also show higher classification accuracy than existing state-of-the-art methods on multiple benchmark datasets.
Tasks
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00730v1
PDF	http://arxiv.org/pdf/1902.00730v1.pdf
PWC	https://paperswithcode.com/paper/self-binarizing-networks
Repo
Framework

Compositional Deep Learning


Title	Compositional Deep Learning
Authors	Bruno Gavranović
Abstract	Neural networks have become an increasingly popular tool for solving many real-world problems. They are a general framework for differentiable optimization which includes many other machine learning approaches as special cases. In this thesis we build a category-theoretic formalism around a class of neural networks exemplified by CycleGAN. CycleGAN is a collection of neural networks, closed under composition, whose inductive bias is increased by enforcing composition invariants, i.e. cycle-consistencies. Inspired by Functorial Data Migration, we specify the interconnection of these networks using a categorical schema, and network instances as set-valued functors on this schema. We also frame neural network architectures, datasets, models, and a number of other concepts in a categorical setting and thus show a special class of functors, rather than functions, can be learned using gradient descent. We use the category-theoretic framework to conceive a novel neural network architecture whose goal is to learn the task of object insertion and object deletion in images with unpaired data. We test the architecture on three different datasets and obtain promising results.
Tasks
Published	2019-07-16
URL	https://arxiv.org/abs/1907.08292v1
PDF	https://arxiv.org/pdf/1907.08292v1.pdf
PWC	https://paperswithcode.com/paper/compositional-deep-learning
Repo
Framework

Improving Multi-Head Attention with Capsule Networks


Title	Improving Multi-Head Attention with Capsule Networks
Authors	Shuhao Gu, Yang Feng
Abstract	Multi-head attention advances neural machine translation by working out multiple versions of attention in different subspaces, but the neglect of semantic overlapping between subspaces increases the difficulty of translation and consequently hinders the further improvement of translation performance. In this paper, we employ capsule networks to comb the information from the multiple heads of the attention so that similar information can be clustered and unique information can be reserved. To this end, we adopt two routing mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and separating. We conducted experiments on Chinese-to-English and English-to-German translation tasks and got consistent improvements over the strong Transformer baseline.
Tasks	Machine Translation
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00188v1
PDF	https://arxiv.org/pdf/1909.00188v1.pdf
PWC	https://paperswithcode.com/paper/improving-multi-head-attention-with-capsule
Repo
Framework

Towards Latent Space Optimality for Auto-Encoder Based Generative Models


Title	Towards Latent Space Optimality for Auto-Encoder Based Generative Models
Authors	Arnab Kumar Mondal, Sankalan Pal Chowdhury, Aravind Jayendran, Parag Singla, Himanshu Asnani, Prathosh AP
Abstract	The field of neural generative models is dominated by the highly successful Generative Adversarial Networks (GANs) despite their challenges, such as training instability and mode collapse. Auto-Encoders (AE) with regularized latent space provides an alternative framework for generative models, albeit their performance levels have not reached that of GANs. In this work, we identify one of the causes for the under-performance of AE-based models and propose a remedial measure. Specifically, we hypothesize that the dimensionality of the AE model’s latent space has a critical effect on the quality of the generated data. Under the assumption that nature generates data by sampling from a “true” generative latent space followed by a deterministic non-linearity, we show that the optimal performance is obtained when the dimensionality of the latent space of the AE-model matches with that of the “true” generative latent space. Further, we propose an algorithm called the Latent Masked Generative Auto-Encoder (LMGAE), in which the dimensionality of the model’s latent space is brought closer to that of the “true” generative latent space, via a novel procedure to mask the spurious latent dimensions. We demonstrate through experiments on synthetic and several real-world datasets that the proposed formulation yields generation quality that is better than the state-of-the-art AE-based generative models and is comparable to that of GANs.
Tasks
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04564v1
PDF	https://arxiv.org/pdf/1912.04564v1.pdf
PWC	https://paperswithcode.com/paper/towards-latent-space-optimality-for-auto
Repo
Framework

Graduate Employment Prediction with Bias


Title	Graduate Employment Prediction with Bias
Authors	Teng Guo, Feng Xia, Shihao Zhen, Xiaomei Bai, Dongyu Zhang, Zitao Liu, Jiliang Tang
Abstract	The failure of landing a job for college students could cause serious social consequences such as drunkenness and suicide. In addition to academic performance, unconscious biases can become one key obstacle for hunting jobs for graduating students. Thus, it is necessary to understand these unconscious biases so that we can help these students at an early stage with more personalized intervention. In this paper, we develop a framework, i.e., MAYA (Multi-mAjor emploYment stAtus) to predict students’ employment status while considering biases. The framework consists of four major components. Firstly, we solve the heterogeneity of student courses by embedding academic performance into a unified space. Then, we apply a generative adversarial network (GAN) to overcome the class imbalance problem. Thirdly, we adopt Long Short-Term Memory (LSTM) with a novel dropout mechanism to comprehensively capture sequential information among semesters. Finally, we design a bias-based regularization to capture the job market biases. We conduct extensive experiments on a large-scale educational dataset and the results demonstrate the effectiveness of our prediction framework.
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12012v1
PDF	https://arxiv.org/pdf/1912.12012v1.pdf
PWC	https://paperswithcode.com/paper/graduate-employment-prediction-with-bias
Repo
Framework

Recognizing Part Attributes with Insufficient Data


Title	Recognizing Part Attributes with Insufficient Data
Authors	Xiangyun Zhao, Yi Yang, Feng Zhou, Xiao Tan, Yuchen Yuan, Yingze Bao, Ying Wu
Abstract	Recognizing attributes of objects and their parts is important to many computer vision applications. Although great progress has been made to apply object-level recognition, recognizing the attributes of parts remains less applicable since the training data for part attributes recognition is usually scarce especially for internet-scale applications. Furthermore, most existing part attribute recognition methods rely on the part annotation which is more expensive to obtain. To solve the data insufficiency problem and get rid of dependence on the part annotation, we introduce a novel Concept Sharing Network (CSN) for part attribute recognition. A great advantage of CSN is its capability of recognizing the part attribute (a combination of part location and appearance pattern) that has insufficient or zero training data, by learning the part location and appearance pattern respectively from the training data that usually mix them in a single label. Extensive experiments on CUB-200-2011 [51], CelebA [35] and a newly proposed human attribute dataset demonstrate the effectiveness of CSN and its advantages over other methods, especially for the attributes with few training samples. Further experiments show that CSN can also perform zero-shot part attribute recognition. The code will be made available at https://github.com/Zhaoxiangyun/Concept-Sharing-Network.
Tasks
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03335v2
PDF	https://arxiv.org/pdf/1908.03335v2.pdf
PWC	https://paperswithcode.com/paper/recognizing-part-attributes-with-insufficient
Repo
Framework

IITP at MEDIQA 2019: Systems Report for Natural Language Inference, Question Entailment and Question Answering


Title	IITP at MEDIQA 2019: Systems Report for Natural Language Inference, Question Entailment and Question Answering
Authors	Dibyanayan Bandyopadhyay, Baban Gain, Tanik Saikh, Asif Ekbal
Abstract	This paper presents the experiments accomplished as a part of our participation in the MEDIQA challenge, an (Abacha et al., 2019) shared task. We participated in all the three tasks defined in this particular shared task. The tasks are viz. i. Natural Language Inference (NLI) ii. Recognizing Question Entailment(RQE) and their application in medical Question Answering (QA). We submitted runs using multiple deep learning based systems (runs) for each of these three tasks. We submitted five system results in each of the NLI and RQE tasks, and four system results for the QA task. The systems yield encouraging results in all three tasks. The highest performance obtained in NLI, RQE and QA tasks are 81.8%, 53.2%, and 71.7%, respectively.
Tasks	Natural Language Inference, Question Answering
Published	2019-06-14
URL	https://arxiv.org/abs/1906.06332v1
PDF	https://arxiv.org/pdf/1906.06332v1.pdf
PWC	https://paperswithcode.com/paper/iitp-at-mediqa-2019-systems-report-for
Repo
Framework

DeepLABNet: End-to-end Learning of Deep Radial Basis Networks with Fully Learnable Basis Functions


Title	DeepLABNet: End-to-end Learning of Deep Radial Basis Networks with Fully Learnable Basis Functions
Authors	Andrew Hryniowski, Alexander Wong
Abstract	From fully connected neural networks to convolutional neural networks, the learned parameters within a neural network have been primarily relegated to the linear parameters (e.g., convolutional filters). The non-linear functions (e.g., activation functions) have largely remained, with few exceptions in recent years, parameter-less, static throughout training, and seen limited variation in design. Largely ignored by the deep learning community, radial basis function (RBF) networks provide an interesting mechanism for learning more complex non-linear activation functions in addition to the linear parameters in a network. However, the interest in RBF networks has waned over time due to the difficulty of integrating RBFs into more complex deep neural network architectures in a tractable and stable manner. In this work, we present a novel approach that enables end-to-end learning of deep RBF networks with fully learnable activation basis functions in an automatic and tractable manner. We demonstrate that our approach for enabling the use of learnable activation basis functions in deep neural networks, which we will refer to as DeepLABNet, is an effective tool for automated activation function learning within complex network architectures.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09257v1
PDF	https://arxiv.org/pdf/1911.09257v1.pdf
PWC	https://paperswithcode.com/paper/deeplabnet-end-to-end-learning-of-deep-radial
Repo
Framework

Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection


Title	Bayesian Variational Autoencoders for Unsupervised Out-of-Distribution Detection
Authors	Erik Daxberger, José Miguel Hernández-Lobato
Abstract	Despite their successes, deep neural networks may make unreliable predictions when faced with test data drawn from a distribution different to that of the training data, constituting a major problem for AI safety. While this has recently motivated the development of methods to detect such out-of-distribution (OoD) inputs, a robust solution is still lacking. We propose a new probabilistic, unsupervised approach to this problem based on a Bayesian variational autoencoder model, which estimates a full posterior distribution over the decoder parameters using stochastic gradient Markov chain Monte Carlo, instead of fitting a point estimate. We describe how information-theoretic measures based on this posterior can then be used to detect OoD inputs both in input space and in the model’s latent space. We empirically show the effectiveness of our approach.
Tasks	Out-of-Distribution Detection
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05651v2
PDF	https://arxiv.org/pdf/1912.05651v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-variational-autoencoders-for-1
Repo
Framework

Learning to Generate Dense Point Clouds with Textures on Multiple Categories


Title	Learning to Generate Dense Point Clouds with Textures on Multiple Categories
Authors	Tao Hu, Geng Lin, Zhizhong Han, Matthias Zwicker
Abstract	3D reconstruction from images is a core problem in computer vision. With recent advances in deep learning, it has become possible to recover plausible 3D shapes even from single RGB images for the first time. However, obtaining detailed geometry and texture for objects with arbitrary topology remains challenging. In this paper, we propose a novel approach for reconstructing point clouds from RGB images. Unlike other methods, we can recover dense point clouds with hundreds of thousands of points, and we also include RGB textures. In addition, we train our model on multiple categories which leads to superior generalization to unseen categories compared to previous techniques. We achieve this using a two-stage approach, where we first infer an object coordinate map from the input RGB image, and then obtain the final point cloud using a reprojection and completion step. We show results on standard benchmarks that demonstrate the advantages of our technique. Code is available at https://github.com/TaoHuUMD/3D-Reconstruction.
Tasks	3D Reconstruction
Published	2019-12-22
URL	https://arxiv.org/abs/1912.10545v1
PDF	https://arxiv.org/pdf/1912.10545v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-generate-dense-point-clouds-with
Repo
Framework

Multi-hop Reading Comprehension via Deep Reinforcement Learning based Document Traversal


Title	Multi-hop Reading Comprehension via Deep Reinforcement Learning based Document Traversal
Authors	Alex Long, Joel Mason, Alan Blair, Wei Wang
Abstract	Reading Comprehension has received significant attention in recent years as high quality Question Answering (QA) datasets have become available. Despite state-of-the-art methods achieving strong overall accuracy, Multi-Hop (MH) reasoning remains particularly challenging. To address MH-QA specifically, we propose a Deep Reinforcement Learning based method capable of learning sequential reasoning across large collections of documents so as to pass a query-aware, fixed-size context subset to existing models for answer extraction. Our method is comprised of two stages: a linker, which decomposes the provided support documents into a graph of sentences, and an extractor, which learns where to look based on the current question and already-visited sentences. The result of the linker is a novel graph structure at the sentence level that preserves logical flow while still allowing rapid movement between documents. Importantly, we demonstrate that the sparsity of the resultant graph is invariant to context size. This translates to fewer decisions required from the Deep-RL trained extractor, allowing the system to scale effectively to large collections of documents. The importance of sequential decision making in the document traversal step is demonstrated by comparison to standard IE methods, and we additionally introduce a BM25-based IR baseline that retrieves documents relevant to the query only. We examine the integration of our method with existing models on the recently proposed QAngaroo benchmark and achieve consistent increases in accuracy across the board, as well as a 2-3x reduction in training time.
Tasks	Decision Making, Multi-Hop Reading Comprehension, Question Answering, Reading Comprehension
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09438v1
PDF	https://arxiv.org/pdf/1905.09438v1.pdf
PWC	https://paperswithcode.com/paper/multi-hop-reading-comprehension-via-deep
Repo
Framework

Self-supervised audio representation learning for mobile devices


Title	Self-supervised audio representation learning for mobile devices
Authors	Marco Tagliasacchi, Beat Gfeller, Félix de Chaumont Quitry, Dominik Roblek
Abstract	We explore self-supervised models that can be potentially deployed on mobile devices to learn general purpose audio representations. Specifically, we propose methods that exploit the temporal context in the spectrogram domain. One method estimates the temporal gap between two short audio segments extracted at random from the same audio clip. The other methods are inspired by Word2Vec, a popular technique used to learn word embeddings, and aim at reconstructing a temporal spectrogram slice from past and future slices or, alternatively, at reconstructing the context of surrounding slices from the current slice. We focus our evaluation on small encoder architectures, which can be potentially run on mobile devices during both inference (re-using a common learned representation across multiple downstream tasks) and training (capturing the true data distribution without compromising users’ privacy when combined with federated learning). We evaluate the quality of the embeddings produced by the self-supervised learning models, and show that they can be re-used for a variety of downstream tasks, and for some tasks even approach the performance of fully supervised models of similar size.
Tasks	Representation Learning, Word Embeddings
Published	2019-05-24
URL	https://arxiv.org/abs/1905.11796v1
PDF	https://arxiv.org/pdf/1905.11796v1.pdf
PWC	https://paperswithcode.com/paper/190511796
Repo
Framework