January 31, 2020

3109 words 15 mins read

Paper Group AWR 420

A Constructive Approach for One-Shot Training of Neural Networks Using Hypercube-Based Topological Coverings. Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator. SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression. Pre-Training of Deep Bid …

A Constructive Approach for One-Shot Training of Neural Networks Using Hypercube-Based Topological Coverings


Title	A Constructive Approach for One-Shot Training of Neural Networks Using Hypercube-Based Topological Coverings
Authors	W. Brent Daniel, Enoch Yeung
Abstract	In this paper we presented a novel constructive approach for training deep neural networks using geometric approaches. We show that a topological covering can be used to define a class of distributed linear matrix inequalities, which in turn directly specify the shape and depth of a neural network architecture. The key insight is a fundamental relationship between linear matrix inequalities and their ability to bound the shape of data, and the rectified linear unit (ReLU) activation function employed in modern neural networks. We show that unit cover geometry and cover porosity are two design variables in cover-constructive learning that play a critical role in defining the complexity of the model and generalizability of the resulting neural network classifier. In the context of cover-constructive learning, these findings underscore the age old trade-off between model complexity and overfitting (as quantified by the number of elements in the data cover) and generalizability on test data. Finally, we benchmark on algorithm on the Iris, MNIST, and Wine dataset and show that the constructive algorithm is able to train a deep neural network classifier in one shot, achieving equal or superior levels of training and test classification accuracy with reduced training time.
Tasks
Published	2019-01-09
URL	http://arxiv.org/abs/1901.02878v1
PDF	http://arxiv.org/pdf/1901.02878v1.pdf
PWC	https://paperswithcode.com/paper/a-constructive-approach-for-one-shot-training
Repo	https://github.com/Vassago1911/nnet_cubes
Framework	none

Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator


Title	Rethinking Softmax with Cross-Entropy: Neural Network Classifier as Mutual Information Estimator
Authors	Zhenyue Qin, Dongwoo Kim, Tom Gedeon
Abstract	Mutual information is widely applied to learn latent representations of observations, whilst its implication in classification neural networks remain to be better explained. We show that optimising the parameters of classification neural networks with softmax cross-entropy is equivalent to maximising the mutual information between inputs and labels under the balanced data assumption. Through experiments on synthetic and real datasets, we show that softmax cross-entropy can estimate mutual information approximately. When applied to image classification, this relation helps approximate the point-wise mutual information between an input image and a label without modifying the network structure. To this end, we propose infoCAM, informative class activation map, which highlights regions of the input image that are the most relevant to a given label based on differences in information. The activation map helps localise the target object in an input image. Through experiments on the semi-supervised object localisation task with two real-world datasets, we evaluate the effectiveness of our information-theoretic approach.
Tasks	Fine-Grained Image Classification, Image Classification, Weakly-Supervised Object Localization
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10688v3
PDF	https://arxiv.org/pdf/1911.10688v3.pdf
PWC	https://paperswithcode.com/paper/rethinking-softmax-with-cross-entropy-neural
Repo	https://github.com/ZhenyueQin/Research-Softmax-with-Mutual-Information
Framework	tf

SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression


Title	SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression
Authors	Christos Baziotis, Ion Androutsopoulos, Ioannis Konstas, Alexandros Potamianos
Abstract	Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compression, where the first and last sequences are the input and reconstructed sentences, respectively, while the middle sequence is the compressed sentence. Constraining the length of the latent word sequences forces the model to distill important information from the input. A pretrained language model, acting as a prior over the latent sequences, encourages the compressed sentences to be human-readable. Continuous relaxations enable us to sample from categorical distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. The proposed model does not require parallel text-summary pairs, achieving promising results in unsupervised sentence compression on benchmark datasets.
Tasks	Language Modelling, Sentence Compression, Unsupervised Sentence Compression
Published	2019-04-07
URL	https://arxiv.org/abs/1904.03651v2
PDF	https://arxiv.org/pdf/1904.03651v2.pdf
PWC	https://paperswithcode.com/paper/seq3-differentiable-sequence-to-sequence-to
Repo	https://github.com/cbaziotis/seq3
Framework	pytorch

Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information


Title	Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information
Authors	Seonwoo Min, Seunghyun Park, Siwon Kim, Hyun-Soo Choi, Sungroh Yoon
Abstract	Motivation: Bridging the exponential gap between the number of unlabeled and labeled protein sequences, a couple of works have adopted semi-supervised learning for protein sequence modeling. They pre-train a model with a substantial amount of unlabeled data and transfer the learned representations to various downstream tasks. Nonetheless, the current pre-training methods mostly rely on a language modeling pre-training task and often show limited performances. Therefore, a pertinent protein-specific pre-training task is necessary to better capture the information contained within the protein sequences. Results: In this paper, we introduce a novel pre-training scheme called PLUS, which stands for Protein sequence representations Learned Using Structural information. PLUS consists of masked language modeling and a protein-specific pre-training task, namely same family prediction. PLUS can be used to pre-train various model architectures. In this work, we mainly use PLUS to pre-train a recurrent neural network (RNN) and refer to the resulting model as PLUS-RNN. It advances the state-of-the-art pre-training methods on six out of seven tasks, i.e., (1) three protein(-pair)-level classification, (2) two protein-level regression, and (3) two amino-acid-level classification tasks. Furthermore, we present results from our ablation studies and qualitative interpretation analyses to better understand the strengths of PLUS-RNN. Availability: The codes and pre-trained models are available at https://github.com/mswzeus/PLUS/
Tasks	Language Modelling
Published	2019-11-25
URL	https://arxiv.org/abs/1912.05625v2
PDF	https://arxiv.org/pdf/1912.05625v2.pdf
PWC	https://paperswithcode.com/paper/pre-training-of-deep-bidirectional-protein
Repo	https://github.com/mswzeus/PLUS
Framework	pytorch

Multi-Channel Graph Neural Network for Entity Alignment


Title	Multi-Channel Graph Neural Network for Entity Alignment
Authors	Yixin Cao, Zhiyuan Liu, Chengjiang Li, Zhiyuan Liu, Juanzi Li, Tat-Seng Chua
Abstract	Entity alignment typically suffers from the issues of structural heterogeneity and limited seed alignments. In this paper, we propose a novel Multi-channel Graph Neural Network model (MuGNN) to learn alignment-oriented knowledge graph (KG) embeddings by robustly encoding two KGs via multiple channels. Each channel encodes KGs via different relation weighting schemes with respect to self-attention towards KG completion and cross-KG attention for pruning exclusive entities respectively, which are further combined via pooling techniques. Moreover, we also infer and transfer rule knowledge for completing two KGs consistently. MuGNN is expected to reconcile the structural differences of two KGs, and thus make better use of seed alignments. Extensive experiments on five publicly available datasets demonstrate our superior performance (5% Hits@1 up on average).
Tasks	Entity Alignment
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09898v1
PDF	https://arxiv.org/pdf/1908.09898v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-graph-neural-network-for-entity-1
Repo	https://github.com/thunlp/MuGNN
Framework	pytorch

Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks


Title	Deep pNML: Predictive Normalized Maximum Likelihood for Deep Neural Networks
Authors	Koby Bibas, Yaniv Fogel, Meir Feder
Abstract	The Predictive Normalized Maximum Likelihood (pNML) scheme has been recently suggested for universal learning in the individual setting, where both the training and test samples are individual data. The goal of universal learning is to compete with a genie'' or reference learner that knows the data values, but is restricted to use a learner from a given model class. The pNML minimizes the associated regret for any possible value of the unknown label. Furthermore, its min-max regret can serve as a pointwise measure of learnability for the specific training and data sample. In this work we examine the pNML and its associated learnability measure for the Deep Neural Network (DNN) model class. As shown, the pNML outperforms the commonly used Empirical Risk Minimization (ERM) approach and provides robustness against adversarial attacks. Together with its learnability measure it can detect out of distribution test examples, be tolerant to noisy labels and serve as a confidence measure for the ERM. Finally, we extend the pNML to a twice universal’’ solution, that provides universality for model class selection and generates a learner competing with the best one from all model classes.
Tasks
Published	2019-04-28
URL	https://arxiv.org/abs/1904.12286v2
PDF	https://arxiv.org/pdf/1904.12286v2.pdf
PWC	https://paperswithcode.com/paper/deep-pnml-predictive-normalized-maximum
Repo	https://github.com/kobybibas/deep_pnml_experiments
Framework	pytorch


Title	Neural Related Work Summarization with a Joint Context-driven Attention Mechanism
Authors	Yongzhen Wang, Xiaozhong Liu, Zheng Gao
Abstract	Conventional solutions to automatic related work summarization rely heavily on human-engineered features. In this paper, we develop a neural data-driven summarizer by leveraging the seq2seq paradigm, in which a joint context-driven attention mechanism is proposed to measure the contextual relevance within full texts and a heterogeneous bibliography graph simultaneously. Our motivation is to maintain the topic coherency between a related work section and its target document, where both the textual and graphic contexts play a big role in characterizing the relationship among scientific publications accurately. Experimental results on a large dataset show that our approach achieves a considerable improvement over a typical seq2seq summarizer and five classical summarization baselines.
Tasks
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09492v1
PDF	http://arxiv.org/pdf/1901.09492v1.pdf
PWC	https://paperswithcode.com/paper/neural-related-work-summarization-with-a
Repo	https://github.com/kuadmu/2018EMNLP
Framework	none

Efficient Method for Categorize Animals in the Wild


Title	Efficient Method for Categorize Animals in the Wild
Authors	Abulikemu Abuduweili, Xin Wu, Xingchen Tao
Abstract	Automatic species classification in camera traps would greatly help the biodiversity monitoring and species analysis in the earth. In order to accelerate the development of automatic species classification task, “Microsoft AI for Earth” have prepared a challenge in FGVC6 workshop at CVPR 2019, which called “iWildCam 2019 competition”. In this work, we propose an efficient method for categorizing animals in the wild. We transfer the state-of-the-art ImagaNet pretrained models to the problem. To improve the generalization and robustness of the model, we utilize efficient image augmentation and regularization strategies, like cutout, mixup and label-smoothing. Finally, we use ensemble learning to increase the performance of the model. Thanks to advanced regularization strategies and ensemble learning, we got top 7/336 places in the final leaderboard. Source code of this work is available at https://github.com/Walleclipse/iWildCam_2019_FGVC6
Tasks	Image Augmentation
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13037v1
PDF	https://arxiv.org/pdf/1907.13037v1.pdf
PWC	https://paperswithcode.com/paper/efficient-method-for-categorize-animals-in
Repo	https://github.com/Walleclipse/iWildCam_2019_FGVC6
Framework	pytorch

Visual Discourse Parsing


Title	Visual Discourse Parsing
Authors	Arjun R Akula, Song-Chun Zhu
Abstract	Text-level discourse parsing aims to unmask how two segments (or sentences) in the text are related to each other. We propose the task of Visual Discourse Parsing, which requires understanding discourse relations among scenes in a video. Here we use the term scene to refer to a subset of video frames that can better summarize the video. In order to collect a dataset for learning discourse cues from videos, one needs to manually identify the scenes from a large pool of video frames and then annotate the discourse relations between them. This is clearly a time consuming, expensive and tedious task. In this work, we propose an approach to identify discourse cues from the videos without the need to explicitly identify and annotate the scenes. We also present a novel dataset containing 310 videos and the corresponding discourse cues to evaluate our approach. We believe that many of the multi-discipline Artificial Intelligence problems such as Visual Dialog and Visual Storytelling would greatly benefit from the use of visual discourse cues.
Tasks	Visual Dialog, Visual Storytelling
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02252v2
PDF	http://arxiv.org/pdf/1903.02252v2.pdf
PWC	https://paperswithcode.com/paper/visual-discourse-parsing
Repo	https://github.com/arjunakula/Visual-Discourse-Parsing
Framework	none

Tapering Analysis of Airways with Bronchiectasis


Title	Tapering Analysis of Airways with Bronchiectasis
Authors	Kin Quan, Rebecca J. Shipley, Ryutaro Tanno, Graeme McPhillips, Vasileios Vavourakis, David Edwards, Joseph Jacob, John R. Hurst, David J. Hawkes
Abstract	Bronchiectasis is the permanent dilation of airways. Patients with the disease can suffer recurrent exacerbations, reducing their quality of life. The gold standard to diagnose and monitor bronchiectasis is accomplished by inspection of chest computed tomography (CT) scans. A clinician examines the broncho-arterial ratio to determine if an airway is brochiectatic. The visual analysis assumes the blood vessel diameter remains constant, although this assumption is disputed in the literature. We propose a simple measurement of tapering along the airways to diagnose and monitor bronchiectasis. To this end, we constructed a pipeline to measure the cross-sectional area along the airways at contiguous intervals, starting from the carina to the most distal point observable. Using a phantom with calibrated 3D printed structures, the precision and accuracy of our algorithm extends to the sub voxel level. The tapering measurement is robust to bifurcations along the airway and was applied to chest CT images acquired in clinical practice. The result is a statistical difference in tapering rate between airways with bronchiectasis and controls. Our code is available at https://github.com/quan14/AirwayTaperingInCT.
Tasks	Computed Tomography (CT)
Published	2019-09-14
URL	https://arxiv.org/abs/1909.06604v1
PDF	https://arxiv.org/pdf/1909.06604v1.pdf
PWC	https://paperswithcode.com/paper/tapering-analysis-of-airways-with
Repo	https://github.com/quan14/AirwayTaperingInCT
Framework	none

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age


Title	FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age
Authors	Kimmo Kärkkäinen, Jungseock Joo
Abstract	Existing public face datasets are strongly biased toward Caucasian faces, and other races (e.g., Latino) are significantly underrepresented. This can lead to inconsistent model accuracy, limit the applicability of face analytic systems to non-White race groups, and adversely affect research findings based on such skewed data. To mitigate the race bias in these datasets, we construct a novel face image dataset, containing 108,501 images, with an emphasis of balanced race composition in the dataset. We define 7 race groups: White, Black, Indian, East Asian, Southeast Asian, Middle East, and Latino. Images were collected from the YFCC-100M Flickr dataset and labeled with race, gender, and age groups. Evaluations were performed on existing face attribute datasets as well as novel image datasets to measure generalization performance. We find that the model trained from our dataset is substantially more accurate on novel datasets and the accuracy is consistent between race and gender groups.
Tasks
Published	2019-08-14
URL	https://arxiv.org/abs/1908.04913v1
PDF	https://arxiv.org/pdf/1908.04913v1.pdf
PWC	https://paperswithcode.com/paper/fairface-face-attribute-dataset-for-balanced
Repo	https://github.com/joojs/fairface
Framework	none

Controllable Sentence Simplification


Title	Controllable Sentence Simplification
Authors	Louis Martin, Benoît Sagot, Éric de la Clergerie, Antoine Bordes
Abstract	Text simplification aims at making a text easier to read and understand by simplifying grammar and structure while keeping the underlying information identical. It is often considered an all-purpose generic task where the same simplification is suitable for all; however multiple audiences can benefit from simplified text in different ways. We adapt a discrete parametrization mechanism that provides explicit control on simplification systems based on Sequence-to-Sequence models. As a result, users can condition the simplifications returned by a model on parameters such as length, amount of paraphrasing, lexical complexity and syntactic complexity. We also show that carefully chosen values of these parameters allow out-of-the-box Sequence-to-Sequence models to outperform their standard counterparts on simplification benchmarks. Our model, which we call ACCESS (as shorthand for AudienCe-CEntric Sentence Simplification), increases the state of the art to 41.87 SARI on the WikiLarge test set, a +1.42 gain over previously reported scores.
Tasks	Text Simplification
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02677v2
PDF	https://arxiv.org/pdf/1910.02677v2.pdf
PWC	https://paperswithcode.com/paper/controllable-sentence-simplification
Repo	https://github.com/facebookresearch/access
Framework	none

Modelling Airway Geometry as Stock Market Data using Bayesian Changepoint Detection


Title	Modelling Airway Geometry as Stock Market Data using Bayesian Changepoint Detection
Authors	Kin Quan, Ryutaro Tanno, Michael Duong, Arjun Nair, Rebecca Shipley, Mark Jones, Christopher Brereton, John Hurst, David Hawkes, Joseph Jacob
Abstract	Numerous lung diseases, such as idiopathic pulmonary fibrosis (IPF), exhibit dilation of the airways. Accurate measurement of dilatation enables assessment of the progression of disease. Unfortunately the combination of image noise and airway bifurcations causes high variability in the profiles of cross-sectional areas, rendering the identification of affected regions very difficult. Here we introduce a noise-robust method for automatically detecting the location of progressive airway dilatation given two profiles of the same airway acquired at different time points. We propose a probabilistic model of abrupt relative variations between profiles and perform inference via Reversible Jump Markov Chain Monte Carlo sampling. We demonstrate the efficacy of the proposed method on two datasets; (i) images of healthy airways with simulated dilatation; (ii) pairs of real images of IPF-affected airways acquired at 1 year intervals. Our model is able to detect the starting location of airway dilatation with an accuracy of 2.5mm on simulated data. The experiments on the IPF dataset display reasonable agreement with radiologists. We can compute a relative change in airway volume that may be useful for quantifying IPF disease progression. The code is available at https://github.com/quan14/Modelling_Airway_Geometry_as_Stock_Market_Data
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12225v2
PDF	https://arxiv.org/pdf/1906.12225v2.pdf
PWC	https://paperswithcode.com/paper/modelling-airway-geometry-as-stock-market
Repo	https://github.com/quan14/Modelling_Airway_Geometry_as_Stock_Market_Data
Framework	none

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation


Title	U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
Authors	Junho Kim, Minjae Kim, Hyeonwoo Kang, Kwanghee Lee
Abstract	We propose a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. The attention module guides our model to focus on more important regions distinguishing between source and target domains based on the attention map obtained by the auxiliary classifier. Unlike previous attention-based method which cannot handle the geometric changes between domains, our model can translate both images requiring holistic changes and images requiring large shape changes. Moreover, our new AdaLIN (Adaptive Layer-Instance Normalization) function helps our attention-guided model to flexibly control the amount of change in shape and texture by learned parameters depending on datasets. Experimental results show the superiority of the proposed method compared to the existing state-of-the-art models with a fixed network architecture and hyper-parameters. Our code and datasets are available at https://github.com/taki0112/UGATIT or https://github.com/znxlwm/UGATIT-pytorch.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10830v3
PDF	https://arxiv.org/pdf/1907.10830v3.pdf
PWC	https://paperswithcode.com/paper/u-gat-it-unsupervised-generative-attentional
Repo	https://github.com/taki0112/UGATIT
Framework	tf

Adversarial Colorization Of Icons Based On Structure And Color Conditions


Title	Adversarial Colorization Of Icons Based On Structure And Color Conditions
Authors	Tsai-Ho Sun, Chien-Hsun Lai, Sai-Keung Wong, Yu-Shuen Wang
Abstract	We present a system to help designers create icons that are widely used in banners, signboards, billboards, homepages, and mobile apps. Designers are tasked with drawing contours, whereas our system colorizes contours in different styles. This goal is achieved by training a dual conditional generative adversarial network (GAN) on our collected icon dataset. One condition requires the generated image and the drawn contour to possess a similar contour, while the other anticipates the image and the referenced icon to be similar in color style. Accordingly, the generator takes a contour image and a man-made icon image to colorize the contour, and then the discriminators determine whether the result fulfills the two conditions. The trained network is able to colorize icons demanded by designers and greatly reduces their workload. For the evaluation, we compared our dual conditional GAN to several state-of-the-art techniques. Experiment results demonstrate that our network is over the previous networks. Finally, we will provide the source code, icon dataset, and trained network for public use.
Tasks	Colorization
Published	2019-10-03
URL	https://arxiv.org/abs/1910.05253v1
PDF	https://arxiv.org/pdf/1910.05253v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-colorization-of-icons-based-on
Repo	https://github.com/jxcodetw/Adversarial-Colorization-Of-Icons-Based-On-Structure-And-Color-Conditions
Framework	pytorch