February 1, 2020

3124 words 15 mins read

Paper Group AWR 369

Paper Group AWR 369

Phrase Grounding by Soft-Label Chain Conditional Random Field. A Novel Re-weighting Method for Connectionist Temporal Classification. A Hybrid Neural Network Model for Commonsense Reasoning. Knowledge Graph Embedding for Ecotoxicological Effect Prediction. Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet …

Phrase Grounding by Soft-Label Chain Conditional Random Field

Title Phrase Grounding by Soft-Label Chain Conditional Random Field
Authors Jiacheng Liu, Julia Hockenmaier
Abstract The phrase grounding task aims to ground each entity mention in a given caption of an image to a corresponding region in that image. Although there are clear dependencies between how different mentions of the same caption should be grounded, previous structured prediction methods that aim to capture such dependencies need to resort to approximate inference or non-differentiable losses. In this paper, we formulate phrase grounding as a sequence labeling task where we treat candidate regions as potential labels, and use neural chain Conditional Random Fields (CRFs) to model dependencies among regions for adjacent mentions. In contrast to standard sequence labeling tasks, the phrase grounding task is defined such that there may be multiple correct candidate regions. To address this multiplicity of gold labels, we define so-called Soft-Label Chain CRFs, and present an algorithm that enables convenient end-to-end training. Our method establishes a new state-of-the-art on phrase grounding on the Flickr30k Entities dataset. Analysis shows that our model benefits both from the entity dependencies captured by the CRF and from the soft-label training regime. Our code is available at \url{github.com/liujch1998/SoftLabelCCRF}
Tasks Phrase Grounding, Structured Prediction
Published 2019-09-01
URL https://arxiv.org/abs/1909.00301v1
PDF https://arxiv.org/pdf/1909.00301v1.pdf
PWC https://paperswithcode.com/paper/phrase-grounding-by-soft-label-chain
Repo https://github.com/liujch1998/SoftLabelCCRF
Framework pytorch

A Novel Re-weighting Method for Connectionist Temporal Classification

Title A Novel Re-weighting Method for Connectionist Temporal Classification
Authors Hongzhu Li, Weiqiang Wang
Abstract The connectionist temporal classification (CTC) enables end-to-end sequence learning by maximizing the probability of correctly recognizing sequences during training. With an extra $blank$ class, the CTC implicitly converts recognizing a sequence into classifying each timestep within the sequence. But the CTC loss is not intuitive for such classification task, so the class imbalance within each sequence, caused by the overwhelming $blank$ timesteps, is a knotty problem. In this paper, we define a piece-wise function as the pseudo ground-truth to reinterpret the CTC loss based on sequences as the cross entropy loss based on timesteps. The cross entropy form makes it easy to re-weight the CTC loss. Experiments on text recognition show that the weighted CTC loss solves the class imbalance problem as well as facilitates the convergence, generally leading to better results than the CTC loss. Beside this, the reinterpretation of CTC, as a brand new perspective, may be potentially useful in some other situations.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10619v1
PDF http://arxiv.org/pdf/1904.10619v1.pdf
PWC https://paperswithcode.com/paper/a-novel-re-weighting-method-for-connectionist
Repo https://github.com/vadimkantorov/ctc
Framework pytorch

A Hybrid Neural Network Model for Commonsense Reasoning

Title A Hybrid Neural Network Model for Commonsense Reasoning
Authors Pengcheng He, Xiaodong Liu, Weizhu Chen, Jianfeng Gao
Abstract This paper proposes a hybrid neural network (HNN) model for commonsense reasoning. An HNN consists of two component models, a masked language model and a semantic similarity model, which share a BERT-based contextual encoder but use different model-specific input and output layers. HNN obtains new state-of-the-art results on three classic commonsense reasoning tasks, pushing the WNLI benchmark to 89%, the Winograd Schema Challenge (WSC) benchmark to 75.1%, and the PDP60 benchmark to 90.0%. An ablation study shows that language models and semantic similarity models are complementary approaches to commonsense reasoning, and HNN effectively combines the strengths of both. The code and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.
Tasks Language Modelling, Semantic Similarity, Semantic Textual Similarity
Published 2019-07-27
URL https://arxiv.org/abs/1907.11983v1
PDF https://arxiv.org/pdf/1907.11983v1.pdf
PWC https://paperswithcode.com/paper/a-hybrid-neural-network-model-for-commonsense
Repo https://github.com/namisan/mt-dnn
Framework pytorch

Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Title Knowledge Graph Embedding for Ecotoxicological Effect Prediction
Authors Erik Bryhn Myklebust, Ernesto Jimenez-Ruiz, Jiaoyan Chen, Raoul Wolf, Knut Erik Tollefsen
Abstract Exploring the effects a chemical compound has on a species takes a considerable experimental effort. Appropriate methods for estimating and suggesting new effects can dramatically reduce the work needed to be done by a laboratory. In this paper we explore the suitability of using a knowledge graph embedding approach for ecotoxicological effect prediction. A knowledge graph has been constructed from publicly available data sets, including a species taxonomy and chemical classification and similarity. The publicly available effect data is integrated to the knowledge graph using ontology alignment techniques. Our experimental results show that the knowledge graph based approach improves the selected baselines.
Tasks Graph Embedding, Knowledge Graph Embedding
Published 2019-07-02
URL https://arxiv.org/abs/1907.01328v3
PDF https://arxiv.org/pdf/1907.01328v3.pdf
PWC https://paperswithcode.com/paper/knowledge-graph-embedding-for
Repo https://github.com/Erik-BM/NIVAUC
Framework tf

Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians

Title Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians
Authors Vardan Papyan
Abstract We consider deep classifying neural networks. We expose a structure in the derivative of the logits with respect to the parameters of the model, which is used to explain the existence of outliers in the spectrum of the Hessian. Previous works decomposed the Hessian into two components, attributing the outliers to one of them, the so-called Covariance of gradients. We show this term is not a Covariance but a second moment matrix, i.e., it is influenced by means of gradients. These means possess an additive two-way structure that is the source of the outliers in the spectrum. This structure can be used to approximate the principal subspace of the Hessian using certain “averaging” operations, avoiding the need for high-dimensional eigenanalysis. We corroborate this claim across different datasets, architectures and sample sizes.
Tasks
Published 2019-01-24
URL http://arxiv.org/abs/1901.08244v1
PDF http://arxiv.org/pdf/1901.08244v1.pdf
PWC https://paperswithcode.com/paper/measurements-of-three-level-hierarchical
Repo https://github.com/deep-lab/DeepnetHessian
Framework pytorch

Dependency-Guided LSTM-CRF for Named Entity Recognition

Title Dependency-Guided LSTM-CRF for Named Entity Recognition
Authors Zhanming Jie, Wei Lu
Abstract Dependency tree structures capture long-distance and syntactic relationships between words in a sentence. The syntactic relations (e.g., nominal subject, object) can potentially infer the existence of certain named entities. In addition, the performance of a named entity recognizer could benefit from the long-distance dependencies between the words in dependency trees. In this work, we propose a simple yet effective dependency-guided LSTM-CRF model to encode the complete dependency trees and capture the above properties for the task of named entity recognition (NER). The data statistics show strong correlations between the entity types and dependency relations. We conduct extensive experiments on several standard datasets and demonstrate the effectiveness of the proposed model in improving NER and achieving state-of-the-art performance. Our analysis reveals that the significant improvements mainly result from the dependency relations and long-distance interactions provided by dependency trees.
Tasks Named Entity Recognition
Published 2019-09-23
URL https://arxiv.org/abs/1909.10148v1
PDF https://arxiv.org/pdf/1909.10148v1.pdf
PWC https://paperswithcode.com/paper/190910148
Repo https://github.com/IIITian-Chandan/Solar_Data_modeling
Framework tf

Towards Photographic Image Manipulation with Balanced Growing of Generative Autoencoders

Title Towards Photographic Image Manipulation with Balanced Growing of Generative Autoencoders
Authors Ari Heljakka, Arno Solin, Juho Kannala
Abstract We present a generative autoencoder that provides fast encoding, faithful reconstructions (eg. retaining the identity of a face), sharp generated/reconstructed samples in high resolutions, and a well-structured latent space that supports semantic manipulation of the inputs. There are no current autoencoder or GAN models that satisfactorily achieve all of these. We build on the progressively growing autoencoder model PIONEER, for which we completely alter the training dynamics based on a careful analysis of recently introduced normalization schemes. We show significantly improved visual and quantitative results for face identity conservation in CelebAHQ. Our model achieves state-of-the-art disentanglement of latent space, both quantitatively and via realistic image attribute manipulations. On the LSUN Bedrooms dataset, we improve the disentanglement performance of the vanilla PIONEER, despite having a simpler model. Overall, our results indicate that the PIONEER networks provide a way towards photorealistic face manipulation.
Tasks
Published 2019-04-12
URL https://arxiv.org/abs/1904.06145v2
PDF https://arxiv.org/pdf/1904.06145v2.pdf
PWC https://paperswithcode.com/paper/towards-photographic-image-manipulation-with
Repo https://github.com/AaltoVision/balanced-pioneer
Framework pytorch

Assessing the Lexico-Semantic Relational Knowledge Captured by Word and Concept Embeddings

Title Assessing the Lexico-Semantic Relational Knowledge Captured by Word and Concept Embeddings
Authors Ronald Denaux, Jose Manuel Gomez-Perez
Abstract Deep learning currently dominates the benchmarks for various NLP tasks and, at the basis of such systems, words are frequently represented as embeddings –vectors in a low dimensional space– learned from large text corpora and various algorithms have been proposed to learn both word and concept embeddings. One of the claimed benefits of such embeddings is that they capture knowledge about semantic relations. Such embeddings are most often evaluated through tasks such as predicting human-rated similarity and analogy which only test a few, often ill-defined, relations. In this paper, we propose a method for (i) reliably generating word and concept pair datasets for a wide number of relations by using a knowledge graph and (ii) evaluating to what extent pre-trained embeddings capture those relations. We evaluate the approach against a proprietary and a public knowledge graph and analyze the results, showing which lexico-semantic relational knowledge is captured by current embedding learning approaches.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.11042v1
PDF https://arxiv.org/pdf/1909.11042v1.pdf
PWC https://paperswithcode.com/paper/assessing-the-lexico-semantic-relational
Repo https://github.com/rdenaux/embrelassess
Framework pytorch

A Modular Task-oriented Dialogue System Using a Neural Mixture-of-Experts

Title A Modular Task-oriented Dialogue System Using a Neural Mixture-of-Experts
Authors Jiahuan Pei, Pengjie Ren, Maarten de Rijke
Abstract End-to-end Task-oriented Dialogue Systems (TDSs) have attracted a lot of attention for their superiority (e.g., in terms of global optimization) over pipeline modularized TDSs. Previous studies on end-to-end TDSs use a single-module model to generate responses for complex dialogue contexts. However, no model consistently outperforms the others in all cases. We propose a neural Modular Task-oriented Dialogue System(MTDS) framework, in which a few expert bots are combined to generate the response for a given dialogue context. MTDS consists of a chair bot and several expert bots. Each expert bot is specialized for a particular situation, e.g., one domain, one type of action of a system, etc. The chair bot coordinates multiple expert bots and adaptively selects an expert bot to generate the appropriate response. We further propose a Token-level Mixture-of-Expert (TokenMoE) model to implement MTDS, where the expert bots predict multiple tokens at each timestamp and the chair bot determines the final generated token by fully taking into consideration the outputs of all expert bots. Both the chair bot and the expert bots are jointly trained in an end-to-end fashion. To verify the effectiveness of TokenMoE, we carry out extensive experiments on a benchmark dataset. Compared with the baseline using a single-module model, our TokenMoE improves the performance by 8.1% of inform rate and 0.8% of success rate.
Tasks Task-Oriented Dialogue Systems
Published 2019-07-10
URL https://arxiv.org/abs/1907.05346v1
PDF https://arxiv.org/pdf/1907.05346v1.pdf
PWC https://paperswithcode.com/paper/a-modular-task-oriented-dialogue-system-using
Repo https://github.com/budzianowski/multiwoz
Framework pytorch

Incrementally Improving Graph WaveNet Performance on Traffic Prediction

Title Incrementally Improving Graph WaveNet Performance on Traffic Prediction
Authors Sam Shleifer, Clara McCreery, Vamsi Chitters
Abstract We present a series of modifications which improve upon Graph WaveNet’s previously state-of-the-art performance on the METR-LA traffic prediction task. The goal of this task is to predict the future speed of traffic at each sensor in a network using the past hour of sensor readings. Graph WaveNet (GWN) is a spatio-temporal graph neural network which interleaves graph convolution to aggregate information from nearby sensors and dilated convolutions to aggregate information from the past. We improve GWN by (1) using better hyperparameters, (2) adding connections that allow larger gradients to flow back to the early convolutional layers, and (3) pretraining on an easier short-term traffic prediction task. These modifications reduce the mean absolute error by .06 on the METR-LA task, nearly equal to GWN’s improvement over its predecessor. These improvements generalize to the PEMS-BAY dataset, with similar relative magnitude. We also show that ensembling separate models for short-and long-term predictions further improves performance. Code is available at https://github.com/sshleifer/Graph-WaveNet .
Tasks Traffic Prediction
Published 2019-12-11
URL https://arxiv.org/abs/1912.07390v1
PDF https://arxiv.org/pdf/1912.07390v1.pdf
PWC https://paperswithcode.com/paper/incrementally-improving-graph-wavenet
Repo https://github.com/nnzhan/Graph-WaveNet
Framework pytorch

Learning meters of Arabic and English poems with Recurrent Neural Networks: a step forward for language understanding and synthesis

Title Learning meters of Arabic and English poems with Recurrent Neural Networks: a step forward for language understanding and synthesis
Authors Waleed A. Yousef, Omar M. Ibrahime, Taha M. Madbouly, Moustafa A. Mahmoud
Abstract Recognizing a piece of writing as a poem or prose is usually easy for the majority of people; however, only specialists can determine which meter a poem belongs to. In this paper, we build Recurrent Neural Network (RNN) models that can classify poems according to their meters from plain text. The input text is encoded at the character level and directly fed to the models without feature handcrafting. This is a step forward for machine understanding and synthesis of languages in general, and Arabic language in particular. Among the 16 poem meters of Arabic and the 4 meters of English the networks were able to correctly classify poem with an overall accuracy of 96.38% and 82.31% respectively. The poem datasets used to conduct this research were massive, over 1.5 million of verses, and were crawled from different nontechnical sources, almost Arabic and English literature sites, and in different heterogeneous and unstructured formats. These datasets are now made publicly available in clean, structured, and documented format for other future research. To the best of the authors’ knowledge, this research is the first to address classifying poem meters in a machine learning approach, in general, and in RNN featureless based approach, in particular. In addition, the dataset is the first publicly available dataset ready for the purpose of future computational research.
Tasks
Published 2019-05-07
URL https://arxiv.org/abs/1905.05700v1
PDF https://arxiv.org/pdf/1905.05700v1.pdf
PWC https://paperswithcode.com/paper/190505700
Repo https://github.com/DrWaleedAYousef/My-Stuff-To-Share
Framework none

Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening

Title Transfer Learning Using Ensemble Neural Networks for Organic Solar Cell Screening
Authors Arindam Paul, Dipendra Jha, Reda Al-Bahrani, Wei-keng Liao, Alok Choudhary, Ankit Agrawal
Abstract Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very useful in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on data derived from calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing the molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values and leverage the potential of transfer learning from a sizeable DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from the use of transfer learning and leveraging both molecular representations.
Tasks Transfer Learning
Published 2019-03-07
URL https://arxiv.org/abs/1903.03178v4
PDF https://arxiv.org/pdf/1903.03178v4.pdf
PWC https://paperswithcode.com/paper/transfer-learning-using-ensemble-neural-nets
Repo https://github.com/paularindam/SINet
Framework tf

Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

Title Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs
Authors Denis Mazur, Vage Egiazarian, Stanislav Morozov, Artem Babenko
Abstract Learning useful representations is a key ingredient to the success of modern machine learning. Currently, representation learning mostly relies on embedding data into Euclidean space. However, recent work has shown that data in some domains is better modeled by non-euclidean metric spaces, and inappropriate geometry can result in inferior performance. In this paper, we aim to eliminate the inductive bias imposed by the embedding space geometry. Namely, we propose to map data into more general non-vector metric spaces: a weighted graph with a shortest path distance. By design, such graphs can model arbitrary geometry with a proper configuration of edges and weights. Our main contribution is PRODIGE: a method that learns a weighted graph representation of data end-to-end by gradient descent. Greater generality and fewer model assumptions make PRODIGE more powerful than existing embedding-based approaches. We confirm the superiority of our method via extensive experiments on a wide range of tasks, including classification, compression, and collaborative filtering.
Tasks Representation Learning
Published 2019-10-08
URL https://arxiv.org/abs/1910.03524v4
PDF https://arxiv.org/pdf/1910.03524v4.pdf
PWC https://paperswithcode.com/paper/beyond-vector-spaces-compact-data
Repo https://github.com/stanis-morozov/prodige
Framework pytorch

Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

Title Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation
Authors Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan
Abstract Cross-view image translation is challenging because it involves images with drastically different views and severe deformation. In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map. The proposed SelectionGAN explicitly utilizes the semantic information and consists of two stages. In the first stage, the condition image and the target semantic map are fed into a cycled semantic-guided generation network to produce initial coarse results. In the second stage, we refine the initial results by using a multi-channel attention selection mechanism. Moreover, uncertainty maps automatically learned from attentions are used to guide the pixel loss for better network optimization. Extensive experiments on Dayton, CVUSA and Ego2Top datasets show that our model is able to generate significantly better results than the state-of-the-art methods. The source code, data and trained models are available at https://github.com/Ha0Tang/SelectionGAN.
Tasks Bird View Synthesis, Cross-View Image-to-Image Translation, Image-to-Image Translation
Published 2019-04-15
URL http://arxiv.org/abs/1904.06807v2
PDF http://arxiv.org/pdf/1904.06807v2.pdf
PWC https://paperswithcode.com/paper/multi-channel-attention-selection-gan-with
Repo https://github.com/Ha0Tang/HandGestureRecognition
Framework none

Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes

Title Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Authors Nathan Kallus, Masatoshi Uehara
Abstract Off-policy evaluation (OPE) in reinforcement learning allows one to evaluate novel decision policies without needing to conduct exploration, which is often costly or otherwise infeasible. We consider for the first time the semiparametric efficiency limits of OPE in Markov decision processes (MDPs), where actions, rewards, and states are memoryless. We show existing OPE estimators may fail to be efficient in this setting. We develop a new estimator based on cross-fold estimation of $q$-functions and marginalized density ratios, which we term double reinforcement learning (DRL). We show that DRL is efficient when both components are estimated at fourth-root rates and is also doubly robust when only one component is consistent. We investigate these properties empirically and demonstrate the performance benefits due to harnessing memorylessness.
Tasks
Published 2019-08-22
URL https://arxiv.org/abs/1908.08526v2
PDF https://arxiv.org/pdf/1908.08526v2.pdf
PWC https://paperswithcode.com/paper/double-reinforcement-learning-for-efficient
Repo https://github.com/CausalML/DoubleReinforcementLearningMDP
Framework none
comments powered by Disqus