January 31, 2020

2898 words 14 mins read

Paper Group AWR 370

Paper Group AWR 370

A Logic-Driven Framework for Consistency of Neural Models. Document Expansion by Query Prediction. Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks. Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation. SafeLife 1.0: Exploring Side Effects in Complex Environments. Crop Yield Prediction Using Deep Neural …

A Logic-Driven Framework for Consistency of Neural Models

Title A Logic-Driven Framework for Consistency of Neural Models
Authors Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar
Abstract While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples. In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a learning framework for constraining models using logic rules to regularize them away from inconsistency. Our framework can leverage both labeled and unlabeled examples and is directly compatible with off-the-shelf learning schemes without model redesign. We instantiate our framework on natural language inference, where experiments show that enforcing invariants stated in logic can help make the predictions of neural models both accurate and consistent.
Tasks Natural Language Inference
Published 2019-08-31
URL https://arxiv.org/abs/1909.00126v4
PDF https://arxiv.org/pdf/1909.00126v4.pdf
PWC https://paperswithcode.com/paper/a-logic-driven-framework-for-consistency-of
Repo https://github.com/utahnlp/consistency
Framework pytorch

Document Expansion by Query Prediction

Title Document Expansion by Query Prediction
Authors Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho
Abstract One technique to improve the retrieval effectiveness of a search engine is to expand documents with terms that are related or representative of the documents’ content.From the perspective of a question answering system, this might comprise questions the document can potentially answer. Following this observation, we propose a simple method that predicts which queries will be issued for a given document and then expands it with those predictions with a vanilla sequence-to-sequence model, trained using datasets consisting of pairs of query and relevant documents. By combining our method with a highly-effective re-ranking component, we achieve the state of the art in two retrieval tasks. In a latency-critical regime, retrieval results alone (without re-ranking) approach the effectiveness of more computationally expensive neural re-rankers but are much faster.
Tasks Passage Re-Ranking, Question Answering
Published 2019-04-17
URL https://arxiv.org/abs/1904.08375v2
PDF https://arxiv.org/pdf/1904.08375v2.pdf
PWC https://paperswithcode.com/paper/document-expansion-by-query-prediction
Repo https://github.com/castorini/Anserini
Framework none

Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks

Title Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks
Authors Qi She, Anqi Wu
Abstract Latent dynamics discovery is challenging in extracting complex dynamics from high-dimensional noisy neural data. Many dimensionality reduction methods have been widely adopted to extract low-dimensional, smooth and time-evolving latent trajectories. However, simple state transition structures, linear embedding assumptions, or inflexible inference networks impede the accurate recovery of dynamic portraits. In this paper, we propose a novel latent dynamic model that is capable of capturing nonlinear, non-Markovian, long short-term time-dependent dynamics via recurrent neural networks and tackling complex nonlinear embedding via non-parametric Gaussian process. Due to the complexity and intractability of the model and its inference, we also provide a powerful inference network with bi-directional long short-term memory networks that encode both past and future information into posterior distributions. In the experiment, we show that our model outperforms other state-of-the-art methods in reconstructing insightful latent dynamics from both simulated and experimental neural datasets with either Gaussian or Poisson observations, especially in the low-sample scenario. Our codes and additional materials are available at https://github.com/sheqi/GP-RNN_UAI2019.
Tasks Dimensionality Reduction, Time Series, Time Series Analysis
Published 2019-07-01
URL https://arxiv.org/abs/1907.00650v1
PDF https://arxiv.org/pdf/1907.00650v1.pdf
PWC https://paperswithcode.com/paper/neural-dynamics-discovery-via-gaussian
Repo https://github.com/sheqi/GP-RNN_UAI2019
Framework tf

Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation

Title Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
Authors Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki
Abstract We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.
Tasks Domain Adaptation
Published 2019-06-05
URL https://arxiv.org/abs/1906.01834v1
PDF https://arxiv.org/pdf/1906.01834v1.pdf
PWC https://paperswithcode.com/paper/automatic-generation-of-high-quality-ccgbanks
Repo https://github.com/masashi-y/depccg
Framework none

SafeLife 1.0: Exploring Side Effects in Complex Environments

Title SafeLife 1.0: Exploring Side Effects in Complex Environments
Authors Carroll L. Wainwright, Peter Eckersley
Abstract We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated levels with many opportunities for unsafe behavior. Agents are graded both on their ability to maximize their explicit reward and on their ability to operate safely without unnecessary side effects. We train agents to maximize rewards using proximal policy optimization and score them on a suite of benchmark levels. The resulting agents are performant but not safe—they tend to cause large side effects in their environments—but they form a baseline against which future safety research can be measured.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01217v1
PDF https://arxiv.org/pdf/1912.01217v1.pdf
PWC https://paperswithcode.com/paper/safelife-10-exploring-side-effects-in-complex
Repo https://github.com/PartnershipOnAI/safelife
Framework none

Crop Yield Prediction Using Deep Neural Networks

Title Crop Yield Prediction Using Deep Neural Networks
Authors Saeed Khaki, Lizhi Wang
Abstract Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive factors, and to reveal such relationship requires both comprehensive datasets and powerful algorithms. In the 2018 Syngenta Crop Challenge, Syngenta released several large datasets that recorded the genotype and yield performances of 2,267 maize hybrids planted in 2,247 locations between 2008 and 2016 and asked participants to predict the yield performance in 2017. As one of the winning teams, we designed a deep neural network (DNN) approach that took advantage of state-of-the-art modeling and solution techniques. Our model was found to have a superior prediction accuracy, with a root-mean-square-error (RMSE) being 12% of the average yield and 50% of the standard deviation for the validation dataset using predicted weather data. With perfect weather data, the RMSE would be reduced to 11% of the average yield and 46% of the standard deviation. We also performed feature selection based on the trained DNN model, which successfully decreased the dimension of the input space without significant drop in the prediction accuracy. Our computational results suggested that this model significantly outperformed other popular methods such as Lasso, shallow neural networks (SNN), and regression tree (RT). The results also revealed that environmental factors had a greater effect on the crop yield than genotype.
Tasks Feature Selection
Published 2019-02-07
URL https://arxiv.org/abs/1902.02860v3
PDF https://arxiv.org/pdf/1902.02860v3.pdf
PWC https://paperswithcode.com/paper/crop-yield-prediction-using-deep-neural
Repo https://github.com/saeedkhaki92/Yield-Prediction-DNN
Framework tf

Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs

Title Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs
Authors Soumya Sharma, Bishal Santra, Abhik Jana, T. Y. S. S. Santosh, Niloy Ganguly, Pawan Goyal
Abstract Recently, biomedical version of embeddings obtained from language models such as BioELMo have shown state-of-the-art results for the textual inference task in the medical domain. In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-the-art approaches for NLI task (ESIM model). We also experiment with fusing the domain-specific sentiment information for the task. Experiments conducted on MedNLI dataset clearly show that this strategy improves the baseline BioELMo architecture for the Medical NLI task.
Tasks Knowledge Graphs
Published 2019-08-31
URL https://arxiv.org/abs/1909.00160v1
PDF https://arxiv.org/pdf/1909.00160v1.pdf
PWC https://paperswithcode.com/paper/incorporating-domain-knowledge-into-medical
Repo https://github.com/soummyaah/KGMedNLI
Framework pytorch

The Nipple-Areola Complex for Criminal Identification

Title The Nipple-Areola Complex for Criminal Identification
Authors Wojciech Michal Matkowski, Krzysztof Matkowski, Adams Wai-Kin Kong, Cory Lloyd Hall
Abstract In digital and multimedia forensics, identification of child sexual offenders based on digital evidence images is highly challenging due to the fact that the offender’s face or other obvious characteristics such as tattoos are occluded, covered, or not visible at all. Nevertheless, other naked body parts, e.g., chest are still visible. Some researchers proposed skin marks, skin texture, vein or androgenic hair patterns for criminal and victim identification. There are no available studies of nipple-areola complex (NAC) for offender identification. In this paper, we present a study of offender identification based on the NAC, and we present NTU-Nipple-v1 dataset, which contains 2732 images of 428 different male nipple-areolae. Popular deep learning and hand-crafted recognition methods are evaluated on the provided dataset. The results indicate that the NAC can be a useful characteristic for offender identification.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11651v1
PDF https://arxiv.org/pdf/1905.11651v1.pdf
PWC https://paperswithcode.com/paper/the-nipple-areola-complex-for-criminal
Repo https://github.com/BFLTeam/NTU_Dataset
Framework none

AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs

Title AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs
Authors Gabriele Abbati, Philippe Wenk, Michael A Osborne, Andreas Krause, Bernhard Schölkopf, Stefan Bauer
Abstract Stochastic differential equations are an important modeling class in many disciplines. Consequently, there exist many methods relying on various discretization and numerical integration schemes. In this paper, we propose a novel, probabilistic model for estimating the drift and diffusion given noisy observations of the underlying stochastic system. Using state-of-the-art adversarial and moment matching inference techniques, we avoid the discretization schemes of classical approaches. This leads to significant improvements in parameter accuracy and robustness given random initial guesses. On four established benchmark systems, we compare the performance of our algorithms to state-of-the-art solutions based on extended Kalman filtering and Gaussian processes.
Tasks Gaussian Processes
Published 2019-02-22
URL https://arxiv.org/abs/1902.08480v2
PDF https://arxiv.org/pdf/1902.08480v2.pdf
PWC https://paperswithcode.com/paper/ares-and-mars-adversarial-and-mmd-minimizing
Repo https://github.com/gabb7/AReS-MaRS
Framework tf

Gendered Pronoun Resolution using BERT and an extractive question answering formulation

Title Gendered Pronoun Resolution using BERT and an extractive question answering formulation
Authors Rakesh Chada
Abstract The resolution of ambiguous pronouns is a longstanding challenge in Natural Language Understanding. Recent studies have suggested gender bias among state-of-the-art coreference resolution systems. As an example, Google AI Language team recently released a gender-balanced dataset and showed that performance of these coreference resolvers is significantly limited on the dataset. In this paper, we propose an extractive question answering (QA) formulation of pronoun resolution task that overcomes this limitation and shows much lower gender bias (0.99) on their dataset. This system uses fine-tuned representations from the pre-trained BERT model and outperforms the existing baseline by a significant margin (22.2% absolute improvement in F1 score) without using any hand-engineered features. This QA framework is equally performant even without the knowledge of the candidate antecedents of the pronoun. An ensemble of QA and BERT-based multiple choice and sequence classification models further improves the F1 (23.3% absolute improvement upon the baseline). This ensemble model was submitted to the shared task for the 1st ACL workshop on Gender Bias for Natural Language Processing. It ranked 9th on the final official leaderboard. Source code is available at https://github.com/rakeshchada/corefqa
Tasks Coreference Resolution, Question Answering
Published 2019-06-09
URL https://arxiv.org/abs/1906.03695v1
PDF https://arxiv.org/pdf/1906.03695v1.pdf
PWC https://paperswithcode.com/paper/gendered-pronoun-resolution-using-bert-and-an
Repo https://github.com/rakeshchada/corefqa
Framework pytorch

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering

Title Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Authors Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee
Abstract Speech separation has been very successful with deep learning techniques. Substantial effort has been reported based on approaches over spectrogram, which is well known as the standard time-and-frequency cross-domain representation for speech signals. It is highly correlated to the phonetic structure of speech, or “how the speech sounds” when perceived by human, but primarily frequency domain features carrying temporal behaviour. Very impressive work achieving speech separation over time domain was reported recently, probably because waveforms in time domain may describe the different realizations of speech in a more precise way than spectrogram. In this paper, we propose a framework properly integrating the above two directions, hoping to achieve both purposes. We construct a time-and-frequency feature map by concatenating the 1-dim convolution encoded feature map (for time domain) and the spectrogram (for frequency domain), which was then processed by an embedding network and clustering approaches very similar to those used in time and frequency domain prior works. In this way, the information in the time and frequency domains, as well as the interactions between them, can be jointly considered during embedding and clustering. Very encouraging results (state-of-the-art to our knowledge) were obtained with WSJ0-2mix dataset in preliminary experiments.
Tasks Speech Separation
Published 2019-04-16
URL http://arxiv.org/abs/1904.07845v1
PDF http://arxiv.org/pdf/1904.07845v1.pdf
PWC https://paperswithcode.com/paper/improved-speech-separation-with-time-and
Repo https://github.com/r06944010/improved-speech-separation
Framework tf

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers

Title SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers
Authors Igor Fedorov, Ryan P. Adams, Matthew Mattina, Paul N. Whatmough
Abstract The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment. The Internet of Things (IoT) promises to inject machine learning into many of these every-day objects via tiny, cheap MCUs. However, these resource-impoverished hardware platforms severely limit the complexity of machine learning models that can be deployed. For example, although convolutional neural networks (CNNs) achieve state-of-the-art results on many visual recognition tasks, CNN inference on MCUs is challenging due to severe finite memory limitations. To circumvent the memory challenge associated with CNNs, various alternatives have been proposed that do fit within the memory budget of an MCU, albeit at the cost of prediction accuracy. This paper challenges the idea that CNNs are not suitable for deployment on MCUs. We demonstrate that it is possible to automatically design CNNs which generalize well, while also being small enough to fit onto memory-limited MCUs. Our Sparse Architecture Search method combines neural architecture search with pruning in a single, unified approach, which learns superior models on four popular IoT datasets. The CNNs we find are more accurate and up to $4.35\times$ smaller than previous approaches, while meeting the strict MCU working memory constraint.
Tasks Neural Architecture Search
Published 2019-05-28
URL https://arxiv.org/abs/1905.12107v1
PDF https://arxiv.org/pdf/1905.12107v1.pdf
PWC https://paperswithcode.com/paper/sparse-sparse-architecture-search-for-cnns-on
Repo https://github.com/patrickthomashansen/ubit_uTensor_demo
Framework tf

Semantic Relatedness Based Re-ranker for Text Spotting

Title Semantic Relatedness Based Re-ranker for Text Spotting
Authors Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró
Abstract Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with dimension reduction techniques like LDA or with embedding-based neural approaches. We present a scenario where semantic similarity is not enough, and we devise a neural approach to learn semantic relatedness. The scenario is text spotting in the wild, where a text in an image (e.g. street sign, advertisement or bus destination) must be identified and recognized. Our goal is to improve the performance of vision systems by leveraging semantic information. Our rationale is that the text to be spotted is often related to the image context in which it appears (word pairs such as Delta-airplane, or quarters-parking are not similar, but are clearly related). We show how learning a word-to-word or word-to-sentence relatedness score can improve the performance of text spotting systems up to 2.9 points, outperforming other measures in a benchmark dataset.
Tasks Dimensionality Reduction, Natural Language Inference, Semantic Similarity, Semantic Textual Similarity, Text Spotting
Published 2019-09-17
URL https://arxiv.org/abs/1909.07950v2
PDF https://arxiv.org/pdf/1909.07950v2.pdf
PWC https://paperswithcode.com/paper/semantic-relatedness-based-re-ranker-for-text
Repo https://github.com/ahmedssabir/dataset
Framework none

SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings

Title SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings
Authors Todor Mihaylov, Preslav Nakov
Abstract We describe our system for finding good answers in a community forum, as defined in SemEval-2016, Task 3 on Community Question Answering. Our approach relies on several semantic similarity features based on fine-tuned word embeddings and topics similarities. In the main Subtask C, our primary submission was ranked third, with a MAP of 51.68 and accuracy of 69.94. In Subtask A, our primary submission was also third, with MAP of 77.58 and accuracy of 73.39.
Tasks Community Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published 2019-11-20
URL https://arxiv.org/abs/1911.08743v1
PDF https://arxiv.org/pdf/1911.08743v1.pdf
PWC https://paperswithcode.com/paper/semanticz-at-semeval-2016-task-3-ranking-1
Repo https://github.com/tbmihailov/semeval2016-task3-CQA
Framework none

Where are the Keys? – Learning Object-Centric Navigation Policies on Semantic Maps with Graph Convolutional Networks

Title Where are the Keys? – Learning Object-Centric Navigation Policies on Semantic Maps with Graph Convolutional Networks
Authors Niko Sünderhauf
Abstract Emerging object-based SLAM algorithms can build a graph representation of an environment comprising nodes for robot poses and object landmarks. However, while this map will contain static objects such as furniture or appliances, many moveable objects (e.g. the car keys, the glasses, or a magazine), are not suitable as landmarks and will not be part of the map due to their non-static nature. We show that Graph Convolutional Networks can learn navigation policies to find such unmapped objects by learning to exploit the hidden probabilistic model that governs where these objects appear in the environment. The learned policies can generalise to object classes unseen during training by using word vectors that express semantic similarity as representations for object nodes in the graph. Furthermore, we show that the policies generalise to unseen environments with only minimal loss of performance. We demonstrate that pre-training the policy network with a proxy task can significantly speed up learning, improving sample efficiency. Code for this paper is available at https://github.com/nikosuenderhauf/graphConvNetsForNavigation.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2019-09-16
URL https://arxiv.org/abs/1909.07376v1
PDF https://arxiv.org/pdf/1909.07376v1.pdf
PWC https://paperswithcode.com/paper/where-are-the-keys-learning-object-centric
Repo https://github.com/nikosuenderhauf/graphConvNetsForNavigation
Framework none
comments powered by Disqus