January 31, 2020

2898 words 14 mins read

Paper Group AWR 370

A Logic-Driven Framework for Consistency of Neural Models. Document Expansion by Query Prediction. Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks. Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation. SafeLife 1.0: Exploring Side Effects in Complex Environments. Crop Yield Prediction Using Deep Neural …

A Logic-Driven Framework for Consistency of Neural Models


Title	A Logic-Driven Framework for Consistency of Neural Models
Authors	Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar
Abstract	While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples. In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a learning framework for constraining models using logic rules to regularize them away from inconsistency. Our framework can leverage both labeled and unlabeled examples and is directly compatible with off-the-shelf learning schemes without model redesign. We instantiate our framework on natural language inference, where experiments show that enforcing invariants stated in logic can help make the predictions of neural models both accurate and consistent.
Tasks	Natural Language Inference
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00126v4
PDF	https://arxiv.org/pdf/1909.00126v4.pdf
PWC	https://paperswithcode.com/paper/a-logic-driven-framework-for-consistency-of
Repo	https://github.com/utahnlp/consistency
Framework	pytorch

Document Expansion by Query Prediction


Title	Document Expansion by Query Prediction
Authors	Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho
Abstract	One technique to improve the retrieval effectiveness of a search engine is to expand documents with terms that are related or representative of the documents’ content.From the perspective of a question answering system, this might comprise questions the document can potentially answer. Following this observation, we propose a simple method that predicts which queries will be issued for a given document and then expands it with those predictions with a vanilla sequence-to-sequence model, trained using datasets consisting of pairs of query and relevant documents. By combining our method with a highly-effective re-ranking component, we achieve the state of the art in two retrieval tasks. In a latency-critical regime, retrieval results alone (without re-ranking) approach the effectiveness of more computationally expensive neural re-rankers but are much faster.
Tasks	Passage Re-Ranking, Question Answering
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08375v2
PDF	https://arxiv.org/pdf/1904.08375v2.pdf
PWC	https://paperswithcode.com/paper/document-expansion-by-query-prediction
Repo	https://github.com/castorini/Anserini
Framework	none

Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks


Title	Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks
Authors	Qi She, Anqi Wu
Abstract	Latent dynamics discovery is challenging in extracting complex dynamics from high-dimensional noisy neural data. Many dimensionality reduction methods have been widely adopted to extract low-dimensional, smooth and time-evolving latent trajectories. However, simple state transition structures, linear embedding assumptions, or inflexible inference networks impede the accurate recovery of dynamic portraits. In this paper, we propose a novel latent dynamic model that is capable of capturing nonlinear, non-Markovian, long short-term time-dependent dynamics via recurrent neural networks and tackling complex nonlinear embedding via non-parametric Gaussian process. Due to the complexity and intractability of the model and its inference, we also provide a powerful inference network with bi-directional long short-term memory networks that encode both past and future information into posterior distributions. In the experiment, we show that our model outperforms other state-of-the-art methods in reconstructing insightful latent dynamics from both simulated and experimental neural datasets with either Gaussian or Poisson observations, especially in the low-sample scenario. Our codes and additional materials are available at https://github.com/sheqi/GP-RNN_UAI2019.
Tasks	Dimensionality Reduction, Time Series, Time Series Analysis
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00650v1
PDF	https://arxiv.org/pdf/1907.00650v1.pdf
PWC	https://paperswithcode.com/paper/neural-dynamics-discovery-via-gaussian
Repo	https://github.com/sheqi/GP-RNN_UAI2019
Framework	tf

Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation


Title	Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
Authors	Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki
Abstract	We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.
Tasks	Domain Adaptation
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01834v1
PDF	https://arxiv.org/pdf/1906.01834v1.pdf
PWC	https://paperswithcode.com/paper/automatic-generation-of-high-quality-ccgbanks
Repo	https://github.com/masashi-y/depccg
Framework	none

SafeLife 1.0: Exploring Side Effects in Complex Environments


Title	SafeLife 1.0: Exploring Side Effects in Complex Environments
Authors	Carroll L. Wainwright, Peter Eckersley
Abstract	We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated levels with many opportunities for unsafe behavior. Agents are graded both on their ability to maximize their explicit reward and on their ability to operate safely without unnecessary side effects. We train agents to maximize rewards using proximal policy optimization and score them on a suite of benchmark levels. The resulting agents are performant but not safe—they tend to cause large side effects in their environments—but they form a baseline against which future safety research can be measured.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01217v1
PDF	https://arxiv.org/pdf/1912.01217v1.pdf
PWC	https://paperswithcode.com/paper/safelife-10-exploring-side-effects-in-complex
Repo	https://github.com/PartnershipOnAI/safelife
Framework	none

Crop Yield Prediction Using Deep Neural Networks


Title	Crop Yield Prediction Using Deep Neural Networks
Authors	Saeed Khaki, Lizhi Wang
Abstract	Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive factors, and to reveal such relationship requires both comprehensive datasets and powerful algorithms. In the 2018 Syngenta Crop Challenge, Syngenta released several large datasets that recorded the genotype and yield performances of 2,267 maize hybrids planted in 2,247 locations between 2008 and 2016 and asked participants to predict the yield performance in 2017. As one of the winning teams, we designed a deep neural network (DNN) approach that took advantage of state-of-the-art modeling and solution techniques. Our model was found to have a superior prediction accuracy, with a root-mean-square-error (RMSE) being 12% of the average yield and 50% of the standard deviation for the validation dataset using predicted weather data. With perfect weather data, the RMSE would be reduced to 11% of the average yield and 46% of the standard deviation. We also performed feature selection based on the trained DNN model, which successfully decreased the dimension of the input space without significant drop in the prediction accuracy. Our computational results suggested that this model significantly outperformed other popular methods such as Lasso, shallow neural networks (SNN), and regression tree (RT). The results also revealed that environmental factors had a greater effect on the crop yield than genotype.
Tasks	Feature Selection
Published	2019-02-07
URL	https://arxiv.org/abs/1902.02860v3
PDF	https://arxiv.org/pdf/1902.02860v3.pdf
PWC	https://paperswithcode.com/paper/crop-yield-prediction-using-deep-neural
Repo	https://github.com/saeedkhaki92/Yield-Prediction-DNN
Framework	tf

Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs


Title	Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs
Authors	Soumya Sharma, Bishal Santra, Abhik Jana, T. Y. S. S. Santosh, Niloy Ganguly, Pawan Goyal
Abstract	Recently, biomedical version of embeddings obtained from language models such as BioELMo have shown state-of-the-art results for the textual inference task in the medical domain. In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-the-art approaches for NLI task (ESIM model). We also experiment with fusing the domain-specific sentiment information for the task. Experiments conducted on MedNLI dataset clearly show that this strategy improves the baseline BioELMo architecture for the Medical NLI task.
Tasks	Knowledge Graphs
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00160v1
PDF	https://arxiv.org/pdf/1909.00160v1.pdf
PWC	https://paperswithcode.com/paper/incorporating-domain-knowledge-into-medical
Repo	https://github.com/soummyaah/KGMedNLI
Framework	pytorch

The Nipple-Areola Complex for Criminal Identification


Title	The Nipple-Areola Complex for Criminal Identification
Authors	Wojciech Michal Matkowski, Krzysztof Matkowski, Adams Wai-Kin Kong, Cory Lloyd Hall
Abstract	In digital and multimedia forensics, identification of child sexual offenders based on digital evidence images is highly challenging due to the fact that the offender’s face or other obvious characteristics such as tattoos are occluded, covered, or not visible at all. Nevertheless, other naked body parts, e.g., chest are still visible. Some researchers proposed skin marks, skin texture, vein or androgenic hair patterns for criminal and victim identification. There are no available studies of nipple-areola complex (NAC) for offender identification. In this paper, we present a study of offender identification based on the NAC, and we present NTU-Nipple-v1 dataset, which contains 2732 images of 428 different male nipple-areolae. Popular deep learning and hand-crafted recognition methods are evaluated on the provided dataset. The results indicate that the NAC can be a useful characteristic for offender identification.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11651v1
PDF	https://arxiv.org/pdf/1905.11651v1.pdf
PWC	https://paperswithcode.com/paper/the-nipple-areola-complex-for-criminal
Repo	https://github.com/BFLTeam/NTU_Dataset
Framework	none

AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs


Title	AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs
Authors	Gabriele Abbati, Philippe Wenk, Michael A Osborne, Andreas Krause, Bernhard Schölkopf, Stefan Bauer
Abstract	Stochastic differential equations are an important modeling class in many disciplines. Consequently, there exist many methods relying on various discretization and numerical integration schemes. In this paper, we propose a novel, probabilistic model for estimating the drift and diffusion given noisy observations of the underlying stochastic system. Using state-of-the-art adversarial and moment matching inference techniques, we avoid the discretization schemes of classical approaches. This leads to significant improvements in parameter accuracy and robustness given random initial guesses. On four established benchmark systems, we compare the performance of our algorithms to state-of-the-art solutions based on extended Kalman filtering and Gaussian processes.
Tasks	Gaussian Processes
Published	2019-02-22
URL	https://arxiv.org/abs/1902.08480v2
PDF	https://arxiv.org/pdf/1902.08480v2.pdf
PWC	https://paperswithcode.com/paper/ares-and-mars-adversarial-and-mmd-minimizing
Repo	https://github.com/gabb7/AReS-MaRS
Framework	tf

Gendered Pronoun Resolution using BERT and an extractive question answering formulation


Title	Gendered Pronoun Resolution using BERT and an extractive question answering formulation
Authors	Rakesh Chada
Abstract	The resolution of ambiguous pronouns is a longstanding challenge in Natural Language Understanding. Recent studies have suggested gender bias among state-of-the-art coreference resolution systems. As an example, Google AI Language team recently released a gender-balanced dataset and showed that performance of these coreference resolvers is significantly limited on the dataset. In this paper, we propose an extractive question answering (QA) formulation of pronoun resolution task that overcomes this limitation and shows much lower gender bias (0.99) on their dataset. This system uses fine-tuned representations from the pre-trained BERT model and outperforms the existing baseline by a significant margin (22.2% absolute improvement in F1 score) without using any hand-engineered features. This QA framework is equally performant even without the knowledge of the candidate antecedents of the pronoun. An ensemble of QA and BERT-based multiple choice and sequence classification models further improves the F1 (23.3% absolute improvement upon the baseline). This ensemble model was submitted to the shared task for the 1st ACL workshop on Gender Bias for Natural Language Processing. It ranked 9th on the final official leaderboard. Source code is available at https://github.com/rakeshchada/corefqa
Tasks	Coreference Resolution, Question Answering
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03695v1
PDF	https://arxiv.org/pdf/1906.03695v1.pdf
PWC	https://paperswithcode.com/paper/gendered-pronoun-resolution-using-bert-and-an
Repo	https://github.com/rakeshchada/corefqa
Framework	pytorch

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering


Title	Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Authors	Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee
Abstract	Speech separation has been very successful with deep learning techniques. Substantial effort has been reported based on approaches over spectrogram, which is well known as the standard time-and-frequency cross-domain representation for speech signals. It is highly correlated to the phonetic structure of speech, or “how the speech sounds” when perceived by human, but primarily frequency domain features carrying temporal behaviour. Very impressive work achieving speech separation over time domain was reported recently, probably because waveforms in time domain may describe the different realizations of speech in a more precise way than spectrogram. In this paper, we propose a framework properly integrating the above two directions, hoping to achieve both purposes. We construct a time-and-frequency feature map by concatenating the 1-dim convolution encoded feature map (for time domain) and the spectrogram (for frequency domain), which was then processed by an embedding network and clustering approaches very similar to those used in time and frequency domain prior works. In this way, the information in the time and frequency domains, as well as the interactions between them, can be jointly considered during embedding and clustering. Very encouraging results (state-of-the-art to our knowledge) were obtained with WSJ0-2mix dataset in preliminary experiments.
Tasks	Speech Separation
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07845v1
PDF	http://arxiv.org/pdf/1904.07845v1.pdf
PWC	https://paperswithcode.com/paper/improved-speech-separation-with-time-and
Repo	https://github.com/r06944010/improved-speech-separation
Framework	tf

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers


Title	SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers
Authors	Igor Fedorov, Ryan P. Adams, Matthew Mattina, Paul N. Whatmough
Abstract	The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment. The Internet of Things (IoT) promises to inject machine learning into many of these every-day objects via tiny, cheap MCUs. However, these resource-impoverished hardware platforms severely limit the complexity of machine learning models that can be deployed. For example, although convolutional neural networks (CNNs) achieve state-of-the-art results on many visual recognition tasks, CNN inference on MCUs is challenging due to severe finite memory limitations. To circumvent the memory challenge associated with CNNs, various alternatives have been proposed that do fit within the memory budget of an MCU, albeit at the cost of prediction accuracy. This paper challenges the idea that CNNs are not suitable for deployment on MCUs. We demonstrate that it is possible to automatically design CNNs which generalize well, while also being small enough to fit onto memory-limited MCUs. Our Sparse Architecture Search method combines neural architecture search with pruning in a single, unified approach, which learns superior models on four popular IoT datasets. The CNNs we find are more accurate and up to $4.35\times$ smaller than previous approaches, while meeting the strict MCU working memory constraint.
Tasks	Neural Architecture Search
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12107v1
PDF	https://arxiv.org/pdf/1905.12107v1.pdf
PWC	https://paperswithcode.com/paper/sparse-sparse-architecture-search-for-cnns-on
Repo	https://github.com/patrickthomashansen/ubit_uTensor_demo
Framework	tf

Semantic Relatedness Based Re-ranker for Text Spotting


Title	Semantic Relatedness Based Re-ranker for Text Spotting
Authors	Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró
Abstract	Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with dimension reduction techniques like LDA or with embedding-based neural approaches. We present a scenario where semantic similarity is not enough, and we devise a neural approach to learn semantic relatedness. The scenario is text spotting in the wild, where a text in an image (e.g. street sign, advertisement or bus destination) must be identified and recognized. Our goal is to improve the performance of vision systems by leveraging semantic information. Our rationale is that the text to be spotted is often related to the image context in which it appears (word pairs such as Delta-airplane, or quarters-parking are not similar, but are clearly related). We show how learning a word-to-word or word-to-sentence relatedness score can improve the performance of text spotting systems up to 2.9 points, outperforming other measures in a benchmark dataset.
Tasks	Dimensionality Reduction, Natural Language Inference, Semantic Similarity, Semantic Textual Similarity, Text Spotting
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07950v2
PDF	https://arxiv.org/pdf/1909.07950v2.pdf
PWC	https://paperswithcode.com/paper/semantic-relatedness-based-re-ranker-for-text
Repo	https://github.com/ahmedssabir/dataset
Framework	none

SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings


Title	SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings
Authors	Todor Mihaylov, Preslav Nakov
Abstract	We describe our system for finding good answers in a community forum, as defined in SemEval-2016, Task 3 on Community Question Answering. Our approach relies on several semantic similarity features based on fine-tuned word embeddings and topics similarities. In the main Subtask C, our primary submission was ranked third, with a MAP of 51.68 and accuracy of 69.94. In Subtask A, our primary submission was also third, with MAP of 77.58 and accuracy of 73.39.
Tasks	Community Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08743v1
PDF	https://arxiv.org/pdf/1911.08743v1.pdf
PWC	https://paperswithcode.com/paper/semanticz-at-semeval-2016-task-3-ranking-1
Repo	https://github.com/tbmihailov/semeval2016-task3-CQA
Framework	none


Title	Where are the Keys? – Learning Object-Centric Navigation Policies on Semantic Maps with Graph Convolutional Networks
Authors	Niko Sünderhauf
Abstract	Emerging object-based SLAM algorithms can build a graph representation of an environment comprising nodes for robot poses and object landmarks. However, while this map will contain static objects such as furniture or appliances, many moveable objects (e.g. the car keys, the glasses, or a magazine), are not suitable as landmarks and will not be part of the map due to their non-static nature. We show that Graph Convolutional Networks can learn navigation policies to find such unmapped objects by learning to exploit the hidden probabilistic model that governs where these objects appear in the environment. The learned policies can generalise to object classes unseen during training by using word vectors that express semantic similarity as representations for object nodes in the graph. Furthermore, we show that the policies generalise to unseen environments with only minimal loss of performance. We demonstrate that pre-training the policy network with a proxy task can significantly speed up learning, improving sample efficiency. Code for this paper is available at https://github.com/nikosuenderhauf/graphConvNetsForNavigation.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07376v1
PDF	https://arxiv.org/pdf/1909.07376v1.pdf
PWC	https://paperswithcode.com/paper/where-are-the-keys-learning-object-centric
Repo	https://github.com/nikosuenderhauf/graphConvNetsForNavigation
Framework	none