Paper Group AWR 370
A Logic-Driven Framework for Consistency of Neural Models. Document Expansion by Query Prediction. Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks. Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation. SafeLife 1.0: Exploring Side Effects in Complex Environments. Crop Yield Prediction Using Deep Neural …
A Logic-Driven Framework for Consistency of Neural Models
Title | A Logic-Driven Framework for Consistency of Neural Models |
Authors | Tao Li, Vivek Gupta, Maitrey Mehta, Vivek Srikumar |
Abstract | While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples. In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a learning framework for constraining models using logic rules to regularize them away from inconsistency. Our framework can leverage both labeled and unlabeled examples and is directly compatible with off-the-shelf learning schemes without model redesign. We instantiate our framework on natural language inference, where experiments show that enforcing invariants stated in logic can help make the predictions of neural models both accurate and consistent. |
Tasks | Natural Language Inference |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00126v4 |
https://arxiv.org/pdf/1909.00126v4.pdf | |
PWC | https://paperswithcode.com/paper/a-logic-driven-framework-for-consistency-of |
Repo | https://github.com/utahnlp/consistency |
Framework | pytorch |
Document Expansion by Query Prediction
Title | Document Expansion by Query Prediction |
Authors | Rodrigo Nogueira, Wei Yang, Jimmy Lin, Kyunghyun Cho |
Abstract | One technique to improve the retrieval effectiveness of a search engine is to expand documents with terms that are related or representative of the documents’ content.From the perspective of a question answering system, this might comprise questions the document can potentially answer. Following this observation, we propose a simple method that predicts which queries will be issued for a given document and then expands it with those predictions with a vanilla sequence-to-sequence model, trained using datasets consisting of pairs of query and relevant documents. By combining our method with a highly-effective re-ranking component, we achieve the state of the art in two retrieval tasks. In a latency-critical regime, retrieval results alone (without re-ranking) approach the effectiveness of more computationally expensive neural re-rankers but are much faster. |
Tasks | Passage Re-Ranking, Question Answering |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08375v2 |
https://arxiv.org/pdf/1904.08375v2.pdf | |
PWC | https://paperswithcode.com/paper/document-expansion-by-query-prediction |
Repo | https://github.com/castorini/Anserini |
Framework | none |
Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks
Title | Neural Dynamics Discovery via Gaussian Process Recurrent Neural Networks |
Authors | Qi She, Anqi Wu |
Abstract | Latent dynamics discovery is challenging in extracting complex dynamics from high-dimensional noisy neural data. Many dimensionality reduction methods have been widely adopted to extract low-dimensional, smooth and time-evolving latent trajectories. However, simple state transition structures, linear embedding assumptions, or inflexible inference networks impede the accurate recovery of dynamic portraits. In this paper, we propose a novel latent dynamic model that is capable of capturing nonlinear, non-Markovian, long short-term time-dependent dynamics via recurrent neural networks and tackling complex nonlinear embedding via non-parametric Gaussian process. Due to the complexity and intractability of the model and its inference, we also provide a powerful inference network with bi-directional long short-term memory networks that encode both past and future information into posterior distributions. In the experiment, we show that our model outperforms other state-of-the-art methods in reconstructing insightful latent dynamics from both simulated and experimental neural datasets with either Gaussian or Poisson observations, especially in the low-sample scenario. Our codes and additional materials are available at https://github.com/sheqi/GP-RNN_UAI2019. |
Tasks | Dimensionality Reduction, Time Series, Time Series Analysis |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00650v1 |
https://arxiv.org/pdf/1907.00650v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-dynamics-discovery-via-gaussian |
Repo | https://github.com/sheqi/GP-RNN_UAI2019 |
Framework | tf |
Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
Title | Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation |
Authors | Masashi Yoshikawa, Hiroshi Noji, Koji Mineshima, Daisuke Bekki |
Abstract | We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems. |
Tasks | Domain Adaptation |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01834v1 |
https://arxiv.org/pdf/1906.01834v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-generation-of-high-quality-ccgbanks |
Repo | https://github.com/masashi-y/depccg |
Framework | none |
SafeLife 1.0: Exploring Side Effects in Complex Environments
Title | SafeLife 1.0: Exploring Side Effects in Complex Environments |
Authors | Carroll L. Wainwright, Peter Eckersley |
Abstract | We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated levels with many opportunities for unsafe behavior. Agents are graded both on their ability to maximize their explicit reward and on their ability to operate safely without unnecessary side effects. We train agents to maximize rewards using proximal policy optimization and score them on a suite of benchmark levels. The resulting agents are performant but not safe—they tend to cause large side effects in their environments—but they form a baseline against which future safety research can be measured. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01217v1 |
https://arxiv.org/pdf/1912.01217v1.pdf | |
PWC | https://paperswithcode.com/paper/safelife-10-exploring-side-effects-in-complex |
Repo | https://github.com/PartnershipOnAI/safelife |
Framework | none |
Crop Yield Prediction Using Deep Neural Networks
Title | Crop Yield Prediction Using Deep Neural Networks |
Authors | Saeed Khaki, Lizhi Wang |
Abstract | Crop yield is a highly complex trait determined by multiple factors such as genotype, environment, and their interactions. Accurate yield prediction requires fundamental understanding of the functional relationship between yield and these interactive factors, and to reveal such relationship requires both comprehensive datasets and powerful algorithms. In the 2018 Syngenta Crop Challenge, Syngenta released several large datasets that recorded the genotype and yield performances of 2,267 maize hybrids planted in 2,247 locations between 2008 and 2016 and asked participants to predict the yield performance in 2017. As one of the winning teams, we designed a deep neural network (DNN) approach that took advantage of state-of-the-art modeling and solution techniques. Our model was found to have a superior prediction accuracy, with a root-mean-square-error (RMSE) being 12% of the average yield and 50% of the standard deviation for the validation dataset using predicted weather data. With perfect weather data, the RMSE would be reduced to 11% of the average yield and 46% of the standard deviation. We also performed feature selection based on the trained DNN model, which successfully decreased the dimension of the input space without significant drop in the prediction accuracy. Our computational results suggested that this model significantly outperformed other popular methods such as Lasso, shallow neural networks (SNN), and regression tree (RT). The results also revealed that environmental factors had a greater effect on the crop yield than genotype. |
Tasks | Feature Selection |
Published | 2019-02-07 |
URL | https://arxiv.org/abs/1902.02860v3 |
https://arxiv.org/pdf/1902.02860v3.pdf | |
PWC | https://paperswithcode.com/paper/crop-yield-prediction-using-deep-neural |
Repo | https://github.com/saeedkhaki92/Yield-Prediction-DNN |
Framework | tf |
Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs
Title | Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs |
Authors | Soumya Sharma, Bishal Santra, Abhik Jana, T. Y. S. S. Santosh, Niloy Ganguly, Pawan Goyal |
Abstract | Recently, biomedical version of embeddings obtained from language models such as BioELMo have shown state-of-the-art results for the textual inference task in the medical domain. In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-the-art approaches for NLI task (ESIM model). We also experiment with fusing the domain-specific sentiment information for the task. Experiments conducted on MedNLI dataset clearly show that this strategy improves the baseline BioELMo architecture for the Medical NLI task. |
Tasks | Knowledge Graphs |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00160v1 |
https://arxiv.org/pdf/1909.00160v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-domain-knowledge-into-medical |
Repo | https://github.com/soummyaah/KGMedNLI |
Framework | pytorch |
The Nipple-Areola Complex for Criminal Identification
Title | The Nipple-Areola Complex for Criminal Identification |
Authors | Wojciech Michal Matkowski, Krzysztof Matkowski, Adams Wai-Kin Kong, Cory Lloyd Hall |
Abstract | In digital and multimedia forensics, identification of child sexual offenders based on digital evidence images is highly challenging due to the fact that the offender’s face or other obvious characteristics such as tattoos are occluded, covered, or not visible at all. Nevertheless, other naked body parts, e.g., chest are still visible. Some researchers proposed skin marks, skin texture, vein or androgenic hair patterns for criminal and victim identification. There are no available studies of nipple-areola complex (NAC) for offender identification. In this paper, we present a study of offender identification based on the NAC, and we present NTU-Nipple-v1 dataset, which contains 2732 images of 428 different male nipple-areolae. Popular deep learning and hand-crafted recognition methods are evaluated on the provided dataset. The results indicate that the NAC can be a useful characteristic for offender identification. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11651v1 |
https://arxiv.org/pdf/1905.11651v1.pdf | |
PWC | https://paperswithcode.com/paper/the-nipple-areola-complex-for-criminal |
Repo | https://github.com/BFLTeam/NTU_Dataset |
Framework | none |
AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs
Title | AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs |
Authors | Gabriele Abbati, Philippe Wenk, Michael A Osborne, Andreas Krause, Bernhard Schölkopf, Stefan Bauer |
Abstract | Stochastic differential equations are an important modeling class in many disciplines. Consequently, there exist many methods relying on various discretization and numerical integration schemes. In this paper, we propose a novel, probabilistic model for estimating the drift and diffusion given noisy observations of the underlying stochastic system. Using state-of-the-art adversarial and moment matching inference techniques, we avoid the discretization schemes of classical approaches. This leads to significant improvements in parameter accuracy and robustness given random initial guesses. On four established benchmark systems, we compare the performance of our algorithms to state-of-the-art solutions based on extended Kalman filtering and Gaussian processes. |
Tasks | Gaussian Processes |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08480v2 |
https://arxiv.org/pdf/1902.08480v2.pdf | |
PWC | https://paperswithcode.com/paper/ares-and-mars-adversarial-and-mmd-minimizing |
Repo | https://github.com/gabb7/AReS-MaRS |
Framework | tf |
Gendered Pronoun Resolution using BERT and an extractive question answering formulation
Title | Gendered Pronoun Resolution using BERT and an extractive question answering formulation |
Authors | Rakesh Chada |
Abstract | The resolution of ambiguous pronouns is a longstanding challenge in Natural Language Understanding. Recent studies have suggested gender bias among state-of-the-art coreference resolution systems. As an example, Google AI Language team recently released a gender-balanced dataset and showed that performance of these coreference resolvers is significantly limited on the dataset. In this paper, we propose an extractive question answering (QA) formulation of pronoun resolution task that overcomes this limitation and shows much lower gender bias (0.99) on their dataset. This system uses fine-tuned representations from the pre-trained BERT model and outperforms the existing baseline by a significant margin (22.2% absolute improvement in F1 score) without using any hand-engineered features. This QA framework is equally performant even without the knowledge of the candidate antecedents of the pronoun. An ensemble of QA and BERT-based multiple choice and sequence classification models further improves the F1 (23.3% absolute improvement upon the baseline). This ensemble model was submitted to the shared task for the 1st ACL workshop on Gender Bias for Natural Language Processing. It ranked 9th on the final official leaderboard. Source code is available at https://github.com/rakeshchada/corefqa |
Tasks | Coreference Resolution, Question Answering |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03695v1 |
https://arxiv.org/pdf/1906.03695v1.pdf | |
PWC | https://paperswithcode.com/paper/gendered-pronoun-resolution-using-bert-and-an |
Repo | https://github.com/rakeshchada/corefqa |
Framework | pytorch |
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Title | Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering |
Authors | Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee |
Abstract | Speech separation has been very successful with deep learning techniques. Substantial effort has been reported based on approaches over spectrogram, which is well known as the standard time-and-frequency cross-domain representation for speech signals. It is highly correlated to the phonetic structure of speech, or “how the speech sounds” when perceived by human, but primarily frequency domain features carrying temporal behaviour. Very impressive work achieving speech separation over time domain was reported recently, probably because waveforms in time domain may describe the different realizations of speech in a more precise way than spectrogram. In this paper, we propose a framework properly integrating the above two directions, hoping to achieve both purposes. We construct a time-and-frequency feature map by concatenating the 1-dim convolution encoded feature map (for time domain) and the spectrogram (for frequency domain), which was then processed by an embedding network and clustering approaches very similar to those used in time and frequency domain prior works. In this way, the information in the time and frequency domains, as well as the interactions between them, can be jointly considered during embedding and clustering. Very encouraging results (state-of-the-art to our knowledge) were obtained with WSJ0-2mix dataset in preliminary experiments. |
Tasks | Speech Separation |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07845v1 |
http://arxiv.org/pdf/1904.07845v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-speech-separation-with-time-and |
Repo | https://github.com/r06944010/improved-speech-separation |
Framework | tf |
SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers
Title | SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers |
Authors | Igor Fedorov, Ryan P. Adams, Matthew Mattina, Paul N. Whatmough |
Abstract | The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment. The Internet of Things (IoT) promises to inject machine learning into many of these every-day objects via tiny, cheap MCUs. However, these resource-impoverished hardware platforms severely limit the complexity of machine learning models that can be deployed. For example, although convolutional neural networks (CNNs) achieve state-of-the-art results on many visual recognition tasks, CNN inference on MCUs is challenging due to severe finite memory limitations. To circumvent the memory challenge associated with CNNs, various alternatives have been proposed that do fit within the memory budget of an MCU, albeit at the cost of prediction accuracy. This paper challenges the idea that CNNs are not suitable for deployment on MCUs. We demonstrate that it is possible to automatically design CNNs which generalize well, while also being small enough to fit onto memory-limited MCUs. Our Sparse Architecture Search method combines neural architecture search with pruning in a single, unified approach, which learns superior models on four popular IoT datasets. The CNNs we find are more accurate and up to $4.35\times$ smaller than previous approaches, while meeting the strict MCU working memory constraint. |
Tasks | Neural Architecture Search |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12107v1 |
https://arxiv.org/pdf/1905.12107v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-sparse-architecture-search-for-cnns-on |
Repo | https://github.com/patrickthomashansen/ubit_uTensor_demo |
Framework | tf |
Semantic Relatedness Based Re-ranker for Text Spotting
Title | Semantic Relatedness Based Re-ranker for Text Spotting |
Authors | Ahmed Sabir, Francesc Moreno-Noguer, Lluís Padró |
Abstract | Applications such as textual entailment, plagiarism detection or document clustering rely on the notion of semantic similarity, and are usually approached with dimension reduction techniques like LDA or with embedding-based neural approaches. We present a scenario where semantic similarity is not enough, and we devise a neural approach to learn semantic relatedness. The scenario is text spotting in the wild, where a text in an image (e.g. street sign, advertisement or bus destination) must be identified and recognized. Our goal is to improve the performance of vision systems by leveraging semantic information. Our rationale is that the text to be spotted is often related to the image context in which it appears (word pairs such as Delta-airplane, or quarters-parking are not similar, but are clearly related). We show how learning a word-to-word or word-to-sentence relatedness score can improve the performance of text spotting systems up to 2.9 points, outperforming other measures in a benchmark dataset. |
Tasks | Dimensionality Reduction, Natural Language Inference, Semantic Similarity, Semantic Textual Similarity, Text Spotting |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07950v2 |
https://arxiv.org/pdf/1909.07950v2.pdf | |
PWC | https://paperswithcode.com/paper/semantic-relatedness-based-re-ranker-for-text |
Repo | https://github.com/ahmedssabir/dataset |
Framework | none |
SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings
Title | SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings |
Authors | Todor Mihaylov, Preslav Nakov |
Abstract | We describe our system for finding good answers in a community forum, as defined in SemEval-2016, Task 3 on Community Question Answering. Our approach relies on several semantic similarity features based on fine-tuned word embeddings and topics similarities. In the main Subtask C, our primary submission was ranked third, with a MAP of 51.68 and accuracy of 69.94. In Subtask A, our primary submission was also third, with MAP of 77.58 and accuracy of 73.39. |
Tasks | Community Question Answering, Question Answering, Semantic Similarity, Semantic Textual Similarity, Word Embeddings |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08743v1 |
https://arxiv.org/pdf/1911.08743v1.pdf | |
PWC | https://paperswithcode.com/paper/semanticz-at-semeval-2016-task-3-ranking-1 |
Repo | https://github.com/tbmihailov/semeval2016-task3-CQA |
Framework | none |
Where are the Keys? – Learning Object-Centric Navigation Policies on Semantic Maps with Graph Convolutional Networks
Title | Where are the Keys? – Learning Object-Centric Navigation Policies on Semantic Maps with Graph Convolutional Networks |
Authors | Niko Sünderhauf |
Abstract | Emerging object-based SLAM algorithms can build a graph representation of an environment comprising nodes for robot poses and object landmarks. However, while this map will contain static objects such as furniture or appliances, many moveable objects (e.g. the car keys, the glasses, or a magazine), are not suitable as landmarks and will not be part of the map due to their non-static nature. We show that Graph Convolutional Networks can learn navigation policies to find such unmapped objects by learning to exploit the hidden probabilistic model that governs where these objects appear in the environment. The learned policies can generalise to object classes unseen during training by using word vectors that express semantic similarity as representations for object nodes in the graph. Furthermore, we show that the policies generalise to unseen environments with only minimal loss of performance. We demonstrate that pre-training the policy network with a proxy task can significantly speed up learning, improving sample efficiency. Code for this paper is available at https://github.com/nikosuenderhauf/graphConvNetsForNavigation. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07376v1 |
https://arxiv.org/pdf/1909.07376v1.pdf | |
PWC | https://paperswithcode.com/paper/where-are-the-keys-learning-object-centric |
Repo | https://github.com/nikosuenderhauf/graphConvNetsForNavigation |
Framework | none |