Paper Group ANR 1585
Discourse Level Factors for Sentence Deletion in Text Simplification. Unaligned Image-to-Sequence Transformation with Loop Consistency. Explainable Observer-Classifier for Explainable Binary Decisions. Comparing Machine Learning Approaches for Table Recognition in Historical Register Books. Visual cryptography in single-pixel imaging. The Task Anal …
Discourse Level Factors for Sentence Deletion in Text Simplification
Title | Discourse Level Factors for Sentence Deletion in Text Simplification |
Authors | Yang Zhong, Chao Jiang, Wei Xu, Junyi Jessy Li |
Abstract | This paper presents a data-driven study focusing on analyzing and predicting sentence deletion — a prevalent but understudied phenomenon in document simplification — on a large English text simplification corpus. We inspect various document and discourse factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected. We reveal that professional editors utilize different strategies to meet readability standards of elementary and middle schools. To predict whether a sentence will be deleted during simplification to a certain level, we harness automatically aligned data to train a classification model. Evaluated on our manually annotated data, our best models reached F1 scores of 65.2 and 59.7 for this task at the levels of elementary and middle school, respectively. We find that discourse level factors contribute to the challenging task of predicting sentence deletion for simplification. |
Tasks | Text Simplification |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10384v2 |
https://arxiv.org/pdf/1911.10384v2.pdf | |
PWC | https://paperswithcode.com/paper/discourse-level-factors-for-sentence-deletion |
Repo | |
Framework | |
Unaligned Image-to-Sequence Transformation with Loop Consistency
Title | Unaligned Image-to-Sequence Transformation with Loop Consistency |
Authors | Siyang Wang, Justin Lazarow, Kwonjoon Lee, Zhuowen Tu |
Abstract | We tackle the problem of modeling sequential visual phenomena. Given examples of a phenomena that can be divided into discrete time steps, we aim to take an input from any such time and realize this input at all other time steps in the sequence. Furthermore, we aim to do this without ground-truth aligned sequences – avoiding the difficulties needed for gathering aligned data. This generalizes the unpaired image-to-image problem from generating pairs to generating sequences. We extend cycle consistency to loop consistency and alleviate difficulties associated with learning in the resulting long chains of computation. We show competitive results compared to existing image-to-image techniques when modeling several different data sets including the Earth’s seasons and aging of human faces. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04149v1 |
https://arxiv.org/pdf/1910.04149v1.pdf | |
PWC | https://paperswithcode.com/paper/unaligned-image-to-sequence-transformation |
Repo | |
Framework | |
Explainable Observer-Classifier for Explainable Binary Decisions
Title | Explainable Observer-Classifier for Explainable Binary Decisions |
Authors | Stephan Alaniz, Zeynep Akata |
Abstract | Explanations help develop a better understanding of the rationale behind the predictions of a deep neural network and improve trust. We propose an explainable observer-classifier framework that exposes the steps taken through the decision-making process in a transparent manner. Instead of assigning a label to an image in a single step, our model makes iterative binary sub-decisions, and as a byproduct reveals a decision tree in the form of an introspective explanation. In addition, our model creates rationalizations as it assigns each binary decision a semantic meaning in the form of attributes imitating human-annotations. On six benchmark datasets with increasing size and granularity, our model outperforms classical decision-trees and generates easy-to-understand binary decision sequences explaining the network’s predictions. |
Tasks | Decision Making |
Published | 2019-02-05 |
URL | https://arxiv.org/abs/1902.01780v2 |
https://arxiv.org/pdf/1902.01780v2.pdf | |
PWC | https://paperswithcode.com/paper/xoc-explainable-observer-classifier-for |
Repo | |
Framework | |
Comparing Machine Learning Approaches for Table Recognition in Historical Register Books
Title | Comparing Machine Learning Approaches for Table Recognition in Historical Register Books |
Authors | Stéphane Clinchant, Hervé Déjean, Jean-Luc Meunier, Eva Lang, Florian Kleber |
Abstract | We present in this paper experiments on Table Recognition in hand-written registry books. We first explain how the problem of row and column detection is modeled, and then compare two Machine Learning approaches (Conditional Random Field and Graph Convolutional Network) for detecting these table elements. Evaluation was conducted on death records provided by the Archive of the Diocese of Passau. Both methods show similar results, a 89 F1 score, a quality which allows for Information Extraction. Software and dataset are open source/data. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.11901v1 |
https://arxiv.org/pdf/1906.11901v1.pdf | |
PWC | https://paperswithcode.com/paper/comparing-machine-learning-approaches-for |
Repo | |
Framework | |
Visual cryptography in single-pixel imaging
Title | Visual cryptography in single-pixel imaging |
Authors | Shuming Jiao, Jun Feng, Yang Gao, Ting Lei, Xiaocong Yuan |
Abstract | Two novel visual cryptography (VC) schemes are proposed by combining VC with single-pixel imaging (SPI) for the first time. It is pointed out that the overlapping of visual key images in VC is similar to the superposition of pixel intensities by a single-pixel detector in SPI. In the first scheme, QR-code VC is designed by using opaque sheets instead of transparent sheets. The secret image can be recovered when identical illumination patterns are projected onto multiple visual key images and a single detector is used to record the total light intensities. In the second scheme, the secret image is shared by multiple illumination pattern sequences and it can be recovered when the visual key patterns are projected onto identical items. The application of VC can be extended to more diversified scenarios by our proposed schemes. |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.05033v1 |
https://arxiv.org/pdf/1911.05033v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-cryptography-in-single-pixel-imaging |
Repo | |
Framework | |
The Task Analysis Cell Assembly Perspective
Title | The Task Analysis Cell Assembly Perspective |
Authors | Dan Diaper, Chris Huyck |
Abstract | An entirely novel synthesis combines the applied cognitive psychology of a task analytic approach with a neural cell assembly perspective that models both brain and mind function during task performance; similar cell assemblies could be implemented as an artificially intelligent neural network. A simplified cell assembly model is introduced and this leads to several new representational formats that, in combination, are demonstrated as suitable for analysing tasks. The advantages of using neural models are exposed and compared with previous research that has used symbolic artificial intelligence production systems, which make no attempt to model neurophysiology. For cognitive scientists, the approach provides an easy and practical introduction to thinking about brains, minds and artificial intelligence in terms of cell assemblies. In the future, subsequent developments have the potential to lead to a new, general theory of psychology and neurophysiology, supported by cell assembly based artificial intelligences. |
Tasks | |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10481v2 |
https://arxiv.org/pdf/1910.10481v2.pdf | |
PWC | https://paperswithcode.com/paper/the-task-analysis-cell-assembly-perspective |
Repo | |
Framework | |
Not All Words are Equal: Video-specific Information Loss for Video Captioning
Title | Not All Words are Equal: Video-specific Information Loss for Video Captioning |
Authors | Jiarong Dong, Ke Gao, Xiaokai Chen, Junbo Guo, Juan Cao, Yongdong Zhang |
Abstract | An ideal description for a given video should fix its gaze on salient and representative content, which is capable of distinguishing this video from others. However, the distribution of different words is unbalanced in video captioning datasets, where distinctive words for describing video-specific salient objects are far less than common words such as ‘a’ ‘the’ and ‘person’. The dataset bias often results in recognition error or detail deficiency of salient but unusual objects. To address this issue, we propose a novel learning strategy called Information Loss, which focuses on the relationship between the video-specific visual content and corresponding representative words. Moreover, a framework with hierarchical visual representations and an optimized hierarchical attention mechanism is established to capture the most salient spatial-temporal visual information, which fully exploits the potential strength of the proposed learning strategy. Extensive experiments demonstrate that the ingenious guidance strategy together with the optimized architecture outperforms state-of-the-art video captioning methods on MSVD with CIDEr score 87.5, and achieves superior CIDEr score 47.7 on MSR-VTT. We also show that our Information Loss is generic which improves various models by significant margins. |
Tasks | Video Captioning |
Published | 2019-01-01 |
URL | http://arxiv.org/abs/1901.00097v1 |
http://arxiv.org/pdf/1901.00097v1.pdf | |
PWC | https://paperswithcode.com/paper/not-all-words-are-equal-video-specific |
Repo | |
Framework | |
Hierarchically Clustered Representation Learning
Title | Hierarchically Clustered Representation Learning |
Authors | Su-Jin Shin, Kyungwoo Song, Il-Chul Moon |
Abstract | The joint optimization of representation learning and clustering in the embedding space has experienced a breakthrough in recent years. In spite of the advance, clustering with representation learning has been limited to flat-level categories, which often involves cohesive clustering with a focus on instance relations. To overcome the limitations of flat clustering, we introduce hierarchically-clustered representation learning (HCRL), which simultaneously optimizes representation learning and hierarchical clustering in the embedding space. Compared with a few prior works, HCRL firstly attempts to consider a generation of deep embeddings from every component of the hierarchy, not just leaf components. In addition to obtaining hierarchically clustered embeddings, we can reconstruct data by the various abstraction levels, infer the intrinsic hierarchical structure, and learn the level-proportion features. We conducted evaluations with image and text domains, and our quantitative analyses showed competent likelihoods and the best accuracies compared with the baselines. |
Tasks | Representation Learning |
Published | 2019-01-28 |
URL | http://arxiv.org/abs/1901.09906v2 |
http://arxiv.org/pdf/1901.09906v2.pdf | |
PWC | https://paperswithcode.com/paper/hierarchically-clustered-representation |
Repo | |
Framework | |
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Title | Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds |
Authors | Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal |
Abstract | We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. We show that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems. |
Tasks | Active Learning |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03671v2 |
https://arxiv.org/pdf/1906.03671v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-batch-active-learning-by-diverse |
Repo | |
Framework | |
Reward Shaping via Meta-Learning
Title | Reward Shaping via Meta-Learning |
Authors | Haosheng Zou, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu |
Abstract | Reward shaping is one of the most effective methods to tackle the crucial yet challenging problem of credit assignment in Reinforcement Learning (RL). However, designing shaping functions usually requires much expert knowledge and hand-engineering, and the difficulties are further exacerbated given multiple similar tasks to solve. In this paper, we consider reward shaping on a distribution of tasks, and propose a general meta-learning framework to automatically learn the efficient reward shaping on newly sampled tasks, assuming only shared state space but not necessarily action space. We first derive the theoretically optimal reward shaping in terms of credit assignment in model-free RL. We then propose a value-based meta-learning algorithm to extract an effective prior over the optimal reward shaping. The prior can be applied directly to new tasks, or provably adapted to the task-posterior while solving the task within few gradient updates. We demonstrate the effectiveness of our shaping through significantly improved learning efficiency and interpretable visualizations across various settings, including notably a successful transfer from DQN to DDPG. |
Tasks | Meta-Learning |
Published | 2019-01-27 |
URL | http://arxiv.org/abs/1901.09330v1 |
http://arxiv.org/pdf/1901.09330v1.pdf | |
PWC | https://paperswithcode.com/paper/reward-shaping-via-meta-learning |
Repo | |
Framework | |
Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network
Title | Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network |
Authors | Jinghui Qin, Ziwei Xie, Yukai Shi, Wushao Wen |
Abstract | Recently, deep learning based single image super-resolution(SR) approaches have achieved great development. The state-of-the-art SR methods usually adopt a feed-forward pipeline to establish a non-linear mapping between low-res(LR) and high-res(HR) images. However, due to treating all image regions equally without considering the difficulty diversity, these approaches meet an upper bound for optimization. To address this issue, we propose a novel SR approach that discriminately processes each image region within an image by its difficulty. Specifically, we propose a dual-way SR network that one way is trained to focus on easy image regions and another is trained to handle hard image regions. To identify whether a region is easy or hard, we propose a novel image difficulty recognition network based on PSNR prior. Our SR approach that uses the region mask to adaptively enforce the dual-way SR network yields superior results. Extensive experiments on several standard benchmarks (e.g., Set5, Set14, BSD100, and Urban100) show that our approach achieves state-of-the-art performance. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05802v2 |
http://arxiv.org/pdf/1904.05802v2.pdf | |
PWC | https://paperswithcode.com/paper/difficulty-aware-image-super-resolution-via |
Repo | |
Framework | |
Regularized Context Gates on Transformer for Machine Translation
Title | Regularized Context Gates on Transformer for Machine Translation |
Authors | Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng |
Abstract | Context gates are effective to control the contributions from the source and target contexts in the recurrent neural network (RNN) based neural machine translation (NMT). However, it is challenging to extend them into the advanced Transformer architecture, which is more complicated than RNN. This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information. Extensive experiments on 4 translation datasets demonstrate that the proposed model obtains an averaged gain of 1.0 BLEU score over strong Transformer baseline. |
Tasks | Machine Translation |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11020v1 |
https://arxiv.org/pdf/1908.11020v1.pdf | |
PWC | https://paperswithcode.com/paper/regularized-context-gates-on-transformer-for |
Repo | |
Framework | |
FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation
Title | FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation |
Authors | Le-Ha Hoang, Muhammad Abdullah Hanif, Muhammad Shafique |
Abstract | Deep Neural Networks (DNNs) are widely being adopted for safety-critical applications, e.g., healthcare and autonomous driving. Inherently, they are considered to be highly error-tolerant. However, recent studies have shown that hardware faults that impact the parameters of a DNN (e.g., weights) can have drastic impacts on its classification accuracy. In this paper, we perform a comprehensive error resilience analysis of DNNs subjected to hardware faults (e.g., permanent faults) in the weight memory. The outcome of this analysis is leveraged to propose a novel error mitigation technique which squashes the high-intensity faulty activation values to alleviate their impact. We achieve this by replacing the unbounded activation functions with their clipped versions. We also present a method to systematically define the clipping values of the activation functions that result in increased resilience of the networks against faults. We evaluate our technique on the AlexNet and the VGG-16 DNNs trained for the CIFAR-10 dataset. The experimental results show that our mitigation technique significantly improves the resilience of the DNNs to faults. For example, the proposed technique offers on average 68.92% improvement in the classification accuracy of resilience-optimized VGG-16 model at 1e-5 fault rate, when compared to the base network without any fault mitigation. |
Tasks | Autonomous Driving |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00941v1 |
https://arxiv.org/pdf/1912.00941v1.pdf | |
PWC | https://paperswithcode.com/paper/ft-clipact-resilience-analysis-of-deep-neural |
Repo | |
Framework | |
Deep Exemplar Networks for VQA and VQG
Title | Deep Exemplar Networks for VQA and VQG |
Authors | Badri N. Patro, Vinay P. Namboodiri |
Abstract | In this paper, we consider the problem of solving semantic tasks such as Visual Question Answering' (VQA), where one aims to answers related to an image and Visual Question Generation’ (VQG), where one aims to generate a natural question pertaining to an image. Solutions for VQA and VQG tasks have been proposed using variants of encoder-decoder deep learning based frameworks that have shown impressive performance. Humans however often show generalization by relying on exemplar based approaches. For instance, the work by Tversky and Kahneman suggests that humans use exemplars when making categorizations and decisions. In this work, we propose the incorporation of exemplar based approaches towards solving these problems. Specifically, we incorporate exemplar based approaches and show that an exemplar based module can be incorporated in almost any of the deep learning architectures proposed in the literature and the addition of such a block results in improved performance for solving these tasks. Thus, just as the incorporation of attention is now considered de facto useful for solving these tasks, similarly, incorporating exemplars also can be considered to improve any proposed architecture for solving this task. We provide extensive empirical analysis for the same through various architectures, ablations, and state of the art comparisons. |
Tasks | Question Answering, Question Generation, Visual Question Answering |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09551v1 |
https://arxiv.org/pdf/1912.09551v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-exemplar-networks-for-vqa-and-vqg |
Repo | |
Framework | |
Happy Together: Learning and Understanding Appraisal From Natural Language
Title | Happy Together: Learning and Understanding Appraisal From Natural Language |
Authors | Arun Rajendran, Chiyu Zhang, Muhammad Abdul-Mageed |
Abstract | In this paper, we explore various approaches for learning two types of appraisal components from happy language. We focus on ‘agency’ of the author and the ‘sociality’ involved in happy moments based on the HappyDB dataset. We develop models based on deep neural networks for the task, including uni- and bi-directional long short-term memory networks, with and without attention. We also experiment with a number of novel embedding methods, such as embedding from neural machine translation (as in CoVe) and embedding from language models (as in ELMo). We compare our results to those acquired by several traditional machine learning methods. Our best models achieve 87.97% accuracy on agency and 93.13% accuracy on sociality, both of which are significantly higher than our baselines. |
Tasks | Machine Translation |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03677v1 |
https://arxiv.org/pdf/1906.03677v1.pdf | |
PWC | https://paperswithcode.com/paper/happy-together-learning-and-understanding |
Repo | |
Framework | |