January 26, 2020

2777 words 14 mins read

Paper Group ANR 1585

Discourse Level Factors for Sentence Deletion in Text Simplification. Unaligned Image-to-Sequence Transformation with Loop Consistency. Explainable Observer-Classifier for Explainable Binary Decisions. Comparing Machine Learning Approaches for Table Recognition in Historical Register Books. Visual cryptography in single-pixel imaging. The Task Anal …

Discourse Level Factors for Sentence Deletion in Text Simplification


Title	Discourse Level Factors for Sentence Deletion in Text Simplification
Authors	Yang Zhong, Chao Jiang, Wei Xu, Junyi Jessy Li
Abstract	This paper presents a data-driven study focusing on analyzing and predicting sentence deletion — a prevalent but understudied phenomenon in document simplification — on a large English text simplification corpus. We inspect various document and discourse factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected. We reveal that professional editors utilize different strategies to meet readability standards of elementary and middle schools. To predict whether a sentence will be deleted during simplification to a certain level, we harness automatically aligned data to train a classification model. Evaluated on our manually annotated data, our best models reached F1 scores of 65.2 and 59.7 for this task at the levels of elementary and middle school, respectively. We find that discourse level factors contribute to the challenging task of predicting sentence deletion for simplification.
Tasks	Text Simplification
Published	2019-11-23
URL	https://arxiv.org/abs/1911.10384v2
PDF	https://arxiv.org/pdf/1911.10384v2.pdf
PWC	https://paperswithcode.com/paper/discourse-level-factors-for-sentence-deletion
Repo
Framework

Unaligned Image-to-Sequence Transformation with Loop Consistency


Title	Unaligned Image-to-Sequence Transformation with Loop Consistency
Authors	Siyang Wang, Justin Lazarow, Kwonjoon Lee, Zhuowen Tu
Abstract	We tackle the problem of modeling sequential visual phenomena. Given examples of a phenomena that can be divided into discrete time steps, we aim to take an input from any such time and realize this input at all other time steps in the sequence. Furthermore, we aim to do this without ground-truth aligned sequences – avoiding the difficulties needed for gathering aligned data. This generalizes the unpaired image-to-image problem from generating pairs to generating sequences. We extend cycle consistency to loop consistency and alleviate difficulties associated with learning in the resulting long chains of computation. We show competitive results compared to existing image-to-image techniques when modeling several different data sets including the Earth’s seasons and aging of human faces.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04149v1
PDF	https://arxiv.org/pdf/1910.04149v1.pdf
PWC	https://paperswithcode.com/paper/unaligned-image-to-sequence-transformation
Repo
Framework

Explainable Observer-Classifier for Explainable Binary Decisions


Title	Explainable Observer-Classifier for Explainable Binary Decisions
Authors	Stephan Alaniz, Zeynep Akata
Abstract	Explanations help develop a better understanding of the rationale behind the predictions of a deep neural network and improve trust. We propose an explainable observer-classifier framework that exposes the steps taken through the decision-making process in a transparent manner. Instead of assigning a label to an image in a single step, our model makes iterative binary sub-decisions, and as a byproduct reveals a decision tree in the form of an introspective explanation. In addition, our model creates rationalizations as it assigns each binary decision a semantic meaning in the form of attributes imitating human-annotations. On six benchmark datasets with increasing size and granularity, our model outperforms classical decision-trees and generates easy-to-understand binary decision sequences explaining the network’s predictions.
Tasks	Decision Making
Published	2019-02-05
URL	https://arxiv.org/abs/1902.01780v2
PDF	https://arxiv.org/pdf/1902.01780v2.pdf
PWC	https://paperswithcode.com/paper/xoc-explainable-observer-classifier-for
Repo
Framework

Comparing Machine Learning Approaches for Table Recognition in Historical Register Books


Title	Comparing Machine Learning Approaches for Table Recognition in Historical Register Books
Authors	Stéphane Clinchant, Hervé Déjean, Jean-Luc Meunier, Eva Lang, Florian Kleber
Abstract	We present in this paper experiments on Table Recognition in hand-written registry books. We first explain how the problem of row and column detection is modeled, and then compare two Machine Learning approaches (Conditional Random Field and Graph Convolutional Network) for detecting these table elements. Evaluation was conducted on death records provided by the Archive of the Diocese of Passau. Both methods show similar results, a 89 F1 score, a quality which allows for Information Extraction. Software and dataset are open source/data.
Tasks
Published	2019-06-14
URL	https://arxiv.org/abs/1906.11901v1
PDF	https://arxiv.org/pdf/1906.11901v1.pdf
PWC	https://paperswithcode.com/paper/comparing-machine-learning-approaches-for
Repo
Framework

Visual cryptography in single-pixel imaging


Title	Visual cryptography in single-pixel imaging
Authors	Shuming Jiao, Jun Feng, Yang Gao, Ting Lei, Xiaocong Yuan
Abstract	Two novel visual cryptography (VC) schemes are proposed by combining VC with single-pixel imaging (SPI) for the first time. It is pointed out that the overlapping of visual key images in VC is similar to the superposition of pixel intensities by a single-pixel detector in SPI. In the first scheme, QR-code VC is designed by using opaque sheets instead of transparent sheets. The secret image can be recovered when identical illumination patterns are projected onto multiple visual key images and a single detector is used to record the total light intensities. In the second scheme, the secret image is shared by multiple illumination pattern sequences and it can be recovered when the visual key patterns are projected onto identical items. The application of VC can be extended to more diversified scenarios by our proposed schemes.
Tasks
Published	2019-11-12
URL	https://arxiv.org/abs/1911.05033v1
PDF	https://arxiv.org/pdf/1911.05033v1.pdf
PWC	https://paperswithcode.com/paper/visual-cryptography-in-single-pixel-imaging
Repo
Framework

The Task Analysis Cell Assembly Perspective


Title	The Task Analysis Cell Assembly Perspective
Authors	Dan Diaper, Chris Huyck
Abstract	An entirely novel synthesis combines the applied cognitive psychology of a task analytic approach with a neural cell assembly perspective that models both brain and mind function during task performance; similar cell assemblies could be implemented as an artificially intelligent neural network. A simplified cell assembly model is introduced and this leads to several new representational formats that, in combination, are demonstrated as suitable for analysing tasks. The advantages of using neural models are exposed and compared with previous research that has used symbolic artificial intelligence production systems, which make no attempt to model neurophysiology. For cognitive scientists, the approach provides an easy and practical introduction to thinking about brains, minds and artificial intelligence in terms of cell assemblies. In the future, subsequent developments have the potential to lead to a new, general theory of psychology and neurophysiology, supported by cell assembly based artificial intelligences.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10481v2
PDF	https://arxiv.org/pdf/1910.10481v2.pdf
PWC	https://paperswithcode.com/paper/the-task-analysis-cell-assembly-perspective
Repo
Framework

Not All Words are Equal: Video-specific Information Loss for Video Captioning


Title	Not All Words are Equal: Video-specific Information Loss for Video Captioning
Authors	Jiarong Dong, Ke Gao, Xiaokai Chen, Junbo Guo, Juan Cao, Yongdong Zhang
Abstract	An ideal description for a given video should fix its gaze on salient and representative content, which is capable of distinguishing this video from others. However, the distribution of different words is unbalanced in video captioning datasets, where distinctive words for describing video-specific salient objects are far less than common words such as ‘a’ ‘the’ and ‘person’. The dataset bias often results in recognition error or detail deficiency of salient but unusual objects. To address this issue, we propose a novel learning strategy called Information Loss, which focuses on the relationship between the video-specific visual content and corresponding representative words. Moreover, a framework with hierarchical visual representations and an optimized hierarchical attention mechanism is established to capture the most salient spatial-temporal visual information, which fully exploits the potential strength of the proposed learning strategy. Extensive experiments demonstrate that the ingenious guidance strategy together with the optimized architecture outperforms state-of-the-art video captioning methods on MSVD with CIDEr score 87.5, and achieves superior CIDEr score 47.7 on MSR-VTT. We also show that our Information Loss is generic which improves various models by significant margins.
Tasks	Video Captioning
Published	2019-01-01
URL	http://arxiv.org/abs/1901.00097v1
PDF	http://arxiv.org/pdf/1901.00097v1.pdf
PWC	https://paperswithcode.com/paper/not-all-words-are-equal-video-specific
Repo
Framework

Hierarchically Clustered Representation Learning


Title	Hierarchically Clustered Representation Learning
Authors	Su-Jin Shin, Kyungwoo Song, Il-Chul Moon
Abstract	The joint optimization of representation learning and clustering in the embedding space has experienced a breakthrough in recent years. In spite of the advance, clustering with representation learning has been limited to flat-level categories, which often involves cohesive clustering with a focus on instance relations. To overcome the limitations of flat clustering, we introduce hierarchically-clustered representation learning (HCRL), which simultaneously optimizes representation learning and hierarchical clustering in the embedding space. Compared with a few prior works, HCRL firstly attempts to consider a generation of deep embeddings from every component of the hierarchy, not just leaf components. In addition to obtaining hierarchically clustered embeddings, we can reconstruct data by the various abstraction levels, infer the intrinsic hierarchical structure, and learn the level-proportion features. We conducted evaluations with image and text domains, and our quantitative analyses showed competent likelihoods and the best accuracies compared with the baselines.
Tasks	Representation Learning
Published	2019-01-28
URL	http://arxiv.org/abs/1901.09906v2
PDF	http://arxiv.org/pdf/1901.09906v2.pdf
PWC	https://paperswithcode.com/paper/hierarchically-clustered-representation
Repo
Framework

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds


Title	Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
Authors	Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal
Abstract	We design a new algorithm for batch active learning with deep neural network models. Our algorithm, Batch Active learning by Diverse Gradient Embeddings (BADGE), samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, a strategy designed to incorporate both predictive uncertainty and sample diversity into every selected batch. Crucially, BADGE trades off between diversity and uncertainty without requiring any hand-tuned hyperparameters. We show that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems.
Tasks	Active Learning
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03671v2
PDF	https://arxiv.org/pdf/1906.03671v2.pdf
PWC	https://paperswithcode.com/paper/deep-batch-active-learning-by-diverse
Repo
Framework

Reward Shaping via Meta-Learning


Title	Reward Shaping via Meta-Learning
Authors	Haosheng Zou, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu
Abstract	Reward shaping is one of the most effective methods to tackle the crucial yet challenging problem of credit assignment in Reinforcement Learning (RL). However, designing shaping functions usually requires much expert knowledge and hand-engineering, and the difficulties are further exacerbated given multiple similar tasks to solve. In this paper, we consider reward shaping on a distribution of tasks, and propose a general meta-learning framework to automatically learn the efficient reward shaping on newly sampled tasks, assuming only shared state space but not necessarily action space. We first derive the theoretically optimal reward shaping in terms of credit assignment in model-free RL. We then propose a value-based meta-learning algorithm to extract an effective prior over the optimal reward shaping. The prior can be applied directly to new tasks, or provably adapted to the task-posterior while solving the task within few gradient updates. We demonstrate the effectiveness of our shaping through significantly improved learning efficiency and interpretable visualizations across various settings, including notably a successful transfer from DQN to DDPG.
Tasks	Meta-Learning
Published	2019-01-27
URL	http://arxiv.org/abs/1901.09330v1
PDF	http://arxiv.org/pdf/1901.09330v1.pdf
PWC	https://paperswithcode.com/paper/reward-shaping-via-meta-learning
Repo
Framework

Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network


Title	Difficulty-aware Image Super Resolution via Deep Adaptive Dual-Network
Authors	Jinghui Qin, Ziwei Xie, Yukai Shi, Wushao Wen
Abstract	Recently, deep learning based single image super-resolution(SR) approaches have achieved great development. The state-of-the-art SR methods usually adopt a feed-forward pipeline to establish a non-linear mapping between low-res(LR) and high-res(HR) images. However, due to treating all image regions equally without considering the difficulty diversity, these approaches meet an upper bound for optimization. To address this issue, we propose a novel SR approach that discriminately processes each image region within an image by its difficulty. Specifically, we propose a dual-way SR network that one way is trained to focus on easy image regions and another is trained to handle hard image regions. To identify whether a region is easy or hard, we propose a novel image difficulty recognition network based on PSNR prior. Our SR approach that uses the region mask to adaptively enforce the dual-way SR network yields superior results. Extensive experiments on several standard benchmarks (e.g., Set5, Set14, BSD100, and Urban100) show that our approach achieves state-of-the-art performance.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05802v2
PDF	http://arxiv.org/pdf/1904.05802v2.pdf
PWC	https://paperswithcode.com/paper/difficulty-aware-image-super-resolution-via
Repo
Framework

Regularized Context Gates on Transformer for Machine Translation


Title	Regularized Context Gates on Transformer for Machine Translation
Authors	Xintong Li, Lemao Liu, Rui Wang, Guoping Huang, Max Meng
Abstract	Context gates are effective to control the contributions from the source and target contexts in the recurrent neural network (RNN) based neural machine translation (NMT). However, it is challenging to extend them into the advanced Transformer architecture, which is more complicated than RNN. This paper first provides a method to identify source and target contexts and then introduce a gate mechanism to control the source and target contributions in Transformer. In addition, to further reduce the bias problem in the gate mechanism, this paper proposes a regularization method to guide the learning of the gates with supervision automatically generated using pointwise mutual information. Extensive experiments on 4 translation datasets demonstrate that the proposed model obtains an averaged gain of 1.0 BLEU score over strong Transformer baseline.
Tasks	Machine Translation
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11020v1
PDF	https://arxiv.org/pdf/1908.11020v1.pdf
PWC	https://paperswithcode.com/paper/regularized-context-gates-on-transformer-for
Repo
Framework

FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation


Title	FT-ClipAct: Resilience Analysis of Deep Neural Networks and Improving their Fault Tolerance using Clipped Activation
Authors	Le-Ha Hoang, Muhammad Abdullah Hanif, Muhammad Shafique
Abstract	Deep Neural Networks (DNNs) are widely being adopted for safety-critical applications, e.g., healthcare and autonomous driving. Inherently, they are considered to be highly error-tolerant. However, recent studies have shown that hardware faults that impact the parameters of a DNN (e.g., weights) can have drastic impacts on its classification accuracy. In this paper, we perform a comprehensive error resilience analysis of DNNs subjected to hardware faults (e.g., permanent faults) in the weight memory. The outcome of this analysis is leveraged to propose a novel error mitigation technique which squashes the high-intensity faulty activation values to alleviate their impact. We achieve this by replacing the unbounded activation functions with their clipped versions. We also present a method to systematically define the clipping values of the activation functions that result in increased resilience of the networks against faults. We evaluate our technique on the AlexNet and the VGG-16 DNNs trained for the CIFAR-10 dataset. The experimental results show that our mitigation technique significantly improves the resilience of the DNNs to faults. For example, the proposed technique offers on average 68.92% improvement in the classification accuracy of resilience-optimized VGG-16 model at 1e-5 fault rate, when compared to the base network without any fault mitigation.
Tasks	Autonomous Driving
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00941v1
PDF	https://arxiv.org/pdf/1912.00941v1.pdf
PWC	https://paperswithcode.com/paper/ft-clipact-resilience-analysis-of-deep-neural
Repo
Framework

Deep Exemplar Networks for VQA and VQG


Title	Deep Exemplar Networks for VQA and VQG
Authors	Badri N. Patro, Vinay P. Namboodiri
Abstract	In this paper, we consider the problem of solving semantic tasks such as `Visual Question Answering' (VQA), where one aims to answers related to an image and` Visual Question Generation’ (VQG), where one aims to generate a natural question pertaining to an image. Solutions for VQA and VQG tasks have been proposed using variants of encoder-decoder deep learning based frameworks that have shown impressive performance. Humans however often show generalization by relying on exemplar based approaches. For instance, the work by Tversky and Kahneman suggests that humans use exemplars when making categorizations and decisions. In this work, we propose the incorporation of exemplar based approaches towards solving these problems. Specifically, we incorporate exemplar based approaches and show that an exemplar based module can be incorporated in almost any of the deep learning architectures proposed in the literature and the addition of such a block results in improved performance for solving these tasks. Thus, just as the incorporation of attention is now considered de facto useful for solving these tasks, similarly, incorporating exemplars also can be considered to improve any proposed architecture for solving this task. We provide extensive empirical analysis for the same through various architectures, ablations, and state of the art comparisons.
Tasks	Question Answering, Question Generation, Visual Question Answering
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09551v1
PDF	https://arxiv.org/pdf/1912.09551v1.pdf
PWC	https://paperswithcode.com/paper/deep-exemplar-networks-for-vqa-and-vqg
Repo
Framework

Happy Together: Learning and Understanding Appraisal From Natural Language


Title	Happy Together: Learning and Understanding Appraisal From Natural Language
Authors	Arun Rajendran, Chiyu Zhang, Muhammad Abdul-Mageed
Abstract	In this paper, we explore various approaches for learning two types of appraisal components from happy language. We focus on ‘agency’ of the author and the ‘sociality’ involved in happy moments based on the HappyDB dataset. We develop models based on deep neural networks for the task, including uni- and bi-directional long short-term memory networks, with and without attention. We also experiment with a number of novel embedding methods, such as embedding from neural machine translation (as in CoVe) and embedding from language models (as in ELMo). We compare our results to those acquired by several traditional machine learning methods. Our best models achieve 87.97% accuracy on agency and 93.13% accuracy on sociality, both of which are significantly higher than our baselines.
Tasks	Machine Translation
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03677v1
PDF	https://arxiv.org/pdf/1906.03677v1.pdf
PWC	https://paperswithcode.com/paper/happy-together-learning-and-understanding
Repo
Framework