January 29, 2020

3204 words 16 mins read

Paper Group ANR 546

Long-Term Progress and Behavior Complexification in Competitive Co-Evolution. Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech. LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction. Employ Multimodal Machine Learning for Content quality analysis. Implicit Dimension Ide …

Long-Term Progress and Behavior Complexification in Competitive Co-Evolution


Title	Long-Term Progress and Behavior Complexification in Competitive Co-Evolution
Authors	Luca Simione, Stefano Nolfi
Abstract	The possibility to use competitive evolutionary algorithms to generate long-term progress is normally prevented by the convergence on limit cycle dynamics in which the evolving agents keep progressing against their current competitors by periodically rediscovering solutions adopted previously over and over again. This leads to local but not to global progress, i.e. progress against all possible competitors. We propose a new competitive algorithm capable of leading to long term global progress thanks to its ability to identify and filter out opportunistic variations, i.e. variations leading to progress against current competitors and retrogression against other competitors. The efficacy of the method is validated on the co-evolution of predator and prey robots, a classic scenario that has been used in other related researches. The accumulation of global progress over many generation leads to effective solutions that involve the production of rather articulated behaviors. The complexity of the behavior displayed by the evolving robots tend to increase across generation although progresses in performance are not always accompanied by behavior complexification.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08303v1
PDF	https://arxiv.org/pdf/1909.08303v1.pdf
PWC	https://paperswithcode.com/paper/long-term-progress-and-behavior
Repo
Framework

Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech


Title	Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech
Authors	William N. Havard, Jean-Pierre Chevrot, Laurent Besacier
Abstract	In this paper, we study how word-like units are represented and activated in a recurrent neural model of visually grounded speech. The model used in our experiments is trained to project an image and its spoken description in a common representation space. We show that a recurrent model trained on spoken sentences implicitly segments its input into word-like units and reliably maps them to their correct visual referents. We introduce a methodology originating from linguistics to analyse the representation learned by neural networks – the gating paradigm – and show that the correct representation of a word is only activated if the network has access to first phoneme of the target word, suggesting that the network does not rely on a global acoustic pattern. Furthermore, we find out that not all speech frames (MFCC vectors in our case) play an equal role in the final encoded representation of a given word, but that some frames have a crucial effect on it. Finally, we suggest that word representation could be activated through a process of lexical competition.
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08491v1
PDF	https://arxiv.org/pdf/1909.08491v1.pdf
PWC	https://paperswithcode.com/paper/word-recognition-competition-and-activation
Repo
Framework

LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction


Title	LapEPI-Net: A Laplacian Pyramid EPI structure for Learning-based Dense Light Field Reconstruction
Authors	Gaochang Wu, Yebin Liu, Lu Fang, Tianyou Chai
Abstract	For dense sampled light field (LF) reconstruction problem, existing approaches focus on a depth-free framework to achieve non-Lambertian performance. However, they trap in the trade-off “either aliasing or blurring” problem, i.e., pre-filtering the aliasing components (caused by the angular sparsity of the input LF) always leads to a blurry result. In this paper, we intend to solve this challenge by introducing an elaborately designed epipolar plane image (EPI) structure within a learning-based framework. Specifically, we start by analytically showing that decreasing the spatial scale of an EPI shows higher efficiency in addressing the aliasing problem than simply adopting pre-filtering. Accordingly, we design a Laplacian Pyramid EPI (LapEPI) structure that contains both low spatial scale EPI (for aliasing) and high-frequency residuals (for blurring) to solve the trade-off problem. We then propose a novel network architecture for the LapEPI structure, termed as LapEPI-net. To ensure the non-Lambertian performance, we adopt a transfer-learning strategy by first pre-training the network with natural images then fine-tuning it with unstructured LFs. Extensive experiments demonstrate the high performance and robustness of the proposed approach for tackling the aliasing-or-blurring problem as well as the non-Lambertian reconstruction.
Tasks	Transfer Learning
Published	2019-02-17
URL	http://arxiv.org/abs/1902.06221v1
PDF	http://arxiv.org/pdf/1902.06221v1.pdf
PWC	https://paperswithcode.com/paper/lapepi-net-a-laplacian-pyramid-epi-structure
Repo
Framework

Employ Multimodal Machine Learning for Content quality analysis


Title	Employ Multimodal Machine Learning for Content quality analysis
Authors	Eric Du, Xiaoyong Li
Abstract	The task of identifying high-quality content becomes increasingly important, and it can improve overall reading time and CTR(click-through rate estimates). Generalizes quality analysis only focused on single Modal,such as image or text,but in today’s mainstream media sites a lot of information is presented in graphic form.In this paper we propose a MultiModal quality recognition approach for the quality score. First we use two feature extractors,one for image and another for the text. After that we use an Siamese Network with the rank loss as the optimization objective.Compare with other approach,our approach get a more accuracy result.
Tasks
Published	2019-09-01
URL	https://arxiv.org/abs/1909.01793v1
PDF	https://arxiv.org/pdf/1909.01793v1.pdf
PWC	https://paperswithcode.com/paper/employ-multimodal-machine-learning-for
Repo
Framework

Implicit Dimension Identification in User-Generated Text with LSTM Networks


Title	Implicit Dimension Identification in User-Generated Text with LSTM Networks
Authors	Victor Makarenkov, Ido Guy, Niva Hazon, Tamar Meisels, Bracha Shapira, Lior Rokach
Abstract	In the process of online storytelling, individual users create and consume highly diverse content that contains a great deal of implicit beliefs and not plainly expressed narrative. It is hard to manually detect these implicit beliefs, intentions and moral foundations of the writers. We study and investigate two different tasks, each of which reflect the difficulty of detecting an implicit user’s knowledge, intent or belief that may be based on writer’s moral foundation: 1) political perspective detection in news articles 2) identification of informational vs. conversational questions in community question answering (CQA) archives and. In both tasks we first describe new interesting annotated datasets and make the datasets publicly available. Second, we compare various classification algorithms, and show the differences in their performance on both tasks. Third, in political perspective detection task we utilize a narrative representation language of local press to identify perspective differences between presumably neutral American and British press.
Tasks	Community Question Answering, Question Answering
Published	2019-01-26
URL	http://arxiv.org/abs/1901.09219v2
PDF	http://arxiv.org/pdf/1901.09219v2.pdf
PWC	https://paperswithcode.com/paper/implicit-dimension-identification-in-user
Repo
Framework

Recognizing Instagram Filtered Images with Feature De-stylization


Title	Recognizing Instagram Filtered Images with Feature De-stylization
Authors	Zhe Wu, Zuxuan Wu, Bharat Singh, Larry S. Davis
Abstract	Deep neural networks have been shown to suffer from poor generalization when small perturbations are added (like Gaussian noise), yet little work has been done to evaluate their robustness to more natural image transformations like photo filters. This paper presents a study on how popular pretrained models are affected by commonly used Instagram filters. To this end, we introduce ImageNet-Instagram, a filtered version of ImageNet, where 20 popular Instagram filters are applied to each image in ImageNet. Our analysis suggests that simple structure preserving filters which only alter the global appearance of an image can lead to large differences in the convolutional feature space. To improve generalization, we introduce a lightweight de-stylization module that predicts parameters used for scaling and shifting feature maps to “undo” the changes incurred by filters, inverting the process of style transfer tasks. We further demonstrate the module can be readily plugged into modern CNN architectures together with skip connections. We conduct extensive studies on ImageNet-Instagram, and show quantitatively and qualitatively, that the proposed module, among other things, can effectively improve generalization by simply learning normalization parameters without retraining the entire network, thus recovering the alterations in the feature space caused by the filters.
Tasks	Style Transfer
Published	2019-12-30
URL	https://arxiv.org/abs/1912.13000v1
PDF	https://arxiv.org/pdf/1912.13000v1.pdf
PWC	https://paperswithcode.com/paper/recognizing-instagram-filtered-images-with
Repo
Framework

PAC-Bayes under potentially heavy tails


Title	PAC-Bayes under potentially heavy tails
Authors	Matthew J. Holland
Abstract	We derive PAC-Bayesian learning guarantees for heavy-tailed losses, and obtain a novel optimal Gibbs posterior which enjoys finite-sample excess risk bounds at logarithmic confidence. Our core technique itself makes use of PAC-Bayesian inequalities in order to derive a robust risk estimator, which by design is easy to compute. In particular, only assuming that the first three moments of the loss distribution are bounded, the learning algorithm derived from this estimator achieves nearly sub-Gaussian statistical error, up to the quality of the prior.
Tasks
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07900v2
PDF	https://arxiv.org/pdf/1905.07900v2.pdf
PWC	https://paperswithcode.com/paper/pac-bayes-under-potentially-heavy-tails
Repo
Framework

Modeling Graph Structure in Transformer for Better AMR-to-Text Generation


Title	Modeling Graph Structure in Transformer for Better AMR-to-Text Generation
Authors	Jie Zhu, Junhui Li, Muhua Zhu, Longhua Qian, Min Zhang, Guodong Zhou
Abstract	Recent studies on AMR-to-text generation often formalize the task as a sequence-to-sequence (seq2seq) learning problem by converting an Abstract Meaning Representation (AMR) graph into a word sequence. Graph structures are further modeled into the seq2seq framework in order to utilize the structural information in the AMR graphs. However, previous approaches only consider the relations between directly connected concepts while ignoring the rich structure in AMR graphs. In this paper we eliminate such a strong limitation and propose a novel structure-aware self-attention approach to better modeling the relations between indirectly connected concepts in the state-of-the-art seq2seq model, i.e., the Transformer. In particular, a few different methods are explored to learn structural representations between two concepts. Experimental results on English AMR benchmark datasets show that our approach significantly outperforms the state of the art with 29.66 and 31.82 BLEU scores on LDC2015E86 and LDC2017T10, respectively. To the best of our knowledge, these are the best results achieved so far by supervised models on the benchmarks.
Tasks	Text Generation
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00136v1
PDF	https://arxiv.org/pdf/1909.00136v1.pdf
PWC	https://paperswithcode.com/paper/modeling-graph-structure-in-transformer-for
Repo
Framework

W-RNN: News text classification based on a Weighted RNN


Title	W-RNN: News text classification based on a Weighted RNN
Authors	Dan Wang, Jibing Gong, Yaxi Song
Abstract	Most of the information is stored as text, so text mining is regarded as having high commercial potential. Aiming at the semantic constraint problem of classification methods based on sparse representation, we propose a weighted recurrent neural network (W-RNN), which can fully extract text serialization semantic information. For the problem that the feature high dimensionality and unclear semantic relationship in text data representation, we first utilize the word vector to represent the vocabulary in the text and use Recurrent Neural Network (RNN) to extract features of the serialized text data. The word vector is then automatically weighted and summed using the intermediate output of the word vector to form the text representation vector. Finally, the neural network is used for classification. W-RNN is verified on the news dataset and proves that W-RNN is superior to other four baseline methods in Precision, Recall, F1 and loss values, which is suitable for text classification.
Tasks	Text Classification
Published	2019-09-28
URL	https://arxiv.org/abs/1909.13077v1
PDF	https://arxiv.org/pdf/1909.13077v1.pdf
PWC	https://paperswithcode.com/paper/w-rnn-news-text-classification-based-on-a
Repo
Framework

Privacy, Altruism, and Experience: Estimating the Perceived Value of Internet Data for Medical Uses


Title	Privacy, Altruism, and Experience: Estimating the Perceived Value of Internet Data for Medical Uses
Authors	Gilie Gefen, Omer Ben-Porat, Moshe Tennenholtz, Elad Yom-Tov
Abstract	People increasingly turn to the Internet when they have a medical condition. The data they create during this process is a valuable source for medical research and for future health services. However, utilizing these data could come at a cost to user privacy. Thus, it is important to balance the perceived value that users assign to these data with the value of the services derived from them. Here we describe experiments where methods from Mechanism Design were used to elicit a truthful valuation from users for their Internet data and for services to screen people for medical conditions. In these experiments, 880 people from around the world were asked to participate in an auction to provide their data for uses differing in their contribution to the participant, to society, and in the disease they addressed. Some users were offered monetary compensation for their participation, while others were asked to pay to participate. Our findings show that 99% of people were willing to contribute their data in exchange for monetary compensation and an analysis of their data, while 53% were willing to pay to have their data analyzed. The average perceived value users assigned to their data was estimated at US$49. Their value to screen them for a specific cancer was US$22 while the value of this service offered to the general public was US$22. Participants requested higher compensation when notified that their data would be used to analyze a more severe condition. They were willing to pay more to have their data analyzed when the condition was more severe, when they had higher education or if they had recently experienced a serious medical condition.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08562v3
PDF	https://arxiv.org/pdf/1906.08562v3.pdf
PWC	https://paperswithcode.com/paper/privacy-altruism-and-experience-estimating
Repo
Framework

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation


Title	An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Authors	Vincent Michalski, Vikram Voleti, Samira Ebrahimi Kahou, Anthony Ortiz, Pascal Vincent, Chris Pal, Doina Precup
Abstract	Batch normalization has been widely used to improve optimization in deep neural networks. While the uncertainty in batch statistics can act as a regularizer, using these dataset statistics specific to the training set impairs generalization in certain tasks. Recently, alternative methods for normalizing feature activations in neural networks have been proposed. Among them, group normalization has been shown to yield similar, in some domains even superior performance to batch normalization. All these methods utilize a learned affine transformation after the normalization operation to increase representational power. Methods used in conditional computation define the parameters of these transformations as learnable functions of conditioning information. In this work, we study whether and where the conditional formulation of group normalization can improve generalization compared to conditional batch normalization. We evaluate performances on the tasks of visual question answering, few-shot learning, and conditional image generation.
Tasks	Conditional Image Generation, Few-Shot Learning, Image Generation, Question Answering, Visual Question Answering
Published	2019-07-31
URL	https://arxiv.org/abs/1908.00061v1
PDF	https://arxiv.org/pdf/1908.00061v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-study-of-batch-normalization-and
Repo
Framework

Modeling Disease Progression In Retinal OCTs With Longitudinal Self-Supervised Learning


Title	Modeling Disease Progression In Retinal OCTs With Longitudinal Self-Supervised Learning
Authors	Antoine Rivail, Ursula Schmidt-Erfurth, Wolf-Dieter Vogl, Sebastian M. Waldstein, Sophie Riedl, Christoph Grechenig, Zhichao Wu, Hrvoje Bogunović
Abstract	Longitudinal imaging is capable of capturing the static ana-to-mi-cal structures and the dynamic changes of the morphology resulting from aging or disease progression. Self-supervised learning allows to learn new representation from available large unlabelled data without any expert knowledge. We propose a deep learning self-supervised approach to model disease progression from longitudinal retinal optical coherence tomography (OCT). Our self-supervised model takes benefit from a generic time-related task, by learning to estimate the time interval between pairs of scans acquired from the same patient. This task is (i) easy to implement, (ii) allows to use irregularly sampled data, (iii) is tolerant to poor registration, and (iv) does not rely on additional annotations. This novel method learns a representation that focuses on progression specific information only, which can be transferred to other types of longitudinal problems. We transfer the learnt representation to a clinically highly relevant task of predicting the onset of an advanced stage of age-related macular degeneration within a given time interval based on a single OCT scan. The boost in prediction accuracy, in comparison to a network learned from scratch or transferred from traditional tasks, demonstrates that our pretrained self-supervised representation learns a clinically meaningful information.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09420v3
PDF	https://arxiv.org/pdf/1910.09420v3.pdf
PWC	https://paperswithcode.com/paper/modeling-disease-progression-in-retinal-octs
Repo
Framework

Lifelong GAN: Continual Learning for Conditional Image Generation


Title	Lifelong GAN: Continual Learning for Conditional Image Generation
Authors	Mengyao Zhai, Lei Chen, Fred Tung, Jiawei He, Megha Nawhal, Greg Mori
Abstract	Lifelong learning is challenging for deep neural networks due to their susceptibility to catastrophic forgetting. Catastrophic forgetting occurs when a trained network is not able to maintain its ability to accomplish previously learned tasks when it is trained to perform new tasks. We study the problem of lifelong learning for generative models, extending a trained network to new conditional generation tasks without forgetting previous tasks, while assuming access to the training data for the current task only. In contrast to state-of-the-art memory replay based approaches which are limited to label-conditioned image generation tasks, a more generic framework for continual learning of generative models under different conditional image generation settings is proposed in this paper. Lifelong GAN employs knowledge distillation to transfer learned knowledge from previous networks to the new network. This makes it possible to perform image-conditioned generation tasks in a lifelong learning setting. We validate Lifelong GAN for both image-conditioned and label-conditioned generation tasks, and provide qualitative and quantitative results to show the generality and effectiveness of our method.
Tasks	Conditional Image Generation, Continual Learning, Image Generation
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10107v2
PDF	https://arxiv.org/pdf/1907.10107v2.pdf
PWC	https://paperswithcode.com/paper/lifelong-gan-continual-learning-for
Repo
Framework

cGANs with Conditional Convolution Layer


Title	cGANs with Conditional Convolution Layer
Authors	Min-Cheol Sagong, Yong-Goo Shin, Yoon-Jae Yeo, Seung Park, Sung-Jea Ko
Abstract	Conditional generative adversarial networks (cGANs) have been widely researched to generate class conditional images using a single generator. However, in the conventional cGANs techniques, it is still challenging for the generator to learn condition-specific features, since a standard convolutional layer with the same weights is used regardless of the condition. In this paper, we propose a novel convolution layer, called the conditional convolution layer, which directly generates different feature maps by employing the weights which are adjusted depending on the conditions. More specifically, in each conditional convolution layer, the weights are conditioned in a simple but effective way through filter-wise scaling and channel-wise shifting operations. In contrast to the conventional methods, the proposed method with a single generator can effectively handle condition-specific characteristics. The experimental results on CIFAR, LSUN and ImageNet datasets show that the generator with the proposed conditional convolution layer achieves a higher quality of conditional image generation than that with the standard convolution layer.
Tasks	Conditional Image Generation, Image Generation
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00709v1
PDF	https://arxiv.org/pdf/1906.00709v1.pdf
PWC	https://paperswithcode.com/paper/190600709
Repo
Framework

The Ambiguous World of Emotion Representation


Title	The Ambiguous World of Emotion Representation
Authors	Vidhyasaharan Sethu, Emily Mower Provost, Julien Epps, Carlos Busso, Nicholas Cummins, Shrikanth Narayanan
Abstract	Artificial intelligence and machine learning systems have demonstrated huge improvements and human-level parity in a range of activities, including speech recognition, face recognition and speaker verification. However, these diverse tasks share a key commonality that is not true in affective computing: the ground truth information that is inferred can be unambiguously represented. This observation provides some hints as to why affective computing, despite having attracted the attention of researchers for years, may not still be considered a mature field of research. A key reason for this is the lack of a common mathematical framework to describe all the relevant elements of emotion representations. This paper proposes the AMBiguous Emotion Representation (AMBER) framework to address this deficiency. AMBER is a unified framework that explicitly describes categorical, numerical and ordinal representations of emotions, including time varying representations. In addition to explaining the core elements of AMBER, the paper also discusses how some of the commonly employed emotion representation schemes can be viewed through the AMBER framework, and concludes with a discussion of how the proposed framework can be used to reason about current and future affective computing systems.
Tasks	Face Recognition, Speaker Verification, Speech Recognition
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00360v1
PDF	https://arxiv.org/pdf/1909.00360v1.pdf
PWC	https://paperswithcode.com/paper/the-ambiguous-world-of-emotion-representation
Repo
Framework