October 18, 2019

3141 words 15 mins read

Paper Group ANR 487

A Hybrid Word-Character Approach to Abstractive Summarization. Diverse Beam Search for Increased Novelty in Abstractive Summarization. Image-derived generative modeling of pseudo-macromolecular structures - towards the statistical assessment of Electron CryoTomography template matching. Conditional Noise-Contrastive Estimation of Unnormalised Model …

A Hybrid Word-Character Approach to Abstractive Summarization


Title	A Hybrid Word-Character Approach to Abstractive Summarization
Authors	Chieh-Teng Chang, Chi-Chia Huang, Chih-Yuan Yang, Jane Yung-Jen Hsu
Abstract	Automatic abstractive text summarization is an important and challenging research topic of natural language processing. Among many widely used languages, the Chinese language has a special property that a Chinese character contains rich information comparable to a word. Existing Chinese text summarization methods, either adopt totally character-based or word-based representations, fail to fully exploit the information carried by both representations. To accurately capture the essence of articles, we propose a hybrid word-character approach (HWC) which preserves the advantages of both word-based and character-based representations. We evaluate the advantage of the proposed HWC approach by applying it to two existing methods, and discover that it generates state-of-the-art performance with a margin of 24 ROUGE points on a widely used dataset LCSTS. In addition, we find an issue contained in the LCSTS dataset and offer a script to remove overlapping pairs (a summary and a short text) to create a clean dataset for the community. The proposed HWC approach also generates the best performance on the new, clean LCSTS dataset.
Tasks	Abstractive Text Summarization, Text Summarization
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09968v2
PDF	http://arxiv.org/pdf/1802.09968v2.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-word-character-approach-to
Repo
Framework

Diverse Beam Search for Increased Novelty in Abstractive Summarization


Title	Diverse Beam Search for Increased Novelty in Abstractive Summarization
Authors	André Cibils, Claudiu Musat, Andreea Hossman, Michael Baeriswyl
Abstract	Text summarization condenses a text to a shorter version while retaining the important informations. Abstractive summarization is a recent development that generates new phrases, rather than simply copying or rephrasing sentences within the original text. Recently neural sequence-to-sequence models have achieved good results in the field of abstractive summarization, which opens new possibilities and applications for industrial purposes. However, most practitioners observe that these models still use large parts of the original text in the output summaries, making them often similar to extractive frameworks. To address this drawback, we first introduce a new metric to measure how much of a summary is extracted from the input text. Secondly, we present a novel method, that relies on a diversity factor in computing the neural network loss, to improve the diversity of the summaries generated by any neural abstractive model implementing beam search. Finally, we show that this method not only makes the system less extractive, but also improves the overall rouge score of state-of-the-art methods by at least 2 points.
Tasks	Abstractive Text Summarization, Text Summarization
Published	2018-02-05
URL	http://arxiv.org/abs/1802.01457v1
PDF	http://arxiv.org/pdf/1802.01457v1.pdf
PWC	https://paperswithcode.com/paper/diverse-beam-search-for-increased-novelty-in
Repo
Framework

Image-derived generative modeling of pseudo-macromolecular structures - towards the statistical assessment of Electron CryoTomography template matching


Title	Image-derived generative modeling of pseudo-macromolecular structures - towards the statistical assessment of Electron CryoTomography template matching
Authors	Kai Wen Wang, Xiangrui Zeng, Xiaodan Liang, Zhiguang Huo, Eric P. Xing, Min Xu
Abstract	Cellular Electron CryoTomography (CECT) is a 3D imaging technique that captures information about the structure and spatial organization of macromolecular complexes within single cells, in near-native state and at sub-molecular resolution. Although template matching is often used to locate macromolecules in a CECT image, it is insufficient as it only measures the relative structural similarity. Therefore, it is preferable to assess the statistical credibility of the decision through hypothesis testing, requiring many templates derived from a diverse population of macromolecular structures. Due to the very limited number of known structures, we need a generative model to efficiently and reliably sample pseudo-structures from the complex distribution of macromolecular structures. To address this challenge, we propose a novel image-derived approach for performing hypothesis testing for template matching by constructing generative models using the generative adversarial network. Finally, we conducted hypothesis testing experiments for template matching on both simulated and experimental subtomograms, allowing us to conclude the identity of subtomograms with high statistical credibility and significantly reducing false positives.
Tasks
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04634v1
PDF	http://arxiv.org/pdf/1805.04634v1.pdf
PWC	https://paperswithcode.com/paper/image-derived-generative-modeling-of-pseudo
Repo
Framework

Conditional Noise-Contrastive Estimation of Unnormalised Models


Title	Conditional Noise-Contrastive Estimation of Unnormalised Models
Authors	Ciwan Ceylan, Michael U. Gutmann
Abstract	Many parametric statistical models are not properly normalised and only specified up to an intractable partition function, which renders parameter estimation difficult. Examples of unnormalised models are Gibbs distributions, Markov random fields, and neural network models in unsupervised deep learning. In previous work, the estimation principle called noise-contrastive estimation (NCE) was introduced where unnormalised models are estimated by learning to distinguish between data and auxiliary noise. An open question is how to best choose the auxiliary noise distribution. We here propose a new method that addresses this issue. The proposed method shares with NCE the idea of formulating density estimation as a supervised learning problem but in contrast to NCE, the proposed method leverages the observed data when generating noise samples. The noise can thus be generated in a semi-automated manner. We first present the underlying theory of the new method, show that score matching emerges as a limiting case, validate the method on continuous and discrete valued synthetic data, and show that we can expect an improved performance compared to NCE when the data lie in a lower-dimensional manifold. Then we demonstrate its applicability in unsupervised deep learning by estimating a four-layer neural image model.
Tasks	Density Estimation
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03664v1
PDF	http://arxiv.org/pdf/1806.03664v1.pdf
PWC	https://paperswithcode.com/paper/conditional-noise-contrastive-estimation-of
Repo
Framework

Conformation Clustering of Long MD Protein Dynamics with an Adversarial Autoencoder


Title	Conformation Clustering of Long MD Protein Dynamics with an Adversarial Autoencoder
Authors	Yunlong Liu, L. Mario Amzel
Abstract	Recent developments in specialized computer hardware have greatly accelerated atomic level Molecular Dynamics (MD) simulations. A single GPU-attached cluster is capable of producing microsecond-length trajectories in reasonable amounts of time. Multiple protein states and a large number of microstates associated with folding and with the function of the protein can be observed as conformations sampled in the trajectories. Clustering those conformations, however, is needed for identifying protein states, evaluating transition rates and understanding protein behavior. In this paper, we propose a novel data-driven generative conformation clustering method based on the adversarial autoencoder (AAE) and provide the associated software implementation Cong. The method was tested using a 208 microseconds MD simulation of the fast-folding peptide Trp-Cage (20 residues) obtained from the D.E. Shaw Research Group. The proposed clustering algorithm identifies many of the salient features of the folding process by grouping a large number of conformations that share common features not easily identifiable in the trajectory.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12313v1
PDF	http://arxiv.org/pdf/1805.12313v1.pdf
PWC	https://paperswithcode.com/paper/conformation-clustering-of-long-md-protein
Repo
Framework


Title	Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances
Authors	Thao Minh Le, Nobuyuki Shimizu, Takashi Miyazaki, Koichi Shinoda
Abstract	With the widespread use of intelligent systems, such as smart speakers, addressee recognition has become a concern in human-computer interaction, as more and more people expect such systems to understand complicated social scenes, including those outdoors, in cafeterias, and hospitals. Because previous studies typically focused only on pre-specified tasks with limited conversational situations such as controlling smart homes, we created a mock dataset called Addressee Recognition in Visual Scenes with Utterances (ARVSU) that contains a vast body of image variations in visual scenes with an annotated utterance and a corresponding addressee for each scenario. We also propose a multi-modal deep-learning-based model that takes different human cues, specifically eye gazes and transcripts of an utterance corpus, into account to predict the conversational addressee from a specific speaker’s view in various real-life conversational scenarios. To the best of our knowledge, we are the first to introduce an end-to-end deep learning model that combines vision and transcripts of utterance for addressee recognition. As a result, our study suggests that future addressee recognition can reach the ability to understand human intention in many social situations previously unexplored, and our modality dataset is a first step in promoting research in this field.
Tasks
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04288v1
PDF	http://arxiv.org/pdf/1809.04288v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-multi-modal-addressee
Repo
Framework

Monaural source enhancement maximizing source-to-distortion ratio via automatic differentiation


Title	Monaural source enhancement maximizing source-to-distortion ratio via automatic differentiation
Authors	Hiroaki Nakajima, Yu Takahashi, Kazunobu Kondo, Yuji Hisaminato
Abstract	Recently, deep neural network (DNN) has made a breakthrough in monaural source enhancement. Through a training step by using a large amount of data, DNN estimates a mapping between mixed signals and clean signals. At this time, we use an objective function that numerically expresses the quality of a mapping by DNN. In the conventional methods, L1 norm, L2 norm, and Itakura-Saito divergence are often used as objective functions. Recently, an objective function based on short-time objective intelligibility (STOI) has also been proposed. However, these functions only indicate similarity between the clean signal and the estimated signal by DNN. In other words, they do not show the quality of noise reduction or source enhancement. Motivated by the fact, this paper adopts signal-to-distortion ratio (SDR) as the objective function. Since SDR virtually shows signal-to-noise ratio (SNR), maximizing SDR solves the above problem. The experimental results revealed that the proposed method achieved better performance than the conventional methods.
Tasks
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05791v1
PDF	http://arxiv.org/pdf/1806.05791v1.pdf
PWC	https://paperswithcode.com/paper/monaural-source-enhancement-maximizing-source
Repo
Framework

Statistical transformer networks: learning shape and appearance models via self supervision


Title	Statistical transformer networks: learning shape and appearance models via self supervision
Authors	Anil Bas, William A. P. Smith
Abstract	We generalise Spatial Transformer Networks (STN) by replacing the parametric transformation of a fixed, regular sampling grid with a deformable, statistical shape model which is itself learnt. We call this a Statistical Transformer Network (StaTN). By training a network containing a StaTN end-to-end for a particular task, the network learns the optimal nonrigid alignment of the input data for the task. Moreover, the statistical shape model is learnt with no direct supervision (such as landmarks) and can be reused for other tasks. Besides training for a specific task, we also show that a StaTN can learn a shape model using generic loss functions. This includes a loss inspired by the minimum description length principle in which an appearance model is also learnt from scratch. In this configuration, our model learns an active appearance model and a means to fit the model from scratch with no supervision at all, even identity labels.
Tasks
Published	2018-04-07
URL	http://arxiv.org/abs/1804.02541v1
PDF	http://arxiv.org/pdf/1804.02541v1.pdf
PWC	https://paperswithcode.com/paper/statistical-transformer-networks-learning
Repo
Framework

Gaussian Process Latent Variable Alignment Learning


Title	Gaussian Process Latent Variable Alignment Learning
Authors	Ieva Kazlauskaite, Carl Henrik Ek, Neill D. F. Campbell
Abstract	We present a model that can automatically learn alignments between high-dimensional data in an unsupervised manner. Our proposed method casts alignment learning in a framework where both alignment and data are modelled simultaneously. Further, we automatically infer groupings of different types of sequences within the same dataset. We derive a probabilistic model built on non-parametric priors that allows for flexible warps while at the same time providing means to specify interpretable constraints. We demonstrate the efficacy of our approach with superior quantitative performance to the state-of-the-art approaches and provide examples to illustrate the versatility of our model in automatic inference of sequence groupings, absent from previous approaches, as well as easy specification of high level priors for different modalities of data.
Tasks
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02603v3
PDF	http://arxiv.org/pdf/1803.02603v3.pdf
PWC	https://paperswithcode.com/paper/gaussian-process-latent-variable-alignment
Repo
Framework

On the Covariance-Hessian Relation in Evolution Strategies


Title	On the Covariance-Hessian Relation in Evolution Strategies
Authors	Ofer M. Shir, Amir Yehudayoff
Abstract	We consider Evolution Strategies operating only with isotropic Gaussian mutations on positive quadratic objective functions, and investigate the covariance matrix when constructed out of selected individuals by truncation. We prove that the covariance matrix over $(1,\lambda)$-selected decision vectors becomes proportional to the inverse of the landscape Hessian as the population-size $\lambda$ increases. This generalizes a previous result that proved an equivalent phenomenon when sampling was assumed to take place in the vicinity of the optimum. It further confirms the classical hypothesis that statistical learning of the landscape is an inherent characteristic of standard Evolution Strategies, and that this distinguishing capability stems only from the usage of isotropic Gaussian mutations and rank-based selection. We provide broad numerical validation for the proven results, and present empirical evidence for its generalization to $(\mu,\lambda)$-selection.
Tasks
Published	2018-06-10
URL	https://arxiv.org/abs/1806.03674v2
PDF	https://arxiv.org/pdf/1806.03674v2.pdf
PWC	https://paperswithcode.com/paper/on-the-covariance-hessian-relation-in
Repo
Framework

Single-View Food Portion Estimation: Learning Image-to-Energy Mappings Using Generative Adversarial Networks


Title	Single-View Food Portion Estimation: Learning Image-to-Energy Mappings Using Generative Adversarial Networks
Authors	Shaobo Fang, Zeman Shao, Runyu Mao, Chichen Fu, Deborah A. Kerr, Carol J. Boushey, Edward J. Delp, Fengqing Zhu
Abstract	Due to the growing concern of chronic diseases and other health problems related to diet, there is a need to develop accurate methods to estimate an individual’s food and energy intake. Measuring accurate dietary intake is an open research problem. In particular, accurate food portion estimation is challenging since the process of food preparation and consumption impose large variations on food shapes and appearances. In this paper, we present a food portion estimation method to estimate food energy (kilocalories) from food images using Generative Adversarial Networks (GAN). We introduce the concept of an “energy distribution” for each food image. To train the GAN, we design a food image dataset based on ground truth food labels and segmentation masks for each food image as well as energy information associated with the food image. Our goal is to learn the mapping of the food image to the food energy. We can then estimate food energy based on the energy distribution. We show that an average energy estimation error rate of 10.89% can be obtained by learning the image-to-energy mapping.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09670v2
PDF	http://arxiv.org/pdf/1802.09670v2.pdf
PWC	https://paperswithcode.com/paper/single-view-food-portion-estimation-learning
Repo
Framework


Title	Learning to Recommend with Multiple Cascading Behaviors
Authors	Chen Gao, Xiangnan He, Dahua Gan, Xiangning Chen, Fuli Feng, Yong Li, Tat-Seng Chua, Lina Yao, Yang Song, Depeng Jin
Abstract	Most existing recommender systems leverage user behavior data of one type only, such as the purchase behavior in E-commerce that is directly related to the business KPI (Key Performance Indicator) of conversion rate. Besides the key behavioral data, we argue that other forms of user behaviors also provide valuable signal, such as views, clicks, adding a product to shop carts and so on. They should be taken into account properly to provide quality recommendation for users. In this work, we contribute a new solution named NMTR (short for Neural Multi-Task Recommendation) for learning recommender systems from user multi-behavior data. We develop a neural network model to capture the complicated and multi-type interactions between users and items. In particular, our model accounts for the cascading relationship among different types of behaviors (e.g., a user must click on a product before purchasing it). To fully exploit the signal in the data of multiple types of behaviors, we perform a joint optimization based on the multi-task learning framework, where the optimization on a behavior is treated as a task. Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data. Further analysis shows that modeling multiple behaviors is particularly useful for providing recommendation for sparse users that have very few interactions.
Tasks	Multi-Task Learning, Recommendation Systems
Published	2018-09-21
URL	https://arxiv.org/abs/1809.08161v4
PDF	https://arxiv.org/pdf/1809.08161v4.pdf
PWC	https://paperswithcode.com/paper/learning-to-recommend-with-multiple-cascading
Repo
Framework

Image Provenance Analysis at Scale


Title	Image Provenance Analysis at Scale
Authors	Daniel Moreira, Aparna Bharati, Joel Brogan, Allan Pinto, Michael Parowski, Kevin W. Bowyer, Patrick J. Flynn, Anderson Rocha, Walter J. Scheirer
Abstract	Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation ( e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results.
Tasks
Published	2018-01-19
URL	http://arxiv.org/abs/1801.06510v2
PDF	http://arxiv.org/pdf/1801.06510v2.pdf
PWC	https://paperswithcode.com/paper/image-provenance-analysis-at-scale
Repo
Framework

On self-play computation of equilibrium in poker


Title	On self-play computation of equilibrium in poker
Authors	Mikhail Goykhman
Abstract	We compare performance of the genetic algorithm and the counterfactual regret minimization algorithm in computing the near-equilibrium strategies in the simplified poker games. We focus on the von Neumann poker and the simplified version of the Texas Hold’Em poker, and test outputs of the considered algorithms against analytical expressions defining the Nash equilibrium strategies. We comment on the performance of the studied algorithms against opponents deviating from equilibrium.
Tasks
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09282v1
PDF	http://arxiv.org/pdf/1805.09282v1.pdf
PWC	https://paperswithcode.com/paper/on-self-play-computation-of-equilibrium-in
Repo
Framework

Conditioning Optimization of Extreme Learning Machine by Multitask Beetle Antennae Swarm Algorithm


Title	Conditioning Optimization of Extreme Learning Machine by Multitask Beetle Antennae Swarm Algorithm
Authors	Xixian Zhang, Zhijing Yang, Faxian Cao, Jiangzhong Cao, Meilin Wang, Nian Cai
Abstract	Extreme learning machine (ELM) as a simple and rapid neural network has been shown its good performance in various areas. Different from the general single hidden layer feedforward neural network (SLFN), the input weights and biases in hidden layer of ELM are generated randomly, so that it only takes a little computation overhead to train the model. However, the strategy of selecting input weights and biases at random may result in ill-posed problem. Aiming to optimize the conditioning of ELM, we propose an effective particle swarm heuristic algorithm called Multitask Beetle Antennae Swarm Algorithm (MBAS), which is inspired by the structures of artificial bee colony (ABS) algorithm and Beetle Antennae Search (BAS) algorithm. Then, the proposed MBAS is applied to optimize the input weights and biases of ELM. Experiment results show that the proposed method is capable of simultaneously reducing the condition number and regression error, and achieving good generalization performances.
Tasks
Published	2018-11-22
URL	http://arxiv.org/abs/1811.09100v1
PDF	http://arxiv.org/pdf/1811.09100v1.pdf
PWC	https://paperswithcode.com/paper/conditioning-optimization-of-extreme-learning
Repo
Framework