October 20, 2019

2992 words 15 mins read

Paper Group AWR 268

Adaptive Affinity Fields for Semantic Segmentation. Recursive Chaining of Reversible Image-to-image Translators For Face Aging. Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making. QUOTA: The Quantile Option Architecture for Reinforcement Learning. A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing …

Adaptive Affinity Fields for Semantic Segmentation


Title	Adaptive Affinity Fields for Semantic Segmentation
Authors	Tsung-Wei Ke, Jyh-Jing Hwang, Ziwei Liu, Stella X. Yu
Abstract	Semantic segmentation has made much progress with increasingly powerful pixel-wise classifiers and incorporating structural priors via Conditional Random Fields (CRF) or Generative Adversarial Networks (GAN). We propose a simpler alternative that learns to verify the spatial structure of segmentation during training only. Unlike existing approaches that enforce semantic labels on individual pixels and match labels between neighbouring pixels, we propose the concept of Adaptive Affinity Fields (AAF) to capture and match the semantic relations between neighbouring pixels in the label space. We use adversarial learning to select the optimal affinity field size for each semantic category. It is formulated as a minimax problem, optimizing our segmentation neural network in a best worst-case learning scenario. AAF is versatile for representing structures as a collection of pixel-centric relations, easier to train than GAN and more efficient than CRF without run-time inference. Our extensive evaluations on PASCAL VOC 2012, Cityscapes, and GTA5 datasets demonstrate its above-par segmentation performance and robust generalization across domains.
Tasks	Semantic Segmentation
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10335v3
PDF	http://arxiv.org/pdf/1803.10335v3.pdf
PWC	https://paperswithcode.com/paper/adaptive-affinity-fields-for-semantic
Repo	https://github.com/twke18/Adaptive_Affinity_Fields
Framework	tf

Recursive Chaining of Reversible Image-to-image Translators For Face Aging


Title	Recursive Chaining of Reversible Image-to-image Translators For Face Aging
Authors	Ari Heljakka, Arno Solin, Juho Kannala
Abstract	This paper addresses the modeling and simulation of progressive changes over time, such as human face aging. By treating the age phases as a sequence of image domains, we construct a chain of transformers that map images from one age domain to the next. Leveraging recent adversarial image translation methods, our approach requires no training samples of the same individual at different ages. Here, the model must be flexible enough to translate a child face to a young adult, and all the way through the adulthood to old age. We find that some transformers in the chain can be recursively applied on their own output to cover multiple phases, compressing the chain. The structure of the chain also unearths information about the underlying physical process. We demonstrate the performance of our method with precise and intuitive metrics, and visually match with the face aging state-of-the-art.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.05023v2
PDF	http://arxiv.org/pdf/1802.05023v2.pdf
PWC	https://paperswithcode.com/paper/recursive-chaining-of-reversible-image-to
Repo	https://github.com/AaltoVision/img-transformer-chain
Framework	tf

Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making


Title	Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making
Authors	Daniel Seita, Nawid Jamali, Michael Laskey, Ajay Kumar Tanwani, Ron Berenstein, Prakash Baskaran, Soshi Iba, John Canny, Ken Goldberg
Abstract	A fundamental challenge in manipulating fabric for clothes folding and textiles manufacturing is computing “pick points” to effectively modify the state of an uncertain manifold. We present a supervised deep transfer learning approach to locate pick points using depth images for invariance to color and texture. We consider the task of bed-making, where a robot sequentially grasps and pulls at pick points to increase blanket coverage. We perform physical experiments with two mobile manipulator robots, the Toyota HSR and the Fetch, and three blankets of different colors and textures. We compare coverage results from (1) human supervision, (2) a baseline of picking at the uppermost blanket point, and (3) learned pick points. On a quarter-scale twin bed, a model trained with combined data from the two robots achieves 92% blanket coverage compared with 83% for the baseline and 95% for human supervisors. The model transfers to two novel blankets and achieves 93% coverage. Average coverage results of 92% for 193 beds suggest that transfer-invariant robot pick points on fabric can be effectively learned.
Tasks	Decision Making, Deformable Object Manipulation, Transfer Learning
Published	2018-09-26
URL	https://arxiv.org/abs/1809.09810v3
PDF	https://arxiv.org/pdf/1809.09810v3.pdf
PWC	https://paperswithcode.com/paper/robot-bed-making-deep-transfer-learning-using
Repo	https://github.com/DanielTakeshi/fast_grasp_detect
Framework	tf

QUOTA: The Quantile Option Architecture for Reinforcement Learning


Title	QUOTA: The Quantile Option Architecture for Reinforcement Learning
Authors	Shangtong Zhang, Borislav Mavrin, Linglong Kong, Bo Liu, Hengshuai Yao
Abstract	In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.
Tasks	Decision Making, Distributional Reinforcement Learning
Published	2018-11-05
URL	http://arxiv.org/abs/1811.02073v2
PDF	http://arxiv.org/pdf/1811.02073v2.pdf
PWC	https://paperswithcode.com/paper/quota-the-quantile-option-architecture-for
Repo	https://github.com/pihey1995/DistributionalRL
Framework	pytorch

A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing Values


Title	A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing Values
Authors	Ruifei Cui, Ioan Gabriel Bucur, Perry Groot, Tom Heskes
Abstract	We consider the problem of learning parameters of latent variable models from mixed (continuous and ordinal) data with missing values. We propose a novel Bayesian Gaussian copula factor (BGCF) approach that is consistent under certain conditions and that is quite robust to the violations of these conditions. In simulations, BGCF substantially outperforms two state-of-the-art alternative approaches. An illustration on the `Holzinger & Swineford 1939’ dataset indicates that BGCF is favorable over the so-called robust maximum likelihood (MLR) even if the data match the assumptions of MLR. \|
Tasks	Latent Variable Models
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04610v1
PDF	http://arxiv.org/pdf/1806.04610v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-bayesian-approach-for-latent-variable
Repo	https://github.com/cuiruifei/CopulaFactorModel
Framework	none

Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News


Title	Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News
Authors	Luís Borges, Bruno Martins, Pável Calado
Abstract	Fake news are nowadays an issue of pressing concern, given their recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in 2017 to encourage the development of machine learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a new approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural architecture that leverages ideas previously advanced in the context of learning efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks, together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article towards the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods, in order to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state-of-the-art.
Tasks	Document Classification, Natural Language Inference, Representation Learning, Stance Detection
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00706v1
PDF	http://arxiv.org/pdf/1811.00706v1.pdf
PWC	https://paperswithcode.com/paper/combining-similarity-features-and-deep
Repo	https://github.com/LuisPB7/fnc-msc
Framework	none

WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration


Title	WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
Authors	Alexander Kirillov, Natalia Krizhanovsky, Andrew Krizhanovsky
Abstract	The problem of word sense disambiguation (WSD) is considered in the article. Given a set of synonyms (synsets) and sentences with these synonyms. It is necessary to select the meaning of the word in the sentence automatically. 1285 sentences were tagged by experts, namely, one of the dictionary meanings was selected by experts for target words. To solve the WSD-problem, an algorithm based on a new method of vector-word contexts proximity calculation is proposed. In order to achieve higher accuracy, a preliminary epsilon-filtering of words is performed, both in the sentence and in the set of synonyms. An extensive program of experiments was carried out. Four algorithms are implemented, including a new algorithm. Experiments have shown that in a number of cases the new algorithm shows better results. The developed software and the tagged corpus have an open license and are available online. Wiktionary and Wikisource are used. A brief description of this work can be viewed in slides (https://goo.gl/9ak6Gt). Video lecture in Russian on this research is available online (https://youtu.be/-DLmRkepf58).
Tasks	Word Sense Disambiguation
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09559v2
PDF	http://arxiv.org/pdf/1805.09559v2.pdf
PWC	https://paperswithcode.com/paper/wsd-algorithm-based-on-a-new-method-of-vector
Repo	https://github.com/componavt/wcorpus
Framework	none

Norm-Ranging LSH for Maximum Inner Product Search


Title	Norm-Ranging LSH for Maximum Inner Product Search
Authors	Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng
Abstract	Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS.
Tasks
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08782v2
PDF	http://arxiv.org/pdf/1809.08782v2.pdf
PWC	https://paperswithcode.com/paper/norm-ranging-lsh-for-maximum-inner-product
Repo	https://github.com/xinyandai/similarity-search
Framework	none

Learning Symmetry Consistent Deep CNNs for Face Completion


Title	Learning Symmetry Consistent Deep CNNs for Face Completion
Authors	Xiaoming Li, Ming Liu, Jieru Zhu, Wangmeng Zuo, Meng Wang, Guosheng Hu, Lei Zhang
Abstract	Deep convolutional networks (CNNs) have achieved great success in face completion to generate plausible facial structures. These methods, however, are limited in maintaining global consistency among face components and recovering fine facial details. On the other hand, reflectional symmetry is a prominent property of face image and benefits face recognition and consistency modeling, yet remaining uninvestigated in deep face completion. In this work, we leverage two kinds of symmetry-enforcing subnets to form a symmetry-consistent CNN model (i.e., SymmFCNet) for effective face completion. For missing pixels on only one of the half-faces, an illumination-reweighted warping subnet is developed to guide the warping and illumination reweighting of the other half-face. As for missing pixels on both of half-faces, we present a generative reconstruction subnet together with a perceptual symmetry loss to enforce symmetry consistency of recovered structures. The SymmFCNet is constructed by stacking generative reconstruction subnet upon illumination-reweighted warping subnet, and can be end-to-end learned from training set of unaligned face images. Experiments show that SymmFCNet can generate high quality results on images with synthetic and real occlusion, and performs favorably against state-of-the-arts.
Tasks	Face Recognition, Facial Inpainting
Published	2018-12-19
URL	http://arxiv.org/abs/1812.07741v1
PDF	http://arxiv.org/pdf/1812.07741v1.pdf
PWC	https://paperswithcode.com/paper/learning-symmetry-consistent-deep-cnns-for
Repo	https://github.com/csxmli2016/SymmFCNet
Framework	pytorch

Anticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction


Title	Anticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction
Authors	Paul Schydlo, Mirko Rakovic, Lorenzo Jamone, José Santos-Victor
Abstract	Close human-robot cooperation is a key enabler for new developments in advanced manufacturing and assistive applications. Close cooperation require robots that can predict human actions and intent, and understand human non-verbal cues. Recent approaches based on neural networks have led to encouraging results in the human action prediction problem both in continuous and discrete spaces. Our approach extends the research in this direction. Our contributions are three-fold. First, we validate the use of gaze and body pose cues as a means of predicting human action through a feature selection method. Next, we address two shortcomings of existing literature: predicting multiple and variable-length action sequences. This is achieved by introducing an encoder-decoder recurrent neural network topology in the discrete action prediction problem. In addition, we theoretically demonstrate the importance of predicting multiple action sequences as a means of estimating the stochastic reward in a human robot cooperation scenario. Finally, we show the ability to effectively train the prediction model on a action prediction dataset, involving human motion data, and explore the influence of the model’s parameters on its performance. Source code repository: https://github.com/pschydlo/ActionAnticipation
Tasks	Feature Selection
Published	2018-02-28
URL	http://arxiv.org/abs/1802.10503v3
PDF	http://arxiv.org/pdf/1802.10503v3.pdf
PWC	https://paperswithcode.com/paper/anticipation-in-human-robot-cooperation-a
Repo	https://github.com/pschydlo/ActionAnticipation
Framework	tf

Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides


Title	Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides
Authors	Naofumi Tomita, Behnaz Abdollahi, Jason Wei, Bing Ren, Arief Suriawinata, Saeed Hassanpour
Abstract	Deep learning-based methods, such as the sliding window approach for cropped-image classification and heuristic aggregation for whole-slide inference, for analyzing histological patterns in high-resolution microscopy images have shown promising results. These approaches, however, require a laborious annotation process and are fragmented. This diagnostic study collected deidentified high-resolution histological images (N = 379) for training a new model composed of a convolutional neural network and a grid-based attention network, trainable without region-of-interest annotations. Histological images of patients who underwent endoscopic esophagus and gastroesophageal junction mucosal biopsy between January 1, 2016, and December 31, 2018, at Dartmouth-Hitchcock Medical Center (Lebanon, New Hampshire) were collected. The method achieved a mean accuracy of 0.83 in classifying 123 test images. These results were comparable with or better than the performance from the current state-of-the-art sliding window approach, which was trained with regions of interest. Results of this study suggest that the proposed attention-based deep neural network framework for Barrett esophagus and esophageal adenocarcinoma detection is important because it is based solely on tissue-level annotations, unlike existing methods that are based on regions of interest. This new model is expected to open avenues for applying deep learning to digital pathology.
Tasks	Crop Classification, Image Classification, Medical Object Detection
Published	2018-11-20
URL	https://arxiv.org/abs/1811.08513v2
PDF	https://arxiv.org/pdf/1811.08513v2.pdf
PWC	https://paperswithcode.com/paper/finding-a-needle-in-the-haystack-attention
Repo	https://github.com/BMIRDS/deepslide
Framework	pytorch

Persistence Bag-of-Words for Topological Data Analysis


Title	Persistence Bag-of-Words for Topological Data Analysis
Authors	Bartosz Zieliński, Michał Lipiński, Mateusz Juda, Matthias Zeppelzauer, Paweł Dłotko
Abstract	Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today’s machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with machine learning. Comprehensive experiments show that the new representation achieves state-of-the-art performance and beyond in much less time than alternative approaches.
Tasks	Topological Data Analysis
Published	2018-12-21
URL	https://arxiv.org/abs/1812.09245v3
PDF	https://arxiv.org/pdf/1812.09245v3.pdf
PWC	https://paperswithcode.com/paper/persistence-bag-of-words-for-topological-data
Repo	https://github.com/bziiuj/pcodebooks
Framework	none

Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding


Title	Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
Authors	Deunsol Yoon, Dongbok Lee, SangKeun Lee
Abstract	In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset.
Tasks	Natural Language Inference, Sentence Embedding
Published	2018-08-22
URL	http://arxiv.org/abs/1808.07383v1
PDF	http://arxiv.org/pdf/1808.07383v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-self-attention-computing-attention
Repo	https://github.com/dsindex/iclassifier
Framework	pytorch

Practical methods for graph two-sample testing


Title	Practical methods for graph two-sample testing
Authors	Debarghya Ghoshdastidar, Ulrike von Luxburg
Abstract	Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies in statistics and learning theory have provided some theoretical insights about such high-dimensional graph testing problems, but the practicality of the developed theoretical methods remains an open question. In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions. We show that these tests are computationally less expensive and, in some cases, more reliable than the existing methods.
Tasks
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12752v1
PDF	http://arxiv.org/pdf/1811.12752v1.pdf
PWC	https://paperswithcode.com/paper/practical-methods-for-graph-two-sample
Repo	https://github.com/gdebarghya/Network-TwoSampleTesting
Framework	none

AXNet: ApproXimate computing using an end-to-end trainable neural network


Title	AXNet: ApproXimate computing using an end-to-end trainable neural network
Authors	Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang
Abstract	Neural network based approximate computing is a universal architecture promising to gain tremendous energy-efficiency for many error resilient applications. To guarantee the approximation quality, existing works deploy two neural networks (NNs), e.g., an approximator and a predictor. The approximator provides the approximate results, while the predictor predicts whether the input data is safe to approximate with the given quality requirement. However, it is non-trivial and time-consuming to make these two neural network coordinate—they have different optimization objectives—by training them separately. This paper proposes a novel neural network structure—AXNet—to fuse two NNs to a holistic end-to-end trainable NN. Leveraging the philosophy of multi-task learning, AXNet can tremendously improve the invocation (proportion of safe-to-approximate samples) and reduce the approximation error. The training effort also decrease significantly. Experiment results show 50.7% more invocation and substantial cuts of training time when compared to existing neural network based approximate computing framework.
Tasks	Multi-Task Learning
Published	2018-07-27
URL	http://arxiv.org/abs/1807.10458v2
PDF	http://arxiv.org/pdf/1807.10458v2.pdf
PWC	https://paperswithcode.com/paper/axnet-approximate-computing-using-an-end-to
Repo	https://github.com/ACA-Lab-SJTU/approximate-computing
Framework	tf