Paper Group AWR 268
Adaptive Affinity Fields for Semantic Segmentation. Recursive Chaining of Reversible Image-to-image Translators For Face Aging. Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making. QUOTA: The Quantile Option Architecture for Reinforcement Learning. A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing …
Adaptive Affinity Fields for Semantic Segmentation
Title | Adaptive Affinity Fields for Semantic Segmentation |
Authors | Tsung-Wei Ke, Jyh-Jing Hwang, Ziwei Liu, Stella X. Yu |
Abstract | Semantic segmentation has made much progress with increasingly powerful pixel-wise classifiers and incorporating structural priors via Conditional Random Fields (CRF) or Generative Adversarial Networks (GAN). We propose a simpler alternative that learns to verify the spatial structure of segmentation during training only. Unlike existing approaches that enforce semantic labels on individual pixels and match labels between neighbouring pixels, we propose the concept of Adaptive Affinity Fields (AAF) to capture and match the semantic relations between neighbouring pixels in the label space. We use adversarial learning to select the optimal affinity field size for each semantic category. It is formulated as a minimax problem, optimizing our segmentation neural network in a best worst-case learning scenario. AAF is versatile for representing structures as a collection of pixel-centric relations, easier to train than GAN and more efficient than CRF without run-time inference. Our extensive evaluations on PASCAL VOC 2012, Cityscapes, and GTA5 datasets demonstrate its above-par segmentation performance and robust generalization across domains. |
Tasks | Semantic Segmentation |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.10335v3 |
http://arxiv.org/pdf/1803.10335v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-affinity-fields-for-semantic |
Repo | https://github.com/twke18/Adaptive_Affinity_Fields |
Framework | tf |
Recursive Chaining of Reversible Image-to-image Translators For Face Aging
Title | Recursive Chaining of Reversible Image-to-image Translators For Face Aging |
Authors | Ari Heljakka, Arno Solin, Juho Kannala |
Abstract | This paper addresses the modeling and simulation of progressive changes over time, such as human face aging. By treating the age phases as a sequence of image domains, we construct a chain of transformers that map images from one age domain to the next. Leveraging recent adversarial image translation methods, our approach requires no training samples of the same individual at different ages. Here, the model must be flexible enough to translate a child face to a young adult, and all the way through the adulthood to old age. We find that some transformers in the chain can be recursively applied on their own output to cover multiple phases, compressing the chain. The structure of the chain also unearths information about the underlying physical process. We demonstrate the performance of our method with precise and intuitive metrics, and visually match with the face aging state-of-the-art. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05023v2 |
http://arxiv.org/pdf/1802.05023v2.pdf | |
PWC | https://paperswithcode.com/paper/recursive-chaining-of-reversible-image-to |
Repo | https://github.com/AaltoVision/img-transformer-chain |
Framework | tf |
Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making
Title | Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making |
Authors | Daniel Seita, Nawid Jamali, Michael Laskey, Ajay Kumar Tanwani, Ron Berenstein, Prakash Baskaran, Soshi Iba, John Canny, Ken Goldberg |
Abstract | A fundamental challenge in manipulating fabric for clothes folding and textiles manufacturing is computing “pick points” to effectively modify the state of an uncertain manifold. We present a supervised deep transfer learning approach to locate pick points using depth images for invariance to color and texture. We consider the task of bed-making, where a robot sequentially grasps and pulls at pick points to increase blanket coverage. We perform physical experiments with two mobile manipulator robots, the Toyota HSR and the Fetch, and three blankets of different colors and textures. We compare coverage results from (1) human supervision, (2) a baseline of picking at the uppermost blanket point, and (3) learned pick points. On a quarter-scale twin bed, a model trained with combined data from the two robots achieves 92% blanket coverage compared with 83% for the baseline and 95% for human supervisors. The model transfers to two novel blankets and achieves 93% coverage. Average coverage results of 92% for 193 beds suggest that transfer-invariant robot pick points on fabric can be effectively learned. |
Tasks | Decision Making, Deformable Object Manipulation, Transfer Learning |
Published | 2018-09-26 |
URL | https://arxiv.org/abs/1809.09810v3 |
https://arxiv.org/pdf/1809.09810v3.pdf | |
PWC | https://paperswithcode.com/paper/robot-bed-making-deep-transfer-learning-using |
Repo | https://github.com/DanielTakeshi/fast_grasp_detect |
Framework | tf |
QUOTA: The Quantile Option Architecture for Reinforcement Learning
Title | QUOTA: The Quantile Option Architecture for Reinforcement Learning |
Authors | Shangtong Zhang, Borislav Mavrin, Linglong Kong, Bo Liu, Hengshuai Yao |
Abstract | In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators. |
Tasks | Decision Making, Distributional Reinforcement Learning |
Published | 2018-11-05 |
URL | http://arxiv.org/abs/1811.02073v2 |
http://arxiv.org/pdf/1811.02073v2.pdf | |
PWC | https://paperswithcode.com/paper/quota-the-quantile-option-architecture-for |
Repo | https://github.com/pihey1995/DistributionalRL |
Framework | pytorch |
A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing Values
Title | A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing Values |
Authors | Ruifei Cui, Ioan Gabriel Bucur, Perry Groot, Tom Heskes |
Abstract | We consider the problem of learning parameters of latent variable models from mixed (continuous and ordinal) data with missing values. We propose a novel Bayesian Gaussian copula factor (BGCF) approach that is consistent under certain conditions and that is quite robust to the violations of these conditions. In simulations, BGCF substantially outperforms two state-of-the-art alternative approaches. An illustration on the `Holzinger & Swineford 1939’ dataset indicates that BGCF is favorable over the so-called robust maximum likelihood (MLR) even if the data match the assumptions of MLR. | |
Tasks | Latent Variable Models |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04610v1 |
http://arxiv.org/pdf/1806.04610v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-bayesian-approach-for-latent-variable |
Repo | https://github.com/cuiruifei/CopulaFactorModel |
Framework | none |
Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News
Title | Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News |
Authors | Luís Borges, Bruno Martins, Pável Calado |
Abstract | Fake news are nowadays an issue of pressing concern, given their recent rise as a potential threat to high-quality journalism and well-informed public discourse. The Fake News Challenge (FNC-1) was organized in 2017 to encourage the development of machine learning-based classification systems for stance detection (i.e., for identifying whether a particular news article agrees, disagrees, discusses, or is unrelated to a particular news headline), thus helping in the detection and analysis of possible instances of fake news. This article presents a new approach to tackle this stance detection problem, based on the combination of string similarity features with a deep neural architecture that leverages ideas previously advanced in the context of learning efficient text representations, document classification, and natural language inference. Specifically, we use bi-directional Recurrent Neural Networks, together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article. These representations are then combined/compared, complemented with similarity features inspired on other FNC-1 approaches, and passed to a final layer that predicts the stance of the article towards the headline. We also explore the use of external sources of information, specifically large datasets of sentence pairs originally proposed for training and evaluating natural language inference methods, in order to pre-train specific components of the neural network architecture (e.g., the RNNs used for encoding sentences). The obtained results attest to the effectiveness of the proposed ideas and show that our model, particularly when considering pre-training and the combination of neural representations together with similarity features, slightly outperforms the previous state-of-the-art. |
Tasks | Document Classification, Natural Language Inference, Representation Learning, Stance Detection |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00706v1 |
http://arxiv.org/pdf/1811.00706v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-similarity-features-and-deep |
Repo | https://github.com/LuisPB7/fnc-msc |
Framework | none |
WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration
Title | WSD algorithm based on a new method of vector-word contexts proximity calculation via epsilon-filtration |
Authors | Alexander Kirillov, Natalia Krizhanovsky, Andrew Krizhanovsky |
Abstract | The problem of word sense disambiguation (WSD) is considered in the article. Given a set of synonyms (synsets) and sentences with these synonyms. It is necessary to select the meaning of the word in the sentence automatically. 1285 sentences were tagged by experts, namely, one of the dictionary meanings was selected by experts for target words. To solve the WSD-problem, an algorithm based on a new method of vector-word contexts proximity calculation is proposed. In order to achieve higher accuracy, a preliminary epsilon-filtering of words is performed, both in the sentence and in the set of synonyms. An extensive program of experiments was carried out. Four algorithms are implemented, including a new algorithm. Experiments have shown that in a number of cases the new algorithm shows better results. The developed software and the tagged corpus have an open license and are available online. Wiktionary and Wikisource are used. A brief description of this work can be viewed in slides (https://goo.gl/9ak6Gt). Video lecture in Russian on this research is available online (https://youtu.be/-DLmRkepf58). |
Tasks | Word Sense Disambiguation |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09559v2 |
http://arxiv.org/pdf/1805.09559v2.pdf | |
PWC | https://paperswithcode.com/paper/wsd-algorithm-based-on-a-new-method-of-vector |
Repo | https://github.com/componavt/wcorpus |
Framework | none |
Norm-Ranging LSH for Maximum Inner Product Search
Title | Norm-Ranging LSH for Maximum Inner Product Search |
Authors | Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng |
Abstract | Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08782v2 |
http://arxiv.org/pdf/1809.08782v2.pdf | |
PWC | https://paperswithcode.com/paper/norm-ranging-lsh-for-maximum-inner-product |
Repo | https://github.com/xinyandai/similarity-search |
Framework | none |
Learning Symmetry Consistent Deep CNNs for Face Completion
Title | Learning Symmetry Consistent Deep CNNs for Face Completion |
Authors | Xiaoming Li, Ming Liu, Jieru Zhu, Wangmeng Zuo, Meng Wang, Guosheng Hu, Lei Zhang |
Abstract | Deep convolutional networks (CNNs) have achieved great success in face completion to generate plausible facial structures. These methods, however, are limited in maintaining global consistency among face components and recovering fine facial details. On the other hand, reflectional symmetry is a prominent property of face image and benefits face recognition and consistency modeling, yet remaining uninvestigated in deep face completion. In this work, we leverage two kinds of symmetry-enforcing subnets to form a symmetry-consistent CNN model (i.e., SymmFCNet) for effective face completion. For missing pixels on only one of the half-faces, an illumination-reweighted warping subnet is developed to guide the warping and illumination reweighting of the other half-face. As for missing pixels on both of half-faces, we present a generative reconstruction subnet together with a perceptual symmetry loss to enforce symmetry consistency of recovered structures. The SymmFCNet is constructed by stacking generative reconstruction subnet upon illumination-reweighted warping subnet, and can be end-to-end learned from training set of unaligned face images. Experiments show that SymmFCNet can generate high quality results on images with synthetic and real occlusion, and performs favorably against state-of-the-arts. |
Tasks | Face Recognition, Facial Inpainting |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07741v1 |
http://arxiv.org/pdf/1812.07741v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-symmetry-consistent-deep-cnns-for |
Repo | https://github.com/csxmli2016/SymmFCNet |
Framework | pytorch |
Anticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction
Title | Anticipation in Human-Robot Cooperation: A Recurrent Neural Network Approach for Multiple Action Sequences Prediction |
Authors | Paul Schydlo, Mirko Rakovic, Lorenzo Jamone, José Santos-Victor |
Abstract | Close human-robot cooperation is a key enabler for new developments in advanced manufacturing and assistive applications. Close cooperation require robots that can predict human actions and intent, and understand human non-verbal cues. Recent approaches based on neural networks have led to encouraging results in the human action prediction problem both in continuous and discrete spaces. Our approach extends the research in this direction. Our contributions are three-fold. First, we validate the use of gaze and body pose cues as a means of predicting human action through a feature selection method. Next, we address two shortcomings of existing literature: predicting multiple and variable-length action sequences. This is achieved by introducing an encoder-decoder recurrent neural network topology in the discrete action prediction problem. In addition, we theoretically demonstrate the importance of predicting multiple action sequences as a means of estimating the stochastic reward in a human robot cooperation scenario. Finally, we show the ability to effectively train the prediction model on a action prediction dataset, involving human motion data, and explore the influence of the model’s parameters on its performance. Source code repository: https://github.com/pschydlo/ActionAnticipation |
Tasks | Feature Selection |
Published | 2018-02-28 |
URL | http://arxiv.org/abs/1802.10503v3 |
http://arxiv.org/pdf/1802.10503v3.pdf | |
PWC | https://paperswithcode.com/paper/anticipation-in-human-robot-cooperation-a |
Repo | https://github.com/pschydlo/ActionAnticipation |
Framework | tf |
Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides
Title | Attention-Based Deep Neural Networks for Detection of Cancerous and Precancerous Esophagus Tissue on Histopathological Slides |
Authors | Naofumi Tomita, Behnaz Abdollahi, Jason Wei, Bing Ren, Arief Suriawinata, Saeed Hassanpour |
Abstract | Deep learning-based methods, such as the sliding window approach for cropped-image classification and heuristic aggregation for whole-slide inference, for analyzing histological patterns in high-resolution microscopy images have shown promising results. These approaches, however, require a laborious annotation process and are fragmented. This diagnostic study collected deidentified high-resolution histological images (N = 379) for training a new model composed of a convolutional neural network and a grid-based attention network, trainable without region-of-interest annotations. Histological images of patients who underwent endoscopic esophagus and gastroesophageal junction mucosal biopsy between January 1, 2016, and December 31, 2018, at Dartmouth-Hitchcock Medical Center (Lebanon, New Hampshire) were collected. The method achieved a mean accuracy of 0.83 in classifying 123 test images. These results were comparable with or better than the performance from the current state-of-the-art sliding window approach, which was trained with regions of interest. Results of this study suggest that the proposed attention-based deep neural network framework for Barrett esophagus and esophageal adenocarcinoma detection is important because it is based solely on tissue-level annotations, unlike existing methods that are based on regions of interest. This new model is expected to open avenues for applying deep learning to digital pathology. |
Tasks | Crop Classification, Image Classification, Medical Object Detection |
Published | 2018-11-20 |
URL | https://arxiv.org/abs/1811.08513v2 |
https://arxiv.org/pdf/1811.08513v2.pdf | |
PWC | https://paperswithcode.com/paper/finding-a-needle-in-the-haystack-attention |
Repo | https://github.com/BMIRDS/deepslide |
Framework | pytorch |
Persistence Bag-of-Words for Topological Data Analysis
Title | Persistence Bag-of-Words for Topological Data Analysis |
Authors | Bartosz Zieliński, Michał Lipiński, Mateusz Juda, Matthias Zeppelzauer, Paweł Dłotko |
Abstract | Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today’s machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with machine learning. Comprehensive experiments show that the new representation achieves state-of-the-art performance and beyond in much less time than alternative approaches. |
Tasks | Topological Data Analysis |
Published | 2018-12-21 |
URL | https://arxiv.org/abs/1812.09245v3 |
https://arxiv.org/pdf/1812.09245v3.pdf | |
PWC | https://paperswithcode.com/paper/persistence-bag-of-words-for-topological-data |
Repo | https://github.com/bziiuj/pcodebooks |
Framework | none |
Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
Title | Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding |
Authors | Deunsol Yoon, Dongbok Lee, SangKeun Lee |
Abstract | In this paper, we propose Dynamic Self-Attention (DSA), a new self-attention mechanism for sentence embedding. We design DSA by modifying dynamic routing in capsule network (Sabouretal.,2017) for natural language processing. DSA attends to informative words with a dynamic weight vector. We achieve new state-of-the-art results among sentence encoding methods in Stanford Natural Language Inference (SNLI) dataset with the least number of parameters, while showing comparative results in Stanford Sentiment Treebank (SST) dataset. |
Tasks | Natural Language Inference, Sentence Embedding |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07383v1 |
http://arxiv.org/pdf/1808.07383v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-self-attention-computing-attention |
Repo | https://github.com/dsindex/iclassifier |
Framework | pytorch |
Practical methods for graph two-sample testing
Title | Practical methods for graph two-sample testing |
Authors | Debarghya Ghoshdastidar, Ulrike von Luxburg |
Abstract | Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies in statistics and learning theory have provided some theoretical insights about such high-dimensional graph testing problems, but the practicality of the developed theoretical methods remains an open question. In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions. We show that these tests are computationally less expensive and, in some cases, more reliable than the existing methods. |
Tasks | |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12752v1 |
http://arxiv.org/pdf/1811.12752v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-methods-for-graph-two-sample |
Repo | https://github.com/gdebarghya/Network-TwoSampleTesting |
Framework | none |
AXNet: ApproXimate computing using an end-to-end trainable neural network
Title | AXNet: ApproXimate computing using an end-to-end trainable neural network |
Authors | Zhenghao Peng, Xuyang Chen, Chengwen Xu, Naifeng Jing, Xiaoyao Liang, Cewu Lu, Li Jiang |
Abstract | Neural network based approximate computing is a universal architecture promising to gain tremendous energy-efficiency for many error resilient applications. To guarantee the approximation quality, existing works deploy two neural networks (NNs), e.g., an approximator and a predictor. The approximator provides the approximate results, while the predictor predicts whether the input data is safe to approximate with the given quality requirement. However, it is non-trivial and time-consuming to make these two neural network coordinate—they have different optimization objectives—by training them separately. This paper proposes a novel neural network structure—AXNet—to fuse two NNs to a holistic end-to-end trainable NN. Leveraging the philosophy of multi-task learning, AXNet can tremendously improve the invocation (proportion of safe-to-approximate samples) and reduce the approximation error. The training effort also decrease significantly. Experiment results show 50.7% more invocation and substantial cuts of training time when compared to existing neural network based approximate computing framework. |
Tasks | Multi-Task Learning |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10458v2 |
http://arxiv.org/pdf/1807.10458v2.pdf | |
PWC | https://paperswithcode.com/paper/axnet-approximate-computing-using-an-end-to |
Repo | https://github.com/ACA-Lab-SJTU/approximate-computing |
Framework | tf |