February 1, 2020

3461 words 17 mins read

Paper Group AWR 352

Paper Group AWR 352

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes. Deep learning-based electroencephalography analysis: a systematic review. A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos. The Hitchhiker’s Guide to LDA. Anchor Diffusion for Unsupervised Video Object Segmentation. Understanding the I …

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

Title Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Authors Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li
Abstract Unsupervised deep learning for optical flow computation has achieved promising results. Most existing deep-net based methods rely on image brightness consistency and local smoothness constraint to train the networks. Their performance degrades at regions where repetitive textures or occlusions occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning. In particular, we investigate multiple ways of enforcing the epipolar constraint in flow estimation. To alleviate a ``chicken-and-egg’’ type of problem encountered in dynamic scenes where multiple motions may be present, we propose a low-rank constraint as well as a union-of-subspaces constraint for training. Experimental results on various benchmarking datasets show that our method achieves competitive performance compared with supervised methods and outperforms state-of-the-art unsupervised deep-learning methods. |
Tasks Optical Flow Estimation
Published 2019-04-08
URL http://arxiv.org/abs/1904.03848v1
PDF http://arxiv.org/pdf/1904.03848v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-deep-epipolar-flow-for
Repo https://github.com/yiranzhong/EPIflow.git
Framework none

Deep learning-based electroencephalography analysis: a systematic review

Title Deep learning-based electroencephalography analysis: a systematic review
Authors Yannick Roy, Hubert Banville, Isabela Albuquerque, Alexandre Gramfort, Tiago H. Falk, Jocelyn Faubert
Abstract Electroencephalography (EEG) is a complex signal and can require several years of training to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. In this work, we review 156 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches in order to inform future research and formulate recommendations. Various data items were extracted for each study pertaining to 1) the data, 2) the preprocessing methodology, 3) the DL design choices, 4) the results, and 5) the reproducibility of the experiments. Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours. As for the model, 40% of the studies used convolutional neural networks (CNNs), while 14% used recurrent neural networks (RNNs), most often with a total of 3 to 10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was 5.4% across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. To help the field progress, we provide a list of recommendations for future studies and we make our summary table of DL and EEG papers available and invite the community to contribute.
Tasks Brain Decoding, EEG, Time Series
Published 2019-01-16
URL http://arxiv.org/abs/1901.05498v2
PDF http://arxiv.org/pdf/1901.05498v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-based-electroencephalography
Repo https://github.com/hubertjb/dl-eeg-review
Framework none

A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos

Title A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos
Authors Tal Hakim, Ilan Shimshoni
Abstract Assessment of motion quality has recently gained high demand in a variety of domains. The ability to automatically assess subject motion in videos that were captured by cheap devices, such as Kinect cameras, is essential for monitoring clinical rehabilitation processes, for improving motor skills and for motion learning tasks. The need to pay attention to low-level details while accurately tracking the motion stages, makes this task very challenging. In this work, we introduce A-MAL, an automatic, strong motion assessment learning algorithm that only learns from properly-performed motion videos without further annotations, powered by a deviation time-segmentation algorithm, a parameter relevance detection algorithm, a novel time-warping algorithm that is based on automatic detection of common temporal points-of-interest and a textual-feedback generation mechanism. We demonstrate our method on motions from the Fugl-Meyer Assessment (FMA) test, which is typically held by occupational therapists in order to monitor patients’ recovery processes after strokes.
Tasks
Published 2019-07-23
URL https://arxiv.org/abs/1907.10004v3
PDF https://arxiv.org/pdf/1907.10004v3.pdf
PWC https://paperswithcode.com/paper/a-mal-automatic-motion-assessment-learning
Repo https://github.com/skvp-owner/a-mal
Framework none

The Hitchhiker’s Guide to LDA

Title The Hitchhiker’s Guide to LDA
Authors Chen Ma
Abstract Latent Dirichlet Allocation (LDA) model is a famous model in the topic model field, it has been studied for years due to its extensive application value in industry and academia. However, the mathematical derivation of LDA model is challenging and difficult, which makes it difficult for the beginners to learn. To help the beginners in learning LDA, this book analyzes the mathematical derivation of LDA in detail, and it also introduces all the knowledge background to make it easy for beginners to understand. Thus, this book contains the author’s unique insights. It should be noted that this book is written in Chinese.
Tasks
Published 2019-08-07
URL https://arxiv.org/abs/1908.03142v2
PDF https://arxiv.org/pdf/1908.03142v2.pdf
PWC https://paperswithcode.com/paper/the-hitchhikers-guide-to-lda
Repo https://github.com/MachineIntellect/GibbsLDA_plus
Framework none

Anchor Diffusion for Unsupervised Video Object Segmentation

Title Anchor Diffusion for Unsupervised Video Object Segmentation
Authors Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip H. S. Torr
Abstract Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow. Despite their complexity, these kinds of approaches tend to favour short-term temporal dependencies and are thus prone to accumulating inaccuracies, which cause drift over time. Moreover, simple (static) image segmentation models, alone, can perform competitively against these methods, which further suggests that the way temporal dependencies are modelled should be reconsidered. Motivated by these observations, in this paper we explore simple yet effective strategies to model long-term temporal dependencies. Inspired by the non-local operators of [70], we introduce a technique to establish dense correspondences between pixel embeddings of a reference “anchor” frame and the current one. This allows the learning of pairwise dependencies at arbitrarily long distances without conditioning on intermediate frames. Without online supervision, our approach can suppress the background and precisely segment the foreground object even in challenging scenarios, while maintaining consistent performance over time. With a mean IoU of $81.7%$, our method ranks first on the DAVIS-2016 leaderboard of unsupervised methods, while still being competitive against state-of-the-art online semi-supervised approaches. We further evaluate our method on the FBMS dataset and the ViSal video saliency dataset, showing results competitive with the state of the art.
Tasks Optical Flow Estimation, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2019-10-24
URL https://arxiv.org/abs/1910.10895v1
PDF https://arxiv.org/pdf/1910.10895v1.pdf
PWC https://paperswithcode.com/paper/anchor-diffusion-for-unsupervised-video-1
Repo https://github.com/yz93/anchor-diff-VOS
Framework pytorch

Understanding the Impact of Label Granularity on CNN-based Image Classification

Title Understanding the Impact of Label Granularity on CNN-based Image Classification
Authors Zhuo Chen, Ruizhou Ding, Ting-Wu Chin, Diana Marculescu
Abstract In recent years, supervised learning using Convolutional Neural Networks (CNNs) has achieved great success in image classification tasks, and large scale labeled datasets have contributed significantly to this achievement. However, the definition of a label is often application dependent. For example, an image of a cat can be labeled as “cat” or perhaps more specifically “Persian cat.” We refer to this as label granularity. In this paper, we conduct extensive experiments using various datasets to demonstrate and analyze how and why training based on fine-grain labeling, such as “Persian cat” can improve CNN accuracy on classifying coarse-grain classes, in this case “cat.” The experimental results show that training CNNs with fine-grain labels improves both network’s optimization and generalization capabilities, as intuitively it encourages the network to learn more features, and hence increases classification accuracy on coarse-grain classes under all datasets considered. Moreover, fine-grain labels enhance data efficiency in CNN training. For example, a CNN trained with fine-grain labels and only 40% of the total training data can achieve higher accuracy than a CNN trained with the full training dataset and coarse-grain labels. These results point to two possible applications of this work: (i) with sufficient human resources, one can improve CNN performance by re-labeling the dataset with fine-grain labels, and (ii) with limited human resources, to improve CNN performance, rather than collecting more training data, one may instead use fine-grain labels for the dataset. We further propose a metric called Average Confusion Ratio to characterize the effectiveness of fine-grain labeling, and show its use through extensive experimentation. Code is available at https://github.com/cmu-enyac/Label-Granularity.
Tasks Image Classification
Published 2019-01-21
URL http://arxiv.org/abs/1901.07012v1
PDF http://arxiv.org/pdf/1901.07012v1.pdf
PWC https://paperswithcode.com/paper/understanding-the-impact-of-label-granularity
Repo https://github.com/cmu-enyac/Label-Granularity
Framework pytorch

Improving the robustness of ImageNet classifiers using elements of human visual cognition

Title Improving the robustness of ImageNet classifiers using elements of human visual cognition
Authors A. Emin Orhan, Brenden M. Lake
Abstract We investigate the robustness properties of image recognition models equipped with two features inspired by human vision, an explicit episodic memory and a shape bias, at the ImageNet scale. As reported in previous work, we show that an explicit episodic memory improves the robustness of image recognition models against small-norm adversarial perturbations under some threat models. It does not, however, improve the robustness against more natural, and typically larger, perturbations. Learning more robust features during training appears to be necessary for robustness in this second sense. We show that features derived from a model that was encouraged to learn global, shape-based representations (Geirhos et al., 2019) do not only improve the robustness against natural perturbations, but when used in conjunction with an episodic memory, they also provide additional robustness against adversarial perturbations. Finally, we address three important design choices for the episodic memory: memory size, dimensionality of the memories and the retrieval method. We show that to make the episodic memory more compact, it is preferable to reduce the number of memories by clustering them, instead of reducing their dimensionality.
Tasks
Published 2019-06-20
URL https://arxiv.org/abs/1906.08416v2
PDF https://arxiv.org/pdf/1906.08416v2.pdf
PWC https://paperswithcode.com/paper/improving-the-robustness-of-imagenet
Repo https://github.com/eminorhan/robust-vision
Framework pytorch

Fact Discovery from Knowledge Base via Facet Decomposition

Title Fact Discovery from Knowledge Base via Facet Decomposition
Authors Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam
Abstract During the past few decades, knowledge bases (KBs) have experienced rapid growth. Nevertheless, most KBs still suffer from serious incompletion. Researchers proposed many tasks such as knowledge base completion and relation prediction to help build the representation of KBs. However, there are some issues unsettled towards enriching the KBs. Knowledge base completion and relation prediction assume that we know two elements of the fact triples and we are going to predict the missing one. This assumption is too restricted in practice and prevents it from discovering new facts directly. To address this issue, we propose a new task, namely, fact discovery from knowledge base. This task only requires that we know the head entity and the goal is to discover facts associated with the head entity. To tackle this new problem, we propose a novel framework that decomposes the discovery problem into several facet discovery components. We also propose a novel auto-encoder based facet component to estimate some facets of the fact. Besides, we propose a feedback learning component to share the information between each facet. We evaluate our framework using a benchmark dataset and the experimental results show that our framework achieves promising results. We also conduct extensive analysis of our framework in discovering different kinds of facts. The source code of this paper can be obtained from https://github.com/thunlp/FFD.
Tasks Knowledge Base Completion
Published 2019-04-21
URL http://arxiv.org/abs/1904.09540v1
PDF http://arxiv.org/pdf/1904.09540v1.pdf
PWC https://paperswithcode.com/paper/fact-discovery-from-knowledge-base-via-facet
Repo https://github.com/thunlp/FFD
Framework none

Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned

Title Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned
Authors Max Berrendorf, Evgeniy Faerman, Valentyn Melnychuk, Volker Tresp, Thomas Seidl
Abstract In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fully reproduce the results from the original paper and after a thorough audit of the code provided by authors, we concluded, that their implementation is different from the architecture described in the paper. In addition, several tricks are required to make the model work and some of them are not very intuitive. We provide an extensive ablation study to quantify the effects these tricks and changes of architecture have on final performance. Furthermore, we examine current evaluation approaches and systematize available benchmark datasets. We believe that people interested in KG matching might profit from our work, as well as novices entering the field
Tasks Entity Alignment, Knowledge Graphs
Published 2019-11-19
URL https://arxiv.org/abs/1911.08342v2
PDF https://arxiv.org/pdf/1911.08342v2.pdf
PWC https://paperswithcode.com/paper/knowledge-graph-entity-alignment-with-graph
Repo https://github.com/Valentyn1997/kg-alignment-lessons-learned
Framework pytorch

Cross-Domain Generalization of Neural Constituency Parsers

Title Cross-Domain Generalization of Neural Constituency Parsers
Authors Daniel Fried, Nikita Kitaev, Dan Klein
Abstract Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing – but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.
Tasks Constituency Parsing, Domain Generalization
Published 2019-07-09
URL https://arxiv.org/abs/1907.04347v1
PDF https://arxiv.org/pdf/1907.04347v1.pdf
PWC https://paperswithcode.com/paper/cross-domain-generalization-of-neural
Repo https://github.com/dpfried/rnng-bert
Framework tf

SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)

Title SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)
Authors Francesco Morandin, Gianluca Amato, Marco Fantozzi, Rosa Gini, Carlo Metta, Maurizio Parton
Abstract We develop a new model that can be applied to any perfect information two-player zero-sum game to target a high score, and thus a perfect play. We integrate this model into the Monte Carlo tree search-policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9 Go produces a superhuman Go player, thus proving that it is stable and robust. We show that this model can be used to effectively play with both positional and score handicap, and to minimize suboptimal moves. We develop a family of agents that can target high scores against any opponent, and recover from very severe disadvantage against weak opponents. To the best of our knowledge, these are the first effective achievements in this direction.
Tasks
Published 2019-05-26
URL https://arxiv.org/abs/1905.10863v3
PDF https://arxiv.org/pdf/1905.10863v3.pdf
PWC https://paperswithcode.com/paper/sai-a-sensible-artificial-intelligence-that-1
Repo https://github.com/sai-dev/sai
Framework none

A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts

Title A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts
Authors Hossein Keshavarz, Shohreh Tabatabayi Seifi, Mohammad Izadi
Abstract In this paper, we propose a novel approach for measuring the degree of similarity between categories of two pieces of Persian text, which were published as descriptions of two separate advertisements. We built an appropriate dataset for this work using a dataset which consists of advertisements posted on an e-commerce website. We generated a significant number of paired texts from this dataset and assigned each pair a score from 0 to 3, which demonstrates the degree of similarity between the domains of the pair. In this work, we represent words with word embedding vectors derived from word2vec. Then deep neural network models are used to represent texts. Eventually, we employ concatenation of absolute difference and bit-wise multiplication and a fully-connected neural network to produce a probability distribution vector for the score of the pairs. Through a supervised learning approach, we trained our model on a GPU, and our best model achieved an F1 score of 0.9865.
Tasks
Published 2019-09-12
URL https://arxiv.org/abs/1909.09690v2
PDF https://arxiv.org/pdf/1909.09690v2.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-based-approach-for-measuring
Repo https://github.com/hossein-kshvrz/text_domain_similarity
Framework none

PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images

Title PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images
Authors Anirban Santara, Jayeeta Datta, Sourav Sarkar, Ankur Garg, Kirti Padia, Pabitra Mitra
Abstract Hyperspectral images of land-cover captured by airborne or satellite-mounted sensors provide a rich source of information about the chemical composition of the materials present in a given place. This makes hyperspectral imaging an important tool for earth sciences, land-cover studies, and military and strategic applications. However, the scarcity of labeled training examples and spatial variability of spectral signature are two of the biggest challenges faced by hyperspectral image classification. In order to address these issues, we aim to develop a framework for material-agnostic information retrieval in hyperspectral images based on Positive-Unlabelled (PU) classification. Given a hyperspectral scene, the user labels some positive samples of a material he/she is looking for and our goal is to retrieve all the remaining instances of the query material in the scene. Additionally, we require the system to work equally well for any material in any scene without the user having to disclose the identity of the query material. This material-agnostic nature of the framework provides it with superior generalization abilities. We explore two alternative approaches to solve the hyperspectral image classification problem within this framework. The first approach is an adaptation of non-negative risk estimation based PU learning for hyperspectral data. The second approach is based on one-versus-all positive-negative classification where the negative class is approximately sampled using a novel spectral-spatial retrieval model. We propose two annotator models - uniform and blob - that represent the labelling patterns of a human annotator. We compare the performances of the proposed algorithms for each annotator model on three benchmark hyperspectral image datasets - Indian Pines, Pavia University and Salinas.
Tasks Hyperspectral Image Classification, Image Classification, Information Retrieval
Published 2019-04-09
URL http://arxiv.org/abs/1904.04547v1
PDF http://arxiv.org/pdf/1904.04547v1.pdf
PWC https://paperswithcode.com/paper/punch-positive-unlabelled-classification
Repo https://github.com/HSISeg/HSISeg
Framework none

Explicit Sentence Compression for Neural Machine Translation

Title Explicit Sentence Compression for Neural Machine Translation
Authors Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao
Abstract State-of-the-art Transformer-based neural machine translation (NMT) systems still follow a standard encoder-decoder framework, in which source sentence representation can be well done by an encoder with self-attention mechanism. Though Transformer-based encoder may effectively capture general information in its resulting source sentence representation, the backbone information, which stands for the gist of a sentence, is not specifically focused on. In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT. In practice, an explicit sentence compression goal used to learn the backbone information in a sentence. We propose three ways, including backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the compressed sentence into NMT. Our empirical tests on the WMT English-to-French and English-to-German translation tasks show that the proposed sentence compression method significantly improves the translation performances over strong baselines.
Tasks Machine Translation, Sentence Compression
Published 2019-12-27
URL https://arxiv.org/abs/1912.11980v1
PDF https://arxiv.org/pdf/1912.11980v1.pdf
PWC https://paperswithcode.com/paper/explicit-sentence-compression-for-neural
Repo https://github.com/bcmi220/esc4nmt
Framework pytorch

Diversity Transfer Network for Few-Shot Learning

Title Diversity Transfer Network for Few-Shot Learning
Authors Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xinyu Zhang, Chang Huang, Wenyu Liu, Bo Wang
Abstract Few-shot learning is a challenging task that aims at training a classifier for unseen classes with only a few training examples. The main difficulty of few-shot learning lies in the lack of intra-class diversity within insufficient training samples. To alleviate this problem, we propose a novel generative framework, Diversity Transfer Network (DTN), that learns to transfer latent diversities from known categories and composite them with support features to generate diverse samples for novel categories in feature space. The learning problem of the sample generation (i.e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works. Besides, an organized auxiliary task co-training over known categories is proposed to stabilize the meta-training process of DTN. We perform extensive experiments and ablation studies on three datasets, i.e., \emph{mini}ImageNet, CIFAR100 and CUB. The results show that DTN, with single-stage training and faster convergence speed, obtains the state-of-the-art results among the feature generation based few-shot learning methods. Code and supplementary material are available at: \texttt{https://github.com/Yuxin-CV/DTN}
Tasks Few-Shot Learning
Published 2019-12-31
URL https://arxiv.org/abs/1912.13182v1
PDF https://arxiv.org/pdf/1912.13182v1.pdf
PWC https://paperswithcode.com/paper/diversity-transfer-network-for-few-shot
Repo https://github.com/Yuxin-CV/DTN
Framework pytorch
comments powered by Disqus