February 1, 2020

3461 words 17 mins read

Paper Group AWR 352

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes. Deep learning-based electroencephalography analysis: a systematic review. A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos. The Hitchhiker’s Guide to LDA. Anchor Diffusion for Unsupervised Video Object Segmentation. Understanding the I …

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes


Title	Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Authors	Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li
Abstract	Unsupervised deep learning for optical flow computation has achieved promising results. Most existing deep-net based methods rely on image brightness consistency and local smoothness constraint to train the networks. Their performance degrades at regions where repetitive textures or occlusions occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning. In particular, we investigate multiple ways of enforcing the epipolar constraint in flow estimation. To alleviate a ``chicken-and-egg’’ type of problem encountered in dynamic scenes where multiple motions may be present, we propose a low-rank constraint as well as a union-of-subspaces constraint for training. Experimental results on various benchmarking datasets show that our method achieves competitive performance compared with supervised methods and outperforms state-of-the-art unsupervised deep-learning methods. \|
Tasks	Optical Flow Estimation
Published	2019-04-08
URL	http://arxiv.org/abs/1904.03848v1
PDF	http://arxiv.org/pdf/1904.03848v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-deep-epipolar-flow-for
Repo	https://github.com/yiranzhong/EPIflow.git
Framework	none

Deep learning-based electroencephalography analysis: a systematic review


Title	Deep learning-based electroencephalography analysis: a systematic review
Authors	Yannick Roy, Hubert Banville, Isabela Albuquerque, Alexandre Gramfort, Tiago H. Falk, Jocelyn Faubert
Abstract	Electroencephalography (EEG) is a complex signal and can require several years of training to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. In this work, we review 156 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches in order to inform future research and formulate recommendations. Various data items were extracted for each study pertaining to 1) the data, 2) the preprocessing methodology, 3) the DL design choices, 4) the results, and 5) the reproducibility of the experiments. Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours. As for the model, 40% of the studies used convolutional neural networks (CNNs), while 14% used recurrent neural networks (RNNs), most often with a total of 3 to 10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was 5.4% across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. To help the field progress, we provide a list of recommendations for future studies and we make our summary table of DL and EEG papers available and invite the community to contribute.
Tasks	Brain Decoding, EEG, Time Series
Published	2019-01-16
URL	http://arxiv.org/abs/1901.05498v2
PDF	http://arxiv.org/pdf/1901.05498v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-electroencephalography
Repo	https://github.com/hubertjb/dl-eeg-review
Framework	none

A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos


Title	A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos
Authors	Tal Hakim, Ilan Shimshoni
Abstract	Assessment of motion quality has recently gained high demand in a variety of domains. The ability to automatically assess subject motion in videos that were captured by cheap devices, such as Kinect cameras, is essential for monitoring clinical rehabilitation processes, for improving motor skills and for motion learning tasks. The need to pay attention to low-level details while accurately tracking the motion stages, makes this task very challenging. In this work, we introduce A-MAL, an automatic, strong motion assessment learning algorithm that only learns from properly-performed motion videos without further annotations, powered by a deviation time-segmentation algorithm, a parameter relevance detection algorithm, a novel time-warping algorithm that is based on automatic detection of common temporal points-of-interest and a textual-feedback generation mechanism. We demonstrate our method on motions from the Fugl-Meyer Assessment (FMA) test, which is typically held by occupational therapists in order to monitor patients’ recovery processes after strokes.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10004v3
PDF	https://arxiv.org/pdf/1907.10004v3.pdf
PWC	https://paperswithcode.com/paper/a-mal-automatic-motion-assessment-learning
Repo	https://github.com/skvp-owner/a-mal
Framework	none

The Hitchhiker’s Guide to LDA


Title	The Hitchhiker’s Guide to LDA
Authors	Chen Ma
Abstract	Latent Dirichlet Allocation (LDA) model is a famous model in the topic model field, it has been studied for years due to its extensive application value in industry and academia. However, the mathematical derivation of LDA model is challenging and difficult, which makes it difficult for the beginners to learn. To help the beginners in learning LDA, this book analyzes the mathematical derivation of LDA in detail, and it also introduces all the knowledge background to make it easy for beginners to understand. Thus, this book contains the author’s unique insights. It should be noted that this book is written in Chinese.
Tasks
Published	2019-08-07
URL	https://arxiv.org/abs/1908.03142v2
PDF	https://arxiv.org/pdf/1908.03142v2.pdf
PWC	https://paperswithcode.com/paper/the-hitchhikers-guide-to-lda
Repo	https://github.com/MachineIntellect/GibbsLDA_plus
Framework	none

Anchor Diffusion for Unsupervised Video Object Segmentation


Title	Anchor Diffusion for Unsupervised Video Object Segmentation
Authors	Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip H. S. Torr
Abstract	Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow. Despite their complexity, these kinds of approaches tend to favour short-term temporal dependencies and are thus prone to accumulating inaccuracies, which cause drift over time. Moreover, simple (static) image segmentation models, alone, can perform competitively against these methods, which further suggests that the way temporal dependencies are modelled should be reconsidered. Motivated by these observations, in this paper we explore simple yet effective strategies to model long-term temporal dependencies. Inspired by the non-local operators of [70], we introduce a technique to establish dense correspondences between pixel embeddings of a reference “anchor” frame and the current one. This allows the learning of pairwise dependencies at arbitrarily long distances without conditioning on intermediate frames. Without online supervision, our approach can suppress the background and precisely segment the foreground object even in challenging scenarios, while maintaining consistent performance over time. With a mean IoU of $81.7%$, our method ranks first on the DAVIS-2016 leaderboard of unsupervised methods, while still being competitive against state-of-the-art online semi-supervised approaches. We further evaluate our method on the FBMS dataset and the ViSal video saliency dataset, showing results competitive with the state of the art.
Tasks	Optical Flow Estimation, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10895v1
PDF	https://arxiv.org/pdf/1910.10895v1.pdf
PWC	https://paperswithcode.com/paper/anchor-diffusion-for-unsupervised-video-1
Repo	https://github.com/yz93/anchor-diff-VOS
Framework	pytorch

Understanding the Impact of Label Granularity on CNN-based Image Classification


Title	Understanding the Impact of Label Granularity on CNN-based Image Classification
Authors	Zhuo Chen, Ruizhou Ding, Ting-Wu Chin, Diana Marculescu
Abstract	In recent years, supervised learning using Convolutional Neural Networks (CNNs) has achieved great success in image classification tasks, and large scale labeled datasets have contributed significantly to this achievement. However, the definition of a label is often application dependent. For example, an image of a cat can be labeled as “cat” or perhaps more specifically “Persian cat.” We refer to this as label granularity. In this paper, we conduct extensive experiments using various datasets to demonstrate and analyze how and why training based on fine-grain labeling, such as “Persian cat” can improve CNN accuracy on classifying coarse-grain classes, in this case “cat.” The experimental results show that training CNNs with fine-grain labels improves both network’s optimization and generalization capabilities, as intuitively it encourages the network to learn more features, and hence increases classification accuracy on coarse-grain classes under all datasets considered. Moreover, fine-grain labels enhance data efficiency in CNN training. For example, a CNN trained with fine-grain labels and only 40% of the total training data can achieve higher accuracy than a CNN trained with the full training dataset and coarse-grain labels. These results point to two possible applications of this work: (i) with sufficient human resources, one can improve CNN performance by re-labeling the dataset with fine-grain labels, and (ii) with limited human resources, to improve CNN performance, rather than collecting more training data, one may instead use fine-grain labels for the dataset. We further propose a metric called Average Confusion Ratio to characterize the effectiveness of fine-grain labeling, and show its use through extensive experimentation. Code is available at https://github.com/cmu-enyac/Label-Granularity.
Tasks	Image Classification
Published	2019-01-21
URL	http://arxiv.org/abs/1901.07012v1
PDF	http://arxiv.org/pdf/1901.07012v1.pdf
PWC	https://paperswithcode.com/paper/understanding-the-impact-of-label-granularity
Repo	https://github.com/cmu-enyac/Label-Granularity
Framework	pytorch

Improving the robustness of ImageNet classifiers using elements of human visual cognition


Title	Improving the robustness of ImageNet classifiers using elements of human visual cognition
Authors	A. Emin Orhan, Brenden M. Lake
Abstract	We investigate the robustness properties of image recognition models equipped with two features inspired by human vision, an explicit episodic memory and a shape bias, at the ImageNet scale. As reported in previous work, we show that an explicit episodic memory improves the robustness of image recognition models against small-norm adversarial perturbations under some threat models. It does not, however, improve the robustness against more natural, and typically larger, perturbations. Learning more robust features during training appears to be necessary for robustness in this second sense. We show that features derived from a model that was encouraged to learn global, shape-based representations (Geirhos et al., 2019) do not only improve the robustness against natural perturbations, but when used in conjunction with an episodic memory, they also provide additional robustness against adversarial perturbations. Finally, we address three important design choices for the episodic memory: memory size, dimensionality of the memories and the retrieval method. We show that to make the episodic memory more compact, it is preferable to reduce the number of memories by clustering them, instead of reducing their dimensionality.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08416v2
PDF	https://arxiv.org/pdf/1906.08416v2.pdf
PWC	https://paperswithcode.com/paper/improving-the-robustness-of-imagenet
Repo	https://github.com/eminorhan/robust-vision
Framework	pytorch


Title	Fact Discovery from Knowledge Base via Facet Decomposition
Authors	Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam
Abstract	During the past few decades, knowledge bases (KBs) have experienced rapid growth. Nevertheless, most KBs still suffer from serious incompletion. Researchers proposed many tasks such as knowledge base completion and relation prediction to help build the representation of KBs. However, there are some issues unsettled towards enriching the KBs. Knowledge base completion and relation prediction assume that we know two elements of the fact triples and we are going to predict the missing one. This assumption is too restricted in practice and prevents it from discovering new facts directly. To address this issue, we propose a new task, namely, fact discovery from knowledge base. This task only requires that we know the head entity and the goal is to discover facts associated with the head entity. To tackle this new problem, we propose a novel framework that decomposes the discovery problem into several facet discovery components. We also propose a novel auto-encoder based facet component to estimate some facets of the fact. Besides, we propose a feedback learning component to share the information between each facet. We evaluate our framework using a benchmark dataset and the experimental results show that our framework achieves promising results. We also conduct extensive analysis of our framework in discovering different kinds of facts. The source code of this paper can be obtained from https://github.com/thunlp/FFD.
Tasks	Knowledge Base Completion
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09540v1
PDF	http://arxiv.org/pdf/1904.09540v1.pdf
PWC	https://paperswithcode.com/paper/fact-discovery-from-knowledge-base-via-facet
Repo	https://github.com/thunlp/FFD
Framework	none

Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned


Title	Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned
Authors	Max Berrendorf, Evgeniy Faerman, Valentyn Melnychuk, Volker Tresp, Thomas Seidl
Abstract	In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fully reproduce the results from the original paper and after a thorough audit of the code provided by authors, we concluded, that their implementation is different from the architecture described in the paper. In addition, several tricks are required to make the model work and some of them are not very intuitive. We provide an extensive ablation study to quantify the effects these tricks and changes of architecture have on final performance. Furthermore, we examine current evaluation approaches and systematize available benchmark datasets. We believe that people interested in KG matching might profit from our work, as well as novices entering the field
Tasks	Entity Alignment, Knowledge Graphs
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08342v2
PDF	https://arxiv.org/pdf/1911.08342v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-graph-entity-alignment-with-graph
Repo	https://github.com/Valentyn1997/kg-alignment-lessons-learned
Framework	pytorch

Cross-Domain Generalization of Neural Constituency Parsers


Title	Cross-Domain Generalization of Neural Constituency Parsers
Authors	Daniel Fried, Nikita Kitaev, Dan Klein
Abstract	Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing – but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks.
Tasks	Constituency Parsing, Domain Generalization
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04347v1
PDF	https://arxiv.org/pdf/1907.04347v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-generalization-of-neural
Repo	https://github.com/dpfried/rnng-bert
Framework	tf

SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)


Title	SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)
Authors	Francesco Morandin, Gianluca Amato, Marco Fantozzi, Rosa Gini, Carlo Metta, Maurizio Parton
Abstract	We develop a new model that can be applied to any perfect information two-player zero-sum game to target a high score, and thus a perfect play. We integrate this model into the Monte Carlo tree search-policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9 Go produces a superhuman Go player, thus proving that it is stable and robust. We show that this model can be used to effectively play with both positional and score handicap, and to minimize suboptimal moves. We develop a family of agents that can target high scores against any opponent, and recover from very severe disadvantage against weak opponents. To the best of our knowledge, these are the first effective achievements in this direction.
Tasks
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10863v3
PDF	https://arxiv.org/pdf/1905.10863v3.pdf
PWC	https://paperswithcode.com/paper/sai-a-sensible-artificial-intelligence-that-1
Repo	https://github.com/sai-dev/sai
Framework	none

A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts


Title	A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts
Authors	Hossein Keshavarz, Shohreh Tabatabayi Seifi, Mohammad Izadi
Abstract	In this paper, we propose a novel approach for measuring the degree of similarity between categories of two pieces of Persian text, which were published as descriptions of two separate advertisements. We built an appropriate dataset for this work using a dataset which consists of advertisements posted on an e-commerce website. We generated a significant number of paired texts from this dataset and assigned each pair a score from 0 to 3, which demonstrates the degree of similarity between the domains of the pair. In this work, we represent words with word embedding vectors derived from word2vec. Then deep neural network models are used to represent texts. Eventually, we employ concatenation of absolute difference and bit-wise multiplication and a fully-connected neural network to produce a probability distribution vector for the score of the pairs. Through a supervised learning approach, we trained our model on a GPU, and our best model achieved an F1 score of 0.9865.
Tasks
Published	2019-09-12
URL	https://arxiv.org/abs/1909.09690v2
PDF	https://arxiv.org/pdf/1909.09690v2.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-based-approach-for-measuring
Repo	https://github.com/hossein-kshvrz/text_domain_similarity
Framework	none

PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images


Title	PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images
Authors	Anirban Santara, Jayeeta Datta, Sourav Sarkar, Ankur Garg, Kirti Padia, Pabitra Mitra
Abstract	Hyperspectral images of land-cover captured by airborne or satellite-mounted sensors provide a rich source of information about the chemical composition of the materials present in a given place. This makes hyperspectral imaging an important tool for earth sciences, land-cover studies, and military and strategic applications. However, the scarcity of labeled training examples and spatial variability of spectral signature are two of the biggest challenges faced by hyperspectral image classification. In order to address these issues, we aim to develop a framework for material-agnostic information retrieval in hyperspectral images based on Positive-Unlabelled (PU) classification. Given a hyperspectral scene, the user labels some positive samples of a material he/she is looking for and our goal is to retrieve all the remaining instances of the query material in the scene. Additionally, we require the system to work equally well for any material in any scene without the user having to disclose the identity of the query material. This material-agnostic nature of the framework provides it with superior generalization abilities. We explore two alternative approaches to solve the hyperspectral image classification problem within this framework. The first approach is an adaptation of non-negative risk estimation based PU learning for hyperspectral data. The second approach is based on one-versus-all positive-negative classification where the negative class is approximately sampled using a novel spectral-spatial retrieval model. We propose two annotator models - uniform and blob - that represent the labelling patterns of a human annotator. We compare the performances of the proposed algorithms for each annotator model on three benchmark hyperspectral image datasets - Indian Pines, Pavia University and Salinas.
Tasks	Hyperspectral Image Classification, Image Classification, Information Retrieval
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04547v1
PDF	http://arxiv.org/pdf/1904.04547v1.pdf
PWC	https://paperswithcode.com/paper/punch-positive-unlabelled-classification
Repo	https://github.com/HSISeg/HSISeg
Framework	none

Explicit Sentence Compression for Neural Machine Translation


Title	Explicit Sentence Compression for Neural Machine Translation
Authors	Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao
Abstract	State-of-the-art Transformer-based neural machine translation (NMT) systems still follow a standard encoder-decoder framework, in which source sentence representation can be well done by an encoder with self-attention mechanism. Though Transformer-based encoder may effectively capture general information in its resulting source sentence representation, the backbone information, which stands for the gist of a sentence, is not specifically focused on. In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT. In practice, an explicit sentence compression goal used to learn the backbone information in a sentence. We propose three ways, including backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the compressed sentence into NMT. Our empirical tests on the WMT English-to-French and English-to-German translation tasks show that the proposed sentence compression method significantly improves the translation performances over strong baselines.
Tasks	Machine Translation, Sentence Compression
Published	2019-12-27
URL	https://arxiv.org/abs/1912.11980v1
PDF	https://arxiv.org/pdf/1912.11980v1.pdf
PWC	https://paperswithcode.com/paper/explicit-sentence-compression-for-neural
Repo	https://github.com/bcmi220/esc4nmt
Framework	pytorch

Diversity Transfer Network for Few-Shot Learning


Title	Diversity Transfer Network for Few-Shot Learning
Authors	Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xinyu Zhang, Chang Huang, Wenyu Liu, Bo Wang
Abstract	Few-shot learning is a challenging task that aims at training a classifier for unseen classes with only a few training examples. The main difficulty of few-shot learning lies in the lack of intra-class diversity within insufficient training samples. To alleviate this problem, we propose a novel generative framework, Diversity Transfer Network (DTN), that learns to transfer latent diversities from known categories and composite them with support features to generate diverse samples for novel categories in feature space. The learning problem of the sample generation (i.e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works. Besides, an organized auxiliary task co-training over known categories is proposed to stabilize the meta-training process of DTN. We perform extensive experiments and ablation studies on three datasets, i.e., \emph{mini}ImageNet, CIFAR100 and CUB. The results show that DTN, with single-stage training and faster convergence speed, obtains the state-of-the-art results among the feature generation based few-shot learning methods. Code and supplementary material are available at: \texttt{https://github.com/Yuxin-CV/DTN}
Tasks	Few-Shot Learning
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13182v1
PDF	https://arxiv.org/pdf/1912.13182v1.pdf
PWC	https://paperswithcode.com/paper/diversity-transfer-network-for-few-shot
Repo	https://github.com/Yuxin-CV/DTN
Framework	pytorch