Paper Group AWR 352
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes. Deep learning-based electroencephalography analysis: a systematic review. A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos. The Hitchhiker’s Guide to LDA. Anchor Diffusion for Unsupervised Video Object Segmentation. Understanding the I …
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Title | Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes |
Authors | Yiran Zhong, Pan Ji, Jianyuan Wang, Yuchao Dai, Hongdong Li |
Abstract | Unsupervised deep learning for optical flow computation has achieved promising results. Most existing deep-net based methods rely on image brightness consistency and local smoothness constraint to train the networks. Their performance degrades at regions where repetitive textures or occlusions occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical flow method which incorporates global geometric constraints into network learning. In particular, we investigate multiple ways of enforcing the epipolar constraint in flow estimation. To alleviate a ``chicken-and-egg’’ type of problem encountered in dynamic scenes where multiple motions may be present, we propose a low-rank constraint as well as a union-of-subspaces constraint for training. Experimental results on various benchmarking datasets show that our method achieves competitive performance compared with supervised methods and outperforms state-of-the-art unsupervised deep-learning methods. | |
Tasks | Optical Flow Estimation |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.03848v1 |
http://arxiv.org/pdf/1904.03848v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-deep-epipolar-flow-for |
Repo | https://github.com/yiranzhong/EPIflow.git |
Framework | none |
Deep learning-based electroencephalography analysis: a systematic review
Title | Deep learning-based electroencephalography analysis: a systematic review |
Authors | Yannick Roy, Hubert Banville, Isabela Albuquerque, Alexandre Gramfort, Tiago H. Falk, Jocelyn Faubert |
Abstract | Electroencephalography (EEG) is a complex signal and can require several years of training to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. In this work, we review 156 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches in order to inform future research and formulate recommendations. Various data items were extracted for each study pertaining to 1) the data, 2) the preprocessing methodology, 3) the DL design choices, 4) the results, and 5) the reproducibility of the experiments. Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours. As for the model, 40% of the studies used convolutional neural networks (CNNs), while 14% used recurrent neural networks (RNNs), most often with a total of 3 to 10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was 5.4% across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. To help the field progress, we provide a list of recommendations for future studies and we make our summary table of DL and EEG papers available and invite the community to contribute. |
Tasks | Brain Decoding, EEG, Time Series |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05498v2 |
http://arxiv.org/pdf/1901.05498v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-electroencephalography |
Repo | https://github.com/hubertjb/dl-eeg-review |
Framework | none |
A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos
Title | A-MAL: Automatic Motion Assessment Learning from Properly Performed Motions in 3D Skeleton Videos |
Authors | Tal Hakim, Ilan Shimshoni |
Abstract | Assessment of motion quality has recently gained high demand in a variety of domains. The ability to automatically assess subject motion in videos that were captured by cheap devices, such as Kinect cameras, is essential for monitoring clinical rehabilitation processes, for improving motor skills and for motion learning tasks. The need to pay attention to low-level details while accurately tracking the motion stages, makes this task very challenging. In this work, we introduce A-MAL, an automatic, strong motion assessment learning algorithm that only learns from properly-performed motion videos without further annotations, powered by a deviation time-segmentation algorithm, a parameter relevance detection algorithm, a novel time-warping algorithm that is based on automatic detection of common temporal points-of-interest and a textual-feedback generation mechanism. We demonstrate our method on motions from the Fugl-Meyer Assessment (FMA) test, which is typically held by occupational therapists in order to monitor patients’ recovery processes after strokes. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.10004v3 |
https://arxiv.org/pdf/1907.10004v3.pdf | |
PWC | https://paperswithcode.com/paper/a-mal-automatic-motion-assessment-learning |
Repo | https://github.com/skvp-owner/a-mal |
Framework | none |
The Hitchhiker’s Guide to LDA
Title | The Hitchhiker’s Guide to LDA |
Authors | Chen Ma |
Abstract | Latent Dirichlet Allocation (LDA) model is a famous model in the topic model field, it has been studied for years due to its extensive application value in industry and academia. However, the mathematical derivation of LDA model is challenging and difficult, which makes it difficult for the beginners to learn. To help the beginners in learning LDA, this book analyzes the mathematical derivation of LDA in detail, and it also introduces all the knowledge background to make it easy for beginners to understand. Thus, this book contains the author’s unique insights. It should be noted that this book is written in Chinese. |
Tasks | |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.03142v2 |
https://arxiv.org/pdf/1908.03142v2.pdf | |
PWC | https://paperswithcode.com/paper/the-hitchhikers-guide-to-lda |
Repo | https://github.com/MachineIntellect/GibbsLDA_plus |
Framework | none |
Anchor Diffusion for Unsupervised Video Object Segmentation
Title | Anchor Diffusion for Unsupervised Video Object Segmentation |
Authors | Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip H. S. Torr |
Abstract | Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow. Despite their complexity, these kinds of approaches tend to favour short-term temporal dependencies and are thus prone to accumulating inaccuracies, which cause drift over time. Moreover, simple (static) image segmentation models, alone, can perform competitively against these methods, which further suggests that the way temporal dependencies are modelled should be reconsidered. Motivated by these observations, in this paper we explore simple yet effective strategies to model long-term temporal dependencies. Inspired by the non-local operators of [70], we introduce a technique to establish dense correspondences between pixel embeddings of a reference “anchor” frame and the current one. This allows the learning of pairwise dependencies at arbitrarily long distances without conditioning on intermediate frames. Without online supervision, our approach can suppress the background and precisely segment the foreground object even in challenging scenarios, while maintaining consistent performance over time. With a mean IoU of $81.7%$, our method ranks first on the DAVIS-2016 leaderboard of unsupervised methods, while still being competitive against state-of-the-art online semi-supervised approaches. We further evaluate our method on the FBMS dataset and the ViSal video saliency dataset, showing results competitive with the state of the art. |
Tasks | Optical Flow Estimation, Semantic Segmentation, Unsupervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.10895v1 |
https://arxiv.org/pdf/1910.10895v1.pdf | |
PWC | https://paperswithcode.com/paper/anchor-diffusion-for-unsupervised-video-1 |
Repo | https://github.com/yz93/anchor-diff-VOS |
Framework | pytorch |
Understanding the Impact of Label Granularity on CNN-based Image Classification
Title | Understanding the Impact of Label Granularity on CNN-based Image Classification |
Authors | Zhuo Chen, Ruizhou Ding, Ting-Wu Chin, Diana Marculescu |
Abstract | In recent years, supervised learning using Convolutional Neural Networks (CNNs) has achieved great success in image classification tasks, and large scale labeled datasets have contributed significantly to this achievement. However, the definition of a label is often application dependent. For example, an image of a cat can be labeled as “cat” or perhaps more specifically “Persian cat.” We refer to this as label granularity. In this paper, we conduct extensive experiments using various datasets to demonstrate and analyze how and why training based on fine-grain labeling, such as “Persian cat” can improve CNN accuracy on classifying coarse-grain classes, in this case “cat.” The experimental results show that training CNNs with fine-grain labels improves both network’s optimization and generalization capabilities, as intuitively it encourages the network to learn more features, and hence increases classification accuracy on coarse-grain classes under all datasets considered. Moreover, fine-grain labels enhance data efficiency in CNN training. For example, a CNN trained with fine-grain labels and only 40% of the total training data can achieve higher accuracy than a CNN trained with the full training dataset and coarse-grain labels. These results point to two possible applications of this work: (i) with sufficient human resources, one can improve CNN performance by re-labeling the dataset with fine-grain labels, and (ii) with limited human resources, to improve CNN performance, rather than collecting more training data, one may instead use fine-grain labels for the dataset. We further propose a metric called Average Confusion Ratio to characterize the effectiveness of fine-grain labeling, and show its use through extensive experimentation. Code is available at https://github.com/cmu-enyac/Label-Granularity. |
Tasks | Image Classification |
Published | 2019-01-21 |
URL | http://arxiv.org/abs/1901.07012v1 |
http://arxiv.org/pdf/1901.07012v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-the-impact-of-label-granularity |
Repo | https://github.com/cmu-enyac/Label-Granularity |
Framework | pytorch |
Improving the robustness of ImageNet classifiers using elements of human visual cognition
Title | Improving the robustness of ImageNet classifiers using elements of human visual cognition |
Authors | A. Emin Orhan, Brenden M. Lake |
Abstract | We investigate the robustness properties of image recognition models equipped with two features inspired by human vision, an explicit episodic memory and a shape bias, at the ImageNet scale. As reported in previous work, we show that an explicit episodic memory improves the robustness of image recognition models against small-norm adversarial perturbations under some threat models. It does not, however, improve the robustness against more natural, and typically larger, perturbations. Learning more robust features during training appears to be necessary for robustness in this second sense. We show that features derived from a model that was encouraged to learn global, shape-based representations (Geirhos et al., 2019) do not only improve the robustness against natural perturbations, but when used in conjunction with an episodic memory, they also provide additional robustness against adversarial perturbations. Finally, we address three important design choices for the episodic memory: memory size, dimensionality of the memories and the retrieval method. We show that to make the episodic memory more compact, it is preferable to reduce the number of memories by clustering them, instead of reducing their dimensionality. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08416v2 |
https://arxiv.org/pdf/1906.08416v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-the-robustness-of-imagenet |
Repo | https://github.com/eminorhan/robust-vision |
Framework | pytorch |
Fact Discovery from Knowledge Base via Facet Decomposition
Title | Fact Discovery from Knowledge Base via Facet Decomposition |
Authors | Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam |
Abstract | During the past few decades, knowledge bases (KBs) have experienced rapid growth. Nevertheless, most KBs still suffer from serious incompletion. Researchers proposed many tasks such as knowledge base completion and relation prediction to help build the representation of KBs. However, there are some issues unsettled towards enriching the KBs. Knowledge base completion and relation prediction assume that we know two elements of the fact triples and we are going to predict the missing one. This assumption is too restricted in practice and prevents it from discovering new facts directly. To address this issue, we propose a new task, namely, fact discovery from knowledge base. This task only requires that we know the head entity and the goal is to discover facts associated with the head entity. To tackle this new problem, we propose a novel framework that decomposes the discovery problem into several facet discovery components. We also propose a novel auto-encoder based facet component to estimate some facets of the fact. Besides, we propose a feedback learning component to share the information between each facet. We evaluate our framework using a benchmark dataset and the experimental results show that our framework achieves promising results. We also conduct extensive analysis of our framework in discovering different kinds of facts. The source code of this paper can be obtained from https://github.com/thunlp/FFD. |
Tasks | Knowledge Base Completion |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09540v1 |
http://arxiv.org/pdf/1904.09540v1.pdf | |
PWC | https://paperswithcode.com/paper/fact-discovery-from-knowledge-base-via-facet |
Repo | https://github.com/thunlp/FFD |
Framework | none |
Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned
Title | Knowledge Graph Entity Alignment with Graph Convolutional Networks: Lessons Learned |
Authors | Max Berrendorf, Evgeniy Faerman, Valentyn Melnychuk, Volker Tresp, Thomas Seidl |
Abstract | In this work, we focus on the problem of entity alignment in Knowledge Graphs (KG) and we report on our experiences when applying a Graph Convolutional Network (GCN) based model for this task. Variants of GCN are used in multiple state-of-the-art approaches and therefore it is important to understand the specifics and limitations of GCN-based models. Despite serious efforts, we were not able to fully reproduce the results from the original paper and after a thorough audit of the code provided by authors, we concluded, that their implementation is different from the architecture described in the paper. In addition, several tricks are required to make the model work and some of them are not very intuitive. We provide an extensive ablation study to quantify the effects these tricks and changes of architecture have on final performance. Furthermore, we examine current evaluation approaches and systematize available benchmark datasets. We believe that people interested in KG matching might profit from our work, as well as novices entering the field |
Tasks | Entity Alignment, Knowledge Graphs |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08342v2 |
https://arxiv.org/pdf/1911.08342v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-graph-entity-alignment-with-graph |
Repo | https://github.com/Valentyn1997/kg-alignment-lessons-learned |
Framework | pytorch |
Cross-Domain Generalization of Neural Constituency Parsers
Title | Cross-Domain Generalization of Neural Constituency Parsers |
Authors | Daniel Fried, Nikita Kitaev, Dan Klein |
Abstract | Neural parsers obtain state-of-the-art results on benchmark treebanks for constituency parsing – but to what degree do they generalize to other domains? We present three results about the generalization of neural parsers in a zero-shot setting: training on trees from one corpus and evaluating on out-of-domain corpora. First, neural and non-neural parsers generalize comparably to new domains. Second, incorporating pre-trained encoder representations into neural parsers substantially improves their performance across all domains, but does not give a larger relative improvement for out-of-domain treebanks. Finally, despite the rich input representations they learn, neural parsers still benefit from structured output prediction of output trees, yielding higher exact match accuracy and stronger generalization both to larger text spans and to out-of-domain corpora. We analyze generalization on English and Chinese corpora, and in the process obtain state-of-the-art parsing results for the Brown, Genia, and English Web treebanks. |
Tasks | Constituency Parsing, Domain Generalization |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04347v1 |
https://arxiv.org/pdf/1907.04347v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-generalization-of-neural |
Repo | https://github.com/dpfried/rnng-bert |
Framework | tf |
SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version)
Title | SAI: a Sensible Artificial Intelligence that plays with handicap and targets high scores in 9x9 Go (extended version) |
Authors | Francesco Morandin, Gianluca Amato, Marco Fantozzi, Rosa Gini, Carlo Metta, Maurizio Parton |
Abstract | We develop a new model that can be applied to any perfect information two-player zero-sum game to target a high score, and thus a perfect play. We integrate this model into the Monte Carlo tree search-policy iteration learning pipeline introduced by Google DeepMind with AlphaGo. Training this model on 9x9 Go produces a superhuman Go player, thus proving that it is stable and robust. We show that this model can be used to effectively play with both positional and score handicap, and to minimize suboptimal moves. We develop a family of agents that can target high scores against any opponent, and recover from very severe disadvantage against weak opponents. To the best of our knowledge, these are the first effective achievements in this direction. |
Tasks | |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10863v3 |
https://arxiv.org/pdf/1905.10863v3.pdf | |
PWC | https://paperswithcode.com/paper/sai-a-sensible-artificial-intelligence-that-1 |
Repo | https://github.com/sai-dev/sai |
Framework | none |
A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts
Title | A Deep Learning-Based Approach for Measuring the Domain Similarity of Persian Texts |
Authors | Hossein Keshavarz, Shohreh Tabatabayi Seifi, Mohammad Izadi |
Abstract | In this paper, we propose a novel approach for measuring the degree of similarity between categories of two pieces of Persian text, which were published as descriptions of two separate advertisements. We built an appropriate dataset for this work using a dataset which consists of advertisements posted on an e-commerce website. We generated a significant number of paired texts from this dataset and assigned each pair a score from 0 to 3, which demonstrates the degree of similarity between the domains of the pair. In this work, we represent words with word embedding vectors derived from word2vec. Then deep neural network models are used to represent texts. Eventually, we employ concatenation of absolute difference and bit-wise multiplication and a fully-connected neural network to produce a probability distribution vector for the score of the pairs. Through a supervised learning approach, we trained our model on a GPU, and our best model achieved an F1 score of 0.9865. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.09690v2 |
https://arxiv.org/pdf/1909.09690v2.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-based-approach-for-measuring |
Repo | https://github.com/hossein-kshvrz/text_domain_similarity |
Framework | none |
PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images
Title | PUNCH: Positive UNlabelled Classification based information retrieval in Hyperspectral images |
Authors | Anirban Santara, Jayeeta Datta, Sourav Sarkar, Ankur Garg, Kirti Padia, Pabitra Mitra |
Abstract | Hyperspectral images of land-cover captured by airborne or satellite-mounted sensors provide a rich source of information about the chemical composition of the materials present in a given place. This makes hyperspectral imaging an important tool for earth sciences, land-cover studies, and military and strategic applications. However, the scarcity of labeled training examples and spatial variability of spectral signature are two of the biggest challenges faced by hyperspectral image classification. In order to address these issues, we aim to develop a framework for material-agnostic information retrieval in hyperspectral images based on Positive-Unlabelled (PU) classification. Given a hyperspectral scene, the user labels some positive samples of a material he/she is looking for and our goal is to retrieve all the remaining instances of the query material in the scene. Additionally, we require the system to work equally well for any material in any scene without the user having to disclose the identity of the query material. This material-agnostic nature of the framework provides it with superior generalization abilities. We explore two alternative approaches to solve the hyperspectral image classification problem within this framework. The first approach is an adaptation of non-negative risk estimation based PU learning for hyperspectral data. The second approach is based on one-versus-all positive-negative classification where the negative class is approximately sampled using a novel spectral-spatial retrieval model. We propose two annotator models - uniform and blob - that represent the labelling patterns of a human annotator. We compare the performances of the proposed algorithms for each annotator model on three benchmark hyperspectral image datasets - Indian Pines, Pavia University and Salinas. |
Tasks | Hyperspectral Image Classification, Image Classification, Information Retrieval |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04547v1 |
http://arxiv.org/pdf/1904.04547v1.pdf | |
PWC | https://paperswithcode.com/paper/punch-positive-unlabelled-classification |
Repo | https://github.com/HSISeg/HSISeg |
Framework | none |
Explicit Sentence Compression for Neural Machine Translation
Title | Explicit Sentence Compression for Neural Machine Translation |
Authors | Zuchao Li, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao |
Abstract | State-of-the-art Transformer-based neural machine translation (NMT) systems still follow a standard encoder-decoder framework, in which source sentence representation can be well done by an encoder with self-attention mechanism. Though Transformer-based encoder may effectively capture general information in its resulting source sentence representation, the backbone information, which stands for the gist of a sentence, is not specifically focused on. In this paper, we propose an explicit sentence compression method to enhance the source sentence representation for NMT. In practice, an explicit sentence compression goal used to learn the backbone information in a sentence. We propose three ways, including backbone source-side fusion, target-side fusion, and both-side fusion, to integrate the compressed sentence into NMT. Our empirical tests on the WMT English-to-French and English-to-German translation tasks show that the proposed sentence compression method significantly improves the translation performances over strong baselines. |
Tasks | Machine Translation, Sentence Compression |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.11980v1 |
https://arxiv.org/pdf/1912.11980v1.pdf | |
PWC | https://paperswithcode.com/paper/explicit-sentence-compression-for-neural |
Repo | https://github.com/bcmi220/esc4nmt |
Framework | pytorch |
Diversity Transfer Network for Few-Shot Learning
Title | Diversity Transfer Network for Few-Shot Learning |
Authors | Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xinyu Zhang, Chang Huang, Wenyu Liu, Bo Wang |
Abstract | Few-shot learning is a challenging task that aims at training a classifier for unseen classes with only a few training examples. The main difficulty of few-shot learning lies in the lack of intra-class diversity within insufficient training samples. To alleviate this problem, we propose a novel generative framework, Diversity Transfer Network (DTN), that learns to transfer latent diversities from known categories and composite them with support features to generate diverse samples for novel categories in feature space. The learning problem of the sample generation (i.e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works. Besides, an organized auxiliary task co-training over known categories is proposed to stabilize the meta-training process of DTN. We perform extensive experiments and ablation studies on three datasets, i.e., \emph{mini}ImageNet, CIFAR100 and CUB. The results show that DTN, with single-stage training and faster convergence speed, obtains the state-of-the-art results among the feature generation based few-shot learning methods. Code and supplementary material are available at: \texttt{https://github.com/Yuxin-CV/DTN} |
Tasks | Few-Shot Learning |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/1912.13182v1 |
https://arxiv.org/pdf/1912.13182v1.pdf | |
PWC | https://paperswithcode.com/paper/diversity-transfer-network-for-few-shot |
Repo | https://github.com/Yuxin-CV/DTN |
Framework | pytorch |