February 1, 2020

3204 words 16 mins read

Paper Group AWR 228

Large-Scale Multi-Label Text Classification on EU Legislation. GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects. Mask Combination of Multi-layer Graphs for Global Structure Inference. Neural Imaging Pipelines - the Scourge or Hope of Forensics?. Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure …

Large-Scale Multi-Label Text Classification on EU Legislation


Title	Large-Scale Multi-Label Text Classification on EU Legislation
Authors	Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Ion Androutsopoulos
Abstract	We consider Large-Scale Multi-Label Text Classification (LMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, annotated with ~4.3k EUROVOC labels, which is suitable for LMTC, few- and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with label-wise attention perform better than other current state of the art methods. Domain-specific WORD2VEC and context-sensitive ELMO embeddings further improve performance. We also find that considering only particular zones of the documents is sufficient. This allows us to bypass BERT’s maximum text length limit and fine-tune BERT, obtaining the best results in all but zero-shot learning cases.
Tasks	Multi-Label Text Classification, Text Classification, Zero-Shot Learning
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02192v1
PDF	https://arxiv.org/pdf/1906.02192v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-multi-label-text-classification-1
Repo	https://github.com/iliaschalkidis/lmtc-eurlex57k
Framework	tf

GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects


Title	GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects
Authors	Edward J. Smith, Scott Fujimoto, Adriana Romero, David Meger
Abstract	Mesh models are a promising approach for encoding the structure of 3D objects. Current mesh reconstruction systems predict uniformly distributed vertex locations of a predetermined graph through a series of graph convolutions, leading to compromises with respect to performance or resolution. In this paper, we argue that the graph representation of geometric objects allows for additional structure, which should be leveraged for enhanced reconstruction. Thus, we propose a system which properly benefits from the advantages of the geometric structure of graph encoded objects by introducing (1) a graph convolutional update preserving vertex information; (2) an adaptive splitting heuristic allowing detail to emerge; and (3) a training objective operating both on the local surfaces defined by vertices as well as the global structure defined by the mesh. Our proposed method is evaluated on the task of 3D object reconstruction from images with the ShapeNet dataset, where we demonstrate state of the art performance, both visually and numerically, while having far smaller space requirements by generating adaptive meshes
Tasks	3D Object Reconstruction, Object Reconstruction
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11461v1
PDF	http://arxiv.org/pdf/1901.11461v1.pdf
PWC	https://paperswithcode.com/paper/geometrics-exploiting-geometric-structure-for
Repo	https://github.com/EdwardSmith1884/GEOMetrics
Framework	pytorch

Mask Combination of Multi-layer Graphs for Global Structure Inference


Title	Mask Combination of Multi-layer Graphs for Global Structure Inference
Authors	Eda Bayram, Dorina Thanou, Elif Vural, Pascal Frossard
Abstract	Structure inference is an important task for network data processing and analysis in data science. In recent years, quite a few approaches have been developed to learn the graph structure underlying a set of observations captured in a data space. Although real world data is often acquired in settings where relationships are influenced by a priori known rules, this domain knowledge is still not well exploited in structure inference problems. In this paper, we identify the structure of signals defined in a data space whose inner relationships are encoded by multi-layer graphs. We aim at properly exploiting the information originating from each layer to infer the global structure underlying the signals. We thus present a novel method for combining the multiple graphs into a global graph using mask matrices, which are estimated through an optimization problem that accommodates the multi-layer graph information and a signal representation model. The proposed mask combination method also estimates the contribution of each graph layer in the structure of signals. The experiments conducted both on synthetic and real world data suggest that integrating the multi-layer graph representation of the data in the structure inference framework enhances the learning procedure considerably by adapting to the quality and the quantity of the input data
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10114v1
PDF	https://arxiv.org/pdf/1910.10114v1.pdf
PWC	https://paperswithcode.com/paper/mask-combination-of-multi-layer-graphs-for
Repo	https://github.com/bayrameda/MaskLearning
Framework	none

Neural Imaging Pipelines - the Scourge or Hope of Forensics?


Title	Neural Imaging Pipelines - the Scourge or Hope of Forensics?
Authors	Pawel Korus, Nasir Memon
Abstract	Forensic analysis of digital photographs relies on intrinsic statistical traces introduced at the time of their acquisition or subsequent editing. Such traces are often removed by post-processing (e.g., down-sampling and re-compression applied upon distribution in the Web) which inhibits reliable provenance analysis. Increasing adoption of computational methods within digital cameras further complicates the process and renders explicit mathematical modeling infeasible. While this trend challenges forensic analysis even in near-acquisition conditions, it also creates new opportunities. This paper explores end-to-end optimization of the entire image acquisition and distribution workflow to facilitate reliable forensic analysis at the end of the distribution channel, where state-of-the-art forensic techniques fail. We demonstrate that a neural network can be trained to replace the entire photo development pipeline, and jointly optimized for high-fidelity photo rendering and reliable provenance analysis. Such optimized neural imaging pipeline allowed us to increase image manipulation detection accuracy from approx. 45% to over 90%. The network learns to introduce carefully crafted artifacts, akin to digital watermarks, which facilitate subsequent manipulation detection. Analysis of performance trade-offs indicates that most of the gains can be obtained with only minor distortion. The findings encourage further research towards building more reliable imaging pipelines with explicit provenance-guaranteeing properties.
Tasks	Image Manipulation Detection
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10707v1
PDF	http://arxiv.org/pdf/1902.10707v1.pdf
PWC	https://paperswithcode.com/paper/neural-imaging-pipelines-the-scourge-or-hope
Repo	https://github.com/pkorus/neural-imaging
Framework	tf

Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure Detection


Title	Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure Detection
Authors	Xiang Zhang, Lina Yao, Manqing Dong, Zhe Liu, Yu Zhang, Yong Li
Abstract	Objective: Epilepsy is a chronic neurological disorder characterized by the occurrence of spontaneous seizures, which affects about one percent of the world’s population. Most of the current seizure detection approaches strongly rely on patient history records and thus fail in the patient-independent situation of detecting the new patients. To overcome such limitation, we propose a robust and explainable epileptic seizure detection model that effectively learns from seizure states while eliminates the inter-patient noises. Methods: A complex deep neural network model is proposed to learn the pure seizure-specific representation from the raw non-invasive electroencephalography (EEG) signals through adversarial training. Furthermore, to enhance the explainability, we develop an attention mechanism to automatically learn the importance of each EEG channels in the seizure diagnosis procedure. Results: The proposed approach is evaluated over the Temple University Hospital EEG (TUH EEG) database. The experimental results illustrate that our model outperforms the competitive state-of-the-art baselines with low latency. Moreover, the designed attention mechanism is demonstrated ables to provide fine-grained information for pathological analysis. Conclusion and significance: We propose an effective and efficient patient-independent diagnosis approach of epileptic seizure based on raw EEG signals without manually feature engineering, which is a step toward the development of large-scale deployment for real-life use.
Tasks	EEG, Feature Engineering, Representation Learning, Seizure Detection
Published	2019-09-18
URL	https://arxiv.org/abs/1909.10868v2
PDF	https://arxiv.org/pdf/1909.10868v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-representation-learning-for
Repo	https://github.com/gabi-a/EEG-Literature
Framework	none

Alleviating Feature Confusion for Generative Zero-shot Learning


Title	Alleviating Feature Confusion for Generative Zero-shot Learning
Authors	Jingjing Li, Mengmeng Jing, Ke Lu, Lei Zhu, Yang Yang, Zi Huang
Abstract	Lately, generative adversarial networks (GANs) have been successfully applied to zero-shot learning (ZSL) and achieved state-of-the-art performance. By synthesizing virtual unseen visual features, GAN-based methods convert the challenging ZSL task into a supervised learning problem. However, GAN-based ZSL methods have to train the generator on the seen categories and further apply it to unseen instances. An inevitable issue of such a paradigm is that the synthesized unseen features are prone to seen references and incapable to reflect the novelty and diversity of real unseen instances. In a nutshell, the synthesized features are confusing. One cannot tell unseen categories from seen ones using the synthesized features. As a result, the synthesized features are too subtle to be classified in generalized zero-shot learning (GZSL) which involves both seen and unseen categories at the test stage. In this paper, we first introduce the feature confusion issue. Then, we propose a new feature generating network, named alleviating feature confusion GAN (AFC-GAN), to challenge the issue. Specifically, we present a boundary loss which maximizes the decision boundary of seen categories and unseen ones. Furthermore, a novel metric named feature confusion score (FCS) is proposed to quantify the feature confusion. Extensive experiments on five widely used datasets verify that our method is able to outperform previous state-of-the-arts under both ZSL and GZSL protocols.
Tasks	Zero-Shot Learning
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07615v1
PDF	https://arxiv.org/pdf/1909.07615v1.pdf
PWC	https://paperswithcode.com/paper/alleviating-feature-confusion-for-generative
Repo	https://github.com/lijin118/AFC-GAN
Framework	pytorch

A Variational-Sequential Graph Autoencoder for Neural Architecture Performance Prediction


Title	A Variational-Sequential Graph Autoencoder for Neural Architecture Performance Prediction
Authors	David Friede, Jovita Lukasik, Heiner Stuckenschmidt, Margret Keuper
Abstract	In computer vision research, the process of automating architecture engineering, Neural Architecture Search (NAS), has gained substantial interest. In the past, NAS was hardly accessible to researchers without access to large-scale compute systems, due to very long compute times for the recurrent search and evaluation of new candidate architectures. The NAS-Bench-101 dataset facilitates a paradigm change towards classical methods such as supervised learning to evaluate neural architectures. In this paper, we propose a graph encoder built upon Graph Neural Networks (GNN). We demonstrate the effectiveness of the proposed encoder on NAS performance prediction for seen architecture types as well an unseen ones (i.e., zero shot prediction). We also provide a new variational-sequential graph autoencoder (VS-GAE) based on the proposed graph encoder. The VS-GAE is specialized on encoding and decoding graphs of varying length utilizing GNNs. Experiments on different sampling methods show that the embedding space learned by our VS-GAE increases the stability on the accuracy prediction task.
Tasks	Neural Architecture Search
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05317v1
PDF	https://arxiv.org/pdf/1912.05317v1.pdf
PWC	https://paperswithcode.com/paper/a-variational-sequential-graph-autoencoder
Repo	https://github.com/jovitalukasik/vs_gae
Framework	pytorch

Towards better substitution-based word sense induction


Title	Towards better substitution-based word sense induction
Authors	Asaf Amrami, Yoav Goldberg
Abstract	Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses. Recent work obtain strong results by clustering lexical substitutes derived from pre-trained RNN language models (ELMo). Adapting the method to BERT improves the scores even further. We extend the previous method to support a dynamic rather than a fixed number of clusters as supported by other prominent methods, and propose a method for interpreting the resulting clusters by associating them with their most informative substitutes. We then perform extensive error analysis revealing the remaining sources of errors in the WSI task. Our code is available at https://github.com/asafamr/bertwsi.
Tasks	Word Sense Induction
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12598v2
PDF	https://arxiv.org/pdf/1905.12598v2.pdf
PWC	https://paperswithcode.com/paper/towards-better-substitution-based-word-sense
Repo	https://github.com/asafamr/bertwsi
Framework	pytorch

Cumulative link models for deep ordinal classification


Title	Cumulative link models for deep ordinal classification
Authors	Víctor-Manuel Vargas, Pedro-Antonio Gutiérrez, César Hervás-Martínez
Abstract	This paper proposes a deep convolutional neural network model for ordinal regression by considering a family of probabilistic ordinal link functions in the output layer. The link functions are those used for cumulative link models, which are traditional statistical linear models based on projecting each pattern into a 1-dimensional space. A set of ordered thresholds splits this space into the different classes of the problem. In our case, the projections are estimated by a non-linear deep neural network. To further improve the results, we combine these ordinal models with a loss function that takes into account the distance between the categories, based on the weighted Kappa index. Three different link functions are studied in the experimental study, and the results are contrasted with statistical analysis. The experiments run over two different ordinal classification problems and the statistical tests confirm that these models improve the results of a nominal model and outperform other robust proposals considered in the literature.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.13392v2
PDF	https://arxiv.org/pdf/1905.13392v2.pdf
PWC	https://paperswithcode.com/paper/190513392
Repo	https://github.com/ayrna/deep-ordinal-clm
Framework	tf

Gaussian Embedding of Large-scale Attributed Graphs


Title	Gaussian Embedding of Large-scale Attributed Graphs
Authors	Bhagya Hettige, Yuan-Fang Li, Weiqing Wang, Wray Buntine
Abstract	Graph embedding methods transform high-dimensional and complex graph contents into low-dimensional representations. They are useful for a wide range of graph analysis tasks including link prediction, node classification, recommendation and visualization. Most existing approaches represent graph nodes as point vectors in a low-dimensional embedding space, ignoring the uncertainty present in the real-world graphs. Furthermore, many real-world graphs are large-scale and rich in content (e.g. node attributes). In this work, we propose GLACE, a novel, scalable graph embedding method that preserves both graph structure and node attributes effectively and efficiently in an end-to-end manner. GLACE effectively models uncertainty through Gaussian embeddings, and supports inductive inference of new nodes based on their attributes. In our comprehensive experiments, we evaluate GLACE on real-world graphs, and the results demonstrate that GLACE significantly outperforms state-of-the-art embedding methods on multiple graph analysis tasks.
Tasks	Graph Embedding, Link Prediction, Node Classification
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00536v1
PDF	https://arxiv.org/pdf/1912.00536v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-embedding-of-large-scale-attributed
Repo	https://github.com/bhagya-hettige/GLACE
Framework	tf

Exploiting Parallelism Opportunities with Deep Learning Frameworks


Title	Exploiting Parallelism Opportunities with Deep Learning Frameworks
Authors	Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood, David Brooks
Abstract	State-of-the-art machine learning frameworks support a wide variety of design features to enable a flexible machine learning programming interface and to ease the programmability burden on machine learning developers. Identifying and using a performance-optimal setting in feature-rich frameworks, however, involves a non-trivial amount of performance characterization and domain-specific knowledge. This paper takes a deep dive into analyzing the performance impact of key design features and the role of parallelism. The observations and insights distill into a simple set of guidelines that one can use to achieve much higher training and inference speedup. The evaluation results show that our proposed performance tuning guidelines outperform both the Intel and TensorFlow recommended settings by 1.29x and 1.34x, respectively, across a diverse set of real-world deep learning models.
Tasks
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04705v1
PDF	https://arxiv.org/pdf/1908.04705v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-parallelism-opportunities-with
Repo	https://github.com/Emma926/mcbench
Framework	caffe2

Self-Supervised Learning of 3D Human Pose using Multi-view Geometry


Title	Self-Supervised Learning of 3D Human Pose using Multi-view Geometry
Authors	Muhammed Kocabas, Salih Karagoz, Emre Akbas
Abstract	Training accurate 3D human pose estimators requires large amount of 3D ground-truth data which is costly to collect. Various weakly or self supervised pose estimation methods have been proposed due to lack of 3D data. Nevertheless, these methods, in addition to 2D ground-truth poses, require either additional supervision in various forms (e.g. unpaired 3D ground truth data, a small subset of labels) or the camera parameters in multiview settings. To address these problems, we present EpipolarPose, a self-supervised learning method for 3D human pose estimation, which does not need any 3D ground-truth data or camera extrinsics. During training, EpipolarPose estimates 2D poses from multi-view images, and then, utilizes epipolar geometry to obtain a 3D pose and camera geometry which are subsequently used to train a 3D pose estimator. We demonstrate the effectiveness of our approach on standard benchmark datasets i.e. Human3.6M and MPI-INF-3DHP where we set the new state-of-the-art among weakly/self-supervised methods. Furthermore, we propose a new performance measure Pose Structure Score (PSS) which is a scale invariant, structure aware measure to evaluate the structural plausibility of a pose with respect to its ground truth. Code and pretrained models are available at https://github.com/mkocabas/EpipolarPose
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2019-03-06
URL	http://arxiv.org/abs/1903.02330v2
PDF	http://arxiv.org/pdf/1903.02330v2.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-3d-human-pose
Repo	https://github.com/mkocabas/EpipolarPose
Framework	pytorch

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables


Title	Meta-Inverse Reinforcement Learning with Probabilistic Context Variables
Authors	Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon
Abstract	Providing a suitable reward function to reinforcement learning can be difficult in many real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, existing methods typically assume homogeneous demonstrations for a single behavior or task, while in practice, it might be easier to collect datasets of heterogeneous but related behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from demonstrations of distinct but related tasks in an unsupervised way. Critically, our model can infer rewards for new, structurally-similar tasks from a single demonstration. Our experiments on multiple continuous control tasks demonstrate the effectiveness of our approach compared to state-of-the-art imitation and inverse reinforcement learning methods.
Tasks	Continuous Control
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09314v2
PDF	https://arxiv.org/pdf/1909.09314v2.pdf
PWC	https://paperswithcode.com/paper/meta-inverse-reinforcement-learning-with
Repo	https://github.com/ermongroup/MetaIRL
Framework	none

PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation


Title	PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Authors	Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, Jian Sun
Abstract	In this work, we present a novel data-driven method for robust 6DoF object pose estimation from a single RGBD image. Unlike previous methods that directly regressing pose parameters, we tackle this challenging task with a keypoint-based approach. Specifically, we propose a deep Hough voting network to detect 3D keypoints of objects and then estimate the 6D pose parameters within a least-squares fitting manner. Our method is a natural extension of 2D-keypoint approaches that successfully work on RGB based 6DoF estimation. It allows us to fully utilize the geometric constraint of rigid objects with the extra depth information and is easy for a network to learn and optimize. Extensive experiments were conducted to demonstrate the effectiveness of 3D-keypoint detection in the 6D pose estimation task. Experimental results also show our method outperforms the state-of-the-art methods by large margins on several benchmarks. Code and video are available at https://github.com/ethnhe/PVN3D.git.
Tasks	6D Pose Estimation, Keypoint Detection, Pose Estimation
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04231v2
PDF	https://arxiv.org/pdf/1911.04231v2.pdf
PWC	https://paperswithcode.com/paper/pvn3d-a-deep-point-wise-3d-keypoints-voting
Repo	https://github.com/ethnhe/PVN3D
Framework	pytorch

Semantic Product Search


Title	Semantic Product Search
Authors	Priyanka Nigam, Yiwei Song, Vijai Mohan, Vihan Lakshman, Weitian, Ding, Ankit Shingavi, Choon Hui Teo, Hao Gu, Bing Yin
Abstract	We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. “woman” vs. “women”), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method.
Tasks	Tokenization
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00937v1
PDF	https://arxiv.org/pdf/1907.00937v1.pdf
PWC	https://paperswithcode.com/paper/semantic-product-search
Repo	https://github.com/ducta-tiki/semranker
Framework	tf