October 21, 2019

3281 words 16 mins read

Paper Group AWR 21

Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study. Excitation Dropout: Encouraging Plasticity in Deep Neural Networks. Hand Pose Estimation via Latent 2.5D Heatmap Regression. Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning. Large-Scale Multi-Domain Belief Track …


Title	Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study
Authors	Maral Dadvar, Kai Eckert
Abstract	Cyberbullying is a disturbing online misbehaviour with troubling consequences. It appears in different forms, and in most of the social networks, it is in textual format. Automatic detection of such incidents requires intelligent systems. Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time. In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance. In this paper, we investigate the findings of a recent literature in this regard. We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors. Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms. We also transferred and evaluated the performance of the models trained on one platform to another platform. Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset. We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks.
Tasks
Published	2018-12-19
URL	http://arxiv.org/abs/1812.08046v1
PDF	http://arxiv.org/pdf/1812.08046v1.pdf
PWC	https://paperswithcode.com/paper/cyberbullying-detection-in-social-networks
Repo	https://github.com/sweta20/Detecting-Cyberbullying-Across-SMPs
Framework	tf

Excitation Dropout: Encouraging Plasticity in Deep Neural Networks


Title	Excitation Dropout: Encouraging Plasticity in Deep Neural Networks
Authors	Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, Vittorio Murino
Abstract	We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction defined as the firing of neurons in specific paths. In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout. In essence, we dropout with higher probability those neurons which contribute more to decision making at training time. This approach penalizes high saliency neurons that are most relevant for model prediction, i.e. those having stronger evidence. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization, resulting in a plasticity-like behavior, a characteristic of human brains too. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression using several metrics over four image/video recognition benchmarks.
Tasks	Decision Making, Video Recognition
Published	2018-05-23
URL	https://arxiv.org/abs/1805.09092v2
PDF	https://arxiv.org/pdf/1805.09092v2.pdf
PWC	https://paperswithcode.com/paper/excitation-dropout-encouraging-plasticity-in
Repo	https://github.com/andreazuna89/Excitation-Dropout
Framework	caffe2

Hand Pose Estimation via Latent 2.5D Heatmap Regression


Title	Hand Pose Estimation via Latent 2.5D Heatmap Regression
Authors	Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz
Abstract	Estimating the 3D pose of a hand is an essential part of human-computer interaction. Estimating 3D pose using depth or multi-view sensors has become easier with recent advances in computer vision, however, regressing pose from a single RGB image is much less straightforward. The main difficulty arises from the fact that 3D pose requires some form of depth estimates, which are ambiguous given only an RGB image. In this paper we propose a new method for 3D hand pose estimation from a monocular image through a novel 2.5D pose representation. Our new representation estimates pose up to a scaling factor, which can be estimated additionally if a prior of the hand size is given. We implicitly learn depth maps and heatmap distributions with a novel CNN architecture. Our system achieves the state-of-the-art estimation of 2D and 3D hand pose on several challenging datasets in presence of severe occlusions.
Tasks	Hand Pose Estimation, Pose Estimation
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09534v1
PDF	http://arxiv.org/pdf/1804.09534v1.pdf
PWC	https://paperswithcode.com/paper/hand-pose-estimation-via-latent-25d-heatmap
Repo	https://github.com/Janus-Shiau/ood_confidence_tensorflow
Framework	tf

Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning


Title	Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
Authors	Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan
Abstract	Skeleton-based action recognition has made great progress recently, but many problems still remain unsolved. For example, most of the previous methods model the representations of skeleton sequences without abundant spatial structure information and detailed temporal dynamics features. In this paper, we propose a novel model with spatial reasoning and temporal stack learning (SR-TSL) for skeleton based action recognition, which consists of a spatial reasoning network (SRN) and a temporal stack learning network (TSLN). The SRN can capture the high-level spatial structural information within each frame by a residual graph neural network, while the TSLN can model the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we propose a clip-based incremental loss to optimize the model. We perform extensive experiments on the SYSU 3D Human-Object Interaction dataset and NTU RGB+D dataset and verify the effectiveness of each network of our model. The comparison results illustrate that our approach achieves much better results than state-of-the-art methods.
Tasks	Human-Object Interaction Detection, Skeleton Based Action Recognition, Temporal Action Localization
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02335v2
PDF	http://arxiv.org/pdf/1805.02335v2.pdf
PWC	https://paperswithcode.com/paper/skeleton-based-action-recognition-with-1
Repo	https://github.com/yfsong0709/RA-GCNv1
Framework	pytorch


Title	Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing
Authors	Osman Ramadan, Paweł Budzianowski, Milica Gašić
Abstract	Robust dialogue belief tracking is a key component in maintaining good quality dialogue systems. The tasks that dialogue systems are trying to solve are becoming increasingly complex, requiring scalability to multi domain, semantically rich dialogues. However, most current approaches have difficulty scaling up with domains because of the dependency of the model parameters on the dialogue ontology. In this paper, a novel approach is introduced that fully utilizes semantic similarity between dialogue utterances and the ontology terms, allowing the information to be shared across domains. The evaluation is performed on a recently collected multi-domain dialogues dataset, one order of magnitude larger than currently available corpora. Our model demonstrates great capability in handling multi-domain dialogues, simultaneously outperforming existing state-of-the-art models in single-domain dialogue tracking tasks.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06517v1
PDF	http://arxiv.org/pdf/1807.06517v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-multi-domain-belief-tracking-with
Repo	https://github.com/osmanio2/multi-domain-belief-tracking
Framework	tf

Explainable and Explicit Visual Reasoning over Scene Graphs


Title	Explainable and Explicit Visual Reasoning over Scene Graphs
Authors	Jiaxin Shi, Hanwang Zhang, Juanzi Li
Abstract	We aim to dismantle the prevalent black-box neural architectures used in complex visual reasoning tasks, into the proposed eXplainable and eXplicit Neural Modules (XNMs), which advance beyond existing neural module networks towards using scene graphs — objects as nodes and the pairwise relationships as edges — for explainable and explicit reasoning with structured knowledge. XNMs allow us to pay more attention to teach machines how to “think”, regardless of what they “look”. As we will show in the paper, by using scene graphs as an inductive bias, 1) we can design XNMs in a concise and flexible fashion, i.e., XNMs merely consist of 4 meta-types, which significantly reduce the number of parameters by 10 to 100 times, and 2) we can explicitly trace the reasoning-flow in terms of graph attentions. XNMs are so generic that they support a wide range of scene graph implementations with various qualities. For example, when the graphs are detected perfectly, XNMs achieve 100% accuracy on both CLEVR and CLEVR CoGenT, establishing an empirical performance upper-bound for visual reasoning; when the graphs are noisily detected from real-world images, XNMs are still robust to achieve a competitive 67.5% accuracy on VQAv2.0, surpassing the popular bag-of-objects attention models without graph structures.
Tasks	Visual Reasoning
Published	2018-12-05
URL	http://arxiv.org/abs/1812.01855v2
PDF	http://arxiv.org/pdf/1812.01855v2.pdf
PWC	https://paperswithcode.com/paper/explainable-and-explicit-visual-reasoning
Repo	https://github.com/shijx12/XNM-Net
Framework	pytorch

Deep learning for time series classification: a review


Title	Deep learning for time series classification: a review
Authors	Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller
Abstract	Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.
Tasks	Time Series, Time Series Classification
Published	2018-09-12
URL	https://arxiv.org/abs/1809.04356v4
PDF	https://arxiv.org/pdf/1809.04356v4.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-time-series-classification
Repo	https://github.com/hyungjik/TSC_practice
Framework	none

Towards generative adversarial networks as a new paradigm for radiology education


Title	Towards generative adversarial networks as a new paradigm for radiology education
Authors	Samuel G. Finlayson, Hyunkwang Lee, Isaac S. Kohane, Luke Oakden-Rayner
Abstract	Medical students and radiology trainees typically view thousands of images in order to “train their eye” to detect the subtle visual patterns necessary for diagnosis. Nevertheless, infrastructural and legal constraints often make it difficult to access and quickly query an abundance of images with a user-specified feature set. In this paper, we use a conditional generative adversarial network (GAN) to synthesize $1024\times1024$ pixel pelvic radiographs that can be queried with conditioning on fracture status. We demonstrate that the conditional GAN learns features that distinguish fractures from non-fractures by training a convolutional neural network exclusively on images sampled from the GAN and achieving an AUC of $>0.95$ on a held-out set of real images. We conduct additional analysis of the images sampled from the GAN and describe ongoing work to validate educational efficacy.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01547v1
PDF	http://arxiv.org/pdf/1812.01547v1.pdf
PWC	https://paperswithcode.com/paper/towards-generative-adversarial-networks-as-a
Repo	https://github.com/namkugkim/Study_deeplearning
Framework	pytorch


Title	MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval
Authors	Xin Liu, Zhikai Hu, Haibin Ling, Yiu-ming Cheung
Abstract	Hashing has recently sparked a great revolution in cross-modal retrieval because of its low storage cost and high query speed. Recent cross-modal hashing methods often learn unified or equal-length hash codes to represent the multi-modal data and make them intuitively comparable. However, such unified or equal-length hash representations could inherently sacrifice their representation scalability because the data from different modalities may not have one-to-one correspondence and could be encoded more efficiently by different hash codes of unequal lengths. To mitigate these problems, this paper exploits a related and relatively unexplored problem: encode the heterogeneous data with varying hash lengths and generalize the cross-modal retrieval in various challenging scenarios. To this end, a generalized and flexible cross-modal hashing framework, termed Matrix Tri-Factorization Hashing (MTFH), is proposed to work seamlessly in various settings including paired or unpaired multi-modal data, and equal or varying hash length encoding scenarios. More specifically, MTFH exploits an efficient objective function to flexibly learn the modality-specific hash codes with different length settings, while synchronously learning two semantic correlation matrices to semantically correlate the different hash representations for heterogeneous data comparable. As a result, the derived hash codes are more semantically meaningful for various challenging cross-modal retrieval tasks. Extensive experiments evaluated on public benchmark datasets highlight the superiority of MTFH under various retrieval scenarios and show its competitive performance with the state-of-the-arts.
Tasks	Cross-Modal Retrieval, Semantic Textual Similarity
Published	2018-05-04
URL	https://arxiv.org/abs/1805.01963v2
PDF	https://arxiv.org/pdf/1805.01963v2.pdf
PWC	https://paperswithcode.com/paper/mtfh-a-matrix-tri-factorization-hashing
Repo	https://github.com/starxliu/MTFH
Framework	none

Interpretable Adversarial Perturbation in Input Embedding Space for Text


Title	Interpretable Adversarial Perturbation in Input Embedding Space for Text
Authors	Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto
Abstract	Following great success in the image processing field, the idea of adversarial training has been applied to tasks in the natural language processing (NLP) field. One promising approach directly applies adversarial training developed in the image processing field to the input word embedding space instead of the discrete input space of texts. However, this approach abandons such interpretability as generating adversarial texts to significantly improve the performance of NLP tasks. This paper restores interpretability to such methods by restricting the directions of perturbations toward the existing words in the input embedding space. As a result, we can straightforwardly reconstruct each input with perturbations to an actual text by considering the perturbations to be the replacement of words in the sentence while maintaining or even improving the task performance.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02917v1
PDF	http://arxiv.org/pdf/1805.02917v1.pdf
PWC	https://paperswithcode.com/paper/interpretable-adversarial-perturbation-in
Repo	https://github.com/aonotas/interpretable_adv
Framework	none

Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing


Title	Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing
Authors	Jill-Jênn Vie, Hisashi Kashima
Abstract	Knowledge tracing is a sequence prediction problem where the goal is to predict the outcomes of students over questions as they are interacting with a learning platform. By tracking the evolution of the knowledge of some student, one can optimize instruction. Existing methods are either based on temporal latent variable models, or factor analysis with temporal features. We here show that factorization machines (FMs), a model for regression or classification, encompasses several existing models in the educational literature as special cases, notably additive factor model, performance factor model, and multidimensional item response theory. We show, using several real datasets of tens of thousands of users and items, that FMs can estimate student knowledge accurately and fast even when student data is sparsely observed, and handle side information such as multiple knowledge components and number of attempts at item or skill level. Our approach allows to fit student models of higher dimension than existing models, and provides a testbed to try new combinations of features in order to improve existing models.
Tasks	Knowledge Tracing, Latent Variable Models
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03388v2
PDF	http://arxiv.org/pdf/1811.03388v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-tracing-machines-factorization
Repo	https://github.com/jilljenn/ktm
Framework	tf

Discrete Factorization Machines for Fast Feature-based Recommendation


Title	Discrete Factorization Machines for Fast Feature-based Recommendation
Authors	Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang
Abstract	User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e.g., usually larger than 10^7, results in expensive storage and computational cost. This prohibits fast recommendation especially on mobile applications where the computational resource is very limited. In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation. DFM binarizes the real-valued model parameters (e.g., float32) of every feature embedding into binary codes (e.g., boolean), and thus supports efficient storage and fast user-item score computation. To avoid the severe quantization loss of the binarization, we propose a convergent updating rule that resolves the challenging discrete optimization of DFM. Through extensive experiments on two real-world datasets, we show that 1) DFM consistently outperforms state-of-the-art binarized recommendation models, and 2) DFM shows very competitive performance compared to its real-valued version (FM), demonstrating the minimized quantization loss. This work is accepted by IJCAI 2018.
Tasks	Quantization
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02232v3
PDF	http://arxiv.org/pdf/1805.02232v3.pdf
PWC	https://paperswithcode.com/paper/discrete-factorization-machines-for-fast
Repo	https://github.com/hanliu95/DFM
Framework	none

On the Continuity of Rotation Representations in Neural Networks


Title	On the Continuity of Rotation Representations in Neural Networks
Authors	Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, Hao Li
Abstract	In neural networks, it is often desirable to work with various representations of the same space. For example, 3D rotations can be represented with quaternions or Euler angles. In this paper, we advance a definition of a continuous representation, which can be helpful for training deep neural networks. We relate this to topological concepts such as homeomorphism and embedding. We then investigate what are continuous and discontinuous representations for 2D, 3D, and n-dimensional rotations. We demonstrate that for 3D rotations, all representations are discontinuous in the real Euclidean spaces of four or fewer dimensions. Thus, widely used representations such as quaternions and Euler angles are discontinuous and difficult for neural networks to learn. We show that the 3D rotations have continuous representations in 5D and 6D, which are more suitable for learning. We also present continuous representations for the general case of the n-dimensional rotation group SO(n). While our main focus is on rotations, we also show that our constructions apply to other groups such as the orthogonal group and similarity transforms. We finally present empirical results, which show that our continuous rotation representations outperform discontinuous ones for several practical problems in graphics and vision, including a simple autoencoder sanity test, a rotation estimator for 3D point clouds, and an inverse kinematics solver for 3D human poses.
Tasks
Published	2018-12-17
URL	http://arxiv.org/abs/1812.07035v3
PDF	http://arxiv.org/pdf/1812.07035v3.pdf
PWC	https://paperswithcode.com/paper/on-the-continuity-of-rotation-representations
Repo	https://github.com/tik0/6d
Framework	none

Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models


Title	Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models
Authors	Shoubo Hu, Zhitang Chen, Vahid Partovi Nia, Laiwan Chan, Yanhui Geng
Abstract	The inference of the causal relationship between a pair of observed variables is a fundamental problem in science, and most existing approaches are based on one single causal model. In practice, however, observations are often collected from multiple sources with heterogeneous causal models due to certain uncontrollable factors, which renders causal analysis results obtained by a single model skeptical. In this paper, we generalize the Additive Noise Model (ANM) to a mixture model, which consists of a finite number of ANMs, and provide the condition of its causal identifiability. To conduct model estimation, we propose Gaussian Process Partially Observable Model (GPPOM), and incorporate independence enforcement into it to learn latent parameter associated with each observation. Causal inference and clustering according to the underlying generating mechanisms of the mixture model are addressed in this work. Experiments on synthetic and real data demonstrate the effectiveness of our proposed approach.
Tasks	Causal Inference
Published	2018-09-23
URL	http://arxiv.org/abs/1809.08568v3
PDF	http://arxiv.org/pdf/1809.08568v3.pdf
PWC	https://paperswithcode.com/paper/causal-inference-and-mechanism-clustering-of
Repo	https://github.com/amber0309/ANM-MM
Framework	none

The Blessings of Multiple Causes


Title	The Blessings of Multiple Causes
Authors	Yixin Wang, David M. Blei
Abstract	Causal inference from observational data often assumes “ignorability,” that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for the deconfounder, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genome-wide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating closer-to-truth causal effects.
Tasks	Causal Inference
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06826v3
PDF	http://arxiv.org/pdf/1805.06826v3.pdf
PWC	https://paperswithcode.com/paper/the-blessings-of-multiple-causes
Repo	https://github.com/rajat641/CSE-472-Causality-Study
Framework	none

Paper Group AWR 21

Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study