Paper Group AWR 21
Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study. Excitation Dropout: Encouraging Plasticity in Deep Neural Networks. Hand Pose Estimation via Latent 2.5D Heatmap Regression. Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning. Large-Scale Multi-Domain Belief Track …
Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study
Title | Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study |
Authors | Maral Dadvar, Kai Eckert |
Abstract | Cyberbullying is a disturbing online misbehaviour with troubling consequences. It appears in different forms, and in most of the social networks, it is in textual format. Automatic detection of such incidents requires intelligent systems. Most of the existing studies have approached this problem with conventional machine learning models and the majority of the developed models in these studies are adaptable to a single social network at a time. In recent studies, deep learning based models have found their way in the detection of cyberbullying incidents, claiming that they can overcome the limitations of the conventional models, and improve the detection performance. In this paper, we investigate the findings of a recent literature in this regard. We successfully reproduced the findings of this literature and validated their findings using the same datasets, namely Wikipedia, Twitter, and Formspring, used by the authors. Then we expanded our work by applying the developed methods on a new YouTube dataset (~54k posts by ~4k users) and investigated the performance of the models in new social media platforms. We also transferred and evaluated the performance of the models trained on one platform to another platform. Our findings show that the deep learning based models outperform the machine learning models previously applied to the same YouTube dataset. We believe that the deep learning based models can also benefit from integrating other sources of information and looking into the impact of profile information of the users in social networks. |
Tasks | |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.08046v1 |
http://arxiv.org/pdf/1812.08046v1.pdf | |
PWC | https://paperswithcode.com/paper/cyberbullying-detection-in-social-networks |
Repo | https://github.com/sweta20/Detecting-Cyberbullying-Across-SMPs |
Framework | tf |
Excitation Dropout: Encouraging Plasticity in Deep Neural Networks
Title | Excitation Dropout: Encouraging Plasticity in Deep Neural Networks |
Authors | Andrea Zunino, Sarah Adel Bargal, Pietro Morerio, Jianming Zhang, Stan Sclaroff, Vittorio Murino |
Abstract | We propose a guided dropout regularizer for deep networks based on the evidence of a network prediction defined as the firing of neurons in specific paths. In this work, we utilize the evidence at each neuron to determine the probability of dropout, rather than dropping out neurons uniformly at random as in standard dropout. In essence, we dropout with higher probability those neurons which contribute more to decision making at training time. This approach penalizes high saliency neurons that are most relevant for model prediction, i.e. those having stronger evidence. By dropping such high-saliency neurons, the network is forced to learn alternative paths in order to maintain loss minimization, resulting in a plasticity-like behavior, a characteristic of human brains too. We demonstrate better generalization ability, an increased utilization of network neurons, and a higher resilience to network compression using several metrics over four image/video recognition benchmarks. |
Tasks | Decision Making, Video Recognition |
Published | 2018-05-23 |
URL | https://arxiv.org/abs/1805.09092v2 |
https://arxiv.org/pdf/1805.09092v2.pdf | |
PWC | https://paperswithcode.com/paper/excitation-dropout-encouraging-plasticity-in |
Repo | https://github.com/andreazuna89/Excitation-Dropout |
Framework | caffe2 |
Hand Pose Estimation via Latent 2.5D Heatmap Regression
Title | Hand Pose Estimation via Latent 2.5D Heatmap Regression |
Authors | Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz |
Abstract | Estimating the 3D pose of a hand is an essential part of human-computer interaction. Estimating 3D pose using depth or multi-view sensors has become easier with recent advances in computer vision, however, regressing pose from a single RGB image is much less straightforward. The main difficulty arises from the fact that 3D pose requires some form of depth estimates, which are ambiguous given only an RGB image. In this paper we propose a new method for 3D hand pose estimation from a monocular image through a novel 2.5D pose representation. Our new representation estimates pose up to a scaling factor, which can be estimated additionally if a prior of the hand size is given. We implicitly learn depth maps and heatmap distributions with a novel CNN architecture. Our system achieves the state-of-the-art estimation of 2D and 3D hand pose on several challenging datasets in presence of severe occlusions. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09534v1 |
http://arxiv.org/pdf/1804.09534v1.pdf | |
PWC | https://paperswithcode.com/paper/hand-pose-estimation-via-latent-25d-heatmap |
Repo | https://github.com/Janus-Shiau/ood_confidence_tensorflow |
Framework | tf |
Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
Title | Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning |
Authors | Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan |
Abstract | Skeleton-based action recognition has made great progress recently, but many problems still remain unsolved. For example, most of the previous methods model the representations of skeleton sequences without abundant spatial structure information and detailed temporal dynamics features. In this paper, we propose a novel model with spatial reasoning and temporal stack learning (SR-TSL) for skeleton based action recognition, which consists of a spatial reasoning network (SRN) and a temporal stack learning network (TSLN). The SRN can capture the high-level spatial structural information within each frame by a residual graph neural network, while the TSLN can model the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we propose a clip-based incremental loss to optimize the model. We perform extensive experiments on the SYSU 3D Human-Object Interaction dataset and NTU RGB+D dataset and verify the effectiveness of each network of our model. The comparison results illustrate that our approach achieves much better results than state-of-the-art methods. |
Tasks | Human-Object Interaction Detection, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2018-05-07 |
URL | http://arxiv.org/abs/1805.02335v2 |
http://arxiv.org/pdf/1805.02335v2.pdf | |
PWC | https://paperswithcode.com/paper/skeleton-based-action-recognition-with-1 |
Repo | https://github.com/yfsong0709/RA-GCNv1 |
Framework | pytorch |
Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing
Title | Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing |
Authors | Osman Ramadan, Paweł Budzianowski, Milica Gašić |
Abstract | Robust dialogue belief tracking is a key component in maintaining good quality dialogue systems. The tasks that dialogue systems are trying to solve are becoming increasingly complex, requiring scalability to multi domain, semantically rich dialogues. However, most current approaches have difficulty scaling up with domains because of the dependency of the model parameters on the dialogue ontology. In this paper, a novel approach is introduced that fully utilizes semantic similarity between dialogue utterances and the ontology terms, allowing the information to be shared across domains. The evaluation is performed on a recently collected multi-domain dialogues dataset, one order of magnitude larger than currently available corpora. Our model demonstrates great capability in handling multi-domain dialogues, simultaneously outperforming existing state-of-the-art models in single-domain dialogue tracking tasks. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.06517v1 |
http://arxiv.org/pdf/1807.06517v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-multi-domain-belief-tracking-with |
Repo | https://github.com/osmanio2/multi-domain-belief-tracking |
Framework | tf |
Explainable and Explicit Visual Reasoning over Scene Graphs
Title | Explainable and Explicit Visual Reasoning over Scene Graphs |
Authors | Jiaxin Shi, Hanwang Zhang, Juanzi Li |
Abstract | We aim to dismantle the prevalent black-box neural architectures used in complex visual reasoning tasks, into the proposed eXplainable and eXplicit Neural Modules (XNMs), which advance beyond existing neural module networks towards using scene graphs — objects as nodes and the pairwise relationships as edges — for explainable and explicit reasoning with structured knowledge. XNMs allow us to pay more attention to teach machines how to “think”, regardless of what they “look”. As we will show in the paper, by using scene graphs as an inductive bias, 1) we can design XNMs in a concise and flexible fashion, i.e., XNMs merely consist of 4 meta-types, which significantly reduce the number of parameters by 10 to 100 times, and 2) we can explicitly trace the reasoning-flow in terms of graph attentions. XNMs are so generic that they support a wide range of scene graph implementations with various qualities. For example, when the graphs are detected perfectly, XNMs achieve 100% accuracy on both CLEVR and CLEVR CoGenT, establishing an empirical performance upper-bound for visual reasoning; when the graphs are noisily detected from real-world images, XNMs are still robust to achieve a competitive 67.5% accuracy on VQAv2.0, surpassing the popular bag-of-objects attention models without graph structures. |
Tasks | Visual Reasoning |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01855v2 |
http://arxiv.org/pdf/1812.01855v2.pdf | |
PWC | https://paperswithcode.com/paper/explainable-and-explicit-visual-reasoning |
Repo | https://github.com/shijx12/XNM-Net |
Framework | pytorch |
Deep learning for time series classification: a review
Title | Deep learning for time series classification: a review |
Authors | Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller |
Abstract | Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date. |
Tasks | Time Series, Time Series Classification |
Published | 2018-09-12 |
URL | https://arxiv.org/abs/1809.04356v4 |
https://arxiv.org/pdf/1809.04356v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-time-series-classification |
Repo | https://github.com/hyungjik/TSC_practice |
Framework | none |
Towards generative adversarial networks as a new paradigm for radiology education
Title | Towards generative adversarial networks as a new paradigm for radiology education |
Authors | Samuel G. Finlayson, Hyunkwang Lee, Isaac S. Kohane, Luke Oakden-Rayner |
Abstract | Medical students and radiology trainees typically view thousands of images in order to “train their eye” to detect the subtle visual patterns necessary for diagnosis. Nevertheless, infrastructural and legal constraints often make it difficult to access and quickly query an abundance of images with a user-specified feature set. In this paper, we use a conditional generative adversarial network (GAN) to synthesize $1024\times1024$ pixel pelvic radiographs that can be queried with conditioning on fracture status. We demonstrate that the conditional GAN learns features that distinguish fractures from non-fractures by training a convolutional neural network exclusively on images sampled from the GAN and achieving an AUC of $>0.95$ on a held-out set of real images. We conduct additional analysis of the images sampled from the GAN and describe ongoing work to validate educational efficacy. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01547v1 |
http://arxiv.org/pdf/1812.01547v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-generative-adversarial-networks-as-a |
Repo | https://github.com/namkugkim/Study_deeplearning |
Framework | pytorch |
MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval
Title | MTFH: A Matrix Tri-Factorization Hashing Framework for Efficient Cross-Modal Retrieval |
Authors | Xin Liu, Zhikai Hu, Haibin Ling, Yiu-ming Cheung |
Abstract | Hashing has recently sparked a great revolution in cross-modal retrieval because of its low storage cost and high query speed. Recent cross-modal hashing methods often learn unified or equal-length hash codes to represent the multi-modal data and make them intuitively comparable. However, such unified or equal-length hash representations could inherently sacrifice their representation scalability because the data from different modalities may not have one-to-one correspondence and could be encoded more efficiently by different hash codes of unequal lengths. To mitigate these problems, this paper exploits a related and relatively unexplored problem: encode the heterogeneous data with varying hash lengths and generalize the cross-modal retrieval in various challenging scenarios. To this end, a generalized and flexible cross-modal hashing framework, termed Matrix Tri-Factorization Hashing (MTFH), is proposed to work seamlessly in various settings including paired or unpaired multi-modal data, and equal or varying hash length encoding scenarios. More specifically, MTFH exploits an efficient objective function to flexibly learn the modality-specific hash codes with different length settings, while synchronously learning two semantic correlation matrices to semantically correlate the different hash representations for heterogeneous data comparable. As a result, the derived hash codes are more semantically meaningful for various challenging cross-modal retrieval tasks. Extensive experiments evaluated on public benchmark datasets highlight the superiority of MTFH under various retrieval scenarios and show its competitive performance with the state-of-the-arts. |
Tasks | Cross-Modal Retrieval, Semantic Textual Similarity |
Published | 2018-05-04 |
URL | https://arxiv.org/abs/1805.01963v2 |
https://arxiv.org/pdf/1805.01963v2.pdf | |
PWC | https://paperswithcode.com/paper/mtfh-a-matrix-tri-factorization-hashing |
Repo | https://github.com/starxliu/MTFH |
Framework | none |
Interpretable Adversarial Perturbation in Input Embedding Space for Text
Title | Interpretable Adversarial Perturbation in Input Embedding Space for Text |
Authors | Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto |
Abstract | Following great success in the image processing field, the idea of adversarial training has been applied to tasks in the natural language processing (NLP) field. One promising approach directly applies adversarial training developed in the image processing field to the input word embedding space instead of the discrete input space of texts. However, this approach abandons such interpretability as generating adversarial texts to significantly improve the performance of NLP tasks. This paper restores interpretability to such methods by restricting the directions of perturbations toward the existing words in the input embedding space. As a result, we can straightforwardly reconstruct each input with perturbations to an actual text by considering the perturbations to be the replacement of words in the sentence while maintaining or even improving the task performance. |
Tasks | |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02917v1 |
http://arxiv.org/pdf/1805.02917v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-adversarial-perturbation-in |
Repo | https://github.com/aonotas/interpretable_adv |
Framework | none |
Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing
Title | Knowledge Tracing Machines: Factorization Machines for Knowledge Tracing |
Authors | Jill-Jênn Vie, Hisashi Kashima |
Abstract | Knowledge tracing is a sequence prediction problem where the goal is to predict the outcomes of students over questions as they are interacting with a learning platform. By tracking the evolution of the knowledge of some student, one can optimize instruction. Existing methods are either based on temporal latent variable models, or factor analysis with temporal features. We here show that factorization machines (FMs), a model for regression or classification, encompasses several existing models in the educational literature as special cases, notably additive factor model, performance factor model, and multidimensional item response theory. We show, using several real datasets of tens of thousands of users and items, that FMs can estimate student knowledge accurately and fast even when student data is sparsely observed, and handle side information such as multiple knowledge components and number of attempts at item or skill level. Our approach allows to fit student models of higher dimension than existing models, and provides a testbed to try new combinations of features in order to improve existing models. |
Tasks | Knowledge Tracing, Latent Variable Models |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03388v2 |
http://arxiv.org/pdf/1811.03388v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-tracing-machines-factorization |
Repo | https://github.com/jilljenn/ktm |
Framework | tf |
Discrete Factorization Machines for Fast Feature-based Recommendation
Title | Discrete Factorization Machines for Fast Feature-based Recommendation |
Authors | Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang |
Abstract | User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e.g., usually larger than 10^7, results in expensive storage and computational cost. This prohibits fast recommendation especially on mobile applications where the computational resource is very limited. In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation. DFM binarizes the real-valued model parameters (e.g., float32) of every feature embedding into binary codes (e.g., boolean), and thus supports efficient storage and fast user-item score computation. To avoid the severe quantization loss of the binarization, we propose a convergent updating rule that resolves the challenging discrete optimization of DFM. Through extensive experiments on two real-world datasets, we show that 1) DFM consistently outperforms state-of-the-art binarized recommendation models, and 2) DFM shows very competitive performance compared to its real-valued version (FM), demonstrating the minimized quantization loss. This work is accepted by IJCAI 2018. |
Tasks | Quantization |
Published | 2018-05-06 |
URL | http://arxiv.org/abs/1805.02232v3 |
http://arxiv.org/pdf/1805.02232v3.pdf | |
PWC | https://paperswithcode.com/paper/discrete-factorization-machines-for-fast |
Repo | https://github.com/hanliu95/DFM |
Framework | none |
On the Continuity of Rotation Representations in Neural Networks
Title | On the Continuity of Rotation Representations in Neural Networks |
Authors | Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, Hao Li |
Abstract | In neural networks, it is often desirable to work with various representations of the same space. For example, 3D rotations can be represented with quaternions or Euler angles. In this paper, we advance a definition of a continuous representation, which can be helpful for training deep neural networks. We relate this to topological concepts such as homeomorphism and embedding. We then investigate what are continuous and discontinuous representations for 2D, 3D, and n-dimensional rotations. We demonstrate that for 3D rotations, all representations are discontinuous in the real Euclidean spaces of four or fewer dimensions. Thus, widely used representations such as quaternions and Euler angles are discontinuous and difficult for neural networks to learn. We show that the 3D rotations have continuous representations in 5D and 6D, which are more suitable for learning. We also present continuous representations for the general case of the n-dimensional rotation group SO(n). While our main focus is on rotations, we also show that our constructions apply to other groups such as the orthogonal group and similarity transforms. We finally present empirical results, which show that our continuous rotation representations outperform discontinuous ones for several practical problems in graphics and vision, including a simple autoencoder sanity test, a rotation estimator for 3D point clouds, and an inverse kinematics solver for 3D human poses. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.07035v3 |
http://arxiv.org/pdf/1812.07035v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-continuity-of-rotation-representations |
Repo | https://github.com/tik0/6d |
Framework | none |
Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models
Title | Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models |
Authors | Shoubo Hu, Zhitang Chen, Vahid Partovi Nia, Laiwan Chan, Yanhui Geng |
Abstract | The inference of the causal relationship between a pair of observed variables is a fundamental problem in science, and most existing approaches are based on one single causal model. In practice, however, observations are often collected from multiple sources with heterogeneous causal models due to certain uncontrollable factors, which renders causal analysis results obtained by a single model skeptical. In this paper, we generalize the Additive Noise Model (ANM) to a mixture model, which consists of a finite number of ANMs, and provide the condition of its causal identifiability. To conduct model estimation, we propose Gaussian Process Partially Observable Model (GPPOM), and incorporate independence enforcement into it to learn latent parameter associated with each observation. Causal inference and clustering according to the underlying generating mechanisms of the mixture model are addressed in this work. Experiments on synthetic and real data demonstrate the effectiveness of our proposed approach. |
Tasks | Causal Inference |
Published | 2018-09-23 |
URL | http://arxiv.org/abs/1809.08568v3 |
http://arxiv.org/pdf/1809.08568v3.pdf | |
PWC | https://paperswithcode.com/paper/causal-inference-and-mechanism-clustering-of |
Repo | https://github.com/amber0309/ANM-MM |
Framework | none |
The Blessings of Multiple Causes
Title | The Blessings of Multiple Causes |
Authors | Yixin Wang, David M. Blei |
Abstract | Causal inference from observational data often assumes “ignorability,” that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for the deconfounder, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genome-wide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating closer-to-truth causal effects. |
Tasks | Causal Inference |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06826v3 |
http://arxiv.org/pdf/1805.06826v3.pdf | |
PWC | https://paperswithcode.com/paper/the-blessings-of-multiple-causes |
Repo | https://github.com/rajat641/CSE-472-Causality-Study |
Framework | none |