January 31, 2020

3258 words 16 mins read

Paper Group AWR 436

TransNet: A deep network for fast detection of common shot transitions. Representation Learning-Assisted Click-Through Rate Prediction. Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample. Theory of the Frequency Principle for General Deep Neural Networks. Persistency of Excitation for Robustness of Neural Networks. Cascaded LSTMs …

TransNet: A deep network for fast detection of common shot transitions


Title	TransNet: A deep network for fast detection of common shot transitions
Authors	Tomáš Souček, Jaroslav Moravec, Jakub Lokoč
Abstract	Shot boundary detection (SBD) is an important first step in many video processing applications. This paper presents a simple modular convolutional neural network architecture that achieves state-of-the-art results on the RAI dataset with well above real-time inference speed even on a single mediocre GPU. The network employs dilated convolutions and operates just on small resized frames. The training process employed randomly generated transitions using selected shots from the TRECVID IACC.3 dataset. The code and a selected trained network will be available at https://github.com/soCzech/TransNet.
Tasks	Boundary Detection
Published	2019-06-08
URL	https://arxiv.org/abs/1906.03363v1
PDF	https://arxiv.org/pdf/1906.03363v1.pdf
PWC	https://paperswithcode.com/paper/transnet-a-deep-network-for-fast-detection-of
Repo	https://github.com/soCzech/TransNet
Framework	tf

Representation Learning-Assisted Click-Through Rate Prediction


Title	Representation Learning-Assisted Click-Through Rate Prediction
Authors	Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Chao Qi, Zhaojie Liu, Yanlong Du
Abstract	Click-through rate (CTR) prediction is a critical task in online advertising systems. Most existing methods mainly model the feature-CTR relationship and suffer from the data sparsity issue. In this paper, we propose DeepMCP, which models other types of relationships in order to learn more informative and statistically reliable feature representations, and in consequence to improve the performance of CTR prediction. In particular, DeepMCP contains three parts: a matching subnet, a correlation subnet and a prediction subnet. These subnets model the user-ad, ad-ad and feature-CTR relationship respectively. When these subnets are jointly optimized under the supervision of the target labels, the learned feature representations have both good prediction powers and good representation abilities. Experiments on two large-scale datasets demonstrate that DeepMCP outperforms several state-of-the-art models for CTR prediction.
Tasks	Click-Through Rate Prediction, Representation Learning
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04365v3
PDF	https://arxiv.org/pdf/1906.04365v3.pdf
PWC	https://paperswithcode.com/paper/representation-learning-assisted-click
Repo	https://github.com/oywtece/deepmcp
Framework	tf

Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample


Title	Quasi-Newton Methods for Deep Learning: Forget the Past, Just Sample
Authors	Albert S. Berahas, Majid Jahani, Martin Takáč
Abstract	We present two sampled quasi-Newton methods: sampled LBFGS and sampled LSR1. Contrary to the classical variants of these methods that sequentially build (inverse) Hessian approximations as the optimization progresses, our proposed methods sample points randomly around the current iterate to produce these approximations. As a result, the approximations constructed make use of more reliable (recent and local) information, and do not depend on past information that could be significantly stale. Our proposed algorithms are efficient in terms of accessed data points (epochs) and have enough concurrency to take advantage of distributed computing environments. We provide convergence guarantees for our proposed methods. Numerical tests on a toy classification problem and on popular benchmarking neural network training tasks reveal that the methods outperform their classical variants and are competitive with first-order methods such as ADAM.
Tasks
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09997v3
PDF	https://arxiv.org/pdf/1901.09997v3.pdf
PWC	https://paperswithcode.com/paper/quasi-newton-methods-for-deep-learning-forget
Repo	https://github.com/OptMLGroup/SQN
Framework	tf

Theory of the Frequency Principle for General Deep Neural Networks


Title	Theory of the Frequency Principle for General Deep Neural Networks
Authors	Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang
Abstract	Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training. The F-Principle has been very useful in providing both qualitative and quantitative understandings of DNNs. In this paper, we rigorously investigate the F-Principle for the training dynamics of a general DNN at three stages: initial stage, intermediate stage, and final stage. For each stage, a theorem is provided in terms of proper quantities characterizing the F-Principle. Our results are general in the sense that they work for multilayer networks with general activation functions, population densities of data, and a large class of loss functions. Our work lays a theoretical foundation of the F-Principle for a better understanding of the training process of DNNs.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09235v2
PDF	https://arxiv.org/pdf/1906.09235v2.pdf
PWC	https://paperswithcode.com/paper/theory-of-the-frequency-principle-for-general
Repo	https://github.com/xuzhiqin1990/F-Principle
Framework	tf

Persistency of Excitation for Robustness of Neural Networks


Title	Persistency of Excitation for Robustness of Neural Networks
Authors	Kamil Nar, S. Shankar Sastry
Abstract	When an online learning algorithm is used to estimate the unknown parameters of a model, the signals interacting with the parameter estimates should not decay too quickly for the optimal values to be discovered correctly. This requirement is referred to as persistency of excitation, and it arises in various contexts, such as optimization with stochastic gradient methods, exploration for multi-armed bandits, and adaptive control of dynamical systems. While training a neural network, the iterative optimization algorithm involved also creates an online learning problem, and consequently, correct estimation of the optimal parameters requires persistent excitation of the network weights. In this work, we analyze the dynamics of the gradient descent algorithm while training a two-layer neural network with two different loss functions, the squared-error loss and the cross-entropy loss; and we obtain conditions to guarantee persistent excitation of the network weights. We then show that these conditions are difficult to satisfy when a multi-layer network is trained for a classification task, for the signals in the intermediate layers of the network become low-dimensional during training and fail to remain persistently exciting. To provide a remedy, we delve into the classical regularization terms used for linear models, reinterpret them as a means to ensure persistent excitation of the model parameters, and propose an algorithm for neural networks by building an analogy. The results in this work shed some light on why adversarial examples have become a challenging problem for neural networks, why merely augmenting training data sets will not be an effective approach to address them, and why there may not exist a data-independent regularization term for neural networks, which involve only the model parameters but not the training data.
Tasks	Multi-Armed Bandits
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01043v1
PDF	https://arxiv.org/pdf/1911.01043v1.pdf
PWC	https://paperswithcode.com/paper/persistency-of-excitation-for-robustness-of
Repo	https://github.com/nar-k/persistent-excitation
Framework	pytorch

Cascaded LSTMs based Deep Reinforcement Learning for Goal-driven Dialogue


Title	Cascaded LSTMs based Deep Reinforcement Learning for Goal-driven Dialogue
Authors	Yue Ma, Xiaojie Wang, Zhenjiang Dong, Hong Chen
Abstract	This paper proposes a deep neural network model for joint modeling Natural Language Understanding (NLU) and Dialogue Management (DM) in goal-driven dialogue systems. There are three parts in this model. A Long Short-Term Memory (LSTM) at the bottom of the network encodes utterances in each dialogue turn into a turn embedding. Dialogue embeddings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings. The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions. The cascaded LSTMs based reinforcement learning network is jointly optimized by making use of the rewards received at each dialogue turn as the only supervision information. There is no explicit NLU and dialogue states in the network. Experimental results show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meeting room booking tasks. Visualization of dialogue embeddings illustrates that the model can learn the representation of dialogue states.
Tasks	Dialogue Management
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14229v1
PDF	https://arxiv.org/pdf/1910.14229v1.pdf
PWC	https://paperswithcode.com/paper/cascaded-lstms-based-deep-reinforcement
Repo	https://github.com/Damcy/cascadeLSTMDRL
Framework	tf

Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection


Title	Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection
Authors	Samet Akçay, Amir Atapour-Abarghouei, Toby P. Breckon
Abstract	Despite inherent ill-definition, anomaly detection is a research endeavor of great interest within machine learning and visual scene understanding alike. Most commonly, anomaly detection is considered as the detection of outliers within a given data distribution based on some measure of normality. The most significant challenge in real-world anomaly detection problems is that available data is highly imbalanced towards normality (i.e. non-anomalous) and contains a most a subset of all possible anomalous samples - hence limiting the use of well-established supervised learning methods. By contrast, we introduce an unsupervised anomaly detection model, trained only on the normal (non-anomalous, plentiful) samples in order to learn the normality distribution of the domain and hence detect abnormality based on deviation from this model. Our proposed approach employs an encoder-decoder convolutional neural network with skip connections to thoroughly capture the multi-scale distribution of the normal data distribution in high-dimensional image space. Furthermore, utilizing an adversarial training scheme for this chosen architecture provides superior reconstruction both within high-dimensional image space and a lower-dimensional latent vector space encoding. Minimizing the reconstruction error metric within both the image and hidden vector spaces during training aids the model to learn the distribution of normality as required. Higher reconstruction metrics during subsequent test and deployment are thus indicative of a deviation from this normal distribution, hence indicative of an anomaly. Experimentation over established anomaly detection benchmarks and challenging real-world datasets, within the context of X-ray security screening, shows the unique promise of such a proposed approach.
Tasks	Anomaly Detection, Scene Understanding, Unsupervised Anomaly Detection
Published	2019-01-25
URL	http://arxiv.org/abs/1901.08954v1
PDF	http://arxiv.org/pdf/1901.08954v1.pdf
PWC	https://paperswithcode.com/paper/skip-ganomaly-skip-connected-and
Repo	https://github.com/samet-akcay/skip-ganomaly
Framework	pytorch

Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking


Title	Joint Group Feature Selection and Discriminative Filter Learning for Robust Visual Object Tracking
Authors	Tianyang Xu, Zhen-Hua Feng, Xiao-Jun Wu, Josef Kittler
Abstract	We propose a new Group Feature Selection method for Discriminative Correlation Filters (GFS-DCF) based visual object tracking. The key innovation of the proposed method is to perform group feature selection across both channel and spatial dimensions, thus to pinpoint the structural relevance of multi-channel features to the filtering system. In contrast to the widely used spatial regularisation or feature selection methods, to the best of our knowledge, this is the first time that channel selection has been advocated for DCF-based tracking. We demonstrate that our GFS-DCF method is able to significantly improve the performance of a DCF tracker equipped with deep neural network features. In addition, our GFS-DCF enables joint feature selection and filter learning, achieving enhanced discrimination and interpretability of the learned filters. To further improve the performance, we adaptively integrate historical information by constraining filters to be smooth across temporal frames, using an efficient low-rank approximation. By design, specific temporal-spatial-channel configurations are dynamically learned in the tracking process, highlighting the relevant features, and alleviating the performance degrading impact of less discriminative representations and reducing information redundancy. The experimental results obtained on OTB2013, OTB2015, VOT2017, VOT2018 and TrackingNet demonstrate the merits of our GFS-DCF and its superiority over the state-of-the-art trackers. The code is publicly available at https://github.com/XU-TIANYANG/GFS-DCF.
Tasks	Feature Selection, Object Tracking, Visual Object Tracking
Published	2019-07-30
URL	https://arxiv.org/abs/1907.13242v2
PDF	https://arxiv.org/pdf/1907.13242v2.pdf
PWC	https://paperswithcode.com/paper/joint-group-feature-selection-and
Repo	https://github.com/XU-TIANYANG/GFS-DCF
Framework	none

Trivializations for Gradient-Based Optimization on Manifolds


Title	Trivializations for Gradient-Based Optimization on Manifolds
Authors	Mario Lezcano-Casado
Abstract	We introduce a framework to study the transformation of problems with manifold constraints into unconstrained problems through parametrizations in terms of a Euclidean space. We call these parametrizations “trivializations”. We prove conditions under which a trivialization is sound in the context of gradient-based optimization and we show how two large families of trivializations have overall favorable properties, but also suffer from a performance issue. We then introduce “dynamic trivializations”, which solve this problem, and we show how these form a family of optimization methods that lie between trivializations and Riemannian gradient descent, and combine the benefits of both of them. We then show how to implement these two families of trivializations in practice for different matrix manifolds. To this end, we prove a formula for the gradient of the exponential of matrices, which can be of practical interest on its own. Finally, we show how dynamic trivializations improve the performance of existing methods on standard tasks designed to test long-term memory within neural networks.
Tasks
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09501v2
PDF	https://arxiv.org/pdf/1909.09501v2.pdf
PWC	https://paperswithcode.com/paper/trivializations-for-gradient-based
Repo	https://github.com/Lezcano/expRNN
Framework	pytorch

CCKS 2019 Shared Task on Inter-Personal Relationship Extraction


Title	CCKS 2019 Shared Task on Inter-Personal Relationship Extraction
Authors	Haitao Wang, Zhengqiu He, Tong Zhu, Hao Shao, Wenliang Chen, Min Zhang
Abstract	The CCKS2019 shared task was devoted to inter-personal relationship extraction. Given two person entities and at least one sentence containing these two entities, participating teams are asked to predict the relationship between the entities according to a given relation list. This year, 358 teams from various universities and organizations participated in this task. In this paper, we present the task definition, the description of data and the evaluation methodology used during this shared task. We also present a brief overview of the various methods adopted by the participating teams. Finally, we present the evaluation results.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11337v1
PDF	https://arxiv.org/pdf/1908.11337v1.pdf
PWC	https://paperswithcode.com/paper/ccks-2019-shared-task-on-inter-personal
Repo	https://github.com/ccks2019-ipre/baseline
Framework	tf

Robust Multi-Modality Multi-Object Tracking


Title	Robust Multi-Modality Multi-Object Tracking
Authors	Wenwei Zhang, Hui Zhou, Shuyang Sun, Zhe Wang, Jianping Shi, Chen Change Loy
Abstract	Multi-sensor perception is crucial to ensure the reliability and accuracy in autonomous driving system, while multi-object tracking (MOT) improves that by tracing sequential movement of dynamic objects. Most current approaches for multi-sensor multi-object tracking are either lack of reliability by tightly relying on a single input source (e.g., center camera), or not accurate enough by fusing the results from multiple sensors in post processing without fully exploiting the inherent information. In this study, we design a generic sensor-agnostic multi-modality MOT framework (mmMOT), where each modality (i.e., sensors) is capable of performing its role independently to preserve reliability, and further improving its accuracy through a novel multi-modality fusion module. Our mmMOT can be trained in an end-to-end manner, enables joint optimization for the base feature extractor of each modality and an adjacency estimator for cross modality. Our mmMOT also makes the first attempt to encode deep representation of point cloud in data association process in MOT. We conduct extensive experiments to evaluate the effectiveness of the proposed framework on the challenging KITTI benchmark and report state-of-the-art performance. Code and models are available at https://github.com/ZwwWayne/mmMOT.
Tasks	Autonomous Driving, Multi-Object Tracking, Object Tracking
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03850v1
PDF	https://arxiv.org/pdf/1909.03850v1.pdf
PWC	https://paperswithcode.com/paper/robust-multi-modality-multi-object-tracking
Repo	https://github.com/ZwwWayne/mmMOT
Framework	pytorch

Analyzing Dynamical Brain Functional Connectivity As Trajectories on Space of Covariance Matrices


Title	Analyzing Dynamical Brain Functional Connectivity As Trajectories on Space of Covariance Matrices
Authors	Mengyu Dai, Zhengwu Zhang, Anuj Srivastava
Abstract	Human brain functional connectivity (FC) is often measured as the similarity of functional MRI responses across brain regions when a brain is either resting or performing a task. This paper aims to statistically analyze the dynamic nature of FC by representing the collective time-series data, over a set of brain regions, as a trajectory on the space of covariance matrices, or symmetric-positive definite matrices (SPDMs). We use a recently developed metric on the space of SPDMs for quantifying differences across FC observations, and for clustering and classification of FC trajectories. To facilitate large scale and high-dimensional data analysis, we propose a novel, metric-based dimensionality reduction technique to reduce data from large SPDMs to small SPDMs. We illustrate this comprehensive framework using data from the Human Connectome Project (HCP) database for multiple subjects and tasks, with task classification rates that match or outperform state-of-the-art techniques.
Tasks	Dimensionality Reduction, Time Series
Published	2019-04-10
URL	https://arxiv.org/abs/1904.05449v2
PDF	https://arxiv.org/pdf/1904.05449v2.pdf
PWC	https://paperswithcode.com/paper/analyzing-dynamical-brain-functional
Repo	https://github.com/dzld00/unsupervised-cov-sequence-dim-reduction
Framework	none

Radius Adaptive Convolutional Neural Network


Title	Radius Adaptive Convolutional Neural Network
Authors	Meisam Rakhshanfar
Abstract	Convolutional neural network (CNN) is widely used in computer vision applications. In the networks that deal with images, CNNs are the most time-consuming layer of the networks. Usually, the solution to address the computation cost is to decrease the number of trainable parameters. This solution usually comes with the cost of dropping the accuracy. Another problem with this technique is that usually the cost of memory access is not taken into account which results in insignificant speedup gain. The number of operations and memory access in a standard convolution layer is independent of the input content, which makes it limited for certain accelerations. We propose a simple modification to a standard convolution to bridge this gap. We propose an adaptive convolution that adopts different kernel sizes (or radii) based on the content. The network can learn and select the proper radius based on the input content in a soft decision manner. Our proposed radius-adaptive convolutional neural network (RACNN) has a similar number of weights to a standard one, yet, results show it can reach higher speeds. The code has been made available at: https://github.com/meisamrf/racnn.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11079v1
PDF	https://arxiv.org/pdf/1911.11079v1.pdf
PWC	https://paperswithcode.com/paper/radius-adaptive-convolutional-neural-network
Repo	https://github.com/meisamrf/racnn
Framework	tf

Sampling random graph homomorphisms and applications to network data analysis


Title	Sampling random graph homomorphisms and applications to network data analysis
Authors	Hanbaek Lyu, Facundo Memoli, David Sivakoff
Abstract	A graph homomorphism is a map between two graphs that preserves adjacency relations. We consider the problem of sampling a random graph homomorphism from a graph $F$ into a large network $\mathcal{G}$. When $\mathcal{G}$ is the complete graph with $q$ nodes, this becomes the well-known problem of sampling uniform $q$-colorings of $F$. We propose two complementary MCMC algorithms for sampling a random graph homomorphisms and establish bounds on their mixing times and concentration of their time averages. Based on our sampling algorithms, we propose a novel framework for network data analysis that circumvents some of the drawbacks in methods based on independent and neigborhood sampling. Various time averages of the MCMC trajectory give us real-, function-, and network-valued computable observables, including well-known ones such as homomorphism density and average clustering coefficient. One of the main observable we propose is called the conditional homomorphism density profile, which reveals hierarchical structure of the network. Furthermore, we show that these network observables are stable with respect to a suitably renormalized cut distance between networks. We also provide various examples and simulations demonstrating our framework through synthetic and real-world networks. For instance, we apply our framework to analyze Word Adjacency Networks of a 45 novels data set and propose an authorship attribution scheme using motif sampling and conditional homomorphism density profiles.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09483v1
PDF	https://arxiv.org/pdf/1910.09483v1.pdf
PWC	https://paperswithcode.com/paper/sampling-random-graph-homomorphisms-and
Repo	https://github.com/HanbaekLyu/ONMF_ONTF_NDL
Framework	none

Learning from Fact-checkers: Analysis and Generation of Fact-checking Language


Title	Learning from Fact-checkers: Analysis and Generation of Fact-checking Language
Authors	Nguyen Vo, Kyumin Lee
Abstract	In fighting against fake news, many fact-checking systems comprised of human-based fact-checking sites (e.g., snopes.com and politifact.com) and automatic detection systems have been developed in recent years. However, online users still keep sharing fake news even when it has been debunked. It means that early fake news detection may be insufficient and we need another complementary approach to mitigate the spread of misinformation. In this paper, we introduce a novel application of text generation for combating fake news. In particular, we (1) leverage online users named \emph{fact-checkers}, who cite fact-checking sites as credible evidences to fact-check information in public discourse; (2) analyze linguistic characteristics of fact-checking tweets; and (3) propose and build a deep learning framework to generate responses with fact-checking intention to increase the fact-checkers’ engagement in fact-checking activities. Our analysis reveals that the fact-checkers tend to refute misinformation and use formal language (e.g. few swear words and Internet slangs). Our framework successfully generates relevant responses, and outperforms competing models by achieving up to 30% improvements. Our qualitative study also confirms that the superiority of our generated responses compared with responses generated from the existing models.
Tasks	Fake News Detection, Text Generation
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02202v1
PDF	https://arxiv.org/pdf/1910.02202v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-fact-checkers-analysis-and
Repo	https://github.com/nguyenvo09/LearningFromFactCheckers
Framework	none