January 28, 2020

3197 words 16 mins read

Paper Group ANR 1039

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition. A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging. NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions. Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation. Metric Gaussian V …

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition


Title	Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition
Authors	Zhao Zhang, Zemin Tang, Zheng Zhang, Yang Wang, Jie Qin, Meng Wang
Abstract	The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling operation may lose important feature information and is unlearnable; 2) the tradi-tional convolution operation optimizes slowly and the hierar-chical features from different layers are not fully utilized. In this work, we address these problems by developing a novel deep network model called Fully-Convolutional Intensive Feature Flow Neural Network (IntensiveNet). Specifically, we design a further dense block called intensive block to extract the feature information, where the original inputs and two dense blocks are connected tightly. To encode data appropriately, we present the concepts of dense fusion block and further dense fusion opera-tions for our new intensive block. By adding short connections to different layers, the feature flow and coupling between layers are enhanced. We also replace the traditional convolution by depthwise separable convolution to make the operation efficient. To prevent important feature information being lost to a certain extent, we use a convolution operation with stride 2 to replace the original pooling operation in the customary transition layers. The recognition results on large-scale Chinese string and MNIST datasets show that our IntensiveNet can deliver enhanced recog-nition results, compared with other related deep models.
Tasks
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06446v2
PDF	https://arxiv.org/pdf/1912.06446v2.pdf
PWC	https://paperswithcode.com/paper/fully-convolutional-intensive-feature-flow
Repo
Framework

A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging


Title	A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging
Authors	Fei Gao, Hyunsoo Yoon, Teresa Wu, Xianghua Chu
Abstract	Object detection, segmentation and classification are three common tasks in medical image analysis. Multi-task deep learning (MTL) tackles these three tasks jointly, which provides several advantages saving computing time and resources and improving robustness against overfitting. However, existing multitask deep models start with each task as an individual task and integrate parallelly conducted tasks at the end of the architecture with one cost function. Such architecture fails to take advantage of the combined power of the features from each individual task at an early stage of the training. In this research, we propose a new architecture, FTMTLNet, an MTL enabled by feature transferring. Traditional transfer learning deals with the same or similar task from different data sources (a.k.a. domain). The underlying assumption is that the knowledge gained from source domains may help the learning task on the target domain. Our proposed FTMTLNet utilizes the different tasks from the same domain. Considering features from the tasks are different views of the domain, the combined feature maps can be well exploited using knowledge from multiple views to enhance the generalizability. To evaluate the validity of the proposed approach, FTMTLNet is compared with models from literature including 8 classification models, 4 detection models and 3 segmentation models using a public full field digital mammogram dataset for breast cancer diagnosis. Experimental results show that the proposed FTMTLNet outperforms the competing models in classification and detection and has comparable results in segmentation.
Tasks	Object Detection, Transfer Learning
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01828v1
PDF	https://arxiv.org/pdf/1906.01828v1.pdf
PWC	https://paperswithcode.com/paper/a-feature-transfer-enabled-multi-task-deep
Repo
Framework

NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions


Title	NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions
Authors	Haekyu Park, Fred Hohman, Duen Horng Chau
Abstract	As deep neural networks are increasingly used in solving high-stake problems, there is a pressing need to understand their internal decision mechanisms. Visualization has helped address this problem by assisting with interpreting complex deep neural networks. However, current tools often support only single data instances, or visualize layers in isolation. We present NeuralDivergence, an interactive visualization system that uses activation distributions as a high-level summary of what a model has learned. NeuralDivergence enables users to interactively summarize and compare activation distributions across layers, classes, and instances (e.g., pairs of adversarial attacked and benign images), helping them gain better understanding of neural network models.
Tasks
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00332v1
PDF	https://arxiv.org/pdf/1906.00332v1.pdf
PWC	https://paperswithcode.com/paper/190600332
Repo
Framework

Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation


Title	Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation
Authors	Kyongmin Yeo
Abstract	We present a data-driven model to reconstruct nonlinear dynamics from a very sparse times series data, which relies on the strength of the echo state network (ESN) in learning nonlinear representation of data. With an assumption of the universal function approximation capability of ESN, it is shown that the reconstruction problem can be formulated as a fixed-point problem, in which the trajectory of the dynamical system is a fixed point of the ESN. An under-relaxed fixed-point iteration is proposed to reconstruct the nonlinear dynamics from a sparse observation. The proposed fixed-point ESN is tested against both univariate and multivariate chaotic dynamical systems by randomly removing up to 95% of the data. It is shown that the fixed-point ESN is able to reconstruct the complex dynamics from only 5 ~ 10% of the data. For a relatively simple non-chaotic dynamical system, the numerical experiments on a forced van der Pol oscillator show that it is possible to reconstruct the nonlinear dynamics from only 1~2% of the data.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04059v1
PDF	https://arxiv.org/pdf/1906.04059v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-reconstruction-of-nonlinear
Repo
Framework

Metric Gaussian Variational Inference


Title	Metric Gaussian Variational Inference
Authors	Jakob Knollmüller, Torsten A. Enßlin
Abstract	Solving Bayesian inference problems approximately with variational approaches can provide fast and accurate results. Capturing correlation within the approximation requires an explicit parametrization. This intrinsically limits this approach to either moderately dimensional problems, or requiring the strongly simplifying mean-field approach. We propose Metric Gaussian Variational Inference (MGVI) as a method that goes beyond mean-field. Here correlations between all model parameters are taken into account, while still scaling linearly in computational time and memory. With this method we achieve higher accuracy and in many cases a significant speedup compared to traditional methods. MGVI is an iterative method that performs a series of Gaussian approximations to the posterior. We alternate between approximating the covariance with the inverse Fisher information metric evaluated at an intermediate mean estimate and optimizing the KL-divergence for the given covariance with respect to the mean. This procedure is iterated until the uncertainty estimate is self-consistent with the mean parameter. We achieve linear scaling by avoiding to store the covariance explicitly at any time. Instead we draw samples from the approximating distribution relying on an implicit representation and numerical schemes to approximately solve linear equations. Those samples are used to approximate the KL-divergence and its gradient. The usage of natural gradient descent allows for rapid convergence. Formulating the Bayesian model in standardized coordinates makes MGVI applicable to any inference problem with continuous parameters. We demonstrate the high accuracy of MGVI by comparing it to HMC and its fast convergence relative to other established methods in several examples. We investigate real-data applications, as well as synthetic examples of varying size and complexity and up to a million model parameters.
Tasks	Bayesian Inference
Published	2019-01-30
URL	https://arxiv.org/abs/1901.11033v3
PDF	https://arxiv.org/pdf/1901.11033v3.pdf
PWC	https://paperswithcode.com/paper/metric-gaussian-variational-inference
Repo
Framework


Title	Fair treatment allocations in social networks
Authors	James Atwood, Hansa Srinivasan, Yoni Halpern, D Sculley
Abstract	Simulations of infectious disease spread have long been used to understand how epidemics evolve and how to effectively treat them. However, comparatively little attention has been paid to understanding the fairness implications of different treatment strategies – that is, how might such strategies distribute the expected disease burden differentially across various subgroups or communities in the population? In this work, we define the precision disease control problem – the problem of optimally allocating vaccines in a social network in a step-by-step fashion – and we use the ML Fairness Gym to simulate epidemic control and study it from both an efficiency and fairness perspective. We then present an exploratory analysis of several different environments and discuss the fairness implications of different treatment strategies.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.05489v1
PDF	https://arxiv.org/pdf/1911.05489v1.pdf
PWC	https://paperswithcode.com/paper/fair-treatment-allocations-in-social-networks
Repo
Framework

Balancing Multi-level Interactions for Session-based Recommendation


Title	Balancing Multi-level Interactions for Session-based Recommendation
Authors	Yujia Zheng, Siyi Liu, Zailei Zhou
Abstract	Predicting user actions based on anonymous sessions is a challenge to general recommendation systems because the lack of user profiles heavily limits data-driven models. Recently, session-based recommendation methods have achieved remarkable results in dealing with this task. However, the upper bound of performance can still be boosted through the innovative exploration of limited data. In this paper, we propose a novel method, namely Intra-and Inter-session Interaction-aware Graph-enhanced Network, to take inter-session item-level interactions into account. Different from existing intra-session item-level interactions and session-level collaborative information, our introduced data represents complex item-level interactions between different sessions. For mining the new data without breaking the equilibrium of the model between different interactions, we construct an intra-session graph and an inter-session graph for the current session. The former focuses on item-level interactions within a single session and the latter models those between items among neighborhood sessions. Then different approaches are employed to encode the information of two graphs according to different structures, and the generated latent vectors are combined to balance the model across different scopes. Experiments on real-world datasets verify that our method outperforms other state-of-the-art methods.
Tasks	Recommendation Systems, Session-Based Recommendations
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13527v1
PDF	https://arxiv.org/pdf/1910.13527v1.pdf
PWC	https://paperswithcode.com/paper/balancing-multi-level-interactions-for
Repo
Framework

Real-time and robust multiple-view gender classification using gait features in video surveillance


Title	Real-time and robust multiple-view gender classification using gait features in video surveillance
Authors	Trung Dung Do, Hakil Kim, Van Huan Nguyen
Abstract	It is common to view people in real applications walking in arbitrary directions, holding items, or wearing heavy coats. These factors are challenges in gait-based application methods because they significantly change a person’s appearance. This paper proposes a novel method for classifying human gender in real time using gait information. The use of an average gait image (AGI), rather than a gait energy image (GEI), allows this method to be computationally efficient and robust against view changes. A viewpoint (VP) model is created for automatically determining the viewing angle during the testing phase. A distance signal (DS) model is constructed to remove any areas with an attachment (carried items, worn coats) from a silhouette to reduce the interference in the resulting classification. Finally, the human gender is classified using multiple view-dependent classifiers trained using a support vector machine. Experiment results confirm that the proposed method achieves a high accuracy of 98.8% on the CASIA Dataset B and outperforms the recent state-of-the-art methods.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01013v1
PDF	https://arxiv.org/pdf/1905.01013v1.pdf
PWC	https://paperswithcode.com/paper/real-time-and-robust-multiple-view-gender
Repo
Framework

Graph Embeddings at Scale


Title	Graph Embeddings at Scale
Authors	C. Bayan Bruss, Anish Khazane, Jonathan Rider, Richard Serpe, Saurabh Nagrecha, Keegan E. Hines
Abstract	Graph embedding is a popular algorithmic approach for creating vector representations for individual vertices in networks. Training these algorithms at scale is important for creating embeddings that can be used for classification, ranking, recommendation and other common applications in industry. While industrial systems exist for training graph embeddings on large datasets, many of these distributed architectures are forced to partition copious amounts of data and model logic across many worker nodes. In this paper, we propose a distributed infrastructure that completely avoids graph partitioning, dynamically creates size constrained computational graphs across worker nodes, and uses highly efficient indexing operations for updating embeddings that allow the system to function at scale. We show that our system can scale an existing embeddings algorithm - skip-gram - to train on the open-source Friendster network (68 million vertices) and on an internal heterogeneous graph (50 million vertices). We measure the performance of our system on two key quantitative metrics: link-prediction accuracy and rate of convergence. We conclude this work by analyzing how a greater number of worker nodes actually improves our system’s performance on the aforementioned metrics and discuss our next steps for rigorously evaluating the embedding vectors produced by our system.
Tasks	Graph Embedding, graph partitioning, Link Prediction
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01705v1
PDF	https://arxiv.org/pdf/1907.01705v1.pdf
PWC	https://paperswithcode.com/paper/graph-embeddings-at-scale
Repo
Framework

Variations on the Chebyshev-Lagrange Activation Function


Title	Variations on the Chebyshev-Lagrange Activation Function
Authors	Yuchen Li, Frank Rudzicz, Jekaterina Novikova
Abstract	We seek to improve the data efficiency of neural networks and present novel implementations of parameterized piece-wise polynomial activation functions. The parameters are the y-coordinates of n+1 Chebyshev nodes per hidden unit and Lagrangian interpolation between the nodes produces the polynomial on [-1, 1]. We show results for different methods of handling inputs outside [-1, 1] on synthetic datasets, finding significant improvements in capacity of expression and accuracy of interpolation in models that compute some form of linear extrapolation from either ends. We demonstrate competitive or state-of-the-art performance on the classification of images (MNIST and CIFAR-10) and minimally-correlated vectors (DementiaBank) when we replace ReLU or tanh with linearly extrapolated Chebyshev-Lagrange activations in deep residual architectures.
Tasks
Published	2019-06-24
URL	https://arxiv.org/abs/1906.10064v1
PDF	https://arxiv.org/pdf/1906.10064v1.pdf
PWC	https://paperswithcode.com/paper/variations-on-the-chebyshev-lagrange
Repo
Framework

Towards Online End-to-end Transformer Automatic Speech Recognition


Title	Towards Online End-to-end Transformer Automatic Speech Recognition
Authors	Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe
Abstract	The Transformer self-attention network has recently shown promising performance as an alternative to recurrent neural networks in end-to-end (E2E) automatic speech recognition (ASR) systems. However, Transformer has a drawback in that the entire input sequence is required to compute self-attention. We have proposed a block processing method for the Transformer encoder by introducing a context-aware inheritance mechanism. An additional context embedding vector handed over from the previously processed block helps to encode not only local acoustic information but also global linguistic, channel, and speaker attributes. In this paper, we extend it towards an entire online E2E ASR system by introducing an online decoding process inspired by monotonic chunkwise attention (MoChA) into the Transformer decoder. Our novel MoChA training and inference algorithms exploit the unique properties of Transformer, whose attentions are not always monotonic or peaky, and have multiple heads and residual connections of the decoder layers. Evaluations of the Wall Street Journal (WSJ) and AISHELL-1 show that our proposed online Transformer decoder outperforms conventional chunkwise approaches.
Tasks	Speech Recognition
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11871v1
PDF	https://arxiv.org/pdf/1910.11871v1.pdf
PWC	https://paperswithcode.com/paper/towards-online-end-to-end-transformer
Repo
Framework

Information-Theoretic Local Minima Characterization and Regularization


Title	Information-Theoretic Local Minima Characterization and Regularization
Authors	Zhiwei Jia, Hao Su
Abstract	Recent advances in deep learning theory have evoked the study of generalizability across different local minima of deep neural networks (DNNs). While current work focused on either discovering properties of good local minima or developing regularization techniques to induce good local minima, no approach exists that can tackle both problems. We achieve these two goals successfully in a unified manner. Specifically, based on the Fisher information we propose a metric both strongly indicative of generalizability of local minima and effectively applied as a practical regularizer. We provide theoretical analysis including a generalization bound and empirically demonstrate the success of our approach in both capturing and improving the generalizability of DNNs. Experiments are performed on CIFAR-10 and CIFAR-100 for various network architectures.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08192v1
PDF	https://arxiv.org/pdf/1911.08192v1.pdf
PWC	https://paperswithcode.com/paper/information-theoretic-local-minima-1
Repo
Framework

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function


Title	Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Authors	Zihan Zhang, Xiangyang Ji
Abstract	We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently. By evaluating the state-pair difference of the optimal bias function $h^{}$, the proposed algorithm achieves a regret bound of $\tilde{O}(\sqrt{SAHT})$\footnote{The symbol $\tilde{O}$ means $O$ with log factors ignored. } for MDP with $S$ states and $A$ actions, in the case that an upper bound $H$ on the span of $h^{}$, i.e., $sp(h^{*})$ is known. This result outperforms the best previous regret bounds $\tilde{O}(S\sqrt{AHT}) $\citep{fruit2019improved} by a factor of $\sqrt{S}$. Furthermore, this regret bound matches the lower bound of $\Omega(\sqrt{SAHT}) $\citep{jaksch2010near} up to a logarithmic factor. As a consequence, we show that there is a near optimal regret bound of $\tilde{O}(\sqrt{SADT})$ for MDPs with a finite diameter $D$ compared to the lower bound of $\Omega(\sqrt{SADT}) $\citep{jaksch2010near}.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05110v3
PDF	https://arxiv.org/pdf/1906.05110v3.pdf
PWC	https://paperswithcode.com/paper/regret-minimization-for-reinforcement
Repo
Framework

The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning


Title	The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning
Authors	Steffen Wolf, Alberto Bailoni, Constantin Pape, Nasim Rahaman, Anna Kreshuk, Ullrich Köthe, Fred A. Hamprecht
Abstract	Image partitioning, or segmentation without semantics, is the task of decomposing an image into distinct segments, or equivalently to detect closed contours. Most prior work either requires seeds, one per segment; or a threshold; or formulates the task as multicut / correlation clustering, an NP-hard problem. Here, we propose a greedy algorithm for signed graph partitioning, the “Mutex Watershed”. Unlike seeded watershed, the algorithm can accommodate not only attractive but also repulsive cues, allowing it to find a previously unspecified number of segments without the need for explicit seeds or a tunable threshold. We also prove that this simple algorithm solves to global optimality an objective function that is intimately related to the multicut / correlation clustering integer linear programming formulation. The algorithm is deterministic, very simple to implement, and has empirically linearithmic complexity. When presented with short-range attractive and long-range repulsive cues from a deep neural network, the Mutex Watershed gives the best results currently known for the competitive ISBI 2012 EM segmentation benchmark.
Tasks	graph partitioning
Published	2019-04-25
URL	http://arxiv.org/abs/1904.12654v1
PDF	http://arxiv.org/pdf/1904.12654v1.pdf
PWC	https://paperswithcode.com/paper/190412654
Repo
Framework

Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text


Title	Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Authors	Guangneng Hu, Yu Zhang, Qiang Yang
Abstract	Collaborative filtering (CF) is the key technique for recommender systems (RSs). CF exploits user-item behavior interactions (e.g., clicks) only and hence suffers from the data sparsity issue. One research thread is to integrate auxiliary information such as product reviews and news titles, leading to hybrid filtering methods. Another thread is to transfer knowledge from other source domains such as improving the movie recommendation with the knowledge from the book domain, leading to transfer learning methods. In real-world life, no single service can satisfy a user’s all information needs. Thus it motivates us to exploit both auxiliary and source information for RSs in this paper. We propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH) methods for cross-domain recommendation with unstructured text in an end-to-end manner. TMH attentively extracts useful content from unstructured text via a memory module and selectively transfers knowledge from a source domain via a transfer network. On two real-world datasets, TMH shows better performance in terms of three ranking metrics by comparing with various baselines. We conduct thorough analyses to understand how the text content and transferred knowledge help the proposed model.
Tasks	Recommendation Systems, Transfer Learning
Published	2019-01-22
URL	http://arxiv.org/abs/1901.07199v1
PDF	http://arxiv.org/pdf/1901.07199v1.pdf
PWC	https://paperswithcode.com/paper/transfer-meets-hybrid-a-synthetic-approach
Repo
Framework