January 28, 2020

3197 words 16 mins read

Paper Group ANR 1039

Paper Group ANR 1039

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition. A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging. NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions. Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation. Metric Gaussian V …

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition

Title Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition
Authors Zhao Zhang, Zemin Tang, Zheng Zhang, Yang Wang, Jie Qin, Meng Wang
Abstract The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling operation may lose important feature information and is unlearnable; 2) the tradi-tional convolution operation optimizes slowly and the hierar-chical features from different layers are not fully utilized. In this work, we address these problems by developing a novel deep network model called Fully-Convolutional Intensive Feature Flow Neural Network (IntensiveNet). Specifically, we design a further dense block called intensive block to extract the feature information, where the original inputs and two dense blocks are connected tightly. To encode data appropriately, we present the concepts of dense fusion block and further dense fusion opera-tions for our new intensive block. By adding short connections to different layers, the feature flow and coupling between layers are enhanced. We also replace the traditional convolution by depthwise separable convolution to make the operation efficient. To prevent important feature information being lost to a certain extent, we use a convolution operation with stride 2 to replace the original pooling operation in the customary transition layers. The recognition results on large-scale Chinese string and MNIST datasets show that our IntensiveNet can deliver enhanced recog-nition results, compared with other related deep models.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.06446v2
PDF https://arxiv.org/pdf/1912.06446v2.pdf
PWC https://paperswithcode.com/paper/fully-convolutional-intensive-feature-flow
Repo
Framework

A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging

Title A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging
Authors Fei Gao, Hyunsoo Yoon, Teresa Wu, Xianghua Chu
Abstract Object detection, segmentation and classification are three common tasks in medical image analysis. Multi-task deep learning (MTL) tackles these three tasks jointly, which provides several advantages saving computing time and resources and improving robustness against overfitting. However, existing multitask deep models start with each task as an individual task and integrate parallelly conducted tasks at the end of the architecture with one cost function. Such architecture fails to take advantage of the combined power of the features from each individual task at an early stage of the training. In this research, we propose a new architecture, FTMTLNet, an MTL enabled by feature transferring. Traditional transfer learning deals with the same or similar task from different data sources (a.k.a. domain). The underlying assumption is that the knowledge gained from source domains may help the learning task on the target domain. Our proposed FTMTLNet utilizes the different tasks from the same domain. Considering features from the tasks are different views of the domain, the combined feature maps can be well exploited using knowledge from multiple views to enhance the generalizability. To evaluate the validity of the proposed approach, FTMTLNet is compared with models from literature including 8 classification models, 4 detection models and 3 segmentation models using a public full field digital mammogram dataset for breast cancer diagnosis. Experimental results show that the proposed FTMTLNet outperforms the competing models in classification and detection and has comparable results in segmentation.
Tasks Object Detection, Transfer Learning
Published 2019-06-05
URL https://arxiv.org/abs/1906.01828v1
PDF https://arxiv.org/pdf/1906.01828v1.pdf
PWC https://paperswithcode.com/paper/a-feature-transfer-enabled-multi-task-deep
Repo
Framework

NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions

Title NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions
Authors Haekyu Park, Fred Hohman, Duen Horng Chau
Abstract As deep neural networks are increasingly used in solving high-stake problems, there is a pressing need to understand their internal decision mechanisms. Visualization has helped address this problem by assisting with interpreting complex deep neural networks. However, current tools often support only single data instances, or visualize layers in isolation. We present NeuralDivergence, an interactive visualization system that uses activation distributions as a high-level summary of what a model has learned. NeuralDivergence enables users to interactively summarize and compare activation distributions across layers, classes, and instances (e.g., pairs of adversarial attacked and benign images), helping them gain better understanding of neural network models.
Tasks
Published 2019-06-02
URL https://arxiv.org/abs/1906.00332v1
PDF https://arxiv.org/pdf/1906.00332v1.pdf
PWC https://paperswithcode.com/paper/190600332
Repo
Framework

Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation

Title Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation
Authors Kyongmin Yeo
Abstract We present a data-driven model to reconstruct nonlinear dynamics from a very sparse times series data, which relies on the strength of the echo state network (ESN) in learning nonlinear representation of data. With an assumption of the universal function approximation capability of ESN, it is shown that the reconstruction problem can be formulated as a fixed-point problem, in which the trajectory of the dynamical system is a fixed point of the ESN. An under-relaxed fixed-point iteration is proposed to reconstruct the nonlinear dynamics from a sparse observation. The proposed fixed-point ESN is tested against both univariate and multivariate chaotic dynamical systems by randomly removing up to 95% of the data. It is shown that the fixed-point ESN is able to reconstruct the complex dynamics from only 5 ~ 10% of the data. For a relatively simple non-chaotic dynamical system, the numerical experiments on a forced van der Pol oscillator show that it is possible to reconstruct the nonlinear dynamics from only 1~2% of the data.
Tasks
Published 2019-06-10
URL https://arxiv.org/abs/1906.04059v1
PDF https://arxiv.org/pdf/1906.04059v1.pdf
PWC https://paperswithcode.com/paper/data-driven-reconstruction-of-nonlinear
Repo
Framework

Metric Gaussian Variational Inference

Title Metric Gaussian Variational Inference
Authors Jakob Knollmüller, Torsten A. Enßlin
Abstract Solving Bayesian inference problems approximately with variational approaches can provide fast and accurate results. Capturing correlation within the approximation requires an explicit parametrization. This intrinsically limits this approach to either moderately dimensional problems, or requiring the strongly simplifying mean-field approach. We propose Metric Gaussian Variational Inference (MGVI) as a method that goes beyond mean-field. Here correlations between all model parameters are taken into account, while still scaling linearly in computational time and memory. With this method we achieve higher accuracy and in many cases a significant speedup compared to traditional methods. MGVI is an iterative method that performs a series of Gaussian approximations to the posterior. We alternate between approximating the covariance with the inverse Fisher information metric evaluated at an intermediate mean estimate and optimizing the KL-divergence for the given covariance with respect to the mean. This procedure is iterated until the uncertainty estimate is self-consistent with the mean parameter. We achieve linear scaling by avoiding to store the covariance explicitly at any time. Instead we draw samples from the approximating distribution relying on an implicit representation and numerical schemes to approximately solve linear equations. Those samples are used to approximate the KL-divergence and its gradient. The usage of natural gradient descent allows for rapid convergence. Formulating the Bayesian model in standardized coordinates makes MGVI applicable to any inference problem with continuous parameters. We demonstrate the high accuracy of MGVI by comparing it to HMC and its fast convergence relative to other established methods in several examples. We investigate real-data applications, as well as synthetic examples of varying size and complexity and up to a million model parameters.
Tasks Bayesian Inference
Published 2019-01-30
URL https://arxiv.org/abs/1901.11033v3
PDF https://arxiv.org/pdf/1901.11033v3.pdf
PWC https://paperswithcode.com/paper/metric-gaussian-variational-inference
Repo
Framework

Fair treatment allocations in social networks

Title Fair treatment allocations in social networks
Authors James Atwood, Hansa Srinivasan, Yoni Halpern, D Sculley
Abstract Simulations of infectious disease spread have long been used to understand how epidemics evolve and how to effectively treat them. However, comparatively little attention has been paid to understanding the fairness implications of different treatment strategies – that is, how might such strategies distribute the expected disease burden differentially across various subgroups or communities in the population? In this work, we define the precision disease control problem – the problem of optimally allocating vaccines in a social network in a step-by-step fashion – and we use the ML Fairness Gym to simulate epidemic control and study it from both an efficiency and fairness perspective. We then present an exploratory analysis of several different environments and discuss the fairness implications of different treatment strategies.
Tasks
Published 2019-11-01
URL https://arxiv.org/abs/1911.05489v1
PDF https://arxiv.org/pdf/1911.05489v1.pdf
PWC https://paperswithcode.com/paper/fair-treatment-allocations-in-social-networks
Repo
Framework

Balancing Multi-level Interactions for Session-based Recommendation

Title Balancing Multi-level Interactions for Session-based Recommendation
Authors Yujia Zheng, Siyi Liu, Zailei Zhou
Abstract Predicting user actions based on anonymous sessions is a challenge to general recommendation systems because the lack of user profiles heavily limits data-driven models. Recently, session-based recommendation methods have achieved remarkable results in dealing with this task. However, the upper bound of performance can still be boosted through the innovative exploration of limited data. In this paper, we propose a novel method, namely Intra-and Inter-session Interaction-aware Graph-enhanced Network, to take inter-session item-level interactions into account. Different from existing intra-session item-level interactions and session-level collaborative information, our introduced data represents complex item-level interactions between different sessions. For mining the new data without breaking the equilibrium of the model between different interactions, we construct an intra-session graph and an inter-session graph for the current session. The former focuses on item-level interactions within a single session and the latter models those between items among neighborhood sessions. Then different approaches are employed to encode the information of two graphs according to different structures, and the generated latent vectors are combined to balance the model across different scopes. Experiments on real-world datasets verify that our method outperforms other state-of-the-art methods.
Tasks Recommendation Systems, Session-Based Recommendations
Published 2019-10-29
URL https://arxiv.org/abs/1910.13527v1
PDF https://arxiv.org/pdf/1910.13527v1.pdf
PWC https://paperswithcode.com/paper/balancing-multi-level-interactions-for
Repo
Framework

Real-time and robust multiple-view gender classification using gait features in video surveillance

Title Real-time and robust multiple-view gender classification using gait features in video surveillance
Authors Trung Dung Do, Hakil Kim, Van Huan Nguyen
Abstract It is common to view people in real applications walking in arbitrary directions, holding items, or wearing heavy coats. These factors are challenges in gait-based application methods because they significantly change a person’s appearance. This paper proposes a novel method for classifying human gender in real time using gait information. The use of an average gait image (AGI), rather than a gait energy image (GEI), allows this method to be computationally efficient and robust against view changes. A viewpoint (VP) model is created for automatically determining the viewing angle during the testing phase. A distance signal (DS) model is constructed to remove any areas with an attachment (carried items, worn coats) from a silhouette to reduce the interference in the resulting classification. Finally, the human gender is classified using multiple view-dependent classifiers trained using a support vector machine. Experiment results confirm that the proposed method achieves a high accuracy of 98.8% on the CASIA Dataset B and outperforms the recent state-of-the-art methods.
Tasks
Published 2019-05-03
URL https://arxiv.org/abs/1905.01013v1
PDF https://arxiv.org/pdf/1905.01013v1.pdf
PWC https://paperswithcode.com/paper/real-time-and-robust-multiple-view-gender
Repo
Framework

Graph Embeddings at Scale

Title Graph Embeddings at Scale
Authors C. Bayan Bruss, Anish Khazane, Jonathan Rider, Richard Serpe, Saurabh Nagrecha, Keegan E. Hines
Abstract Graph embedding is a popular algorithmic approach for creating vector representations for individual vertices in networks. Training these algorithms at scale is important for creating embeddings that can be used for classification, ranking, recommendation and other common applications in industry. While industrial systems exist for training graph embeddings on large datasets, many of these distributed architectures are forced to partition copious amounts of data and model logic across many worker nodes. In this paper, we propose a distributed infrastructure that completely avoids graph partitioning, dynamically creates size constrained computational graphs across worker nodes, and uses highly efficient indexing operations for updating embeddings that allow the system to function at scale. We show that our system can scale an existing embeddings algorithm - skip-gram - to train on the open-source Friendster network (68 million vertices) and on an internal heterogeneous graph (50 million vertices). We measure the performance of our system on two key quantitative metrics: link-prediction accuracy and rate of convergence. We conclude this work by analyzing how a greater number of worker nodes actually improves our system’s performance on the aforementioned metrics and discuss our next steps for rigorously evaluating the embedding vectors produced by our system.
Tasks Graph Embedding, graph partitioning, Link Prediction
Published 2019-07-03
URL https://arxiv.org/abs/1907.01705v1
PDF https://arxiv.org/pdf/1907.01705v1.pdf
PWC https://paperswithcode.com/paper/graph-embeddings-at-scale
Repo
Framework

Variations on the Chebyshev-Lagrange Activation Function

Title Variations on the Chebyshev-Lagrange Activation Function
Authors Yuchen Li, Frank Rudzicz, Jekaterina Novikova
Abstract We seek to improve the data efficiency of neural networks and present novel implementations of parameterized piece-wise polynomial activation functions. The parameters are the y-coordinates of n+1 Chebyshev nodes per hidden unit and Lagrangian interpolation between the nodes produces the polynomial on [-1, 1]. We show results for different methods of handling inputs outside [-1, 1] on synthetic datasets, finding significant improvements in capacity of expression and accuracy of interpolation in models that compute some form of linear extrapolation from either ends. We demonstrate competitive or state-of-the-art performance on the classification of images (MNIST and CIFAR-10) and minimally-correlated vectors (DementiaBank) when we replace ReLU or tanh with linearly extrapolated Chebyshev-Lagrange activations in deep residual architectures.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.10064v1
PDF https://arxiv.org/pdf/1906.10064v1.pdf
PWC https://paperswithcode.com/paper/variations-on-the-chebyshev-lagrange
Repo
Framework

Towards Online End-to-end Transformer Automatic Speech Recognition

Title Towards Online End-to-end Transformer Automatic Speech Recognition
Authors Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe
Abstract The Transformer self-attention network has recently shown promising performance as an alternative to recurrent neural networks in end-to-end (E2E) automatic speech recognition (ASR) systems. However, Transformer has a drawback in that the entire input sequence is required to compute self-attention. We have proposed a block processing method for the Transformer encoder by introducing a context-aware inheritance mechanism. An additional context embedding vector handed over from the previously processed block helps to encode not only local acoustic information but also global linguistic, channel, and speaker attributes. In this paper, we extend it towards an entire online E2E ASR system by introducing an online decoding process inspired by monotonic chunkwise attention (MoChA) into the Transformer decoder. Our novel MoChA training and inference algorithms exploit the unique properties of Transformer, whose attentions are not always monotonic or peaky, and have multiple heads and residual connections of the decoder layers. Evaluations of the Wall Street Journal (WSJ) and AISHELL-1 show that our proposed online Transformer decoder outperforms conventional chunkwise approaches.
Tasks Speech Recognition
Published 2019-10-25
URL https://arxiv.org/abs/1910.11871v1
PDF https://arxiv.org/pdf/1910.11871v1.pdf
PWC https://paperswithcode.com/paper/towards-online-end-to-end-transformer
Repo
Framework

Information-Theoretic Local Minima Characterization and Regularization

Title Information-Theoretic Local Minima Characterization and Regularization
Authors Zhiwei Jia, Hao Su
Abstract Recent advances in deep learning theory have evoked the study of generalizability across different local minima of deep neural networks (DNNs). While current work focused on either discovering properties of good local minima or developing regularization techniques to induce good local minima, no approach exists that can tackle both problems. We achieve these two goals successfully in a unified manner. Specifically, based on the Fisher information we propose a metric both strongly indicative of generalizability of local minima and effectively applied as a practical regularizer. We provide theoretical analysis including a generalization bound and empirically demonstrate the success of our approach in both capturing and improving the generalizability of DNNs. Experiments are performed on CIFAR-10 and CIFAR-100 for various network architectures.
Tasks
Published 2019-11-19
URL https://arxiv.org/abs/1911.08192v1
PDF https://arxiv.org/pdf/1911.08192v1.pdf
PWC https://paperswithcode.com/paper/information-theoretic-local-minima-1
Repo
Framework

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

Title Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Authors Zihan Zhang, Xiangyang Ji
Abstract We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently. By evaluating the state-pair difference of the optimal bias function $h^{}$, the proposed algorithm achieves a regret bound of $\tilde{O}(\sqrt{SAHT})$\footnote{The symbol $\tilde{O}$ means $O$ with log factors ignored. } for MDP with $S$ states and $A$ actions, in the case that an upper bound $H$ on the span of $h^{}$, i.e., $sp(h^{*})$ is known. This result outperforms the best previous regret bounds $\tilde{O}(S\sqrt{AHT}) $\citep{fruit2019improved} by a factor of $\sqrt{S}$. Furthermore, this regret bound matches the lower bound of $\Omega(\sqrt{SAHT}) $\citep{jaksch2010near} up to a logarithmic factor. As a consequence, we show that there is a near optimal regret bound of $\tilde{O}(\sqrt{SADT})$ for MDPs with a finite diameter $D$ compared to the lower bound of $\Omega(\sqrt{SADT}) $\citep{jaksch2010near}.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.05110v3
PDF https://arxiv.org/pdf/1906.05110v3.pdf
PWC https://paperswithcode.com/paper/regret-minimization-for-reinforcement
Repo
Framework

The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning

Title The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning
Authors Steffen Wolf, Alberto Bailoni, Constantin Pape, Nasim Rahaman, Anna Kreshuk, Ullrich Köthe, Fred A. Hamprecht
Abstract Image partitioning, or segmentation without semantics, is the task of decomposing an image into distinct segments, or equivalently to detect closed contours. Most prior work either requires seeds, one per segment; or a threshold; or formulates the task as multicut / correlation clustering, an NP-hard problem. Here, we propose a greedy algorithm for signed graph partitioning, the “Mutex Watershed”. Unlike seeded watershed, the algorithm can accommodate not only attractive but also repulsive cues, allowing it to find a previously unspecified number of segments without the need for explicit seeds or a tunable threshold. We also prove that this simple algorithm solves to global optimality an objective function that is intimately related to the multicut / correlation clustering integer linear programming formulation. The algorithm is deterministic, very simple to implement, and has empirically linearithmic complexity. When presented with short-range attractive and long-range repulsive cues from a deep neural network, the Mutex Watershed gives the best results currently known for the competitive ISBI 2012 EM segmentation benchmark.
Tasks graph partitioning
Published 2019-04-25
URL http://arxiv.org/abs/1904.12654v1
PDF http://arxiv.org/pdf/1904.12654v1.pdf
PWC https://paperswithcode.com/paper/190412654
Repo
Framework

Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text

Title Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Authors Guangneng Hu, Yu Zhang, Qiang Yang
Abstract Collaborative filtering (CF) is the key technique for recommender systems (RSs). CF exploits user-item behavior interactions (e.g., clicks) only and hence suffers from the data sparsity issue. One research thread is to integrate auxiliary information such as product reviews and news titles, leading to hybrid filtering methods. Another thread is to transfer knowledge from other source domains such as improving the movie recommendation with the knowledge from the book domain, leading to transfer learning methods. In real-world life, no single service can satisfy a user’s all information needs. Thus it motivates us to exploit both auxiliary and source information for RSs in this paper. We propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH) methods for cross-domain recommendation with unstructured text in an end-to-end manner. TMH attentively extracts useful content from unstructured text via a memory module and selectively transfers knowledge from a source domain via a transfer network. On two real-world datasets, TMH shows better performance in terms of three ranking metrics by comparing with various baselines. We conduct thorough analyses to understand how the text content and transferred knowledge help the proposed model.
Tasks Recommendation Systems, Transfer Learning
Published 2019-01-22
URL http://arxiv.org/abs/1901.07199v1
PDF http://arxiv.org/pdf/1901.07199v1.pdf
PWC https://paperswithcode.com/paper/transfer-meets-hybrid-a-synthetic-approach
Repo
Framework
comments powered by Disqus