Paper Group ANR 1039
Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition. A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging. NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions. Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation. Metric Gaussian V …
Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition
Title | Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition |
Authors | Zhao Zhang, Zemin Tang, Zheng Zhang, Yang Wang, Jie Qin, Meng Wang |
Abstract | The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling operation may lose important feature information and is unlearnable; 2) the tradi-tional convolution operation optimizes slowly and the hierar-chical features from different layers are not fully utilized. In this work, we address these problems by developing a novel deep network model called Fully-Convolutional Intensive Feature Flow Neural Network (IntensiveNet). Specifically, we design a further dense block called intensive block to extract the feature information, where the original inputs and two dense blocks are connected tightly. To encode data appropriately, we present the concepts of dense fusion block and further dense fusion opera-tions for our new intensive block. By adding short connections to different layers, the feature flow and coupling between layers are enhanced. We also replace the traditional convolution by depthwise separable convolution to make the operation efficient. To prevent important feature information being lost to a certain extent, we use a convolution operation with stride 2 to replace the original pooling operation in the customary transition layers. The recognition results on large-scale Chinese string and MNIST datasets show that our IntensiveNet can deliver enhanced recog-nition results, compared with other related deep models. |
Tasks | |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06446v2 |
https://arxiv.org/pdf/1912.06446v2.pdf | |
PWC | https://paperswithcode.com/paper/fully-convolutional-intensive-feature-flow |
Repo | |
Framework | |
A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging
Title | A Feature Transfer Enabled Multi-Task Deep Learning Model on Medical Imaging |
Authors | Fei Gao, Hyunsoo Yoon, Teresa Wu, Xianghua Chu |
Abstract | Object detection, segmentation and classification are three common tasks in medical image analysis. Multi-task deep learning (MTL) tackles these three tasks jointly, which provides several advantages saving computing time and resources and improving robustness against overfitting. However, existing multitask deep models start with each task as an individual task and integrate parallelly conducted tasks at the end of the architecture with one cost function. Such architecture fails to take advantage of the combined power of the features from each individual task at an early stage of the training. In this research, we propose a new architecture, FTMTLNet, an MTL enabled by feature transferring. Traditional transfer learning deals with the same or similar task from different data sources (a.k.a. domain). The underlying assumption is that the knowledge gained from source domains may help the learning task on the target domain. Our proposed FTMTLNet utilizes the different tasks from the same domain. Considering features from the tasks are different views of the domain, the combined feature maps can be well exploited using knowledge from multiple views to enhance the generalizability. To evaluate the validity of the proposed approach, FTMTLNet is compared with models from literature including 8 classification models, 4 detection models and 3 segmentation models using a public full field digital mammogram dataset for breast cancer diagnosis. Experimental results show that the proposed FTMTLNet outperforms the competing models in classification and detection and has comparable results in segmentation. |
Tasks | Object Detection, Transfer Learning |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01828v1 |
https://arxiv.org/pdf/1906.01828v1.pdf | |
PWC | https://paperswithcode.com/paper/a-feature-transfer-enabled-multi-task-deep |
Repo | |
Framework | |
NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions
Title | NeuralDivergence: Exploring and Understanding Neural Networks by Comparing Activation Distributions |
Authors | Haekyu Park, Fred Hohman, Duen Horng Chau |
Abstract | As deep neural networks are increasingly used in solving high-stake problems, there is a pressing need to understand their internal decision mechanisms. Visualization has helped address this problem by assisting with interpreting complex deep neural networks. However, current tools often support only single data instances, or visualize layers in isolation. We present NeuralDivergence, an interactive visualization system that uses activation distributions as a high-level summary of what a model has learned. NeuralDivergence enables users to interactively summarize and compare activation distributions across layers, classes, and instances (e.g., pairs of adversarial attacked and benign images), helping them gain better understanding of neural network models. |
Tasks | |
Published | 2019-06-02 |
URL | https://arxiv.org/abs/1906.00332v1 |
https://arxiv.org/pdf/1906.00332v1.pdf | |
PWC | https://paperswithcode.com/paper/190600332 |
Repo | |
Framework | |
Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation
Title | Data-driven Reconstruction of Nonlinear Dynamics from Sparse Observation |
Authors | Kyongmin Yeo |
Abstract | We present a data-driven model to reconstruct nonlinear dynamics from a very sparse times series data, which relies on the strength of the echo state network (ESN) in learning nonlinear representation of data. With an assumption of the universal function approximation capability of ESN, it is shown that the reconstruction problem can be formulated as a fixed-point problem, in which the trajectory of the dynamical system is a fixed point of the ESN. An under-relaxed fixed-point iteration is proposed to reconstruct the nonlinear dynamics from a sparse observation. The proposed fixed-point ESN is tested against both univariate and multivariate chaotic dynamical systems by randomly removing up to 95% of the data. It is shown that the fixed-point ESN is able to reconstruct the complex dynamics from only 5 ~ 10% of the data. For a relatively simple non-chaotic dynamical system, the numerical experiments on a forced van der Pol oscillator show that it is possible to reconstruct the nonlinear dynamics from only 1~2% of the data. |
Tasks | |
Published | 2019-06-10 |
URL | https://arxiv.org/abs/1906.04059v1 |
https://arxiv.org/pdf/1906.04059v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-reconstruction-of-nonlinear |
Repo | |
Framework | |
Metric Gaussian Variational Inference
Title | Metric Gaussian Variational Inference |
Authors | Jakob Knollmüller, Torsten A. Enßlin |
Abstract | Solving Bayesian inference problems approximately with variational approaches can provide fast and accurate results. Capturing correlation within the approximation requires an explicit parametrization. This intrinsically limits this approach to either moderately dimensional problems, or requiring the strongly simplifying mean-field approach. We propose Metric Gaussian Variational Inference (MGVI) as a method that goes beyond mean-field. Here correlations between all model parameters are taken into account, while still scaling linearly in computational time and memory. With this method we achieve higher accuracy and in many cases a significant speedup compared to traditional methods. MGVI is an iterative method that performs a series of Gaussian approximations to the posterior. We alternate between approximating the covariance with the inverse Fisher information metric evaluated at an intermediate mean estimate and optimizing the KL-divergence for the given covariance with respect to the mean. This procedure is iterated until the uncertainty estimate is self-consistent with the mean parameter. We achieve linear scaling by avoiding to store the covariance explicitly at any time. Instead we draw samples from the approximating distribution relying on an implicit representation and numerical schemes to approximately solve linear equations. Those samples are used to approximate the KL-divergence and its gradient. The usage of natural gradient descent allows for rapid convergence. Formulating the Bayesian model in standardized coordinates makes MGVI applicable to any inference problem with continuous parameters. We demonstrate the high accuracy of MGVI by comparing it to HMC and its fast convergence relative to other established methods in several examples. We investigate real-data applications, as well as synthetic examples of varying size and complexity and up to a million model parameters. |
Tasks | Bayesian Inference |
Published | 2019-01-30 |
URL | https://arxiv.org/abs/1901.11033v3 |
https://arxiv.org/pdf/1901.11033v3.pdf | |
PWC | https://paperswithcode.com/paper/metric-gaussian-variational-inference |
Repo | |
Framework | |
Fair treatment allocations in social networks
Title | Fair treatment allocations in social networks |
Authors | James Atwood, Hansa Srinivasan, Yoni Halpern, D Sculley |
Abstract | Simulations of infectious disease spread have long been used to understand how epidemics evolve and how to effectively treat them. However, comparatively little attention has been paid to understanding the fairness implications of different treatment strategies – that is, how might such strategies distribute the expected disease burden differentially across various subgroups or communities in the population? In this work, we define the precision disease control problem – the problem of optimally allocating vaccines in a social network in a step-by-step fashion – and we use the ML Fairness Gym to simulate epidemic control and study it from both an efficiency and fairness perspective. We then present an exploratory analysis of several different environments and discuss the fairness implications of different treatment strategies. |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.05489v1 |
https://arxiv.org/pdf/1911.05489v1.pdf | |
PWC | https://paperswithcode.com/paper/fair-treatment-allocations-in-social-networks |
Repo | |
Framework | |
Balancing Multi-level Interactions for Session-based Recommendation
Title | Balancing Multi-level Interactions for Session-based Recommendation |
Authors | Yujia Zheng, Siyi Liu, Zailei Zhou |
Abstract | Predicting user actions based on anonymous sessions is a challenge to general recommendation systems because the lack of user profiles heavily limits data-driven models. Recently, session-based recommendation methods have achieved remarkable results in dealing with this task. However, the upper bound of performance can still be boosted through the innovative exploration of limited data. In this paper, we propose a novel method, namely Intra-and Inter-session Interaction-aware Graph-enhanced Network, to take inter-session item-level interactions into account. Different from existing intra-session item-level interactions and session-level collaborative information, our introduced data represents complex item-level interactions between different sessions. For mining the new data without breaking the equilibrium of the model between different interactions, we construct an intra-session graph and an inter-session graph for the current session. The former focuses on item-level interactions within a single session and the latter models those between items among neighborhood sessions. Then different approaches are employed to encode the information of two graphs according to different structures, and the generated latent vectors are combined to balance the model across different scopes. Experiments on real-world datasets verify that our method outperforms other state-of-the-art methods. |
Tasks | Recommendation Systems, Session-Based Recommendations |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13527v1 |
https://arxiv.org/pdf/1910.13527v1.pdf | |
PWC | https://paperswithcode.com/paper/balancing-multi-level-interactions-for |
Repo | |
Framework | |
Real-time and robust multiple-view gender classification using gait features in video surveillance
Title | Real-time and robust multiple-view gender classification using gait features in video surveillance |
Authors | Trung Dung Do, Hakil Kim, Van Huan Nguyen |
Abstract | It is common to view people in real applications walking in arbitrary directions, holding items, or wearing heavy coats. These factors are challenges in gait-based application methods because they significantly change a person’s appearance. This paper proposes a novel method for classifying human gender in real time using gait information. The use of an average gait image (AGI), rather than a gait energy image (GEI), allows this method to be computationally efficient and robust against view changes. A viewpoint (VP) model is created for automatically determining the viewing angle during the testing phase. A distance signal (DS) model is constructed to remove any areas with an attachment (carried items, worn coats) from a silhouette to reduce the interference in the resulting classification. Finally, the human gender is classified using multiple view-dependent classifiers trained using a support vector machine. Experiment results confirm that the proposed method achieves a high accuracy of 98.8% on the CASIA Dataset B and outperforms the recent state-of-the-art methods. |
Tasks | |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01013v1 |
https://arxiv.org/pdf/1905.01013v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-and-robust-multiple-view-gender |
Repo | |
Framework | |
Graph Embeddings at Scale
Title | Graph Embeddings at Scale |
Authors | C. Bayan Bruss, Anish Khazane, Jonathan Rider, Richard Serpe, Saurabh Nagrecha, Keegan E. Hines |
Abstract | Graph embedding is a popular algorithmic approach for creating vector representations for individual vertices in networks. Training these algorithms at scale is important for creating embeddings that can be used for classification, ranking, recommendation and other common applications in industry. While industrial systems exist for training graph embeddings on large datasets, many of these distributed architectures are forced to partition copious amounts of data and model logic across many worker nodes. In this paper, we propose a distributed infrastructure that completely avoids graph partitioning, dynamically creates size constrained computational graphs across worker nodes, and uses highly efficient indexing operations for updating embeddings that allow the system to function at scale. We show that our system can scale an existing embeddings algorithm - skip-gram - to train on the open-source Friendster network (68 million vertices) and on an internal heterogeneous graph (50 million vertices). We measure the performance of our system on two key quantitative metrics: link-prediction accuracy and rate of convergence. We conclude this work by analyzing how a greater number of worker nodes actually improves our system’s performance on the aforementioned metrics and discuss our next steps for rigorously evaluating the embedding vectors produced by our system. |
Tasks | Graph Embedding, graph partitioning, Link Prediction |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01705v1 |
https://arxiv.org/pdf/1907.01705v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-embeddings-at-scale |
Repo | |
Framework | |
Variations on the Chebyshev-Lagrange Activation Function
Title | Variations on the Chebyshev-Lagrange Activation Function |
Authors | Yuchen Li, Frank Rudzicz, Jekaterina Novikova |
Abstract | We seek to improve the data efficiency of neural networks and present novel implementations of parameterized piece-wise polynomial activation functions. The parameters are the y-coordinates of n+1 Chebyshev nodes per hidden unit and Lagrangian interpolation between the nodes produces the polynomial on [-1, 1]. We show results for different methods of handling inputs outside [-1, 1] on synthetic datasets, finding significant improvements in capacity of expression and accuracy of interpolation in models that compute some form of linear extrapolation from either ends. We demonstrate competitive or state-of-the-art performance on the classification of images (MNIST and CIFAR-10) and minimally-correlated vectors (DementiaBank) when we replace ReLU or tanh with linearly extrapolated Chebyshev-Lagrange activations in deep residual architectures. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.10064v1 |
https://arxiv.org/pdf/1906.10064v1.pdf | |
PWC | https://paperswithcode.com/paper/variations-on-the-chebyshev-lagrange |
Repo | |
Framework | |
Towards Online End-to-end Transformer Automatic Speech Recognition
Title | Towards Online End-to-end Transformer Automatic Speech Recognition |
Authors | Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe |
Abstract | The Transformer self-attention network has recently shown promising performance as an alternative to recurrent neural networks in end-to-end (E2E) automatic speech recognition (ASR) systems. However, Transformer has a drawback in that the entire input sequence is required to compute self-attention. We have proposed a block processing method for the Transformer encoder by introducing a context-aware inheritance mechanism. An additional context embedding vector handed over from the previously processed block helps to encode not only local acoustic information but also global linguistic, channel, and speaker attributes. In this paper, we extend it towards an entire online E2E ASR system by introducing an online decoding process inspired by monotonic chunkwise attention (MoChA) into the Transformer decoder. Our novel MoChA training and inference algorithms exploit the unique properties of Transformer, whose attentions are not always monotonic or peaky, and have multiple heads and residual connections of the decoder layers. Evaluations of the Wall Street Journal (WSJ) and AISHELL-1 show that our proposed online Transformer decoder outperforms conventional chunkwise approaches. |
Tasks | Speech Recognition |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11871v1 |
https://arxiv.org/pdf/1910.11871v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-online-end-to-end-transformer |
Repo | |
Framework | |
Information-Theoretic Local Minima Characterization and Regularization
Title | Information-Theoretic Local Minima Characterization and Regularization |
Authors | Zhiwei Jia, Hao Su |
Abstract | Recent advances in deep learning theory have evoked the study of generalizability across different local minima of deep neural networks (DNNs). While current work focused on either discovering properties of good local minima or developing regularization techniques to induce good local minima, no approach exists that can tackle both problems. We achieve these two goals successfully in a unified manner. Specifically, based on the Fisher information we propose a metric both strongly indicative of generalizability of local minima and effectively applied as a practical regularizer. We provide theoretical analysis including a generalization bound and empirically demonstrate the success of our approach in both capturing and improving the generalizability of DNNs. Experiments are performed on CIFAR-10 and CIFAR-100 for various network architectures. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08192v1 |
https://arxiv.org/pdf/1911.08192v1.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-local-minima-1 |
Repo | |
Framework | |
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Title | Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function |
Authors | Zihan Zhang, Xiangyang Ji |
Abstract | We present an algorithm based on the \emph{Optimism in the Face of Uncertainty} (OFU) principle which is able to learn Reinforcement Learning (RL) modeled by Markov decision process (MDP) with finite state-action space efficiently. By evaluating the state-pair difference of the optimal bias function $h^{}$, the proposed algorithm achieves a regret bound of $\tilde{O}(\sqrt{SAHT})$\footnote{The symbol $\tilde{O}$ means $O$ with log factors ignored. } for MDP with $S$ states and $A$ actions, in the case that an upper bound $H$ on the span of $h^{}$, i.e., $sp(h^{*})$ is known. This result outperforms the best previous regret bounds $\tilde{O}(S\sqrt{AHT}) $\citep{fruit2019improved} by a factor of $\sqrt{S}$. Furthermore, this regret bound matches the lower bound of $\Omega(\sqrt{SAHT}) $\citep{jaksch2010near} up to a logarithmic factor. As a consequence, we show that there is a near optimal regret bound of $\tilde{O}(\sqrt{SADT})$ for MDPs with a finite diameter $D$ compared to the lower bound of $\Omega(\sqrt{SADT}) $\citep{jaksch2010near}. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05110v3 |
https://arxiv.org/pdf/1906.05110v3.pdf | |
PWC | https://paperswithcode.com/paper/regret-minimization-for-reinforcement |
Repo | |
Framework | |
The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning
Title | The Mutex Watershed and its Objective: Efficient, Parameter-Free Image Partitioning |
Authors | Steffen Wolf, Alberto Bailoni, Constantin Pape, Nasim Rahaman, Anna Kreshuk, Ullrich Köthe, Fred A. Hamprecht |
Abstract | Image partitioning, or segmentation without semantics, is the task of decomposing an image into distinct segments, or equivalently to detect closed contours. Most prior work either requires seeds, one per segment; or a threshold; or formulates the task as multicut / correlation clustering, an NP-hard problem. Here, we propose a greedy algorithm for signed graph partitioning, the “Mutex Watershed”. Unlike seeded watershed, the algorithm can accommodate not only attractive but also repulsive cues, allowing it to find a previously unspecified number of segments without the need for explicit seeds or a tunable threshold. We also prove that this simple algorithm solves to global optimality an objective function that is intimately related to the multicut / correlation clustering integer linear programming formulation. The algorithm is deterministic, very simple to implement, and has empirically linearithmic complexity. When presented with short-range attractive and long-range repulsive cues from a deep neural network, the Mutex Watershed gives the best results currently known for the competitive ISBI 2012 EM segmentation benchmark. |
Tasks | graph partitioning |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.12654v1 |
http://arxiv.org/pdf/1904.12654v1.pdf | |
PWC | https://paperswithcode.com/paper/190412654 |
Repo | |
Framework | |
Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Title | Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text |
Authors | Guangneng Hu, Yu Zhang, Qiang Yang |
Abstract | Collaborative filtering (CF) is the key technique for recommender systems (RSs). CF exploits user-item behavior interactions (e.g., clicks) only and hence suffers from the data sparsity issue. One research thread is to integrate auxiliary information such as product reviews and news titles, leading to hybrid filtering methods. Another thread is to transfer knowledge from other source domains such as improving the movie recommendation with the knowledge from the book domain, leading to transfer learning methods. In real-world life, no single service can satisfy a user’s all information needs. Thus it motivates us to exploit both auxiliary and source information for RSs in this paper. We propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH) methods for cross-domain recommendation with unstructured text in an end-to-end manner. TMH attentively extracts useful content from unstructured text via a memory module and selectively transfers knowledge from a source domain via a transfer network. On two real-world datasets, TMH shows better performance in terms of three ranking metrics by comparing with various baselines. We conduct thorough analyses to understand how the text content and transferred knowledge help the proposed model. |
Tasks | Recommendation Systems, Transfer Learning |
Published | 2019-01-22 |
URL | http://arxiv.org/abs/1901.07199v1 |
http://arxiv.org/pdf/1901.07199v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-meets-hybrid-a-synthetic-approach |
Repo | |
Framework | |