October 17, 2019

2925 words 14 mins read

Paper Group ANR 921

Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks. Invariant and Equivariant Graph Networks. Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift. Towards Providing Explanations for AI Planner Decisions. JTAV: Jointly Learning Social Media Content Representati …

Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks


Title	Interpolatron: Interpolation or Extrapolation Schemes to Accelerate Optimization for Deep Neural Networks
Authors	Guangzeng Xie, Yitan Wang, Shuchang Zhou, Zhihua Zhang
Abstract	In this paper we explore acceleration techniques for large scale nonconvex optimization problems with special focuses on deep neural networks. The extrapolation scheme is a classical approach for accelerating stochastic gradient descent for convex optimization, but it does not work well for nonconvex optimization typically. Alternatively, we propose an interpolation scheme to accelerate nonconvex optimization and call the method Interpolatron. We explain motivation behind Interpolatron and conduct a thorough empirical analysis. Empirical results on DNNs of great depths (e.g., 98-layer ResNet and 200-layer ResNet) on CIFAR-10 and ImageNet show that Interpolatron can converge much faster than the state-of-the-art methods such as the SGD with momentum and Adam. Furthermore, Anderson’s acceleration, in which mixing coefficients are computed by least-squares estimation, can also be used to improve the performance. Both Interpolatron and Anderson’s acceleration are easy to implement and tune. We also show that Interpolatron has linear convergence rate under certain regularity assumptions.
Tasks
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06753v1
PDF	http://arxiv.org/pdf/1805.06753v1.pdf
PWC	https://paperswithcode.com/paper/interpolatron-interpolation-or-extrapolation
Repo
Framework

Invariant and Equivariant Graph Networks


Title	Invariant and Equivariant Graph Networks
Authors	Haggai Maron, Heli Ben-Hamu, Nadav Shamir, Yaron Lipman
Abstract	Invariant and equivariant networks have been successfully used for learning images, sets, point clouds, and graphs. A basic challenge in developing such networks is finding the maximal collection of invariant and equivariant linear layers. Although this question is answered for the first three examples (for popular transformations, at-least), a full characterization of invariant and equivariant linear layers for graphs is not known. In this paper we provide a characterization of all permutation invariant and equivariant linear layers for (hyper-)graph data, and show that their dimension, in case of edge-value graph data, is 2 and 15, respectively. More generally, for graph data defined on k-tuples of nodes, the dimension is the k-th and 2k-th Bell numbers. Orthogonal bases for the layers are computed, including generalization to multi-graph data. The constant number of basis elements and their characteristics allow successfully applying the networks to different size graphs. From the theoretical point of view, our results generalize and unify recent advancement in equivariant deep learning. In particular, we show that our model is capable of approximating any message passing neural network Applying these new linear layers in a simple deep neural network framework is shown to achieve comparable results to state-of-the-art and to have better expressivity than previous invariant and equivariant bases.
Tasks
Published	2018-12-24
URL	http://arxiv.org/abs/1812.09902v2
PDF	http://arxiv.org/pdf/1812.09902v2.pdf
PWC	https://paperswithcode.com/paper/invariant-and-equivariant-graph-networks
Repo
Framework

Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift


Title	Deep Cocktail Network: Multi-source Unsupervised Domain Adaptation with Category Shift
Authors	Ruijia Xu, Ziliang Chen, Wangmeng Zuo, Junjie Yan, Liang Lin
Abstract	Unsupervised domain adaptation (UDA) conventionally assumes labeled source samples coming from a single underlying source distribution. Whereas in practical scenario, labeled data are typically collected from diverse sources. The multiple sources are different not only from the target but also from each other, thus, domain adaptater should not be modeled in the same way. Moreover, those sources may not completely share their categories, which further brings a new transfer challenge called category shift. In this paper, we propose a deep cocktail network (DCTN) to battle the domain and category shifts among multiple sources. Motivated by the theoretical results in \cite{mansour2009domain}, the target distribution can be represented as the weighted combination of source distributions, and, the multi-source unsupervised domain adaptation via DCTN is then performed as two alternating steps: i) It deploys multi-way adversarial learning to minimize the discrepancy between the target and each of the multiple source domains, which also obtains the source-specific perplexity scores to denote the possibilities that a target sample belongs to different source domains. ii) The multi-source category classifiers are integrated with the perplexity scores to classify target sample, and the pseudo-labeled target samples together with source samples are utilized to update the multi-source category classifier and the feature extractor. We evaluate DCTN in three domain adaptation benchmarks, which clearly demonstrate the superiority of our framework.
Tasks	Domain Adaptation, Unsupervised Domain Adaptation
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00830v1
PDF	http://arxiv.org/pdf/1803.00830v1.pdf
PWC	https://paperswithcode.com/paper/deep-cocktail-network-multi-source
Repo
Framework

Towards Providing Explanations for AI Planner Decisions


Title	Towards Providing Explanations for AI Planner Decisions
Authors	Rita Borgo, Michael Cashmore, Daniele Magazzeni
Abstract	In order to engender trust in AI, humans must understand what an AI system is trying to achieve, and why. To overcome this problem, the underlying AI process must produce justifications and explanations that are both transparent and comprehensible to the user. AI Planning is well placed to be able to address this challenge. In this paper we present a methodology to provide initial explanations for the decisions made by the planner. Explanations are created by allowing the user to suggest alternative actions in plans and then compare the resulting plans with the one found by the planner. The methodology is implemented in the new XAI-Plan framework.
Tasks
Published	2018-10-15
URL	http://arxiv.org/abs/1810.06338v1
PDF	http://arxiv.org/pdf/1810.06338v1.pdf
PWC	https://paperswithcode.com/paper/towards-providing-explanations-for-ai-planner
Repo
Framework


Title	JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features
Authors	Hongru Liang, Haozheng Wang, Jun Wang, Shaodi You, Zhe Sun, Jin-Mao Wei, Zhenglu Yang
Abstract	Learning social media content is the basis of many real-world applications, including information retrieval and recommendation systems, among others. In contrast with previous works that focus mainly on single modal or bi-modal learning, we propose to learn social media content by fusing jointly textual, acoustic, and visual information (JTAV). Effective strategies are proposed to extract fine-grained features of each modality, that is, attBiGRU and DCRNN. We also introduce cross-modal fusion and attentive pooling techniques to integrate multi-modal information comprehensively. Extensive experimental evaluation conducted on real-world datasets demonstrates our proposed model outperforms the state-of-the-art approaches by a large margin.
Tasks	Information Retrieval, Recommendation Systems
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01483v1
PDF	http://arxiv.org/pdf/1806.01483v1.pdf
PWC	https://paperswithcode.com/paper/jtav-jointly-learning-social-media-content
Repo
Framework

JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs


Title	JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs
Authors	Eunji Jeong, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, Byung-Gon Chun
Abstract	The rapid evolution of deep neural networks is demanding deep learning (DL) frameworks not only to satisfy the requirement of quickly executing large computations, but also to support straightforward programming models for quickly implementing and experimenting with complex network structures. However, existing frameworks fail to excel in both departments simultaneously, leading to diverged efforts for optimizing performance and improving usability. This paper presents JANUS, a system that combines the advantages from both sides by transparently converting an imperative DL program written in Python, the de-facto scripting language for DL, into an efficiently executable symbolic dataflow graph. JANUS can convert various dynamic features of Python, including dynamic control flow, dynamic types, and impure functions, into elements of a symbolic dataflow graph. Experiments demonstrate that JANUS can achieve fast DL training by exploiting the techniques imposed by symbolic graph-based DL frameworks, while maintaining the simple and flexible programmability of imperative DL frameworks at the same time.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.01329v2
PDF	http://arxiv.org/pdf/1812.01329v2.pdf
PWC	https://paperswithcode.com/paper/janus-fast-and-flexible-deep-learning-via
Repo
Framework

Reciprocal Attention Fusion for Visual Question Answering


Title	Reciprocal Attention Fusion for Visual Question Answering
Authors	Moshiur R Farazi, Salman H Khan
Abstract	Existing attention mechanisms either attend to local image grid or object level features for Visual Question Answering (VQA). Motivated by the observation that questions can relate to both object instances and their parts, we propose a novel attention mechanism that jointly considers reciprocal relationships between the two levels of visual details. The bottom-up attention thus generated is further coalesced with the top-down information to only focus on the scene elements that are most relevant to a given question. Our design hierarchically fuses multi-modal information i.e., language, object- and gird-level features, through an efficient tensor decomposition scheme. The proposed model improves the state-of-the-art single model performances from 67.9% to 68.2% on VQAv1 and from 65.7% to 67.4% on VQAv2, demonstrating a significant boost.
Tasks	Question Answering, Visual Question Answering
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04247v2
PDF	http://arxiv.org/pdf/1805.04247v2.pdf
PWC	https://paperswithcode.com/paper/reciprocal-attention-fusion-for-visual
Repo
Framework

Optimized Participation of Multiple Fusion Functions in Consensus Creation: An Evolutionary Approach


Title	Optimized Participation of Multiple Fusion Functions in Consensus Creation: An Evolutionary Approach
Authors	Elaheh Rashedi, Abdolreza Mirzaei
Abstract	Recent studies show that ensemble methods enhance the stability and robustness of unsupervised learning. These approaches are successfully utilized to construct multiple clustering and combine them into a one representative consensus clustering of an improved quality. The quality of the consensus clustering is directly depended on fusion functions used in combination. In this article, the hierarchical clustering ensemble techniques are extended by introducing a new evolutionary fusion function. In the proposed method, multiple hierarchical clustering methods are generated via bagging. Thereafter, the consensus clustering is obtained using the search capability of genetic algorithm among different aggregated clustering methods made by different fusion functions. Putting some popular data sets to empirical study, the quality of the proposed method is compared with regular clustering ensembles. Experimental results demonstrate the accuracy improvement of the aggregated clustering results.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12270v1
PDF	http://arxiv.org/pdf/1805.12270v1.pdf
PWC	https://paperswithcode.com/paper/optimized-participation-of-multiple-fusion
Repo
Framework

Online Non-Additive Path Learning under Full and Partial Information


Title	Online Non-Additive Path Learning under Full and Partial Information
Authors	Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Holakou Rahmanian, Manfred K. Warmuth
Abstract	We study the problem of online path learning with non-additive gains, which is a central problem appearing in several applications, including ensemble structured prediction. We present new online algorithms for path learning with non-additive count-based gains for the three settings of full information, semi-bandit and full bandit with very favorable regret guarantees. A key component of our algorithms is the definition and computation of an intermediate context-dependent automaton that enables us to use existing algorithms designed for additive gains. We further apply our methods to the important application of ensemble structured prediction. Finally, beyond count-based gains, we give an efficient implementation of the EXP3 algorithm for the full bandit setting with an arbitrary (non-additive) gain.
Tasks	Structured Prediction
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06518v4
PDF	http://arxiv.org/pdf/1804.06518v4.pdf
PWC	https://paperswithcode.com/paper/online-non-additive-path-learning-under-full
Repo
Framework

Inference of the three-dimensional chromatin structure and its temporal behavior


Title	Inference of the three-dimensional chromatin structure and its temporal behavior
Authors	Bianca-Cristina Cristescu, Zalán Borsos, John Lygeros, María Rodríguez Martínez, Maria Anna Rapsomaniki
Abstract	Understanding the three-dimensional (3D) structure of the genome is essential for elucidating vital biological processes and their links to human disease. To determine how the genome folds within the nucleus, chromosome conformation capture methods such as HiC have recently been employed. However, computational methods that exploit the resulting high-throughput, high-resolution data are still suffering from important limitations. In this work, we explore the idea of manifold learning for the 3D chromatin structure inference and present a novel method, REcurrent Autoencoders for CHromatin 3D structure prediction (REACH-3D). Our framework employs autoencoders with recurrent neural units to reconstruct the chromatin structure. In comparison to existing methods, REACH-3D makes no transfer function assumption and permits dynamic analysis. Evaluating REACH-3D on synthetic data indicated high agreement with the ground truth. When tested on real experimental HiC data, REACH-3D recovered most faithfully the expected biological properties and obtained the highest correlation coefficient with microscopy measurements. Last, REACH-3D was applied to dynamic HiC data, where it successfully modeled chromatin conformation during the cell cycle.
Tasks
Published	2018-11-22
URL	http://arxiv.org/abs/1811.09619v1
PDF	http://arxiv.org/pdf/1811.09619v1.pdf
PWC	https://paperswithcode.com/paper/inference-of-the-three-dimensional-chromatin
Repo
Framework

Rehabilitating the ColorChecker Dataset for Illuminant Estimation


Title	Rehabilitating the ColorChecker Dataset for Illuminant Estimation
Authors	Ghalia Hemrit, Graham D. Finlayson, Arjan Gijsenij, Peter Gehler, Simone Bianco, Brian Funt, Mark Drew, Lilong Shi
Abstract	In a previous work, it was shown that there is a curious problem with the benchmark ColorChecker dataset for illuminant estimation. To wit, this dataset has at least 3 different sets of ground-truths. Typically, for a single algorithm a single ground-truth is used. But then different algorithms, whose performance is measured with respect to different ground-truths, are compared against each other and then ranked. This makes no sense. We show in this paper that there are also errors in how each ground-truth set was calculated. As a result, all performance rankings based on the ColorChecker dataset - and there are scores of these - are inaccurate. In this paper, we re-generate a new ‘recommended’ set of ground-truth based on the calculation methodology described by Shi and Funt. We then review the performance evaluation of a range of illuminant estimation algorithms. Compared with the legacy ground-truths, we find that the difference in how algorithms perform can be large, with many local rankings of algorithms being reversed. Finally, we draw the readers attention to our new ‘open’ data repository which, we hope, will allow the ColorChecker set to be rehabilitated and once again to become a useful benchmark for illuminant estimation algorithms.
Tasks
Published	2018-05-30
URL	http://arxiv.org/abs/1805.12262v3
PDF	http://arxiv.org/pdf/1805.12262v3.pdf
PWC	https://paperswithcode.com/paper/rehabilitating-the-colorchecker-dataset-for
Repo
Framework

Improving Long-Horizon Forecasts with Expectation-Biased LSTM Networks


Title	Improving Long-Horizon Forecasts with Expectation-Biased LSTM Networks
Authors	Aya Abdelsalam Ismail, Timothy Wood, Héctor Corrada Bravo
Abstract	State-of-the-art forecasting methods using Recurrent Neural Net- works (RNN) based on Long-Short Term Memory (LSTM) cells have shown exceptional performance targeting short-horizon forecasts, e.g given a set of predictor features, forecast a target value for the next few time steps in the future. However, in many applica- tions, the performance of these methods decays as the forecasting horizon extends beyond these few time steps. This paper aims to explore the challenges of long-horizon forecasting using LSTM networks. Here, we illustrate the long-horizon forecasting problem in datasets from neuroscience and energy supply management. We then propose expectation-biasing, an approach motivated by the literature of Dynamic Belief Networks, as a solution to improve long-horizon forecasting using LSTMs. We propose two LSTM ar- chitectures along with two methods for expectation biasing that significantly outperforms standard practice.
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06776v1
PDF	http://arxiv.org/pdf/1804.06776v1.pdf
PWC	https://paperswithcode.com/paper/improving-long-horizon-forecasts-with
Repo
Framework

Layer Trajectory LSTM


Title	Layer Trajectory LSTM
Authors	Jinyu Li, Changliang Liu, Yifan Gong
Abstract	It is popular to stack LSTM layers to get better modeling power, especially when large amount of training data is available. However, an LSTM-RNN with too many vanilla LSTM layers is very hard to train and there still exists the gradient vanishing issue if the network goes too deep. This issue can be partially solved by adding skip connections between layers, such as residual LSTM. In this paper, we propose a layer trajectory LSTM (ltLSTM) which builds a layer-LSTM using all the layer outputs from a standard multi-layer time-LSTM. This layer-LSTM scans the outputs from time-LSTMs, and uses the summarized layer trajectory information for final senone classification. The forward-propagation of time-LSTM and layer-LSTM can be handled in two separate threads in parallel so that the network computation time is the same as the standard time-LSTM. With a layer-LSTM running through layers, a gated path is provided from the output layer to the bottom layer, alleviating the gradient vanishing issue. Trained with 30 thousand hours of EN-US Microsoft internal data, the proposed ltLSTM performed significantly better than the standard multi-layer LSTM and residual LSTM, with up to 9.0% relative word error rate reduction across different tasks.
Tasks
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09522v1
PDF	http://arxiv.org/pdf/1808.09522v1.pdf
PWC	https://paperswithcode.com/paper/layer-trajectory-lstm
Repo
Framework

Training Neural Speech Recognition Systems with Synthetic Speech Augmentation


Title	Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Authors	Jason Li, Ravi Gadde, Boris Ginsburg, Vitaly Lavrukhin
Abstract	Building an accurate automatic speech recognition (ASR) system requires a large dataset that contains many hours of labeled speech samples produced by a diverse set of speakers. The lack of such open free datasets is one of the main issues preventing advancements in ASR research. To address this problem, we propose to augment a natural speech dataset with synthetic speech. We train very large end-to-end neural speech recognition models using the LibriSpeech dataset augmented with synthetic speech. These new models achieve state of the art Word Error Rate (WER) for character-level based models without an external language model.
Tasks	Language Modelling, Speech Recognition
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00707v1
PDF	http://arxiv.org/pdf/1811.00707v1.pdf
PWC	https://paperswithcode.com/paper/training-neural-speech-recognition-systems
Repo
Framework

Planning and Learning with Stochastic Action Sets


Title	Planning and Learning with Stochastic Action Sets
Authors	Craig Boutilier, Alon Cohen, Amit Daniely, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans
Abstract	In many practical uses of reinforcement learning (RL) the set of actions available at a given state is a random variable, with realizations governed by an exogenous stochastic process. Somewhat surprisingly, the foundations for such sequential decision processes have been unaddressed. In this work, we formalize and investigate MDPs with stochastic action sets (SAS-MDPs) to provide these foundations. We show that optimal policies and value functions in this model have a structure that admits a compact representation. From an RL perspective, we show that Q-learning with sampled action sets is sound. In model-based settings, we consider two important special cases: when individual actions are available with independent probabilities; and a sampling-based model for unknown distributions. We develop poly-time value and policy iteration methods for both cases; and in the first, we offer a poly-time linear programming solution.
Tasks	Q-Learning
Published	2018-05-07
URL	http://arxiv.org/abs/1805.02363v1
PDF	http://arxiv.org/pdf/1805.02363v1.pdf
PWC	https://paperswithcode.com/paper/planning-and-learning-with-stochastic-action
Repo
Framework