Paper Group ANR 1211
Incremental embedding for temporal networks. Quantum Graph Neural Networks. Modeling Latent Sentence Structure in Neural Machine Translation. A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network. Towards Scalable Gaussian Process Modeling. Continual learning: A comparative study on how to de …
Incremental embedding for temporal networks
Title | Incremental embedding for temporal networks |
Authors | Tomasz Kajdanowicz, Kamil Tagowski, Maciej Falkiewicz, Piotr Bielak, Przemysław Kazienko, Nitesh V. Chawla |
Abstract | Prediction over edges and nodes in graphs requires appropriate and efficiently achieved data representation. Recent research on representation learning for dynamic networks resulted in a significant progress. However, the more precise and accurate methods, the greater computational and memory complexity. Here, we introduce ICMEN - the first-in-class incremental meta-embedding method that produces vector representations of nodes respecting temporal dependencies in the graph. ICMEN efficiently constructs nodes’ embedding from historical representations by linearly convex combinations making the process less memory demanding than state-of-the-art embedding algorithms. The method is capable of constructing representation for inactive and new nodes without a need to re-embed. The results of link prediction on several real-world datasets shown that applying ICMEN incremental meta-method to any base embedding approach, we receive similar results and save memory and computational power. Taken together, our work proposes a new way of efficient online representation learning in dynamic complex networks. |
Tasks | Link Prediction, Representation Learning |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03423v1 |
http://arxiv.org/pdf/1904.03423v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-embedding-for-temporal-networks |
Repo | |
Framework | |
Quantum Graph Neural Networks
Title | Quantum Graph Neural Networks |
Authors | Guillaume Verdon, Trevor McCourt, Enxhell Luzhnica, Vikash Singh, Stefan Leichenauer, Jack Hidary |
Abstract | We introduce Quantum Graph Neural Networks (QGNN), a new class of quantum neural network ansatze which are tailored to represent quantum processes which have a graph structure, and are particularly suitable to be executed on distributed quantum systems over a quantum network. Along with this general class of ansatze, we introduce further specialized architectures, namely, Quantum Graph Recurrent Neural Networks (QGRNN) and Quantum Graph Convolutional Neural Networks (QGCNN). We provide four example applications of QGNNs: learning Hamiltonian dynamics of quantum systems, learning how to create multipartite entanglement in a quantum network, unsupervised learning for spectral clustering, and supervised learning for graph isomorphism classification. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12264v1 |
https://arxiv.org/pdf/1909.12264v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-graph-neural-networks-1 |
Repo | |
Framework | |
Modeling Latent Sentence Structure in Neural Machine Translation
Title | Modeling Latent Sentence Structure in Neural Machine Translation |
Authors | Joost Bastings, Wilker Aziz, Ivan Titov, Khalil Sima’an |
Abstract | Recently it was shown that linguistic structure predicted by a supervised parser can be beneficial for neural machine translation (NMT). In this work we investigate a more challenging setup: we incorporate sentence structure as a latent variable in a standard NMT encoder-decoder and induce it in such a way as to benefit the translation task. We consider German-English and Japanese-English translation benchmarks and observe that when using RNN encoders the model makes no or very limited use of the structure induction apparatus. In contrast, CNN and word-embedding-based encoders rely on latent graphs and force them to encode useful, potentially long-distance, dependencies. |
Tasks | Machine Translation |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06436v1 |
http://arxiv.org/pdf/1901.06436v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-latent-sentence-structure-in-neural |
Repo | |
Framework | |
A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network
Title | A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network |
Authors | Xihan Li, Jia Zhang, Jiang Bian, Yunhai Tong, Tie-Yan Liu |
Abstract | Resource balancing within complex transportation networks is one of the most important problems in real logistics domain. Traditional solutions on these problems leverage combinatorial optimization with demand and supply forecasting. However, the high complexity of transportation routes, severe uncertainty of future demand and supply, together with non-convex business constraints make it extremely challenging in the traditional resource management field. In this paper, we propose a novel sophisticated multi-agent reinforcement learning approach to address these challenges. In particular, inspired by the externalities especially the interactions among resource agents, we introduce an innovative cooperative mechanism for state and reward design resulting in more effective and efficient transportation. Extensive experiments on a simulated ocean transportation service demonstrate that our new approach can stimulate cooperation among agents and lead to much better performance. Compared with traditional solutions based on combinatorial optimization, our approach can give rise to a significant improvement in terms of both performance and stability. |
Tasks | Combinatorial Optimization, Multi-agent Reinforcement Learning |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00714v1 |
http://arxiv.org/pdf/1903.00714v1.pdf | |
PWC | https://paperswithcode.com/paper/a-cooperative-multi-agent-reinforcement |
Repo | |
Framework | |
Towards Scalable Gaussian Process Modeling
Title | Towards Scalable Gaussian Process Modeling |
Authors | Piyush Pandita, Jesper Kristensen, Liping Wang |
Abstract | Numerous engineering problems of interest to the industry are often characterized by expensive black-box objective experiments or computer simulations. Obtaining insight into the problem or performing subsequent optimizations requires hundreds of thousands of evaluations of the objective function which is most often a practically unachievable task. Gaussian Process (GP) surrogate modeling replaces the expensive function with a cheap-to-evaluate data-driven probabilistic model. While the GP does not assume a functional form of the problem, it is defined by a set of parameters, called hyperparameters. The hyperparameters define the characteristics of the objective function, such as smoothness, magnitude, periodicity, etc. Accurately estimating these hyperparameters is a key ingredient in developing a reliable and generalizable surrogate model. Markov chain Monte Carlo (MCMC) is a ubiquitously used Bayesian method to estimate these hyperparameters. At the GE Global Research Center, a customized industry-strength Bayesian hybrid modeling framework utilizing the GP, called GEBHM, has been employed and validated over many years. GEBHM is very effective on problems of small and medium size, typically less than 1000 training points. However, the GP does not scale well in time with a growing dataset and problem dimensionality which can be a major impediment in such problems. In this work, we extend and implement in GEBHM an Adaptive Sequential Monte Carlo (ASMC) methodology for training the GP enabling the modeling of large-scale industry problems. This implementation saves computational time (especially for large-scale problems) while not sacrificing predictability over the current MCMC implementation. We demonstrate the effectiveness and accuracy of GEBHM with ASMC on four mathematical problems and on two challenging industry applications of varying complexity. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11313v1 |
https://arxiv.org/pdf/1907.11313v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-scalable-gaussian-process-modeling |
Repo | |
Framework | |
Continual learning: A comparative study on how to defy forgetting in classification tasks
Title | Continual learning: A comparative study on how to defy forgetting in classification tasks |
Authors | Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh, Tinne Tuytelaars |
Abstract | Artificial neural networks thrive in solving the classification problem for a particular rigid task, where the network resembles a static entity of knowledge, acquired through generalized learning behaviour from a distinct training phase. However, endeavours to extend this knowledge without targeting the original task usually result in a catastrophic forgetting of this task. Continual learning shifts this paradigm towards a network that can continually accumulate knowledge over different tasks without the need for retraining from scratch, with methods in particular aiming to alleviate forgetting. We focus on task-incremental classification, where tasks arrive in a batch-like fashion, and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 10 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize which method performs best, both on balanced Tiny Imagenet and a large-scale unbalanced iNaturalist datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage. |
Tasks | Continual Learning |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08383v1 |
https://arxiv.org/pdf/1909.08383v1.pdf | |
PWC | https://paperswithcode.com/paper/continual-learning-a-comparative-study-on-how |
Repo | |
Framework | |
Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education
Title | Dolphin: A Spoken Language Proficiency Assessment System for Elementary Education |
Authors | Wenbiao Ding, Guowei Xu, Tianqiao Liu, Weiping Fu, Yujia Song, Chaoyou Guo, Cong Kong, Songfan Yang, Gale Yan Huang, Zitao Liu |
Abstract | Spoken language proficiency is critically important for children’s growth and personal development. Due to the limited and imbalanced educational resources in China, elementary students barely have chances to improve their oral language skills in classes. Verbal fluency tasks (VFTs) were invented to let the students practice their spoken language proficiency after school. VFTs are simple but concrete math related questions that ask students to not only report answers but speak out the entire thinking process. In spite of the great success of VFTs, they bring a heavy grading burden to elementary teachers. To alleviate this problem, we develop Dolphin, a spoken language proficiency assessment system for Chinese elementary education. Dolphin is able to automatically evaluate both phonological fluency and semantic relevance of students’ VFT answers. We conduct a wide range of offline and online experiments to demonstrate the effectiveness of Dolphin. In our offline experiments, we show that Dolphin improves both phonological fluency and semantic relevance evaluation performance when compared to state-of-the-art baselines on real-world educational data sets. In our online A/B experiments, we test Dolphin with 183 teachers from 2 major cities (Hangzhou and Xi’an) in China for 10 weeks and the results show that VFT assignments grading coverage is improved by 22%. |
Tasks | |
Published | 2019-08-01 |
URL | https://arxiv.org/abs/1908.00358v3 |
https://arxiv.org/pdf/1908.00358v3.pdf | |
PWC | https://paperswithcode.com/paper/dolphin-a-verbal-fluency-evaluation-system |
Repo | |
Framework | |
Asymptotic Risk of Bezier Simplex Fitting
Title | Asymptotic Risk of Bezier Simplex Fitting |
Authors | Akinori Tanaka, Akiyoshi Sannai, Ken Kobayashi, Naoki Hamada |
Abstract | The Bezier simplex fitting is a novel data modeling technique which exploits geometric structures of data to approximate the Pareto front of multi-objective optimization problems. There are two fitting methods based on different sampling strategies. The inductive skeleton fitting employs a stratified subsampling from each skeleton of a simplex, whereas the all-at-once fitting uses a non-stratified sampling which treats a simplex as a whole. In this paper, we analyze the asymptotic risks of those B'ezier simplex fitting methods and derive the optimal subsample ratio for the inductive skeleton fitting. It is shown that the inductive skeleton fitting with the optimal ratio has a smaller risk when the degree of a Bezier simplex is less than three. Those results are verified numerically under small to moderate sample sizes. In addition, we provide two complementary applications of our theory: a generalized location problem and a multi-objective hyper-parameter tuning of the group lasso. The former can be represented by a Bezier simplex of degree two where the inductive skeleton fitting outperforms. The latter can be represented by a Bezier simplex of degree three where the all-at-once fitting gets an advantage. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.06924v1 |
https://arxiv.org/pdf/1906.06924v1.pdf | |
PWC | https://paperswithcode.com/paper/asymptotic-risk-of-bezier-simplex-fitting |
Repo | |
Framework | |
The application of Convolutional Neural Networks to Detect Slow, Sustained Deformation in InSAR Timeseries
Title | The application of Convolutional Neural Networks to Detect Slow, Sustained Deformation in InSAR Timeseries |
Authors | N. Anantrasirichai, J. Biggs, F. Albino, D. Bull |
Abstract | Automated systems for detecting deformation in satellite InSAR imagery could be used to develop a global monitoring system for volcanic and urban environments. Here we explore the limits of a CNN for detecting slow, sustained deformations in wrapped interferograms. Using synthetic data, we estimate a detection threshold of 3.9cm for deformation signals alone, and 6.3cm when atmospheric artefacts are considered. Over-wrapping reduces this to 1.8cm and 5.0cm respectively as more fringes are generated without altering SNR. We test the approach on timeseries of cumulative deformation from Campi Flegrei and Dallol, where over-wrapping improves classication performance by up to 15%. We propose a mean-filtering method for combining results of different wrap parameters to flag deformation. At Campi Flegrei, deformation of 8.5cm/yr was detected after 60days and at Dallol, deformation of 3.5cm/yr was detected after 310 days. This corresponds to cumulative displacements of 3 cm and 4 cm consistent with estimates based on synthetic data. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02321v1 |
https://arxiv.org/pdf/1909.02321v1.pdf | |
PWC | https://paperswithcode.com/paper/the-application-of-convolutional-neural |
Repo | |
Framework | |
Counterfactual Reasoning for Fair Clinical Risk Prediction
Title | Counterfactual Reasoning for Fair Clinical Risk Prediction |
Authors | Stephen Pfohl, Tony Duan, Daisy Yi Ding, Nigam H. Shah |
Abstract | The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we develop an augmented counterfactual fairness criteria to extend the group fairness criteria of equalized odds to an individual level. We do so by requiring that the same prediction be made for a patient, and a counterfactual patient resulting from changing a sensitive attribute, if the factual and counterfactual outcomes do not differ. We investigate the extent to which the augmented counterfactual fairness criteria may be applied to develop fair models for prolonged inpatient length of stay and mortality with observational electronic health records data. As the fairness criteria is ill-defined without knowledge of the data generating process, we use a variational autoencoder to perform counterfactual inference in the context of an assumed causal graph. While our technique provides a means to trade off maintenance of fairness with reduction in predictive performance in the context of a learned generative model, further work is needed to assess the generality of this approach. |
Tasks | Counterfactual Inference, Decision Making |
Published | 2019-07-14 |
URL | https://arxiv.org/abs/1907.06260v1 |
https://arxiv.org/pdf/1907.06260v1.pdf | |
PWC | https://paperswithcode.com/paper/counterfactual-reasoning-for-fair-clinical |
Repo | |
Framework | |
Map Matching Algorithm for Large-scale Datasets
Title | Map Matching Algorithm for Large-scale Datasets |
Authors | David Fiedler, Michal Čáp, Jan Nykl, Pavol Žilecký, Martin Schaefer |
Abstract | GPS receivers embedded in cell phones and connected vehicles generate a series of location measurements that can be used for various analytical purposes. A common pre-processing step of this data is the so-called map matching. The goal of map matching is to infer the trajectory that the device followed in a road network from a potentially sparse series of noisy location measurements. Although accurate and robust map matching algorithms based on probabilistic models exist, they are computationally heavy and thus impractical for processing of large datasets. In this paper, we present a scalable map-matching algorithm based on Dijkstra shortest path method, that is both accurate and applicable to large datasets. Our experiments on a publicly-available dataset showed that the proposed method achieves accuracy that is comparable to that of the existing map matching methods using only a fraction of computational resources. In result, our algorithm can be used to efficiently process large datasets of noisy and potentially sparse location data that would be unexploitable using existing techniques due to their high computational requirements. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1910.05312v1 |
https://arxiv.org/pdf/1910.05312v1.pdf | |
PWC | https://paperswithcode.com/paper/map-matching-algorithm-for-large-scale |
Repo | |
Framework | |
The Curious Case of Neural Text Degeneration
Title | The Curious Case of Neural Text Degeneration |
Authors | Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi |
Abstract | Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators. The counter-intuitive empirical observation is that even though the use of likelihood as training objective leads to high quality models for a broad range of language understanding tasks, using likelihood as a decoding objective leads to text that is bland and strangely repetitive. In this paper, we reveal surprising distributional differences between human text and machine text. In addition, we find that decoding strategies alone can dramatically effect the quality of machine text, even when generated from exactly the same neural language model. Our findings motivate Nucleus Sampling, a simple but effective method to draw the best out of neural generation. By sampling text from the dynamic nucleus of the probability distribution, which allows for diversity while effectively truncating the less reliable tail of the distribution, the resulting text better demonstrates the quality of human text, yielding enhanced diversity without sacrificing fluency and coherence. |
Tasks | Language Modelling |
Published | 2019-04-22 |
URL | https://arxiv.org/abs/1904.09751v2 |
https://arxiv.org/pdf/1904.09751v2.pdf | |
PWC | https://paperswithcode.com/paper/the-curious-case-of-neural-text-degeneration |
Repo | |
Framework | |
Rapidly-Exploring Quotient-Space Trees: Motion Planning using Sequential Simplifications
Title | Rapidly-Exploring Quotient-Space Trees: Motion Planning using Sequential Simplifications |
Authors | Andreas Orthey, Marc Toussaint |
Abstract | Motion planning problems can be simplified by admissible projections of the configuration space to sequences of lower-dimensional quotient-spaces, called sequential simplifications. To exploit sequential simplifications, we present the Quotient-space Rapidly-exploring Random Trees (QRRT) algorithm. QRRT takes as input a start and a goal configuration, and a sequence of quotient-spaces. The algorithm grows trees on the quotient-spaces both sequentially and simultaneously to guarantee a dense coverage. QRRT is shown to be (1) probabilistically complete, and (2) can reduce the runtime by at least one order of magnitude. However, we show in experiments that the runtime varies substantially between different quotient-space sequences. To find out why, we perform an additional experiment, showing that the more narrow an environment, the more a quotient-space sequence can reduce runtime. |
Tasks | Motion Planning |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01350v2 |
https://arxiv.org/pdf/1906.01350v2.pdf | |
PWC | https://paperswithcode.com/paper/rapidly-exploring-quotient-space-trees-motion |
Repo | |
Framework | |
Geometric Estimation of Multivariate Dependency
Title | Geometric Estimation of Multivariate Dependency |
Authors | Salimeh Yasaei Sekeh, Alfred O. Hero |
Abstract | This paper proposes a geometric estimator of dependency between a pair of multivariate samples. The proposed estimator of dependency is based on a randomly permuted geometric graph (the minimal spanning tree) over the two multivariate samples. This estimator converges to a quantity that we call the geometric mutual information (GMI), which is equivalent to the Henze-Penrose divergence [1] between the joint distribution of the multivariate samples and the product of the marginals. The GMI has many of the same properties as standard MI but can be estimated from empirical data without density estimation; making it scalable to large datasets. The proposed empirical estimator of GMI is simple to implement, involving the construction of an MST spanning over both the original data and a randomly permuted version of this data. We establish asymptotic convergence of the estimator and convergence rates of the bias and variance for smooth multivariate density functions belonging to a H"{o}lder class. We demonstrate the advantages of our proposed geometric dependency estimator in a series of experiments. |
Tasks | Density Estimation |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08594v1 |
https://arxiv.org/pdf/1905.08594v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-estimation-of-multivariate |
Repo | |
Framework | |
Image Super-Resolution Using Attention Based DenseNet with Residual Deconvolution
Title | Image Super-Resolution Using Attention Based DenseNet with Residual Deconvolution |
Authors | Zhuangzi Li |
Abstract | Image super-resolution is a challenging task and has attracted increasing attention in research and industrial communities. In this paper, we propose a novel end-to-end Attention-based DenseNet with Residual Deconvolution named as ADRD. In our ADRD, a weighted dense block, in which the current layer receives weighted features from all previous levels, is proposed to capture valuable features rely in dense layers adaptively. And a novel spatial attention module is presented to generate a group of attentive maps for emphasizing informative regions. In addition, we design an innovative strategy to upsample residual information via the deconvolution layer, so that the high-frequency details can be accurately upsampled. Extensive experiments conducted on publicly available datasets demonstrate the promising performance of the proposed ADRD against the state-of-the-arts, both quantitatively and qualitatively. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.05282v1 |
https://arxiv.org/pdf/1907.05282v1.pdf | |
PWC | https://paperswithcode.com/paper/image-super-resolution-using-attention-based |
Repo | |
Framework | |