Paper Group ANR 506
Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models. Laplacian Matrix for Dimensionality Reduction and Clustering. Reconstructing commuters network using machine learning and urban indicators. Variational Bayes on Manifolds. Encouraging an Appropriate Representation Simplifies Training of …
Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models
Title | Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models |
Authors | Romit Maulik, Vishwas Rao, Sandeep Madireddy, Bethany Lusch, Prasanna Balaprakash |
Abstract | Rapid simulations of advection-dominated problems are vital for multiple engineering and geophysical applications. In this paper, we present a long short-term memory neural network to approximate the nonlinear component of the reduced-order model (ROM) of an advection-dominated partial differential equation. This is motivated by the fact that the nonlinear term is the most expensive component of a successful ROM. For our approach, we utilize a Galerkin projection to isolate the linear and the transient components of the dynamical system and then use discrete empirical interpolation to generate training data for supervised learning. We note that the numerical time-advancement and linear-term computation of the system ensure a greater preservation of physics than does a process that is fully modeled. Our results show that the proposed framework recovers transient dynamics accurately without nonlinear term computations in full-order space and represents a cost-effective alternative to solely equation-based ROMs. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.09144v2 |
https://arxiv.org/pdf/1909.09144v2.pdf | |
PWC | https://paperswithcode.com/paper/using-recurrent-neural-networks-for-nonlinear |
Repo | |
Framework | |
Laplacian Matrix for Dimensionality Reduction and Clustering
Title | Laplacian Matrix for Dimensionality Reduction and Clustering |
Authors | Laurenz Wiskott, Fabian Schönfeld |
Abstract | Many problems in machine learning can be expressed by means of a graph with nodes representing training samples and edges representing the relationship between samples in terms of similarity, temporal proximity, or label information. Graphs can in turn be represented by matrices. A special example is the Laplacian matrix, which allows us to assign each node a value that varies only little between strongly connected nodes and more between distant nodes. Such an assignment can be used to extract a useful feature representation, find a good embedding of data in a low dimensional space, or perform clustering on the original samples. In these lecture notes we first introduce the Laplacian matrix and then present a small number of algorithms designed around it. |
Tasks | Dimensionality Reduction |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08381v1 |
https://arxiv.org/pdf/1909.08381v1.pdf | |
PWC | https://paperswithcode.com/paper/laplacian-matrix-for-dimensionality-reduction |
Repo | |
Framework | |
Reconstructing commuters network using machine learning and urban indicators
Title | Reconstructing commuters network using machine learning and urban indicators |
Authors | Gabriel Spadon, Andre C. P. L. F. de Carvalho, Jose F. Rodrigues-Jr, Luiz G. A. Alves |
Abstract | Human mobility has a significant impact on several layers of society, from infrastructural planning and economics to the spread of diseases and crime. Representing the system as a complex network, in which nodes are assigned to regions (e.g., a city) and links indicate the flow of people between two of them, physics-inspired models have been proposed to quantify the number of people migrating from one city to the other. Despite the advances made by these models, our ability to predict the number of commuters and reconstruct mobility networks remains limited. Here, we propose an alternative approach using machine learning and 22 urban indicators to predict the flow of people and reconstruct the intercity commuters network. Our results reveal that predictions based on machine learning algorithms and urban indicators can reconstruct the commuters network with 90.4% of accuracy and describe 77.6% of the variance observed in the flow of people between cities. We also identify essential features to recover the network structure and the urban indicators mostly related to commuting patterns. As previously reported, distance plays a significant role in commuting, but other indicators, such as Gross Domestic Product (GDP) and unemployment rate, are also driven-forces for people to commute. We believe that our results shed new lights on the modeling of migration and reinforce the role of urban indicators on commuting patterns. Also, because link-prediction and network reconstruction are still open challenges in network science, our results have implications in other areas, like economics, social sciences, and biology, where node attributes can give us information about the existence of links connecting entities in the network. |
Tasks | Link Prediction |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03512v1 |
https://arxiv.org/pdf/1908.03512v1.pdf | |
PWC | https://paperswithcode.com/paper/reconstructing-commuters-network-using |
Repo | |
Framework | |
Variational Bayes on Manifolds
Title | Variational Bayes on Manifolds |
Authors | Minh-Ngoc Tran, Dang H. Nguyen, Duy Nguyen |
Abstract | Variational Bayes (VB) has become a widely-used tool for Bayesian inference in statistics and machine learning. Nonetheless, the development of the existing VB algorithms is so far generally restricted to the case where the variational parameter space is Euclidean, which hinders the potential broad application of VB methods. This paper extends the scope of VB to the case where the variational parameter space is a Riemannian manifold. We develop an efficient manifold-based VB algorithm that exploits both the geometric structure of the constraint parameter space and the information geometry of the manifold of VB approximating probability distributions. Our algorithm is provably convergent and achieves a convergence rate of order $\mathcal O(1/\sqrt{T})$ and $\mathcal O(1/T^{2-2\epsilon})$ for a non-convex evidence lower bound function and a strongly retraction-convex evidence lower bound function, respectively. We develop in particular two manifold VB algorithms, Manifold Gaussian VB and Manifold Neural Net VB, and demonstrate through numerical experiments that the proposed algorithms are stable, less sensitive to initialization and compares favourably to existing VB methods. |
Tasks | Bayesian Inference |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03097v2 |
https://arxiv.org/pdf/1908.03097v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-bayes-on-manifolds |
Repo | |
Framework | |
Encouraging an Appropriate Representation Simplifies Training of Neural Networks
Title | Encouraging an Appropriate Representation Simplifies Training of Neural Networks |
Authors | Krisztian Buza |
Abstract | A common assumption about neural networks is that they can learn an appropriate internal representations on their own, see e.g. end-to-end learning. In this work we challenge this assumption. We consider two simple tasks and show that the state-of-the-art training algorithm fails, although the model itself is able to represent an appropriate solution. We will demonstrate that encouraging an appropriate internal representation allows the same model to solve these tasks. While we do not claim that it is impossible to solve these tasks by other means (such as neural networks with more layers), our results illustrate that integration of domain knowledge in form of a desired internal representation may improve the generalization ability of neural networks. |
Tasks | |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07245v1 |
https://arxiv.org/pdf/1911.07245v1.pdf | |
PWC | https://paperswithcode.com/paper/encouraging-an-appropriate-representation |
Repo | |
Framework | |
ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image
Title | ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image |
Authors | Yida Wang, David Joseph Tan, Nassir Navab, Federico Tombari |
Abstract | We propose a novel model for 3D semantic completion from a single depth image, based on a single encoder and three separate generators used to reconstruct different geometric and semantic representations of the original and completed scene, all sharing the same latent space. To transfer information between the geometric and semantic branches of the network, we introduce paths between them concatenating features at corresponding network layers. Motivated by the limited amount of training samples from real scenes, an interesting attribute of our architecture is the capacity to supplement the existing dataset by generating a new training dataset with high quality, realistic scenes that even includes occlusion and real noise. We build the new dataset by sampling the features directly from latent space which generates a pair of partial volumetric surface and completed volumetric semantic surface. Moreover, we utilize multiple discriminators to increase the accuracy and realism of the reconstructions. We demonstrate the benefits of our approach on standard benchmarks for the two most common completion tasks: semantic 3D scene completion and 3D object completion. |
Tasks | |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01106v1 |
https://arxiv.org/pdf/1909.01106v1.pdf | |
PWC | https://paperswithcode.com/paper/forknet-multi-branch-volumetric-semantic |
Repo | |
Framework | |
Detailed comparison of communication efficiency of split learning and federated learning
Title | Detailed comparison of communication efficiency of split learning and federated learning |
Authors | Abhishek Singh, Praneeth Vepakomma, Otkrist Gupta, Ramesh Raskar |
Abstract | We compare communication efficiencies of two compelling distributed machine learning approaches of split learning and federated learning. We show useful settings under which each method outperforms the other in terms of communication efficiency. We consider various practical scenarios of distributed learning setup and juxtapose the two methods under various real-life scenarios. We consider settings of small and large number of clients as well as small models (1M - 6M parameters), large models (10M - 200M parameters) and very large models (1 Billion-100 Billion parameters). We show that increasing number of clients or increasing model size favors split learning setup over the federated while increasing the number of data samples while keeping the number of clients or model size low makes federated learning more communication efficient. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.09145v1 |
https://arxiv.org/pdf/1909.09145v1.pdf | |
PWC | https://paperswithcode.com/paper/detailed-comparison-of-communication |
Repo | |
Framework | |
A deep learning model for early prediction of Alzheimer’s disease dementia based on hippocampal MRI
Title | A deep learning model for early prediction of Alzheimer’s disease dementia based on hippocampal MRI |
Authors | Hongming Li, Mohamad Habes, David A. Wolk, Yong Fan |
Abstract | Introduction: It is challenging at baseline to predict when and which individuals who meet criteria for mild cognitive impairment (MCI) will ultimately progress to Alzheimer’s disease (AD) dementia. Methods: A deep learning method is developed and validated based on MRI scans of 2146 subjects (803 for training and 1343 for validation) to predict MCI subjects’ progression to AD dementia in a time-to-event analysis setting. Results: The deep learning time-to-event model predicted individual subjects’ progression to AD dementia with a concordance index (C-index) of 0.762 on 439 ADNI testing MCI subjects with follow-up duration from 6 to 78 months (quartiles: [24, 42, 54]) and a C-index of 0.781 on 40 AIBL testing MCI subjects with follow-up duration from 18-54 months (quartiles: [18, 36,54]). The predicted progression risk also clustered individual subjects into subgroups with significant differences in their progression time to AD dementia (p<0.0002). Improved performance for predicting progression to AD dementia (C-index=0.864) was obtained when the deep learning based progression risk was combined with baseline clinical measures. Conclusion: Our method provides a cost effective and accurate means for prognosis and potentially to facilitate enrollment in clinical trials with individuals likely to progress within a specific temporal period. |
Tasks | |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07282v1 |
http://arxiv.org/pdf/1904.07282v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-model-for-early-prediction-of |
Repo | |
Framework | |
Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization
Title | Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization |
Authors | Zhenxun Zhuang, Ashok Cutkosky, Francesco Orabona |
Abstract | Stochastic Gradient Descent (SGD) has played a central role in machine learning. However, it requires a carefully hand-picked stepsize for fast convergence, which is notoriously tedious and time-consuming to tune. Over the last several years, a plethora of adaptive gradient-based algorithms have emerged to ameliorate this problem. They have proved efficient in reducing the labor of tuning in practice, but many of them lack theoretic guarantees even in the convex setting. In this paper, we propose new surrogate losses to cast the problem of learning the optimal stepsizes for the stochastic optimization of a non-convex smooth objective function onto an online convex optimization problem. This allows the use of no-regret online algorithms to compute optimal stepsizes on the fly. In turn, this results in a SGD algorithm with self-tuned stepsizes that guarantees convergence rates that are automatically adaptive to the level of noise. |
Tasks | Stochastic Optimization |
Published | 2019-01-25 |
URL | https://arxiv.org/abs/1901.09068v2 |
https://arxiv.org/pdf/1901.09068v2.pdf | |
PWC | https://paperswithcode.com/paper/surrogate-losses-for-online-learning-of |
Repo | |
Framework | |
Learning Neural PDE Solvers with Convergence Guarantees
Title | Learning Neural PDE Solvers with Convergence Guarantees |
Authors | Jun-Ting Hsieh, Shengjia Zhao, Stephan Eismann, Lucia Mirabella, Stefano Ermon |
Abstract | Partial differential equations (PDEs) are widely used across the physical and computational sciences. Decades of research and engineering went into designing fast iterative solution methods. Existing solvers are general purpose, but may be sub-optimal for specific classes of problems. In contrast to existing hand-crafted solutions, we propose an approach to learn a fast iterative solver tailored to a specific domain. We achieve this goal by learning to modify the updates of an existing solver using a deep neural network. Crucially, our approach is proven to preserve strong correctness and convergence guarantees. After training on a single geometry, our model generalizes to a wide variety of geometries and boundary conditions, and achieves 2-3 times speedup compared to state-of-the-art solvers. |
Tasks | |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.01200v1 |
https://arxiv.org/pdf/1906.01200v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-neural-pde-solvers-with-convergence-1 |
Repo | |
Framework | |
Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text
Title | Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text |
Authors | Murali Karthick Baskar, Shinji Watanabe, Ramon Astudillo, Takaaki Hori, Lukáš Burget, Jan Černocký |
Abstract | Sequence-to-sequence automatic speech recognition (ASR) models require large quantities of data to attain high performance. For this reason, there has been a recent surge in interest for unsupervised and semi-supervised training in such models. This work builds upon recent results showing notable improvements in semi-supervised training using cycle-consistency and related techniques. Such techniques derive training procedures and losses able to leverage unpaired speech and/or text data by combining ASR with Text-to-Speech (TTS) models. In particular, this work proposes a new semi-supervised loss combining an end-to-end differentiable ASR$\rightarrow$TTS loss with TTS$\rightarrow$ASR loss. The method is able to leverage both unpaired speech and text data to outperform recently proposed related techniques in terms of %WER. We provide extensive results analyzing the impact of data quantity and speech and text modalities and show consistent gains across WSJ and Librispeech corpora. Our code is provided in ESPnet to reproduce the experiments. |
Tasks | Semi-Supervised Image Classification, Speech Recognition |
Published | 2019-04-30 |
URL | https://arxiv.org/abs/1905.01152v2 |
https://arxiv.org/pdf/1905.01152v2.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-sequence-to-sequence-asr |
Repo | |
Framework | |
Decision Automation for Electric Power Network Recovery
Title | Decision Automation for Electric Power Network Recovery |
Authors | Yugandhar Sarkale, Saeed Nozhati, Edwin K. P. Chong, Bruce R. Ellingwood |
Abstract | Critical infrastructure systems such as electric power networks, water networks, and transportation systems play a major role in the welfare of any community. In the aftermath of disasters, their recovery is of paramount importance; orderly and efficient recovery involves the assignment of limited resources (a combination of human repair workers and machines) to repair damaged infrastructure components. The decision maker must also deal with uncertainty in the outcome of the resource-allocation actions during recovery. The manual assignment of resources seldom is optimal despite the expertise of the decision maker because of the large number of choices and uncertainties in consequences of sequential decisions. This combinatorial assignment problem under uncertainty is known to be \mbox{NP-hard}. We propose a novel decision technique that addresses the massive number of decision choices for large-scale real-world problems; in addition, our method also features an experiential learning component that adaptively determines the utilization of the computational resources based on the performance of a small number of choices. Our framework is closed-loop, and naturally incorporates all the attractive features of such a decision-making system. In contrast to myopic approaches, which do not account for the future effects of the current choices, our methodology has an anticipatory learning component that effectively incorporates \emph{lookahead} into the solutions. To this end, we leverage the theory of regression analysis, Markov decision processes (MDPs), multi-armed bandits, and stochastic models of community damage from natural disasters to develop a method for near-optimal recovery of communities. Our method contributes to the general problem of MDPs with massive action spaces with application to recovery of communities affected by hazards. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00699v3 |
https://arxiv.org/pdf/1910.00699v3.pdf | |
PWC | https://paperswithcode.com/paper/decision-automation-for-electric-power |
Repo | |
Framework | |
Weakly-supervised Knowledge Graph Alignment with Adversarial Learning
Title | Weakly-supervised Knowledge Graph Alignment with Adversarial Learning |
Authors | Meng Qu, Jian Tang, Yoshua Bengio |
Abstract | This paper studies aligning knowledge graphs from different sources or languages. Most existing methods train supervised methods for the alignment, which usually require a large number of aligned knowledge triplets. However, such a large number of aligned knowledge triplets may not be available or are expensive to obtain in many domains. Therefore, in this paper we propose to study aligning knowledge graphs in fully-unsupervised or weakly-supervised fashion, i.e., without or with only a few aligned triplets. We propose an unsupervised framework to align the entity and relation embddings of different knowledge graphs with an adversarial learning framework. Moreover, a regularization term which maximizes the mutual information between the embeddings of different knowledge graphs is used to mitigate the problem of mode collapse when learning the alignment functions. Such a framework can be further seamlessly integrated with existing supervised methods by utilizing a limited number of aligned triples as guidance. Experimental results on multiple datasets prove the effectiveness of our proposed approach in both the unsupervised and the weakly-supervised settings. |
Tasks | Knowledge Graphs |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.03179v1 |
https://arxiv.org/pdf/1907.03179v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-knowledge-graph-alignment-1 |
Repo | |
Framework | |
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
Title | Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation |
Authors | Chia-Hsuan Lee, Yun-Nung Chen, Hung-Yi Lee |
Abstract | Spoken question answering (SQA) is challenging due to complex reasoning on top of the spoken documents. The recent studies have also shown the catastrophic impact of automatic speech recognition (ASR) errors on SQA. Therefore, this work proposes to mitigate the ASR errors by aligning the mismatch between ASR hypotheses and their corresponding reference transcriptions. An adversarial model is applied to this domain adaptation task, which forces the model to learn domain-invariant features the QA model can effectively utilize in order to improve the SQA results. The experiments successfully demonstrate the effectiveness of our proposed model, and the results are better than the previous best model by 2% EM score. |
Tasks | Domain Adaptation, Question Answering, Speech Recognition |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07904v1 |
http://arxiv.org/pdf/1904.07904v1.pdf | |
PWC | https://paperswithcode.com/paper/mitigating-the-impact-of-speech-recognition-1 |
Repo | |
Framework | |
Spot Evasion Attacks: Adversarial Examples for License Plate Recognition Systems with Convolutional Neural Networks
Title | Spot Evasion Attacks: Adversarial Examples for License Plate Recognition Systems with Convolutional Neural Networks |
Authors | Ya-guan Qian, Dan-feng Ma, Bin Wang, Jun Pan, Jia-min Wang, Jian-hai Chen, Wu-jie Zhou, Jing-sheng Lei |
Abstract | Recent studies have shown convolution neural networks (CNNs) for image recognition are vulnerable to evasion attacks with carefully manipulated adversarial examples. Previous work primarily focused on how to generate adversarial examples closed to source images, by introducing pixel-level perturbations into the whole or specific part of images. In this paper, we propose an evasion attack on CNN classifiers in the context of License Plate Recognition (LPR), which adds predetermined perturbations to specific regions of license plate images, simulating some sort of naturally formed spots (such as sludge, etc.). Therefore, the problem is modeled as an optimization process searching for optimal perturbation positions, which is different from previous work that consider pixel values as decision variables. Notice that this is a complex nonlinear optimization problem, and we use a genetic-algorithm based approach to obtain optimal perturbation positions. In experiments, we use the proposed algorithm to generate various adversarial examples in the form of rectangle, circle, ellipse and spots cluster. Experimental results show that these adversarial examples are almost ignored by human eyes, but can fool HyperLPR with high attack success rate over 93%. Therefore, we believe that this kind of spot evasion attacks would pose a great threat to current LPR systems, and needs to be investigated further by the security community. |
Tasks | License Plate Recognition |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1911.00927v2 |
https://arxiv.org/pdf/1911.00927v2.pdf | |
PWC | https://paperswithcode.com/paper/spot-evasion-attacks-adversarial-examples-for |
Repo | |
Framework | |