February 1, 2020

3554 words 17 mins read

Paper Group AWR 268

Paper Group AWR 268

Machine Learning Approach to Earthquake Rupture Dynamics. PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI. A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization. Multi-Turn Beam Search for Neural Dialogue Modeling. Guided learning for weakly-labeled semi-supervised sound event detection. Mutual Inf …

Machine Learning Approach to Earthquake Rupture Dynamics

Title Machine Learning Approach to Earthquake Rupture Dynamics
Authors Sabber Ahamed, Eric G. Daub
Abstract Simulating dynamic rupture propagation is challenging due to the uncertainties involved in the underlying physics of fault slip, stress conditions, and frictional properties of the fault. A trial and error approach is often used to determine the unknown parameters describing rupture, but running many simulations usually requires human review to determine how to adjust parameter values and is thus not very efficient. To reduce the computational cost and improve our ability to determine reasonable stress and friction parameters, we take advantage of the machine learning approach. We develop two models for earthquake rupture propagation using the artificial neural network (ANN) and the random forest (RF) algorithms to predict if a rupture can break a geometric heterogeneity on a fault. We train the models using a database of 1600 dynamic rupture simulations computed numerically. Fault geometry, stress conditions, and friction parameters vary in each simulation. We cross-validate and test the predictive power of the models using an additional 400 simulated ruptures, respectively. Both RF and ANN models predict rupture propagation with more than 81% accuracy, and model parameters can be used to infer the underlying factors most important for rupture propagation. Both of the models are computationally efficient such that the 400 testings require a fraction of a second, leading to potential applications of dynamic rupture that have previously not been possible due to the computational demands of physics-based rupture simulations.
Tasks
Published 2019-06-14
URL https://arxiv.org/abs/1906.06250v1
PDF https://arxiv.org/pdf/1906.06250v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-approach-to-earthquake
Repo https://github.com/msahamed/machine_learning_earthquake_rupture
Framework none

PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI

Title PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI
Authors Tomer Weiss, Ortal Senouf, Sanketh Vedula, Oleg Michailovich, Michael Zibulevsky, Alex Bronstein
Abstract Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relative high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of concurrent learning-based design of data acquisition and image reconstruction schemes. In this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of a practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and back-propagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.
Tasks Image Reconstruction, Semantic Segmentation
Published 2019-09-12
URL https://arxiv.org/abs/1909.05773v3
PDF https://arxiv.org/pdf/1909.05773v3.pdf
PWC https://paperswithcode.com/paper/pilot-physics-informed-learned-optimal
Repo https://github.com/tomer196/PILOT
Framework pytorch

A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization

Title A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization
Authors Tak-Wai Hui, Xiaoou Tang, Chen Change Loy
Abstract Over four decades, the majority addresses the problem of optical flow estimation using variational methods. With the advance of machine learning, some recent works have attempted to address the problem using convolutional neural network (CNN) and have showed promising results. FlowNet2, the state-of-the-art CNN, requires over 160M parameters to achieve accurate flow estimation. Our LiteFlowNet2 outperforms FlowNet2 on Sintel and KITTI benchmarks, while being 25.3 times smaller in the model size and 3.1 times faster in the running speed. LiteFlowNet2 is built on the foundation laid by conventional methods and resembles the corresponding roles as data fidelity and regularization in variational methods. We compute optical flow in a spatial-pyramid formulation as SPyNet but through a novel lightweight cascaded flow inference. It provides high flow estimation accuracy through early correction with seamless incorporation of descriptor matching. Flow regularization is used to ameliorate the issue of outliers and vague flow boundaries through feature-driven local convolutions. Our network also owns an effective structure for pyramidal feature extraction and embraces feature warping rather than image warping as practiced in FlowNet2 and SPyNet. Comparing to LiteFlowNet, LiteFlowNet2 improves the optical flow accuracy on Sintel Clean by 23.3%, Sintel Final by 12.8%, KITTI 2012 by 19.6%, and KITTI 2015 by 18.8%, while being 2.2 times faster. Our network protocol and trained models are made publicly available on https://github.com/twhui/LiteFlowNet2.
Tasks Optical Flow Estimation
Published 2019-03-15
URL https://arxiv.org/abs/1903.07414v3
PDF https://arxiv.org/pdf/1903.07414v3.pdf
PWC https://paperswithcode.com/paper/a-lightweight-optical-flow-cnn-revisiting
Repo https://github.com/twhui/LiteFlowNet2
Framework none

Multi-Turn Beam Search for Neural Dialogue Modeling

Title Multi-Turn Beam Search for Neural Dialogue Modeling
Authors Ilia Kulikov, Jason Lee, Kyunghyun Cho
Abstract In neural dialogue modeling, a neural network is trained to predict the next utterance, and at inference time, an approximate decoding algorithm is used to generate next utterances given previous ones. While this autoregressive framework allows us to model the whole conversation during training, inference is highly suboptimal, as a wrong utterance can affect future utterances. While beam search yields better results than greedy search does, we argue that it is still greedy in the context of the entire conversation, in that it does not consider future utterances. We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns. Given a set of candidates for next utterance, we unroll the conversation for a number of turns and identify the candidate utterance in the initial hypothesis set that gives rise to the most likely sequence of future utterances. We empirically validate our approach by conducting human evaluation using the Persona-Chat dataset, and find that our multi-turn beam search generates significantly better dialogue responses. We propose three approximations to the partner model, and observe that more informed partner models give better performance.
Tasks
Published 2019-06-01
URL https://arxiv.org/abs/1906.00141v2
PDF https://arxiv.org/pdf/1906.00141v2.pdf
PWC https://paperswithcode.com/paper/190600141
Repo https://github.com/nyu-dl/dl4dial-mt-beam
Framework none

Guided learning for weakly-labeled semi-supervised sound event detection

Title Guided learning for weakly-labeled semi-supervised sound event detection
Authors Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian
Abstract We propose a simple but efficient method termed Guided Learning for weakly-labeled semi-supervised sound event detection (SED). There are two sub-targets implied in weakly-labeled SED: audio tagging and boundary detection. Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data. The guidance is guaranteed by the audio tagging performance gap of the two models. In the meantime, the student model liberated from the trade-off is able to provide more excellent boundary detection results. We propose a principle to design such two models based on the relation between the temporal compression scale and the two sub-targets. We also propose an end-to-end semi-supervised learning process for these two models to enable their abilities to rise alternately. Experiments on the DCASE2018 Task4 dataset show that our approach achieves competitive performance.
Tasks Audio Tagging, Boundary Detection, Sound Event Detection
Published 2019-06-06
URL https://arxiv.org/abs/1906.02517v5
PDF https://arxiv.org/pdf/1906.02517v5.pdf
PWC https://paperswithcode.com/paper/what-you-need-is-a-more-professional-teacher
Repo https://github.com/Kikyo-16/Sound_event_detection
Framework tf

Mutual Information Maximization in Graph Neural Networks

Title Mutual Information Maximization in Graph Neural Networks
Authors Xinhan Di, Pengqian Yu, Rui Bu, Mingchao Sun
Abstract A variety of graph neural networks (GNNs) frameworks for representation learning on graphs have been recently developed. These frameworks rely on aggregation and iteration scheme to learn the representation of nodes. However, information between nodes is inevitably lost in the scheme during learning. In order to reduce the loss, we extend the GNNs frameworks by exploring the aggregation and iteration scheme in the methodology of mutual information. We propose a new approach of enlarging the normal neighborhood in the aggregation of GNNs, which aims at maximizing mutual information. Based on a series of experiments conducted on several benchmark datasets, we show that the proposed approach improves the state-of-the-art performance for four types of graph tasks, including supervised and semi-supervised graph classification, graph link prediction and graph edge generation and classification.
Tasks Graph Classification, Link Prediction, Representation Learning
Published 2019-05-21
URL https://arxiv.org/abs/1905.08509v4
PDF https://arxiv.org/pdf/1905.08509v4.pdf
PWC https://paperswithcode.com/paper/neighborhood-enlargement-in-graph-neural
Repo https://github.com/CODE-SUBMIT/Graph_Neighborhood_1
Framework pytorch

ADMM for Efficient Deep Learning with Global Convergence

Title ADMM for Efficient Deep Learning with Global Convergence
Authors Junxiang Wang, Fuxun Yu, Xiang Chen, Liang Zhao
Abstract Alternating Direction Method of Multipliers (ADMM) has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent (SGD) as a deep learning optimizer. However, as an emerging domain, several challenges remain, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework for deep learning via ADMM (dlADMM) to address these challenges simultaneously. The parameters in each layer are updated backward and then forward so that the parameter information in each layer is exchanged efficiently. The time complexity is reduced from cubic to quadratic in (latent) feature dimensions via a dedicated algorithm design for subproblems that enhances them utilizing iterative quadratic approximations and backtracking. Finally, we provide the first proof of global convergence for an ADMM-based method (dlADMM) in a deep neural network problem under mild conditions. Experiments on benchmark datasets demonstrated that our proposed dlADMM algorithm outperforms most of the comparison methods.
Tasks Stochastic Optimization
Published 2019-05-31
URL https://arxiv.org/abs/1905.13611v2
PDF https://arxiv.org/pdf/1905.13611v2.pdf
PWC https://paperswithcode.com/paper/admm-for-efficient-deep-learning-with-global
Repo https://github.com/xianggebenben/dlADMM
Framework tf

Deep Probabilistic Modeling of Glioma Growth

Title Deep Probabilistic Modeling of Glioma Growth
Authors Jens Petersen, Paul F. Jäger, Fabian Isensee, Simon A. A. Kohl, Ulf Neuberger, Wolfgang Wick, Jürgen Debus, Sabine Heiland, Martin Bendszus, Philipp Kickingereder, Klaus H. Maier-Hein
Abstract Existing approaches to modeling the dynamics of brain tumor growth, specifically glioma, employ biologically inspired models of cell diffusion, using image data to estimate the associated parameters. In this work, we propose an alternative approach based on recent advances in probabilistic segmentation and representation learning that implicitly learns growth dynamics directly from data without an underlying explicit model. We present evidence that our approach is able to learn a distribution of plausible future tumor appearances conditioned on past observations of the same tumor.
Tasks Representation Learning
Published 2019-07-09
URL https://arxiv.org/abs/1907.04064v1
PDF https://arxiv.org/pdf/1907.04064v1.pdf
PWC https://paperswithcode.com/paper/deep-probabilistic-modeling-of-glioma-growth
Repo https://github.com/jenspetersen/probabilistic-unet
Framework pytorch

DeepXDE: A deep learning library for solving differential equations

Title DeepXDE: A deep learning library for solving differential equations
Authors Lu Lu, Xuhui Meng, Zhiping Mao, George E. Karniadakis
Abstract Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. The PINN algorithm is simple, and it can be applied to different types of PDEs, including integro-differential equations, fractional PDEs, and stochastic PDEs. Moreover, from the implementation point of view, PINNs solve inverse problems as easily as forward problems. We propose a new residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. For pedagogical reasons, we compare the PINN algorithm to a standard finite element method. We also present a Python library for PINNs, DeepXDE, which is designed to serve both as an education tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering. Specifically, DeepXDE can solve forward problems given initial and boundary conditions, as well as inverse problems given some extra measurements. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry, and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE and its customizability, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for five different examples. More broadly, DeepXDE contributes to the more rapid development of the emerging Scientific Machine Learning field.
Tasks
Published 2019-07-10
URL https://arxiv.org/abs/1907.04502v2
PDF https://arxiv.org/pdf/1907.04502v2.pdf
PWC https://paperswithcode.com/paper/deepxde-a-deep-learning-library-for-solving
Repo https://github.com/lululxvi/deepxde
Framework tf

Learning Activation Functions: A new paradigm for understanding Neural Networks

Title Learning Activation Functions: A new paradigm for understanding Neural Networks
Authors Mohit Goyal, Rajan Goyal, Brejesh Lall
Abstract The scope of research in the domain of activation functions remains limited and centered around improving the ease of optimization or generalization quality of neural networks (NNs). However, to develop a deeper understanding of deep learning, it becomes important to look at the non linear component of NNs more carefully. In this paper, we aim to provide a generic form of activation function along with appropriate mathematical grounding so as to allow for insights into the working of NNs in future. We propose ``Self-Learnable Activation Functions’’ (SLAF), which are learned during training and are capable of approximating most of the existing activation functions. SLAF is given as a weighted sum of pre-defined basis elements which can serve for a good approximation of the optimal activation function. The coefficients for these basis elements allow a search in the entire space of continuous functions (consisting of all the conventional activations). We propose various training routines which can be used to achieve performance with SLAF equipped neural networks (SLNNs). We prove that SLNNs can approximate any neural network with lipschitz continuous activations, to any arbitrary error highlighting their capacity and possible equivalence with standard NNs. Also, SLNNs can be completely represented as a collections of finite degree polynomial upto the very last layer obviating several hyper parameters like width and depth. Since the optimization of SLNNs is still a challenge, we show that using SLAF along with standard activations (like ReLU) can provide performance improvements with only a small increase in number of parameters. |
Tasks
Published 2019-06-23
URL https://arxiv.org/abs/1906.09529v2
PDF https://arxiv.org/pdf/1906.09529v2.pdf
PWC https://paperswithcode.com/paper/learning-activation-functions-a-new-paradigm
Repo https://github.com/mohit1997/SLAF
Framework tf

Ultra Fast Medoid Identification via Correlated Sequential Halving

Title Ultra Fast Medoid Identification via Correlated Sequential Halving
Authors Tavor Z. Baharav, David N. Tse
Abstract The medoid of a set of n points is the point in the set that minimizes the sum of distances to other points. It can be determined exactly in O(n^2) time by computing the distances between all pairs of points. Previous works show that one can significantly reduce the number of distance computations needed by adaptively querying distances. The resulting randomized algorithm is obtained by a direct conversion of the computation problem to a multi-armed bandit statistical inference problem. In this work, we show that we can better exploit the structure of the underlying computation problem by modifying the traditional bandit sampling strategy and using it in conjunction with a suitably chosen multi-armed bandit algorithm. Four to five orders of magnitude gains over exact computation are obtained on real data, in terms of both number of distance computations needed and wall clock time. Theoretical results are obtained to quantify such gains in terms of data parameters. Our code is publicly available online at https://github.com/TavorB/Correlated-Sequential-Halving.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04356v2
PDF https://arxiv.org/pdf/1906.04356v2.pdf
PWC https://paperswithcode.com/paper/ultra-fast-medoid-identification-via
Repo https://github.com/NEURIPS-anonymous-2019/Correlated-Sequential-Halving
Framework none

An Energy Approach to the Solution of Partial Differential Equations in Computational Mechanics via Machine Learning: Concepts, Implementation and Applications

Title An Energy Approach to the Solution of Partial Differential Equations in Computational Mechanics via Machine Learning: Concepts, Implementation and Applications
Authors Esteban Samaniego, Cosmin Anitescu, Somdatta Goswami, Vien Minh Nguyen-Thanh, Hongwei Guo, Khader Hamdia, Timon Rabczuk, Xiaoying Zhuang
Abstract Partial Differential Equations (PDE) are fundamental to model different phenomena in science and engineering mathematically. Solving them is a crucial step towards a precise knowledge of the behaviour of natural and engineered systems. In general, in order to solve PDEs that represent real systems to an acceptable degree, analytical methods are usually not enough. One has to resort to discretization methods. For engineering problems, probably the best known option is the finite element method (FEM). However, powerful alternatives such as mesh-free methods and Isogeometric Analysis (IGA) are also available. The fundamental idea is to approximate the solution of the PDE by means of functions specifically built to have some desirable properties. In this contribution, we explore Deep Neural Networks (DNNs) as an option for approximation. They have shown impressive results in areas such as visual recognition. DNNs are regarded here as function approximation machines. There is great flexibility to define their structure and important advances in the architecture and the efficiency of the algorithms to implement them make DNNs a very interesting alternative to approximate the solution of a PDE. We concentrate in applications that have an interest for Computational Mechanics. Most contributions that have decided to explore this possibility have adopted a collocation strategy. In this contribution, we concentrate in mechanical problems and analyze the energetic format of the PDE. The energy of a mechanical system seems to be the natural loss function for a machine learning method to approach a mechanical problem. As proofs of concept, we deal with several problems and explore the capabilities of the method for applications in engineering.
Tasks
Published 2019-08-27
URL https://arxiv.org/abs/1908.10407v2
PDF https://arxiv.org/pdf/1908.10407v2.pdf
PWC https://paperswithcode.com/paper/an-energy-approach-to-the-solution-of-partial
Repo https://github.com/ISM-Weimar/DeepEnergyMethods
Framework tf

Progressive Augmentation of GANs

Title Progressive Augmentation of GANs
Authors Dan Zhang, Anna Khoreva
Abstract Training of Generative Adversarial Networks (GANs) is notoriously fragile, requiring to maintain a careful balance between the generator and the discriminator in order to perform well. To mitigate this issue we introduce a new regularization technique - progressive augmentation of GANs (PA-GAN). The key idea is to gradually increase the task difficulty of the discriminator by progressively augmenting its input or feature space, thus enabling continuous learning of the generator. We show that the proposed progressive augmentation preserves the original GAN objective, does not compromise the discriminator’s optimality and encourages a healthy competition between the generator and discriminator, leading to the better-performing generator. We experimentally demonstrate the effectiveness of PA-GAN across different architectures and on multiple benchmarks for the image synthesis task, on average achieving ~3 point improvement of the FID score.
Tasks Image Generation
Published 2019-01-29
URL https://arxiv.org/abs/1901.10422v3
PDF https://arxiv.org/pdf/1901.10422v3.pdf
PWC https://paperswithcode.com/paper/pa-gan-improving-gan-training-by-progressive
Repo https://github.com/boschresearch/PA-GAN
Framework tf

Learning Nearest Neighbor Graphs from Noisy Distance Samples

Title Learning Nearest Neighbor Graphs from Noisy Distance Samples
Authors Blake Mason, Ardhendu Tripathy, Robert Nowak
Abstract We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people’s preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Delta^-2) queries in favorable settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding.
Tasks
Published 2019-05-30
URL https://arxiv.org/abs/1905.13267v1
PDF https://arxiv.org/pdf/1905.13267v1.pdf
PWC https://paperswithcode.com/paper/learning-nearest-neighbor-graphs-from-noisy
Repo https://github.com/blakemas/nngraph
Framework none

Regression Planning Networks

Title Regression Planning Networks
Authors Danfei Xu, Roberto Martín-Martín, De-An Huang, Yuke Zhu, Silvio Savarese, Li Fei-Fei
Abstract Recent learning-to-plan methods have shown promising results on planning directly from observation space. Yet, their ability to plan for long-horizon tasks is limited by the accuracy of the prediction model. On the other hand, classical symbolic planners show remarkable capabilities in solving long-horizon tasks, but they require predefined symbolic rules and symbolic states, restricting their real-world applicability. In this work, we combine the benefits of these two paradigms and propose a learning-to-plan method that can directly generate a long-term symbolic plan conditioned on high-dimensional observations. We borrow the idea of regression (backward) planning from classical planning literature and introduce Regression Planning Networks (RPN), a neural network architecture that plans backward starting at a task goal and generates a sequence of intermediate goals that reaches the current observation. We show that our model not only inherits many favorable traits from symbolic planning, e.g., the ability to solve previously unseen tasks but also can learn from visual inputs in an end-to-end manner. We evaluate the capabilities of RPN in a grid world environment and a simulated 3D kitchen environment featuring complex visual scenes and long task horizons, and show that it achieves near-optimal performance in completely new task instances.
Tasks
Published 2019-09-28
URL https://arxiv.org/abs/1909.13072v1
PDF https://arxiv.org/pdf/1909.13072v1.pdf
PWC https://paperswithcode.com/paper/regression-planning-networks
Repo https://github.com/danfeiX/RPN
Framework none
comments powered by Disqus