Paper Group AWR 268
Machine Learning Approach to Earthquake Rupture Dynamics. PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI. A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization. Multi-Turn Beam Search for Neural Dialogue Modeling. Guided learning for weakly-labeled semi-supervised sound event detection. Mutual Inf …
Machine Learning Approach to Earthquake Rupture Dynamics
Title | Machine Learning Approach to Earthquake Rupture Dynamics |
Authors | Sabber Ahamed, Eric G. Daub |
Abstract | Simulating dynamic rupture propagation is challenging due to the uncertainties involved in the underlying physics of fault slip, stress conditions, and frictional properties of the fault. A trial and error approach is often used to determine the unknown parameters describing rupture, but running many simulations usually requires human review to determine how to adjust parameter values and is thus not very efficient. To reduce the computational cost and improve our ability to determine reasonable stress and friction parameters, we take advantage of the machine learning approach. We develop two models for earthquake rupture propagation using the artificial neural network (ANN) and the random forest (RF) algorithms to predict if a rupture can break a geometric heterogeneity on a fault. We train the models using a database of 1600 dynamic rupture simulations computed numerically. Fault geometry, stress conditions, and friction parameters vary in each simulation. We cross-validate and test the predictive power of the models using an additional 400 simulated ruptures, respectively. Both RF and ANN models predict rupture propagation with more than 81% accuracy, and model parameters can be used to infer the underlying factors most important for rupture propagation. Both of the models are computationally efficient such that the 400 testings require a fraction of a second, leading to potential applications of dynamic rupture that have previously not been possible due to the computational demands of physics-based rupture simulations. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06250v1 |
https://arxiv.org/pdf/1906.06250v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-approach-to-earthquake |
Repo | https://github.com/msahamed/machine_learning_earthquake_rupture |
Framework | none |
PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI
Title | PILOT: Physics-Informed Learned Optimized Trajectories for Accelerated MRI |
Authors | Tomer Weiss, Ortal Senouf, Sanketh Vedula, Oleg Michailovich, Michael Zibulevsky, Alex Bronstein |
Abstract | Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relative high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of concurrent learning-based design of data acquisition and image reconstruction schemes. In this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of a practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and back-propagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks. |
Tasks | Image Reconstruction, Semantic Segmentation |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05773v3 |
https://arxiv.org/pdf/1909.05773v3.pdf | |
PWC | https://paperswithcode.com/paper/pilot-physics-informed-learned-optimal |
Repo | https://github.com/tomer196/PILOT |
Framework | pytorch |
A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization
Title | A Lightweight Optical Flow CNN – Revisiting Data Fidelity and Regularization |
Authors | Tak-Wai Hui, Xiaoou Tang, Chen Change Loy |
Abstract | Over four decades, the majority addresses the problem of optical flow estimation using variational methods. With the advance of machine learning, some recent works have attempted to address the problem using convolutional neural network (CNN) and have showed promising results. FlowNet2, the state-of-the-art CNN, requires over 160M parameters to achieve accurate flow estimation. Our LiteFlowNet2 outperforms FlowNet2 on Sintel and KITTI benchmarks, while being 25.3 times smaller in the model size and 3.1 times faster in the running speed. LiteFlowNet2 is built on the foundation laid by conventional methods and resembles the corresponding roles as data fidelity and regularization in variational methods. We compute optical flow in a spatial-pyramid formulation as SPyNet but through a novel lightweight cascaded flow inference. It provides high flow estimation accuracy through early correction with seamless incorporation of descriptor matching. Flow regularization is used to ameliorate the issue of outliers and vague flow boundaries through feature-driven local convolutions. Our network also owns an effective structure for pyramidal feature extraction and embraces feature warping rather than image warping as practiced in FlowNet2 and SPyNet. Comparing to LiteFlowNet, LiteFlowNet2 improves the optical flow accuracy on Sintel Clean by 23.3%, Sintel Final by 12.8%, KITTI 2012 by 19.6%, and KITTI 2015 by 18.8%, while being 2.2 times faster. Our network protocol and trained models are made publicly available on https://github.com/twhui/LiteFlowNet2. |
Tasks | Optical Flow Estimation |
Published | 2019-03-15 |
URL | https://arxiv.org/abs/1903.07414v3 |
https://arxiv.org/pdf/1903.07414v3.pdf | |
PWC | https://paperswithcode.com/paper/a-lightweight-optical-flow-cnn-revisiting |
Repo | https://github.com/twhui/LiteFlowNet2 |
Framework | none |
Multi-Turn Beam Search for Neural Dialogue Modeling
Title | Multi-Turn Beam Search for Neural Dialogue Modeling |
Authors | Ilia Kulikov, Jason Lee, Kyunghyun Cho |
Abstract | In neural dialogue modeling, a neural network is trained to predict the next utterance, and at inference time, an approximate decoding algorithm is used to generate next utterances given previous ones. While this autoregressive framework allows us to model the whole conversation during training, inference is highly suboptimal, as a wrong utterance can affect future utterances. While beam search yields better results than greedy search does, we argue that it is still greedy in the context of the entire conversation, in that it does not consider future utterances. We propose a novel approach for conversation-level inference by explicitly modeling the dialogue partner and running beam search across multiple conversation turns. Given a set of candidates for next utterance, we unroll the conversation for a number of turns and identify the candidate utterance in the initial hypothesis set that gives rise to the most likely sequence of future utterances. We empirically validate our approach by conducting human evaluation using the Persona-Chat dataset, and find that our multi-turn beam search generates significantly better dialogue responses. We propose three approximations to the partner model, and observe that more informed partner models give better performance. |
Tasks | |
Published | 2019-06-01 |
URL | https://arxiv.org/abs/1906.00141v2 |
https://arxiv.org/pdf/1906.00141v2.pdf | |
PWC | https://paperswithcode.com/paper/190600141 |
Repo | https://github.com/nyu-dl/dl4dial-mt-beam |
Framework | none |
Guided learning for weakly-labeled semi-supervised sound event detection
Title | Guided learning for weakly-labeled semi-supervised sound event detection |
Authors | Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian |
Abstract | We propose a simple but efficient method termed Guided Learning for weakly-labeled semi-supervised sound event detection (SED). There are two sub-targets implied in weakly-labeled SED: audio tagging and boundary detection. Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data. The guidance is guaranteed by the audio tagging performance gap of the two models. In the meantime, the student model liberated from the trade-off is able to provide more excellent boundary detection results. We propose a principle to design such two models based on the relation between the temporal compression scale and the two sub-targets. We also propose an end-to-end semi-supervised learning process for these two models to enable their abilities to rise alternately. Experiments on the DCASE2018 Task4 dataset show that our approach achieves competitive performance. |
Tasks | Audio Tagging, Boundary Detection, Sound Event Detection |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02517v5 |
https://arxiv.org/pdf/1906.02517v5.pdf | |
PWC | https://paperswithcode.com/paper/what-you-need-is-a-more-professional-teacher |
Repo | https://github.com/Kikyo-16/Sound_event_detection |
Framework | tf |
Mutual Information Maximization in Graph Neural Networks
Title | Mutual Information Maximization in Graph Neural Networks |
Authors | Xinhan Di, Pengqian Yu, Rui Bu, Mingchao Sun |
Abstract | A variety of graph neural networks (GNNs) frameworks for representation learning on graphs have been recently developed. These frameworks rely on aggregation and iteration scheme to learn the representation of nodes. However, information between nodes is inevitably lost in the scheme during learning. In order to reduce the loss, we extend the GNNs frameworks by exploring the aggregation and iteration scheme in the methodology of mutual information. We propose a new approach of enlarging the normal neighborhood in the aggregation of GNNs, which aims at maximizing mutual information. Based on a series of experiments conducted on several benchmark datasets, we show that the proposed approach improves the state-of-the-art performance for four types of graph tasks, including supervised and semi-supervised graph classification, graph link prediction and graph edge generation and classification. |
Tasks | Graph Classification, Link Prediction, Representation Learning |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08509v4 |
https://arxiv.org/pdf/1905.08509v4.pdf | |
PWC | https://paperswithcode.com/paper/neighborhood-enlargement-in-graph-neural |
Repo | https://github.com/CODE-SUBMIT/Graph_Neighborhood_1 |
Framework | pytorch |
ADMM for Efficient Deep Learning with Global Convergence
Title | ADMM for Efficient Deep Learning with Global Convergence |
Authors | Junxiang Wang, Fuxun Yu, Xiang Chen, Liang Zhao |
Abstract | Alternating Direction Method of Multipliers (ADMM) has been used successfully in many conventional machine learning applications and is considered to be a useful alternative to Stochastic Gradient Descent (SGD) as a deep learning optimizer. However, as an emerging domain, several challenges remain, including 1) The lack of global convergence guarantees, 2) Slow convergence towards solutions, and 3) Cubic time complexity with regard to feature dimensions. In this paper, we propose a novel optimization framework for deep learning via ADMM (dlADMM) to address these challenges simultaneously. The parameters in each layer are updated backward and then forward so that the parameter information in each layer is exchanged efficiently. The time complexity is reduced from cubic to quadratic in (latent) feature dimensions via a dedicated algorithm design for subproblems that enhances them utilizing iterative quadratic approximations and backtracking. Finally, we provide the first proof of global convergence for an ADMM-based method (dlADMM) in a deep neural network problem under mild conditions. Experiments on benchmark datasets demonstrated that our proposed dlADMM algorithm outperforms most of the comparison methods. |
Tasks | Stochastic Optimization |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13611v2 |
https://arxiv.org/pdf/1905.13611v2.pdf | |
PWC | https://paperswithcode.com/paper/admm-for-efficient-deep-learning-with-global |
Repo | https://github.com/xianggebenben/dlADMM |
Framework | tf |
Deep Probabilistic Modeling of Glioma Growth
Title | Deep Probabilistic Modeling of Glioma Growth |
Authors | Jens Petersen, Paul F. Jäger, Fabian Isensee, Simon A. A. Kohl, Ulf Neuberger, Wolfgang Wick, Jürgen Debus, Sabine Heiland, Martin Bendszus, Philipp Kickingereder, Klaus H. Maier-Hein |
Abstract | Existing approaches to modeling the dynamics of brain tumor growth, specifically glioma, employ biologically inspired models of cell diffusion, using image data to estimate the associated parameters. In this work, we propose an alternative approach based on recent advances in probabilistic segmentation and representation learning that implicitly learns growth dynamics directly from data without an underlying explicit model. We present evidence that our approach is able to learn a distribution of plausible future tumor appearances conditioned on past observations of the same tumor. |
Tasks | Representation Learning |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04064v1 |
https://arxiv.org/pdf/1907.04064v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-probabilistic-modeling-of-glioma-growth |
Repo | https://github.com/jenspetersen/probabilistic-unet |
Framework | pytorch |
DeepXDE: A deep learning library for solving differential equations
Title | DeepXDE: A deep learning library for solving differential equations |
Authors | Lu Lu, Xuhui Meng, Zhiping Mao, George E. Karniadakis |
Abstract | Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. The PINN algorithm is simple, and it can be applied to different types of PDEs, including integro-differential equations, fractional PDEs, and stochastic PDEs. Moreover, from the implementation point of view, PINNs solve inverse problems as easily as forward problems. We propose a new residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. For pedagogical reasons, we compare the PINN algorithm to a standard finite element method. We also present a Python library for PINNs, DeepXDE, which is designed to serve both as an education tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering. Specifically, DeepXDE can solve forward problems given initial and boundary conditions, as well as inverse problems given some extra measurements. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry, and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE and its customizability, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for five different examples. More broadly, DeepXDE contributes to the more rapid development of the emerging Scientific Machine Learning field. |
Tasks | |
Published | 2019-07-10 |
URL | https://arxiv.org/abs/1907.04502v2 |
https://arxiv.org/pdf/1907.04502v2.pdf | |
PWC | https://paperswithcode.com/paper/deepxde-a-deep-learning-library-for-solving |
Repo | https://github.com/lululxvi/deepxde |
Framework | tf |
Learning Activation Functions: A new paradigm for understanding Neural Networks
Title | Learning Activation Functions: A new paradigm for understanding Neural Networks |
Authors | Mohit Goyal, Rajan Goyal, Brejesh Lall |
Abstract | The scope of research in the domain of activation functions remains limited and centered around improving the ease of optimization or generalization quality of neural networks (NNs). However, to develop a deeper understanding of deep learning, it becomes important to look at the non linear component of NNs more carefully. In this paper, we aim to provide a generic form of activation function along with appropriate mathematical grounding so as to allow for insights into the working of NNs in future. We propose ``Self-Learnable Activation Functions’’ (SLAF), which are learned during training and are capable of approximating most of the existing activation functions. SLAF is given as a weighted sum of pre-defined basis elements which can serve for a good approximation of the optimal activation function. The coefficients for these basis elements allow a search in the entire space of continuous functions (consisting of all the conventional activations). We propose various training routines which can be used to achieve performance with SLAF equipped neural networks (SLNNs). We prove that SLNNs can approximate any neural network with lipschitz continuous activations, to any arbitrary error highlighting their capacity and possible equivalence with standard NNs. Also, SLNNs can be completely represented as a collections of finite degree polynomial upto the very last layer obviating several hyper parameters like width and depth. Since the optimization of SLNNs is still a challenge, we show that using SLAF along with standard activations (like ReLU) can provide performance improvements with only a small increase in number of parameters. | |
Tasks | |
Published | 2019-06-23 |
URL | https://arxiv.org/abs/1906.09529v2 |
https://arxiv.org/pdf/1906.09529v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-activation-functions-a-new-paradigm |
Repo | https://github.com/mohit1997/SLAF |
Framework | tf |
Ultra Fast Medoid Identification via Correlated Sequential Halving
Title | Ultra Fast Medoid Identification via Correlated Sequential Halving |
Authors | Tavor Z. Baharav, David N. Tse |
Abstract | The medoid of a set of n points is the point in the set that minimizes the sum of distances to other points. It can be determined exactly in O(n^2) time by computing the distances between all pairs of points. Previous works show that one can significantly reduce the number of distance computations needed by adaptively querying distances. The resulting randomized algorithm is obtained by a direct conversion of the computation problem to a multi-armed bandit statistical inference problem. In this work, we show that we can better exploit the structure of the underlying computation problem by modifying the traditional bandit sampling strategy and using it in conjunction with a suitably chosen multi-armed bandit algorithm. Four to five orders of magnitude gains over exact computation are obtained on real data, in terms of both number of distance computations needed and wall clock time. Theoretical results are obtained to quantify such gains in terms of data parameters. Our code is publicly available online at https://github.com/TavorB/Correlated-Sequential-Halving. |
Tasks | |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04356v2 |
https://arxiv.org/pdf/1906.04356v2.pdf | |
PWC | https://paperswithcode.com/paper/ultra-fast-medoid-identification-via |
Repo | https://github.com/NEURIPS-anonymous-2019/Correlated-Sequential-Halving |
Framework | none |
An Energy Approach to the Solution of Partial Differential Equations in Computational Mechanics via Machine Learning: Concepts, Implementation and Applications
Title | An Energy Approach to the Solution of Partial Differential Equations in Computational Mechanics via Machine Learning: Concepts, Implementation and Applications |
Authors | Esteban Samaniego, Cosmin Anitescu, Somdatta Goswami, Vien Minh Nguyen-Thanh, Hongwei Guo, Khader Hamdia, Timon Rabczuk, Xiaoying Zhuang |
Abstract | Partial Differential Equations (PDE) are fundamental to model different phenomena in science and engineering mathematically. Solving them is a crucial step towards a precise knowledge of the behaviour of natural and engineered systems. In general, in order to solve PDEs that represent real systems to an acceptable degree, analytical methods are usually not enough. One has to resort to discretization methods. For engineering problems, probably the best known option is the finite element method (FEM). However, powerful alternatives such as mesh-free methods and Isogeometric Analysis (IGA) are also available. The fundamental idea is to approximate the solution of the PDE by means of functions specifically built to have some desirable properties. In this contribution, we explore Deep Neural Networks (DNNs) as an option for approximation. They have shown impressive results in areas such as visual recognition. DNNs are regarded here as function approximation machines. There is great flexibility to define their structure and important advances in the architecture and the efficiency of the algorithms to implement them make DNNs a very interesting alternative to approximate the solution of a PDE. We concentrate in applications that have an interest for Computational Mechanics. Most contributions that have decided to explore this possibility have adopted a collocation strategy. In this contribution, we concentrate in mechanical problems and analyze the energetic format of the PDE. The energy of a mechanical system seems to be the natural loss function for a machine learning method to approach a mechanical problem. As proofs of concept, we deal with several problems and explore the capabilities of the method for applications in engineering. |
Tasks | |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10407v2 |
https://arxiv.org/pdf/1908.10407v2.pdf | |
PWC | https://paperswithcode.com/paper/an-energy-approach-to-the-solution-of-partial |
Repo | https://github.com/ISM-Weimar/DeepEnergyMethods |
Framework | tf |
Progressive Augmentation of GANs
Title | Progressive Augmentation of GANs |
Authors | Dan Zhang, Anna Khoreva |
Abstract | Training of Generative Adversarial Networks (GANs) is notoriously fragile, requiring to maintain a careful balance between the generator and the discriminator in order to perform well. To mitigate this issue we introduce a new regularization technique - progressive augmentation of GANs (PA-GAN). The key idea is to gradually increase the task difficulty of the discriminator by progressively augmenting its input or feature space, thus enabling continuous learning of the generator. We show that the proposed progressive augmentation preserves the original GAN objective, does not compromise the discriminator’s optimality and encourages a healthy competition between the generator and discriminator, leading to the better-performing generator. We experimentally demonstrate the effectiveness of PA-GAN across different architectures and on multiple benchmarks for the image synthesis task, on average achieving ~3 point improvement of the FID score. |
Tasks | Image Generation |
Published | 2019-01-29 |
URL | https://arxiv.org/abs/1901.10422v3 |
https://arxiv.org/pdf/1901.10422v3.pdf | |
PWC | https://paperswithcode.com/paper/pa-gan-improving-gan-training-by-progressive |
Repo | https://github.com/boschresearch/PA-GAN |
Framework | tf |
Learning Nearest Neighbor Graphs from Noisy Distance Samples
Title | Learning Nearest Neighbor Graphs from Noisy Distance Samples |
Authors | Blake Mason, Ardhendu Tripathy, Robert Nowak |
Abstract | We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people’s preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Delta^-2) queries in favorable settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding. |
Tasks | |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13267v1 |
https://arxiv.org/pdf/1905.13267v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-nearest-neighbor-graphs-from-noisy |
Repo | https://github.com/blakemas/nngraph |
Framework | none |
Regression Planning Networks
Title | Regression Planning Networks |
Authors | Danfei Xu, Roberto Martín-Martín, De-An Huang, Yuke Zhu, Silvio Savarese, Li Fei-Fei |
Abstract | Recent learning-to-plan methods have shown promising results on planning directly from observation space. Yet, their ability to plan for long-horizon tasks is limited by the accuracy of the prediction model. On the other hand, classical symbolic planners show remarkable capabilities in solving long-horizon tasks, but they require predefined symbolic rules and symbolic states, restricting their real-world applicability. In this work, we combine the benefits of these two paradigms and propose a learning-to-plan method that can directly generate a long-term symbolic plan conditioned on high-dimensional observations. We borrow the idea of regression (backward) planning from classical planning literature and introduce Regression Planning Networks (RPN), a neural network architecture that plans backward starting at a task goal and generates a sequence of intermediate goals that reaches the current observation. We show that our model not only inherits many favorable traits from symbolic planning, e.g., the ability to solve previously unseen tasks but also can learn from visual inputs in an end-to-end manner. We evaluate the capabilities of RPN in a grid world environment and a simulated 3D kitchen environment featuring complex visual scenes and long task horizons, and show that it achieves near-optimal performance in completely new task instances. |
Tasks | |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1909.13072v1 |
https://arxiv.org/pdf/1909.13072v1.pdf | |
PWC | https://paperswithcode.com/paper/regression-planning-networks |
Repo | https://github.com/danfeiX/RPN |
Framework | none |