February 1, 2020

3167 words 15 mins read

Paper Group AWR 174

VS-Net: Variable splitting network for accelerated parallel MRI reconstruction. Analysis of Optimization Algorithms via Sum-of-Squares. Variational Quantum Circuits for Deep Reinforcement Learning. Residual Pyramid Learning for Single-Shot Semantic Segmentation. Hint-Based Training for Non-Autoregressive Machine Translation. Beyond BLEU: Training N …

VS-Net: Variable splitting network for accelerated parallel MRI reconstruction


Title	VS-Net: Variable splitting network for accelerated parallel MRI reconstruction
Authors	Jinming Duan, Jo Schlemper, Chen Qin, Cheng Ouyang, Wenjia Bai, Carlo Biffi, Ghalib Bello, Ben Statton, Declan P O’Regan, Daniel Rueckert
Abstract	In this work, we propose a deep learning approach for parallel magnetic resonance imaging (MRI) reconstruction, termed a variable splitting network (VS-Net), for an efficient, high-quality reconstruction of undersampled multi-coil MR data. We formulate the generalized parallel compressed sensing reconstruction as an energy minimization problem, for which a variable splitting optimization method is derived. Based on this formulation we propose a novel, end-to-end trainable deep neural network architecture by unrolling the resulting iterative process of such variable splitting scheme. VS-Net is evaluated on complex valued multi-coil knee images for 4-fold and 6-fold acceleration factors. We show that VS-Net outperforms state-of-the-art deep learning reconstruction algorithms, in terms of reconstruction accuracy and perceptual quality. Our code is publicly available at https://github.com/j-duan/VS-Net.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.10033v1
PDF	https://arxiv.org/pdf/1907.10033v1.pdf
PWC	https://paperswithcode.com/paper/vs-net-variable-splitting-network-for
Repo	https://github.com/j-duan/VS-Net
Framework	pytorch

Analysis of Optimization Algorithms via Sum-of-Squares


Title	Analysis of Optimization Algorithms via Sum-of-Squares
Authors	Sandra S. Y. Tan, Antonios Varvitsiotis, Vincent Y. F. Tan
Abstract	In this work, we introduce a new framework for unifying and systematizing the performance analysis of first-order black-box optimization algorithms for unconstrained convex minimization. The low-cost iteration complexity enjoyed by first-order algorithms renders them particularly relevant for applications in machine learning and large-scale data analysis. Our approach is based on sum-of-squares optimization, which allows to introduce a hierarchy of semidefinite programs (SDPs) that give increasingly better convergence bounds for higher levels of the hierarchy. The (dual of the) first level of the sum-of-squares hierarchy corresponds to the SDP reformulation of the Performance Estimation Problem, first introduced by Drori and Teboulle [Math. Program., 145(1):451-482, 2014] and developed further by Taylor, Hendrickx, and Glineur [Math. Program., 161(1):307-345, 2017]. Illustrating the usefulness of our approach, we recover, in a unified manner, several known convergence bounds for four widely-used first-order algorithms, and also derive new convergence results for noisy gradient descent with inexact line search methods.
Tasks
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04648v3
PDF	https://arxiv.org/pdf/1906.04648v3.pdf
PWC	https://paperswithcode.com/paper/analysis-of-optimization-algorithms-via-sum
Repo	https://github.com/sandratsy/SumsOfSquares
Framework	none

Variational Quantum Circuits for Deep Reinforcement Learning


Title	Variational Quantum Circuits for Deep Reinforcement Learning
Authors	Samuel Yen-Chi Chen, Chao-Han Huck Yang, Jun Qi, Pin-Yu Chen, Xiaoli Ma, Hsi-Sheng Goan
Abstract	The state-of-the-art Machine learning approaches are based on classical Von-Neumann computing architectures and have been widely used in many industrial and academic domains. With the recent development of quantum computing, a couple of tech-giants have attempted new quantum circuits for machine learning tasks. However, the existing quantum machine learning is hard to simulate classical deep learning models because of the intractability of deep quantum circuits. Thus, it is necessary to design approximated quantum algorithms for quantum machine learning. This work explores variational quantum circuits for deep reinforcement learning. Specifically, we reshape classical deep reinforcement learning algorithms like experience replay and target network into a representation of variational quantum circuits. On the other hand, we use a quantum information encoding scheme to reduce the number of model parameters as small as the scale of $poly(\log{} N)$ in contrast to $poly(N)$ in a standard configuration. Besides, our variational quantum circuits can be deployed in many near-term noisy intermediate quantum machines.
Tasks	Quantum Machine Learning
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00397v2
PDF	https://arxiv.org/pdf/1907.00397v2.pdf
PWC	https://paperswithcode.com/paper/variational-quantum-circuits-and-deep
Repo	https://github.com/ycchen1989/Var-QuantumCircuits-DeepRL
Framework	pytorch

Residual Pyramid Learning for Single-Shot Semantic Segmentation


Title	Residual Pyramid Learning for Single-Shot Semantic Segmentation
Authors	Xiaoyu Chen, Xiaotian Lou, Lianfa Bai, Jing Han
Abstract	Pixel-level semantic segmentation is a challenging task with a huge amount of computation, especially if the size of input is large. In the segmentation model, apart from the feature extraction, the extra decoder structure is often employed to recover spatial information. In this paper, we put forward a method for single-shot segmentation in a feature residual pyramid network (RPNet), which learns the main and residuals of segmentation by decomposing the label at different levels of residual blocks. Specifically speaking, we use the residual features to learn the edges and details, and the identity features to learn the main part of targets. At testing time, the predicted residuals are used to enhance the details of the top-level prediction. Residual learning blocks split the network into several shallow sub-networks which facilitates the training of the RPNet. We then evaluate the proposed method and compare it with recent state-of-the-art methods on CamVid and Cityscapes. The proposed single-shot segmentation based on RPNet achieves impressive results with high efficiency on pixel-level segmentation.
Tasks	Semantic Segmentation
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09746v1
PDF	http://arxiv.org/pdf/1903.09746v1.pdf
PWC	https://paperswithcode.com/paper/residual-pyramid-learning-for-single-shot
Repo	https://github.com/superlxt/RPNet-Pytorch
Framework	pytorch

Hint-Based Training for Non-Autoregressive Machine Translation


Title	Hint-Based Training for Non-Autoregressive Machine Translation
Authors	Zhuohan Li, Zi Lin, Di He, Fei Tian, Tao Qin, Liwei Wang, Tie-Yan Liu
Abstract	Due to the unparallelizable nature of the autoregressive factorization, AutoRegressive Translation (ART) models have to generate tokens sequentially during decoding and thus suffer from high inference latency. Non-AutoRegressive Translation (NART) models were proposed to reduce the inference time, but could only achieve inferior translation accuracy. In this paper, we proposed a novel approach to leveraging the hints from hidden states and word alignments to help the training of NART models. The results achieve significant improvement over previous NART models for the WMT14 En-De and De-En datasets and are even comparable to a strong LSTM-based ART baseline but one order of magnitude faster in inference.
Tasks	Machine Translation
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06708v1
PDF	https://arxiv.org/pdf/1909.06708v1.pdf
PWC	https://paperswithcode.com/paper/hint-based-training-for-non-autoregressive-1
Repo	https://github.com/zhuohan123/hint-nart
Framework	none

Beyond BLEU: Training Neural Machine Translation with Semantic Similarity


Title	Beyond BLEU: Training Neural Machine Translation with Semantic Similarity
Authors	John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig
Abstract	While most neural machine translation (NMT) systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can substantially improve final translation accuracy. However, training with BLEU has some limitations: it doesn’t assign partial credit, it has a limited range of output values, and it can penalize semantically correct hypotheses if they differ lexically from the reference. In this paper, we introduce an alternative reward function for optimizing NMT systems that is based on recent work in semantic similarity. We evaluate on four disparate languages translated to English, and find that training with our proposed metric results in better translations as evaluated by BLEU, semantic similarity, and human evaluation, and also that the optimization procedure converges faster. Analysis suggests that this is because the proposed metric is more conducive to optimization, assigning partial credit and providing more diversity in scores than BLEU.
Tasks	Machine Translation, Semantic Similarity, Semantic Textual Similarity
Published	2019-09-14
URL	https://arxiv.org/abs/1909.06694v1
PDF	https://arxiv.org/pdf/1909.06694v1.pdf
PWC	https://paperswithcode.com/paper/beyond-bleu-training-neural-machine
Repo	https://github.com/jwieting/beyond-bleu
Framework	pytorch

Skip prediction using boosting trees based on acoustic features of tracks in sessions


Title	Skip prediction using boosting trees based on acoustic features of tracks in sessions
Authors	Andrés Ferraro, Dmitry Bogdanov, Xavier Serra
Abstract	The Spotify Sequential Skip Prediction Challenge focuses on predicting if a track in a session will be skipped by the user or not. In this paper, we describe our approach to this problem and the final system that was submitted to the challenge by our team from the Music Technology Group (MTG) under the name “aferraro”. This system consists in combining the predictions of multiple boosting trees models trained with features extracted from the sessions and the tracks. The proposed approach achieves good overall performance (MAA of 0.554), with our model ranked 14th out of more than 600 submissions in the final leaderboard.
Tasks
Published	2019-03-28
URL	http://arxiv.org/abs/1903.11833v1
PDF	http://arxiv.org/pdf/1903.11833v1.pdf
PWC	https://paperswithcode.com/paper/skip-prediction-using-boosting-trees-based-on
Repo	https://github.com/andrebola/skip-challenge-wsdm
Framework	none

Neural Similarity Learning


Title	Neural Similarity Learning
Authors	Weiyang Liu, Zhen Liu, James M. Rehg, Le Song
Abstract	Inner product-based convolution has been the founding stone of convolutional neural networks (CNNs), enabling end-to-end learning of visual representation. By generalizing inner product with a bilinear matrix, we propose the neural similarity which serves as a learnable parametric similarity measure for CNNs. Neural similarity naturally generalizes the convolution and enhances flexibility. Further, we consider the neural similarity learning (NSL) in order to learn the neural similarity adaptively from training data. Specifically, we propose two different ways of learning the neural similarity: static NSL and dynamic NSL. Interestingly, dynamic neural similarity makes the CNN become a dynamic inference network. By regularizing the bilinear matrix, NSL can be viewed as learning the shape of kernel and the similarity measure simultaneously. We further justify the effectiveness of NSL with a theoretical viewpoint. Most importantly, NSL shows promising performance in visual recognition and few-shot learning, validating the superiority of NSL over the inner product-based convolution counterparts.
Tasks	Few-Shot Learning
Published	2019-10-28
URL	https://arxiv.org/abs/1910.13003v3
PDF	https://arxiv.org/pdf/1910.13003v3.pdf
PWC	https://paperswithcode.com/paper/neural-similarity-learning
Repo	https://github.com/wy1iu/NSL
Framework	none

Global Explanations of Neural Networks: Mapping the Landscape of Predictions


Title	Global Explanations of Neural Networks: Mapping the Landscape of Predictions
Authors	Mark Ibrahim, Melissa Louie, Ceena Modarres, John Paisley
Abstract	A barrier to the wider adoption of neural networks is their lack of interpretability. While local explanation methods exist for one prediction, most global attributions still reduce neural network decisions to a single set of features. In response, we present an approach for generating global attributions called GAM, which explains the landscape of neural network predictions across subpopulations. GAM augments global explanations with the proportion of samples that each attribution best explains and specifies which samples are described by each attribution. Global explanations also have tunable granularity to detect more or fewer subpopulations. We demonstrate that GAM’s global explanations 1) yield the known feature importances of simulated data, 2) match feature weights of interpretable statistical models on real data, and 3) are intuitive to practitioners through user studies. With more transparent predictions, GAM can help ensure neural network decisions are generated for the right reasons.
Tasks
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02384v1
PDF	http://arxiv.org/pdf/1902.02384v1.pdf
PWC	https://paperswithcode.com/paper/global-explanations-of-neural-networks
Repo	https://github.com/capitalone/global-attribution-mapping
Framework	none

Simple Black-box Adversarial Attacks


Title	Simple Black-box Adversarial Attacks
Authors	Chuan Guo, Jacob R. Gardner, Yurong You, Andrew Gordon Wilson, Kilian Q. Weinberger
Abstract	We propose an intriguingly simple method for the construction of adversarial images in the black-box setting. In constrast to the white-box scenario, constructing black-box adversarial images has the additional constraint on query budget, and efficient attacks remain an open problem to date. With only the mild assumption of continuous-valued confidence scores, our highly query-efficient algorithm utilizes the following simple iterative principle: we randomly sample a vector from a predefined orthonormal basis and either add or subtract it to the target image. Despite its simplicity, the proposed method can be used for both untargeted and targeted attacks – resulting in previously unprecedented query efficiency in both settings. We demonstrate the efficacy and efficiency of our algorithm on several real world settings including the Google Cloud Vision API. We argue that our proposed algorithm should serve as a strong baseline for future black-box attacks, in particular because it is extremely fast and its implementation requires less than 20 lines of PyTorch code.
Tasks
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07121v2
PDF	https://arxiv.org/pdf/1905.07121v2.pdf
PWC	https://paperswithcode.com/paper/simple-black-box-adversarial-attacks-1
Repo	https://github.com/cg563/simple-blackbox-attack
Framework	pytorch

Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality


Title	Estimating post-editing effort: a study on human judgements, task-based and reference-based metrics of MT quality
Authors	Carolina Scarton, Mikel L. Forcada, Miquel Esplà-Gomis, Lucia Specia
Abstract	Devising metrics to assess translation quality has always been at the core of machine translation (MT) research. Traditional automatic reference-based metrics, such as BLEU, have shown correlations with human judgements of adequacy and fluency and have been paramount for the advancement of MT system development. Crowd-sourcing has popularised and enabled the scalability of metrics based on human judgements, such as subjective direct assessments (DA) of adequacy, that are believed to be more reliable than reference-based automatic metrics. Finally, task-based measurements, such as post-editing time, are expected to provide a more detailed evaluation of the usefulness of translations for a specific task. Therefore, while DA averages adequacy judgements to obtain an appraisal of (perceived) quality independently of the task, and reference-based automatic metrics try to objectively estimate quality also in a task-independent way, task-based metrics are measurements obtained either during or after performing a specific task. In this paper we argue that, although expensive, task-based measurements are the most reliable when estimating MT quality in a specific task; in our case, this task is post-editing. To that end, we report experiments on a dataset with newly-collected post-editing indicators and show their usefulness when estimating post-editing effort. Our results show that task-based metrics comparing machine-translated and post-edited versions are the best at tracking post-editing effort, as expected. These metrics are followed by DA, and then by metrics comparing the machine-translated version and independent references. We suggest that MT practitioners should be aware of these differences and acknowledge their implications when deciding how to evaluate MT for post-editing purposes.
Tasks	Machine Translation
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06204v1
PDF	https://arxiv.org/pdf/1910.06204v1.pdf
PWC	https://paperswithcode.com/paper/estimating-post-editing-effort-a-study-on
Repo	https://github.com/carolscarton/iwslt2019
Framework	none

Mapping Supervised Bilingual Word Embeddings from English to low-resource languages


Title	Mapping Supervised Bilingual Word Embeddings from English to low-resource languages
Authors	Sourav Dutta
Abstract	It is very challenging to work with low-resource languages due to the inadequate availability of data. Using a dictionary to map independently trained word embeddings into a shared vector space has proved to be very useful in learning bilingual embeddings in the past. Here we have tried to map individual embeddings of words in English and their corresponding translated words in low-resource languages like Estonian, Slovenian, Slovakian, and Hungarian. We have used a supervised learning approach. We report accuracy scores through various retrieval strategies which show that it is possible to approach challenging tasks in Natural Language Processing like machine translation for such languages, provided that we have at least some amount of proper bilingual data. We also conclude that we can follow an unsupervised learning path on monolingual text data as that is more suitable for low-resource languages.
Tasks	Machine Translation, Word Embeddings
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06411v1
PDF	https://arxiv.org/pdf/1910.06411v1.pdf
PWC	https://paperswithcode.com/paper/mapping-supervised-bilingual-word-embeddings
Repo	https://github.com/SouravDutta91/map-low-resource-embeddings
Framework	none

Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures


Title	Data-Driven Approach to Encoding and Decoding 3-D Crystal Structures
Authors	Jordan Hoffmann, Louis Maestrati, Yoshihide Sawada, Jian Tang, Jean Michel Sellier, Yoshua Bengio
Abstract	Generative models have achieved impressive results in many domains including image and text generation. In the natural sciences, generative models have led to rapid progress in automated drug discovery. Many of the current methods focus on either 1-D or 2-D representations of typically small, drug-like molecules. However, many molecules require 3-D descriptors and exceed the chemical complexity of commonly used dataset. We present a method to encode and decode the position of atoms in 3-D molecules from a dataset of nearly 50,000 stable crystal unit cells that vary from containing 1 to over 100 atoms. We construct a smooth and continuous 3-D density representation of each crystal based on the positions of different atoms. Two different neural networks were trained on a dataset of over 120,000 three-dimensional samples of single and repeating crystal structures, made by rotating the single unit cells. The first, an Encoder-Decoder pair, constructs a compressed latent space representation of each molecule and then decodes this description into an accurate reconstruction of the input. The second network segments the resulting output into atoms and assigns each atom an atomic number. By generating compressed, continuous latent spaces representations of molecules we are able to decode random samples, interpolate between two molecules, and alter known molecules.
Tasks	Drug Discovery, Text Generation
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00949v1
PDF	https://arxiv.org/pdf/1909.00949v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-approach-to-encoding-and-decoding
Repo	https://github.com/hoffmannjordan/Encoding-Decoding-3D-Crystals
Framework	pytorch

Relational Dose-Response Modeling for Cancer Drug Studies


Title	Relational Dose-Response Modeling for Cancer Drug Studies
Authors	Wesley Tansey, Christopher Tosh, David M. Blei
Abstract	Exploratory cancer drug studies test multiple tumor cell lines against multiple candidate drugs. The goal in each paired (cell line, drug) experiment is to map out the dose-response curve of the cell line as the dose level of the drug increases. The level of natural variation and technical noise in these experiments is high, even when multiple replicates are run. Further, running all possible combinations of cell lines and drugs may be prohibitively expensive, leading to missing data. Thus, estimating the dose-response curve is a denoising and imputation task. We cast this task as a functional matrix factorization problem: finding low-dimensional structure in a matrix where every entry is a noisy function evaluated at a set of discrete points. We propose Bayesian Tensor Filtering (BTF), a hierarchical Bayesian model of matrices of functions. BTF captures the smoothness in each individual function while also being locally adaptive to sharp discontinuities. The BTF model can incorporate many types of likelihoods, making it flexible enough to handle a wide variety of data. We derive efficient Gibbs samplers for three classes of likelihoods: (i) Gaussian, for which updates are fully conjugate; (ii) binomial and related likelihoods, for which updates are conditionally conjugate through Polya-Gamma augmentation; and (iii) non-conjugate likelihoods, for which we develop an analytic truncated elliptical slice sampling routine. We compare BTF against a state-of-the-art method for dynamic Poisson matrix factorization, showing BTF better reconstructs held out data in synthetic experiments. Finally, we build a dose-response model around BTF and apply it to real data from two multi-sample, multi-drug cancer studies. We show that the BTF-based dose-response model outperforms the current standard approach in biology. Code is available at https://github.com/tansey/functionalmf.
Tasks	Denoising, Drug Discovery, Imputation
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04072v2
PDF	https://arxiv.org/pdf/1906.04072v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-tensor-filtering-smooth-locally
Repo	https://github.com/tansey/functionalmf
Framework	none

Scaffold-based molecular design using graph generative model


Title	Scaffold-based molecular design using graph generative model
Authors	Jaechang Lim, Sang-Yeon Hwang, Seungsu Kim, Seokhyun Moon, Woo Youn Kim
Abstract	Searching new molecules in areas like drug discovery often starts from the core structures of candidate molecules to optimize the properties of interest. The way as such has called for a strategy of designing molecules retaining a particular scaffold as a substructure. On this account, our present work proposes a scaffold-based molecular generative model. The model generates molecular graphs by extending the graph of a scaffold through sequential additions of vertices and edges. In contrast to previous related models, our model guarantees the generated molecules to retain the given scaffold with certainty. Our evaluation of the model using unseen scaffolds showed the validity, uniqueness, and novelty of generated molecules as high as the case using seen scaffolds. This confirms that the model can generalize the learned chemical rules of adding atoms and bonds rather than simply memorizing the mapping from scaffolds to molecules during learning. Furthermore, despite the restraint of fixing core structures, our model could simultaneously control multiple molecular properties when generating new molecules.
Tasks	Drug Discovery
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13639v1
PDF	https://arxiv.org/pdf/1905.13639v1.pdf
PWC	https://paperswithcode.com/paper/scaffold-based-molecular-design-using-graph
Repo	https://github.com/jaechanglim/GGM
Framework	pytorch