Paper Group NAWR 7
Building English-to-Serbian Machine Translation System for IMDb Movie Reviews. Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses. Limited Data Rolling Bearing Fault Diagnosis with Few-shot Learning. Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks. Scalable Bayesian inferen …
Building English-to-Serbian Machine Translation System for IMDb Movie Reviews
Title | Building English-to-Serbian Machine Translation System for IMDb Movie Reviews |
Authors | Pintu Lohar, Maja Popovi{'c}, Andy Way |
Abstract | This paper reports the results of the first experiment dealing with the challenges of building a machine translation system for user-generated content involving a complex South Slavic language. We focus on translation of English IMDb user movie reviews into Serbian, in a low-resource scenario. We explore potentials and limits of (i) phrase-based and neural machine translation systems trained on out-of-domain clean parallel data from news articles (ii) creating additional synthetic in-domain parallel corpus by machine-translating the English IMDb corpus into Serbian. Our main findings are that morphology and syntax are better handled by the neural approach than by the phrase-based approach even in this low-resource mismatched domain scenario, however the situation is different for the lexical aspect, especially for person names. This finding also indicates that in general, machine translation of person names into Slavic languages (especially those which require/allow transcription) should be investigated more systematically. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3715/ |
https://www.aclweb.org/anthology/W19-3715 | |
PWC | https://paperswithcode.com/paper/building-english-to-serbian-machine |
Repo | https://github.com/m-popovic/imdb-corpus-for-MT |
Framework | none |
Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses
Title | Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses |
Authors | Jerome Rony, Luiz G. Hafemann, Luiz S. Oliveira, Ismail Ben Ayed, Robert Sabourin, Eric Granger |
Abstract | Research on adversarial examples in computer vision tasks has shown that small, often imperceptible changes to an image can induce misclassification, which has security implications for a wide range of image processing systems. Considering L2 norm distortions, the Carlini and Wagner attack is presently the most effective white-box attack in the literature. However, this method is slow since it performs a line-search for one of the optimization terms, and often requires thousands of iterations. In this paper, an efficient approach is proposed to generate gradient-based attacks that induce misclassifications with low L2 norm, by decoupling the direction and the norm of the adversarial perturbation that is added to the image. Experiments conducted on the MNIST, CIFAR-10 and ImageNet datasets indicate that our attack achieves comparable results to the state-of-the-art (in terms of L2 norm) with considerably fewer iterations (as few as 100 iterations), which opens the possibility of using these attacks for adversarial training. Models trained with our attack achieve state-of-the-art robustness against white-box gradient-based L2 attacks on the MNIST and CIFAR-10 datasets, outperforming the Madry defense when the attacks are limited to a maximum norm. |
Tasks | Adversarial Attack, Adversarial Defense |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Rony_Decoupling_Direction_and_Norm_for_Efficient_Gradient-Based_L2_Adversarial_Attacks_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Rony_Decoupling_Direction_and_Norm_for_Efficient_Gradient-Based_L2_Adversarial_Attacks_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/decoupling-direction-and-norm-for-efficient-1 |
Repo | https://github.com/jeromerony/fast_adversarial |
Framework | pytorch |
Limited Data Rolling Bearing Fault Diagnosis with Few-shot Learning
Title | Limited Data Rolling Bearing Fault Diagnosis with Few-shot Learning |
Authors | Ansi Zhang, Shaobo Li, Yuxin Cui, Wanli Yang, Rongzhi Dong and Jianjun Hu |
Abstract | This paper focuses on bearing fault diagnosis with limited training data. A major challenge in fault diagnosis is the infeasibility of obtaining sufficient training samples for every fault type under all working conditions. Recently deep learning based fault diagnosis methods have achieved promising results. However, most of these methods require large amount of training data. In this study, we propose a deep neural network based few-shot learning approach for rolling bearing fault diagnosis with limited data. Our model is based on the siamese neural network, which learns by exploiting sample pairs of the same or different categories. Experimental results over the standard Case Western Reserve University (CWRU) bearing fault diagnosis benchmark dataset showed that our few-shot learning approach is more effective in fault diagnosis with limited data availability. When tested over different noise environments with minimal amount of training data, the performance of our few-shot learning model surpasses the one of the baseline with reasonable noise level. When evaluated over test sets with new fault types or new working conditions, few-shot models work better than the baseline trained with all fault types. All our models and datasets in this study are open sourced and can be downloaded from https://mekhub.cn/as/fault_diagnosis_with_few-shot_learning/ . |
Tasks | Few-Shot Learning |
Published | 2019-08-22 |
URL | https://ieeexplore.ieee.org/abstract/document/8793060 |
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8793060 | |
PWC | https://paperswithcode.com/paper/limited-data-rolling-bearing-fault-diagnosis |
Repo | https://github.com/SNBQT/Limited-Data-Rolling-Bearing-Fault-Diagnosis-with-Few-shot-Learning |
Framework | none |
Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks
Title | Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks |
Authors | Seungjoo Yoo, Hyojin Bahng, Sunghyo Chung, Junsoo Lee, Jaehyuk Chang, Jaegul Choo |
Abstract | Despite recent advancements in deep learning-based automatic colorization, they are still limited when it comes to few-shot learning. Existing models require a significant amount of training data. To tackle this issue, we present a novel memory-augmented colorization model MemoPainter that can produce high-quality colorization with limited data. In particular, our model is able to capture rare instances and successfully colorize them. Also, we propose a novel threshold triplet loss that enables unsupervised training of memory networks without the need for class labels. Experiments show that our model has superior quality in both few-shot and one-shot colorization tasks. |
Tasks | Colorization, Few-Shot Learning |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Yoo_Coloring_With_Limited_Data_Few-Shot_Colorization_via_Memory_Augmented_Networks_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Yoo_Coloring_With_Limited_Data_Few-Shot_Colorization_via_Memory_Augmented_Networks_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/coloring-with-limited-data-few-shot |
Repo | https://github.com/dongheehand/MemoPainter-PyTorch |
Framework | pytorch |
Scalable Bayesian inference of dendritic voltage via spatiotemporal recurrent state space models
Title | Scalable Bayesian inference of dendritic voltage via spatiotemporal recurrent state space models |
Authors | Ruoxi Sun, Ian Kinsella, Scott Linderman, Liam Paninski |
Abstract | Recent advances in optical voltage sensors have brought us closer to a critical goal in cellular neuroscience: imaging the full spatiotemporal voltage on a dendritic tree. However, current sensors and imaging approaches still face significant limitations in SNR and sampling frequency; therefore statistical denoising and interpolation methods remain critical for understanding single-trial spatiotemporal dendritic voltage dynamics. Previous denoising approaches were either based on an inadequate linear voltage model or scaled poorly to large trees. Here we introduce a scalable fully Bayesian approach. We develop a generative nonlinear model that requires few parameters per compartment of the cell but is nonetheless flexible enough to sample realistic spatiotemporal data. The model captures different dynamics in each compartment and leverages biophysical knowledge to constrain intra- and inter-compartmental dynamics. We obtain a full posterior distribution over spatiotemporal voltage via an augmented Gibbs sampling algorithm. The nonlinear smoother model outperforms previously developed linear methods, and scales to much larger systems than previous methods based on sequential Monte Carlo approaches. |
Tasks | Bayesian Inference, Denoising |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9206-scalable-bayesian-inference-of-dendritic-voltage-via-spatiotemporal-recurrent-state-space-models |
http://papers.nips.cc/paper/9206-scalable-bayesian-inference-of-dendritic-voltage-via-spatiotemporal-recurrent-state-space-models.pdf | |
PWC | https://paperswithcode.com/paper/scalable-bayesian-inference-of-dendritic |
Repo | https://github.com/SunRuoxi/Voltage_ |
Framework | none |
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning
Title | Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning |
Authors | Gregory Farquhar, Shimon Whiteson, Jakob Foerster |
Abstract | Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives. We derive an objective that, under automatic differentiation, produces low-variance unbiased estimators of derivatives at any order. Our objective is compatible with arbitrary advantage estimators, which allows the control of the bias and variance of any-order derivatives when using function approximation. Furthermore, we propose a method to trade off bias and variance of higher order derivatives by discounting the impact of more distant causal dependencies. We demonstrate the correctness and utility of our estimator in analytically tractable MDPs and in meta-reinforcement-learning for continuous control. |
Tasks | Continuous Control |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9026-loaded-dice-trading-off-bias-and-variance-in-any-order-score-function-gradient-estimators-for-reinforcement-learning |
http://papers.nips.cc/paper/9026-loaded-dice-trading-off-bias-and-variance-in-any-order-score-function-gradient-estimators-for-reinforcement-learning.pdf | |
PWC | https://paperswithcode.com/paper/loaded-dice-trading-off-bias-and-variance-in-1 |
Repo | https://github.com/oxwhirl/loaded-dice |
Framework | none |
Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning
Title | Distilling Discrimination and Generalization Knowledge for Event Detection via Delta-Representation Learning |
Authors | Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun |
Abstract | Event detection systems rely on discrimination knowledge to distinguish ambiguous trigger words and generalization knowledge to detect unseen/sparse trigger words. Current neural event detection approaches focus on trigger-centric representations, which work well on distilling discrimination knowledge, but poorly on learning generalization knowledge. To address this problem, this paper proposes a Delta-learning approach to distill discrimination and generalization knowledge by effectively decoupling, incrementally learning and adaptively fusing event representation. Experiments show that our method significantly outperforms previous approaches on unseen/sparse trigger words, and achieves state-of-the-art performance on both ACE2005 and KBP2017 datasets. |
Tasks | Representation Learning |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1429/ |
https://www.aclweb.org/anthology/P19-1429 | |
PWC | https://paperswithcode.com/paper/distilling-discrimination-and-generalization |
Repo | https://github.com/luyaojie/delta-learning-for-ed |
Framework | pytorch |
A Self-Training Approach for Short Text Clustering
Title | A Self-Training Approach for Short Text Clustering |
Authors | Amir Hadifar, Lucas Sterckx, Thomas Demeester, Chris Develder |
Abstract | Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations of the short texts. Low-dimensional continuous representations or embeddings can counter that sparseness problem: their high representational power is exploited in deep clustering algorithms. While deep clustering has been studied extensively in computer vision, relatively little work has focused on NLP. The method we propose, learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network. Experiments on three short text datasets empirically validate the effectiveness of our method. |
Tasks | Sentence Embedding, Text Clustering |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4322/ |
https://www.aclweb.org/anthology/W19-4322 | |
PWC | https://paperswithcode.com/paper/a-self-training-approach-for-short-text |
Repo | https://github.com/hadifar/stc_clustering |
Framework | none |
Octave Deep Plane-Sweeping Network: Reducing Spatial Redundancy for Learning-Based Plane-Sweeping Stereo
Title | Octave Deep Plane-Sweeping Network: Reducing Spatial Redundancy for Learning-Based Plane-Sweeping Stereo |
Authors | R. Komatsu, H. Fujii, Y. Tamura, A. Yamashita, H. Asama |
Abstract | In this paper, we propose the octave deep plane-sweeping network (OctDPSNet). OctDPSNet is a novel learning-based plane-sweeping stereo, which drastically reduces the required GPU memory and computation time while achieving a state-of-the-art depth estimation accuracy. Inspired by octave convolution, we divide image features into high and low spatial frequency features, and two cost volumes are generated from these using our proposed plane-sweeping module. To reduce spatial redundancy, the resolution of the cost volume from the low spatial frequency features is set to half that of the high spatial frequency features, which enables the memory consumption and computational cost to be reduced. After refinement, the two cost volumes are integrated into a final cost volume through our proposed pixel-wise “squeeze-and-excitation” based attention mechanism, and the depth maps are estimated from the final cost volume. We evaluate the proposed model on five datasets: SUN3D, RGB-D SLAM, MVS, Scenes11, and ETH3D. Our model outperforms previous methods on five datasets while drastically reducing the memory consumption and computational cost. Our source code is available at https://github.com/matsuren/octDPSNet. |
Tasks | Depth Estimation, Stereo Depth Estimation |
Published | 2019-10-14 |
URL | https://ieeexplore.ieee.org/document/8867874 |
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8867874 | |
PWC | https://paperswithcode.com/paper/octave-deep-plane-sweeping-network-reducing |
Repo | https://github.com/matsuren/octDPSNet |
Framework | pytorch |
Effective Adversarial Regularization for Neural Machine Translation
Title | Effective Adversarial Regularization for Neural Machine Translation |
Authors | Motoki Sato, Jun Suzuki, Shun Kiyono |
Abstract | A regularization technique based on adversarial perturbation, which was initially developed in the field of image processing, has been successfully applied to text classification tasks and has yielded attractive improvements. We aim to further leverage this promising methodology into more sophisticated and critical neural models in the natural language processing field, i.e., neural machine translation (NMT) models. However, it is not trivial to apply this methodology to such models. Thus, this paper investigates the effectiveness of several possible configurations of applying the adversarial perturbation and reveals that the adversarial regularization technique can significantly and consistently improve the performance of widely used NMT models, such as LSTM-based and Transformer-based models. |
Tasks | Machine Translation, Text Classification |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1020/ |
https://www.aclweb.org/anthology/P19-1020 | |
PWC | https://paperswithcode.com/paper/effective-adversarial-regularization-for |
Repo | https://github.com/pfnet-research/vat_nmt |
Framework | none |
Difference of Convolution for Deep Compressive Sensing
Title | Difference of Convolution for Deep Compressive Sensing |
Authors | Canh, Thuong Nguyen; Jeon, Byeungwoo |
Abstract | Deep learning-based compressive sensing (DCS) has improved the compressive sensing (CS) with fast and high reconstruction quality. Researchers have further extended it to multi-scale DCS which improves reconstruction quality based on Wavelet decomposition. In this work, we mimic the Difference of Gaussian via convolution and propose a scheme named as Difference of convolution-based multi-scale DCS (DoC-DCS). Unlike the multi-scale DCS based on a well-designed filter in wavelet domain, the proposed DoC-DCS learns decomposition, thereby, outperforms other state-of-the-art compressive sensing methods. |
Tasks | Compressive Sensing |
Published | 2019-09-22 |
URL | https://github.com/AtenaKid/DoC-DCS |
https://github.com/AtenaKid/DoC-DCS | |
PWC | https://paperswithcode.com/paper/difference-of-convolution-for-deep |
Repo | https://github.com/AtenaKid/DoC-DCS |
Framework | none |
Automated characterization of noise distributions in diffusion MRI data
Title | Automated characterization of noise distributions in diffusion MRI data |
Authors | Samuel St-Jean, Alberto De Luca, Chantal M. W. Tax, Max A. Viergever, Alexander Leemans |
Abstract | Purpose: To understand and characterize noise distributions in parallel imaging for diffusion MRI. Theory and Methods: Two new automated methods using the moments and the maximum likelihood equations of the Gamma distribution were developed. Simulations using stationary and spatially varying noncentral chi noise distributions were created for two diffusion weightings with SENSE or GRAPPA reconstruction and 8, 12 or 32 receiver coils. Furthermore, MRI data of a water phantom with different combinations of multiband and SENSE acceleration were acquired on a 3T scanner along with noise-only measurements. Finally, an in vivo dataset was acquired at 3T using multiband acceleration and GRAPPA reconstruction. Estimation of the noise distribution was performed with the proposed methods and compared with 3 other existing algorithms. Results: Simulations showed that assuming a Rician distribution can lead to misestimation in parallel imaging. Results on the acquired datasets showed that signal leakage in multiband can lead to a misestimation of the parameters. Noise maps are robust to these artifacts, but may misestimate parameters in some cases. The proposed algorithms herein can estimate both parameters of the noise distribution, are robust to signal leakage artifacts and perform best when used on acquired noise maps. Conclusion: Misestimation of the correct noise distribution can hamper further processing such as bias correction and denoising, especially when the measured distribution differs too much from the actual signal distribution e.g., due to artifacts. The use of noise maps can yield more robust estimates than the use of diffusion weighted images as input for algorithms. |
Tasks | Denoising |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12121 |
https://arxiv.org/pdf/1906.12121.pdf | |
PWC | https://paperswithcode.com/paper/automated-characterization-of-noise |
Repo | https://github.com/samuelstjean/autodmri |
Framework | none |
R2D2: Reliable and Repeatable Detector and Descriptor
Title | R2D2: Reliable and Repeatable Detector and Descriptor |
Authors | Jerome Revaud, Cesar De Souza, Martin Humenberger, Philippe Weinzaepfel |
Abstract | Interest point detection and local feature description are fundamental steps in many computer vision applications. Classical approaches are based on a detect-then-describe paradigm where separate handcrafted methods are used to first identify repeatable keypoints and then represent them with a local descriptor. Neural networks trained with metric learning losses have recently caught up with these techniques, focusing on learning repeatable saliency maps for keypoint detection or learning descriptors at the detected keypoint locations. In this work, we argue that repeatable regions are not necessarily discriminative and can therefore lead to select suboptimal keypoints. Furthermore, we claim that descriptors should be learned only in regions for which matching can be performed with high confidence. We thus propose to jointly learn keypoint detection and description together with a predictor of the local descriptor discriminativeness. This allows to avoid ambiguous areas, thus leading to reliable keypoint detection and description. Our detection-and-description approach simultaneously outputs sparse, repeatable and reliable keypoints that outperforms state-of-the-art detectors and descriptors on the HPatches dataset and on the recent Aachen Day-Night localization benchmark. |
Tasks | Atari Games, Interest Point Detection, Keypoint Detection, Metric Learning |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9407-r2d2-reliable-and-repeatable-detector-and-descriptor |
http://papers.nips.cc/paper/9407-r2d2-reliable-and-repeatable-detector-and-descriptor.pdf | |
PWC | https://paperswithcode.com/paper/r2d2-reliable-and-repeatable-detector-and |
Repo | https://github.com/naver/r2d2 |
Framework | pytorch |
Sampling Networks and Aggregate Simulation for Online POMDP Planning
Title | Sampling Networks and Aggregate Simulation for Online POMDP Planning |
Authors | Hao(Jackson) Cui, Roni Khardon |
Abstract | The paper introduces a new algorithm for planning in partially observable Markov decision processes (POMDP) based on the idea of aggregate simulation. The algorithm uses product distributions to approximate the belief state and shows how to build a representation graph of an approximate action-value function over belief space. The graph captures the result of simulating the model in aggregate under independence assumptions, giving a symbolic representation of the value function. The algorithm supports large observation spaces using sampling networks, a representation of the process of sampling values of observations, which is integrated into the graph representation. Following previous work in MDPs this approach enables action selection in POMDPs through gradient optimization over the graph representation. This approach complements recent algorithms for POMDPs which are based on particle representations of belief states and an explicit search for action selection. Our approach enables scaling to large factored action spaces in addition to large state spaces and observation spaces. An experimental evaluation demonstrates that the algorithm provides excellent performance relative to state of the art in large POMDP problems. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9121-sampling-networks-and-aggregate-simulation-for-online-pomdp-planning |
http://papers.nips.cc/paper/9121-sampling-networks-and-aggregate-simulation-for-online-pomdp-planning.pdf | |
PWC | https://paperswithcode.com/paper/sampling-networks-and-aggregate-simulation |
Repo | https://github.com/hcui01/SNAP |
Framework | none |
Park: An Open Platform for Learning-Augmented Computer Systems
Title | Park: An Open Platform for Learning-Augmented Computer Systems |
Authors | Hongzi Mao, Parimarjan Negi, Akshay Narayan, Hanrui Wang, Jiacheng Yang, Haonan Wang, Ryan Marcus, Ravichandra Addanki, Mehrdad Khani Shirkoohi, Songtao He, Vikram Nathan, Frank Cangialosi, Shaileshh Venkatakrishnan, Wei-Hung Weng, Song Han, Tim Kraska, Dr.Mohammad Alizadeh |
Abstract | We present Park, a platform for researchers to experiment with Reinforcement Learning (RL) for computer systems. Using RL for improving the performance of systems has a lot of potential, but is also in many ways very different from, for example, using RL for games. Thus, in this work we first discuss the unique challenges RL for systems has, and then propose Park an open extensible platform, which makes it easier for ML researchers to work on systems problems. Currently, Park consists of 12 real world system-centric optimization problems with one common easy to use interface. Finally, we present the performance of existing RL approaches over those 12 problems and outline potential areas of future work. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8519-park-an-open-platform-for-learning-augmented-computer-systems |
http://papers.nips.cc/paper/8519-park-an-open-platform-for-learning-augmented-computer-systems.pdf | |
PWC | https://paperswithcode.com/paper/park-an-open-platform-for-learning-augmented |
Repo | https://github.com/park-project/park |
Framework | tf |