January 28, 2020

3068 words 15 mins read

Paper Group ANR 937

Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss. Expanding the Text Classification Toolbox with Cross-Lingual Embeddings. Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms. Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD. DA …

Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss


Title	Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss
Authors	Jia Li, Jinming Su, Changqun Xia, Yonghong Tian
Abstract	By the aid of attention mechanisms to weight the image features adaptively, recent advanced deep learning-based salient object detection models encourage the predicted results to approximate the ground-truth masks with as large predictable areas as possible. However, these methods do not pay enough attention to small areas prone to misprediction. In this way, it is still tough to accurately locate salient objects due to the existence of regions with indistinguishable foreground and background and regions with complex or fine structures. To address these problems, we propose a novel network with purificatory mechanism and structural similarity loss. Specifically, in order to better locate preliminary salient objects, we first introduce the promotion attention, which is based on spatial and channel attention mechanisms to promote attention to salient regions. Subsequently, for the purpose of restoring the indistinguishable regions that can be regarded as error-prone regions of one model, we propose the rectification attention, which is learned from the areas of wrong prediction and guide the network to focus on error-prone regions thus rectifying errors. Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects. In addition to paying different attention to these hard-to-distinguish regions, we also consider the structural constraints on complex regions and propose the Structural Similarity Loss. The proposed loss models the region-level pair-wise relationship between regions to assist these regions to calibrate their own saliency values. In experiments, the proposed approach efficiently outperforms 19 state-of-the-art methods on six datasets with a notable margin.
Tasks	Object Detection, Salient Object Detection
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08393v1
PDF	https://arxiv.org/pdf/1912.08393v1.pdf
PWC	https://paperswithcode.com/paper/salient-object-detection-with-purificatory
Repo
Framework

Expanding the Text Classification Toolbox with Cross-Lingual Embeddings


Title	Expanding the Text Classification Toolbox with Cross-Lingual Embeddings
Authors	Meryem M’hamdi, Robert West, Andreea Hossmann, Michael Baeriswyl, Claudiu Musat
Abstract	Most work in text classification and Natural Language Processing (NLP) focuses on English or a handful of other languages that have text corpora of hundreds of millions of words. This is creating a new version of the digital divide: the artificial intelligence (AI) divide. Transfer-based approaches, such as Cross-Lingual Text Classification (CLTC) - the task of categorizing texts written in different languages into a common taxonomy, are a promising solution to the emerging AI divide. Recent work on CLTC has focused on demonstrating the benefits of using bilingual word embeddings as features, relegating the CLTC problem to a mere benchmark based on a simple averaged perceptron. In this paper, we explore more extensively and systematically two flavors of the CLTC problem: news topic classification and textual churn intent detection (TCID) in social media. In particular, we test the hypothesis that embeddings with context are more effective, by multi-tasking the learning of multilingual word embeddings and text classification; we explore neural architectures for CLTC; and we move from bi- to multi-lingual word embeddings. For all architectures, types of word embeddings and datasets, we notice a consistent gain trend in favor of multilingual joint training, especially for low-resourced languages.
Tasks	Intent Detection, Multilingual Word Embeddings, Text Classification, Word Embeddings
Published	2019-03-23
URL	http://arxiv.org/abs/1903.09878v2
PDF	http://arxiv.org/pdf/1903.09878v2.pdf
PWC	https://paperswithcode.com/paper/expanding-the-text-classification-toolbox
Repo
Framework

Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms


Title	Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms
Authors	Karl M. Koerich, Mohammad Esmailpour, Sajjad Abdoli, Alceu S. Britto Jr., Alessandro L. Koerich
Abstract	This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform spectrograms, and such perturbed spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce perturbed spectrograms that are visually imperceptible by humans. Furthermore, the audio waveforms reconstructed from the perturbed spectrograms are also able to fool a 1D CNN trained on the original audio. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87% of mean accuracy on legitimate examples and such performance drops to 12.09% on adversarial examples. Likewise, the 1D CNN achieves up to 78.29% of mean accuracy on original audio samples and such performance drops to 27.91% on adversarial audio waveforms reconstructed from the perturbed spectrograms.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10106v2
PDF	https://arxiv.org/pdf/1910.10106v2.pdf
PWC	https://paperswithcode.com/paper/cross-representation-transferability-of
Repo
Framework

Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD


Title	Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD
Authors	Kosuke Haruki, Taiji Suzuki, Yohei Hamakawa, Takeshi Toda, Ryuji Sakai, Masahiro Ozawa, Mitsuhiro Kimura
Abstract	Large-batch stochastic gradient descent (SGD) is widely used for training in distributed deep learning because of its training-time efficiency, however, extremely large-batch SGD leads to poor generalization and easily converges to sharp minima, which prevents naive large-scale data-parallel SGD (DP-SGD) from converging to good minima. To overcome this difficulty, we propose gradient noise convolution (GNC), which effectively smooths sharper minima of the loss function. For DP-SGD, GNC utilizes so-called gradient noise, which is induced by stochastic gradient variation and convolved to the loss function as a smoothing effect. GNC computation can be performed by simply computing the stochastic gradient on each parallel worker and merging them, and is therefore extremely easy to implement. Due to convolving with the gradient noise, which tends to spread along a sharper direction of the loss function, GNC can effectively smooth sharp minima and achieve better generalization, whereas isotropic random noise cannot. We empirically show this effect by comparing GNC with isotropic random noise, and show that it achieves state-of-the-art generalization performance for large-scale deep neural network optimization.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.10822v1
PDF	https://arxiv.org/pdf/1906.10822v1.pdf
PWC	https://paperswithcode.com/paper/gradient-noise-convolution-gnc-smoothing-loss
Repo
Framework

DASGAN – Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images


Title	DASGAN – Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images
Authors	Ansh Kapil, Tobias Wiestler, Simon Lanzmich, Abraham Silva, Keith Steele, Marlon Rebelatto, Guenter Schmidt, Nicolas Brieu
Abstract	The analysis of the tumor environment on digital histopathology slides is becoming key for the understanding of the immune response against cancer, supporting the development of novel immuno-therapies. We introduce here a novel deep learning solution to the related problem of tumor epithelium segmentation. While most existing deep learning segmentation approaches are trained on time-consuming and costly manual annotation on single stain domain (PD-L1), we leverage here semi-automatically labeled images from a second stain domain (Cytokeratin-CK). We introduce an end-to-end trainable network that jointly segment tumor epithelium on PD-L1 while leveraging unpaired image-to-image translation between CK and PD-L1, therefore completely bypassing the need for serial sections or re-staining of slides. Extending the method to differentiate between PD-L1 positive and negative tumor epithelium regions enables the automated estimation of the PD-L1 Tumor Cell (TC) score. Quantitative experimental results demonstrate the accuracy of our approach against state-of-the-art segmentation methods.
Tasks	Domain Adaptation, Image-to-Image Translation
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11118v1
PDF	https://arxiv.org/pdf/1906.11118v1.pdf
PWC	https://paperswithcode.com/paper/dasgan-joint-domain-adaptation-and
Repo
Framework

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts


Title	Robust Neural Machine Translation for Clean and Noisy Speech Transcripts
Authors	Mattia Antonino Di Gangi, Robert Enyedi, Alessandra Brusadin, Marcello Federico
Abstract	Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in our application scenarios transcripts might be post-edited by human experts, we propose adaptation strategies to train a single system that can translate either clean or noisy input with no supervision on the input type. Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text. Adapting on both clean and noisy variants of the same data leads to the best results on both input types.
Tasks	Machine Translation, Speech Recognition
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10238v1
PDF	https://arxiv.org/pdf/1910.10238v1.pdf
PWC	https://paperswithcode.com/paper/robust-neural-machine-translation-for-clean
Repo
Framework

Movie Recommender Systems: Implementation and Performance Evaluation


Title	Movie Recommender Systems: Implementation and Performance Evaluation
Authors	Mojdeh Saadati, Syed Shihab, Mohammed Shaiqur Rahman
Abstract	Over the years, explosive growth in the number of items in the catalog of e-commerce businesses, such as Amazon, Netflix, Pandora, etc., have warranted the development of recommender systems to guide consumers towards their desired products based on their preferences and tastes. Some of the popular approaches for building recommender systems, for mining user, derived input datasets, are: content-based systems, collaborative filtering, latent-factor systems using Singular Value Decomposition (SVD), and Restricted Boltzmann Machines (RBM). In this project, user-user collaborative filtering, item-item collaborative filtering, content-based recommendation, SVD, and neural networks were chosen for implementation in Python to predict the user ratings of unwatched movies for each user, and their performances were evaluated and compared.
Tasks	Recommendation Systems
Published	2019-09-16
URL	https://arxiv.org/abs/1909.12749v1
PDF	https://arxiv.org/pdf/1909.12749v1.pdf
PWC	https://paperswithcode.com/paper/movie-recommender-systems-implementation-and
Repo
Framework

Hardening Random Forest Cyber Detectors Against Adversarial Attacks


Title	Hardening Random Forest Cyber Detectors Against Adversarial Attacks
Authors	Giovanni Apruzzese, Mauro Andreolini, Michele Colajanni, Mirco Marchetti
Abstract	Machine learning algorithms are effective in several applications, but they are not as much successful when applied to intrusion detection in cyber security. Due to the high sensitivity to their training data, cyber detectors based on machine learning are vulnerable to targeted adversarial attacks that involve the perturbation of initial samples. Existing defenses assume unrealistic scenarios; their results are underwhelming in non-adversarial settings; or they can be applied only to machine learning algorithms that perform poorly for cyber security. We present an original methodology for countering adversarial perturbations targeting intrusion detection systems based on random forests. As a practical application, we integrate the proposed defense method in a cyber detector analyzing network traffic. The experimental results on millions of labelled network flows show that the new detector has a twofold value: it outperforms state-of-the-art detectors that are subject to adversarial attacks; it exhibits robust results both in adversarial and non-adversarial scenarios.
Tasks	Intrusion Detection
Published	2019-12-09
URL	https://arxiv.org/abs/1912.03790v1
PDF	https://arxiv.org/pdf/1912.03790v1.pdf
PWC	https://paperswithcode.com/paper/hardening-random-forest-cyber-detectors
Repo
Framework

MVF-Net: Multi-View 3D Face Morphable Model Regression


Title	MVF-Net: Multi-View 3D Face Morphable Model Regression
Authors	Fanzi Wu, Linchao Bao, Yajing Chen, Yonggen Ling, Yibing Song, Songnan Li, King Ngi Ngan, Wei Liu
Abstract	We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods.
Tasks	Optical Flow Estimation
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04473v1
PDF	http://arxiv.org/pdf/1904.04473v1.pdf
PWC	https://paperswithcode.com/paper/mvf-net-multi-view-3d-face-morphable-model
Repo
Framework

PageRank algorithm for Directed Hypergraph


Title	PageRank algorithm for Directed Hypergraph
Authors	Loc Tran, Tho Quan, An Mai
Abstract	During the last two decades, we easilly see that the World Wide Web’s link structure is modeled as the directed graph. In this paper, we will model the World Wide Web’s link structure as the directed hypergraph. Moreover, we will develop the PageRank algorithm for this directed hypergraph. Due to the lack of the World Wide Web directed hypergraph datasets, we will apply the PageRank algorithm to the metabolic network which is the directed hypergraph itself. The experiments show that our novel PageRank algorithm is successfully applied to this metabolic network.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1909.01132v1
PDF	https://arxiv.org/pdf/1909.01132v1.pdf
PWC	https://paperswithcode.com/paper/pagerank-algorithm-for-directed-hypergraph
Repo
Framework

Adaptive Trade-Offs in Off-Policy Learning


Title	Adaptive Trade-Offs in Off-Policy Learning
Authors	Mark Rowland, Will Dabney, Rémi Munos
Abstract	A great variety of off-policy learning algorithms exist in the literature, and new breakthroughs in this area continue to be made, improving theoretical understanding and yielding state-of-the-art reinforcement learning algorithms. In this paper, we take a unifying view of this space of algorithms, and consider their trade-offs of three fundamental quantities: update variance, fixed-point bias, and contraction rate. This leads to new perspectives of existing methods, and also naturally yields novel algorithms for off-policy evaluation and control. We develop one such algorithm, C-trace, demonstrating that it is able to more efficiently make these trade-offs than existing methods in use, and that it can be scaled to yield state-of-the-art performance in large-scale environments.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07478v1
PDF	https://arxiv.org/pdf/1910.07478v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-trade-offs-in-off-policy-learning
Repo
Framework

Multi-Perspective, Simultaneous Embedding


Title	Multi-Perspective, Simultaneous Embedding
Authors	Md Iqbal Hossain, Vahan Huroyan, Stephen Kobourov, Raymundo Navarrete
Abstract	We describe a method for simultaneous visualization of multiple pairwise distances in 3 dimensional (3D) space. Given the distance matrices that correspond to 2 dimensional projections of a 3 dimensional object (dataset) the goal is to recover the 3 dimensional object (dataset). We propose an approach that uses 3D to place the points, along with projections (planes) that preserve each of the given distance matrices. Our multi-perspective, simultaneous embedding (MPSE) method is based on non-linear dimensionality reduction that generalizes multidimensional scaling. We consider two versions of the problem: in the first one we are given the input distance matrices and the projections (e.g., if we have 3 different projections we can use the three orthogonal directions of the unit cube). In the second version of the problem we also compute the best projections as part of the optimization. We experimentally evaluate MPSE using synthetic datasets that illustrate the quality of the resulting solutions. Finally, we provide a functional prototype which implements both settings.
Tasks	Dimensionality Reduction
Published	2019-09-13
URL	https://arxiv.org/abs/1909.06485v1
PDF	https://arxiv.org/pdf/1909.06485v1.pdf
PWC	https://paperswithcode.com/paper/multi-perspective-simultaneous-embedding
Repo
Framework

3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders


Title	3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders
Authors	Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni, Antonio de Marvao, Stuart A. Cook, Declan P. O’Regan, Daniel Rueckert
Abstract	Accurate segmentation of heart structures imaged by cardiac MR is key for the quantitative analysis of pathology. High-resolution 3D MR sequences enable whole-heart structural imaging but are time-consuming, expensive to acquire and they often require long breath holds that are not suitable for patients. Consequently, multiplanar breath-hold 2D cine sequences are standard practice but are disadvantaged by lack of whole-heart coverage and low through-plane resolution. To address this, we propose a conditional variational autoencoder architecture able to learn a generative model of 3D high-resolution left ventricular (LV) segmentations which is conditioned on three 2D LV segmentations of one short-axis and two long-axis images. By only employing these three 2D segmentations, our model can efficiently reconstruct the 3D high-resolution LV segmentation of a subject. When evaluated on 400 unseen healthy volunteers, our model yielded an average Dice score of $87.92 \pm 0.15$ and outperformed competing architectures.
Tasks	Cardiac Segmentation
Published	2019-02-28
URL	http://arxiv.org/abs/1902.11000v1
PDF	http://arxiv.org/pdf/1902.11000v1.pdf
PWC	https://paperswithcode.com/paper/3d-high-resolution-cardiac-segmentation
Repo
Framework

Deep Generative Quantile-Copula Models for Probabilistic Forecasting


Title	Deep Generative Quantile-Copula Models for Probabilistic Forecasting
Authors	Ruofeng Wen, Kari Torkkola
Abstract	We introduce a new category of multivariate conditional generative models and demonstrate its performance and versatility in probabilistic time series forecasting and simulation. Specifically, the output of quantile regression networks is expanded from a set of fixed quantiles to the whole Quantile Function by a univariate mapping from a latent uniform distribution to the target distribution. Then the multivariate case is solved by learning such quantile functions for each dimension’s marginal distribution, followed by estimating a conditional Copula to associate these latent uniform random variables. The quantile functions and copula, together defining the joint predictive distribution, can be parameterized by a single implicit generative Deep Neural Network.
Tasks	Time Series, Time Series Forecasting
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10697v1
PDF	https://arxiv.org/pdf/1907.10697v1.pdf
PWC	https://paperswithcode.com/paper/deep-generative-quantile-copula-models-for
Repo
Framework

A Practical Framework for Solving Center-Based Clustering with Outliers


Title	A Practical Framework for Solving Center-Based Clustering with Outliers
Authors	Hu Ding, Haikuo Yu
Abstract	Clustering has many important applications in computer science, but real-world datasets often contain outliers. Moreover, the existence of outliers can make the clustering problems to be much more challenging. In this paper, we propose a practical framework for solving the problems of $k$-center/median/means clustering with outliers. The framework actually is very simple, where we just need to take a small sample from input and run existing approximation algorithm on the sample. However, our analysis is fundamentally different from the previous sampling based ideas. In particular, the size of the sample is independent of the input data size and dimensionality. To explain the effectiveness of random sampling in theory, we introduce a `significance’ criterion and prove that the performance of our framework depends on the significance degree of the given instance. The result proposed in this paper falls under the umbrella of beyond worst-case analysis in terms of clustering with outliers. The experiments suggest that our framework can achieve comparable clustering result with existing methods, but greatly reduce the running time. \|
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10143v2
PDF	https://arxiv.org/pdf/1905.10143v2.pdf
PWC	https://paperswithcode.com/paper/a-practical-framework-for-solving-center
Repo
Framework