Paper Group ANR 937
Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss. Expanding the Text Classification Toolbox with Cross-Lingual Embeddings. Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms. Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD. DA …
Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss
Title | Salient Object Detection with Purificatory Mechanism and Structural Similarity Loss |
Authors | Jia Li, Jinming Su, Changqun Xia, Yonghong Tian |
Abstract | By the aid of attention mechanisms to weight the image features adaptively, recent advanced deep learning-based salient object detection models encourage the predicted results to approximate the ground-truth masks with as large predictable areas as possible. However, these methods do not pay enough attention to small areas prone to misprediction. In this way, it is still tough to accurately locate salient objects due to the existence of regions with indistinguishable foreground and background and regions with complex or fine structures. To address these problems, we propose a novel network with purificatory mechanism and structural similarity loss. Specifically, in order to better locate preliminary salient objects, we first introduce the promotion attention, which is based on spatial and channel attention mechanisms to promote attention to salient regions. Subsequently, for the purpose of restoring the indistinguishable regions that can be regarded as error-prone regions of one model, we propose the rectification attention, which is learned from the areas of wrong prediction and guide the network to focus on error-prone regions thus rectifying errors. Through these two attentions, we use the Purificatory Mechanism to impose strict weights with different regions of the whole salient objects and purify results from hard-to-distinguish regions, thus accurately predicting the locations and details of salient objects. In addition to paying different attention to these hard-to-distinguish regions, we also consider the structural constraints on complex regions and propose the Structural Similarity Loss. The proposed loss models the region-level pair-wise relationship between regions to assist these regions to calibrate their own saliency values. In experiments, the proposed approach efficiently outperforms 19 state-of-the-art methods on six datasets with a notable margin. |
Tasks | Object Detection, Salient Object Detection |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08393v1 |
https://arxiv.org/pdf/1912.08393v1.pdf | |
PWC | https://paperswithcode.com/paper/salient-object-detection-with-purificatory |
Repo | |
Framework | |
Expanding the Text Classification Toolbox with Cross-Lingual Embeddings
Title | Expanding the Text Classification Toolbox with Cross-Lingual Embeddings |
Authors | Meryem M’hamdi, Robert West, Andreea Hossmann, Michael Baeriswyl, Claudiu Musat |
Abstract | Most work in text classification and Natural Language Processing (NLP) focuses on English or a handful of other languages that have text corpora of hundreds of millions of words. This is creating a new version of the digital divide: the artificial intelligence (AI) divide. Transfer-based approaches, such as Cross-Lingual Text Classification (CLTC) - the task of categorizing texts written in different languages into a common taxonomy, are a promising solution to the emerging AI divide. Recent work on CLTC has focused on demonstrating the benefits of using bilingual word embeddings as features, relegating the CLTC problem to a mere benchmark based on a simple averaged perceptron. In this paper, we explore more extensively and systematically two flavors of the CLTC problem: news topic classification and textual churn intent detection (TCID) in social media. In particular, we test the hypothesis that embeddings with context are more effective, by multi-tasking the learning of multilingual word embeddings and text classification; we explore neural architectures for CLTC; and we move from bi- to multi-lingual word embeddings. For all architectures, types of word embeddings and datasets, we notice a consistent gain trend in favor of multilingual joint training, especially for low-resourced languages. |
Tasks | Intent Detection, Multilingual Word Embeddings, Text Classification, Word Embeddings |
Published | 2019-03-23 |
URL | http://arxiv.org/abs/1903.09878v2 |
http://arxiv.org/pdf/1903.09878v2.pdf | |
PWC | https://paperswithcode.com/paper/expanding-the-text-classification-toolbox |
Repo | |
Framework | |
Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms
Title | Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms |
Authors | Karl M. Koerich, Mohammad Esmailpour, Sajjad Abdoli, Alceu S. Britto Jr., Alessandro L. Koerich |
Abstract | This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform spectrograms, and such perturbed spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce perturbed spectrograms that are visually imperceptible by humans. Furthermore, the audio waveforms reconstructed from the perturbed spectrograms are also able to fool a 1D CNN trained on the original audio. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87% of mean accuracy on legitimate examples and such performance drops to 12.09% on adversarial examples. Likewise, the 1D CNN achieves up to 78.29% of mean accuracy on original audio samples and such performance drops to 27.91% on adversarial audio waveforms reconstructed from the perturbed spectrograms. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10106v2 |
https://arxiv.org/pdf/1910.10106v2.pdf | |
PWC | https://paperswithcode.com/paper/cross-representation-transferability-of |
Repo | |
Framework | |
Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD
Title | Gradient Noise Convolution (GNC): Smoothing Loss Function for Distributed Large-Batch SGD |
Authors | Kosuke Haruki, Taiji Suzuki, Yohei Hamakawa, Takeshi Toda, Ryuji Sakai, Masahiro Ozawa, Mitsuhiro Kimura |
Abstract | Large-batch stochastic gradient descent (SGD) is widely used for training in distributed deep learning because of its training-time efficiency, however, extremely large-batch SGD leads to poor generalization and easily converges to sharp minima, which prevents naive large-scale data-parallel SGD (DP-SGD) from converging to good minima. To overcome this difficulty, we propose gradient noise convolution (GNC), which effectively smooths sharper minima of the loss function. For DP-SGD, GNC utilizes so-called gradient noise, which is induced by stochastic gradient variation and convolved to the loss function as a smoothing effect. GNC computation can be performed by simply computing the stochastic gradient on each parallel worker and merging them, and is therefore extremely easy to implement. Due to convolving with the gradient noise, which tends to spread along a sharper direction of the loss function, GNC can effectively smooth sharp minima and achieve better generalization, whereas isotropic random noise cannot. We empirically show this effect by comparing GNC with isotropic random noise, and show that it achieves state-of-the-art generalization performance for large-scale deep neural network optimization. |
Tasks | |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10822v1 |
https://arxiv.org/pdf/1906.10822v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-noise-convolution-gnc-smoothing-loss |
Repo | |
Framework | |
DASGAN – Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images
Title | DASGAN – Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images |
Authors | Ansh Kapil, Tobias Wiestler, Simon Lanzmich, Abraham Silva, Keith Steele, Marlon Rebelatto, Guenter Schmidt, Nicolas Brieu |
Abstract | The analysis of the tumor environment on digital histopathology slides is becoming key for the understanding of the immune response against cancer, supporting the development of novel immuno-therapies. We introduce here a novel deep learning solution to the related problem of tumor epithelium segmentation. While most existing deep learning segmentation approaches are trained on time-consuming and costly manual annotation on single stain domain (PD-L1), we leverage here semi-automatically labeled images from a second stain domain (Cytokeratin-CK). We introduce an end-to-end trainable network that jointly segment tumor epithelium on PD-L1 while leveraging unpaired image-to-image translation between CK and PD-L1, therefore completely bypassing the need for serial sections or re-staining of slides. Extending the method to differentiate between PD-L1 positive and negative tumor epithelium regions enables the automated estimation of the PD-L1 Tumor Cell (TC) score. Quantitative experimental results demonstrate the accuracy of our approach against state-of-the-art segmentation methods. |
Tasks | Domain Adaptation, Image-to-Image Translation |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11118v1 |
https://arxiv.org/pdf/1906.11118v1.pdf | |
PWC | https://paperswithcode.com/paper/dasgan-joint-domain-adaptation-and |
Repo | |
Framework | |
Robust Neural Machine Translation for Clean and Noisy Speech Transcripts
Title | Robust Neural Machine Translation for Clean and Noisy Speech Transcripts |
Authors | Mattia Antonino Di Gangi, Robert Enyedi, Alessandra Brusadin, Marcello Federico |
Abstract | Neural machine translation models have shown to achieve high quality when trained and fed with well structured and punctuated input texts. Unfortunately, the latter condition is not met in spoken language translation, where the input is generated by an automatic speech recognition (ASR) system. In this paper, we study how to adapt a strong NMT system to make it robust to typical ASR errors. As in our application scenarios transcripts might be post-edited by human experts, we propose adaptation strategies to train a single system that can translate either clean or noisy input with no supervision on the input type. Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text. Adapting on both clean and noisy variants of the same data leads to the best results on both input types. |
Tasks | Machine Translation, Speech Recognition |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10238v1 |
https://arxiv.org/pdf/1910.10238v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-neural-machine-translation-for-clean |
Repo | |
Framework | |
Movie Recommender Systems: Implementation and Performance Evaluation
Title | Movie Recommender Systems: Implementation and Performance Evaluation |
Authors | Mojdeh Saadati, Syed Shihab, Mohammed Shaiqur Rahman |
Abstract | Over the years, explosive growth in the number of items in the catalog of e-commerce businesses, such as Amazon, Netflix, Pandora, etc., have warranted the development of recommender systems to guide consumers towards their desired products based on their preferences and tastes. Some of the popular approaches for building recommender systems, for mining user, derived input datasets, are: content-based systems, collaborative filtering, latent-factor systems using Singular Value Decomposition (SVD), and Restricted Boltzmann Machines (RBM). In this project, user-user collaborative filtering, item-item collaborative filtering, content-based recommendation, SVD, and neural networks were chosen for implementation in Python to predict the user ratings of unwatched movies for each user, and their performances were evaluated and compared. |
Tasks | Recommendation Systems |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.12749v1 |
https://arxiv.org/pdf/1909.12749v1.pdf | |
PWC | https://paperswithcode.com/paper/movie-recommender-systems-implementation-and |
Repo | |
Framework | |
Hardening Random Forest Cyber Detectors Against Adversarial Attacks
Title | Hardening Random Forest Cyber Detectors Against Adversarial Attacks |
Authors | Giovanni Apruzzese, Mauro Andreolini, Michele Colajanni, Mirco Marchetti |
Abstract | Machine learning algorithms are effective in several applications, but they are not as much successful when applied to intrusion detection in cyber security. Due to the high sensitivity to their training data, cyber detectors based on machine learning are vulnerable to targeted adversarial attacks that involve the perturbation of initial samples. Existing defenses assume unrealistic scenarios; their results are underwhelming in non-adversarial settings; or they can be applied only to machine learning algorithms that perform poorly for cyber security. We present an original methodology for countering adversarial perturbations targeting intrusion detection systems based on random forests. As a practical application, we integrate the proposed defense method in a cyber detector analyzing network traffic. The experimental results on millions of labelled network flows show that the new detector has a twofold value: it outperforms state-of-the-art detectors that are subject to adversarial attacks; it exhibits robust results both in adversarial and non-adversarial scenarios. |
Tasks | Intrusion Detection |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.03790v1 |
https://arxiv.org/pdf/1912.03790v1.pdf | |
PWC | https://paperswithcode.com/paper/hardening-random-forest-cyber-detectors |
Repo | |
Framework | |
MVF-Net: Multi-View 3D Face Morphable Model Regression
Title | MVF-Net: Multi-View 3D Face Morphable Model Regression |
Authors | Fanzi Wu, Linchao Bao, Yajing Chen, Yonggen Ling, Yibing Song, Songnan Li, King Ngi Ngan, Wei Liu |
Abstract | We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods. |
Tasks | Optical Flow Estimation |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04473v1 |
http://arxiv.org/pdf/1904.04473v1.pdf | |
PWC | https://paperswithcode.com/paper/mvf-net-multi-view-3d-face-morphable-model |
Repo | |
Framework | |
PageRank algorithm for Directed Hypergraph
Title | PageRank algorithm for Directed Hypergraph |
Authors | Loc Tran, Tho Quan, An Mai |
Abstract | During the last two decades, we easilly see that the World Wide Web’s link structure is modeled as the directed graph. In this paper, we will model the World Wide Web’s link structure as the directed hypergraph. Moreover, we will develop the PageRank algorithm for this directed hypergraph. Due to the lack of the World Wide Web directed hypergraph datasets, we will apply the PageRank algorithm to the metabolic network which is the directed hypergraph itself. The experiments show that our novel PageRank algorithm is successfully applied to this metabolic network. |
Tasks | |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1909.01132v1 |
https://arxiv.org/pdf/1909.01132v1.pdf | |
PWC | https://paperswithcode.com/paper/pagerank-algorithm-for-directed-hypergraph |
Repo | |
Framework | |
Adaptive Trade-Offs in Off-Policy Learning
Title | Adaptive Trade-Offs in Off-Policy Learning |
Authors | Mark Rowland, Will Dabney, Rémi Munos |
Abstract | A great variety of off-policy learning algorithms exist in the literature, and new breakthroughs in this area continue to be made, improving theoretical understanding and yielding state-of-the-art reinforcement learning algorithms. In this paper, we take a unifying view of this space of algorithms, and consider their trade-offs of three fundamental quantities: update variance, fixed-point bias, and contraction rate. This leads to new perspectives of existing methods, and also naturally yields novel algorithms for off-policy evaluation and control. We develop one such algorithm, C-trace, demonstrating that it is able to more efficiently make these trade-offs than existing methods in use, and that it can be scaled to yield state-of-the-art performance in large-scale environments. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07478v1 |
https://arxiv.org/pdf/1910.07478v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-trade-offs-in-off-policy-learning |
Repo | |
Framework | |
Multi-Perspective, Simultaneous Embedding
Title | Multi-Perspective, Simultaneous Embedding |
Authors | Md Iqbal Hossain, Vahan Huroyan, Stephen Kobourov, Raymundo Navarrete |
Abstract | We describe a method for simultaneous visualization of multiple pairwise distances in 3 dimensional (3D) space. Given the distance matrices that correspond to 2 dimensional projections of a 3 dimensional object (dataset) the goal is to recover the 3 dimensional object (dataset). We propose an approach that uses 3D to place the points, along with projections (planes) that preserve each of the given distance matrices. Our multi-perspective, simultaneous embedding (MPSE) method is based on non-linear dimensionality reduction that generalizes multidimensional scaling. We consider two versions of the problem: in the first one we are given the input distance matrices and the projections (e.g., if we have 3 different projections we can use the three orthogonal directions of the unit cube). In the second version of the problem we also compute the best projections as part of the optimization. We experimentally evaluate MPSE using synthetic datasets that illustrate the quality of the resulting solutions. Finally, we provide a functional prototype which implements both settings. |
Tasks | Dimensionality Reduction |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06485v1 |
https://arxiv.org/pdf/1909.06485v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-perspective-simultaneous-embedding |
Repo | |
Framework | |
3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders
Title | 3D High-Resolution Cardiac Segmentation Reconstruction from 2D Views using Conditional Variational Autoencoders |
Authors | Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni, Antonio de Marvao, Stuart A. Cook, Declan P. O’Regan, Daniel Rueckert |
Abstract | Accurate segmentation of heart structures imaged by cardiac MR is key for the quantitative analysis of pathology. High-resolution 3D MR sequences enable whole-heart structural imaging but are time-consuming, expensive to acquire and they often require long breath holds that are not suitable for patients. Consequently, multiplanar breath-hold 2D cine sequences are standard practice but are disadvantaged by lack of whole-heart coverage and low through-plane resolution. To address this, we propose a conditional variational autoencoder architecture able to learn a generative model of 3D high-resolution left ventricular (LV) segmentations which is conditioned on three 2D LV segmentations of one short-axis and two long-axis images. By only employing these three 2D segmentations, our model can efficiently reconstruct the 3D high-resolution LV segmentation of a subject. When evaluated on 400 unseen healthy volunteers, our model yielded an average Dice score of $87.92 \pm 0.15$ and outperformed competing architectures. |
Tasks | Cardiac Segmentation |
Published | 2019-02-28 |
URL | http://arxiv.org/abs/1902.11000v1 |
http://arxiv.org/pdf/1902.11000v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-high-resolution-cardiac-segmentation |
Repo | |
Framework | |
Deep Generative Quantile-Copula Models for Probabilistic Forecasting
Title | Deep Generative Quantile-Copula Models for Probabilistic Forecasting |
Authors | Ruofeng Wen, Kari Torkkola |
Abstract | We introduce a new category of multivariate conditional generative models and demonstrate its performance and versatility in probabilistic time series forecasting and simulation. Specifically, the output of quantile regression networks is expanded from a set of fixed quantiles to the whole Quantile Function by a univariate mapping from a latent uniform distribution to the target distribution. Then the multivariate case is solved by learning such quantile functions for each dimension’s marginal distribution, followed by estimating a conditional Copula to associate these latent uniform random variables. The quantile functions and copula, together defining the joint predictive distribution, can be parameterized by a single implicit generative Deep Neural Network. |
Tasks | Time Series, Time Series Forecasting |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10697v1 |
https://arxiv.org/pdf/1907.10697v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-generative-quantile-copula-models-for |
Repo | |
Framework | |
A Practical Framework for Solving Center-Based Clustering with Outliers
Title | A Practical Framework for Solving Center-Based Clustering with Outliers |
Authors | Hu Ding, Haikuo Yu |
Abstract | Clustering has many important applications in computer science, but real-world datasets often contain outliers. Moreover, the existence of outliers can make the clustering problems to be much more challenging. In this paper, we propose a practical framework for solving the problems of $k$-center/median/means clustering with outliers. The framework actually is very simple, where we just need to take a small sample from input and run existing approximation algorithm on the sample. However, our analysis is fundamentally different from the previous sampling based ideas. In particular, the size of the sample is independent of the input data size and dimensionality. To explain the effectiveness of random sampling in theory, we introduce a `significance’ criterion and prove that the performance of our framework depends on the significance degree of the given instance. The result proposed in this paper falls under the umbrella of beyond worst-case analysis in terms of clustering with outliers. The experiments suggest that our framework can achieve comparable clustering result with existing methods, but greatly reduce the running time. | |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10143v2 |
https://arxiv.org/pdf/1905.10143v2.pdf | |
PWC | https://paperswithcode.com/paper/a-practical-framework-for-solving-center |
Repo | |
Framework | |