April 2, 2020

3295 words 16 mins read

Paper Group ANR 250

Polarizing Front Ends for Robust CNNs. Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation. Exploring Visual Patterns in Projected Human and Machine Decision-Making Paths. Detection of Information Hiding at Anti-Copying 2D Barcodes. The Whole Is Greater Than the Sum of Its Nonrigid Parts. A Federated Learning Framewor …

Polarizing Front Ends for Robust CNNs


Title	Polarizing Front Ends for Robust CNNs
Authors	Can Bakiskan, Soorya Gopalakrishnan, Metehan Cekic, Upamanyu Madhow, Ramtin Pedarsani
Abstract	The vulnerability of deep neural networks to small, adversarially designed perturbations can be attributed to their “excessive linearity.” In this paper, we propose a bottom-up strategy for attenuating adversarial perturbations using a nonlinear front end which polarizes and quantizes the data. We observe that ideal polarization can be utilized to completely eliminate perturbations, develop algorithms to learn approximately polarizing bases for data, and investigate the effectiveness of the proposed strategy on the MNIST and Fashion MNIST datasets.
Tasks
Published	2020-02-22
URL	https://arxiv.org/abs/2002.09580v1
PDF	https://arxiv.org/pdf/2002.09580v1.pdf
PWC	https://paperswithcode.com/paper/polarizing-front-ends-for-robust-cnns
Repo
Framework

Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation


Title	Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation
Authors	Guohui Ding, Joewie J. Koh, Kelly Merckaert, Bram Vanderborght, Marco M. Nicotra, Christoffer Heckman, Alessandro Roncone, Lijun Chen
Abstract	We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL). We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL), where each agent applies Q-learning with individual reward functions; and game-theoretic RL (GT-RL), where the agents update their Q-values based on the Nash equilibrium of a bimatrix Q-value game. We validate the proposed approaches in the setting of cooperative object manipulation with two simulated robot arms. Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
Tasks	Q-Learning
Published	2020-03-21
URL	https://arxiv.org/abs/2003.09540v1
PDF	https://arxiv.org/pdf/2003.09540v1.pdf
PWC	https://paperswithcode.com/paper/distributed-reinforcement-learning-for-1
Repo
Framework

Exploring Visual Patterns in Projected Human and Machine Decision-Making Paths


Title	Exploring Visual Patterns in Projected Human and Machine Decision-Making Paths
Authors	Andreas Hinterreiter, Christian Steinparz, Moritz Schöfl, Holger Stitz, Marc Streit
Abstract	In problem solving, the paths towards solutions can be viewed as a sequence of decisions. The decisions, made by humans or computers, describe a trajectory through a high-dimensional representation space of the problem. Using dimensionality reduction, these trajectories can be visualized in lower dimensional space. Such embedded trajectories have previously been applied to a wide variety of data, but so far, almost exclusively the self-similarity of single trajectories has been analyzed. In contrast, we describe patterns emerging from drawing many trajectories—for different initial conditions, end states, or solution strategies—in the same embedding space. We argue that general statements about the problem solving tasks and solving strategies can be made by interpreting these patterns. We explore and characterize such patterns in trajectories resulting from human and machine-made decisions in a variety of application domains: logic puzzles (Rubik’s cube), strategy games (chess), and optimization problems (neural network training). In the context of Rubik’s cube, we present a physical interactive demonstrator that uses trajectory visualization to provide immediate feedback to users regarding the consequences of their decisions. We also discuss the importance of suitably chosen representation spaces and similarity metrics for the embedding.
Tasks	Decision Making, Dimensionality Reduction
Published	2020-01-20
URL	https://arxiv.org/abs/2001.08372v1
PDF	https://arxiv.org/pdf/2001.08372v1.pdf
PWC	https://paperswithcode.com/paper/exploring-visual-patterns-in-projected-human
Repo
Framework

Detection of Information Hiding at Anti-Copying 2D Barcodes


Title	Detection of Information Hiding at Anti-Copying 2D Barcodes
Authors	Ning Xie, Ji Hu, Junjie Chen, Qiqi Zhang, Changsheng Chen
Abstract	This paper concerns the problem of detecting the use of information hiding at anti-copying 2D barcodes. Prior hidden information detection schemes are either heuristicbased or Machine Learning (ML) based. The key limitation of prior heuristics-based schemes is that they do not answer the fundamental question of why the information hidden at a 2D barcode can be detected. The key limitation of prior MLbased information schemes is that they lack robustness because a printed 2D barcode is very much environmentally dependent, and thus an information hiding detection scheme trained in one environment often does not work well in another environment. In this paper, we propose two hidden information detection schemes at the existing anti-copying 2D barcodes. The first scheme is to directly use the pixel distance to detect the use of an information hiding scheme in a 2D barcode, referred as to the Pixel Distance Based Detection (PDBD) scheme. The second scheme is first to calculate the variance of the raw signal and the covariance between the recovered signal and the raw signal, and then based on the variance results, detects the use of information hiding scheme in a 2D barcode, referred as to the Pixel Variance Based Detection (PVBD) scheme. Moreover, we design advanced IC attacks to evaluate the security of two existing anti-copying 2D barcodes. We implemented our schemes and conducted extensive performance comparison between our schemes and prior schemes under different capturing devices, such as a scanner and a camera phone. Our experimental results show that the PVBD scheme can correctly detect the existence of the hidden information at both the 2LQR code and the LCAC 2D barcode. Moreover, the probability of successfully attacking of our IC attacks achieves 0.6538 for the 2LQR code and 1 for the LCAC 2D barcode.
Tasks
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09316v1
PDF	https://arxiv.org/pdf/2003.09316v1.pdf
PWC	https://paperswithcode.com/paper/detection-of-information-hiding-at-anti
Repo
Framework

The Whole Is Greater Than the Sum of Its Nonrigid Parts


Title	The Whole Is Greater Than the Sum of Its Nonrigid Parts
Authors	Oshri Halimi, Ido Imanuel, Or Litany, Giovanni Trappolini, Emanuele Rodolà, Leonidas Guibas, Ron Kimmel
Abstract	According to Aristotle, a philosopher in Ancient Greece, “the whole is greater than the sum of its parts”. This observation was adopted to explain human perception by the Gestalt psychology school of thought in the twentieth century. Here, we claim that observing part of an object which was previously acquired as a whole, one could deal with both partial matching and shape completion in a holistic manner. More specifically, given the geometry of a full, articulated object in a given pose, as well as a partial scan of the same object in a different pose, we address the problem of matching the part to the whole while simultaneously reconstructing the new pose from its partial observation. Our approach is data-driven, and takes the form of a Siamese autoencoder without the requirement of a consistent vertex labeling at inference time; as such, it can be used on unorganized point clouds as well as on triangle meshes. We demonstrate the practical effectiveness of our model in the applications of single-view deformable shape completion and dense shape correspondence, both on synthetic and real-world geometric data, where we outperform prior work on these tasks by a large margin.
Tasks
Published	2020-01-27
URL	https://arxiv.org/abs/2001.09650v1
PDF	https://arxiv.org/pdf/2001.09650v1.pdf
PWC	https://paperswithcode.com/paper/the-whole-is-greater-than-the-sum-of-its-1
Repo
Framework

A Federated Learning Framework for Privacy-preserving and Parallel Training


Title	A Federated Learning Framework for Privacy-preserving and Parallel Training
Authors	Tien-Dung Cao, Tram Truong-Huu, Hien Tran, Khanh Tran
Abstract	The deployment of such deep learning in practice has been hurdled by two issues: the computational cost of model training and the privacy issue of training data such as medical or healthcare records. The large size of both learning models and datasets incurs a massive computational cost, requiring efficient approaches to speed up the training phase. While parallel and distributed learning can address the issue of computational overhead, preserving the privacy of training data and intermediate results (e.g., gradients) remains a hard problem. Enabling parallel training of deep learning models on distributed datasets while preserving data privacy is even more complex and challenging. In this paper, we develop and implement FEDF, a distributed deep learning framework for privacy-preserving and parallel training. The framework allows a model to be learned on multiple geographically-distributed training datasets (which may belong to different owners) while do not reveal any information of each dataset as well as the intermediate results. We formally prove the convergence of the learning model when training with the developed framework and its privacy-preserving property. We carry out extensive experiments to evaluate the performance of the framework in terms of speedup ratio, the approximation to the upper-bound performance (when training centrally) and communication overhead between the master and training workers. The results show that the developed framework achieves a speedup of up to 9x compared to the centralized training approach and maintaining the performance approximation of the models within 4.5% of the centrally-trained models. The proposed framework also significantly reduces the amount of data exchanged between the master and training workers by up to 34% compared to existing work.
Tasks
Published	2020-01-22
URL	https://arxiv.org/abs/2001.09782v1
PDF	https://arxiv.org/pdf/2001.09782v1.pdf
PWC	https://paperswithcode.com/paper/a-federated-learning-framework-for-privacy
Repo
Framework

Speaker-aware speech-transformer


Title	Speaker-aware speech-transformer
Authors	Zhiyun Fan, Jie Li, Shiyu Zhou, Bo Xu
Abstract	Recently, end-to-end (E2E) models become a competitive alternative to the conventional hybrid automatic speech recognition (ASR) systems. However, they still suffer from speaker mismatch in training and testing condition. In this paper, we use Speech-Transformer (ST) as the study platform to investigate speaker aware training of E2E models. We propose a model called Speaker-Aware Speech-Transformer (SAST), which is a standard ST equipped with a speaker attention module (SAM). The SAM has a static speaker knowledge block (SKB) that is made of i-vectors. At each time step, the encoder output attends to the i-vectors in the block, and generates a weighted combined speaker embedding vector, which helps the model to normalize the speaker variations. The SAST model trained in this way becomes independent of specific training speakers and thus generalizes better to unseen testing speakers. We investigate different factors of SAM. Experimental results on the AISHELL-1 task show that SAST achieves a relative 6.5% CER reduction (CERR) over the speaker-independent (SI) baseline. Moreover, we demonstrate that SAST still works quite well even if the i-vectors in SKB all come from a different data source other than the acoustic training set.
Tasks	Speech Recognition
Published	2020-01-02
URL	https://arxiv.org/abs/2001.01557v1
PDF	https://arxiv.org/pdf/2001.01557v1.pdf
PWC	https://paperswithcode.com/paper/speaker-aware-speech-transformer
Repo
Framework

ROAM: Random Layer Mixup for Semi-Supervised Learning in Medical Imaging


Title	ROAM: Random Layer Mixup for Semi-Supervised Learning in Medical Imaging
Authors	Tariq Bdair, Nassir Navab, Shadi Albarqouni
Abstract	Medical image segmentation is one of the major challenges addressed by machine learning methods. Yet, deep learning methods profoundly depend on a huge amount of annotated data which is time-consuming and costly. Though semi-supervised learning methods approach this problem by leveraging an abundant amount of unlabeled data along with a small amount of labeled data in the training process. Recently, MixUp regularizer [32] has been successfully introduced to semi-supervised learning methods showing superior performance [3]. MixUp augments the model with new data points through linear interpolation of the data at the input space. In this paper, we argue that this option is limited, instead, we propose ROAM, a random layer mixup, which encourages the network to be less confident for interpolated data points at randomly selected space. Hence, avoids over-fitting and enhances the generalization ability. We validate our method on publicly available datasets on whole-brain image segmentation (MALC) achieving state-of-the-art results in fully supervised (89.8%) and semi-supervised (87.2%) settings with relative improvement up to 2.75% and 16.73%, respectively.
Tasks	Brain Image Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09439v1
PDF	https://arxiv.org/pdf/2003.09439v1.pdf
PWC	https://paperswithcode.com/paper/roam-random-layer-mixup-for-semi-supervised
Repo
Framework

Automatic segmentation of spinal multiple sclerosis lesions: How to generalize across MRI contrasts?


Title	Automatic segmentation of spinal multiple sclerosis lesions: How to generalize across MRI contrasts?
Authors	Olivier Vincent, Charley Gros, Joseph Paul Cohen, Julien Cohen-Adad
Abstract	Despite recent improvements in medical image segmentation, the ability to generalize across imaging contrasts remains an open issue. To tackle this challenge, we implement Feature-wise Linear Modulation (FiLM) to leverage physics knowledge within the segmentation model and learn the characteristics of each contrast. Interestingly, a well-optimised U-Net reached the same performance as our FiLMed-Unet on a multi-contrast dataset (0.72 of Dice score), which suggests that there is a bottleneck in spinal MS lesion segmentation different from the generalization across varying contrasts. This bottleneck likely stems from inter-rater variability, which is estimated at 0.61 of Dice score in our dataset.
Tasks	Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04377v2
PDF	https://arxiv.org/pdf/2003.04377v2.pdf
PWC	https://paperswithcode.com/paper/automatic-segmentation-of-spinal-multiple
Repo
Framework

Rethinking Curriculum Learning with Incremental Labels and Adaptive Compensation


Title	Rethinking Curriculum Learning with Incremental Labels and Adaptive Compensation
Authors	Madan Ravi Ganesh, Jason J. Corso
Abstract	Like humans, deep networks learn better when samples are organized and introduced in a meaningful order or curriculum (Weinshall et al., 2018). While con-ventional approaches to curriculum learning emphasize the difficulty of samples as the core incremental strategy, it forces networks to learn from small subsets of data while introducing pre-computation overheads. In this work, we propose Learning with Incremental Labels and Adaptive Compensation(LILAC), which takes a novel approach to curriculum learning. LILAC emphasizes incrementally learning labels instead of incrementally learning difficult samples. It works in two distinct phases: first, in the incremental label introduction phase, we recursively reveal ground-truth labels in small installments while using a fake label for the remaining data. In the adaptive compensation phase, we compensate for failed predictions by adaptively altering the target vector to a smoother distribution. We evaluate LILAC against the closest comparable methods in batch and curriculum learning and label smoothing, across three standard image benchmarks, CIFAR-10, CIFAR-100, and STL-10. We show that our method outperforms batch learning with higher mean recognition accuracy as well as lower standard deviation in performance consistently across all benchmarks. We further extend LILAC to show the highest performance on CIFAR-10 for methods using simple data augmentation while exhibiting label-order invariance among other properties.
Tasks	Data Augmentation
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04529v1
PDF	https://arxiv.org/pdf/2001.04529v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-curriculum-learning-with-1
Repo
Framework

Deep Affinity Net: Instance Segmentation via Affinity


Title	Deep Affinity Net: Instance Segmentation via Affinity
Authors	Xingqian Xu, Mang Tik Chiu, Thomas S. Huang, Honghui Shi
Abstract	Most of the modern instance segmentation approaches fall into two categories: region-based approaches in which object bounding boxes are detected first and later used in cropping and segmenting instances; and keypoint-based approaches in which individual instances are represented by a set of keypoints followed by a dense pixel clustering around those keypoints. Despite the maturity of these two paradigms, we would like to report an alternative affinity-based paradigm where instances are segmented based on densely predicted affinities and graph partitioning algorithms. Such affinity-based approaches indicate that high-level graph features other than regions or keypoints can be directly applied in the instance segmentation task. In this work, we propose Deep Affinity Net, an effective affinity-based approach accompanied with a new graph partitioning algorithm Cascade-GAEC. Without bells and whistles, our end-to-end model results in 32.4% AP on Cityscapes val and 27.5% AP on test. It achieves the best single-shot result as well as the fastest running time among all affinity-based models. It also outperforms the region-based method Mask R-CNN.
Tasks	graph partitioning, Instance Segmentation, Semantic Segmentation
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06849v1
PDF	https://arxiv.org/pdf/2003.06849v1.pdf
PWC	https://paperswithcode.com/paper/deep-affinity-net-instance-segmentation-via
Repo
Framework

Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning


Title	Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning
Authors	Andreas Kirsch, Clare Lyle, Yarin Gal
Abstract	The information bottleneck (IB) principle offers both a mechanism to explain how deep neural networks train and generalize, as well as a regularized objective with which to train models. However, multiple competing objectives have been proposed based on this principle. Moreover, the information-theoretic quantities in the objective are difficult to compute for large deep neural networks, and this limits its use as a training objective. In this work, we review these quantities, compare and unify previously proposed objectives and relate them to surrogate objectives more friendly to optimization. We find that these surrogate objectives allow us to apply the information bottleneck to modern neural network architectures. We demonstrate our insights on Permutation-MNIST, MNIST and CIFAR10.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12537v1
PDF	https://arxiv.org/pdf/2003.12537v1.pdf
PWC	https://paperswithcode.com/paper/unpacking-information-bottlenecks-unifying
Repo
Framework

XSepConv: Extremely Separated Convolution


Title	XSepConv: Extremely Separated Convolution
Authors	Jiarong Chen, Zongqing Lu, Jing-Hao Xue, Qingmin Liao
Abstract	Depthwise convolution has gradually become an indispensable operation for modern efficient neural networks and larger kernel sizes ($\ge5$) have been applied to it recently. In this paper, we propose a novel extremely separated convolutional block (XSepConv), which fuses spatially separable convolutions into depthwise convolution to further reduce both the computational cost and parameter size of large kernels. Furthermore, an extra $2\times2$ depthwise convolution coupled with improved symmetric padding strategy is employed to compensate for the side effect brought by spatially separable convolutions. XSepConv is designed to be an efficient alternative to vanilla depthwise convolution with large kernel sizes. To verify this, we use XSepConv for the state-of-the-art architecture MobileNetV3-Small and carry out extensive experiments on four highly competitive benchmark datasets (CIFAR-10, CIFAR-100, SVHN and Tiny-ImageNet) to demonstrate that XSepConv can indeed strike a better trade-off between accuracy and efficiency.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12046v1
PDF	https://arxiv.org/pdf/2002.12046v1.pdf
PWC	https://paperswithcode.com/paper/xsepconv-extremely-separated-convolution
Repo
Framework

Identification of Choquet capacity in multicriteria sorting problems through stochastic inverse analysis


Title	Identification of Choquet capacity in multicriteria sorting problems through stochastic inverse analysis
Authors	Renata Pelissari, Leonardo Tomazeli Duarte
Abstract	In multicriteria decision aiding (MCDA), the Choquet integral has been used as an aggregation operator to deal with the case of interacting decision criteria. While the application of the Choquet integral for ranking problems have been receiving most of the attention, this paper rather focuses on multicriteria sorting problems (MCSP). In the Choquet integral context, a practical problem that arises is related to the elicitation of parameters known as the Choquet capacities. We address the problem of Choquet capacity identification for MCSP by applying the Stochastic Acceptability Multicriteri Analysis (SMAA), proposing the SMAA-S-Choquet method. The proposed method is also able to model uncertain data that may be present in both decision matrix and limiting profiles, the latter a parameter associated with the sorting problematic. We also introduce two new descriptive measures in order to conduct reverse analysis regarding the capacities: the Scenario Acceptability Index and the Scenario Central Capacity vector.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12530v1
PDF	https://arxiv.org/pdf/2003.12530v1.pdf
PWC	https://paperswithcode.com/paper/identification-of-choquet-capacity-in
Repo
Framework

Introduction to Rare-Event Predictive Modeling for Inferential Statisticians – A Hands-On Application in the Prediction of Breakthrough Patents


Title	Introduction to Rare-Event Predictive Modeling for Inferential Statisticians – A Hands-On Application in the Prediction of Breakthrough Patents
Authors	Daniel Hain, Roman Jurowetzki
Abstract	Recent years have seen a substantial development of quantitative methods, mostly led by the computer science community with the goal to develop better machine learning application, mainly focused on predictive modeling. However, economic, management, and technology forecasting research has up to now been hesitant to apply predictive modeling techniques and workflows. In this paper, we introduce to a machine learning (ML) approach to quantitative analysis geared towards optimizing the predictive performance, contrasting it with standard practices inferential statistics which focus on producing good parameter estimates. We discuss the potential synergies between the two fields against the backdrop of this at first glance, \enquote{target-incompatibility}. We discuss fundamental concepts in predictive modeling, such as out-of-sample model validation, variable and model selection, generalization and hyperparameter tuning procedures. Providing a hands-on predictive modelling for an quantitative social science audience, while aiming at demystifying computer science jargon. We use the example of \enquote{high-quality} patent identification guiding the reader through various model classes and procedures for data pre-processing, modelling and validation. We start of with more familiar easy to interpret model classes (Logit and Elastic Nets), continues with less familiar non-parametric approaches (Classification Trees and Random Forest) and finally presents artificial neural network architectures, first a simple feed-forward and then a deep autoencoder geared towards anomaly detection. Instead of limiting ourselves to the introduction of standard ML techniques, we also present state-of-the-art yet approachable techniques from artificial neural networks and deep learning to predict rare phenomena of interest.
Tasks	Anomaly Detection, Model Selection
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13441v1
PDF	https://arxiv.org/pdf/2003.13441v1.pdf
PWC	https://paperswithcode.com/paper/introduction-to-rare-event-predictive
Repo
Framework