October 17, 2019

3234 words 16 mins read

Paper Group ANR 690

TS-Net: Combining modality specific and common features for multimodal patch matching. How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?. Adaptive Task Allocation for Mobile Edge Learning. AI Fairness for People with Disabilities: Point of View. In Defense of Single-column Networks for Crowd Counting. U-SLADS: Unsupe …

TS-Net: Combining modality specific and common features for multimodal patch matching


Title	TS-Net: Combining modality specific and common features for multimodal patch matching
Authors	Sovann En, Alexis Lechervy, Frédéric Jurie
Abstract	Multimodal patch matching addresses the problem of finding the correspondences between image patches from two different modalities, e.g. RGB vs sketch or RGB vs near-infrared. The comparison of patches of different modalities can be done by discovering the information common to both modalities (Siamese like approaches) or the modality-specific information (Pseudo-Siamese like approaches). We observed that none of these two scenarios is optimal. This motivates us to propose a three-stream architecture, dubbed as TS-Net, combining the benefits of the two. In addition, we show that adding extra constraints in the intermediate layers of such networks further boosts the performance. Experimentations on three multimodal datasets show significant performance gains in comparison with Siamese and Pseudo-Siamese networks.
Tasks
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01550v1
PDF	http://arxiv.org/pdf/1806.01550v1.pdf
PWC	https://paperswithcode.com/paper/ts-net-combining-modality-specific-and-common
Repo
Framework

How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?


Title	How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?
Authors	Shuoyang Ding, Kevin Duh
Abstract	Using pre-trained word embeddings as input layer is a common practice in many natural language processing (NLP) tasks, but it is largely neglected for neural machine translation (NMT). In this paper, we conducted a systematic analysis on the effect of using pre-trained source-side monolingual word embedding in NMT. We compared several strategies, such as fixing or updating the embeddings during NMT training on varying amounts of data, and we also proposed a novel strategy called dual-embedding that blends the fixing and updating strategies. Our results suggest that pre-trained embeddings can be helpful if properly incorporated into NMT, especially when parallel data is limited or additional in-domain monolingual data is readily available.
Tasks	Machine Translation, Word Embeddings
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01515v2
PDF	http://arxiv.org/pdf/1806.01515v2.pdf
PWC	https://paperswithcode.com/paper/how-do-source-side-monolingual-word
Repo
Framework

Adaptive Task Allocation for Mobile Edge Learning


Title	Adaptive Task Allocation for Mobile Edge Learning
Authors	Umair Mohammad, Sameh Sorour
Abstract	This paper aims to establish a new optimization paradigm for implementing realistic distributed learning algorithms, with performance guarantees, on wireless edge nodes with heterogeneous computing and communication capacities. We will refer to this new paradigm as `Mobile Edge Learning (MEL)'. The problem of dynamic task allocation for MEL is considered in this paper with the aim to maximize the learning accuracy, while guaranteeing that the total times of data distribution/aggregation over heterogeneous channels, and local computing iterations at the heterogeneous nodes, are bounded by a preset duration. The problem is first formulated as a quadratically-constrained integer linear problem. Being an NP-hard problem, the paper relaxes it into a non-convex problem over real variables. We thus proposed two solutions based on deriving analytical upper bounds of the optimal solution of this relaxed problem using Lagrangian analysis and KKT conditions, and the use of suggest-and-improve starting from equal batch allocation, respectively. The merits of these proposed solutions are exhibited by comparing their performances to both numerical approaches and the equal task allocation approach. \|
Tasks
Published	2018-11-09
URL	http://arxiv.org/abs/1811.03748v2
PDF	http://arxiv.org/pdf/1811.03748v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-task-allocation-for-mobile-edge
Repo
Framework

AI Fairness for People with Disabilities: Point of View


Title	AI Fairness for People with Disabilities: Point of View
Authors	Shari Trewin
Abstract	We consider how fair treatment in society for people with disabilities might be impacted by the rise in the use of artificial intelligence, and especially machine learning methods. We argue that fairness for people with disabilities is different to fairness for other protected attributes such as age, gender or race. One major difference is the extreme diversity of ways disabilities manifest, and people adapt. Secondly, disability information is highly sensitive and not always shared, precisely because of the potential for discrimination. Given these differences, we explore definitions of fairness and how well they work in the disability space. Finally, we suggest ways of approaching fairness for people with disabilities in AI applications.
Tasks
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10670v1
PDF	http://arxiv.org/pdf/1811.10670v1.pdf
PWC	https://paperswithcode.com/paper/ai-fairness-for-people-with-disabilities
Repo
Framework

In Defense of Single-column Networks for Crowd Counting


Title	In Defense of Single-column Networks for Crowd Counting
Authors	Ze Wang, Zehao Xiao, Kai Xie, Qiang Qiu, Xiantong Zhen, Xianbin Cao
Abstract	Crowd counting usually addressed by density estimation becomes an increasingly important topic in computer vision due to its widespread applications in video surveillance, urban planning, and intelligence gathering. However, it is essentially a challenging task because of the greatly varied sizes of objects, coupled with severe occlusions and vague appearance of extremely small individuals. Existing methods heavily rely on multi-column learning architectures to extract multi-scale features, which however suffer from heavy computational cost, especially undesired for crowd counting. In this paper, we propose the single-column counting network (SCNet) for efficient crowd counting without relying on multi-column networks. SCNet consists of residual fusion modules (RFMs) for multi-scale feature extraction, a pyramid pooling module (PPM) for information fusion, and a sub-pixel convolutional module (SPCM) followed by a bilinear upsampling layer for resolution recovery. Those proposed modules enable our SCNet to fully capture multi-scale features in a compact single-column architecture and estimate high-resolution density map in an efficient way. In addition, we provide a principled paradigm for density map generation and data augmentation for training, which shows further improved performance. Extensive experiments on three benchmark datasets show that our SCNet delivers new state-of-the-art performance and surpasses previous methods by large margins, which demonstrates the great effectiveness of SCNet as a single-column network for crowd counting.
Tasks	Crowd Counting, Data Augmentation, Density Estimation
Published	2018-08-18
URL	http://arxiv.org/abs/1808.06133v1
PDF	http://arxiv.org/pdf/1808.06133v1.pdf
PWC	https://paperswithcode.com/paper/in-defense-of-single-column-networks-for
Repo
Framework

U-SLADS: Unsupervised Learning Approach for Dynamic Dendrite Sampling


Title	U-SLADS: Unsupervised Learning Approach for Dynamic Dendrite Sampling
Authors	Yan Zhang, Xiang Huang, Nicola Ferrier, Emine B. Gulsoy, Charudatta Phatak
Abstract	Novel data acquisition schemes have been an emerging need for scanning microscopy based imaging techniques to reduce the time in data acquisition and to minimize probing radiation in sample exposure. Varies sparse sampling schemes have been studied and are ideally suited for such applications where the images can be reconstructed from a sparse set of measurements. Dynamic sparse sampling methods, particularly supervised learning based iterative sampling algorithms, have shown promising results for sampling pixel locations on the edges or boundaries during imaging. However, dynamic sampling for imaging skeleton-like objects such as metal dendrites remains difficult. Here, we address a new unsupervised learning approach using Hierarchical Gaussian Mixture Mod- els (HGMM) to dynamically sample metal dendrites. This technique is very useful if the users are interested in fast imaging the primary and secondary arms of metal dendrites in solidification process in materials science.
Tasks
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02233v1
PDF	http://arxiv.org/pdf/1807.02233v1.pdf
PWC	https://paperswithcode.com/paper/u-slads-unsupervised-learning-approach-for
Repo
Framework

Kernel Density Estimation-Based Markov Models with Hidden State


Title	Kernel Density Estimation-Based Markov Models with Hidden State
Authors	Gustav Eje Henter, Arne Leijon, W. Bastiaan Kleijn
Abstract	We consider Markov models of stochastic processes where the next-step conditional distribution is defined by a kernel density estimator (KDE), similar to Markov forecast densities and certain time-series bootstrap schemes. The KDE Markov models (KDE-MMs) we discuss are nonlinear, nonparametric, fully probabilistic representations of stationary processes, based on techniques with strong asymptotic consistency properties. The models generate new data by concatenating points from the training data sequences in a context-sensitive manner, together with some additive driving noise. We present novel EM-type maximum-likelihood algorithms for data-driven bandwidth selection in KDE-MMs. Additionally, we augment the KDE-MMs with a hidden state, yielding a new model class, KDE-HMMs. The added state variable captures non-Markovian long memory and signal structure (e.g., slow oscillations), complementing the short-range dependences described by the Markov process. The resulting joint Markov and hidden-Markov structure is appealing for modelling complex real-world processes such as speech signals. We present guaranteed-ascent EM-update equations for model parameters in the case of Gaussian kernels, as well as relaxed update formulas that greatly accelerate training in practice. Experiments demonstrate increased held-out set probability for KDE-HMMs on several challenging natural and synthetic data series, compared to traditional techniques such as autoregressive models, HMMs, and their combinations.
Tasks	Density Estimation, Time Series
Published	2018-07-30
URL	http://arxiv.org/abs/1807.11320v1
PDF	http://arxiv.org/pdf/1807.11320v1.pdf
PWC	https://paperswithcode.com/paper/kernel-density-estimation-based-markov-models
Repo
Framework

Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information


Title	Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information
Authors	Yannik Peeters, Arnoud V. den Boer, Michel Mandjes
Abstract	Motivated by several practical applications, we consider assortment optimization over a continuous spectrum of products represented by the unit interval, where the seller’s problem consists of determining the optimal subset of products to offer to potential customers. To describe the relation between assortment and customer choice, we propose a probabilistic choice model that forms the continuous counterpart of the widely studied discrete multinomial logit model. We consider the seller’s problem under incomplete information, propose a stochastic-approximation type of policy, and show that its regret – its performance loss compared to the optimal policy – is only logarithmic in the time horizon. We complement this result by showing a matching lower bound on the regret of any policy, implying that our policy is asymptotically optimal. We then show that adding a capacity constraint significantly changes the structure of the problem, by constructing an instance in which the regret of any policy after $T$ time periods is bounded below by a positive constant times $T^{2/3}$. We propose a policy based on kernel-density estimation techniques, and show that its regret is bounded above by a constant times $T^{2/3}$. Numerical illustrations show that our policies outperform or are on par with alternatives based on discretizing the product space.
Tasks	Density Estimation
Published	2018-07-17
URL	https://arxiv.org/abs/1807.06497v3
PDF	https://arxiv.org/pdf/1807.06497v3.pdf
PWC	https://paperswithcode.com/paper/continuous-assortment-optimization-with-logit
Repo
Framework

Deep Neural Networks for ECG-free Cardiac Phase and End-Diastolic Frame Detection on Coronary Angiographies


Title	Deep Neural Networks for ECG-free Cardiac Phase and End-Diastolic Frame Detection on Coronary Angiographies
Authors	Costin Ciusdel, Alexandru Turcea, Andrei Puiu, Lucian Itu, Lucian Calmac, Emma Weiss, Cornelia Margineanu, Elisabeta Badila, Martin Berger, Thomas Redel, Tiziano Passerini, Mehmet Gulsun, Puneet Sharma
Abstract	Invasive coronary angiography (ICA) is the gold standard in Coronary Artery Disease (CAD) imaging. Detection of the end-diastolic frame (EDF) and, in general, cardiac phase detection on each temporal frame of a coronary angiography acquisition is of significant importance for the anatomical and non-invasive functional assessment of CAD. This task is generally performed via manual frame selection or semi-automated selection based on simultaneously acquired ECG signals - thus introducing the requirement of simultaneous ECG recordings. We evaluate the performance of a purely image based workflow based on deep neural networks for fully automated cardiac phase and EDF detection on coronary angiographies. A first deep neural network (DNN), trained to detect coronary arteries, is employed to preselect a subset of frames in which coronary arteries are well visible. A second DNN predicts cardiac phase labels for each frame. Only in the training and evaluation phases for the second DNN, ECG signals are used to provide ground truth labels for each angiographic frame. The networks were trained on 17800 coronary angiographies from 3900 patients and evaluated on 27900 coronary angiographies from 6250 patients. No exclusion criteria related to patient state, previous interventions, or pathology were formulated. Cardiac phase detection had an accuracy of 97.6%, a sensitivity of 97.6% and a specificity of 97.5% on the evaluation set. EDF prediction had a precision of 97.4% and a recall of 96.9%. Several sub-group analyses were performed, indicating that the cardiac phase detection performance is largely independent from acquisition angles and the heart rate of the patient. The average execution time of cardiac phase detection for one angiographic series was on average less than five seconds on a standard workstation.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02797v1
PDF	http://arxiv.org/pdf/1811.02797v1.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-for-ecg-free-cardiac
Repo
Framework

Causal Inference via Kernel Deviance Measures


Title	Causal Inference via Kernel Deviance Measures
Authors	Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh
Abstract	Discovering the causal structure among a set of variables is a fundamental problem in many areas of science. In this paper, we propose Kernel Conditional Deviance for Causal Inference (KCDC) a fully nonparametric causal discovery method based on purely observational data. From a novel interpretation of the notion of asymmetry between cause and effect, we derive a corresponding asymmetry measure using the framework of reproducing kernel Hilbert spaces. Based on this, we propose three decision rules for causal discovery. We demonstrate the wide applicability of our method across a range of diverse synthetic datasets. Furthermore, we test our method on real-world time series data and the real-world benchmark dataset Tubingen Cause-Effect Pairs where we outperform existing state-of-the-art methods.
Tasks	Causal Discovery, Causal Inference, Time Series
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04622v1
PDF	http://arxiv.org/pdf/1804.04622v1.pdf
PWC	https://paperswithcode.com/paper/causal-inference-via-kernel-deviance-measures
Repo
Framework

Learning Discriminative Video Representations Using Adversarial Perturbations


Title	Learning Discriminative Video Representations Using Adversarial Perturbations
Authors	Jue Wang, Anoop Cherian
Abstract	Adversarial perturbations are noise-like patterns that can subtly change the data, while failing an otherwise accurate classifier. In this paper, we propose to use such perturbations for improving the robustness of video representations. To this end, given a well-trained deep model for per-frame video recognition, we first generate adversarial noise adapted to this model. Using the original data features from the full video sequence and their perturbed counterparts, as two separate bags, we develop a binary classification problem that learns a set of discriminative hyperplanes – as a subspace – that will separate the two bags from each other. This subspace is then used as a descriptor for the video, dubbed discriminative subspace pooling. As the perturbed features belong to data classes that are likely to be confused with the original features, the discriminative subspace will characterize parts of the feature space that are more representative of the original data, and thus may provide robust video representations. To learn such descriptors, we formulate a subspace learning objective on the Stiefel manifold and resort to Riemannian optimization methods for solving it efficiently. We provide experiments on several video datasets and demonstrate state-of-the-art results.
Tasks	Action Recognition In Videos, Video Recognition
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09380v2
PDF	http://arxiv.org/pdf/1807.09380v2.pdf
PWC	https://paperswithcode.com/paper/learning-discriminative-video-representations
Repo
Framework

Identity-preserving Face Recovery from Portraits


Title	Identity-preserving Face Recovery from Portraits
Authors	Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz
Abstract	Recovering the latent photorealistic faces from their artistic portraits aids human perception and facial analysis. However, a recovery process that can preserve identity is challenging because the fine details of real faces can be distorted or lost in stylized images. In this paper, we present a new Identity-preserving Face Recovery from Portraits (IFRP) to recover latent photorealistic faces from unaligned stylized portraits. Our IFRP method consists of two components: Style Removal Network (SRN) and Discriminative Network (DN). The SRN is designed to transfer feature maps of stylized images to the feature maps of the corresponding photorealistic faces. By embedding spatial transformer networks into the SRN, our method can compensate for misalignments of stylized faces automatically and output aligned realistic face images. The role of the DN is to enforce recovered faces to be similar to authentic faces. To ensure the identity preservation, we promote the recovered and ground-truth faces to share similar visual features via a distance measure which compares features of recovered and ground-truth faces extracted from a pre-trained VGG network. We evaluate our method on a large-scale synthesized dataset of real and stylized face pairs and attain state of the art results. In addition, our method can recover photorealistic faces from previously unseen stylized portraits, original paintings and human-drawn sketches.
Tasks
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02279v2
PDF	http://arxiv.org/pdf/1801.02279v2.pdf
PWC	https://paperswithcode.com/paper/identity-preserving-face-recovery-from
Repo
Framework

Feature Selection using Stochastic Gates


Title	Feature Selection using Stochastic Gates
Authors	Yutaro Yamada, Ofir Lindenbaum, Sahand Negahban, Yuval Kluger
Abstract	Feature selection problems have been extensively studied for linear estimation, for instance, Lasso, but less emphasis has been placed on feature selection for non-linear functions. In this study, we propose a method for feature selection in high-dimensional non-linear function estimation problems. The new procedure is based on minimizing the $\ell_0$ norm of the vector of indicator variables that represent if a feature is selected or not. Our approach relies on the continuous relaxation of Bernoulli distributions, which allows our model to learn the parameters of the approximate Bernoulli distributions via gradient descent. This general framework simultaneously minimizes a loss function while selecting relevant features. Furthermore, we provide an information-theoretic justification of incorporating Bernoulli distribution into our approach and demonstrate the potential of the approach on synthetic and real-life applications.
Tasks	Feature Selection
Published	2018-10-09
URL	https://arxiv.org/abs/1810.04247v6
PDF	https://arxiv.org/pdf/1810.04247v6.pdf
PWC	https://paperswithcode.com/paper/feature-selection-using-stochastic-gates
Repo
Framework


Title	MRI to FDG-PET: Cross-Modal Synthesis Using 3D U-Net For Multi-Modal Alzheimer’s Classification
Authors	Apoorva Sikka, Skand Vishwanath Peri, Deepti. R. Bathula
Abstract	Recent studies suggest that combined analysis of Magnetic resonance imaging~(MRI) that measures brain atrophy and positron emission tomography~(PET) that quantifies hypo-metabolism provides improved accuracy in diagnosing Alzheimer’s disease. However, such techniques are limited by the availability of corresponding scans of each modality. Current work focuses on a cross-modal approach to estimate FDG-PET scans for the given MR scans using a 3D U-Net architecture. The use of the complete MR image instead of a local patch based approach helps in capturing non-local and non-linear correlations between MRI and PET modalities. The quality of the estimated PET scans is measured using quantitative metrics such as MAE, PSNR and SSIM. The efficacy of the proposed method is evaluated in the context of Alzheimer’s disease classification. The accuracy using only MRI is 70.18% while joint classification using synthesized PET and MRI is 74.43% with a p-value of $0.06$. The significant improvement in diagnosis demonstrates the utility of the synthesized PET scans for multi-modal analysis.
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10111v2
PDF	http://arxiv.org/pdf/1807.10111v2.pdf
PWC	https://paperswithcode.com/paper/mri-to-fdg-pet-cross-modal-synthesis-using-3d
Repo
Framework

Cognitive Deficit of Deep Learning in Numerosity


Title	Cognitive Deficit of Deep Learning in Numerosity
Authors	Xiaolin Wu, Xi Zhang, Xiao Shu
Abstract	Subitizing, or the sense of small natural numbers, is an innate cognitive function of humans and primates; it responds to visual stimuli prior to the development of any symbolic skills, language or arithmetic. Given successes of deep learning (DL) in tasks of visual intelligence and given the primitivity of number sense, a tantalizing question is whether DL can comprehend numbers and perform subitizing. But somewhat disappointingly, extensive experiments of the type of cognitive psychology demonstrate that the examples-driven black box DL cannot see through superficial variations in visual representations and distill the abstract notion of natural number, a task that children perform with high accuracy and confidence. The failure is apparently due to the learning method not the CNN computational machinery itself. A recurrent neural network capable of subitizing does exist, which we construct by encoding a mechanism of mathematical morphology into the CNN convolutional kernels. Also, we investigate, using subitizing as a test bed, the ways to aid the black box DL by cognitive priors derived from human insight. Our findings are mixed and interesting, pointing to both cognitive deficit of pure DL, and some measured successes of boosting DL by predetermined cognitive implements. This case study of DL in cognitive computing is meaningful for visual numerosity represents a minimum level of human intelligence.
Tasks
Published	2018-02-09
URL	http://arxiv.org/abs/1802.05160v4
PDF	http://arxiv.org/pdf/1802.05160v4.pdf
PWC	https://paperswithcode.com/paper/cognitive-deficit-of-deep-learning-in
Repo
Framework