Paper Group ANR 690
TS-Net: Combining modality specific and common features for multimodal patch matching. How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?. Adaptive Task Allocation for Mobile Edge Learning. AI Fairness for People with Disabilities: Point of View. In Defense of Single-column Networks for Crowd Counting. U-SLADS: Unsupe …
TS-Net: Combining modality specific and common features for multimodal patch matching
Title | TS-Net: Combining modality specific and common features for multimodal patch matching |
Authors | Sovann En, Alexis Lechervy, Frédéric Jurie |
Abstract | Multimodal patch matching addresses the problem of finding the correspondences between image patches from two different modalities, e.g. RGB vs sketch or RGB vs near-infrared. The comparison of patches of different modalities can be done by discovering the information common to both modalities (Siamese like approaches) or the modality-specific information (Pseudo-Siamese like approaches). We observed that none of these two scenarios is optimal. This motivates us to propose a three-stream architecture, dubbed as TS-Net, combining the benefits of the two. In addition, we show that adding extra constraints in the intermediate layers of such networks further boosts the performance. Experimentations on three multimodal datasets show significant performance gains in comparison with Siamese and Pseudo-Siamese networks. |
Tasks | |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01550v1 |
http://arxiv.org/pdf/1806.01550v1.pdf | |
PWC | https://paperswithcode.com/paper/ts-net-combining-modality-specific-and-common |
Repo | |
Framework | |
How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?
Title | How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation? |
Authors | Shuoyang Ding, Kevin Duh |
Abstract | Using pre-trained word embeddings as input layer is a common practice in many natural language processing (NLP) tasks, but it is largely neglected for neural machine translation (NMT). In this paper, we conducted a systematic analysis on the effect of using pre-trained source-side monolingual word embedding in NMT. We compared several strategies, such as fixing or updating the embeddings during NMT training on varying amounts of data, and we also proposed a novel strategy called dual-embedding that blends the fixing and updating strategies. Our results suggest that pre-trained embeddings can be helpful if properly incorporated into NMT, especially when parallel data is limited or additional in-domain monolingual data is readily available. |
Tasks | Machine Translation, Word Embeddings |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01515v2 |
http://arxiv.org/pdf/1806.01515v2.pdf | |
PWC | https://paperswithcode.com/paper/how-do-source-side-monolingual-word |
Repo | |
Framework | |
Adaptive Task Allocation for Mobile Edge Learning
Title | Adaptive Task Allocation for Mobile Edge Learning |
Authors | Umair Mohammad, Sameh Sorour |
Abstract | This paper aims to establish a new optimization paradigm for implementing realistic distributed learning algorithms, with performance guarantees, on wireless edge nodes with heterogeneous computing and communication capacities. We will refer to this new paradigm as `Mobile Edge Learning (MEL)'. The problem of dynamic task allocation for MEL is considered in this paper with the aim to maximize the learning accuracy, while guaranteeing that the total times of data distribution/aggregation over heterogeneous channels, and local computing iterations at the heterogeneous nodes, are bounded by a preset duration. The problem is first formulated as a quadratically-constrained integer linear problem. Being an NP-hard problem, the paper relaxes it into a non-convex problem over real variables. We thus proposed two solutions based on deriving analytical upper bounds of the optimal solution of this relaxed problem using Lagrangian analysis and KKT conditions, and the use of suggest-and-improve starting from equal batch allocation, respectively. The merits of these proposed solutions are exhibited by comparing their performances to both numerical approaches and the equal task allocation approach. | |
Tasks | |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.03748v2 |
http://arxiv.org/pdf/1811.03748v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-task-allocation-for-mobile-edge |
Repo | |
Framework | |
AI Fairness for People with Disabilities: Point of View
Title | AI Fairness for People with Disabilities: Point of View |
Authors | Shari Trewin |
Abstract | We consider how fair treatment in society for people with disabilities might be impacted by the rise in the use of artificial intelligence, and especially machine learning methods. We argue that fairness for people with disabilities is different to fairness for other protected attributes such as age, gender or race. One major difference is the extreme diversity of ways disabilities manifest, and people adapt. Secondly, disability information is highly sensitive and not always shared, precisely because of the potential for discrimination. Given these differences, we explore definitions of fairness and how well they work in the disability space. Finally, we suggest ways of approaching fairness for people with disabilities in AI applications. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10670v1 |
http://arxiv.org/pdf/1811.10670v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-fairness-for-people-with-disabilities |
Repo | |
Framework | |
In Defense of Single-column Networks for Crowd Counting
Title | In Defense of Single-column Networks for Crowd Counting |
Authors | Ze Wang, Zehao Xiao, Kai Xie, Qiang Qiu, Xiantong Zhen, Xianbin Cao |
Abstract | Crowd counting usually addressed by density estimation becomes an increasingly important topic in computer vision due to its widespread applications in video surveillance, urban planning, and intelligence gathering. However, it is essentially a challenging task because of the greatly varied sizes of objects, coupled with severe occlusions and vague appearance of extremely small individuals. Existing methods heavily rely on multi-column learning architectures to extract multi-scale features, which however suffer from heavy computational cost, especially undesired for crowd counting. In this paper, we propose the single-column counting network (SCNet) for efficient crowd counting without relying on multi-column networks. SCNet consists of residual fusion modules (RFMs) for multi-scale feature extraction, a pyramid pooling module (PPM) for information fusion, and a sub-pixel convolutional module (SPCM) followed by a bilinear upsampling layer for resolution recovery. Those proposed modules enable our SCNet to fully capture multi-scale features in a compact single-column architecture and estimate high-resolution density map in an efficient way. In addition, we provide a principled paradigm for density map generation and data augmentation for training, which shows further improved performance. Extensive experiments on three benchmark datasets show that our SCNet delivers new state-of-the-art performance and surpasses previous methods by large margins, which demonstrates the great effectiveness of SCNet as a single-column network for crowd counting. |
Tasks | Crowd Counting, Data Augmentation, Density Estimation |
Published | 2018-08-18 |
URL | http://arxiv.org/abs/1808.06133v1 |
http://arxiv.org/pdf/1808.06133v1.pdf | |
PWC | https://paperswithcode.com/paper/in-defense-of-single-column-networks-for |
Repo | |
Framework | |
U-SLADS: Unsupervised Learning Approach for Dynamic Dendrite Sampling
Title | U-SLADS: Unsupervised Learning Approach for Dynamic Dendrite Sampling |
Authors | Yan Zhang, Xiang Huang, Nicola Ferrier, Emine B. Gulsoy, Charudatta Phatak |
Abstract | Novel data acquisition schemes have been an emerging need for scanning microscopy based imaging techniques to reduce the time in data acquisition and to minimize probing radiation in sample exposure. Varies sparse sampling schemes have been studied and are ideally suited for such applications where the images can be reconstructed from a sparse set of measurements. Dynamic sparse sampling methods, particularly supervised learning based iterative sampling algorithms, have shown promising results for sampling pixel locations on the edges or boundaries during imaging. However, dynamic sampling for imaging skeleton-like objects such as metal dendrites remains difficult. Here, we address a new unsupervised learning approach using Hierarchical Gaussian Mixture Mod- els (HGMM) to dynamically sample metal dendrites. This technique is very useful if the users are interested in fast imaging the primary and secondary arms of metal dendrites in solidification process in materials science. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02233v1 |
http://arxiv.org/pdf/1807.02233v1.pdf | |
PWC | https://paperswithcode.com/paper/u-slads-unsupervised-learning-approach-for |
Repo | |
Framework | |
Kernel Density Estimation-Based Markov Models with Hidden State
Title | Kernel Density Estimation-Based Markov Models with Hidden State |
Authors | Gustav Eje Henter, Arne Leijon, W. Bastiaan Kleijn |
Abstract | We consider Markov models of stochastic processes where the next-step conditional distribution is defined by a kernel density estimator (KDE), similar to Markov forecast densities and certain time-series bootstrap schemes. The KDE Markov models (KDE-MMs) we discuss are nonlinear, nonparametric, fully probabilistic representations of stationary processes, based on techniques with strong asymptotic consistency properties. The models generate new data by concatenating points from the training data sequences in a context-sensitive manner, together with some additive driving noise. We present novel EM-type maximum-likelihood algorithms for data-driven bandwidth selection in KDE-MMs. Additionally, we augment the KDE-MMs with a hidden state, yielding a new model class, KDE-HMMs. The added state variable captures non-Markovian long memory and signal structure (e.g., slow oscillations), complementing the short-range dependences described by the Markov process. The resulting joint Markov and hidden-Markov structure is appealing for modelling complex real-world processes such as speech signals. We present guaranteed-ascent EM-update equations for model parameters in the case of Gaussian kernels, as well as relaxed update formulas that greatly accelerate training in practice. Experiments demonstrate increased held-out set probability for KDE-HMMs on several challenging natural and synthetic data series, compared to traditional techniques such as autoregressive models, HMMs, and their combinations. |
Tasks | Density Estimation, Time Series |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11320v1 |
http://arxiv.org/pdf/1807.11320v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-density-estimation-based-markov-models |
Repo | |
Framework | |
Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information
Title | Continuous Assortment Optimization with Logit Choice Probabilities under Incomplete Information |
Authors | Yannik Peeters, Arnoud V. den Boer, Michel Mandjes |
Abstract | Motivated by several practical applications, we consider assortment optimization over a continuous spectrum of products represented by the unit interval, where the seller’s problem consists of determining the optimal subset of products to offer to potential customers. To describe the relation between assortment and customer choice, we propose a probabilistic choice model that forms the continuous counterpart of the widely studied discrete multinomial logit model. We consider the seller’s problem under incomplete information, propose a stochastic-approximation type of policy, and show that its regret – its performance loss compared to the optimal policy – is only logarithmic in the time horizon. We complement this result by showing a matching lower bound on the regret of any policy, implying that our policy is asymptotically optimal. We then show that adding a capacity constraint significantly changes the structure of the problem, by constructing an instance in which the regret of any policy after $T$ time periods is bounded below by a positive constant times $T^{2/3}$. We propose a policy based on kernel-density estimation techniques, and show that its regret is bounded above by a constant times $T^{2/3}$. Numerical illustrations show that our policies outperform or are on par with alternatives based on discretizing the product space. |
Tasks | Density Estimation |
Published | 2018-07-17 |
URL | https://arxiv.org/abs/1807.06497v3 |
https://arxiv.org/pdf/1807.06497v3.pdf | |
PWC | https://paperswithcode.com/paper/continuous-assortment-optimization-with-logit |
Repo | |
Framework | |
Deep Neural Networks for ECG-free Cardiac Phase and End-Diastolic Frame Detection on Coronary Angiographies
Title | Deep Neural Networks for ECG-free Cardiac Phase and End-Diastolic Frame Detection on Coronary Angiographies |
Authors | Costin Ciusdel, Alexandru Turcea, Andrei Puiu, Lucian Itu, Lucian Calmac, Emma Weiss, Cornelia Margineanu, Elisabeta Badila, Martin Berger, Thomas Redel, Tiziano Passerini, Mehmet Gulsun, Puneet Sharma |
Abstract | Invasive coronary angiography (ICA) is the gold standard in Coronary Artery Disease (CAD) imaging. Detection of the end-diastolic frame (EDF) and, in general, cardiac phase detection on each temporal frame of a coronary angiography acquisition is of significant importance for the anatomical and non-invasive functional assessment of CAD. This task is generally performed via manual frame selection or semi-automated selection based on simultaneously acquired ECG signals - thus introducing the requirement of simultaneous ECG recordings. We evaluate the performance of a purely image based workflow based on deep neural networks for fully automated cardiac phase and EDF detection on coronary angiographies. A first deep neural network (DNN), trained to detect coronary arteries, is employed to preselect a subset of frames in which coronary arteries are well visible. A second DNN predicts cardiac phase labels for each frame. Only in the training and evaluation phases for the second DNN, ECG signals are used to provide ground truth labels for each angiographic frame. The networks were trained on 17800 coronary angiographies from 3900 patients and evaluated on 27900 coronary angiographies from 6250 patients. No exclusion criteria related to patient state, previous interventions, or pathology were formulated. Cardiac phase detection had an accuracy of 97.6%, a sensitivity of 97.6% and a specificity of 97.5% on the evaluation set. EDF prediction had a precision of 97.4% and a recall of 96.9%. Several sub-group analyses were performed, indicating that the cardiac phase detection performance is largely independent from acquisition angles and the heart rate of the patient. The average execution time of cardiac phase detection for one angiographic series was on average less than five seconds on a standard workstation. |
Tasks | |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02797v1 |
http://arxiv.org/pdf/1811.02797v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-for-ecg-free-cardiac |
Repo | |
Framework | |
Causal Inference via Kernel Deviance Measures
Title | Causal Inference via Kernel Deviance Measures |
Authors | Jovana Mitrovic, Dino Sejdinovic, Yee Whye Teh |
Abstract | Discovering the causal structure among a set of variables is a fundamental problem in many areas of science. In this paper, we propose Kernel Conditional Deviance for Causal Inference (KCDC) a fully nonparametric causal discovery method based on purely observational data. From a novel interpretation of the notion of asymmetry between cause and effect, we derive a corresponding asymmetry measure using the framework of reproducing kernel Hilbert spaces. Based on this, we propose three decision rules for causal discovery. We demonstrate the wide applicability of our method across a range of diverse synthetic datasets. Furthermore, we test our method on real-world time series data and the real-world benchmark dataset Tubingen Cause-Effect Pairs where we outperform existing state-of-the-art methods. |
Tasks | Causal Discovery, Causal Inference, Time Series |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04622v1 |
http://arxiv.org/pdf/1804.04622v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-inference-via-kernel-deviance-measures |
Repo | |
Framework | |
Learning Discriminative Video Representations Using Adversarial Perturbations
Title | Learning Discriminative Video Representations Using Adversarial Perturbations |
Authors | Jue Wang, Anoop Cherian |
Abstract | Adversarial perturbations are noise-like patterns that can subtly change the data, while failing an otherwise accurate classifier. In this paper, we propose to use such perturbations for improving the robustness of video representations. To this end, given a well-trained deep model for per-frame video recognition, we first generate adversarial noise adapted to this model. Using the original data features from the full video sequence and their perturbed counterparts, as two separate bags, we develop a binary classification problem that learns a set of discriminative hyperplanes – as a subspace – that will separate the two bags from each other. This subspace is then used as a descriptor for the video, dubbed discriminative subspace pooling. As the perturbed features belong to data classes that are likely to be confused with the original features, the discriminative subspace will characterize parts of the feature space that are more representative of the original data, and thus may provide robust video representations. To learn such descriptors, we formulate a subspace learning objective on the Stiefel manifold and resort to Riemannian optimization methods for solving it efficiently. We provide experiments on several video datasets and demonstrate state-of-the-art results. |
Tasks | Action Recognition In Videos, Video Recognition |
Published | 2018-07-24 |
URL | http://arxiv.org/abs/1807.09380v2 |
http://arxiv.org/pdf/1807.09380v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-discriminative-video-representations |
Repo | |
Framework | |
Identity-preserving Face Recovery from Portraits
Title | Identity-preserving Face Recovery from Portraits |
Authors | Fatemeh Shiri, Xin Yu, Fatih Porikli, Richard Hartley, Piotr Koniusz |
Abstract | Recovering the latent photorealistic faces from their artistic portraits aids human perception and facial analysis. However, a recovery process that can preserve identity is challenging because the fine details of real faces can be distorted or lost in stylized images. In this paper, we present a new Identity-preserving Face Recovery from Portraits (IFRP) to recover latent photorealistic faces from unaligned stylized portraits. Our IFRP method consists of two components: Style Removal Network (SRN) and Discriminative Network (DN). The SRN is designed to transfer feature maps of stylized images to the feature maps of the corresponding photorealistic faces. By embedding spatial transformer networks into the SRN, our method can compensate for misalignments of stylized faces automatically and output aligned realistic face images. The role of the DN is to enforce recovered faces to be similar to authentic faces. To ensure the identity preservation, we promote the recovered and ground-truth faces to share similar visual features via a distance measure which compares features of recovered and ground-truth faces extracted from a pre-trained VGG network. We evaluate our method on a large-scale synthesized dataset of real and stylized face pairs and attain state of the art results. In addition, our method can recover photorealistic faces from previously unseen stylized portraits, original paintings and human-drawn sketches. |
Tasks | |
Published | 2018-01-08 |
URL | http://arxiv.org/abs/1801.02279v2 |
http://arxiv.org/pdf/1801.02279v2.pdf | |
PWC | https://paperswithcode.com/paper/identity-preserving-face-recovery-from |
Repo | |
Framework | |
Feature Selection using Stochastic Gates
Title | Feature Selection using Stochastic Gates |
Authors | Yutaro Yamada, Ofir Lindenbaum, Sahand Negahban, Yuval Kluger |
Abstract | Feature selection problems have been extensively studied for linear estimation, for instance, Lasso, but less emphasis has been placed on feature selection for non-linear functions. In this study, we propose a method for feature selection in high-dimensional non-linear function estimation problems. The new procedure is based on minimizing the $\ell_0$ norm of the vector of indicator variables that represent if a feature is selected or not. Our approach relies on the continuous relaxation of Bernoulli distributions, which allows our model to learn the parameters of the approximate Bernoulli distributions via gradient descent. This general framework simultaneously minimizes a loss function while selecting relevant features. Furthermore, we provide an information-theoretic justification of incorporating Bernoulli distribution into our approach and demonstrate the potential of the approach on synthetic and real-life applications. |
Tasks | Feature Selection |
Published | 2018-10-09 |
URL | https://arxiv.org/abs/1810.04247v6 |
https://arxiv.org/pdf/1810.04247v6.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-using-stochastic-gates |
Repo | |
Framework | |
MRI to FDG-PET: Cross-Modal Synthesis Using 3D U-Net For Multi-Modal Alzheimer’s Classification
Title | MRI to FDG-PET: Cross-Modal Synthesis Using 3D U-Net For Multi-Modal Alzheimer’s Classification |
Authors | Apoorva Sikka, Skand Vishwanath Peri, Deepti. R. Bathula |
Abstract | Recent studies suggest that combined analysis of Magnetic resonance imaging~(MRI) that measures brain atrophy and positron emission tomography~(PET) that quantifies hypo-metabolism provides improved accuracy in diagnosing Alzheimer’s disease. However, such techniques are limited by the availability of corresponding scans of each modality. Current work focuses on a cross-modal approach to estimate FDG-PET scans for the given MR scans using a 3D U-Net architecture. The use of the complete MR image instead of a local patch based approach helps in capturing non-local and non-linear correlations between MRI and PET modalities. The quality of the estimated PET scans is measured using quantitative metrics such as MAE, PSNR and SSIM. The efficacy of the proposed method is evaluated in the context of Alzheimer’s disease classification. The accuracy using only MRI is 70.18% while joint classification using synthesized PET and MRI is 74.43% with a p-value of $0.06$. The significant improvement in diagnosis demonstrates the utility of the synthesized PET scans for multi-modal analysis. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10111v2 |
http://arxiv.org/pdf/1807.10111v2.pdf | |
PWC | https://paperswithcode.com/paper/mri-to-fdg-pet-cross-modal-synthesis-using-3d |
Repo | |
Framework | |
Cognitive Deficit of Deep Learning in Numerosity
Title | Cognitive Deficit of Deep Learning in Numerosity |
Authors | Xiaolin Wu, Xi Zhang, Xiao Shu |
Abstract | Subitizing, or the sense of small natural numbers, is an innate cognitive function of humans and primates; it responds to visual stimuli prior to the development of any symbolic skills, language or arithmetic. Given successes of deep learning (DL) in tasks of visual intelligence and given the primitivity of number sense, a tantalizing question is whether DL can comprehend numbers and perform subitizing. But somewhat disappointingly, extensive experiments of the type of cognitive psychology demonstrate that the examples-driven black box DL cannot see through superficial variations in visual representations and distill the abstract notion of natural number, a task that children perform with high accuracy and confidence. The failure is apparently due to the learning method not the CNN computational machinery itself. A recurrent neural network capable of subitizing does exist, which we construct by encoding a mechanism of mathematical morphology into the CNN convolutional kernels. Also, we investigate, using subitizing as a test bed, the ways to aid the black box DL by cognitive priors derived from human insight. Our findings are mixed and interesting, pointing to both cognitive deficit of pure DL, and some measured successes of boosting DL by predetermined cognitive implements. This case study of DL in cognitive computing is meaningful for visual numerosity represents a minimum level of human intelligence. |
Tasks | |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.05160v4 |
http://arxiv.org/pdf/1802.05160v4.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-deficit-of-deep-learning-in |
Repo | |
Framework | |