Paper Group ANR 799
Coupled dictionary learning for unsupervised change detection between multi-sensor remote sensing images. LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks. State Gradients for RNN Memory Analysis. Modeling of Facial Aging and Kinship: A Survey. Building Ethically Bounded AI. ForensicTransfer: Weakly-supervised Dom …
Coupled dictionary learning for unsupervised change detection between multi-sensor remote sensing images
Title | Coupled dictionary learning for unsupervised change detection between multi-sensor remote sensing images |
Authors | Vinicius Ferraris, Nicolas Dobigeon, Yanna Cavalcanti, Thomas Oberlin, Marie Chabert |
Abstract | Archetypal scenarios for change detection generally consider two images acquired through sensors of the same modality. However, in some specific cases such as emergency situations, the only images available may be those acquired through sensors of different modalities. This paper addresses the problem of unsupervisedly detecting changes between two observed images acquired by sensors of different modalities with possibly different resolutions. These sensor dissimilarities introduce additional issues in the context of operational change detection that are not addressed by most of the classical methods. This paper introduces a novel framework to effectively exploit the available information by modelling the two observed images as a sparse linear combination of atoms belonging to a pair of coupled overcomplete dictionaries learnt from each observed image. As they cover the same geographical location, codes are expected to be globally similar, except for possible changes in sparse spatial locations. Thus, the change detection task is envisioned through a dual code estimation which enforces spatial sparsity in the difference between the estimated codes associated with each image. This problem is formulated as an inverse problem which is iteratively solved using an efficient proximal alternating minimization algorithm accounting for nonsmooth and nonconvex functions. The proposed method is applied to real images with simulated yet realistic and real changes. A comparison with state-of-the-art change detection methods evidences the accuracy of the proposed strategy. |
Tasks | Dictionary Learning |
Published | 2018-07-21 |
URL | https://arxiv.org/abs/1807.08118v2 |
https://arxiv.org/pdf/1807.08118v2.pdf | |
PWC | https://paperswithcode.com/paper/coupled-dictionary-learning-for-unsupervised |
Repo | |
Framework | |
LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks
Title | LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks |
Authors | Ziming Zhang |
Abstract | In this paper we propose solving localized multiple kernel learning (LMKL) using LMKL-Net, a feedforward deep neural network. In contrast to previous works, as a learning principle we propose {\em parameterizing} both the gating function for learning kernel combination weights and the multiclass classifier in LMKL using an attentional network (AN) and a multilayer perceptron (MLP), respectively. In this way we can learn the (nonlinear) decision function in LMKL (approximately) by sequential applications of AN and MLP. Empirically on benchmark datasets we demonstrate that overall LMKL-Net can not only outperform the state-of-the-art MKL solvers in terms of accuracy, but also be trained about {\em two orders of magnitude} faster with much smaller memory footprint for large-scale learning. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08656v1 |
http://arxiv.org/pdf/1805.08656v1.pdf | |
PWC | https://paperswithcode.com/paper/lmkl-net-a-fast-localized-multiple-kernel |
Repo | |
Framework | |
State Gradients for RNN Memory Analysis
Title | State Gradients for RNN Memory Analysis |
Authors | Lyan Verwimp, Hugo Van hamme, Vincent Renkens, Patrick Wambacq |
Abstract | We present a framework for analyzing what the state in RNNs remembers from its input embeddings. Our approach is inspired by backpropagation, in the sense that we compute the gradients of the states with respect to the input embeddings. The gradient matrix is decomposed with Singular Value Decomposition to analyze which directions in the embedding space are best transferred to the hidden state space, characterized by the largest singular values. We apply our approach to LSTM language models and investigate to what extent and for how long certain classes of words are remembered on average for a certain corpus. Additionally, the extent to which a specific property or relationship is remembered by the RNN can be tracked by comparing a vector characterizing that property with the direction(s) in embedding space that are best preserved in hidden state space. |
Tasks | |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04264v2 |
http://arxiv.org/pdf/1805.04264v2.pdf | |
PWC | https://paperswithcode.com/paper/state-gradients-for-rnn-memory-analysis |
Repo | |
Framework | |
Modeling of Facial Aging and Kinship: A Survey
Title | Modeling of Facial Aging and Kinship: A Survey |
Authors | Markos Georgopoulos, Yannis Panagakis, Maja Pantic |
Abstract | Computational facial models that capture properties of facial cues related to aging and kinship increasingly attract the attention of the research community, enabling the development of reliable methods for age progression, age estimation, age-invariant facial characterization, and kinship verification from visual data. In this paper, we review recent advances in modeling of facial aging and kinship. In particular, we provide an up-to date, complete list of available annotated datasets and an in-depth analysis of geometric, hand-crafted, and learned facial representations that are used for facial aging and kinship characterization. Moreover, evaluation protocols and metrics are reviewed and notable experimental results for each surveyed task are analyzed. This survey allows us to identify challenges and discuss future research directions for the development of robust facial models in real-world conditions. |
Tasks | Age Estimation |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04636v2 |
http://arxiv.org/pdf/1802.04636v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-of-facial-aging-and-kinship-a-survey |
Repo | |
Framework | |
Building Ethically Bounded AI
Title | Building Ethically Bounded AI |
Authors | Francesca Rossi, Nicholas Mattei |
Abstract | The more AI agents are deployed in scenarios with possibly unexpected situations, the more they need to be flexible, adaptive, and creative in achieving the goal we have given them. Thus, a certain level of freedom to choose the best path to the goal is inherent in making AI robust and flexible enough. At the same time, however, the pervasive deployment of AI in our life, whether AI is autonomous or collaborating with humans, raises several ethical challenges. AI agents should be aware and follow appropriate ethical principles and should thus exhibit properties such as fairness or other virtues. These ethical principles should define the boundaries of AI’s freedom and creativity. However, it is still a challenge to understand how to specify and reason with ethical boundaries in AI agents and how to combine them appropriately with subjective preferences and goal specifications. Some initial attempts employ either a data-driven example-based approach for both, or a symbolic rule-based approach for both. We envision a modular approach where any AI technique can be used for any of these essential ingredients in decision making or decision support systems, paired with a contextual approach to define their combination and relative weight. In a world where neither humans nor AI systems work in isolation, but are tightly interconnected, e.g., the Internet of Things, we also envision a compositional approach to building ethically bounded AI, where the ethical properties of each component can be fruitfully exploited to derive those of the overall system. In this paper we define and motivate the notion of ethically-bounded AI, we describe two concrete examples, and we outline some outstanding challenges. |
Tasks | Decision Making |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03980v1 |
http://arxiv.org/pdf/1812.03980v1.pdf | |
PWC | https://paperswithcode.com/paper/building-ethically-bounded-ai |
Repo | |
Framework | |
ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection
Title | ForensicTransfer: Weakly-supervised Domain Adaptation for Forgery Detection |
Authors | Davide Cozzolino, Justus Thies, Andreas Rössler, Christian Riess, Matthias Nießner, Luisa Verdoliva |
Abstract | Distinguishing manipulated from real images is becoming increasingly difficult as new sophisticated image forgery approaches come out by the day. Naive classification approaches based on Convolutional Neural Networks (CNNs) show excellent performance in detecting image manipulations when they are trained on a specific forgery method. However, on examples from unseen manipulation approaches, their performance drops significantly. To address this limitation in transferability, we introduce Forensic-Transfer (FT). We devise a learning-based forensic detector which adapts well to new domains, i.e., novel manipulation methods and can handle scenarios where only a handful of fake examples are available during training. To this end, we learn a forensic embedding based on a novel autoencoder-based architecture that can be used to distinguish between real and fake imagery. The learned embedding acts as a form of anomaly detector; namely, an image manipulated from an unseen method will be detected as fake provided it maps sufficiently far away from the cluster of real images. Comparing to prior works, FT shows significant improvements in transferability, which we demonstrate in a series of experiments on cutting-edge benchmarks. For instance, on unseen examples, we achieve up to 85% in terms of accuracy, and with only a handful of seen examples, our performance already reaches around 95%. |
Tasks | Domain Adaptation |
Published | 2018-12-06 |
URL | https://arxiv.org/abs/1812.02510v2 |
https://arxiv.org/pdf/1812.02510v2.pdf | |
PWC | https://paperswithcode.com/paper/forensictransfer-weakly-supervised-domain |
Repo | |
Framework | |
Memoryless Exact Solutions for Deterministic MDPs with Sparse Rewards
Title | Memoryless Exact Solutions for Deterministic MDPs with Sparse Rewards |
Authors | Joshua R. Bertram, Peng Wei |
Abstract | We propose an algorithm for deterministic continuous Markov Decision Processes with sparse rewards that computes the optimal policy exactly with no dependency on the size of the state space. The algorithm has time complexity of $O( R^3 \times A^2 )$ and memory complexity of $O( R \times A )$, where $R$ is the number of reward sources and $A$ is the number of actions. Furthermore, we describe a companion algorithm that can follow the optimal policy from any initial state without computing the entire value function, instead computing on-demand the value of states as they are needed. The algorithm to solve the MDP does not depend on the size of the state space for either time or memory complexity, and the ability to follow the optimal policy is linear in time and space with the path length of following the optimal policy from the initial state. We demonstrate the algorithm operation side by side with value iteration on tractable MDPs. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.07220v1 |
http://arxiv.org/pdf/1805.07220v1.pdf | |
PWC | https://paperswithcode.com/paper/memoryless-exact-solutions-for-deterministic |
Repo | |
Framework | |
A Simple Framework to Leverage State-Of-The-Art Single-Image Super-Resolution Methods to Restore Light Fields
Title | A Simple Framework to Leverage State-Of-The-Art Single-Image Super-Resolution Methods to Restore Light Fields |
Authors | Reuben A. Farrugia, C. Guillemot |
Abstract | Plenoptic cameras offer a cost effective solution to capture light fields by multiplexing multiple views on a single image sensor. However, the high angular resolution is achieved at the expense of reducing the spatial resolution of each view by orders of magnitude compared to the raw sensor image. While light field super-resolution is still at an early stage, the field of single image super-resolution (SISR) has recently known significant advances with the use of deep learning techniques. This paper describes a simple framework allowing us to leverage state-of-the-art SISR techniques into light fields, while taking into account specific light field geometrical constraints. The idea is to first compute a representation compacting most of the light field energy into as few components as possible. This is achieved by aligning the light field using optical flows and then by decomposing the aligned light field using singular value decomposition (SVD). The principal basis captures the information that is coherent across all the views, while the other basis contain the high angular frequencies. Super-resolving this principal basis using an SISR method allows us to super-resolve all the information that is coherent across the entire light field. This framework allows the proposed light field super-resolution method to inherit the benefits of the SISR method used. Experimental results show that the proposed method is competitive, and most of the time superior, to recent light field super-resolution methods in terms of both PSNR and SSIM quality metrics, with a lower complexity. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10449v1 |
http://arxiv.org/pdf/1809.10449v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-framework-to-leverage-state-of-the |
Repo | |
Framework | |
An Asynchronous Distributed Expectation Maximization Algorithm For Massive Data: The DEM Algorithm
Title | An Asynchronous Distributed Expectation Maximization Algorithm For Massive Data: The DEM Algorithm |
Authors | Sanvesh Srivastava, Glen DePalma, Chuanhai Liu |
Abstract | The family of Expectation-Maximization (EM) algorithms provides a general approach to fitting flexible models for large and complex data. The expectation (E) step of EM-type algorithms is time-consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the Distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called \emph{workers} and \emph{managers} for performing the E step and the maximization step (M step), respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of local E step. The managers perform the M step after they have received results from a $\gamma$-fraction of the workers, where $\gamma$ is a fixed constant in $(0, 1]$. The sequence of parameter estimates generated by the DEM algorithm retains the attractive properties of EM: convergence of the sequence of parameter estimates to a local mode and linear global rate of convergence. Across diverse simulations focused on linear mixed-effects models, the DEM algorithm is significantly faster than competing EM-type algorithms while having a similar accuracy. The DEM algorithm maintains its superior empirical performance on a movie ratings database consisting of 10 million ratings. |
Tasks | |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07533v1 |
http://arxiv.org/pdf/1806.07533v1.pdf | |
PWC | https://paperswithcode.com/paper/an-asynchronous-distributed-expectation |
Repo | |
Framework | |
Towards Monocular Digital Elevation Model (DEM) Estimation by Convolutional Neural Networks - Application on Synthetic Aperture Radar Images
Title | Towards Monocular Digital Elevation Model (DEM) Estimation by Convolutional Neural Networks - Application on Synthetic Aperture Radar Images |
Authors | Gabriele Costante, Thomas A. Ciarfuglia, Filippo Biondi |
Abstract | Synthetic aperture radar (SAR) interferometry (InSAR) is performed using repeat-pass geometry. InSAR technique is used to estimate the topographic reconstruction of the earth surface. The main problem of the range-Doppler focusing technique is the nature of the two-dimensional SAR result, affected by the layover indetermination. In order to resolve this problem, a minimum of two sensor acquisitions, separated by a baseline and extended in the cross-slant-range, are needed. However, given its multi-temporal nature, these techniques are vulnerable to atmosphere and Earth environment parameters variation in addition to physical platform instabilities. Furthermore, either two radars are needed or an interferometric cycle is required (that spans from days to weeks), which makes real time DEM estimation impossible. In this work, the authors propose a novel experimental alternative to the InSAR method that uses single-pass acquisitions, using a data driven approach implemented by Deep Neural Networks. We propose a fully Convolutional Neural Network (CNN) Encoder-Decoder architecture, training it on radar images in order to estimate DEMs from single pass image acquisitions. Our results on a set of Sentinel images show that this method is able to learn to some extent the statistical properties of the DEM. The results of this exploratory analysis are encouraging and open the way to the solution of single-pass DEM estimation problem with data driven approaches. |
Tasks | |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05387v1 |
http://arxiv.org/pdf/1803.05387v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-monocular-digital-elevation-model-dem |
Repo | |
Framework | |
Testing the Efficient Network TRaining (ENTR) Hypothesis: initially reducing training image size makes Convolutional Neural Network training for image recognition tasks more efficient
Title | Testing the Efficient Network TRaining (ENTR) Hypothesis: initially reducing training image size makes Convolutional Neural Network training for image recognition tasks more efficient |
Authors | Thomas Cherico Wanger, Peter Frohn |
Abstract | Convolutional Neural Networks (CNN) for image recognition tasks are seeing rapid advances in the available architectures and how networks are trained based on large computational infrastructure and standard datasets with millions of images. In contrast, performance and time constraints for example, of small devices and free cloud GPUs necessitate efficient network training (i.e., highest accuracy in the shortest inference time possible), often on small datasets. Here, we hypothesize that initially decreasing image size during training makes the training process more efficient, because pre-shaping weights with small images and later utilizing these weights with larger images reduces initial network parameters and total inference time. We test this Efficient Network TRaining (ENTR) Hypothesis by training pre-trained Residual Network (ResNet) models (ResNet18, 34, & 50) on three small datasets (steel microstructures, bee images, and geographic aerial images) with a free cloud GPU. Based on three training regimes of i) not, ii) gradually or iii) in one step increasing image size over the training process, we show that initially reducing image size increases training efficiency consistently across datasets and networks. We interpret these results mechanistically in the framework of regularization theory. Support for the ENTR hypothesis is an important contribution, because network efficiency improvements for image recognition tasks are needed for practical applications. In the future, it will be exciting to see how the ENTR hypothesis holds for large standard datasets like ImageNet or CIFAR, to better understand the underlying mechanisms, and how these results compare to other fields such as structural learning. |
Tasks | |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11583v1 |
http://arxiv.org/pdf/1807.11583v1.pdf | |
PWC | https://paperswithcode.com/paper/testing-the-efficient-network-training-entr |
Repo | |
Framework | |
Fast forwarding Egocentric Videos by Listening and Watching
Title | Fast forwarding Egocentric Videos by Listening and Watching |
Authors | Vinicius S. Furlan, Ruzena Bajcsy, Erickson R. Nascimento |
Abstract | The remarkable technological advance in well-equipped wearable devices is pushing an increasing production of long first-person videos. However, since most of these videos have long and tedious parts, they are forgotten or never seen. Despite a large number of techniques proposed to fast-forward these videos by highlighting relevant moments, most of them are image based only. Most of these techniques disregard other relevant sensors present in the current devices such as high-definition microphones. In this work, we propose a new approach to fast-forward videos using psychoacoustic metrics extracted from the soundtrack. These metrics can be used to estimate the annoyance of a segment allowing our method to emphasize moments of sound pleasantness. The efficiency of our method is demonstrated through qualitative results and quantitative results as far as of speed-up and instability are concerned. |
Tasks | |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04620v1 |
http://arxiv.org/pdf/1806.04620v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-forwarding-egocentric-videos-by |
Repo | |
Framework | |
Approach for Video Classification with Multi-label on YouTube-8M Dataset
Title | Approach for Video Classification with Multi-label on YouTube-8M Dataset |
Authors | Kwangsoo Shin, Junhyeong Jeon, Seungbin Lee, Boyoung Lim, Minsoo Jeong, Jongho Nang |
Abstract | Video traffic is increasing at a considerable rate due to the spread of personal media and advancements in media technology. Accordingly, there is a growing need for techniques to automatically classify moving images. This paper use NetVLAD and NetFV models and the Huber loss function for video classification problem and YouTube-8M dataset to verify the experiment. We tried various attempts according to the dataset and optimize hyperparameters, ultimately obtain a GAP score of 0.8668. |
Tasks | Video Classification |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.08671v3 |
http://arxiv.org/pdf/1808.08671v3.pdf | |
PWC | https://paperswithcode.com/paper/approach-for-video-classification-with-multi |
Repo | |
Framework | |
A Parameterized Complexity View on Description Logic Reasoning
Title | A Parameterized Complexity View on Description Logic Reasoning |
Authors | Ronald de Haan |
Abstract | Description logics are knowledge representation languages that have been designed to strike a balance between expressivity and computational tractability. Many different description logics have been developed, and numerous computational problems for these logics have been studied for their computational complexity. However, essentially all complexity analyses of reasoning problems for description logics use the one-dimensional framework of classical complexity theory. The multi-dimensional framework of parameterized complexity theory is able to provide a much more detailed image of the complexity of reasoning problems. In this paper we argue that the framework of parameterized complexity has a lot to offer for the complexity analysis of description logic reasoning problems—when one takes a progressive and forward-looking view on parameterized complexity tools. We substantiate our argument by means of three case studies. The first case study is about the problem of concept satisfiability for the logic ALC with respect to nearly acyclic TBoxes. The second case study concerns concept satisfiability for ALC concepts parameterized by the number of occurrences of union operators and the number of occurrences of full existential quantification. The third case study offers a critical look at data complexity results from a parameterized complexity point of view. These three case studies are representative for the wide range of uses for parameterized complexity methods for description logic problems. |
Tasks | |
Published | 2018-08-11 |
URL | http://arxiv.org/abs/1808.03852v1 |
http://arxiv.org/pdf/1808.03852v1.pdf | |
PWC | https://paperswithcode.com/paper/a-parameterized-complexity-view-on |
Repo | |
Framework | |
Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning
Title | Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning |
Authors | Uta Büchler, Biagio Brattoli, Björn Ommer |
Abstract | Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval. |
Tasks | Transfer Learning, Video Classification |
Published | 2018-07-30 |
URL | http://arxiv.org/abs/1807.11293v1 |
http://arxiv.org/pdf/1807.11293v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-spatiotemporal-self-supervision-by |
Repo | |
Framework | |