October 17, 2019

3061 words 15 mins read

Paper Group ANR 949

Facial Expression Analysis under Partial Occlusion: A Survey. Kernel Embedding Approaches to Orbit Determination of Spacecraft Clusters. A Multi-Modal Approach to Infer Image Affect. Institutional Metaphors for Designing Large-Scale Distributed AI versus AI Techniques for Running Institutions. Interpreting DNN output layer activations: A strategy t …

Facial Expression Analysis under Partial Occlusion: A Survey


Title	Facial Expression Analysis under Partial Occlusion: A Survey
Authors	Ligang Zhang, Brijesh Verma, Dian Tjondronegoro, Vinod Chandran
Abstract	Automatic machine-based Facial Expression Analysis (FEA) has made substantial progress in the past few decades driven by its importance for applications in psychology, security, health, entertainment and human computer interaction. The vast majority of completed FEA studies are based on non-occluded faces collected in a controlled laboratory environment. Automatic expression recognition tolerant to partial occlusion remains less understood, particularly in real-world scenarios. In recent years, efforts investigating techniques to handle partial occlusion for FEA have seen an increase. The context is right for a comprehensive perspective of these developments and the state of the art from this perspective. This survey provides such a comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems. It outlines existing challenges in overcoming partial occlusion and discusses possible opportunities in advancing the technology. To the best of our knowledge, it is the first FEA survey dedicated to occlusion and aimed at promoting better informed and benchmarked future work.
Tasks
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08784v1
PDF	http://arxiv.org/pdf/1802.08784v1.pdf
PWC	https://paperswithcode.com/paper/facial-expression-analysis-under-partial
Repo
Framework

Kernel Embedding Approaches to Orbit Determination of Spacecraft Clusters


Title	Kernel Embedding Approaches to Orbit Determination of Spacecraft Clusters
Authors	Srinagesh Sharma, James W. Cutler
Abstract	This paper presents a novel formulation and solution of orbit determination over finite time horizons as a learning problem. We present an approach to orbit determination under very broad conditions that are satisfied for n-body problems. These weak conditions allow us to perform orbit determination with noisy and highly non-linear observations such as those presented by range-rate only (Doppler only) observations. We show that domain generalization and distribution regression techniques can learn to estimate orbits of a group of satellites and identify individual satellites especially with prior understanding of correlations between orbits and provide asymptotic convergence conditions. The approach presented requires only visibility and observability of the underlying state from observations and is particularly useful for autonomous spacecraft operations using low-cost ground stations or sensors. We validate the orbit determination approach using observations of two spacecraft (GRIFEX and MCubed-2) along with synthetic datasets of multiple spacecraft deployments and lunar orbits. We also provide a comparison with the standard techniques (EKF) under highly noisy conditions.
Tasks	Domain Generalization
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00650v1
PDF	http://arxiv.org/pdf/1803.00650v1.pdf
PWC	https://paperswithcode.com/paper/kernel-embedding-approaches-to-orbit
Repo
Framework


Title	A Multi-Modal Approach to Infer Image Affect
Authors	Ashok Sundaresan, Sugumar Murugesan, Sean Davis, Karthik Kappaganthu, ZhongYi Jin, Divya Jain, Anurag Maunder
Abstract	The group affect or emotion in an image of people can be inferred by extracting features about both the people in the picture and the overall makeup of the scene. The state-of-the-art on this problem investigates a combination of facial features, scene extraction and even audio tonality. This paper combines three additional modalities, namely, human pose, text-based tagging and CNN extracted features / predictions. To the best of our knowledge, this is the first time all of the modalities were extracted using deep neural networks. We evaluate the performance of our approach against baselines and identify insights throughout this paper.
Tasks
Published	2018-03-13
URL	http://arxiv.org/abs/1803.05070v1
PDF	http://arxiv.org/pdf/1803.05070v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-modal-approach-to-infer-image-affect
Repo
Framework

Institutional Metaphors for Designing Large-Scale Distributed AI versus AI Techniques for Running Institutions


Title	Institutional Metaphors for Designing Large-Scale Distributed AI versus AI Techniques for Running Institutions
Authors	Alexander Boer, Giovanni Sileno
Abstract	Artificial Intelligence (AI) started out with an ambition to reproduce the human mind, but, as the sheer scale of that ambition became apparent, quickly retreated into either studying specialized intelligent behaviours, or proposing overarching architectural concepts for interfacing specialized intelligent behaviour components, conceived of as agents in a kind of organization. This agent-based modeling paradigm, in turn, proves to have interesting applications in understanding, simulating, and predicting the behaviour of social and legal structures on an aggregate level. This chapter examines a number of relevant cross-cutting concerns, conceptualizations, modeling problems and design challenges in large-scale distributed Artificial Intelligence, as well as in institutional systems, and identifies potential grounds for novel advances.
Tasks
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03407v1
PDF	http://arxiv.org/pdf/1803.03407v1.pdf
PWC	https://paperswithcode.com/paper/institutional-metaphors-for-designing-large
Repo
Framework

Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition


Title	Interpreting DNN output layer activations: A strategy to cope with unseen data in speech recognition
Authors	Vikramjit Mitra, Horacio Franco
Abstract	Unseen data can degrade performance of deep neural net acoustic models. To cope with unseen data, adaptation techniques are deployed. For unlabeled unseen data, one must generate some hypothesis given an existing model, which is used as the label for model adaptation. However, assessing the goodness of the hypothesis can be difficult, and an erroneous hypothesis can lead to poorly trained models. In such cases, a strategy to select data having reliable hypothesis can ensure better model adaptation. This work proposes a data-selection strategy for DNN model adaptation, where DNN output layer activations are used to ascertain the goodness of a generated hypothesis. In a DNN acoustic model, the output layer activations are used to generate target class probabilities. Under unseen data conditions, the difference between the most probable target and the next most probable target is decreased compared to the same for seen data, indicating that the model may be uncertain while generating its hypothesis. This work proposes a strategy to assess a model’s performance by analyzing the output layer activations by using a distance measure between the most likely target and the next most likely target, which is used for data selection for performing unsupervised adaptation.
Tasks	Speech Recognition
Published	2018-02-16
URL	http://arxiv.org/abs/1802.06861v1
PDF	http://arxiv.org/pdf/1802.06861v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-dnn-output-layer-activations-a
Repo
Framework

A case for multiple and parallel RRAMs as synaptic model for training SNNs


Title	A case for multiple and parallel RRAMs as synaptic model for training SNNs
Authors	Aditya Shukla, Sidharth Prasad, Sandip Lashkare, Udayan Ganguly
Abstract	To enable a dense integration of model synapses in a spiking neural networks hardware, various nano-scale devices are being considered. Such a device, besides exhibiting spike-time dependent plasticity (STDP), needs to be highly scalable, have a large endurance and require low energy for transitioning between states. In this work, we first introduce and empirically determine two new specifications for an synapse in SNNs: number of conductance levels per synapse and maximum learning-rate. To the best of our knowledge, there are no RRAMs that meet the latter specification. As a solution, we propose the use of multiple PCMO-RRAMs in parallel within a synapse. While synaptic reading, all PCMO-RRAMs are simultaneously read and for each synaptic conductance-change event, the mechanism for conductance STDP is initiated for only one RRAM, randomly picked from the set. Second, to validate our solution, we experimentally demonstrate STDP of conductance of a PCMO-RRAM and then show that due to a large learning-rate, a single PCMO-RRAM fails to model a synapse in the training of an SNN. As anticipated, network training improves as more PCMO-RRAMs are added to the synapse. Fourth, we discuss the circuit-requirements for implementing such a scheme, to conclude that the requirements are within bounds. Thus, our work presents specifications for synaptic devices in trainable SNNs, indicates the shortcomings of state-of-art synaptic contenders, and provides a solution to extrinsically meet the specifications and discusses the peripheral circuitry that implements the solution.
Tasks
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04773v1
PDF	http://arxiv.org/pdf/1803.04773v1.pdf
PWC	https://paperswithcode.com/paper/a-case-for-multiple-and-parallel-rrams-as
Repo
Framework

ECG Arrhythmia Classification Using Transfer Learning from 2-Dimensional Deep CNN Features


Title	ECG Arrhythmia Classification Using Transfer Learning from 2-Dimensional Deep CNN Features
Authors	Milad Salem, Shayan Taheri, Jiann Shiun-Yuan
Abstract	Due to the recent advances in the area of deep learning, it has been demonstrated that a deep neural network, trained on a huge amount of data, can recognize cardiac arrhythmias better than cardiologists. Moreover, traditionally feature extraction was considered an integral part of ECG pattern recognition; however, recent findings have shown that deep neural networks can carry out the task of feature extraction directly from the data itself. In order to use deep neural networks for their accuracy and feature extraction, high volume of training data is required, which in the case of independent studies is not pragmatic. To arise to this challenge, in this work, the identification and classification of four ECG patterns are studied from a transfer learning perspective, transferring knowledge learned from the image classification domain to the ECG signal classification domain. It is demonstrated that feature maps learned in a deep neural network trained on great amounts of generic input images can be used as general descriptors for the ECG signal spectrograms and result in features that enable classification of arrhythmias. Overall, an accuracy of 97.23 percent is achieved in classifying near 7000 instances by ten-fold cross validation.
Tasks	Image Classification, Transfer Learning
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04693v1
PDF	http://arxiv.org/pdf/1812.04693v1.pdf
PWC	https://paperswithcode.com/paper/ecg-arrhythmia-classification-using-transfer
Repo
Framework

Diversity regularization in deep ensembles


Title	Diversity regularization in deep ensembles
Authors	Changjian Shui, Azadeh Sadat Mozafari, Jonathan Marek, Ihsen Hedhli, Christian Gagné
Abstract	Calibrating the confidence of supervised learning models is important for a variety of contexts where the certainty over predictions should be reliable. However, it has been reported that deep neural network models are often too poorly calibrated for achieving complex tasks requiring reliable uncertainty estimates in their prediction. In this work, we are proposing a strategy for training deep ensembles with a diversity function regularization, which improves the calibration property while maintaining a similar prediction accuracy.
Tasks	Calibration
Published	2018-02-22
URL	http://arxiv.org/abs/1802.07881v1
PDF	http://arxiv.org/pdf/1802.07881v1.pdf
PWC	https://paperswithcode.com/paper/diversity-regularization-in-deep-ensembles
Repo
Framework

Peeking Behind Objects: Layered Depth Prediction from a Single Image


Title	Peeking Behind Objects: Layered Depth Prediction from a Single Image
Authors	Helisa Dhamo, Keisuke Tateno, Iro Laina, Nassir Navab, Federico Tombari
Abstract	While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects. This limits the use of depth prediction in augmented and virtual reality applications, that aim at scene exploration by synthesizing the scene from a different vantage point, or at diminished reality. To address this issue, we shift the focus from conventional depth map prediction to the regression of a specific data representation called Layered Depth Image (LDI), which contains information about the occluded regions in the reference frame and can fill in occlusion gaps in case of small view changes. We propose a novel approach based on Convolutional Neural Networks (CNNs) to jointly predict depth maps and foreground separation masks used to condition Generative Adversarial Networks (GANs) for hallucinating plausible color and depths in the initially occluded areas. We demonstrate the effectiveness of our approach for novel scene view synthesis from a single image.
Tasks	Depth Estimation
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08776v1
PDF	http://arxiv.org/pdf/1807.08776v1.pdf
PWC	https://paperswithcode.com/paper/peeking-behind-objects-layered-depth
Repo
Framework

Strong mixed-integer programming formulations for trained neural networks


Title	Strong mixed-integer programming formulations for trained neural networks
Authors	Ross Anderson, Joey Huchette, Christian Tjandraatmadja, Juan Pablo Vielma
Abstract	We present an ideal mixed-integer programming (MIP) formulation for a rectified linear unit (ReLU) appearing in a trained neural network. Our formulation requires a single binary variable and no additional continuous variables beyond the input and output variables of the ReLU. We contrast it with an ideal “extended” formulation with a linear number of additional continuous variables, derived through standard techniques. An apparent drawback of our formulation is that it requires an exponential number of inequality constraints, but we provide a routine to separate the inequalities in linear time. We also prove that these exponentially-many constraints are facet-defining under mild conditions. Finally, we study network verification problems and observe that dynamically separating from the exponential inequalities 1) is much more computationally efficient and scalable than the extended formulation, 2) decreases the solve time of a state-of-the-art MIP solver by a factor of 7 on smaller instances, and 3) nearly matches the dual bounds of a state-of-the-art MIP solver on harder instances, after just a few rounds of separation and in orders of magnitude less time.
Tasks
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08359v2
PDF	http://arxiv.org/pdf/1811.08359v2.pdf
PWC	https://paperswithcode.com/paper/strong-mixed-integer-programming-formulations
Repo
Framework

Generative Adversarial Networks using Adaptive Convolution


Title	Generative Adversarial Networks using Adaptive Convolution
Authors	Nhat M. Nguyen, Nilanjan Ray
Abstract	Most existing GANs architectures that generate images use transposed convolution or resize-convolution as their upsampling algorithm from lower to higher resolution feature maps in the generator. We argue that this kind of fixed operation is problematic for GANs to model objects that have very different visual appearances. We propose a novel adaptive convolution method that learns the upsampling algorithm based on the local context at each location to address this problem. We modify a baseline GANs architecture by replacing normal convolutions with adaptive convolutions in the generator. Experiments on CIFAR-10 dataset show that our modified models improve the baseline model by a large margin. Furthermore, our models achieve state-of-the-art performance on CIFAR-10 and STL-10 datasets in the unsupervised setting.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1802.02226v1
PDF	http://arxiv.org/pdf/1802.02226v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-networks-using
Repo
Framework

Deep Co-attention based Comparators For Relative Representation Learning in Person Re-identification


Title	Deep Co-attention based Comparators For Relative Representation Learning in Person Re-identification
Authors	Lin Wu, Yang Wang, Junbin Gao, Dacheng Tao
Abstract	Person re-identification (re-ID) requires rapid, flexible yet discriminant representations to quickly generalize to unseen observations on-the-fly and recognize the same identity across disjoint camera views. Recent effective methods are developed in a pair-wise similarity learning system to detect a fixed set of features from distinct regions which are mapped to their vector embeddings for the distance measuring. However, the most relevant and crucial parts of each image are detected independently without referring to the dependency conditioned on one and another. Also, these region based methods rely on spatial manipulation to position the local features in comparable similarity measuring. To combat these limitations, in this paper we introduce the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent representations of the paired images so as to focus on the relevant parts of both images and produce their \textit{relative representations}. Given a pair of pedestrian images to be compared, the proposed model mimics the foveation of human eyes to detect distinct regions concurrent on both images, namely co-dependent features, and alternatively attend to relevant regions to fuse them into the similarity learning. Our comparator is capable of producing dynamic representations relative to a particular sample every time, and thus well-suited to the case of re-identifying pedestrians on-the-fly. We perform extensive experiments to provide the insights and demonstrate the effectiveness of the proposed DCCs in person re-ID. Moreover, our approach has achieved the state-of-the-art performance on three benchmark data sets: DukeMTMC-reID \cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}.
Tasks	Foveation, Person Re-Identification, Representation Learning
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11027v1
PDF	http://arxiv.org/pdf/1804.11027v1.pdf
PWC	https://paperswithcode.com/paper/deep-co-attention-based-comparators-for
Repo
Framework

Fairness Through Computationally-Bounded Awareness


Title	Fairness Through Computationally-Bounded Awareness
Authors	Michael P. Kim, Omer Reingold, Guy N. Rothblum
Abstract	We study the problem of fair classification within the versatile framework of Dwork et al. [ITCS ‘12], which assumes the existence of a metric that measures similarity between pairs of individuals. Unlike earlier work, we do not assume that the entire metric is known to the learning algorithm; instead, the learner can query this arbitrary metric a bounded number of times. We propose a new notion of fairness called metric multifairness and show how to achieve this notion in our setting. Metric multifairness is parameterized by a similarity metric $d$ on pairs of individuals to classify and a rich collection ${\cal C}$ of (possibly overlapping) “comparison sets” over pairs of individuals. At a high level, metric multifairness guarantees that similar subpopulations are treated similarly, as long as these subpopulations are identified within the class ${\cal C}$.
Tasks
Published	2018-03-08
URL	http://arxiv.org/abs/1803.03239v2
PDF	http://arxiv.org/pdf/1803.03239v2.pdf
PWC	https://paperswithcode.com/paper/fairness-through-computationally-bounded
Repo
Framework

Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization


Title	Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization
Authors	Rad Niazadeh, Tim Roughgarden, Joshua R. Wang
Abstract	In this paper we study the fundamental problems of maximizing a continuous non-monotone submodular function over the hypercube, both with and without coordinate-wise concavity. This family of optimization problems has several applications in machine learning, economics, and communication systems. Our main result is the first $\frac{1}{2}$-approximation algorithm for continuous submodular function maximization; this approximation factor of $\frac{1}{2}$ is the best possible for algorithms that only query the objective function at polynomially many points. For the special case of DR-submodular maximization, i.e. when the submodular functions is also coordinate wise concave along all coordinates, we provide a different $\frac{1}{2}$-approximation algorithm that runs in quasilinear time. Both of these results improve upon prior work [Bian et al, 2017, Soma and Yoshida, 2017]. Our first algorithm uses novel ideas such as reducing the guaranteed approximation problem to analyzing a zero-sum game for each coordinate, and incorporates the geometry of this zero-sum game to fix the value at this coordinate. Our second algorithm exploits coordinate-wise concavity to identify a monotone equilibrium condition sufficient for getting the required approximation guarantee, and hunts for the equilibrium point using binary search. We further run experiments to verify the performance of our proposed algorithms in related machine learning applications.
Tasks
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09480v1
PDF	http://arxiv.org/pdf/1805.09480v1.pdf
PWC	https://paperswithcode.com/paper/optimal-algorithms-for-continuous-non
Repo
Framework

Learning Inward Scaled Hypersphere Embedding: Exploring Projections in Higher Dimensions


Title	Learning Inward Scaled Hypersphere Embedding: Exploring Projections in Higher Dimensions
Authors	Muhammad Kamran Janjua, Shah Nawaz, Alessandro Calefati, Ignazio Gallo
Abstract	Majority of the current dimensionality reduction or retrieval techniques rely on embedding the learned feature representations onto a computable metric space. Once the learned features are mapped, a distance metric aids the bridging of gaps between similar instances. Since the scaled projection is not exploited in these methods, discriminative embedding onto a hyperspace becomes a challenge. In this paper, we propose to inwardly scale feature representations in proportional to projecting them onto a hypersphere manifold for discriminative analysis. We further propose a novel, yet simpler, convolutional neural network based architecture and extensively evaluate the proposed methodology in the context of classification and retrieval tasks obtaining results comparable to state-of-the-art techniques.
Tasks	Dimensionality Reduction
Published	2018-10-16
URL	http://arxiv.org/abs/1810.07037v1
PDF	http://arxiv.org/pdf/1810.07037v1.pdf
PWC	https://paperswithcode.com/paper/learning-inward-scaled-hypersphere-embedding
Repo
Framework