October 16, 2019

3176 words 15 mins read

Paper Group ANR 979

A method to Suppress Facial Expression in Posed and Spontaneous Videos. SRP: Efficient class-aware embedding learning for large-scale data via supervised random projections. Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks. Global Convergence to the Equilibrium of GANs using Variational Inequalities. Mining …

A method to Suppress Facial Expression in Posed and Spontaneous Videos


Title	A method to Suppress Facial Expression in Posed and Spontaneous Videos
Authors	Ghada Zamzmi, Gabriel Ruiz, Matthew Shreve, Dmitry Goldgof, Rangachar Kasturi, Sudeep Sarkar
Abstract	We address the problem of suppressing facial expressions in videos because expressions can hinder the retrieval of important information in applications such as face recognition. To achieve this, we present an optical strain suppression method that removes any facial expression without requiring training for a specific expression. For each frame in a video, an optical strain map that provides the strain magnitude value at each pixel is generated; this strain map is then utilized to neutralize the expression by replacing pixels of high strain values with pixels from a reference face frame. Experimental results of testing the method on various expressions namely happiness, sadness, and anger for two publicly available data sets (i.e., BU-4DFE and AM-FED) show the ability of our method in suppressing facial expressions.
Tasks	Face Recognition
Published	2018-10-04
URL	http://arxiv.org/abs/1810.02401v1
PDF	http://arxiv.org/pdf/1810.02401v1.pdf
PWC	https://paperswithcode.com/paper/a-method-to-suppress-facial-expression-in
Repo
Framework

SRP: Efficient class-aware embedding learning for large-scale data via supervised random projections


Title	SRP: Efficient class-aware embedding learning for large-scale data via supervised random projections
Authors	Amir-Hossein Karimi, Alexander Wong, Ali Ghodsi
Abstract	Supervised dimensionality reduction strategies have been of great interest. However, current supervised dimensionality reduction approaches are difficult to scale for situations characterized by large datasets given the high computational complexities associated with such methods. While stochastic approximation strategies have been explored for unsupervised dimensionality reduction to tackle this challenge, such approaches are not well-suited for accelerating computational speed for supervised dimensionality reduction. Motivated to tackle this challenge, in this study we explore a novel direction of directly learning optimal class-aware embeddings in a supervised manner via the notion of supervised random projections (SRP). The key idea behind SRP is that, rather than performing spectral decomposition (or approximations thereof) which are computationally prohibitive for large-scale data, we instead perform a direct decomposition by leveraging kernel approximation theory and the symmetry of the Hilbert-Schmidt Independence Criterion (HSIC) measure of dependence between the embedded data and the labels. Experimental results on five different synthetic and real-world datasets demonstrate that the proposed SRP strategy for class-aware embedding learning can be very promising in producing embeddings that are highly competitive with existing supervised dimensionality reduction methods (e.g., SPCA and KSPCA) while achieving 1-2 orders of magnitude better computational performance. As such, such an efficient approach to learning embeddings for dimensionality reduction can be a powerful tool for large-scale data analysis and visualization.
Tasks	Dimensionality Reduction
Published	2018-11-07
URL	http://arxiv.org/abs/1811.03166v1
PDF	http://arxiv.org/pdf/1811.03166v1.pdf
PWC	https://paperswithcode.com/paper/srp-efficient-class-aware-embedding-learning
Repo
Framework

Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks


Title	Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks
Authors	Henggang Cui, Vladan Radosavljevic, Fang-Chieh Chou, Tsung-Han Lin, Thi Nguyen, Tzu-Kuo Huang, Jeff Schneider, Nemanja Djuric
Abstract	Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. However, despite large interest and a number of industry players working in the autonomous domain, there still remains more to be done in order to develop a system capable of operating at a level comparable to best human drivers. One reason for this is high uncertainty of traffic behavior and large number of situations that an SDV may encounter on the roads, making it very difficult to create a fully generalizable system. To ensure safe and efficient operations, an autonomous vehicle is required to account for this uncertainty and to anticipate a multitude of possible behaviors of traffic actors in its surrounding. We address this critical problem and present a method to predict multiple possible trajectories of actors while also estimating their probabilities. The method encodes each actor’s surrounding context into a raster image, used as input by deep convolutional networks to automatically derive relevant features for the task. Following extensive offline evaluation and comparison to state-of-the-art baselines, the method was successfully tested on SDVs in closed-course tests.
Tasks	Autonomous Driving
Published	2018-09-18
URL	http://arxiv.org/abs/1809.10732v2
PDF	http://arxiv.org/pdf/1809.10732v2.pdf
PWC	https://paperswithcode.com/paper/multimodal-trajectory-predictions-for
Repo
Framework

Global Convergence to the Equilibrium of GANs using Variational Inequalities


Title	Global Convergence to the Equilibrium of GANs using Variational Inequalities
Authors	Ian Gemp, Sridhar Mahadevan
Abstract	In optimization, the negative gradient of a function denotes the direction of steepest descent. Furthermore, traveling in any direction orthogonal to the gradient maintains the value of the function. In this work, we show that these orthogonal directions that are ignored by gradient descent can be critical in equilibrium problems. Equilibrium problems have drawn heightened attention in machine learning due to the emergence of the Generative Adversarial Network (GAN). We use the framework of Variational Inequalities to analyze popular training algorithms for a fundamental GAN variant: the Wasserstein Linear-Quadratic GAN. We show that the steepest descent direction causes divergence from the equilibrium, and convergence to the equilibrium is achieved through following a particular orthogonal direction. We call this successful technique Crossing-the-Curl, named for its mathematical derivation as well as its intuition: identify the game’s axis of rotation and move “across” space in the direction towards smaller “curling”.
Tasks
Published	2018-08-04
URL	https://arxiv.org/abs/1808.01531v3
PDF	https://arxiv.org/pdf/1808.01531v3.pdf
PWC	https://paperswithcode.com/paper/global-convergence-to-the-equilibrium-of-gans
Repo
Framework


Title	Mining Social Media for Newsgathering: A Review
Authors	Arkaitz Zubiaga
Abstract	Social media is becoming an increasingly important data source for learning about breaking news and for following the latest developments of ongoing news. This is in part possible thanks to the existence of mobile devices, which allows anyone with access to the Internet to post updates from anywhere, leading in turn to a growing presence of citizen journalism. Consequently, social media has become a go-to resource for journalists during the process of newsgathering. Use of social media for newsgathering is however challenging, and suitable tools are needed in order to facilitate access to useful information for reporting. In this paper, we provide an overview of research in data mining and natural language processing for mining social media for newsgathering. We discuss five different areas that researchers have worked on to mitigate the challenges inherent to social media newsgathering: news discovery, curation of news, validation and verification of content, newsgathering dashboards, and other tasks. We outline the progress made so far in the field, summarise the current challenges as well as discuss future directions in the use of computational journalism to assist with social media newsgathering. This review is relevant to computer scientists researching news in social media as well as for interdisciplinary researchers interested in the intersection of computer science and journalism.
Tasks
Published	2018-04-10
URL	https://arxiv.org/abs/1804.03540v2
PDF	https://arxiv.org/pdf/1804.03540v2.pdf
PWC	https://paperswithcode.com/paper/mining-social-media-for-newsgathering
Repo
Framework

Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation


Title	Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation
Authors	Benjamin Marie, Atsushi Fujita
Abstract	Recent work achieved remarkable results in training neural machine translation (NMT) systems in a fully unsupervised way, with new and dedicated architectures that rely on monolingual corpora only. In this work, we propose to define unsupervised NMT (UNMT) as NMT trained with the supervision of synthetic bilingual data. Our approach straightforwardly enables the use of state-of-the-art architectures proposed for supervised NMT by replacing human-made bilingual data with synthetic bilingual data for training. We propose to initialize the training of UNMT with synthetic bilingual data generated by unsupervised statistical machine translation (USMT). The UNMT system is then incrementally improved using back-translation. Our preliminary experiments show that our approach achieves a new state-of-the-art for unsupervised machine translation on the WMT16 German–English news translation task, for both translation directions.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12703v1
PDF	http://arxiv.org/pdf/1810.12703v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-neural-machine-translation-1
Repo
Framework

Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data


Title	Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data
Authors	Yutaro Miyauchi, Yusuke Sugano, Yasuyuki Matsushita
Abstract	Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task to generate images with detailed conditioning on object shapes. Existing methods for conditional image generation use category labels and/or keypoints and are only give limited control over object categories. In this work, we present SCGAN, an architecture to generate images with a desired shape specified by an input normal map. The shape-conditioned image generation task is achieved by explicitly modeling the image appearance via a latent appearance vector. The network is trained using unpaired training samples of real images and rendered normal maps. This approach enables us to generate images of arbitrary object categories with the target shape and diverse image appearances. We show the effectiveness of our method through both qualitative and quantitative evaluation on training data generation tasks.
Tasks	Conditional Image Generation, Image Generation
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11991v1
PDF	http://arxiv.org/pdf/1811.11991v1.pdf
PWC	https://paperswithcode.com/paper/shape-conditioned-image-generation-by
Repo
Framework

Efficient Active Learning for Image Classification and Segmentation using a Sample Selection and Conditional Generative Adversarial Network


Title	Efficient Active Learning for Image Classification and Segmentation using a Sample Selection and Conditional Generative Adversarial Network
Authors	Dwarikanath Mahapatra, Behzad Bozorgtabar, Jean-Philippe Thiran, Mauricio Reyes
Abstract	Training robust deep learning (DL) systems for medical image classification or segmentation is challenging due to limited images covering different disease types and severity. We propose an active learning (AL) framework to select most informative samples and add to the training data. We use conditional generative adversarial networks (cGANs) to generate realistic chest xray images with different disease characteristics by conditioning its generation on a real image sample. Informative samples to add to the training set are identified using a Bayesian neural network. Experiments show our proposed AL framework is able to achieve state of the art performance by using about 35% of the full dataset, thus saving significant time and effort over conventional methods.
Tasks	Active Learning, Image Classification
Published	2018-06-14
URL	https://arxiv.org/abs/1806.05473v4
PDF	https://arxiv.org/pdf/1806.05473v4.pdf
PWC	https://paperswithcode.com/paper/efficient-active-learning-for-image
Repo
Framework

Scalable Natural Gradient Langevin Dynamics in Practice


Title	Scalable Natural Gradient Langevin Dynamics in Practice
Authors	Henri Palacci, Henry Hess
Abstract	Stochastic Gradient Langevin Dynamics (SGLD) is a sampling scheme for Bayesian modeling adapted to large datasets and models. SGLD relies on the injection of Gaussian Noise at each step of a Stochastic Gradient Descent (SGD) update. In this scheme, every component in the noise vector is independent and has the same scale, whereas the parameters we seek to estimate exhibit strong variations in scale and significant correlation structures, leading to poor convergence and mixing times. We compare different preconditioning approaches to the normalization of the noise vector and benchmark these approaches on the following criteria: 1) mixing times of the multivariate parameter vector, 2) regularizing effect on small dataset where it is easy to overfit, 3) covariate shift detection and 4) resistance to adversarial examples.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02855v1
PDF	http://arxiv.org/pdf/1806.02855v1.pdf
PWC	https://paperswithcode.com/paper/scalable-natural-gradient-langevin-dynamics
Repo
Framework

Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges


Title	Human-Machine Inference Networks For Smart Decision Making: Opportunities and Challenges
Authors	Aditya Vempaty, Bhavya Kailkhura, Pramod K. Varshney
Abstract	The emerging paradigm of Human-Machine Inference Networks (HuMaINs) combines complementary cognitive strengths of humans and machines in an intelligent manner to tackle various inference tasks and achieves higher performance than either humans or machines by themselves. While inference performance optimization techniques for human-only or sensor-only networks are quite mature, HuMaINs require novel signal processing and machine learning solutions. In this paper, we present an overview of the HuMaINs architecture with a focus on three main issues that include architecture design, inference algorithms including security/privacy challenges, and application areas/use cases.
Tasks	Decision Making
Published	2018-01-29
URL	http://arxiv.org/abs/1801.09626v1
PDF	http://arxiv.org/pdf/1801.09626v1.pdf
PWC	https://paperswithcode.com/paper/human-machine-inference-networks-for-smart
Repo
Framework

Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features


Title	Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features
Authors	Constantinos Loukas
Abstract	Recognizing the phases of a laparoscopic surgery (LS) operation form its video constitutes a fundamental step for efficient content representation, indexing and retrieval in surgical video databases. In the literature, most techniques focus on phase segmentation of the entire LS video using hand-crafted visual features, instrument usage signals, and recently convolutional neural networks (CNNs). In this paper we address the problem of phase recognition of short video shots (10s) of the operation, without utilizing information about the preceding/forthcoming video frames, their phase labels or the instruments used. We investigate four state-of-the-art CNN architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature extraction via transfer learning. Visual saliency was employed for selecting the most informative region of the image as input to the CNN. Video shot representation was based on two temporal pooling mechanisms. Most importantly, we investigate the role of ‘elapsed time’ (from the beginning of the operation), and we show that inclusion of this feature can increase performance dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory (LSTM) network was trained for video shot classification based on the fusion of CNN features with ‘elapsed time’, increasing the accuracy to 86%. Our results highlight the prominent role of visual saliency, long-range temporal recursion and ‘elapsed time’ (a feature so far ignored), for surgical phase recognition.
Tasks	Transfer Learning
Published	2018-07-20
URL	http://arxiv.org/abs/1807.07853v4
PDF	http://arxiv.org/pdf/1807.07853v4.pdf
PWC	https://paperswithcode.com/paper/surgical-phase-recognition-of-short-video
Repo
Framework

Optimal Sparse Singular Value Decomposition for High-dimensional High-order Data


Title	Optimal Sparse Singular Value Decomposition for High-dimensional High-order Data
Authors	Anru Zhang, Rungang Han
Abstract	In this article, we consider the sparse tensor singular value decomposition, which aims for dimension reduction on high-dimensional high-order data with certain sparsity structure. A method named \underline{s}parse \underline{t}ensor \underline{a}lternating \underline{t}hresholding for \underline{s}ingular \underline{v}alue \underline{d}ecomposition (STAT-SVD) is proposed. The proposed procedure features a novel double projection & thresholding scheme, which provides a sharp criterion for thresholding in each iteration. Compared with regular tensor SVD model, STAT-SVD permits more robust estimation under weaker assumptions. Both the upper and lower bounds for estimation accuracy are developed. The proposed procedure is shown to be minimax rate-optimal in a general class of situations. Simulation studies show that STAT-SVD performs well under a variety of configurations. We also illustrate the merits of the proposed procedure on a longitudinal tensor dataset on European country mortality rates.
Tasks	Dimensionality Reduction
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01796v1
PDF	http://arxiv.org/pdf/1809.01796v1.pdf
PWC	https://paperswithcode.com/paper/optimal-sparse-singular-value-decomposition
Repo
Framework

Multi-Band Covariance Interpolation with Applications in Massive MIMO


Title	Multi-Band Covariance Interpolation with Applications in Massive MIMO
Authors	Saeid Haghighatshoar, Mahdi Barzegar Khalilsarai, Giuseppe Caire
Abstract	In this paper, we study the problem of multi-band (frequency-variant) covariance interpolation with a particular emphasis towards massive MIMO applications. In a massive MIMO system, the communication between each BS with $M \gg 1$ antennas and each single-antenna user occurs through a collection of scatterers in the environment, where the channel vector of each user at BS antennas consists in a weighted linear combination of the array responses of the scatterers, where each scatterer has its own angle of arrival (AoA) and complex channel gain. The array response at a given AoA depends on the wavelength of the incoming planar wave and is naturally frequency dependent. This results in a frequency-dependent distortion where the second order statistics, i.e., the covariance matrix, of the channel vectors varies with frequency. In this paper, we show that although this effect is generally negligible for a small number of antennas $M$, it results in a considerable distortion of the covariance matrix and especially its dominant signal subspace in the massive MIMO regime where $M \to \infty$, and can generally incur a serious degradation of the performance especially in frequency division duplexing (FDD) massive MIMO systems where the uplink (UL) and the downlink (DL) communication occur over different frequency bands. We propose a novel UL-DL covariance interpolation technique that is able to recover the covariance matrix in the DL from an estimate of the covariance matrix in the UL under a mild reciprocity condition on the angular power spread function (PSF) of the users. We analyze the performance of our proposed scheme mathematically and prove its robustness under a sufficiently large spatial oversampling of the array. We also propose several simple off-the-shelf algorithms for UL-DL covariance interpolation and evaluate their performance via numerical simulations.
Tasks
Published	2018-01-11
URL	http://arxiv.org/abs/1801.03714v1
PDF	http://arxiv.org/pdf/1801.03714v1.pdf
PWC	https://paperswithcode.com/paper/multi-band-covariance-interpolation-with
Repo
Framework

Safe Active Feature Selection for Sparse Learning


Title	Safe Active Feature Selection for Sparse Learning
Authors	Shaogang Ren, Jianhua Z. Huang, Shuai Huang, Xiaoning Qian
Abstract	We present safe active incremental feature selection~(SAIF) to scale up the computation of LASSO solutions. SAIF does not require a solution from a heavier penalty parameter as in sequential screening or updating the full model for each iteration as in dynamic screening. Different from these existing screening methods, SAIF starts from a small number of features and incrementally recruits active features and updates the significantly reduced model. Hence, it is much more computationally efficient and scalable with the number of features. More critically, SAIF has the safe guarantee as it has the convergence guarantee to the optimal solution to the original full LASSO problem. Such an incremental procedure and theoretical convergence guarantee can be extended to fused LASSO problems. Compared with state-of-the-art screening methods as well as working set and homotopy methods, which may not always guarantee the optimal solution, SAIF can achieve superior or comparable efficiency and high scalability with the safe guarantee when facing extremely high dimensional data sets. Experiments with both synthetic and real-world data sets show that SAIF can be up to 50 times faster than dynamic screening, and hundreds of times faster than computing LASSO or fused LASSO solutions without screening.
Tasks	Feature Selection, Sparse Learning
Published	2018-06-15
URL	http://arxiv.org/abs/1806.05817v2
PDF	http://arxiv.org/pdf/1806.05817v2.pdf
PWC	https://paperswithcode.com/paper/safe-active-feature-selection-for-sparse
Repo
Framework

Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance


Title	Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance
Authors	Giulia Luise, Alessandro Rudi, Massimiliano Pontil, Carlo Ciliberto
Abstract	Applications of optimal transport have recently gained remarkable attention thanks to the computational advantages of entropic regularization. However, in most situations the Sinkhorn approximation of the Wasserstein distance is replaced by a regularized version that is less accurate but easy to differentiate. In this work we characterize the differential properties of the original Sinkhorn distance, proving that it enjoys the same smoothness as its regularized version and we explicitly provide an efficient algorithm to compute its gradient. We show that this result benefits both theory and applications: on one hand, high order smoothness confers statistical guarantees to learning with Wasserstein approximations. On the other hand, the gradient formula allows us to efficiently solve learning and optimization problems in practice. Promising preliminary experiments complement our analysis.
Tasks
Published	2018-05-30
URL	http://arxiv.org/abs/1805.11897v1
PDF	http://arxiv.org/pdf/1805.11897v1.pdf
PWC	https://paperswithcode.com/paper/differential-properties-of-sinkhorn
Repo
Framework