January 27, 2020

3142 words 15 mins read

Paper Group ANR 1118

Dance Dance Generation: Motion Transfer for Internet Videos. Classification of Cardiac Arrhythmias from Single Lead ECG with a Convolutional Recurrent Neural Network. Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction. Evaluating Robustness of Deep Image Super-Resolution against Adversarial Attacks. Expression Analysis Ba …

Dance Dance Generation: Motion Transfer for Internet Videos


Title	Dance Dance Generation: Motion Transfer for Internet Videos
Authors	Yipin Zhou, Zhaowen Wang, Chen Fang, Trung Bui, Tamara L. Berg
Abstract	This work presents computational methods for transferring body movements from one person to another with videos collected in the wild. Specifically, we train a personalized model on a single video from the Internet which can generate videos of this target person driven by the motions of other people. Our model is built on two generative networks: a human (foreground) synthesis net which generates photo-realistic imagery of the target person in a novel pose, and a fusion net which combines the generated foreground with the scene (background), adding shadows or reflections as needed to enhance realism. We validate the the efficacy of our proposed models over baselines with qualitative and quantitative evaluations as well as a subjective test.
Tasks
Published	2019-03-30
URL	http://arxiv.org/abs/1904.00129v1
PDF	http://arxiv.org/pdf/1904.00129v1.pdf
PWC	https://paperswithcode.com/paper/dance-dance-generation-motion-transfer-for
Repo
Framework

Classification of Cardiac Arrhythmias from Single Lead ECG with a Convolutional Recurrent Neural Network


Title	Classification of Cardiac Arrhythmias from Single Lead ECG with a Convolutional Recurrent Neural Network
Authors	Jérôme Van Zaen, Olivier Chételat, Mathieu Lemay, Enric M. Calvo, Ricard Delgado-Gonzalo
Abstract	While most heart arrhythmias are not immediately harmful, they can lead to severe complications. In particular, atrial fibrillation, the most common arrhythmia, is characterized by fast and irregular heart beats and increases the risk of suffering a stroke. To detect such abnormal heart conditions, we propose a system composed of two main parts: a smart vest with two cooperative sensors to collect ECG data and a neural network architecture to classify heart rhythms. The smart vest uses two dry bi-electrodes to record a single lead ECG signal. The biopotential signal is then streamed via a gateway to the cloud where a neural network detects and classifies the heart arrhythmias. We selected an architecture that combines convolutional and recurrent layers. The convolutional layers extract relevant features from sliding windows of ECG and the recurrent layer aggregates them for a final softmax layer that performs the classification. Our neural network achieves an accuracy of 87.50% on the dataset of the challenge of Computing in Cardiology 2017.
Tasks
Published	2019-06-25
URL	https://arxiv.org/abs/1907.01513v1
PDF	https://arxiv.org/pdf/1907.01513v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-cardiac-arrhythmias-from
Repo
Framework

Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction


Title	Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction
Authors	Steven Hickson, Karthik Raveendran, Alireza Fathi, Kevin Murphy, Irfan Essa
Abstract	We propose 4 insights that help to significantly improve the performance of deep learning models that predict surface normals and semantic labels from a single RGB image. These insights are: (1) denoise the “ground truth” surface normals in the training set to ensure consistency with the semantic labels; (2) concurrently train on a mix of real and synthetic data, instead of pretraining on synthetic and finetuning on real; (3) jointly predict normals and semantics using a shared model, but only backpropagate errors on pixels that have valid training labels; (4) slim down the model and use grayscale instead of color inputs. Despite the simplicity of these steps, we demonstrate consistently improved results on several datasets, using a model that runs at 12 fps on a standard mobile phone.
Tasks
Published	2019-06-16
URL	https://arxiv.org/abs/1906.06792v1
PDF	https://arxiv.org/pdf/1906.06792v1.pdf
PWC	https://paperswithcode.com/paper/floors-are-flat-leveraging-semantics-for-real
Repo
Framework

Evaluating Robustness of Deep Image Super-Resolution against Adversarial Attacks


Title	Evaluating Robustness of Deep Image Super-Resolution against Adversarial Attacks
Authors	Jun-Ho Choi, Huan Zhang, Jun-Hyuk Kim, Cho-Jui Hsieh, Jong-Seok Lee
Abstract	Single-image super-resolution aims to generate a high-resolution version of a low-resolution image, which serves as an essential component in many computer vision applications. This paper investigates the robustness of deep learning-based super-resolution methods against adversarial attacks, which can significantly deteriorate the super-resolved images without noticeable distortion in the attacked low-resolution images. It is demonstrated that state-of-the-art deep super-resolution methods are highly vulnerable to adversarial attacks. Different levels of robustness of different methods are analyzed theoretically and experimentally. We also present analysis on transferability of attacks, and feasibility of targeted attacks and universal attacks.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-04-12
URL	https://arxiv.org/abs/1904.06097v2
PDF	https://arxiv.org/pdf/1904.06097v2.pdf
PWC	https://paperswithcode.com/paper/evaluating-robustness-of-deep-image-super
Repo
Framework

Expression Analysis Based on Face Regions in Read-world Conditions


Title	Expression Analysis Based on Face Regions in Read-world Conditions
Authors	Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang, Ming-Yue Niu
Abstract	Facial emotion recognition is an essential and important aspect of the field of human-machine interaction. Past research on facial emotion recognition focuses on the laboratory environment. However, it faces many challenges in real-world conditions, i.e., illumination changes, large pose variations and partial or full occlusions. Those challenges lead to different face areas with different degrees of sharpness and completeness. Inspired by this fact, we focus on the authenticity of predictions generated by different <emotion, region> pairs. For example, if only the mouth areas are available and the emotion classifier predicts happiness, then there is a question of how to judge the authenticity of predictions. This problem can be converted into the contribution of different face areas to different emotions. In this paper, we divide the whole face into six areas: nose areas, mouth areas, eyes areas, nose to mouth areas, nose to eyes areas and mouth to eyes areas. To obtain more convincing results, our experiments are conducted on three different databases: facial expression recognition + ( FER+), real-world affective faces database (RAF-DB) and expression in-the-wild (ExpW) dataset. Through analysis of the classification accuracy, the confusion matrix and the class activation map (CAM), we can establish convincing results. To sum up, the contributions of this paper lie in two areas: 1) We visualize concerned areas of human faces in emotion recognition; 2) We analyze the contribution of different face areas to different emotions in real-world conditions through experimental analysis. Our findings can be combined with findings in psychology to promote the understanding of emotional expressions.
Tasks	Emotion Recognition, Facial Expression Recognition
Published	2019-10-23
URL	https://arxiv.org/abs/1911.05188v1
PDF	https://arxiv.org/pdf/1911.05188v1.pdf
PWC	https://paperswithcode.com/paper/expression-analysis-based-on-face-regions-in
Repo
Framework

Regularized Sparse Gaussian Processes


Title	Regularized Sparse Gaussian Processes
Authors	Rui Meng, Herbert Lee, Soper Braden, Priyadip Ray
Abstract	Gaussian processes are a flexible Bayesian nonparametric modelling approach that has been widely applied to learning tasks such as facial expression recognition, image reconstruction, and human pose estimation. To address the issues of poor scaling from exact inference methods, approximation methods based on sparse Gaussian processes (SGP) and variational inference (VI) are necessary for the inference on large datasets. However, one of the problems involved in SGP, especially in latent variable models, is that the distribution of the inducing inputs may fail to capture the distribution of training inputs, which may lead to inefficient inference and poor model prediction. Hence, we propose a regularization approach for sparse Gaussian processes. We also extend this regularization approach into latent sparse Gaussian processes in a unified view, considering the balance of the distribution of inducing inputs and embedding inputs. Furthermore, we justify that performing VI on a sparse latent Gaussian process with this regularization term is equivalent to performing VI on a related empirical Bayes model with a prior on the inducing inputs. Also stochastic variational inference is available for our regularization approach. Finally, the feasibility of our proposed regularization method is demonstrated on three real-world datasets.
Tasks	Facial Expression Recognition, Gaussian Processes, Image Reconstruction, Latent Variable Models, Pose Estimation
Published	2019-10-13
URL	https://arxiv.org/abs/1910.05843v1
PDF	https://arxiv.org/pdf/1910.05843v1.pdf
PWC	https://paperswithcode.com/paper/regularized-sparse-gaussian-processes
Repo
Framework

Recurrent neural network approach for cyclic job shop scheduling problem


Title	Recurrent neural network approach for cyclic job shop scheduling problem
Authors	M-Tahar Kechadi, Kok Seng Low, G. Goncalves
Abstract	While cyclic scheduling is involved in numerous real-world applications, solving the derived problem is still of exponential complexity. This paper focuses specifically on modelling the manufacturing application as a cyclic job shop problem and we have developed an efficient neural network approach to minimise the cycle time of a schedule. Our approach introduces an interesting model for a manufacturing production, and it is also very efficient, adaptive and flexible enough to work with other techniques. Experimental results validated the approach and confirmed our hypotheses about the system model and the efficiency of neural networks for such a class of problems.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09437v1
PDF	https://arxiv.org/pdf/1910.09437v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-network-approach-for-cyclic
Repo
Framework

Controlling Neural Networks via Energy Dissipation


Title	Controlling Neural Networks via Energy Dissipation
Authors	Michael Moeller, Thomas Möllenhoff, Daniel Cremers
Abstract	The last decade has shown a tremendous success in solving various computer vision problems with the help of deep learning techniques. Lately, many works have demonstrated that learning-based approaches with suitable network architectures even exhibit superior performance for the solution of (ill-posed) image reconstruction problems such as deblurring, super-resolution, or medical image reconstruction. The drawback of purely learning-based methods, however, is that they cannot provide provable guarantees for the trained network to follow a given data formation process during inference. In this work we propose energy dissipating networks that iteratively compute a descent direction with respect to a given cost function or energy at the currently estimated reconstruction. Therefore, an adaptive step size rule such as a line-search, along with a suitable number of iterations can guarantee the reconstruction to follow a given data formation model encoded in the energy to arbitrary precision, and hence control the model’s behavior even during test time. We prove that under standard assumptions, descent using the direction predicted by the network converges (linearly) to the global minimum of the energy. We illustrate the effectiveness of the proposed approach in experiments on single image super resolution and computed tomography (CT) reconstruction, and further illustrate extensions to convex feasibility problems.
Tasks	Computed Tomography (CT), Deblurring, Image Reconstruction, Image Super-Resolution, Super-Resolution
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03081v2
PDF	https://arxiv.org/pdf/1904.03081v2.pdf
PWC	https://paperswithcode.com/paper/controlling-neural-networks-via-energy
Repo
Framework

Comment on “Blessings of Multiple Causes”


Title	Comment on “Blessings of Multiple Causes”
Authors	Elizabeth L. Ogburn, Ilya Shpitser, Eric J. Tchetgen Tchetgen
Abstract	(This comment has been updated to respond to Wang and Blei’s rejoinder [arXiv:1910.07320].) The premise of the deconfounder method proposed in “Blessings of Multiple Causes” by Wang and Blei [arXiv:1805.06826], namely that a variable that renders multiple causes conditionally independent also controls for unmeasured multi-cause confounding, is incorrect. This can be seen by noting that no fact about the observed data alone can be informative about ignorability, since ignorability is compatible with any observed data distribution. Methods to control for unmeasured confounding may be valid with additional assumptions in specific settings, but they cannot, in general, provide a checkable approach to causal inference, and they do not, in general, require weaker assumptions than the assumptions that are commonly used for causal inference. While this is outside the scope of this comment, we note that much recent work on applying ideas from latent variable modeling to causal inference problems suffers from similar issues.
Tasks	Causal Inference
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05438v3
PDF	https://arxiv.org/pdf/1910.05438v3.pdf
PWC	https://paperswithcode.com/paper/comment-on-blessings-of-multiple-causes
Repo
Framework

Estimating Transfer Entropy via Copula Entropy


Title	Estimating Transfer Entropy via Copula Entropy
Authors	Ma Jian
Abstract	Causal inference is a fundemental problem in statistics and has wide applications in different fields. Transfer Entropy (TE) is a important notion defined for measuring causality, which is essentially conditional Mutual Information (MI). Copula Entropy (CE) is a theory on measurement of statistical independence and is equivalent to MI. In this paper, we prove that TE can be represented with only CE and then propose a non-parametric method for estimating TE via CE. The proposed method was applied to analyze the Beijing PM2.5 data in the experiments. Experimental results show that the proposed method can infer causality relationships from data effectively and hence help to understand the data better.
Tasks	Causal Inference
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04375v2
PDF	https://arxiv.org/pdf/1910.04375v2.pdf
PWC	https://paperswithcode.com/paper/estimating-transfer-entropy-via-copula
Repo
Framework

Optimized Partial Identification Bounds for Regression Discontinuity Designs with Manipulation


Title	Optimized Partial Identification Bounds for Regression Discontinuity Designs with Manipulation
Authors	Evan Rosenman, Karthik Rajkumar
Abstract	The regression discontinuity (RD) design is one of the most popular quasi-experimental methods for applied causal inference. In practice, the method is quite sensitive to the assumption that individuals cannot control their value of a “running variable” that determines treatment status precisely. If individuals are able to precisely manipulate their scores, then point identification is lost. We propose a procedure for obtaining partial identification bounds in the case of a discrete running variable where manipulation is present. Our method relies on two stages: first, we derive the distribution of non-manipulators under several assumptions about the data. Second, we obtain bounds on the causal effect via a sequential convex programming approach. We also propose methods for tightening the partial identification bounds using an auxiliary covariate, and derive confidence intervals via the bootstrap. We demonstrate the utility of our method on a simulated dataset.
Tasks	Causal Inference
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02170v1
PDF	https://arxiv.org/pdf/1910.02170v1.pdf
PWC	https://paperswithcode.com/paper/optimized-partial-identification-bounds-for
Repo
Framework

Facial Expression Recognition Using Human to Animated-Character Expression Translation


Title	Facial Expression Recognition Using Human to Animated-Character Expression Translation
Authors	Kamran Ali, Ilkin Isler, Charles Hughes
Abstract	Facial expression recognition is a challenging task due to two major problems: the presence of inter-subject variations in facial expression recognition dataset and impure expressions posed by human subjects. In this paper we present a novel Human-to-Animation conditional Generative Adversarial Network (HA-GAN) to overcome these two problems by using many (human faces) to one (animated face) mapping. Specifically, for any given input human expression image, our HA-GAN transfers the expression information from the input image to a fixed animated identity. Stylized animated characters from the Facial Expression Research Group-Database (FERGDB) are used for the generation of fixed identity. By learning this many-to-one identity mapping function using our proposed HA-GAN, the effect of inter-subject variations can be reduced in Facial Expression Recognition(FER). We also argue that the expressions in the generated animated images are pure expressions and since FER is performed on these generated images, the performance of facial expression recognition is improved. Our initial experimental results on the state-of-the-art datasets show that facial expression recognition carried out on the generated animated images using our HA-GAN framework outperforms the baseline deep neural network and produces comparable or even better results than the state-of-the-art methods for facial expression recognition.
Tasks	Facial Expression Recognition
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05595v1
PDF	https://arxiv.org/pdf/1910.05595v1.pdf
PWC	https://paperswithcode.com/paper/facial-expression-recognition-using-human-to
Repo
Framework

CONet: A Cognitive Ocean Network


Title	CONet: A Cognitive Ocean Network
Authors	Huimin Lu, Dong Wang, Yujie Li, Jianru Li, Xin Li, Hyoungseop Kim, Seiichi Serikawa, Iztok Humar
Abstract	The scientific and technological revolution of the Internet of Things has begun in the area of oceanography. Historically, humans have observed the ocean from an external viewpoint in order to study it. In recent years, however, changes have occurred in the ocean, and laboratories have been built on the seafloor. Approximately 70.8% of the Earth’s surface is covered by oceans and rivers. The Ocean of Things is expected to be important for disaster prevention, ocean-resource exploration, and underwater environmental monitoring. Unlike traditional wireless sensor networks, the Ocean Network has its own unique features, such as low reliability and narrow bandwidth. These features will be great challenges for the Ocean Network. Furthermore, the integration of the Ocean Network with artificial intelligence has become a topic of increasing interest for oceanology researchers. The Cognitive Ocean Network (CONet) will become the mainstream of future ocean science and engineering developments. In this article, we define the CONet. The contributions of the paper are as follows: (1) a CONet architecture is proposed and described in detail; (2) important and useful demonstration applications of the CONet are proposed; and (3) future trends in CONet research are presented.
Tasks
Published	2019-01-09
URL	http://arxiv.org/abs/1901.06253v1
PDF	http://arxiv.org/pdf/1901.06253v1.pdf
PWC	https://paperswithcode.com/paper/conet-a-cognitive-ocean-network
Repo
Framework

Kernel-based Approach to Handle Mixed Data for Inferring Causal Graphs


Title	Kernel-based Approach to Handle Mixed Data for Inferring Causal Graphs
Authors	Teny Handhayani, James Cussens
Abstract	Causal learning is a beneficial approach to analyze the cause and effect relationships among variables in a dataset. A causal graph can be generated from a dataset using a particular causal algorithm, for instance, the PC algorithm or Fast Causal Inference (FCI). Generating a causal graph from a dataset that contains different data types (mixed data) is not trivial. This research offers an easy way to handle the mixed data so that it can be used to learn causal graphs using the existing application of the PC algorithm and FCI. This research proposes using kernel functions and Kernel Alignment to handle mixed data. Two main steps of this approach are computing a kernel matrix for each variable and calculating a pseudo-correlation matrix using Kernel Alignment. Kernel Alignment is used as a substitute for the correlation matrix for the conditional independence test for Gaussian data in the PC Algorithm and FCI. The advantage of this idea is that is possible to handle any data type by using a suitable kernel function to compute a kernel matrix for an observed variable. The proposed method is successfully applied to learn a causal graph from mixed data containing categorical, binary, ordinal, and continuous variables.
Tasks	Causal Inference
Published	2019-10-07
URL	https://arxiv.org/abs/1910.03055v1
PDF	https://arxiv.org/pdf/1910.03055v1.pdf
PWC	https://paperswithcode.com/paper/kernel-based-approach-to-handle-mixed-data
Repo
Framework

Spike-based causal inference for weight alignment


Title	Spike-based causal inference for weight alignment
Authors	Jordan Guerguiev, Konrad P. Kording, Blake A. Richards
Abstract	In artificial neural networks trained with gradient descent, the weights used for processing stimuli are also used during backward passes to calculate gradients. For the real brain to approximate gradients, gradient information would have to be propagated separately, such that one set of synaptic weights is used for processing and another set is used for backward passes. This produces the so-called “weight transport problem” for biological models of learning, where the backward weights used to calculate gradients need to mirror the forward weights used to process stimuli. This weight transport problem has been considered so hard that popular proposals for biological learning assume that the backward weights are simply random, as in the feedback alignment algorithm. However, such random weights do not appear to work well for large networks. Here we show how the discontinuity introduced in a spiking system can lead to a solution to this problem. The resulting algorithm is a special case of an estimator used for causal inference in econometrics, regression discontinuity design. We show empirically that this algorithm rapidly makes the backward weights approximate the forward weights. As the backward weights become correct, this improves learning performance over feedback alignment on tasks such as Fashion-MNIST, SVHN, CIFAR-10 and VOC. Our results demonstrate that a simple learning rule in a spiking network can allow neurons to produce the right backward connections and thus solve the weight transport problem.
Tasks	Causal Inference
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01689v2
PDF	https://arxiv.org/pdf/1910.01689v2.pdf
PWC	https://paperswithcode.com/paper/spike-based-causal-inference-for-weight-1
Repo
Framework