October 17, 2019

2875 words 14 mins read

Paper Group ANR 878

A Deep Convolutional Neural Network for Lung Cancer Diagnostic. Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length. A Modified Image Comparison Algorithm Using Histogram Features. Diverse and Coherent Paragraph Generation from Images. Query-Conditioned Three-Player Adversarial Network for Video Summarization. Robust Tracking via …

A Deep Convolutional Neural Network for Lung Cancer Diagnostic


Title	A Deep Convolutional Neural Network for Lung Cancer Diagnostic
Authors	Mehdi Fatan Serj, Bahram Lavi, Gabriela Hoff, Domenec Puig Valls
Abstract	In this paper, we examine the strength of deep learning technique for diagnosing lung cancer on medical image analysis problem. Convolutional neural networks (CNNs) models become popular among the pattern recognition and computer vision research area because of their promising outcome on generating high-level image representations. We propose a new deep learning architecture for learning high-level image representation to achieve high classification accuracy with low variance in medical image binary classification tasks. We aim to learn discriminant compact features at beginning of our deep convolutional neural network. We evaluate our model on Kaggle Data Science Bowl 2017 (KDSB17) data set, and compare it with some related works proposed in the Kaggle competition.
Tasks
Published	2018-04-22
URL	http://arxiv.org/abs/1804.08170v1
PDF	http://arxiv.org/pdf/1804.08170v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-convolutional-neural-network-for-lung
Repo
Framework

Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length


Title	Incremental Non-Rigid Structure-from-Motion with Unknown Focal Length
Authors	Thomas Probst, Danda Pani Paudel, Ajad Chhatkuli, Luc Van Gool
Abstract	The perspective camera and the isometric surface prior have recently gathered increased attention for Non-Rigid Structure-from-Motion (NRSfM). Despite the recent progress, several challenges remain, particularly the computational complexity and the unknown camera focal length. In this paper we present a method for incremental Non-Rigid Structure-from-Motion (NRSfM) with the perspective camera model and the isometric surface prior with unknown focal length. In the template-based case, we provide a method to estimate four parameters of the camera intrinsics. For the template-less scenario of NRSfM, we propose a method to upgrade reconstructions obtained for one focal length to another based on local rigidity and the so-called Maximum Depth Heuristics (MDH). On its basis we propose a method to simultaneously recover the focal length and the non-rigid shapes. We further solve the problem of incorporating a large number of points and adding more views in MDH-based NRSfM and efficiently solve them with Second-Order Cone Programming (SOCP). This does not require any shape initialization and produces results orders of times faster than many methods. We provide evaluations on standard sequences with ground-truth and qualitative reconstructions on challenging YouTube videos. These evaluations show that our method performs better in both speed and accuracy than the state of the art.
Tasks
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04181v1
PDF	http://arxiv.org/pdf/1808.04181v1.pdf
PWC	https://paperswithcode.com/paper/incremental-non-rigid-structure-from-motion
Repo
Framework

A Modified Image Comparison Algorithm Using Histogram Features


Title	A Modified Image Comparison Algorithm Using Histogram Features
Authors	Anas M. Al-Oraiqat, Natalya S. Kostyukova
Abstract	This article discuss the problem of color image content comparison. Particularly, methods of image content comparison are analyzed, restrictions of color histogram are described and a modified method of images content comparison is proposed. This method uses the color histograms and considers color locations. Testing and analyzing of based and modified algorithms are performed. The modified method shows 97% average precision for a collection containing about 700 images without loss of the advantages of based method, i.e. scale and rotation invariant.
Tasks
Published	2018-04-03
URL	http://arxiv.org/abs/1804.01142v1
PDF	http://arxiv.org/pdf/1804.01142v1.pdf
PWC	https://paperswithcode.com/paper/a-modified-image-comparison-algorithm-using
Repo
Framework

Diverse and Coherent Paragraph Generation from Images


Title	Diverse and Coherent Paragraph Generation from Images
Authors	Moitreya Chatterjee, Alexander G. Schwing
Abstract	Paragraph generation from images, which has gained popularity recently, is an important task for video summarization, editing, and support of the disabled. Traditional image captioning methods fall short on this front, since they aren’t designed to generate long informative descriptions. Moreover, the vanilla approach of simply concatenating multiple short sentences, possibly synthesized from a classical image captioning system, doesn’t embrace the intricacies of paragraphs: coherent sentences, globally consistent structure, and diversity. To address those challenges, we propose to augment paragraph generation techniques with ‘coherence vectors’, ‘global topic vectors’, and modeling of the inherent ambiguity of associating paragraphs with images, via a variational auto-encoder formulation. We demonstrate the effectiveness of the developed approach on two datasets, outperforming existing state-of-the-art techniques on both.
Tasks	Image Captioning, Video Summarization
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00681v1
PDF	http://arxiv.org/pdf/1809.00681v1.pdf
PWC	https://paperswithcode.com/paper/diverse-and-coherent-paragraph-generation
Repo
Framework

Query-Conditioned Three-Player Adversarial Network for Video Summarization


Title	Query-Conditioned Three-Player Adversarial Network for Video Summarization
Authors	Yujia Zhang, Michael Kampffmeyer, Xiaodan Liang, Min Tan, Eric P. Xing
Abstract	Video summarization plays an important role in video understanding by selecting key frames/shots. Traditionally, it aims to find the most representative and diverse contents in a video as short summaries. Recently, a more generalized task, query-conditioned video summarization, has been introduced, which takes user queries into consideration to learn more user-oriented summaries. In this paper, we propose a query-conditioned three-player generative adversarial network to tackle this challenge. The generator learns the joint representation of the user query and the video content, and the discriminator takes three pairs of query-conditioned summaries as the input to discriminate the real summary from a generated and a random one. A three-player loss is introduced for joint training of the generator and the discriminator, which forces the generator to learn better summary results, and avoids the generation of random trivial summaries. Experiments on a recently proposed query-conditioned video summarization benchmark dataset show the efficiency and efficacy of our proposed method.
Tasks	Video Summarization, Video Understanding
Published	2018-07-17
URL	http://arxiv.org/abs/1807.06677v1
PDF	http://arxiv.org/pdf/1807.06677v1.pdf
PWC	https://paperswithcode.com/paper/query-conditioned-three-player-adversarial
Repo
Framework

Robust Tracking via Weighted Online Extreme Learning Machine


Title	Robust Tracking via Weighted Online Extreme Learning Machine
Authors	Jing Zhang, Huibing Wang, Yonggong Ren
Abstract	The tracking method based on the extreme learning machine (ELM) is efficient and effective. ELM randomly generates input weights and biases in the hidden layer, and then calculates and computes the output weights by reducing the iterative solution to the problem of linear equations. Therefore, ELM offers the satisfying classification performance and fast training time than other discriminative models in tracking. However, the original ELM method often suffers from the problem of the imbalanced classification distribution, which is caused by few target objects, leading to under-fitting and more background samples leading to over-fitting. Worse still, it reduces the robustness of tracking under special conditions including occlusion, illumination, etc. To address above problems, in this paper, we present a robust tracking algorithm. First, we introduce the local weight matrix that is the dynamic creation from the data distribution at the current frame in the original ELM so as to balance between the empirical and structure risk, and fully learn the target object to enhance the classification performance. Second, we improve it to the incremental learning method ensuring tracking real-time and efficient. Finally, the forgetting factor is used to strengthen the robustness for changing of the classification distribution with time. Meanwhile, we propose a novel optimized method to obtain the optimal sample as the target object, which avoids tracking drift resulting from noisy samples. Therefore, our tracking method can fully learn both of the target object and background information to enhance the tracking performance, and it is evaluated in 20 challenge image sequences with different attributes including illumination, occlusion, deformation, etc., which achieves better performance than several state-of-the-art methods in terms of effectiveness and robustness.
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10211v1
PDF	http://arxiv.org/pdf/1807.10211v1.pdf
PWC	https://paperswithcode.com/paper/robust-tracking-via-weighted-online-extreme
Repo
Framework

Enhancing Stock Movement Prediction with Adversarial Training


Title	Enhancing Stock Movement Prediction with Adversarial Training
Authors	Fuli Feng, Huimin Chen, Xiangnan He, Ji Ding, Maosong Sun, Tat-Seng Chua
Abstract	This paper contributes a new machine learning solution for stock movement prediction, which aims to predict whether the price of a stock will be up or down in the near future. The key novelty is that we propose to employ adversarial training to improve the generalization of a neural network prediction model. The rationality of adversarial training here is that the input features to stock prediction are typically based on stock price, which is essentially a stochastic variable and continuously changed with time by nature. As such, normal training with static price-based features (e.g. the close price) can easily overfit the data, being insufficient to obtain reliable models. To address this problem, we propose to add perturbations to simulate the stochasticity of price variable, and train the model to work well under small yet intentional perturbations. Extensive experiments on two real-world stock data show that our method outperforms the state-of-the-art solution with 3.11% relative improvements on average w.r.t. accuracy, validating the usefulness of adversarial training for stock prediction task.
Tasks	Stock Prediction
Published	2018-10-13
URL	https://arxiv.org/abs/1810.09936v2
PDF	https://arxiv.org/pdf/1810.09936v2.pdf
PWC	https://paperswithcode.com/paper/enhancing-stock-movement-prediction-with
Repo
Framework

Metrics for Explainable AI: Challenges and Prospects


Title	Metrics for Explainable AI: Challenges and Prospects
Authors	Robert R. Hoffman, Shane T. Mueller, Gary Klein, Jordan Litman
Abstract	The question addressed in this paper is: If we present to a user an AI system that explains how it works, how do we know whether the explanation works and the user has achieved a pragmatic understanding of the AI? In other words, how do we know that an explanainable AI system (XAI) is any good? Our focus is on the key concepts of measurement. We discuss specific methods for evaluating: (1) the goodness of explanations, (2) whether users are satisfied by explanations, (3) how well users understand the AI systems, (4) how curiosity motivates the search for explanations, (5) whether the user’s trust and reliance on the AI are appropriate, and finally, (6) how the human-XAI work system performs. The recommendations we present derive from our integration of extensive research literatures and our own psychometric evaluations.
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04608v2
PDF	http://arxiv.org/pdf/1812.04608v2.pdf
PWC	https://paperswithcode.com/paper/metrics-for-explainable-ai-challenges-and
Repo
Framework

What deep learning can tell us about higher cognitive functions like mindreading?


Title	What deep learning can tell us about higher cognitive functions like mindreading?
Authors	Jaan Aru, Raul Vicente
Abstract	Can deep learning (DL) guide our understanding of computations happening in biological brain? We will first briefly consider how DL has contributed to the research on visual object recognition. In the main part we will assess whether DL could also help us to clarify the computations underlying higher cognitive functions such as Theory of Mind. In addition, we will compare the objectives and learning signals of brains and machines, leading us to conclude that simply scaling up the current DL algorithms will most likely not lead to human level Theory of Mind.
Tasks	Object Recognition
Published	2018-03-28
URL	https://arxiv.org/abs/1803.10470v2
PDF	https://arxiv.org/pdf/1803.10470v2.pdf
PWC	https://paperswithcode.com/paper/what-deep-learning-can-tell-us-about-higher
Repo
Framework

Generating Responses Expressing Emotion in an Open-domain Dialogue System


Title	Generating Responses Expressing Emotion in an Open-domain Dialogue System
Authors	Chenyang Huang, Osmar R. Zaïane
Abstract	Neural network-based Open-ended conversational agents automatically generate responses based on predictive models learned from a large number of pairs of utterances. The generated responses are typically acceptable as a sentence but are often dull, generic, and certainly devoid of any emotion. In this paper, we present neural models that learn to express a given emotion in the generated response. We propose four models and evaluate them against 3 baselines. An encoder-decoder framework-based model with multiple attention layers provides the best overall performance in terms of expressing the required emotion. While it does not outperform other models on all emotions, it presents promising results in most cases.
Tasks
Published	2018-11-15
URL	http://arxiv.org/abs/1811.10990v1
PDF	http://arxiv.org/pdf/1811.10990v1.pdf
PWC	https://paperswithcode.com/paper/181110990
Repo
Framework

Learning DNFs under product distributions via μ-biased quantum Fourier sampling


Title	Learning DNFs under product distributions via μ-biased quantum Fourier sampling
Authors	Varun Kanade, Andrea Rocchetto, Simone Severini
Abstract	We show that DNF formulae can be quantum PAC-learned in polynomial time under product distributions using a quantum example oracle. The best classical algorithm (without access to membership queries) runs in superpolynomial time. Our result extends the work by Bshouty and Jackson (1998) that proved that DNF formulae are efficiently learnable under the uniform distribution using a quantum example oracle. Our proof is based on a new quantum algorithm that efficiently samples the coefficients of a {\mu}-biased Fourier transform.
Tasks
Published	2018-02-15
URL	https://arxiv.org/abs/1802.05690v3
PDF	https://arxiv.org/pdf/1802.05690v3.pdf
PWC	https://paperswithcode.com/paper/learning-dnfs-under-product-distributions-via
Repo
Framework

Multi-Channel Pyramid Person Matching Network for Person Re-Identification


Title	Multi-Channel Pyramid Person Matching Network for Person Re-Identification
Authors	Chaojie Mao, Yingming Li, Yaqing Zhang, Zhongfei Zhang, Xi Li
Abstract	In this work, we present a Multi-Channel deep convolutional Pyramid Person Matching Network (MC-PPMN) based on the combination of the semantic-components and the color-texture distributions to address the problem of person re-identification. In particular, we learn separate deep representations for semantic-components and color-texture distributions from two person images and then employ pyramid person matching network (PPMN) to obtain correspondence representations. These correspondence representations are fused to perform the re-identification task. Further, the proposed framework is optimized via a unified end-to-end deep learning scheme. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our approach against the state-of-the-art literature, especially on the rank-1 recognition rate.
Tasks	Person Re-Identification
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02558v1
PDF	http://arxiv.org/pdf/1803.02558v1.pdf
PWC	https://paperswithcode.com/paper/multi-channel-pyramid-person-matching-network
Repo
Framework

Enhancing Stock Market Prediction with Extended Coupled Hidden Markov Model over Multi-Sourced Data


Title	Enhancing Stock Market Prediction with Extended Coupled Hidden Markov Model over Multi-Sourced Data
Authors	Xi Zhang, Yixuan Li, Senzhang Wang, Binxing Fang, Philip S. Yu
Abstract	Traditional stock market prediction methods commonly only utilize the historical trading data, ignoring the fact that stock market fluctuations can be impacted by various other information sources such as stock related events. Although some recent works propose event-driven prediction approaches by considering the event data, how to leverage the joint impacts of multiple data sources still remains an open research problem. In this work, we study how to explore multiple data sources to improve the performance of the stock prediction. We introduce an Extended Coupled Hidden Markov Model incorporating the news events with the historical trading data. To address the data sparsity issue of news events for each single stock, we further study the fluctuation correlations between the stocks and incorporate the correlations into the model to facilitate the prediction task. Evaluations on China A-share market data in 2016 show the superior performance of our model against previous methods.
Tasks	Stock Market Prediction, Stock Prediction
Published	2018-09-02
URL	http://arxiv.org/abs/1809.00306v1
PDF	http://arxiv.org/pdf/1809.00306v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-stock-market-prediction-with
Repo
Framework

Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks


Title	Buildings Detection in VHR SAR Images Using Fully Convolution Neural Networks
Authors	Muhammad Shahzad, Michael Maurer, Friedrich Fraundorfer, Yuanyuan Wang, Xiao Xiang Zhu
Abstract	This paper addresses the highly challenging problem of automatically detecting man-made structures especially buildings in very high resolution (VHR) synthetic aperture radar (SAR) images. In this context, the paper has two major contributions: Firstly, it presents a novel and generic workflow that initially classifies the spaceborne TomoSAR point clouds $ - $ generated by processing VHR SAR image stacks using advanced interferometric techniques known as SAR tomography (TomoSAR) $ - $ into buildings and non-buildings with the aid of auxiliary information (i.e., either using openly available 2-D building footprints or adopting an optical image classification scheme) and later back project the extracted building points onto the SAR imaging coordinates to produce automatic large-scale benchmark labelled (buildings/non-buildings) SAR datasets. Secondly, these labelled datasets (i.e., building masks) have been utilized to construct and train the state-of-the-art deep Fully Convolution Neural Networks with an additional Conditional Random Field represented as a Recurrent Neural Network to detect building regions in a single VHR SAR image. Such a cascaded formation has been successfully employed in computer vision and remote sensing fields for optical image classification but, to our knowledge, has not been applied to SAR images. The results of the building detection are illustrated and validated over a TerraSAR-X VHR spotlight SAR image covering approximately 39 km$ ^2 $ $ - $ almost the whole city of Berlin $ - $ with mean pixel accuracies of around 93.84%
Tasks	Image Classification
Published	2018-08-14
URL	http://arxiv.org/abs/1808.06155v1
PDF	http://arxiv.org/pdf/1808.06155v1.pdf
PWC	https://paperswithcode.com/paper/buildings-detection-in-vhr-sar-images-using
Repo
Framework

Unsupervised Person Image Synthesis in Arbitrary Poses


Title	Unsupervised Person Image Synthesis in Arbitrary Poses
Authors	Albert Pumarola, Antonio Agudo, Alberto Sanfeliu, Francesc Moreno-Noguer
Abstract	We present a novel approach for synthesizing photo-realistic images of people in arbitrary poses using generative adversarial learning. Given an input image of a person and a desired pose represented by a 2D skeleton, our model renders the image of the same person under the new pose, synthesizing novel views of the parts visible in the input image and hallucinating those that are not seen. This problem has recently been addressed in a supervised manner, i.e., during training the ground truth images under the new poses are given to the network. We go beyond these approaches by proposing a fully unsupervised strategy. We tackle this challenging scenario by splitting the problem into two principal subtasks. First, we consider a pose conditioned bidirectional generator that maps back the initially rendered image to the original pose, hence being directly comparable to the input image without the need to resort to any training image. Second, we devise a novel loss function that incorporates content and style terms, and aims at producing images of high perceptual quality. Extensive experiments conducted on the DeepFashion dataset demonstrate that the images rendered by our model are very close in appearance to those obtained by fully supervised approaches.
Tasks	Image Generation
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10280v1
PDF	http://arxiv.org/pdf/1809.10280v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-person-image-synthesis-in
Repo
Framework