July 28, 2019

2909 words 14 mins read

Paper Group ANR 204

Understanding the Mechanisms of Deep Transfer Learning for Medical Images. Study of Clear Sky Models for Singapore. Personalized and Occupational-aware Age Progression by Generative Adversarial Networks. A Maximum Matching Algorithm for Basis Selection in Spectral Learning. Neural Speed Reading via Skim-RNN. Discriminative Bimodal Networks for Visu …

Understanding the Mechanisms of Deep Transfer Learning for Medical Images


Title	Understanding the Mechanisms of Deep Transfer Learning for Medical Images
Authors	Hariharan Ravishankar, Prasad Sudhakar, Rahul Venkataramani, Sheshadri Thiruvenkadam, Pavan Annangi, Narayanan Babu, Vivek Vaidya
Abstract	The ability to automatically learn task specific feature representations has led to a huge success of deep learning methods. When large training data is scarce, such as in medical imaging problems, transfer learning has been very effective. In this paper, we systematically investigate the process of transferring a Convolutional Neural Network, trained on ImageNet images to perform image classification, to kidney detection problem in ultrasound images. We study how the detection performance depends on the extent of transfer. We show that a transferred and tuned CNN can outperform a state-of-the-art feature engineered pipeline and a hybridization of these two techniques achieves 20% higher performance. We also investigate how the evolution of intermediate response images from our network. Finally, we compare these responses to state-of-the-art image processing filters in order to gain greater insight into how transfer learning is able to effectively manage widely varying imaging regimes.
Tasks	Image Classification, Transfer Learning
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06040v1
PDF	http://arxiv.org/pdf/1704.06040v1.pdf
PWC	https://paperswithcode.com/paper/understanding-the-mechanisms-of-deep-transfer
Repo
Framework

Study of Clear Sky Models for Singapore


Title	Study of Clear Sky Models for Singapore
Authors	Soumyabrata Dev, Shilpa Manandhar, Yee Hui Lee, Stefan Winkler
Abstract	The estimation of total solar irradiance falling on the earth’s surface is important in the field of solar energy generation and forecasting. Several clear-sky solar radiation models have been developed over the last few decades. Most of these models are based on empirical distribution of various geographical parameters; while a few models consider various atmospheric effects in the solar energy estimation. In this paper, we perform a comparative analysis of several popular clear-sky models, in the tropical region of Singapore. This is important in countries like Singapore, where we are primarily focused on reliable and efficient solar energy generation. We analyze and compare three popular clear-sky models that are widely used in the literature. We validate our solar estimation results using actual solar irradiance measurements obtained from collocated weather stations. We finally conclude the most reliable clear sky model for Singapore, based on all clear sky days in a year.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.08760v1
PDF	http://arxiv.org/pdf/1708.08760v1.pdf
PWC	https://paperswithcode.com/paper/study-of-clear-sky-models-for-singapore
Repo
Framework

Personalized and Occupational-aware Age Progression by Generative Adversarial Networks


Title	Personalized and Occupational-aware Age Progression by Generative Adversarial Networks
Authors	Siyu Zhou, Weiqiang Zhao, Jiashi Feng, Hanjiang Lai, Yan Pan, Jian Yin, Shuicheng Yan
Abstract	Face age progression, which aims to predict the future looks, is important for various applications and has been received considerable attentions. Existing methods and datasets are limited in exploring the effects of occupations which may influence the personal appearances. In this paper, we firstly introduce an occupational face aging dataset for studying the influences of occupations on the appearances. It includes five occupations, which enables the development of new algorithms for age progression and facilitate future researches. Second, we propose a new occupational-aware adversarial face aging network, which learns human aging process under different occupations. Two factors are taken into consideration in our aging process: personality-preserving and visually plausible texture change for different occupations. We propose personalized network with personalized loss in deep autoencoder network for keeping personalized facial characteristics, and occupational-aware adversarial network with occupational-aware adversarial loss for obtaining more realistic texture changes. Experimental results well demonstrate the advantages of the proposed method by comparing with other state-of-the-arts age progression methods.
Tasks
Published	2017-11-26
URL	http://arxiv.org/abs/1711.09368v2
PDF	http://arxiv.org/pdf/1711.09368v2.pdf
PWC	https://paperswithcode.com/paper/personalized-and-occupational-aware-age
Repo
Framework

A Maximum Matching Algorithm for Basis Selection in Spectral Learning


Title	A Maximum Matching Algorithm for Basis Selection in Spectral Learning
Authors	Ariadna Quattoni, Xavier Carreras, Matthias Gallé
Abstract	We present a solution to scale spectral algorithms for learning sequence functions. We are interested in the case where these functions are sparse (that is, for most sequences they return 0). Spectral algorithms reduce the learning problem to the task of computing an SVD decomposition over a special type of matrix called the Hankel matrix. This matrix is designed to capture the relevant statistics of the training sequences. What is crucial is that to capture long range dependencies we must consider very large Hankel matrices. Thus the computation of the SVD becomes a critical bottleneck. Our solution finds a subset of rows and columns of the Hankel that realizes a compact and informative Hankel submatrix. The novelty lies in the way that this subset is selected: we exploit a maximal bipartite matching combinatorial algorithm to look for a sub-block with full structural rank, and show how computation of this sub-block can be further improved by exploiting the specific structure of Hankel matrices.
Tasks
Published	2017-06-09
URL	http://arxiv.org/abs/1706.02857v1
PDF	http://arxiv.org/pdf/1706.02857v1.pdf
PWC	https://paperswithcode.com/paper/a-maximum-matching-algorithm-for-basis
Repo
Framework

Neural Speed Reading via Skim-RNN


Title	Neural Speed Reading via Skim-RNN
Authors	Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi
Abstract	Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens. Skim-RNN gives computational advantage over an RNN that always updates the entire hidden state. Skim-RNN uses the same input and output interfaces as a standard RNN and can be easily used instead of RNNs in existing models. In our experiments, we show that Skim-RNN can achieve significantly reduced computational cost without losing accuracy compared to standard RNNs across five different natural language tasks. In addition, we demonstrate that the trade-off between accuracy and speed of Skim-RNN can be dynamically controlled during inference time in a stable manner. Our analysis also shows that Skim-RNN running on a single CPU offers lower latency compared to standard RNNs on GPUs.
Tasks
Published	2017-11-06
URL	http://arxiv.org/abs/1711.02085v3
PDF	http://arxiv.org/pdf/1711.02085v3.pdf
PWC	https://paperswithcode.com/paper/neural-speed-reading-via-skim-rnn
Repo
Framework

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries


Title	Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Authors	Yuting Zhang, Luyao Yuan, Yijie Guo, Zhiyuan He, I-An Huang, Honglak Lee
Abstract	Associating image regions with text queries has been recently explored as a new way to bridge visual and linguistic representations. A few pioneering approaches have been proposed based on recurrent neural language models trained generatively (e.g., generating captions), but achieving somewhat limited localization accuracy. To better address natural-language-based visual entity localization, we propose a discriminative approach. We formulate a discriminative bimodal neural network (DBNet), which can be trained by a classifier with extensive use of negative samples. Our training objective encourages better localization on single images, incorporates text phrases in a broad range, and properly pairs image regions with text phrases into positive and negative examples. Experiments on the Visual Genome dataset demonstrate the proposed DBNet significantly outperforms previous state-of-the-art methods both for localization on single images and for detection on multiple images. We we also establish an evaluation protocol for natural-language visual detection.
Tasks	Visual Localization
Published	2017-04-12
URL	http://arxiv.org/abs/1704.03944v2
PDF	http://arxiv.org/pdf/1704.03944v2.pdf
PWC	https://paperswithcode.com/paper/discriminative-bimodal-networks-for-visual
Repo
Framework

Dual Path Networks for Multi-Person Human Pose Estimation


Title	Dual Path Networks for Multi-Person Human Pose Estimation
Authors	Guanghan Ning, Zhihai He
Abstract	The task of multi-person human pose estimation in natural scenes is quite challenging. Existing methods include both top-down and bottom-up approaches. The main advantage of bottom-up methods is its excellent tradeoff between estimation accuracy and computational cost. We follow this path and aim to design smaller, faster, and more accurate neural networks for the regression of keypoints and limb association vectors. These two regression tasks are naturally dependent on each other. In this work, we propose a dual-path network specially designed for multi-person human pose estimation, and compare our performance with the openpose network in aspects of model size, forward speed, and estimation accuracy.
Tasks	Pose Estimation
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10192v1
PDF	http://arxiv.org/pdf/1710.10192v1.pdf
PWC	https://paperswithcode.com/paper/dual-path-networks-for-multi-person-human
Repo
Framework

Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox


Title	Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch-Prox
Authors	Jialei Wang, Weiran Wang, Nathan Srebro
Abstract	We present and analyze an approach for distributed stochastic optimization which is statistically optimal and achieves near-linear speedups (up to logarithmic factors). Our approach allows a communication-memory tradeoff, with either logarithmic communication but linear memory, or polynomial communication and a corresponding polynomial reduction in required memory. This communication-memory tradeoff is achieved through minibatch-prox iterations (minibatch passive-aggressive updates), where a subproblem on a minibatch is solved at each iteration. We provide a novel analysis for such a minibatch-prox procedure which achieves the statistical optimal rate regardless of minibatch size and smoothness, thus significantly improving on prior work.
Tasks	Stochastic Optimization
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06269v2
PDF	http://arxiv.org/pdf/1702.06269v2.pdf
PWC	https://paperswithcode.com/paper/memory-and-communication-efficient
Repo
Framework

Associations among Image Assessments as Cost Functions in Linear Decomposition: MSE, SSIM, and Correlation Coefficient


Title	Associations among Image Assessments as Cost Functions in Linear Decomposition: MSE, SSIM, and Correlation Coefficient
Authors	Jianji Wang, Nanning Zheng, Badong Chen, Jose C. Principe
Abstract	The traditional methods of image assessment, such as mean squared error (MSE), signal-to-noise ratio (SNR), and Peak signal-to-noise ratio (PSNR), are all based on the absolute error of images. Pearson’s inner-product correlation coefficient (PCC) is also usually used to measure the similarity between images. Structural similarity (SSIM) index is another important measurement which has been shown to be more effective in the human vision system (HVS). Although there are many essential differences among these image assessments, some important associations among them as cost functions in linear decomposition are discussed in this paper. Firstly, the selected bases from a basis set for a target vector are the same in the linear decomposition schemes with different cost functions MSE, SSIM, and PCC. Moreover, for a target vector, the ratio of the corresponding affine parameters in the MSE-based linear decomposition scheme and the SSIM-based scheme is a constant, which is just the value of PCC between the target vector and its estimated vector.
Tasks
Published	2017-08-04
URL	http://arxiv.org/abs/1708.01541v1
PDF	http://arxiv.org/pdf/1708.01541v1.pdf
PWC	https://paperswithcode.com/paper/associations-among-image-assessments-as-cost
Repo
Framework

Phase Conductor on Multi-layered Attentions for Machine Comprehension


Title	Phase Conductor on Multi-layered Attentions for Machine Comprehension
Authors	Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina
Abstract	Attention models have been intensively studied to improve NLP tasks such as machine comprehension via both question-aware passage attention model and self-matching attention model. Our research proposes phase conductor (PhaseCond) for attention models in two meaningful ways. First, PhaseCond, an architecture of multi-layered attention models, consists of multiple phases each implementing a stack of attention layers producing passage representations and a stack of inner or outer fusion layers regulating the information flow. Second, we extend and improve the dot-product attention function for PhaseCond by simultaneously encoding multiple question and passage embedding layers from different perspectives. We demonstrate the effectiveness of our proposed model PhaseCond on the SQuAD dataset, showing that our model significantly outperforms both state-of-the-art single-layered and multiple-layered attention models. We deepen our results with new findings via both detailed qualitative analysis and visualized examples showing the dynamic changes through multi-layered attention models.
Tasks	Question Answering, Reading Comprehension
Published	2017-10-28
URL	http://arxiv.org/abs/1710.10504v2
PDF	http://arxiv.org/pdf/1710.10504v2.pdf
PWC	https://paperswithcode.com/paper/phase-conductor-on-multi-layered-attentions
Repo
Framework

The Impact of Coevolution and Abstention on the Emergence of Cooperation


Title	The Impact of Coevolution and Abstention on the Emergence of Cooperation
Authors	Marcos Cardinot, Colm O’Riordan, Josephine Griffith
Abstract	This paper explores the Coevolutionary Optional Prisoner’s Dilemma (COPD) game, which is a simple model to coevolve game strategy and link weights of agents playing the Optional Prisoner’s Dilemma game. We consider a population of agents placed in a lattice grid with boundary conditions. A number of Monte Carlo simulations are performed to investigate the impacts of the COPD game on the emergence of cooperation. Results show that the coevolutionary rules enable cooperators to survive and even dominate, with the presence of abstainers in the population playing a key role in the protection of cooperators against exploitation from defectors. We observe that in adverse conditions such as when the initial population of abstainers is too scarce/abundant, or when the temptation to defect is very high, cooperation has no chance of emerging. However, when the simple coevolutionary rules are applied, cooperators flourish.
Tasks
Published	2017-04-28
URL	http://arxiv.org/abs/1705.00094v1
PDF	http://arxiv.org/pdf/1705.00094v1.pdf
PWC	https://paperswithcode.com/paper/the-impact-of-coevolution-and-abstention-on
Repo
Framework

Hedera: Scalable Indexing and Exploring Entities in Wikipedia Revision History


Title	Hedera: Scalable Indexing and Exploring Entities in Wikipedia Revision History
Authors	Tuan Tran, Tu Ngoc Nguyen
Abstract	Much of work in semantic web relying on Wikipedia as the main source of knowledge often work on static snapshots of the dataset. The full history of Wikipedia revisions, while contains much more useful information, is still difficult to access due to its exceptional volume. To enable further research on this collection, we developed a tool, named Hedera, that efficiently extracts semantic information from Wikipedia revision history datasets. Hedera exploits Map-Reduce paradigm to achieve rapid extraction, it is able to handle one entire Wikipedia articles revision history within a day in a medium-scale cluster, and supports flexible data structures for various kinds of semantic web study.
Tasks
Published	2017-01-14
URL	http://arxiv.org/abs/1701.03937v1
PDF	http://arxiv.org/pdf/1701.03937v1.pdf
PWC	https://paperswithcode.com/paper/hedera-scalable-indexing-and-exploring
Repo
Framework

Proposal Flow: Semantic Correspondences from Object Proposals


Title	Proposal Flow: Semantic Correspondences from Object Proposals
Authors	Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce
Abstract	Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals. Unlike prevailing semantic flow approaches that operate on pixels or regularly sampled local regions, proposal flow benefits from the characteristics of modern object proposals, that exhibit high repeatability at multiple scales, and can take advantage of both local and geometric consistency constraints among proposals. We also show that the corresponding sparse proposal flow can effectively be transformed into a conventional dense flow field. We introduce two new challenging datasets that can be used to evaluate both general semantic flow techniques and region-based approaches such as proposal flow. We use these benchmarks to compare different matching algorithms, object proposals, and region features within proposal flow, to the state of the art in semantic flow. This comparison, along with experiments on standard datasets, demonstrates that proposal flow significantly outperforms existing semantic flow methods in various settings.
Tasks
Published	2017-03-21
URL	http://arxiv.org/abs/1703.07144v1
PDF	http://arxiv.org/pdf/1703.07144v1.pdf
PWC	https://paperswithcode.com/paper/proposal-flow-semantic-correspondences-from
Repo
Framework

Multi-view Graph Embedding with Hub Detection for Brain Network Analysis


Title	Multi-view Graph Embedding with Hub Detection for Brain Network Analysis
Authors	Guixiang Ma, Chun-Ta Lu, Lifang He, Philip S. Yu, Ann B. Ragin
Abstract	Multi-view graph embedding has become a widely studied problem in the area of graph learning. Most of the existing works on multi-view graph embedding aim to find a shared common node embedding across all the views of the graph by combining the different views in a specific way. Hub detection, as another essential topic in graph mining has also drawn extensive attentions in recent years, especially in the context of brain network analysis. Both the graph embedding and hub detection relate to the node clustering structure of graphs. The multi-view graph embedding usually implies the node clustering structure of the graph based on the multiple views, while the hubs are the boundary-spanning nodes across different node clusters in the graph and thus may potentially influence the clustering structure of the graph. However, none of the existing works in multi-view graph embedding considered the hubs when learning the multi-view embeddings. In this paper, we propose to incorporate the hub detection task into the multi-view graph embedding framework so that the two tasks could benefit each other. Specifically, we propose an auto-weighted framework of Multi-view Graph Embedding with Hub Detection (MVGE-HD) for brain network analysis. The MVGE-HD framework learns a unified graph embedding across all the views while reducing the potential influence of the hubs on blurring the boundaries between node clusters in the graph, thus leading to a clear and discriminative node clustering structure for the graph. We apply MVGE-HD on two real multi-view brain network datasets (i.e., HIV and Bipolar). The experimental results demonstrate the superior performance of the proposed framework in brain network analysis for clinical investigation and application.
Tasks	Graph Embedding
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03659v1
PDF	http://arxiv.org/pdf/1709.03659v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-graph-embedding-with-hub-detection
Repo
Framework

Fastest Convergence for Q-learning


Title	Fastest Convergence for Q-learning
Authors	Adithya M. Devraj, Sean P. Meyn
Abstract	The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins’ original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its asymptotic variance is optimal. Moreover, an ODE analysis suggests that the transient behavior is a close match to a deterministic Newton-Raphson implementation. This is made possible by a two time-scale update equation for the matrix gain sequence. The analysis suggests that the approach will lead to stable and efficient computation even for non-ideal parameterized settings. Numerical experiments confirm the quick convergence, even in such non-ideal cases. A secondary goal of this paper is tutorial. The first half of the paper contains a survey on reinforcement learning algorithms, with a focus on minimum variance algorithms.
Tasks	Q-Learning
Published	2017-07-12
URL	http://arxiv.org/abs/1707.03770v2
PDF	http://arxiv.org/pdf/1707.03770v2.pdf
PWC	https://paperswithcode.com/paper/fastest-convergence-for-q-learning
Repo
Framework