October 17, 2019

3037 words 15 mins read

Paper Group ANR 784

Runtime Analysis of Probabilistic Crowding and Restricted Tournament Selection for Bimodal Optimisation. Categorical Mixture Models on VGGNet activations. Multi-label Object Attribute Classification using a Convolutional Neural Network. Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images. Interactive Me …

Runtime Analysis of Probabilistic Crowding and Restricted Tournament Selection for Bimodal Optimisation


Title	Runtime Analysis of Probabilistic Crowding and Restricted Tournament Selection for Bimodal Optimisation
Authors	Edgar Covantes Osuna, Dirk Sudholt
Abstract	Many real optimisation problems lead to multimodal domains and so require the identification of multiple optima. Niching methods have been developed to maintain the population diversity, to investigate many peaks in parallel and to reduce the effect of genetic drift. Using rigorous runtime analysis, we analyse for the first time two well known niching methods: probabilistic crowding and restricted tournament selection (RTS). We incorporate both methods into a $(\mu+1)~EA$ on the bimodal function Twomax where the goal is to find two optima at opposite ends of the search space. In probabilistic crowding, the offspring compete with their parents and the survivor is chosen proportionally to its fitness. On Twomax probabilistic crowding fails to find any reasonable solution quality even in exponential time. In RTS the offspring compete against the closest individual amongst $w$ (window size) individuals. We prove that RTS fails if $w$ is too small, leading to exponential times with high probability. However, if w is chosen large enough, it finds both optima for Twomax in time $O(\mu n \log{n})$ with high probability. Our theoretical results are accompanied by experimental studies that match the theoretical results and also shed light on parameters not covered by the theoretical results.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09766v1
PDF	http://arxiv.org/pdf/1803.09766v1.pdf
PWC	https://paperswithcode.com/paper/runtime-analysis-of-probabilistic-crowding
Repo
Framework

Categorical Mixture Models on VGGNet activations


Title	Categorical Mixture Models on VGGNet activations
Authors	Sean Billings
Abstract	In this project, I use unsupervised learning techniques in order to cluster a set of yelp restaurant photos under meaningful topics. In order to do this, I extract layer activations from a pre-trained implementation of the popular VGGNet convolutional neural network. First, I explore using LDA with the activations of convolutional layers as features. Secondly, I explore using the object-recognition powers of VGGNet trained on ImageNet in order to extract meaningful objects from the photos, and then perform LDA to group the photos under topic-archetypes. I find that this second approach finds meaningful archetypes, which match the human intuition for photo topics such as restaurant, food, and drinks. Furthermore, these clusters align well and distinctly with the actual yelp photo labels.
Tasks	Object Recognition
Published	2018-03-06
URL	http://arxiv.org/abs/1803.02446v1
PDF	http://arxiv.org/pdf/1803.02446v1.pdf
PWC	https://paperswithcode.com/paper/categorical-mixture-models-on-vggnet
Repo
Framework

Multi-label Object Attribute Classification using a Convolutional Neural Network


Title	Multi-label Object Attribute Classification using a Convolutional Neural Network
Authors	Soubarna Banik, Mikko Lauri, Simone Frintrop
Abstract	Objects of different classes can be described using a limited number of attributes such as color, shape, pattern, and texture. Learning to detect object attributes instead of only detecting objects can be helpful in dealing with a priori unknown objects. With this inspiration, a deep convolutional neural network for low-level object attribute classification, called the Deep Attribute Network (DAN), is proposed. Since object features are implicitly learned by object recognition networks, one such existing network is modified and fine-tuned for developing DAN. The performance of DAN is evaluated on the ImageNet Attribute and a-Pascal datasets. Experiments show that in comparison with state-of-the-art methods, the proposed model achieves better results.
Tasks	Object Recognition
Published	2018-11-10
URL	http://arxiv.org/abs/1811.04309v1
PDF	http://arxiv.org/pdf/1811.04309v1.pdf
PWC	https://paperswithcode.com/paper/multi-label-object-attribute-classification
Repo
Framework

Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images


Title	Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images
Authors	Tao Yu, Yu Qiao, Huan Long
Abstract	A variety of deep neural networks have been applied in medical image segmentation and achieve good performance. Unlike natural images, medical images of the same imaging modality are characterized by the same pattern, which indicates that same normal organs or tissues locate at similar positions in the images. Thus, in this paper we try to incorporate the prior knowledge of medical images into the structure of neural networks such that the prior knowledge can be utilized for accurate segmentation. Based on this idea, we propose a novel deep network called knowledge-based fully convolutional network (KFCN) for medical image segmentation. The segmentation function and corresponding error is analyzed. We show the existence of an asymptotically stable region for KFCN which traditional FCN doesn’t possess. Experiments validate our knowledge assumption about the incorporation of prior knowledge into the convolution kernels of KFCN and show that KFCN can achieve a reasonable segmentation and a satisfactory accuracy.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08492v1
PDF	http://arxiv.org/pdf/1805.08492v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-based-fully-convolutional-network
Repo
Framework

Interactive Medical Image Segmentation via Point-Based Interaction and Sequential Patch Learning


Title	Interactive Medical Image Segmentation via Point-Based Interaction and Sequential Patch Learning
Authors	Jinquan Sun, Yinghuan Shi, Yang Gao, Lei Wang, Luping Zhou, Wanqi Yang, Dinggang Shen
Abstract	Due to low tissue contrast, irregular object appearance, and unpredictable location variation, segmenting the objects from different medical imaging modalities (e.g., CT, MR) is considered as an important yet challenging task. In this paper, we present a novel method for interactive medical image segmentation with the following merits. (1) Our design is fundamentally different from previous pure patch-based and image-based segmentation methods. We observe that during delineation, the physician repeatedly check the inside-outside intensity changing to determine the boundary, which indicates that comparison in an inside-outside manner is extremely important. Thus, we innovatively model our segmentation task as learning the representation of the bi-directional sequential patches, starting from (or ending in) the given central point of the object. This can be realized by our proposed ConvRNN network embedded with a gated memory propagation unit. (2) Unlike previous interactive methods (requiring bounding box or seed points), we only ask the physician to merely click on the rough central point of the object before segmentation, which could simultaneously enhance the performance and reduce the segmentation time. (3) We utilize our method in a multi-level framework for better performance. We systematically evaluate our method in three different segmentation tasks including CT kidney tumor, MR prostate, and PROMISE12 challenge, showing promising results compared with state-of-the-art methods. The code is available here: \href{https://github.com/sunalbert/Sequential-patch-based-segmentation}{Sequential-patch-based-segmentation}.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2018-04-27
URL	http://arxiv.org/abs/1804.10481v2
PDF	http://arxiv.org/pdf/1804.10481v2.pdf
PWC	https://paperswithcode.com/paper/interactive-medical-image-segmentation-via
Repo
Framework

Peekaboo - Where are the Objects? Structure Adjusting Superpixels


Title	Peekaboo - Where are the Objects? Structure Adjusting Superpixels
Authors	Georg Maierhofer, Daniel Heydecker, Angelica I. Aviles-Rivero, Samar M. Alsaleh, Carola-Bibiane Schönlieb
Abstract	This paper addresses the search for a fast and meaningful image segmentation in the context of $k$-means clustering. The proposed method builds on a widely-used local version of Lloyd’s algorithm, called Simple Linear Iterative Clustering (SLIC). We propose an algorithm which extends SLIC to dynamically adjust the local search, adopting superpixel resolution dynamically to structure existent in the image, and thus provides for more meaningful superpixels in the same linear runtime as standard SLIC. The proposed method is evaluated against state-of-the-art techniques and improved boundary adherence and undersegmentation error are observed, whilst still remaining among the fastest algorithms which are tested.
Tasks	Semantic Segmentation
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02796v2
PDF	http://arxiv.org/pdf/1802.02796v2.pdf
PWC	https://paperswithcode.com/paper/peekaboo-where-are-the-objects-structure
Repo
Framework

Alquist: The Alexa Prize Socialbot


Title	Alquist: The Alexa Prize Socialbot
Authors	Jan Pichl, Petr Marek, Jakub Konrád, Martin Matulík, Hoang Long Nguyen, Jan Šedivý
Abstract	This paper describes a new open domain dialogue system Alquist developed as part of the Alexa Prize competition for the Amazon Echo line of products. The Alquist dialogue system is designed to conduct a coherent and engaging conversation on popular topics. We are presenting a hybrid system combining several machine learning and rule based approaches. We discuss and describe the Alquist pipeline, data acquisition, and processing, dialogue manager, NLG, knowledge aggregation and hierarchy of sub-dialogs. We present some of the experimental results.
Tasks
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06705v1
PDF	http://arxiv.org/pdf/1804.06705v1.pdf
PWC	https://paperswithcode.com/paper/alquist-the-alexa-prize-socialbot
Repo
Framework

Spatio-Temporal Neural Networks for Space-Time Series Forecasting and Relations Discovery


Title	Spatio-Temporal Neural Networks for Space-Time Series Forecasting and Relations Discovery
Authors	Ali Ziat, Edouard Delasalles, Ludovic Denoyer, Patrick Gallinari
Abstract	We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is evaluated and compared to state-of-the-art baselines, on a variety of forecasting problems representative of different application areas: epidemiology, geo-spatial statistics and car-traffic prediction. Besides these evaluations, we also describe experiments showing the ability of this approach to extract relevant spatial relations.
Tasks	Epidemiology, Time Series, Time Series Forecasting, Traffic Prediction
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08562v1
PDF	http://arxiv.org/pdf/1804.08562v1.pdf
PWC	https://paperswithcode.com/paper/spatio-temporal-neural-networks-for-space
Repo
Framework

Policy Gradients for Contextual Recommendations


Title	Policy Gradients for Contextual Recommendations
Authors	Feiyang Pan, Qingpeng Cai, Pingzhong Tang, Fuzhen Zhuang, Qing He
Abstract	Decision making is a challenging task in online recommender systems. The decision maker often needs to choose a contextual item at each step from a set of candidates. Contextual bandit algorithms have been successfully deployed to such applications, for the trade-off between exploration and exploitation and the state-of-art performance on minimizing online costs. However, the applicability of existing contextual bandit methods is limited by the over-simplified assumptions of the problem, such as assuming a simple form of the reward function or assuming a static environment where the states are not affected by previous actions. In this work, we put forward Policy Gradients for Contextual Recommendations (PGCR) to solve the problem without those unrealistic assumptions. It optimizes over a restricted class of policies where the marginal probability of choosing an item (in expectation of other items) has a simple closed form, and the gradient of the expected return over the policy in this class is in a succinct form. Moreover, PGCR leverages two useful heuristic techniques called Time-Dependent Greed and Actor-Dropout. The former ensures PGCR to be empirically greedy in the limit, and the latter addresses the trade-off between exploration and exploitation by using the policy network with Dropout as a Bayesian approximation. PGCR can solve the standard contextual bandits as well as its Markov Decision Process generalization. Therefore it can be applied to a wide range of realistic settings of recommendations, such as personalized advertising. We evaluate PGCR on toy datasets as well as a real-world dataset of personalized music recommendations. Experiments show that PGCR enables fast convergence and low regret, and outperforms both classic contextual-bandits and vanilla policy gradient methods.
Tasks	Decision Making, Multi-Armed Bandits, Policy Gradient Methods, Recommendation Systems
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04162v3
PDF	http://arxiv.org/pdf/1802.04162v3.pdf
PWC	https://paperswithcode.com/paper/policy-gradients-for-contextual
Repo
Framework

Quantifying the visual concreteness of words and topics in multimodal datasets


Title	Quantifying the visual concreteness of words and topics in multimodal datasets
Authors	Jack Hessel, David Mimno, Lillian Lee
Abstract	Multimodal machine learning algorithms aim to learn visual-textual correspondences. Previous work suggests that concepts with concrete visual manifestations may be easier to learn than concepts with abstract ones. We give an algorithm for automatically computing the visual concreteness of words and topics within multimodal datasets. We apply the approach in four settings, ranging from image captions to images/text scraped from historical books. In addition to enabling explorations of concepts in multimodal datasets, our concreteness scores predict the capacity of machine learning algorithms to learn textual/visual relationships. We find that 1) concrete concepts are indeed easier to learn; 2) the large number of algorithms we consider have similar failure cases; 3) the precise positive relationship between concreteness and performance varies between datasets. We conclude with recommendations for using concreteness scores to facilitate future multimodal research.
Tasks	Image Captioning
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06786v2
PDF	http://arxiv.org/pdf/1804.06786v2.pdf
PWC	https://paperswithcode.com/paper/quantifying-the-visual-concreteness-of-words
Repo
Framework

Low-Resource Text Classification using Domain-Adversarial Learning


Title	Low-Resource Text Classification using Domain-Adversarial Learning
Authors	Daniel Grießhaber, Ngoc Thang Vu, Johannes Maucher
Abstract	Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems. They require, however, a large amount of annotated data which is often missing. This paper explores the use of domain-adversarial learning as a regularizer to avoid overfitting when training domain invariant features for deep, complex neural network in low-resource and zero-resource settings in new target domains or languages. In the case of new languages, we show that monolingual word-vectors can be directly used for training without pre-alignment. Their projection into a common space can be learnt ad-hoc at training time reaching the final performance of pretrained multilingual word-vectors.
Tasks	Text Classification
Published	2018-07-13
URL	http://arxiv.org/abs/1807.05195v1
PDF	http://arxiv.org/pdf/1807.05195v1.pdf
PWC	https://paperswithcode.com/paper/low-resource-text-classification-using-domain
Repo
Framework

Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator


Title	Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
Authors	Maryam Fazel, Rong Ge, Sham M. Kakade, Mehran Mesbahi
Abstract	Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an “end-to-end” approach, directly optimizing the performance metric of interest 3) they inherently allow for richly parameterized policies. A notable drawback is that even in the most basic continuous control problem (that of linear quadratic regulators), these methods must solve a non-convex optimization problem, where little is understood about their efficiency from both computational and statistical perspectives. In contrast, system identification and model based planning in optimal control theory have a much more solid theoretical footing, where much is known with regards to their computational and statistical properties. This work bridges this gap showing that (model free) policy gradient methods globally converge to the optimal solution and are efficient (polynomially so in relevant problem dependent quantities) with regards to their sample and computational complexities.
Tasks	Continuous Control, Policy Gradient Methods
Published	2018-01-15
URL	http://arxiv.org/abs/1801.05039v3
PDF	http://arxiv.org/pdf/1801.05039v3.pdf
PWC	https://paperswithcode.com/paper/global-convergence-of-policy-gradient-methods
Repo
Framework

Deep Multi-View Spatial-Temporal Network for Taxi Demand Prediction


Title	Deep Multi-View Spatial-Temporal Network for Taxi Demand Prediction
Authors	Huaxiu Yao, Fei Wu, Jintao Ke, Xianfeng Tang, Yitian Jia, Siyu Lu, Pinghua Gong, Jieping Ye, Zhenhui Li
Abstract	Taxi demand prediction is an important building block to enabling intelligent transportation systems in a smart city. An accurate prediction model can help the city pre-allocate resources to meet travel demand and to reduce empty taxis on streets which waste energy and worsen the traffic congestion. With the increasing popularity of taxi requesting services such as Uber and Didi Chuxing (in China), we are able to collect large-scale taxi demand data continuously. How to utilize such big data to improve the demand prediction is an interesting and critical real-world problem. Traditional demand prediction methods mostly rely on time series forecasting techniques, which fail to model the complex non-linear spatial and temporal relations. Recent advances in deep learning have shown superior performance on traditionally challenging tasks such as image classification by learning the complex features and correlations from large-scale data. This breakthrough has inspired researchers to explore deep learning techniques on traffic prediction problems. However, existing methods on traffic prediction have only considered spatial relation (e.g., using CNN) or temporal relation (e.g., using LSTM) independently. We propose a Deep Multi-View Spatial-Temporal Network (DMVST-Net) framework to model both spatial and temporal relations. Specifically, our proposed model consists of three views: temporal view (modeling correlations between future demand values with near time points via LSTM), spatial view (modeling local spatial correlation via local CNN), and semantic view (modeling correlations among regions sharing similar temporal patterns). Experiments on large-scale real taxi demand data demonstrate effectiveness of our approach over state-of-the-art methods.
Tasks	Image Classification, Time Series, Time Series Forecasting, Traffic Prediction
Published	2018-02-23
URL	http://arxiv.org/abs/1802.08714v2
PDF	http://arxiv.org/pdf/1802.08714v2.pdf
PWC	https://paperswithcode.com/paper/deep-multi-view-spatial-temporal-network-for
Repo
Framework

Revisiting the Importance of Individual Units in CNNs via Ablation


Title	Revisiting the Importance of Individual Units in CNNs via Ablation
Authors	Bolei Zhou, Yiyou Sun, David Bau, Antonio Torralba
Abstract	We revisit the importance of the individual units in Convolutional Neural Networks (CNNs) for visual recognition. By conducting unit ablation experiments on CNNs trained on large scale image datasets, we demonstrate that, though ablating any individual unit does not hurt overall classification accuracy, it does lead to significant damage on the accuracy of specific classes. This result shows that an individual unit is specialized to encode information relevant to a subset of classes. We compute the correlation between the accuracy drop under unit ablation and various attributes of an individual unit such as class selectivity and weight L1 norm. We confirm that unit attributes such as class selectivity are a poor predictor for impact on overall accuracy as found previously in recent work \cite{morcos2018importance}. However, our results show that class selectivity along with other attributes are good predictors of the importance of one unit to individual classes. We evaluate the impact of random rotation, batch normalization, and dropout to the importance of units to specific classes. Our results show that units with high selectivity play an important role in network classification power at the individual class level. Understanding and interpreting the behavior of these units is necessary and meaningful.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02891v1
PDF	http://arxiv.org/pdf/1806.02891v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-the-importance-of-individual-units
Repo
Framework

MaskRNN: Instance Level Video Object Segmentation


Title	MaskRNN: Instance Level Video Object Segmentation
Authors	Yuan-Ting Hu, Jia-Bin Huang, Alexander G. Schwing
Abstract	Instance level video object segmentation is an important technique for video editing and compression. To capture the temporal coherence, in this paper, we develop MaskRNN, a recurrent neural net approach which fuses in each frame the output of two deep nets for each object instance – a binary segmentation net providing a mask and a localization net providing a bounding box. Due to the recurrent component and the localization component, our method is able to take advantage of long-term temporal structures of the video data as well as rejecting outliers. We validate the proposed algorithm on three challenging benchmark datasets, the DAVIS-2016 dataset, the DAVIS-2017 dataset, and the Segtrack v2 dataset, achieving state-of-the-art performance on all of them.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2018-03-29
URL	http://arxiv.org/abs/1803.11187v1
PDF	http://arxiv.org/pdf/1803.11187v1.pdf
PWC	https://paperswithcode.com/paper/maskrnn-instance-level-video-object
Repo
Framework