October 19, 2019

3062 words 15 mins read

Paper Group ANR 140

Paper Group ANR 140

Linear model predictive safety certification for learning-based control. On the relationship between Dropout and Equiangular Tight Frames. Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata. Semi-supervised Fisher vector network. Sample Efficient Stochastic Variance-Reduced Cubic Regularization Method. A Sim …

Linear model predictive safety certification for learning-based control

Title Linear model predictive safety certification for learning-based control
Authors Kim P. Wabersich, Melanie N. Zeilinger
Abstract While it has been repeatedly shown that learning-based controllers can provide superior performance, they often lack of safety guarantees. This paper aims at addressing this problem by introducing a model predictive safety certification (MPSC) scheme for polytopic linear systems with additive disturbances. The scheme verifies safety of a proposed learning-based input and modifies it as little as necessary in order to keep the system within a given set of constraints. Safety is thereby related to the existence of a model predictive controller (MPC) providing a feasible trajectory towards a safe target set. A robust MPC formulation accounts for the fact that the model is generally uncertain in the context of learning, which allows proving constraint satisfaction at all times under the proposed MPSC strategy. The MPSC scheme can be used in order to expand any potentially conservative set of safe states for learning and we prove an iterative technique for enlarging the safe set. Finally, a practical data-based design procedure for MPSC is proposed using scenario optimization.
Tasks
Published 2018-03-22
URL http://arxiv.org/abs/1803.08552v6
PDF http://arxiv.org/pdf/1803.08552v6.pdf
PWC https://paperswithcode.com/paper/linear-model-predictive-safety-certification
Repo
Framework

On the relationship between Dropout and Equiangular Tight Frames

Title On the relationship between Dropout and Equiangular Tight Frames
Authors Dor Bank, Raja Giryes
Abstract Dropout is a popular regularization technique in neural networks. Yet, the reason for its success is still not fully understood. This paper provides a new interpretation of Dropout from a frame theory perspective. By drawing a connection to recent developments in analog channel coding, we suggest that for a certain family of autoencoders with a linear encoder, the minimizer of an optimization with dropout regularization on the encoder is an equiangular tight frame (ETF). Since this optimization is non-convex, we add another regularization that promotes such structures by minimizing the cross-correlation between filters in the network. We demonstrate its applicability in convolutional and fully connected layers in both feed-forward and recurrent networks. All these results suggest that there is indeed a relationship between dropout and ETF structure of the regularized linear operations.
Tasks
Published 2018-10-14
URL http://arxiv.org/abs/1810.06049v3
PDF http://arxiv.org/pdf/1810.06049v3.pdf
PWC https://paperswithcode.com/paper/on-the-relationship-between-dropout-and
Repo
Framework

Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata

Title Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata
Authors Chengjiang Long, Roddy Collins, Eran Swears, Anthony Hoogs
Abstract We propose a novel method for predicting image labels by fusing image content descriptors with the social media context of each image. An image uploaded to a social media site such as Flickr often has meaningful, associated information, such as comments and other images the user has uploaded, that is complementary to pixel content and helpful in predicting labels. Prediction challenges such as ImageNet~\cite{imagenet_cvpr09} and MSCOCO~\cite{LinMBHPRDZ:ECCV14} use only pixels, while other methods make predictions purely from social media context \cite{McAuleyECCV12}. Our method is based on a novel fully connected Conditional Random Field (CRF) framework, where each node is an image, and consists of two deep Convolutional Neural Networks (CNN) and one Recurrent Neural Network (RNN) that model both textual and visual node/image information. The edge weights of the CRF graph represent textual similarity and link-based metadata such as user sets and image groups. We model the CRF as an RNN for both learning and inference, and incorporate the weighted ranking loss and cross entropy loss into the CRF parameter optimization to handle the training data imbalance issue. Our proposed approach is evaluated on the MIR-9K dataset and experimentally outperforms current state-of-the-art approaches.
Tasks
Published 2018-01-27
URL http://arxiv.org/abs/1801.09108v1
PDF http://arxiv.org/pdf/1801.09108v1.pdf
PWC https://paperswithcode.com/paper/deep-neural-networks-in-fully-connected-crf
Repo
Framework

Semi-supervised Fisher vector network

Title Semi-supervised Fisher vector network
Authors Petar Palasek, Ioannis Patras
Abstract In this work we explore how the architecture proposed in [8], which expresses the processing steps of the classical Fisher vector pipeline approaches, i.e. dimensionality reduction by principal component analysis (PCA) projection, Gaussian mixture model (GMM) and Fisher vector descriptor extraction as network layers, can be modified into a hybrid network that combines the benefits of both unsupervised and supervised training methods, resulting in a model that learns a semi-supervised Fisher vector descriptor of the input data. We evaluate the proposed model at image classification and action recognition problems and show how the model’s classification performance improves as the amount of unlabeled data increases during training.
Tasks Dimensionality Reduction, Image Classification, Temporal Action Localization
Published 2018-01-13
URL http://arxiv.org/abs/1801.04438v1
PDF http://arxiv.org/pdf/1801.04438v1.pdf
PWC https://paperswithcode.com/paper/semi-supervised-fisher-vector-network
Repo
Framework

Sample Efficient Stochastic Variance-Reduced Cubic Regularization Method

Title Sample Efficient Stochastic Variance-Reduced Cubic Regularization Method
Authors Dongruo Zhou, Pan Xu, Quanquan Gu
Abstract We propose a sample efficient stochastic variance-reduced cubic regularization (Lite-SVRC) algorithm for finding the local minimum efficiently in nonconvex optimization. The proposed algorithm achieves a lower sample complexity of Hessian matrix computation than existing cubic regularization based methods. At the heart of our analysis is the choice of a constant batch size of Hessian matrix computation at each iteration and the stochastic variance reduction techniques. In detail, for a nonconvex function with $n$ component functions, Lite-SVRC converges to the local minimum within $\tilde{O}(n+n^{2/3}/\epsilon^{3/2})$ Hessian sample complexity, which is faster than all existing cubic regularization based methods. Numerical experiments with different nonconvex optimization problems conducted on real datasets validate our theoretical results.
Tasks
Published 2018-11-29
URL http://arxiv.org/abs/1811.11989v1
PDF http://arxiv.org/pdf/1811.11989v1.pdf
PWC https://paperswithcode.com/paper/sample-efficient-stochastic-variance-reduced
Repo
Framework

A Simple and Effective Approach to the Story Cloze Test

Title A Simple and Effective Approach to the Story Cloze Test
Authors Siddarth Srinivasan, Richa Arora, Mark Riedl
Abstract In the Story Cloze Test, a system is presented with a 4-sentence prompt to a story, and must determine which one of two potential endings is the ‘right’ ending to the story. Previous work has shown that ignoring the training set and training a model on the validation set can achieve high accuracy on this task due to stylistic differences between the story endings in the training set and validation and test sets. Following this approach, we present a simpler fully-neural approach to the Story Cloze Test using skip-thought embeddings of the stories in a feed-forward network that achieves close to state-of-the-art performance on this task without any feature engineering. We also find that considering just the last sentence of the prompt instead of the whole prompt yields higher accuracy with our approach.
Tasks Feature Engineering
Published 2018-03-15
URL http://arxiv.org/abs/1803.05547v1
PDF http://arxiv.org/pdf/1803.05547v1.pdf
PWC https://paperswithcode.com/paper/a-simple-and-effective-approach-to-the-story
Repo
Framework

Efficient and Accurate Abnormality Mining from Radiology Reports with Customized False Positive Reduction

Title Efficient and Accurate Abnormality Mining from Radiology Reports with Customized False Positive Reduction
Authors Nithya Attaluri, Ahmed Nasir, Carolynne Powe, Harold Racz, Ben Covington, Li Yao, Jordan Prosky, Eric Poblenz, Tobi Olatunji, Kevin Lyman
Abstract Obtaining datasets labeled to facilitate model development is a challenge for most machine learning tasks. The difficulty is heightened for medical imaging, where data itself is limited in accessibility and labeling requires costly time and effort by trained medical specialists. Medical imaging studies, however, are often accompanied by a medical report produced by a radiologist, identifying important features on the corresponding scan for other physicians not specifically trained in radiology. We propose a methodology for approximating image-level labels for radiology studies from associated reports using a general purpose language processing tool for medical concept extraction and sentiment analysis, and simple manually crafted heuristics for false positive reduction. Using this approach, we label more than 175,000 Head CT studies for the presence of 33 features indicative of 11 clinically relevant conditions. For 27 of the 30 keywords that yielded positive results (3 had no occurrences), the lower bound of the confidence intervals created to estimate the percentage of accurately labeled reports was above 85%, with the average being above 95%. Though noisier then manual labeling, these results suggest this method to be a viable means of labeling medical images at scale.
Tasks Sentiment Analysis
Published 2018-10-01
URL http://arxiv.org/abs/1810.00967v1
PDF http://arxiv.org/pdf/1810.00967v1.pdf
PWC https://paperswithcode.com/paper/efficient-and-accurate-abnormality-mining
Repo
Framework

Long Activity Video Understanding using Functional Object-Oriented Network

Title Long Activity Video Understanding using Functional Object-Oriented Network
Authors Ahmad Babaeian Jelodar, David Paulius, Yu Sun
Abstract Video understanding is one of the most challenging topics in computer vision. In this paper, a four-stage video understanding pipeline is presented to simultaneously recognize all atomic actions and the single on-going activity in a video. This pipeline uses objects and motions from the video and a graph-based knowledge representation network as prior reference. Two deep networks are trained to identify objects and motions in each video sequence associated with an action. Low Level image features are then used to identify objects of interest in that video sequence. Confidence scores are assigned to objects of interest based on their involvement in the action and to motion classes based on results from a deep neural network that classifies the on-going action in video into motion classes. Confidence scores are computed for each candidate functional unit associated with an action using a knowledge representation network, object confidences, and motion confidences. Each action is therefore associated with a functional unit and the sequence of actions is further evaluated to identify the single on-going activity in the video. The knowledge representation used in the pipeline is called the functional object-oriented network which is a graph-based network useful for encoding knowledge about manipulation tasks. Experiments are performed on a dataset of cooking videos to test the proposed algorithm with action inference and activity classification. Experiments show that using functional object oriented network improves video understanding significantly.
Tasks Video Understanding
Published 2018-07-03
URL http://arxiv.org/abs/1807.00983v1
PDF http://arxiv.org/pdf/1807.00983v1.pdf
PWC https://paperswithcode.com/paper/long-activity-video-understanding-using
Repo
Framework

Uncertainty in Quantum Rule-Based Systems

Title Uncertainty in Quantum Rule-Based Systems
Authors Vicente Moret-Bonillo, Isaac Fernández-Varela, Diego Alvarez-Estevez
Abstract This article deals with the problem of the uncertainty in rule-based systems (RBS), but from the perspective of quantum computing (QC). In this work we first remember the characteristics of Quantum Rule-Based Systems (QRBS), a concept defined in a previous article by one of the authors of this paper, and we introduce the problem of quantum uncertainty. We assume that the subjective uncertainty that affects the facts of classical RBSs can be treated as a direct consequence of the probabilistic nature of quantum mechanics (QM), and we also assume that the uncertainty associated with a given hypothesis is a consequence of the propagation of the imprecision through the inferential circuits of RBSs. This article does not intend to contribute anything new to the QM field: it is a work of artificial intelligence (AI) that uses QC techniques to solve the problem of uncertainty in RBSs. Bearing the above arguments in mind a quantum model is proposed. This model has been applied to a problem already defined by one of the authors of this work in a previous publication and which is briefly described in this article. Then the model is generalized, and it is thoroughly evaluated. The results obtained show that QC is a valid, effective and efficient method to deal with the inherent uncertainty of RBSs
Tasks
Published 2018-11-07
URL http://arxiv.org/abs/1811.02782v1
PDF http://arxiv.org/pdf/1811.02782v1.pdf
PWC https://paperswithcode.com/paper/uncertainty-in-quantum-rule-based-systems
Repo
Framework

Crowd-Machine Collaboration for Item Screening

Title Crowd-Machine Collaboration for Item Screening
Authors Evgeny Krivosheev, Bahareh Harandizadeh, Fabio Casati, Boualem Benatallah
Abstract In this paper we describe how crowd and machine classifier can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost.
Tasks
Published 2018-03-21
URL http://arxiv.org/abs/1803.07947v1
PDF http://arxiv.org/pdf/1803.07947v1.pdf
PWC https://paperswithcode.com/paper/crowd-machine-collaboration-for-item
Repo
Framework

GoT-WAVE: Temporal network alignment using graphlet-orbit transitions

Title GoT-WAVE: Temporal network alignment using graphlet-orbit transitions
Authors David Aparício, Pedro Ribeiro, Tijana Milenković, Fernando Silva
Abstract Global pairwise network alignment (GPNA) aims to find a one-to-one node mapping between two networks that identifies conserved network regions. GPNA algorithms optimize node conservation (NC) and edge conservation (EC). NC quantifies topological similarity between nodes. Graphlet-based degree vectors (GDVs) are a state-of-the-art topological NC measure. Dynamic GDVs (DGDVs) were used as a dynamic NC measure within the first-ever algorithms for GPNA of temporal networks: DynaMAGNA++ and DynaWAVE. The latter is superior for larger networks. We recently developed a different graphlet-based measure of temporal node similarity, graphlet-orbit transitions (GoTs). Here, we use GoTs instead of DGDVs as a new dynamic NC measure within DynaWAVE, resulting in a new approach, GoT-WAVE. On synthetic networks, GoT-WAVE improves DynaWAVE’s accuracy by 25% and speed by 64%. On real networks, when optimizing only dynamic NC, each method is superior ~50% of the time. While DynaWAVE benefits more from also optimizing dynamic EC, only GoT-WAVE can support directed edges. Hence, GoT-WAVE is a promising new temporal GPNA algorithm, which efficiently optimizes dynamic NC. Future work on better incorporating dynamic EC may yield further improvements.
Tasks
Published 2018-08-24
URL http://arxiv.org/abs/1808.08195v1
PDF http://arxiv.org/pdf/1808.08195v1.pdf
PWC https://paperswithcode.com/paper/got-wave-temporal-network-alignment-using
Repo
Framework

Déjà Vu: an empirical evaluation of the memorization properties of ConvNets

Title Déjà Vu: an empirical evaluation of the memorization properties of ConvNets
Authors Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Abstract Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting. This paper considers the related question of “membership inference”, where the goal is to determine if an image was used during training. We consider it under three complementary angles. We show how to detect which dataset was used to train a model, and in particular whether some validation images were used at train time. We then analyze explicit memorization and extend classical random label experiments to the problem of learning a model that predicts if an image belongs to an arbitrary set. Finally, we propose a new approach to infer membership when a few of the top layers are not available or have been fine-tuned, and show that lower layers still carry information about the training samples. To support our findings, we conduct large-scale experiments on Imagenet and subsets of YFCC-100M with modern architectures such as VGG and Resnet.
Tasks Data Augmentation
Published 2018-09-17
URL http://arxiv.org/abs/1809.06396v1
PDF http://arxiv.org/pdf/1809.06396v1.pdf
PWC https://paperswithcode.com/paper/deja-vu-an-empirical-evaluation-of-the
Repo
Framework

Context-Aware Text-Based Binary Image Stylization and Synthesis

Title Context-Aware Text-Based Binary Image Stylization and Synthesis
Authors Shuai Yang, Jiaying Liu, Wenhan Yang, Zongming Guo
Abstract In this work, we present a new framework for the stylization of text-based binary images. First, our method stylizes the stroke-based geometric shape like text, symbols and icons in the target binary image based on an input style image. Second, the composition of the stylized geometric shape and a background image is explored. To accomplish the task, we propose legibility-preserving structure and texture transfer algorithms, which progressively narrow the visual differences between the binary image and the style image. The stylization is then followed by a context-aware layout design algorithm, where cues for both seamlessness and aesthetics are employed to determine the optimal layout of the shape in the background. Given the layout, the binary image is seamlessly embedded into the background by texture synthesis under a context-aware boundary constraint. According to the contents of binary images, our method can be applied to many fields. We show that the proposed method is capable of addressing the unsupervised text stylization problem and is superior to state-of-the-art style transfer methods in automatic artistic typography creation. Besides, extensive experiments on various tasks, such as visual-textual presentation synthesis, icon/symbol rendering and structure-guided image inpainting, demonstrate the effectiveness of the proposed method.
Tasks Image Inpainting, Image Stylization, Style Transfer, Texture Synthesis
Published 2018-10-09
URL http://arxiv.org/abs/1810.03767v1
PDF http://arxiv.org/pdf/1810.03767v1.pdf
PWC https://paperswithcode.com/paper/context-aware-text-based-binary-image
Repo
Framework

Exploiting Spatial-Temporal Modelling and Multi-Modal Fusion for Human Action Recognition

Title Exploiting Spatial-Temporal Modelling and Multi-Modal Fusion for Human Action Recognition
Authors Dongliang He, Fu Li, Qijie Zhao, Xiang Long, Yi Fu, Shilei Wen
Abstract In this report, our approach to tackling the task of ActivityNet 2018 Kinetics-600 challenge is described in detail. Though spatial-temporal modelling methods, which adopt either such end-to-end framework as I3D \cite{i3d} or two-stage frameworks (i.e., CNN+RNN), have been proposed in existing state-of-the-arts for this task, video modelling is far from being well solved. In this challenge, we propose spatial-temporal network (StNet) for better joint spatial-temporal modelling and comprehensively video understanding. Besides, given that multi-modal information is contained in video source, we manage to integrate both early-fusion and later-fusion strategy of multi-modal information via our proposed improved temporal Xception network (iTXN) for video understanding. Our StNet RGB single model achieves 78.99% top-1 precision in the Kinetics-600 validation set and that of our improved temporal Xception network which integrates RGB, flow and audio modalities is up to 82.35%. After model ensemble, we achieve top-1 precision as high as 85.0% on the validation set and rank No.1 among all submissions.
Tasks Temporal Action Localization, Video Understanding
Published 2018-06-27
URL http://arxiv.org/abs/1806.10319v1
PDF http://arxiv.org/pdf/1806.10319v1.pdf
PWC https://paperswithcode.com/paper/exploiting-spatial-temporal-modelling-and
Repo
Framework

Texture Mixing by Interpolating Deep Statistics via Gaussian Models

Title Texture Mixing by Interpolating Deep Statistics via Gaussian Models
Authors Zi-Ming Wang, Gui-Song Xia, Yi-Peng Zhang
Abstract Recently, enthusiastic studies have devoted to texture synthesis using deep neural networks, because these networks excel at handling complex patterns in images. In these models, second-order statistics, such as Gram matrix, are used to describe textures. Despite the fact that these model have achieved promising results, the structure of their parametric space is still unclear, consequently, it is difficult to use them to mix textures. This paper addresses the texture mixing problem by using a Gaussian scheme to interpolate deep statistics computed from deep neural networks. More precisely, we first reveal that the statistics used in existing deep models can be unified using a stationary Gaussian scheme. We then present a novel algorithm to mix these statistics by interpolating between Gaussian models using optimal transport. We further apply our scheme to Neural Style Transfer, where we can create mixed styles. The experiments demonstrate that our method can achieve state-of-the-art results. Because all the computations are implemented in closed forms, our mixing algorithm adds only negligible time to the original texture synthesis procedure.
Tasks Style Transfer, Texture Synthesis
Published 2018-07-29
URL http://arxiv.org/abs/1807.11035v1
PDF http://arxiv.org/pdf/1807.11035v1.pdf
PWC https://paperswithcode.com/paper/texture-mixing-by-interpolating-deep
Repo
Framework
comments powered by Disqus