July 27, 2019

3049 words 15 mins read

Paper Group ANR 716

Consistencies and inconsistencies between model selection and link prediction in networks. Predicting Disease-Gene Associations using Cross-Document Graph-based Features. Channel masking for multivariate time series shapelets. Design of the Artificial: lessons from the biological roots of general intelligence. An ILP Solver for Multi-label MRFs wit …

Consistencies and inconsistencies between model selection and link prediction in networks


Title	Consistencies and inconsistencies between model selection and link prediction in networks
Authors	Toni Vallès-Català, Tiago P. Peixoto, Roger Guimerà, Marta Sales-Pardo
Abstract	A principled approach to understand network structures is to formulate generative models. Given a collection of models, however, an outstanding key task is to determine which one provides a more accurate description of the network at hand, discounting statistical fluctuations. This problem can be approached using two principled criteria that at first may seem equivalent: selecting the most plausible model in terms of its posterior probability; or selecting the model with the highest predictive performance in terms of identifying missing links. Here we show that while these two approaches yield consistent results in most of cases, there are also notable instances where they do not, that is, where the most plausible model is not the most predictive. We show that in the latter case the improvement of predictive performance can in fact lead to overfitting both in artificial and empirical settings. Furthermore, we show that, in general, the predictive performance is higher when we average over collections of models that are individually less plausible, than when we consider only the single most plausible model.
Tasks	Link Prediction, Model Selection
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07967v2
PDF	http://arxiv.org/pdf/1705.07967v2.pdf
PWC	https://paperswithcode.com/paper/consistencies-and-inconsistencies-between
Repo
Framework

Predicting Disease-Gene Associations using Cross-Document Graph-based Features


Title	Predicting Disease-Gene Associations using Cross-Document Graph-based Features
Authors	Hendrik ter Horst, Matthias Hartung, Roman Klinger, Matthias Zwick, Philipp Cimiano
Abstract	In the context of personalized medicine, text mining methods pose an interesting option for identifying disease-gene associations, as they can be used to generate novel links between diseases and genes which may complement knowledge from structured databases. The most straightforward approach to extract such links from text is to rely on a simple assumption postulating an association between all genes and diseases that co-occur within the same document. However, this approach (i) tends to yield a number of spurious associations, (ii) does not capture different relevant types of associations, and (iii) is incapable of aggregating knowledge that is spread across documents. Thus, we propose an approach in which disease-gene co-occurrences and gene-gene interactions are represented in an RDF graph. A machine learning-based classifier is trained that incorporates features extracted from the graph to separate disease-gene pairs into valid disease-gene associations and spurious ones. On the manually curated Genetic Testing Registry, our approach yields a 30 points increase in F1 score over a plain co-occurrence baseline.
Tasks
Published	2017-09-26
URL	http://arxiv.org/abs/1709.09239v1
PDF	http://arxiv.org/pdf/1709.09239v1.pdf
PWC	https://paperswithcode.com/paper/predicting-disease-gene-associations-using
Repo
Framework

Channel masking for multivariate time series shapelets


Title	Channel masking for multivariate time series shapelets
Authors	Dripta S. Raychaudhuri, Josif Grabocka, Lars Schmidt-Thieme
Abstract	Time series shapelets are discriminative sub-sequences and their similarity to time series can be used for time series classification. Initial shapelet extraction algorithms searched shapelets by complete enumeration of all possible data sub-sequences. Research on shapelets for univariate time series proposed a mechanism called shapelet learning which parameterizes the shapelets and learns them jointly with a prediction model in an optimization procedure. Trivial extension of this method to multivariate time series does not yield very good results due to the presence of noisy channels which lead to overfitting. In this paper we propose a shapelet learning scheme for multivariate time series in which we introduce channel masks to discount noisy channels and serve as an implicit regularization.
Tasks	Time Series, Time Series Classification
Published	2017-11-02
URL	http://arxiv.org/abs/1711.00812v1
PDF	http://arxiv.org/pdf/1711.00812v1.pdf
PWC	https://paperswithcode.com/paper/channel-masking-for-multivariate-time-series
Repo
Framework

Design of the Artificial: lessons from the biological roots of general intelligence


Title	Design of the Artificial: lessons from the biological roots of general intelligence
Authors	Nima Dehghani
Abstract	Our desire and fascination with intelligent machines dates back to the antiquity’s mythical automaton Talos, Aristotle’s mode of mechanical thought (syllogism) and Heron of Alexandria’s mechanical machines and automata. However, the quest for Artificial General Intelligence (AGI) is troubled with repeated failures of strategies and approaches throughout the history. This decade has seen a shift in interest towards bio-inspired software and hardware, with the assumption that such mimicry entails intelligence. Though these steps are fruitful in certain directions and have advanced automation, their singular design focus renders them highly inefficient in achieving AGI. Which set of requirements have to be met in the design of AGI? What are the limits in the design of the artificial? Here, a careful examination of computation in biological systems hints that evolutionary tinkering of contextual processing of information enabled by a hierarchical architecture is the key to build AGI.
Tasks
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02245v2
PDF	http://arxiv.org/pdf/1703.02245v2.pdf
PWC	https://paperswithcode.com/paper/design-of-the-artificial-lessons-from-the
Repo
Framework

An ILP Solver for Multi-label MRFs with Connectivity Constraints


Title	An ILP Solver for Multi-label MRFs with Connectivity Constraints
Authors	Ruobing Shen, Eric Kendinibilir, Ismail Ben Ayed, Andrea Lodi, Andrea Tramontani, Gerhard Reinelt
Abstract	Integer Linear Programming (ILP) formulations of Markov random fields (MRFs) models with global connectivity priors were investigated previously in computer vision, e.g., \cite{globalinter,globalconn}. In these works, only Linear Programing (LP) relaxations \cite{globalinter,globalconn} or simplified versions \cite{graphcutbase} of the problem were solved. This paper investigates the ILP of multi-label MRF with exact connectivity priors via a branch-and-cut method, which provably finds globally optimal solutions. The method enforces connectivity priors iteratively by a cutting plane method, and provides feasible solutions with a guarantee on sub-optimality even if we terminate it earlier. The proposed ILP can be applied as a post-processing method on top of any existing multi-label segmentation approach. As it provides globally optimal solution, it can be used off-line to generate ground-truth labeling, which serves as quality check for any fast on-line algorithm. Furthermore, it can be used to generate ground-truth proposals for weakly supervised segmentation. We demonstrate the power and usefulness of our model by several experiments on the BSDS500 and PASCAL image dataset, as well as on medical images with trained probability maps.
Tasks
Published	2017-12-16
URL	http://arxiv.org/abs/1712.06020v2
PDF	http://arxiv.org/pdf/1712.06020v2.pdf
PWC	https://paperswithcode.com/paper/an-ilp-solver-for-multi-label-mrfs-with
Repo
Framework

Image retargeting via Beltrami representation


Title	Image retargeting via Beltrami representation
Authors	Chun Pong Lau, Chun Pang Yung, Lok Ming Lui
Abstract	Image retargeting aims to resize an image to one with a prescribed aspect ratio. Simple scaling inevitably introduces unnatural geometric distortions on the important content of the image. In this paper, we propose a simple and yet effective method to resize an image, which preserves the geometry of the important content, using the Beltrami representation. Our algorithm allows users to interactively label content regions as well as line structures. Image resizing can then be achieved by warping the image by an orientation-preserving bijective warping map with controlled distortion. The warping map is represented by its Beltrami representation, which captures the local geometric distortion of the map. By carefully prescribing the values of the Beltrami representation, images with different complexity can be effectively resized. Our method does not require solving any optimization problems and tuning parameters throughout the process. This results in a simple and efficient algorithm to solve the image retargeting problem. Extensive experiments have been carried out, which demonstrate the efficacy of our proposed method.
Tasks
Published	2017-10-11
URL	http://arxiv.org/abs/1710.04034v1
PDF	http://arxiv.org/pdf/1710.04034v1.pdf
PWC	https://paperswithcode.com/paper/image-retargeting-via-beltrami-representation
Repo
Framework

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks


Title	Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks
Authors	Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang
Abstract	Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNNs (AVDCNN) SE model, which incorporates audio and visual streams into a unified network model. We also propose a multi-task learning framework for reconstructing audio and visual signals at the output layer. Precisely speaking, the proposed AVDCNN model is structured as an audio-visual encoder-decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech (the primary task) and reconstructed images (the secondary task) at the output layer. The model is trained in an end-to-end manner, and parameters are jointly learned through back-propagation. We evaluate enhanced speech using five instrumental criteria. Results show that the AVDCNN model yields a notably superior performance compared with an audio-only CNN-based SE model and two conventional SE approaches, confirming the effectiveness of integrating visual information into the SE process. In addition, the AVDCNN model also outperforms an existing audio-visual SE model, confirming its capability of effectively combining audio and visual information in SE.
Tasks	Multi-Task Learning, Speech Enhancement
Published	2017-03-30
URL	http://arxiv.org/abs/1703.10893v6
PDF	http://arxiv.org/pdf/1703.10893v6.pdf
PWC	https://paperswithcode.com/paper/audio-visual-speech-enhancement-using
Repo
Framework

WeText: Scene Text Detection under Weak Supervision


Title	WeText: Scene Text Detection under Weak Supervision
Authors	Shangxuan Tian, Shijian Lu, Chongshou Li
Abstract	The requiring of large amounts of annotated training data has become a common constraint on various deep learning systems. In this paper, we propose a weakly supervised scene text detection method (WeText) that trains robust and accurate scene text detection models by learning from unannotated or weakly annotated data. With a “light” supervised model trained on a small fully annotated dataset, we explore semi-supervised and weakly supervised learning on a large unannotated dataset and a large weakly annotated dataset, respectively. For the unsupervised learning, the light supervised model is applied to the unannotated dataset to search for more character training samples, which are further combined with the small annotated dataset to retrain a superior character detection model. For the weakly supervised learning, the character searching is guided by high-level annotations of words/text lines that are widely available and also much easier to prepare. In addition, we design an unified scene character detector by adapting regression based deep networks, which greatly relieves the error accumulation issue that widely exists in most traditional approaches. Extensive experiments across different unannotated and weakly annotated datasets show that the scene text detection performance can be clearly boosted under both scenarios, where the weakly supervised learning can achieve the state-of-the-art performance by using only 229 fully annotated scene text images.
Tasks	Scene Text Detection
Published	2017-10-13
URL	http://arxiv.org/abs/1710.04826v1
PDF	http://arxiv.org/pdf/1710.04826v1.pdf
PWC	https://paperswithcode.com/paper/wetext-scene-text-detection-under-weak
Repo
Framework

Towards speech-to-text translation without speech recognition


Title	Towards speech-to-text translation without speech recognition
Authors	Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater
Abstract	We explore the problem of translating speech to text in low-resource scenarios where neither automatic speech recognition (ASR) nor machine translation (MT) are available, but we have training data in the form of audio paired with text translations. We present the first system for this problem applied to a realistic multi-speaker dataset, the CALLHOME Spanish-English speech translation corpus. Our approach uses unsupervised term discovery (UTD) to cluster repeated patterns in the audio, creating a pseudotext, which we pair with translations to create a parallel text and train a simple bag-of-words MT model. We identify the challenges faced by the system, finding that the difficulty of cross-speaker UTD results in low recall, but that our system is still able to correctly translate some content words in test data.
Tasks	Machine Translation, Speech Recognition
Published	2017-02-13
URL	http://arxiv.org/abs/1702.03856v1
PDF	http://arxiv.org/pdf/1702.03856v1.pdf
PWC	https://paperswithcode.com/paper/towards-speech-to-text-translation-without
Repo
Framework

Robust Optimization for Non-Convex Objectives


Title	Robust Optimization for Non-Convex Objectives
Authors	Robert Chen, Brendan Lucier, Yaron Singer, Vasilis Syrgkanis
Abstract	We consider robust optimization problems, where the goal is to optimize in the worst case over a class of objective functions. We develop a reduction from robust improper optimization to Bayesian optimization: given an oracle that returns $\alpha$-approximate solutions for distributions over objectives, we compute a distribution over solutions that is $\alpha$-approximate in the worst case. We show that de-randomizing this solution is NP-hard in general, but can be done for a broad class of statistical learning tasks. We apply our results to robust neural network training and submodular optimization. We evaluate our approach experimentally on corrupted character classification, and robust influence maximization in networks.
Tasks
Published	2017-07-04
URL	http://arxiv.org/abs/1707.01047v1
PDF	http://arxiv.org/pdf/1707.01047v1.pdf
PWC	https://paperswithcode.com/paper/robust-optimization-for-non-convex-objectives
Repo
Framework

Forced to Learn: Discovering Disentangled Representations Without Exhaustive Labels


Title	Forced to Learn: Discovering Disentangled Representations Without Exhaustive Labels
Authors	Alexey Romanov, Anna Rumshisky
Abstract	Learning a better representation with neural networks is a challenging problem, which was tackled extensively from different prospectives in the past few years. In this work, we focus on learning a representation that could be used for a clustering task and introduce two novel loss components that substantially improve the quality of produced clusters, are simple to apply to an arbitrary model and cost function, and do not require a complicated training procedure. We evaluate them on two most common types of models, Recurrent Neural Networks and Convolutional Neural Networks, showing that the approach we propose consistently improves the quality of KMeans clustering in terms of Adjusted Mutual Information score and outperforms previously proposed methods.
Tasks
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00574v1
PDF	http://arxiv.org/pdf/1705.00574v1.pdf
PWC	https://paperswithcode.com/paper/forced-to-learn-discovering-disentangled
Repo
Framework

An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning


Title	An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning
Authors	Fan Wu, Zhongwen Xu, Yi Yang
Abstract	We propose an end-to-end approach to the natural language object retrieval task, which localizes an object within an image according to a natural language description, i.e., referring expression. Previous works divide this problem into two independent stages: first, compute region proposals from the image without the exploration of the language description; second, score the object proposals with regard to the referring expression and choose the top-ranked proposals. The object proposals are generated independently from the referring expression, which makes the proposal generation redundant and even irrelevant to the referred object. In this work, we train an agent with deep reinforcement learning, which learns to move and reshape a bounding box to localize the object according to the referring expression. We incorporate both the spatial and temporal context information into the training procedure. By simultaneously exploiting local visual information, the spatial and temporal context and the referring language a priori, the agent selects an appropriate action to take at each time. A special action is defined to indicate when the agent finds the referred object, and terminate the procedure. We evaluate our model on various datasets, and our algorithm significantly outperforms the compared algorithms. Notably, the accuracy improvement of our method over the recent method GroundeR and SCRC on the ReferItGame dataset are 7.67% and 18.25%, respectively.
Tasks
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07579v1
PDF	http://arxiv.org/pdf/1703.07579v1.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-approach-to-natural-language
Repo
Framework

Function Norms and Regularization in Deep Networks


Title	Function Norms and Regularization in Deep Networks
Authors	Amal Rannen Triki, Maxim Berman, Matthew B. Blaschko
Abstract	Deep neural networks (DNNs) have become increasingly important due to their excellent empirical performance on a wide range of problems. However, regularization is generally achieved by indirect means, largely due to the complex set of functions defined by a network and the difficulty in measuring function complexity. There exists no method in the literature for additive regularization based on a norm of the function, as is classically considered in statistical learning theory. In this work, we propose sampling-based approximations to weighted function norms as regularizers for deep neural networks. We provide, to the best of our knowledge, the first proof in the literature of the NP-hardness of computing function norms of DNNs, motivating the necessity of an approximate approach. We then derive a generalization bound for functions trained with weighted norms and prove that a natural stochastic optimization strategy minimizes the bound. Finally, we empirically validate the improved performance of the proposed regularization strategies for both convex function sets as well as DNNs on real-world classification and image segmentation tasks demonstrating improved performance over weight decay, dropout, and batch normalization. Source code will be released at the time of publication.
Tasks	Semantic Segmentation, Stochastic Optimization
Published	2017-10-18
URL	http://arxiv.org/abs/1710.06703v2
PDF	http://arxiv.org/pdf/1710.06703v2.pdf
PWC	https://paperswithcode.com/paper/function-norms-and-regularization-in-deep
Repo
Framework

Geodesic Distance Histogram Feature for Video Segmentation


Title	Geodesic Distance Histogram Feature for Video Segmentation
Authors	Hieu Le, Vu Nguyen, Chen-Ping Yu, Dimitris Samaras
Abstract	This paper proposes a geodesic-distance-based feature that encodes global information for improved video segmentation algorithms. The feature is a joint histogram of intensity and geodesic distances, where the geodesic distances are computed as the shortest paths between superpixels via their boundaries. We also incorporate adaptive voting weights and spatial pyramid configurations to include spatial information into the geodesic histogram feature and show that this further improves results. The feature is generic and can be used as part of various algorithms. In experiments, we test the geodesic histogram feature by incorporating it into two existing video segmentation frameworks. This leads to significantly better performance in 3D video segmentation benchmarks on two datasets.
Tasks	Video Semantic Segmentation
Published	2017-03-31
URL	http://arxiv.org/abs/1704.00077v1
PDF	http://arxiv.org/pdf/1704.00077v1.pdf
PWC	https://paperswithcode.com/paper/geodesic-distance-histogram-feature-for-video
Repo
Framework

Faster Fuzzing: Reinitialization with Deep Neural Models


Title	Faster Fuzzing: Reinitialization with Deep Neural Models
Authors	Nicole Nichols, Mark Raugas, Robert Jasper, Nathan Hilliard
Abstract	We improve the performance of the American Fuzzy Lop (AFL) fuzz testing framework by using Generative Adversarial Network (GAN) models to reinitialize the system with novel seed files. We assess performance based on the temporal rate at which we produce novel and unseen code paths. We compare this approach to seed file generation from a random draw of bytes observed in the training seed files. The code path lengths and variations were not sufficiently diverse to fully replace AFL input generation. However, augmenting native AFL with these additional code paths demonstrated improvements over AFL alone. Specifically, experiments showed the GAN was faster and more effective than the LSTM and out-performed a random augmentation strategy, as measured by the number of unique code paths discovered. GAN helps AFL discover 14.23% more code paths than the random strategy in the same amount of CPU time, finds 6.16% more unique code paths, and finds paths that are on average 13.84% longer. Using GAN shows promise as a reinitialization strategy for AFL to help the fuzzer exercise deep paths in software.
Tasks
Published	2017-11-08
URL	http://arxiv.org/abs/1711.02807v1
PDF	http://arxiv.org/pdf/1711.02807v1.pdf
PWC	https://paperswithcode.com/paper/faster-fuzzing-reinitialization-with-deep
Repo
Framework