October 20, 2019

3145 words 15 mins read

Paper Group AWR 222

Paper Group AWR 222

Structure from Recurrent Motion: From Rigidity to Recurrency. Extracting Scientific Figures with Distantly Supervised Neural Networks. A pathway-based kernel boosting method for sample classification using genomic data. Billion-scale Network Embedding with Iterative Random Projection. XNLI: Evaluating Cross-lingual Sentence Representations. Jack th …

Structure from Recurrent Motion: From Rigidity to Recurrency

Title Structure from Recurrent Motion: From Rigidity to Recurrency
Authors Xiu Li, Hongdong Li, Hanbyul Joo, Yebin Liu, Yaser Sheikh
Abstract This paper proposes a new method for Non-Rigid Structure-from-Motion (NRSfM) from a long monocular video sequence observing a non-rigid object performing recurrent and possibly repetitive dynamic action. Departing from the traditional idea of using linear low-order or lowrank shape model for the task of NRSfM, our method exploits the property of shape recurrency (i.e., many deforming shapes tend to repeat themselves in time). We show that recurrency is in fact a generalized rigidity. Based on this, we reduce NRSfM problems to rigid ones provided that certain recurrency condition is satisfied. Given such a reduction, standard rigid-SfM techniques are directly applicable (without any change) to the reconstruction of non-rigid dynamic shapes. To implement this idea as a practical approach, this paper develops efficient algorithms for automatic recurrency detection, as well as camera view clustering via a rigidity-check. Experiments on both simulated sequences and real data demonstrate the effectiveness of the method. Since this paper offers a novel perspective on rethinking structure-from-motion, we hope it will inspire other new problems in the field.
Tasks
Published 2018-04-18
URL http://arxiv.org/abs/1804.06510v1
PDF http://arxiv.org/pdf/1804.06510v1.pdf
PWC https://paperswithcode.com/paper/structure-from-recurrent-motion-from-rigidity
Repo https://github.com/leehsiu/poseLabel
Framework none

Extracting Scientific Figures with Distantly Supervised Neural Networks

Title Extracting Scientific Figures with Distantly Supervised Neural Networks
Authors Noah Siegel, Nicholas Lourie, Russell Power, Waleed Ammar
Abstract Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. In this paper, we induce high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention. To accomplish this we leverage the auxiliary data provided in two large web collections of scientific documents (arXiv and PubMed) to locate figures and their associated captions in the rasterized PDF. We share the resulting dataset of over 5.5 million induced labels—4,000 times larger than the previous largest figure extraction dataset—with an average precision of 96.8%, to enable the development of modern data-driven methods for this task. We use this dataset to train a deep neural network for end-to-end figure detection, yielding a model that can be more easily extended to new domains compared to previous work. The model was successfully deployed in Semantic Scholar, a large-scale academic search engine, and used to extract figures in 13 million scientific documents.
Tasks
Published 2018-04-06
URL http://arxiv.org/abs/1804.02445v2
PDF http://arxiv.org/pdf/1804.02445v2.pdf
PWC https://paperswithcode.com/paper/extracting-scientific-figures-with-distantly
Repo https://github.com/allenai/deepfigures-open
Framework none

A pathway-based kernel boosting method for sample classification using genomic data

Title A pathway-based kernel boosting method for sample classification using genomic data
Authors Li Zeng, Zhaolong Yu, Hongyu Zhao
Abstract The analysis of cancer genomic data has long suffered “the curse of dimensionality”. Sample sizes for most cancer genomic studies are a few hundreds at most while there are tens of thousands of genomic features studied. Various methods have been proposed to leverage prior biological knowledge, such as pathways, to more effectively analyze cancer genomic data. Most of the methods focus on testing marginal significance of the associations between pathways and clinical phenotypes. They can identify relevant pathways, but do not involve predictive modeling. In this article, we propose a Pathway-based Kernel Boosting (PKB) method for integrating gene pathway information for sample classification, where we use kernel functions calculated from each pathway as base learners and learn the weights through iterative optimization of the classification loss function. We apply PKB and several competing methods to three cancer studies with pathological and clinical information, including tumor grade, stage, tumor sites, and metastasis status. Our results show that PKB outperforms other methods, and identifies pathways relevant to the outcome variables.
Tasks
Published 2018-03-11
URL http://arxiv.org/abs/1803.03910v1
PDF http://arxiv.org/pdf/1803.03910v1.pdf
PWC https://paperswithcode.com/paper/a-pathway-based-kernel-boosting-method-for
Repo https://github.com/zengliX/PKB
Framework none

Billion-scale Network Embedding with Iterative Random Projection

Title Billion-scale Network Embedding with Iterative Random Projection
Authors Ziwei Zhang, Peng Cui, Haoyang Li, Xiao Wang, Wenwu Zhu
Abstract Network embedding, which learns low-dimensional vector representation for nodes in the network, has attracted considerable research attention recently. However, the existing methods are incapable of handling billion-scale networks, because they are computationally expensive and, at the same time, difficult to be accelerated by distributed computing schemes. To address these problems, we propose RandNE (Iterative Random Projection Network Embedding), a novel and simple billion-scale network embedding method. Specifically, we propose a Gaussian random projection approach to map the network into a low-dimensional embedding space while preserving the high-order proximities between nodes. To reduce the time complexity, we design an iterative projection procedure to avoid the explicit calculation of the high-order proximities. Theoretical analysis shows that our method is extremely efficient, and friendly to distributed computing schemes without any communication cost in the calculation. We also design a dynamic updating procedure which can efficiently incorporate the dynamic changes of the networks without error aggregation. Extensive experimental results demonstrate the efficiency and efficacy of RandNE over state-of-the-art methods in several tasks including network reconstruction, link prediction and node classification on multiple datasets with different scales, ranging from thousands to billions of nodes and edges.
Tasks Link Prediction, Network Embedding, Node Classification
Published 2018-05-07
URL http://arxiv.org/abs/1805.02396v2
PDF http://arxiv.org/pdf/1805.02396v2.pdf
PWC https://paperswithcode.com/paper/billion-scale-network-embedding-with
Repo https://github.com/ZW-ZHANG/RandNE
Framework none

XNLI: Evaluating Cross-lingual Sentence Representations

Title XNLI: Evaluating Cross-lingual Sentence Representations
Authors Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R. Bowman, Holger Schwenk, Veselin Stoyanov
Abstract State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models. These models are generally trained on data in a single language (usually English), and cannot be directly used beyond that language. Since collecting data in every language is not realistic, there has been a growing interest in cross-lingual language understanding (XLU) and low-resource cross-language transfer. In this work, we construct an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus (MultiNLI) to 15 languages, including low-resource languages such as Swahili and Urdu. We hope that our dataset, dubbed XNLI, will catalyze research in cross-lingual sentence understanding by providing an informative standard evaluation task. In addition, we provide several baselines for multilingual sentence understanding, including two based on machine translation systems, and two that use parallel data to train aligned multilingual bag-of-words and LSTM encoders. We find that XNLI represents a practical and challenging evaluation suite, and that directly translating the test data yields the best performance among available baselines.
Tasks Cross-Lingual Natural Language Inference, Machine Translation, Natural Language Inference
Published 2018-09-13
URL http://arxiv.org/abs/1809.05053v1
PDF http://arxiv.org/pdf/1809.05053v1.pdf
PWC https://paperswithcode.com/paper/xnli-evaluating-cross-lingual-sentence
Repo https://github.com/Somefive/XNLI
Framework pytorch

Jack the Reader - A Machine Reading Framework

Title Jack the Reader - A Machine Reading Framework
Authors Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel
Abstract Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of related tasks would allow for expressive modelling, and easier model comparison and replication. To that end, we present Jack the Reader (Jack), a framework for Machine Reading that allows for quick model prototyping by component reuse, evaluation of new models on existing datasets as well as integrating new datasets and applying them on a growing set of implemented baseline models. Jack is currently supporting (but not limited to) three tasks: Question Answering, Natural Language Inference, and Link Prediction. It is developed with the aim of increasing research efficiency and code reuse.
Tasks Link Prediction, Natural Language Inference, Question Answering, Reading Comprehension
Published 2018-06-20
URL http://arxiv.org/abs/1806.08727v1
PDF http://arxiv.org/pdf/1806.08727v1.pdf
PWC https://paperswithcode.com/paper/jack-the-reader-a-machine-reading-framework
Repo https://github.com/uclmr/jack
Framework tf

Motion Planning Networks

Title Motion Planning Networks
Authors Ahmed H. Qureshi, Anthony Simeonov, Mayur J. Bency, Michael C. Yip
Abstract Fast and efficient motion planning algorithms are crucial for many state-of-the-art robotics applications such as self-driving cars. Existing motion planning methods become ineffective as their computational complexity increases exponentially with the dimensionality of the motion planning problem. To address this issue, we present Motion Planning Networks (MPNet), a neural network-based novel planning algorithm. The proposed method encodes the given workspaces directly from a point cloud measurement and generates the end-to-end collision-free paths for the given start and goal configurations. We evaluate MPNet on various 2D and 3D environments including the planning of a 7 DOF Baxter robot manipulator. The results show that MPNet is not only consistently computationally efficient in all environments but also generalizes to completely unseen environments. The results also show that the computation time of MPNet consistently remains less than 1 second in all presented experiments, which is significantly lower than existing state-of-the-art motion planning algorithms.
Tasks Motion Planning, Self-Driving Cars, Transfer Learning
Published 2018-06-14
URL http://arxiv.org/abs/1806.05767v2
PDF http://arxiv.org/pdf/1806.05767v2.pdf
PWC https://paperswithcode.com/paper/motion-planning-networks
Repo https://github.com/ahq1993/MPNet
Framework pytorch

Evaluating Theory of Mind in Question Answering

Title Evaluating Theory of Mind in Question Answering
Authors Aida Nematzadeh, Kaylee Burns, Erin Grant, Alison Gopnik, Thomas L. Griffiths
Abstract We propose a new dataset for evaluating question answering models with respect to their capacity to reason about beliefs. Our tasks are inspired by theory-of-mind experiments that examine whether children are able to reason about the beliefs of others, in particular when those beliefs differ from reality. We evaluate a number of recent neural models with memory augmentation. We find that all fail on our tasks, which require keeping track of inconsistent states of the world; moreover, the models’ accuracy decreases notably when random sentences are introduced to the tasks at test.
Tasks Question Answering
Published 2018-08-28
URL http://arxiv.org/abs/1808.09352v1
PDF http://arxiv.org/pdf/1808.09352v1.pdf
PWC https://paperswithcode.com/paper/evaluating-theory-of-mind-in-question
Repo https://github.com/kayburns/tom-qa-dataset
Framework none

Learning towards Minimum Hyperspherical Energy

Title Learning towards Minimum Hyperspherical Energy
Authors Weiyang Liu, Rongmei Lin, Zhen Liu, Lixin Liu, Zhiding Yu, Bo Dai, Le Song
Abstract Neural networks are a powerful class of nonlinear functions that can be trained end-to-end on various applications. While the over-parametrization nature in many neural networks renders the ability to fit complex functions and the strong representation power to handle challenging tasks, it also leads to highly correlated neurons that can hurt the generalization ability and incur unnecessary computation cost. As a result, how to regularize the network to avoid undesired representation redundancy becomes an important issue. To this end, we draw inspiration from a well-known problem in physics – Thomson problem, where one seeks to find a state that distributes N electrons on a unit sphere as evenly as possible with minimum potential energy. In light of this intuition, we reduce the redundancy regularization problem to generic energy minimization, and propose a minimum hyperspherical energy (MHE) objective as generic regularization for neural networks. We also propose a few novel variants of MHE, and provide some insights from a theoretical point of view. Finally, we apply neural networks with MHE regularization to several challenging tasks. Extensive experiments demonstrate the effectiveness of our intuition, by showing the superior performance with MHE regularization.
Tasks
Published 2018-05-23
URL http://arxiv.org/abs/1805.09298v8
PDF http://arxiv.org/pdf/1805.09298v8.pdf
PWC https://paperswithcode.com/paper/learning-towards-minimum-hyperspherical
Repo https://github.com/wy1iu/MHE
Framework tf

Selective Clustering Annotated using Modes of Projections

Title Selective Clustering Annotated using Modes of Projections
Authors Evan Greene, Greg Finak, Raphael Gottardo
Abstract Selective clustering annotated using modes of projections (SCAMP) is a new clustering algorithm for data in $\mathbb{R}^p$. SCAMP is motivated from the point of view of non-parametric mixture modeling. Rather than maximizing a classification likelihood to determine cluster assignments, SCAMP casts clustering as a search and selection problem. One consequence of this problem formulation is that the number of clusters is $\textbf{not}$ a SCAMP tuning parameter. The search phase of SCAMP consists of finding sub-collections of the data matrix, called candidate clusters, that obey shape constraints along each coordinate projection. An extension of the dip test of Hartigan and Hartigan (1985) is developed to assist the search. Selection occurs by scoring each candidate cluster with a preference function that quantifies prior belief about the mixture composition. Clustering proceeds by selecting candidates to maximize their total preference score. SCAMP concludes by annotating each selected cluster with labels that describe how cluster-level statistics compare to certain dataset-level quantities. SCAMP can be run multiple times on a single data matrix. Comparison of annotations obtained across iterations provides a measure of clustering uncertainty. Simulation studies and applications to real data are considered. A C++ implementation with R interface is $\href{https://github.com/RGLab/scamp}{available\ online}$.
Tasks
Published 2018-07-26
URL http://arxiv.org/abs/1807.10328v1
PDF http://arxiv.org/pdf/1807.10328v1.pdf
PWC https://paperswithcode.com/paper/selective-clustering-annotated-using-modes-of
Repo https://github.com/RGLab/scamp
Framework none

Maximally Invariant Data Perturbation as Explanation

Title Maximally Invariant Data Perturbation as Explanation
Authors Satoshi Hara, Kouichi Ikeno, Tasuku Soma, Takanori Maehara
Abstract While several feature scoring methods are proposed to explain the output of complex machine learning models, most of them lack formal mathematical definitions. In this study, we propose a novel definition of the feature score using the maximally invariant data perturbation, which is inspired from the idea of adversarial example. In adversarial example, one seeks the smallest data perturbation that changes the model’s output. In our proposed approach, we consider the opposite: we seek the maximally invariant data perturbation that does not change the model’s output. In this way, we can identify important input features as the ones with small allowable data perturbations. To find the maximally invariant data perturbation, we formulate the problem as linear programming. The experiment on the image classification with VGG16 shows that the proposed method could identify relevant parts of the images effectively.
Tasks Image Classification
Published 2018-06-19
URL http://arxiv.org/abs/1806.07004v2
PDF http://arxiv.org/pdf/1806.07004v2.pdf
PWC https://paperswithcode.com/paper/maximally-invariant-data-perturbation-as
Repo https://github.com/sato9hara/PertMap
Framework tf

TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer

Title TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
Authors Sicong Huang, Qiyang Li, Cem Anil, Xuchan Bao, Sageev Oore, Roger B. Grosse
Abstract In this work, we address the problem of musical timbre transfer, where the goal is to manipulate the timbre of a sound sample from one instrument to match another instrument while preserving other musical content, such as pitch, rhythm, and loudness. In principle, one could apply image-based style transfer techniques to a time-frequency representation of an audio signal, but this depends on having a representation that allows independent manipulation of timbre as well as high-quality waveform generation. We introduce TimbreTron, a method for musical timbre transfer which applies “image” domain style transfer to a time-frequency representation of the audio signal, and then produces a high-quality waveform using a conditional WaveNet synthesizer. We show that the Constant Q Transform (CQT) representation is particularly well-suited to convolutional architectures due to its approximate pitch equivariance. Based on human perceptual evaluations, we confirmed that TimbreTron recognizably transferred the timbre while otherwise preserving the musical content, for both monophonic and polyphonic samples.
Tasks Style Transfer
Published 2018-11-22
URL http://arxiv.org/abs/1811.09620v2
PDF http://arxiv.org/pdf/1811.09620v2.pdf
PWC https://paperswithcode.com/paper/timbretron-a-wavenetcyclegancqtaudio-pipeline
Repo https://github.com/huangsicong/TimbreTron
Framework none

Model-blind Video Denoising Via Frame-to-frame Training

Title Model-blind Video Denoising Via Frame-to-frame Training
Authors Thibaud Ehret, Axel Davy, Jean-Michel Morel, Gabriele Facciolo, Pablo Arias
Abstract Modeling the processing chain that has produced a video is a difficult reverse engineering task, even when the camera is available. This makes model based video processing a still more complex task. In this paper we propose a fully blind video denoising method, with two versions off-line and on-line. This is achieved by fine-tuning a pre-trained AWGN denoising network to the video with a novel frame-to-frame training strategy. Our denoiser can be used without knowledge of the origin of the video or burst and the post processing steps applied from the camera sensor. The on-line process only requires a couple of frames before achieving visually-pleasing results for a wide range of perturbations. It nonetheless reaches state of the art performance for standard Gaussian noise, and can be used off-line with still better performance.
Tasks Denoising, Video Denoising
Published 2018-11-30
URL https://arxiv.org/abs/1811.12766v3
PDF https://arxiv.org/pdf/1811.12766v3.pdf
PWC https://paperswithcode.com/paper/model-blind-video-denoising-via-frame-to
Repo https://github.com/tehret/blind-denoising
Framework none

Active Anomaly Detection via Ensembles

Title Active Anomaly Detection via Ensembles
Authors Shubhomoy Das, Md Rakibul Islam, Nitthilan Kannappan Jayakodi, Janardhan Rao Doppa
Abstract In critical applications of anomaly detection including computer security and fraud prevention, the anomaly detector must be configurable by the analyst to minimize the effort on false positives. One important way to configure the anomaly detector is by providing true labels for a few instances. We study the problem of label-efficient active learning to automatically tune anomaly detection ensembles and make four main contributions. First, we present an important insight into how anomaly detector ensembles are naturally suited for active learning. This insight allows us to relate the greedy querying strategy to uncertainty sampling, with implications for label-efficiency. Second, we present a novel formalism called compact description to describe the discovered anomalies and show that it can also be employed to improve the diversity of the instances presented to the analyst without loss in the anomaly discovery rate. Third, we present a novel data drift detection algorithm that not only detects the drift robustly, but also allows us to take corrective actions to adapt the detector in a principled manner. Fourth, we present extensive experiments to evaluate our insights and algorithms in both batch and streaming settings. Our results show that in addition to discovering significantly more anomalies than state-of-the-art unsupervised baselines, our active learning algorithms under the streaming-data setup are competitive with the batch setup.
Tasks Active Learning, Anomaly Detection
Published 2018-09-17
URL http://arxiv.org/abs/1809.06477v1
PDF http://arxiv.org/pdf/1809.06477v1.pdf
PWC https://paperswithcode.com/paper/active-anomaly-detection-via-ensembles
Repo https://github.com/shubhomoydas/ad_examples
Framework tf

Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations

Title Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations
Authors Dimitri P. Bertsekas
Abstract In this paper we discuss policy iteration methods for approximate solution of a finite-state discounted Markov decision problem, with a focus on feature-based aggregation methods and their connection with deep reinforcement learning schemes. We introduce features of the states of the original problem, and we formulate a smaller “aggregate” Markov decision problem, whose states relate to the features. We discuss properties and possible implementations of this type of aggregation, including a new approach to approximate policy iteration. In this approach the policy improvement operation combines feature-based aggregation with feature construction using deep neural networks or other calculations. We argue that the cost function of a policy may be approximated much more accurately by the nonlinear function of the features provided by aggregation, than by the linear function of the features provided by neural network-based reinforcement learning, thereby potentially leading to more effective policy improvement.
Tasks
Published 2018-04-12
URL http://arxiv.org/abs/1804.04577v3
PDF http://arxiv.org/pdf/1804.04577v3.pdf
PWC https://paperswithcode.com/paper/feature-based-aggregation-and-deep
Repo https://github.com/greysun/Papers-for-deep-learning
Framework tf
comments powered by Disqus