October 20, 2019

2793 words 14 mins read

Paper Group AWR 227

Paper Group AWR 227

Kymatio: Scattering Transforms in Python. An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge. Neural Baby Talk. PCL: Proposal Cluster Learning for Weakly Supervised Object Detection. Clustering via Boundary Erosion. N-ary Relation Extraction using Graph State LSTM. Datasheets for Datasets. Deep neural decoders for near term fau …

Kymatio: Scattering Transforms in Python

Title Kymatio: Scattering Transforms in Python
Authors Mathieu Andreux, Tomás Angles, Georgios Exarchakis, Roberto Leonarduzzi, Gaspar Rochette, Louis Thiry, John Zarka, Stéphane Mallat, Joakim andén, Eugene Belilovsky, Joan Bruna, Vincent Lostanlen, Matthew J. Hirn, Edouard Oyallon, Sixin Zhang, Carmine Cella, Michael Eickenberg
Abstract The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks. All transforms may be executed on a GPU (in addition to CPU), offering a considerable speed up over CPU implementations. The package also has a small memory footprint, resulting inefficient memory usage. The source code, documentation, and examples are available undera BSD license at https://www.kymat.io/
Tasks
Published 2018-12-28
URL https://arxiv.org/abs/1812.11214v2
PDF https://arxiv.org/pdf/1812.11214v2.pdf
PWC https://paperswithcode.com/paper/kymatio-scattering-transforms-in-python
Repo https://github.com/kymatio/kymatio
Framework pytorch

An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge

Title An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge
Authors Xiao Sun, Chuankang Li, Stephen Lin
Abstract For the ECCV 2018 PoseTrack Challenge, we present a 3D human pose estimation system based mainly on the integral human pose regression method. We show a comprehensive ablation study to examine the key performance factors of the proposed system. Our system obtains 47mm MPJPE on the CHALL_H80K test dataset, placing second in the ECCV2018 3D human pose estimation challenge. Code will be released to facilitate future work.
Tasks 3D Human Pose Estimation, Pose Estimation
Published 2018-09-17
URL http://arxiv.org/abs/1809.06079v1
PDF http://arxiv.org/pdf/1809.06079v1.pdf
PWC https://paperswithcode.com/paper/an-integral-pose-regression-system-for-the
Repo https://github.com/JimmySuen/integral-human-pose
Framework pytorch

Neural Baby Talk

Title Neural Baby Talk
Authors Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh
Abstract We introduce a novel framework for image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image. Our approach reconciles classical slot filling approaches (that are generally better grounded in images) with modern neural captioning approaches (that are generally more natural sounding and accurate). Our approach first generates a sentence `template’ with slot locations explicitly tied to specific image regions. These slots are then filled in by visual concepts identified in the regions by object detectors. The entire architecture (sentence template generation and slot filling with object detectors) is end-to-end differentiable. We verify the effectiveness of our proposed model on different image captioning tasks. On standard image captioning and novel object captioning, our model reaches state-of-the-art on both COCO and Flickr30k datasets. We also demonstrate that our model has unique advantages when the train and test distributions of scene compositions – and hence language priors of associated captions – are different. Code has been made available at: https://github.com/jiasenlu/NeuralBabyTalk |
Tasks Image Captioning, Slot Filling
Published 2018-03-27
URL http://arxiv.org/abs/1803.09845v1
PDF http://arxiv.org/pdf/1803.09845v1.pdf
PWC https://paperswithcode.com/paper/neural-baby-talk
Repo https://github.com/jiasenlu/NeuralBabyTalk
Framework pytorch

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection

Title PCL: Proposal Cluster Learning for Weakly Supervised Object Detection
Authors Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille
Abstract Weakly Supervised Object Detection (WSOD), using only image-level annotations to train object detectors, is of growing importance in object recognition. In this paper, we propose a novel deep network for WSOD. Unlike previous networks that transfer the object detection problem to an image classification problem using Multiple Instance Learning (MIL), our strategy generates proposal clusters to learn refined instance classifiers by an iterative process. The proposals in the same cluster are spatially adjacent and associated with the same object. This prevents the network from concentrating too much on parts of objects instead of whole objects. We first show that instances can be assigned object or background labels directly based on proposal clusters for instance classifier refinement, and then show that treating each cluster as a small new bag yields fewer ambiguities than the directly assigning label method. The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one. Experiments are conducted on the PASCAL VOC, ImageNet detection, and MS-COCO benchmarks for WSOD. Results show that our method outperforms the previous state of the art significantly.
Tasks Multiple Instance Learning, Object Detection, Object Recognition, Weakly Supervised Object Detection
Published 2018-07-09
URL http://arxiv.org/abs/1807.03342v2
PDF http://arxiv.org/pdf/1807.03342v2.pdf
PWC https://paperswithcode.com/paper/pcl-proposal-cluster-learning-for-weakly
Repo https://github.com/ppengtang/oicr
Framework pytorch

Clustering via Boundary Erosion

Title Clustering via Boundary Erosion
Authors Cheng-Hao Deng, Wan-Lei Zhao
Abstract Clustering analysis identifies samples as groups based on either their mutual closeness or homogeneity. In order to detect clusters in arbitrary shapes, a novel and generic solution based on boundary erosion is proposed. The clusters are assumed to be separated by relatively sparse regions. The samples are eroded sequentially according to their dynamic boundary densities. The erosion starts from low density regions, invading inwards, until all the samples are eroded out. By this manner, boundaries between different clusters become more and more apparent. It therefore offers a natural and powerful way to separate the clusters when the boundaries between them are hard to be drawn at once. With the sequential order of being eroded, the sequential boundary levels are produced, following which the clusters in arbitrary shapes are automatically reconstructed. As demonstrated across various clustering tasks, it is able to outperform most of the state-of-the-art algorithms and its performance is nearly perfect in some scenarios.
Tasks
Published 2018-04-12
URL http://arxiv.org/abs/1804.04312v2
PDF http://arxiv.org/pdf/1804.04312v2.pdf
PWC https://paperswithcode.com/paper/clustering-via-boundary-erosion
Repo https://github.com/redfoxdch/beClustering
Framework none

N-ary Relation Extraction using Graph State LSTM

Title N-ary Relation Extraction using Graph State LSTM
Authors Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Abstract Cross-sentence $n$-ary relation extraction detects relations among $n$ entities across multiple sentences. Typical methods formulate an input as a \textit{document graph}, integrating various intra-sentential and inter-sentential dependencies. The current state-of-the-art method splits the input graph into two DAGs, adopting a DAG-structured LSTM for each. Though being able to model rich linguistic knowledge by leveraging graph edges, important information can be lost in the splitting procedure. We propose a graph-state LSTM model, which uses a parallel state to model each word, recurrently enriching state values via message passing. Compared with DAG LSTMs, our graph LSTM keeps the original graph structure, and speeds up computation by allowing more parallelization. On a standard benchmark, our model shows the best result in the literature.
Tasks Relation Extraction
Published 2018-08-28
URL http://arxiv.org/abs/1808.09101v1
PDF http://arxiv.org/pdf/1808.09101v1.pdf
PWC https://paperswithcode.com/paper/n-ary-relation-extraction-using-graph-state
Repo https://github.com/freesunshine0316/nary-grn
Framework tf

Datasheets for Datasets

Title Datasheets for Datasets
Authors Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, Kate Crawford
Abstract The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. By analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on. Datasheets for datasets will facilitate better communication between dataset creators and dataset consumers, and encourage the machine learning community to prioritize transparency and accountability.
Tasks
Published 2018-03-23
URL https://arxiv.org/abs/1803.09010v7
PDF https://arxiv.org/pdf/1803.09010v7.pdf
PWC https://paperswithcode.com/paper/datasheets-for-datasets
Repo https://github.com/eric-erki/IdenProf
Framework tf

Deep neural decoders for near term fault-tolerant experiments

Title Deep neural decoders for near term fault-tolerant experiments
Authors Christopher Chamberland, Pooya Ronagh
Abstract Finding efficient decoders for quantum error correcting codes adapted to realistic experimental noise in fault-tolerant devices represents a significant challenge. In this paper we introduce several decoding algorithms complemented by deep neural decoders and apply them to analyze several fault-tolerant error correction protocols such as the surface code as well as Steane and Knill error correction. Our methods require no knowledge of the underlying noise model afflicting the quantum device making them appealing for real-world experiments. Our analysis is based on a full circuit-level noise model. It considers both distance-three and five codes, and is performed near the codes pseudo-threshold regime. Training deep neural decoders in low noise rate regimes appears to be a challenging machine learning endeavour. We provide a detailed description of our neural network architectures and training methodology. We then discuss both the advantages and limitations of deep neural decoders. Lastly, we provide a rigorous analysis of the decoding runtime of trained deep neural decoders and compare our methods with anticipated gate times in future quantum devices. Given the broad applications of our decoding schemes, we believe that the methods presented in this paper could have practical applications for near term fault-tolerant experiments.
Tasks
Published 2018-02-18
URL http://arxiv.org/abs/1802.06441v2
PDF http://arxiv.org/pdf/1802.06441v2.pdf
PWC https://paperswithcode.com/paper/deep-neural-decoders-for-near-term-fault
Repo https://github.com/pooya-git/DeepNeuralDecoder
Framework tf

Advanced Super-Resolution using Lossless Pooling Convolutional Networks

Title Advanced Super-Resolution using Lossless Pooling Convolutional Networks
Authors Farzad Toutounchi, Ebroul Izquierdo
Abstract In this paper, we present a novel deep learning-based approach for still image super-resolution, that unlike the mainstream models does not rely solely on the input low resolution image for high quality upsampling, and takes advantage of a set of artificially created auxiliary self-replicas of the input image that are incorporated in the neural network to create an enhanced and accurate upscaling scheme. Inclusion of the proposed lossless pooling layers, and the fusion of the input self-replicas enable the model to exploit the high correlation between multiple instances of the same content, and eventually result in significant improvements in the quality of the super-resolution, which is confirmed by extensive evaluations.
Tasks Image Super-Resolution, Super-Resolution
Published 2018-12-14
URL http://arxiv.org/abs/1812.06023v1
PDF http://arxiv.org/pdf/1812.06023v1.pdf
PWC https://paperswithcode.com/paper/advanced-super-resolution-using-lossless
Repo https://github.com/gan3sh500/custom-pooling
Framework pytorch

Learning Deep Representations with Probabilistic Knowledge Transfer

Title Learning Deep Representations with Probabilistic Knowledge Transfer
Authors Nikolaos Passalis, Anastasios Tefas
Abstract Knowledge Transfer (KT) techniques tackle the problem of transferring the knowledge from a large and complex neural network into a smaller and faster one. However, existing KT methods are tailored towards classification tasks and they cannot be used efficiently for other representation learning tasks. In this paper a novel knowledge transfer technique, that is capable of training a student model that maintains the same amount of mutual information between the learned representation and a set of (possible unknown) labels as the teacher model, is proposed. Apart from outperforming existing KT techniques, the proposed method allows for overcoming several limitations of existing methods providing new insight into KT as well as novel KT applications, ranging from knowledge transfer from handcrafted feature extractors to {cross-modal} KT from the textual modality into the representation extracted from the visual modality of the data.
Tasks Representation Learning, Transfer Learning
Published 2018-03-28
URL http://arxiv.org/abs/1803.10837v3
PDF http://arxiv.org/pdf/1803.10837v3.pdf
PWC https://paperswithcode.com/paper/learning-deep-representations-with
Repo https://github.com/passalis/probabilistic_kt
Framework pytorch

Neural Voice Cloning with a Few Samples

Title Neural Voice Cloning with a Few Samples
Authors Sercan O. Arik, Jitong Chen, Kainan Peng, Wei Ping, Yanqi Zhou
Abstract Voice cloning is a highly desired feature for personalized speech interfaces. Neural network based speech synthesis has been shown to generate high quality speech for a large number of speakers. In this paper, we introduce a neural voice cloning system that takes a few audio samples as input. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples. Speaker encoding is based on training a separate model to directly infer a new speaker embedding from cloning audios and to be used with a multi-speaker generative model. In terms of naturalness of the speech and its similarity to original speaker, both approaches can achieve good performance, even with very few cloning audios. While speaker adaptation can achieve better naturalness and similarity, the cloning time or required memory for the speaker encoding approach is significantly less, making it favorable for low-resource deployment.
Tasks Speech Synthesis
Published 2018-02-14
URL http://arxiv.org/abs/1802.06006v3
PDF http://arxiv.org/pdf/1802.06006v3.pdf
PWC https://paperswithcode.com/paper/neural-voice-cloning-with-a-few-samples
Repo https://github.com/SforAiDl/Neural-Voice-Cloning-With-Few-Samples
Framework pytorch

Distribution Matching Losses Can Hallucinate Features in Medical Image Translation

Title Distribution Matching Losses Can Hallucinate Features in Medical Image Translation
Authors Joseph Paul Cohen, Margaux Luck, Sina Honari
Abstract This paper discusses how distribution matching losses, such as those used in CycleGAN, when used to synthesize medical images can lead to mis-diagnosis of medical conditions. It seems appealing to use these new image synthesis methods for translating images from a source to a target domain because they can produce high quality images and some even do not require paired data. However, the basis of how these image translation models work is through matching the translation output to the distribution of the target domain. This can cause an issue when the data provided in the target domain has an over or under representation of some classes (e.g. healthy or sick). When the output of an algorithm is a transformed image there are uncertainties whether all known and unknown class labels have been preserved or changed. Therefore, we recommend that these translated images should not be used for direct interpretation (e.g. by doctors) because they may lead to misdiagnosis of patients based on hallucinated image features by an algorithm that matches a distribution. However there are many recent papers that seem as though this is the goal.
Tasks Image Generation
Published 2018-05-22
URL http://arxiv.org/abs/1805.08841v3
PDF http://arxiv.org/pdf/1805.08841v3.pdf
PWC https://paperswithcode.com/paper/distribution-matching-losses-can-hallucinate
Repo https://github.com/ieee8023/dist-bias
Framework pytorch

Accuracy-based Curriculum Learning in Deep Reinforcement Learning

Title Accuracy-based Curriculum Learning in Deep Reinforcement Learning
Authors Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer
Abstract In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning. Using a reinforcement learning agent based on the Deep Deterministic Policy Gradient algorithm and addressing the Reacher environment, we first show that an agent trained with various accuracy requirements sampled randomly learns more efficiently than when asked to be very accurate at all times. Then we show that adaptive selection of accuracy requirements, based on a local measure of competence progress, automatically generates a curriculum where difficulty progressively increases, resulting in a better learning efficiency than sampling randomly.
Tasks
Published 2018-06-25
URL http://arxiv.org/abs/1806.09614v2
PDF http://arxiv.org/pdf/1806.09614v2.pdf
PWC https://paperswithcode.com/paper/accuracy-based-curriculum-learning-in-deep
Repo https://github.com/fabian57fabian/MinGrid-Improved-RL-Methods
Framework pytorch

Neural Clustering Processes

Title Neural Clustering Processes
Authors Ari Pakman, Yueqi Wang, Catalin Mitelut, JinHyung Lee, Liam Paninski
Abstract Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be inaccurate and/or very slow. In this work we introduce deep network architectures trained with labeled samples from any generative model of clustered datasets. At test time, the networks generate approximate posterior samples of cluster labels for any new dataset of arbitrary size. We develop two complementary approaches to this task, requiring either O(N) or O(K) network forward passes per dataset, where N is the dataset size and K the number of clusters. Unlike previous approaches, our methods sample the labels of all the data points from a well-defined posterior, and can learn nonparametric Bayesian posteriors since they do not limit the number of mixture components. Moreover, the algorithms are easily parallelized with a GPU. As a scientific application, we present a novel approach to neural spike sorting for high-density multielectrode arrays.
Tasks Bayesian Inference
Published 2018-12-28
URL https://arxiv.org/abs/1901.00409v3
PDF https://arxiv.org/pdf/1901.00409v3.pdf
PWC https://paperswithcode.com/paper/discrete-neural-processes
Repo https://github.com/aripakman/neural_clustering_process
Framework pytorch

Optimal Piecewise Local-Linear Approximations

Title Optimal Piecewise Local-Linear Approximations
Authors Kartik Ahuja, William Zame, Mihaela van der Schaar
Abstract Existing works on “black-box” model interpretation use local-linear approximations to explain the predictions made for each data instance in terms of the importance assigned to the different features for arriving at the prediction. These works provide instancewise explanations and thus give a local view of the model. To be able to trust the model it is important to understand the global model behavior and there are relatively fewer works which do the same. Piecewise local-linear models provide a natural way to extend local-linear models to explain the global behavior of the model. In this work, we provide a dynamic programming based framework to obtain piecewise approximations of the black-box model. We also provide provable fidelity, i.e., how well the explanations reflect the black-box model, guarantees. We carry out simulations on synthetic and real datasets to show the utility of the proposed approach. At the end, we show that the ideas developed for our framework can also be used to address the problem of clustering for one-dimensional data. We give a polynomial time algorithm and prove that it achieves optimal clustering.
Tasks
Published 2018-06-27
URL https://arxiv.org/abs/1806.10270v4
PDF https://arxiv.org/pdf/1806.10270v4.pdf
PWC https://paperswithcode.com/paper/piecewise-approximations-of-black-box-models
Repo https://github.com/ahujak/Piecewise-Local-Linear-Model
Framework none
comments powered by Disqus