October 20, 2019

2793 words 14 mins read

Paper Group AWR 227

Kymatio: Scattering Transforms in Python. An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge. Neural Baby Talk. PCL: Proposal Cluster Learning for Weakly Supervised Object Detection. Clustering via Boundary Erosion. N-ary Relation Extraction using Graph State LSTM. Datasheets for Datasets. Deep neural decoders for near term fau …

Kymatio: Scattering Transforms in Python


Title	Kymatio: Scattering Transforms in Python
Authors	Mathieu Andreux, Tomás Angles, Georgios Exarchakis, Roberto Leonarduzzi, Gaspar Rochette, Louis Thiry, John Zarka, Stéphane Mallat, Joakim andén, Eugene Belilovsky, Joan Bruna, Vincent Lostanlen, Matthew J. Hirn, Edouard Oyallon, Sixin Zhang, Carmine Cella, Michael Eickenberg
Abstract	The wavelet scattering transform is an invariant signal representation suitable for many signal processing and machine learning applications. We present the Kymatio software package, an easy-to-use, high-performance Python implementation of the scattering transform in 1D, 2D, and 3D that is compatible with modern deep learning frameworks. All transforms may be executed on a GPU (in addition to CPU), offering a considerable speed up over CPU implementations. The package also has a small memory footprint, resulting inefficient memory usage. The source code, documentation, and examples are available undera BSD license at https://www.kymat.io/
Tasks
Published	2018-12-28
URL	https://arxiv.org/abs/1812.11214v2
PDF	https://arxiv.org/pdf/1812.11214v2.pdf
PWC	https://paperswithcode.com/paper/kymatio-scattering-transforms-in-python
Repo	https://github.com/kymatio/kymatio
Framework	pytorch

An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge


Title	An Integral Pose Regression System for the ECCV2018 PoseTrack Challenge
Authors	Xiao Sun, Chuankang Li, Stephen Lin
Abstract	For the ECCV 2018 PoseTrack Challenge, we present a 3D human pose estimation system based mainly on the integral human pose regression method. We show a comprehensive ablation study to examine the key performance factors of the proposed system. Our system obtains 47mm MPJPE on the CHALL_H80K test dataset, placing second in the ECCV2018 3D human pose estimation challenge. Code will be released to facilitate future work.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2018-09-17
URL	http://arxiv.org/abs/1809.06079v1
PDF	http://arxiv.org/pdf/1809.06079v1.pdf
PWC	https://paperswithcode.com/paper/an-integral-pose-regression-system-for-the
Repo	https://github.com/JimmySuen/integral-human-pose
Framework	pytorch

Neural Baby Talk


Title	Neural Baby Talk
Authors	Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh
Abstract	We introduce a novel framework for image captioning that can produce natural language explicitly grounded in entities that object detectors find in the image. Our approach reconciles classical slot filling approaches (that are generally better grounded in images) with modern neural captioning approaches (that are generally more natural sounding and accurate). Our approach first generates a sentence `template’ with slot locations explicitly tied to specific image regions. These slots are then filled in by visual concepts identified in the regions by object detectors. The entire architecture (sentence template generation and slot filling with object detectors) is end-to-end differentiable. We verify the effectiveness of our proposed model on different image captioning tasks. On standard image captioning and novel object captioning, our model reaches state-of-the-art on both COCO and Flickr30k datasets. We also demonstrate that our model has unique advantages when the train and test distributions of scene compositions – and hence language priors of associated captions – are different. Code has been made available at: https://github.com/jiasenlu/NeuralBabyTalk \|
Tasks	Image Captioning, Slot Filling
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09845v1
PDF	http://arxiv.org/pdf/1803.09845v1.pdf
PWC	https://paperswithcode.com/paper/neural-baby-talk
Repo	https://github.com/jiasenlu/NeuralBabyTalk
Framework	pytorch

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection


Title	PCL: Proposal Cluster Learning for Weakly Supervised Object Detection
Authors	Peng Tang, Xinggang Wang, Song Bai, Wei Shen, Xiang Bai, Wenyu Liu, Alan Yuille
Abstract	Weakly Supervised Object Detection (WSOD), using only image-level annotations to train object detectors, is of growing importance in object recognition. In this paper, we propose a novel deep network for WSOD. Unlike previous networks that transfer the object detection problem to an image classification problem using Multiple Instance Learning (MIL), our strategy generates proposal clusters to learn refined instance classifiers by an iterative process. The proposals in the same cluster are spatially adjacent and associated with the same object. This prevents the network from concentrating too much on parts of objects instead of whole objects. We first show that instances can be assigned object or background labels directly based on proposal clusters for instance classifier refinement, and then show that treating each cluster as a small new bag yields fewer ambiguities than the directly assigning label method. The iterative instance classifier refinement is implemented online using multiple streams in convolutional neural networks, where the first is an MIL network and the others are for instance classifier refinement supervised by the preceding one. Experiments are conducted on the PASCAL VOC, ImageNet detection, and MS-COCO benchmarks for WSOD. Results show that our method outperforms the previous state of the art significantly.
Tasks	Multiple Instance Learning, Object Detection, Object Recognition, Weakly Supervised Object Detection
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03342v2
PDF	http://arxiv.org/pdf/1807.03342v2.pdf
PWC	https://paperswithcode.com/paper/pcl-proposal-cluster-learning-for-weakly
Repo	https://github.com/ppengtang/oicr
Framework	pytorch

Clustering via Boundary Erosion


Title	Clustering via Boundary Erosion
Authors	Cheng-Hao Deng, Wan-Lei Zhao
Abstract	Clustering analysis identifies samples as groups based on either their mutual closeness or homogeneity. In order to detect clusters in arbitrary shapes, a novel and generic solution based on boundary erosion is proposed. The clusters are assumed to be separated by relatively sparse regions. The samples are eroded sequentially according to their dynamic boundary densities. The erosion starts from low density regions, invading inwards, until all the samples are eroded out. By this manner, boundaries between different clusters become more and more apparent. It therefore offers a natural and powerful way to separate the clusters when the boundaries between them are hard to be drawn at once. With the sequential order of being eroded, the sequential boundary levels are produced, following which the clusters in arbitrary shapes are automatically reconstructed. As demonstrated across various clustering tasks, it is able to outperform most of the state-of-the-art algorithms and its performance is nearly perfect in some scenarios.
Tasks
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04312v2
PDF	http://arxiv.org/pdf/1804.04312v2.pdf
PWC	https://paperswithcode.com/paper/clustering-via-boundary-erosion
Repo	https://github.com/redfoxdch/beClustering
Framework	none

N-ary Relation Extraction using Graph State LSTM


Title	N-ary Relation Extraction using Graph State LSTM
Authors	Linfeng Song, Yue Zhang, Zhiguo Wang, Daniel Gildea
Abstract	Cross-sentence $n$-ary relation extraction detects relations among $n$ entities across multiple sentences. Typical methods formulate an input as a \textit{document graph}, integrating various intra-sentential and inter-sentential dependencies. The current state-of-the-art method splits the input graph into two DAGs, adopting a DAG-structured LSTM for each. Though being able to model rich linguistic knowledge by leveraging graph edges, important information can be lost in the splitting procedure. We propose a graph-state LSTM model, which uses a parallel state to model each word, recurrently enriching state values via message passing. Compared with DAG LSTMs, our graph LSTM keeps the original graph structure, and speeds up computation by allowing more parallelization. On a standard benchmark, our model shows the best result in the literature.
Tasks	Relation Extraction
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09101v1
PDF	http://arxiv.org/pdf/1808.09101v1.pdf
PWC	https://paperswithcode.com/paper/n-ary-relation-extraction-using-graph-state
Repo	https://github.com/freesunshine0316/nary-grn
Framework	tf

Datasheets for Datasets


Title	Datasheets for Datasets
Authors	Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, Kate Crawford
Abstract	The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. To address this gap, we propose datasheets for datasets. In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. By analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on. Datasheets for datasets will facilitate better communication between dataset creators and dataset consumers, and encourage the machine learning community to prioritize transparency and accountability.
Tasks
Published	2018-03-23
URL	https://arxiv.org/abs/1803.09010v7
PDF	https://arxiv.org/pdf/1803.09010v7.pdf
PWC	https://paperswithcode.com/paper/datasheets-for-datasets
Repo	https://github.com/eric-erki/IdenProf
Framework	tf

Deep neural decoders for near term fault-tolerant experiments


Title	Deep neural decoders for near term fault-tolerant experiments
Authors	Christopher Chamberland, Pooya Ronagh
Abstract	Finding efficient decoders for quantum error correcting codes adapted to realistic experimental noise in fault-tolerant devices represents a significant challenge. In this paper we introduce several decoding algorithms complemented by deep neural decoders and apply them to analyze several fault-tolerant error correction protocols such as the surface code as well as Steane and Knill error correction. Our methods require no knowledge of the underlying noise model afflicting the quantum device making them appealing for real-world experiments. Our analysis is based on a full circuit-level noise model. It considers both distance-three and five codes, and is performed near the codes pseudo-threshold regime. Training deep neural decoders in low noise rate regimes appears to be a challenging machine learning endeavour. We provide a detailed description of our neural network architectures and training methodology. We then discuss both the advantages and limitations of deep neural decoders. Lastly, we provide a rigorous analysis of the decoding runtime of trained deep neural decoders and compare our methods with anticipated gate times in future quantum devices. Given the broad applications of our decoding schemes, we believe that the methods presented in this paper could have practical applications for near term fault-tolerant experiments.
Tasks
Published	2018-02-18
URL	http://arxiv.org/abs/1802.06441v2
PDF	http://arxiv.org/pdf/1802.06441v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-decoders-for-near-term-fault
Repo	https://github.com/pooya-git/DeepNeuralDecoder
Framework	tf

Advanced Super-Resolution using Lossless Pooling Convolutional Networks


Title	Advanced Super-Resolution using Lossless Pooling Convolutional Networks
Authors	Farzad Toutounchi, Ebroul Izquierdo
Abstract	In this paper, we present a novel deep learning-based approach for still image super-resolution, that unlike the mainstream models does not rely solely on the input low resolution image for high quality upsampling, and takes advantage of a set of artificially created auxiliary self-replicas of the input image that are incorporated in the neural network to create an enhanced and accurate upscaling scheme. Inclusion of the proposed lossless pooling layers, and the fusion of the input self-replicas enable the model to exploit the high correlation between multiple instances of the same content, and eventually result in significant improvements in the quality of the super-resolution, which is confirmed by extensive evaluations.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-12-14
URL	http://arxiv.org/abs/1812.06023v1
PDF	http://arxiv.org/pdf/1812.06023v1.pdf
PWC	https://paperswithcode.com/paper/advanced-super-resolution-using-lossless
Repo	https://github.com/gan3sh500/custom-pooling
Framework	pytorch

Learning Deep Representations with Probabilistic Knowledge Transfer


Title	Learning Deep Representations with Probabilistic Knowledge Transfer
Authors	Nikolaos Passalis, Anastasios Tefas
Abstract	Knowledge Transfer (KT) techniques tackle the problem of transferring the knowledge from a large and complex neural network into a smaller and faster one. However, existing KT methods are tailored towards classification tasks and they cannot be used efficiently for other representation learning tasks. In this paper a novel knowledge transfer technique, that is capable of training a student model that maintains the same amount of mutual information between the learned representation and a set of (possible unknown) labels as the teacher model, is proposed. Apart from outperforming existing KT techniques, the proposed method allows for overcoming several limitations of existing methods providing new insight into KT as well as novel KT applications, ranging from knowledge transfer from handcrafted feature extractors to {cross-modal} KT from the textual modality into the representation extracted from the visual modality of the data.
Tasks	Representation Learning, Transfer Learning
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10837v3
PDF	http://arxiv.org/pdf/1803.10837v3.pdf
PWC	https://paperswithcode.com/paper/learning-deep-representations-with
Repo	https://github.com/passalis/probabilistic_kt
Framework	pytorch

Neural Voice Cloning with a Few Samples


Title	Neural Voice Cloning with a Few Samples
Authors	Sercan O. Arik, Jitong Chen, Kainan Peng, Wei Ping, Yanqi Zhou
Abstract	Voice cloning is a highly desired feature for personalized speech interfaces. Neural network based speech synthesis has been shown to generate high quality speech for a large number of speakers. In this paper, we introduce a neural voice cloning system that takes a few audio samples as input. We study two approaches: speaker adaptation and speaker encoding. Speaker adaptation is based on fine-tuning a multi-speaker generative model with a few cloning samples. Speaker encoding is based on training a separate model to directly infer a new speaker embedding from cloning audios and to be used with a multi-speaker generative model. In terms of naturalness of the speech and its similarity to original speaker, both approaches can achieve good performance, even with very few cloning audios. While speaker adaptation can achieve better naturalness and similarity, the cloning time or required memory for the speaker encoding approach is significantly less, making it favorable for low-resource deployment.
Tasks	Speech Synthesis
Published	2018-02-14
URL	http://arxiv.org/abs/1802.06006v3
PDF	http://arxiv.org/pdf/1802.06006v3.pdf
PWC	https://paperswithcode.com/paper/neural-voice-cloning-with-a-few-samples
Repo	https://github.com/SforAiDl/Neural-Voice-Cloning-With-Few-Samples
Framework	pytorch

Distribution Matching Losses Can Hallucinate Features in Medical Image Translation


Title	Distribution Matching Losses Can Hallucinate Features in Medical Image Translation
Authors	Joseph Paul Cohen, Margaux Luck, Sina Honari
Abstract	This paper discusses how distribution matching losses, such as those used in CycleGAN, when used to synthesize medical images can lead to mis-diagnosis of medical conditions. It seems appealing to use these new image synthesis methods for translating images from a source to a target domain because they can produce high quality images and some even do not require paired data. However, the basis of how these image translation models work is through matching the translation output to the distribution of the target domain. This can cause an issue when the data provided in the target domain has an over or under representation of some classes (e.g. healthy or sick). When the output of an algorithm is a transformed image there are uncertainties whether all known and unknown class labels have been preserved or changed. Therefore, we recommend that these translated images should not be used for direct interpretation (e.g. by doctors) because they may lead to misdiagnosis of patients based on hallucinated image features by an algorithm that matches a distribution. However there are many recent papers that seem as though this is the goal.
Tasks	Image Generation
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08841v3
PDF	http://arxiv.org/pdf/1805.08841v3.pdf
PWC	https://paperswithcode.com/paper/distribution-matching-losses-can-hallucinate
Repo	https://github.com/ieee8023/dist-bias
Framework	pytorch

Accuracy-based Curriculum Learning in Deep Reinforcement Learning


Title	Accuracy-based Curriculum Learning in Deep Reinforcement Learning
Authors	Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer
Abstract	In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning. Using a reinforcement learning agent based on the Deep Deterministic Policy Gradient algorithm and addressing the Reacher environment, we first show that an agent trained with various accuracy requirements sampled randomly learns more efficiently than when asked to be very accurate at all times. Then we show that adaptive selection of accuracy requirements, based on a local measure of competence progress, automatically generates a curriculum where difficulty progressively increases, resulting in a better learning efficiency than sampling randomly.
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09614v2
PDF	http://arxiv.org/pdf/1806.09614v2.pdf
PWC	https://paperswithcode.com/paper/accuracy-based-curriculum-learning-in-deep
Repo	https://github.com/fabian57fabian/MinGrid-Improved-RL-Methods
Framework	pytorch

Neural Clustering Processes


Title	Neural Clustering Processes
Authors	Ari Pakman, Yueqi Wang, Catalin Mitelut, JinHyung Lee, Liam Paninski
Abstract	Probabilistic clustering models (or equivalently, mixture models) are basic building blocks in countless statistical models and involve latent random variables over discrete spaces. For these models, posterior inference methods can be inaccurate and/or very slow. In this work we introduce deep network architectures trained with labeled samples from any generative model of clustered datasets. At test time, the networks generate approximate posterior samples of cluster labels for any new dataset of arbitrary size. We develop two complementary approaches to this task, requiring either O(N) or O(K) network forward passes per dataset, where N is the dataset size and K the number of clusters. Unlike previous approaches, our methods sample the labels of all the data points from a well-defined posterior, and can learn nonparametric Bayesian posteriors since they do not limit the number of mixture components. Moreover, the algorithms are easily parallelized with a GPU. As a scientific application, we present a novel approach to neural spike sorting for high-density multielectrode arrays.
Tasks	Bayesian Inference
Published	2018-12-28
URL	https://arxiv.org/abs/1901.00409v3
PDF	https://arxiv.org/pdf/1901.00409v3.pdf
PWC	https://paperswithcode.com/paper/discrete-neural-processes
Repo	https://github.com/aripakman/neural_clustering_process
Framework	pytorch

Optimal Piecewise Local-Linear Approximations


Title	Optimal Piecewise Local-Linear Approximations
Authors	Kartik Ahuja, William Zame, Mihaela van der Schaar
Abstract	Existing works on “black-box” model interpretation use local-linear approximations to explain the predictions made for each data instance in terms of the importance assigned to the different features for arriving at the prediction. These works provide instancewise explanations and thus give a local view of the model. To be able to trust the model it is important to understand the global model behavior and there are relatively fewer works which do the same. Piecewise local-linear models provide a natural way to extend local-linear models to explain the global behavior of the model. In this work, we provide a dynamic programming based framework to obtain piecewise approximations of the black-box model. We also provide provable fidelity, i.e., how well the explanations reflect the black-box model, guarantees. We carry out simulations on synthetic and real datasets to show the utility of the proposed approach. At the end, we show that the ideas developed for our framework can also be used to address the problem of clustering for one-dimensional data. We give a polynomial time algorithm and prove that it achieves optimal clustering.
Tasks
Published	2018-06-27
URL	https://arxiv.org/abs/1806.10270v4
PDF	https://arxiv.org/pdf/1806.10270v4.pdf
PWC	https://paperswithcode.com/paper/piecewise-approximations-of-black-box-models
Repo	https://github.com/ahujak/Piecewise-Local-Linear-Model
Framework	none