October 21, 2019

2628 words 13 mins read

Paper Group AWR 39

Temporal Recurrent Networks for Online Action Detection. INSPECTRE: Privately Estimating the Unseen. Recurrent Transformer Networks for Semantic Correspondence. Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation. Learning to Reason with Third-Order Tensor Products. Learning Qualitatively Diverse and Interpretable Rul …

Temporal Recurrent Networks for Online Action Detection


Title	Temporal Recurrent Networks for Online Action Detection
Authors	Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall
Abstract	Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed. However, important real-time applications including surveillance and driver assistance systems require identifying actions as soon as each video frame arrives, based only on current and historical observations. In this paper, we propose a novel framework, Temporal Recurrent Network (TRN), to model greater temporal context of a video frame by simultaneously performing online action detection and anticipation of the immediate future. At each moment in time, our approach makes use of both accumulated historical evidence and predicted future information to better recognize the action that is currently occurring, and integrates both of these into a unified end-to-end architecture. We evaluate our approach on two popular online action detection datasets, HDD and TVSeries, as well as another widely used dataset, THUMOS’14. The results show that TRN significantly outperforms the state-of-the-art.
Tasks	Action Detection
Published	2018-11-18
URL	http://arxiv.org/abs/1811.07391v2
PDF	http://arxiv.org/pdf/1811.07391v2.pdf
PWC	https://paperswithcode.com/paper/temporal-recurrent-networks-for-online-action
Repo	https://github.com/rajskar/CS763Project
Framework	pytorch

INSPECTRE: Privately Estimating the Unseen


Title	INSPECTRE: Privately Estimating the Unseen
Authors	Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang
Abstract	We develop differentially private methods for estimating various distributional properties. Given a sample from a discrete distribution $p$, some functional $f$, and accuracy and privacy parameters $\alpha$ and $\varepsilon$, the goal is to estimate $f(p)$ up to accuracy $\alpha$, while maintaining $\varepsilon$-differential privacy of the sample. We prove almost-tight bounds on the sample size required for this problem for several functionals of interest, including support size, support coverage, and entropy. We show that the cost of privacy is negligible in a variety of settings, both theoretically and experimentally. Our methods are based on a sensitivity analysis of several state-of-the-art methods for estimating these properties with sublinear sample complexities.
Tasks
Published	2018-02-28
URL	http://arxiv.org/abs/1803.00008v1
PDF	http://arxiv.org/pdf/1803.00008v1.pdf
PWC	https://paperswithcode.com/paper/inspectre-privately-estimating-the-unseen
Repo	https://github.com/HuanyuZhang/INSPECTRE
Framework	none

Recurrent Transformer Networks for Semantic Correspondence


Title	Recurrent Transformer Networks for Semantic Correspondence
Authors	Seungryong Kim, Stephen Lin, Sangryul Jeon, Dongbo Min, Kwanghoon Sohn
Abstract	We present recurrent transformer networks (RTNs) for obtaining dense correspondences between semantically similar images. Our networks accomplish this through an iterative process of estimating spatial transformations between the input images and using these transformations to generate aligned convolutional activations. By directly estimating the transformations between an image pair, rather than employing spatial transformer networks to independently normalize each individual image, we show that greater accuracy can be achieved. This process is conducted in a recursive manner to refine both the transformation estimates and the feature representations. In addition, a technique is presented for weakly-supervised training of RTNs that is based on a proposed classification loss. With RTNs, state-of-the-art performance is attained on several benchmarks for semantic correspondence.
Tasks
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12155v1
PDF	http://arxiv.org/pdf/1810.12155v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-transformer-networks-for-semantic
Repo	https://github.com/seungryong/RTNs
Framework	none

Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation


Title	Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Authors	Yikang Li, Wanli Ouyang, Bolei Zhou, Jianping Shi, Chao Zhang, Xiaogang Wang
Abstract	Generating scene graph to describe all the relations inside an image gains increasing interests these years. However, most of the previous methods use complicated structures with slow inference speed or rely on the external data, which limits the usage of the model in real-life scenarios. To improve the efficiency of scene graph generation, we propose a subgraph-based connection graph to concisely represent the scene graph during the inference. A bottom-up clustering method is first used to factorize the entire scene graph into subgraphs, where each subgraph contains several objects and a subset of their relationships. By replacing the numerous relationship representations of the scene graph with fewer subgraph and object features, the computation in the intermediate stage is significantly reduced. In addition, spatial information is maintained by the subgraph features, which is leveraged by our proposed Spatial-weighted Message Passing~(SMP) structure and Spatial-sensitive Relation Inference~(SRI) module to facilitate the relationship recognition. On the recent Visual Relationship Detection and Visual Genome datasets, our method outperforms the state-of-the-art method in both accuracy and speed.
Tasks	Graph Generation, Scene Graph Generation
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11538v2
PDF	http://arxiv.org/pdf/1806.11538v2.pdf
PWC	https://paperswithcode.com/paper/factorizable-net-an-efficient-subgraph-based
Repo	https://github.com/yikang-li/FactorizableNet
Framework	pytorch

Learning to Reason with Third-Order Tensor Products


Title	Learning to Reason with Third-Order Tensor Products
Authors	Imanol Schlag, Jürgen Schmidhuber
Abstract	We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data. This improves symbolic interpretation and systematic generalisation. Our architecture is trained end-to-end through gradient descent on a variety of simple natural language reasoning tasks, significantly outperforming the latest state-of-the-art models in single-task and all-tasks settings. We also augment a subset of the data such that training and test data exhibit large systematic differences and show that our approach generalises better than the previous state-of-the-art.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12143v2
PDF	http://arxiv.org/pdf/1811.12143v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-reason-with-third-order-tensor
Repo	https://github.com/ischlag/TPR-RNN
Framework	tf

Learning Qualitatively Diverse and Interpretable Rules for Classification


Title	Learning Qualitatively Diverse and Interpretable Rules for Classification
Authors	Andrew Slavin Ross, Weiwei Pan, Finale Doshi-Velez
Abstract	There has been growing interest in developing accurate models that can also be explained to humans. Unfortunately, if there exist multiple distinct but accurate models for some dataset, current machine learning methods are unlikely to find them: standard techniques will likely recover a complex model that combines them. In this work, we introduce a way to identify a maximal set of distinct but accurate models for a dataset. We demonstrate empirically that, in situations where the data supports multiple accurate classifiers, we tend to recover simpler, more interpretable classifiers rather than more complex ones.
Tasks
Published	2018-06-22
URL	http://arxiv.org/abs/1806.08716v2
PDF	http://arxiv.org/pdf/1806.08716v2.pdf
PWC	https://paperswithcode.com/paper/learning-qualitatively-diverse-and
Repo	https://github.com/dtak/local-independence-public
Framework	tf

Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations


Title	Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations
Authors	Aditi Chaudhary, Chunting Zhou, Lori Levin, Graham Neubig, David R. Mortensen, Jaime G. Carbonell
Abstract	Much work in Natural Language Processing (NLP) has been for resource-rich languages, making generalization to new, less-resourced languages challenging. We present two approaches for improving generalization to low-resourced languages by adapting continuous word representations using linguistically motivated subword units: phonemes, morphemes and graphemes. Our method requires neither parallel corpora nor bilingual dictionaries and provides a significant gain in performance over previous methods relying on these resources. We demonstrate the effectiveness of our approaches on Named Entity Recognition for four languages, namely Uyghur, Turkish, Bengali and Hindi, of which Uyghur and Bengali are low resource languages, and also perform experiments on Machine Translation. Exploiting subwords with transfer learning gives us a boost of +15.2 NER F1 for Uyghur and +9.7 F1 for Bengali. We also show improvements in the monolingual setting where we achieve (avg.) +3 F1 and (avg.) +1.35 BLEU.
Tasks	Machine Translation, Named Entity Recognition, Transfer Learning, Word Embeddings
Published	2018-08-28
URL	http://arxiv.org/abs/1808.09500v1
PDF	http://arxiv.org/pdf/1808.09500v1.pdf
PWC	https://paperswithcode.com/paper/adapting-word-embeddings-to-new-languages
Repo	https://github.com/Aditi138/Embeddings
Framework	none

End-to-end detection-segmentation network with ROI convolution


Title	End-to-end detection-segmentation network with ROI convolution
Authors	Zichen Zhang, Min Tang, Dana Cobzas, Dornoosh Zonoobi, Martin Jagersand, Jacob L. Jaremko
Abstract	We propose an end-to-end neural network that improves the segmentation accuracy of fully convolutional networks by incorporating a localization unit. This network performs object localization first, which is then used as a cue to guide the training of the segmentation network. We test the proposed method on a segmentation task of small objects on a clinical dataset of ultrasound images. We show that by jointly learning for detection and segmentation, the proposed network is able to improve the segmentation accuracy compared to only learning for segmentation. Code is publicly available at https://github.com/vincentzhang/roi-fcn.
Tasks	Object Localization
Published	2018-01-08
URL	https://arxiv.org/abs/1801.02722v2
PDF	https://arxiv.org/pdf/1801.02722v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-detection-segmentation-network
Repo	https://github.com/vincentzhang/roi-fcn
Framework	none

Meta-Learning: A Survey


Title	Meta-Learning: A Survey
Authors	Joaquin Vanschoren
Abstract	Meta-learning, or learning to learn, is the science of systematically observing how different machine learning approaches perform on a wide range of learning tasks, and then learning from this experience, or meta-data, to learn new tasks much faster than otherwise possible. Not only does this dramatically speed up and improve the design of machine learning pipelines or neural architectures, it also allows us to replace hand-engineered algorithms with novel approaches learned in a data-driven way. In this chapter, we provide an overview of the state of the art in this fascinating and continuously evolving field.
Tasks	Meta-Learning
Published	2018-10-08
URL	http://arxiv.org/abs/1810.03548v1
PDF	http://arxiv.org/pdf/1810.03548v1.pdf
PWC	https://paperswithcode.com/paper/meta-learning-a-survey
Repo	https://github.com/289371298/RLpapersnote
Framework	none

Automated proof synthesis for propositional logic with deep neural networks


Title	Automated proof synthesis for propositional logic with deep neural networks
Authors	Taro Sekiyama, Kohei Suenaga
Abstract	This work explores the application of deep learning, a machine learning technique that uses deep neural networks (DNN) in its core, to an automated theorem proving (ATP) problem. To this end, we construct a statistical model which quantifies the likelihood that a proof is indeed a correct one of a given proposition. Based on this model, we give a proof-synthesis procedure that searches for a proof in the order of the likelihood. This procedure uses an estimator of the likelihood of an inference rule being applied at each step of a proof. As an implementation of the estimator, we propose a proposition-to-proof architecture, which is a DNN tailored to the automated proof synthesis problem. To empirically demonstrate its usefulness, we apply our model to synthesize proofs of propositional logic. We train the proposition-to-proof model using a training dataset of proposition-proof pairs. The evaluation against a benchmark set shows the very high accuracy and an improvement to the recent work of neural proof synthesis.
Tasks	Automated Theorem Proving
Published	2018-05-30
URL	http://arxiv.org/abs/1805.11799v1
PDF	http://arxiv.org/pdf/1805.11799v1.pdf
PWC	https://paperswithcode.com/paper/automated-proof-synthesis-for-propositional
Repo	https://github.com/mluszczyk/deepsat
Framework	tf

Skin Lesion Diagnosis using Ensembles, Unscaled Multi-Crop Evaluation and Loss Weighting


Title	Skin Lesion Diagnosis using Ensembles, Unscaled Multi-Crop Evaluation and Loss Weighting
Authors	Nils Gessert, Thilo Sentker, Frederic Madesta, Rüdiger Schmitz, Helge Kniep, Ivo Baltruschat, René Werner, Alexander Schlaefer
Abstract	In this paper we present the methods of our submission to the ISIC 2018 challenge for skin lesion diagnosis (Task 3). The dataset consists of 10000 images with seven image-level classes to be distinguished by an automated algorithm. We employ an ensemble of convolutional neural networks for this task. In particular, we fine-tune pretrained state-of-the-art deep learning models such as Densenet, SENet and ResNeXt. We identify heavy class imbalance as a key problem for this challenge and consider multiple balancing approaches such as loss weighting and balanced batch sampling. Another important feature of our pipeline is the use of a vast amount of unscaled crops for evaluation. Last, we consider meta learning approaches for the final predictions. Our team placed second at the challenge while being the best approach using only publicly available data.
Tasks	Meta-Learning
Published	2018-08-05
URL	http://arxiv.org/abs/1808.01694v1
PDF	http://arxiv.org/pdf/1808.01694v1.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-diagnosis-using-ensembles
Repo	https://github.com/ngessert/patch-lesion
Framework	pytorch

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation


Title	Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Authors	Hao-Wen Dong, Yi-Hsuan Yang
Abstract	It has been shown recently that deep convolutional generative adversarial networks (GANs) can learn to generate music in the form of piano-rolls, which represent music by binary-valued time-pitch matrices. However, existing models can only generate real-valued piano-rolls and require further post-processing, such as hard thresholding (HT) or Bernoulli sampling (BS), to obtain the final binary-valued results. In this paper, we study whether we can have a convolutional GAN model that directly creates binary-valued piano-rolls by using binary neurons. Specifically, we propose to append to the generator an additional refiner network, which uses binary neurons at the output layer. The whole network is trained in two stages. Firstly, the generator and the discriminator are pretrained. Then, the refiner network is trained along with the discriminator to learn to binarize the real-valued piano-rolls the pretrained generator creates. Experimental results show that using binary neurons instead of HT or BS indeed leads to better results in a number of objective measures. Moreover, deterministic binary neurons perform better than stochastic ones in both objective measures and a subjective test. The source code, training data and audio examples of the generated results can be found at https://salu133445.github.io/bmusegan/ .
Tasks	Music Generation
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09399v3
PDF	http://arxiv.org/pdf/1804.09399v3.pdf
PWC	https://paperswithcode.com/paper/convolutional-generative-adversarial-networks
Repo	https://github.com/salu133445/musegan
Framework	tf

IndoSum: A New Benchmark Dataset for Indonesian Text Summarization


Title	IndoSum: A New Benchmark Dataset for Indonesian Text Summarization
Authors	Kemal Kurniawan, Samuel Louvan
Abstract	Automatic text summarization is generally considered as a challenging task in the NLP community. One of the challenges is the publicly available and large dataset that is relatively rare and difficult to construct. The problem is even worse for low-resource languages such as Indonesian. In this paper, we present IndoSum, a new benchmark dataset for Indonesian text summarization. The dataset consists of news articles and manually constructed summaries. Notably, the dataset is almost 200x larger than the previous Indonesian summarization dataset of the same domain. We evaluated various extractive summarization approaches and obtained encouraging results which demonstrate the usefulness of the dataset and provide baselines for future research. The code and the dataset are available online under permissive licenses.
Tasks	Text Summarization
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05334v5
PDF	http://arxiv.org/pdf/1810.05334v5.pdf
PWC	https://paperswithcode.com/paper/indosum-a-new-benchmark-dataset-for
Repo	https://github.com/kata-ai/indosum
Framework	tf

Towards Solving Text-based Games by Producing Adaptive Action Spaces


Title	Towards Solving Text-based Games by Producing Adaptive Action Spaces
Authors	Ruo Yu Tao, Marc-Alexandre Côté, Xingdi Yuan, Layla El Asri
Abstract	To solve a text-based game, an agent needs to formulate valid text commands for a given context and find the ones that lead to success. Recent attempts at solving text-based games with deep reinforcement learning have focused on the latter, i.e., learning to act optimally when valid actions are known in advance. In this work, we propose to tackle the first task and train a model that generates the set of all valid commands for a given context. We try three generative models on a dataset generated with Textworld. The best model can generate valid commands which were unseen at training and achieve high $F_1$ score on the test set.
Tasks
Published	2018-12-03
URL	http://arxiv.org/abs/1812.00855v1
PDF	http://arxiv.org/pdf/1812.00855v1.pdf
PWC	https://paperswithcode.com/paper/towards-solving-text-based-games-by-producing
Repo	https://github.com/projectzork/Readings
Framework	none

DLOW: Domain Flow for Adaptation and Generalization


Title	DLOW: Domain Flow for Adaptation and Generalization
Authors	Rui Gong, Wen Li, Yuhua Chen, Luc Van Gool
Abstract	In this work, we present a domain flow generation(DLOW) model to bridge two different domains by generating a continuous sequence of intermediate domains flowing from one domain to the other. The benefits of our DLOW model are two-fold. First, it is able to transfer source images into different styles in the intermediate domains. The transferred images smoothly bridge the gap between source and target domains, thus easing the domain adaptation task. Second, when multiple target domains are provided for training, our DLOW model is also able to generate new styles of images that are unseen in the training data. We implement our DLOW model based on CycleGAN. A domainness variable is introduced to guide the model to generate the desired intermediate domain images. In the inference phase, a flow of various styles of images can be obtained by varying the domainness variable. We demonstrate the effectiveness of our model for both cross-domain semantic segmentation and the style generalization tasks on benchmark datasets. Our implementation is available at https://github.com/ETHRuiGong/DLOW.
Tasks	Domain Adaptation, Semantic Segmentation, Style Generalization
Published	2018-12-13
URL	https://arxiv.org/abs/1812.05418v2
PDF	https://arxiv.org/pdf/1812.05418v2.pdf
PWC	https://paperswithcode.com/paper/dlow-domain-flow-for-adaptation-and
Repo	https://github.com/ETHRuiGong/DLOW
Framework	none