February 1, 2020

3008 words 15 mins read

Paper Group AWR 272

Graduated Optimization of Black-Box Functions. End-to-End Wireframe Parsing. Dream to Control: Learning Behaviors by Latent Imagination. Single Network Panoptic Segmentation for Street Scene Understanding. Drawing early-bird tickets: Towards more efficient training of deep networks. Fine-grained Sentiment Classification using BERT. Memory Bounded O …

Graduated Optimization of Black-Box Functions


Title	Graduated Optimization of Black-Box Functions
Authors	Weijia Shao, Christian Geißler, Fikret Sivrikaya
Abstract	Motivated by the problem of tuning hyperparameters in machine learning, we present a new approach for gradually and adaptively optimizing an unknown function using estimated gradients. We validate the empirical performance of the proposed idea on both low and high dimensional problems. The experimental results demonstrate the advantages of our approach for tuning high dimensional hyperparameters in machine learning.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01279v1
PDF	https://arxiv.org/pdf/1906.01279v1.pdf
PWC	https://paperswithcode.com/paper/graduated-optimization-of-black-box-functions
Repo	https://github.com/christiangeissler/gradoptbenchmark
Framework	none

End-to-End Wireframe Parsing


Title	End-to-End Wireframe Parsing
Authors	Yichao Zhou, Haozhi Qi, Yi Ma
Abstract	We present a conceptually simple yet effective algorithm to detect wireframes in a given image. Compared to the previous methods which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable and can directly output a vectorized wireframe that contains semantically meaningful and geometrically salient junctions and lines. To better understand the quality of the outputs, we propose a new metric for wireframe evaluation that penalizes overlapped line segments and incorrect line connectivities. We conduct extensive experiments and show that our method significantly outperforms the previous state-of-the-art wireframe and line extraction algorithms. We hope our simple approach can be served as a baseline for future wireframe parsing studies. Code has been made publicly available at https://github.com/zhou13/lcnn.
Tasks	Line Segment Detection
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03246v2
PDF	https://arxiv.org/pdf/1905.03246v2.pdf
PWC	https://paperswithcode.com/paper/190503246
Repo	https://github.com/zhou13/lcnn
Framework	pytorch

Dream to Control: Learning Behaviors by Latent Imagination


Title	Dream to Control: Learning Behaviors by Latent Imagination
Authors	Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Abstract	Learned world models summarize an agent’s experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We efficiently learn behaviors by propagating analytic gradients of learned state values back through trajectories imagined in the compact state space of a learned world model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches in data-efficiency, computation time, and final performance.
Tasks	Continuous Control
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01603v3
PDF	https://arxiv.org/pdf/1912.01603v3.pdf
PWC	https://paperswithcode.com/paper/dream-to-control-learning-behaviors-by-latent
Repo	https://github.com/danijar/dreamer
Framework	tf

Single Network Panoptic Segmentation for Street Scene Understanding


Title	Single Network Panoptic Segmentation for Street Scene Understanding
Authors	Daan de Geus, Panagiotis Meletis, Gijs Dubbelman
Abstract	In this work, we propose a single deep neural network for panoptic segmentation, for which the goal is to provide each individual pixel of an input image with a class label, as in semantic segmentation, as well as a unique identifier for specific objects in an image, following instance segmentation. Our network makes joint semantic and instance segmentation predictions and combines these to form an output in the panoptic format. This has two main benefits: firstly, the entire panoptic prediction is made in one pass, reducing the required computation time and resources; secondly, by learning the tasks jointly, information is shared between the two tasks, thereby improving performance. Our network is evaluated on two street scene datasets: Cityscapes and Mapillary Vistas. By leveraging information exchange and improving the merging heuristics, we increase the performance of the single network, and achieve a score of 23.9 on the Panoptic Quality (PQ) metric on Mapillary Vistas validation, with an input resolution of 640 x 900 pixels. On Cityscapes validation, our method achieves a PQ score of 45.9 with an input resolution of 512 x 1024 pixels. Moreover, our method decreases the prediction time by a factor of 2 with respect to separate networks.
Tasks	Instance Segmentation, Panoptic Segmentation, Scene Understanding, Semantic Segmentation
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02678v1
PDF	http://arxiv.org/pdf/1902.02678v1.pdf
PWC	https://paperswithcode.com/paper/single-network-panoptic-segmentation-for
Repo	https://github.com/DdeGeus/single-network-panoptic-segmentation
Framework	tf

Drawing early-bird tickets: Towards more efficient training of deep networks


Title	Drawing early-bird tickets: Towards more efficient training of deep networks
Authors	Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin
Abstract	(Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e.g., early stopping and low-precision training) at large learning rates. Our finding of EB tickets is consistent with recently reported observations that the key connectivity patterns of neural networks emerge early. Furthermore, we propose a mask distance metric that can be used to identify EB tickets with low computational overhead, without needing to know the true winning tickets that emerge after the full training. Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy. Experiments based on various deep networks and datasets validate: 1) the existence of EB tickets, and the effectiveness of mask distance in efficiently identifying them; and 2) that the proposed efficient training via EB tickets can achieve up to 4.7x energy savings while maintaining comparable or even better accuracy, demonstrating a promising and easily adopted method for tackling cost-prohibitive deep network training. Code available at https://github.com/RICE-EIC/Early-Bird-Tickets.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11957v3
PDF	https://arxiv.org/pdf/1909.11957v3.pdf
PWC	https://paperswithcode.com/paper/drawing-early-bird-tickets-towards-more
Repo	https://github.com/RICE-EIC/Early-Bird-Tickets
Framework	pytorch

Fine-grained Sentiment Classification using BERT


Title	Fine-grained Sentiment Classification using BERT
Authors	Manish Munikar, Sushil Shakya, Aakash Shrestha
Abstract	Sentiment classification is an important process in understanding people’s perception towards a product, service, or topic. Many natural language processing models have been proposed to solve the sentiment classification problem. However, most of them have focused on binary sentiment classification. In this paper, we use a promising deep learning model called BERT to solve the fine-grained sentiment classification task. Experiments show that our model outperforms other popular models for this task without sophisticated architecture. We also demonstrate the effectiveness of transfer learning in natural language processing in the process.
Tasks	Sentiment Analysis, Transfer Learning
Published	2019-10-04
URL	https://arxiv.org/abs/1910.03474v1
PDF	https://arxiv.org/pdf/1910.03474v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-sentiment-classification-using
Repo	https://github.com/munikarmanish/bert-sentiment
Framework	pytorch

Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling


Title	Memory Bounded Open-Loop Planning in Large POMDPs using Thompson Sampling
Authors	Thomy Phan, Lenz Belzner, Marie Kiermeier, Markus Friedrich, Kyrill Schmid, Claudia Linnhoff-Popien
Abstract	State-of-the-art approaches to partially observable planning like POMCP are based on stochastic tree search. While these approaches are computationally efficient, they may still construct search trees of considerable size, which could limit the performance due to restricted memory resources. In this paper, we propose Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to open-loop planning in large POMDPs, which optimizes a fixed size stack of Thompson Sampling bandits. We empirically evaluate POSTS in four large benchmark problems and compare its performance with different tree-based approaches. We show that POSTS achieves competitive performance compared to tree-based open-loop planning and offers a performance-memory tradeoff, making it suitable for partially observable planning with highly restricted computational and memory resources.
Tasks
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04020v1
PDF	https://arxiv.org/pdf/1905.04020v1.pdf
PWC	https://paperswithcode.com/paper/memory-bounded-open-loop-planning-in-large
Repo	https://github.com/thomyphan/planning
Framework	none

Self-Attention for Raw Optical Satellite Time Series Classification


Title	Self-Attention for Raw Optical Satellite Time Series Classification
Authors	Marc Rußwurm, Marco Körner
Abstract	Deep learning methods have received increasing interest by the remote sensing community for multi-temporal land cover classification in recent years. Convolutional Neural networks that elementwise compare a time series with learned kernels, and recurrent neural networks that sequentially process temporal data have dominated the state-of-the-art in the classification of vegetation from satellite time series. Self-attention allows a neural network to selectively extract features from specific times in the input sequence thus suppressing non-classification relevant information. Today, self-attention based neural networks dominate the state-of-the-art in natural language processing but are hardly explored and tested in the remote sensing context. In this work, we embed self-attention in the canon of deep learning mechanisms for satellite time series classification for vegetation modeling and crop type identification. We compare it quantitatively to convolution, and recurrence and test four models that each exclusively relies on one of these mechanisms. The models are trained to identify the type of vegetation on crop parcels using raw and preprocessed Sentinel 2 time series over one entire year. To obtain an objective measure we find the best possible performance for each of the models by a large-scale hyperparameter search with more than 2400 validation runs. Beyond the quantitative comparison, we qualitatively analyze the models by an easy-to-implement, but yet effective feature importance analysis based on gradient back-propagation that exploits the differentiable nature of deep learning models. Finally, we look into the self-attention transformer model and visualize attention scores as bipartite graphs in the context of the input time series and a low-dimensional representation of internal hidden states using t-distributed stochastic neighborhood embedding (t-SNE).
Tasks	Feature Importance, Time Series, Time Series Classification
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10536v1
PDF	https://arxiv.org/pdf/1910.10536v1.pdf
PWC	https://paperswithcode.com/paper/self-attention-for-raw-optical-satellite-time
Repo	https://github.com/marccoru/crop-type-mapping
Framework	pytorch

MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation


Title	MASC: Multi-scale Affinity with Sparse Convolution for 3D Instance Segmentation
Authors	Chen Liu, Yasutaka Furukawa
Abstract	We propose a new approach for 3D instance segmentation based on sparse convolution and point affinity prediction, which indicates the likelihood of two points belonging to the same instance. The proposed network, built upon submanifold sparse convolution [3], processes a voxelized point cloud and predicts semantic scores for each occupied voxel as well as the affinity between neighboring voxels at different scales. A simple yet effective clustering algorithm segments points into instances based on the predicted affinity and the mesh topology. The semantic for each instance is determined by the semantic prediction. Experiments show that our method outperforms the state-of-the-art instance segmentation methods by a large margin on the widely used ScanNet benchmark [2]. We share our code publicly at https://github.com/art-programmer/MASC.
Tasks	3D Instance Segmentation, Instance Segmentation, Semantic Segmentation
Published	2019-02-12
URL	http://arxiv.org/abs/1902.04478v1
PDF	http://arxiv.org/pdf/1902.04478v1.pdf
PWC	https://paperswithcode.com/paper/masc-multi-scale-affinity-with-sparse
Repo	https://github.com/art-programmer/MASC
Framework	pytorch

On the separation of shape and temporal patterns in time series -Application to signature authentication-


Title	On the separation of shape and temporal patterns in time series -Application to signature authentication-
Authors	Pierre-François Marteau
Abstract	In this article we address the problem of separation of shape and time components in time series. The concept ofshape that we tackle is termed temporally neutral to consider that it may possibly exist outside of any temporal specification, as it is the case for a geometric form. We propose to exploit and adapt a probabilistic temporal alignment algorithm, initially designed to estimate the centroid of a set of time series, to build some heuristicelements of solution to this separation problem. We show on some controlled synthetic data that this algorithm meets empirically our initial objectives. We finally evaluate it on real data, in the context of some on-line handwritten signature authentication benchmarks. On the three evaluated tasks, our approach based on the separation of signature shape and associated temporal patterns is positioned slightly above the current state of the art demonstrating the applicative benefit of this separating problem.
Tasks	Time Series
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09360v2
PDF	https://arxiv.org/pdf/1911.09360v2.pdf
PWC	https://paperswithcode.com/paper/on-the-separation-of-shape-and-temporal
Repo	https://github.com/pfmarteau/ShapeTimeSeparation
Framework	none

An Adversarial Approach to Private Flocking in Mobile Robot Teams


Title	An Adversarial Approach to Private Flocking in Mobile Robot Teams
Authors	Hehui Zheng, Jacopo Panerati, Giovanni Beltrame, Amanda Prorok
Abstract	Privacy is an important facet of defence against adversaries. In this letter, we introduce the problem of private flocking. We consider a team of mobile robots flocking in the presence of an adversary, who is able to observe all robots’ trajectories, and who is interested in identifying the leader. We present a method that generates private flocking controllers that hide the identity of the leader robot. Our approach towards privacy leverages a data-driven adversarial co-optimization scheme. We design a mechanism that optimizes flocking control parameters, such that leader inference is hindered. As the flocking performance improves, we successively train an adversarial discriminator that tries to infer the identity of the leader robot. To evaluate the performance of our co-optimization scheme, we investigate different classes of reference trajectories. Although it is reasonable to assume that there is an inherent trade-off between flocking performance and privacy, our results demonstrate that we are able to achieve high flocking performance and simultaneously reduce the risk of revealing the leader.
Tasks
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10387v3
PDF	https://arxiv.org/pdf/1909.10387v3.pdf
PWC	https://paperswithcode.com/paper/190910387
Repo	https://github.com/proroklab/private_flocking
Framework	pytorch

Signed Distance-based Deep Memory Recommender


Title	Signed Distance-based Deep Memory Recommender
Authors	Thanh Tran, Xinyue Liu, Kyumin Lee, Xiangnan Kong
Abstract	Personalized recommendation algorithms learn a user’s preference for an item by measuring a distance/similarity between them. However, some of the existing recommendation models (e.g., matrix factorization) assume a linear relationship between the user and item. This approach limits the capacity of recommender systems, since the interactions between users and items in real-world applications are much more complex than the linear relationship. To overcome this limitation, in this paper, we design and propose a deep learning framework called Signed Distance-based Deep Memory Recommender, which captures non-linear relationships between users and items explicitly and implicitly, and work well in both general recommendation task and shopping basket-based recommendation task. Through an extensive empirical study on six real-world datasets in the two recommendation tasks, our proposed approach achieved significant improvement over ten state-of-the-art recommendation models.
Tasks	Recommendation Systems
Published	2019-05-01
URL	http://arxiv.org/abs/1905.00453v1
PDF	http://arxiv.org/pdf/1905.00453v1.pdf
PWC	https://paperswithcode.com/paper/signed-distance-based-deep-memory-recommender
Repo	https://github.com/thanhdtran/SDMR
Framework	pytorch

Scalable Global Optimization via Local Bayesian Optimization


Title	Scalable Global Optimization via Local Bayesian Optimization
Authors	David Eriksson, Michael Pearce, Jacob R Gardner, Ryan Turner, Matthias Poloczek
Abstract	Bayesian optimization has recently emerged as a popular method for the sample-efficient optimization of expensive black-box functions. However, the application to high-dimensional problems with several thousand observations remains challenging, and on difficult problems Bayesian optimization is often not competitive with other paradigms. In this paper we take the view that this is due to the implicit homogeneity of the global probabilistic models and an overemphasized exploration that results from global acquisition. This motivates the design of a local probabilistic approach for global optimization of large-scale high-dimensional problems. We propose the $\texttt{TuRBO}$ algorithm that fits a collection of local models and performs a principled global allocation of samples across these models via an implicit bandit approach. A comprehensive evaluation demonstrates that $\texttt{TuRBO}$ outperforms state-of-the-art methods from machine learning and operations research on problems spanning reinforcement learning, robotics, and the natural sciences.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01739v4
PDF	https://arxiv.org/pdf/1910.01739v4.pdf
PWC	https://paperswithcode.com/paper/scalable-global-optimization-via-local
Repo	https://github.com/uber-research/TuRBO
Framework	pytorch

Resource Efficient 3D Convolutional Neural Networks


Title	Resource Efficient 3D Convolutional Neural Networks
Authors	Okan Köpüklü, Neslihan Kose, Ahmet Gunduz, Gerhard Rigoll
Abstract	Recently, convolutional neural networks with 3D kernels (3D CNNs) have been very popular in computer vision community as a result of their superior ability of extracting spatio-temporal features within video frames compared to 2D CNNs. Although there has been great advances recently to build resource efficient 2D CNN architectures considering memory and power budget, there is hardly any similar resource efficient architectures for 3D CNNs. In this paper, we have converted various well-known resource efficient 2D CNNs to 3D CNNs and evaluated their performance on three major benchmarks in terms of classification accuracy for different complexity levels. We have experimented on (1) Kinetics-600 dataset to inspect their capacity to learn, (2) Jester dataset to inspect their ability to capture motion patterns, and (3) UCF-101 to inspect the applicability of transfer learning. We have evaluated the run-time performance of each model on a single Titan XP GPU and a Jetson TX2 embedded system. The results of this study show that these models can be utilized for different types of real-world applications since they provide real-time performance with considerable accuracies and memory usage. Our analysis on different complexity levels shows that the resource efficient 3D CNNs should not be designed too shallow or narrow in order to save complexity. The codes and pretrained models used in this work are publicly available.
Tasks	Action Recognition In Videos, Transfer Learning
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02422v4
PDF	https://arxiv.org/pdf/1904.02422v4.pdf
PWC	https://paperswithcode.com/paper/resource-efficient-3d-convolutional-neural
Repo	https://github.com/okankop/Efficient-3DCNNs
Framework	pytorch

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction


Title	FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction
Authors	Shuyang Sun, Jiangmiao Pang, Jianping Shi, Shuai Yi, Wanli Ouyang
Abstract	The basic principles in designing convolutional neural network (CNN) structures for predicting objects on different levels, e.g., image-level, region-level, and pixel-level are diverging. Generally, network structures designed specifically for image classification are directly used as default backbone structure for other tasks including detection and segmentation, but there is seldom backbone structure designed under the consideration of unifying the advantages of networks designed for pixel-level or region-level predicting tasks, which may require very deep features with high resolution. Towards this goal, we design a fish-like network, called FishNet. In FishNet, the information of all resolutions is preserved and refined for the final task. Besides, we observe that existing works still cannot \emph{directly} propagate the gradient information from deep layers to shallow layers. Our design can better handle this problem. Extensive experiments have been conducted to demonstrate the remarkable performance of the FishNet. In particular, on ImageNet-1k, the accuracy of FishNet is able to surpass the performance of DenseNet and ResNet with fewer parameters. FishNet was applied as one of the modules in the winning entry of the COCO Detection 2018 challenge. The code is available at https://github.com/kevin-ssy/FishNet.
Tasks	Image Classification
Published	2019-01-11
URL	http://arxiv.org/abs/1901.03495v1
PDF	http://arxiv.org/pdf/1901.03495v1.pdf
PWC	https://paperswithcode.com/paper/fishnet-a-versatile-backbone-for-image-region
Repo	https://github.com/kevin-ssy/FishNet
Framework	pytorch