July 29, 2019

2858 words 14 mins read

Paper Group AWR 204

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents. Towards Real-Time Advancement of Underwater Visual Quality with GAN. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning. Bidirectional deep-readout echo state networks. PWLS-ULTRA: An Efficient Clustering an …

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents


Title	Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Authors	Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth O. Stanley, Jeff Clune
Abstract	Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. However, many RL problems require directed exploration because they have reward functions that are sparse or deceptive (i.e. contain local optima), and it is unknown how to encourage such exploration with ES. Here we show that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability. Our experiments confirm that the resultant new algorithms, NS-ES and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES to achieve higher performance on Atari and simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration. It also adds this new family of exploration algorithms to the RL toolbox and raises the interesting possibility that analogous algorithms with multiple simultaneous paths of exploration might also combine well with existing RL algorithms outside ES.
Tasks	Policy Gradient Methods, Q-Learning
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06560v3
PDF	http://arxiv.org/pdf/1712.06560v3.pdf
PWC	https://paperswithcode.com/paper/improving-exploration-in-evolution-strategies
Repo	https://github.com/uber-common/deep-neuroevolution
Framework	tf

Towards Real-Time Advancement of Underwater Visual Quality with GAN


Title	Towards Real-Time Advancement of Underwater Visual Quality with GAN
Authors	Xingyu Chen, Junzhi Yu, Shihan Kong, Zhengxing Wu, Xi Fang, Li Wen
Abstract	Low visual quality has prevented underwater robotic vision from a wide range of applications. Although several algorithms have been developed, real-time and adaptive methods are deficient for real-world tasks. In this paper, we address this difficulty based on generative adversarial networks (GAN), and propose a GAN-based restoration scheme (GAN-RS). In particular, we develop a multi-branch discriminator including an adversarial branch and a critic branch for the purpose of simultaneously preserving image content and removing underwater noise. In addition to adversarial learning, a novel dark channel prior loss also promotes the generator to produce realistic vision. More specifically, an underwater index is investigated to describe underwater properties, and a loss function based on the underwater index is designed to train the critic branch for underwater noise suppression. Through extensive comparisons on visual quality and feature restoration, we confirm the superiority of the proposed approach. Consequently, the GAN-RS can adaptively improve underwater visual quality in real time and induce an overall superior restoration performance. Finally, a real-world experiment is conducted on the seabed for grasping marine products, and the results are quite promising. The source code is publicly available at https://github.com/SeanChenxy/GAN_RS.
Tasks
Published	2017-12-03
URL	https://arxiv.org/abs/1712.00736v4
PDF	https://arxiv.org/pdf/1712.00736v4.pdf
PWC	https://paperswithcode.com/paper/towards-quality-advancement-of-underwater
Repo	https://github.com/SeanChenxy/GAN_RS
Framework	pytorch

PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning


Title	PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
Authors	Arun Mallya, Svetlana Lazebnik
Abstract	This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting. Inspired by network pruning techniques, we exploit redundancies in large deep networks to free up parameters that can then be employed to learn new tasks. By performing iterative pruning and network re-training, we are able to sequentially “pack” multiple tasks into a single network while ensuring minimal drop in performance and minimal storage overhead. Unlike prior work that uses proxy losses to maintain accuracy on older tasks, we always optimize for the task at hand. We perform extensive experiments on a variety of network architectures and large-scale datasets, and observe much better robustness against catastrophic forgetting than prior work. In particular, we are able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task. Code available at https://github.com/arunmallya/packnet
Tasks	Network Pruning
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05769v2
PDF	http://arxiv.org/pdf/1711.05769v2.pdf
PWC	https://paperswithcode.com/paper/packnet-adding-multiple-tasks-to-a-single
Repo	https://github.com/arunmallya/packnet
Framework	pytorch

Bidirectional deep-readout echo state networks


Title	Bidirectional deep-readout echo state networks
Authors	Filippo Maria Bianchi, Simone Scardapane, Sigurd Løkse, Robert Jenssen
Abstract	We propose a deep architecture for the classification of multivariate time series. By means of a recurrent and untrained reservoir we generate a vectorial representation that embeds temporal relationships in the data. To improve the memorization capability, we implement a bidirectional reservoir, whose last state captures also past dependencies in the input. We apply dimensionality reduction to the final reservoir states to obtain compressed fixed size representations of the time series. These are subsequently fed into a deep feedforward network trained to perform the final classification. We test our architecture on benchmark datasets and on a real-world use-case of blood samples classification. Results show that our method performs better than a standard echo state network and, at the same time, achieves results comparable to a fully-trained recurrent network, but with a faster training.
Tasks	Dimensionality Reduction, Time Series
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06509v3
PDF	http://arxiv.org/pdf/1711.06509v3.pdf
PWC	https://paperswithcode.com/paper/bidirectional-deep-readout-echo-state
Repo	https://github.com/FilippoMB/Bidirectional-Deep-Echo-State-Network
Framework	tf

PWLS-ULTRA: An Efficient Clustering and Learning-Based Approach for Low-Dose 3D CT Image Reconstruction


Title	PWLS-ULTRA: An Efficient Clustering and Learning-Based Approach for Low-Dose 3D CT Image Reconstruction
Authors	Xuehang Zheng, Saiprasad Ravishankar, Yong Long, Jeffrey A. Fessler
Abstract	The development of computed tomography (CT) image reconstruction methods that significantly reduce patient radiation exposure while maintaining high image quality is an important area of research in low-dose CT (LDCT) imaging. We propose a new penalized weighted least squares (PWLS) reconstruction method that exploits regularization based on an efficient Union of Learned TRAnsforms (PWLS-ULTRA). The union of square transforms is pre-learned from numerous image patches extracted from a dataset of CT images or volumes. The proposed PWLS-based cost function is optimized by alternating between a CT image reconstruction step, and a sparse coding and clustering step. The CT image reconstruction step is accelerated by a relaxed linearized augmented Lagrangian method with ordered-subsets that reduces the number of forward and back projections. Simulations with 2-D and 3-D axial CT scans of the extended cardiac-torso phantom and 3D helical chest and abdomen scans show that for both normal-dose and low-dose levels, the proposed method significantly improves the quality of reconstructed images compared to PWLS reconstruction with a nonadaptive edge-preserving regularizer (PWLS-EP). PWLS with regularization based on a union of learned transforms leads to better image reconstructions than using a single learned square transform. We also incorporate patch-based weights in PWLS-ULTRA that enhance image quality and help improve image resolution uniformity. The proposed approach achieves comparable or better image quality compared to learned overcomplete synthesis dictionaries, but importantly, is much faster (computationally more efficient).
Tasks	Computed Tomography (CT), Image Reconstruction
Published	2017-03-27
URL	http://arxiv.org/abs/1703.09165v3
PDF	http://arxiv.org/pdf/1703.09165v3.pdf
PWC	https://paperswithcode.com/paper/pwls-ultra-an-efficient-clustering-and
Repo	https://github.com/xuehangzheng/PWLS-ULTRA-for-Low-Dose-3D-CT-Image-Reconstruction
Framework	none

Synthesis of Positron Emission Tomography (PET) Images via Multi-channel Generative Adversarial Networks (GANs)


Title	Synthesis of Positron Emission Tomography (PET) Images via Multi-channel Generative Adversarial Networks (GANs)
Authors	Lei Bi, Jinman Kim, Ashnil Kumar, Dagan Feng, Michael Fulham
Abstract	Positron emission tomography (PET) image synthesis plays an important role, which can be used to boost the training data for computer aided diagnosis systems. However, existing image synthesis methods have problems in synthesizing the low resolution PET images. To address these limitations, we propose multi-channel generative adversarial networks (M-GAN) based PET image synthesis method. Different to the existing methods which rely on using low-level features, the proposed M-GAN is capable to represent the features in a high-level of semantic based on the adversarial learning concept. In addition, M-GAN enables to take the input from the annotation (label) to synthesize the high uptake regions e.g., tumors and from the computed tomography (CT) images to constrain the appearance consistency and output the synthetic PET images directly. Our results on 50 lung cancer PET-CT studies indicate that our method was much closer to the real PET images when compared with the existing methods.
Tasks	Computed Tomography (CT), Image Generation
Published	2017-07-31
URL	http://arxiv.org/abs/1707.09747v1
PDF	http://arxiv.org/pdf/1707.09747v1.pdf
PWC	https://paperswithcode.com/paper/synthesis-of-positron-emission-tomography-pet
Repo	https://github.com/ChengBinJin/SpineC2M
Framework	tf

Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis


Title	Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis
Authors	Björn Ross, Michael Rist, Guillermo Carbonell, Benjamin Cabrera, Nils Kurowsky, Michael Wojatzki
Abstract	Some users of social media are spreading racist, sexist, and otherwise hateful content. For the purpose of training a hate speech detection system, the reliability of the annotations is crucial, but there is no universally agreed-upon definition. We collected potentially hateful messages and asked two groups of internet users to determine whether they were hate speech or not, whether they should be banned or not and to rate their degree of offensiveness. One of the groups was shown a definition prior to completing the survey. We aimed to assess whether hate speech can be annotated reliably, and the extent to which existing definitions are in accordance with subjective ratings. Our results indicate that showing users a definition caused them to partially align their own opinion with the definition but did not improve reliability, which was very low overall. We conclude that the presence of hate speech should perhaps not be considered a binary yes-or-no decision, and raters need more detailed instructions for the annotation.
Tasks	Hate Speech Detection
Published	2017-01-27
URL	http://arxiv.org/abs/1701.08118v1
PDF	http://arxiv.org/pdf/1701.08118v1.pdf
PWC	https://paperswithcode.com/paper/measuring-the-reliability-of-hate-speech
Repo	https://github.com/UCSM-DUE/IWG_hatespeech_public
Framework	none

Lucid Data Dreaming for Video Object Segmentation


Title	Lucid Data Dreaming for Video Object Segmentation
Authors	Anna Khoreva, Rodrigo Benenson, Eddy Ilg, Thomas Brox, Bernt Schiele
Abstract	Convolutional networks reach top quality in pixel-level video object segmentation but require a large amount of training data (1k~100k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x~1000x less annotated data than competing methods. Our approach is suitable for both single and multiple object segmentation. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize (“lucid dream”) plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame, without ImageNet pre-training. Our results indicate that using a larger training set is not automatically better, and that for the video object segmentation task a smaller training set that is closer to the target domain is more effective. This changes the mindset regarding how many training samples and general “objectness” knowledge are required for the video object segmentation task.
Tasks	Multiple Object Tracking, Object Tracking, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2017-03-28
URL	http://arxiv.org/abs/1703.09554v5
PDF	http://arxiv.org/pdf/1703.09554v5.pdf
PWC	https://paperswithcode.com/paper/lucid-data-dreaming-for-video-object
Repo	https://github.com/yelantingfeng/pyLucid
Framework	none

DSSD : Deconvolutional Single Shot Detector


Title	DSSD : Deconvolutional Single Shot Detector
Authors	Cheng-Yang Fu, Wei Liu, Ananth Ranga, Ambrish Tyagi, Alexander C. Berg
Abstract	The main contribution of this paper is an approach for introducing additional context into state-of-the-art general object detection. To achieve this we first combine a state-of-the-art classifier (Residual-101[14]) with a fast detection framework (SSD[18]). We then augment SSD+Residual-101 with deconvolution layers to introduce additional large-scale context in object detection and improve accuracy, especially for small objects, calling our resulting system DSSD for deconvolutional single shot detector. While these two contributions are easily described at a high-level, a naive implementation does not succeed. Instead we show that carefully adding additional stages of learned transformations, specifically a module for feed-forward connections in deconvolution and a new output module, enables this new approach and forms a potential way forward for further detection research. Results are shown on both PASCAL VOC and COCO detection. Our DSSD with $513 \times 513$ input achieves 81.5% mAP on VOC2007 test, 80.0% mAP on VOC2012 test, and 33.2% mAP on COCO, outperforming a state-of-the-art method R-FCN[3] on each dataset.
Tasks	Object Detection
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06659v1
PDF	http://arxiv.org/pdf/1701.06659v1.pdf
PWC	https://paperswithcode.com/paper/dssd-deconvolutional-single-shot-detector
Repo	https://github.com/MTCloudVision/mxnet-dssd
Framework	mxnet

CSI: A Hybrid Deep Model for Fake News Detection


Title	CSI: A Hybrid Deep Model for Fake News Detection
Authors	Natali Ruchansky, Sungyong Seo, Yan Liu
Abstract	The topic of fake news has drawn attention both from the public and the academic communities. Such misinformation has the potential of affecting public opinion, providing an opportunity for malicious parties to manipulate the outcomes of public events such as elections. Because such high stakes are at play, automatically detecting fake news is an important, yet challenging problem that is not yet well understood. Nevertheless, there are three generally agreed upon characteristics of fake news: the text of an article, the user response it receives, and the source users promoting it. Existing work has largely focused on tailoring solutions to one particular characteristic which has limited their success and generality. In this work, we propose a model that combines all three characteristics for a more accurate and automated prediction. Specifically, we incorporate the behavior of both parties, users and articles, and the group behavior of users who propagate fake news. Motivated by the three characteristics, we propose a model called CSI which is composed of three modules: Capture, Score, and Integrate. The first module is based on the response and text; it uses a Recurrent Neural Network to capture the temporal pattern of user activity on a given article. The second module learns the source characteristic based on the behavior of users, and the two are integrated with the third module to classify an article as fake or not. Experimental analysis on real-world data demonstrates that CSI achieves higher accuracy than existing models, and extracts meaningful latent representations of both users and articles.
Tasks	Fake News Detection
Published	2017-03-20
URL	http://arxiv.org/abs/1703.06959v4
PDF	http://arxiv.org/pdf/1703.06959v4.pdf
PWC	https://paperswithcode.com/paper/csi-a-hybrid-deep-model-for-fake-news
Repo	https://github.com/soorism/CSI-Code
Framework	none

Ensemble Sales Forecasting Study in Semiconductor Industry


Title	Ensemble Sales Forecasting Study in Semiconductor Industry
Authors	Qiuping Xu, Vikas Sharma
Abstract	Sales forecasting plays a prominent role in business planning and business strategy. The value and importance of advance information is a cornerstone of planning activity, and a well-set forecast goal can guide sale-force more efficiently. In this paper CPU sales forecasting of Intel Corporation, a multinational semiconductor industry, was considered. Past sale, future booking, exchange rates, Gross domestic product (GDP) forecasting, seasonality and other indicators were innovatively incorporated into the quantitative modeling. Benefit from the recent advances in computation power and software development, millions of models built upon multiple regressions, time series analysis, random forest and boosting tree were executed in parallel. The models with smaller validation errors were selected to form the ensemble model. To better capture the distinct characteristics, forecasting models were implemented at lead time and lines of business level. The moving windows validation process automatically selected the models which closely represent current market condition. The weekly cadence forecasting schema allowed the model to response effectively to market fluctuation. Generic variable importance analysis was also developed to increase the model interpretability. Rather than assuming fixed distribution, this non-parametric permutation variable importance analysis provided a general framework across methods to evaluate the variable importance. This variable importance framework can further extend to classification problem by modifying the mean absolute percentage error(MAPE) into misclassify error. Please find the demo code at : https://github.com/qx0731/ensemble_forecast_methods
Tasks	Time Series, Time Series Analysis
Published	2017-04-28
URL	http://arxiv.org/abs/1705.00003v3
PDF	http://arxiv.org/pdf/1705.00003v3.pdf
PWC	https://paperswithcode.com/paper/ensemble-sales-forecasting-study-in
Repo	https://github.com/qx0731/ensemble_forecast_methods
Framework	none

(Quasi)Periodicity Quantification in Video Data, Using Topology


Title	(Quasi)Periodicity Quantification in Video Data, Using Topology
Authors	Christopher J. Tralie, Jose A. Perea
Abstract	This work introduces a novel framework for quantifying the presence and strength of recurrent dynamics in video data. Specifically, we provide continuous measures of periodicity (perfect repetition) and quasiperiodicity (superposition of periodic modes with non-commensurate periods), in a way which does not require segmentation, training, object tracking or 1-dimensional surrogate signals. Our methodology operates directly on video data. The approach combines ideas from nonlinear time series analysis (delay embeddings) and computational topology (persistent homology), by translating the problem of finding recurrent dynamics in video data, into the problem of determining the circularity or toroidality of an associated geometric space. Through extensive testing, we show the robustness of our scores with respect to several noise models/levels, we show that our periodicity score is superior to other methods when compared to human-generated periodicity rankings, and furthermore, we show that our quasiperiodicity score clearly indicates the presence of biphonation in videos of vibrating vocal folds, which has never before been accomplished end to end quantitatively.
Tasks	Object Tracking, Time Series, Time Series Analysis
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08382v2
PDF	http://arxiv.org/pdf/1704.08382v2.pdf
PWC	https://paperswithcode.com/paper/quasiperiodicity-quantification-in-video-data
Repo	https://github.com/ctralie/SlidingWindowVideoTDA
Framework	none