October 20, 2019

2933 words 14 mins read

Paper Group ANR 49

Paper Group ANR 49

Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer. Learning Multi-Layer Transform Models. Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization. The UN Parallel Corpus Annotated for Translation Direction. Learning without Memorizing. Learning Sound Events From …

Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer

Title Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer
Authors Hung-Yu Chen, I-Sheng Fang, Wei-Chen Chiu
Abstract Style transfer has been widely applied to give real-world images a new artistic look. However, given a stylized image, the attempts to use typical style transfer methods for de-stylization or transferring it again into another style usually lead to artifacts or undesired results. We realize that these issues are originated from the content inconsistency between the original image and its stylized output. Therefore, in this paper we advance to keep the content information of the input image during the process of style transfer by the power of steganography, with two approaches proposed: a two-stage model and an end-to-end model. We conduct extensive experiments to successfully verify the capacity of our models, in which both of them are able to not only generate stylized images of quality comparable with the ones produced by typical style transfer methods, but also effectively eliminate the artifacts introduced in reconstructing original input from a stylized image as well as performing multiple times of style transfer in series.
Tasks Style Transfer
Published 2018-12-10
URL https://arxiv.org/abs/1812.03910v3
PDF https://arxiv.org/pdf/1812.03910v3.pdf
PWC https://paperswithcode.com/paper/self-contained-stylization-via-steganography
Repo
Framework

Learning Multi-Layer Transform Models

Title Learning Multi-Layer Transform Models
Authors Saiprasad Ravishankar, Brendt Wohlberg
Abstract Learned data models based on sparsity are widely used in signal processing and imaging applications. A variety of methods for learning synthesis dictionaries, sparsifying transforms, etc., have been proposed in recent years, often imposing useful structures or properties on the models. In this work, we focus on sparsifying transform learning, which enjoys a number of advantages. We consider multi-layer or nested extensions of the transform model, and propose efficient learning algorithms. Numerical experiments with image data illustrate the behavior of the multi-layer transform learning algorithm and its usefulness for image denoising. Multi-layer models provide better denoising quality than single layer schemes.
Tasks Denoising, Image Denoising
Published 2018-10-19
URL http://arxiv.org/abs/1810.08323v1
PDF http://arxiv.org/pdf/1810.08323v1.pdf
PWC https://paperswithcode.com/paper/learning-multi-layer-transform-models
Repo
Framework

Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization

Title Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Authors Shuming Ma, Xu Sun, Junyang Lin, Houfeng Wang
Abstract Most of the current abstractive text summarization models are based on the sequence-to-sequence model (Seq2Seq). The source content of social media is long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic representation. Compared with the source content, the annotated summary is short and well written. Moreover, it shares the same meaning as the source content. In this work, we supervise the learning of the representation of the source content with that of the summary. In implementation, we regard a summary autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we evaluate our model on a popular Chinese social media dataset. Experimental results show that our model achieves the state-of-the-art performances on the benchmark dataset.
Tasks Abstractive Text Summarization, Text Summarization
Published 2018-05-13
URL http://arxiv.org/abs/1805.04869v1
PDF http://arxiv.org/pdf/1805.04869v1.pdf
PWC https://paperswithcode.com/paper/autoencoder-as-assistant-supervisor-improving
Repo
Framework

The UN Parallel Corpus Annotated for Translation Direction

Title The UN Parallel Corpus Annotated for Translation Direction
Authors Elad Tolochinsky, Ohad Mosafi, Ella Rabinovich, Shuly Wintner
Abstract This work distinguishes between translated and original text in the UN protocol corpus. By modeling the problem as classification problem, we can achieve up to 95% classification accuracy. We begin by deriving a parallel corpus for different language-pairs annotated for translation direction, and then classify the data by using various feature extraction methods. We compare the different methods as well as the ability to distinguish between translated and original texts in the different languages. The annotated corpus is publicly available.
Tasks
Published 2018-05-20
URL http://arxiv.org/abs/1805.07697v1
PDF http://arxiv.org/pdf/1805.07697v1.pdf
PWC https://paperswithcode.com/paper/the-un-parallel-corpus-annotated-for
Repo
Framework

Learning without Memorizing

Title Learning without Memorizing
Authors Prithviraj Dhar, Rajat Vikram Singh, Kuan-Chuan Peng, Ziyan Wu, Rama Chellappa
Abstract Incremental learning (IL) is an important task aimed at increasing the capability of a trained model, in terms of the number of classes recognizable by the model. The key problem in this task is the requirement of storing data (e.g. images) associated with existing classes, while teaching the classifier to learn new classes. However, this is impractical as it increases the memory requirement at every incremental step, which makes it impossible to implement IL algorithms on edge devices with limited memory. Hence, we propose a novel approach, called `Learning without Memorizing (LwM)', to preserve the information about existing (base) classes, without storing any of their data, while making the classifier progressively learn the new classes. In LwM, we present an information preserving penalty: Attention Distillation Loss ($L_{AD}$), and demonstrate that penalizing the changes in classifiers’ attention maps helps to retain information of the base classes, as new classes are added. We show that adding $L_{AD}$ to the distillation loss which is an existing information preserving loss consistently outperforms the state-of-the-art performance in the iILSVRC-small and iCIFAR-100 datasets in terms of the overall accuracy of base and incrementally learned classes. |
Tasks
Published 2018-11-20
URL http://arxiv.org/abs/1811.08051v2
PDF http://arxiv.org/pdf/1811.08051v2.pdf
PWC https://paperswithcode.com/paper/learning-without-memorizing
Repo
Framework

Learning Sound Events From Webly Labeled Data

Title Learning Sound Events From Webly Labeled Data
Authors Anurag Kumar, Ankit Shah, Alex Hauptmann, Bhiksha Raj
Abstract In the last couple of years, weakly labeled learning for sound events has turned out to be an exciting approach for audio event detection. In this work, we introduce webly labeled learning for sound events in which we aim to remove human supervision altogether from the learning process. We first develop a method of obtaining labeled audio data from the web (albeit noisy), in which no manual labeling is involved. We then describe deep learning methods to efficiently learn from these webly labeled audio recordings. In our proposed system, WeblyNet, two deep neural networks co-teach each other to robustly learn from webly labeled data, leading to around 17% relative improvement over the baseline method. The method also involves transfer learning to obtain efficient representations.
Tasks Transfer Learning
Published 2018-11-25
URL http://arxiv.org/abs/1811.09967v1
PDF http://arxiv.org/pdf/1811.09967v1.pdf
PWC https://paperswithcode.com/paper/learning-sound-events-from-webly-labeled-data
Repo
Framework

High Performance Visual Tracking with Circular and Structural Operators

Title High Performance Visual Tracking with Circular and Structural Operators
Authors Peng Gao, Yipeng Ma, Ke Song, Chao Li, Fei Wang, Liyi Xiao, Yan Zhang
Abstract In this paper, a novel circular and structural operator tracker (CSOT) is proposed for high performance visual tracking, it not only possesses the powerful discriminative capability of SOSVM but also efficiently inherits the superior computational efficiency of DCF. Based on the proposed circular and structural operators, a set of primal confidence score maps can be obtained by circular correlating feature maps with their corresponding structural correlation filters. Furthermore, an implicit interpolation is applied to convert the multi-resolution feature maps to the continuous domain and make all primal confidence score maps have the same spatial resolution. Then, we exploit an efficient ensemble post-processor based on relative entropy, which can coalesce primal confidence score maps and create an optimal confidence score map for more accurate localization. The target is localized on the peak of the optimal confidence score map. Besides, we introduce a collaborative optimization strategy to update circular and structural operators by iteratively training structural correlation filters, which significantly reduces computational complexity and improves robustness. Experimental results demonstrate that our approach achieves state-of-the-art performance in mean AUC scores of 71.5% and 69.4% on the OTB-2013 and OTB-2015 benchmarks respectively, and obtains a third-best expected average overlap (EAO) score of 29.8% on the VOT-2017 benchmark.
Tasks Visual Tracking
Published 2018-04-23
URL http://arxiv.org/abs/1804.08208v3
PDF http://arxiv.org/pdf/1804.08208v3.pdf
PWC https://paperswithcode.com/paper/high-performance-visual-tracking-with
Repo
Framework

Rational Recurrences

Title Rational Recurrences
Authors Hao Peng, Roy Schwartz, Sam Thomson, Noah A. Smith
Abstract Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights. In this work, we show that some recurrent neural networks also share this connection to WFSAs. We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs. We present one such model, which performs better than two recent baselines on language modeling and text classification. Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.
Tasks Language Modelling, Text Classification
Published 2018-08-28
URL http://arxiv.org/abs/1808.09357v1
PDF http://arxiv.org/pdf/1808.09357v1.pdf
PWC https://paperswithcode.com/paper/rational-recurrences
Repo
Framework

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Title End to End Video Segmentation for Driving : Lane Detection For Autonomous Car
Authors Wenhui Zhang, Tejas Mahale
Abstract Safety and decline of road traffic accidents remain important issues of autonomous driving. Statistics show that unintended lane departure is a leading cause of worldwide motor vehicle collisions, making lane detection the most promising and challenge task for self-driving. Today, numerous groups are combining deep learning techniques with computer vision problems to solve self-driving problems. In this paper, a Global Convolution Networks (GCN) model is used to address both classification and localization issues for semantic segmentation of lane. We are using color-based segmentation is presented and the usability of the model is evaluated. A residual-based boundary refinement and Adam optimization is also used to achieve state-of-art performance. As normal cars could not afford GPUs on the car, and training session for a particular road could be shared by several cars. We propose a framework to get it work in real world. We build a real time video transfer system to get video from the car, get the model trained in edge server (which is equipped with GPUs), and send the trained model back to the car.
Tasks Autonomous Driving, Lane Detection, Semantic Segmentation, Video Semantic Segmentation
Published 2018-12-13
URL http://arxiv.org/abs/1812.05914v1
PDF http://arxiv.org/pdf/1812.05914v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-video-segmentation-for-driving
Repo
Framework

3D Convolutional Neural Networks for Tumor Segmentation using Long-range 2D Context

Title 3D Convolutional Neural Networks for Tumor Segmentation using Long-range 2D Context
Authors Pawel Mlynarski, Hervé Delingette, Antonio Criminisi, Nicholas Ayache
Abstract We present an efficient deep learning approach for the challenging task of tumor segmentation in multisequence MR images. In recent years, Convolutional Neural Networks (CNN) have achieved state-of-the-art performances in a large variety of recognition tasks in medical imaging. Because of the considerable computational cost of CNNs, large volumes such as MRI are typically processed by subvolumes, for instance slices (axial, coronal, sagittal) or small 3D patches. In this paper we introduce a CNN-based model which efficiently combines the advantages of the short-range 3D context and the long-range 2D context. To overcome the limitations of specific choices of neural network architectures, we also propose to merge outputs of several cascaded 2D-3D models by a voxelwise voting strategy. Furthermore, we propose a network architecture in which the different MR sequences are processed by separate subnetworks in order to be more robust to the problem of missing MR sequences. Finally, a simple and efficient algorithm for training large CNN models is introduced. We evaluate our method on the public benchmark of the BRATS 2017 challenge on the task of multiclass segmentation of malignant brain tumors. Our method achieves good performances and produces accurate segmentations with median Dice scores of 0.918 (whole tumor), 0.883 (tumor core) and 0.854 (enhancing core). Our approach can be naturally applied to various tasks involving segmentation of lesions or organs.
Tasks
Published 2018-07-23
URL http://arxiv.org/abs/1807.08599v1
PDF http://arxiv.org/pdf/1807.08599v1.pdf
PWC https://paperswithcode.com/paper/3d-convolutional-neural-networks-for-tumor
Repo
Framework

LDSO: Direct Sparse Odometry with Loop Closure

Title LDSO: Direct Sparse Odometry with Loop Closure
Authors Xiang Gao, Rui Wang, Nikolaus Demmel, Daniel Cremers
Abstract In this paper we present an extension of Direct Sparse Odometry (DSO) to a monocular visual SLAM system with loop closure detection and pose-graph optimization (LDSO). As a direct technique, DSO can utilize any image pixel with sufficient intensity gradient, which makes it robust even in featureless areas. LDSO retains this robustness, while at the same time ensuring repeatability of some of these points by favoring corner features in the tracking frontend. This repeatability allows to reliably detect loop closure candidates with a conventional feature-based bag-of-words (BoW) approach. Loop closure candidates are verified geometrically and Sim(3) relative pose constraints are estimated by jointly minimizing 2D and 3D geometric error terms. These constraints are fused with a co-visibility graph of relative poses extracted from DSO’s sliding window optimization. Our evaluation on publicly available datasets demonstrates that the modified point selection strategy retains the tracking accuracy and robustness, and the integrated pose-graph optimization significantly reduces the accumulated rotation-, translation- and scale-drift, resulting in an overall performance comparable to state-of-the-art feature-based systems, even without global bundle adjustment.
Tasks Loop Closure Detection
Published 2018-08-03
URL http://arxiv.org/abs/1808.01111v1
PDF http://arxiv.org/pdf/1808.01111v1.pdf
PWC https://paperswithcode.com/paper/ldso-direct-sparse-odometry-with-loop-closure
Repo
Framework

Training DNNs with Hybrid Block Floating Point

Title Training DNNs with Hybrid Block Floating Point
Authors Mario Drumond, Tao Lin, Martin Jaggi, Babak Falsafi
Abstract The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point’s narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating point’s accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.
Tasks
Published 2018-04-04
URL http://arxiv.org/abs/1804.01526v4
PDF http://arxiv.org/pdf/1804.01526v4.pdf
PWC https://paperswithcode.com/paper/training-dnns-with-hybrid-block-floating
Repo
Framework

Multi-scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets

Title Multi-scale Convolution Aggregation and Stochastic Feature Reuse for DenseNets
Authors Mingjie Wang, Jun Zhou, Wendong Mao, Minglun Gong
Abstract Recently, Convolution Neural Networks (CNNs) obtained huge success in numerous vision tasks. In particular, DenseNets have demonstrated that feature reuse via dense skip connections can effectively alleviate the difficulty of training very deep networks and that reusing features generated by the initial layers in all subsequent layers has strong impact on performance. To feed even richer information into the network, a novel adaptive Multi-scale Convolution Aggregation module is presented in this paper. Composed of layers for multi-scale convolutions, trainable cross-scale aggregation, maxout, and concatenation, this module is highly non-linear and can boost the accuracy of DenseNet while using much fewer parameters. In addition, due to high model complexity, the network with extremely dense feature reuse is prone to overfitting. To address this problem, a regularization method named Stochastic Feature Reuse is also presented. Through randomly dropping a set of feature maps to be reused for each mini-batch during the training phase, this regularization method reduces training costs and prevents co-adaptation. Experimental results on CIFAR-10, CIFAR-100 and SVHN benchmarks demonstrated the effectiveness of the proposed methods.
Tasks
Published 2018-10-02
URL http://arxiv.org/abs/1810.01373v1
PDF http://arxiv.org/pdf/1810.01373v1.pdf
PWC https://paperswithcode.com/paper/multi-scale-convolution-aggregation-and
Repo
Framework

Universal Deep Neural Network Compression

Title Universal Deep Neural Network Compression
Authors Yoojin Choi, Mostafa El-Khamy, Jungwon Lee
Abstract In this paper, we investigate lossy compression of deep neural networks (DNNs) by weight quantization and lossless source coding for memory-efficient deployment. Whereas the previous work addressed non-universal scalar quantization and entropy coding of DNN weights, we for the first time introduce universal DNN compression by universal vector quantization and universal source coding. In particular, we examine universal randomized lattice quantization of DNNs, which randomizes DNN weights by uniform random dithering before lattice quantization and can perform near-optimally on any source without relying on knowledge of its probability distribution. Moreover, we present a method of fine-tuning vector quantized DNNs to recover the performance loss after quantization. Our experimental results show that the proposed universal DNN compression scheme compresses the 32-layer ResNet (trained on CIFAR-10) and the AlexNet (trained on ImageNet) with compression ratios of $47.1$ and $42.5$, respectively.
Tasks Neural Network Compression, Quantization
Published 2018-02-07
URL http://arxiv.org/abs/1802.02271v2
PDF http://arxiv.org/pdf/1802.02271v2.pdf
PWC https://paperswithcode.com/paper/universal-deep-neural-network-compression
Repo
Framework

Deep Learning for Image Denoising: A Survey

Title Deep Learning for Image Denoising: A Survey
Authors Chunwei Tian, Yong Xu, Lunke Fei, Ke Yan
Abstract Since the proposal of big data analysis and Graphic Processing Unit (GPU), the deep learning technology has received a great deal of attention and has been widely applied in the field of imaging processing. In this paper, we have an aim to completely review and summarize the deep learning technologies for image denoising proposed in recent years. Morever, we systematically analyze the conventional machine learning methods for image denoising. Finally, we point out some research directions for the deep learning technologies in image denoising.
Tasks Denoising, Image Denoising
Published 2018-10-11
URL http://arxiv.org/abs/1810.05052v1
PDF http://arxiv.org/pdf/1810.05052v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-image-denoising-a-survey
Repo
Framework
comments powered by Disqus