Paper Group AWR 256
Decouple Learning for Parameterized Image Operators. Simultaneous Fidelity and Regularization Learning for Image Restoration. TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents. Modeling natural language emergence with integral transform theory and reinforcement learning. Building Efficient ConvNets using Redundant Feature Pruni …
Decouple Learning for Parameterized Image Operators
Title | Decouple Learning for Parameterized Image Operators |
Authors | Qingnan Fan, Dongdong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen |
Abstract | Many different deep networks have been used to approximate, accelerate or improve traditional image operators, such as image smoothing, super-resolution and denoising. Among these traditional operators, many contain parameters which need to be tweaked to obtain the satisfactory results, which we refer to as “parameterized image operators”. However, most existing deep networks trained for these operators are only designed for one specific parameter configuration, which does not meet the needs of real scenarios that usually require flexible parameters settings. To overcome this limitation, we propose a new decouple learning algorithm to learn from the operator parameters to dynamically adjust the weights of a deep network for image operators, denoted as the base network. The learned algorithm is formed as another network, namely the weight learning network, which can be end-to-end jointly trained with the base network. Experiments demonstrate that the proposed framework can be successfully applied to many traditional parameterized image operators. We provide more analysis to better understand the proposed framework, which may inspire more promising research in this direction. Our codes and models have been released in https://github.com/fqnchina/DecoupleLearning |
Tasks | Denoising, Super-Resolution |
Published | 2018-07-21 |
URL | http://arxiv.org/abs/1807.08186v2 |
http://arxiv.org/pdf/1807.08186v2.pdf | |
PWC | https://paperswithcode.com/paper/decouple-learning-for-parameterized-image |
Repo | https://github.com/fqnchina/DecoupleLearning |
Framework | pytorch |
Simultaneous Fidelity and Regularization Learning for Image Restoration
Title | Simultaneous Fidelity and Regularization Learning for Image Restoration |
Authors | Dongwei Ren, Wangmeng Zuo, David Zhang, Lei Zhang, Ming-Hsuan Yang |
Abstract | Most existing non-blind restoration methods are based on the assumption that a precise degradation model is known. As the degradation process can only be partially known or inaccurately modeled, images may not be well restored. Rain streak removal and image deconvolution with inaccurate blur kernels are two representative examples of such tasks. For rain streak removal, although an input image can be decomposed into a scene layer and a rain streak layer, there exists no explicit formulation for modeling rain streaks and the composition with scene layer. For blind deconvolution, as estimation error of blur kernel is usually introduced, the subsequent non-blind deconvolution process does not restore the latent image well. In this paper, we propose a principled algorithm within the maximum a posterior framework to tackle image restoration with a partially known or inaccurate degradation model. Specifically, the residual caused by a partially known or inaccurate degradation model is spatially dependent and complexly distributed. With a training set of degraded and ground-truth image pairs, we parameterize and learn the fidelity term for a degradation model in a task-driven manner. Furthermore, the regularization term can also be learned along with the fidelity term, thereby forming a simultaneous fidelity and regularization learning model. Extensive experimental results demonstrate the effectiveness of the proposed model for image deconvolution with inaccurate blur kernels, deconvolution with multiple degradations and rain streak removal. |
Tasks | Denoising, Image Deconvolution, Image Restoration |
Published | 2018-04-12 |
URL | https://arxiv.org/abs/1804.04522v4 |
https://arxiv.org/pdf/1804.04522v4.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-fidelity-and-regularization |
Repo | https://github.com/csdwren/sfarl |
Framework | none |
TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents
Title | TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents |
Authors | Yuexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wenping Wang, Dinesh Manocha |
Abstract | To safely and efficiently navigate in complex urban traffic, autonomous vehicles must make responsible predictions in relation to surrounding traffic-agents (vehicles, bicycles, pedestrians, etc.). A challenging and critical task is to explore the movement patterns of different traffic-agents and predict their future trajectories accurately to help the autonomous vehicle make reasonable navigation decision. To solve this problem, we propose a long short-term memory-based (LSTM-based) realtime traffic prediction algorithm, TrafficPredict. Our approach uses an instance layer to learn instances’ movements and interactions and has a category layer to learn the similarities of instances belonging to the same type to refine the prediction. In order to evaluate its performance, we collected trajectory datasets in a large city consisting of varying conditions and traffic densities. The dataset includes many challenging scenarios where vehicles, bicycles, and pedestrians move among one another. We evaluate the performance of TrafficPredict on our new dataset and highlight its higher accuracy for trajectory prediction by comparing with prior prediction methods. |
Tasks | Autonomous Vehicles, Traffic Prediction, Trajectory Prediction |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02146v5 |
http://arxiv.org/pdf/1811.02146v5.pdf | |
PWC | https://paperswithcode.com/paper/trafficpredict-trajectory-prediction-for |
Repo | https://github.com/ApolloScapeAuto/dataset-api |
Framework | none |
Modeling natural language emergence with integral transform theory and reinforcement learning
Title | Modeling natural language emergence with integral transform theory and reinforcement learning |
Authors | Bohdan Khomtchouk, Shyam Sudhakaran |
Abstract | Zipf’s law predicts a power-law relationship between word rank and frequency in language communication systems and has been widely reported in a variety of natural language processing applications. However, the emergence of natural language is often modeled as a function of bias between speaker and listener interests, which lacks a direct way of relating information-theoretic bias to Zipfian rank. A function of bias also serves as an unintuitive interpretation of the communicative effort exchanged between a speaker and a listener. We counter these shortcomings by proposing a novel integral transform and kernel for mapping communicative bias functions to corresponding word frequency-rank representations at any arbitrary phase transition point, resulting in a direct way to link communicative effort (modeled by speaker/listener bias) to specific vocabulary used (represented by word rank). We demonstrate the practical utility of our integral transform by showing how a change from bias to rank results in greater accuracy and performance at an image classification task for assigning word labels to images randomly subsampled from CIFAR10. We model this task as a reinforcement learning game between a speaker and listener and compare the relative impact of bias and Zipfian word rank on communicative performance (and accuracy) between the two agents. |
Tasks | Image Classification |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.01431v1 |
http://arxiv.org/pdf/1812.01431v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-natural-language-emergence-with |
Repo | https://github.com/Quiltomics/NLERL |
Framework | tf |
Building Efficient ConvNets using Redundant Feature Pruning
Title | Building Efficient ConvNets using Redundant Feature Pruning |
Authors | Babajide O. Ayinde, Jacek M. Zurada |
Abstract | This paper presents an efficient technique to prune deep and/or wide convolutional neural network models by eliminating redundant features (or filters). Previous studies have shown that over-sized deep neural network models tend to produce a lot of redundant features that are either shifted version of one another or are very similar and show little or no variations; thus resulting in filtering redundancy. We propose to prune these redundant features along with their connecting feature maps according to their differentiation and based on their relative cosine distances in the feature space, thus yielding smaller network size with reduced inference costs and competitive performance. We empirically show on select models and CIFAR-10 dataset that inference costs can be reduced by 40% for VGG-16, 27% for ResNet-56, and 39% for ResNet-110. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07653v1 |
http://arxiv.org/pdf/1802.07653v1.pdf | |
PWC | https://paperswithcode.com/paper/building-efficient-convnets-using-redundant |
Repo | https://github.com/bemova/Building-Efficient-ConvNets-using-Redundant-Feature-Pruning |
Framework | pytorch |
Non-Local Recurrent Network for Image Restoration
Title | Non-Local Recurrent Network for Image Restoration |
Authors | Ding Liu, Bihan Wen, Yuchen Fan, Chen Change Loy, Thomas S. Huang |
Abstract | Many classic methods have shown non-local self-similarity in natural images to be an effective prior for image restoration. However, it remains unclear and challenging to make use of this intrinsic property via deep networks. In this paper, we propose a non-local recurrent network (NLRN) as the first attempt to incorporate non-local operations into a recurrent neural network (RNN) for image restoration. The main contributions of this work are: (1) Unlike existing methods that measure self-similarity in an isolated manner, the proposed non-local module can be flexibly integrated into existing deep networks for end-to-end training to capture deep feature correlation between each location and its neighborhood. (2) We fully employ the RNN structure for its parameter efficiency and allow deep feature correlation to be propagated along adjacent recurrent states. This new design boosts robustness against inaccurate correlation estimation due to severely degraded images. (3) We show that it is essential to maintain a confined neighborhood for computing deep feature correlation given degraded images. This is in contrast to existing practice that deploys the whole image. Extensive experiments on both image denoising and super-resolution tasks are conducted. Thanks to the recurrent non-local operations and correlation propagation, the proposed NLRN achieves superior results to state-of-the-art methods with much fewer parameters. |
Tasks | Denoising, Image Denoising, Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02919v2 |
http://arxiv.org/pdf/1806.02919v2.pdf | |
PWC | https://paperswithcode.com/paper/non-local-recurrent-network-for-image |
Repo | https://github.com/Ding-Liu/NLRN |
Framework | tf |
AMR Parsing as Graph Prediction with Latent Alignment
Title | AMR Parsing as Graph Prediction with Latent Alignment |
Authors | Chunchuan Lyu, Ivan Titov |
Abstract | Abstract meaning representations (AMRs) are broad-coverage sentence-level semantic representations. AMRs represent sentences as rooted labeled directed acyclic graphs. AMR parsing is challenging partly due to the lack of annotated alignments between nodes in the graphs and words in the corresponding sentences. We introduce a neural parser which treats alignments as latent variables within a joint probabilistic model of concepts, relations and alignments. As exact inference requires marginalizing over alignments and is infeasible, we use the variational auto-encoding framework and a continuous relaxation of the discrete alignments. We show that joint modeling is preferable to using a pipeline of align and parse. The parser achieves the best reported results on the standard benchmark (74.4% on LDC2016E25). |
Tasks | Amr Parsing |
Published | 2018-05-14 |
URL | http://arxiv.org/abs/1805.05286v1 |
http://arxiv.org/pdf/1805.05286v1.pdf | |
PWC | https://paperswithcode.com/paper/amr-parsing-as-graph-prediction-with-latent |
Repo | https://github.com/josefigueroa168/NLP-project |
Framework | none |
Getting to Know Low-light Images with The Exclusively Dark Dataset
Title | Getting to Know Low-light Images with The Exclusively Dark Dataset |
Authors | Yuen Peng Loh, Chee Seng Chan |
Abstract | Low-light is an inescapable element of our daily surroundings that greatly affects the efficiency of our vision. Research works on low-light has seen a steady growth, particularly in the field of image enhancement, but there is still a lack of a go-to database as benchmark. Besides, research fields that may assist us in low-light environments, such as object detection, has glossed over this aspect even though breakthroughs-after-breakthroughs had been achieved in recent years, most noticeably from the lack of low-light data (less than 2% of the total images) in successful public benchmark dataset such as PASCAL VOC, ImageNet, and Microsoft COCO. Thus, we propose the Exclusively Dark dataset to elevate this data drought, consisting exclusively of ten different types of low-light images (i.e. low, ambient, object, single, weak, strong, screen, window, shadow and twilight) captured in visible light only with image and object level annotations. Moreover, we share insightful findings in regards to the effects of low-light on the object detection task by analyzing visualizations of both hand-crafted and learned features. Most importantly, we found that the effects of low-light reaches far deeper into the features than can be solved by simple “illumination invariance’". It is our hope that this analysis and the Exclusively Dark dataset can encourage the growth in low-light domain researches on different fields. The Exclusively Dark dataset with its annotation is available at https://github.com/cs-chan/Exclusively-Dark-Image-Dataset |
Tasks | Image Enhancement, Low-Light Image Enhancement, Object Detection |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11227v1 |
http://arxiv.org/pdf/1805.11227v1.pdf | |
PWC | https://paperswithcode.com/paper/getting-to-know-low-light-images-with-the |
Repo | https://github.com/cs-chan/Exclusively-Dark-Image-Dataset |
Framework | none |
Shorten Spatial-spectral RNN with Parallel-GRU for Hyperspectral Image Classification
Title | Shorten Spatial-spectral RNN with Parallel-GRU for Hyperspectral Image Classification |
Authors | Haowen Luo |
Abstract | Convolutional neural networks (CNNs) attained a good performance in hyperspectral sensing image (HSI) classification, but CNNs consider spectra as orderless vectors. Therefore, considering the spectra as sequences, recurrent neural networks (RNNs) have been applied in HSI classification, for RNNs is skilled at dealing with sequential data. However, for a long-sequence task, RNNs is difficult for training and not as effective as we expected. Besides, spatial contextual features are not considered in RNNs. In this study, we propose a Shorten Spatial-spectral RNN with Parallel-GRU (St-SS-pGRU) for HSI classification. A shorten RNN is more efficient and easier for training than band-by-band RNN. By combining converlusion layer, the St-SSpGRU model considers not only spectral but also spatial feature, which results in a better performance. An architecture named parallel-GRU is also proposed and applied in St-SS-pGRU. With this architecture, the model gets a better performance and is more robust. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12563v1 |
http://arxiv.org/pdf/1810.12563v1.pdf | |
PWC | https://paperswithcode.com/paper/shorten-spatial-spectral-rnn-with-parallel |
Repo | https://github.com/codeRimoe/DL_for_RSIs |
Framework | tf |
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
Title | A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation |
Authors | Alexander H. Liu, Yen-Cheng Liu, Yu-Ying Yeh, Yu-Chiang Frank Wang |
Abstract | We present a novel and unified deep learning framework which is capable of learning domain-invariant representation from data across multiple domains. Realized by adversarial training with additional ability to exploit domain-specific information, the proposed network is able to perform continuous cross-domain image translation and manipulation, and produces desirable output images accordingly. In addition, the resulting feature representation exhibits superior performance of unsupervised domain adaptation, which also verifies the effectiveness of the proposed model in learning disentangled features for describing cross-domain data. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01361v3 |
http://arxiv.org/pdf/1809.01361v3.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-feature-disentangler-for-multi |
Repo | https://github.com/Alexander-H-Liu/UFDN |
Framework | pytorch |
Content-based Video Relevance Prediction Challenge: Data, Protocol, and Baseline
Title | Content-based Video Relevance Prediction Challenge: Data, Protocol, and Baseline |
Authors | Mengyi Liu, Xiaohui Xie, Hanning Zhou |
Abstract | Video relevance prediction is one of the most important tasks for online streaming service. Given the relevance of videos and viewer feedbacks, the system can provide personalized recommendations, which will help the user discover more content of interest. In most online service, the computation of video relevance table is based on users’ implicit feedback, e.g. watch and search history. However, this kind of method performs poorly for “cold-start” problems - when a new video is added to the library, the recommendation system needs to bootstrap the video relevance score with very little user behavior known. One promising approach to solve it is analyzing video content itself, i.e. predicting video relevance by video frame, audio, subtitle and metadata. In this paper, we describe a challenge on Content-based Video Relevance Prediction (CBVRP) that is hosted by Hulu in the ACM Multimedia Conference 2018. In this challenge, Hulu drives the study on an open problem of exploiting content characteristics directly from original video for video relevance prediction. We provide massive video assets and ground truth relevance derived from our really system, to build up a common platform for algorithm development and performance evaluation. |
Tasks | |
Published | 2018-06-03 |
URL | http://arxiv.org/abs/1806.00737v1 |
http://arxiv.org/pdf/1806.00737v1.pdf | |
PWC | https://paperswithcode.com/paper/content-based-video-relevance-prediction |
Repo | https://github.com/cbvrp-acmmm-2018/cbvrp-acmmm-2018 |
Framework | none |
On Self Modulation for Generative Adversarial Networks
Title | On Self Modulation for Generative Adversarial Networks |
Authors | Ting Chen, Mario Lucic, Neil Houlsby, Sylvain Gelly |
Abstract | Training Generative Adversarial Networks (GANs) is notoriously challenging. We propose and study an architectural modification, self-modulation, which improves GAN performance across different data sets, architectures, losses, regularizers, and hyperparameter settings. Intuitively, self-modulation allows the intermediate feature maps of a generator to change as a function of the input noise vector. While reminiscent of other conditioning techniques, it requires no labeled data. In a large-scale empirical study we observe a relative decrease of $5%-35%$ in FID. Furthermore, all else being equal, adding this modification to the generator leads to improved performance in $124/144$ ($86%$) of the studied settings. Self-modulation is a simple architectural change that requires no additional parameter tuning, which suggests that it can be applied readily to any GAN. |
Tasks | |
Published | 2018-10-02 |
URL | http://arxiv.org/abs/1810.01365v2 |
http://arxiv.org/pdf/1810.01365v2.pdf | |
PWC | https://paperswithcode.com/paper/on-self-modulation-for-generative-adversarial |
Repo | https://github.com/google/compare_gan |
Framework | tf |
VLASE: Vehicle Localization by Aggregating Semantic Edges
Title | VLASE: Vehicle Localization by Aggregating Semantic Edges |
Authors | Xin Yu, Sagar Chaturvedi, Chen Feng, Yuichi Taguchi, Teng-Yok Lee, Clinton Fernandes, Srikumar Ramalingam |
Abstract | In this paper, we propose VLASE, a framework to use semantic edge features from images to achieve on-road localization. Semantic edge features denote edge contours that separate pairs of distinct objects such as building-sky, road- sidewalk, and building-ground. While prior work has shown promising results by utilizing the boundary between prominent classes such as sky and building using skylines, we generalize this approach to consider semantic edge features that arise from 19 different classes. Our localization algorithm is simple, yet very powerful. We extract semantic edge features using a recently introduced CASENet architecture and utilize VLAD framework to perform image retrieval. Our experiments show that we achieve improvement over some of the state-of-the-art localization algorithms such as SIFT-VLAD and its deep variant NetVLAD. We use ablation study to study the importance of different semantic classes and show that our unified approach achieves better performance compared to individual prominent features such as skylines. |
Tasks | Image Retrieval |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02536v1 |
http://arxiv.org/pdf/1807.02536v1.pdf | |
PWC | https://paperswithcode.com/paper/vlase-vehicle-localization-by-aggregating |
Repo | https://github.com/sagachat/VLASE |
Framework | none |
Unsupervised Degradation Learning for Single Image Super-Resolution
Title | Unsupervised Degradation Learning for Single Image Super-Resolution |
Authors | Tianyu Zhao, Wenqi Ren, Changqing Zhang, Dongwei Ren, Qinghua Hu |
Abstract | Deep Convolution Neural Networks (CNN) have achieved significant performance on single image super-resolution (SR) recently. However, existing CNN-based methods use artificially synthetic low-resolution (LR) and high-resolution (HR) image pairs to train networks, which cannot handle real-world cases since the degradation from HR to LR is much more complex than manually designed. To solve this problem, we propose a real-world LR images guided bi-cycle network for single image super-resolution, in which the bidirectional structural consistency is exploited to train both the degradation and SR reconstruction networks in an unsupervised way. Specifically, we propose a degradation network to model the real-world degradation process from HR to LR via generative adversarial networks, and these generated realistic LR images paired with real-world HR images are exploited for training the SR reconstruction network, forming the first cycle. Then in the second reverse cycle, consistency of real-world LR images are exploited to further stabilize the training of SR reconstruction and degradation networks. Extensive experiments on both synthetic and real-world images demonstrate that the proposed algorithm performs favorably against state-of-the-art single image SR methods. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04240v2 |
http://arxiv.org/pdf/1812.04240v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-degradation-learning-for-single |
Repo | https://github.com/jotoy/SISR |
Framework | tf |
SentEval: An Evaluation Toolkit for Universal Sentence Representations
Title | SentEval: An Evaluation Toolkit for Universal Sentence Representations |
Authors | Alexis Conneau, Douwe Kiela |
Abstract | We introduce SentEval, a toolkit for evaluating the quality of universal sentence representations. SentEval encompasses a variety of tasks, including binary and multi-class classification, natural language inference and sentence similarity. The set of tasks was selected based on what appears to be the community consensus regarding the appropriate evaluations for universal sentence representations. The toolkit comes with scripts to download and preprocess datasets, and an easy interface to evaluate sentence encoders. The aim is to provide a fairer, less cumbersome and more centralized way for evaluating sentence representations. |
Tasks | Natural Language Inference |
Published | 2018-03-14 |
URL | http://arxiv.org/abs/1803.05449v1 |
http://arxiv.org/pdf/1803.05449v1.pdf | |
PWC | https://paperswithcode.com/paper/senteval-an-evaluation-toolkit-for-universal |
Repo | https://github.com/facebookresearch/InferSent |
Framework | pytorch |