October 20, 2019

2968 words 14 mins read

Paper Group AWR 269

NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations. Towards Optimal Power Control via Ensembling Deep Neural Networks. Rotation Equivariant CNNs for Digital Pathology. Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network. Product-based Neural Networks for User Response Prediction over Mult …

NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations


Title	NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
Authors	Marco Ciccone, Marco Gallieri, Jonathan Masci, Christian Osendorfer, Faustino Gomez
Abstract	This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming tanh units, and multiple stable equilibria for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets.
Tasks
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07209v3
PDF	http://arxiv.org/pdf/1804.07209v3.pdf
PWC	https://paperswithcode.com/paper/nais-net-stable-deep-networks-from-non
Repo	https://github.com/batuhanguler/Deep-BSDE-Solver
Framework	pytorch

Towards Optimal Power Control via Ensembling Deep Neural Networks


Title	Towards Optimal Power Control via Ensembling Deep Neural Networks
Authors	Fei Liang, Cong Shen, Wei Yu, Feng Wu
Abstract	A deep neural network (DNN) based power control method is proposed, which aims at solving the non-convex optimization problem of maximizing the sum rate of a multi-user interference channel. Towards this end, we first present PCNet, which is a multi-layer fully connected neural network that is specifically designed for the power control problem. PCNet takes the channel coefficients as input and outputs the transmit power of all users. A key challenge in training a DNN for the power control problem is the lack of ground truth, i.e., the optimal power allocation is unknown. To address this issue, PCNet leverages the unsupervised learning strategy and directly maximizes the sum rate in the training phase. Observing that a single PCNet does not globally outperform the existing solutions, we further propose ePCNet, a network ensemble with multiple PCNets trained independently. Simulation results show that for the standard symmetric multi-user Gaussian interference channel, ePCNet can outperform all state-of-the-art power control methods by 1.2%-4.6% under a variety of system configurations. Furthermore, the performance improvement of ePCNet comes with a reduced computational complexity.
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10025v2
PDF	http://arxiv.org/pdf/1807.10025v2.pdf
PWC	https://paperswithcode.com/paper/towards-optimal-power-control-via-ensembling
Repo	https://github.com/ShenGroup/PCNet-ePCNet
Framework	tf

Rotation Equivariant CNNs for Digital Pathology


Title	Rotation Equivariant CNNs for Digital Pathology
Authors	Bastiaan S. Veeling, Jasper Linmans, Jim Winkens, Taco Cohen, Max Welling
Abstract	We propose a new model for digital pathology segmentation, based on the observation that histopathology images are inherently symmetric under rotation and reflection. Utilizing recent findings on rotation equivariant CNNs, the proposed model leverages these symmetries in a principled manner. We present a visual analysis showing improved stability on predictions, and demonstrate that exploiting rotation equivariance significantly improves tumor detection performance on a challenging lymph node metastases dataset. We further present a novel derived dataset to enable principled comparison of machine learning models, in combination with an initial benchmark. Through this dataset, the task of histopathology diagnosis becomes accessible as a challenging benchmark for fundamental machine learning research.
Tasks
Published	2018-06-08
URL	http://arxiv.org/abs/1806.03962v1
PDF	http://arxiv.org/pdf/1806.03962v1.pdf
PWC	https://paperswithcode.com/paper/rotation-equivariant-cnns-for-digital
Repo	https://github.com/eb00/pcam_analysis
Framework	tf

Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network


Title	Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network
Authors	Jun-Ho Choi, Jun-Hyuk Kim, Manri Cheon, Jong-Seok Lee
Abstract	Recently, several deep learning-based image super-resolution methods have been developed by stacking massive numbers of layers. However, this leads too large model sizes and high computational complexities, thus some recursive parameter-sharing methods have been also proposed. Nevertheless, their designs do not properly utilize the potential of the recursive operation. In this paper, we propose a novel, lightweight, and efficient super-resolution method to maximize the usefulness of the recursive architecture, by introducing block state-based recursive network. By taking advantage of utilizing the block state, the recursive part of our model can easily track the status of the current image features. We show the benefits of the proposed method in terms of model size, speed, and efficiency. In addition, we show that our method outperforms the other state-of-the-art methods.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-11-30
URL	http://arxiv.org/abs/1811.12546v1
PDF	http://arxiv.org/pdf/1811.12546v1.pdf
PWC	https://paperswithcode.com/paper/lightweight-and-efficient-image-super
Repo	https://github.com/manricheon/manricheon.github.io
Framework	tf

Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data


Title	Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data
Authors	Yanru Qu, Bohui Fang, Weinan Zhang, Ruiming Tang, Minzhe Niu, Huifeng Guo, Yong Yu, Xiuqiang He
Abstract	User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search. The data in user response prediction is mostly in a multi-field categorical format and transformed into sparse representations via one-hot encoding. Due to the sparsity problems in representation and optimization, most research focuses on feature engineering and shallow modeling. Recently, deep neural networks have attracted research attention on such a problem for their high capacity and end-to-end training scheme. In this paper, we study user response prediction in the scenario of click prediction. We first analyze a coupled gradient issue in latent vector-based models and propose kernel product to learn field-aware feature interactions. Then we discuss an insensitive gradient issue in DNN-based models and propose Product-based Neural Network (PNN) which adopts a feature extractor to explore feature interactions. Generalizing the kernel product to a net-in-net architecture, we further propose Product-network In Network (PIN) which can generalize previous models. Extensive experiments on 4 industrial datasets and 1 contest dataset demonstrate that our models consistently outperform 8 baselines on both AUC and log loss. Besides, PIN makes great CTR improvement (relatively 34.67%) in online A/B test.
Tasks	Click-Through Rate Prediction, Feature Engineering, Information Retrieval, Recommendation Systems
Published	2018-07-01
URL	http://arxiv.org/abs/1807.00311v1
PDF	http://arxiv.org/pdf/1807.00311v1.pdf
PWC	https://paperswithcode.com/paper/product-based-neural-networks-for-user-1
Repo	https://github.com/Atomu2014/product-nets
Framework	tf

Accelerator-Aware Pruning for Convolutional Neural Networks


Title	Accelerator-Aware Pruning for Convolutional Neural Networks
Authors	Hyeong-Ju Kang
Abstract	Convolutional neural networks have shown tremendous performance capabilities in computer vision tasks, but their excessive amounts of weight storage and arithmetic operations prevent them from being adopted in embedded environments. One of the solutions involves pruning, where certain unimportant weights are forced to have a value of zero. Many pruning schemes have been proposed, but these have mainly focused on the number of pruned weights. Previous pruning schemes scarcely considered ASIC or FPGA accelerator architectures. When these pruned networks are run on accelerators, the lack of consideration of the architecture causes some inefficiency problems, including internal buffer misalignments and load imbalances. This paper proposes a new pruning scheme that reflects accelerator architectures. In the proposed scheme, pruning is performed so that the same number of weights remain for each weight group corresponding to activations fetched simultaneously. In this way, the pruning scheme resolves the inefficiency problems, doubling the accelerator performance. Even with this constraint, the proposed pruning scheme reached a pruning ratio similar to that of previous unconstrained pruning schemes, not only on AlexNet and VGG16 but also on state-of-the-art very deep networks such as ResNet. Furthermore, the proposed scheme demonstrated a comparable pruning ratio on compact networks such as MobileNet and on slimmed networks that were already pruned in a channel-wise manner. In addition to improving the efficiency of previous sparse accelerators, it will be also shown that the proposed pruning scheme can be used to reduce the logic complexity of sparse accelerators.
Tasks
Published	2018-04-26
URL	https://arxiv.org/abs/1804.09862v2
PDF	https://arxiv.org/pdf/1804.09862v2.pdf
PWC	https://paperswithcode.com/paper/accelerator-aware-pruning-for-convolutional
Repo	https://github.com/hjkang1976/accelerator-aware-pruning
Framework	none

Beyond Gradient Descent for Regularized Segmentation Losses


Title	Beyond Gradient Descent for Regularized Segmentation Losses
Authors	Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov
Abstract	The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a “smoother” tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in “shallow” segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods.
Tasks
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02322v2
PDF	http://arxiv.org/pdf/1809.02322v2.pdf
PWC	https://paperswithcode.com/paper/adm-for-grid-crf-loss-in-cnn-segmentation
Repo	https://github.com/dmitrii-marin/adm-seg
Framework	none

Focus Quality Assessment of High-Throughput Whole Slide Imaging in Digital Pathology


Title	Focus Quality Assessment of High-Throughput Whole Slide Imaging in Digital Pathology
Authors	Mahdi S. Hosseini, Yueyang Zhang, Lyndon Chan, Konstantinos N. Plataniotis, Jasper A. Z. Brawley-Hayes, Savvas Damaskinos
Abstract	One of the challenges facing the adoption of digital pathology workflows for clinical use is the need for automated quality control. As the scanners sometimes determine focus inaccurately, the resultant image blur deteriorates the scanned slide to the point of being unusable. Also, the scanned slide images tend to be extremely large when scanned at greater or equal 20X image resolution. Hence, for digital pathology to be clinically useful, it is necessary to use computational tools to quickly and accurately quantify the image focus quality and determine whether an image needs to be re-scanned. We propose a no-reference focus quality assessment metric specifically for digital pathology images, that operates by using a sum of even-derivative filter bases to synthesize a human visual system-like kernel, which is modeled as the inverse of the lens’ point spread function. This kernel is then applied to a digital pathology image to modify high-frequency image information deteriorated by the scanner’s optics and quantify the focus quality at the patch level. We show in several experiments that our method correlates better with ground-truth $z$-level data than other methods, and is more computationally efficient. We also extend our method to generate a local slide-level focus quality heatmap, which can be used for automated slide quality control, and demonstrate the utility of our method for clinical scan quality control by comparison with subjective slide quality scores.
Tasks
Published	2018-11-14
URL	http://arxiv.org/abs/1811.06038v1
PDF	http://arxiv.org/pdf/1811.06038v1.pdf
PWC	https://paperswithcode.com/paper/focus-quality-assessment-of-high-throughput
Repo	https://github.com/mahdihosseini/FQPath
Framework	none

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms


Title	GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Authors	Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer
Abstract	In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems. Conversely, evolutionary and developmental methods focusing on exploration like Novelty Search, Quality-Diversity or Goal Exploration Processes explore more robustly but are less efficient at fine-tuning policies using gradient descent. In this paper, we present the GEP-PG approach, taking the best of both worlds by sequentially combining a Goal Exploration Process and two variants of DDPG. We study the learning performance of these components and their combination on a low dimensional deceptive reward problem and on the larger Half-Cheetah benchmark. We show that DDPG fails on the former and that GEP-PG improves over the best DDPG variant in both environments. Supplementary videos and discussion can be found at http://frama.link/gep_pg, the code at http://github.com/flowersteam/geppg.
Tasks
Published	2018-02-14
URL	http://arxiv.org/abs/1802.05054v5
PDF	http://arxiv.org/pdf/1802.05054v5.pdf
PWC	https://paperswithcode.com/paper/gep-pg-decoupling-exploration-and
Repo	https://github.com/flowersteam/geppg
Framework	none

A categorisation and implementation of digital pen features for behaviour characterisation


Title	A categorisation and implementation of digital pen features for behaviour characterisation
Authors	Alexander Prange, Michael Barz, Daniel Sonntag
Abstract	In this paper we provide a categorisation and implementation of digital ink features for behaviour characterisation. Based on four feature sets taken from literature, we provide a categorisation in different classes of syntactic and semantic features. We implemented a publicly available framework to calculate these features and show its deployment in the use case of analysing cognitive assessments performed using a digital pen.
Tasks
Published	2018-10-01
URL	https://arxiv.org/abs/1810.03970v1
PDF	https://arxiv.org/pdf/1810.03970v1.pdf
PWC	https://paperswithcode.com/paper/a-categorisation-and-implementation-of
Repo	https://github.com/DFKI-Interactive-Machine-Learning/ink-features
Framework	none

Contour Parametrization via Anisotropic Mean Curvature Flows


Title	Contour Parametrization via Anisotropic Mean Curvature Flows
Authors	P. Suárez-Serrato, E. I. Velázquez Richards
Abstract	We present a new implementation of anisotropic mean curvature flow for contour recognition. Our procedure couples the mean curvature flow of planar closed smooth curves, with an external field from a potential of point-wise charges. This coupling constrains the motion when the curve matches a picture placed as background. We include a stability criteria for our numerical approximation.
Tasks
Published	2018-03-10
URL	http://arxiv.org/abs/1803.03724v1
PDF	http://arxiv.org/pdf/1803.03724v1.pdf
PWC	https://paperswithcode.com/paper/contour-parametrization-via-anisotropic-mean
Repo	https://github.com/V3du4rd0/AMCF
Framework	none

Improving Retrieval-Based Question Answering with Deep Inference Models


Title	Improving Retrieval-Based Question Answering with Deep Inference Models
Authors	George-Sebastian Pirtoaca, Traian Rebedea, Stefan Ruseti
Abstract	Question answering is one of the most important and difficult applications at the border of information retrieval and natural language processing, especially when we talk about complex science questions which require some form of inference to determine the correct answer. In this paper, we present a two-step method that combines information retrieval techniques optimized for question answering with deep learning models for natural language inference in order to tackle the multi-choice question answering in the science domain. For each question-answer pair, we use standard retrieval-based models to find relevant candidate contexts and decompose the main problem into two different sub-problems. First, assign correctness scores for each candidate answer based on the context using retrieval models from Lucene. Second, we use deep learning architectures to compute if a candidate answer can be inferred from some well-chosen context consisting of sentences retrieved from the knowledge base. In the end, all these solvers are combined using a simple neural network to predict the correct answer. This proposed two-step model outperforms the best retrieval-based solver by over 3% in absolute accuracy.
Tasks	Information Retrieval, Natural Language Inference, Question Answering
Published	2018-12-07
URL	https://arxiv.org/abs/1812.02971v2
PDF	https://arxiv.org/pdf/1812.02971v2.pdf
PWC	https://paperswithcode.com/paper/improving-retrieval-based-question-answering
Repo	https://github.com/SebiSebi/AI2-Reasoning-Challenge-ARC
Framework	none

Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update


Title	Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
Authors	Su Young Lee, Sungik Choi, Sae-Young Chung
Abstract	We propose Episodic Backward Update (EBU) - a novel deep reinforcement learning algorithm with a direct value propagation. In contrast to the conventional use of the experience replay with uniform random sampling, our agent samples a whole episode and successively propagates the value of a state to its previous states. Our computationally efficient recursive algorithm allows sparse and delayed rewards to propagate directly through all transitions of the sampled episode. We theoretically prove the convergence of the EBU method and experimentally demonstrate its performance in both deterministic and stochastic environments. Especially in 49 games of Atari 2600 domain, EBU achieves the same mean and median human normalized performance of DQN by using only 5% and 10% of samples, respectively.
Tasks
Published	2018-05-31
URL	https://arxiv.org/abs/1805.12375v3
PDF	https://arxiv.org/pdf/1805.12375v3.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-deep-reinforcement-learning-2
Repo	https://github.com/suyoung-lee/Episodic-Backward-Update
Framework	none

Multi-scale Location-aware Kernel Representation for Object Detection


Title	Multi-scale Location-aware Kernel Representation for Object Detection
Authors	Hao Wang, Qilong Wang, Mingqi Gao, Peihua Li, Wangmeng Zuo
Abstract	Although Faster R-CNN and its variants have shown promising performance in object detection, they only exploit simple first-order representation of object proposals for final classification and regression. Recent classification methods demonstrate that the integration of high-order statistics into deep convolutional neural networks can achieve impressive improvement, but their goal is to model whole images by discarding location information so that they cannot be directly adopted to object detection. In this paper, we make an attempt to exploit high-order statistics in object detection, aiming at generating more discriminative representations for proposals to enhance the performance of detectors. To this end, we propose a novel Multi-scale Location-aware Kernel Representation (MLKP) to capture high-order statistics of deep features in proposals. Our MLKP can be efficiently computed on a modified multi-scale feature map using a low-dimensional polynomial kernel approximation.Moreover, different from existing orderless global representations based on high-order statistics, our proposed MLKP is location retentive and sensitive so that it can be flexibly adopted to object detection. Through integrating into Faster R-CNN schema, the proposed MLKP achieves very competitive performance with state-of-the-art methods, and improves Faster R-CNN by 4.9% (mAP), 4.7% (mAP) and 5.0% (AP at IOU=[0.5:0.05:0.95]) on PASCAL VOC 2007, VOC 2012 and MS COCO benchmarks, respectively. Code is available at: https://github.com/Hwang64/MLKP.
Tasks	Object Detection
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00428v1
PDF	http://arxiv.org/pdf/1804.00428v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-location-aware-kernel
Repo	https://github.com/Hwang64/MLKP
Framework	none

Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System


Title	Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System
Authors	Jiaxi Tang, Ke Wang
Abstract	We propose a novel way to train ranking models, such as recommender systems, that are both effective and efficient. Knowledge distillation (KD) was shown to be successful in image recognition to achieve both effectiveness and efficiency. We propose a KD technique for learning to rank problems, called \emph{ranking distillation (RD)}. Specifically, we train a smaller student model to learn to rank documents/items from both the training data and the supervision of a larger teacher model. The student model achieves a similar ranking performance to that of the large teacher model, but its smaller model size makes the online inference more efficient. RD is flexible because it is orthogonal to the choices of ranking models for the teacher and student. We address the challenges of RD for ranking problems. The experiments on public data sets and state-of-the-art recommendation models showed that RD achieves its design purposes: the student model learnt with RD has a model size less than half of the teacher model while achieving a ranking performance similar to the teacher model and much better than the student model learnt without RD.
Tasks	Learning-To-Rank, Recommendation Systems
Published	2018-09-19
URL	http://arxiv.org/abs/1809.07428v1
PDF	http://arxiv.org/pdf/1809.07428v1.pdf
PWC	https://paperswithcode.com/paper/ranking-distillation-learning-compact-ranking
Repo	https://github.com/graytowne/rank_distill
Framework	pytorch