Paper Group AWR 269
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations. Towards Optimal Power Control via Ensembling Deep Neural Networks. Rotation Equivariant CNNs for Digital Pathology. Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network. Product-based Neural Networks for User Response Prediction over Mult …
NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations
Title | NAIS-Net: Stable Deep Networks from Non-Autonomous Differential Equations |
Authors | Marco Ciccone, Marco Gallieri, Jonathan Masci, Christian Osendorfer, Faustino Gomez |
Abstract | This paper introduces Non-Autonomous Input-Output Stable Network (NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming tanh units, and multiple stable equilibria for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets. |
Tasks | |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07209v3 |
http://arxiv.org/pdf/1804.07209v3.pdf | |
PWC | https://paperswithcode.com/paper/nais-net-stable-deep-networks-from-non |
Repo | https://github.com/batuhanguler/Deep-BSDE-Solver |
Framework | pytorch |
Towards Optimal Power Control via Ensembling Deep Neural Networks
Title | Towards Optimal Power Control via Ensembling Deep Neural Networks |
Authors | Fei Liang, Cong Shen, Wei Yu, Feng Wu |
Abstract | A deep neural network (DNN) based power control method is proposed, which aims at solving the non-convex optimization problem of maximizing the sum rate of a multi-user interference channel. Towards this end, we first present PCNet, which is a multi-layer fully connected neural network that is specifically designed for the power control problem. PCNet takes the channel coefficients as input and outputs the transmit power of all users. A key challenge in training a DNN for the power control problem is the lack of ground truth, i.e., the optimal power allocation is unknown. To address this issue, PCNet leverages the unsupervised learning strategy and directly maximizes the sum rate in the training phase. Observing that a single PCNet does not globally outperform the existing solutions, we further propose ePCNet, a network ensemble with multiple PCNets trained independently. Simulation results show that for the standard symmetric multi-user Gaussian interference channel, ePCNet can outperform all state-of-the-art power control methods by 1.2%-4.6% under a variety of system configurations. Furthermore, the performance improvement of ePCNet comes with a reduced computational complexity. |
Tasks | |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10025v2 |
http://arxiv.org/pdf/1807.10025v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-optimal-power-control-via-ensembling |
Repo | https://github.com/ShenGroup/PCNet-ePCNet |
Framework | tf |
Rotation Equivariant CNNs for Digital Pathology
Title | Rotation Equivariant CNNs for Digital Pathology |
Authors | Bastiaan S. Veeling, Jasper Linmans, Jim Winkens, Taco Cohen, Max Welling |
Abstract | We propose a new model for digital pathology segmentation, based on the observation that histopathology images are inherently symmetric under rotation and reflection. Utilizing recent findings on rotation equivariant CNNs, the proposed model leverages these symmetries in a principled manner. We present a visual analysis showing improved stability on predictions, and demonstrate that exploiting rotation equivariance significantly improves tumor detection performance on a challenging lymph node metastases dataset. We further present a novel derived dataset to enable principled comparison of machine learning models, in combination with an initial benchmark. Through this dataset, the task of histopathology diagnosis becomes accessible as a challenging benchmark for fundamental machine learning research. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03962v1 |
http://arxiv.org/pdf/1806.03962v1.pdf | |
PWC | https://paperswithcode.com/paper/rotation-equivariant-cnns-for-digital |
Repo | https://github.com/eb00/pcam_analysis |
Framework | tf |
Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network
Title | Lightweight and Efficient Image Super-Resolution with Block State-based Recursive Network |
Authors | Jun-Ho Choi, Jun-Hyuk Kim, Manri Cheon, Jong-Seok Lee |
Abstract | Recently, several deep learning-based image super-resolution methods have been developed by stacking massive numbers of layers. However, this leads too large model sizes and high computational complexities, thus some recursive parameter-sharing methods have been also proposed. Nevertheless, their designs do not properly utilize the potential of the recursive operation. In this paper, we propose a novel, lightweight, and efficient super-resolution method to maximize the usefulness of the recursive architecture, by introducing block state-based recursive network. By taking advantage of utilizing the block state, the recursive part of our model can easily track the status of the current image features. We show the benefits of the proposed method in terms of model size, speed, and efficiency. In addition, we show that our method outperforms the other state-of-the-art methods. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1811.12546v1 |
http://arxiv.org/pdf/1811.12546v1.pdf | |
PWC | https://paperswithcode.com/paper/lightweight-and-efficient-image-super |
Repo | https://github.com/manricheon/manricheon.github.io |
Framework | tf |
Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data
Title | Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data |
Authors | Yanru Qu, Bohui Fang, Weinan Zhang, Ruiming Tang, Minzhe Niu, Huifeng Guo, Yong Yu, Xiuqiang He |
Abstract | User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search. The data in user response prediction is mostly in a multi-field categorical format and transformed into sparse representations via one-hot encoding. Due to the sparsity problems in representation and optimization, most research focuses on feature engineering and shallow modeling. Recently, deep neural networks have attracted research attention on such a problem for their high capacity and end-to-end training scheme. In this paper, we study user response prediction in the scenario of click prediction. We first analyze a coupled gradient issue in latent vector-based models and propose kernel product to learn field-aware feature interactions. Then we discuss an insensitive gradient issue in DNN-based models and propose Product-based Neural Network (PNN) which adopts a feature extractor to explore feature interactions. Generalizing the kernel product to a net-in-net architecture, we further propose Product-network In Network (PIN) which can generalize previous models. Extensive experiments on 4 industrial datasets and 1 contest dataset demonstrate that our models consistently outperform 8 baselines on both AUC and log loss. Besides, PIN makes great CTR improvement (relatively 34.67%) in online A/B test. |
Tasks | Click-Through Rate Prediction, Feature Engineering, Information Retrieval, Recommendation Systems |
Published | 2018-07-01 |
URL | http://arxiv.org/abs/1807.00311v1 |
http://arxiv.org/pdf/1807.00311v1.pdf | |
PWC | https://paperswithcode.com/paper/product-based-neural-networks-for-user-1 |
Repo | https://github.com/Atomu2014/product-nets |
Framework | tf |
Accelerator-Aware Pruning for Convolutional Neural Networks
Title | Accelerator-Aware Pruning for Convolutional Neural Networks |
Authors | Hyeong-Ju Kang |
Abstract | Convolutional neural networks have shown tremendous performance capabilities in computer vision tasks, but their excessive amounts of weight storage and arithmetic operations prevent them from being adopted in embedded environments. One of the solutions involves pruning, where certain unimportant weights are forced to have a value of zero. Many pruning schemes have been proposed, but these have mainly focused on the number of pruned weights. Previous pruning schemes scarcely considered ASIC or FPGA accelerator architectures. When these pruned networks are run on accelerators, the lack of consideration of the architecture causes some inefficiency problems, including internal buffer misalignments and load imbalances. This paper proposes a new pruning scheme that reflects accelerator architectures. In the proposed scheme, pruning is performed so that the same number of weights remain for each weight group corresponding to activations fetched simultaneously. In this way, the pruning scheme resolves the inefficiency problems, doubling the accelerator performance. Even with this constraint, the proposed pruning scheme reached a pruning ratio similar to that of previous unconstrained pruning schemes, not only on AlexNet and VGG16 but also on state-of-the-art very deep networks such as ResNet. Furthermore, the proposed scheme demonstrated a comparable pruning ratio on compact networks such as MobileNet and on slimmed networks that were already pruned in a channel-wise manner. In addition to improving the efficiency of previous sparse accelerators, it will be also shown that the proposed pruning scheme can be used to reduce the logic complexity of sparse accelerators. |
Tasks | |
Published | 2018-04-26 |
URL | https://arxiv.org/abs/1804.09862v2 |
https://arxiv.org/pdf/1804.09862v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerator-aware-pruning-for-convolutional |
Repo | https://github.com/hjkang1976/accelerator-aware-pruning |
Framework | none |
Beyond Gradient Descent for Regularized Segmentation Losses
Title | Beyond Gradient Descent for Regularized Segmentation Losses |
Authors | Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov |
Abstract | The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a “smoother” tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in “shallow” segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods. |
Tasks | |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02322v2 |
http://arxiv.org/pdf/1809.02322v2.pdf | |
PWC | https://paperswithcode.com/paper/adm-for-grid-crf-loss-in-cnn-segmentation |
Repo | https://github.com/dmitrii-marin/adm-seg |
Framework | none |
Focus Quality Assessment of High-Throughput Whole Slide Imaging in Digital Pathology
Title | Focus Quality Assessment of High-Throughput Whole Slide Imaging in Digital Pathology |
Authors | Mahdi S. Hosseini, Yueyang Zhang, Lyndon Chan, Konstantinos N. Plataniotis, Jasper A. Z. Brawley-Hayes, Savvas Damaskinos |
Abstract | One of the challenges facing the adoption of digital pathology workflows for clinical use is the need for automated quality control. As the scanners sometimes determine focus inaccurately, the resultant image blur deteriorates the scanned slide to the point of being unusable. Also, the scanned slide images tend to be extremely large when scanned at greater or equal 20X image resolution. Hence, for digital pathology to be clinically useful, it is necessary to use computational tools to quickly and accurately quantify the image focus quality and determine whether an image needs to be re-scanned. We propose a no-reference focus quality assessment metric specifically for digital pathology images, that operates by using a sum of even-derivative filter bases to synthesize a human visual system-like kernel, which is modeled as the inverse of the lens’ point spread function. This kernel is then applied to a digital pathology image to modify high-frequency image information deteriorated by the scanner’s optics and quantify the focus quality at the patch level. We show in several experiments that our method correlates better with ground-truth $z$-level data than other methods, and is more computationally efficient. We also extend our method to generate a local slide-level focus quality heatmap, which can be used for automated slide quality control, and demonstrate the utility of our method for clinical scan quality control by comparison with subjective slide quality scores. |
Tasks | |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.06038v1 |
http://arxiv.org/pdf/1811.06038v1.pdf | |
PWC | https://paperswithcode.com/paper/focus-quality-assessment-of-high-throughput |
Repo | https://github.com/mahdihosseini/FQPath |
Framework | none |
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Title | GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms |
Authors | Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer |
Abstract | In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems. Conversely, evolutionary and developmental methods focusing on exploration like Novelty Search, Quality-Diversity or Goal Exploration Processes explore more robustly but are less efficient at fine-tuning policies using gradient descent. In this paper, we present the GEP-PG approach, taking the best of both worlds by sequentially combining a Goal Exploration Process and two variants of DDPG. We study the learning performance of these components and their combination on a low dimensional deceptive reward problem and on the larger Half-Cheetah benchmark. We show that DDPG fails on the former and that GEP-PG improves over the best DDPG variant in both environments. Supplementary videos and discussion can be found at http://frama.link/gep_pg, the code at http://github.com/flowersteam/geppg. |
Tasks | |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.05054v5 |
http://arxiv.org/pdf/1802.05054v5.pdf | |
PWC | https://paperswithcode.com/paper/gep-pg-decoupling-exploration-and |
Repo | https://github.com/flowersteam/geppg |
Framework | none |
A categorisation and implementation of digital pen features for behaviour characterisation
Title | A categorisation and implementation of digital pen features for behaviour characterisation |
Authors | Alexander Prange, Michael Barz, Daniel Sonntag |
Abstract | In this paper we provide a categorisation and implementation of digital ink features for behaviour characterisation. Based on four feature sets taken from literature, we provide a categorisation in different classes of syntactic and semantic features. We implemented a publicly available framework to calculate these features and show its deployment in the use case of analysing cognitive assessments performed using a digital pen. |
Tasks | |
Published | 2018-10-01 |
URL | https://arxiv.org/abs/1810.03970v1 |
https://arxiv.org/pdf/1810.03970v1.pdf | |
PWC | https://paperswithcode.com/paper/a-categorisation-and-implementation-of |
Repo | https://github.com/DFKI-Interactive-Machine-Learning/ink-features |
Framework | none |
Contour Parametrization via Anisotropic Mean Curvature Flows
Title | Contour Parametrization via Anisotropic Mean Curvature Flows |
Authors | P. Suárez-Serrato, E. I. Velázquez Richards |
Abstract | We present a new implementation of anisotropic mean curvature flow for contour recognition. Our procedure couples the mean curvature flow of planar closed smooth curves, with an external field from a potential of point-wise charges. This coupling constrains the motion when the curve matches a picture placed as background. We include a stability criteria for our numerical approximation. |
Tasks | |
Published | 2018-03-10 |
URL | http://arxiv.org/abs/1803.03724v1 |
http://arxiv.org/pdf/1803.03724v1.pdf | |
PWC | https://paperswithcode.com/paper/contour-parametrization-via-anisotropic-mean |
Repo | https://github.com/V3du4rd0/AMCF |
Framework | none |
Improving Retrieval-Based Question Answering with Deep Inference Models
Title | Improving Retrieval-Based Question Answering with Deep Inference Models |
Authors | George-Sebastian Pirtoaca, Traian Rebedea, Stefan Ruseti |
Abstract | Question answering is one of the most important and difficult applications at the border of information retrieval and natural language processing, especially when we talk about complex science questions which require some form of inference to determine the correct answer. In this paper, we present a two-step method that combines information retrieval techniques optimized for question answering with deep learning models for natural language inference in order to tackle the multi-choice question answering in the science domain. For each question-answer pair, we use standard retrieval-based models to find relevant candidate contexts and decompose the main problem into two different sub-problems. First, assign correctness scores for each candidate answer based on the context using retrieval models from Lucene. Second, we use deep learning architectures to compute if a candidate answer can be inferred from some well-chosen context consisting of sentences retrieved from the knowledge base. In the end, all these solvers are combined using a simple neural network to predict the correct answer. This proposed two-step model outperforms the best retrieval-based solver by over 3% in absolute accuracy. |
Tasks | Information Retrieval, Natural Language Inference, Question Answering |
Published | 2018-12-07 |
URL | https://arxiv.org/abs/1812.02971v2 |
https://arxiv.org/pdf/1812.02971v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-retrieval-based-question-answering |
Repo | https://github.com/SebiSebi/AI2-Reasoning-Challenge-ARC |
Framework | none |
Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
Title | Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update |
Authors | Su Young Lee, Sungik Choi, Sae-Young Chung |
Abstract | We propose Episodic Backward Update (EBU) - a novel deep reinforcement learning algorithm with a direct value propagation. In contrast to the conventional use of the experience replay with uniform random sampling, our agent samples a whole episode and successively propagates the value of a state to its previous states. Our computationally efficient recursive algorithm allows sparse and delayed rewards to propagate directly through all transitions of the sampled episode. We theoretically prove the convergence of the EBU method and experimentally demonstrate its performance in both deterministic and stochastic environments. Especially in 49 games of Atari 2600 domain, EBU achieves the same mean and median human normalized performance of DQN by using only 5% and 10% of samples, respectively. |
Tasks | |
Published | 2018-05-31 |
URL | https://arxiv.org/abs/1805.12375v3 |
https://arxiv.org/pdf/1805.12375v3.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-deep-reinforcement-learning-2 |
Repo | https://github.com/suyoung-lee/Episodic-Backward-Update |
Framework | none |
Multi-scale Location-aware Kernel Representation for Object Detection
Title | Multi-scale Location-aware Kernel Representation for Object Detection |
Authors | Hao Wang, Qilong Wang, Mingqi Gao, Peihua Li, Wangmeng Zuo |
Abstract | Although Faster R-CNN and its variants have shown promising performance in object detection, they only exploit simple first-order representation of object proposals for final classification and regression. Recent classification methods demonstrate that the integration of high-order statistics into deep convolutional neural networks can achieve impressive improvement, but their goal is to model whole images by discarding location information so that they cannot be directly adopted to object detection. In this paper, we make an attempt to exploit high-order statistics in object detection, aiming at generating more discriminative representations for proposals to enhance the performance of detectors. To this end, we propose a novel Multi-scale Location-aware Kernel Representation (MLKP) to capture high-order statistics of deep features in proposals. Our MLKP can be efficiently computed on a modified multi-scale feature map using a low-dimensional polynomial kernel approximation.Moreover, different from existing orderless global representations based on high-order statistics, our proposed MLKP is location retentive and sensitive so that it can be flexibly adopted to object detection. Through integrating into Faster R-CNN schema, the proposed MLKP achieves very competitive performance with state-of-the-art methods, and improves Faster R-CNN by 4.9% (mAP), 4.7% (mAP) and 5.0% (AP at IOU=[0.5:0.05:0.95]) on PASCAL VOC 2007, VOC 2012 and MS COCO benchmarks, respectively. Code is available at: https://github.com/Hwang64/MLKP. |
Tasks | Object Detection |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00428v1 |
http://arxiv.org/pdf/1804.00428v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-location-aware-kernel |
Repo | https://github.com/Hwang64/MLKP |
Framework | none |
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System
Title | Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System |
Authors | Jiaxi Tang, Ke Wang |
Abstract | We propose a novel way to train ranking models, such as recommender systems, that are both effective and efficient. Knowledge distillation (KD) was shown to be successful in image recognition to achieve both effectiveness and efficiency. We propose a KD technique for learning to rank problems, called \emph{ranking distillation (RD)}. Specifically, we train a smaller student model to learn to rank documents/items from both the training data and the supervision of a larger teacher model. The student model achieves a similar ranking performance to that of the large teacher model, but its smaller model size makes the online inference more efficient. RD is flexible because it is orthogonal to the choices of ranking models for the teacher and student. We address the challenges of RD for ranking problems. The experiments on public data sets and state-of-the-art recommendation models showed that RD achieves its design purposes: the student model learnt with RD has a model size less than half of the teacher model while achieving a ranking performance similar to the teacher model and much better than the student model learnt without RD. |
Tasks | Learning-To-Rank, Recommendation Systems |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07428v1 |
http://arxiv.org/pdf/1809.07428v1.pdf | |
PWC | https://paperswithcode.com/paper/ranking-distillation-learning-compact-ranking |
Repo | https://github.com/graytowne/rank_distill |
Framework | pytorch |