October 21, 2019

3418 words 17 mins read

Paper Group AWR 108

Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network. Psychophysical evaluation of individual low-level feature influences on visual attention. Acquisition of Phrase Correspondences using Natural Deduction Proofs. Constrained-CNN losses for weakly supervised segmentation. Quantifying model form uncertaint …

Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network


Title	Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network
Authors	Zhongliang Yang, Yongfeng Huang, Yiran Jiang, Yuxi Sun, Yu-Jin Zhan, Pengcheng Luo
Abstract	Automatically extracting useful information from electronic medical records along with conducting disease diagnoses is a promising task for both clinical decision support(CDS) and neural language processing(NLP). Most of the existing systems are based on artificially constructed knowledge bases, and then auxiliary diagnosis is done by rule matching. In this study, we present a clinical intelligent decision approach based on Convolutional Neural Networks(CNN), which can automatically extract high-level semantic information of electronic medical records and then perform automatic diagnosis without artificial construction of rules or knowledge bases. We use collected 18,590 copies of the real-world clinical electronic medical records to train and test the proposed model. Experimental results show that the proposed model can achieve 98.67% accuracy and 96.02% recall, which strongly supports that using convolutional neural network to automatically learn high-level semantic features of electronic medical records and then conduct assist diagnosis is feasible and effective.
Tasks
Published	2018-04-23
URL	http://arxiv.org/abs/1804.08261v1
PDF	http://arxiv.org/pdf/1804.08261v1.pdf
PWC	https://paperswithcode.com/paper/clinical-assistant-diagnosis-for-electronic
Repo	https://github.com/YangzlTHU/C-EMRs
Framework	none

Psychophysical evaluation of individual low-level feature influences on visual attention


Title	Psychophysical evaluation of individual low-level feature influences on visual attention
Authors	David Berga, Xosé Ramón Fdez-Vidal, Xavier Otazu, Víctor Leborán, Xosé María Pardo
Abstract	In this study we provide the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of synthetically-generated image patterns. Design of visual stimuli was inspired by the ones used in previous psychophysical experiments, namely in free-viewing and visual searching tasks, to provide a total of 15 types of stimuli, divided according to the task and feature to be analyzed. Our interest is to analyze the influences of low-level feature contrast between a salient region and the rest of distractors, providing fixation localization characteristics and reaction time of landing inside the salient region. Eye-tracking data was collected from 34 participants during the viewing of a 230 images dataset. Results show that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. temporality of fixations, 4. task difficulty and 5. center bias. This experimentation proposes a new psychophysical basis for saliency model evaluation using synthetic images.
Tasks	Eye Tracking
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06458v1
PDF	http://arxiv.org/pdf/1811.06458v1.pdf
PWC	https://paperswithcode.com/paper/psychophysical-evaluation-of-individual-low
Repo	https://github.com/dberga/sig4vam
Framework	none

Acquisition of Phrase Correspondences using Natural Deduction Proofs


Title	Acquisition of Phrase Correspondences using Natural Deduction Proofs
Authors	Hitomi Yanaka, Koji Mineshima, Pascual Martinez-Gomez, Daisuke Bekki
Abstract	How to identify, extract, and use phrasal knowledge is a crucial problem for the task of Recognizing Textual Entailment (RTE). To solve this problem, we propose a method for detecting paraphrases via natural deduction proofs of semantic relations between sentence pairs. Our solution relies on a graph reformulation of partial variable unifications and an algorithm that induces subgraph alignments between meaning representations. Experiments show that our method can automatically detect various paraphrases that are absent from existing paraphrase databases. In addition, the detection of paraphrases using proof information improves the accuracy of RTE tasks.
Tasks	Natural Language Inference
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07656v1
PDF	http://arxiv.org/pdf/1804.07656v1.pdf
PWC	https://paperswithcode.com/paper/acquisition-of-phrase-correspondences-using
Repo	https://github.com/mynlp/ccg2lambda
Framework	none

Constrained-CNN losses for weakly supervised segmentation


Title	Constrained-CNN losses for weakly supervised segmentation
Authors	Hoel Kervadec, Jose Dolz, Meng Tang, Eric Granger, Yuri Boykov, Ismail Ben Ayed
Abstract	Weakly-supervised learning based on, e.g., partially labelled images or image-tags, is currently attracting significant attention in CNN segmentation as it can mitigate the need for full and laborious pixel/voxel annotations. Enforcing high-order (global) inequality constraints on the network output (for instance, to constrain the size of the target region) can leverage unlabeled data, guiding the training process with domain-specific knowledge. Inequality constraints are very flexible because they do not assume exact prior knowledge. However, constrained Lagrangian dual optimization has been largely avoided in deep networks, mainly for computational tractability reasons. To the best of our knowledge, the method of [Pathak et al., 2015] is the only prior work that addresses deep CNNs with linear constraints in weakly supervised segmentation. It uses the constraints to synthesize fully-labeled training masks (proposals) from weak labels, mimicking full supervision and facilitating dual optimization. We propose to introduce a differentiable penalty, which enforces inequality constraints directly in the loss function, avoiding expensive Lagrangian dual iterates and proposal generation. From constrained-optimization perspective, our simple penalty-based approach is not optimal as there is no guarantee that the constraints are satisfied. However, surprisingly, it yields substantially better results than the Lagrangian-based constrained CNNs in [Pathak et al., 2015], while reducing the computational demand for training. By annotating only a small fraction of the pixels, the proposed approach can reach a level of segmentation performance that is comparable to full supervision on three separate tasks. While our experiments focused on basic linear constraints such as the target-region size and image tags, our framework can be easily extended to other non-linear constraints.
Tasks	Medical Image Segmentation, Semantic Segmentation, Weakly-Supervised Semantic Segmentation
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04628v2
PDF	http://arxiv.org/pdf/1805.04628v2.pdf
PWC	https://paperswithcode.com/paper/constrained-cnn-losses-for-weakly-supervised
Repo	https://github.com/meng-tang/rloss
Framework	pytorch

Quantifying model form uncertainty in Reynolds-averaged turbulence models with Bayesian deep neural networks


Title	Quantifying model form uncertainty in Reynolds-averaged turbulence models with Bayesian deep neural networks
Authors	Nicholas Geneva, Nicholas Zabaras
Abstract	Data-driven methods for improving turbulence modeling in Reynolds-Averaged Navier-Stokes (RANS) simulations have gained significant interest in the computational fluid dynamics community. Modern machine learning algorithms have opened up a new area of black-box turbulence models allowing for the tuning of RANS simulations to increase their predictive accuracy. While several data-driven turbulence models have been reported, the quantification of the uncertainties introduced has mostly been neglected. Uncertainty quantification for such data-driven models is essential since their predictive capability rapidly declines as they are tested for flow physics that deviate from that in the training data. In this work, we propose a novel data-driven framework that not only improves RANS predictions but also provides probabilistic bounds for fluid quantities such as velocity and pressure. The uncertainties capture both model form uncertainty as well as epistemic uncertainty induced by the limited training data. An invariant Bayesian deep neural network is used to predict the anisotropic tensor component of the Reynolds stress. This model is trained using Stein variational gradient decent algorithm. The computed uncertainty on the Reynolds stress is propagated to the quantities of interest by vanilla Monte Carlo simulation. Results are presented for two test cases that differ geometrically from the training flows at several different Reynolds numbers. The prediction enhancement of the data-driven model is discussed as well as the associated probabilistic bounds for flow properties of interest. Ultimately this framework allows for a quantitative measurement of model confidence and uncertainty quantification for flows in which no high-fidelity observations or prior knowledge is available.
Tasks
Published	2018-07-08
URL	http://arxiv.org/abs/1807.02901v3
PDF	http://arxiv.org/pdf/1807.02901v3.pdf
PWC	https://paperswithcode.com/paper/quantifying-model-form-uncertainty-in
Repo	https://github.com/cics-nd/rans-uncertainty
Framework	pytorch

Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking


Title	Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
Authors	Haichuan Yang, Yuhao Zhu, Ji Liu
Abstract	Deep Neural Networks (DNNs) are increasingly deployed in highly energy-constrained environments such as autonomous drones and wearable devices while at the same time must operate in real-time. Therefore, reducing the energy consumption has become a major design consideration in DNN training. This paper proposes the first end-to-end DNN training framework that provides quantitative energy consumption guarantees via weighted sparse projection and input masking. The key idea is to formulate the DNN training as an optimization problem in which the energy budget imposes a previously unconsidered optimization constraint. We integrate the quantitative DNN energy estimation into the DNN training process to assist the constrained optimization. We prove that an approximate algorithm can be used to efficiently solve the optimization problem. Compared to the best prior energy-saving methods, our framework trains DNNs that provide higher accuracies under same or lower energy budgets. Code is publicly available.
Tasks
Published	2018-06-12
URL	https://arxiv.org/abs/1806.04321v3
PDF	https://arxiv.org/pdf/1806.04321v3.pdf
PWC	https://paperswithcode.com/paper/end-to-end-learning-of-energy-constrained
Repo	https://github.com/hyang1990/model_based_energy_constrained_compression
Framework	pytorch

Neural Joint Source-Channel Coding


Title	Neural Joint Source-Channel Coding
Authors	Kristy Choi, Kedar Tatwawadi, Aditya Grover, Tsachy Weissman, Stefano Ermon
Abstract	For reliable transmission across a noisy communication channel, classical results from information theory show that it is asymptotically optimal to separate out the source and channel coding processes. However, this decomposition can fall short in the finite bit-length regime, as it requires non-trivial tuning of hand-crafted codes and assumes infinite computational power for decoding. In this work, we propose to jointly learn the encoding and decoding processes using a new discrete variational autoencoder model. By adding noise into the latent codes to simulate the channel during training, we learn to both compress and error-correct given a fixed bit-length and computational budget. We obtain codes that are not only competitive against several separation schemes, but also learn useful robust representations of the data for downstream tasks such as classification. Finally, inference amortization yields an extremely fast neural decoder, almost an order of magnitude faster compared to standard decoding methods based on iterative belief propagation.
Tasks
Published	2018-11-19
URL	https://arxiv.org/abs/1811.07557v3
PDF	https://arxiv.org/pdf/1811.07557v3.pdf
PWC	https://paperswithcode.com/paper/neural-joint-source-channel-coding
Repo	https://github.com/ermongroup/necst
Framework	tf

Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks


Title	Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks
Authors	Filippos Kokkinos, Stamatios Lefkimmiatis
Abstract	Demosaicking and denoising are among the most crucial steps of modern digital camera pipelines and their joint treatment is a highly ill-posed inverse problem where at-least two-thirds of the information are missing and the rest are corrupted by noise. This poses a great challenge in obtaining meaningful reconstructions and a special care for the efficient treatment of the problem is required. While there are several machine learning approaches that have been recently introduced to deal with joint image demosaicking-denoising, in this work we propose a novel deep learning architecture which is inspired by powerful classical image regularization methods and large-scale convex optimization techniques. Consequently, our derived network is more transparent and has a clear interpretation compared to alternative competitive deep learning approaches. Our extensive experiments demonstrate that our network outperforms any previous approaches on both noisy and noise-free data. This improvement in reconstruction quality is attributed to the principled way we design our network architecture, which also requires fewer trainable parameters than the current state-of-the-art deep network solution. Finally, we show that our network has the ability to generalize well even when it is trained on small datasets, while keeping the overall number of trainable parameters low.
Tasks	Demosaicking, Denoising
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05215v4
PDF	http://arxiv.org/pdf/1803.05215v4.pdf
PWC	https://paperswithcode.com/paper/deep-image-demosaicking-using-a-cascade-of
Repo	https://github.com/cig-skoltech/deep_demosaick
Framework	pytorch

PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark


Title	PTB-TIR: A Thermal Infrared Pedestrian Tracking Benchmark
Authors	Qiao Liu, Zhenyu He, Xin Li, Yuan Zheng
Abstract	Thermal infrared (TIR) pedestrian tracking is one of the important components among numerous applications of computer vision, which has a major advantage: it can track pedestrians in total darkness. The ability to evaluate the TIR pedestrian tracker fairly, on a benchmark dataset, is significant for the development of this field. However, there is not a benchmark dataset. In this paper, we develop a TIR pedestrian tracking dataset for the TIR pedestrian tracker evaluation. The dataset includes 60 thermal sequences with manual annotations. Each sequence has nine attribute labels for the attribute based evaluation. In addition to the dataset, we carry out the large-scale evaluation experiments on our benchmark dataset using nine publicly available trackers. The experimental results help us understand the strengths and weaknesses of these trackers.In addition, in order to gain more insight into the TIR pedestrian tracker, we divide its functions into three components: feature extractor, motion model, and observation model. Then, we conduct three comparison experiments on our benchmark dataset to validate how each component affects the tracker’s performance. The findings of these experiments provide some guidelines for future research. The dataset and evaluation toolkit can be downloaded at {https://github.com/QiaoLiuHit/PTB-TIR_Evaluation_toolkit}.
Tasks	Thermal Infrared Object Tracking
Published	2018-01-18
URL	https://arxiv.org/abs/1801.05944v3
PDF	https://arxiv.org/pdf/1801.05944v3.pdf
PWC	https://paperswithcode.com/paper/ptb-tir-a-thermal-infrared-pedestrian
Repo	https://github.com/QiaoLiuHit/PTB-TIR_Evaluation_toolkit
Framework	none

The Unusual Effectiveness of Averaging in GAN Training


Title	The Unusual Effectiveness of Averaging in GAN Training
Authors	Yasin Yazıcı, Chuan-Sheng Foo, Stefan Winkler, Kim-Hui Yap, Georgios Piliouras, Vijay Chandrasekhar
Abstract	We examine two different techniques for parameter averaging in GAN training. Moving Average (MA) computes the time-average of parameters, whereas Exponential Moving Average (EMA) computes an exponentially discounted sum. Whilst MA is known to lead to convergence in bilinear settings, we provide the – to our knowledge – first theoretical arguments in support of EMA. We show that EMA converges to limit cycles around the equilibrium with vanishing amplitude as the discount parameter approaches one for simple bilinear games and also enhances the stability of general GAN training. We establish experimentally that both techniques are strikingly effective in the non-convex-concave GAN setting as well. Both improve inception and FID scores on different architectures and for different GAN objectives. We provide comprehensive experimental results across a range of datasets – mixture of Gaussians, CIFAR-10, STL-10, CelebA and ImageNet – to demonstrate its effectiveness. We achieve state-of-the-art results on CIFAR-10 and produce clean CelebA face images.\footnote{~The code is available at \url{https://github.com/yasinyazici/EMA_GAN}}
Tasks
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04498v2
PDF	http://arxiv.org/pdf/1806.04498v2.pdf
PWC	https://paperswithcode.com/paper/the-unusual-effectiveness-of-averaging-in-gan
Repo	https://github.com/yasinyazici/EMA_GAN
Framework	none

Learning-based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems


Title	Learning-based Application-Agnostic 3D NoC Design for Heterogeneous Manycore Systems
Authors	Biresh Kumar Joardar, Ryan Gary Kim, Janardhan Rao Doppa, Partha Pratim Pande, Diana Marculescu, Radu Marculescu
Abstract	The rising use of deep learning and other big-data algorithms has led to an increasing demand for hardware platforms that are computationally powerful, yet energy-efficient. Due to the amount of data parallelism in these algorithms, high-performance 3D manycore platforms that incorporate both CPUs and GPUs present a promising direction. However, as systems use heterogeneity (e.g., a combination of CPUs, GPUs, and accelerators) to improve performance and efficiency, it becomes more pertinent to address the distinct and likely conflicting communication requirements (e.g., CPU memory access latency or GPU network throughput) that arise from such heterogeneity. Unfortunately, it is difficult to quickly explore the hardware design space and choose appropriate tradeoffs between these heterogeneous requirements. To address these challenges, we propose the design of a 3D Network-on-Chip (NoC) for heterogeneous manycore platforms that considers the appropriate design objectives for a 3D heterogeneous system and explores various tradeoffs using an efficient ML-based multi-objective optimization technique. The proposed design space exploration considers the various requirements of its heterogeneous components and generates a set of 3D NoC architectures that efficiently trades off these design objectives. Our findings show that by jointly considering these requirements (latency, throughput, temperature, and energy), we can achieve 9.6% better Energy-Delay Product on average at nearly iso-temperature conditions when compared to a thermally-optimized design for 3D heterogeneous NoCs. More importantly, our results suggest that our 3D NoCs optimized for a few applications can be generalized for unknown applications as well. Our results show that these generalized 3D NoCs only incur a 1.8% (36-tile system) and 1.1% (64-tile system) average performance loss compared to application-specific NoCs.
Tasks
Published	2018-10-20
URL	https://arxiv.org/abs/1810.08869v2
PDF	https://arxiv.org/pdf/1810.08869v2.pdf
PWC	https://paperswithcode.com/paper/learning-based-application-agnostic-3d-noc
Repo	https://github.com/CSU-rgkim/TC_2018_code
Framework	none

Smoothed Dilated Convolutions for Improved Dense Prediction


Title	Smoothed Dilated Convolutions for Improved Dense Prediction
Authors	Zhengyang Wang, Shuiwang Ji
Abstract	Dilated convolutions, also known as atrous convolutions, have been widely explored in deep convolutional neural networks (DCNNs) for various dense prediction tasks. However, dilated convolutions suffer from the gridding artifacts, which hampers the performance. In this work, we propose two simple yet effective degridding methods by studying a decomposition of dilated convolutions. Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself. In addition, we point out that the two degridding approaches are intrinsically related and define separable and shared (SS) operations, which generalize the proposed methods. We further explore SS operations in view of operations on graphs and propose the SS output layer, which is able to smooth the entire DCNNs by only replacing the output layer. We evaluate our degridding methods and the SS output layer thoroughly, and visualize the smoothing effect through effective receptive field analysis. Results show that our methods degridding yield consistent improvements on the performance of dense prediction tasks, while adding negligible amounts of extra training parameters. And the SS output layer improves the performance significantly and is very efficient in terms of number of training parameters.
Tasks	Audio Generation, Machine Translation, Object Detection, Semantic Segmentation
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08931v2
PDF	http://arxiv.org/pdf/1808.08931v2.pdf
PWC	https://paperswithcode.com/paper/smoothed-dilated-convolutions-for-improved
Repo	https://github.com/divelab/dilated
Framework	tf

Mean Field Theory of Activation Functions in Deep Neural Networks


Title	Mean Field Theory of Activation Functions in Deep Neural Networks
Authors	Mirco Milletarí, Thiparat Chotibut, Paolo E. Trevisanutto
Abstract	We present a Statistical Mechanics (SM) model of deep neural networks, connecting the energy-based and the feed forward networks (FFN) approach. We infer that FFN can be understood as performing three basic steps: encoding, representation validation and propagation. From the meanfield solution of the model, we obtain a set of natural activations – such as Sigmoid, $\tanh$ and ReLu – together with the state-of-the-art, Swish; this represents the expected information propagating through the network and tends to ReLu in the limit of zero noise.We study the spectrum of the Hessian on an associated classification task, showing that Swish allows for more consistent performances over a wider range of network architectures.
Tasks
Published	2018-05-22
URL	https://arxiv.org/abs/1805.08786v2
PDF	https://arxiv.org/pdf/1805.08786v2.pdf
PWC	https://paperswithcode.com/paper/expectation-propagation-a-probabilistic-view
Repo	https://github.com/WessZumino/meanfield-theory-of-activation-functions
Framework	tf

Learning to Learn from Web Data through Deep Semantic Embeddings


Title	Learning to Learn from Web Data through Deep Semantic Embeddings
Authors	Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas
Abstract	In this paper we propose to learn a multimodal image and text embedding from Web and Social Media data, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the pipeline can learn from images with associated text without supervision and perform a thourough analysis of five different text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text based image retrieval task, and we clearly outperform state of the art in the MIRFlickr dataset when training in the target data. Further we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Tasks	Image Retrieval
Published	2018-08-20
URL	http://arxiv.org/abs/1808.06368v1
PDF	http://arxiv.org/pdf/1808.06368v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-learn-from-web-data-through-deep
Repo	https://github.com/gombru/LearnFromWebData
Framework	none

Guiding Policies with Language via Meta-Learning


Title	Guiding Policies with Language via Meta-Learning
Authors	John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine
Abstract	Behavioral skills or policies for autonomous agents are conventionally learned from reward functions, via reinforcement learning, or from demonstrations, via imitation learning. However, both modes of task specification have their disadvantages: reward functions require manual engineering, while demonstrations require a human expert to be able to actually perform the task in order to generate the demonstration. Instruction following from natural language instructions provides an appealing alternative: in the same way that we can specify goals to other humans simply by speaking or writing, we would like to be able to specify tasks for our machines. However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task. In this work, we propose an interactive formulation of the task specification problem, where iterative language corrections are provided to an autonomous agent, guiding it in acquiring the desired skill. Our proposed language-guided policy learning algorithm can integrate an instruction and a sequence of corrections to acquire new skills very quickly. In our experiments, we show that this method can enable a policy to follow instructions and corrections for simulated navigation and manipulation tasks, substantially outperforming direct, non-interactive instruction following.
Tasks	Imitation Learning, Meta-Learning
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07882v2
PDF	http://arxiv.org/pdf/1811.07882v2.pdf
PWC	https://paperswithcode.com/paper/guiding-policies-with-language-via-meta
Repo	https://github.com/maximecb/gym-minigrid
Framework	pytorch