October 20, 2019

3973 words 19 mins read

Paper Group AWR 344

Paper Group AWR 344

DetNet: A Backbone network for Object Detection. Deep Bayesian Self-Training. Double Path Networks for Sequence to Sequence Learning. Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering. Real-world Anomaly Detection in Surveillance Videos. BRUNO: A Deep Recurrent Model for Exchangeable Data. Generating Informative and …

DetNet: A Backbone network for Object Detection

Title DetNet: A Backbone network for Object Detection
Authors Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
Abstract Recent CNN based object detectors, no matter one-stage methods like YOLO, SSD, and RetinaNe or two-stage detectors like Faster R-CNN, R-FCN and FPN are usually trying to directly finetune from ImageNet pre-trained models designed for image classification. There has been little work discussing on the backbone feature extractor specifically designed for the object detection. More importantly, there are several differences between the tasks of image classification and object detection. 1. Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales. 2. Object detection not only needs to recognize the category of the object instances but also spatially locate the position. Large downsampling factor brings large valid receptive field, which is good for image classification but compromises the object location ability. Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection. Moreover, DetNet includes the extra stages against traditional backbone network for image classification, while maintains high spatial resolution in deeper layers. Without any bells and whistles, state-of-the-art results have been obtained for both object detection and instance segmentation on the MSCOCO benchmark based on our DetNet~(4.8G FLOPs) backbone. The code will be released for the reproduction.
Tasks Image Classification, Instance Segmentation, Object Detection, Semantic Segmentation
Published 2018-04-17
URL http://arxiv.org/abs/1804.06215v2
PDF http://arxiv.org/pdf/1804.06215v2.pdf
PWC https://paperswithcode.com/paper/detnet-a-backbone-network-for-object
Repo https://github.com/becauseofAI/DetNet-Keras
Framework none

Deep Bayesian Self-Training

Title Deep Bayesian Self-Training
Authors Fabio De Sousa Ribeiro, Francesco Caliva, Mark Swainson, Kjartan Gudmundsson, Georgios Leontidis, Stefanos Kollias
Abstract Supervised Deep Learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real world problems, manual annotation is practically intractable due to time/labour constraints, thus the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (i) Deep Bayesian Self-Training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern Neural Network architectures, as well as (ii) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of Neural Network latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard Self-Training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains.
Tasks
Published 2018-11-26
URL https://arxiv.org/abs/1812.01681v3
PDF https://arxiv.org/pdf/1812.01681v3.pdf
PWC https://paperswithcode.com/paper/deep-bayesian-self-training
Repo https://github.com/fabio-deep/Deep-Bayesian-Self-Training
Framework tf

Double Path Networks for Sequence to Sequence Learning

Title Double Path Networks for Sequence to Sequence Learning
Authors Kaitao Song, Xu Tan, Di He, Jianfeng Lu, Tao Qin, Tie-Yan Liu
Abstract Encoder-decoder based Sequence to Sequence learning (S2S) has made remarkable progress in recent years. Different network architectures have been used in the encoder/decoder. Among them, Convolutional Neural Networks (CNN) and Self Attention Networks (SAN) are the prominent ones. The two architectures achieve similar performances but use very different ways to encode and decode context: CNN use convolutional layers to focus on the local connectivity of the sequence, while SAN uses self-attention layers to focus on global semantics. In this work we propose Double Path Networks for Sequence to Sequence learning (DPN-S2S), which leverage the advantages of both models by using double path information fusion. During the encoding step, we develop a double path architecture to maintain the information coming from different paths with convolutional layers and self-attention layers separately. To effectively use the encoded context, we develop a cross attention module with gating and use it to automatically pick up the information needed during the decoding step. By deeply integrating the two paths with cross attention, both types of information are combined and well exploited. Experiments show that our proposed method can significantly improve the performance of sequence to sequence learning over state-of-the-art systems.
Tasks
Published 2018-06-13
URL http://arxiv.org/abs/1806.04856v2
PDF http://arxiv.org/pdf/1806.04856v2.pdf
PWC https://paperswithcode.com/paper/double-path-networks-for-sequence-to-sequence
Repo https://github.com/StillKeepTry/Transformer-PyTorch
Framework pytorch

Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering

Title Submodular Hypergraphs: p-Laplacians, Cheeger Inequalities and Spectral Clustering
Authors Pan Li, Olgica Milenkovic
Abstract We introduce submodular hypergraphs, a family of hypergraphs that have different submodular weights associated with different cuts of hyperedges. Submodular hypergraphs arise in clustering applications in which higher-order structures carry relevant information. For such hypergraphs, we define the notion of p-Laplacians and derive corresponding nodal domain theorems and k-way Cheeger inequalities. We conclude with the description of algorithms for computing the spectra of 1- and 2-Laplacians that constitute the basis of new spectral hypergraph clustering methods.
Tasks
Published 2018-03-10
URL http://arxiv.org/abs/1803.03833v4
PDF http://arxiv.org/pdf/1803.03833v4.pdf
PWC https://paperswithcode.com/paper/submodular-hypergraphs-p-laplacians-cheeger
Repo https://github.com/lipan00123/IPM-for-submodular-hypergraphs
Framework none

Real-world Anomaly Detection in Surveillance Videos

Title Real-world Anomaly Detection in Surveillance Videos
Authors Waqas Sultani, Chen Chen, Mubarak Shah
Abstract Surveillance videos are able to capture a variety of realistic anomalies. In this paper, we propose to learn anomalies by exploiting both normal and anomalous videos. To avoid annotating the anomalous segments or clips in training videos, which is very time consuming, we propose to learn anomaly through the deep multiple instance ranking framework by leveraging weakly labeled training videos, i.e. the training labels (anomalous or normal) are at video-level instead of clip-level. In our approach, we consider normal and anomalous videos as bags and video segments as instances in multiple instance learning (MIL), and automatically learn a deep anomaly ranking model that predicts high anomaly scores for anomalous video segments. Furthermore, we introduce sparsity and temporal smoothness constraints in the ranking loss function to better localize anomaly during training. We also introduce a new large-scale first of its kind dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies such as fighting, road accident, burglary, robbery, etc. as well as normal activities. This dataset can be used for two tasks. First, general anomaly detection considering all anomalies in one group and all normal activities in another group. Second, for recognizing each of 13 anomalous activities. Our experimental results show that our MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches. We provide the results of several recent deep learning baselines on anomalous activity recognition. The low recognition performance of these baselines reveals that our dataset is very challenging and opens more opportunities for future work. The dataset is available at: https://webpages.uncc.edu/cchen62/dataset.html
Tasks Activity Recognition, Anomaly Detection, Anomaly Detection In Surveillance Videos, Multiple Instance Learning
Published 2018-01-12
URL http://arxiv.org/abs/1801.04264v3
PDF http://arxiv.org/pdf/1801.04264v3.pdf
PWC https://paperswithcode.com/paper/real-world-anomaly-detection-in-surveillance
Repo https://github.com/WaqasSultani/AnomalyDetectionCVPR2018
Framework none

BRUNO: A Deep Recurrent Model for Exchangeable Data

Title BRUNO: A Deep Recurrent Model for Exchangeable Data
Authors Iryna Korshunova, Jonas Degrave, Ferenc Huszár, Yarin Gal, Arthur Gretton, Joni Dambre
Abstract We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations. Our model is provably exchangeable, meaning that the joint distribution over observations is invariant under permutation: this property lies at the heart of Bayesian inference. The model does not require variational approximations to train, and new samples can be generated conditional on previous samples, with cost linear in the size of the conditioning set. The advantages of our architecture are demonstrated on learning tasks that require generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, and anomaly detection.
Tasks Anomaly Detection, Bayesian Inference, Conditional Image Generation, Few-Shot Learning, Image Generation
Published 2018-02-21
URL http://arxiv.org/abs/1802.07535v3
PDF http://arxiv.org/pdf/1802.07535v3.pdf
PWC https://paperswithcode.com/paper/bruno-a-deep-recurrent-model-for-exchangeable
Repo https://github.com/IraKorshunova/bruno
Framework tf

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization

Title Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization
Authors Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan, Xiujun Li, Chris Brockett, Bill Dolan
Abstract Responses generated by neural conversational models tend to lack informativeness and diversity. We present Adversarial Information Maximization (AIM), an adversarial learning strategy that addresses these two related but distinct problems. To foster response diversity, we leverage adversarial training that allows distributional matching of synthetic and real responses. To improve informativeness, our framework explicitly optimizes a variational lower bound on pairwise mutual information between query and response. Empirical results from automatic and human evaluations demonstrate that our methods significantly boost informativeness and diversity.
Tasks Conversational Response Generation
Published 2018-09-16
URL http://arxiv.org/abs/1809.05972v5
PDF http://arxiv.org/pdf/1809.05972v5.pdf
PWC https://paperswithcode.com/paper/generating-informative-and-diverse
Repo https://github.com/microsoft/DialoGPT
Framework pytorch

Metric Learning for Novelty and Anomaly Detection

Title Metric Learning for Novelty and Anomaly Detection
Authors Marc Masana, Idoia Ruiz, Joan Serrat, Joost van de Weijer, Antonio M. Lopez
Abstract When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently. The capability to detect out-of-distribution images is therefore crucial for many real-world applications. We divide out-of-distribution detection between novelty detection —images of classes which are not in the training set but are related to those—, and anomaly detection —images with classes which are unrelated to the training set. By related we mean they contain the same type of objects, like digits in MNIST and SVHN. Most existing work has focused on anomaly detection, and has addressed this problem considering networks trained with the cross-entropy loss. Differently from them, we propose to use metric learning which does not have the drawback of the softmax layer (inherent to cross-entropy methods), which forces the network to divide its prediction power over the learned classes. We perform extensive experiments and evaluate both novelty and anomaly detection, even in a relevant application such as traffic sign recognition, obtaining comparable or better results than previous works.
Tasks Anomaly Detection, Metric Learning, Out-of-Distribution Detection, Traffic Sign Recognition
Published 2018-08-16
URL http://arxiv.org/abs/1808.05492v1
PDF http://arxiv.org/pdf/1808.05492v1.pdf
PWC https://paperswithcode.com/paper/metric-learning-for-novelty-and-anomaly
Repo https://github.com/mmasana/OoD_Mining
Framework tf

Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space

Title Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space
Authors Berta Bescos, José Neira, Roland Siegwart, Cesar Cadena
Abstract In this paper we present an end-to-end deep learning framework to turn images that show dynamic content, such as vehicles or pedestrians, into realistic static frames. This objective encounters two main challenges: detecting all the dynamic objects, and inpainting the static occluded background with plausible imagery. The second problem is approached with a conditional generative adversarial model that, taking as input the original dynamic image and its dynamic/static binary mask, is capable of generating the final static image. The former challenge is addressed by the use of a convolutional network that learns a multi-class semantic segmentation of the image. These generated images can be used for applications such as augmented reality or vision-based robot localization purposes. To validate our approach, we show both qualitative and quantitative comparisons against other state-of-the-art inpainting methods by removing the dynamic objects and hallucinating the static structure behind them. Furthermore, to demonstrate the potential of our results, we carry out pilot experiments that show the benefits of our proposal for visual place recognition.
Tasks Image Inpainting, Semantic Segmentation, Visual Place Recognition
Published 2018-09-20
URL http://arxiv.org/abs/1809.10239v2
PDF http://arxiv.org/pdf/1809.10239v2.pdf
PWC https://paperswithcode.com/paper/empty-cities-image-inpainting-for-a-dynamic
Repo https://github.com/bertabescos/EmptyCities
Framework pytorch

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Title Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge
Authors Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko, Arash Nazeri, Marc-Andre Weber, Abhishek Mahajan, Ujjwal Baid, Elizabeth Gerstner, Dongjin Kwon, Gagan Acharya, Manu Agarwal, Mahbubul Alam, Alberto Albiol, Antonio Albiol, Francisco J. Albiol, Varghese Alex, Nigel Allinson, Pedro H. A. Amorim, Abhijit Amrutkar, Ganesh Anand, Simon Andermatt, Tal Arbel, Pablo Arbelaez, Aaron Avery, Muneeza Azmat, Pranjal B., W Bai, Subhashis Banerjee, Bill Barth, Thomas Batchelder, Kayhan Batmanghelich, Enzo Battistella, Andrew Beers, Mikhail Belyaev, Martin Bendszus, Eze Benson, Jose Bernal, Halandur Nagaraja Bharath, George Biros, Sotirios Bisdas, James Brown, Mariano Cabezas, Shilei Cao, Jorge M. Cardoso, Eric N Carver, Adrià Casamitjana, Laura Silvana Castillo, Marcel Catà, Philippe Cattin, Albert Cerigues, Vinicius S. Chagas, Siddhartha Chandra, Yi-Ju Chang, Shiyu Chang, Ken Chang, Joseph Chazalon, Shengcong Chen, Wei Chen, Jefferson W Chen, Zhaolin Chen, Kun Cheng, Ahana Roy Choudhury, Roger Chylla, Albert Clérigues, Steven Colleman, Ramiro German Rodriguez Colmeiro, Marc Combalia, Anthony Costa, Xiaomeng Cui, Zhenzhen Dai, Lutao Dai, Laura Alexandra Daza, Eric Deutsch, Changxing Ding, Chao Dong, Shidu Dong, Wojciech Dudzik, Zach Eaton-Rosen, Gary Egan, Guilherme Escudero, Théo Estienne, Richard Everson, Jonathan Fabrizio, Yong Fan, Longwei Fang, Xue Feng, Enzo Ferrante, Lucas Fidon, Martin Fischer, Andrew P. French, Naomi Fridman, Huan Fu, David Fuentes, Yaozong Gao, Evan Gates, David Gering, Amir Gholami, Willi Gierke, Ben Glocker, Mingming Gong, Sandra González-Villá, T. Grosges, Yuanfang Guan, Sheng Guo, Sudeep Gupta, Woo-Sup Han, Il Song Han, Konstantin Harmuth, Huiguang He, Aura Hernández-Sabaté, Evelyn Herrmann, Naveen Himthani, Winston Hsu, Cheyu Hsu, Xiaojun Hu, Xiaobin Hu, Yan Hu, Yifan Hu, Rui Hua, Teng-Yi Huang, Weilin Huang, Sabine Van Huffel, Quan Huo, Vivek HV, Khan M. Iftekharuddin, Fabian Isensee, Mobarakol Islam, Aaron S. Jackson, Sachin R. Jambawalikar, Andrew Jesson, Weijian Jian, Peter Jin, V Jeya Maria Jose, Alain Jungo, B Kainz, Konstantinos Kamnitsas, Po-Yu Kao, Ayush Karnawat, Thomas Kellermeier, Adel Kermi, Kurt Keutzer, Mohamed Tarek Khadir, Mahendra Khened, Philipp Kickingereder, Geena Kim, Nik King, Haley Knapp, Urspeter Knecht, Lisa Kohli, Deren Kong, Xiangmao Kong, Simon Koppers, Avinash Kori, Ganapathy Krishnamurthi, Egor Krivov, Piyush Kumar, Kaisar Kushibar, Dmitrii Lachinov, Tryphon Lambrou, Joon Lee, Chengen Lee, Yuehchou Lee, M Lee, Szidonia Lefkovits, Laszlo Lefkovits, James Levitt, Tengfei Li, Hongwei Li, Wenqi Li, Hongyang Li, Xiaochuan Li, Yuexiang Li, Heng Li, Zhenye Li, Xiaoyu Li, Zeju Li, XiaoGang Li, Wenqi Li, Zheng-Shen Lin, Fengming Lin, Pietro Lio, Chang Liu, Boqiang Liu, Xiang Liu, Mingyuan Liu, Ju Liu, Luyan Liu, Xavier Llado, Marc Moreno Lopez, Pablo Ribalta Lorenzo, Zhentai Lu, Lin Luo, Zhigang Luo, Jun Ma, Kai Ma, Thomas Mackie, Anant Madabushi, Issam Mahmoudi, Klaus H. Maier-Hein, Pradipta Maji, CP Mammen, Andreas Mang, B. S. Manjunath, Michal Marcinkiewicz, S McDonagh, Stephen McKenna, Richard McKinley, Miriam Mehl, Sachin Mehta, Raghav Mehta, Raphael Meier, Christoph Meinel, Dorit Merhof, Craig Meyer, Robert Miller, Sushmita Mitra, Aliasgar Moiyadi, David Molina-Garcia, Miguel A. B. Monteiro, Grzegorz Mrukwa, Andriy Myronenko, Jakub Nalepa, Thuyen Ngo, Dong Nie, Holly Ning, Chen Niu, Nicholas K Nuechterlein, Eric Oermann, Arlindo Oliveira, Diego D. C. Oliveira, Arnau Oliver, Alexander F. I. Osman, Yu-Nian Ou, Sebastien Ourselin, Nikos Paragios, Moo Sung Park, Brad Paschke, J. Gregory Pauloski, Kamlesh Pawar, Nick Pawlowski, Linmin Pei, Suting Peng, Silvio M. Pereira, Julian Perez-Beteta, Victor M. Perez-Garcia, Simon Pezold, Bao Pham, Ashish Phophalia, Gemma Piella, G. N. Pillai, Marie Piraud, Maxim Pisov, Anmol Popli, Michael P. Pound, Reza Pourreza, Prateek Prasanna, Vesna Prkovska, Tony P. Pridmore, Santi Puch, Élodie Puybareau, Buyue Qian, Xu Qiao, Martin Rajchl, Swapnil Rane, Michael Rebsamen, Hongliang Ren, Xuhua Ren, Karthik Revanuru, Mina Rezaei, Oliver Rippel, Luis Carlos Rivera, Charlotte Robert, Bruce Rosen, Daniel Rueckert, Mohammed Safwan, Mostafa Salem, Joaquim Salvi, Irina Sanchez, Irina Sánchez, Heitor M. Santos, Emmett Sartor, Dawid Schellingerhout, Klaudius Scheufele, Matthew R. Scott, Artur A. Scussel, Sara Sedlar, Juan Pablo Serrano-Rubio, N. Jon Shah, Nameetha Shah, Mazhar Shaikh, B. Uma Shankar, Zeina Shboul, Haipeng Shen, Dinggang Shen, Linlin Shen, Haocheng Shen, Varun Shenoy, Feng Shi, Hyung Eun Shin, Hai Shu, Diana Sima, M Sinclair, Orjan Smedby, James M. Snyder, Mohammadreza Soltaninejad, Guidong Song, Mehul Soni, Jean Stawiaski, Shashank Subramanian, Li Sun, Roger Sun, Jiawei Sun, Kay Sun, Yu Sun, Guoxia Sun, Shuang Sun, Yannick R Suter, Laszlo Szilagyi, Sanjay Talbar, Dacheng Tao, Dacheng Tao, Zhongzhao Teng, Siddhesh Thakur, Meenakshi H Thakur, Sameer Tharakan, Pallavi Tiwari, Guillaume Tochon, Tuan Tran, Yuhsiang M. Tsai, Kuan-Lun Tseng, Tran Anh Tuan, Vadim Turlapov, Nicholas Tustison, Maria Vakalopoulou, Sergi Valverde, Rami Vanguri, Evgeny Vasiliev, Jonathan Ventura, Luis Vera, Tom Vercauteren, C. A. Verrastro, Lasitha Vidyaratne, Veronica Vilaplana, Ajeet Vivekanandan, Guotai Wang, Qian Wang, Chiatse J. Wang, Weichung Wang, Duo Wang, Ruixuan Wang, Yuanyuan Wang, Chunliang Wang, Guotai Wang, Ning Wen, Xin Wen, Leon Weninger, Wolfgang Wick, Shaocheng Wu, Qiang Wu, Yihong Wu, Yong Xia, Yanwu Xu, Xiaowen Xu, Peiyuan Xu, Tsai-Ling Yang, Xiaoping Yang, Hao-Yu Yang, Junlin Yang, Haojin Yang, Guang Yang, Hongdou Yao, Xujiong Ye, Changchang Yin, Brett Young-Moxon, Jinhua Yu, Xiangyu Yue, Songtao Zhang, Angela Zhang, Kun Zhang, Xuejie Zhang, Lichi Zhang, Xiaoyue Zhang, Yazhuo Zhang, Lei Zhang, Jianguo Zhang, Xiang Zhang, Tianhao Zhang, Sicheng Zhao, Yu Zhao, Xiaomei Zhao, Liang Zhao, Yefeng Zheng, Liming Zhong, Chenhong Zhou, Xiaobing Zhou, Fan Zhou, Hongtu Zhu, Jin Zhu, Ying Zhuge, Weiwei Zong, Jayashree Kalpathy-Cramer, Keyvan Farahani, Christos Davatzikos, Koen van Leemput, Bjoern Menze
Abstract Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
Tasks Brain Tumor Segmentation
Published 2018-11-05
URL http://arxiv.org/abs/1811.02629v3
PDF http://arxiv.org/pdf/1811.02629v3.pdf
PWC https://paperswithcode.com/paper/identifying-the-best-machine-learning
Repo https://github.com/christophbrgr/brats-orchestra
Framework none

Generative Image Inpainting with Contextual Attention

Title Generative Image Inpainting with Contextual Attention
Authors Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas S. Huang
Abstract Recent deep learning based approaches have shown promising results for the challenging task of inpainting large missing regions in an image. These methods can generate visually plausible image structures and textures, but often create distorted structures or blurry textures inconsistent with surrounding areas. This is mainly due to ineffectiveness of convolutional neural networks in explicitly borrowing or copying information from distant spatial locations. On the other hand, traditional texture and patch synthesis approaches are particularly suitable when it needs to borrow textures from the surrounding regions. Motivated by these observations, we propose a new deep generative model-based approach which can not only synthesize novel image structures but also explicitly utilize surrounding image features as references during network training to make better predictions. The model is a feed-forward, fully convolutional neural network which can process images with multiple holes at arbitrary locations and with variable sizes during the test time. Experiments on multiple datasets including faces (CelebA, CelebA-HQ), textures (DTD) and natural images (ImageNet, Places2) demonstrate that our proposed approach generates higher-quality inpainting results than existing ones. Code, demo and models are available at: https://github.com/JiahuiYu/generative_inpainting.
Tasks Image Inpainting
Published 2018-01-24
URL http://arxiv.org/abs/1801.07892v2
PDF http://arxiv.org/pdf/1801.07892v2.pdf
PWC https://paperswithcode.com/paper/generative-image-inpainting-with-contextual
Repo https://github.com/ShnitzelKiller/generative_inpainting
Framework tf

Deep Structured Energy-Based Image Inpainting

Title Deep Structured Energy-Based Image Inpainting
Authors Fazil Altinel, Mete Ozay, Takayuki Okatani
Abstract In this paper, we propose a structured image inpainting method employing an energy based model. In order to learn structural relationship between patterns observed in images and missing regions of the images, we employ an energy-based structured prediction method. The structural relationship is learned by minimizing an energy function which is defined by a simple convolutional neural network. The experimental results on various benchmark datasets show that our proposed method significantly outperforms the state-of-the-art methods which use Generative Adversarial Networks (GANs). We obtained 497.35 mean squared error (MSE) on the Olivetti face dataset compared to 833.0 MSE provided by the state-of-the-art method. Moreover, we obtained 28.4 dB peak signal to noise ratio (PSNR) on the SVHN dataset and 23.53 dB on the CelebA dataset, compared to 22.3 dB and 21.3 dB, provided by the state-of-the-art methods, respectively. The code is publicly available.
Tasks Image Inpainting, Structured Prediction
Published 2018-01-24
URL http://arxiv.org/abs/1801.07939v2
PDF http://arxiv.org/pdf/1801.07939v2.pdf
PWC https://paperswithcode.com/paper/deep-structured-energy-based-image-inpainting
Repo https://github.com/cvlab-tohoku/DSEBImageInpainting
Framework tf

LSTM Benchmarks for Deep Learning Frameworks

Title LSTM Benchmarks for Deep Learning Frameworks
Authors Stefan Braun
Abstract This study provides benchmarks for different implementations of LSTM units between the deep learning frameworks PyTorch, TensorFlow, Lasagne and Keras. The comparison includes cuDNN LSTMs, fused LSTM variants and less optimized, but more flexible LSTM implementations. The benchmarks reflect two typical scenarios for automatic speech recognition, notably continuous speech recognition and isolated digit recognition. These scenarios cover input sequences of fixed and variable length as well as the loss functions CTC and cross entropy. Additionally, a comparison between four different PyTorch versions is included. The code is available online https://github.com/stefbraun/rnn_benchmarks.
Tasks Speech Recognition
Published 2018-06-05
URL http://arxiv.org/abs/1806.01818v1
PDF http://arxiv.org/pdf/1806.01818v1.pdf
PWC https://paperswithcode.com/paper/lstm-benchmarks-for-deep-learning-frameworks
Repo https://github.com/stefbraun/rnn_benchmarks
Framework tf

Learning a Disentangled Embedding for Monocular 3D Shape Retrieval and Pose Estimation

Title Learning a Disentangled Embedding for Monocular 3D Shape Retrieval and Pose Estimation
Authors Kyaw Zaw Lin, Weipeng Xu, Qianru Sun, Christian Theobalt, Tat-Seng Chua
Abstract We propose a novel approach to jointly perform 3D shape retrieval and pose estimation from monocular images.In order to make the method robust to real-world image variations, e.g. complex textures and backgrounds, we learn an embedding space from 3D data that only includes the relevant information, namely the shape and pose. Our approach explicitly disentangles a shape vector and a pose vector, which alleviates both pose bias for 3D shape retrieval and categorical bias for pose estimation. We then train a CNN to map the images to this embedding space, and then retrieve the closest 3D shape from the database and estimate the 6D pose of the object. Our method achieves 10.3 median error for pose estimation and 0.592 top-1-accuracy for category agnostic 3D object retrieval on the Pascal3D+ dataset, outperforming the previous state-of-the-art methods on both tasks.
Tasks 3D Object Retrieval, 3D Shape Retrieval, Pose Estimation
Published 2018-12-24
URL http://arxiv.org/abs/1812.09899v2
PDF http://arxiv.org/pdf/1812.09899v2.pdf
PWC https://paperswithcode.com/paper/learning-a-disentangled-embedding-for
Repo https://github.com/zawlin/distangled_pose_shape
Framework none

Unsupervised Semantic Frame Induction using Triclustering

Title Unsupervised Semantic Frame Induction using Triclustering
Authors Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann, Simone Paolo Ponzetto
Abstract We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction. We cast the frame induction problem as a triclustering problem that is a generalization of clustering for triadic data. Our replicable benchmarks demonstrate that the proposed graph-based approach, Triframes, shows state-of-the art results on this task on a FrameNet-derived dataset and performing on par with competitive methods on a verb class clustering task.
Tasks
Published 2018-05-12
URL http://arxiv.org/abs/1805.04715v2
PDF http://arxiv.org/pdf/1805.04715v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-semantic-frame-induction-using
Repo https://github.com/uhh-lt/triframes
Framework pytorch
comments powered by Disqus