Paper Group AWR 86
A Survey on Dialogue Systems: Recent Advances and New Frontiers. Quantifying multivariate redundancy with maximum entropy decompositions of mutual information. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis. Handwritten Bangla Character Recognition Using The …
A Survey on Dialogue Systems: Recent Advances and New Frontiers
Title | A Survey on Dialogue Systems: Recent Advances and New Frontiers |
Authors | Hongshen Chen, Xiaorui Liu, Dawei Yin, Jiliang Tang |
Abstract | Dialogue systems have attracted more and more attention. Recent advances on dialogue systems are overwhelmingly contributed by deep learning techniques, which have been employed to enhance a wide range of big data applications such as computer vision, natural language processing, and recommender systems. For dialogue systems, deep learning can leverage a massive amount of data to learn meaningful feature representations and response generation strategies, while requiring a minimum amount of hand-crafting. In this article, we give an overview to these recent advances on dialogue systems from various perspectives and discuss some possible research directions. In particular, we generally divide existing dialogue systems into task-oriented and non-task-oriented models, then detail how deep learning techniques help them with representative algorithms and finally discuss some appealing research directions that can bring the dialogue system research into a new frontier. |
Tasks | Recommendation Systems |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.01731v3 |
http://arxiv.org/pdf/1711.01731v3.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-dialogue-systems-recent-advances |
Repo | https://github.com/ocplease/DialogueSystem |
Framework | none |
Quantifying multivariate redundancy with maximum entropy decompositions of mutual information
Title | Quantifying multivariate redundancy with maximum entropy decompositions of mutual information |
Authors | Daniel Chicharro |
Abstract | Williams and Beer (2010) proposed a nonnegative mutual information decomposition, based on the construction of redundancy lattices, which allows separating the information that a set of variables contains about a target variable into nonnegative components interpretable as the unique information of some variables not provided by others as well as redundant and synergistic components. However, the definition of multivariate measures of redundancy that comply with nonnegativity and conform to certain axioms that capture conceptually desirable properties of redundancy has proven to be elusive. We here present a procedure to determine nonnegative multivariate redundancy measures, within the maximum entropy framework. In particular, we generalize existing bivariate maximum entropy measures of redundancy and unique information, defining measures of the redundant information that a group of variables has about a target, and of the unique redundant information that a group of variables has about a target that is not redundant with information from another group. The two key ingredients for this approach are: First, the identification of a type of constraints on entropy maximization that allows isolating components of redundancy and unique redundancy by mirroring them to synergy components. Second, the construction of rooted tree-based decompositions of the mutual information, which conform to the axioms of the redundancy lattice by the local implementation at each tree node of binary unfoldings of the information using hierarchically related maximum entropy constraints. Altogether, the proposed measures quantify the different multivariate redundancy contributions of a nonnegative mutual information decomposition consistent with the redundancy lattice. |
Tasks | |
Published | 2017-08-13 |
URL | http://arxiv.org/abs/1708.03845v2 |
http://arxiv.org/pdf/1708.03845v2.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-multivariate-redundancy-with |
Repo | https://github.com/Abzinger/Chicharro_PID |
Framework | none |
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
Title | An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis |
Authors | Yuandong Tian |
Abstract | In this paper, we explore theoretical properties of training a two-layered ReLU network $g(\mathbf{x}; \mathbf{w}) = \sum_{j=1}^K \sigma(\mathbf{w}_j^T\mathbf{x})$ with centered $d$-dimensional spherical Gaussian input $\mathbf{x}$ ($\sigma$=ReLU). We train our network with gradient descent on $\mathbf{w}$ to mimic the output of a teacher network with the same architecture and fixed parameters $\mathbf{w}^*$. We show that its population gradient has an analytical formula, leading to interesting theoretical analysis of critical points and convergence behaviors. First, we prove that critical points outside the hyperplane spanned by the teacher parameters (“out-of-plane”) are not isolated and form manifolds, and characterize in-plane critical-point-free regions for two ReLU case. On the other hand, convergence to $\mathbf{w}^*$ for one ReLU node is guaranteed with at least $(1-\epsilon)/2$ probability, if weights are initialized randomly with standard deviation upper-bounded by $O(\epsilon/\sqrt{d})$, consistent with empirical practice. For network with many ReLU nodes, we prove that an infinitesimal perturbation of weight initialization results in convergence towards $\mathbf{w}^*$ (or its permutation), a phenomenon known as spontaneous symmetric-breaking (SSB) in physics. We assume no independence of ReLU activations. Simulation verifies our findings. |
Tasks | |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00560v2 |
http://arxiv.org/pdf/1703.00560v2.pdf | |
PWC | https://paperswithcode.com/paper/an-analytical-formula-of-population-gradient |
Repo | https://github.com/yuandong-tian/ICML17_ReLU |
Framework | none |
Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks
Title | Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks |
Authors | Md Zahangir Alom, Peheding Sidike, Mahmudul Hasan, Tark M. Taha, Vijayan K. Asari |
Abstract | In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower performance than those developed for English alpha-numeric characters. To improve the performance of HBCR, we herein present the application of the state-of-the-art Deep Convolutional Neural Networks (DCNN) including VGG Network, All Convolution Network (All-Conv Net), Network in Network (NiN), Residual Network, FractalNet, and DenseNet for HBCR. The deep learning approaches have the advantage of extracting and using feature information, improving the recognition of 2D shapes with a high degree of invariance to translation, scaling and other distortions. We systematically evaluated the performance of DCNN models on publicly available Bangla handwritten character dataset called CMATERdb and achieved the superior recognition accuracy when using DCNN models. This improvement would help in building an automatic HBCR system for practical applications. |
Tasks | Object Recognition |
Published | 2017-12-28 |
URL | http://arxiv.org/abs/1712.09872v3 |
http://arxiv.org/pdf/1712.09872v3.pdf | |
PWC | https://paperswithcode.com/paper/handwritten-bangla-character-recognition |
Repo | https://github.com/sh21kang/OCR_V1 |
Framework | mxnet |
Video Imagination from a Single Image with Transformation Generation
Title | Video Imagination from a Single Image with Transformation Generation |
Authors | Baoyang Chen, Wenmin Wang, Jinzhuo Wang, Xiongtao Chen |
Abstract | In this work, we focus on a challenging task: synthesizing multiple imaginary videos given a single image. Major problems come from high dimensionality of pixel space and the ambiguity of potential motions. To overcome those problems, we propose a new framework that produce imaginary videos by transformation generation. The generated transformations are applied to the original image in a novel volumetric merge network to reconstruct frames in imaginary video. Through sampling different latent variables, our method can output different imaginary video samples. The framework is trained in an adversarial way with unsupervised learning. For evaluation, we propose a new assessment metric $RIQA$. In experiments, we test on 3 datasets varying from synthetic data to natural scene. Our framework achieves promising performance in image quality assessment. The visual inspection indicates that it can successfully generate diverse five-frame videos in acceptable perceptual quality. |
Tasks | Image Quality Assessment |
Published | 2017-06-13 |
URL | http://arxiv.org/abs/1706.04124v2 |
http://arxiv.org/pdf/1706.04124v2.pdf | |
PWC | https://paperswithcode.com/paper/video-imagination-from-a-single-image-with |
Repo | https://github.com/gitpub327/VideoImagination |
Framework | tf |
DeepIGeoS: A Deep Interactive Geodesic Framework for Medical Image Segmentation
Title | DeepIGeoS: A Deep Interactive Geodesic Framework for Medical Image Segmentation |
Authors | Guotai Wang, Maria A. Zuluaga, Wenqi Li, Rosalind Pratt, Premal A. Patel, Michael Aertsen, Tom Doel, Anna L. David, Jan Deprest, Sebastien Ourselin, Tom Vercauteren |
Abstract | Accurate medical image segmentation is essential for diagnosis, surgical planning and many other applications. Convolutional Neural Networks (CNNs) have become the state-of-the-art automatic segmentation methods. However, fully automatic results may still need to be refined to become accurate and robust enough for clinical use. We propose a deep learning-based interactive segmentation method to improve the results obtained by an automatic CNN and to reduce user interactions during refinement for higher accuracy. We use one CNN to obtain an initial automatic segmentation, on which user interactions are added to indicate mis-segmentations. Another CNN takes as input the user interactions with the initial segmentation and gives a refined result. We propose to combine user interactions with CNNs through geodesic distance transforms, and propose a resolution-preserving network that gives a better dense prediction. In addition, we integrate user interactions as hard constraints into a back-propagatable Conditional Random Field. We validated the proposed framework in the context of 2D placenta segmentation from fetal MRI and 3D brain tumor segmentation from FLAIR images. Experimental results show our method achieves a large improvement from automatic CNNs, and obtains comparable and even higher accuracy with fewer user interventions and less time compared with traditional interactive methods. |
Tasks | Brain Tumor Segmentation, Interactive Segmentation, Medical Image Segmentation, Placenta Segmentation, Semantic Segmentation |
Published | 2017-07-03 |
URL | http://arxiv.org/abs/1707.00652v3 |
http://arxiv.org/pdf/1707.00652v3.pdf | |
PWC | https://paperswithcode.com/paper/deepigeos-a-deep-interactive-geodesic |
Repo | https://github.com/taigw/geodesic_distance |
Framework | none |
Brain Tumor Segmentation Based on Refined Fully Convolutional Neural Networks with A Hierarchical Dice Loss
Title | Brain Tumor Segmentation Based on Refined Fully Convolutional Neural Networks with A Hierarchical Dice Loss |
Authors | Jiachi Zhang, Xiaolei Shen, Tianqi Zhuo, Hong Zhou |
Abstract | As a basic task in computer vision, semantic segmentation can provide fundamental information for object detection and instance segmentation to help the artificial intelligence better understand real world. Since the proposal of fully convolutional neural network (FCNN), it has been widely used in semantic segmentation because of its high accuracy of pixel-wise classification as well as high precision of localization. In this paper, we apply several famous FCNN to brain tumor segmentation, making comparisons and adjusting network architectures to achieve better performance measured by metrics such as precision, recall, mean of intersection of union (mIoU) and dice score coefficient (DSC). The adjustments to the classic FCNN include adding more connections between convolutional layers, enlarging decoders after up sample layers and changing the way shallower layers’ information is reused. Besides the structure modification, we also propose a new classifier with a hierarchical dice loss. Inspired by the containing relationship between classes, the loss function converts multiple classification to multiple binary classification in order to counteract the negative effect caused by imbalance data set. Massive experiments have been done on the training set and testing set in order to assess our refined fully convolutional neural networks and new types of loss function. Competitive figures prove they are more effective than their predecessors. |
Tasks | Brain Tumor Segmentation, Instance Segmentation, Object Detection, Semantic Segmentation |
Published | 2017-12-25 |
URL | http://arxiv.org/abs/1712.09093v3 |
http://arxiv.org/pdf/1712.09093v3.pdf | |
PWC | https://paperswithcode.com/paper/brain-tumor-segmentation-based-on-refined |
Repo | https://github.com/milliondegree/semantic-segmentation-tensorflow |
Framework | tf |
Unsupervised Learning of Disentangled Representations from Video
Title | Unsupervised Learning of Disentangled Representations from Video |
Authors | Emily Denton, Vighnesh Birodkar |
Abstract | We present a new model DrNET that learns disentangled image representations from video. Our approach leverages the temporal coherence of video and a novel adversarial loss to learn a representation that factorizes each frame into a stationary part and a temporally varying component. The disentangled representation can be used for a range of tasks. For example, applying a standard LSTM to the time-vary components enables prediction of future frames. We evaluate our approach on a range of synthetic and real videos, demonstrating the ability to coherently generate hundreds of steps into the future. |
Tasks | |
Published | 2017-05-31 |
URL | http://arxiv.org/abs/1705.10915v1 |
http://arxiv.org/pdf/1705.10915v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-disentangled |
Repo | https://github.com/edenton/drnet |
Framework | torch |
End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design
Title | End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design |
Authors | Li Shen |
Abstract | We develop an end-to-end training algorithm for whole-image breast cancer diagnosis based on mammograms. It requires lesion annotations only at the first stage of training. After that, a whole image classifier can be trained using only image level labels. This greatly reduced the reliance on lesion annotations. Our approach is implemented using an all convolutional design that is simple yet provides superior performance in comparison with the previous methods. On DDSM, our best single-model achieves a per-image AUC score of 0.88 and three-model averaging increases the score to 0.91. On INbreast, our best single-model achieves a per-image AUC score of 0.96. Using DDSM as benchmark, our models compare favorably with the current state-of-the-art. We also demonstrate that a whole image model trained on DDSM can be easily transferred to INbreast without using its lesion annotations and using only a small amount of training data. Code availability: https://github.com/lishen/end2end-all-conv |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05775v1 |
http://arxiv.org/pdf/1711.05775v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-training-for-whole-image-breast |
Repo | https://github.com/yuyuyu123456/CBIS-DDSM |
Framework | tf |
Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification
Title | Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification |
Authors | Sungmin Rhee, Seokjun Seo, Sun Kim |
Abstract | Network biology has been successfully used to help reveal complex mechanisms of disease, especially cancer. On the other hand, network biology requires in-depth knowledge to construct disease-specific networks, but our current knowledge is very limited even with the recent advances in human cancer biology. Deep learning has shown a great potential to address the difficult situation like this. However, deep learning technologies conventionally use grid-like structured data, thus application of deep learning technologies to the classification of human disease subtypes is yet to be explored. Recently, graph based deep learning techniques have emerged, which becomes an opportunity to leverage analyses in network biology. In this paper, we proposed a hybrid model, which integrates two key components 1) graph convolution neural network (graph CNN) and 2) relation network (RN). We utilize graph CNN as a component to learn expression patterns of cooperative gene community, and RN as a component to learn associations between learned patterns. The proposed model is applied to the PAM50 breast cancer subtype classification task, the standard breast cancer subtype classification of clinical utility. In experiments of both subtype classification and patient survival analysis, our proposed method achieved significantly better performances than existing methods. We believe that this work is an important starting point to realize the upcoming personalized medicine. |
Tasks | Survival Analysis |
Published | 2017-11-16 |
URL | http://arxiv.org/abs/1711.05859v3 |
http://arxiv.org/pdf/1711.05859v3.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-approach-of-relation-network-and |
Repo | https://github.com/LeeJunHyun/The-Databases-for-Drug-Discovery |
Framework | tf |
Comprehensive Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems Using the R-Package flacco
Title | Comprehensive Feature-Based Landscape Analysis of Continuous and Constrained Optimization Problems Using the R-Package flacco |
Authors | Pascal Kerschke |
Abstract | Choosing the best-performing optimizer(s) out of a portfolio of optimization algorithms is usually a difficult and complex task. It gets even worse, if the underlying functions are unknown, i.e., so-called Black-Box problems, and function evaluations are considered to be expensive. In the case of continuous single-objective optimization problems, Exploratory Landscape Analysis (ELA) - a sophisticated and effective approach for characterizing the landscapes of such problems by means of numerical values before actually performing the optimization task itself - is advantageous. Unfortunately, until now it has been quite complicated to compute multiple ELA features simultaneously, as the corresponding code has been - if at all - spread across multiple platforms or at least across several packages within these platforms. This article presents a broad summary of existing ELA approaches and introduces flacco, an R-package for feature-based landscape analysis of continuous and constrained optimization problems. Although its functions neither solve the optimization problem itself nor the related “Algorithm Selection Problem (ASP)", it offers easy access to an essential ingredient of the ASP by providing a wide collection of ELA features on a single platform - even within a single package. In addition, flacco provides multiple visualization techniques, which enhance the understanding of some of these numerical features, and thereby make certain landscape properties more comprehensible. On top of that, we will introduce the package’s build-in, as well as web-hosted and hence platform-independent, graphical user interface (GUI), which facilitates the usage of the package - especially for people who are not familiar with R - making it a very convenient toolbox when working towards algorithm selection of continuous single-objective optimization problems. |
Tasks | |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05258v1 |
http://arxiv.org/pdf/1708.05258v1.pdf | |
PWC | https://paperswithcode.com/paper/comprehensive-feature-based-landscape |
Repo | https://github.com/kerschke/flacco |
Framework | none |
“Found in Translation”: Predicting Outcomes of Complex Organic Chemistry Reactions using Neural Sequence-to-Sequence Models
Title | “Found in Translation”: Predicting Outcomes of Complex Organic Chemistry Reactions using Neural Sequence-to-Sequence Models |
Authors | Philippe Schwaller, Theophile Gaudin, David Lanyi, Costas Bekas, Teodoro Laino |
Abstract | There is an intuitive analogy of an organic chemist’s understanding of a compound and a language speaker’s understanding of a word. Consequently, it is possible to introduce the basic concepts and analyze potential impacts of linguistic analysis to the world of organic chemistry. In this work, we cast the reaction prediction task as a translation problem by introducing a template-free sequence-to-sequence model, trained end-to-end and fully data-driven. We propose a novel way of tokenization, which is arbitrarily extensible with reaction information. With this approach, we demonstrate results superior to the state-of-the-art solution by a significant margin on the top-1 accuracy. Specifically, our approach achieves an accuracy of 80.1% without relying on auxiliary knowledge such as reaction templates. Also, 66.4% accuracy is reached on a larger and noisier dataset. |
Tasks | Tokenization |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.04810v2 |
http://arxiv.org/pdf/1711.04810v2.pdf | |
PWC | https://paperswithcode.com/paper/found-in-translation-predicting-outcomes-of |
Repo | https://github.com/ManzoorElahi/organic-chemistry-reaction-prediction-using-NMT |
Framework | none |
DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
Title | DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling |
Authors | Lachlan Tychsen-Smith, Lars Petersson |
Abstract | We define the object detection from imagery problem as estimating a very large but extremely sparse bounding box dependent probability distribution. Subsequently we identify a sparse distribution estimation scheme, Directed Sparse Sampling, and employ it in a single end-to-end CNN based detection model. This methodology extends and formalizes previous state-of-the-art detection models with an additional emphasis on high evaluation rates and reduced manual engineering. We introduce two novelties, a corner based region-of-interest estimator and a deconvolution based CNN model. The resulting model is scene adaptive, does not require manually defined reference bounding boxes and produces highly competitive results on MSCOCO, Pascal VOC 2007 and Pascal VOC 2012 with real-time evaluation rates. Further analysis suggests our model performs particularly well when finegrained object localization is desirable. We argue that this advantage stems from the significantly larger set of available regions-of-interest relative to other methods. Source-code is available from: https://github.com/lachlants/denet |
Tasks | Object Detection, Object Localization, Real-Time Object Detection |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10295v3 |
http://arxiv.org/pdf/1703.10295v3.pdf | |
PWC | https://paperswithcode.com/paper/denet-scalable-real-time-object-detection |
Repo | https://github.com/lachlants/denet |
Framework | none |
Onsets and Frames: Dual-Objective Piano Transcription
Title | Onsets and Frames: Dual-Objective Piano Transcription |
Authors | Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck |
Abstract | We advance the state of the art in polyphonic piano music transcription by using a deep convolutional and recurrent neural network which is trained to jointly predict onsets and frames. Our model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note to start unless the onset detector also agrees that an onset for that pitch is present in the frame. We focus on improving onsets and offsets together instead of either in isolation as we believe this correlates better with human musical perception. Our approach results in over a 100% relative improvement in note F1 score (with offsets) on the MAPS dataset. Furthermore, we extend the model to predict relative velocities of normalized audio which results in more natural-sounding transcriptions. |
Tasks | |
Published | 2017-10-30 |
URL | http://arxiv.org/abs/1710.11153v2 |
http://arxiv.org/pdf/1710.11153v2.pdf | |
PWC | https://paperswithcode.com/paper/onsets-and-frames-dual-objective-piano |
Repo | https://github.com/BShakhovsky/PolyphonicPianoTranscription |
Framework | tf |
Framing U-Net via Deep Convolutional Framelets: Application to Sparse-view CT
Title | Framing U-Net via Deep Convolutional Framelets: Application to Sparse-view CT |
Authors | Yoseob Han, Jong Chul Ye |
Abstract | X-ray computed tomography (CT) using sparse projection views is a recent approach to reduce the radiation dose. However, due to the insufficient projection views, an analytic reconstruction approach using the filtered back projection (FBP) produces severe streaking artifacts. Recently, deep learning approaches using large receptive field neural networks such as U-Net have demonstrated impressive performance for sparse- view CT reconstruction. However, theoretical justification is still lacking. Inspired by the recent theory of deep convolutional framelets, the main goal of this paper is, therefore, to reveal the limitation of U-Net and propose new multi-resolution deep learning schemes. In particular, we show that the alternative U- Net variants such as dual frame and the tight frame U-Nets satisfy the so-called frame condition which make them better for effective recovery of high frequency edges in sparse view- CT. Using extensive experiments with real patient data set, we demonstrate that the new network architectures provide better reconstruction performance. |
Tasks | Computed Tomography (CT) |
Published | 2017-08-28 |
URL | http://arxiv.org/abs/1708.08333v3 |
http://arxiv.org/pdf/1708.08333v3.pdf | |
PWC | https://paperswithcode.com/paper/framing-u-net-via-deep-convolutional |
Repo | https://github.com/hanyoseob/framing-u-net |
Framework | none |