July 27, 2019

3035 words 15 mins read

Paper Group ANR 600

Paper Group ANR 600

Identifying Harm Events in Clinical Care through Medical Narratives. Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval. ICABiDAS: Intuition Centred Architecture for Big Data Analysis and Synthesis. Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking. DLPaper2Code: Auto-generation of Co …

Identifying Harm Events in Clinical Care through Medical Narratives

Title Identifying Harm Events in Clinical Care through Medical Narratives
Authors Arman Cohan, Allan Fong, Raj Ratwani, Nazli Goharian
Abstract Preventable medical errors are estimated to be among the leading causes of injury and death in the United States. To prevent such errors, healthcare systems have implemented patient safety and incident reporting systems. These systems enable clinicians to report unsafe conditions and cases where patients have been harmed due to errors in medical care. These reports are narratives in natural language and while they provide detailed information about the situation, it is non-trivial to perform large scale analysis for identifying common causes of errors and harm to the patients. In this work, we present a method based on attentive convolutional and recurrent networks for identifying harm events in patient care and categorize the harm based on its severity level. We demonstrate that our methods can significantly improve the performance over existing methods in identifying harm in clinical care.
Tasks
Published 2017-08-15
URL http://arxiv.org/abs/1708.04681v1
PDF http://arxiv.org/pdf/1708.04681v1.pdf
PWC https://paperswithcode.com/paper/identifying-harm-events-in-clinical-care
Repo
Framework

Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

Title Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval
Authors Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang, Tao Xiang, W. Bastiaan Kleijn, Jun Guo
Abstract Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the experimental results, we have demonstrated that the subspace learning can effectively model the sketch-photo domain-gap. In addition we draw a few key insights to drive future research.
Tasks Cross-Modal Retrieval, Image Retrieval, Sketch-Based Image Retrieval, Text Matching
Published 2017-05-28
URL http://arxiv.org/abs/1705.09888v1
PDF http://arxiv.org/pdf/1705.09888v1.pdf
PWC https://paperswithcode.com/paper/cross-modal-subspace-learning-for-fine
Repo
Framework

ICABiDAS: Intuition Centred Architecture for Big Data Analysis and Synthesis

Title ICABiDAS: Intuition Centred Architecture for Big Data Analysis and Synthesis
Authors Amit Kumar Mishra
Abstract Humans are expert in the amount of sensory data they deal with each moment. Human brain not only analyses these data but also starts synthesizing new information from the existing data. The current age Big-data systems are needed not just to analyze data but also to come up new interpretation. We believe that the pivotal ability in human brain which enables us to do this is what is known as “intuition”. Here, we present an intuition based architecture for big data analysis and synthesis.
Tasks
Published 2017-06-02
URL http://arxiv.org/abs/1706.00638v1
PDF http://arxiv.org/pdf/1706.00638v1.pdf
PWC https://paperswithcode.com/paper/icabidas-intuition-centred-architecture-for
Repo
Framework

Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking

Title Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking
Authors Glorianna Jagfeld, Ngoc Thang Vu
Abstract This paper presents our novel method to encode word confusion networks, which can represent a rich hypothesis space of automatic speech recognition systems, via recurrent neural networks. We demonstrate the utility of our approach for the task of dialog state tracking in spoken dialog systems that relies on automatic speech recognition output. Encoding confusion networks outperforms encoding the best hypothesis of the automatic speech recognition in a neural system for dialog state tracking on the well-known second Dialog State Tracking Challenge dataset.
Tasks Speech Recognition
Published 2017-07-18
URL http://arxiv.org/abs/1707.05853v2
PDF http://arxiv.org/pdf/1707.05853v2.pdf
PWC https://paperswithcode.com/paper/encoding-word-confusion-networks-with
Repo
Framework

DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers

Title DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers
Authors Akshay Sethi, Anush Sankaran, Naveen Panwar, Shreya Khare, Senthil Mani
Abstract With an abundance of research papers in deep learning, reproducibility or adoption of the existing works becomes a challenge. This is due to the lack of open source implementations provided by the authors. Further, re-implementing research papers in a different library is a daunting task. To address these challenges, we propose a novel extensible approach, DLPaper2Code, to extract and understand deep learning design flow diagrams and tables available in a research paper and convert them to an abstract computational graph. The extracted computational graph is then converted into execution ready source code in both Keras and Caffe, in real-time. An arXiv-like website is created where the automatically generated designs is made publicly available for 5,000 research papers. The generated designs could be rated and edited using an intuitive drag-and-drop UI framework in a crowdsourced manner. To evaluate our approach, we create a simulated dataset with over 216,000 valid design visualizations using a manually defined grammar. Experiments on the simulated dataset show that the proposed framework provide more than $93%$ accuracy in flow diagram content extraction.
Tasks
Published 2017-11-09
URL http://arxiv.org/abs/1711.03543v1
PDF http://arxiv.org/pdf/1711.03543v1.pdf
PWC https://paperswithcode.com/paper/dlpaper2code-auto-generation-of-code-from
Repo
Framework

Approximate message passing for nonconvex sparse regularization with stability and asymptotic analysis

Title Approximate message passing for nonconvex sparse regularization with stability and asymptotic analysis
Authors Ayaka Sakata, Yingying Xu
Abstract We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate message passing (AMP) algorithm considering nonconvex regularization, namely SCAD-AMP, and analytically show that the stability condition corresponds to the de Almeida–Thouless condition in spin glass literature. Through asymptotic analysis, we show the correspondence between the density evolution of SCAD-AMP and the replica symmetric solution. Numerical experiments confirm that for a sufficiently large system size, SCAD-AMP achieves the optimal performance predicted by the replica method. Through replica analysis, a phase transition between replica symmetric (RS) and replica symmetry breaking (RSB) region is found in the parameter space of SCAD. The appearance of the RS region for a nonconvex penalty is a significant advantage that indicates the region of smooth landscape of the optimization problem. Furthermore, we analytically show that the statistical representation performance of the SCAD penalty is better than that of L1-based methods, and the minimum representation error under RS assumption is obtained at the edge of the RS/RSB phase. The correspondence between the convergence of the existing coordinate descent algorithm and RS/RSB transition is also indicated.
Tasks
Published 2017-11-08
URL http://arxiv.org/abs/1711.02795v3
PDF http://arxiv.org/pdf/1711.02795v3.pdf
PWC https://paperswithcode.com/paper/approximate-message-passing-for-nonconvex
Repo
Framework

Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants

Title Grab, Pay and Eat: Semantic Food Detection for Smart Restaurants
Authors Eduardo Aguilar, Beatriz Remeseiro, Marc Bolaños, Petia Radeva
Abstract The increase in awareness of people towards their nutritional habits has drawn considerable attention to the field of automatic food analysis. Focusing on self-service restaurants environment, automatic food analysis is not only useful for extracting nutritional information from foods selected by customers, it is also of high interest to speed up the service solving the bottleneck produced at the cashiers in times of high demand. In this paper, we address the problem of automatic food tray analysis in canteens and restaurants environment, which consists in predicting multiple foods placed on a tray image. We propose a new approach for food analysis based on convolutional neural networks, we name Semantic Food Detection, which integrates in the same framework food localization, recognition and segmentation. We demonstrate that our method improves the state of the art food detection by a considerable margin on the public dataset UNIMIB2016 achieving about 90% in terms of F-measure, and thus provides a significant technological advance towards the automatic billing in restaurant environments.
Tasks
Published 2017-11-14
URL http://arxiv.org/abs/1711.05128v1
PDF http://arxiv.org/pdf/1711.05128v1.pdf
PWC https://paperswithcode.com/paper/grab-pay-and-eat-semantic-food-detection-for
Repo
Framework

Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification

Title Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification
Authors Wentao Zhu, Qi Lou, Yeeleng Scott Vang, Xiaohui Xie
Abstract Mammogram classification is directly related to computer-aided diagnosis of breast cancer. Traditional methods rely on regions of interest (ROIs) which require great efforts to annotate. Inspired by the success of using deep convolutional features for natural image analysis and multi-instance learning (MIL) for labeling a set of instances/patches, we propose end-to-end trained deep multi-instance networks for mass classification based on whole mammogram without the aforementioned ROIs. We explore three different schemes to construct deep multi-instance networks for whole mammogram classification. Experimental results on the INbreast dataset demonstrate the robustness of proposed networks compared to previous work using segmentation and detection annotations.
Tasks Whole Mammogram Classification
Published 2017-05-23
URL http://arxiv.org/abs/1705.08550v1
PDF http://arxiv.org/pdf/1705.08550v1.pdf
PWC https://paperswithcode.com/paper/deep-multi-instance-networks-with-sparse-1
Repo
Framework

Spectral Dynamics of Learning Restricted Boltzmann Machines

Title Spectral Dynamics of Learning Restricted Boltzmann Machines
Authors Aurélien Decelle, Giancarlo Fissore, Cyril Furtlehner
Abstract The Restricted Boltzmann Machine (RBM), an important tool used in machine learning in particular for unsupervized learning tasks, is investigated from the perspective of its spectral properties. Starting from empirical observations, we propose a generic statistical ensemble for the weight matrix of the RBM and characterize its mean evolution. This let us show how in the linear regime, in which the RBM is found to operate at the beginning of the training, the statistical properties of the data drive the selection of the unstable modes of the weight matrix. A set of equations characterizing the non-linear regime is then derived, unveiling in some way how the selected modes interact in later stages of the learning procedure and defining a deterministic learning curve for the RBM.
Tasks
Published 2017-08-09
URL http://arxiv.org/abs/1708.02917v2
PDF http://arxiv.org/pdf/1708.02917v2.pdf
PWC https://paperswithcode.com/paper/spectral-dynamics-of-learning-restricted
Repo
Framework

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

Title Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information
Authors Peng Xu, Fred Roosta, Michael W. Mahoney
Abstract We consider variants of trust-region and cubic regularization methods for non-convex optimization, in which the Hessian matrix is approximated. Under mild conditions on the inexact Hessian, and using approximate solution of the corresponding sub-problems, we provide iteration complexity to achieve $ \epsilon $-approximate second-order optimality which have shown to be tight. Our Hessian approximation conditions constitute a major relaxation over the existing ones in the literature. Consequently, we are able to show that such mild conditions allow for the construction of the approximate Hessian through various random sampling methods. In this light, we consider the canonical problem of finite-sum minimization, provide appropriate uniform and non-uniform sub-sampling strategies to construct such Hessian approximations, and obtain optimal iteration complexity for the corresponding sub-sampled trust-region and cubic regularization methods.
Tasks
Published 2017-08-23
URL https://arxiv.org/abs/1708.07164v4
PDF https://arxiv.org/pdf/1708.07164v4.pdf
PWC https://paperswithcode.com/paper/newton-type-methods-for-non-convex
Repo
Framework

Self-Supervised Relative Depth Learning for Urban Scene Understanding

Title Self-Supervised Relative Depth Learning for Urban Scene Understanding
Authors Huaizu Jiang, Erik Learned-Miller, Gustav Larsson, Michael Maire, Greg Shakhnarovich
Abstract As an agent moves through the world, the apparent motion of scene elements is (usually) inversely proportional to their depth. It is natural for a learning agent to associate image patterns with the magnitude of their displacement over time: as the agent moves, faraway mountains don’t move much; nearby trees move a lot. This natural relationship between the appearance of objects and their motion is a rich source of information about the world. In this work, we start by training a deep network, using fully automatic supervision, to predict relative scene depth from single images. The relative depth training images are automatically derived from simple videos of cars moving through a scene, using recent motion segmentation techniques, and no human-provided labels. This proxy task of predicting relative depth from a single image induces features in the network that result in large improvements in a set of downstream tasks including semantic segmentation, joint road segmentation and car detection, and monocular (absolute) depth estimation, over a network trained from scratch. The improvement on the semantic segmentation task is greater than those produced by any other automatically supervised methods. Moreover, for monocular depth estimation, our unsupervised pre-training method even outperforms supervised pre-training with ImageNet. In addition, we demonstrate benefits from learning to predict (unsupervised) relative depth in the specific videos associated with various downstream tasks. We adapt to the specific scenes in those tasks in an unsupervised manner to improve performance. In summary, for semantic segmentation, we present state-of-the-art results among methods that do not use supervised pre-training, and we even exceed the performance of supervised ImageNet pre-trained models for monocular depth estimation, achieving results that are comparable with state-of-the-art methods.
Tasks Depth Estimation, Monocular Depth Estimation, Motion Segmentation, Scene Understanding, Semantic Segmentation
Published 2017-12-13
URL http://arxiv.org/abs/1712.04850v2
PDF http://arxiv.org/pdf/1712.04850v2.pdf
PWC https://paperswithcode.com/paper/self-supervised-relative-depth-learning-for
Repo
Framework

Clothing and People - A Social Signal Processing Perspective

Title Clothing and People - A Social Signal Processing Perspective
Authors Maedeh Aghaei, Federico Parezzan, Mariella Dimiccoli, Petia Radeva, Marco Cristani
Abstract In our society and century, clothing is not anymore used only as a means for body protection. Our paper builds upon the evidence, studied within the social sciences, that clothing brings a clear communicative message in terms of social signals, influencing the impression and behaviour of others towards a person. In fact, clothing correlates with personality traits, both in terms of self-assessment and assessments that unacquainted people give to an individual. The consequences of these facts are important: the influence of clothing on the decision making of individuals has been investigated in the literature, showing that it represents a discriminative factor to differentiate among diverse groups of people. Unfortunately, this has been observed after cumbersome and expensive manual annotations, on very restricted populations, limiting the scope of the resulting claims. With this position paper, we want to sketch the main steps of the very first systematic analysis, driven by social signal processing techniques, of the relationship between clothing and social signals, both sent and perceived. Thanks to human parsing technologies, which exhibit high robustness owing to deep learning architectures, we are now capable to isolate visual patterns characterising a large types of garments. These algorithms will be used to capture statistical relations on a large corpus of evidence to confirm the sociological findings and to go beyond the state of the art.
Tasks Decision Making, Human Parsing
Published 2017-04-07
URL http://arxiv.org/abs/1704.02231v1
PDF http://arxiv.org/pdf/1704.02231v1.pdf
PWC https://paperswithcode.com/paper/clothing-and-people-a-social-signal
Repo
Framework

Rapid Mixing Swendsen-Wang Sampler for Stochastic Partitioned Attractive Models

Title Rapid Mixing Swendsen-Wang Sampler for Stochastic Partitioned Attractive Models
Authors Sejun Park, Yunhun Jang, Andreas Galanis, Jinwoo Shin, Daniel Stefankovic, Eric Vigoda
Abstract The Gibbs sampler is a particularly popular Markov chain used for learning and inference problems in Graphical Models (GMs). These tasks are computationally intractable in general, and the Gibbs sampler often suffers from slow mixing. In this paper, we study the Swendsen-Wang dynamics which is a more sophisticated Markov chain designed to overcome bottlenecks that impede the Gibbs sampler. We prove O(\log n) mixing time for attractive binary pairwise GMs (i.e., ferromagnetic Ising models) on stochastic partitioned graphs having n vertices, under some mild conditions, including low temperature regions where the Gibbs sampler provably mixes exponentially slow. Our experiments also confirm that the Swendsen-Wang sampler significantly outperforms the Gibbs sampler when they are used for learning parameters of attractive GMs.
Tasks
Published 2017-04-06
URL http://arxiv.org/abs/1704.02232v1
PDF http://arxiv.org/pdf/1704.02232v1.pdf
PWC https://paperswithcode.com/paper/rapid-mixing-swendsen-wang-sampler-for
Repo
Framework

On the Futility of Learning Complex Frame-Level Language Models for Chord Recognition

Title On the Futility of Learning Complex Frame-Level Language Models for Chord Recognition
Authors Filip Korzeniowski, Gerhard Widmer
Abstract Chord recognition systems use temporal models to post-process frame-wise chord preditions from acoustic models. Traditionally, first-order models such as Hidden Markov Models were used for this task, with recent works suggesting to apply Recurrent Neural Networks instead. Due to their ability to learn longer-term dependencies, these models are supposed to learn and to apply musical knowledge, instead of just smoothing the output of the acoustic model. In this paper, we argue that learning complex temporal models at the level of audio frames is futile on principle, and that non-Markovian models do not perform better than their first-order counterparts. We support our argument through three experiments on the McGill Billboard dataset. The first two show 1) that when learning complex temporal models at the frame level, improvements in chord sequence modelling are marginal; and 2) that these improvements do not translate when applied within a full chord recognition system. The third, still rather preliminary experiment gives first indications that the use of complex sequential models for chord prediction at higher temporal levels might be more promising.
Tasks Chord Recognition
Published 2017-02-01
URL http://arxiv.org/abs/1702.00178v2
PDF http://arxiv.org/pdf/1702.00178v2.pdf
PWC https://paperswithcode.com/paper/on-the-futility-of-learning-complex-frame
Repo
Framework

Transfer Adversarial Hashing for Hamming Space Retrieval

Title Transfer Adversarial Hashing for Hamming Space Retrieval
Authors Zhangjie Cao, Mingsheng Long, Chao Huang, Jianmin Wang
Abstract Hashing is widely applied to large-scale image retrieval due to the storage and retrieval efficiency. Existing work on deep hashing assumes that the database in the target domain is identically distributed with the training set in the source domain. This paper relaxes this assumption to a transfer retrieval setting, which allows the database and the training set to come from different but relevant domains. However, the transfer retrieval setting will introduce two technical difficulties: first, the hash model trained on the source domain cannot work well on the target domain due to the large distribution gap; second, the domain gap makes it difficult to concentrate the database points to be within a small Hamming ball. As a consequence, transfer retrieval performance within Hamming Radius 2 degrades significantly in existing hashing methods. This paper presents Transfer Adversarial Hashing (TAH), a new hybrid deep architecture that incorporates a pairwise $t$-distribution cross-entropy loss to learn concentrated hash codes and an adversarial network to align the data distributions between the source and target domains. TAH can generate compact transfer hash codes for efficient image retrieval on both source and target domains. Comprehensive experiments validate that TAH yields state of the art Hamming space retrieval performance on standard datasets.
Tasks Image Retrieval
Published 2017-12-13
URL http://arxiv.org/abs/1712.04616v1
PDF http://arxiv.org/pdf/1712.04616v1.pdf
PWC https://paperswithcode.com/paper/transfer-adversarial-hashing-for-hamming
Repo
Framework
comments powered by Disqus