Paper Group ANR 1003
Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss. Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks. Information Maximizing Exploration with a Latent Dynamics Model. Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images. Fine-tuning t …
Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss
Title | Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss |
Authors | Stephen Mussmann, Percy Liang |
Abstract | Uncertainty sampling, a popular active learning algorithm, is used to reduce the amount of data required to learn a classifier, but it has been observed in practice to converge to different parameters depending on the initialization and sometimes to even better parameters than standard training on all the data. In this work, we give a theoretical explanation of this phenomenon, showing that uncertainty sampling on a convex loss can be interpreted as performing a preconditioned stochastic gradient step on a smoothed version of the population zero-one loss that converges to the population zero-one loss. Furthermore, uncertainty sampling moves in a descent direction and converges to stationary points of the smoothed population zero-one loss. Experiments on synthetic and real datasets support this connection. |
Tasks | Active Learning |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.01815v1 |
http://arxiv.org/pdf/1812.01815v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-sampling-is-preconditioned |
Repo | |
Framework | |
Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks
Title | Toward Extractive Summarization of Online Forum Discussions via Hierarchical Attention Networks |
Authors | Sansiri Tarnpradab, Fei Liu, Kien A. Hua |
Abstract | Forum threads are lengthy and rich in content. Concise thread summaries will benefit both newcomers seeking information and those who participate in the discussion. Few studies, however, have examined the task of forum thread summarization. In this work we make the first attempt to adapt the hierarchical attention networks for thread summarization. The model draws on the recent development of neural attention mechanisms to build sentence and thread representations and use them for summarization. Our results indicate that the proposed approach can outperform a range of competitive baselines. Further, a redundancy removal step is crucial for achieving outstanding results. |
Tasks | |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10390v2 |
http://arxiv.org/pdf/1805.10390v2.pdf | |
PWC | https://paperswithcode.com/paper/toward-extractive-summarization-of-online |
Repo | |
Framework | |
Information Maximizing Exploration with a Latent Dynamics Model
Title | Information Maximizing Exploration with a Latent Dynamics Model |
Authors | Trevor Barron, Oliver Obst, Heni Ben Amor |
Abstract | All reinforcement learning algorithms must handle the trade-off between exploration and exploitation. Many state-of-the-art deep reinforcement learning methods use noise in the action selection, such as Gaussian noise in policy gradient methods or $\epsilon$-greedy in Q-learning. While these methods are appealing due to their simplicity, they do not explore the state space in a methodical manner. We present an approach that uses a model to derive reward bonuses as a means of intrinsic motivation to improve model-free reinforcement learning. A key insight of our approach is that this dynamics model can be learned in the latent feature space of a value function, representing the dynamics of the agent and the environment. This method is both theoretically grounded and computationally advantageous, permitting the efficient use of Bayesian information-theoretic methods in high-dimensional state spaces. We evaluate our method on several continuous control tasks, focusing on improving exploration. |
Tasks | Continuous Control, Policy Gradient Methods, Q-Learning |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.01238v1 |
http://arxiv.org/pdf/1804.01238v1.pdf | |
PWC | https://paperswithcode.com/paper/information-maximizing-exploration-with-a |
Repo | |
Framework | |
Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images
Title | Deep Adaptive Learning for Writer Identification based on Single Handwritten Word Images |
Authors | Sheng He, Lambert Schomaker |
Abstract | There are two types of information in each handwritten word image: explicit information which can be easily read or derived directly, such as lexical content or word length, and implicit attributes such as the author’s identity. Whether features learned by a neural network for one task can be used for another task remains an open question. In this paper, we present a deep adaptive learning method for writer identification based on single-word images using multi-task learning. An auxiliary task is added to the training process to enforce the emergence of reusable features. Our proposed method transfers the benefits of the learned features of a convolutional neural network from an auxiliary task such as explicit content recognition to the main task of writer identification in a single procedure. Specifically, we propose a new adaptive convolutional layer to exploit the learned deep features. A multi-task neural network with one or several adaptive convolutional layers is trained end-to-end, to exploit robust generic features for a specific main task, i.e., writer identification. Three auxiliary tasks, corresponding to three explicit attributes of handwritten word images (lexical content, word length and character attributes), are evaluated. Experimental results on two benchmark datasets show that the proposed deep adaptive learning method can improve the performance of writer identification based on single-word images, compared to non-adaptive and simple linear-adaptive approaches. |
Tasks | Multi-Task Learning |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.10954v1 |
http://arxiv.org/pdf/1809.10954v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-adaptive-learning-for-writer |
Repo | |
Framework | |
Fine-tuning the Ant Colony System algorithm through Particle Swarm Optimization
Title | Fine-tuning the Ant Colony System algorithm through Particle Swarm Optimization |
Authors | D Gómez-Cabrero, D. N. Ranasinghe |
Abstract | Ant Colony System (ACS) is a distributed (agent- based) algorithm which has been widely studied on the Symmetric Travelling Salesman Problem (TSP). The optimum parameters for this algorithm have to be found by trial and error. We use a Particle Swarm Optimization algorithm (PSO) to optimize the ACS parameters working in a designed subset of TSP instances. First goal is to perform the hybrid PSO-ACS algorithm on a single instance to find the optimum parameters and optimum solutions for the instance. Second goal is to analyze those sets of optimum parameters, in relation to instance characteristics. Computational results have shown good quality solutions for single instances though with high computational times, and that there may be sets of parameters that work optimally for a majority of instances. |
Tasks | |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.08353v1 |
http://arxiv.org/pdf/1803.08353v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-tuning-the-ant-colony-system-algorithm |
Repo | |
Framework | |
Neural Task Planning with And-Or Graph Representations
Title | Neural Task Planning with And-Or Graph Representations |
Authors | Tianshui Chen, Riquan Chen, Lin Nie, Xiaonan Luo, Xiaobai Liu, Liang Lin |
Abstract | This paper focuses on semantic task planning, i.e., predicting a sequence of actions toward accomplishing a specific task under a certain scene, which is a new problem in computer vision research. The primary challenges are how to model task-specific knowledge and how to integrate this knowledge into the learning procedure. In this work, we propose training a recurrent long short-term memory (LSTM) network to address this problem, i.e., taking a scene image (including pre-located objects) and the specified task as input and recurrently predicting action sequences. However, training such a network generally requires large numbers of annotated samples to cover the semantic space (e.g., diverse action decomposition and ordering). To overcome this issue, we introduce a knowledge and-or graph (AOG) for task description, which hierarchically represents a task as atomic actions. With this AOG representation, we can produce many valid samples (i.e., action sequences according to common sense) by training another auxiliary LSTM network with a small set of annotated samples. Furthermore, these generated samples (i.e., task-oriented action sequences) effectively facilitate training of the model for semantic task planning. In our experiments, we create a new dataset that contains diverse daily tasks and extensively evaluate the effectiveness of our approach. |
Tasks | Common Sense Reasoning |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.09284v1 |
http://arxiv.org/pdf/1808.09284v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-task-planning-with-and-or-graph |
Repo | |
Framework | |
Mixup-Based Acoustic Scene Classification Using Multi-Channel Convolutional Neural Network
Title | Mixup-Based Acoustic Scene Classification Using Multi-Channel Convolutional Neural Network |
Authors | Kele Xu, Dawei Feng, Haibo Mi, Boqing Zhu, Dezhi Wang, Lilun Zhang, Hengxing Cai, Shuwen Liu |
Abstract | Audio scene classification, the problem of predicting class labels of audio scenes, has drawn lots of attention during the last several years. However, it remains challenging and falls short of accuracy and efficiency. Recently, Convolutional Neural Network (CNN)-based methods have achieved better performance with comparison to the traditional methods. Nevertheless, conventional single channel CNN may fail to consider the fact that additional cues may be embedded in the multi-channel recordings. In this paper, we explore the use of Multi-channel CNN for the classification task, which aims to extract features from different channels in an end-to-end manner. We conduct the evaluation compared with the conventional CNN and traditional Gaussian Mixture Model-based methods. Moreover, to improve the classification accuracy further, this paper explores the using of mixup method. In brief, mixup trains the neural network on linear combinations of pairs of the representation of audio scene examples and their labels. By employing the mixup approach for data argumentation, the novel model can provide higher prediction accuracy and robustness in contrast with previous models, while the generalization error can also be reduced on the evaluation data. |
Tasks | Acoustic Scene Classification, Scene Classification |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07319v1 |
http://arxiv.org/pdf/1805.07319v1.pdf | |
PWC | https://paperswithcode.com/paper/mixup-based-acoustic-scene-classification |
Repo | |
Framework | |
Quantitative Susceptibility Map Reconstruction Using Annihilating Filter-based Low-Rank Hankel Matrix Approach
Title | Quantitative Susceptibility Map Reconstruction Using Annihilating Filter-based Low-Rank Hankel Matrix Approach |
Authors | Hyun-Seo Ahn, Sung-Hong Park, Jong Chul Ye |
Abstract | Quantitative susceptibility mapping (QSM) inevitably suffers from streaking artifacts caused by zeros on the conical surface of the dipole kernel in k-space. This work proposes a novel and accurate QSM reconstruction method based on a direct k-space interpolation approach, avoiding problems of over smoothing and streaking artifacts. Inspired by the recent theory of annihilating filter-based low-rank Hankel matrix approach (ALOHA), QSM reconstruction problem is formulated as deconvolution problem under low-rank Hankel matrix constraint in the k-space. To reduce the computational complexity and the memory requirement, the problem is formulated as successive reconstruction of 2-D planes along three independent axes of the 3-D phase image in Fourier domain. Extensive experiments were performed to verify and compare the proposed method with existing QSM reconstruction methods. The proposed ALOHA-QSM effectively reduced streaking artifacts and accurately estimated susceptibility values in deep gray matter structures, compared to the existing QSM methods. Our suggested ALOHA-QSM algorithm successfully solves the three-dimensional QSM dipole inversion problem without additional anatomical information or prior assumption and provides good image quality and quantitative accuracy. |
Tasks | |
Published | 2018-04-25 |
URL | https://arxiv.org/abs/1804.09396v2 |
https://arxiv.org/pdf/1804.09396v2.pdf | |
PWC | https://paperswithcode.com/paper/quantitative-susceptibility-map |
Repo | |
Framework | |
Alpha-rooting color image enhancement method by two-side 2-D quaternion discrete Fourier transform followed by spatial transformation
Title | Alpha-rooting color image enhancement method by two-side 2-D quaternion discrete Fourier transform followed by spatial transformation |
Authors | Artyom M. Grigoryan, Aparna John, Sos S. Agaian |
Abstract | In this paper a quaternion approach of enhancement method is proposed in which color in the image is considered as a single entity. This new method is referred as the alpha-rooting method of color image enhancement by the two-dimensional quaternion discrete Fourier transform (2-D QDFT) followed by a spatial transformation. The results of the proposed color image enhancement method are compared with its counterpart channel-by-channel enhancement algorithm by the 2-D DFT. The image enhancements are quantified to the enhancement measure that is based on visual perception referred as the color enhancement measure estimation (CEME). The preliminary experiment results show that the quaternion approach of image enhancement is an effective color image enhancement technique. |
Tasks | Image Enhancement |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07960v1 |
http://arxiv.org/pdf/1807.07960v1.pdf | |
PWC | https://paperswithcode.com/paper/alpha-rooting-color-image-enhancement-method |
Repo | |
Framework | |
Computed Tomography Image Enhancement using 3D Convolutional Neural Network
Title | Computed Tomography Image Enhancement using 3D Convolutional Neural Network |
Authors | Meng Li, Shiwen Shen, Wen Gao, William Hsu, Jason Cong |
Abstract | Computed tomography (CT) is increasingly being used for cancer screening, such as early detection of lung cancer. However, CT studies have varying pixel spacing due to differences in acquisition parameters. Thick slice CTs have lower resolution, hindering tasks such as nodule characterization during computer-aided detection due to partial volume effect. In this study, we propose a novel 3D enhancement convolutional neural network (3DECNN) to improve the spatial resolution of CT studies that were acquired using lower resolution/slice thicknesses to higher resolutions. Using a subset of the LIDC dataset consisting of 20,672 CT slices from 100 scans, we simulated lower resolution/thick section scans then attempted to reconstruct the original images using our 3DECNN network. A significant improvement in PSNR (29.3087dB vs. 28.8769dB, p-value < 2.2e-16) and SSIM (0.8529dB vs. 0.8449dB, p-value < 2.2e-16) compared to other state-of-art deep learning methods is observed. |
Tasks | Computed Tomography (CT), Image Enhancement |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.06821v1 |
http://arxiv.org/pdf/1807.06821v1.pdf | |
PWC | https://paperswithcode.com/paper/computed-tomography-image-enhancement-using |
Repo | |
Framework | |
Hybrid ASP-based Approach to Pattern Mining
Title | Hybrid ASP-based Approach to Pattern Mining |
Authors | Sergey Paramonov, Daria Stepanova, Pauli Miettinen |
Abstract | Detecting small sets of relevant patterns from a given dataset is a central challenge in data mining. The relevance of a pattern is based on user-provided criteria; typically, all patterns that satisfy certain criteria are considered relevant. Rule-based languages like Answer Set Programming (ASP) seem well-suited for specifying such criteria in a form of constraints. Although progress has been made, on the one hand, on solving individual mining problems and, on the other hand, developing generic mining systems, the existing methods either focus on scalability or on generality. In this paper we make steps towards combining local (frequency, size, cost) and global (various condensed representations like maximal, closed, skyline) constraints in a generic and efficient way. We present a hybrid approach for itemset, sequence and graph mining which exploits dedicated highly optimized mining systems to detect frequent patterns and then filters the results using declarative ASP. To further demonstrate the generic nature of our hybrid framework we apply it to a problem of approximately tiling a database. Experiments on real-world datasets show the effectiveness of the proposed method and computational gains for itemset, sequence and graph mining, as well as approximate tiling. Under consideration in Theory and Practice of Logic Programming (TPLP). |
Tasks | |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07302v1 |
http://arxiv.org/pdf/1808.07302v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-asp-based-approach-to-pattern-mining |
Repo | |
Framework | |
CT Image Enhancement Using Stacked Generative Adversarial Networks and Transfer Learning for Lesion Segmentation Improvement
Title | CT Image Enhancement Using Stacked Generative Adversarial Networks and Transfer Learning for Lesion Segmentation Improvement |
Authors | Youbao Tang, Jinzheng Cai, Le Lu, Adam P. Harrison, Ke Yan, Jing Xiao, Lin Yang, Ronald M. Summers |
Abstract | Automated lesion segmentation from computed tomography (CT) is an important and challenging task in medical image analysis. While many advancements have been made, there is room for continued improvements. One hurdle is that CT images can exhibit high noise and low contrast, particularly in lower dosages. To address this, we focus on a preprocessing method for CT images that uses stacked generative adversarial networks (SGAN) approach. The first GAN reduces the noise in the CT image and the second GAN generates a higher resolution image with enhanced boundaries and high contrast. To make up for the absence of high quality CT images, we detail how to synthesize a large number of low- and high-quality natural images and use transfer learning with progressively larger amounts of CT images. We apply both the classic GrabCut method and the modern holistically nested network (HNN) to lesion segmentation, testing whether SGAN can yield improved lesion segmentation. Experimental results on the DeepLesion dataset demonstrate that the SGAN enhancements alone can push GrabCut performance over HNN trained on original images. We also demonstrate that HNN + SGAN performs best compared against four other enhancement methods, including when using only a single GAN. |
Tasks | Computed Tomography (CT), Image Enhancement, Lesion Segmentation, Transfer Learning |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.07144v1 |
http://arxiv.org/pdf/1807.07144v1.pdf | |
PWC | https://paperswithcode.com/paper/ct-image-enhancement-using-stacked-generative |
Repo | |
Framework | |
The Shape of Art History in the Eyes of the Machine
Title | The Shape of Art History in the Eyes of the Machine |
Authors | Ahmed Elgammal, Marian Mazzone, Bingchen Liu, Diana Kim, Mohamed Elhoseiny |
Abstract | How does the machine classify styles in art? And how does it relate to art historians’ methods for analyzing style? Several studies have shown the ability of the machine to learn and predict style categories, such as Renaissance, Baroque, Impressionism, etc., from images of paintings. This implies that the machine can learn an internal representation encoding discriminative features through its visual analysis. However, such a representation is not necessarily interpretable. We conducted a comprehensive study of several of the state-of-the-art convolutional neural networks applied to the task of style classification on 77K images of paintings, and analyzed the learned representation through correlation analysis with concepts derived from art history. Surprisingly, the networks could place the works of art in a smooth temporal arrangement mainly based on learning style labels, without any a priori knowledge of time of creation, the historical time and context of styles, or relations between styles. The learned representations showed that there are few underlying factors that explain the visual variations of style in art. Some of these factors were found to correlate with style patterns suggested by Heinrich W"olfflin (1846-1945). The learned representations also consistently highlighted certain artists as the extreme distinctive representative of their styles, which quantitatively confirms art historian observations. |
Tasks | |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07729v2 |
http://arxiv.org/pdf/1801.07729v2.pdf | |
PWC | https://paperswithcode.com/paper/the-shape-of-art-history-in-the-eyes-of-the |
Repo | |
Framework | |
Data-driven Probabilistic Atlases Capture Whole-brain Individual Variation
Title | Data-driven Probabilistic Atlases Capture Whole-brain Individual Variation |
Authors | Yuankai Huo, Katherine Swett, Susan M. Resnick, Laurie E. Cutting, Bennett A. Landman |
Abstract | Probabilistic atlases provide essential spatial contextual information for image interpretation, Bayesian modeling, and algorithmic processing. Such atlases are typically constructed by grouping subjects with similar demographic information. Importantly, use of the same scanner minimizes inter-group variability. However, generalizability and spatial specificity of such approaches is more limited than one might like. Inspired by Commowick “Frankenstein’s creature paradigm” which builds a personal specific anatomical atlas, we propose a data-driven framework to build a personal specific probabilistic atlas under the large-scale data scheme. The data-driven framework clusters regions with similar features using a point distribution model to learn different anatomical phenotypes. Regional structural atlases and corresponding regional probabilistic atlases are used as indices and targets in the dictionary. By indexing the dictionary, the whole brain probabilistic atlases adapt to each new subject quickly and can be used as spatial priors for visualization and processing. The novelties of this approach are (1) it provides a new perspective of generating personal specific whole brain probabilistic atlases (132 regions) under data-driven scheme across sites. (2) The framework employs the large amount of heterogeneous data (2349 images). (3) The proposed framework achieves low computational cost since only one affine registration and Pearson correlation operation are required for a new subject. Our method matches individual regions better with higher Dice similarity value when testing the probabilistic atlases. Importantly, the advantage the large-scale scheme is demonstrated by the better performance of using large-scale training data (1888 images) than smaller training set (720 images). |
Tasks | |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02300v1 |
http://arxiv.org/pdf/1806.02300v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-probabilistic-atlases-capture |
Repo | |
Framework | |
Critical Percolation as a Framework to Analyze the Training of Deep Networks
Title | Critical Percolation as a Framework to Analyze the Training of Deep Networks |
Authors | Zohar Ringel, Rodrigo de Bem |
Abstract | In this paper we approach two relevant deep learning topics: i) tackling of graph structured input data and ii) a better understanding and analysis of deep networks and related learning algorithms. With this in mind we focus on the topological classification of reachability in a particular subset of planar graphs (Mazes). Doing so, we are able to model the topology of data while staying in Euclidean space, thus allowing its processing with standard CNN architectures. We suggest a suitable architecture for this problem and show that it can express a perfect solution to the classification task. The shape of the cost function around this solution is also derived and, remarkably, does not depend on the size of the maze in the large maze limit. Responsible for this behavior are rare events in the dataset which strongly regulate the shape of the cost function near this global minimum. We further identify an obstacle to learning in the form of poorly performing local minima in which the network chooses to ignore some of the inputs. We further support our claims with training experiments and numerical analysis of the cost function on networks with up to $128$ layers. |
Tasks | |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.02154v1 |
http://arxiv.org/pdf/1802.02154v1.pdf | |
PWC | https://paperswithcode.com/paper/critical-percolation-as-a-framework-to |
Repo | |
Framework | |