May 6, 2019

3194 words 15 mins read

Paper Group ANR 211

Paper Group ANR 211

Screening Rules for Convex Problems. Vietnamese Named Entity Recognition using Token Regular Expressions and Bidirectional Inference. Medical Image Synthesis with Context-Aware Generative Adversarial Networks. Selecting Efficient Features via a Hyper-Heuristic Approach. Using Virtual Humans to Understand Real Ones. Deep Clustering and Conventional …

Screening Rules for Convex Problems

Title Screening Rules for Convex Problems
Authors Anant Raj, Jakob Olbrich, Bernd Gärtner, Bernhard Schölkopf, Martin Jaggi
Abstract We propose a new framework for deriving screening rules for convex optimization problems. Our approach covers a large class of constrained and penalized optimization formulations, and works in two steps. First, given any approximate point, the structure of the objective function and the duality gap is used to gather information on the optimal solution. In the second step, this information is used to produce screening rules, i.e. safely identifying unimportant weight variables of the optimal solution. Our general framework leads to a large variety of useful existing as well as new screening rules for many applications. For example, we provide new screening rules for general simplex and $L_1$-constrained problems, Elastic Net, squared-loss Support Vector Machines, minimum enclosing ball, as well as structured norm regularized problems, such as group lasso.
Tasks
Published 2016-09-23
URL http://arxiv.org/abs/1609.07478v1
PDF http://arxiv.org/pdf/1609.07478v1.pdf
PWC https://paperswithcode.com/paper/screening-rules-for-convex-problems
Repo
Framework

Vietnamese Named Entity Recognition using Token Regular Expressions and Bidirectional Inference

Title Vietnamese Named Entity Recognition using Token Regular Expressions and Bidirectional Inference
Authors Phuong Le-Hong
Abstract This paper describes an efficient approach to improve the accuracy of a named entity recognition system for Vietnamese. The approach combines regular expressions over tokens and a bidirectional inference method in a sequence labelling model. The proposed method achieves an overall $F_1$ score of 89.66% on a test set of an evaluation campaign, organized in late 2016 by the Vietnamese Language and Speech Processing (VLSP) community.
Tasks Named Entity Recognition
Published 2016-10-18
URL http://arxiv.org/abs/1610.05652v2
PDF http://arxiv.org/pdf/1610.05652v2.pdf
PWC https://paperswithcode.com/paper/vietnamese-named-entity-recognition-using
Repo
Framework

Medical Image Synthesis with Context-Aware Generative Adversarial Networks

Title Medical Image Synthesis with Context-Aware Generative Adversarial Networks
Authors Dong Nie, Roger Trullo, Caroline Petitjean, Su Ruan, Dinggang Shen
Abstract Computed tomography (CT) is critical for various clinical applications, e.g., radiotherapy treatment planning and also PET attenuation correction. However, CT exposes radiation during acquisition, which may cause side effects to patients. Compared to CT, magnetic resonance imaging (MRI) is much safer and does not involve any radiations. Therefore, recently, researchers are greatly motivated to estimate CT image from its corresponding MR image of the same subject for the case of radiotherapy planning. In this paper, we propose a data-driven approach to address this challenging problem. Specifically, we train a fully convolutional network to generate CT given an MR image. To better model the nonlinear relationship from MRI to CT and to produce more realistic images, we propose to use the adversarial training strategy and an image gradient difference loss function. We further apply AutoContext Model to implement a context-aware generative adversarial network. Experimental results show that our method is accurate and robust for predicting CT images from MRI images, and also outperforms three state-of-the-art methods under comparison.
Tasks Computed Tomography (CT), Image Generation
Published 2016-12-16
URL http://arxiv.org/abs/1612.05362v1
PDF http://arxiv.org/pdf/1612.05362v1.pdf
PWC https://paperswithcode.com/paper/medical-image-synthesis-with-context-aware
Repo
Framework

Selecting Efficient Features via a Hyper-Heuristic Approach

Title Selecting Efficient Features via a Hyper-Heuristic Approach
Authors Mitra Montazeri, Mahdieh Soleymani Baghshah, Aliakbar Niknafs
Abstract By Emerging huge databases and the need to efficient learning algorithms on these datasets, new problems have appeared and some methods have been proposed to solve these problems by selecting efficient features. Feature selection is a problem of finding efficient features among all features in which the final feature set can improve accuracy and reduce complexity. One way to solve this problem is to evaluate all possible feature subsets. However, evaluating all possible feature subsets is an exhaustive search and thus it has high computational complexity. Until now many heuristic algorithms have been studied for solving this problem. Hyper-heuristic is a new heuristic approach which can search the solution space effectively by applying local searches appropriately. Each local search is a neighborhood searching algorithm. Since each region of the solution space can have its own characteristics, it should be chosen an appropriate local search and apply it to current solution. This task is tackled to a supervisor. The supervisor chooses a local search based on the functional history of local searches. By doing this task, it can trade of between exploitation and exploration. Since the existing heuristic cannot trade of between exploration and exploitation appropriately, the solution space has not been searched appropriately in these methods and thus they have low convergence rate. For the first time, in this paper use a hyper-heuristic approach to find an efficient feature subset. In the proposed method, genetic algorithm is used as a supervisor and 16 heuristic algorithms are used as local searches. Empirical study of the proposed method on several commonly used data sets from UCI data sets indicates that it outperforms recent existing methods in the literature for feature selection.
Tasks Feature Selection
Published 2016-01-20
URL http://arxiv.org/abs/1601.05409v1
PDF http://arxiv.org/pdf/1601.05409v1.pdf
PWC https://paperswithcode.com/paper/selecting-efficient-features-via-a-hyper
Repo
Framework

Using Virtual Humans to Understand Real Ones

Title Using Virtual Humans to Understand Real Ones
Authors Katie Hoemann, Behnaz Rezaei, Stacy C. Marsella, Sarah Ostadabbas
Abstract Human interactions are characterized by explicit as well as implicit channels of communication. While the explicit channel transmits overt messages, the implicit ones transmit hidden messages about the communicator (e.g., his/her intentions and attitudes). There is a growing consensus that providing a computer with the ability to manipulate implicit affective cues should allow for a more meaningful and natural way of studying particular non-verbal signals of human-human communications by human-computer interactions. In this pilot study, we created a non-dynamic human-computer interaction while manipulating three specific non-verbal channels of communication: gaze pattern, facial expression, and gesture. Participants rated the virtual agent on affective dimensional scales (pleasure, arousal, and dominance) while their physiological signal (electrodermal activity, EDA) was captured during the interaction. Assessment of the behavioral data revealed a significant and complex three-way interaction between gaze, gesture, and facial configuration on the dimension of pleasure, as well as a main effect of gesture on the dimension of dominance. These results suggest a complex relationship between different non-verbal cues and the social context in which they are interpreted. Qualifying considerations as well as possible next steps are further discussed in light of these exploratory findings.
Tasks
Published 2016-06-13
URL http://arxiv.org/abs/1606.04165v1
PDF http://arxiv.org/pdf/1606.04165v1.pdf
PWC https://paperswithcode.com/paper/using-virtual-humans-to-understand-real-ones
Repo
Framework

Deep Clustering and Conventional Networks for Music Separation: Stronger Together

Title Deep Clustering and Conventional Networks for Music Separation: Stronger Together
Authors Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani
Abstract Deep clustering is the first method to handle general audio separation scenarios with multiple sources of the same type and an arbitrary number of sources, performing impressively in speaker-independent speech separation tasks. However, little is known about its effectiveness in other challenging situations such as music source separation. Contrary to conventional networks that directly estimate the source signals, deep clustering generates an embedding for each time-frequency bin, and separates sources by clustering the bins in the embedding space. We show that deep clustering outperforms conventional networks on a singing voice separation task, in both matched and mismatched conditions, even though conventional networks have the advantage of end-to-end training for best signal approximation, presumably because its more flexible objective engenders better regularization. Since the strengths of deep clustering and conventional network architectures appear complementary, we explore combining them in a single hybrid network trained via an approach akin to multi-task learning. Remarkably, the combination significantly outperforms either of its components.
Tasks Multi-Task Learning, Music Source Separation, Speech Separation
Published 2016-11-18
URL http://arxiv.org/abs/1611.06265v2
PDF http://arxiv.org/pdf/1611.06265v2.pdf
PWC https://paperswithcode.com/paper/deep-clustering-and-conventional-networks-for
Repo
Framework

On statistical learning via the lens of compression

Title On statistical learning via the lens of compression
Authors Ofir David, Shay Moran, Amir Yehudayoff
Abstract This work continues the study of the relationship between sample compression schemes and statistical learning, which has been mostly investigated within the framework of binary classification. The central theme of this work is establishing equivalences between learnability and compressibility, and utilizing these equivalences in the study of statistical learning theory. We begin with the setting of multiclass categorization (zero/one loss). We prove that in this case learnability is equivalent to compression of logarithmic sample size, and that uniform convergence implies compression of constant size. We then consider Vapnik’s general learning setting: we show that in order to extend the compressibility-learnability equivalence to this case, it is necessary to consider an approximate variant of compression. Finally, we provide some applications of the compressibility-learnability equivalences: (i) Agnostic-case learnability and realizable-case learnability are equivalent in multiclass categorization problems (in terms of sample complexity). (ii) This equivalence between agnostic-case learnability and realizable-case learnability does not hold for general learning problems: There exists a learning problem whose loss function takes just three values, under which agnostic-case and realizable-case learnability are not equivalent. (iii) Uniform convergence implies compression of constant size in multiclass categorization problems. Part of the argument includes an analysis of the uniform convergence rate in terms of the graph dimension, in which we improve upon previous bounds. (iv) A dichotomy for sample compression in multiclass categorization problems: If a non-trivial compression exists then a compression of logarithmic size exists. (v) A compactness theorem for multiclass categorization problems.
Tasks
Published 2016-10-12
URL http://arxiv.org/abs/1610.03592v2
PDF http://arxiv.org/pdf/1610.03592v2.pdf
PWC https://paperswithcode.com/paper/on-statistical-learning-via-the-lens-of
Repo
Framework

Numerical Inversion of SRNF Maps for Elastic Shape Analysis of Genus-Zero Surfaces

Title Numerical Inversion of SRNF Maps for Elastic Shape Analysis of Genus-Zero Surfaces
Authors Hamid Laga, Qian Xie, Ian H. Jermyn, Anuj Srivastava
Abstract Recent developments in elastic shape analysis (ESA) are motivated by the fact that it provides comprehensive frameworks for simultaneous registration, deformation, and comparison of shapes. These methods achieve computational efficiency using certain square-root representations that transform invariant elastic metrics into Euclidean metrics, allowing for applications of standard algorithms and statistical tools. For analyzing shapes of embeddings of $\mathbb{S}^2$ in $\mathbb{R}^3$, Jermyn et al. introduced square-root normal fields (SRNFs) that transformed an elastic metric, with desirable invariant properties, into the $\mathbb{L}^2$ metric. These SRNFs are essentially surface normals scaled by square-roots of infinitesimal area elements. A critical need in shape analysis is to invert solutions (deformations, averages, modes of variations, etc) computed in the SRNF space, back to the original surface space for visualizations and inferences. Due to the lack of theory for understanding SRNFs maps and their inverses, we take a numerical approach and derive an efficient multiresolution algorithm, based on solving an optimization problem in the surface space, that estimates surfaces corresponding to given SRNFs. This solution is found effective, even for complex shapes, e.g. human bodies and animals, that undergo significant deformations including bending and stretching. Specifically, we use this inversion for computing elastic shape deformations, transferring deformations, summarizing shapes, and for finding modes of variability in a given collection, while simultaneously registering the surfaces. We demonstrate the proposed algorithms using a statistical analysis of human body shapes, classification of generic surfaces and analysis of brain structures.
Tasks
Published 2016-10-14
URL http://arxiv.org/abs/1610.04531v1
PDF http://arxiv.org/pdf/1610.04531v1.pdf
PWC https://paperswithcode.com/paper/numerical-inversion-of-srnf-maps-for-elastic
Repo
Framework

Spatio-Temporal Attention Models for Grounded Video Captioning

Title Spatio-Temporal Attention Models for Grounded Video Captioning
Authors Mihai Zanfir, Elisabeta Marinoiu, Cristian Sminchisescu
Abstract Automatic video captioning is challenging due to the complex interactions in dynamic real scenes. A comprehensive system would ultimately localize and track the objects, actions and interactions present in a video and generate a description that relies on temporal localization in order to ground the visual concepts. However, most existing automatic video captioning systems map from raw video data to high level textual description, bypassing localization and recognition, thus discarding potentially valuable information for content localization and generalization. In this work we present an automatic video captioning model that combines spatio-temporal attention and image classification by means of deep neural network structures based on long short-term memory. The resulting system is demonstrated to produce state-of-the-art results in the standard YouTube captioning benchmark while also offering the advantage of localizing the visual concepts (subjects, verbs, objects), with no grounding supervision, over space and time.
Tasks Image Classification, Temporal Localization, Video Captioning
Published 2016-10-17
URL http://arxiv.org/abs/1610.04997v2
PDF http://arxiv.org/pdf/1610.04997v2.pdf
PWC https://paperswithcode.com/paper/spatio-temporal-attention-models-for-grounded
Repo
Framework

Title Generation for User Generated Videos

Title Title Generation for User Generated Videos
Authors Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, Min Sun
Abstract A great video title describes the most salient event compactly and captures the viewer’s attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensitive by priming them with a highlight detector. Our framework allows for jointly training a model for title generation and video highlight localization. Second, we induce high sentence diversity in video captioners, so that the generated titles are also diverse and catchy. This means that a large number of sentences might be required to learn the sentence structure of titles. Hence, we propose a novel sentence augmentation method to train a captioner with additional sentence-only examples that come without corresponding videos. We collected a large-scale Video Titles in the Wild (VTW) dataset of 18100 automatically crawled user-generated videos and titles. On VTW, our methods consistently improve title prediction accuracy, and achieve the best performance in both automatic and human evaluation. Finally, our sentence augmentation method also outperforms the baselines on the M-VAD dataset.
Tasks Video Captioning
Published 2016-08-25
URL http://arxiv.org/abs/1608.07068v2
PDF http://arxiv.org/pdf/1608.07068v2.pdf
PWC https://paperswithcode.com/paper/title-generation-for-user-generated-videos
Repo
Framework

A Comment on Argumentation

Title A Comment on Argumentation
Authors Karl Schlechta
Abstract We use the theory of defaults and their meaning of [GS16] to develop (the outline of a) new theory of argumentation.
Tasks
Published 2016-12-17
URL http://arxiv.org/abs/1612.05756v1
PDF http://arxiv.org/pdf/1612.05756v1.pdf
PWC https://paperswithcode.com/paper/a-comment-on-argumentation
Repo
Framework

Bidirectional Long-Short Term Memory for Video Description

Title Bidirectional Long-Short Term Memory for Video Description
Authors Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen
Abstract Video captioning has been attracting broad research attention in multimedia community. However, most existing approaches either ignore temporal information among video frames or just employ local contextual temporal knowledge. In this work, we propose a novel video captioning framework, termed as \emph{Bidirectional Long-Short Term Memory} (BiLSTM), which deeply captures bidirectional global temporal structure in video. Specifically, we first devise a joint visual modelling approach to encode video data by combining a forward LSTM pass, a backward LSTM pass, together with visual features from Convolutional Neural Networks (CNNs). Then, we inject the derived video representation into the subsequent language model for initialization. The benefits are in two folds: 1) comprehensively preserving sequential and visual information; and 2) adaptively learning dense visual features and sparse semantic representations for videos and sentences, respectively. We verify the effectiveness of our proposed video captioning framework on a commonly-used benchmark, i.e., Microsoft Video Description (MSVD) corpus, and the experimental results demonstrate that the superiority of the proposed approach as compared to several state-of-the-art methods.
Tasks Language Modelling, Video Captioning, Video Description
Published 2016-06-15
URL http://arxiv.org/abs/1606.04631v1
PDF http://arxiv.org/pdf/1606.04631v1.pdf
PWC https://paperswithcode.com/paper/bidirectional-long-short-term-memory-for
Repo
Framework
Title Cross-Language Domain Adaptation for Classifying Crisis-Related Short Messages
Authors Muhammad Imran, Prasenjit Mitra, Jaideep Srivastava
Abstract Rapid crisis response requires real-time analysis of messages. After a disaster happens, volunteers attempt to classify tweets to determine needs, e.g., supplies, infrastructure damage, etc. Given labeled data, supervised machine learning can help classify these messages. Scarcity of labeled data causes poor performance in machine training. Can we reuse old tweets to train classifiers? How can we choose labeled tweets for training? Specifically, we study the usefulness of labeled data of past events. Do labeled tweets in different language help? We observe the performance of our classifiers trained using different combinations of training sets obtained from past disasters. We perform extensive experimentation on real crisis datasets and show that the past labels are useful when both source and target events are of the same type (e.g. both earthquakes). For similar languages (e.g., Italian and Spanish), cross-language domain adaptation was useful, however, when for different languages (e.g., Italian and English), the performance decreased.
Tasks Domain Adaptation
Published 2016-02-17
URL http://arxiv.org/abs/1602.05388v2
PDF http://arxiv.org/pdf/1602.05388v2.pdf
PWC https://paperswithcode.com/paper/cross-language-domain-adaptation-for
Repo
Framework

Predicting 1p19q Chromosomal Deletion of Low-Grade Gliomas from MR Images using Deep Learning

Title Predicting 1p19q Chromosomal Deletion of Low-Grade Gliomas from MR Images using Deep Learning
Authors Zeynettin Akkus, Issa Ali, Jiri Sedlar, Timothy L. Kline, Jay P. Agrawal, Ian F. Parney, Caterina Giannini, Bradley J. Erickson
Abstract Objective: Several studies have associated codeletion of chromosome arms 1p/19q in low-grade gliomas (LGG) with positive response to treatment and longer progression free survival. Therefore, predicting 1p/19q status is crucial for effective treatment planning of LGG. In this study, we predict the 1p/19q status from MR images using convolutional neural networks (CNN), which could be a noninvasive alternative to surgical biopsy and histopathological analysis. Method: Our method consists of three main steps: image registration, tumor segmentation, and classification of 1p/19q status using CNN. We included a total of 159 LGG with 3 image slices each who had biopsy-proven 1p/19q status (57 nondeleted and 102 codeleted) and preoperative postcontrast-T1 (T1C) and T2 images. We divided our data into training, validation, and test sets. The training data was balanced for equal class probability and then augmented with iterations of random translational shift, rotation, and horizontal and vertical flips to increase the size of the training set. We shuffled and augmented the training data to counter overfitting in each epoch. Finally, we evaluated several configurations of a multi-scale CNN architecture until training and validation accuracies became consistent. Results: The results of the best performing configuration on the unseen test set were 93.3% (sensitivity), 82.22% (specificity), and 87.7% (accuracy). Conclusion: Multi-scale CNN with their self-learning capability provides promising results for predicting 1p/19q status noninvasively based on T1C and T2 images. Significance: Predicting 1p/19q status noninvasively from MR images would allow selecting effective treatment strategies for LGG patients without the need for surgical biopsy.
Tasks Image Registration
Published 2016-11-21
URL http://arxiv.org/abs/1611.06939v1
PDF http://arxiv.org/pdf/1611.06939v1.pdf
PWC https://paperswithcode.com/paper/predicting-1p19q-chromosomal-deletion-of-low
Repo
Framework

A Nonlinear Weighted Total Variation Image Reconstruction Algorithm for Electrical Capacitance Tomography

Title A Nonlinear Weighted Total Variation Image Reconstruction Algorithm for Electrical Capacitance Tomography
Authors Kezhi Li, Daniel Holland
Abstract A new iterative image reconstruction algorithm for electrical capacitance tomography (ECT) is proposed that is based on iterative soft thresholding of a total variation penalty and adaptive reweighted compressive sensing. This algorithm encourages sharp changes in the ECT image and overcomes the disadvantage of the $l_1$ minimization by equipping the total variation with an adaptive weighting depending on the reconstructed image. Moreover, the non-linear effect is also partially reduced due to the adoption of an updated sensitivity matrix. Simulation results show that the proposed algorithm recovers ECT images more precisely than existing state-of-the-art algorithms and therefore is suitable for the imaging of multiphase systems in industrial or medical applications.
Tasks Compressive Sensing, Image Reconstruction
Published 2016-03-02
URL http://arxiv.org/abs/1603.00816v2
PDF http://arxiv.org/pdf/1603.00816v2.pdf
PWC https://paperswithcode.com/paper/a-nonlinear-weighted-total-variation-image
Repo
Framework
comments powered by Disqus