July 29, 2019

3069 words 15 mins read

Paper Group ANR 14

Paper Group ANR 14

The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing. Concave losses for robust dictionary learning. Data-Driven Dialogue Systems for Social Agents. Model enumeration in propositional circumscription via unsatisfiable core analysis. Small Moving Window Calibration Models for Soft Sensing Processes with Limited His …

The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing

Title The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing
Authors Rik van Noord, Johan Bos
Abstract We evaluate a semantic parser based on a character-based sequence-to-sequence model in the context of the SemEval-2017 shared task on semantic parsing for AMRs. With data augmentation, super characters, and POS-tagging we gain major improvements in performance compared to a baseline character-level model. Although we improve on previous character-based neural semantic parsing models, the overall accuracy is still lower than a state-of-the-art AMR parser. An ensemble combining our neural semantic parser with an existing, traditional parser, yields a small gain in performance.
Tasks Data Augmentation, Semantic Parsing
Published 2017-04-07
URL http://arxiv.org/abs/1704.02156v2
PDF http://arxiv.org/pdf/1704.02156v2.pdf
PWC https://paperswithcode.com/paper/the-meaning-factory-at-semeval-2017-task-9
Repo
Framework

Concave losses for robust dictionary learning

Title Concave losses for robust dictionary learning
Authors Rafael Will M de Araujo, Roberto Hirata, Alain Rakotomamonjy
Abstract Traditional dictionary learning methods are based on quadratic convex loss function and thus are sensitive to outliers. In this paper, we propose a generic framework for robust dictionary learning based on concave losses. We provide results on composition of concave functions, notably regarding super-gradient computations, that are key for developing generic dictionary learning algorithms applicable to smooth and non-smooth losses. In order to improve identification of outliers, we introduce an initialization heuristic based on undercomplete dictionary learning. Experimental results using synthetic and real data demonstrate that our method is able to better detect outliers, is capable of generating better dictionaries, outperforming state-of-the-art methods such as K-SVD and LC-KSVD.
Tasks Dictionary Learning
Published 2017-11-02
URL http://arxiv.org/abs/1711.00659v1
PDF http://arxiv.org/pdf/1711.00659v1.pdf
PWC https://paperswithcode.com/paper/concave-losses-for-robust-dictionary-learning
Repo
Framework

Data-Driven Dialogue Systems for Social Agents

Title Data-Driven Dialogue Systems for Social Agents
Authors Kevin K. Bowden, Shereen Oraby, Amita Misra, Jiaqi Wu, Stephanie Lukin
Abstract In order to build dialogue systems to tackle the ambitious task of holding social conversations, we argue that we need a data driven approach that includes insight into human conversational chit chat, and which incorporates different natural language processing modules. Our strategy is to analyze and index large corpora of social media data, including Twitter conversations, online debates, dialogues between friends, and blog posts, and then to couple this data retrieval with modules that perform tasks such as sentiment and style analysis, topic modeling, and summarization. We aim for personal assistants that can learn more nuanced human language, and to grow from task-oriented agents to more personable social bots.
Tasks
Published 2017-09-10
URL http://arxiv.org/abs/1709.03190v1
PDF http://arxiv.org/pdf/1709.03190v1.pdf
PWC https://paperswithcode.com/paper/data-driven-dialogue-systems-for-social
Repo
Framework

Model enumeration in propositional circumscription via unsatisfiable core analysis

Title Model enumeration in propositional circumscription via unsatisfiable core analysis
Authors Mario Alviano
Abstract Many practical problems are characterized by a preference relation over admissible solutions, where preferred solutions are minimal in some sense. For example, a preferred diagnosis usually comprises a minimal set of reasons that is sufficient to cause the observed anomaly. Alternatively, a minimal correction subset comprises a minimal set of reasons whose deletion is sufficient to eliminate the observed anomaly. Circumscription formalizes such preference relations by associating propositional theories with minimal models. The resulting enumeration problem is addressed here by means of a new algorithm taking advantage of unsatisfiable core analysis. Empirical evidence of the efficiency of the algorithm is given by comparing the performance of the resulting solver, CIRCUMSCRIPTINO, with HCLASP, CAMUS MCS, LBX and MCSLS on the enumeration of minimal models for problems originating from practical applications. This paper is under consideration for acceptance in TPLP.
Tasks
Published 2017-07-05
URL http://arxiv.org/abs/1707.01423v1
PDF http://arxiv.org/pdf/1707.01423v1.pdf
PWC https://paperswithcode.com/paper/model-enumeration-in-propositional
Repo
Framework

Small Moving Window Calibration Models for Soft Sensing Processes with Limited History

Title Small Moving Window Calibration Models for Soft Sensing Processes with Limited History
Authors Casey Kneale, Steven D. Brown
Abstract Five simple soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were moving window partial least squares regression (and a recursive variant), moving window random forest regression, the mean moving window of $y$, and a novel random forest partial least squares regression ensemble (RF-PLS), all of which can be used with small sample sizes so that they can be rapidly placed online. It was found that, on two of the datasets studied, small window sizes led to the lowest prediction errors for all of the moving window methods studied. On the majority of datasets studied, the RF-PLS calibration method offered the lowest one-step-ahead prediction errors compared to those of the other methods, and it demonstrated greater predictive stability at larger time delays than moving window PLS alone. It was found that both the random forest and RF-PLS methods most adequately modeled the datasets that did not feature purely monotonic increases in property values, but that both methods performed more poorly than moving window PLS models on one dataset with purely monotonic property values. Other data dependent findings are presented and discussed.
Tasks Calibration
Published 2017-10-31
URL http://arxiv.org/abs/1710.11595v3
PDF http://arxiv.org/pdf/1710.11595v3.pdf
PWC https://paperswithcode.com/paper/small-moving-window-calibration-models-for
Repo
Framework

Characteristic and Universal Tensor Product Kernels

Title Characteristic and Universal Tensor Product Kernels
Authors Zoltan Szabo, Bharath K. Sriperumbudur
Abstract Maximum mean discrepancy (MMD), also called energy distance or N-distance in statistics and Hilbert-Schmidt independence criterion (HSIC), specifically distance covariance in statistics, are among the most popular and successful approaches to quantify the difference and independence of random variables, respectively. Thanks to their kernel-based foundations, MMD and HSIC are applicable on a wide variety of domains. Despite their tremendous success, quite little is known about when HSIC characterizes independence and when MMD with tensor product kernel can discriminate probability distributions. In this paper, we answer these questions by studying various notions of characteristic property of the tensor product kernel.
Tasks
Published 2017-08-28
URL http://arxiv.org/abs/1708.08157v4
PDF http://arxiv.org/pdf/1708.08157v4.pdf
PWC https://paperswithcode.com/paper/characteristic-and-universal-tensor-product
Repo
Framework

MIML-FCN+: Multi-instance Multi-label Learning via Fully Convolutional Networks with Privileged Information

Title MIML-FCN+: Multi-instance Multi-label Learning via Fully Convolutional Networks with Privileged Information
Authors Hao Yang, Joey Tianyi Zhou, Jianfei Cai, Yew Soon Ong
Abstract Multi-instance multi-label (MIML) learning has many interesting applications in computer visions, including multi-object recognition and automatic image tagging. In these applications, additional information such as bounding-boxes, image captions and descriptions is often available during training phrase, which is referred as privileged information (PI). However, as existing works on learning using PI only consider instance-level PI (privileged instances), they fail to make use of bag-level PI (privileged bags) available in MIML learning. Therefore, in this paper, we propose a two-stream fully convolutional network, named MIML-FCN+, unified by a novel PI loss to solve the problem of MIML learning with privileged bags. Compared to the previous works on PI, the proposed MIML-FCN+ utilizes the readily available privileged bags, instead of hard-to-obtain privileged instances, making the system more general and practical in real world applications. As the proposed PI loss is convex and SGD compatible and the framework itself is a fully convolutional network, MIML-FCN+ can be easily integrated with state of-the-art deep learning networks. Moreover, the flexibility of convolutional layers allows us to exploit structured correlations among instances to facilitate more effective training and testing. Experimental results on three benchmark datasets demonstrate the effectiveness of the proposed MIML-FCN+, outperforming state-of-the-art methods in the application of multi-object recognition.
Tasks Image Captioning, Multi-Label Learning, Object Recognition
Published 2017-02-28
URL http://arxiv.org/abs/1702.08681v1
PDF http://arxiv.org/pdf/1702.08681v1.pdf
PWC https://paperswithcode.com/paper/miml-fcn-multi-instance-multi-label-learning
Repo
Framework

Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks

Title Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks
Authors Devinder Kumar, Alexander Wong, Graham W. Taylor
Abstract In this work, we propose CLass-Enhanced Attentive Response (CLEAR): an approach to visualize and understand the decisions made by deep neural networks (DNNs) given a specific input. CLEAR facilitates the visualization of attentive regions and levels of interest of DNNs during the decision-making process. It also enables the visualization of the most dominant classes associated with these attentive regions of interest. As such, CLEAR can mitigate some of the shortcomings of heatmap-based methods associated with decision ambiguity, and allows for better insights into the decision-making process of DNNs. Quantitative and qualitative experiments across three different datasets demonstrate the efficacy of CLEAR for gaining a better understanding of the inner workings of DNNs during the decision-making process.
Tasks Decision Making
Published 2017-04-13
URL http://arxiv.org/abs/1704.04133v2
PDF http://arxiv.org/pdf/1704.04133v2.pdf
PWC https://paperswithcode.com/paper/explaining-the-unexplained-a-class-enhanced
Repo
Framework

Scalable Greedy Feature Selection via Weak Submodularity

Title Scalable Greedy Feature Selection via Weak Submodularity
Authors Rajiv Khanna, Ethan Elenberg, Alexandros G. Dimakis, Sahand Negahban, Joydeep Ghosh
Abstract Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because for each greedy step we need to refit a model or calculate a function using the previously selected choices and the new candidate. Two algorithms that are faster approximations to the greedy forward selection were introduced recently ([Mirzasoleiman et al. 2013, 2015]). They achieve better performance by exploiting distributed computation and stochastic evaluation respectively. Both algorithms have provable performance guarantees for submodular functions. In this paper we show that divergent from previously held opinion, submodularity is not required to obtain approximation guarantees for these two algorithms. Specifically, we show that a generalized concept of weak submodularity suffices to give multiplicative approximation guarantees. Our result extends the applicability of these algorithms to a larger class of functions. Furthermore, we show that a bounded submodularity ratio can be used to provide data dependent bounds that can sometimes be tighter also for submodular functions. We empirically validate our work by showing superior performance of fast greedy approximations versus several established baselines on artificial and real datasets.
Tasks Feature Selection
Published 2017-03-08
URL http://arxiv.org/abs/1703.02723v1
PDF http://arxiv.org/pdf/1703.02723v1.pdf
PWC https://paperswithcode.com/paper/scalable-greedy-feature-selection-via-weak
Repo
Framework

Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling

Title Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling
Authors Hairong Liu, Zhenyao Zhu, Xiangang Li, Sanjeev Satheesh
Abstract Most existing sequence labelling models rely on a fixed decomposition of a target sequence into a sequence of basic units. These methods suffer from two major drawbacks: 1) the set of basic units is fixed, such as the set of words, characters or phonemes in speech recognition, and 2) the decomposition of target sequences is fixed. These drawbacks usually result in sub-optimal performance of modeling sequences. In this pa- per, we extend the popular CTC loss criterion to alleviate these limitations, and propose a new loss function called Gram-CTC. While preserving the advantages of CTC, Gram-CTC automatically learns the best set of basic units (grams), as well as the most suitable decomposition of tar- get sequences. Unlike CTC, Gram-CTC allows the model to output variable number of characters at each time step, which enables the model to capture longer term dependency and improves the computational efficiency. We demonstrate that the proposed Gram-CTC improves CTC in terms of both performance and efficiency on the large vocabulary speech recognition task at multiple scales of data, and that with Gram-CTC we can outperform the state-of-the-art on a standard speech benchmark.
Tasks Large Vocabulary Continuous Speech Recognition, Speech Recognition
Published 2017-03-01
URL http://arxiv.org/abs/1703.00096v2
PDF http://arxiv.org/pdf/1703.00096v2.pdf
PWC https://paperswithcode.com/paper/gram-ctc-automatic-unit-selection-and-target
Repo
Framework

Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm

Title Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm
Authors Shuvozit Ghose, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy
Abstract In this paper, a new texture descriptor named “Fractional Local Neighborhood Intensity Pattern” (FLNIP) has been proposed for content based image retrieval (CBIR). It is an extension of the Local Neighborhood Intensity Pattern (LNIP)[1]. FLNIP calculates the relative intensity difference between a particular pixel and the center pixel of a 3x3 window by considering the relationship with adjacent neighbors. In this work, the fractional change in the local neighborhood involving the adjacent neighbors has been calculated first with respect to one of the eight neighbors of the center pixel of a 3x3 window. Next, the fractional change has been calculated with respect to the center itself. The two values of fractional change are next compared to generate a binary bit pattern. Both sign and magnitude information are encoded in a single descriptor as it deals with the relative change in magnitude in the adjacent neighborhood i.e., the comparison of the fractional change. The descriptor is applied on four multi-resolution images – one being the raw image and the other three being filtered gaussian images obtained by applying gaussian filters of different standard deviations on the raw image to signify the importance of exploring texture information at different resolutions in an image. The four sets of distances obtained between the query and the target image are then combined with a genetic algorithm based approach to improve the retrieval performance by minimizing the distance between similar class images. The performance of the method has been tested for image retrieval on four popular databases. The precision and recall values observed on these databases have been compared with recent state-of-art local patterns. The proposed method has shown a significant improvement over many other existing methods.
Tasks Content-Based Image Retrieval, Image Retrieval
Published 2017-12-30
URL https://arxiv.org/abs/1801.00187v3
PDF https://arxiv.org/pdf/1801.00187v3.pdf
PWC https://paperswithcode.com/paper/fractional-local-neighborhood-intensity
Repo
Framework

Reducing Deep Network Complexity with Fourier Transform Methods

Title Reducing Deep Network Complexity with Fourier Transform Methods
Authors Andrew Kiruluta
Abstract We propose a novel way that uses shallow densely connected neuron network architectures to achieve superior performance to convolution based neural networks (CNNs) approaches with the added benefits of lower computation burden requiring dramatically less training examples to achieve high prediction accuracy ($>98%$). The advantages of our proposed method is demonstrated in results on benchmark datasets which show significant performance gain over existing state-of-the-art results on MNIST, CIFAR-10 and CIFAR-100. By Fourier transforming the inputs, each point in the training sample then has a representational energy of all the weighted information from every other point. The consequence of using this input is a reduced complexity neuron network, reduced computation load and the lifting of the requirement for a large number of training examples to achieve high classification accuracy.
Tasks
Published 2017-12-15
URL http://arxiv.org/abs/1801.01451v2
PDF http://arxiv.org/pdf/1801.01451v2.pdf
PWC https://paperswithcode.com/paper/reducing-deep-network-complexity-with-fourier
Repo
Framework

Iteratively reweighted $\ell_1$ algorithms with extrapolation

Title Iteratively reweighted $\ell_1$ algorithms with extrapolation
Authors Peiran Yu, Ting Kei Pong
Abstract Iteratively reweighted $\ell_1$ algorithm is a popular algorithm for solving a large class of optimization problems whose objective is the sum of a Lipschitz differentiable loss function and a possibly nonconvex sparsity inducing regularizer. In this paper, motivated by the success of extrapolation techniques in accelerating first-order methods, we study how widely used extrapolation techniques such as those in [4,5,22,28] can be incorporated to possibly accelerate the iteratively reweighted $\ell_1$ algorithm. We consider three versions of such algorithms. For each version, we exhibit an explicitly checkable condition on the extrapolation parameters so that the sequence generated provably clusters at a stationary point of the optimization problem. We also investigate global convergence under additional Kurdyka-$\L$ojasiewicz assumptions on certain potential functions. Our numerical experiments show that our algorithms usually outperform the general iterative shrinkage and thresholding algorithm in [21] and an adaptation of the iteratively reweighted $\ell_1$ algorithm in [23, Algorithm 7] with nonmonotone line-search for solving random instances of log penalty regularized least squares problems in terms of both CPU time and solution quality.
Tasks
Published 2017-10-22
URL http://arxiv.org/abs/1710.07886v2
PDF http://arxiv.org/pdf/1710.07886v2.pdf
PWC https://paperswithcode.com/paper/iteratively-reweighted-ell_1-algorithms-with
Repo
Framework

A Procedural Texture Generation Framework Based on Semantic Descriptions

Title A Procedural Texture Generation Framework Based on Semantic Descriptions
Authors Junyu Dong, Lina Wang, Jun Liu, Xin Sun
Abstract Procedural textures are normally generated from mathematical models with parameters carefully selected by experienced users. However, for naive users, the intuitive way to obtain a desired texture is to provide semantic descriptions such as “regular,” “lacelike,” and “repetitive” and then a procedural model with proper parameters will be automatically suggested to generate the corresponding textures. By contrast, it is less practical for users to learn mathematical models and tune parameters based on multiple examinations of large numbers of generated textures. In this study, we propose a novel framework that generates procedural textures according to user-defined semantic descriptions, and we establish a mapping between procedural models and semantic texture descriptions. First, based on a vocabulary of semantic attributes collected from psychophysical experiments, a multi-label learning method is employed to annotate a large number of textures with semantic attributes to form a semantic procedural texture dataset. Then, we derive a low dimensional semantic space in which the semantic descriptions can be separated from one other. Finally, given a set of semantic descriptions, the diverse properties of the samples in the semantic space can lead the framework to find an appropriate generation model that uses appropriate parameters to produce a desired texture. The experimental results show that the proposed framework is effective and that the generated textures closely correlate with the input semantic descriptions.
Tasks Multi-Label Learning, Texture Synthesis
Published 2017-04-13
URL http://arxiv.org/abs/1704.04141v1
PDF http://arxiv.org/pdf/1704.04141v1.pdf
PWC https://paperswithcode.com/paper/a-procedural-texture-generation-framework
Repo
Framework

Residual Parameter Transfer for Deep Domain Adaptation

Title Residual Parameter Transfer for Deep Domain Adaptation
Authors Artem Rozantsev, Mathieu Salzmann, Pascal Fua
Abstract The goal of Deep Domain Adaptation is to make it possible to use Deep Nets trained in one domain where there is enough annotated training data in another where there is little or none. Most current approaches have focused on learning feature representations that are invariant to the changes that occur when going from one domain to the other, which means using the same network parameters in both domains. While some recent algorithms explicitly model the changes by adapting the network parameters, they either severely restrict the possible domain changes, or significantly increase the number of model parameters. By contrast, we introduce a network architecture that includes auxiliary residual networks, which we train to predict the parameters in the domain with little annotated data from those in the other one. This architecture enables us to flexibly preserve the similarities between domains where they exist and model the differences when necessary. We demonstrate that our approach yields higher accuracy than state-of-the-art methods without undue complexity.
Tasks Domain Adaptation
Published 2017-11-21
URL http://arxiv.org/abs/1711.07714v1
PDF http://arxiv.org/pdf/1711.07714v1.pdf
PWC https://paperswithcode.com/paper/residual-parameter-transfer-for-deep-domain
Repo
Framework
comments powered by Disqus