Paper Group ANR 14
The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing. Concave losses for robust dictionary learning. Data-Driven Dialogue Systems for Social Agents. Model enumeration in propositional circumscription via unsatisfiable core analysis. Small Moving Window Calibration Models for Soft Sensing Processes with Limited His …
The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing
Title | The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing |
Authors | Rik van Noord, Johan Bos |
Abstract | We evaluate a semantic parser based on a character-based sequence-to-sequence model in the context of the SemEval-2017 shared task on semantic parsing for AMRs. With data augmentation, super characters, and POS-tagging we gain major improvements in performance compared to a baseline character-level model. Although we improve on previous character-based neural semantic parsing models, the overall accuracy is still lower than a state-of-the-art AMR parser. An ensemble combining our neural semantic parser with an existing, traditional parser, yields a small gain in performance. |
Tasks | Data Augmentation, Semantic Parsing |
Published | 2017-04-07 |
URL | http://arxiv.org/abs/1704.02156v2 |
http://arxiv.org/pdf/1704.02156v2.pdf | |
PWC | https://paperswithcode.com/paper/the-meaning-factory-at-semeval-2017-task-9 |
Repo | |
Framework | |
Concave losses for robust dictionary learning
Title | Concave losses for robust dictionary learning |
Authors | Rafael Will M de Araujo, Roberto Hirata, Alain Rakotomamonjy |
Abstract | Traditional dictionary learning methods are based on quadratic convex loss function and thus are sensitive to outliers. In this paper, we propose a generic framework for robust dictionary learning based on concave losses. We provide results on composition of concave functions, notably regarding super-gradient computations, that are key for developing generic dictionary learning algorithms applicable to smooth and non-smooth losses. In order to improve identification of outliers, we introduce an initialization heuristic based on undercomplete dictionary learning. Experimental results using synthetic and real data demonstrate that our method is able to better detect outliers, is capable of generating better dictionaries, outperforming state-of-the-art methods such as K-SVD and LC-KSVD. |
Tasks | Dictionary Learning |
Published | 2017-11-02 |
URL | http://arxiv.org/abs/1711.00659v1 |
http://arxiv.org/pdf/1711.00659v1.pdf | |
PWC | https://paperswithcode.com/paper/concave-losses-for-robust-dictionary-learning |
Repo | |
Framework | |
Data-Driven Dialogue Systems for Social Agents
Title | Data-Driven Dialogue Systems for Social Agents |
Authors | Kevin K. Bowden, Shereen Oraby, Amita Misra, Jiaqi Wu, Stephanie Lukin |
Abstract | In order to build dialogue systems to tackle the ambitious task of holding social conversations, we argue that we need a data driven approach that includes insight into human conversational chit chat, and which incorporates different natural language processing modules. Our strategy is to analyze and index large corpora of social media data, including Twitter conversations, online debates, dialogues between friends, and blog posts, and then to couple this data retrieval with modules that perform tasks such as sentiment and style analysis, topic modeling, and summarization. We aim for personal assistants that can learn more nuanced human language, and to grow from task-oriented agents to more personable social bots. |
Tasks | |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.03190v1 |
http://arxiv.org/pdf/1709.03190v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-dialogue-systems-for-social |
Repo | |
Framework | |
Model enumeration in propositional circumscription via unsatisfiable core analysis
Title | Model enumeration in propositional circumscription via unsatisfiable core analysis |
Authors | Mario Alviano |
Abstract | Many practical problems are characterized by a preference relation over admissible solutions, where preferred solutions are minimal in some sense. For example, a preferred diagnosis usually comprises a minimal set of reasons that is sufficient to cause the observed anomaly. Alternatively, a minimal correction subset comprises a minimal set of reasons whose deletion is sufficient to eliminate the observed anomaly. Circumscription formalizes such preference relations by associating propositional theories with minimal models. The resulting enumeration problem is addressed here by means of a new algorithm taking advantage of unsatisfiable core analysis. Empirical evidence of the efficiency of the algorithm is given by comparing the performance of the resulting solver, CIRCUMSCRIPTINO, with HCLASP, CAMUS MCS, LBX and MCSLS on the enumeration of minimal models for problems originating from practical applications. This paper is under consideration for acceptance in TPLP. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01423v1 |
http://arxiv.org/pdf/1707.01423v1.pdf | |
PWC | https://paperswithcode.com/paper/model-enumeration-in-propositional |
Repo | |
Framework | |
Small Moving Window Calibration Models for Soft Sensing Processes with Limited History
Title | Small Moving Window Calibration Models for Soft Sensing Processes with Limited History |
Authors | Casey Kneale, Steven D. Brown |
Abstract | Five simple soft sensor methodologies with two update conditions were compared on two experimentally-obtained datasets and one simulated dataset. The soft sensors investigated were moving window partial least squares regression (and a recursive variant), moving window random forest regression, the mean moving window of $y$, and a novel random forest partial least squares regression ensemble (RF-PLS), all of which can be used with small sample sizes so that they can be rapidly placed online. It was found that, on two of the datasets studied, small window sizes led to the lowest prediction errors for all of the moving window methods studied. On the majority of datasets studied, the RF-PLS calibration method offered the lowest one-step-ahead prediction errors compared to those of the other methods, and it demonstrated greater predictive stability at larger time delays than moving window PLS alone. It was found that both the random forest and RF-PLS methods most adequately modeled the datasets that did not feature purely monotonic increases in property values, but that both methods performed more poorly than moving window PLS models on one dataset with purely monotonic property values. Other data dependent findings are presented and discussed. |
Tasks | Calibration |
Published | 2017-10-31 |
URL | http://arxiv.org/abs/1710.11595v3 |
http://arxiv.org/pdf/1710.11595v3.pdf | |
PWC | https://paperswithcode.com/paper/small-moving-window-calibration-models-for |
Repo | |
Framework | |
Characteristic and Universal Tensor Product Kernels
Title | Characteristic and Universal Tensor Product Kernels |
Authors | Zoltan Szabo, Bharath K. Sriperumbudur |
Abstract | Maximum mean discrepancy (MMD), also called energy distance or N-distance in statistics and Hilbert-Schmidt independence criterion (HSIC), specifically distance covariance in statistics, are among the most popular and successful approaches to quantify the difference and independence of random variables, respectively. Thanks to their kernel-based foundations, MMD and HSIC are applicable on a wide variety of domains. Despite their tremendous success, quite little is known about when HSIC characterizes independence and when MMD with tensor product kernel can discriminate probability distributions. In this paper, we answer these questions by studying various notions of characteristic property of the tensor product kernel. |
Tasks | |
Published | 2017-08-28 |
URL | http://arxiv.org/abs/1708.08157v4 |
http://arxiv.org/pdf/1708.08157v4.pdf | |
PWC | https://paperswithcode.com/paper/characteristic-and-universal-tensor-product |
Repo | |
Framework | |
MIML-FCN+: Multi-instance Multi-label Learning via Fully Convolutional Networks with Privileged Information
Title | MIML-FCN+: Multi-instance Multi-label Learning via Fully Convolutional Networks with Privileged Information |
Authors | Hao Yang, Joey Tianyi Zhou, Jianfei Cai, Yew Soon Ong |
Abstract | Multi-instance multi-label (MIML) learning has many interesting applications in computer visions, including multi-object recognition and automatic image tagging. In these applications, additional information such as bounding-boxes, image captions and descriptions is often available during training phrase, which is referred as privileged information (PI). However, as existing works on learning using PI only consider instance-level PI (privileged instances), they fail to make use of bag-level PI (privileged bags) available in MIML learning. Therefore, in this paper, we propose a two-stream fully convolutional network, named MIML-FCN+, unified by a novel PI loss to solve the problem of MIML learning with privileged bags. Compared to the previous works on PI, the proposed MIML-FCN+ utilizes the readily available privileged bags, instead of hard-to-obtain privileged instances, making the system more general and practical in real world applications. As the proposed PI loss is convex and SGD compatible and the framework itself is a fully convolutional network, MIML-FCN+ can be easily integrated with state of-the-art deep learning networks. Moreover, the flexibility of convolutional layers allows us to exploit structured correlations among instances to facilitate more effective training and testing. Experimental results on three benchmark datasets demonstrate the effectiveness of the proposed MIML-FCN+, outperforming state-of-the-art methods in the application of multi-object recognition. |
Tasks | Image Captioning, Multi-Label Learning, Object Recognition |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08681v1 |
http://arxiv.org/pdf/1702.08681v1.pdf | |
PWC | https://paperswithcode.com/paper/miml-fcn-multi-instance-multi-label-learning |
Repo | |
Framework | |
Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks
Title | Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks |
Authors | Devinder Kumar, Alexander Wong, Graham W. Taylor |
Abstract | In this work, we propose CLass-Enhanced Attentive Response (CLEAR): an approach to visualize and understand the decisions made by deep neural networks (DNNs) given a specific input. CLEAR facilitates the visualization of attentive regions and levels of interest of DNNs during the decision-making process. It also enables the visualization of the most dominant classes associated with these attentive regions of interest. As such, CLEAR can mitigate some of the shortcomings of heatmap-based methods associated with decision ambiguity, and allows for better insights into the decision-making process of DNNs. Quantitative and qualitative experiments across three different datasets demonstrate the efficacy of CLEAR for gaining a better understanding of the inner workings of DNNs during the decision-making process. |
Tasks | Decision Making |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.04133v2 |
http://arxiv.org/pdf/1704.04133v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-the-unexplained-a-class-enhanced |
Repo | |
Framework | |
Scalable Greedy Feature Selection via Weak Submodularity
Title | Scalable Greedy Feature Selection via Weak Submodularity |
Authors | Rajiv Khanna, Ethan Elenberg, Alexandros G. Dimakis, Sahand Negahban, Joydeep Ghosh |
Abstract | Greedy algorithms are widely used for problems in machine learning such as feature selection and set function optimization. Unfortunately, for large datasets, the running time of even greedy algorithms can be quite high. This is because for each greedy step we need to refit a model or calculate a function using the previously selected choices and the new candidate. Two algorithms that are faster approximations to the greedy forward selection were introduced recently ([Mirzasoleiman et al. 2013, 2015]). They achieve better performance by exploiting distributed computation and stochastic evaluation respectively. Both algorithms have provable performance guarantees for submodular functions. In this paper we show that divergent from previously held opinion, submodularity is not required to obtain approximation guarantees for these two algorithms. Specifically, we show that a generalized concept of weak submodularity suffices to give multiplicative approximation guarantees. Our result extends the applicability of these algorithms to a larger class of functions. Furthermore, we show that a bounded submodularity ratio can be used to provide data dependent bounds that can sometimes be tighter also for submodular functions. We empirically validate our work by showing superior performance of fast greedy approximations versus several established baselines on artificial and real datasets. |
Tasks | Feature Selection |
Published | 2017-03-08 |
URL | http://arxiv.org/abs/1703.02723v1 |
http://arxiv.org/pdf/1703.02723v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-greedy-feature-selection-via-weak |
Repo | |
Framework | |
Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling
Title | Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling |
Authors | Hairong Liu, Zhenyao Zhu, Xiangang Li, Sanjeev Satheesh |
Abstract | Most existing sequence labelling models rely on a fixed decomposition of a target sequence into a sequence of basic units. These methods suffer from two major drawbacks: 1) the set of basic units is fixed, such as the set of words, characters or phonemes in speech recognition, and 2) the decomposition of target sequences is fixed. These drawbacks usually result in sub-optimal performance of modeling sequences. In this pa- per, we extend the popular CTC loss criterion to alleviate these limitations, and propose a new loss function called Gram-CTC. While preserving the advantages of CTC, Gram-CTC automatically learns the best set of basic units (grams), as well as the most suitable decomposition of tar- get sequences. Unlike CTC, Gram-CTC allows the model to output variable number of characters at each time step, which enables the model to capture longer term dependency and improves the computational efficiency. We demonstrate that the proposed Gram-CTC improves CTC in terms of both performance and efficiency on the large vocabulary speech recognition task at multiple scales of data, and that with Gram-CTC we can outperform the state-of-the-art on a standard speech benchmark. |
Tasks | Large Vocabulary Continuous Speech Recognition, Speech Recognition |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00096v2 |
http://arxiv.org/pdf/1703.00096v2.pdf | |
PWC | https://paperswithcode.com/paper/gram-ctc-automatic-unit-selection-and-target |
Repo | |
Framework | |
Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm
Title | Fractional Local Neighborhood Intensity Pattern for Image Retrieval using Genetic Algorithm |
Authors | Shuvozit Ghose, Abhirup Das, Ayan Kumar Bhunia, Partha Pratim Roy |
Abstract | In this paper, a new texture descriptor named “Fractional Local Neighborhood Intensity Pattern” (FLNIP) has been proposed for content based image retrieval (CBIR). It is an extension of the Local Neighborhood Intensity Pattern (LNIP)[1]. FLNIP calculates the relative intensity difference between a particular pixel and the center pixel of a 3x3 window by considering the relationship with adjacent neighbors. In this work, the fractional change in the local neighborhood involving the adjacent neighbors has been calculated first with respect to one of the eight neighbors of the center pixel of a 3x3 window. Next, the fractional change has been calculated with respect to the center itself. The two values of fractional change are next compared to generate a binary bit pattern. Both sign and magnitude information are encoded in a single descriptor as it deals with the relative change in magnitude in the adjacent neighborhood i.e., the comparison of the fractional change. The descriptor is applied on four multi-resolution images – one being the raw image and the other three being filtered gaussian images obtained by applying gaussian filters of different standard deviations on the raw image to signify the importance of exploring texture information at different resolutions in an image. The four sets of distances obtained between the query and the target image are then combined with a genetic algorithm based approach to improve the retrieval performance by minimizing the distance between similar class images. The performance of the method has been tested for image retrieval on four popular databases. The precision and recall values observed on these databases have been compared with recent state-of-art local patterns. The proposed method has shown a significant improvement over many other existing methods. |
Tasks | Content-Based Image Retrieval, Image Retrieval |
Published | 2017-12-30 |
URL | https://arxiv.org/abs/1801.00187v3 |
https://arxiv.org/pdf/1801.00187v3.pdf | |
PWC | https://paperswithcode.com/paper/fractional-local-neighborhood-intensity |
Repo | |
Framework | |
Reducing Deep Network Complexity with Fourier Transform Methods
Title | Reducing Deep Network Complexity with Fourier Transform Methods |
Authors | Andrew Kiruluta |
Abstract | We propose a novel way that uses shallow densely connected neuron network architectures to achieve superior performance to convolution based neural networks (CNNs) approaches with the added benefits of lower computation burden requiring dramatically less training examples to achieve high prediction accuracy ($>98%$). The advantages of our proposed method is demonstrated in results on benchmark datasets which show significant performance gain over existing state-of-the-art results on MNIST, CIFAR-10 and CIFAR-100. By Fourier transforming the inputs, each point in the training sample then has a representational energy of all the weighted information from every other point. The consequence of using this input is a reduced complexity neuron network, reduced computation load and the lifting of the requirement for a large number of training examples to achieve high classification accuracy. |
Tasks | |
Published | 2017-12-15 |
URL | http://arxiv.org/abs/1801.01451v2 |
http://arxiv.org/pdf/1801.01451v2.pdf | |
PWC | https://paperswithcode.com/paper/reducing-deep-network-complexity-with-fourier |
Repo | |
Framework | |
Iteratively reweighted $\ell_1$ algorithms with extrapolation
Title | Iteratively reweighted $\ell_1$ algorithms with extrapolation |
Authors | Peiran Yu, Ting Kei Pong |
Abstract | Iteratively reweighted $\ell_1$ algorithm is a popular algorithm for solving a large class of optimization problems whose objective is the sum of a Lipschitz differentiable loss function and a possibly nonconvex sparsity inducing regularizer. In this paper, motivated by the success of extrapolation techniques in accelerating first-order methods, we study how widely used extrapolation techniques such as those in [4,5,22,28] can be incorporated to possibly accelerate the iteratively reweighted $\ell_1$ algorithm. We consider three versions of such algorithms. For each version, we exhibit an explicitly checkable condition on the extrapolation parameters so that the sequence generated provably clusters at a stationary point of the optimization problem. We also investigate global convergence under additional Kurdyka-$\L$ojasiewicz assumptions on certain potential functions. Our numerical experiments show that our algorithms usually outperform the general iterative shrinkage and thresholding algorithm in [21] and an adaptation of the iteratively reweighted $\ell_1$ algorithm in [23, Algorithm 7] with nonmonotone line-search for solving random instances of log penalty regularized least squares problems in terms of both CPU time and solution quality. |
Tasks | |
Published | 2017-10-22 |
URL | http://arxiv.org/abs/1710.07886v2 |
http://arxiv.org/pdf/1710.07886v2.pdf | |
PWC | https://paperswithcode.com/paper/iteratively-reweighted-ell_1-algorithms-with |
Repo | |
Framework | |
A Procedural Texture Generation Framework Based on Semantic Descriptions
Title | A Procedural Texture Generation Framework Based on Semantic Descriptions |
Authors | Junyu Dong, Lina Wang, Jun Liu, Xin Sun |
Abstract | Procedural textures are normally generated from mathematical models with parameters carefully selected by experienced users. However, for naive users, the intuitive way to obtain a desired texture is to provide semantic descriptions such as “regular,” “lacelike,” and “repetitive” and then a procedural model with proper parameters will be automatically suggested to generate the corresponding textures. By contrast, it is less practical for users to learn mathematical models and tune parameters based on multiple examinations of large numbers of generated textures. In this study, we propose a novel framework that generates procedural textures according to user-defined semantic descriptions, and we establish a mapping between procedural models and semantic texture descriptions. First, based on a vocabulary of semantic attributes collected from psychophysical experiments, a multi-label learning method is employed to annotate a large number of textures with semantic attributes to form a semantic procedural texture dataset. Then, we derive a low dimensional semantic space in which the semantic descriptions can be separated from one other. Finally, given a set of semantic descriptions, the diverse properties of the samples in the semantic space can lead the framework to find an appropriate generation model that uses appropriate parameters to produce a desired texture. The experimental results show that the proposed framework is effective and that the generated textures closely correlate with the input semantic descriptions. |
Tasks | Multi-Label Learning, Texture Synthesis |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.04141v1 |
http://arxiv.org/pdf/1704.04141v1.pdf | |
PWC | https://paperswithcode.com/paper/a-procedural-texture-generation-framework |
Repo | |
Framework | |
Residual Parameter Transfer for Deep Domain Adaptation
Title | Residual Parameter Transfer for Deep Domain Adaptation |
Authors | Artem Rozantsev, Mathieu Salzmann, Pascal Fua |
Abstract | The goal of Deep Domain Adaptation is to make it possible to use Deep Nets trained in one domain where there is enough annotated training data in another where there is little or none. Most current approaches have focused on learning feature representations that are invariant to the changes that occur when going from one domain to the other, which means using the same network parameters in both domains. While some recent algorithms explicitly model the changes by adapting the network parameters, they either severely restrict the possible domain changes, or significantly increase the number of model parameters. By contrast, we introduce a network architecture that includes auxiliary residual networks, which we train to predict the parameters in the domain with little annotated data from those in the other one. This architecture enables us to flexibly preserve the similarities between domains where they exist and model the differences when necessary. We demonstrate that our approach yields higher accuracy than state-of-the-art methods without undue complexity. |
Tasks | Domain Adaptation |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07714v1 |
http://arxiv.org/pdf/1711.07714v1.pdf | |
PWC | https://paperswithcode.com/paper/residual-parameter-transfer-for-deep-domain |
Repo | |
Framework | |