July 26, 2019

3046 words 15 mins read

Paper Group ANR 759

Paper Group ANR 759

Generating Different Story Tellings from Semantic Representations of Narrative. Exploring the Regularity of Sparse Structure in Convolutional Neural Networks. Conceptual Text Summarizer: A new model in continuous vector space. Relation Extraction : A Survey. Coordinate Descent with Bandit Sampling. Message Passing Stein Variational Gradient Descent …

Generating Different Story Tellings from Semantic Representations of Narrative

Title Generating Different Story Tellings from Semantic Representations of Narrative
Authors Elena Rishes, Stephanie M. Lukin, David K. Elson, Marilyn A. Walker
Abstract In order to tell stories in different voices for different audiences, interactive story systems require: (1) a semantic representation of story structure, and (2) the ability to automatically generate story and dialogue from this semantic representation using some form of Natural Language Generation (NLG). However, there has been limited research on methods for linking story structures to narrative descriptions of scenes and story events. In this paper we present an automatic method for converting from Scheherazade’s story intention graph, a semantic representation, to the input required by the Personage NLG engine. Using 36 Aesop Fables distributed in DramaBank, a collection of story encodings, we train translation rules on one story and then test these rules by generating text for the remaining 35. The results are measured in terms of the string similarity metrics Levenshtein Distance and BLEU score. The results show that we can generate the 35 stories with correct content: the test set stories on average are close to the output of the Scheherazade realizer, which was customized to this semantic representation. We provide some examples of story variations generated by personage. In future work, we will experiment with measuring the quality of the same stories generated in different voices, and with techniques for making storytelling interactive.
Tasks Text Generation
Published 2017-08-29
URL http://arxiv.org/abs/1708.08573v1
PDF http://arxiv.org/pdf/1708.08573v1.pdf
PWC https://paperswithcode.com/paper/generating-different-story-tellings-from
Repo
Framework

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

Title Exploring the Regularity of Sparse Structure in Convolutional Neural Networks
Authors Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, William J. Dally
Abstract Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning creates regular sparsity patterns, making it more amenable for hardware acceleration but more challenging to maintain the same accuracy. In this paper we quantitatively measure the trade-off between sparsity regularity and prediction accuracy, providing insights in how to maintain accuracy while having more a more structured sparsity pattern. Our experimental results show that coarse-grained pruning can achieve a sparsity ratio similar to unstructured pruning without loss of accuracy. Moreover, due to the index saving effect, coarse-grained pruning is able to obtain a better compression ratio than fine-grained sparsity at the same accuracy threshold. Based on the recent sparse convolutional neural network accelerator (SCNN), our experiments further demonstrate that coarse-grained sparsity saves about 2x the memory references compared to fine-grained sparsity. Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.
Tasks
Published 2017-05-24
URL http://arxiv.org/abs/1705.08922v3
PDF http://arxiv.org/pdf/1705.08922v3.pdf
PWC https://paperswithcode.com/paper/exploring-the-regularity-of-sparse-structure
Repo
Framework

Conceptual Text Summarizer: A new model in continuous vector space

Title Conceptual Text Summarizer: A new model in continuous vector space
Authors Mohammad Ebrahim Khademi, Mohammad Fakhredanesh, Seyed Mojtaba Hoseini
Abstract Traditional methods of summarization are not cost-effective and possible today. Extractive summarization is a process that helps to extract the most important sentences from a text automatically and generates a short informative summary. In this work, we propose an unsupervised method to summarize Persian texts. This method is a novel hybrid approach that clusters the concepts of the text using deep learning and traditional statistical methods. First we produce a word embedding based on Hamshahri2 corpus and a dictionary of word frequencies. Then the proposed algorithm extracts the keywords of the document, clusters its concepts, and finally ranks the sentences to produce the summary. We evaluated the proposed method on Pasokh single-document corpus using the ROUGE evaluation measure. Without using any hand-crafted features, our proposed method achieves state-of-the-art results. We compared our unsupervised method with the best supervised Persian methods and we achieved an overall improvement of ROUGE-2 recall score of 7.5%.
Tasks
Published 2017-10-30
URL http://arxiv.org/abs/1710.10994v3
PDF http://arxiv.org/pdf/1710.10994v3.pdf
PWC https://paperswithcode.com/paper/conceptual-text-summarizer-a-new-model-in
Repo
Framework

Relation Extraction : A Survey

Title Relation Extraction : A Survey
Authors Sachin Pawar, Girish K. Palshikar, Pushpak Bhattacharyya
Abstract With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.
Tasks Information Retrieval, Open Information Extraction, Question Answering, Relation Extraction
Published 2017-12-14
URL http://arxiv.org/abs/1712.05191v1
PDF http://arxiv.org/pdf/1712.05191v1.pdf
PWC https://paperswithcode.com/paper/relation-extraction-a-survey
Repo
Framework

Coordinate Descent with Bandit Sampling

Title Coordinate Descent with Bandit Sampling
Authors Farnood Salehi, Patrick Thiran, L. Elisa Celis
Abstract Coordinate descent methods usually minimize a cost function by updating a random decision variable (corresponding to one coordinate) at a time. Ideally, we would update the decision variable that yields the largest decrease in the cost function. However, finding this coordinate would require checking all of them, which would effectively negate the improvement in computational tractability that coordinate descent is intended to afford. To address this, we propose a new adaptive method for selecting a coordinate. First, we find a lower bound on the amount the cost function decreases when a coordinate is updated. We then use a multi-armed bandit algorithm to learn which coordinates result in the largest lower bound by interleaving this learning with conventional coordinate descent updates except that the coordinate is selected proportionately to the expected decrease. We show that our approach improves the convergence of coordinate descent methods both theoretically and experimentally.
Tasks
Published 2017-12-08
URL http://arxiv.org/abs/1712.03010v2
PDF http://arxiv.org/pdf/1712.03010v2.pdf
PWC https://paperswithcode.com/paper/coordinate-descent-with-bandit-sampling
Repo
Framework

Message Passing Stein Variational Gradient Descent

Title Message Passing Stein Variational Gradient Descent
Authors Jingwei Zhuo, Chang Liu, Jiaxin Shi, Jun Zhu, Ning Chen, Bo Zhang
Abstract Stein variational gradient descent (SVGD) is a recently proposed particle-based Bayesian inference method, which has attracted a lot of interest due to its remarkable approximation ability and particle efficiency compared to traditional variational inference and Markov Chain Monte Carlo methods. However, we observed that particles of SVGD tend to collapse to modes of the target distribution, and this particle degeneracy phenomenon becomes more severe with higher dimensions. Our theoretical analysis finds out that there exists a negative correlation between the dimensionality and the repulsive force of SVGD which should be blamed for this phenomenon. We propose Message Passing SVGD (MP-SVGD) to solve this problem. By leveraging the conditional independence structure of probabilistic graphical models (PGMs), MP-SVGD converts the original high-dimensional global inference problem into a set of local ones over the Markov blanket with lower dimensions. Experimental results show its advantages of preventing vanishing repulsive force in high-dimensional space over SVGD, and its particle efficiency and approximation flexibility over other inference methods on graphical models.
Tasks Bayesian Inference
Published 2017-11-13
URL http://arxiv.org/abs/1711.04425v3
PDF http://arxiv.org/pdf/1711.04425v3.pdf
PWC https://paperswithcode.com/paper/message-passing-stein-variational-gradient
Repo
Framework

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

Title Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
Authors Yuchen Dai, Zheng Huang, Yuting Gao, Youxuan Xu, Kai Chen, Jie Guo, Weidong Qiu
Abstract In this paper, we introduce a novel end-end framework for multi-oriented scene text detection from an instance-aware semantic segmentation perspective. We present Fused Text Segmentation Networks, which combine multi-level features during the feature extracting as text instance may rely on finer feature expression compared to general objects. It detects and segments the text instance jointly and simultaneously, leveraging merits from both semantic segmentation task and region proposal based object detection task. Not involving any extra pipelines, our approach surpasses the current state of the art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever, we report a baseline on total-text containing curved text which suggests effectiveness of the proposed approach.
Tasks Multi-Oriented Scene Text Detection, Object Detection, Scene Text Detection, Semantic Segmentation
Published 2017-09-11
URL http://arxiv.org/abs/1709.03272v4
PDF http://arxiv.org/pdf/1709.03272v4.pdf
PWC https://paperswithcode.com/paper/fused-text-segmentation-networks-for-multi
Repo
Framework

Learning event representation: As sparse as possible, but not sparser

Title Learning event representation: As sparse as possible, but not sparser
Authors Tuan Do, James Pustejovsky
Abstract Selecting an optimal event representation is essential for event classification in real world contexts. In this paper, we investigate the application of qualitative spatial reasoning (QSR) frameworks for classification of human-object interaction in three dimensional space, in comparison with the use of quantitative feature extraction approaches for the same purpose. In particular, we modify QSRLib, a library that allows computation of Qualitative Spatial Relations and Calculi, and employ it for feature extraction, before inputting features into our neural network models. Using an experimental setup involving motion captures of human-object interaction as three dimensional inputs, we observe that the use of qualitative spatial features significantly improves the performance of our machine learning algorithm against our baseline, while quantitative features of similar kinds fail to deliver similar improvement. We also observe that sequential representations of QSR features yield the best classification performance. A result of our learning method is a simple approach to the qualitative representation of 3D activities as compositions of 2D actions that can be visualized and learned using 2-dimensional QSR.
Tasks Human-Object Interaction Detection
Published 2017-10-02
URL http://arxiv.org/abs/1710.00448v1
PDF http://arxiv.org/pdf/1710.00448v1.pdf
PWC https://paperswithcode.com/paper/learning-event-representation-as-sparse-as
Repo
Framework

Face Identification and Clustering

Title Face Identification and Clustering
Authors Atul Dhingra
Abstract In this thesis, we study two problems based on clustering algorithms. In the first problem, we study the role of visual attributes using an agglomerative clustering algorithm to whittle down the search area where the number of classes is high to improve the performance of clustering. We observe that as we add more attributes, the clustering performance increases overall. In the second problem, we study the role of clustering in aggregating templates in a 1:N open set protocol using multi-shot video as a probe. We observe that by increasing the number of clusters, the performance increases with respect to the baseline and reaches a peak, after which increasing the number of clusters causes the performance to degrade. Experiments are conducted using recently introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition datasets.
Tasks Face Identification, Face Recognition
Published 2017-04-26
URL http://arxiv.org/abs/1704.08328v1
PDF http://arxiv.org/pdf/1704.08328v1.pdf
PWC https://paperswithcode.com/paper/face-identification-and-clustering
Repo
Framework

Learning to Pour

Title Learning to Pour
Authors Yongqiang Huang, Yu Sun
Abstract Pouring is a simple task people perform daily. It is the second most frequently executed motion in cooking scenarios, after pick-and-place. We present a pouring trajectory generation approach, which uses force feedback from the cup to determine the future velocity of pouring. The approach uses recurrent neural networks as its building blocks. We collected the pouring demonstrations which we used for training. To test our approach in simulation, we also created and trained a force estimation system. The simulated experiments show that the system is able to generalize to single unseen element of the pouring characteristics.
Tasks
Published 2017-05-25
URL http://arxiv.org/abs/1705.09021v1
PDF http://arxiv.org/pdf/1705.09021v1.pdf
PWC https://paperswithcode.com/paper/learning-to-pour
Repo
Framework

Bringing Semantic Structures to User Intent Detection in Online Medical Queries

Title Bringing Semantic Structures to User Intent Detection in Online Medical Queries
Authors Chenwei Zhang, Nan Du, Wei Fan, Yaliang Li, Chun-Ta Lu, Philip S. Yu
Abstract The Internet has revolutionized healthcare by offering medical information ubiquitously to patients via web search. The healthcare status, complex medical information needs of patients are expressed diversely and implicitly in their medical text queries. Aiming to better capture a focused picture of user’s medical-related information search and shed insights on their healthcare information access strategies, it is challenging yet rewarding to detect structured user intentions from their diversely expressed medical text queries. We introduce a graph-based formulation to explore structured concept transitions for effective user intent detection in medical queries, where each node represents a medical concept mention and each directed edge indicates a medical concept transition. A deep model based on multi-task learning is introduced to extract structured semantic transitions from user queries, where the model extracts word-level medical concept mentions as well as sentence-level concept transitions collectively. A customized graph-based mutual transfer loss function is designed to impose explicit constraints and further exploit the contribution of mentioning a medical concept word to the implication of a semantic transition. We observe an 8% relative improvement in AUC and 23% relative reduction in coverage error by comparing the proposed model with the best baseline model for the concept transition inference task on real-world medical text queries.
Tasks Intent Detection, Multi-Task Learning
Published 2017-10-22
URL http://arxiv.org/abs/1710.08015v1
PDF http://arxiv.org/pdf/1710.08015v1.pdf
PWC https://paperswithcode.com/paper/bringing-semantic-structures-to-user-intent
Repo
Framework

Overcoming data scarcity with transfer learning

Title Overcoming data scarcity with transfer learning
Authors Maxwell L. Hutchinson, Erin Antono, Brenna M. Gibbons, Sean Paradiso, Julia Ling, Bryce Meredig
Abstract Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual differences introduce irreducible errors into analyses, fundamentally limiting their accuracy. Transfer learning, where information from one dataset is used to inform a model on another, can be an effective tool for bridging sparse data while preserving the contextual differences in the underlying measurements. Here, we describe and compare three techniques for transfer learning: multi-task, difference, and explicit latent variable architectures. We show that difference architectures are most accurate in the multi-fidelity case of mixed DFT and experimental band gaps, while multi-task most improves classification performance of color with band gaps. For activation energies of steps in NO reduction, the explicit latent variable method is not only the most accurate, but also enjoys cancellation of errors in functions that depend on multiple tasks. These results motivate the publication of high quality materials datasets that encode transferable information, independent of industrial or academic interest in the particular labels, and encourage further development and application of transfer learning methods to materials informatics problems.
Tasks Transfer Learning
Published 2017-11-02
URL http://arxiv.org/abs/1711.05099v1
PDF http://arxiv.org/pdf/1711.05099v1.pdf
PWC https://paperswithcode.com/paper/overcoming-data-scarcity-with-transfer
Repo
Framework

Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs

Title Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs
Authors Bin Hu, Peter Seiler, Laurent Lessard
Abstract We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.
Tasks
Published 2017-11-03
URL https://arxiv.org/abs/1711.00987v3
PDF https://arxiv.org/pdf/1711.00987v3.pdf
PWC https://paperswithcode.com/paper/analysis-of-approximate-stochastic-gradient
Repo
Framework

Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks

Title Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks
Authors Lawrence Phillips, Nathan Hodas
Abstract Increasingly, cognitive scientists have demonstrated interest in applying tools from deep learning. One use for deep learning is in language acquisition where it is useful to know if a linguistic phenomenon can be learned through domain-general means. To assess whether unsupervised deep learning is appropriate, we first pose a smaller question: Can unsupervised neural networks apply linguistic rules productively, using them in novel situations? We draw from the literature on determiner/noun productivity by training an unsupervised, autoencoder network measuring its ability to combine nouns with determiners. Our simple autoencoder creates combinations it has not previously encountered and produces a degree of overlap matching adults. While this preliminary work does not provide conclusive evidence for productivity, it warrants further investigation with more complex models. Further, this work helps lay the foundations for future collaboration between the deep learning and cognitive science communities.
Tasks Language Acquisition
Published 2017-06-06
URL http://arxiv.org/abs/1706.01839v1
PDF http://arxiv.org/pdf/1706.01839v1.pdf
PWC https://paperswithcode.com/paper/assessing-the-linguistic-productivity-of
Repo
Framework

Identifying civilians killed by police with distantly supervised entity-event extraction

Title Identifying civilians killed by police with distantly supervised entity-event extraction
Authors Katherine A. Keith, Abram Handler, Michael Pinkham, Cara Magliozzi, Joshua McDuffie, Brendan O’Connor
Abstract We propose a new, socially-impactful task for natural language processing: from a news corpus, extract names of persons who have been killed by police. We present a newly collected police fatality corpus, which we release publicly, and present a model to solve this problem that uses EM-based distant supervision with logistic regression and convolutional neural network classifiers. Our model outperforms two off-the-shelf event extractor systems, and it can suggest candidate victim names in some cases faster than one of the major manually-collected police fatality databases.
Tasks
Published 2017-07-22
URL http://arxiv.org/abs/1707.07086v1
PDF http://arxiv.org/pdf/1707.07086v1.pdf
PWC https://paperswithcode.com/paper/identifying-civilians-killed-by-police-with
Repo
Framework
comments powered by Disqus