July 26, 2019

3046 words 15 mins read

Paper Group ANR 759

Generating Different Story Tellings from Semantic Representations of Narrative. Exploring the Regularity of Sparse Structure in Convolutional Neural Networks. Conceptual Text Summarizer: A new model in continuous vector space. Relation Extraction : A Survey. Coordinate Descent with Bandit Sampling. Message Passing Stein Variational Gradient Descent …

Generating Different Story Tellings from Semantic Representations of Narrative


Title	Generating Different Story Tellings from Semantic Representations of Narrative
Authors	Elena Rishes, Stephanie M. Lukin, David K. Elson, Marilyn A. Walker
Abstract	In order to tell stories in different voices for different audiences, interactive story systems require: (1) a semantic representation of story structure, and (2) the ability to automatically generate story and dialogue from this semantic representation using some form of Natural Language Generation (NLG). However, there has been limited research on methods for linking story structures to narrative descriptions of scenes and story events. In this paper we present an automatic method for converting from Scheherazade’s story intention graph, a semantic representation, to the input required by the Personage NLG engine. Using 36 Aesop Fables distributed in DramaBank, a collection of story encodings, we train translation rules on one story and then test these rules by generating text for the remaining 35. The results are measured in terms of the string similarity metrics Levenshtein Distance and BLEU score. The results show that we can generate the 35 stories with correct content: the test set stories on average are close to the output of the Scheherazade realizer, which was customized to this semantic representation. We provide some examples of story variations generated by personage. In future work, we will experiment with measuring the quality of the same stories generated in different voices, and with techniques for making storytelling interactive.
Tasks	Text Generation
Published	2017-08-29
URL	http://arxiv.org/abs/1708.08573v1
PDF	http://arxiv.org/pdf/1708.08573v1.pdf
PWC	https://paperswithcode.com/paper/generating-different-story-tellings-from
Repo
Framework

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks


Title	Exploring the Regularity of Sparse Structure in Convolutional Neural Networks
Authors	Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, William J. Dally
Abstract	Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning creates regular sparsity patterns, making it more amenable for hardware acceleration but more challenging to maintain the same accuracy. In this paper we quantitatively measure the trade-off between sparsity regularity and prediction accuracy, providing insights in how to maintain accuracy while having more a more structured sparsity pattern. Our experimental results show that coarse-grained pruning can achieve a sparsity ratio similar to unstructured pruning without loss of accuracy. Moreover, due to the index saving effect, coarse-grained pruning is able to obtain a better compression ratio than fine-grained sparsity at the same accuracy threshold. Based on the recent sparse convolutional neural network accelerator (SCNN), our experiments further demonstrate that coarse-grained sparsity saves about 2x the memory references compared to fine-grained sparsity. Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.
Tasks
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08922v3
PDF	http://arxiv.org/pdf/1705.08922v3.pdf
PWC	https://paperswithcode.com/paper/exploring-the-regularity-of-sparse-structure
Repo
Framework

Conceptual Text Summarizer: A new model in continuous vector space


Title	Conceptual Text Summarizer: A new model in continuous vector space
Authors	Mohammad Ebrahim Khademi, Mohammad Fakhredanesh, Seyed Mojtaba Hoseini
Abstract	Traditional methods of summarization are not cost-effective and possible today. Extractive summarization is a process that helps to extract the most important sentences from a text automatically and generates a short informative summary. In this work, we propose an unsupervised method to summarize Persian texts. This method is a novel hybrid approach that clusters the concepts of the text using deep learning and traditional statistical methods. First we produce a word embedding based on Hamshahri2 corpus and a dictionary of word frequencies. Then the proposed algorithm extracts the keywords of the document, clusters its concepts, and finally ranks the sentences to produce the summary. We evaluated the proposed method on Pasokh single-document corpus using the ROUGE evaluation measure. Without using any hand-crafted features, our proposed method achieves state-of-the-art results. We compared our unsupervised method with the best supervised Persian methods and we achieved an overall improvement of ROUGE-2 recall score of 7.5%.
Tasks
Published	2017-10-30
URL	http://arxiv.org/abs/1710.10994v3
PDF	http://arxiv.org/pdf/1710.10994v3.pdf
PWC	https://paperswithcode.com/paper/conceptual-text-summarizer-a-new-model-in
Repo
Framework

Relation Extraction : A Survey


Title	Relation Extraction : A Survey
Authors	Sachin Pawar, Girish K. Palshikar, Pushpak Bhattacharyya
Abstract	With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.
Tasks	Information Retrieval, Open Information Extraction, Question Answering, Relation Extraction
Published	2017-12-14
URL	http://arxiv.org/abs/1712.05191v1
PDF	http://arxiv.org/pdf/1712.05191v1.pdf
PWC	https://paperswithcode.com/paper/relation-extraction-a-survey
Repo
Framework

Coordinate Descent with Bandit Sampling


Title	Coordinate Descent with Bandit Sampling
Authors	Farnood Salehi, Patrick Thiran, L. Elisa Celis
Abstract	Coordinate descent methods usually minimize a cost function by updating a random decision variable (corresponding to one coordinate) at a time. Ideally, we would update the decision variable that yields the largest decrease in the cost function. However, finding this coordinate would require checking all of them, which would effectively negate the improvement in computational tractability that coordinate descent is intended to afford. To address this, we propose a new adaptive method for selecting a coordinate. First, we find a lower bound on the amount the cost function decreases when a coordinate is updated. We then use a multi-armed bandit algorithm to learn which coordinates result in the largest lower bound by interleaving this learning with conventional coordinate descent updates except that the coordinate is selected proportionately to the expected decrease. We show that our approach improves the convergence of coordinate descent methods both theoretically and experimentally.
Tasks
Published	2017-12-08
URL	http://arxiv.org/abs/1712.03010v2
PDF	http://arxiv.org/pdf/1712.03010v2.pdf
PWC	https://paperswithcode.com/paper/coordinate-descent-with-bandit-sampling
Repo
Framework

Message Passing Stein Variational Gradient Descent


Title	Message Passing Stein Variational Gradient Descent
Authors	Jingwei Zhuo, Chang Liu, Jiaxin Shi, Jun Zhu, Ning Chen, Bo Zhang
Abstract	Stein variational gradient descent (SVGD) is a recently proposed particle-based Bayesian inference method, which has attracted a lot of interest due to its remarkable approximation ability and particle efficiency compared to traditional variational inference and Markov Chain Monte Carlo methods. However, we observed that particles of SVGD tend to collapse to modes of the target distribution, and this particle degeneracy phenomenon becomes more severe with higher dimensions. Our theoretical analysis finds out that there exists a negative correlation between the dimensionality and the repulsive force of SVGD which should be blamed for this phenomenon. We propose Message Passing SVGD (MP-SVGD) to solve this problem. By leveraging the conditional independence structure of probabilistic graphical models (PGMs), MP-SVGD converts the original high-dimensional global inference problem into a set of local ones over the Markov blanket with lower dimensions. Experimental results show its advantages of preventing vanishing repulsive force in high-dimensional space over SVGD, and its particle efficiency and approximation flexibility over other inference methods on graphical models.
Tasks	Bayesian Inference
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04425v3
PDF	http://arxiv.org/pdf/1711.04425v3.pdf
PWC	https://paperswithcode.com/paper/message-passing-stein-variational-gradient
Repo
Framework

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection


Title	Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
Authors	Yuchen Dai, Zheng Huang, Yuting Gao, Youxuan Xu, Kai Chen, Jie Guo, Weidong Qiu
Abstract	In this paper, we introduce a novel end-end framework for multi-oriented scene text detection from an instance-aware semantic segmentation perspective. We present Fused Text Segmentation Networks, which combine multi-level features during the feature extracting as text instance may rely on finer feature expression compared to general objects. It detects and segments the text instance jointly and simultaneously, leveraging merits from both semantic segmentation task and region proposal based object detection task. Not involving any extra pipelines, our approach surpasses the current state of the art on multi-oriented scene text detection benchmarks: ICDAR2015 Incidental Scene Text and MSRA-TD500 reaching Hmean 84.1% and 82.0% respectively. Morever, we report a baseline on total-text containing curved text which suggests effectiveness of the proposed approach.
Tasks	Multi-Oriented Scene Text Detection, Object Detection, Scene Text Detection, Semantic Segmentation
Published	2017-09-11
URL	http://arxiv.org/abs/1709.03272v4
PDF	http://arxiv.org/pdf/1709.03272v4.pdf
PWC	https://paperswithcode.com/paper/fused-text-segmentation-networks-for-multi
Repo
Framework

Learning event representation: As sparse as possible, but not sparser


Title	Learning event representation: As sparse as possible, but not sparser
Authors	Tuan Do, James Pustejovsky
Abstract	Selecting an optimal event representation is essential for event classification in real world contexts. In this paper, we investigate the application of qualitative spatial reasoning (QSR) frameworks for classification of human-object interaction in three dimensional space, in comparison with the use of quantitative feature extraction approaches for the same purpose. In particular, we modify QSRLib, a library that allows computation of Qualitative Spatial Relations and Calculi, and employ it for feature extraction, before inputting features into our neural network models. Using an experimental setup involving motion captures of human-object interaction as three dimensional inputs, we observe that the use of qualitative spatial features significantly improves the performance of our machine learning algorithm against our baseline, while quantitative features of similar kinds fail to deliver similar improvement. We also observe that sequential representations of QSR features yield the best classification performance. A result of our learning method is a simple approach to the qualitative representation of 3D activities as compositions of 2D actions that can be visualized and learned using 2-dimensional QSR.
Tasks	Human-Object Interaction Detection
Published	2017-10-02
URL	http://arxiv.org/abs/1710.00448v1
PDF	http://arxiv.org/pdf/1710.00448v1.pdf
PWC	https://paperswithcode.com/paper/learning-event-representation-as-sparse-as
Repo
Framework

Face Identification and Clustering


Title	Face Identification and Clustering
Authors	Atul Dhingra
Abstract	In this thesis, we study two problems based on clustering algorithms. In the first problem, we study the role of visual attributes using an agglomerative clustering algorithm to whittle down the search area where the number of classes is high to improve the performance of clustering. We observe that as we add more attributes, the clustering performance increases overall. In the second problem, we study the role of clustering in aggregating templates in a 1:N open set protocol using multi-shot video as a probe. We observe that by increasing the number of clusters, the performance increases with respect to the baseline and reaches a peak, after which increasing the number of clusters causes the performance to degrade. Experiments are conducted using recently introduced unconstrained IARPA Janus IJB-A, CS2, and CS3 face recognition datasets.
Tasks	Face Identification, Face Recognition
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08328v1
PDF	http://arxiv.org/pdf/1704.08328v1.pdf
PWC	https://paperswithcode.com/paper/face-identification-and-clustering
Repo
Framework

Learning to Pour


Title	Learning to Pour
Authors	Yongqiang Huang, Yu Sun
Abstract	Pouring is a simple task people perform daily. It is the second most frequently executed motion in cooking scenarios, after pick-and-place. We present a pouring trajectory generation approach, which uses force feedback from the cup to determine the future velocity of pouring. The approach uses recurrent neural networks as its building blocks. We collected the pouring demonstrations which we used for training. To test our approach in simulation, we also created and trained a force estimation system. The simulated experiments show that the system is able to generalize to single unseen element of the pouring characteristics.
Tasks
Published	2017-05-25
URL	http://arxiv.org/abs/1705.09021v1
PDF	http://arxiv.org/pdf/1705.09021v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-pour
Repo
Framework

Bringing Semantic Structures to User Intent Detection in Online Medical Queries


Title	Bringing Semantic Structures to User Intent Detection in Online Medical Queries
Authors	Chenwei Zhang, Nan Du, Wei Fan, Yaliang Li, Chun-Ta Lu, Philip S. Yu
Abstract	The Internet has revolutionized healthcare by offering medical information ubiquitously to patients via web search. The healthcare status, complex medical information needs of patients are expressed diversely and implicitly in their medical text queries. Aiming to better capture a focused picture of user’s medical-related information search and shed insights on their healthcare information access strategies, it is challenging yet rewarding to detect structured user intentions from their diversely expressed medical text queries. We introduce a graph-based formulation to explore structured concept transitions for effective user intent detection in medical queries, where each node represents a medical concept mention and each directed edge indicates a medical concept transition. A deep model based on multi-task learning is introduced to extract structured semantic transitions from user queries, where the model extracts word-level medical concept mentions as well as sentence-level concept transitions collectively. A customized graph-based mutual transfer loss function is designed to impose explicit constraints and further exploit the contribution of mentioning a medical concept word to the implication of a semantic transition. We observe an 8% relative improvement in AUC and 23% relative reduction in coverage error by comparing the proposed model with the best baseline model for the concept transition inference task on real-world medical text queries.
Tasks	Intent Detection, Multi-Task Learning
Published	2017-10-22
URL	http://arxiv.org/abs/1710.08015v1
PDF	http://arxiv.org/pdf/1710.08015v1.pdf
PWC	https://paperswithcode.com/paper/bringing-semantic-structures-to-user-intent
Repo
Framework

Overcoming data scarcity with transfer learning


Title	Overcoming data scarcity with transfer learning
Authors	Maxwell L. Hutchinson, Erin Antono, Brenna M. Gibbons, Sean Paradiso, Julia Ling, Bryce Meredig
Abstract	Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual differences introduce irreducible errors into analyses, fundamentally limiting their accuracy. Transfer learning, where information from one dataset is used to inform a model on another, can be an effective tool for bridging sparse data while preserving the contextual differences in the underlying measurements. Here, we describe and compare three techniques for transfer learning: multi-task, difference, and explicit latent variable architectures. We show that difference architectures are most accurate in the multi-fidelity case of mixed DFT and experimental band gaps, while multi-task most improves classification performance of color with band gaps. For activation energies of steps in NO reduction, the explicit latent variable method is not only the most accurate, but also enjoys cancellation of errors in functions that depend on multiple tasks. These results motivate the publication of high quality materials datasets that encode transferable information, independent of industrial or academic interest in the particular labels, and encourage further development and application of transfer learning methods to materials informatics problems.
Tasks	Transfer Learning
Published	2017-11-02
URL	http://arxiv.org/abs/1711.05099v1
PDF	http://arxiv.org/pdf/1711.05099v1.pdf
PWC	https://paperswithcode.com/paper/overcoming-data-scarcity-with-transfer
Repo
Framework

Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs


Title	Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs
Authors	Bin Hu, Peter Seiler, Laurent Lessard
Abstract	We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.
Tasks
Published	2017-11-03
URL	https://arxiv.org/abs/1711.00987v3
PDF	https://arxiv.org/pdf/1711.00987v3.pdf
PWC	https://paperswithcode.com/paper/analysis-of-approximate-stochastic-gradient
Repo
Framework

Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks


Title	Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks
Authors	Lawrence Phillips, Nathan Hodas
Abstract	Increasingly, cognitive scientists have demonstrated interest in applying tools from deep learning. One use for deep learning is in language acquisition where it is useful to know if a linguistic phenomenon can be learned through domain-general means. To assess whether unsupervised deep learning is appropriate, we first pose a smaller question: Can unsupervised neural networks apply linguistic rules productively, using them in novel situations? We draw from the literature on determiner/noun productivity by training an unsupervised, autoencoder network measuring its ability to combine nouns with determiners. Our simple autoencoder creates combinations it has not previously encountered and produces a degree of overlap matching adults. While this preliminary work does not provide conclusive evidence for productivity, it warrants further investigation with more complex models. Further, this work helps lay the foundations for future collaboration between the deep learning and cognitive science communities.
Tasks	Language Acquisition
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01839v1
PDF	http://arxiv.org/pdf/1706.01839v1.pdf
PWC	https://paperswithcode.com/paper/assessing-the-linguistic-productivity-of
Repo
Framework

Identifying civilians killed by police with distantly supervised entity-event extraction


Title	Identifying civilians killed by police with distantly supervised entity-event extraction
Authors	Katherine A. Keith, Abram Handler, Michael Pinkham, Cara Magliozzi, Joshua McDuffie, Brendan O’Connor
Abstract	We propose a new, socially-impactful task for natural language processing: from a news corpus, extract names of persons who have been killed by police. We present a newly collected police fatality corpus, which we release publicly, and present a model to solve this problem that uses EM-based distant supervision with logistic regression and convolutional neural network classifiers. Our model outperforms two off-the-shelf event extractor systems, and it can suggest candidate victim names in some cases faster than one of the major manually-collected police fatality databases.
Tasks
Published	2017-07-22
URL	http://arxiv.org/abs/1707.07086v1
PDF	http://arxiv.org/pdf/1707.07086v1.pdf
PWC	https://paperswithcode.com/paper/identifying-civilians-killed-by-police-with
Repo
Framework