January 30, 2020

3158 words 15 mins read

Paper Group ANR 277

Paper Group ANR 277

A Priori Estimates of the Population Risk for Residual Networks. Dissecting Deep Neural Networks. Quality-aware skill translation models for expert finding on StackOverflow. Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels. Short Text Classification Improved by Feat …

A Priori Estimates of the Population Risk for Residual Networks

Title A Priori Estimates of the Population Risk for Residual Networks
Authors Weinan E, Chao Ma, Qingcan Wang
Abstract Optimal a priori estimates are derived for the population risk, also known as the generalization error, of a regularized residual network model. An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities are regularized by larger weights. The error estimates are a priori in the sense that the estimates depend only on the target function, not on the parameters obtained in the training process. The estimates are optimal, in a high dimensional setting, in the sense that both the bound for the approximation and estimation errors are comparable to the Monte Carlo error rates. A crucial step in the proof is to establish an optimal bound for the Rademacher complexity of the residual networks. Comparisons are made with existing norm-based generalization error bounds.
Tasks
Published 2019-03-06
URL https://arxiv.org/abs/1903.02154v2
PDF https://arxiv.org/pdf/1903.02154v2.pdf
PWC https://paperswithcode.com/paper/a-priori-estimates-of-the-population-risk-for
Repo
Framework

Dissecting Deep Neural Networks

Title Dissecting Deep Neural Networks
Authors Haakon Robinson, Adil Rasheed, Omer San
Abstract In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have raised concerns over their use in safety-critical applications. A first step to understanding these networks is to develop alternate representations that allow for further analysis. It has been shown that neural networks with piecewise affine activation functions are themselves piecewise affine, with their domains consisting of a vast number of linear regions. So far, the research on this topic has focused on counting the number of linear regions, rather than obtaining explicit piecewise affine representations. This work presents a novel algorithm that can compute the piecewise affine form of any fully connected neural network with rectified linear unit activations.
Tasks
Published 2019-10-09
URL https://arxiv.org/abs/1910.03879v2
PDF https://arxiv.org/pdf/1910.03879v2.pdf
PWC https://paperswithcode.com/paper/dissecting-deep-neural-networks
Repo
Framework

Quality-aware skill translation models for expert finding on StackOverflow

Title Quality-aware skill translation models for expert finding on StackOverflow
Authors Arash Dargahi Nobari, Mahmood Neshati, Sajad Sotudeh Gharebagh
Abstract StackOverflow has become an emerging resource for talent recognition in recent years. While users exploit technical language on StackOverflow, recruiters try to find the relevant candidates for jobs using their own terminology. This procedure implies a gap which exists between recruiters and candidates terms. Due to this gap, the state-of-the-art expert finding models cannot effectively address the expert finding problem on StackOverflow. We propose two translation models to bridge this gap. The first approach is a statistical method and the second is based on word embedding approach. Utilizing several translations for a given query during the scoring step, the result of each intermediate query is blended together to obtain the final ranking. Here, we propose a new approach which takes the quality of documents into account in scoring step. We have made several observations to visualize the effectiveness of the translation approaches and also the quality-aware scoring approach. Our experiments indicate the following: First, while statistical and word embedding translation approaches provide different translations for each query, both can considerably improve the recall. Besides, the quality-aware scoring approach can improve the precision remarkably. Finally, our best proposed method can improve the MAP measure up to 46% on average, in comparison with the state-of-the-art expert finding approach.
Tasks
Published 2019-07-16
URL https://arxiv.org/abs/1907.06836v1
PDF https://arxiv.org/pdf/1907.06836v1.pdf
PWC https://paperswithcode.com/paper/quality-aware-skill-translation-models-for
Repo
Framework

Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels

Title Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels
Authors Philippe Mulhem, Lorraine Goeuriot, Massih-Reza Amini, Nayanika Dogra
Abstract We describe here an experimental framework and the results obtained on microblogs retrieval. We study the contribution one popular approach, i.e., words embeddings, and investigate the impact of the training set on the learned embedding. We focus on query expansion for the retrieval of tweets on the CLEF CMC 2016 corpus. Our results show that using embeddings trained on a corpus in the same domain as the indexed documents did not necessarily lead to better retrieval results.
Tasks
Published 2019-11-17
URL https://arxiv.org/abs/1911.07317v1
PDF https://arxiv.org/pdf/1911.07317v1.pdf
PWC https://paperswithcode.com/paper/quels-corpus-dentrainement-pour-lexpansion-de
Repo
Framework

Short Text Classification Improved by Feature Space Extension

Title Short Text Classification Improved by Feature Space Extension
Authors Yanxuan Li
Abstract With the explosive development of mobile Internet, short text has been applied extensively. The difference between classifying short text and long documents is that short text is of shortness and sparsity. Thus, it is challenging to deal with short text classification owing to its less semantic information. In this paper, we propose a novel topic-based convolutional neural network (TB-CNN) based on Latent Dirichlet Allocation (LDA) model and convolutional neural network. Comparing to traditional CNN methods, TB-CNN generates topic words with LDA model to reduce the sparseness and combines the embedding vectors of topic words and input words to extend feature space of short text. The validation results on IMDB movie review dataset show the improvement and effectiveness of TB-CNN.
Tasks Text Classification
Published 2019-04-02
URL http://arxiv.org/abs/1904.01313v1
PDF http://arxiv.org/pdf/1904.01313v1.pdf
PWC https://paperswithcode.com/paper/short-text-classification-improved-by-feature
Repo
Framework

Alternative Blockmodelling

Title Alternative Blockmodelling
Authors Oscar Correa, Jeffrey Chan, Vinh Nguyen
Abstract Many approaches have been proposed to discover clusters within networks. Community finding field encompasses approaches which try to discover clusters where nodes are tightly related within them but loosely related with nodes of other clusters. However, a community network configuration is not the only possible latent structure in a graph. Core-periphery and hierarchical network configurations are valid structures to discover in a relational dataset. On the other hand, a network is not completely explained by only knowing the membership of each node. A high level view of the inter-cluster relationships is needed. Blockmodelling techniques deal with these two issues. Firstly, blockmodelling allows finding any network configuration besides to the well-known community structure. Secondly, blockmodelling is a summary representation of a network which regards not only membership of nodes but also relations between clusters. Finally, a unique summary representation of a network is unlikely. Networks might hide more than one blockmodel. Therefore, our proposed problem aims to discover a secondary blockmodel representation of a network that is of good quality and dissimilar with respect to a given blockmodel. Our methodology is presented through two approaches, (a) inclusion of cannot-link constraints and (b) dissimilarity between image matrices. Both approaches are based on non-negative matrix factorisation NMF which fits the blockmodelling representation. The evaluation of these two approaches regards quality and dissimilarity of the discovered alternative blockmodel as these are the requirements of the problem.
Tasks
Published 2019-07-27
URL https://arxiv.org/abs/1908.02575v1
PDF https://arxiv.org/pdf/1908.02575v1.pdf
PWC https://paperswithcode.com/paper/alternative-blockmodelling
Repo
Framework

Privacy Amplification by Mixing and Diffusion Mechanisms

Title Privacy Amplification by Mixing and Diffusion Mechanisms
Authors Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek
Abstract A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. On the applied side, we generalize the analysis of “privacy amplification by iteration” in Noisy SGD and show it admits an exponential improvement in the strongly convex case, and study a mechanism based on the Ornstein-Uhlenbeck diffusion process which contains the Gaussian mechanism with optimal post-processing on bounded inputs as a special case.
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12264v2
PDF https://arxiv.org/pdf/1905.12264v2.pdf
PWC https://paperswithcode.com/paper/privacy-amplification-by-mixing-and-diffusion
Repo
Framework

Hyperintensional Reasoning based on Natural Language Knowledge Base

Title Hyperintensional Reasoning based on Natural Language Knowledge Base
Authors Marie Duží, Aleš Horák
Abstract The success of automated reasoning techniques over large natural-language texts heavily relies on a fine-grained analysis of natural language assumptions. While there is a common agreement that the analysis should be hyperintensional, most of the automatic reasoning systems are still based on an intensional logic, at the best. In this paper, we introduce the system of reasoning based on a fine-grained, hyperintensional analysis. To this end we apply Tichy’s Transparent Intensional Logic (TIL) with its procedural semantics. TIL is a higher-order, hyperintensional logic of partial functions, in particular apt for a fine-grained natural-language analysis. Within TIL we recognise three kinds of context, namely extensional, intensional and hyperintensional, in which a particular natural-language term, or rather its meaning, can occur. Having defined the three kinds of context and implemented an algorithm of context recognition, we are in a position to develop and implement an extensional logic of hyperintensions with the inference machine that should neither over-infer nor under-infer.
Tasks
Published 2019-06-18
URL https://arxiv.org/abs/1906.07562v1
PDF https://arxiv.org/pdf/1906.07562v1.pdf
PWC https://paperswithcode.com/paper/hyperintensional-reasoning-based-on-natural
Repo
Framework

Identifying Candidate Spaces for Advert Implantation

Title Identifying Candidate Spaces for Advert Implantation
Authors Soumyabrata Dev, Hossein Javidnia, Murhaf Hossari, Matthew Nicholson, Killian McCabe, Atul Nautiyal, Clare Conran, Jian Tang, Wei Xu, François Pitié
Abstract Virtual advertising is an important and promising feature in the area of online advertising. It involves integrating adverts onto live or recorded videos for product placements and targeted advertisements. Such integration of adverts is primarily done by video editors in the post-production stage, which is cumbersome and time-consuming. Therefore, it is important to automatically identify candidate spaces in a video frame, wherein new adverts can be implanted. The candidate space should match the scene perspective, and also have a high quality of experience according to human subjective judgment. In this paper, we propose the use of a bespoke neural net that can assist the video editors in identifying candidate spaces. We benchmark our approach against several deep-learning architectures on a large-scale image dataset of candidate spaces of outdoor scenes. Our work is the first of its kind in this area of multimedia and augmented reality applications, and achieves the best results.
Tasks
Published 2019-10-08
URL https://arxiv.org/abs/1910.03227v1
PDF https://arxiv.org/pdf/1910.03227v1.pdf
PWC https://paperswithcode.com/paper/identifying-candidate-spaces-for-advert
Repo
Framework

Multilevel Text Normalization with Sequence-to-Sequence Networks and Multisource Learning

Title Multilevel Text Normalization with Sequence-to-Sequence Networks and Multisource Learning
Authors Tatyana Ruzsics, Tanja Samardžić
Abstract We define multilevel text normalization as sequence-to-sequence processing that transforms naturally noisy text into a sequence of normalized units of meaning (morphemes) in three steps: 1) writing normalization, 2) lemmatization, 3) canonical segmentation. These steps are traditionally considered separate NLP tasks, with diverse solutions, evaluation schemes and data sources. We exploit the fact that all these tasks involve sub-word sequence-to-sequence transformation to propose a systematic solution for all of them using neural encoder-decoder technology. The specific challenge that we tackle in this paper is integrating the traditional know-how on separate tasks into the neural sequence-to-sequence framework to improve the state of the art. We address this challenge by enriching the general framework with mechanisms that allow processing the information on multiple levels of text organization (characters, morphemes, words, sentences) in combination with structural information (multilevel language model, part-of-speech) and heterogeneous sources (text, dictionaries). We show that our solution consistently improves on the current methods in all three steps. In addition, we analyze the performance of our system to show the specific contribution of the integrating components to the overall improvement.
Tasks Language Modelling, Lemmatization
Published 2019-03-27
URL http://arxiv.org/abs/1903.11340v2
PDF http://arxiv.org/pdf/1903.11340v2.pdf
PWC https://paperswithcode.com/paper/multilevel-text-normalization-with-sequence
Repo
Framework

Time-series Insights into the Process of Passing or Failing Online University Courses using Neural-Induced Interpretable Student States

Title Time-series Insights into the Process of Passing or Failing Online University Courses using Neural-Induced Interpretable Student States
Authors Byungsoo Jeon, Eyal Shafran, Luke Breitfeller, Jason Levin, Carolyn P. Rose
Abstract This paper addresses a key challenge in Educational Data Mining, namely to model student behavioral trajectories in order to provide a means for identifying students most at-risk, with the goal of providing supportive interventions. While many forms of data including clickstream data or data from sensors have been used extensively in time series models for such purposes, in this paper we explore the use of textual data, which is sometimes available in the records of students at large, online universities. We propose a time series model that constructs an evolving student state representation using both clickstream data and a signal extracted from the textual notes recorded by human mentors assigned to each student. We explore how the addition of this textual data improves both the predictive power of student states for the purpose of identifying students at risk for course failure as well as for providing interpretable insights about student course engagement processes.
Tasks Time Series
Published 2019-05-01
URL http://arxiv.org/abs/1905.00422v1
PDF http://arxiv.org/pdf/1905.00422v1.pdf
PWC https://paperswithcode.com/paper/time-series-insights-into-the-process-of
Repo
Framework

Large Scale Global Optimization by Hybrid Evolutionary Computation

Title Large Scale Global Optimization by Hybrid Evolutionary Computation
Authors Gutha Jaya Krishna, Vadlamani Ravi
Abstract In management, business, economics, science, engineering, and research domains, Large Scale Global Optimization (LSGO) plays a predominant and vital role. Though LSGO is applied in many of the application domains, it is a very troublesome and a perverse task. The Congress on Evolutionary Computation (CEC) began an LSGO competition to come up with algorithms with a bunch of standard benchmark unconstrained LSGO functions. Therefore, in this paper, we propose a hybrid meta-heuristic algorithm, which combines an Improved and Modified Harmony Search (IMHS), along with a Modified Differential Evolution (MDE) with an alternate selection strategy. Harmony Search (HS) does the job of exploration and exploitation, and Differential Evolution does the job of giving a perturbation to the exploration of IMHS, as harmony search suffers from being stuck at the basin of local optimal. To judge the performance of the suggested algorithm, we compare the proposed algorithm with ten excellent meta-heuristic algorithms on fifteen LSGO benchmark functions, which have 1000 continuous decision variables, of the CEC 2013 LSGO special session. The experimental results consistently show that our proposed hybrid meta-heuristic performs statistically on par with some algorithms in a few problems, while it turned out to be the best in a couple of problems.
Tasks
Published 2019-10-09
URL https://arxiv.org/abs/1910.03799v1
PDF https://arxiv.org/pdf/1910.03799v1.pdf
PWC https://paperswithcode.com/paper/large-scale-global-optimization-by-hybrid
Repo
Framework

CAGNet: Content-Aware Guidance for Salient Object Detection

Title CAGNet: Content-Aware Guidance for Salient Object Detection
Authors Sina Mohammadi, Mehrdad Noori, Ali Bahri, Sina Ghofrani Majelan, Mohammad Havaei
Abstract Beneficial from Fully Convolutional Neural Networks (FCNs), saliency detection methods have achieved promising results. However, it is still challenging to learn effective features for detecting salient objects in complicated scenarios, in which i) non-salient regions may have “salient-like” appearance; ii) the salient objects may have different-looking regions. To handle these complex scenarios, we propose a Feature Guide Network which exploits the nature of low-level and high-level features to i) make foreground and background regions more distinct and suppress the non-salient regions which have “salient-like” appearance; ii) assign foreground label to different-looking salient regions. Furthermore, we utilize a Multi-scale Feature Extraction Module (MFEM) for each level of abstraction to obtain multi-scale contextual information. Finally, we design a loss function which outperforms the widely-used Cross-entropy loss. By adopting four different pre-trained models as the backbone, we prove that our method is very general with respect to the choice of the backbone model. Experiments on five challenging datasets demonstrate that our method achieves the state-of-the-art performance in terms of different evaluation metrics. Additionally, our approach contains fewer parameters than the existing ones, does not need any post-processing, and runs fast at a real-time speed of 28 FPS when processing a 480 x 480 image.
Tasks Object Detection, Saliency Detection, Salient Object Detection
Published 2019-11-29
URL https://arxiv.org/abs/1911.13168v1
PDF https://arxiv.org/pdf/1911.13168v1.pdf
PWC https://paperswithcode.com/paper/cagnet-content-aware-guidance-for-salient
Repo
Framework

Learning Twitter User Sentiments on Climate Change with Limited Labeled Data

Title Learning Twitter User Sentiments on Climate Change with Limited Labeled Data
Authors Allison Koenecke, Jordi Feliu-Fabà
Abstract While it is well-documented that climate change accepters and deniers have become increasingly polarized in the United States over time, there has been no large-scale examination of whether these individuals are prone to changing their opinions as a result of natural external occurrences. On the sub-population of Twitter users, we examine whether climate change sentiment changes in response to five separate natural disasters occurring in the U.S. in 2018. We begin by showing that relevant tweets can be classified with over 75% accuracy as either accepting or denying climate change when using our methodology to compensate for limited labeled data; results are robust across several machine learning models and yield geographic-level results in line with prior research. We then apply RNNs to conduct a cohort-level analysis showing that the 2018 hurricanes yielded a statistically significant increase in average tweet sentiment affirming climate change. However, this effect does not hold for the 2018 blizzard and wildfires studied, implying that Twitter users’ opinions on climate change are fairly ingrained on this subset of natural disasters.
Tasks
Published 2019-04-15
URL http://arxiv.org/abs/1904.07342v1
PDF http://arxiv.org/pdf/1904.07342v1.pdf
PWC https://paperswithcode.com/paper/learning-twitter-user-sentiments-on-climate
Repo
Framework

Understanding complex predictive models with Ghost Variables

Title Understanding complex predictive models with Ghost Variables
Authors Pedro Delicado, Daniel Peña
Abstract We propose a procedure for assigning a relevance measure to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check the out of sample performance. First, the individual relevance of each variable is computed by comparing the predictions in the test set, given by the model that includes all the variables with those of another model in which the variable of interest is substituted by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. Second, we check the joint effects among the variables by using the eigenvalues of a relevance matrix that is the covariance matrix of the vectors of individual effects. It is shown that in simple models, as linear or additive models, the proposed measures are related to standard measures of significance of the variables and in neural networks models (and in other algorithmic prediction models) the procedure provides information about the joint and individual effects of the variables that is not usually available by other methods. The procedure is illustrated with simulated examples and the analysis of a large real data set.
Tasks
Published 2019-12-13
URL https://arxiv.org/abs/1912.06407v2
PDF https://arxiv.org/pdf/1912.06407v2.pdf
PWC https://paperswithcode.com/paper/understanding-complex-predictive-models-with
Repo
Framework
comments powered by Disqus