Paper Group ANR 277
A Priori Estimates of the Population Risk for Residual Networks. Dissecting Deep Neural Networks. Quality-aware skill translation models for expert finding on StackOverflow. Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels. Short Text Classification Improved by Feat …
A Priori Estimates of the Population Risk for Residual Networks
Title | A Priori Estimates of the Population Risk for Residual Networks |
Authors | Weinan E, Chao Ma, Qingcan Wang |
Abstract | Optimal a priori estimates are derived for the population risk, also known as the generalization error, of a regularized residual network model. An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities are regularized by larger weights. The error estimates are a priori in the sense that the estimates depend only on the target function, not on the parameters obtained in the training process. The estimates are optimal, in a high dimensional setting, in the sense that both the bound for the approximation and estimation errors are comparable to the Monte Carlo error rates. A crucial step in the proof is to establish an optimal bound for the Rademacher complexity of the residual networks. Comparisons are made with existing norm-based generalization error bounds. |
Tasks | |
Published | 2019-03-06 |
URL | https://arxiv.org/abs/1903.02154v2 |
https://arxiv.org/pdf/1903.02154v2.pdf | |
PWC | https://paperswithcode.com/paper/a-priori-estimates-of-the-population-risk-for |
Repo | |
Framework | |
Dissecting Deep Neural Networks
Title | Dissecting Deep Neural Networks |
Authors | Haakon Robinson, Adil Rasheed, Omer San |
Abstract | In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have raised concerns over their use in safety-critical applications. A first step to understanding these networks is to develop alternate representations that allow for further analysis. It has been shown that neural networks with piecewise affine activation functions are themselves piecewise affine, with their domains consisting of a vast number of linear regions. So far, the research on this topic has focused on counting the number of linear regions, rather than obtaining explicit piecewise affine representations. This work presents a novel algorithm that can compute the piecewise affine form of any fully connected neural network with rectified linear unit activations. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03879v2 |
https://arxiv.org/pdf/1910.03879v2.pdf | |
PWC | https://paperswithcode.com/paper/dissecting-deep-neural-networks |
Repo | |
Framework | |
Quality-aware skill translation models for expert finding on StackOverflow
Title | Quality-aware skill translation models for expert finding on StackOverflow |
Authors | Arash Dargahi Nobari, Mahmood Neshati, Sajad Sotudeh Gharebagh |
Abstract | StackOverflow has become an emerging resource for talent recognition in recent years. While users exploit technical language on StackOverflow, recruiters try to find the relevant candidates for jobs using their own terminology. This procedure implies a gap which exists between recruiters and candidates terms. Due to this gap, the state-of-the-art expert finding models cannot effectively address the expert finding problem on StackOverflow. We propose two translation models to bridge this gap. The first approach is a statistical method and the second is based on word embedding approach. Utilizing several translations for a given query during the scoring step, the result of each intermediate query is blended together to obtain the final ranking. Here, we propose a new approach which takes the quality of documents into account in scoring step. We have made several observations to visualize the effectiveness of the translation approaches and also the quality-aware scoring approach. Our experiments indicate the following: First, while statistical and word embedding translation approaches provide different translations for each query, both can considerably improve the recall. Besides, the quality-aware scoring approach can improve the precision remarkably. Finally, our best proposed method can improve the MAP measure up to 46% on average, in comparison with the state-of-the-art expert finding approach. |
Tasks | |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.06836v1 |
https://arxiv.org/pdf/1907.06836v1.pdf | |
PWC | https://paperswithcode.com/paper/quality-aware-skill-translation-models-for |
Repo | |
Framework | |
Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels
Title | Quels corpus d’entraînement pour l’expansion de requêtes par plongement de mots : application à la recherche de microblogs culturels |
Authors | Philippe Mulhem, Lorraine Goeuriot, Massih-Reza Amini, Nayanika Dogra |
Abstract | We describe here an experimental framework and the results obtained on microblogs retrieval. We study the contribution one popular approach, i.e., words embeddings, and investigate the impact of the training set on the learned embedding. We focus on query expansion for the retrieval of tweets on the CLEF CMC 2016 corpus. Our results show that using embeddings trained on a corpus in the same domain as the indexed documents did not necessarily lead to better retrieval results. |
Tasks | |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07317v1 |
https://arxiv.org/pdf/1911.07317v1.pdf | |
PWC | https://paperswithcode.com/paper/quels-corpus-dentrainement-pour-lexpansion-de |
Repo | |
Framework | |
Short Text Classification Improved by Feature Space Extension
Title | Short Text Classification Improved by Feature Space Extension |
Authors | Yanxuan Li |
Abstract | With the explosive development of mobile Internet, short text has been applied extensively. The difference between classifying short text and long documents is that short text is of shortness and sparsity. Thus, it is challenging to deal with short text classification owing to its less semantic information. In this paper, we propose a novel topic-based convolutional neural network (TB-CNN) based on Latent Dirichlet Allocation (LDA) model and convolutional neural network. Comparing to traditional CNN methods, TB-CNN generates topic words with LDA model to reduce the sparseness and combines the embedding vectors of topic words and input words to extend feature space of short text. The validation results on IMDB movie review dataset show the improvement and effectiveness of TB-CNN. |
Tasks | Text Classification |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01313v1 |
http://arxiv.org/pdf/1904.01313v1.pdf | |
PWC | https://paperswithcode.com/paper/short-text-classification-improved-by-feature |
Repo | |
Framework | |
Alternative Blockmodelling
Title | Alternative Blockmodelling |
Authors | Oscar Correa, Jeffrey Chan, Vinh Nguyen |
Abstract | Many approaches have been proposed to discover clusters within networks. Community finding field encompasses approaches which try to discover clusters where nodes are tightly related within them but loosely related with nodes of other clusters. However, a community network configuration is not the only possible latent structure in a graph. Core-periphery and hierarchical network configurations are valid structures to discover in a relational dataset. On the other hand, a network is not completely explained by only knowing the membership of each node. A high level view of the inter-cluster relationships is needed. Blockmodelling techniques deal with these two issues. Firstly, blockmodelling allows finding any network configuration besides to the well-known community structure. Secondly, blockmodelling is a summary representation of a network which regards not only membership of nodes but also relations between clusters. Finally, a unique summary representation of a network is unlikely. Networks might hide more than one blockmodel. Therefore, our proposed problem aims to discover a secondary blockmodel representation of a network that is of good quality and dissimilar with respect to a given blockmodel. Our methodology is presented through two approaches, (a) inclusion of cannot-link constraints and (b) dissimilarity between image matrices. Both approaches are based on non-negative matrix factorisation NMF which fits the blockmodelling representation. The evaluation of these two approaches regards quality and dissimilarity of the discovered alternative blockmodel as these are the requirements of the problem. |
Tasks | |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1908.02575v1 |
https://arxiv.org/pdf/1908.02575v1.pdf | |
PWC | https://paperswithcode.com/paper/alternative-blockmodelling |
Repo | |
Framework | |
Privacy Amplification by Mixing and Diffusion Mechanisms
Title | Privacy Amplification by Mixing and Diffusion Mechanisms |
Authors | Borja Balle, Gilles Barthe, Marco Gaboardi, Joseph Geumlek |
Abstract | A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. On the applied side, we generalize the analysis of “privacy amplification by iteration” in Noisy SGD and show it admits an exponential improvement in the strongly convex case, and study a mechanism based on the Ornstein-Uhlenbeck diffusion process which contains the Gaussian mechanism with optimal post-processing on bounded inputs as a special case. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12264v2 |
https://arxiv.org/pdf/1905.12264v2.pdf | |
PWC | https://paperswithcode.com/paper/privacy-amplification-by-mixing-and-diffusion |
Repo | |
Framework | |
Hyperintensional Reasoning based on Natural Language Knowledge Base
Title | Hyperintensional Reasoning based on Natural Language Knowledge Base |
Authors | Marie Duží, Aleš Horák |
Abstract | The success of automated reasoning techniques over large natural-language texts heavily relies on a fine-grained analysis of natural language assumptions. While there is a common agreement that the analysis should be hyperintensional, most of the automatic reasoning systems are still based on an intensional logic, at the best. In this paper, we introduce the system of reasoning based on a fine-grained, hyperintensional analysis. To this end we apply Tichy’s Transparent Intensional Logic (TIL) with its procedural semantics. TIL is a higher-order, hyperintensional logic of partial functions, in particular apt for a fine-grained natural-language analysis. Within TIL we recognise three kinds of context, namely extensional, intensional and hyperintensional, in which a particular natural-language term, or rather its meaning, can occur. Having defined the three kinds of context and implemented an algorithm of context recognition, we are in a position to develop and implement an extensional logic of hyperintensions with the inference machine that should neither over-infer nor under-infer. |
Tasks | |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07562v1 |
https://arxiv.org/pdf/1906.07562v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperintensional-reasoning-based-on-natural |
Repo | |
Framework | |
Identifying Candidate Spaces for Advert Implantation
Title | Identifying Candidate Spaces for Advert Implantation |
Authors | Soumyabrata Dev, Hossein Javidnia, Murhaf Hossari, Matthew Nicholson, Killian McCabe, Atul Nautiyal, Clare Conran, Jian Tang, Wei Xu, François Pitié |
Abstract | Virtual advertising is an important and promising feature in the area of online advertising. It involves integrating adverts onto live or recorded videos for product placements and targeted advertisements. Such integration of adverts is primarily done by video editors in the post-production stage, which is cumbersome and time-consuming. Therefore, it is important to automatically identify candidate spaces in a video frame, wherein new adverts can be implanted. The candidate space should match the scene perspective, and also have a high quality of experience according to human subjective judgment. In this paper, we propose the use of a bespoke neural net that can assist the video editors in identifying candidate spaces. We benchmark our approach against several deep-learning architectures on a large-scale image dataset of candidate spaces of outdoor scenes. Our work is the first of its kind in this area of multimedia and augmented reality applications, and achieves the best results. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03227v1 |
https://arxiv.org/pdf/1910.03227v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-candidate-spaces-for-advert |
Repo | |
Framework | |
Multilevel Text Normalization with Sequence-to-Sequence Networks and Multisource Learning
Title | Multilevel Text Normalization with Sequence-to-Sequence Networks and Multisource Learning |
Authors | Tatyana Ruzsics, Tanja Samardžić |
Abstract | We define multilevel text normalization as sequence-to-sequence processing that transforms naturally noisy text into a sequence of normalized units of meaning (morphemes) in three steps: 1) writing normalization, 2) lemmatization, 3) canonical segmentation. These steps are traditionally considered separate NLP tasks, with diverse solutions, evaluation schemes and data sources. We exploit the fact that all these tasks involve sub-word sequence-to-sequence transformation to propose a systematic solution for all of them using neural encoder-decoder technology. The specific challenge that we tackle in this paper is integrating the traditional know-how on separate tasks into the neural sequence-to-sequence framework to improve the state of the art. We address this challenge by enriching the general framework with mechanisms that allow processing the information on multiple levels of text organization (characters, morphemes, words, sentences) in combination with structural information (multilevel language model, part-of-speech) and heterogeneous sources (text, dictionaries). We show that our solution consistently improves on the current methods in all three steps. In addition, we analyze the performance of our system to show the specific contribution of the integrating components to the overall improvement. |
Tasks | Language Modelling, Lemmatization |
Published | 2019-03-27 |
URL | http://arxiv.org/abs/1903.11340v2 |
http://arxiv.org/pdf/1903.11340v2.pdf | |
PWC | https://paperswithcode.com/paper/multilevel-text-normalization-with-sequence |
Repo | |
Framework | |
Time-series Insights into the Process of Passing or Failing Online University Courses using Neural-Induced Interpretable Student States
Title | Time-series Insights into the Process of Passing or Failing Online University Courses using Neural-Induced Interpretable Student States |
Authors | Byungsoo Jeon, Eyal Shafran, Luke Breitfeller, Jason Levin, Carolyn P. Rose |
Abstract | This paper addresses a key challenge in Educational Data Mining, namely to model student behavioral trajectories in order to provide a means for identifying students most at-risk, with the goal of providing supportive interventions. While many forms of data including clickstream data or data from sensors have been used extensively in time series models for such purposes, in this paper we explore the use of textual data, which is sometimes available in the records of students at large, online universities. We propose a time series model that constructs an evolving student state representation using both clickstream data and a signal extracted from the textual notes recorded by human mentors assigned to each student. We explore how the addition of this textual data improves both the predictive power of student states for the purpose of identifying students at risk for course failure as well as for providing interpretable insights about student course engagement processes. |
Tasks | Time Series |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00422v1 |
http://arxiv.org/pdf/1905.00422v1.pdf | |
PWC | https://paperswithcode.com/paper/time-series-insights-into-the-process-of |
Repo | |
Framework | |
Large Scale Global Optimization by Hybrid Evolutionary Computation
Title | Large Scale Global Optimization by Hybrid Evolutionary Computation |
Authors | Gutha Jaya Krishna, Vadlamani Ravi |
Abstract | In management, business, economics, science, engineering, and research domains, Large Scale Global Optimization (LSGO) plays a predominant and vital role. Though LSGO is applied in many of the application domains, it is a very troublesome and a perverse task. The Congress on Evolutionary Computation (CEC) began an LSGO competition to come up with algorithms with a bunch of standard benchmark unconstrained LSGO functions. Therefore, in this paper, we propose a hybrid meta-heuristic algorithm, which combines an Improved and Modified Harmony Search (IMHS), along with a Modified Differential Evolution (MDE) with an alternate selection strategy. Harmony Search (HS) does the job of exploration and exploitation, and Differential Evolution does the job of giving a perturbation to the exploration of IMHS, as harmony search suffers from being stuck at the basin of local optimal. To judge the performance of the suggested algorithm, we compare the proposed algorithm with ten excellent meta-heuristic algorithms on fifteen LSGO benchmark functions, which have 1000 continuous decision variables, of the CEC 2013 LSGO special session. The experimental results consistently show that our proposed hybrid meta-heuristic performs statistically on par with some algorithms in a few problems, while it turned out to be the best in a couple of problems. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03799v1 |
https://arxiv.org/pdf/1910.03799v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-global-optimization-by-hybrid |
Repo | |
Framework | |
CAGNet: Content-Aware Guidance for Salient Object Detection
Title | CAGNet: Content-Aware Guidance for Salient Object Detection |
Authors | Sina Mohammadi, Mehrdad Noori, Ali Bahri, Sina Ghofrani Majelan, Mohammad Havaei |
Abstract | Beneficial from Fully Convolutional Neural Networks (FCNs), saliency detection methods have achieved promising results. However, it is still challenging to learn effective features for detecting salient objects in complicated scenarios, in which i) non-salient regions may have “salient-like” appearance; ii) the salient objects may have different-looking regions. To handle these complex scenarios, we propose a Feature Guide Network which exploits the nature of low-level and high-level features to i) make foreground and background regions more distinct and suppress the non-salient regions which have “salient-like” appearance; ii) assign foreground label to different-looking salient regions. Furthermore, we utilize a Multi-scale Feature Extraction Module (MFEM) for each level of abstraction to obtain multi-scale contextual information. Finally, we design a loss function which outperforms the widely-used Cross-entropy loss. By adopting four different pre-trained models as the backbone, we prove that our method is very general with respect to the choice of the backbone model. Experiments on five challenging datasets demonstrate that our method achieves the state-of-the-art performance in terms of different evaluation metrics. Additionally, our approach contains fewer parameters than the existing ones, does not need any post-processing, and runs fast at a real-time speed of 28 FPS when processing a 480 x 480 image. |
Tasks | Object Detection, Saliency Detection, Salient Object Detection |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13168v1 |
https://arxiv.org/pdf/1911.13168v1.pdf | |
PWC | https://paperswithcode.com/paper/cagnet-content-aware-guidance-for-salient |
Repo | |
Framework | |
Learning Twitter User Sentiments on Climate Change with Limited Labeled Data
Title | Learning Twitter User Sentiments on Climate Change with Limited Labeled Data |
Authors | Allison Koenecke, Jordi Feliu-Fabà |
Abstract | While it is well-documented that climate change accepters and deniers have become increasingly polarized in the United States over time, there has been no large-scale examination of whether these individuals are prone to changing their opinions as a result of natural external occurrences. On the sub-population of Twitter users, we examine whether climate change sentiment changes in response to five separate natural disasters occurring in the U.S. in 2018. We begin by showing that relevant tweets can be classified with over 75% accuracy as either accepting or denying climate change when using our methodology to compensate for limited labeled data; results are robust across several machine learning models and yield geographic-level results in line with prior research. We then apply RNNs to conduct a cohort-level analysis showing that the 2018 hurricanes yielded a statistically significant increase in average tweet sentiment affirming climate change. However, this effect does not hold for the 2018 blizzard and wildfires studied, implying that Twitter users’ opinions on climate change are fairly ingrained on this subset of natural disasters. |
Tasks | |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07342v1 |
http://arxiv.org/pdf/1904.07342v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-twitter-user-sentiments-on-climate |
Repo | |
Framework | |
Understanding complex predictive models with Ghost Variables
Title | Understanding complex predictive models with Ghost Variables |
Authors | Pedro Delicado, Daniel Peña |
Abstract | We propose a procedure for assigning a relevance measure to each explanatory variable in a complex predictive model. We assume that we have a training set to fit the model and a test set to check the out of sample performance. First, the individual relevance of each variable is computed by comparing the predictions in the test set, given by the model that includes all the variables with those of another model in which the variable of interest is substituted by its ghost variable, defined as the prediction of this variable by using the rest of explanatory variables. Second, we check the joint effects among the variables by using the eigenvalues of a relevance matrix that is the covariance matrix of the vectors of individual effects. It is shown that in simple models, as linear or additive models, the proposed measures are related to standard measures of significance of the variables and in neural networks models (and in other algorithmic prediction models) the procedure provides information about the joint and individual effects of the variables that is not usually available by other methods. The procedure is illustrated with simulated examples and the analysis of a large real data set. |
Tasks | |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06407v2 |
https://arxiv.org/pdf/1912.06407v2.pdf | |
PWC | https://paperswithcode.com/paper/understanding-complex-predictive-models-with |
Repo | |
Framework | |