Paper Group ANR 705
Using a Classifier Ensemble for Proactive Quality Monitoring and Control: the impact of the choice of classifiers types, selection criterion, and fusion process. Pore detection in high-resolution fingerprint images using Deep Residual Network. $\ell_1$-regression with Heavy-tailed Distributions. Average-Case Information Complexity of Learning. Expl …
Using a Classifier Ensemble for Proactive Quality Monitoring and Control: the impact of the choice of classifiers types, selection criterion, and fusion process
Title | Using a Classifier Ensemble for Proactive Quality Monitoring and Control: the impact of the choice of classifiers types, selection criterion, and fusion process |
Authors | Philippe Thomas, Hind Bril El Haouzi, Marie-Christine Suhner, André Thomas, Emmanuel Zimmermann, Mélanie Noyel |
Abstract | In recent times, the manufacturing processes are faced with many external or internal (the increase of customized product rescheduling , process reliability,..) changes. Therefore, monitoring and quality management activities for these manufacturing processes are difficult. Thus, the managers need more proactive approaches to deal with this variability. In this study, a proactive quality monitoring and control approach based on classifiers to predict defect occurrences and provide optimal values for factors critical to the quality processes is proposed. In a previous work (Noyel et al. 2013), the classification approach had been used in order to improve the quality of a lacquering process at a company plant; the results obtained are promising, but the accuracy of the classification model used needs to be improved. One way to achieve this is to construct a committee of classifiers (referred to as an ensemble) to obtain a better predictive model than its constituent models. However, the selection of the best classification methods and the construction of the final ensemble still poses a challenging issue. In this study, we focus and analyze the impact of the choice of classifier types on the accuracy of the classifier ensemble; in addition, we explore the effects of the selection criterion and fusion process on the ensemble accuracy as well. Several fusion scenarios were tested and compared based on a real-world case. Our results show that using an ensemble classification leads to an increase in the accuracy of the classifier models. Consequently, the monitoring and control of the considered real-world case can be improved. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.01684v1 |
http://arxiv.org/pdf/1804.01684v1.pdf | |
PWC | https://paperswithcode.com/paper/using-a-classifier-ensemble-for-proactive |
Repo | |
Framework | |
Pore detection in high-resolution fingerprint images using Deep Residual Network
Title | Pore detection in high-resolution fingerprint images using Deep Residual Network |
Authors | Vijay Anand, Vivek kanhangad |
Abstract | This letter presents a residual learning-based convolutional neural network, referred to as DeepResPore, for detection of pores in high-resolution fingerprint images. Specifically, the proposed DeepResPore model generates a pore intensity map from the input fingerprint image. Subsequently, the local maxima filter is operated on the pore intensity map to identify the pore coordinates. The results of our experiments indicate that the proposed approach is effective in extracting pores with a true detection rate of 94:49% on Test set I and 93:78% on Test set II of the publicly available PolyU HRF dataset. Most importantly, the proposed approach achieves state-of-the-art performance on both test sets. |
Tasks | |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.01986v2 |
http://arxiv.org/pdf/1809.01986v2.pdf | |
PWC | https://paperswithcode.com/paper/pore-detection-in-high-resolution-fingerprint |
Repo | |
Framework | |
$\ell_1$-regression with Heavy-tailed Distributions
Title | $\ell_1$-regression with Heavy-tailed Distributions |
Authors | Lijun Zhang, Zhi-Hua Zhou |
Abstract | In this paper, we consider the problem of linear regression with heavy-tailed distributions. Different from previous studies that use the squared loss to measure the performance, we choose the absolute loss, which is capable of estimating the conditional median. To address the challenge that both the input and output could be heavy-tailed, we propose a truncated minimization problem, and demonstrate that it enjoys an $\widetilde{O}(\sqrt{d/n})$ excess risk, where $d$ is the dimensionality and $n$ is the number of samples. Compared with traditional work on $\ell_1$-regression, the main advantage of our result is that we achieve a high-probability risk bound without exponential moment conditions on the input and output. Furthermore, if the input is bounded, we show that the classical empirical risk minimization is competent for $\ell_1$-regression even when the output is heavy-tailed. |
Tasks | |
Published | 2018-05-02 |
URL | http://arxiv.org/abs/1805.00616v4 |
http://arxiv.org/pdf/1805.00616v4.pdf | |
PWC | https://paperswithcode.com/paper/ell_1-regression-with-heavy-tailed |
Repo | |
Framework | |
Average-Case Information Complexity of Learning
Title | Average-Case Information Complexity of Learning |
Authors | Ido Nachum, Amir Yehudayoff |
Abstract | How many bits of information are revealed by a learning algorithm for a concept class of VC-dimension $d$? Previous works have shown that even for $d=1$ the amount of information may be unbounded (tend to $\infty$ with the universe size). Can it be that all concepts in the class require leaking a large amount of information? We show that typically concepts do not require leakage. There exists a proper learning algorithm that reveals $O(d)$ bits of information for most concepts in the class. This result is a special case of a more general phenomenon we explore. If there is a low information learner when the algorithm {\em knows} the underlying distribution on inputs, then there is a learner that reveals little information on an average concept {\em without knowing} the distribution on inputs. |
Tasks | |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.09923v1 |
http://arxiv.org/pdf/1811.09923v1.pdf | |
PWC | https://paperswithcode.com/paper/average-case-information-complexity-of |
Repo | |
Framework | |
Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks
Title | Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks |
Authors | Diego Marcheggiani, Joost Bastings, Ivan Titov |
Abstract | Semantic representations have long been argued as potentially useful for enforcing meaning preservation and improving generalization performance of machine translation methods. In this work, we are the first to incorporate information about predicate-argument structure of source sentences (namely, semantic-role representations) into neural machine translation. We use Graph Convolutional Networks (GCNs) to inject a semantic bias into sentence encoders and achieve improvements in BLEU scores over the linguistic-agnostic and syntax-aware versions on the English–German language pair. |
Tasks | Machine Translation |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08313v1 |
http://arxiv.org/pdf/1804.08313v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-semantics-in-neural-machine |
Repo | |
Framework | |
A Hybrid Model for Identity Obfuscation by Face Replacement
Title | A Hybrid Model for Identity Obfuscation by Face Replacement |
Authors | Qianru Sun, Ayush Tewari, Weipeng Xu, Mario Fritz, Christian Theobalt, Bernt Schiele |
Abstract | As more and more personal photos are shared and tagged in social media, avoiding privacy risks such as unintended recognition becomes increasingly challenging. We propose a new hybrid approach to obfuscate identities in photos by head replacement. Our approach combines state of the art parametric face synthesis with latest advances in Generative Adversarial Networks (GAN) for data-driven image synthesis. On the one hand, the parametric part of our method gives us control over the facial parameters and allows for explicit manipulation of the identity. On the other hand, the data-driven aspects allow for adding fine details and overall realism as well as seamless blending into the scene context. In our experiments, we show highly realistic output of our system that improves over the previous state of the art in obfuscation rate while preserving a higher similarity to the original image content. |
Tasks | Face Generation, Image Generation |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.04779v2 |
http://arxiv.org/pdf/1804.04779v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-model-for-identity-obfuscation-by |
Repo | |
Framework | |
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Title | CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge |
Authors | Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan Berant |
Abstract | When answering a question, people often draw upon their rich world knowledge in addition to the particular context. Recent work has focused primarily on answering questions given some relevant document or context, and required very little general background. To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering. To capture common sense beyond associations, we extract from ConceptNet (Speer et al., 2017) multiple target concepts that have the same semantic relation to a single source concept. Crowd-workers are asked to author multiple-choice questions that mention the source concept and discriminate in turn between each of the target concepts. This encourages workers to create questions with complex semantics that often require prior knowledge. We create 12,247 questions through this procedure and demonstrate the difficulty of our task with a large number of strong baselines. Our best baseline is based on BERT-large (Devlin et al., 2018) and obtains 56% accuracy, well below human performance, which is 89%. |
Tasks | Common Sense Reasoning, Question Answering |
Published | 2018-11-02 |
URL | http://arxiv.org/abs/1811.00937v2 |
http://arxiv.org/pdf/1811.00937v2.pdf | |
PWC | https://paperswithcode.com/paper/commonsenseqa-a-question-answering-challenge |
Repo | |
Framework | |
Pathological Evidence Exploration in Deep Retinal Image Diagnosis
Title | Pathological Evidence Exploration in Deep Retinal Image Diagnosis |
Authors | Yuhao Niu, Lin Gu, Feng Lu, Feifan Lv, Zongji Wang, Imari Sato, Zijian Zhang, Yangyan Xiao, Xunzhang Dai, Tingting Cheng |
Abstract | Though deep learning has shown successful performance in classifying the label and severity stage of certain disease, most of them give few evidence on how to make prediction. Here, we propose to exploit the interpretability of deep learning application in medical diagnosis. Inspired by Koch’s Postulates, a well-known strategy in medical research to identify the property of pathogen, we define a pathological descriptor that can be extracted from the activated neurons of a diabetic retinopathy detector. To visualize the symptom and feature encoded in this descriptor, we propose a GAN based method to synthesize pathological retinal image given the descriptor and a binary vessel segmentation. Besides, with this descriptor, we can arbitrarily manipulate the position and quantity of lesions. As verified by a panel of 5 licensed ophthalmologists, our synthesized images carry the symptoms that are directly related to diabetic retinopathy diagnosis. The panel survey also shows that our generated images is both qualitatively and quantitatively superior to existing methods. |
Tasks | Medical Diagnosis |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02640v1 |
http://arxiv.org/pdf/1812.02640v1.pdf | |
PWC | https://paperswithcode.com/paper/pathological-evidence-exploration-in-deep |
Repo | |
Framework | |
Evaluating Word Embedding Hyper-Parameters for Similarity and Analogy Tasks
Title | Evaluating Word Embedding Hyper-Parameters for Similarity and Analogy Tasks |
Authors | Maryam Fanaeepour, Adam Makarucha, Jey Han Lau |
Abstract | The versatility of word embeddings for various applications is attracting researchers from various fields. However, the impact of hyper-parameters when training embedding model is often poorly understood. How much do hyper-parameters such as vector dimensions and corpus size affect the quality of embeddings, and how do these results translate to downstream applications? Using standard embedding evaluation metrics and datasets, we conduct a study to empirically measure the impact of these hyper-parameters. |
Tasks | Word Embeddings |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04211v1 |
http://arxiv.org/pdf/1804.04211v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-word-embedding-hyper-parameters |
Repo | |
Framework | |
Batch Normalization and the impact of batch structure on the behavior of deep convolution networks
Title | Batch Normalization and the impact of batch structure on the behavior of deep convolution networks |
Authors | Mohamed Hajaj, Duncan Gillies |
Abstract | Batch normalization was introduced in 2015 to speed up training of deep convolution networks by normalizing the activations across the current batch to have zero mean and unity variance. The results presented here show an interesting aspect of batch normalization, where controlling the shape of the training batches can influence what the network will learn. If training batches are structured as balanced batches (one image per class), and inference is also carried out on balanced test batches, using the batch’s own means and variances, then the conditional results will improve considerably. The network uses the strong information about easy images in a balanced batch, and propagates it through the shared means and variances to help decide the identity of harder images on the same batch. Balancing the test batches requires the labels of the test images, which are not available in practice, however further investigation can be done using batch structures that are less strict and might not require the test image labels. The conditional results show the error rate almost reduced to zero for nontrivial datasets with small number of classes such as the CIFAR10. |
Tasks | |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07590v1 |
http://arxiv.org/pdf/1802.07590v1.pdf | |
PWC | https://paperswithcode.com/paper/batch-normalization-and-the-impact-of-batch |
Repo | |
Framework | |
Cascaded Coarse-to-Fine Deep Kernel Networks for Efficient Satellite Image Change Detection
Title | Cascaded Coarse-to-Fine Deep Kernel Networks for Efficient Satellite Image Change Detection |
Authors | Hichem Sahbi |
Abstract | Deep networks are nowadays becoming popular in many computer vision and pattern recognition tasks. Among these networks, deep kernels are particularly interesting and effective, however, their computational complexity is a major issue especially on cheap hardware resources. In this paper, we address the issue of efficient computation in deep kernel networks. We propose a novel framework that reduces dramatically the complexity of evaluating these deep kernels. Our method is based on a coarse-to-fine cascade of networks designed for efficient computation; early stages of the cascade are cheap and reject many patterns efficiently while deep stages are more expensive and accurate. The design principle of these reduced complexity networks is based on a variant of the cross-entropy criterion that reduces the complexity of the networks in the cascade while preserving all the positive responses of the original kernel network. Experiments conducted - on the challenging and time demanding change detection task, on very large satellite images - show that our proposed coarse-to-fine approach is effective and highly efficient. |
Tasks | |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.09119v1 |
http://arxiv.org/pdf/1812.09119v1.pdf | |
PWC | https://paperswithcode.com/paper/cascaded-coarse-to-fine-deep-kernel-networks |
Repo | |
Framework | |
Neologisms on Facebook
Title | Neologisms on Facebook |
Authors | Nikita Muravyev, Alexander Panchenko, Sergei Obiedkov |
Abstract | In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have analyzed a dataset of several million publically available posts written during 2006-2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the OpenCorpora dictionary the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in texts from the Russian National Corpus. The result is a list of 168 words that can potentially be considered neologisms. We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, “internet”, “marketing”, and “multimedia” being among those with the largest number of words. We believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language. |
Tasks | |
Published | 2018-04-13 |
URL | http://arxiv.org/abs/1804.05831v1 |
http://arxiv.org/pdf/1804.05831v1.pdf | |
PWC | https://paperswithcode.com/paper/neologisms-on-facebook |
Repo | |
Framework | |
Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs
Title | Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs |
Authors | Francisco M. Castro, Nicolás Guil, Manuel J. Marín-Jiménez, Jesús Pérez-Serrano, Manuel Ujaldón |
Abstract | Deep Learning (DL) applications are gaining momentum in the realm of Artificial Intelligence, particularly after GPUs have demonstrated remarkable skills for accelerating their challenging computational requirements. Within this context, Convolutional Neural Network (CNN) models constitute a representative example of success on a wide set of complex applications, particularly on datasets where the target can be represented through a hierarchy of local features of increasing semantic complexity. In most of the real scenarios, the roadmap to improve results relies on CNN settings involving brute force computation, and researchers have lately proven Nvidia GPUs to be one of the best hardware counterparts for acceleration. Our work complements those findings with an energy study on critical parameters for the deployment of CNNs on flagship image and video applications: object recognition and people identification by gait, respectively. We evaluate energy consumption on four different networks based on the two most popular ones (ResNet/AlexNet): ResNet (167 layers), a 2D CNN (15 layers), a CaffeNet (25 layers) and a ResNetIm (94 layers) using batch sizes of 64, 128 and 256, and then correlate those with speed-up and accuracy to determine optimal settings. Experimental results on a multi-GPU server endowed with twin Maxwell and twin Pascal Titan X GPUs demonstrate that energy correlates with performance and that Pascal may have up to 40% gains versus Maxwell. Larger batch sizes extend performance gains and energy savings, but we have to keep an eye on accuracy, which sometimes shows a preference for small batches. We expect this work to provide a preliminary guidance for a wide set of CNN and DL applications in modern HPC times, where the GFLOPS/w ratio constitutes the primary goal. |
Tasks | Object Recognition |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00286v1 |
http://arxiv.org/pdf/1808.00286v1.pdf | |
PWC | https://paperswithcode.com/paper/energy-based-tuning-of-convolutional-neural |
Repo | |
Framework | |
Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning
Title | Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning |
Authors | Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman |
Abstract | Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximate setting is challenging. In this context, an important question is the loss function used for model learning as varying the loss function can have a remarkable impact on effectiveness of planning. Recently Farahmand et al. (2017) proposed a value-aware model learning (VAML) objective that captures the structure of value function during model learning. Using tools from Asadi et al. (2018), we show that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric. This equivalence improves our understanding of value-aware models, and also creates a theoretical foundation for applications of Wasserstein in model-based reinforcement~learning. |
Tasks | |
Published | 2018-06-01 |
URL | http://arxiv.org/abs/1806.01265v2 |
http://arxiv.org/pdf/1806.01265v2.pdf | |
PWC | https://paperswithcode.com/paper/equivalence-between-wasserstein-and-value |
Repo | |
Framework | |
Bundled fragments of first-order modal logic: (un)decidability
Title | Bundled fragments of first-order modal logic: (un)decidability |
Authors | Anantha Padmanabha, R. Ramanujam, Yanjing Wang |
Abstract | Quantified modal logic provides a natural logical language for reasoning about modal attitudes even while retaining the richness of quantification for referring to predicates over domains. But then most fragments of the logic are undecidable, over many model classes. Over the years, only a few fragments (such as the monodic) have been shown to be decidable. In this paper, we study fragments that bundle quantifiers and modalities together, inspired by earlier work on epistemic logics of know-how/why/what. As always with quantified modal logics, it makes a significant difference whether the domain stays the same across worlds, or not. In particular, we show that the bundle $\forall \Box$ is undecidable over constant domain interpretations, even with only monadic predicates, whereas $\exists \Box$ bundle is decidable. On the other hand, over increasing domain interpretations, we get decidability with both $\forall \Box$ and $\exists \Box$ bundles with unrestricted predicates. In these cases, we also obtain tableau based procedures that run in \PSPACE. We further show that the $\exists \Box$ bundle cannot distinguish between constant domain and increasing domain interpretations. |
Tasks | |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10508v1 |
http://arxiv.org/pdf/1803.10508v1.pdf | |
PWC | https://paperswithcode.com/paper/bundled-fragments-of-first-order-modal-logic |
Repo | |
Framework | |