January 26, 2020

3012 words 15 mins read

Paper Group ANR 1611

C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning. Understanding Stability of Medical Concept Embeddings: Analysis and Prediction. Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation. Crowdsourcing and Validating Event-focused Emotion Corpora for German and English. Deep Learning Fundus Image Analy …

C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning


Title	C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning
Authors	Chih-Yao Chiu, Hwann-Tzong Chen, Tyng-Luh Liu
Abstract	This paper describes a channel-selection approach for simplifying deep neural networks. Specifically, we propose a new type of generic network layer, called pruning layer, to seamlessly augment a given pre-trained model for compression. Each pruning layer, comprising $1 \times 1$ depth-wise kernels, is represented with a dual format: one is real-valued and the other is binary. The former enables a two-phase optimization process of network pruning to operate with an end-to-end differentiable network, and the latter yields the mask information for channel selection. Our method progressively performs the pruning task layer-wise, and achieves channel selection according to a sparsity criterion to favor pruning more channels. We also develop a cost-aware mechanism to prevent the compression from sacrificing the expected network performance. Our results for compressing several benchmark deep networks on image classification and semantic segmentation are comparable to those by state-of-the-art.
Tasks	Image Classification, Network Pruning, Semantic Segmentation
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03508v1
PDF	http://arxiv.org/pdf/1904.03508v1.pdf
PWC	https://paperswithcode.com/paper/c2s2-cost-aware-channel-sparse-selection-for
Repo
Framework

Understanding Stability of Medical Concept Embeddings: Analysis and Prediction


Title	Understanding Stability of Medical Concept Embeddings: Analysis and Prediction
Authors	Grace E. Lee, Aixin Sun
Abstract	In biomedical area, medical concepts linked to external knowledge bases (e.g., UMLS) are frequently used for accurate and effective representations. There are many studies to develop embeddings for medical concepts on biomedical corpus and evaluate overall quality of concept embeddings. However, quality of individual concept embeddings has not been carefully investigated. We analyze the quality of medical concept embeddings trained with word2vec in terms of embedding stability. From the analysis, we observe that some of concept embeddings are out of the effect of different hyperparameter values in word2vec and remain with poor stability. Moreover, when stability of concept embeddings is analyzed in terms of frequency, many low-frequency concepts achieve high stability as high-frequency concepts do. The findings suggest that there are other factors influencing the stability of medical concept embeddings. In this paper, we propose a new factor, the distribution of context words to predict stability of medical concept embeddings. By estimating the distribution of context words using normalized entropy, we show that the skewed distribution has a moderate correlation with the stability of concept embeddings. The result demonstrates that a medical concept whose a large portion of context words is taken up by a few words is able to obtain high stability, even though its frequency is low. The clear correlation between the proposed factor and stability of medical concept embeddings allows to predict the medical concepts with low-quality embeddings even prior to training.
Tasks
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09552v1
PDF	http://arxiv.org/pdf/1904.09552v1.pdf
PWC	https://paperswithcode.com/paper/understanding-stability-of-medical-concept
Repo
Framework

Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation


Title	Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation
Authors	Chenhui Chu, Raj Dabre
Abstract	In this paper, we propose two novel methods for domain adaptation for the attention-only neural machine translation (NMT) model, i.e., the Transformer. Our methods focus on training a single translation model for multiple domains by either learning domain specialized hidden state representations or predictor biases for each domain. We combine our methods with a previously proposed black-box method called mixed fine tuning, which is known to be highly effective for domain adaptation. In addition, we incorporate multilingualism into the domain adaptation framework. Experiments show that multilingual multi-domain adaptation can significantly improve both resource-poor in-domain and resource-rich out-of-domain translations, and the combination of our methods with mixed fine tuning achieves the best performance.
Tasks	Domain Adaptation, Machine Translation
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07978v2
PDF	https://arxiv.org/pdf/1906.07978v2.pdf
PWC	https://paperswithcode.com/paper/multilingual-multi-domain-adaptation
Repo
Framework

Crowdsourcing and Validating Event-focused Emotion Corpora for German and English


Title	Crowdsourcing and Validating Event-focused Emotion Corpora for German and English
Authors	Enrica Troiano, Sebastian Padó, Roman Klinger
Abstract	Sentiment analysis has a range of corpora available across multiple languages. For emotion analysis, the situation is more limited, which hinders potential research on cross-lingual modeling and the development of predictive models for other languages. In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in analogy to the well-established English ISEAR emotion dataset. Motivated by Scherer’s appraisal theory, we implement a crowdsourcing experiment which consists of two steps. In step 1, participants create descriptions of emotional events for a given emotion. In step 2, five annotators assess the emotion expressed by the texts. We show that transferring an emotion classification model from the original English ISEAR to the German crowdsourced deISEAR via machine translation does not, on average, cause a performance drop.
Tasks	Emotion Classification, Emotion Recognition, Machine Translation, Sentiment Analysis
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13618v1
PDF	https://arxiv.org/pdf/1905.13618v1.pdf
PWC	https://paperswithcode.com/paper/crowdsourcing-and-validating-event-focused
Repo
Framework

Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading


Title	Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading
Authors	Jaakko Sahlsten, Joel Jaskari, Jyri Kivinen, Lauri Turunen, Esa Jaanio, Kustaa Hietala, Kimmo Kaski
Abstract	Diabetes is a globally prevalent disease that can cause visible microvascular complications such as diabetic retinopathy and macular edema in the human eye retina, the images of which are today used for manual disease screening. This labor-intensive task could greatly benefit from automatic detection using deep learning technique. Here we present a deep learning system that identifies referable diabetic retinopathy comparably or better than presented in the previous studies, although we use only a small fraction of images (<1/4) in training but are aided with higher image resolutions. We also provide novel results for five different screening and clinical grading systems for diabetic retinopathy and macular edema classification, including results for accurately classifying images according to clinical five-grade diabetic retinopathy and four-grade diabetic macular edema scales. These results suggest, that a deep learning system could increase the cost-effectiveness of screening while attaining higher than recommended performance, and that the system could be applied in clinical examinations requiring finer grading.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.08764v1
PDF	http://arxiv.org/pdf/1904.08764v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-fundus-image-analysis-for
Repo
Framework

Human-to-AI Coach: Improving Human Inputs to AI Systems


Title	Human-to-AI Coach: Improving Human Inputs to AI Systems
Authors	Johannes Schneider
Abstract	Humans increasingly interact with Artificial intelligence(AI) systems. AI systems are optimized for objectives such as minimum computation or minimum error rate in recognizing and interpreting inputs from humans. In contrast, inputs created by humans are often treated as a given. We investigate how inputs of humans can be altered to reduce misinterpretation by the AI system and to improve efficiency of input generation for the human while altered inputs should remain as similar as possible to the original inputs. These objectives result in trade-offs that are analyzed for a deep learning system classifying handwritten digits. To create examples that serve as demonstrations for humans to improve, we develop a model based on a conditional convolutional autoencoder (CCAE). Our quantitative and qualitative evaluation shows that in many occasions the generated proposals lead to lower error rates, require less effort to create and differ only modestly from the original samples.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03652v2
PDF	https://arxiv.org/pdf/1912.03652v2.pdf
PWC	https://paperswithcode.com/paper/ai-how-can-humans-communicate-better-with-you
Repo
Framework

RUSLAN: Russian Spoken Language Corpus for Speech Synthesis


Title	RUSLAN: Russian Spoken Language Corpus for Speech Synthesis
Authors	Lenar Gabdrakhmanov, Rustem Garaev, Evgenii Razinkov
Abstract	We present RUSLAN – a new open Russian spoken language corpus for the text-to-speech task. RUSLAN contains 22200 audio samples with text annotations – more than 31 hours of high-quality speech of one person – being the largest annotated Russian corpus in terms of speech duration for a single speaker. We trained an end-to-end neural network for the text-to-speech task on our corpus and evaluated the quality of the synthesized speech using Mean Opinion Score test. Synthesized speech achieves 4.05 score for naturalness and 3.78 score for intelligibility on a 5-point MOS scale.
Tasks	Speech Synthesis
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11645v1
PDF	https://arxiv.org/pdf/1906.11645v1.pdf
PWC	https://paperswithcode.com/paper/ruslan-russian-spoken-language-corpus-for
Repo
Framework

Image-Based Geo-Localization Using Satellite Imagery


Title	Image-Based Geo-Localization Using Satellite Imagery
Authors	Sixing Hu, Gim Hee Lee
Abstract	The problem of localization on a geo-referenced satellite map given a query ground view image is useful yet remains challenging due to the drastic change in viewpoint. To this end, in this paper we work on the extension of our earlier work on the Cross-View Matching Network (CVM-Net) for the ground-to-aerial image matching task since the traditional image descriptors fail due to the drastic viewpoint change. In particular, we show more extensive experimental results and analyses of the network architecture on our CVM-Net. Furthermore, we propose a Markov localization framework that enforces the temporal consistency between image frames to enhance the geo-localization results in the case where a video stream of ground view images is available. Experimental results show that our proposed Markov localization framework can continuously localize the vehicle within a small error on our Singapore dataset.
Tasks
Published	2019-03-01
URL	https://arxiv.org/abs/1903.00159v3
PDF	https://arxiv.org/pdf/1903.00159v3.pdf
PWC	https://paperswithcode.com/paper/image-based-geo-localization-using-satellite
Repo
Framework

Action Recognition from Single Timestamp Supervision in Untrimmed Videos


Title	Action Recognition from Single Timestamp Supervision in Untrimmed Videos
Authors	Davide Moltisanti, Sanja Fidler, Dima Damen
Abstract	Recognising actions in videos relies on labelled supervision during training, typically the start and end times of each action instance. This supervision is not only subjective, but also expensive to acquire. Weak video-level supervision has been successfully exploited for recognition in untrimmed videos, however it is challenged when the number of different actions in training videos increases. We propose a method that is supervised by single timestamps located around each action instance, in untrimmed videos. We replace expensive action bounds with sampling distributions initialised from these timestamps. We then use the classifier’s response to iteratively update the sampling distributions. We demonstrate that these distributions converge to the location and extent of discriminative action segments. We evaluate our method on three datasets for fine-grained recognition, with increasing number of different actions per video, and show that single timestamps offer a reasonable compromise between recognition performance and labelling effort, performing comparably to full temporal supervision. Our update method improves top-1 test accuracy by up to 5.4%. across the evaluated datasets.
Tasks	Temporal Action Localization
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04689v1
PDF	http://arxiv.org/pdf/1904.04689v1.pdf
PWC	https://paperswithcode.com/paper/action-recognition-from-single-timestamp
Repo
Framework

Recurrent Neural Networks for P300-based BCI


Title	Recurrent Neural Networks for P300-based BCI
Authors	Ori Tal, Doron Friedman
Abstract	P300-based spellers are one of the main methods for EEG-based brain-computer interface, and the detection of the P300 target event with high accuracy is an important prerequisite. The rapid serial visual presentation (RSVP) protocol is of high interest because it can be used by patients who have lost control over their eyes. In this study we wish to explore the suitability of recurrent neural networks (RNNs) as a machine learning method for identifying the P300 signal in RSVP data. We systematically compare RNN with alternative methods such as linear discriminant analysis (LDA) and convolutional neural network (CNN). Our results indicate that LDA performs as well as the neural network models or better on single subject data, but a network combining CNN and RNN has advantages when transferring learning among subejcts, and is significantly more resilient to temporal noise than other methods.
Tasks	EEG
Published	2019-01-30
URL	http://arxiv.org/abs/1901.10798v1
PDF	http://arxiv.org/pdf/1901.10798v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-for-p300-based-bci
Repo
Framework

A few filters are enough: Convolutional Neural Network for P300 Detection


Title	A few filters are enough: Convolutional Neural Network for P300 Detection
Authors	Alicia Montserrat Alvarado-Gonzalez, Gibran Fuentes-Pineda, Jorge Cervantes-Ojeda
Abstract	In this paper, we aim to provide elements to contribute to the discussion about the usefulness of deep CNNs with several filters to solve both within-subject and cross-subject classification for single-trial P300 detection. To that end, we present SepConv1D, a simple Convolutional Neural Network architecture consisting of a depthwise separable 1D convolutional block followed by a Sigmoid classification block. Additionally, we present a one-layer Fully-Connected Neural Network with two neurons in the hidden layer to show the unnecessary of having complex architectures to solve the problem under analysis. We compare their performances against CNN-based state-of-the-art architectures. The experiments did not show a statistically significant difference between their AUC. Moreover, SepConv1D has the lowest number of parameters of all by far. This is important because simpler, cheaper, faster and, thus, more portable devices can be built.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.06970v1
PDF	https://arxiv.org/pdf/1909.06970v1.pdf
PWC	https://paperswithcode.com/paper/a-few-filters-are-enough-convolutional-neural
Repo
Framework

The Use of Machine Learning and Big Five Personality Taxonomy to Predict Construction Workers’ Safety Behaviour


Title	The Use of Machine Learning and Big Five Personality Taxonomy to Predict Construction Workers’ Safety Behaviour
Authors	Yifan Gao, Vicente A. Gonzalez, Tak Wing Yiu, Guillermo Cabrera-Guerrerod
Abstract	Research has found that many occupational accidents are foreseeable, being the result of people’s unsafe behaviour from a retrospective point of view. The prediction of workers’ safety behaviour will enable the prior insights into each worker’s behavioural tendency and will be useful in the design of management practices prior to the occurrence of accidents and contribute to the reduction of injury rates. In recent years, researchers have found that people do have stable predispositions to engage in certain safety behavioural patterns which vary among individuals as a function of personality features. In this study, an innovative forecasting model, which employs machine learning algorithms, is developed to estimate construction workers’ behavioural tendency based on the Big Five personality taxonomy. The data-driven nature of machine learning technique enabled a reliable estimate of the personality-safety behaviour relationship, which allowed this study to provide novel insight that nonlinearity may exist in the relationship between construction workers’ personality traits and safety behaviour. The developed model is found to be sufficient to have satisfactory accuracy in explaining and predicting workers’ safety behaviour. This finding provides the empirical evidence to support the usefulness of personality traits as effective predictors of people’s safety behaviour at work. In addition, this study could have practical implications. The machine learning model developed can help identify vulnerable workers who are more prone to undertake unsafe behaviours, which is proven to have good prediction accuracy and is thereby potentially useful for decision making and safety management on construction sites.
Tasks	Decision Making
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05944v1
PDF	https://arxiv.org/pdf/1912.05944v1.pdf
PWC	https://paperswithcode.com/paper/the-use-of-machine-learning-and-big-five
Repo
Framework

Modeling Gestalt Visual Reasoning on the Raven’s Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques


Title	Modeling Gestalt Visual Reasoning on the Raven’s Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques
Authors	Tianyu Hua, Maithilee Kunda
Abstract	Psychologists recognize Raven’s Progressive Matrices as a very effective test of general human intelligence. While many computational models have been developed by the AI community to investigate different forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test performance. In this work, we investigate how Gestalt visual reasoning on the Raven’s test can be modeled using generative image inpainting techniques from computer vision. We demonstrate that a self-supervised inpainting model trained only on photorealistic images of objects achieves a score of 27/36 on the Colored Progressive Matrices, which corresponds to average performance for nine-year-old children. We also show that models trained on other datasets (faces, places, and textures) do not perform as well. Our results illustrate how learning visual regularities in real-world images can translate into successful reasoning about artificial test stimuli. On the flip side, our results also highlight the limitations of such transfer, which may explain why intelligence tests like the Raven’s are often sensitive to people’s individual sociocultural backgrounds.
Tasks	Image Inpainting, Visual Reasoning
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07736v2
PDF	https://arxiv.org/pdf/1911.07736v2.pdf
PWC	https://paperswithcode.com/paper/modeling-gestalt-visual-reasoning-on-the
Repo
Framework

On the Apparent Conflict Between Individual and Group Fairness


Title	On the Apparent Conflict Between Individual and Group Fairness
Authors	Reuben Binns
Abstract	A distinction has been drawn in fair machine learning research between `group' and` individual’ fairness measures. Many technical research papers assume that both are important, but conflicting, and propose ways to minimise the trade-offs between these measures. This paper argues that this apparent conflict is based on a misconception. It draws on theoretical discussions from within the fair machine learning research, and from political and legal philosophy, to argue that individual and group fairness are not fundamentally in conflict. First, it outlines accounts of egalitarian fairness which encompass plausible motivations for both group and individual fairness, thereby suggesting that there need be no conflict in principle. Second, it considers the concept of individual justice, from legal philosophy and jurisprudence which seems similar but actually contradicts the notion of individual fairness as proposed in the fair machine learning literature. The conclusion is that the apparent conflict between individual and group fairness is more of an artifact of the blunt application of fairness measures, rather than a matter of conflicting principles. In practice, this conflict may be resolved by a nuanced consideration of the sources of `unfairness’ in a particular deployment context, and the carefully justified application of measures to mitigate it. \|
Tasks
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06883v1
PDF	https://arxiv.org/pdf/1912.06883v1.pdf
PWC	https://paperswithcode.com/paper/on-the-apparent-conflict-between-individual
Repo
Framework

A Comparison of Prediction Algorithms and Nexting for Short Term Weather Forecasts


Title	A Comparison of Prediction Algorithms and Nexting for Short Term Weather Forecasts
Authors	Michael Koller, Johannes Feldmaier, Klaus Diepold
Abstract	This report first provides a brief overview of a number of supervised learning algorithms for regression tasks. Among those are neural networks, regression trees, and the recently introduced Nexting. Nexting has been presented in the context of reinforcement learning where it was used to predict a large number of signals at different timescales. In the second half of this report, we apply the algorithms to historical weather data in order to evaluate their suitability to forecast a local weather trend. Our experiments did not identify one clearly preferable method, but rather show that choosing an appropriate algorithm depends on the available side information. For slowly varying signals and a proficient number of training samples, Nexting achieved good results in the studied cases.
Tasks
Published	2019-03-18
URL	http://arxiv.org/abs/1903.07512v1
PDF	http://arxiv.org/pdf/1903.07512v1.pdf
PWC	https://paperswithcode.com/paper/a-comparison-of-prediction-algorithms-and
Repo
Framework