Paper Group ANR 1611
C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning. Understanding Stability of Medical Concept Embeddings: Analysis and Prediction. Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation. Crowdsourcing and Validating Event-focused Emotion Corpora for German and English. Deep Learning Fundus Image Analy …
C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning
Title | C2S2: Cost-aware Channel Sparse Selection for Progressive Network Pruning |
Authors | Chih-Yao Chiu, Hwann-Tzong Chen, Tyng-Luh Liu |
Abstract | This paper describes a channel-selection approach for simplifying deep neural networks. Specifically, we propose a new type of generic network layer, called pruning layer, to seamlessly augment a given pre-trained model for compression. Each pruning layer, comprising $1 \times 1$ depth-wise kernels, is represented with a dual format: one is real-valued and the other is binary. The former enables a two-phase optimization process of network pruning to operate with an end-to-end differentiable network, and the latter yields the mask information for channel selection. Our method progressively performs the pruning task layer-wise, and achieves channel selection according to a sparsity criterion to favor pruning more channels. We also develop a cost-aware mechanism to prevent the compression from sacrificing the expected network performance. Our results for compressing several benchmark deep networks on image classification and semantic segmentation are comparable to those by state-of-the-art. |
Tasks | Image Classification, Network Pruning, Semantic Segmentation |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03508v1 |
http://arxiv.org/pdf/1904.03508v1.pdf | |
PWC | https://paperswithcode.com/paper/c2s2-cost-aware-channel-sparse-selection-for |
Repo | |
Framework | |
Understanding Stability of Medical Concept Embeddings: Analysis and Prediction
Title | Understanding Stability of Medical Concept Embeddings: Analysis and Prediction |
Authors | Grace E. Lee, Aixin Sun |
Abstract | In biomedical area, medical concepts linked to external knowledge bases (e.g., UMLS) are frequently used for accurate and effective representations. There are many studies to develop embeddings for medical concepts on biomedical corpus and evaluate overall quality of concept embeddings. However, quality of individual concept embeddings has not been carefully investigated. We analyze the quality of medical concept embeddings trained with word2vec in terms of embedding stability. From the analysis, we observe that some of concept embeddings are out of the effect of different hyperparameter values in word2vec and remain with poor stability. Moreover, when stability of concept embeddings is analyzed in terms of frequency, many low-frequency concepts achieve high stability as high-frequency concepts do. The findings suggest that there are other factors influencing the stability of medical concept embeddings. In this paper, we propose a new factor, the distribution of context words to predict stability of medical concept embeddings. By estimating the distribution of context words using normalized entropy, we show that the skewed distribution has a moderate correlation with the stability of concept embeddings. The result demonstrates that a medical concept whose a large portion of context words is taken up by a few words is able to obtain high stability, even though its frequency is low. The clear correlation between the proposed factor and stability of medical concept embeddings allows to predict the medical concepts with low-quality embeddings even prior to training. |
Tasks | |
Published | 2019-04-21 |
URL | http://arxiv.org/abs/1904.09552v1 |
http://arxiv.org/pdf/1904.09552v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-stability-of-medical-concept |
Repo | |
Framework | |
Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation
Title | Multilingual Multi-Domain Adaptation Approaches for Neural Machine Translation |
Authors | Chenhui Chu, Raj Dabre |
Abstract | In this paper, we propose two novel methods for domain adaptation for the attention-only neural machine translation (NMT) model, i.e., the Transformer. Our methods focus on training a single translation model for multiple domains by either learning domain specialized hidden state representations or predictor biases for each domain. We combine our methods with a previously proposed black-box method called mixed fine tuning, which is known to be highly effective for domain adaptation. In addition, we incorporate multilingualism into the domain adaptation framework. Experiments show that multilingual multi-domain adaptation can significantly improve both resource-poor in-domain and resource-rich out-of-domain translations, and the combination of our methods with mixed fine tuning achieves the best performance. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07978v2 |
https://arxiv.org/pdf/1906.07978v2.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-multi-domain-adaptation |
Repo | |
Framework | |
Crowdsourcing and Validating Event-focused Emotion Corpora for German and English
Title | Crowdsourcing and Validating Event-focused Emotion Corpora for German and English |
Authors | Enrica Troiano, Sebastian Padó, Roman Klinger |
Abstract | Sentiment analysis has a range of corpora available across multiple languages. For emotion analysis, the situation is more limited, which hinders potential research on cross-lingual modeling and the development of predictive models for other languages. In this paper, we fill this gap for German by constructing deISEAR, a corpus designed in analogy to the well-established English ISEAR emotion dataset. Motivated by Scherer’s appraisal theory, we implement a crowdsourcing experiment which consists of two steps. In step 1, participants create descriptions of emotional events for a given emotion. In step 2, five annotators assess the emotion expressed by the texts. We show that transferring an emotion classification model from the original English ISEAR to the German crowdsourced deISEAR via machine translation does not, on average, cause a performance drop. |
Tasks | Emotion Classification, Emotion Recognition, Machine Translation, Sentiment Analysis |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13618v1 |
https://arxiv.org/pdf/1905.13618v1.pdf | |
PWC | https://paperswithcode.com/paper/crowdsourcing-and-validating-event-focused |
Repo | |
Framework | |
Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading
Title | Deep Learning Fundus Image Analysis for Diabetic Retinopathy and Macular Edema Grading |
Authors | Jaakko Sahlsten, Joel Jaskari, Jyri Kivinen, Lauri Turunen, Esa Jaanio, Kustaa Hietala, Kimmo Kaski |
Abstract | Diabetes is a globally prevalent disease that can cause visible microvascular complications such as diabetic retinopathy and macular edema in the human eye retina, the images of which are today used for manual disease screening. This labor-intensive task could greatly benefit from automatic detection using deep learning technique. Here we present a deep learning system that identifies referable diabetic retinopathy comparably or better than presented in the previous studies, although we use only a small fraction of images (<1/4) in training but are aided with higher image resolutions. We also provide novel results for five different screening and clinical grading systems for diabetic retinopathy and macular edema classification, including results for accurately classifying images according to clinical five-grade diabetic retinopathy and four-grade diabetic macular edema scales. These results suggest, that a deep learning system could increase the cost-effectiveness of screening while attaining higher than recommended performance, and that the system could be applied in clinical examinations requiring finer grading. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.08764v1 |
http://arxiv.org/pdf/1904.08764v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-fundus-image-analysis-for |
Repo | |
Framework | |
Human-to-AI Coach: Improving Human Inputs to AI Systems
Title | Human-to-AI Coach: Improving Human Inputs to AI Systems |
Authors | Johannes Schneider |
Abstract | Humans increasingly interact with Artificial intelligence(AI) systems. AI systems are optimized for objectives such as minimum computation or minimum error rate in recognizing and interpreting inputs from humans. In contrast, inputs created by humans are often treated as a given. We investigate how inputs of humans can be altered to reduce misinterpretation by the AI system and to improve efficiency of input generation for the human while altered inputs should remain as similar as possible to the original inputs. These objectives result in trade-offs that are analyzed for a deep learning system classifying handwritten digits. To create examples that serve as demonstrations for humans to improve, we develop a model based on a conditional convolutional autoencoder (CCAE). Our quantitative and qualitative evaluation shows that in many occasions the generated proposals lead to lower error rates, require less effort to create and differ only modestly from the original samples. |
Tasks | |
Published | 2019-12-08 |
URL | https://arxiv.org/abs/1912.03652v2 |
https://arxiv.org/pdf/1912.03652v2.pdf | |
PWC | https://paperswithcode.com/paper/ai-how-can-humans-communicate-better-with-you |
Repo | |
Framework | |
RUSLAN: Russian Spoken Language Corpus for Speech Synthesis
Title | RUSLAN: Russian Spoken Language Corpus for Speech Synthesis |
Authors | Lenar Gabdrakhmanov, Rustem Garaev, Evgenii Razinkov |
Abstract | We present RUSLAN – a new open Russian spoken language corpus for the text-to-speech task. RUSLAN contains 22200 audio samples with text annotations – more than 31 hours of high-quality speech of one person – being the largest annotated Russian corpus in terms of speech duration for a single speaker. We trained an end-to-end neural network for the text-to-speech task on our corpus and evaluated the quality of the synthesized speech using Mean Opinion Score test. Synthesized speech achieves 4.05 score for naturalness and 3.78 score for intelligibility on a 5-point MOS scale. |
Tasks | Speech Synthesis |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11645v1 |
https://arxiv.org/pdf/1906.11645v1.pdf | |
PWC | https://paperswithcode.com/paper/ruslan-russian-spoken-language-corpus-for |
Repo | |
Framework | |
Image-Based Geo-Localization Using Satellite Imagery
Title | Image-Based Geo-Localization Using Satellite Imagery |
Authors | Sixing Hu, Gim Hee Lee |
Abstract | The problem of localization on a geo-referenced satellite map given a query ground view image is useful yet remains challenging due to the drastic change in viewpoint. To this end, in this paper we work on the extension of our earlier work on the Cross-View Matching Network (CVM-Net) for the ground-to-aerial image matching task since the traditional image descriptors fail due to the drastic viewpoint change. In particular, we show more extensive experimental results and analyses of the network architecture on our CVM-Net. Furthermore, we propose a Markov localization framework that enforces the temporal consistency between image frames to enhance the geo-localization results in the case where a video stream of ground view images is available. Experimental results show that our proposed Markov localization framework can continuously localize the vehicle within a small error on our Singapore dataset. |
Tasks | |
Published | 2019-03-01 |
URL | https://arxiv.org/abs/1903.00159v3 |
https://arxiv.org/pdf/1903.00159v3.pdf | |
PWC | https://paperswithcode.com/paper/image-based-geo-localization-using-satellite |
Repo | |
Framework | |
Action Recognition from Single Timestamp Supervision in Untrimmed Videos
Title | Action Recognition from Single Timestamp Supervision in Untrimmed Videos |
Authors | Davide Moltisanti, Sanja Fidler, Dima Damen |
Abstract | Recognising actions in videos relies on labelled supervision during training, typically the start and end times of each action instance. This supervision is not only subjective, but also expensive to acquire. Weak video-level supervision has been successfully exploited for recognition in untrimmed videos, however it is challenged when the number of different actions in training videos increases. We propose a method that is supervised by single timestamps located around each action instance, in untrimmed videos. We replace expensive action bounds with sampling distributions initialised from these timestamps. We then use the classifier’s response to iteratively update the sampling distributions. We demonstrate that these distributions converge to the location and extent of discriminative action segments. We evaluate our method on three datasets for fine-grained recognition, with increasing number of different actions per video, and show that single timestamps offer a reasonable compromise between recognition performance and labelling effort, performing comparably to full temporal supervision. Our update method improves top-1 test accuracy by up to 5.4%. across the evaluated datasets. |
Tasks | Temporal Action Localization |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04689v1 |
http://arxiv.org/pdf/1904.04689v1.pdf | |
PWC | https://paperswithcode.com/paper/action-recognition-from-single-timestamp |
Repo | |
Framework | |
Recurrent Neural Networks for P300-based BCI
Title | Recurrent Neural Networks for P300-based BCI |
Authors | Ori Tal, Doron Friedman |
Abstract | P300-based spellers are one of the main methods for EEG-based brain-computer interface, and the detection of the P300 target event with high accuracy is an important prerequisite. The rapid serial visual presentation (RSVP) protocol is of high interest because it can be used by patients who have lost control over their eyes. In this study we wish to explore the suitability of recurrent neural networks (RNNs) as a machine learning method for identifying the P300 signal in RSVP data. We systematically compare RNN with alternative methods such as linear discriminant analysis (LDA) and convolutional neural network (CNN). Our results indicate that LDA performs as well as the neural network models or better on single subject data, but a network combining CNN and RNN has advantages when transferring learning among subejcts, and is significantly more resilient to temporal noise than other methods. |
Tasks | EEG |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.10798v1 |
http://arxiv.org/pdf/1901.10798v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-for-p300-based-bci |
Repo | |
Framework | |
A few filters are enough: Convolutional Neural Network for P300 Detection
Title | A few filters are enough: Convolutional Neural Network for P300 Detection |
Authors | Alicia Montserrat Alvarado-Gonzalez, Gibran Fuentes-Pineda, Jorge Cervantes-Ojeda |
Abstract | In this paper, we aim to provide elements to contribute to the discussion about the usefulness of deep CNNs with several filters to solve both within-subject and cross-subject classification for single-trial P300 detection. To that end, we present SepConv1D, a simple Convolutional Neural Network architecture consisting of a depthwise separable 1D convolutional block followed by a Sigmoid classification block. Additionally, we present a one-layer Fully-Connected Neural Network with two neurons in the hidden layer to show the unnecessary of having complex architectures to solve the problem under analysis. We compare their performances against CNN-based state-of-the-art architectures. The experiments did not show a statistically significant difference between their AUC. Moreover, SepConv1D has the lowest number of parameters of all by far. This is important because simpler, cheaper, faster and, thus, more portable devices can be built. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.06970v1 |
https://arxiv.org/pdf/1909.06970v1.pdf | |
PWC | https://paperswithcode.com/paper/a-few-filters-are-enough-convolutional-neural |
Repo | |
Framework | |
The Use of Machine Learning and Big Five Personality Taxonomy to Predict Construction Workers’ Safety Behaviour
Title | The Use of Machine Learning and Big Five Personality Taxonomy to Predict Construction Workers’ Safety Behaviour |
Authors | Yifan Gao, Vicente A. Gonzalez, Tak Wing Yiu, Guillermo Cabrera-Guerrerod |
Abstract | Research has found that many occupational accidents are foreseeable, being the result of people’s unsafe behaviour from a retrospective point of view. The prediction of workers’ safety behaviour will enable the prior insights into each worker’s behavioural tendency and will be useful in the design of management practices prior to the occurrence of accidents and contribute to the reduction of injury rates. In recent years, researchers have found that people do have stable predispositions to engage in certain safety behavioural patterns which vary among individuals as a function of personality features. In this study, an innovative forecasting model, which employs machine learning algorithms, is developed to estimate construction workers’ behavioural tendency based on the Big Five personality taxonomy. The data-driven nature of machine learning technique enabled a reliable estimate of the personality-safety behaviour relationship, which allowed this study to provide novel insight that nonlinearity may exist in the relationship between construction workers’ personality traits and safety behaviour. The developed model is found to be sufficient to have satisfactory accuracy in explaining and predicting workers’ safety behaviour. This finding provides the empirical evidence to support the usefulness of personality traits as effective predictors of people’s safety behaviour at work. In addition, this study could have practical implications. The machine learning model developed can help identify vulnerable workers who are more prone to undertake unsafe behaviours, which is proven to have good prediction accuracy and is thereby potentially useful for decision making and safety management on construction sites. |
Tasks | Decision Making |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05944v1 |
https://arxiv.org/pdf/1912.05944v1.pdf | |
PWC | https://paperswithcode.com/paper/the-use-of-machine-learning-and-big-five |
Repo | |
Framework | |
Modeling Gestalt Visual Reasoning on the Raven’s Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques
Title | Modeling Gestalt Visual Reasoning on the Raven’s Progressive Matrices Intelligence Test Using Generative Image Inpainting Techniques |
Authors | Tianyu Hua, Maithilee Kunda |
Abstract | Psychologists recognize Raven’s Progressive Matrices as a very effective test of general human intelligence. While many computational models have been developed by the AI community to investigate different forms of top-down, deliberative reasoning on the test, there has been less research on bottom-up perceptual processes, like Gestalt image completion, that are also critical in human test performance. In this work, we investigate how Gestalt visual reasoning on the Raven’s test can be modeled using generative image inpainting techniques from computer vision. We demonstrate that a self-supervised inpainting model trained only on photorealistic images of objects achieves a score of 27/36 on the Colored Progressive Matrices, which corresponds to average performance for nine-year-old children. We also show that models trained on other datasets (faces, places, and textures) do not perform as well. Our results illustrate how learning visual regularities in real-world images can translate into successful reasoning about artificial test stimuli. On the flip side, our results also highlight the limitations of such transfer, which may explain why intelligence tests like the Raven’s are often sensitive to people’s individual sociocultural backgrounds. |
Tasks | Image Inpainting, Visual Reasoning |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07736v2 |
https://arxiv.org/pdf/1911.07736v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-gestalt-visual-reasoning-on-the |
Repo | |
Framework | |
On the Apparent Conflict Between Individual and Group Fairness
Title | On the Apparent Conflict Between Individual and Group Fairness |
Authors | Reuben Binns |
Abstract | A distinction has been drawn in fair machine learning research between group' and individual’ fairness measures. Many technical research papers assume that both are important, but conflicting, and propose ways to minimise the trade-offs between these measures. This paper argues that this apparent conflict is based on a misconception. It draws on theoretical discussions from within the fair machine learning research, and from political and legal philosophy, to argue that individual and group fairness are not fundamentally in conflict. First, it outlines accounts of egalitarian fairness which encompass plausible motivations for both group and individual fairness, thereby suggesting that there need be no conflict in principle. Second, it considers the concept of individual justice, from legal philosophy and jurisprudence which seems similar but actually contradicts the notion of individual fairness as proposed in the fair machine learning literature. The conclusion is that the apparent conflict between individual and group fairness is more of an artifact of the blunt application of fairness measures, rather than a matter of conflicting principles. In practice, this conflict may be resolved by a nuanced consideration of the sources of `unfairness’ in a particular deployment context, and the carefully justified application of measures to mitigate it. | |
Tasks | |
Published | 2019-12-14 |
URL | https://arxiv.org/abs/1912.06883v1 |
https://arxiv.org/pdf/1912.06883v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-apparent-conflict-between-individual |
Repo | |
Framework | |
A Comparison of Prediction Algorithms and Nexting for Short Term Weather Forecasts
Title | A Comparison of Prediction Algorithms and Nexting for Short Term Weather Forecasts |
Authors | Michael Koller, Johannes Feldmaier, Klaus Diepold |
Abstract | This report first provides a brief overview of a number of supervised learning algorithms for regression tasks. Among those are neural networks, regression trees, and the recently introduced Nexting. Nexting has been presented in the context of reinforcement learning where it was used to predict a large number of signals at different timescales. In the second half of this report, we apply the algorithms to historical weather data in order to evaluate their suitability to forecast a local weather trend. Our experiments did not identify one clearly preferable method, but rather show that choosing an appropriate algorithm depends on the available side information. For slowly varying signals and a proficient number of training samples, Nexting achieved good results in the studied cases. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07512v1 |
http://arxiv.org/pdf/1903.07512v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparison-of-prediction-algorithms-and |
Repo | |
Framework | |