Paper Group ANR 425
Class-Imbalanced Semi-Supervised Learning. Structured Consistency Loss for semi-supervised semantic segmentation. Online Self-Supervised Learning for Object Picking: Detecting Optimum Grasping Position using a Metric Learning Approach. A Fourier Domain Feature Approach for Human Activity Recognition & Fall Detection. An end-to-end approach for the …
Class-Imbalanced Semi-Supervised Learning
Title | Class-Imbalanced Semi-Supervised Learning |
Authors | Minsung Hyun, Jisoo Jeong, Nojun Kwak |
Abstract | Semi-Supervised Learning (SSL) has achieved great success in overcoming the difficulties of labeling and making full use of unlabeled data. However, SSL has a limited assumption that the numbers of samples in different classes are balanced, and many SSL algorithms show lower performance for the datasets with the imbalanced class distribution. In this paper, we introduce a task of class-imbalanced semi-supervised learning (CISSL), which refers to semi-supervised learning with class-imbalanced data. In doing so, we consider class imbalance in both labeled and unlabeled sets. First, we analyze existing SSL methods in imbalanced environments and examine how the class imbalance affects SSL methods. Then we propose Suppressed Consistency Loss (SCL), a regularization method robust to class imbalance. Our method shows better performance than the conventional methods in the CISSL environment. In particular, the more severe the class imbalance and the smaller the size of the labeled data, the better our method performs. |
Tasks | |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06815v1 |
https://arxiv.org/pdf/2002.06815v1.pdf | |
PWC | https://paperswithcode.com/paper/class-imbalanced-semi-supervised-learning |
Repo | |
Framework | |
Structured Consistency Loss for semi-supervised semantic segmentation
Title | Structured Consistency Loss for semi-supervised semantic segmentation |
Authors | Jongmok Kim, Jooyoung Jang, Hyunwoo Park |
Abstract | The consistency loss has played a key role in solving problems in recent studies on semi-supervised learning. Yet extant studies with the consistency loss are limited to its application to classification tasks; extant studies on semi-supervised semantic segmentation rely on pixel-wise classification, which does not reflect the structured nature of characteristics in prediction. We propose a structured consistency loss to address this limitation of extant studies. Structured consistency loss promotes consistency in inter-pixel similarity between teacher and student networks. Specifically, collaboration with CutMix optimizes the efficient performance of semi-supervised semantic segmentation with structured consistency loss by reducing computational burden dramatically. The superiority of proposed method is verified with the Cityscapes; The Cityscapes benchmark results with validation and with test data are 81.9 mIoU and 83.84 mIoU respectively. This ranks the first place on the pixel-level semantic labeling task of Cityscapes benchmark suite. To the best of our knowledge, we are the first to present the superiority of state-of-the-art semi-supervised learning in semantic segmentation. |
Tasks | Semantic Segmentation, Semi-Supervised Semantic Segmentation |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04647v1 |
https://arxiv.org/pdf/2001.04647v1.pdf | |
PWC | https://paperswithcode.com/paper/structured-consistency-loss-for-semi |
Repo | |
Framework | |
Online Self-Supervised Learning for Object Picking: Detecting Optimum Grasping Position using a Metric Learning Approach
Title | Online Self-Supervised Learning for Object Picking: Detecting Optimum Grasping Position using a Metric Learning Approach |
Authors | Kanata Suzuki, Yasuto Yokota, Yuzi Kanazawa, Tomoyoshi Takebayashi |
Abstract | Self-supervised learning methods are attractive candidates for automatic object picking. However, the trial samples lack the complete ground truth because the observable parts of the agent are limited. That is, the information contained in the trial samples is often insufficient to learn the specific grasping position of each object. Consequently, the training falls into a local solution, and the grasp positions learned by the robot are independent of the state of the object. In this study, the optimal grasping position of an individual object is determined from the grasping score, defined as the distance in the feature space obtained using metric learning. The closeness of the solution to the pre-designed optimal grasping position was evaluated in trials. The proposed method incorporates two types of feedback control: one feedback enlarges the grasping score when the grasping position approaches the optimum; the other reduces the negative feedback of the potential grasping positions among the grasping candidates. The proposed online self-supervised learning method employs two deep neural networks. : SSD that detects the grasping position of an object, and Siamese networks (SNs) that evaluate the trial sample using the similarity of two input data in the feature space. Our method embeds the relation of each grasping position as feature vectors by training the trial samples and a few pre-samples indicating the optimum grasping position. By incorporating the grasping score based on the feature space of SNs into the SSD training process, the method preferentially trains the optimum grasping position. In the experiment, the proposed method achieved a higher success rate than the baseline method using simple teaching signals. And the grasping scores in the feature space of the SNs accurately represented the grasping positions of the objects. |
Tasks | Metric Learning |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.03717v1 |
https://arxiv.org/pdf/2003.03717v1.pdf | |
PWC | https://paperswithcode.com/paper/online-self-supervised-learning-for-object |
Repo | |
Framework | |
A Fourier Domain Feature Approach for Human Activity Recognition & Fall Detection
Title | A Fourier Domain Feature Approach for Human Activity Recognition & Fall Detection |
Authors | Asma Khatun, Sk. Golam Sarowar Hossain |
Abstract | Elder people consequence a variety of problems while living Activities of Daily Living (ADL) for the reason of age, sense, loneliness and cognitive changes. These cause the risk to ADL which leads to several falls. Getting real life fall data is a difficult process and are not available whereas simulated falls become ubiquitous to evaluate the proposed methodologies. From the literature review, it is investigated that most of the researchers used raw and energy features (time domain features) of the signal data as those are most discriminating. However, in real life situations fall signal may be noisy than the current simulated data. Hence the result using raw feature may dramatically changes when using in a real life scenario. This research is using frequency domain Fourier coefficient features to differentiate various human activities of daily life. The feature vector constructed using those Fast Fourier Transform are robust to noise and rotation invariant. Two different supervised classifiers kNN and SVM are used for evaluating the method. Two standard publicly available datasets are used for benchmark analysis. In this research, more discriminating results are obtained applying kNN classifier than the SVM classifier. Various standard measure including Standard Accuracy (SA), Macro Average Accuracy (MAA), Sensitivity (SE) and Specificity (SP) has been accounted. In all cases, the proposed method outperforms energy features whereas competitive results are shown with raw features. It is also noticed that the proposed method performs better than the recently risen deep learning approach in which data augmentation method were not used. |
Tasks | Activity Recognition, Data Augmentation, Human Activity Recognition |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05209v1 |
https://arxiv.org/pdf/2003.05209v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fourier-domain-feature-approach-for-human |
Repo | |
Framework | |
An end-to-end approach for the verification problem: learning the right distance
Title | An end-to-end approach for the verification problem: learning the right distance |
Authors | Joao Monteiro, Isabela Albuquerque, Jahangir Alam, R Devon Hjelm, Tiago Falk |
Abstract | In this contribution, we augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder. Several interpretations are thus drawn for the learned distance-like model’s output. We first show it approximates a likelihood ratio which can be used for hypothesis tests, and that it further induces a large divergence across the joint distributions of pairs of examples from the same and from different classes. Evaluation is performed under the verification setting consisting of determining whether sets of examples belong to the same class, even if such classes are novel and were never presented to the model during training. Empirical evaluation shows such method defines an end-to-end approach for the verification problem, able to attain better performance than simple scorers such as those based on cosine similarity and further outperforming widely used downstream classifiers. We further observe training is much simplified under the proposed approach compared to metric learning with actual distances, requiring no complex scheme to harvest pairs of examples. |
Tasks | Metric Learning |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09469v1 |
https://arxiv.org/pdf/2002.09469v1.pdf | |
PWC | https://paperswithcode.com/paper/an-end-to-end-approach-for-the-verification |
Repo | |
Framework | |
A comprehensive study on the prediction reliability of graph neural networks for virtual screening
Title | A comprehensive study on the prediction reliability of graph neural networks for virtual screening |
Authors | Soojung Yang, Kyung Hoon Lee, Seongok Ryu |
Abstract | Prediction models based on deep neural networks are increasingly gaining attention for fast and accurate virtual screening systems. For decision makings in virtual screening, researchers find it useful to interpret an output of classification system as probability, since such interpretation allows them to filter out more desirable compounds. However, probabilistic interpretation cannot be correct for models that hold over-parameterization problems or inappropriate regularizations, leading to unreliable prediction and decision making. In this regard, we concern the reliability of neural prediction models on molecular properties, especially when models are trained with sparse data points and imbalanced distributions. This work aims to propose guidelines for training reliable models, we thus provide methodological details and ablation studies on the following train principles. We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results. Moreover, we evaluate prediction reliability of models on virtual screening scenario. Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate, especially in data imbalanced situation. All experiments were performed under a single unified model implementation to alleviate external randomness in model training and to enable precise comparison of results. |
Tasks | Decision Making |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07611v1 |
https://arxiv.org/pdf/2003.07611v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comprehensive-study-on-the-prediction |
Repo | |
Framework | |
Coordinated Reasoning for Cross-Lingual Knowledge Graph Alignment
Title | Coordinated Reasoning for Cross-Lingual Knowledge Graph Alignment |
Authors | Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu |
Abstract | Existing entity alignment methods mainly vary on the choices of encoding the knowledge graph, but they typically use the same decoding method, which independently chooses the local optimal match for each source entity. This decoding method may not only cause the “many-to-one” problem but also neglect the coordinated nature of this task, that is, each alignment decision may highly correlate to the other decisions. In this paper, we introduce two coordinated reasoning methods, i.e., the Easy-to-Hard decoding strategy and joint entity alignment algorithm. Specifically, the Easy-to-Hard strategy first retrieves the model-confident alignments from the predicted results and then incorporates them as additional knowledge to resolve the remaining model-uncertain alignments. To achieve this, we further propose an enhanced alignment model that is built on the current state-of-the-art baseline. In addition, to address the many-to-one problem, we propose to jointly predict entity alignments so that the one-to-one constraint can be naturally incorporated into the alignment prediction. Experimental results show that our model achieves the state-of-the-art performance and our reasoning methods can also significantly improve existing baselines. |
Tasks | Entity Alignment |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08728v1 |
https://arxiv.org/pdf/2001.08728v1.pdf | |
PWC | https://paperswithcode.com/paper/coordinated-reasoning-for-cross-lingual |
Repo | |
Framework | |
Adversarial Transferability in Wearable Sensor Systems
Title | Adversarial Transferability in Wearable Sensor Systems |
Authors | Ramesh Kumar Sah, Hassan Ghasemzadeh |
Abstract | Machine learning has increasingly become the most used approach for inference and decision making in wearable sensor systems. However, recent studies have found that machine learning systems are easily fooled by the addition of adversarial perturbation to their inputs. What is more interesting is that the adversarial examples generated for one machine learning system can also degrade the performance of another. This property of adversarial examples is called transferability. In this work, we take the first strides in studying adversarial transferability in wearable sensor systems, from the following perspectives: 1) Transferability between machine learning models, 2) Transferability across subjects, 3) Transferability across sensor locations, and 4) Transferability across datasets. With Human Activity Recognition (HAR) as an example sensor system, we found strong untargeted transferability in all cases of transferability. Specifically, gradient-based attacks were able to achieve higher misclassification rates compared to non-gradient attacks. The misclassification rate of untargeted adversarial examples ranged from 20% to 98%. For targeted transferability between machine learning models, the success rate of adversarial examples was 100% for iterative attack methods. However, the success rate for other types of targeted transferability ranged from 20% to 0%. Our findings strongly suggest that adversarial transferability has serious consequences not only in sensor systems but also across the broad spectrum of ubiquitous computing. |
Tasks | Activity Recognition, Decision Making, Human Activity Recognition |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07982v1 |
https://arxiv.org/pdf/2003.07982v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-transferability-in-wearable |
Repo | |
Framework | |
Self-Supervised Discovering of Causal Features: Towards Interpretable Reinforcement Learning
Title | Self-Supervised Discovering of Causal Features: Towards Interpretable Reinforcement Learning |
Authors | Wenjie Shi, Shiji Song, Zhuoyuan Wang, Gao Huang |
Abstract | Deep reinforcement learning (RL) has recently led to many breakthroughs on a range of complex control tasks. However, the agent’s decision-making process is generally not transparent. The lack of interpretability hinders the applicability of RL in safety-critical scenarios. In this paper, we propose a self-supervised interpretable framework, which employs a self-supervised interpretable network (SSINet) to discover and locate fine-grained causal features that constitute most evidence for the agent’s decisions. We verify and evaluate our method on several Atari 2600 games as well as Duckietown. The results show that our method renders causal explanations and empirical evidences about how the agent makes decisions and why the agent performs well or badly. Moreover, our method is a flexible explanatory module that can be applied to most vision-based RL agents. Overall, our method provides valuable insight into interpretable vision-based RL. |
Tasks | Atari Games, Decision Making |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07069v2 |
https://arxiv.org/pdf/2003.07069v2.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-discovering-of-causal |
Repo | |
Framework | |
Introducing the diagrammatic mode
Title | Introducing the diagrammatic mode |
Authors | Tuomo Hiippala, John A. Bateman |
Abstract | In this article, we propose a multimodal perspective to diagrammatic representations by sketching a description of what may be tentatively termed the diagrammatic mode. We consider diagrammatic representations in the light of contemporary multimodality theory and explicate what enables diagrammatic representations to integrate natural language, various forms of graphics, diagrammatic elements such as arrows, lines and other expressive resources into coherent organisations. We illustrate the proposed approach using two recent diagram corpora and show how a multimodal approach supports the empirical analysis of diagrammatic representations, especially in identifying diagrammatic constituents and describing their interrelations. |
Tasks | |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11224v1 |
https://arxiv.org/pdf/2001.11224v1.pdf | |
PWC | https://paperswithcode.com/paper/introducing-the-diagrammatic-mode |
Repo | |
Framework | |
Deep learning accelerated topology optimization of linear and nonlinear structures
Title | Deep learning accelerated topology optimization of linear and nonlinear structures |
Authors | Diab W. Abueidda, Seid Koric, Nahil A. Sobh |
Abstract | The field of optimal design of linear elastic structures has seen many exciting successes that resulted in new architected materials and structural designs. With the availability of cloud computing, including high-performance computing, machine learning, and simulation, searching for optimal nonlinear structures is now within reach. In this study, we develop convolutional neural network models to predict optimized designs for a given set of boundary conditions, loads, and optimization constraints. We have considered the case of materials with a linear elastic response with and without stress constraint. Also, we have considered the case of materials with a hyperelastic response, where material and geometric nonlinearities are involved. For the nonlinear elastic case, the neo-Hookean model is utilized. For this purpose, we generate datasets composed of the optimized designs paired with the corresponding boundary conditions, loads, and constraints, using a topology optimization framework to train and validate the neural network models. The developed models are capable of accurately predicting the optimized designs without requiring an iterative scheme and with negligible computational time. The suggested pipeline can be generalized to other nonlinear mechanics scenarios and design domains. |
Tasks | |
Published | 2020-01-31 |
URL | https://arxiv.org/abs/2002.01896v3 |
https://arxiv.org/pdf/2002.01896v3.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-accelerated-topology |
Repo | |
Framework | |
A Loss-Function for Causal Machine-Learning
Title | A Loss-Function for Causal Machine-Learning |
Authors | I-Sheng Yang |
Abstract | Causal machine-learning is about predicting the net-effect (true-lift) of treatments. Given the data of a treatment group and a control group, it is similar to a standard supervised-learning problem. Unfortunately, there is no similarly well-defined loss function due to the lack of point-wise true values in the data. Many advances in modern machine-learning are not directly applicable due to the absence of such loss function. We propose a novel method to define a loss function in this context, which is equal to mean-square-error (MSE) in a standard regression problem. Our loss function is universally applicable, thus providing a general standard to evaluate the quality of any model/strategy that predicts the true-lift. We demonstrate that despite its novel definition, one can still perform gradient descent directly on this loss function to find the best fit. This leads to a new way to train any parameter-based model, such as deep neural networks, to solve causal machine-learning problems without going through the meta-learner strategy. |
Tasks | |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00629v1 |
https://arxiv.org/pdf/2001.00629v1.pdf | |
PWC | https://paperswithcode.com/paper/a-loss-function-for-causal-machine-learning |
Repo | |
Framework | |
Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models
Title | Exploring Gaussian mixture model framework for speaker adaptation of deep neural network acoustic models |
Authors | Natalia Tomashenko, Yuri Khokhlov, Yannick Esteve |
Abstract | In this paper we investigate the GMM-derived (GMMD) features for adaptation of deep neural network (DNN) acoustic models. The adaptation of the DNN trained on GMMD features is done through the maximum a posteriori (MAP) adaptation of the auxiliary GMM model used for GMMD feature extraction. We explore fusion of the adapted GMMD features with conventional features, such as bottleneck and MFCC features, in two different neural network architectures: DNN and time-delay neural network (TDNN). We analyze and compare different types of adaptation techniques such as i-vectors and feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) with the proposed adaptation approach, and explore their complementarity using various types of fusion such as feature level, posterior level, lattice level and others in order to discover the best possible way of combination. Experimental results on the TED-LIUM corpus show that the proposed adaptation technique can be effectively integrated into DNN and TDNN setups at different levels and provide additional gain in recognition performance: up to 6% of relative word error rate reduction (WERR) over the strong feature-space adaptation techniques based on maximum likelihood linear regression (fMLLR) speaker adapted DNN baseline, and up to 18% of relative WERR in comparison with a speaker independent (SI) DNN baseline model, trained on conventional features. For TDNN models the proposed approach achieves up to 26% of relative WERR in comparison with a SI baseline, and up 13% in comparison with the model adapted by using i-vectors. The analysis of the adapted GMMD features from various points of view demonstrates their effectiveness at different levels. |
Tasks | |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06894v1 |
https://arxiv.org/pdf/2003.06894v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-gaussian-mixture-model-framework |
Repo | |
Framework | |
Learning Intermediate Features of Object Affordances with a Convolutional Neural Network
Title | Learning Intermediate Features of Object Affordances with a Convolutional Neural Network |
Authors | Aria Yuan Wang, Michael J. Tarr |
Abstract | Our ability to interact with the world around us relies on being able to infer what actions objects afford – often referred to as affordances. The neural mechanisms of object-action associations are realized in the visuomotor pathway where information about both visual properties and actions is integrated into common representations. However, explicating these mechanisms is particularly challenging in the case of affordances because there is hardly any one-to-one mapping between visual features and inferred actions. To better understand the nature of affordances, we trained a deep convolutional neural network (CNN) to recognize affordances from images and to learn the underlying features or the dimensionality of affordances. Such features form an underlying compositional structure for the general representation of affordances which can then be tested against human neural data. We view this representational analysis as the first step towards a more formal account of how humans perceive and interact with the environment. |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08975v1 |
https://arxiv.org/pdf/2002.08975v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-intermediate-features-of-object |
Repo | |
Framework | |
Gradient-based Data Augmentation for Semi-Supervised Learning
Title | Gradient-based Data Augmentation for Semi-Supervised Learning |
Authors | Hiroshi Kaizuka |
Abstract | In semi-supervised learning (SSL), a technique called consistency regularization (CR) achieves high performance. It has been proved that the diversity of data used in CR is extremely important to obtain a model with high discrimination performance by CR. We propose a new data augmentation (Gradient-based Data Augmentation (GDA)) that is deterministically calculated from the image pixel value gradient of the posterior probability distribution that is the model output. We aim to secure effective data diversity for CR by utilizing three types of GDA. On the other hand, it has been demonstrated that the mixup method for labeled data and unlabeled data is also effective in SSL. We propose an SSL method named MixGDA by combining various mixup methods and GDA. The discrimination performance achieved by MixGDA is evaluated against the 13-layer CNN that is used as standard in SSL research. As a result, for CIFAR-10 (4000 labels), MixGDA achieves the same level of performance as the best performance ever achieved. For SVHN (250 labels, 500 labels and 1000 labels) and CIFAR-100 (10000 labels), MixGDA achieves state-of-the-art performance. |
Tasks | Data Augmentation |
Published | 2020-03-28 |
URL | https://arxiv.org/abs/2003.12824v1 |
https://arxiv.org/pdf/2003.12824v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-based-data-augmentation-for-semi |
Repo | |
Framework | |