Paper Group ANR 1282
Human-like Decision Making for Autonomous Driving via Adversarial Inverse Reinforcement Learning. Technical Report on Visual Quality Assessment for Frame Interpolation. Understanding the Mechanism of Deep Learning Framework for Lesion Detection in Pathological Images with Breast Cancer. Simulation Model of Two-Robot Cooperation in Common Operating …
Human-like Decision Making for Autonomous Driving via Adversarial Inverse Reinforcement Learning
Title | Human-like Decision Making for Autonomous Driving via Adversarial Inverse Reinforcement Learning |
Authors | Pin Wang, Dapeng Liu, Jiayu Chen, Hanhan Li, Ching-Yao Chan |
Abstract | To make human-like decisions under complex driving environment is a challenging task for autonomous agents. Imitation Learning or learning-from-demonstration methods have seen great potential for achieving such a goal. Some state-of-the-art studies apply Generative Adversarial Imitation Learning (GAIL) to learn sequential decision-making and control policies. While GAIL can directly learn a policy, it lacks the ability to recover a reward function, which is considered robust and adaptable to environments changes. Adversarial Inverse Reinforcement Learning (AIRL) is another learning-from-demonstration method that can achieve similar benefits as GAIL but also learns the reward function with the policy simultaneously. In the original work of AIRL, it has been demonstrated in single-agent environments such as maze navigation and ant running tasks in OpenAI Gyms. In this paper, we augment AIRL by concatenating semantic reward terms in the learning framework to improve and stabilize its performance, and then extend it to a more practical but challenging situation, i.e. decision-making scenario in highly interactive driving environment. Four performance evaluation metrics are proposed and compared with some Imitation Learning based methods and Reinforcement Learning based methods. Simulation results show that the augmented AIRL outperforms all the other methods, and the trained vehicle agent can perform decision-making behaviors comparable with that of the experts. |
Tasks | Autonomous Driving, Decision Making, Imitation Learning |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08044v2 |
https://arxiv.org/pdf/1911.08044v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-inverse-reinforcement-learning |
Repo | |
Framework | |
Technical Report on Visual Quality Assessment for Frame Interpolation
Title | Technical Report on Visual Quality Assessment for Frame Interpolation |
Authors | Hui Men, Hanhe Lin, Vlad Hosu, Daniel Maurer, Andres Bruhn, Dietmar Saupe |
Abstract | Current benchmarks for optical flow algorithms evaluate the estimation quality by comparing their predicted flow field with the ground truth, and additionally may compare interpolated frames, based on these predictions, with the correct frames from the actual image sequences. For the latter comparisons, objective measures such as mean square errors are applied. However, for applications like image interpolation, the expected user’s quality of experience cannot be fully deduced from such simple quality measures. Therefore, we conducted a subjective quality assessment study by crowdsourcing for the interpolated images provided in one of the optical flow benchmarks, the Middlebury benchmark. We used paired comparisons with forced choice and reconstructed absolute quality scale values according to Thurstone’s model using the classical least squares method. The results give rise to a re-ranking of 141 participating algorithms w.r.t. visual quality of interpolated frames mostly based on optical flow estimation. Our re-ranking result shows the necessity of visual quality assessment as another evaluation metric for optical flow and frame interpolation benchmarks. |
Tasks | Optical Flow Estimation |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05362v2 |
http://arxiv.org/pdf/1901.05362v2.pdf | |
PWC | https://paperswithcode.com/paper/technical-report-on-visual-quality-assessment |
Repo | |
Framework | |
Understanding the Mechanism of Deep Learning Framework for Lesion Detection in Pathological Images with Breast Cancer
Title | Understanding the Mechanism of Deep Learning Framework for Lesion Detection in Pathological Images with Breast Cancer |
Authors | Wei-Wen Hsu, Chung-Hao Chen, Chang Hoa, Yu-Ling Hou, Xiang Gao, Yun Shao, Xueli Zhang, Jingjing Wang, Tao He, Yanghong Tai |
Abstract | The computer-aided detection (CADe) systems are developed to assist pathologists in slide assessment, increasing diagnosis efficiency and reducing missing inspections. Many studies have shown such a CADe system with deep learning approaches outperforms the one using conventional methods that rely on hand-crafted features based on field-knowledge. However, most developers who adopted deep learning models directly focused on the efficacy of outcomes, without providing comprehensive explanations on why their proposed frameworks can work effectively. In this study, we designed four experiments to verify the consecutive concepts, showing that the deep features learned from pathological patches are interpretable by domain knowledge of pathology and enlightening for clinical diagnosis in the task of lesion detection. The experimental results show the activation features work as morphological descriptors for specific cells or tissues, which agree with the clinical rules in classification. That is, the deep learning framework not only detects the distribution of tumor cells but also recognizes lymphocytes, collagen fibers, and some other non-cell structural tissues. Most of the characteristics learned by the deep learning models have summarized the detection rules that can be recognized by the experienced pathologists, whereas there are still some features may not be intuitive to domain experts but discriminative in classification for machines. Those features are worthy to be further studied in order to find out the reasonable correlations to pathological knowledge, from which pathological experts may draw inspirations for exploring new characteristics in diagnosis. |
Tasks | |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01214v1 |
http://arxiv.org/pdf/1903.01214v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-the-mechanism-of-deep-learning |
Repo | |
Framework | |
Simulation Model of Two-Robot Cooperation in Common Operating Environment
Title | Simulation Model of Two-Robot Cooperation in Common Operating Environment |
Authors | V. Ya. Vilisov, B. Yu. Murashkin, A. I. Kulikov |
Abstract | The article considers a simulation modelling problem related to the chess game process occurring between two three-tier manipulators. The objective of the game construction lies in developing the procedure of effective control of the autonomous manipulator robots located in a common operating environment. The simulation model is a preliminary stage of building a natural complex that would provide cooperation of several manipulator robots within a common operating environment. The article addresses issues of training and research. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08485v1 |
https://arxiv.org/pdf/1908.08485v1.pdf | |
PWC | https://paperswithcode.com/paper/simulation-model-of-two-robot-cooperation-in |
Repo | |
Framework | |
Stylized Text Generation Using Wasserstein Autoencoders with a Mixture of Gaussian Prior
Title | Stylized Text Generation Using Wasserstein Autoencoders with a Mixture of Gaussian Prior |
Authors | Amirpasha Ghabussi, Lili Mou, Olga Vechtomova |
Abstract | Wasserstein autoencoders are effective for text generation. They do not however provide any control over the style and topic of the generated sentences if the dataset has multiple classes and includes different topics. In this work, we present a semi-supervised approach for generating stylized sentences. Our model is trained on a multi-class dataset and learns the latent representation of the sentences using a mixture of Gaussian prior without any adversarial losses. This allows us to generate sentences in the style of a specified class or multiple classes by sampling from their corresponding prior distributions. Moreover, we can train our model on relatively small datasets and learn the latent representation of a specified class by adding external data with other styles/classes to our dataset. While a simple WAE or VAE cannot generate diverse sentences in this case, generated sentences with our approach are diverse, fluent, and preserve the style and the content of the desired classes. |
Tasks | Text Generation |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03828v1 |
https://arxiv.org/pdf/1911.03828v1.pdf | |
PWC | https://paperswithcode.com/paper/stylized-text-generation-using-wasserstein |
Repo | |
Framework | |
LoRMIkA: Local Rule-based Model Interpretability with k-optimal Associations
Title | LoRMIkA: Local Rule-based Model Interpretability with k-optimal Associations |
Authors | Dilini Rajapaksha, Christoph Bergmeir, Wray Buntine |
Abstract | As we rely more and more on machine learning models for real-life decision-making, being able to understand and trust the predictions becomes ever more important. Local explainer models have recently been introduced to explain the predictions of complex machine learning models at the instance level. In this paper, we propose Local Rule-based Model Interpretability with k-optimal Associations (LoRMIkA), a novel model-agnostic approach that obtains k-optimal association rules from a neighborhood of the instance to be explained. Compared to other rule-based approaches in the literature, we argue that the most predictive rules are not necessarily the rules that provide the best explanations. Consequently, the LoRMIkA framework provides a flexible way to obtain predictive and interesting rules. It uses an efficient search algorithm guaranteed to find the k-optimal rules with respect to objectives such as strength, lift, leverage, coverage, and support. It also provides multiple rules which explain the decision and counterfactual rules, which give indications for potential changes to obtain different outputs for given instances. We compare our approach to other state-of-the-art approaches in local model interpretability on three different datasets, and achieve competitive results in terms of local accuracy and interpretability. |
Tasks | Decision Making |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03840v1 |
https://arxiv.org/pdf/1908.03840v1.pdf | |
PWC | https://paperswithcode.com/paper/lormika-local-rule-based-model |
Repo | |
Framework | |
Motion Reasoning for Goal-Based Imitation Learning
Title | Motion Reasoning for Goal-Based Imitation Learning |
Authors | De-An Huang, Yu-Wei Chao, Chris Paxton, Xinke Deng, Li Fei-Fei, Juan Carlos Niebles, Animesh Garg, Dieter Fox |
Abstract | We address goal-based imitation learning, where the aim is to output the symbolic goal from a third-person video demonstration. This enables the robot to plan for execution and reproduce the same goal in a completely different environment. The key challenge is that the goal of a video demonstration is often ambiguous at the level of semantic actions. The human demonstrators might unintentionally achieve certain subgoals in the demonstrations with their actions. Our main contribution is to propose a motion reasoning framework that combines task and motion planning to disambiguate the true intention of the demonstrator in the video demonstration. This allows us to robustly recognize the goals that cannot be disambiguated by previous action-based approaches. We evaluate our approach by collecting a dataset of 96 video demonstrations in a mockup kitchen environment. We show that our motion reasoning plays an important role in recognizing the actual goal of the demonstrator and improves the success rate by over 20%. We further show that by using the automatically inferred goal from the video demonstration, our robot is able to reproduce the same task in a real kitchen environment. |
Tasks | Imitation Learning, Motion Planning |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05864v1 |
https://arxiv.org/pdf/1911.05864v1.pdf | |
PWC | https://paperswithcode.com/paper/motion-reasoning-for-goal-based-imitation |
Repo | |
Framework | |
Continuous Dropout
Title | Continuous Dropout |
Authors | Xu Shen, Xinmei Tian, Tongliang Liu, Fang Xu, Dacheng Tao |
Abstract | Dropout has been proven to be an effective algorithm for training robust deep networks because of its ability to prevent overfitting by avoiding the co-adaptation of feature detectors. Current explanations of dropout include bagging, naive Bayes, regularization, and sex in evolution. According to the activation patterns of neurons in the human brain, when faced with different situations, the firing rates of neurons are random and continuous, not binary as current dropout does. Inspired by this phenomenon, we extend the traditional binary dropout to continuous dropout. On the one hand, continuous dropout is considerably closer to the activation characteristics of neurons in the human brain than traditional binary dropout. On the other hand, we demonstrate that continuous dropout has the property of avoiding the co-adaptation of feature detectors, which suggests that we can extract more independent feature detectors for model averaging in the test stage. We introduce the proposed continuous dropout to a feedforward neural network and comprehensively compare it with binary dropout, adaptive dropout, and DropConnect on MNIST, CIFAR-10, SVHN, NORB, and ILSVRC-12. Thorough experiments demonstrate that our method performs better in preventing the co-adaptation of feature detectors and improves test performance. The code is available at: https://github.com/jasonustc/caffe-multigpu/tree/dropout. |
Tasks | |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12675v1 |
https://arxiv.org/pdf/1911.12675v1.pdf | |
PWC | https://paperswithcode.com/paper/continuous-dropout |
Repo | |
Framework | |
Automatic Emotion Recognition (AER) System based on Two-Level Ensemble of Lightweight Deep CNN Models
Title | Automatic Emotion Recognition (AER) System based on Two-Level Ensemble of Lightweight Deep CNN Models |
Authors | Emad-ul-Haq Qazi, Muhammad Hussain, Hatim AboAlsamh, Ihsan Ullah |
Abstract | Emotions play a crucial role in human interaction, health care and security investigations and monitoring. Automatic emotion recognition (AER) using electroencephalogram (EEG) signals is an effective method for decoding the real emotions, which are independent of body gestures, but it is a challenging problem. Several automatic emotion recognition systems have been proposed, which are based on traditional hand-engineered approaches and their performances are very poor. Motivated by the outstanding performance of deep learning (DL) in many recognition tasks, we introduce an AER system (Deep-AER) based on EEG brain signals using DL. A DL model involves a large number of learnable parameters, and its training needs a large dataset of EEG signals, which is difficult to acquire for AER problem. To overcome this problem, we proposed a lightweight pyramidal one-dimensional convolutional neural network (LP-1D-CNN) model, which involves a small number of learnable parameters. Using LP-1D-CNN, we build a two level ensemble model. In the first level of the ensemble, each channel is scanned incrementally by LP-1D-CNN to generate predictions, which are fused using majority vote. The second level of the ensemble combines the predictions of all channels of an EEG signal using majority vote for detecting the emotion state. We validated the effectiveness and robustness of Deep-AER using DEAP, a benchmark dataset for emotion recognition research. The results indicate that FRONT plays dominant role in AER and over this region, Deep-AER achieved the accuracies of 98.43% and 97.65% for two AER problems, i.e., high valence vs low valence (HV vs LV) and high arousal vs low arousal (HA vs LA), respectively. The comparison reveals that Deep-AER outperforms the state-of-the-art systems with large margin. The Deep-AER system will be helpful in monitoring for health care and security investigations. |
Tasks | EEG, Emotion Recognition |
Published | 2019-04-30 |
URL | http://arxiv.org/abs/1904.13234v1 |
http://arxiv.org/pdf/1904.13234v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-emotion-recognition-aer-system |
Repo | |
Framework | |
The iWildCam 2018 Challenge Dataset
Title | The iWildCam 2018 Challenge Dataset |
Authors | Sara Beery, Grant van Horn, Oisin Mac Aodha, Pietro Perona |
Abstract | Camera traps are a valuable tool for studying biodiversity, but research using this data is limited by the speed of human annotation. With the vast amounts of data now available it is imperative that we develop automatic solutions for annotating camera trap data in order to allow this research to scale. A promising approach is based on deep networks trained on human-annotated images. We provide a challenge dataset to explore whether such solutions generalize to novel locations, since systems that are trained once and may be deployed to operate automatically in new locations would be most useful. |
Tasks | |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05986v2 |
http://arxiv.org/pdf/1904.05986v2.pdf | |
PWC | https://paperswithcode.com/paper/the-iwildcam-2018-challenge-dataset |
Repo | |
Framework | |
Constrained Mutual Convex Cone Method for Image Set Based Recognition
Title | Constrained Mutual Convex Cone Method for Image Set Based Recognition |
Authors | Naoya Sogi, Rui Zhu, Jing-Hao Xue, Kazuhiro Fukui |
Abstract | In this paper, we propose a method for image-set classification based on convex cone models. Image set classification aims to classify a set of images, which were usually obtained from video frames or multi-view cameras, into a target object. To accurately and stably classify a set, it is essential to represent structural information of the set accurately. There are various representative image features, such as histogram based features, HLAC, and Convolutional Neural Network (CNN) features. We should note that most of them have non-negativity and thus can be effectively represented by a convex cone. This leads us to introduce the convex cone representation to image-set classification. To establish a convex cone based framework, we mathematically define multiple angles between two convex cones, and then define the geometric similarity between the cones using the angles. Moreover, to enhance the framework, we introduce a discriminant space that maximizes the between-class variance (gaps) and minimizes the within-class variance of the projected convex cones onto the discriminant space, similar to the Fisher discriminant analysis. Finally, the classification is performed based on the similarity between projected convex cones. The effectiveness of the proposed method is demonstrated experimentally by using five databases: CMU PIE dataset, ETH-80, CMU Motion of Body dataset, Youtube Celebrity dataset, and a private database of multi-view hand shapes. |
Tasks | |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1903.06549v1 |
http://arxiv.org/pdf/1903.06549v1.pdf | |
PWC | https://paperswithcode.com/paper/constrained-mutual-convex-cone-method-for |
Repo | |
Framework | |
Visual Analytics of Student Learning Behaviors on K-12 Mathematics E-learning Platforms
Title | Visual Analytics of Student Learning Behaviors on K-12 Mathematics E-learning Platforms |
Authors | Meng Xia, Huan Wei, Min Xu, Leo Yu Ho Lo, Yong Wang, Rong Zhang, Huamin Qu |
Abstract | With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded. However, it remains unclear how to make use of these detailed learning behavior data to improve the design of learning materials and gain deeper insight into students’ thinking and learning styles. In this work, we propose a visual analytics system to analyze student learning behaviors on a K-12 mathematics E-learning platform. It supports both correlation analysis between different attributes and a detailed visualization of user mouse-movement logs. Our case studies on a real dataset show that our system can better guide the design of learning resources (e.g., math questions) and facilitate quick interpretation of students’ problem-solving and learning styles. |
Tasks | |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.04749v2 |
https://arxiv.org/pdf/1909.04749v2.pdf | |
PWC | https://paperswithcode.com/paper/visual-analytics-of-student-learning |
Repo | |
Framework | |
One Size Does Not Fit All: Quantifying and Exposing the Accuracy-Latency Trade-off in Machine Learning Cloud Service APIs via Tolerance Tiers
Title | One Size Does Not Fit All: Quantifying and Exposing the Accuracy-Latency Trade-off in Machine Learning Cloud Service APIs via Tolerance Tiers |
Authors | Matthew Halpern, Behzad Boroujerdian, Todd Mummert, Evelyn Duesterwald, Vijay Janapa Reddi |
Abstract | Today’s cloud service architectures follow a “one size fits all” deployment strategy where the same service version instantiation is provided to the end users. However, consumers are broad and different applications have different accuracy and responsiveness requirements, which as we demonstrate renders the “one size fits all” approach inefficient in practice. We use a production-grade speech recognition engine, which serves several thousands of users, and an open source computer vision based system, to explain our point. To overcome the limitations of the “one size fits all” approach, we recommend Tolerance Tiers where each MLaaS tier exposes an accuracy/responsiveness characteristic, and consumers can programmatically select a tier. We evaluate our proposal on the CPU-based automatic speech recognition (ASR) engine and cutting-edge neural networks for image classification deployed on both CPUs and GPUs. The results show that our proposed approach provides an MLaaS cloud service architecture that can be tuned by the end API user or consumer to outperform the conventional “one size fits all” approach. |
Tasks | Image Classification, Speech Recognition |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.11307v1 |
https://arxiv.org/pdf/1906.11307v1.pdf | |
PWC | https://paperswithcode.com/paper/one-size-does-not-fit-all-quantifying-and |
Repo | |
Framework | |
Application of Genetic Algorithms to the Multiple Team Formation Problem
Title | Application of Genetic Algorithms to the Multiple Team Formation Problem |
Authors | Jose G. M. Esgario, Iago E. da Silva, Renato A. Krohling |
Abstract | Allocating of people in multiple projects is an important issue considering the efficiency of groups from the point of view of social interaction. In this paper, based on previous works, the Multiple Team Formation Problem (MTFP) based on sociometric techniques is formulated as an optimization problem taking into account the social interaction among team members. To solve the resulting optimization problem we propose a Genetic Algorithm due to the NP-hard nature of the problem. The social cohesion is an important issue that directly impacts the productivity of the work environment. So, maintaining an appropriate level of cohesion keeps a group together, which will bring positive impacts on the results of a project. The aim of the proposal is to ensure the best possible effectiveness from the point of view of social interaction. In this way, the presented algorithm serves as a decision-making tool for managers to build teams of people in multiple projects. In order to analyze the performance of the proposed method, computational experiments with benchmarks were performed and compared with the exhaustive method. The results are promising and show that the algorithm generally obtains near-optimal results within a short computational time. |
Tasks | Decision Making |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03523v1 |
http://arxiv.org/pdf/1903.03523v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-genetic-algorithms-to-the |
Repo | |
Framework | |
Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations
Title | Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations |
Authors | Andreas Kopf, Vincent Fortuin, Vignesh Ram Somnath, Manfred Claassen |
Abstract | Clustering high-dimensional data, such as images or biological measurements, is a long-standing problem and has been studied extensively. Recently, Deep Clustering gained popularity due to its flexibility in fitting the specific peculiarities of complex data. Here we introduce the Mixture-of-Experts Similarity Variational Autoencoder (MoE-Sim-VAE), a novel generative clustering model. The model can learn multi-modal distributions of high-dimensional data and use these to generate realistic data with high efficacy and efficiency. MoE-Sim-VAE is based on a Variational Autoencoder (VAE), where the decoder consists of a Mixture-of-Experts (MoE) architecture. This specific architecture allows for various modes of the data to be automatically learned by means of the experts. Additionally, we encourage the lower dimensional latent representation of our model to follow a Gaussian mixture distribution and to accurately represent the similarities between the data points. We assess the performance of our model on the MNIST benchmark data set and a challenging real-world task of defining cell subpopulations from mass cytometry (CyTOF) measurements on hundreds of different datasets. MoE-Sim-VAE exhibits superior clustering performance on all these tasks in comparison to the baselines as well as competitor methods and we show that the MoE architecture in the decoder reduces the computational cost of sampling specific data modes with high fidelity. |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.07763v2 |
https://arxiv.org/pdf/1910.07763v2.pdf | |
PWC | https://paperswithcode.com/paper/mixture-of-experts-variational-autoencoder |
Repo | |
Framework | |