Paper Group ANR 317
Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories. Automated Cardiothoracic Ratio Calculation and Cardiomegaly Detection using Deep Learning Approach. Adversarially Guided Self-Play for Adopting Social Conventions. Effects of Discretization of Decision and Objective Spaces on the Performance of Evolutionary Multiob …
Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories
Title | Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories |
Authors | Tiange Luo, Kaichun Mo, Zhiao Huang, Jiarui Xu, Siyu Hu, Liwei Wang, Hao Su |
Abstract | We address the problem of discovering 3D parts for objects in unseen categories. Being able to learn the geometry prior of parts and transfer this prior to unseen categories pose fundamental challenges on data-driven shape segmentation approaches. Formulated as a contextual bandit problem, we propose a learning-based agglomerative clustering framework which learns a grouping policy to progressively group small part proposals into bigger ones in a bottom-up fashion. At the core of our approach is to restrict the local context for extracting part-level features, which encourages the generalizability to unseen categories. On the large-scale fine-grained 3D part dataset, PartNet, we demonstrate that our method can transfer knowledge of parts learned from 3 training categories to 21 unseen testing categories without seeing any annotated samples. Quantitative comparisons against four shape segmentation baselines shows that our approach achieve the state-of-the-art performance. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06478v1 |
https://arxiv.org/pdf/2002.06478v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-group-a-bottom-up-framework-for-1 |
Repo | |
Framework | |
Automated Cardiothoracic Ratio Calculation and Cardiomegaly Detection using Deep Learning Approach
Title | Automated Cardiothoracic Ratio Calculation and Cardiomegaly Detection using Deep Learning Approach |
Authors | Isarun Chamveha, Treethep Promwiset, Trongtum Tongdee, Pairash Saiviroonporn, Warasinee Chaisangmongkon |
Abstract | We propose an algorithm for calculating the cardiothoracic ratio (CTR) from chest X-ray films. Our approach applies a deep learning model based on U-Net with VGG16 encoder to extract lung and heart masks from chest X-ray images and calculate CTR from the extents of obtained masks. Human radiologists evaluated our CTR measurements, and $76.5%$ were accepted to be included in medical reports without any need for adjustment. This result translates to a large amount of time and labor saved for radiologists using our automated tools. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07468v1 |
https://arxiv.org/pdf/2002.07468v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-cardiothoracic-ratio-calculation |
Repo | |
Framework | |
Adversarially Guided Self-Play for Adopting Social Conventions
Title | Adversarially Guided Self-Play for Adopting Social Conventions |
Authors | Mycal Tucker, Yilun Zhou, Julie Shah |
Abstract | Robotic agents must adopt existing social conventions in order to be effective teammates. These social conventions, such as driving on the right or left side of the road, are arbitrary choices among optimal policies, but all agents on a successful team must use the same convention. Prior work has identified a method of combining self-play with paired input-output data gathered from existing agents in order to learn their social convention without interacting with them. We build upon this work by introducing a technique called Adversarial Self-Play (ASP) that uses adversarial training to shape the space of possible learned policies and substantially improves learning efficiency. ASP only requires the addition of unpaired data: a dataset of outputs produced by the social convention without associated inputs. Theoretical analysis reveals how ASP shapes the policy space and the circumstances (when behaviors are clustered or exhibit some other structure) under which it offers the greatest benefits. Empirical results across three domains confirm ASP’s advantages: it produces models that more closely match the desired social convention when given as few as two paired datapoints. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05994v1 |
https://arxiv.org/pdf/2001.05994v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarially-guided-self-play-for-adopting |
Repo | |
Framework | |
Effects of Discretization of Decision and Objective Spaces on the Performance of Evolutionary Multiobjective Optimization Algorithms
Title | Effects of Discretization of Decision and Objective Spaces on the Performance of Evolutionary Multiobjective Optimization Algorithms |
Authors | Weiyu Chen, Hisao Ishibuchi, Ke Shang |
Abstract | Recently, the discretization of decision and objective spaces has been discussed in the literature. In some studies, it is shown that the decision space discretization improves the performance of evolutionary multi-objective optimization (EMO) algorithms on continuous multi-objective test problems. In other studies, it is shown that the objective space discretization improves the performance on combinatorial multi-objective problems. However, the effect of the simultaneous discretization of both spaces has not been examined in the literature. In this paper, we examine the effects of the decision space discretization, objective space discretization and simultaneous discretization on the performance of NSGA-II through computational experiments on the DTLZ and WFG problems. Using various settings about the number of decision variables and the number of objectives, our experiments are performed on four types of problems: standard problems, large-scale problems, many-objective problems, and large-scale many-objective problems. We show that the decision space discretization has a positive effect for large-scale problems and the objective space discretization has a positive effect for many-objective problems. We also show the discretization of both spaces is useful for large-scale many-objective problems. |
Tasks | Multiobjective Optimization |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09917v1 |
https://arxiv.org/pdf/2003.09917v1.pdf | |
PWC | https://paperswithcode.com/paper/effects-of-discretization-of-decision-and |
Repo | |
Framework | |
Dynamic clustering of time series data
Title | Dynamic clustering of time series data |
Authors | Victhor S. Sartório, Thaís C. O. Fonseca |
Abstract | We propose a new method for clustering multivariate time-series data based on Dynamic Linear Models. Whereas usual time-series clustering methods obtain static membership parameters, our proposal allows each time-series to dynamically change their cluster memberships over time. In this context, a mixture model is assumed for the time series and a flexible Dirichlet evolution for mixture weights allows for smooth membership changes over time. Posterior estimates and predictions can be obtained through Gibbs sampling, but a more efficient method for obtaining point estimates is presented, based on Stochastic Expectation-Maximization and Gradient Descent. Finally, two applications illustrate the usefulness of our proposed model to model both univariate and multivariate time-series: World Bank indicators for the renewable energy consumption of EU nations and the famous Gapminder dataset containing life-expectancy and GDP per capita for various countries. |
Tasks | Clustering Multivariate Time Series, Time Series, Time Series Clustering |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2002.01890v1 |
https://arxiv.org/pdf/2002.01890v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-clustering-of-time-series-data |
Repo | |
Framework | |
Predicting Elastic Properties of Materials from Electronic Charge Density Using 3D Deep Convolutional Neural Networks
Title | Predicting Elastic Properties of Materials from Electronic Charge Density Using 3D Deep Convolutional Neural Networks |
Authors | Yong Zhao, Kunpeng Yuan, Yinqiao Liu, Steph-Yves Loius, Ming Hu, Jianjun Hu |
Abstract | Materials representation plays a key role in machine learning based prediction of materials properties and new materials discovery. Currently both graph and 3D voxel representation methods are based on the heterogeneous elements of the crystal structures. Here, we propose to use electronic charge density (ECD) as a generic unified 3D descriptor for materials property prediction with the advantage of possessing close relation with the physical and chemical properties of materials. We developed an ECD based 3D convolutional neural networks (CNNs) for predicting elastic properties of materials, in which CNNs can learn effective hierarchical features with multiple convolving and pooling operations. Extensive benchmark experiments over 2,170 Fm-3m face-centered-cubic (FCC) materials show that our ECD based CNNs can achieve good performance for elasticity prediction. Especially, our CNN models based on the fusion of elemental Magpie features and ECD descriptors achieved the best 5-fold cross-validation performance. More importantly, we showed that our ECD based CNN models can achieve significantly better extrapolation performance when evaluated over non-redundant datasets where there are few neighbor training samples around test samples. As additional validation, we evaluated the predictive performance of our models on 329 materials of space group Fm-3m by comparing to DFT calculated values, which shows better prediction power of our model for bulk modulus than shear modulus. Due to the unified representation power of ECD, it is expected that our ECD based CNN approach can also be applied to predict other physical and chemical properties of crystalline materials. |
Tasks | |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.13425v1 |
https://arxiv.org/pdf/2003.13425v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-elastic-properties-of-materials |
Repo | |
Framework | |
Coresets for the Nearest-Neighbor Rule
Title | Coresets for the Nearest-Neighbor Rule |
Authors | Alejandro Flores-Velazco, David M. Mount |
Abstract | The problem of nearest-neighbor condensation deals with finding a subset $R$ from a set of labeled points $P$ such that for every point $p \in P$ the nearest-neighbor of $p$ in $R$ has the same label as $p$. This is motivated by applications in classification, where the nearest-neighbor rule assigns to an unlabeled query point the label of its nearest-neighbor in the point set. In this context, condensation aims to reduce the size of the set needed to classify new points. However, finding such subsets of minimum cardinality is NP-hard, and most research has focused on practical heuristics without performance guarantees. Additionally, the use of exact nearest-neighbors is always assumed, ignoring the effect of condensation in the classification accuracy when nearest-neighbors are computed approximately. In this paper, we address these shortcomings by proposing new approximation-sensitive criteria for the nearest-neighbor condensation problem, along with practical algorithms with provable performance guarantees. We characterize sufficient conditions to guarantee correct classification of unlabeled points using approximate nearest-neighbor queries on these subsets, which introduces the notion of coresets for classification with the nearest-neighbor rule. Moreover, we prove that it is NP-hard to compute subsets with these characteristics, whose cardinality approximates that of the minimum cardinality subset. Additionally, we propose new algorithms for computing such subsets, with tight approximation factors in general metrics, and improved factors for doubling metrics and $\ell_p$ metrics with $p\geq2$. Finally, we show an alternative implementation scheme that reduces the worst-case time complexity of one of these algorithms, becoming the first truly subquadratic approximation algorithm for the nearest-neighbor condensation problem. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06650v2 |
https://arxiv.org/pdf/2002.06650v2.pdf | |
PWC | https://paperswithcode.com/paper/coresets-for-the-nearest-neighbor-rule |
Repo | |
Framework | |
Towards Certifiable Adversarial Sample Detection
Title | Towards Certifiable Adversarial Sample Detection |
Authors | Ilia Shumailov, Yiren Zhao, Robert Mullins, Ross Anderson |
Abstract | Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs’ adversarial robustness but these all suffer performance penalties or other limitations. In this paper, we provide a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). The system can provide certifiable guarantees of detection of adversarial inputs for certain $l_{\infty}$ sizes on a reasonable assumption, namely that the training data have the same distribution as the test data. We develop and evaluate several versions of CTT with a range of defense capabilities, training overheads and certifiability on adversarial samples. Against adversaries with various $l_p$ norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies. |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08740v1 |
https://arxiv.org/pdf/2002.08740v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-certifiable-adversarial-sample |
Repo | |
Framework | |
Teaching the Old Dog New Tricks: Supervised Learning with Constraints
Title | Teaching the Old Dog New Tricks: Supervised Learning with Constraints |
Authors | Fabrizio Detassis, Michele Lombardi, Michela Milano |
Abstract | Methods for taking into account external knowledge in Machine Learning models have the potential to address outstanding issues in data-driven AI methods, such as improving safety and fairness, and can simplify training in the presence of scarce data. We propose a simple, but effective, method for injecting constraints at training time in supervised learning, based on decomposition and bi-level optimization: a master step is in charge of enforcing the constraints, while a learner step takes care of training the model. The process leads to approximate constraint satisfaction. The method is applicable to any ML approach for which the concept of label (or target) is well defined (most regression and classification scenarios), and allows to reuse existing training algorithms with no modifications. We require no assumption on the constraints, although their properties affect the shape and complexity of the master problem. Convergence guarantees are hard to provide, but we found that the approach performs well on ML tasks with fairness constraints and on classical datasets with synthetic constraints. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10766v1 |
https://arxiv.org/pdf/2002.10766v1.pdf | |
PWC | https://paperswithcode.com/paper/teaching-the-old-dog-new-tricks-supervised |
Repo | |
Framework | |
Interpreting Interpretations: Organizing Attribution Methods by Criteria
Title | Interpreting Interpretations: Organizing Attribution Methods by Criteria |
Authors | Zifan Wang, PiotrPiotr Mardziel, Anupam Datta, Matt Fredrikson |
Abstract | Attribution methods that explains the behaviour of machine learning models, e.g. Convolutional Neural Networks (CNNs), have developed into many different forms, motivated by desirable distinct, though related, criteria. Following the diversity of attribution methods, evaluation tools are in need to answer: which method is better for what purpose and why? This paper introduces a new way to decompose the evaluation for attribution methods into two criteria: ordering and proportionality. We argue that existing evaluations follow an ordering criteria roughly corresponding to either the logical concept of necessity or sufficiency. The paper further demonstrates a notion of Proportionality for Necessity and Sufficiency, a quantitative evaluation to compare existing attribution methods, as a refinement to the ordering criteria. Evaluating the performance of existing attribution methods on explaining the CNN for image classification, we conclude that some attribution methods are better in the necessity analysis and the others are better in the sufficiency analysis, but no method is always the winner on both sides. |
Tasks | Image Classification |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.07985v1 |
https://arxiv.org/pdf/2002.07985v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-interpretations-organizing |
Repo | |
Framework | |
Designing Interaction for Multi-agent Cooperative System in an Office Environment
Title | Designing Interaction for Multi-agent Cooperative System in an Office Environment |
Authors | Chao Wang, Stephan Hasler, Manuel Muehlig, Frank Joublin, Antonello Ceravola, Joerg Deigmoeller, Lydia Fischer |
Abstract | Future intelligent system will involve very various types of artificial agents, such as mobile robots, smart home infrastructure or personal devices, which share data and collaborate with each other to execute certain tasks.Designing an efficient human-machine interface, which can support users to express needs to the system, supervise the collaboration progress of different entities and evaluate the result, will be challengeable. This paper presents the design and implementation of the human-machine interface of Intelligent Cyber-Physical system (ICPS),which is a multi-entity coordination system of robots and other smart devices in a working environment. ICPS gathers sensory data from entities and then receives users’ command, then optimizes plans to utilize the capability of different entities to serve people. Using multi-model interaction methods, e.g. graphical interfaces, speech interaction, gestures and facial expressions, ICPS is able to receive inputs from users through different entities, keep users aware of the progress and accomplish the task efficiently |
Tasks | |
Published | 2020-02-15 |
URL | https://arxiv.org/abs/2002.06417v1 |
https://arxiv.org/pdf/2002.06417v1.pdf | |
PWC | https://paperswithcode.com/paper/designing-interaction-for-multi-agent |
Repo | |
Framework | |
Trustworthy AI
Title | Trustworthy AI |
Authors | Jeannette M. Wing |
Abstract | The promise of AI is huge. AI systems have already achieved good enough performance to be in our streets and in our homes. However, they can be brittle and unfair. For society to reap the benefits of AI systems, society needs to be able to trust them. Inspired by decades of progress in trustworthy computing, we suggest what trustworthy properties would be desired of AI systems. By enumerating a set of new research questions, we explore one approach–formal verification–for ensuring trust in AI. Trustworthy AI ups the ante on both trustworthy computing and formal methods. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.06276v1 |
https://arxiv.org/pdf/2002.06276v1.pdf | |
PWC | https://paperswithcode.com/paper/trustworthy-ai |
Repo | |
Framework | |
A Model to Measure the Spread Power of Rumors
Title | A Model to Measure the Spread Power of Rumors |
Authors | Zoleikha Jahanbakhsh-Nagadeh, Mohammad-Reza Feizi-Derakhshi, Majid Ramezani, Taymaz Rahkar-Farshi, Meysam Asgari-Chenaghlu, Narjes Nikzad-Khasmakhi, Ali-Reza Feizi-Derakhshi, Mehrdad Ranjbar-Khadivi, Elnaz Zafarani-Moattar, Mohammad-Ali Balafar |
Abstract | Nowadays, a significant portion of daily interacted posts in social media are infected by rumors. This study investigates the problem of rumor analysis in different areas from other researches. It tackles the unaddressed problem related to calculating the Spread Power of Rumor (SPR) for the first time and seeks to examine the spread power as the function of multi-contextual features. For this purpose, the theory of Allport and Postman will be adopted. In which it claims that there are two key factors determinant to the spread power of rumors, namely importance and ambiguity. The proposed Rumor Spread Power Measurement Model (RSPMM) computes SPR by utilizing a textual-based approach, which entails contextual features to compute the spread power of the rumors in two categories: False Rumor (FR) and True Rumor (TR). Totally 51 contextual features are introduced to measure SPR and their impact on classification are investigated, then 42 features in two categories “importance” (28 features) and “ambiguity” (14 features) are selected to compute SPR. The proposed RSPMM is verified on two labelled datasets, which are collected from Twitter and Telegram. The results show that (i) the proposed new features are effective and efficient to discriminate between FRs and TRs. (ii) the proposed RSPMM approach focused only on contextual features while existing techniques are based on Structure and Content features, but RSPMM achieves considerably outstanding results (F-measure=83%). (iii) The result of T-Test shows that SPR criteria can significantly distinguish between FR and TR, besides it can be useful as a new method to verify the trueness of rumors. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07563v2 |
https://arxiv.org/pdf/2002.07563v2.pdf | |
PWC | https://paperswithcode.com/paper/a-model-to-measure-the-spread-power-of-rumors |
Repo | |
Framework | |
Knowledge distillation via adaptive instance normalization
Title | Knowledge distillation via adaptive instance normalization |
Authors | Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos |
Abstract | This paper addresses the problem of model compression via knowledge distillation. To this end, we propose a new knowledge distillation method based on transferring feature statistics, specifically the channel-wise mean and variance, from the teacher to the student. Our method goes beyond the standard way of enforcing the mean and variance of the student to be similar to those of the teacher through an $L_2$ loss, which we found it to be of limited effectiveness. Specifically, we propose a new loss based on adaptive instance normalization to effectively transfer the feature statistics. The main idea is to transfer the learned statistics back to the teacher via adaptive instance normalization (conditioned on the student) and let the teacher network “evaluate” via a loss whether the statistics learned by the student are reliably transferred. We show that our distillation method outperforms other state-of-the-art distillation methods over a large set of experimental settings including different (a) network architectures, (b) teacher-student capacities, (c) datasets, and (d) domains. |
Tasks | Model Compression |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04289v1 |
https://arxiv.org/pdf/2003.04289v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-distillation-via-adaptive-instance |
Repo | |
Framework | |
Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity
Title | Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity |
Authors | Thomas Miconi, Aditya Rawal, Jeff Clune, Kenneth O. Stanley |
Abstract | The impressive lifelong learning in animal brains is primarily enabled by plastic changes in synaptic connectivity. Importantly, these changes are not passive, but are actively controlled by neuromodulation, which is itself under the control of the brain. The resulting self-modifying abilities of the brain play an important role in learning and adaptation, and are a major basis for biological reinforcement learning. Here we show for the first time that artificial neural networks with such neuromodulated plasticity can be trained with gradient descent. Extending previous work on differentiable Hebbian plasticity, we propose a differentiable formulation for the neuromodulation of plasticity. We show that neuromodulated plasticity improves the performance of neural networks on both reinforcement learning and supervised learning tasks. In one task, neuromodulated plastic LSTMs with millions of parameters outperform standard LSTMs on a benchmark language modeling task (controlling for the number of parameters). We conclude that differentiable neuromodulation of plasticity offers a powerful new framework for training neural networks. |
Tasks | Language Modelling |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10585v1 |
https://arxiv.org/pdf/2002.10585v1.pdf | |
PWC | https://paperswithcode.com/paper/backpropamine-training-self-modifying-neural-1 |
Repo | |
Framework | |