Paper Group ANR 174
Practical Learning of Predictive State Representations. Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks. Expectation maximization transfer learning and its application for bionic hand prostheses. Data-driven Optimal Cost Selection for Distributionally Robust Optimization. IDK Cascades: Fast Deep Learn …
Practical Learning of Predictive State Representations
Title | Practical Learning of Predictive State Representations |
Authors | Carlton Downey, Ahmed Hefny, Geoffrey Gordon |
Abstract | Over the past decade there has been considerable interest in spectral algorithms for learning Predictive State Representations (PSRs). Spectral algorithms have appealing theoretical guarantees; however, the resulting models do not always perform well on inference tasks in practice. One reason for this behavior is the mismatch between the intended task (accurate filtering or prediction) and the loss function being optimized by the algorithm (estimation error in model parameters). A natural idea is to improve performance by refining PSRs using an algorithm such as EM. Unfortunately it is not obvious how to apply apply an EM style algorithm in the context of PSRs as the Log Likelihood is not well defined for all PSRs. We show that it is possible to overcome this problem using ideas from Predictive State Inference Machines. We combine spectral algorithms for PSRs as a consistent and efficient initialization with PSIM-style updates to refine the resulting model parameters. By combining these two ideas we develop Inference Gradients, a simple, fast, and robust method for practical learning of PSRs. Inference Gradients performs gradient descent in the PSR parameter space to optimize an inference-based loss function like PSIM. Because Inference Gradients uses a spectral initialization we get the same consistency benefits as PSRs. We show that Inference Gradients outperforms both PSRs and PSIMs on real and synthetic data sets. |
Tasks | |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04121v1 |
http://arxiv.org/pdf/1702.04121v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-learning-of-predictive-state |
Repo | |
Framework | |
Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks
Title | Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks |
Authors | Hao Dong, Guang Yang, Fangde Liu, Yuanhan Mo, Yike Guo |
Abstract | A major challenge in brain tumor treatment planning and quantitative evaluation is determination of the tumor extent. The noninvasive magnetic resonance imaging (MRI) technique has emerged as a front-line diagnostic tool for brain tumors without ionizing radiation. Manual segmentation of brain tumor extent from 3D MRI volumes is a very time-consuming task and the performance is highly relied on operator’s experience. In this context, a reliable fully automatic segmentation method for the brain tumor segmentation is necessary for an efficient measurement of the tumor extent. In this study, we propose a fully automatic method for brain tumor segmentation, which is developed using U-Net based deep convolutional networks. Our method was evaluated on Multimodal Brain Tumor Image Segmentation (BRATS 2015) datasets, which contain 220 high-grade brain tumor and 54 low-grade tumor cases. Cross-validation has shown that our method can obtain promising segmentation efficiently. |
Tasks | Brain Tumor Segmentation, Semantic Segmentation |
Published | 2017-05-10 |
URL | http://arxiv.org/abs/1705.03820v3 |
http://arxiv.org/pdf/1705.03820v3.pdf | |
PWC | https://paperswithcode.com/paper/automatic-brain-tumor-detection-and |
Repo | |
Framework | |
Expectation maximization transfer learning and its application for bionic hand prostheses
Title | Expectation maximization transfer learning and its application for bionic hand prostheses |
Authors | Benjamin Paaßen, Alexander Schulz, Janne Hahne, Barbara Hammer |
Abstract | Machine learning models in practical settings are typically confronted with changes to the distribution of the incoming data. Such changes can severely affect the model performance, leading for example to misclassifications of data. This is particularly apparent in the domain of bionic hand prostheses, where machine learning models promise faster and more intuitive user interfaces, but are hindered by their lack of robustness to everyday disturbances, such as electrode shifts. One way to address changes in the data distribution is transfer learning, that is, to transfer the disturbed data to a space where the original model is applicable again. In this contribution, we propose a novel expectation maximization algorithm to learn linear transformations that maximize the likelihood of disturbed data after the transformation. We also show that this approach generalizes to discriminative models, in particular learning vector quantization models. In our evaluation on data from the bionic prostheses domain we demonstrate that our approach can learn a transformation which improves classification accuracy significantly and outperforms all tested baselines, if few data or few classes are available in the target domain. |
Tasks | Quantization, Transfer Learning |
Published | 2017-11-25 |
URL | http://arxiv.org/abs/1711.09256v1 |
http://arxiv.org/pdf/1711.09256v1.pdf | |
PWC | https://paperswithcode.com/paper/expectation-maximization-transfer-learning |
Repo | |
Framework | |
Data-driven Optimal Cost Selection for Distributionally Robust Optimization
Title | Data-driven Optimal Cost Selection for Distributionally Robust Optimization |
Authors | Jose Blanchet, Yang Kang, Fan Zhang, Karthyek Murthy |
Abstract | Recently, (Blanchet, Kang, and Murhy 2016, and Blanchet, and Kang 2017) showed that several machine learning algorithms, such as square-root Lasso, Support Vector Machines, and regularized logistic regression, among many others, can be represented exactly as distributionally robust optimization (DRO) problems. The distributional uncertainty is defined as a neighborhood centered at the empirical distribution. We propose a methodology which learns such neighborhood in a natural data-driven way. We show rigorously that our framework encompasses adaptive regularization as a particular case. Moreover, we demonstrate empirically that our proposed methodology is able to improve upon a wide range of popular machine learning estimators. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07152v3 |
http://arxiv.org/pdf/1705.07152v3.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-optimal-transport-cost-selection |
Repo | |
Framework | |
IDK Cascades: Fast Deep Learning by Learning not to Overthink
Title | IDK Cascades: Fast Deep Learning by Learning not to Overthink |
Authors | Xin Wang, Yujia Luo, Daniel Crankshaw, Alexey Tumanov, Fisher Yu, Joseph E. Gonzalez |
Abstract | Advances in deep learning have led to substantial increases in prediction accuracy but have been accompanied by increases in the cost of rendering predictions. We conjecture that fora majority of real-world inputs, the recent advances in deep learning have created models that effectively “overthink” on simple inputs. In this paper, we revisit the classic question of building model cascades that primarily leverage class asymmetry to reduce cost. We introduce the “I Don’t Know”(IDK) prediction cascades framework, a general framework to systematically compose a set of pre-trained models to accelerate inference without a loss in prediction accuracy. We propose two search based methods for constructing cascades as well as a new cost-aware objective within this framework. The proposed IDK cascade framework can be easily adopted in the existing model serving systems without additional model re-training. We evaluate the proposed techniques on a range of benchmarks to demonstrate the effectiveness of the proposed framework. |
Tasks | |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00885v4 |
http://arxiv.org/pdf/1706.00885v4.pdf | |
PWC | https://paperswithcode.com/paper/idk-cascades-fast-deep-learning-by-learning |
Repo | |
Framework | |
Plan, Attend, Generate: Planning for Sequence-to-Sequence Models
Title | Plan, Attend, Generate: Planning for Sequence-to-Sequence Models |
Authors | Francis Dutil, Caglar Gulcehre, Adam Trischler, Yoshua Bengio |
Abstract | We investigate the integration of a planning mechanism into sequence-to-sequence models using attention. We develop a model which can plan ahead in the future when it computes its alignments between input and output sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the recently proposed strategic attentive reader and writer (STRAW) model for Reinforcement Learning. Our proposed model is end-to-end trainable using primarily differentiable operations. We show that it outperforms a strong baseline on character-level translation tasks from WMT’15, the algorithmic task of finding Eulerian circuits of graphs, and question generation from the text. Our analysis demonstrates that the model computes qualitatively intuitive alignments, converges faster than the baselines, and achieves superior performance with fewer parameters. |
Tasks | Question Generation |
Published | 2017-11-28 |
URL | http://arxiv.org/abs/1711.10462v1 |
http://arxiv.org/pdf/1711.10462v1.pdf | |
PWC | https://paperswithcode.com/paper/plan-attend-generate-planning-for-sequence-to |
Repo | |
Framework | |
Clustering For Point Pattern Data
Title | Clustering For Point Pattern Data |
Authors | Quang N. Tran, Ba-Ngu Vo, Dinh Phung, Ba-Tuong Vo |
Abstract | Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns - sets or multi-sets of unordered elements - that are found in numerous applications and data sources. In this paper, we propose two approaches for clustering point patterns. The first is a non-parametric method based on novel distances for sets. The second is a model-based approach, formulated via random finite set theory, and solved by the Expectation-Maximization algorithm. Numerical experiments show that the proposed methods perform well on both simulated and real data. |
Tasks | |
Published | 2017-02-08 |
URL | http://arxiv.org/abs/1702.02262v1 |
http://arxiv.org/pdf/1702.02262v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-for-point-pattern-data |
Repo | |
Framework | |
How to Escape Saddle Points Efficiently
Title | How to Escape Saddle Points Efficiently |
Authors | Chi Jin, Rong Ge, Praneeth Netrapalli, Sham M. Kakade, Michael I. Jordan |
Abstract | This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost “dimension-free”). The convergence rate of this procedure matches the well-known convergence rate of gradient descent to first-order stationary points, up to log factors. When all saddle points are non-degenerate, all second-order stationary points are local minima, and our result thus shows that perturbed gradient descent can escape saddle points almost for free. Our results can be directly applied to many machine learning applications, including deep learning. As a particular concrete example of such an application, we show that our results can be used directly to establish sharp global convergence rates for matrix factorization. Our results rely on a novel characterization of the geometry around saddle points, which may be of independent interest to the non-convex optimization community. |
Tasks | |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00887v1 |
http://arxiv.org/pdf/1703.00887v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-escape-saddle-points-efficiently |
Repo | |
Framework | |
Context-Aware Hierarchical Online Learning for Performance Maximization in Mobile Crowdsourcing
Title | Context-Aware Hierarchical Online Learning for Performance Maximization in Mobile Crowdsourcing |
Authors | Sabrina Klos, Cem Tekin, Mihaela van der Schaar, Anja Klein |
Abstract | In mobile crowdsourcing (MCS), mobile users accomplish outsourced human intelligence tasks. MCS requires an appropriate task assignment strategy, since different workers may have different performance in terms of acceptance rate and quality. Task assignment is challenging, since a worker’s performance (i) may fluctuate, depending on both the worker’s current personal context and the task context, (ii) is not known a priori, but has to be learned over time. Moreover, learning context-specific worker performance requires access to context information, which may not be available at a central entity due to communication overhead or privacy concerns. Additionally, evaluating worker performance might require costly quality assessments. In this paper, we propose a context-aware hierarchical online learning algorithm addressing the problem of performance maximization in MCS. In our algorithm, a local controller (LC) in the mobile device of a worker regularly observes the worker’s context, her/his decisions to accept or decline tasks and the quality in completing tasks. Based on these observations, the LC regularly estimates the worker’s context-specific performance. The mobile crowdsourcing platform (MCSP) then selects workers based on performance estimates received from the LCs. This hierarchical approach enables the LCs to learn context-specific worker performance and it enables the MCSP to select suitable workers. In addition, our algorithm preserves worker context locally, and it keeps the number of required quality assessments low. We prove that our algorithm converges to the optimal task assignment strategy. Moreover, the algorithm outperforms simpler task assignment strategies in experiments based on synthetic and real data. |
Tasks | |
Published | 2017-05-10 |
URL | http://arxiv.org/abs/1705.03822v2 |
http://arxiv.org/pdf/1705.03822v2.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-hierarchical-online-learning |
Repo | |
Framework | |
A Probabilistic Framework for Location Inference from Social Media
Title | A Probabilistic Framework for Location Inference from Social Media |
Authors | Yujie Qian, Jie Tang, Zhilin Yang, Binxuan Huang, Wei Wei, Kathleen M. Carley |
Abstract | We study the extent to which we can infer users’ geographical locations from social media. Location inference from social media can benefit many applications, such as disaster management, targeted advertising, and news content tailoring. The challenges, however, lie in the limited amount of labeled data and the large scale of social networks. In this paper, we formalize the problem of inferring location from social media into a semi-supervised factor graph model (SSFGM). The model provides a probabilistic framework in which various sources of information (e.g., content and social network) can be combined together. We design a two-layer neural network to learn feature representations, and incorporate the learned latent features into SSFGM. To deal with the large-scale problem, we propose a Two-Chain Sampling (TCS) algorithm to learn SSFGM. The algorithm achieves a good trade-off between accuracy and efficiency. Experiments on Twitter and Weibo show that the proposed TCS algorithm for SSFGM can substantially improve the inference accuracy over several state-of-the-art methods. More importantly, TCS achieves over 100x speedup comparing with traditional propagation-based methods (e.g., loopy belief propagation). |
Tasks | |
Published | 2017-02-23 |
URL | https://arxiv.org/abs/1702.07281v3 |
https://arxiv.org/pdf/1702.07281v3.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-framework-for-location |
Repo | |
Framework | |
Shapechanger: Environments for Transfer Learning
Title | Shapechanger: Environments for Transfer Learning |
Authors | Sébastien M. R. Arnold, Tsam Kiu Pun, Théo-Tim J. Denisart, Francisco J. Valero-Cuevas |
Abstract | We present Shapechanger, a library for transfer reinforcement learning specifically designed for robotic tasks. We consider three types of knowledge transfer—from simulation to simulation, from simulation to real, and from real to real—and a wide range of tasks with continuous states and actions. Shapechanger is under active development and open-sourced at: https://github.com/seba-1511/shapechanger/. |
Tasks | Transfer Learning, Transfer Reinforcement Learning |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05070v1 |
http://arxiv.org/pdf/1709.05070v1.pdf | |
PWC | https://paperswithcode.com/paper/shapechanger-environments-for-transfer |
Repo | |
Framework | |
Towards Interrogating Discriminative Machine Learning Models
Title | Towards Interrogating Discriminative Machine Learning Models |
Authors | Wenbo Guo, Kaixuan Zhang, Lin Lin, Sui Huang, Xinyu Xing |
Abstract | It is oftentimes impossible to understand how machine learning models reach a decision. While recent research has proposed various technical approaches to provide some clues as to how a learning model makes individual decisions, they cannot provide users with ability to inspect a learning model as a complete entity. In this work, we propose a new technical approach that augments a Bayesian regression mixture model with multiple elastic nets. Using the enhanced mixture model, we extract explanations for a target model through global approximation. To demonstrate the utility of our approach, we evaluate it on different learning models covering the tasks of text mining and image recognition. Our results indicate that the proposed approach not only outperforms the state-of-the-art technique in explaining individual decisions but also provides users with an ability to discover the vulnerabilities of a learning model. |
Tasks | |
Published | 2017-05-23 |
URL | http://arxiv.org/abs/1705.08564v1 |
http://arxiv.org/pdf/1705.08564v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-interrogating-discriminative-machine |
Repo | |
Framework | |
Evaluating Visual Conversational Agents via Cooperative Human-AI Games
Title | Evaluating Visual Conversational Agents via Cooperative Human-AI Games |
Authors | Prithvijit Chattopadhyay, Deshraj Yadav, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra, Devi Parikh |
Abstract | As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI. The AI, which we call ALICE, is provided an image which is unseen by the human. Following a brief description of the image, the human questions ALICE about this secret image to identify it from a fixed pool of images. We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the human-ALICE teams for two versions of ALICE. Our human studies suggest a counterintuitive trend - that while AI literature shows that one version outperforms the other when paired with an AI questioner bot, we find that this improvement in AI-AI performance does not translate to improved human-AI performance. This suggests a mismatch between benchmarking of AI in isolation and in the context of human-AI teams. |
Tasks | |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05122v1 |
http://arxiv.org/pdf/1708.05122v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-visual-conversational-agents-via |
Repo | |
Framework | |
Financial Series Prediction: Comparison Between Precision of Time Series Models and Machine Learning Methods
Title | Financial Series Prediction: Comparison Between Precision of Time Series Models and Machine Learning Methods |
Authors | Xin-Yao Qian |
Abstract | Precise financial series predicting has long been a difficult problem because of unstableness and many noises within the series. Although Traditional time series models like ARIMA and GARCH have been researched and proved to be effective in predicting, their performances are still far from satisfying. Machine Learning, as an emerging research field in recent years, has brought about many incredible improvements in tasks such as regressing and classifying, and it’s also promising to exploit the methodology in financial time series predicting. In this paper, the predicting precision of financial time series between traditional time series models and mainstream machine learning models including some state-of-the-art ones of deep learning are compared through experiment using real stock index data from history. The result shows that machine learning as a modern method far surpasses traditional models in precision. |
Tasks | Time Series |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00948v5 |
http://arxiv.org/pdf/1706.00948v5.pdf | |
PWC | https://paperswithcode.com/paper/financial-series-prediction-comparison |
Repo | |
Framework | |
Human Understandable Explanation Extraction for Black-box Classification Models Based on Matrix Factorization
Title | Human Understandable Explanation Extraction for Black-box Classification Models Based on Matrix Factorization |
Authors | Jaedeok Kim, Jingoo Seo |
Abstract | In recent years, a number of artificial intelligent services have been developed such as defect detection system or diagnosis system for customer services. Unfortunately, the core in these services is a black-box in which human cannot understand the underlying decision making logic, even though the inspection of the logic is crucial before launching a commercial service. Our goal in this paper is to propose an analytic method of a model explanation that is applicable to general classification models. To this end, we introduce the concept of a contribution matrix and an explanation embedding in a constraint space by using a matrix factorization. We extract a rule-like model explanation from the contribution matrix with the help of the nonnegative matrix factorization. To validate our method, the experiment results provide with open datasets as well as an industry dataset of a LTE network diagnosis and the results show our method extracts reasonable explanations. |
Tasks | Decision Making |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06201v1 |
http://arxiv.org/pdf/1709.06201v1.pdf | |
PWC | https://paperswithcode.com/paper/human-understandable-explanation-extraction |
Repo | |
Framework | |