Paper Group ANR 226
On Learning High Dimensional Structured Single Index Models. Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?. The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction. Five dimensions of reasoning in the wild. Learning Compatibility Across Categories for Hete …
On Learning High Dimensional Structured Single Index Models
Title | On Learning High Dimensional Structured Single Index Models |
Authors | Nikhil Rao, Ravi Ganti, Laura Balzano, Rebecca Willett, Robert Nowak |
Abstract | Single Index Models (SIMs) are simple yet flexible semi-parametric models for machine learning, where the response variable is modeled as a monotonic function of a linear combination of features. Estimation in this context requires learning both the feature weights and the nonlinear function that relates features to observations. While methods have been described to learn SIMs in the low dimensional regime, a method that can efficiently learn SIMs in high dimensions, and under general structural assumptions, has not been forthcoming. In this paper, we propose computationally efficient algorithms for SIM inference in high dimensions with structural constraints. Our general approach specializes to sparsity, group sparsity, and low-rank assumptions among others. Experiments show that the proposed method enjoys superior predictive performance when compared to generalized linear models, and achieves results comparable to or better than single layer feedforward neural networks with significantly less computational cost. |
Tasks | |
Published | 2016-03-13 |
URL | http://arxiv.org/abs/1603.03980v2 |
http://arxiv.org/pdf/1603.03980v2.pdf | |
PWC | https://paperswithcode.com/paper/on-learning-high-dimensional-structured |
Repo | |
Framework | |
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
Title | Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? |
Authors | Abhishek Das, Harsh Agrawal, C. Lawrence Zitnick, Devi Parikh, Dhruv Batra |
Abstract | We conduct large-scale studies on `human attention’ in Visual Question Answering (VQA) to understand where humans choose to look to answer questions about images. We design and test multiple game-inspired novel attention-annotation interfaces that require the subject to sharpen regions of a blurred image to answer a question. Thus, we introduce the VQA-HAT (Human ATtention) dataset. We evaluate attention maps generated by state-of-the-art VQA models against human attention both qualitatively (via visualizations) and quantitatively (via rank-order correlation). Overall, our experiments show that current attention models in VQA do not seem to be looking at the same regions as humans. | |
Tasks | Question Answering, Visual Question Answering |
Published | 2016-06-11 |
URL | http://arxiv.org/abs/1606.03556v2 |
http://arxiv.org/pdf/1606.03556v2.pdf | |
PWC | https://paperswithcode.com/paper/human-attention-in-visual-question-answering-1 |
Repo | |
Framework | |
The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction
Title | The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction |
Authors | Mahnoosh Kholghi, Lance De Vine, Laurianne Sitbon, Guido Zuccon, Anthony Nguyen |
Abstract | This study investigates the use of unsupervised word embeddings and sequence features for sample representation in an active learning framework built to extract clinical concepts from clinical free text. The objective is to further reduce the manual annotation effort while achieving higher effectiveness compared to a set of baseline features. Unsupervised features are derived from skip-gram word embeddings and a sequence representation approach. The comparative performance of unsupervised features and baseline hand-crafted features in an active learning framework are investigated using a wide range of selection criteria including least confidence, information diversity, information density and diversity, and domain knowledge informativeness. Two clinical datasets are used for evaluation: the i2b2/VA 2010 NLP challenge and the ShARe/CLEF 2013 eHealth Evaluation Lab. Our results demonstrate significant improvements in terms of effectiveness as well as annotation effort savings across both datasets. Using unsupervised features along with baseline features for sample representation lead to further savings of up to 9% and 10% of the token and concept annotation rates, respectively. |
Tasks | Active Learning, Word Embeddings |
Published | 2016-07-11 |
URL | http://arxiv.org/abs/1607.02810v4 |
http://arxiv.org/pdf/1607.02810v4.pdf | |
PWC | https://paperswithcode.com/paper/the-benefits-of-word-embeddings-features-for |
Repo | |
Framework | |
Five dimensions of reasoning in the wild
Title | Five dimensions of reasoning in the wild |
Authors | Don Perlis |
Abstract | Reasoning does not work well when done in isolation from its significance, both to the needs and interests of an agent and with respect to the wider world. Moreover, those issues may best be handled with a new sort of data structure that goes beyond the knowledge base and incorporates aspects of perceptual knowledge and even more, in which a kind of anticipatory action may be key. |
Tasks | |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06349v1 |
http://arxiv.org/pdf/1608.06349v1.pdf | |
PWC | https://paperswithcode.com/paper/five-dimensions-of-reasoning-in-the-wild |
Repo | |
Framework | |
Learning Compatibility Across Categories for Heterogeneous Item Recommendation
Title | Learning Compatibility Across Categories for Heterogeneous Item Recommendation |
Authors | Ruining He, Charles Packer, Julian McAuley |
Abstract | Identifying relationships between items is a key task of an online recommender system, in order to help users discover items that are functionally complementary or visually compatible. In domains like clothing recommendation, this task is particularly challenging since a successful system should be capable of handling a large corpus of items, a huge amount of relationships among them, as well as the high-dimensional and semantically complicated features involved. Furthermore, the human notion of “compatibility” to capture goes beyond mere similarity: For two items to be compatible—whether jeans and a t-shirt, or a laptop and a charger—they should be similar in some ways, but systematically different in others. In this paper we propose a novel method, Monomer, to learn complicated and heterogeneous relationships between items in product recommendation settings. Recently, scalable methods have been developed that address this task by learning similarity metrics on top of the content of the products involved. Here our method relaxes the metricity assumption inherent in previous work and models multiple localized notions of ‘relatedness,’ so as to uncover ways in which related items should be systematically similar, and systematically different. Quantitatively, we show that our system achieves state-of-the-art performance on large-scale compatibility prediction tasks, especially in cases where there is substantial heterogeneity between related items. Qualitatively, we demonstrate that richer notions of compatibility can be learned that go beyond similarity, and that our model can make effective recommendations of heterogeneous content. |
Tasks | Product Recommendation, Recommendation Systems |
Published | 2016-03-31 |
URL | http://arxiv.org/abs/1603.09473v3 |
http://arxiv.org/pdf/1603.09473v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-compatibility-across-categories-for |
Repo | |
Framework | |
A Sparse Nonlinear Classifier Design Using AUC Optimization
Title | A Sparse Nonlinear Classifier Design Using AUC Optimization |
Authors | Vishal Kakkar, Shirish K. Shevade, S Sundararajan, Dinesh Garg |
Abstract | AUC (Area under the ROC curve) is an important performance measure for applications where the data is highly imbalanced. Learning to maximize AUC performance is thus an important research problem. Using a max-margin based surrogate loss function, AUC optimization problem can be approximated as a pairwise rankSVM learning problem. Batch learning methods for solving the kernelized version of this problem suffer from scalability and may not result in sparse classifiers. Recent years have witnessed an increased interest in the development of online or single-pass online learning algorithms that design a classifier by maximizing the AUC performance. The AUC performance of nonlinear classifiers, designed using online methods, is not comparable with that of nonlinear classifiers designed using batch learning algorithms on many real-world datasets. Motivated by these observations, we design a scalable algorithm for maximizing AUC performance by greedily adding the required number of basis functions into the classifier model. The resulting sparse classifiers perform faster inference. Our experimental results show that the level of sparsity achievable can be order of magnitude smaller than the Kernel RankSVM model without affecting the AUC performance much. |
Tasks | |
Published | 2016-12-27 |
URL | http://arxiv.org/abs/1612.08633v1 |
http://arxiv.org/pdf/1612.08633v1.pdf | |
PWC | https://paperswithcode.com/paper/a-sparse-nonlinear-classifier-design-using |
Repo | |
Framework | |
Self-Modification of Policy and Utility Function in Rational Agents
Title | Self-Modification of Policy and Utility Function in Rational Agents |
Authors | Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter |
Abstract | Any agent that is part of the environment it interacts with and has versatile actuators (such as arms and fingers), will in principle have the ability to self-modify – for example by changing its own source code. As we continue to create more and more intelligent agents, chances increase that they will learn about this ability. The question is: will they want to use it? For example, highly intelligent systems may find ways to change their goals to something more easily achievable, thereby `escaping’ the control of their designers. In an important paper, Omohundro (2008) argued that goal preservation is a fundamental drive of any intelligent system, since a goal is more likely to be achieved if future versions of the agent strive towards the same goal. In this paper, we formalise this argument in general reinforcement learning, and explore situations where it fails. Our conclusion is that the self-modification possibility is harmless if and only if the value function of the agent anticipates the consequences of self-modifications and use the current utility function when evaluating the future. | |
Tasks | |
Published | 2016-05-10 |
URL | http://arxiv.org/abs/1605.03142v1 |
http://arxiv.org/pdf/1605.03142v1.pdf | |
PWC | https://paperswithcode.com/paper/self-modification-of-policy-and-utility |
Repo | |
Framework | |
Interpretation of Prediction Models Using the Input Gradient
Title | Interpretation of Prediction Models Using the Input Gradient |
Authors | Yotam Hechtlinger |
Abstract | State of the art machine learning algorithms are highly optimized to provide the optimal prediction possible, naturally resulting in complex models. While these models often outperform simpler more interpretable models by order of magnitudes, in terms of understanding the way the model functions, we are often facing a “black box”. In this paper we suggest a simple method to interpret the behavior of any predictive model, both for regression and classification. Given a particular model, the information required to interpret it can be obtained by studying the partial derivatives of the model with respect to the input. We exemplify this insight by interpreting convolutional and multi-layer neural networks in the field of natural language processing. |
Tasks | |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07634v1 |
http://arxiv.org/pdf/1611.07634v1.pdf | |
PWC | https://paperswithcode.com/paper/interpretation-of-prediction-models-using-the |
Repo | |
Framework | |
A Survey of Brain Inspired Technologies for Engineering
Title | A Survey of Brain Inspired Technologies for Engineering |
Authors | Jarryd Son, Amit Kumar Mishra |
Abstract | Cognitive engineering is a multi-disciplinary field and hence it is difficult to find a review article consolidating the leading developments in the field. The in-credible pace at which technology is advancing pushes the boundaries of what is achievable in cognitive engineering. There are also differing approaches to cognitive engineering brought about from the multi-disciplinary nature of the field and the vastness of possible applications. Thus research communities require more frequent reviews to keep up to date with the latest trends. In this paper we shall dis-cuss some of the approaches to cognitive engineering holistically to clarify the reasoning behind the different approaches and to highlight their strengths and weaknesses. We shall then show how developments from seemingly disjointed views could be integrated to achieve the same goal of creating cognitive machines. By reviewing the major contributions in the different fields and showing the potential for a combined approach, this work intends to assist the research community in devising more unified methods and techniques for developing cognitive machines. |
Tasks | |
Published | 2016-10-31 |
URL | http://arxiv.org/abs/1610.09882v1 |
http://arxiv.org/pdf/1610.09882v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-brain-inspired-technologies-for |
Repo | |
Framework | |
Neural shrinkage for wavelet-based SAR despeckling
Title | Neural shrinkage for wavelet-based SAR despeckling |
Authors | Mario Mastriani, Alberto E. Giraldez |
Abstract | The wavelet shrinkage denoising approach is able to maintain local regularity of a signal while suppressing noise. However, the conventional wavelet shrinkage based methods are not time-scale adaptive to track the local time-scale variation. In this paper, a new type of Neural Shrinkage (NS) is presented with a new class of shrinkage architecture for speckle reduction in Synthetic Aperture Radar (SAR) images. The numerical results indicate that the new method outperforms the standard filters, the standard wavelet shrinkage despeckling method, and previous NS. |
Tasks | Denoising |
Published | 2016-07-31 |
URL | http://arxiv.org/abs/1608.00279v1 |
http://arxiv.org/pdf/1608.00279v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-shrinkage-for-wavelet-based-sar |
Repo | |
Framework | |
“Model and Run” Constraint Networks with a MILP Engine
Title | “Model and Run” Constraint Networks with a MILP Engine |
Authors | Thierry Petit |
Abstract | Constraint Programming (CP) users need significant expertise in order to model their problems appropriately, notably to select propagators and search strategies. This puts the brakes on a broader uptake of CP. In this paper, we introduce MICE, a complete Java CP modeler that can use any Mixed Integer Linear Programming (MILP) solver as a solution technique. Our aim is to provide an alternative tool for democratizing the “CP-style” modeling thanks to its simplicity of use, with reasonable solving capabilities. Our contributions include new decompositions of (reified) constraints and constraints on numerical variables. |
Tasks | |
Published | 2016-11-27 |
URL | http://arxiv.org/abs/1611.08908v1 |
http://arxiv.org/pdf/1611.08908v1.pdf | |
PWC | https://paperswithcode.com/paper/model-and-run-constraint-networks-with-a-milp |
Repo | |
Framework | |
Bounds for Vector-Valued Function Estimation
Title | Bounds for Vector-Valued Function Estimation |
Authors | Andreas Maurer, Massimiliano Pontil |
Abstract | We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multi- task learning are equally applicable to multi-category learning. |
Tasks | Multi-Task Learning |
Published | 2016-06-05 |
URL | http://arxiv.org/abs/1606.01487v1 |
http://arxiv.org/pdf/1606.01487v1.pdf | |
PWC | https://paperswithcode.com/paper/bounds-for-vector-valued-function-estimation |
Repo | |
Framework | |
Developing Quantum Annealer Driven Data Discovery
Title | Developing Quantum Annealer Driven Data Discovery |
Authors | Joseph Dulny III, Michael Kim |
Abstract | Machine learning applications are limited by computational power. In this paper, we gain novel insights into the application of quantum annealing (QA) to machine learning (ML) through experiments in natural language processing (NLP), seizure prediction, and linear separability testing. These experiments are performed on QA simulators and early-stage commercial QA hardware and compared to an unprecedented number of traditional ML techniques. We extend QBoost, an early implementation of a binary classifier that utilizes a quantum annealer, via resampling and ensembling of predicted probabilities to produce a more robust class estimator. To determine the strengths and weaknesses of this approach, resampled QBoost (RQBoost) is tested across several datasets and compared to QBoost and traditional ML. We show and explain how QBoost in combination with a commercial QA device are unable to perfectly separate binary class data which is linearly separable via logistic regression with shrinkage. We further explore the performance of RQBoost in the space of NLP and seizure prediction and find QA-enabled ML using QBoost and RQBoost is outperformed by traditional techniques. Additionally, we provide a detailed discussion of algorithmic constraints and trade-offs imposed by the use of this QA hardware. Through these experiments, we provide unique insights into the state of quantum ML via boosting and the use of quantum annealing hardware that are valuable to institutions interested in applying QA to problems in ML and beyond. |
Tasks | Seizure prediction |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07980v1 |
http://arxiv.org/pdf/1603.07980v1.pdf | |
PWC | https://paperswithcode.com/paper/developing-quantum-annealer-driven-data |
Repo | |
Framework | |
Sliding Dictionary Based Sparse Representation For Action Recognition
Title | Sliding Dictionary Based Sparse Representation For Action Recognition |
Authors | Yashas Annadani, D L Rakshith, Soma Biswas |
Abstract | The task of action recognition has been in the forefront of research, given its applications in gaming, surveillance and health care. In this work, we propose a simple, yet very effective approach which works seamlessly for both offline and online action recognition using the skeletal joints. We construct a sliding dictionary which has the training data along with their time stamps. This is used to compute the sparse coefficients of the input action sequence which is divided into overlapping windows and each window gives a probability score for each action class. In addition, we compute another simple feature, which calibrates each of the action sequences to the training sequences, and models the deviation of the action from the each of the training data. Finally, a score level fusion of the two heterogeneous but complementary features for each window is obtained and the scores for the available windows are successively combined to give the confidence scores of each action class. This way of combining the scores makes the approach suitable for scenarios where only part of the sequence is available. Extensive experimental evaluation on three publicly available datasets shows the effectiveness of the proposed approach for both offline and online action recognition tasks. |
Tasks | Temporal Action Localization |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00218v1 |
http://arxiv.org/pdf/1611.00218v1.pdf | |
PWC | https://paperswithcode.com/paper/sliding-dictionary-based-sparse |
Repo | |
Framework | |
Differentially Private Policy Evaluation
Title | Differentially Private Policy Evaluation |
Authors | Borja Balle, Maziar Gomrokchi, Doina Precup |
Abstract | We present the first differentially private algorithms for reinforcement learning, which apply to the task of evaluating a fixed policy. We establish two approaches for achieving differential privacy, provide a theoretical analysis of the privacy and utility of the two algorithms, and show promising results on simple empirical examples. |
Tasks | |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.02010v1 |
http://arxiv.org/pdf/1603.02010v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-policy-evaluation |
Repo | |
Framework | |