Paper Group ANR 507
Transfer Learning for Endoscopic Image Classification. Parameter Compression of Recurrent Neural Networks and Degradation of Short-term Memory. Using Self-Contradiction to Learn Confidence Measures in Stereo Vision. TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions. Multi-source Hierarchical Prediction Consolidation. Impro …
Transfer Learning for Endoscopic Image Classification
Title | Transfer Learning for Endoscopic Image Classification |
Authors | Shoji Sonoyama, Toru Tamaki, Tsubasa Hirakawa, Bisser Raytchev, Kazufumi Kaneda, Tetsushi Koide, Shigeto Yoshida, Hiroshi Mieno, Shinji Tanaka |
Abstract | In this paper we propose a method for transfer learning of endoscopic images. For transferring between features obtained from images taken by different (old and new) endoscopes, we extend the Max-Margin Domain Transfer (MMDT) proposed by Hoffman et al. in order to use L2 distance constraints as regularization, called Max-Margin Domain Transfer with L2 Distance Constraints (MMDTL2). Furthermore, we develop the dual formulation of the optimization problem in order to reduce the computation cost. Experimental results demonstrate that the proposed MMDTL2 outperforms MMDT for real data sets taken by different endoscopes. |
Tasks | Image Classification, Transfer Learning |
Published | 2016-08-24 |
URL | http://arxiv.org/abs/1608.06713v1 |
http://arxiv.org/pdf/1608.06713v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-for-endoscopic-image |
Repo | |
Framework | |
Parameter Compression of Recurrent Neural Networks and Degradation of Short-term Memory
Title | Parameter Compression of Recurrent Neural Networks and Degradation of Short-term Memory |
Authors | Jonathan A. Cox |
Abstract | The significant computational costs of deploying neural networks in large-scale or resource constrained environments, such as data centers and mobile devices, has spurred interest in model compression, which can achieve a reduction in both arithmetic operations and storage memory. Several techniques have been proposed for reducing or compressing the parameters for feed-forward and convolutional neural networks, but less is understood about the effect of parameter compression on recurrent neural networks (RNN). In particular, the extent to which the recurrent parameters can be compressed and the impact on short-term memory performance, is not well understood. In this paper, we study the effect of complexity reduction, through singular value decomposition rank reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several tasks. We show that considerable rank reduction is possible when compressing recurrent weights, even without fine tuning. Furthermore, we propose a perturbation model for the effect of general perturbations, such as a compression, on the recurrent parameters of RNNs. The model is tested against a noiseless memorization experiment that elucidates the short-term memory performance. In this way, we demonstrate that the effect of compression of recurrent parameters is dependent on the degree of temporal coherence present in the data and task. This work can guide on-the-fly RNN compression for novel environments or tasks, and provides insight for applying RNN compression in low-power devices, such as hearing aids. |
Tasks | Model Compression |
Published | 2016-12-02 |
URL | http://arxiv.org/abs/1612.00891v2 |
http://arxiv.org/pdf/1612.00891v2.pdf | |
PWC | https://paperswithcode.com/paper/parameter-compression-of-recurrent-neural |
Repo | |
Framework | |
Using Self-Contradiction to Learn Confidence Measures in Stereo Vision
Title | Using Self-Contradiction to Learn Confidence Measures in Stereo Vision |
Authors | Christian Mostegel, Markus Rumpler, Friedrich Fraundorfer, Horst Bischof |
Abstract | Learned confidence measures gain increasing importance for outlier removal and quality improvement in stereo vision. However, acquiring the necessary training data is typically a tedious and time consuming task that involves manual interaction, active sensing devices and/or synthetic scenes. To overcome this problem, we propose a new, flexible, and scalable way for generating training data that only requires a set of stereo images as input. The key idea of our approach is to use different view points for reasoning about contradictions and consistencies between multiple depth maps generated with the same stereo algorithm. This enables us to generate a huge amount of training data in a fully automated manner. Among other experiments, we demonstrate the potential of our approach by boosting the performance of three learned confidence measures on the KITTI2012 dataset by simply training them on a vast amount of automatically generated training data rather than a limited amount of laser ground truth data. |
Tasks | |
Published | 2016-04-18 |
URL | http://arxiv.org/abs/1604.05132v1 |
http://arxiv.org/pdf/1604.05132v1.pdf | |
PWC | https://paperswithcode.com/paper/using-self-contradiction-to-learn-confidence |
Repo | |
Framework | |
TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions
Title | TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions |
Authors | Sujay Kumar Jauhar, Peter Turney, Eduard Hovy |
Abstract | We describe two new related resources that facilitate modelling of general knowledge reasoning in 4th grade science exams. The first is a collection of curated facts in the form of tables, and the second is a large set of crowd-sourced multiple-choice questions covering the facts in the tables. Through the setup of the crowd-sourced annotation task we obtain implicit alignment information between questions and tables. We envisage that the resources will be useful not only to researchers working on question answering, but also to people investigating a diverse range of other applications such as information extraction, question parsing, answer type identification, and lexical semantic modelling. |
Tasks | Question Answering |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.03960v1 |
http://arxiv.org/pdf/1602.03960v1.pdf | |
PWC | https://paperswithcode.com/paper/tabmcq-a-dataset-of-general-knowledge-tables |
Repo | |
Framework | |
Multi-source Hierarchical Prediction Consolidation
Title | Multi-source Hierarchical Prediction Consolidation |
Authors | Chenwei Zhang, Sihong Xie, Yaliang Li, Jing Gao, Wei Fan, Philip S. Yu |
Abstract | In big data applications such as healthcare data mining, due to privacy concerns, it is necessary to collect predictions from multiple information sources for the same instance, with raw features being discarded or withheld when aggregating multiple predictions. Besides, crowd-sourced labels need to be aggregated to estimate the ground truth of the data. Because of the imperfect predictive models or human crowdsourcing workers, noisy and conflicting information is ubiquitous and inevitable. Although state-of-the-art aggregation methods have been proposed to handle label spaces with flat structures, as the label space is becoming more and more complicated, aggregation under a label hierarchical structure becomes necessary but has been largely ignored. These label hierarchies can be quite informative as they are usually created by domain experts to make sense of highly complex label correlations for many real-world cases like protein functionality interactions or disease relationships. We propose a novel multi-source hierarchical prediction consolidation method to effectively exploits the complicated hierarchical label structures to resolve the noisy and conflicting information that inherently originates from multiple imperfect sources. We formulate the problem as an optimization problem with a closed-form solution. The proposed method captures the smoothness overall information sources as well as penalizing any consolidation result that violates the constraints derived from the label hierarchy. The hierarchical instance similarity, as well as the consolidation result, are inferred in a totally unsupervised, iterative fashion. Experimental results on both synthetic and real-world datasets show the effectiveness of the proposed method over existing alternatives. |
Tasks | |
Published | 2016-08-11 |
URL | http://arxiv.org/abs/1608.03344v1 |
http://arxiv.org/pdf/1608.03344v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-source-hierarchical-prediction |
Repo | |
Framework | |
Improving Image Captioning by Concept-based Sentence Reranking
Title | Improving Image Captioning by Concept-based Sentence Reranking |
Authors | Xirong Li, Qin Jin |
Abstract | This paper describes our winning entry in the ImageCLEF 2015 image sentence generation task. We improve Google’s CNN-LSTM model by introducing concept-based sentence reranking, a data-driven approach which exploits the large amounts of concept-level annotations on Flickr. Different from previous usage of concept detection that is tailored to specific image captioning models, the propose approach reranks predicted sentences in terms of their matches with detected concepts, essentially treating the underlying model as a black box. This property makes the approach applicable to a number of existing solutions. We also experiment with fine tuning on the deep language model, which improves the performance further. Scoring METEOR of 0.1875 on the ImageCLEF 2015 test set, our system outperforms the runner-up (METEOR of 0.1687) with a clear margin. |
Tasks | Image Captioning, Language Modelling |
Published | 2016-05-03 |
URL | http://arxiv.org/abs/1605.00855v1 |
http://arxiv.org/pdf/1605.00855v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-image-captioning-by-concept-based |
Repo | |
Framework | |
Time Resource Networks
Title | Time Resource Networks |
Authors | Szymon Sidor, Peng Yu, Cheng Fang, Brian Williams |
Abstract | The problem of scheduling under resource constraints is widely applicable. One prominent example is power management, in which we have a limited continuous supply of power but must schedule a number of power-consuming tasks. Such problems feature tightly coupled continuous resource constraints and continuous temporal constraints. We address such problems by introducing the Time Resource Network (TRN), an encoding for resource-constrained scheduling problems. The definition allows temporal specifications using a general family of representations derived from the Simple Temporal network, including the Simple Temporal Network with Uncertainty, and the probabilistic Simple Temporal Network (Fang et al. (2014)). We propose two algorithms for determining the consistency of a TRN: one based on Mixed Integer Programing and the other one based on Constraint Programming, which we evaluate on scheduling problems with Simple Temporal Constraints and Probabilistic Temporal Constraints. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.03203v1 |
http://arxiv.org/pdf/1602.03203v1.pdf | |
PWC | https://paperswithcode.com/paper/time-resource-networks |
Repo | |
Framework | |
Quantum Monte Carlo simulation of a particular class of non-stoquastic Hamiltonians in quantum annealing
Title | Quantum Monte Carlo simulation of a particular class of non-stoquastic Hamiltonians in quantum annealing |
Authors | Masayuki Ohzeki |
Abstract | Quantum annealing is a generic solver of the optimization problem that uses fictitious quantum fluctuation. Its simulation in classical computing is often performed using the quantum Monte Carlo simulation via the Suzuki–Trotter decomposition. However, the negative sign problem sometimes emerges in the simulation of quantum annealing with an elaborate driver Hamiltonian, since it belongs to a class of non-stoquastic Hamiltonians. In the present study, we propose an alternative way to avoid the negative sign problem involved in a particular class of the non-stoquastic Hamiltonians. To check the validity of the method, we demonstrate our method by applying it to a simple problem that includes the anti-ferromagnetic XX interaction, which is a typical instance of the non-stoquastic Hamiltonians. |
Tasks | |
Published | 2016-12-14 |
URL | http://arxiv.org/abs/1612.04785v1 |
http://arxiv.org/pdf/1612.04785v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-monte-carlo-simulation-of-a |
Repo | |
Framework | |
Differentially Private Variational Inference for Non-conjugate Models
Title | Differentially Private Variational Inference for Non-conjugate Models |
Authors | Joonas Jälkö, Onur Dikmen, Antti Honkela |
Abstract | Many machine learning applications are based on data collected from people, such as their tastes and behaviour as well as biological traits and genetic data. Regardless of how important the application might be, one has to make sure individuals’ identities or the privacy of the data are not compromised in the analysis. Differential privacy constitutes a powerful framework that prevents breaching of data subject privacy from the output of a computation. Differentially private versions of many important Bayesian inference methods have been proposed, but there is a lack of an efficient unified approach applicable to arbitrary models. In this contribution, we propose a differentially private variational inference method with a very wide applicability. It is built on top of doubly stochastic variational inference, a recent advance which provides a variational solution to a large class of models. We add differential privacy into doubly stochastic variational inference by clipping and perturbing the gradients. The algorithm is made more efficient through privacy amplification from subsampling. We demonstrate the method can reach an accuracy close to non-private level under reasonably strong privacy guarantees, clearly improving over previous sampling-based alternatives especially in the strong privacy regime. |
Tasks | Bayesian Inference |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08749v2 |
http://arxiv.org/pdf/1610.08749v2.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-variational-inference |
Repo | |
Framework | |
Boost Picking: A Universal Method on Converting Supervised Classification to Semi-supervised Classification
Title | Boost Picking: A Universal Method on Converting Supervised Classification to Semi-supervised Classification |
Authors | Fuqiang Liu, Fukun Bi, Yiding Yang, Liang Chen |
Abstract | This paper proposes a universal method, Boost Picking, to train supervised classification models mainly by un-labeled data. Boost Picking only adopts two weak classifiers to estimate and correct the error. It is theoretically proved that Boost Picking could train a supervised model mainly by un-labeled data as effectively as the same model trained by 100% labeled data, only if recalls of the two weak classifiers are all greater than zero and the sum of precisions is greater than one. Based on Boost Picking, we present “Test along with Training (TawT)” to improve the generalization of supervised models. Both Boost Picking and TawT are successfully tested in varied little data sets. |
Tasks | |
Published | 2016-02-18 |
URL | http://arxiv.org/abs/1602.05659v3 |
http://arxiv.org/pdf/1602.05659v3.pdf | |
PWC | https://paperswithcode.com/paper/boost-picking-a-universal-method-on |
Repo | |
Framework | |
When is multitask learning effective? Semantic sequence prediction under varying data conditions
Title | When is multitask learning effective? Semantic sequence prediction under varying data conditions |
Authors | Héctor Martínez Alonso, Barbara Plank |
Abstract | Multitask learning has been applied successfully to a range of tasks, mostly morphosyntactic. However, little is known on when MTL works and whether there are data characteristics that help to determine its success. In this paper we evaluate a range of semantic sequence labeling tasks in a MTL setup. We examine different auxiliary tasks, amongst which a novel setup, and correlate their impact to data-dependent conditions. Our results show that MTL is not always effective, significant improvements are obtained only for 1 out of 5 tasks. When successful, auxiliary tasks with compact and more uniform label distributions are preferable. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02251v2 |
http://arxiv.org/pdf/1612.02251v2.pdf | |
PWC | https://paperswithcode.com/paper/when-is-multitask-learning-effective-semantic |
Repo | |
Framework | |
A Multivariate Hawkes Process with Gaps in Observations
Title | A Multivariate Hawkes Process with Gaps in Observations |
Authors | Triet M Le |
Abstract | Given a collection of entities (or nodes) in a network and our intermittent observations of activities from each entity, an important problem is to learn the hidden edges depicting directional relationships among these entities. Here, we study causal relationships (excitations) that are realized by a multivariate Hawkes process. The multivariate Hawkes process (MHP) and its variations (spatio-temporal point processes) have been used to study contagion in earthquakes, crimes, neural spiking activities, the stock and foreign exchange markets, etc. In this paper, we consider the multivariate Hawkes process with gaps in observations (MHPG). We propose a variational problem for detecting sparsely hidden relationships with a multivariate Hawkes process that takes into account the gaps from each entity. We bypass the problem of dealing with a large amount of missing events by introducing a small number of unknown boundary conditions. In the case where our observations are sparse (e.g. from 10% to 30%), we show through numerical simulations that robust recovery with MHPG is still possible even if the lengths of the observed intervals are small but they are chosen accordingly. The numerical results also show that the knowledge of gaps and imposing the right boundary conditions are very crucial in discovering the underlying patterns and hidden relationships. |
Tasks | Point Processes |
Published | 2016-08-03 |
URL | http://arxiv.org/abs/1608.01282v3 |
http://arxiv.org/pdf/1608.01282v3.pdf | |
PWC | https://paperswithcode.com/paper/a-multivariate-hawkes-process-with-gaps-in |
Repo | |
Framework | |
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling
Title | Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling |
Authors | Chengtao Li, Stefanie Jegelka, Suvrit Sra |
Abstract | We study probability measures induced by set functions with constraints. Such measures arise in a variety of real-world settings, where prior knowledge, resource limitations, or other pragmatic considerations impose constraints. We consider the task of rapidly sampling from such constrained measures, and develop fast Markov chain samplers for them. Our first main result is for MCMC sampling from Strongly Rayleigh (SR) measures, for which we present sharp polynomial bounds on the mixing time. As a corollary, this result yields a fast mixing sampler for Determinantal Point Processes (DPPs), yielding (to our knowledge) the first provably fast MCMC sampler for DPPs since their inception over four decades ago. Beyond SR measures, we develop MCMC samplers for probabilistic models with hard constraints and identify sufficient conditions under which their chains mix rapidly. We illustrate our claims by empirically verifying the dependence of mixing times on the key factors governing our theoretical bounds. |
Tasks | Point Processes |
Published | 2016-08-02 |
URL | http://arxiv.org/abs/1608.01008v3 |
http://arxiv.org/pdf/1608.01008v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-mixing-markov-chains-for-strongly |
Repo | |
Framework | |
Modular Decomposition and Analysis of Registration based Trackers
Title | Modular Decomposition and Analysis of Registration based Trackers |
Authors | Abhineet Singh, Ankush Roy, Xi Zhang, Martin Jagersand |
Abstract | This paper presents a new way to study registration based trackers by decomposing them into three constituent sub modules: appearance model, state space model and search method. It is often the case that when a new tracker is introduced in literature, it only contributes to one or two of these sub modules while using existing methods for the rest. Since these are often selected arbitrarily by the authors, they may not be optimal for the new method. In such cases, our breakdown can help to experimentally find the best combination of methods for these sub modules while also providing a framework within which the contributions of the new tracker can be clearly demarcated and thus studied better. We show how existing trackers can be broken down using the suggested methodology and compare the performance of the default configuration chosen by the authors against other possible combinations to demonstrate the new insights that can be gained by such an approach. We also present an open source system that provides a convenient interface to plug in a new method for any sub module and test it against all possible combinations of methods for the other two sub modules while also serving as a fast and efficient solution for practical tracking requirements. |
Tasks | |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.01292v2 |
http://arxiv.org/pdf/1603.01292v2.pdf | |
PWC | https://paperswithcode.com/paper/modular-decomposition-and-analysis-of |
Repo | |
Framework | |
Fast Sampling for Strongly Rayleigh Measures with Application to Determinantal Point Processes
Title | Fast Sampling for Strongly Rayleigh Measures with Application to Determinantal Point Processes |
Authors | Chengtao Li, Stefanie Jegelka, Suvrit Sra |
Abstract | In this note we consider sampling from (non-homogeneous) strongly Rayleigh probability measures. As an important corollary, we obtain a fast mixing Markov Chain sampler for Determinantal Point Processes. |
Tasks | Point Processes |
Published | 2016-07-13 |
URL | http://arxiv.org/abs/1607.03559v1 |
http://arxiv.org/pdf/1607.03559v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-sampling-for-strongly-rayleigh-measures |
Repo | |
Framework | |