Paper Group ANR 327
Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks. Vicious Circle Principle and Formation of Sets in ASP Based Languages. Kernels for sequentially ordered data. End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager. Diagnostic Prediction Using Discomfort Drawi …
Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks
Title | Exact ICL maximization in a non-stationary temporal extension of the stochastic block model for dynamic networks |
Authors | Marco Corneli, Pierre Latouche, Fabrice Rossi |
Abstract | The stochastic block model (SBM) is a flexible probabilistic tool that can be used to model interactions between clusters of nodes in a network. However, it does not account for interactions of time varying intensity between clusters. The extension of the SBM developed in this paper addresses this shortcoming through a temporal partition: assuming interactions between nodes are recorded on fixed-length time intervals, the inference procedure associated with the model we propose allows to cluster simultaneously the nodes of the network and the time intervals. The number of clusters of nodes and of time intervals, as well as the memberships to clusters, are obtained by maximizing an exact integrated complete-data likelihood, relying on a greedy search approach. Experiments on simulated and real data are carried out in order to assess the proposed methodology. |
Tasks | |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02540v2 |
http://arxiv.org/pdf/1605.02540v2.pdf | |
PWC | https://paperswithcode.com/paper/exact-icl-maximization-in-a-non-stationary-1 |
Repo | |
Framework | |
Vicious Circle Principle and Formation of Sets in ASP Based Languages
Title | Vicious Circle Principle and Formation of Sets in ASP Based Languages |
Authors | Michael Gelfond, Yuanlin Zhang |
Abstract | The paper continues the investigation of Poincare and Russel’s Vicious Circle Principle (VCP) in the context of the design of logic programming languages with sets. We expand previously introduced language Alog with aggregates by allowing infinite sets and several additional set related constructs useful for knowledge representation and teaching. In addition, we propose an alternative formalization of the original VCP and incorporate it into the semantics of new language, Slog+, which allows more liberal construction of sets and their use in programming rules. We show that, for programs without disjunction and infinite sets, the formal semantics of aggregates in Slog+ coincides with that of several other known languages. Their intuitive and formal semantics, however, are based on quite different ideas and seem to be more involved than that of Slog+. |
Tasks | |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.08262v1 |
http://arxiv.org/pdf/1608.08262v1.pdf | |
PWC | https://paperswithcode.com/paper/vicious-circle-principle-and-formation-of |
Repo | |
Framework | |
Kernels for sequentially ordered data
Title | Kernels for sequentially ordered data |
Authors | Franz J Király, Harald Oberhauser |
Abstract | We present a novel framework for kernel learning with sequential data of any kind, such as time series, sequences of graphs, or strings. Our approach is based on signature features which can be seen as an ordered variant of sample (cross-)moments; it allows to obtain a “sequentialized” version of any static kernel. The sequential kernels are efficiently computable for discrete sequences and are shown to approximate a continuous moment form in a sampling sense. A number of known kernels for sequences arise as “sequentializations” of suitable static kernels: string kernels may be obtained as a special case, and alignment kernels are closely related up to a modification that resolves their open non-definiteness issue. Our experiments indicate that our signature-based sequential kernel framework may be a promising approach to learning with sequential data, such as time series, that allows to avoid extensive manual pre-processing. |
Tasks | Time Series |
Published | 2016-01-29 |
URL | http://arxiv.org/abs/1601.08169v1 |
http://arxiv.org/pdf/1601.08169v1.pdf | |
PWC | https://paperswithcode.com/paper/kernels-for-sequentially-ordered-data |
Repo | |
Framework | |
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
Title | End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager |
Authors | Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng |
Abstract | Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance. Conventional approaches aggregate separate models of natural language understanding (NLU) and system action prediction (SAP) as a pipeline that is sensitive to noisy outputs of error-prone NLU. To address the issues, we propose an end-to-end deep recurrent neural network with limited contextual dialogue memory by jointly training NLU and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our proposed model significantly outperforms the state-of-the-art pipeline models for both NLU and SAP, which indicates that our joint model is capable of mitigating the affects of noisy NLU outputs, and NLU model can be refined by error flows backpropagating from the extra supervised signals of system actions. |
Tasks | |
Published | 2016-12-03 |
URL | http://arxiv.org/abs/1612.00913v2 |
http://arxiv.org/pdf/1612.00913v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-joint-learning-of-natural-language |
Repo | |
Framework | |
Diagnostic Prediction Using Discomfort Drawings with IBTM
Title | Diagnostic Prediction Using Discomfort Drawings with IBTM |
Authors | Cheng Zhang, Hedvig Kjellstrom, Carl Henrik Ek, Bo C. Bertilson |
Abstract | In this paper, we explore the possibility to apply machine learning to make diagnostic predictions using discomfort drawings. A discomfort drawing is an intuitive way for patients to express discomfort and pain related symptoms. These drawings have proven to be an effective method to collect patient data and make diagnostic decisions in real-life practice. A dataset from real-world patient cases is collected for which medical experts provide diagnostic labels. Next, we use a factorized multimodal topic model, Inter-Battery Topic Model (IBTM), to train a system that can make diagnostic predictions given an unseen discomfort drawing. The number of output diagnostic labels is determined by using mean-shift clustering on the discomfort drawing. Experimental results show reasonable predictions of diagnostic labels given an unseen discomfort drawing. Additionally, we generate synthetic discomfort drawings with IBTM given a diagnostic label, which results in typical cases of symptoms. The positive result indicates a significant potential of machine learning to be used for parts of the pain diagnostic process and to be a decision support system for physicians and other health care personnel. |
Tasks | |
Published | 2016-07-27 |
URL | http://arxiv.org/abs/1607.08206v2 |
http://arxiv.org/pdf/1607.08206v2.pdf | |
PWC | https://paperswithcode.com/paper/diagnostic-prediction-using-discomfort-1 |
Repo | |
Framework | |
Distant IE by Bootstrapping Using Lists and Document Structure
Title | Distant IE by Bootstrapping Using Lists and Document Structure |
Authors | Lidong Bing, Mingyang Ling, Richard C. Wang, William W. Cohen |
Abstract | Distant labeling for information extraction (IE) suffers from noisy training data. We describe a way of reducing the noise associated with distant IE by identifying coupling constraints between potential instance labels. As one example of coupling, items in a list are likely to have the same label. A second example of coupling comes from analysis of document structure: in some corpora, sections can be identified such that items in the same section are likely to have the same label. Such sections do not exist in all corpora, but we show that augmenting a large corpus with coupling constraints from even a small, well-structured corpus can improve performance substantially, doubling F1 on one task. |
Tasks | |
Published | 2016-01-04 |
URL | http://arxiv.org/abs/1601.00620v1 |
http://arxiv.org/pdf/1601.00620v1.pdf | |
PWC | https://paperswithcode.com/paper/distant-ie-by-bootstrapping-using-lists-and |
Repo | |
Framework | |
Recurrent Neural Networks to Correct Satellite Image Classification Maps
Title | Recurrent Neural Networks to Correct Satellite Image Classification Maps |
Authors | Emmanuel Maggiori, Guillaume Charpiat, Yuliya Tarabalka, Pierre Alliez |
Abstract | While initially devised for image categorization, convolutional neural networks (CNNs) are being increasingly used for the pixelwise semantic labeling of images. However, the proper nature of the most common CNN architectures makes them good at recognizing but poor at localizing objects precisely. This problem is magnified in the context of aerial and satellite image labeling, where a spatially fine object outlining is of paramount importance. Different iterative enhancement algorithms have been presented in the literature to progressively improve the coarse CNN outputs, seeking to sharpen object boundaries around real image edges. However, one must carefully design, choose and tune such algorithms. Instead, our goal is to directly learn the iterative process itself. For this, we formulate a generic iterative enhancement process inspired from partial differential equations, and observe that it can be expressed as a recurrent neural network (RNN). Consequently, we train such a network from manually labeled data for our enhancement task. In a series of experiments we show that our RNN effectively learns an iterative process that significantly improves the quality of satellite image classification maps. |
Tasks | Image Categorization, Image Classification |
Published | 2016-08-11 |
URL | http://arxiv.org/abs/1608.03440v3 |
http://arxiv.org/pdf/1608.03440v3.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-to-correct |
Repo | |
Framework | |
Deep Double Sparsity Encoder: Learning to Sparsify Not Only Features But Also Parameters
Title | Deep Double Sparsity Encoder: Learning to Sparsify Not Only Features But Also Parameters |
Authors | Zhangyang Wang, Thomas S. Huang |
Abstract | This paper emphasizes the significance to jointly exploit the problem structure and the parameter structure, in the context of deep modeling. As a specific and interesting example, we describe the deep double sparsity encoder (DDSE), which is inspired by the double sparsity model for dictionary learning. DDSE simultaneously sparsities the output features and the learned model parameters, under one unified framework. In addition to its intuitive model interpretation, DDSE also possesses compact model size and low complexity. Extensive simulations compare DDSE with several carefully-designed baselines, and verify the consistently superior performance of DDSE. We further apply DDSE to the novel application domain of brain encoding, with promising preliminary results achieved. |
Tasks | Dictionary Learning |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06374v2 |
http://arxiv.org/pdf/1608.06374v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-double-sparsity-encoder-learning-to |
Repo | |
Framework | |
Introspective Agents: Confidence Measures for General Value Functions
Title | Introspective Agents: Confidence Measures for General Value Functions |
Authors | Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski |
Abstract | Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions. While such adaptive agents may leverage engineered knowledge, they will require the capacity to construct and evaluate knowledge themselves from their own experience in a bottom-up, constructivist fashion. This position paper builds on the idea of encoding knowledge as temporally extended predictions through the use of general value functions. Prior work has focused on learning predictions about externally derived signals about a task or environment (e.g. battery level, joint position). Here we advocate that the agent should also predict internally generated signals regarding its own learning process - for example, an agent’s confidence in its learned predictions. Finally, we suggest how such information would be beneficial in creating an introspective agent that is able to learn to make good decisions in a complex, changing world. |
Tasks | |
Published | 2016-06-17 |
URL | http://arxiv.org/abs/1606.05593v1 |
http://arxiv.org/pdf/1606.05593v1.pdf | |
PWC | https://paperswithcode.com/paper/introspective-agents-confidence-measures-for |
Repo | |
Framework | |
How to do lexical quality estimation of a large OCRed historical Finnish newspaper collection with scarce resources
Title | How to do lexical quality estimation of a large OCRed historical Finnish newspaper collection with scarce resources |
Authors | Kimmo Kettunen |
Abstract | The National Library of Finland has digitized the historical newspapers published in Finland between 1771 and 1910. This collection contains approximately 1.95 million pages in Finnish and Swedish. Finnish part of the collection consists of about 2.40 billion words. The National Library’s Digital Collections are offered via the digi.kansalliskirjasto.fi web service, also known as Digi. Part of the newspaper material (from 1771 to 1874) is also available freely downloadable in The Language Bank of Finland provided by the FINCLARIN consortium. The collection can also be accessed through the Korp environment that has been developed by Spr{\aa}kbanken at the University of Gothenburg and extended by FINCLARIN team at the University of Helsinki to provide concordances of text resources. A Cranfield style information retrieval test collection has also been produced out of a small part of the Digi newspaper material at the University of Tampere. Quality of OCRed collections is an important topic in digital humanities, as it affects general usability and searchability of collections. There is no single available method to assess quality of large collections, but different methods can be used to approximate quality. This paper discusses different corpus analysis style methods to approximate overall lexical quality of the Finnish part of the Digi collection. Methods include usage of parallel samples and word error rates, usage of morphological analyzers, frequency analysis of words and comparisons to comparable edited lexical data. Our aim in the quality analysis is twofold: firstly to analyze the present state of the lexical data and secondly, to establish a set of assessment methods that build up a compact procedure for quality assessment after e.g. new OCRing or post correction of the material. In the discussion part of the paper we shall synthesize results of our different analyses. |
Tasks | Information Retrieval |
Published | 2016-11-16 |
URL | https://arxiv.org/abs/1611.05239v2 |
https://arxiv.org/pdf/1611.05239v2.pdf | |
PWC | https://paperswithcode.com/paper/how-to-do-lexical-quality-estimation-of-a |
Repo | |
Framework | |
Resolving Distributed Knowledge
Title | Resolving Distributed Knowledge |
Authors | Thomas Ågotnes, Yì N. Wáng |
Abstract | Distributed knowledge is the sum of the knowledge in a group; what someone who is able to discern between two possible worlds whenever any member of the group can discern between them, would know. Sometimes distributed knowledge is referred to as the potential knowledge of a group, or the joint knowledge they could obtain if they had unlimited means of communication. In epistemic logic, the formula D_G{\phi} is intended to express the fact that group G has distributed knowledge of {\phi}, that there is enough information in the group to infer {\phi}. But this is not the same as reasoning about what happens if the members of the group share their information. In this paper we introduce an operator R_G, such that R_G{\phi} means that {\phi} is true after G have shared all their information with each other - after G’s distributed knowledge has been resolved. The R_G operators are called resolution operators. Semantically, we say that an expression R_G{\phi} is true iff {\phi} is true in what van Benthem [11, p. 249] calls (G’s) communication core; the model update obtained by removing links to states for members of G that are not linked by all members of G. We study logics with different combinations of resolution operators and operators for common and distributed knowledge. Of particular interest is the relationship between distributed and common knowledge. The main results are sound and complete axiomatizations. |
Tasks | |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07515v1 |
http://arxiv.org/pdf/1606.07515v1.pdf | |
PWC | https://paperswithcode.com/paper/resolving-distributed-knowledge |
Repo | |
Framework | |
A Two-Stage Shape Retrieval (TSR) Method with Global and Local Features
Title | A Two-Stage Shape Retrieval (TSR) Method with Global and Local Features |
Authors | Xiaqing Pan, Sachin Chachada, C. -C. Jay Kuo |
Abstract | A robust two-stage shape retrieval (TSR) method is proposed to address the 2D shape retrieval problem. Most state-of-the-art shape retrieval methods are based on local features matching and ranking. Their retrieval performance is not robust since they may retrieve globally dissimilar shapes in high ranks. To overcome this challenge, we decompose the decision process into two stages. In the first irrelevant cluster filtering (ICF) stage, we consider both global and local features and use them to predict the relevance of gallery shapes with respect to the query. Irrelevant shapes are removed from the candidate shape set. After that, a local-features-based matching and ranking (LMR) method follows in the second stage. We apply the proposed TSR system to MPEG-7, Kimia99 and Tari1000 three datasets and show that it outperforms all other existing methods. The robust retrieval performance of the TSR system is demonstrated. |
Tasks | |
Published | 2016-03-07 |
URL | http://arxiv.org/abs/1603.01942v3 |
http://arxiv.org/pdf/1603.01942v3.pdf | |
PWC | https://paperswithcode.com/paper/a-two-stage-shape-retrieval-tsr-method-with |
Repo | |
Framework | |
Exploring the Entire Regularization Path for the Asymmetric Cost Linear Support Vector Machine
Title | Exploring the Entire Regularization Path for the Asymmetric Cost Linear Support Vector Machine |
Authors | Daniel Wesierski |
Abstract | We propose an algorithm for exploring the entire regularization path of asymmetric-cost linear support vector machines. Empirical evidence suggests the predictive power of support vector machines depends on the regularization parameters of the training algorithms. The algorithms exploring the entire regularization paths have been proposed for single-cost support vector machines thereby providing the complete knowledge on the behavior of the trained model over the hyperparameter space. Considering the problem in two-dimensional hyperparameter space though enables our algorithm to maintain greater flexibility in dealing with special cases and sheds light on problems encountered by algorithms building the paths in one-dimensional spaces. We demonstrate two-dimensional regularization paths for linear support vector machines that we train on synthetic and real data. |
Tasks | |
Published | 2016-10-12 |
URL | http://arxiv.org/abs/1610.03738v1 |
http://arxiv.org/pdf/1610.03738v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-the-entire-regularization-path-for |
Repo | |
Framework | |
Communication-Efficient Parallel Block Minimization for Kernel Machines
Title | Communication-Efficient Parallel Block Minimization for Kernel Machines |
Authors | Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon |
Abstract | Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges. In this paper, we show how to overcome the important challenge of speeding up kernel machines. In particular, we develop a parallel block minimization framework for solving kernel machines, including kernel SVM and kernel logistic regression. Our framework proceeds by dividing the problem into smaller subproblems by forming a block-diagonal approximation of the Hessian matrix. The subproblems are then solved approximately in parallel. After that, a communication efficient line search procedure is developed to ensure sufficient reduction of the objective function value at each iteration. We prove global linear convergence rate of the proposed method with a wide class of subproblem solvers, and our analysis covers strongly convex and some non-strongly convex functions. We apply our algorithm to solve large-scale kernel SVM problems on distributed systems, and show a significant improvement over existing parallel solvers. As an example, on the covtype dataset with half-a-million samples, our algorithm can obtain an approximate solution with 96% accuracy in 20 seconds using 32 machines, while all the other parallel kernel SVM solvers require more than 2000 seconds to achieve a solution with 95% accuracy. Moreover, our algorithm can scale to very large data sets, such as the kdd algebra dataset with 8 million samples and 20 million features. |
Tasks | |
Published | 2016-08-05 |
URL | http://arxiv.org/abs/1608.02010v1 |
http://arxiv.org/pdf/1608.02010v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-parallel-block |
Repo | |
Framework | |
Structure-Aware Classification using Supervised Dictionary Learning
Title | Structure-Aware Classification using Supervised Dictionary Learning |
Authors | Yael Yankelevsky, Michael Elad |
Abstract | In this paper, we propose a supervised dictionary learning algorithm that aims to preserve the local geometry in both dimensions of the data. A graph-based regularization explicitly takes into account the local manifold structure of the data points. A second graph regularization gives similar treatment to the feature domain and helps in learning a more robust dictionary. Both graphs can be constructed from the training data or learned and adapted along the dictionary learning process. The combination of these two terms promotes the discriminative power of the learned sparse representations and leads to improved classification accuracy. The proposed method was evaluated on several different datasets, representing both single-label and multi-label classification problems, and demonstrated better performance compared with other dictionary based approaches. |
Tasks | Dictionary Learning, Multi-Label Classification |
Published | 2016-09-29 |
URL | http://arxiv.org/abs/1609.09199v1 |
http://arxiv.org/pdf/1609.09199v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-aware-classification-using |
Repo | |
Framework | |