July 28, 2019

3003 words 15 mins read

Paper Group ANR 243

Paper Group ANR 243

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs. Dialogue Act Segmentation for Vietnamese Human-Human Conversational Texts. Demystifying Relational Latent Representations. A Greedy Part Assignment Algorithm for Real-time Multi-person 2D Pose Estimation. Induction of Interpretable Possibilistic Logic Theories from Relational Dat …

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Title Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Authors Alon Brutzkus, Amir Globerson
Abstract Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.
Tasks
Published 2017-02-26
URL http://arxiv.org/abs/1702.07966v1
PDF http://arxiv.org/pdf/1702.07966v1.pdf
PWC https://paperswithcode.com/paper/globally-optimal-gradient-descent-for-a
Repo
Framework

Dialogue Act Segmentation for Vietnamese Human-Human Conversational Texts

Title Dialogue Act Segmentation for Vietnamese Human-Human Conversational Texts
Authors Thi Lan Ngo, Khac Linh Pham, Minh Son Cao, Son Bao Pham, Xuan Hieu Phan
Abstract Dialog act identification plays an important role in understanding conversations. It has been widely applied in many fields such as dialogue systems, automatic machine translation, automatic speech recognition, and especially useful in systems with human-computer natural language dialogue interfaces such as virtual assistants and chatbots. The first step of identifying dialog act is identifying the boundary of the dialog act in utterances. In this paper, we focus on segmenting the utterance according to the dialog act boundaries, i.e. functional segments identification, for Vietnamese utterances. We investigate carefully functional segment identification in two approaches: (1) machine learning approach using maximum entropy (ME) and conditional random fields (CRFs); (2) deep learning approach using bidirectional Long Short-Term Memory (LSTM) with a CRF layer (Bi-LSTM-CRF) on two different conversational datasets: (1) Facebook messages (Message data); (2) transcription from phone conversations (Phone data). To the best of our knowledge, this is the first work that applies deep learning based approach to dialog act segmentation. As the results show, deep learning approach performs appreciably better as to compare with traditional machine learning approaches. Moreover, it is also the first study that tackles dialog act and functional segment identification for Vietnamese.
Tasks Machine Translation, Speech Recognition
Published 2017-08-16
URL http://arxiv.org/abs/1708.04765v1
PDF http://arxiv.org/pdf/1708.04765v1.pdf
PWC https://paperswithcode.com/paper/dialogue-act-segmentation-for-vietnamese
Repo
Framework

Demystifying Relational Latent Representations

Title Demystifying Relational Latent Representations
Authors Sebastijan Dumančić, Hendrik Blockeel
Abstract Latent features learned by deep learning approaches have proven to be a powerful tool for machine learning. They serve as a data abstraction that makes learning easier by capturing regularities in data explicitly. Their benefits motivated their adaptation to relational learning context. In our previous work, we introduce an approach that learns relational latent features by means of clustering instances and their relations. The major drawback of latent representations is that they are often black-box and difficult to interpret. This work addresses these issues and shows that (1) latent features created by clustering are interpretable and capture interesting properties of data; (2) they identify local regions of instances that match well with the label, which partially explains their benefit; and (3) although the number of latent features generated by this approach is large, often many of them are highly redundant and can be removed without hurting performance much.
Tasks Relational Reasoning
Published 2017-05-16
URL http://arxiv.org/abs/1705.05785v3
PDF http://arxiv.org/pdf/1705.05785v3.pdf
PWC https://paperswithcode.com/paper/demystifying-relational-latent
Repo
Framework

A Greedy Part Assignment Algorithm for Real-time Multi-person 2D Pose Estimation

Title A Greedy Part Assignment Algorithm for Real-time Multi-person 2D Pose Estimation
Authors Srenivas Varadarajan, Parual Datta, Omesh Tickoo
Abstract Human pose-estimation in a multi-person image involves detection of various body parts and grouping them into individual person clusters. While the former task is challenging due to mutual occlusions, the combinatorial complexity of the latter task is very high. We propose a greedy part assignment algorithm that exploits the inherent structure of the human body to achieve a lower complexity, compared to any of the prior published works. This is accomplished by (i) reducing the number of part-candidates using the estimated number of people in the image, (ii) doing a greedy sequential assignment of part-classes, following the kinematic chain from head to ankle (iii) doing a greedy assignment of parts in each part-class set, to person-clusters (iv) limiting the candidate person clusters to the most proximal clusters using human anthropometric data and (v) using only a specific subset of pre-assigned parts for establishing pairwise structural constraints. We show that, these steps result in a sparse body parts relationship graph and reduces the complexity. We also propose methods for improving the accuracy of pose-estimation by (i) spawning person-clusters from any unassigned significant body part and (ii) suppressing hallucinated parts. On the MPII multi-person pose database, pose-estimation using the proposed method takes only 0.14 seconds per image. We show that, our proposed algorithm, by using a large spatial and structural context, achieves the state-of-the-art accuracy on both MPII and WAF multi-person pose datasets, demonstrating the robustness of our approach.
Tasks Pose Estimation
Published 2017-08-30
URL http://arxiv.org/abs/1708.09182v1
PDF http://arxiv.org/pdf/1708.09182v1.pdf
PWC https://paperswithcode.com/paper/a-greedy-part-assignment-algorithm-for-real
Repo
Framework

Induction of Interpretable Possibilistic Logic Theories from Relational Data

Title Induction of Interpretable Possibilistic Logic Theories from Relational Data
Authors Ondrej Kuzelka, Jesse Davis, Steven Schockaert
Abstract The field of Statistical Relational Learning (SRL) is concerned with learning probabilistic models from relational data. Learned SRL models are typically represented using some kind of weighted logical formulas, which make them considerably more interpretable than those obtained by e.g. neural networks. In practice, however, these models are often still difficult to interpret correctly, as they can contain many formulas that interact in non-trivial ways and weights do not always have an intuitive meaning. To address this, we propose a new SRL method which uses possibilistic logic to encode relational models. Learned models are then essentially stratified classical theories, which explicitly encode what can be derived with a given level of certainty. Compared to Markov Logic Networks (MLNs), our method is faster and produces considerably more interpretable models.
Tasks Relational Reasoning
Published 2017-05-19
URL http://arxiv.org/abs/1705.07095v1
PDF http://arxiv.org/pdf/1705.07095v1.pdf
PWC https://paperswithcode.com/paper/induction-of-interpretable-possibilistic
Repo
Framework

Stability Enhanced Large-Margin Classifier Selection

Title Stability Enhanced Large-Margin Classifier Selection
Authors Will Wei Sun, Guang Cheng, Yufeng Liu
Abstract Stability is an important aspect of a classification procedure because unstable predictions can potentially reduce users’ trust in a classification system and also harm the reproducibility of scientific conclusions. The major goal of our work is to introduce a novel concept of classification instability, i.e., decision boundary instability (DBI), and incorporate it with the generalization error (GE) as a standard for selecting the most accurate and stable classifier. Specifically, we implement a two-stage algorithm: (i) initially select a subset of classifiers whose estimated GEs are not significantly different from the minimal estimated GE among all the candidate classifiers; (ii) the optimal classifier is chosen as the one achieving the minimal DBI among the subset selected in stage (i). This general selection principle applies to both linear and nonlinear classifiers. Large-margin classifiers are used as a prototypical example to illustrate the above idea. Our selection method is shown to be consistent in the sense that the optimal classifier simultaneously achieves the minimal GE and the minimal DBI. Various simulations and real examples further demonstrate the advantage of our method over several alternative approaches.
Tasks
Published 2017-01-20
URL http://arxiv.org/abs/1701.05672v1
PDF http://arxiv.org/pdf/1701.05672v1.pdf
PWC https://paperswithcode.com/paper/stability-enhanced-large-margin-classifier
Repo
Framework

Inferring Networked Device Categories from Low-Level Activity Indicators

Title Inferring Networked Device Categories from Low-Level Activity Indicators
Authors Kyumars Sheykh Esmaili, Jaideep Chandrashekar, Pascal Le Guyadec
Abstract We study the problem of inferring the type of a networked device in a home network by leveraging low level traffic activity indicators seen at commodity home gateways. We analyze a dataset of detailed device network activity obtained from 240 subscriber homes of a large European ISP and extract a number of traffic and spatial fingerprints for individual devices. We develop a two level taxonomy to describe devices onto which we map individual devices using a number of heuristics. We leverage the heuristically derived labels to train classifiers that distinguish device classes based on the traffic and spatial fingerprints of a device. Our results show an accuracy level up to 91% for the coarse level category and up to 84% for the fine grained category. By incorporating information from other sources (e.g., MAC OUI), we are able to further improve accuracy to above 97% and 92%, respectively. Finally, we also extract a set of simple and human-readable rules that concisely capture the behaviour of these distinct device categories.
Tasks
Published 2017-09-01
URL http://arxiv.org/abs/1709.00348v1
PDF http://arxiv.org/pdf/1709.00348v1.pdf
PWC https://paperswithcode.com/paper/inferring-networked-device-categories-from
Repo
Framework

Geospatial Semantics

Title Geospatial Semantics
Authors Yingjie Hu
Abstract Geospatial semantics is a broad field that involves a variety of research areas. The term semantics refers to the meaning of things, and is in contrast with the term syntactics. Accordingly, studies on geospatial semantics usually focus on understanding the meaning of geographic entities as well as their counterparts in the cognitive and digital world, such as cognitive geographic concepts and digital gazetteers. Geospatial semantics can also facilitate the design of geographic information systems (GIS) by enhancing the interoperability of distributed systems and developing more intelligent interfaces for user interactions. During the past years, a lot of research has been conducted, approaching geospatial semantics from different perspectives, using a variety of methods, and targeting different problems. Meanwhile, the arrival of big geo data, especially the large amount of unstructured text data on the Web, and the fast development of natural language processing methods enable new research directions in geospatial semantics. This chapter, therefore, provides a systematic review on the existing geospatial semantic research. Six major research areas are identified and discussed, including semantic interoperability, digital gazetteers, geographic information retrieval, geospatial Semantic Web, place semantics, and cognitive geographic concepts.
Tasks Information Retrieval
Published 2017-07-12
URL http://arxiv.org/abs/1707.03550v2
PDF http://arxiv.org/pdf/1707.03550v2.pdf
PWC https://paperswithcode.com/paper/geospatial-semantics
Repo
Framework

A New Rank Constraint on Multi-view Fundamental Matrices, and its Application to Camera Location Recovery

Title A New Rank Constraint on Multi-view Fundamental Matrices, and its Application to Camera Location Recovery
Authors Soumyadip Sengupta, Tal Amir, Meirav Galun, Tom Goldstein, David W. Jacobs, Amit Singer, Ronen Basri
Abstract Accurate estimation of camera matrices is an important step in structure from motion algorithms. In this paper we introduce a novel rank constraint on collections of fundamental matrices in multi-view settings. We show that in general, with the selection of proper scale factors, a matrix formed by stacking fundamental matrices between pairs of images has rank 6. Moreover, this matrix forms the symmetric part of a rank 3 matrix whose factors relate directly to the corresponding camera matrices. We use this new characterization to produce better estimations of fundamental matrices by optimizing an L1-cost function using Iterative Re-weighted Least Squares and Alternate Direction Method of Multiplier. We further show that this procedure can improve the recovery of camera locations, particularly in multi-view settings in which fewer images are available.
Tasks
Published 2017-02-10
URL http://arxiv.org/abs/1702.03023v1
PDF http://arxiv.org/pdf/1702.03023v1.pdf
PWC https://paperswithcode.com/paper/a-new-rank-constraint-on-multi-view
Repo
Framework

Interpretable Graph-Based Semi-Supervised Learning via Flows

Title Interpretable Graph-Based Semi-Supervised Learning via Flows
Authors Raif M. Rustamov, James T. Klosowski
Abstract In this paper, we consider the interpretability of the foundational Laplacian-based semi-supervised learning approaches on graphs. We introduce a novel flow-based learning framework that subsumes the foundational approaches and additionally provides a detailed, transparent, and easily understood expression of the learning process in terms of graph flows. As a result, one can visualize and interactively explore the precise subgraph along which the information from labeled nodes flows to an unlabeled node of interest. Surprisingly, the proposed framework avoids trading accuracy for interpretability, but in fact leads to improved prediction accuracy, which is supported both by theoretical considerations and empirical results. The flow-based framework guarantees the maximum principle by construction and can handle directed graphs in an out-of-the-box manner.
Tasks
Published 2017-09-14
URL http://arxiv.org/abs/1709.04764v1
PDF http://arxiv.org/pdf/1709.04764v1.pdf
PWC https://paperswithcode.com/paper/interpretable-graph-based-semi-supervised
Repo
Framework

Reliable Decision Support using Counterfactual Models

Title Reliable Decision Support using Counterfactual Models
Authors Peter Schulam, Suchi Saria
Abstract Decision-makers are faced with the challenge of estimating what is likely to happen when they take an action. For instance, if I choose not to treat this patient, are they likely to die? Practitioners commonly use supervised learning algorithms to fit predictive models that help decision-makers reason about likely future outcomes, but we show that this approach is unreliable, and sometimes even dangerous. The key issue is that supervised learning algorithms are highly sensitive to the policy used to choose actions in the training data, which causes the model to capture relationships that do not generalize. We propose using a different learning objective that predicts counterfactuals instead of predicting outcomes under an existing action policy as in supervised learning. To support decision-making in temporal settings, we introduce the Counterfactual Gaussian Process (CGP) to predict the counterfactual future progression of continuous-time trajectories under sequences of future actions. We demonstrate the benefits of the CGP on two important decision-support tasks: risk prediction and “what if?” reasoning for individualized treatment planning.
Tasks Decision Making
Published 2017-03-30
URL http://arxiv.org/abs/1703.10651v4
PDF http://arxiv.org/pdf/1703.10651v4.pdf
PWC https://paperswithcode.com/paper/reliable-decision-support-using
Repo
Framework

Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model

Title Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model
Authors Priyadarshini Panda, Narayan Srinivasa
Abstract A fundamental challenge in machine learning today is to build a model that can learn from few examples. Here, we describe a reservoir based spiking neural model for learning to recognize actions with a limited number of labeled videos. First, we propose a novel encoding, inspired by how microsaccades influence visual perception, to extract spike information from raw video data while preserving the temporal correlation across different frames. Using this encoding, we show that the reservoir generalizes its rich dynamical activity toward signature action/movements enabling it to learn from few training examples. We evaluate our approach on the UCF-101 dataset. Our experiments demonstrate that our proposed reservoir achieves 81.3%/87% Top-1/Top-5 accuracy, respectively, on the 101-class data while requiring just 8 video examples per class for training. Our results establish a new benchmark for action recognition from limited video examples for spiking neural models while yielding competetive accuracy with respect to state-of-the-art non-spiking neural models.
Tasks Temporal Action Localization
Published 2017-10-19
URL http://arxiv.org/abs/1710.07354v1
PDF http://arxiv.org/pdf/1710.07354v1.pdf
PWC https://paperswithcode.com/paper/learning-to-recognize-actions-from-limited
Repo
Framework

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

Title Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder
Authors Caglar Gulcehre, Francis Dutil, Adam Trischler, Yoshua Bengio
Abstract We investigate the integration of a planning mechanism into an encoder-decoder architecture with an explicit alignment for character-level machine translation. We develop a model that plans ahead when it computes alignments between the source and target sequences, constructing a matrix of proposed future alignments and a commitment vector that governs whether to follow or recompute the plan. This mechanism is inspired by the strategic attentive reader and writer (STRAW) model. Our proposed model is end-to-end trainable with fully differentiable operations. We show that it outperforms a strong baseline on three character-level decoder neural machine translation on WMT’15 corpus. Our analysis demonstrates that our model can compute qualitatively intuitive alignments and achieves superior performance with fewer parameters.
Tasks Machine Translation
Published 2017-06-13
URL http://arxiv.org/abs/1706.05087v2
PDF http://arxiv.org/pdf/1706.05087v2.pdf
PWC https://paperswithcode.com/paper/plan-attend-generate-character-level-neural-1
Repo
Framework

Joint Cuts and Matching of Partitions in One Graph

Title Joint Cuts and Matching of Partitions in One Graph
Authors Tianshu Yu, Junchi Yan, Jieyi Zhao, Baoxin Li
Abstract As two fundamental problems, graph cuts and graph matching have been investigated over decades, resulting in vast literature in these two topics respectively. However the way of jointly applying and solving graph cuts and matching receives few attention. In this paper, we first formalize the problem of simultaneously cutting a graph into two partitions i.e. graph cuts and establishing their correspondence i.e. graph matching. Then we develop an optimization algorithm by updating matching and cutting alternatively, provided with theoretical analysis. The efficacy of our algorithm is verified on both synthetic dataset and real-world images containing similar regions or structures.
Tasks Graph Matching
Published 2017-11-27
URL http://arxiv.org/abs/1711.09584v1
PDF http://arxiv.org/pdf/1711.09584v1.pdf
PWC https://paperswithcode.com/paper/joint-cuts-and-matching-of-partitions-in-one
Repo
Framework

Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems

Title Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems
Authors Kexin Pei, Yinzhi Cao, Junfeng Yang, Suman Jana
Abstract Due to the increasing usage of machine learning (ML) techniques in security- and safety-critical domains, such as autonomous systems and medical diagnosis, ensuring correct behavior of ML systems, especially for different corner cases, is of growing importance. In this paper, we propose a generic framework for evaluating security and robustness of ML systems using different real-world safety properties. We further design, implement and evaluate VeriVis, a scalable methodology that can verify a diverse set of safety properties for state-of-the-art computer vision systems with only blackbox access. VeriVis leverage different input space reduction techniques for efficient verification of different safety properties. VeriVis is able to find thousands of safety violations in fifteen state-of-the-art computer vision systems including ten Deep Neural Networks (DNNs) such as Inception-v3 and Nvidia’s Dave self-driving system with thousands of neurons as well as five commercial third-party vision APIs including Google vision and Clarifai for twelve different safety properties. Furthermore, VeriVis can successfully verify local safety properties, on average, for around 31.7% of the test images. VeriVis finds up to 64.8x more violations than existing gradient-based methods that, unlike VeriVis, cannot ensure non-existence of any violations. Finally, we show that retraining using the safety violations detected by VeriVis can reduce the average number of violations up to 60.2%.
Tasks Medical Diagnosis
Published 2017-12-05
URL http://arxiv.org/abs/1712.01785v3
PDF http://arxiv.org/pdf/1712.01785v3.pdf
PWC https://paperswithcode.com/paper/towards-practical-verification-of-machine
Repo
Framework
comments powered by Disqus