May 7, 2019

3179 words 15 mins read

Paper Group ANR 68

Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition. DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild. Machine Learning Approach for Skill Evaluation in Robotic-Assisted Surgery. Generalizing diffuse interface methods on graphs: non-smooth potentials and hypergraphs. Learning Syntactic Program Transforma …


Title	Correlated and Individual Multi-Modal Deep Learning for RGB-D Object Recognition
Authors	Ziyan Wang, Jiwen Lu, Ruogu Lin, Jianjiang Feng, Jie zhou
Abstract	In this paper, we propose a new correlated and individual multi-modal deep learning (CIMDL) method for RGB-D object recognition. Unlike most conventional RGB-D object recognition methods which extract features from the RGB and depth channels individually, our CIMDL jointly learns feature representations from raw RGB-D data with a pair of deep neural networks, so that the sharable and modal-specific information can be simultaneously exploited. Specifically, we construct a pair of deep convolutional neural networks (CNNs) for the RGB and depth data, and concatenate them at the top layer of the network with a loss function which learns a new feature space where both correlated part and the individual part of the RGB-D information are well modelled. The parameters of the whole networks are updated by using the back-propagation criterion. Experimental results on two widely used RGB-D object image benchmark datasets clearly show that our method outperforms state-of-the-arts.
Tasks	Object Recognition
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01655v3
PDF	http://arxiv.org/pdf/1604.01655v3.pdf
PWC	https://paperswithcode.com/paper/correlated-and-individual-multi-modal-deep
Repo
Framework

DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild


Title	DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild
Authors	Rıza Alp Güler, George Trigeorgis, Epameinondas Antonakos, Patrick Snape, Stefanos Zafeiriou, Iasonas Kokkinos
Abstract	In this paper we propose to learn a mapping from image pixels into a dense template grid through a fully convolutional network. We formulate this task as a regression problem and train our network by leveraging upon manually annotated facial landmarks “in-the-wild”. We use such landmarks to establish a dense correspondence field between a three-dimensional object template and the input image, which then serves as the ground-truth for training our regression system. We show that we can combine ideas from semantic segmentation with regression networks, yielding a highly-accurate “quantized regression” architecture. Our system, called DenseReg, allows us to estimate dense image-to-template correspondences in a fully convolutional manner. As such our network can provide useful correspondence information as a stand-alone system, while when used as an initialization for Statistical Deformable Models we obtain landmark localization results that largely outperform the current state-of-the-art on the challenging 300W benchmark. We thoroughly evaluate our method on a host of facial analysis tasks and also provide qualitative results for dense human body correspondence. We make our code available at http://alpguler.com/DenseReg.html along with supplementary materials.
Tasks	Semantic Segmentation
Published	2016-12-04
URL	http://arxiv.org/abs/1612.01202v2
PDF	http://arxiv.org/pdf/1612.01202v2.pdf
PWC	https://paperswithcode.com/paper/densereg-fully-convolutional-dense-shape-1
Repo
Framework

Machine Learning Approach for Skill Evaluation in Robotic-Assisted Surgery


Title	Machine Learning Approach for Skill Evaluation in Robotic-Assisted Surgery
Authors	Mahtab J. Fard, Sattar Ameri, Ratna B. Chinnam, Abhilash K. Pandya, Michael D. Klein, R. Darin Ellis
Abstract	Evaluating surgeon skill has predominantly been a subjective task. Development of objective methods for surgical skill assessment are of increased interest. Recently, with technological advances such as robotic-assisted minimally invasive surgery (RMIS), new opportunities for objective and automated assessment frameworks have arisen. In this paper, we applied machine learning methods to automatically evaluate performance of the surgeon in RMIS. Six important movement features were used in the evaluation including completion time, path length, depth perception, speed, smoothness and curvature. Different classification methods applied to discriminate expert and novice surgeons. We test our method on real surgical data for suturing task and compare the classification result with the ground truth data (obtained by manual labeling). The experimental results show that the proposed framework can classify surgical skill level with relatively high accuracy of 85.7%. This study demonstrates the ability of machine learning methods to automatically classify expert and novice surgeons using movement features for different RMIS tasks. Due to the simplicity and generalizability of the introduced classification method, it is easy to implement in existing trainers.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05136v1
PDF	http://arxiv.org/pdf/1611.05136v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-approach-for-skill
Repo
Framework

Generalizing diffuse interface methods on graphs: non-smooth potentials and hypergraphs


Title	Generalizing diffuse interface methods on graphs: non-smooth potentials and hypergraphs
Authors	Jessica Bosch, Steffen Klamt, Martin Stoll
Abstract	Diffuse interface methods have recently been introduced for the task of semi-supervised learning. The underlying model is well-known in materials science but was extended to graphs using a Ginzburg–Landau functional and the graph Laplacian. We here generalize the previously proposed model by a non-smooth potential function. Additionally, we show that the diffuse interface method can be used for the segmentation of data coming from hypergraphs. For this we show that the graph Laplacian in almost all cases is derived from hypergraph information. Additionally, we show that the formerly introduced hypergraph Laplacian coming from a relaxed optimization problem is well suited to be used within the diffuse interface method. We present computational experiments for graph and hypergraph Laplacians.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.06094v1
PDF	http://arxiv.org/pdf/1611.06094v1.pdf
PWC	https://paperswithcode.com/paper/generalizing-diffuse-interface-methods-on
Repo
Framework

Learning Syntactic Program Transformations from Examples


Title	Learning Syntactic Program Transformations from Examples
Authors	Reudismam Rolim, Gustavo Soares, Loris D’Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, Bjoern Hartmann
Abstract	IDEs, such as Visual Studio, automate common transformations, such as Rename and Extract Method refactorings. However, extending these catalogs of transformations is complex and time-consuming. A similar phenomenon appears in intelligent tutoring systems where instructors have to write cumbersome code transformations that describe “common faults” to fix similar student submissions to programming assignments. We present REFAZER, a technique for automatically generating program transformations. REFAZER builds on the observation that code edits performed by developers can be used as examples for learning transformations. Example edits may share the same structure but involve different variables and subexpressions, which must be generalized in a transformation at the right level of abstraction. To learn transformations, REFAZER leverages state-of-the-art programming-by-example methodology using the following key components: (a) a novel domain-specific language (DSL) for describing program transformations, (b) domain-specific deductive algorithms for synthesizing transformations in the DSL, and (c) functions for ranking the synthesized transformations. We instantiate and evaluate REFAZER in two domains. First, given examples of edits used by students to fix incorrect programming assignment submissions, we learn transformations that can fix other students’ submissions with similar faults. In our evaluation conducted on 4 programming tasks performed by 720 students, our technique helped to fix incorrect submissions for 87% of the students. In the second domain, we use repetitive edits applied by developers to the same project to synthesize a program transformation that applies these edits to other locations in the code. In our evaluation conducted on 59 scenarios of repetitive edits taken from 3 C# open-source projects, REFAZER learns the intended program transformation in 83% of the cases.
Tasks
Published	2016-08-31
URL	http://arxiv.org/abs/1608.09000v1
PDF	http://arxiv.org/pdf/1608.09000v1.pdf
PWC	https://paperswithcode.com/paper/learning-syntactic-program-transformations
Repo
Framework

Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection


Title	Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection
Authors	Guillermo Garcia-Hernando, Tae-Kyun Kim
Abstract	A human action can be seen as transitions between one’s body poses over time, where the transition depicts a temporal relation between two poses. Recognizing actions thus involves learning a classifier sensitive to these pose transitions as well as to static poses. In this paper, we introduce a novel method called transitions forests, an ensemble of decision trees that both learn to discriminate static poses and transitions between pairs of two independent frames. During training, node splitting is driven by alternating two criteria: the standard classification objective that maximizes the discrimination power in individual frames, and the proposed one in pairwise frame transitions. Growing the trees tends to group frames that have similar associated transitions and share same action label incorporating temporal information that was not available otherwise. Unlike conventional decision trees where the best split in a node is determined independently of other nodes, the transition forests try to find the best split of nodes jointly (within a layer) for incorporating distant node transitions. When inferring the class label of a new frame, it is passed down the trees and the prediction is made based on previous frame predictions and the current one in an efficient and online manner. We apply our method on varied skeleton action recognition and online detection datasets showing its suitability over several baselines and state-of-the-art approaches.
Tasks	Action Detection, Spatio-Temporal Action Localization, Temporal Action Localization
Published	2016-07-10
URL	http://arxiv.org/abs/1607.02737v3
PDF	http://arxiv.org/pdf/1607.02737v3.pdf
PWC	https://paperswithcode.com/paper/transition-forests-learning-discriminative
Repo
Framework

Mental Lexicon Growth Modelling Reveals the Multiplexity of the English Language


Title	Mental Lexicon Growth Modelling Reveals the Multiplexity of the English Language
Authors	Massimo Stella, Markus Brede
Abstract	In this work we extend previous analyses of linguistic networks by adopting a multi-layer network framework for modelling the human mental lexicon, i.e. an abstract mental repository where words and concepts are stored together with their linguistic patterns. Across a three-layer linguistic multiplex, we model English words as nodes and connect them according to (i) phonological similarities, (ii) synonym relationships and (iii) free word associations. Our main aim is to exploit this multi-layered structure to explore the influence of phonological and semantic relationships on lexicon assembly over time. We propose a model of lexicon growth which is driven by the phonological layer: words are suggested according to different orderings of insertion (e.g. shorter word length, highest frequency, semantic multiplex features) and accepted or rejected subject to constraints. We then measure times of network assembly and compare these to empirical data about the age of acquisition of words. In agreement with empirical studies in psycholinguistics, our results provide quantitative evidence for the hypothesis that word acquisition is driven by features at multiple levels of organisation within language.
Tasks
Published	2016-04-05
URL	http://arxiv.org/abs/1604.01243v1
PDF	http://arxiv.org/pdf/1604.01243v1.pdf
PWC	https://paperswithcode.com/paper/mental-lexicon-growth-modelling-reveals-the
Repo
Framework

An Optimal Treatment Assignment Strategy to Evaluate Demand Response Effect


Title	An Optimal Treatment Assignment Strategy to Evaluate Demand Response Effect
Authors	Pan Li, Baosen Zhang
Abstract	Demand response is designed to motivate electricity customers to modify their loads at critical time periods. The accurate estimation of impact of demand response signals to customers’ consumption is central to any successful program. In practice, learning these response is nontrivial because operators can only send a limited number of signals. In addition, customer behavior also depends on a large number of exogenous covariates. These two features lead to a high dimensional inference problem with limited number of observations. In this paper, we formulate this problem by using a multivariate linear model and adopt an experimental design approach to estimate the impact of demand response signals. We show that randomized assignment, which is widely used to estimate the average treatment effect, is not efficient in reducing the variance of the estimator when a large number of covariates is present. In contrast, we present a tractable algorithm that strategically assigns demand response signals to customers. This algorithm achieves the optimal reduction in estimation variance, independent of the number of covariates. The results are validated from simulations on synthetic data.
Tasks
Published	2016-10-02
URL	http://arxiv.org/abs/1610.00362v2
PDF	http://arxiv.org/pdf/1610.00362v2.pdf
PWC	https://paperswithcode.com/paper/an-optimal-treatment-assignment-strategy-to
Repo
Framework

A Bayesian Model of Multilingual Unsupervised Semantic Role Induction


Title	A Bayesian Model of Multilingual Unsupervised Semantic Role Induction
Authors	Nikhil Garg, James Henderson
Abstract	We propose a Bayesian model of unsupervised semantic role induction in multiple languages, and use it to explore the usefulness of parallel corpora for this task. Our joint Bayesian model consists of individual models for each language plus additional latent variables that capture alignments between roles across languages. Because it is a generative Bayesian model, we can do evaluations in a variety of scenarios just by varying the inference procedure, without changing the model, thereby comparing the scenarios directly. We compare using only monolingual data, using a parallel corpus, using a parallel corpus with annotations in the other language, and using small amounts of annotation in the target language. We find that the biggest impact of adding a parallel corpus to training is actually the increase in mono-lingual data, with the alignments to another language resulting in small improvements, even with labeled data for the other language.
Tasks
Published	2016-03-04
URL	http://arxiv.org/abs/1603.01514v1
PDF	http://arxiv.org/pdf/1603.01514v1.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-model-of-multilingual-unsupervised
Repo
Framework

On the Nyström and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Data Sets


Title	On the Nyström and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Data Sets
Authors	Darren Homrighausen, Daniel J. McDonald
Abstract	In this paper we analyze approximate methods for undertaking a principal components analysis (PCA) on large data sets. PCA is a classical dimension reduction method that involves the projection of the data onto the subspace spanned by the leading eigenvectors of the covariance matrix. This projection can be used either for exploratory purposes or as an input for further analysis, e.g. regression. If the data have billions of entries or more, the computational and storage requirements for saving and manipulating the design matrix in fast memory is prohibitive. Recently, the Nystr"om and column-sampling methods have appeared in the numerical linear algebra community for the randomized approximation of the singular value decomposition of large matrices. However, their utility for statistical applications remains unclear. We compare these approximations theoretically by bounding the distance between the induced subspaces and the desired, but computationally infeasible, PCA subspace. Additionally we show empirically, through simulations and a real data example involving a corpus of emails, the trade-off of approximation accuracy and computational complexity.
Tasks	Dimensionality Reduction
Published	2016-02-02
URL	http://arxiv.org/abs/1602.01120v1
PDF	http://arxiv.org/pdf/1602.01120v1.pdf
PWC	https://paperswithcode.com/paper/on-the-nystrom-and-column-sampling-methods
Repo
Framework

Multi30K: Multilingual English-German Image Descriptions


Title	Multi30K: Multilingual English-German Image Descriptions
Authors	Desmond Elliott, Stella Frank, Khalil Sima’an, Lucia Specia
Abstract	We introduce the Multi30K dataset to stimulate multilingual multimodal research. Recent advances in image description have been demonstrated on English-language datasets almost exclusively, but image description should not be limited to English. This dataset extends the Flickr30K dataset with i) German translations created by professional translators over a subset of the English descriptions, and ii) descriptions crowdsourced independently of the original English descriptions. We outline how the data can be used for multilingual image description and multimodal machine translation, but we anticipate the data will be useful for a broader range of tasks.
Tasks	Machine Translation, Multimodal Machine Translation
Published	2016-05-02
URL	http://arxiv.org/abs/1605.00459v1
PDF	http://arxiv.org/pdf/1605.00459v1.pdf
PWC	https://paperswithcode.com/paper/multi30k-multilingual-english-german-image
Repo
Framework

Deep counter networks for asynchronous event-based processing


Title	Deep counter networks for asynchronous event-based processing
Authors	Jonathan Binas, Giacomo Indiveri, Michael Pfeiffer
Abstract	Despite their advantages in terms of computational resources, latency, and power consumption, event-based implementations of neural networks have not been able to achieve the same performance figures as their equivalent state-of-the-art deep network models. We propose counter neurons as minimal spiking neuron models which only require addition and comparison operations, thus avoiding costly multiplications. We show how inference carried out in deep counter networks converges to the same accuracy levels as are achieved with state-of-the-art conventional networks. As their event-based style of computation leads to reduced latency and sparse updates, counter networks are ideally suited for efficient compact and low-power hardware implementation. We present theory and training methods for counter networks, and demonstrate on the MNIST benchmark that counter networks converge quickly, both in terms of time and number of operations required, to state-of-the-art classification accuracy.
Tasks
Published	2016-11-02
URL	http://arxiv.org/abs/1611.00710v1
PDF	http://arxiv.org/pdf/1611.00710v1.pdf
PWC	https://paperswithcode.com/paper/deep-counter-networks-for-asynchronous-event
Repo
Framework

Compact-Table: Efficiently Filtering Table Constraints with Reversible Sparse Bit-Sets


Title	Compact-Table: Efficiently Filtering Table Constraints with Reversible Sparse Bit-Sets
Authors	Jordan Demeulenaere, Renaud Hartert, Christophe Lecoutre, Guillaume Perez, Laurent Perron, Jean-Charles Régin, Pierre Schaus
Abstract	In this paper, we describe Compact-Table (CT), a bitwise algorithm to enforce Generalized Arc Consistency (GAC) on table con- straints. Although this algorithm is the default propagator for table constraints in or-tools and OscaR, two publicly available CP solvers, it has never been described so far. Importantly, CT has been recently improved further with the introduction of residues, resetting operations and a data-structure called reversible sparse bit-set, used to maintain tables of supports (following the idea of tabular reduction): tuples are invalidated incrementally on value removals by means of bit-set operations. The experimentation that we have conducted with OscaR shows that CT outperforms state-of-the-art algorithms STR2, STR3, GAC4R, MDD4R and AC5-TC on standard benchmarks.
Tasks
Published	2016-04-22
URL	http://arxiv.org/abs/1604.06641v1
PDF	http://arxiv.org/pdf/1604.06641v1.pdf
PWC	https://paperswithcode.com/paper/compact-table-efficiently-filtering-table
Repo
Framework

Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection


Title	Learning when to trust distant supervision: An application to low-resource POS tagging using cross-lingual projection
Authors	Meng Fang, Trevor Cohn
Abstract	Cross lingual projection of linguistic annotation suffers from many sources of bias and noise, leading to unreliable annotations that cannot be used directly. In this paper, we introduce a novel approach to sequence tagging that learns to correct the errors from cross-lingual projection using an explicit debiasing layer. This is framed as joint learning over two corpora, one tagged with gold standard and the other with projected tags. We evaluated with only 1,000 tokens tagged with gold standard tags, along with more plentiful parallel data. Our system equals or exceeds the state-of-the-art on eight simulated low-resource settings, as well as two real low-resource languages, Malagasy and Kinyarwanda.
Tasks
Published	2016-07-05
URL	http://arxiv.org/abs/1607.01133v1
PDF	http://arxiv.org/pdf/1607.01133v1.pdf
PWC	https://paperswithcode.com/paper/learning-when-to-trust-distant-supervision-an
Repo
Framework

Graph-based semi-supervised learning for relational networks


Title	Graph-based semi-supervised learning for relational networks
Authors	Leto Peel
Abstract	We address the problem of semi-supervised learning in relational networks, networks in which nodes are entities and links are the relationships or interactions between them. Typically this problem is confounded with the problem of graph-based semi-supervised learning (GSSL), because both problems represent the data as a graph and predict the missing class labels of nodes. However, not all graphs are created equally. In GSSL a graph is constructed, often from independent data, based on similarity. As such, edges tend to connect instances with the same class label. Relational networks, however, can be more heterogeneous and edges do not always indicate similarity. For instance, instead of links being more likely to connect nodes with the same class label, they may occur more frequently between nodes with different class labels (link-heterogeneity). Or nodes with the same class label do not necessarily have the same type of connectivity across the whole network (class-heterogeneity), e.g. in a network of sexual interactions we may observe links between opposite genders in some parts of the graph and links between the same genders in others. Performing classification in networks with different types of heterogeneity is a hard problem that is made harder still when we do not know a-priori the type or level of heterogeneity. Here we present two scalable approaches for graph-based semi-supervised learning for the more general case of relational networks. We demonstrate these approaches on synthetic and real-world networks that display different link patterns within and between classes. Compared to state-of-the-art approaches, ours give better classification performance without prior knowledge of how classes interact. In particular, our two-step label propagation algorithm gives consistently good accuracy and runs on networks of over 1.6 million nodes and 30 million edges in around 12 seconds.
Tasks
Published	2016-12-15
URL	http://arxiv.org/abs/1612.05001v1
PDF	http://arxiv.org/pdf/1612.05001v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-semi-supervised-learning-for
Repo
Framework