July 27, 2019

3199 words 16 mins read

Paper Group ANR 594

Paper Group ANR 594

On the Behavior of Convolutional Nets for Feature Extraction. Deep learning for extracting protein-protein interactions from biomedical literature. Learning Discriminative Alpha-Beta-divergence for Positive Definite Matrices (Extended Version). Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Tran …

On the Behavior of Convolutional Nets for Feature Extraction

Title On the Behavior of Convolutional Nets for Feature Extraction
Authors Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura
Abstract Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previously learnt by the CNN after processing millions of images, without requiring an expensive training phase. Contributions to this field (commonly known as feature representation transfer or transfer learning) have been purely empirical so far, extracting all CNN features from a single layer close to the output and testing their performance by feeding them to a classifier. This approach has provided consistent results, although its relevance is limited to classification tasks. In a completely different approach, in this paper we statistically measure the discriminative power of every single feature found within a deep CNN, when used for characterizing every class of 11 datasets. We seek to provide new insights into the behavior of CNN features, particularly the ones from convolutional layers, as this can be relevant for their application to knowledge representation and reasoning. Our results confirm that low and middle level features may behave differently to high level features, but only under certain conditions. We find that all CNN features can be used for knowledge representation purposes both by their presence or by their absence, doubling the information a single CNN feature may provide. We also study how much noise these features may include, and propose a thresholding approach to discard most of it. All these insights have a direct application to the generation of CNN embedding spaces.
Tasks Representation Learning, Transfer Learning
Published 2017-03-03
URL http://arxiv.org/abs/1703.01127v4
PDF http://arxiv.org/pdf/1703.01127v4.pdf
PWC https://paperswithcode.com/paper/on-the-behavior-of-convolutional-nets-for
Repo
Framework

Deep learning for extracting protein-protein interactions from biomedical literature

Title Deep learning for extracting protein-protein interactions from biomedical literature
Authors Yifan Peng, Zhiyong Lu
Abstract State-of-the-art methods for protein-protein interaction (PPI) extraction are primarily feature-based or kernel-based by leveraging lexical and syntactic information. But how to incorporate such knowledge in the recent deep learning methods remains an open question. In this paper, we propose a multichannel dependency-based convolutional neural network model (McDepCNN). It applies one channel to the embedding vector of each word in the sentence, and another channel to the embedding vector of the head of the corresponding word. Therefore, the model can use richer information obtained from different channels. Experiments on two public benchmarking datasets, AIMed and BioInfer, demonstrate that McDepCNN compares favorably to the state-of-the-art rich-feature and single-kernel based methods. In addition, McDepCNN achieves 24.4% relative improvement in F1-score over the state-of-the-art methods on cross-corpus evaluation and 12% improvement in F1-score over kernel-based methods on “difficult” instances. These results suggest that McDepCNN generalizes more easily over different corpora, and is capable of capturing long distance features in the sentences.
Tasks
Published 2017-06-05
URL http://arxiv.org/abs/1706.01556v2
PDF http://arxiv.org/pdf/1706.01556v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-extracting-protein-protein
Repo
Framework

Learning Discriminative Alpha-Beta-divergence for Positive Definite Matrices (Extended Version)

Title Learning Discriminative Alpha-Beta-divergence for Positive Definite Matrices (Extended Version)
Authors Anoop Cherian, Panagiotis Stanitsas, Mehrtash Harandi, Vassilios Morellas, Nikolaos Papanikolopoulos
Abstract Symmetric positive definite (SPD) matrices are useful for capturing second-order statistics of visual data. To compare two SPD matrices, several measures are available, such as the affine-invariant Riemannian metric, Jeffreys divergence, Jensen-Bregman logdet divergence, etc.; however, their behaviors may be application dependent, raising the need of manual selection to achieve the best possible performance. Further and as a result of their overwhelming complexity for large-scale problems, computing pairwise similarities by clever embedding of SPD matrices is often preferred to direct use of the aforementioned measures. In this paper, we propose a discriminative metric learning framework, Information Divergence and Dictionary Learning (IDDL), that not only learns application specific measures on SPD matrices automatically, but also embeds them as vectors using a learned dictionary. To learn the similarity measures (which could potentially be distinct for every dictionary atom), we use the recently introduced alpha-beta-logdet divergence, which is known to unify the measures listed above. We propose a novel IDDL objective, that learns the parameters of the divergence and the dictionary atoms jointly in a discriminative setup and is solved efficiently using Riemannian optimization. We showcase extensive experiments on eight computer vision datasets, demonstrating state-of-the-art performances.
Tasks Dictionary Learning, Metric Learning
Published 2017-08-05
URL http://arxiv.org/abs/1708.01741v1
PDF http://arxiv.org/pdf/1708.01741v1.pdf
PWC https://paperswithcode.com/paper/learning-discriminative-alpha-beta-divergence
Repo
Framework

Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation

Title Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation
Authors Jean-Benoit Delbrouck, Stéphane Dupont, Omar Seddati
Abstract In Multimodal Neural Machine Translation (MNMT), a neural model generates a translated sentence that describes an image, given the image itself and one source descriptions in English. This is considered as the multimodal image caption translation task. The images are processed with Convolutional Neural Network (CNN) to extract visual features exploitable by the translation model. So far, the CNNs used are pre-trained on object detection and localization task. We hypothesize that richer architecture, such as dense captioning models, may be more suitable for MNMT and could lead to improved translations. We extend this intuition to the word-embeddings, where we compute both linguistic and visual representation for our corpus vocabulary. We combine and compare different confi
Tasks Machine Translation, Object Detection, Word Embeddings
Published 2017-07-04
URL http://arxiv.org/abs/1707.01009v5
PDF http://arxiv.org/pdf/1707.01009v5.pdf
PWC https://paperswithcode.com/paper/visually-grounded-word-embeddings-and-richer
Repo
Framework

Multi-Level and Multi-Scale Feature Aggregation Using Pre-trained Convolutional Neural Networks for Music Auto-tagging

Title Multi-Level and Multi-Scale Feature Aggregation Using Pre-trained Convolutional Neural Networks for Music Auto-tagging
Authors Jongpil Lee, Juhan Nam
Abstract Music auto-tagging is often handled in a similar manner to image classification by regarding the 2D audio spectrogram as image data. However, music auto-tagging is distinguished from image classification in that the tags are highly diverse and have different levels of abstractions. Considering this issue, we propose a convolutional neural networks (CNN)-based architecture that embraces multi-level and multi-scaled features. The architecture is trained in three steps. First, we conduct supervised feature learning to capture local audio features using a set of CNNs with different input sizes. Second, we extract audio features from each layer of the pre-trained convolutional networks separately and aggregate them altogether given a long audio clip. Finally, we put them into fully-connected networks and make final predictions of the tags. Our experiments show that using the combination of multi-level and multi-scale features is highly effective in music auto-tagging and the proposed method outperforms previous state-of-the-arts on the MagnaTagATune dataset and the Million Song Dataset. We further show that the proposed architecture is useful in transfer learning.
Tasks Image Classification, Music Auto-Tagging, Transfer Learning
Published 2017-03-06
URL http://arxiv.org/abs/1703.01793v2
PDF http://arxiv.org/pdf/1703.01793v2.pdf
PWC https://paperswithcode.com/paper/multi-level-and-multi-scale-feature
Repo
Framework

Symbolic Regression Algorithms with Built-in Linear Regression

Title Symbolic Regression Algorithms with Built-in Linear Regression
Authors Jan Žegklitz, Petr Pošík
Abstract Recently, several algorithms for symbolic regression (SR) emerged which employ a form of multiple linear regression (LR) to produce generalized linear models. The use of LR allows the algorithms to create models with relatively small error right from the beginning of the search; such algorithms are thus claimed to be (sometimes by orders of magnitude) faster than SR algorithms based on vanilla genetic programming. However, a systematic comparison of these algorithms on a common set of problems is still missing. In this paper we conceptually and experimentally compare several representatives of such algorithms (GPTIPS, FFX, and EFS). They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic and real-world SR benchmark problems. Their performance is also related to the performance of three conventional machine learning algorithms — multiple regression, random forests and support vector regression.
Tasks
Published 2017-01-13
URL http://arxiv.org/abs/1701.03641v3
PDF http://arxiv.org/pdf/1701.03641v3.pdf
PWC https://paperswithcode.com/paper/symbolic-regression-algorithms-with-built-in
Repo
Framework

Datenqualität in Regressionsproblemen

Title Datenqualität in Regressionsproblemen
Authors Wolfgang Doneit, Ralf Mikut, Markus Reischl
Abstract Regression models are increasingly built using datasets which do not follow a design of experiment. Instead, the data is e.g. gathered by an automated monitoring of a technical system. As a consequence, already the input data represents phenomena of the system and violates statistical assumptions of distributions. The input data can show correlations, clusters or other patterns. Further, the distribution of input data influences the reliability of regression models. We propose criteria to quantify typical phenomena of input data for regression and show their suitability with simulated benchmark datasets. —– Regressionen werden zunehmend auf Datens"atzen angewendet, deren Eingangsvektoren nicht durch eine statistische Versuchsplanung festgelegt wurden. Stattdessen werden die Daten beispielsweise durch die passive Beobachtung technischer Systeme gesammelt. Damit bilden bereits die Eingangsdaten Ph"anomene des Systems ab und widersprechen statistischen Verteilungsannahmen. Die Verteilung der Eingangsdaten hat Einfluss auf die Zuverl"assigkeit eines Regressionsmodells. Wir stellen deshalb Bewertungskriterien f"ur einige typische Ph"anomene in Eingangsdaten von Regressionen vor und zeigen ihre Funktionalit"at anhand simulierter Benchmarkdatens"atze.
Tasks
Published 2017-01-16
URL http://arxiv.org/abs/1701.04342v1
PDF http://arxiv.org/pdf/1701.04342v1.pdf
PWC https://paperswithcode.com/paper/datenqualitat-in-regressionsproblemen
Repo
Framework

SADA: A General Framework to Support Robust Causation Discovery with Theoretical Guarantee

Title SADA: A General Framework to Support Robust Causation Discovery with Theoretical Guarantee
Authors Ruichu Cai, Zhenjie Zhang, Zhifeng Hao
Abstract Causation discovery without manipulation is considered a crucial problem to a variety of applications. The state-of-the-art solutions are applicable only when large numbers of samples are available or the problem domain is sufficiently small. Motivated by the observations of the local sparsity properties on causal structures, we propose a general Split-and-Merge framework, named SADA, to enhance the scalability of a wide class of causation discovery algorithms. In SADA, the variables are partitioned into subsets, by finding causal cut on the sparse causal structure over the variables. By running mainstream causation discovery algorithms as basic causal solvers on the subproblems, complete causal structure can be reconstructed by combining the partial results. SADA benefits from the recursive division technique, since each small subproblem generates more accurate result under the same number of samples. We theoretically prove that SADA always reduces the scales of problems without sacrifice on accuracy, under the condition of local causal sparsity and reliable conditional independence tests. We also present sufficient condition to accuracy enhancement by SADA, even when the conditional independence tests are vulnerable. Extensive experiments on both simulated and real-world datasets verify the improvements on scalability and accuracy by applying SADA together with existing causation discovery algorithms.
Tasks
Published 2017-07-05
URL http://arxiv.org/abs/1707.01283v1
PDF http://arxiv.org/pdf/1707.01283v1.pdf
PWC https://paperswithcode.com/paper/sada-a-general-framework-to-support-robust
Repo
Framework

Foot anthropometry device and single object image thresholding

Title Foot anthropometry device and single object image thresholding
Authors Amir Mohammad Esmaieeli Sikaroudi, Sasan Ghaffari, Ali Yousefi, Hassan Sadeghi Naeini
Abstract This paper introduces a device, algorithm and graphical user interface to obtain anthropometric measurements of foot. Presented device facilitates obtaining scale of image and image processing by taking one image from side foot and underfoot simultaneously. Introduced image processing algorithm minimizes a noise criterion, which is suitable for object detection in single object images and outperforms famous image thresholding methods when lighting condition is poor. Performance of image-based method is compared to manual method. Image-based measurements of underfoot in average was 4mm less than actual measures. Mean absolute error of underfoot length was 1.6mm, however length obtained from side foot had 4.4mm mean absolute error. Furthermore, based on t-test and f-test results, no significant difference between manual and image-based anthropometry observed. In order to maintain anthropometry process performance in different situations user interface designed for handling changes in light conditions and altering speed of the algorithm.
Tasks Object Detection
Published 2017-07-10
URL http://arxiv.org/abs/1707.03004v1
PDF http://arxiv.org/pdf/1707.03004v1.pdf
PWC https://paperswithcode.com/paper/foot-anthropometry-device-and-single-object
Repo
Framework

Deep Frame Interpolation

Title Deep Frame Interpolation
Authors Vladislav Samsonov
Abstract This work presents a supervised learning based approach to the computer vision problem of frame interpolation. The presented technique could also be used in the cartoon animations since drawing each individual frame consumes a noticeable amount of time. The most existing solutions to this problem use unsupervised methods and focus only on real life videos with already high frame rate. However, the experiments show that such methods do not work as well when the frame rate becomes low and object displacements between frames becomes large. This is due to the fact that interpolation of the large displacement motion requires knowledge of the motion structure thus the simple techniques such as frame averaging start to fail. In this work the deep convolutional neural network is used to solve the frame interpolation problem. In addition, it is shown that incorporating the prior information such as optical flow improves the interpolation quality significantly.
Tasks Optical Flow Estimation
Published 2017-06-04
URL http://arxiv.org/abs/1706.01159v2
PDF http://arxiv.org/pdf/1706.01159v2.pdf
PWC https://paperswithcode.com/paper/deep-frame-interpolation
Repo
Framework

McGan: Mean and Covariance Feature Matching GAN

Title McGan: Mean and Covariance Feature Matching GAN
Authors Youssef Mroueh, Tom Sercu, Vaibhava Goel
Abstract We introduce new families of Integral Probability Metrics (IPM) for training Generative Adversarial Networks (GAN). Our IPMs are based on matching statistics of distributions embedded in a finite dimensional feature space. Mean and covariance feature matching IPMs allow for stable training of GANs, which we will call McGan. McGan minimizes a meaningful loss between distributions.
Tasks
Published 2017-02-27
URL http://arxiv.org/abs/1702.08398v2
PDF http://arxiv.org/pdf/1702.08398v2.pdf
PWC https://paperswithcode.com/paper/mcgan-mean-and-covariance-feature-matching
Repo
Framework

Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos

Title Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos
Authors Rui Hou, Chen Chen, Mubarak Shah
Abstract Deep learning has been demonstrated to achieve excellent results for image classification and object detection. However, the impact of deep learning on video analysis (e.g. action detection and recognition) has been limited due to complexity of video data and lack of annotations. Previous convolutional neural networks (CNN) based video action detection approaches usually consist of two major steps: frame-level action proposal detection and association of proposals across frames. Also, these methods employ two-stream CNN framework to handle spatial and temporal feature separately. In this paper, we propose an end-to-end deep network called Tube Convolutional Neural Network (T-CNN) for action detection in videos. The proposed architecture is a unified network that is able to recognize and localize action based on 3D convolution features. A video is first divided into equal length clips and for each clip a set of tube proposals are generated next based on 3D Convolutional Network (ConvNet) features. Finally, the tube proposals of different clips are linked together employing network flow and spatio-temporal action detection is performed using these linked video proposals. Extensive experiments on several video datasets demonstrate the superior performance of T-CNN for classifying and localizing actions in both trimmed and untrimmed videos compared to state-of-the-arts.
Tasks Action Detection, Image Classification, Object Detection
Published 2017-03-30
URL http://arxiv.org/abs/1703.10664v3
PDF http://arxiv.org/pdf/1703.10664v3.pdf
PWC https://paperswithcode.com/paper/tube-convolutional-neural-network-t-cnn-for
Repo
Framework

Visual Cues to Improve Myoelectric Control of Upper Limb Prostheses

Title Visual Cues to Improve Myoelectric Control of Upper Limb Prostheses
Authors Andrea Gigli, Arjan Gijsberts, Valentina Gregori, Matteo Cognolato, Manfredo Atzori, Barbara Caputo
Abstract The instability of myoelectric signals over time complicates their use to control highly articulated prostheses. To address this problem, studies have tried to combine surface electromyography with modalities that are less affected by the amputation and environment, such as accelerometry or gaze information. In the latter case, the hypothesis is that a subject looks at the object he or she intends to manipulate and that knowing this object’s affordances allows to constrain the set of possible grasps. In this paper, we develop an automated way to detect stable fixations and show that gaze information is indeed helpful in predicting hand movements. In our multimodal approach, we automatically detect stable gazes and segment an object of interest around the subject’s fixation in the visual frame. The patch extracted around this object is subsequently fed through an off-the-shelf deep convolutional neural network to obtain a high level feature representation, which is then combined with traditional surface electromyography in the classification stage. Tests have been performed on a dataset acquired from five intact subjects who performed ten types of grasps on various objects as well as in a functional setting. They show that the addition of gaze information increases the classification accuracy considerably. Further analysis demonstrates that this improvement is consistent for all grasps and concentrated during the movement onset and offset.
Tasks
Published 2017-08-29
URL http://arxiv.org/abs/1709.02236v1
PDF http://arxiv.org/pdf/1709.02236v1.pdf
PWC https://paperswithcode.com/paper/visual-cues-to-improve-myoelectric-control-of
Repo
Framework

When confidence and competence collide: Effects on online decision-making discussions

Title When confidence and competence collide: Effects on online decision-making discussions
Authors Liye Fu, Lillian Lee, Cristian Danescu-Niculescu-Mizil
Abstract Group discussions are a way for individuals to exchange ideas and arguments in order to reach better decisions than they could on their own. One of the premises of productive discussions is that better solutions will prevail, and that the idea selection process is mediated by the (relative) competence of the individuals involved. However, since people may not know their actual competence on a new task, their behavior is influenced by their self-estimated competence — that is, their confidence — which can be misaligned with their actual competence. Our goal in this work is to understand the effects of confidence-competence misalignment on the dynamics and outcomes of discussions. To this end, we design a large-scale natural setting, in the form of an online team-based geography game, that allows us to disentangle confidence from competence and thus separate their effects. We find that in task-oriented discussions, the more-confident individuals have a larger impact on the group’s decisions even when these individuals are at the same level of competence as their teammates. Furthermore, this unjustified role of confidence in the decision-making process often leads teams to under-perform. We explore this phenomenon by investigating the effects of confidence on conversational dynamics.
Tasks Decision Making
Published 2017-02-24
URL http://arxiv.org/abs/1702.07717v2
PDF http://arxiv.org/pdf/1702.07717v2.pdf
PWC https://paperswithcode.com/paper/when-confidence-and-competence-collide
Repo
Framework

Sign-Constrained Regularized Loss Minimization

Title Sign-Constrained Regularized Loss Minimization
Authors Tsuyoshi Kato, Misato Kobayashi, Daisuke Sano
Abstract In practical analysis, domain knowledge about analysis target has often been accumulated, although, typically, such knowledge has been discarded in the statistical analysis stage, and the statistical tool has been applied as a black box. In this paper, we introduce sign constraints that are a handy and simple representation for non-experts in generic learning problems. We have developed two new optimization algorithms for the sign-constrained regularized loss minimization, called the sign-constrained Pegasos (SC-Pega) and the sign-constrained SDCA (SC-SDCA), by simply inserting the sign correction step into the original Pegasos and SDCA, respectively. We present theoretical analyses that guarantee that insertion of the sign correction step does not degrade the convergence rate for both algorithms. Two applications, where the sign-constrained learning is effective, are presented. The one is exploitation of prior information about correlation between explanatory variables and a target variable. The other is introduction of the sign-constrained to SVM-Pairwise method. Experimental results demonstrate significant improvement of generalization performance by introducing sign constraints in both applications.
Tasks
Published 2017-10-12
URL http://arxiv.org/abs/1710.04380v1
PDF http://arxiv.org/pdf/1710.04380v1.pdf
PWC https://paperswithcode.com/paper/sign-constrained-regularized-loss
Repo
Framework
comments powered by Disqus