July 27, 2019

2619 words 13 mins read

Paper Group ANR 557

Paper Group ANR 557

Predicting Yelp Star Reviews Based on Network Structure with Deep Learning. Bidirectional American Sign Language to English Translation. A case study on using speech-to-translation alignments for language documentation. Discriminative convolutional Fisher vector network for action recognition. Neural network augmented inverse problems for PDEs. Fin …

Predicting Yelp Star Reviews Based on Network Structure with Deep Learning

Title Predicting Yelp Star Reviews Based on Network Structure with Deep Learning
Authors Luis Perez
Abstract In this paper, we tackle the real-world problem of predicting Yelp star-review rating based on business features (such as images, descriptions), user features (average previous ratings), and, of particular interest, network properties (which businesses has a user rated before). We compare multiple models on different sets of features – from simple linear regression on network features only to deep learning models on network and item features. In recent years, breakthroughs in deep learning have led to increased accuracy in common supervised learning tasks, such as image classification, captioning, and language understanding. However, the idea of combining deep learning with network feature and structure appears to be novel. While the problem of predicting future interactions in a network has been studied at length, these approaches have often ignored either node-specific data or global structure. We demonstrate that taking a mixed approach combining both node-level features and network information can effectively be used to predict Yelp-review star ratings. We evaluate on the Yelp dataset by splitting our data along the time dimension (as would naturally occur in the real-world) and comparing our model against others which do no take advantage of the network structure and/or deep learning.
Tasks Image Classification
Published 2017-12-11
URL http://arxiv.org/abs/1712.04350v1
PDF http://arxiv.org/pdf/1712.04350v1.pdf
PWC https://paperswithcode.com/paper/predicting-yelp-star-reviews-based-on-network
Repo
Framework

Bidirectional American Sign Language to English Translation

Title Bidirectional American Sign Language to English Translation
Authors Hardie Cate, Zeshan Hussain
Abstract We outline a bidirectional translation system that converts sentences from American Sign Language (ASL) to English, and vice versa. To perform machine translation between ASL and English, we utilize a generative approach. Specifically, we employ an adjustment to the IBM word-alignment model 1 (IBM WAM1), where we define language models for English and ASL, as well as a translation model, and attempt to generate a translation that maximizes the posterior distribution defined by these models. Then, using these models, we are able to quantify the concepts of fluency and faithfulness of a translation between languages.
Tasks Machine Translation, Word Alignment
Published 2017-01-10
URL http://arxiv.org/abs/1701.02795v1
PDF http://arxiv.org/pdf/1701.02795v1.pdf
PWC https://paperswithcode.com/paper/bidirectional-american-sign-language-to
Repo
Framework

A case study on using speech-to-translation alignments for language documentation

Title A case study on using speech-to-translation alignments for language documentation
Authors Antonios Anastasopoulos, David Chiang
Abstract For many low-resource or endangered languages, spoken language resources are more likely to be annotated with translations than with transcriptions. Recent work exploits such annotations to produce speech-to-translation alignments, without access to any text transcriptions. We investigate whether providing such information can aid in producing better (mismatched) crowdsourced transcriptions, which in turn could be valuable for training speech recognition systems, and show that they can indeed be beneficial through a small-scale case study as a proof-of-concept. We also present a simple phonetically aware string averaging technique that produces transcriptions of higher quality.
Tasks Speech Recognition
Published 2017-02-14
URL http://arxiv.org/abs/1702.04372v1
PDF http://arxiv.org/pdf/1702.04372v1.pdf
PWC https://paperswithcode.com/paper/a-case-study-on-using-speech-to-translation
Repo
Framework

Discriminative convolutional Fisher vector network for action recognition

Title Discriminative convolutional Fisher vector network for action recognition
Authors Petar Palasek, Ioannis Patras
Abstract In this work we propose a novel neural network architecture for the problem of human action recognition in videos. The proposed architecture expresses the processing steps of classical Fisher vector approaches, that is dimensionality reduction by principal component analysis (PCA) projection, Gaussian mixture model (GMM) and Fisher vector descriptor extraction, as network layers. By contrast to other methods where these steps are performed consecutively and the corresponding parameters are learned in an unsupervised manner, having them defined as a single neural network allows us to refine the whole model discriminatively in an end to end fashion. Furthermore, we show that the proposed architecture can be used as a replacement for the fully connected layers in popular convolutional networks achieving a comparable classification performance, or even significantly surpassing the performance of similar architectures while reducing the total number of trainable parameters by a factor of 5. We show that our method achieves significant improvements in comparison to the classical chain.
Tasks Action Recognition In Videos, Dimensionality Reduction, Temporal Action Localization
Published 2017-07-19
URL http://arxiv.org/abs/1707.06119v1
PDF http://arxiv.org/pdf/1707.06119v1.pdf
PWC https://paperswithcode.com/paper/discriminative-convolutional-fisher-vector
Repo
Framework

Neural network augmented inverse problems for PDEs

Title Neural network augmented inverse problems for PDEs
Authors Jens Berg, Kaj Nyström
Abstract In this paper we show how to augment classical methods for inverse problems with artificial neural networks. The neural network acts as a prior for the coefficient to be estimated from noisy data. Neural networks are global, smooth function approximators and as such they do not require explicit regularization of the error functional to recover smooth solutions and coefficients. We give detailed examples using the Poisson equation in 1, 2, and 3 space dimensions and show that the neural network augmentation is robust with respect to noisy and incomplete data, mesh, and geometry.
Tasks
Published 2017-12-27
URL http://arxiv.org/abs/1712.09685v2
PDF http://arxiv.org/pdf/1712.09685v2.pdf
PWC https://paperswithcode.com/paper/neural-network-augmented-inverse-problems-for
Repo
Framework

Finding phonemes: improving machine lip-reading

Title Finding phonemes: improving machine lip-reading
Authors Helen L. Bear, Richard W. Harvey, Yuxuan Lan
Abstract In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated phonemes, which have been confused during phoneme recognition, into viseme groups. Using these maps, with the LiLIR dataset, we show the effect of changing the viseme map size in speaker-dependent machine lip-reading, measured by word recognition correctness and so demonstrate that word recognition with phoneme classifiers is not just possible, but often better than word recognition with viseme classifiers. Furthermore, there are intermediate units between visemes and phonemes which are better still.
Tasks
Published 2017-10-03
URL http://arxiv.org/abs/1710.01142v1
PDF http://arxiv.org/pdf/1710.01142v1.pdf
PWC https://paperswithcode.com/paper/finding-phonemes-improving-machine-lip
Repo
Framework

Indoor Sound Source Localization with Probabilistic Neural Network

Title Indoor Sound Source Localization with Probabilistic Neural Network
Authors Yingxiang Sun, Jiajia Chen, Chau Yuen, Susanto Rahardja
Abstract It is known that adverse environments such as high reverberation and low signal-to-noise ratio (SNR) pose a great challenge to indoor sound source localization. To address this challenge, in this paper, we propose a sound source localization algorithm based on probabilistic neural network, namely Generalized cross correlation Classification Algorithm (GCA). Experimental results for adverse environments with high reverberation time T60 up to 600ms and low SNR such as -10dB show that, the average azimuth angle error and elevation angle error by GCA are only 4.6 degrees and 3.1 degrees respectively. Compared with three recently published algorithms, GCA has increased the success rate on direction of arrival estimation significantly with good robustness to environmental changes. These results show that the proposed GCA can localize accurately and robustly for diverse indoor applications where the site acoustic features can be studied prior to the localization stage.
Tasks Direction of Arrival Estimation
Published 2017-12-21
URL http://arxiv.org/abs/1712.07814v1
PDF http://arxiv.org/pdf/1712.07814v1.pdf
PWC https://paperswithcode.com/paper/indoor-sound-source-localization-with
Repo
Framework

Faster Subgradient Methods for Functions with Hölderian Growth

Title Faster Subgradient Methods for Functions with Hölderian Growth
Authors Patrick R. Johnstone, Pierre Moulin
Abstract The purpose of this manuscript is to derive new convergence results for several subgradient methods applied to minimizing nonsmooth convex functions with H"olderian growth. The growth condition is satisfied in many applications and includes functions with quadratic growth and weakly sharp minima as special cases. To this end there are three main contributions. First, for a constant and sufficiently small stepsize, we show that the subgradient method achieves linear convergence up to a certain region including the optimal set, with error of the order of the stepsize. Second, if appropriate problem parameters are known, we derive a decaying stepsize which obtains a much faster convergence rate than is suggested by the classical $O(1/\sqrt{k})$ result for the subgradient method. Thirdly we develop a novel “descending stairs” stepsize which obtains this faster convergence rate and also obtains linear convergence for the special case of weakly sharp functions. We also develop an adaptive variant of the “descending stairs” stepsize which achieves the same convergence rate without requiring an error bound constant which is difficult to estimate in practice.
Tasks
Published 2017-04-01
URL http://arxiv.org/abs/1704.00196v3
PDF http://arxiv.org/pdf/1704.00196v3.pdf
PWC https://paperswithcode.com/paper/faster-subgradient-methods-for-functions-with
Repo
Framework

Self-adaptive node-based PCA encodings

Title Self-adaptive node-based PCA encodings
Authors Leonard Johard, Victor Rivera, Manuel Mazzara, JooYoung Lee
Abstract In this paper we propose an algorithm, Simple Hebbian PCA, and prove that it is able to calculate the principal component analysis (PCA) in a distributed fashion across nodes. It simplifies existing network structures by removing intralayer weights, essentially cutting the number of weights that need to be trained in half.
Tasks
Published 2017-06-16
URL http://arxiv.org/abs/1708.04498v1
PDF http://arxiv.org/pdf/1708.04498v1.pdf
PWC https://paperswithcode.com/paper/self-adaptive-node-based-pca-encodings
Repo
Framework

Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network

Title Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network
Authors Sharath Adavanne, Archontis Politis, Tuomas Virtanen
Abstract This paper proposes a deep neural network for estimating the directions of arrival (DOA) of multiple sound sources. The proposed stacked convolutional and recurrent neural network (DOAnet) generates a spatial pseudo-spectrum (SPS) along with the DOA estimates in both azimuth and elevation. We avoid any explicit feature extraction step by using the magnitudes and phases of the spectrograms of all the channels as input to the network. The proposed DOAnet is evaluated by estimating the DOAs of multiple concurrently present sources in anechoic, matched and unmatched reverberant conditions. The results show that the proposed DOAnet is capable of estimating the number of sources and their respective DOAs with good precision and generate SPS with high signal-to-noise ratio.
Tasks Direction of Arrival Estimation
Published 2017-10-27
URL http://arxiv.org/abs/1710.10059v2
PDF http://arxiv.org/pdf/1710.10059v2.pdf
PWC https://paperswithcode.com/paper/direction-of-arrival-estimation-for-multiple
Repo
Framework

Auto-Encoding User Ratings via Knowledge Graphs in Recommendation Scenarios

Title Auto-Encoding User Ratings via Knowledge Graphs in Recommendation Scenarios
Authors Vito Bellini, Vito Walter Anelli, Tommaso Di Noia, Eugenio Di Sciascio
Abstract In the last decade, driven also by the availability of an unprecedented computational power and storage capabilities in cloud environments we assisted to the proliferation of new algorithms, methods, and approaches in two areas of artificial intelligence: knowledge representation and machine learning. On the one side, the generation of a high rate of structured data on the Web led to the creation and publication of the so-called knowledge graphs. On the other side, deep learning emerged as one of the most promising approaches in the generation and training of models that can be applied to a wide variety of application fields. More recently, autoencoders have proven their strength in various scenarios, playing a fundamental role in unsupervised learning. In this paper, we instigate how to exploit the semantic information encoded in a knowledge graph to build connections between units in a Neural Network, thus leading to a new method, SEM-AUTO, to extract and weigh semantic features that can eventually be used to build a recommender system. As adding content-based side information may mitigate the cold user problems, we tested how our approach behave in the presence of a few rating from a user on the Movielens 1M dataset and compare results with BPRSLIM.
Tasks Knowledge Graphs, Recommendation Systems
Published 2017-06-24
URL https://arxiv.org/abs/1706.07956v2
PDF https://arxiv.org/pdf/1706.07956v2.pdf
PWC https://paperswithcode.com/paper/auto-encoding-user-ratings-via-knowledge
Repo
Framework

Neural Translation of Musical Style

Title Neural Translation of Musical Style
Authors Iman Malik, Carl Henrik Ek
Abstract Music is an expressive form of communication often used to convey emotion in scenarios where “words are not enough”. Part of this information lies in the musical composition where well-defined language exists. However, a significant amount of information is added during a performance as the musician interprets the composition. The performer injects expressiveness into the written score through variations of different musical properties such as dynamics and tempo. In this paper, we describe a model that can learn to perform sheet music. Our research concludes that the generated performances are indistinguishable from a human performance, thereby passing a test in the spirit of a “musical Turing test”.
Tasks
Published 2017-08-11
URL http://arxiv.org/abs/1708.03535v1
PDF http://arxiv.org/pdf/1708.03535v1.pdf
PWC https://paperswithcode.com/paper/neural-translation-of-musical-style
Repo
Framework

Data Readiness Levels

Title Data Readiness Levels
Authors Neil D. Lawrence
Abstract Application of models to data is fraught. Data-generating collaborators often only have a very basic understanding of the complications of collating, processing and curating data. Challenges include: poor data collection practices, missing values, inconvenient storage mechanisms, intellectual property, security and privacy. All these aspects obstruct the sharing and interconnection of data, and the eventual interpretation of data through machine learning or other approaches. In project reporting, a major challenge is in encapsulating these problems and enabling goals to be built around the processing of data. Project overruns can occur due to failure to account for the amount of time required to curate and collate. But to understand these failures we need to have a common language for assessing the readiness of a particular data set. This position paper proposes the use of data readiness levels: it gives a rough outline of three stages of data preparedness and speculates on how formalisation of these levels into a common language for data readiness could facilitate project management.
Tasks
Published 2017-05-05
URL http://arxiv.org/abs/1705.02245v1
PDF http://arxiv.org/pdf/1705.02245v1.pdf
PWC https://paperswithcode.com/paper/data-readiness-levels
Repo
Framework

Column Generation for Interaction Coverage in Combinatorial Software Testing

Title Column Generation for Interaction Coverage in Combinatorial Software Testing
Authors Serdar Kadioglu
Abstract This paper proposes a novel column generation framework for combinatorial software testing. In particular, it combines Mathematical Programming and Constraint Programming in a hybrid decomposition to generate covering arrays. The approach allows generating parameterized test cases with coverage guarantees between parameter interactions of a given application. Compared to exhaustive testing, combinatorial test case generation reduces the number of tests to run significantly. Our column generation algorithm is generic and can accommodate mixed coverage arrays over heterogeneous alphabets. The algorithm is realized in practice as a cloud service and recognized as one of the five winners of the company-wide cloud application challenge at Oracle. The service is currently helping software developers from a range of different product teams in their testing efforts while exposing declarative constraint models and hybrid optimization techniques to a broader audience.
Tasks
Published 2017-12-19
URL http://arxiv.org/abs/1712.07081v1
PDF http://arxiv.org/pdf/1712.07081v1.pdf
PWC https://paperswithcode.com/paper/column-generation-for-interaction-coverage-in
Repo
Framework

Concept Drift Adaptation by Exploiting Historical Knowledge

Title Concept Drift Adaptation by Exploiting Historical Knowledge
Authors Yu Sun, Ke Tang, Zexuan Zhu, Xin Yao
Abstract Incremental learning with concept drift has often been tackled by ensemble methods, where models built in the past can be re-trained to attain new models for the current data. Two design questions need to be addressed in developing ensemble methods for incremental learning with concept drift, i.e., which historical (i.e., previously trained) models should be preserved and how to utilize them. A novel ensemble learning method, namely Diversity and Transfer based Ensemble Learning (DTEL), is proposed in this paper. Given newly arrived data, DTEL uses each preserved historical model as an initial model and further trains it with the new data via transfer learning. Furthermore, DTEL preserves a diverse set of historical models, rather than a set of historical models that are merely accurate in terms of classification accuracy. Empirical studies on 15 synthetic data streams and 4 real-world data streams (all with concept drifts) demonstrate that DTEL can handle concept drift more effectively than 4 other state-of-the-art methods.
Tasks Transfer Learning
Published 2017-02-12
URL http://arxiv.org/abs/1702.03500v1
PDF http://arxiv.org/pdf/1702.03500v1.pdf
PWC https://paperswithcode.com/paper/concept-drift-adaptation-by-exploiting
Repo
Framework
comments powered by Disqus