Paper Group ANR 557
Predicting Yelp Star Reviews Based on Network Structure with Deep Learning. Bidirectional American Sign Language to English Translation. A case study on using speech-to-translation alignments for language documentation. Discriminative convolutional Fisher vector network for action recognition. Neural network augmented inverse problems for PDEs. Fin …
Predicting Yelp Star Reviews Based on Network Structure with Deep Learning
Title | Predicting Yelp Star Reviews Based on Network Structure with Deep Learning |
Authors | Luis Perez |
Abstract | In this paper, we tackle the real-world problem of predicting Yelp star-review rating based on business features (such as images, descriptions), user features (average previous ratings), and, of particular interest, network properties (which businesses has a user rated before). We compare multiple models on different sets of features – from simple linear regression on network features only to deep learning models on network and item features. In recent years, breakthroughs in deep learning have led to increased accuracy in common supervised learning tasks, such as image classification, captioning, and language understanding. However, the idea of combining deep learning with network feature and structure appears to be novel. While the problem of predicting future interactions in a network has been studied at length, these approaches have often ignored either node-specific data or global structure. We demonstrate that taking a mixed approach combining both node-level features and network information can effectively be used to predict Yelp-review star ratings. We evaluate on the Yelp dataset by splitting our data along the time dimension (as would naturally occur in the real-world) and comparing our model against others which do no take advantage of the network structure and/or deep learning. |
Tasks | Image Classification |
Published | 2017-12-11 |
URL | http://arxiv.org/abs/1712.04350v1 |
http://arxiv.org/pdf/1712.04350v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-yelp-star-reviews-based-on-network |
Repo | |
Framework | |
Bidirectional American Sign Language to English Translation
Title | Bidirectional American Sign Language to English Translation |
Authors | Hardie Cate, Zeshan Hussain |
Abstract | We outline a bidirectional translation system that converts sentences from American Sign Language (ASL) to English, and vice versa. To perform machine translation between ASL and English, we utilize a generative approach. Specifically, we employ an adjustment to the IBM word-alignment model 1 (IBM WAM1), where we define language models for English and ASL, as well as a translation model, and attempt to generate a translation that maximizes the posterior distribution defined by these models. Then, using these models, we are able to quantify the concepts of fluency and faithfulness of a translation between languages. |
Tasks | Machine Translation, Word Alignment |
Published | 2017-01-10 |
URL | http://arxiv.org/abs/1701.02795v1 |
http://arxiv.org/pdf/1701.02795v1.pdf | |
PWC | https://paperswithcode.com/paper/bidirectional-american-sign-language-to |
Repo | |
Framework | |
A case study on using speech-to-translation alignments for language documentation
Title | A case study on using speech-to-translation alignments for language documentation |
Authors | Antonios Anastasopoulos, David Chiang |
Abstract | For many low-resource or endangered languages, spoken language resources are more likely to be annotated with translations than with transcriptions. Recent work exploits such annotations to produce speech-to-translation alignments, without access to any text transcriptions. We investigate whether providing such information can aid in producing better (mismatched) crowdsourced transcriptions, which in turn could be valuable for training speech recognition systems, and show that they can indeed be beneficial through a small-scale case study as a proof-of-concept. We also present a simple phonetically aware string averaging technique that produces transcriptions of higher quality. |
Tasks | Speech Recognition |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04372v1 |
http://arxiv.org/pdf/1702.04372v1.pdf | |
PWC | https://paperswithcode.com/paper/a-case-study-on-using-speech-to-translation |
Repo | |
Framework | |
Discriminative convolutional Fisher vector network for action recognition
Title | Discriminative convolutional Fisher vector network for action recognition |
Authors | Petar Palasek, Ioannis Patras |
Abstract | In this work we propose a novel neural network architecture for the problem of human action recognition in videos. The proposed architecture expresses the processing steps of classical Fisher vector approaches, that is dimensionality reduction by principal component analysis (PCA) projection, Gaussian mixture model (GMM) and Fisher vector descriptor extraction, as network layers. By contrast to other methods where these steps are performed consecutively and the corresponding parameters are learned in an unsupervised manner, having them defined as a single neural network allows us to refine the whole model discriminatively in an end to end fashion. Furthermore, we show that the proposed architecture can be used as a replacement for the fully connected layers in popular convolutional networks achieving a comparable classification performance, or even significantly surpassing the performance of similar architectures while reducing the total number of trainable parameters by a factor of 5. We show that our method achieves significant improvements in comparison to the classical chain. |
Tasks | Action Recognition In Videos, Dimensionality Reduction, Temporal Action Localization |
Published | 2017-07-19 |
URL | http://arxiv.org/abs/1707.06119v1 |
http://arxiv.org/pdf/1707.06119v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-convolutional-fisher-vector |
Repo | |
Framework | |
Neural network augmented inverse problems for PDEs
Title | Neural network augmented inverse problems for PDEs |
Authors | Jens Berg, Kaj Nyström |
Abstract | In this paper we show how to augment classical methods for inverse problems with artificial neural networks. The neural network acts as a prior for the coefficient to be estimated from noisy data. Neural networks are global, smooth function approximators and as such they do not require explicit regularization of the error functional to recover smooth solutions and coefficients. We give detailed examples using the Poisson equation in 1, 2, and 3 space dimensions and show that the neural network augmentation is robust with respect to noisy and incomplete data, mesh, and geometry. |
Tasks | |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09685v2 |
http://arxiv.org/pdf/1712.09685v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-augmented-inverse-problems-for |
Repo | |
Framework | |
Finding phonemes: improving machine lip-reading
Title | Finding phonemes: improving machine lip-reading |
Authors | Helen L. Bear, Richard W. Harvey, Yuxuan Lan |
Abstract | In machine lip-reading there is continued debate and research around the correct classes to be used for recognition. In this paper we use a structured approach for devising speaker-dependent viseme classes, which enables the creation of a set of phoneme-to-viseme maps where each has a different quantity of visemes ranging from two to 45. Viseme classes are based upon the mapping of articulated phonemes, which have been confused during phoneme recognition, into viseme groups. Using these maps, with the LiLIR dataset, we show the effect of changing the viseme map size in speaker-dependent machine lip-reading, measured by word recognition correctness and so demonstrate that word recognition with phoneme classifiers is not just possible, but often better than word recognition with viseme classifiers. Furthermore, there are intermediate units between visemes and phonemes which are better still. |
Tasks | |
Published | 2017-10-03 |
URL | http://arxiv.org/abs/1710.01142v1 |
http://arxiv.org/pdf/1710.01142v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-phonemes-improving-machine-lip |
Repo | |
Framework | |
Indoor Sound Source Localization with Probabilistic Neural Network
Title | Indoor Sound Source Localization with Probabilistic Neural Network |
Authors | Yingxiang Sun, Jiajia Chen, Chau Yuen, Susanto Rahardja |
Abstract | It is known that adverse environments such as high reverberation and low signal-to-noise ratio (SNR) pose a great challenge to indoor sound source localization. To address this challenge, in this paper, we propose a sound source localization algorithm based on probabilistic neural network, namely Generalized cross correlation Classification Algorithm (GCA). Experimental results for adverse environments with high reverberation time T60 up to 600ms and low SNR such as -10dB show that, the average azimuth angle error and elevation angle error by GCA are only 4.6 degrees and 3.1 degrees respectively. Compared with three recently published algorithms, GCA has increased the success rate on direction of arrival estimation significantly with good robustness to environmental changes. These results show that the proposed GCA can localize accurately and robustly for diverse indoor applications where the site acoustic features can be studied prior to the localization stage. |
Tasks | Direction of Arrival Estimation |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.07814v1 |
http://arxiv.org/pdf/1712.07814v1.pdf | |
PWC | https://paperswithcode.com/paper/indoor-sound-source-localization-with |
Repo | |
Framework | |
Faster Subgradient Methods for Functions with Hölderian Growth
Title | Faster Subgradient Methods for Functions with Hölderian Growth |
Authors | Patrick R. Johnstone, Pierre Moulin |
Abstract | The purpose of this manuscript is to derive new convergence results for several subgradient methods applied to minimizing nonsmooth convex functions with H"olderian growth. The growth condition is satisfied in many applications and includes functions with quadratic growth and weakly sharp minima as special cases. To this end there are three main contributions. First, for a constant and sufficiently small stepsize, we show that the subgradient method achieves linear convergence up to a certain region including the optimal set, with error of the order of the stepsize. Second, if appropriate problem parameters are known, we derive a decaying stepsize which obtains a much faster convergence rate than is suggested by the classical $O(1/\sqrt{k})$ result for the subgradient method. Thirdly we develop a novel “descending stairs” stepsize which obtains this faster convergence rate and also obtains linear convergence for the special case of weakly sharp functions. We also develop an adaptive variant of the “descending stairs” stepsize which achieves the same convergence rate without requiring an error bound constant which is difficult to estimate in practice. |
Tasks | |
Published | 2017-04-01 |
URL | http://arxiv.org/abs/1704.00196v3 |
http://arxiv.org/pdf/1704.00196v3.pdf | |
PWC | https://paperswithcode.com/paper/faster-subgradient-methods-for-functions-with |
Repo | |
Framework | |
Self-adaptive node-based PCA encodings
Title | Self-adaptive node-based PCA encodings |
Authors | Leonard Johard, Victor Rivera, Manuel Mazzara, JooYoung Lee |
Abstract | In this paper we propose an algorithm, Simple Hebbian PCA, and prove that it is able to calculate the principal component analysis (PCA) in a distributed fashion across nodes. It simplifies existing network structures by removing intralayer weights, essentially cutting the number of weights that need to be trained in half. |
Tasks | |
Published | 2017-06-16 |
URL | http://arxiv.org/abs/1708.04498v1 |
http://arxiv.org/pdf/1708.04498v1.pdf | |
PWC | https://paperswithcode.com/paper/self-adaptive-node-based-pca-encodings |
Repo | |
Framework | |
Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network
Title | Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network |
Authors | Sharath Adavanne, Archontis Politis, Tuomas Virtanen |
Abstract | This paper proposes a deep neural network for estimating the directions of arrival (DOA) of multiple sound sources. The proposed stacked convolutional and recurrent neural network (DOAnet) generates a spatial pseudo-spectrum (SPS) along with the DOA estimates in both azimuth and elevation. We avoid any explicit feature extraction step by using the magnitudes and phases of the spectrograms of all the channels as input to the network. The proposed DOAnet is evaluated by estimating the DOAs of multiple concurrently present sources in anechoic, matched and unmatched reverberant conditions. The results show that the proposed DOAnet is capable of estimating the number of sources and their respective DOAs with good precision and generate SPS with high signal-to-noise ratio. |
Tasks | Direction of Arrival Estimation |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10059v2 |
http://arxiv.org/pdf/1710.10059v2.pdf | |
PWC | https://paperswithcode.com/paper/direction-of-arrival-estimation-for-multiple |
Repo | |
Framework | |
Auto-Encoding User Ratings via Knowledge Graphs in Recommendation Scenarios
Title | Auto-Encoding User Ratings via Knowledge Graphs in Recommendation Scenarios |
Authors | Vito Bellini, Vito Walter Anelli, Tommaso Di Noia, Eugenio Di Sciascio |
Abstract | In the last decade, driven also by the availability of an unprecedented computational power and storage capabilities in cloud environments we assisted to the proliferation of new algorithms, methods, and approaches in two areas of artificial intelligence: knowledge representation and machine learning. On the one side, the generation of a high rate of structured data on the Web led to the creation and publication of the so-called knowledge graphs. On the other side, deep learning emerged as one of the most promising approaches in the generation and training of models that can be applied to a wide variety of application fields. More recently, autoencoders have proven their strength in various scenarios, playing a fundamental role in unsupervised learning. In this paper, we instigate how to exploit the semantic information encoded in a knowledge graph to build connections between units in a Neural Network, thus leading to a new method, SEM-AUTO, to extract and weigh semantic features that can eventually be used to build a recommender system. As adding content-based side information may mitigate the cold user problems, we tested how our approach behave in the presence of a few rating from a user on the Movielens 1M dataset and compare results with BPRSLIM. |
Tasks | Knowledge Graphs, Recommendation Systems |
Published | 2017-06-24 |
URL | https://arxiv.org/abs/1706.07956v2 |
https://arxiv.org/pdf/1706.07956v2.pdf | |
PWC | https://paperswithcode.com/paper/auto-encoding-user-ratings-via-knowledge |
Repo | |
Framework | |
Neural Translation of Musical Style
Title | Neural Translation of Musical Style |
Authors | Iman Malik, Carl Henrik Ek |
Abstract | Music is an expressive form of communication often used to convey emotion in scenarios where “words are not enough”. Part of this information lies in the musical composition where well-defined language exists. However, a significant amount of information is added during a performance as the musician interprets the composition. The performer injects expressiveness into the written score through variations of different musical properties such as dynamics and tempo. In this paper, we describe a model that can learn to perform sheet music. Our research concludes that the generated performances are indistinguishable from a human performance, thereby passing a test in the spirit of a “musical Turing test”. |
Tasks | |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03535v1 |
http://arxiv.org/pdf/1708.03535v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-translation-of-musical-style |
Repo | |
Framework | |
Data Readiness Levels
Title | Data Readiness Levels |
Authors | Neil D. Lawrence |
Abstract | Application of models to data is fraught. Data-generating collaborators often only have a very basic understanding of the complications of collating, processing and curating data. Challenges include: poor data collection practices, missing values, inconvenient storage mechanisms, intellectual property, security and privacy. All these aspects obstruct the sharing and interconnection of data, and the eventual interpretation of data through machine learning or other approaches. In project reporting, a major challenge is in encapsulating these problems and enabling goals to be built around the processing of data. Project overruns can occur due to failure to account for the amount of time required to curate and collate. But to understand these failures we need to have a common language for assessing the readiness of a particular data set. This position paper proposes the use of data readiness levels: it gives a rough outline of three stages of data preparedness and speculates on how formalisation of these levels into a common language for data readiness could facilitate project management. |
Tasks | |
Published | 2017-05-05 |
URL | http://arxiv.org/abs/1705.02245v1 |
http://arxiv.org/pdf/1705.02245v1.pdf | |
PWC | https://paperswithcode.com/paper/data-readiness-levels |
Repo | |
Framework | |
Column Generation for Interaction Coverage in Combinatorial Software Testing
Title | Column Generation for Interaction Coverage in Combinatorial Software Testing |
Authors | Serdar Kadioglu |
Abstract | This paper proposes a novel column generation framework for combinatorial software testing. In particular, it combines Mathematical Programming and Constraint Programming in a hybrid decomposition to generate covering arrays. The approach allows generating parameterized test cases with coverage guarantees between parameter interactions of a given application. Compared to exhaustive testing, combinatorial test case generation reduces the number of tests to run significantly. Our column generation algorithm is generic and can accommodate mixed coverage arrays over heterogeneous alphabets. The algorithm is realized in practice as a cloud service and recognized as one of the five winners of the company-wide cloud application challenge at Oracle. The service is currently helping software developers from a range of different product teams in their testing efforts while exposing declarative constraint models and hybrid optimization techniques to a broader audience. |
Tasks | |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.07081v1 |
http://arxiv.org/pdf/1712.07081v1.pdf | |
PWC | https://paperswithcode.com/paper/column-generation-for-interaction-coverage-in |
Repo | |
Framework | |
Concept Drift Adaptation by Exploiting Historical Knowledge
Title | Concept Drift Adaptation by Exploiting Historical Knowledge |
Authors | Yu Sun, Ke Tang, Zexuan Zhu, Xin Yao |
Abstract | Incremental learning with concept drift has often been tackled by ensemble methods, where models built in the past can be re-trained to attain new models for the current data. Two design questions need to be addressed in developing ensemble methods for incremental learning with concept drift, i.e., which historical (i.e., previously trained) models should be preserved and how to utilize them. A novel ensemble learning method, namely Diversity and Transfer based Ensemble Learning (DTEL), is proposed in this paper. Given newly arrived data, DTEL uses each preserved historical model as an initial model and further trains it with the new data via transfer learning. Furthermore, DTEL preserves a diverse set of historical models, rather than a set of historical models that are merely accurate in terms of classification accuracy. Empirical studies on 15 synthetic data streams and 4 real-world data streams (all with concept drifts) demonstrate that DTEL can handle concept drift more effectively than 4 other state-of-the-art methods. |
Tasks | Transfer Learning |
Published | 2017-02-12 |
URL | http://arxiv.org/abs/1702.03500v1 |
http://arxiv.org/pdf/1702.03500v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-drift-adaptation-by-exploiting |
Repo | |
Framework | |