October 18, 2019

3378 words 16 mins read

Paper Group ANR 565

Paper Group ANR 565

Nonlinear Metric Learning through Geodesic Interpolation within Lie Groups. Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension. Mining Public Opinion about Economic Issues: Twitter and the U.S. Presidential Election. Deep Learning for Radio Resource Allocation in Multi-Cell Networks. A Fast-Converged Acousti …

Nonlinear Metric Learning through Geodesic Interpolation within Lie Groups

Title Nonlinear Metric Learning through Geodesic Interpolation within Lie Groups
Authors Zhewei Wang, Bibo Shi, Charles D. Smith, Jundong Liu
Abstract In this paper, we propose a nonlinear distance metric learning scheme based on the fusion of component linear metrics. Instead of merging displacements at each data point, our model calculates the velocities induced by the component transformations, via a geodesic interpolation on a Lie transfor- mation group. Such velocities are later summed up to produce a global transformation that is guaranteed to be diffeomorphic. Consequently, pair-wise distances computed this way conform to a smooth and spatially varying metric, which can greatly benefit k-NN classification. Experiments on synthetic and real datasets demonstrate the effectiveness of our model.
Tasks Metric Learning
Published 2018-05-12
URL http://arxiv.org/abs/1805.04784v3
PDF http://arxiv.org/pdf/1805.04784v3.pdf
PWC https://paperswithcode.com/paper/nonlinear-metric-learning-through-geodesic
Repo
Framework

Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension

Title Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension
Authors Liang Wang, Sujian Li, Wei Zhao, Kewei Shen, Meng Sun, Ruoyu Jia, Jingming Liu
Abstract Cloze-style reading comprehension has been a popular task for measuring the progress of natural language understanding in recent years. In this paper, we design a novel multi-perspective framework, which can be seen as the joint training of heterogeneous experts and aggregate context information from different perspectives. Each perspective is modeled by a simple aggregation module. The outputs of multiple aggregation modules are fed into a one-timestep pointer network to get the final answer. At the same time, to tackle the problem of insufficient labeled data, we propose an efficient sampling mechanism to automatically generate more training examples by matching the distribution of candidates between labeled and unlabeled data. We conduct our experiments on a recently released cloze-test dataset CLOTH (Xie et al., 2017), which consists of nearly 100k questions designed by professional teachers. Results show that our method achieves new state-of-the-art performance over previous strong baselines.
Tasks Reading Comprehension
Published 2018-08-20
URL http://arxiv.org/abs/1808.06289v1
PDF http://arxiv.org/pdf/1808.06289v1.pdf
PWC https://paperswithcode.com/paper/multi-perspective-context-aggregation-for
Repo
Framework

Mining Public Opinion about Economic Issues: Twitter and the U.S. Presidential Election

Title Mining Public Opinion about Economic Issues: Twitter and the U.S. Presidential Election
Authors Amir Karami, London S. Bennett, Xiaoyun He
Abstract Opinion polls have been the bridge between public opinion and politicians in elections. However, developing surveys to disclose people’s feedback with respect to economic issues is limited, expensive, and time-consuming. In recent years, social media such as Twitter has enabled people to share their opinions regarding elections. Social media has provided a platform for collecting a large amount of social media data. This paper proposes a computational public opinion mining approach to explore the discussion of economic issues in social media during an election. Current related studies use text mining methods independently for election analysis and election prediction; this research combines two text mining methods: sentiment analysis and topic modeling. The proposed approach has effectively been deployed on millions of tweets to analyze economic concerns of people during the 2012 US presidential election.
Tasks Opinion Mining, Sentiment Analysis
Published 2018-02-06
URL http://arxiv.org/abs/1802.01786v1
PDF http://arxiv.org/pdf/1802.01786v1.pdf
PWC https://paperswithcode.com/paper/mining-public-opinion-about-economic-issues
Repo
Framework

Deep Learning for Radio Resource Allocation in Multi-Cell Networks

Title Deep Learning for Radio Resource Allocation in Multi-Cell Networks
Authors K. I. Ahmed, H. Tabassum, E. Hossain
Abstract Increased complexity and heterogeneity of emerging 5G and beyond 5G (B5G) wireless networks will require a paradigm shift from traditional resource allocation mechanisms. Deep learning (DL) is a powerful tool where a multi-layer neural network can be trained to model a resource management algorithm using network data.Therefore, resource allocation decisions can be obtained without intensive online computations which would be required otherwise for the solution of resource allocation problems. In this context, this article focuses on the application of DL to obtain solutions for the radio resource allocation problems in multi-cell networks. Starting with a brief overview of a deep neural network (DNN) as a DL model, relevant DNN architectures and the data training procedure, we provide an overview of existing state-of-the-art applying DL in the context of radio resource allocation. A qualitative comparison is provided in terms of their objectives, inputs/outputs, learning and data training methods. Then, we present a supervised DL model to solve the sub-band and power allocation problem in a multi-cell network. Using the data generated by a genetic algorithm, we first train the model and then test the accuracy of the proposed model in predicting the resource allocation solutions. Simulation results show that the trained DL model is able to provide the desired optimal solution 86.3% of time.
Tasks
Published 2018-08-02
URL http://arxiv.org/abs/1808.00667v1
PDF http://arxiv.org/pdf/1808.00667v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-radio-resource-allocation
Repo
Framework

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

Title A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Authors Hosung Park, Donghyun Lee, Minkyu Lim, Yoseb Kang, Juneseok Oh, Ji-Hwan Kim
Abstract In this paper, a time delay neural network (TDNN) based acoustic model is proposed to implement a fast-converged acoustic modeling for Korean speech recognition. The TDNN has an advantage in fast-convergence where the amount of training data is limited, due to subsampling which excludes duplicated weights. The TDNN showed an absolute improvement of 2.12% in terms of character error rate compared to feed forward neural network (FFNN) based modelling for Korean speech corpora. The proposed model converged 1.67 times faster than a FFNN-based model did.
Tasks Speech Recognition
Published 2018-07-11
URL http://arxiv.org/abs/1807.05855v1
PDF http://arxiv.org/pdf/1807.05855v1.pdf
PWC https://paperswithcode.com/paper/a-fast-converged-acoustic-modeling-for-korean
Repo
Framework

Ancient Coin Classification Using Graph Transduction Games

Title Ancient Coin Classification Using Graph Transduction Games
Authors Sinem Aslan, Sebastiano Vascon, Marcello Pelillo
Abstract Recognizing the type of an ancient coin requires theoretical expertise and years of experience in the field of numismatics. Our goal in this work is automatizing this time consuming and demanding task by a visual classification framework. Specifically, we propose to model ancient coin image classification using Graph Transduction Games (GTG). GTG casts the classification problem as a non-cooperative game where the players (the coin images) decide their strategies (class labels) according to the choices made by the others, which results with a global consensus at the final labeling. Experiments are conducted on the only publicly available dataset which is composed of 180 images of 60 types of Roman coins. We demonstrate that our approach outperforms the literature work on the same dataset with the classification accuracy of 73.6% and 87.3% when there are one and two images per class in the training set, respectively.
Tasks Image Classification
Published 2018-10-02
URL http://arxiv.org/abs/1810.01091v1
PDF http://arxiv.org/pdf/1810.01091v1.pdf
PWC https://paperswithcode.com/paper/ancient-coin-classification-using-graph
Repo
Framework

Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks

Title Unsupervised Learning of Latent Physical Properties Using Perception-Prediction Networks
Authors David Zheng, Vinson Luo, Jiajun Wu, Joshua B. Tenenbaum
Abstract We propose a framework for the completely unsupervised learning of latent object properties from their interactions: the perception-prediction network (PPN). Consisting of a perception module that extracts representations of latent object properties and a prediction module that uses those extracted properties to simulate system dynamics, the PPN can be trained in an end-to-end fashion purely from samples of object dynamics. The representations of latent object properties learned by PPNs not only are sufficient to accurately simulate the dynamics of systems comprised of previously unseen objects, but also can be translated directly into human-interpretable properties (e.g., mass, coefficient of restitution) in an entirely unsupervised manner. Crucially, PPNs also generalize to novel scenarios: their gradient-based training can be applied to many dynamical systems and their graph-based structure functions over systems comprised of different numbers of objects. Our results demonstrate the efficacy of graph-based neural architectures in object-centric inference and prediction tasks, and our model has the potential to discover relevant object properties in systems that are not yet well understood.
Tasks
Published 2018-07-24
URL http://arxiv.org/abs/1807.09244v2
PDF http://arxiv.org/pdf/1807.09244v2.pdf
PWC https://paperswithcode.com/paper/unsupervised-learning-of-latent-physical
Repo
Framework

Advantages of versatile neural-network decoding for topological codes

Title Advantages of versatile neural-network decoding for topological codes
Authors Nishad Maskara, Aleksander Kubica, Tomas Jochym-O’Connor
Abstract Finding optimal correction of errors in generic stabilizer codes is a computationally hard problem, even for simple noise models. While this task can be simplified for codes with some structure, such as topological stabilizer codes, developing good and efficient decoders still remains a challenge. In our work, we systematically study a very versatile class of decoders based on feedforward neural networks. To demonstrate adaptability, we apply neural decoders to the triangular color and toric codes under various noise models with realistic features, such as spatially-correlated errors. We report that neural decoders provide significant improvement over leading efficient decoders in terms of the error-correction threshold. Using neural networks simplifies the process of designing well-performing decoders, and does not require prior knowledge of the underlying noise model.
Tasks
Published 2018-02-23
URL http://arxiv.org/abs/1802.08680v1
PDF http://arxiv.org/pdf/1802.08680v1.pdf
PWC https://paperswithcode.com/paper/advantages-of-versatile-neural-network
Repo
Framework

The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects

Title The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects
Authors Zhanxing Zhu, Jingfeng Wu, Bing Yu, Lei Wu, Jinwen Ma
Abstract Understanding the behavior of stochastic gradient descent (SGD) in the context of deep neural networks has raised lots of concerns recently. Along this line, we study a general form of gradient based optimization dynamics with unbiased noise, which unifies SGD and standard Langevin dynamics. Through investigating this general optimization dynamics, we analyze the behavior of SGD on escaping from minima and its regularization effects. A novel indicator is derived to characterize the efficiency of escaping from minima through measuring the alignment of noise covariance and the curvature of loss function. Based on this indicator, two conditions are established to show which type of noise structure is superior to isotropic noise in term of escaping efficiency. We further show that the anisotropic noise in SGD satisfies the two conditions, and thus helps to escape from sharp and poor minima effectively, towards more stable and flat minima that typically generalize well. We systematically design various experiments to verify the benefits of the anisotropic noise, compared with full gradient descent plus isotropic diffusion (i.e. Langevin dynamics).
Tasks
Published 2018-03-01
URL https://arxiv.org/abs/1803.00195v5
PDF https://arxiv.org/pdf/1803.00195v5.pdf
PWC https://paperswithcode.com/paper/the-anisotropic-noise-in-stochastic-gradient
Repo
Framework

Adaptive Clinical Trials: Exploiting Sequential Patient Recruitment and Allocation

Title Adaptive Clinical Trials: Exploiting Sequential Patient Recruitment and Allocation
Authors Onur Atan, William R. Zame, Mihaela van der Schaar
Abstract Randomized Controlled Trials (RCTs) are the gold standard for comparing the effectiveness of a new treatment to the current one (the control). Most RCTs allocate the patients to the treatment group and the control group by uniform randomization. We show that this procedure can be highly sub-optimal (in terms of learning) if – as is often the case – patients can be recruited in cohorts (rather than all at once), the effects on each cohort can be observed before recruiting the next cohort, and the effects are heterogeneous across identifiable subgroups of patients. We formulate the patient allocation problem as a finite stage Markov Decision Process in which the objective is to minimize a given weighted combination of type-I and type-II errors. Because finding the exact solution to this Markov Decision Process is computationally intractable, we propose an algorithm – \textit{Knowledge Gradient for Randomized Controlled Trials} (RCT-KG) – that yields an approximate solution. We illustrate our algorithm on a synthetic dataset with Bernoulli outcomes and compare it with uniform randomization. For a given size of trial our method achieves significant reduction in error, and to achieve a prescribed level of confidence (in identifying whether the treatment is superior to the control), our method requires many fewer patients. Our approach uses what has been learned from the effects on previous cohorts to recruit patients to subgroups and allocate patients (to treatment/control) within subgroups in a way that promotes more efficient learning.
Tasks
Published 2018-10-05
URL http://arxiv.org/abs/1810.02876v2
PDF http://arxiv.org/pdf/1810.02876v2.pdf
PWC https://paperswithcode.com/paper/adaptive-clinical-trials-exploiting
Repo
Framework

Machine Learning Distinguishes Neurosurgical Skill Levels in a Virtual Reality Tumor Resection Task

Title Machine Learning Distinguishes Neurosurgical Skill Levels in a Virtual Reality Tumor Resection Task
Authors Samaneh Siyar, Hamed Azarnoush, Saeid Rashidi, Alexandre Winkler-Schwartz, Vincent Bissonnette, Nirros Ponnudurai, Rolando F. Del Maestro
Abstract Background: Virtual reality simulators and machine learning have the potential to augment understanding, assessment and training of psychomotor performance in neurosurgery residents. Objective: This study outlines the first application of machine learning to distinguish “skilled” and “novice” psychomotor performance during a virtual reality neurosurgical task. Methods: Twenty-three neurosurgeons and senior neurosurgery residents comprising the “skilled” group and 92 junior neurosurgery residents and medical students the “novice” group. The task involved removing a series of virtual brain tumors without causing injury to surrounding tissue. Over 100 features were extracted and 68 selected using t-test analysis. These features were provided to 4 classifiers: K-Nearest Neighbors, Parzen Window, Support Vector Machine, and Fuzzy K-Nearest Neighbors. Equal Error Rate was used to assess classifier performance. Results: Ratios of train set size to test set size from 10% to 90% and 5 to 30 features, chosen by the forward feature selection algorithm, were employed. A working point of 50% train to test set size ratio and 15 features resulted in an equal error rates as low as 8.3% using the Fuzzy K-Nearest Neighbors classifier. Conclusion: Machine learning may be one component helping realign the traditional apprenticeship educational paradigm to a more objective model based on proven performance standards. Keywords: Artificial intelligence, Classifiers, Machine learning, Neurosurgery skill assessment, Surgical education, Tumor resection, Virtual reality simulation
Tasks Feature Selection
Published 2018-11-20
URL http://arxiv.org/abs/1811.08159v1
PDF http://arxiv.org/pdf/1811.08159v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-distinguishes-neurosurgical
Repo
Framework

Characterizing Interconnections and Linguistic Patterns in Twitter

Title Characterizing Interconnections and Linguistic Patterns in Twitter
Authors Johnnatan Messias
Abstract Social media is considered a democratic space in which people connect and interact with each other regardless of their gender, race, or any other demographic aspect. Despite numerous efforts that explore demographic aspects in social media, it is still unclear whether social media perpetuates old inequalities from the offline world. In this dissertation, we attempt to identify gender and race of Twitter users located in the United States using advanced image processing algorithms from Face++. We investigate how different demographic groups connect with each other and differentiate them regarding linguistic styles and also their interests. We quantify to what extent one group follows and interacts with each other and the extent to which these connections and interactions reflect in inequalities in Twitter. We also extract linguistic features from six categories (affective attributes, cognitive attributes, lexical density and awareness, temporal references, social and personal concerns, and interpersonal focus) in order to identify the similarities and the differences in the messages they share in Twitter. Furthermore, we extract the absolute ranking difference of top phrases between demographic groups. As a dimension of diversity, we use the topics of interest that we retrieve from each user. Our analysis shows that users identified as white and male tend to attain higher positions, in terms of the number of followers and number of times in another user’s lists, in Twitter. There are clear differences in the way of writing across different demographic groups in both gender and race domains as well as in the topic of interest. We hope our effort can stimulate the development of new theories of demographic information in the online space. Finally, we developed a Web-based system that leverages the demographic aspects of users to provide transparency to the Twitter trending topics system.
Tasks
Published 2018-03-30
URL http://arxiv.org/abs/1804.00084v1
PDF http://arxiv.org/pdf/1804.00084v1.pdf
PWC https://paperswithcode.com/paper/characterizing-interconnections-and
Repo
Framework

Reducing Sampling Ratios Improves Bagging in Sparse Regression

Title Reducing Sampling Ratios Improves Bagging in Sparse Regression
Authors Luoluo Liu, Sang Peter Chin, Trac D. Tran
Abstract Bagging, a powerful ensemble method from machine learning, improves the performance of unstable predictors. Although the power of Bagging has been shown mostly in classification problems, we demonstrate the success of employing Bagging in sparse regression over the baseline method (L1 minimization). The framework employs the generalized version of the original Bagging with various bootstrap ratios. The performance limits associated with different choices of bootstrap sampling ratio L/m and number of estimates K is analyzed theoretically. Simulation shows that the proposed method yields state-of-the-art recovery performance, outperforming L1 minimization and Bolasso in the challenging case of low levels of measurements. A lower L/m ratio (60% - 90%) leads to better performance, especially with a small number of measurements. With the reduced sampling rate, SNR improves over the original Bagging by up to 24%. With a properly chosen sampling ratio, a reasonably small number of estimates K = 30 gives satisfying result, even though increasing K is discovered to always improve or at least maintain the performance.
Tasks
Published 2018-12-20
URL http://arxiv.org/abs/1812.08808v4
PDF http://arxiv.org/pdf/1812.08808v4.pdf
PWC https://paperswithcode.com/paper/reducing-sampling-ratios-and-increasing
Repo
Framework

Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture

Title Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Authors Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Georgios Tzimiropoulos, Maja Pantic
Abstract Recent works in speech recognition rely either on connectionist temporal classification (CTC) or sequence-to-sequence models for character-level recognition. CTC assumes conditional independence of individual characters, whereas attention-based models can provide nonsequential alignments. Therefore, we could use a CTC loss in combination with an attention-based model in order to force monotonic alignments and at the same time get rid of the conditional independence assumption. In this paper, we use the recently proposed hybrid CTC/attention architecture for audio-visual recognition of speech in-the-wild. To the best of our knowledge, this is the first time that such a hybrid architecture architecture is used for audio-visual recognition of speech. We use the LRS2 database and show that the proposed audio-visual model leads to an 1.3% absolute decrease in word error rate over the audio-only model and achieves the new state-of-the-art performance on LRS2 database (7% word error rate). We also observe that the audio-visual model significantly outperforms the audio-based model (up to 32.9% absolute improvement in word error rate) for several different types of noise as the signal-to-noise ratio decreases.
Tasks Audio-Visual Speech Recognition, Speech Recognition, Visual Speech Recognition
Published 2018-09-28
URL http://arxiv.org/abs/1810.00108v1
PDF http://arxiv.org/pdf/1810.00108v1.pdf
PWC https://paperswithcode.com/paper/audio-visual-speech-recognition-with-a-hybrid
Repo
Framework

Span error bound for weighted SVM with applications in hyperparameter selection

Title Span error bound for weighted SVM with applications in hyperparameter selection
Authors Ioannis Sarafis, Christos Diou, Anastasios Delopoulos
Abstract Weighted SVM (or fuzzy SVM) is the most widely used SVM variant owning its effectiveness to the use of instance weights. Proper selection of the instance weights can lead to increased generalization performance. In this work, we extend the span error bound theory to weighted SVM and we introduce effective hyperparameter selection methods for the weighted SVM algorithm. The significance of the presented work is that enables the application of span bound and span-rule with weighted SVM. The span bound is an upper bound of the leave-one-out error that can be calculated using a single trained SVM model. This is important since leave-one-out error is an almost unbiased estimator of the test error. Similarly, the span-rule gives the actual value of the leave-one-out error. Thus, one can apply span bound and span-rule as computationally lightweight alternatives of leave-one-out procedure for hyperparameter selection. The main theoretical contributions are: (a) we prove the necessary and sufficient condition for the existence of the span of a support vector in weighted SVM; and (b) we prove the extension of span bound and span-rule to weighted SVM. We experimentally evaluate the span bound and the span-rule for hyperparameter selection and we compare them with other methods that are applicable to weighted SVM: the $K$-fold cross-validation and the ${\xi}-{\alpha}$ bound. Experiments on 14 benchmark data sets and data sets with importance scores for the training instances show that: (a) the condition for the existence of span in weighted SVM is satisfied almost always; (b) the span-rule is the most effective method for weighted SVM hyperparameter selection; (c) the span-rule is the best predictor of the test error in the mean square error sense; and (d) the span-rule is efficient and, for certain problems, it can be calculated faster than $K$-fold cross-validation.
Tasks
Published 2018-09-17
URL http://arxiv.org/abs/1809.06124v1
PDF http://arxiv.org/pdf/1809.06124v1.pdf
PWC https://paperswithcode.com/paper/span-error-bound-for-weighted-svm-with
Repo
Framework
comments powered by Disqus