Paper Group ANR 354
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input. On the Estimation of Entropy in the FastICA Algorithm. Atrial scars segmentation via potential learning in the graph-cuts framework. Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs. Pachinko …
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
Title | Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input |
Authors | Junliang Guo, Xu Tan, Di He, Tao Qin, Linli Xu, Tie-Yan Liu |
Abstract | Non-autoregressive translation (NAT) models, which remove the dependence on previous target tokens from the inputs of the decoder, achieve significantly inference speedup but at the cost of inferior accuracy compared to autoregressive translation (AT) models. Previous work shows that the quality of the inputs of the decoder is important and largely impacts the model accuracy. In this paper, we propose two methods to enhance the decoder inputs so as to improve NAT models. The first one directly leverages a phrase table generated by conventional SMT approaches to translate source tokens to target tokens, which are then fed into the decoder as inputs. The second one transforms source-side word embeddings to target-side word embeddings through sentence-level alignment and word-level adversary learning, and then feeds the transformed word embeddings into the decoder as inputs. Experimental results show our method largely outperforms the NAT baseline~\citep{gu2017non} by $5.11$ BLEU scores on WMT14 English-German task and $4.72$ BLEU scores on WMT16 English-Romanian task. |
Tasks | Machine Translation, Word Embeddings |
Published | 2018-12-23 |
URL | http://arxiv.org/abs/1812.09664v1 |
http://arxiv.org/pdf/1812.09664v1.pdf | |
PWC | https://paperswithcode.com/paper/non-autoregressive-neural-machine-translation |
Repo | |
Framework | |
On the Estimation of Entropy in the FastICA Algorithm
Title | On the Estimation of Entropy in the FastICA Algorithm |
Authors | Elena Issoglio, Paul Smith, Jochen Voss |
Abstract | The fastICA method is a popular dimension reduction technique used to reveal patterns in data. Here we show both theoretically and in practice that the approximations used in fastICA can result in patterns not being successfully recognised. We demonstrate this problem using a two-dimensional example where a clear structure is immediately visible to the naked eye, but where the projection chosen by fastICA fails to reveal this structure. This implies that care is needed when applying fastICA. We discuss how the problem arises and how it is intrinsically connected to the approximations that form the basis of the computational efficiency of fastICA. |
Tasks | Dimensionality Reduction |
Published | 2018-05-25 |
URL | https://arxiv.org/abs/1805.10206v4 |
https://arxiv.org/pdf/1805.10206v4.pdf | |
PWC | https://paperswithcode.com/paper/on-the-estimation-of-entropy-in-the-fastica |
Repo | |
Framework | |
Atrial scars segmentation via potential learning in the graph-cuts framework
Title | Atrial scars segmentation via potential learning in the graph-cuts framework |
Authors | Lei Li, Fuping Wu, Guang Yang, Tom Wong, Raad Mohiaddin, David Firmin, Jenny Keegan, Lingchao Xu, Xiahai Zhuang |
Abstract | Late Gadolinium Enhancement Magnetic Resonance Imaging (LGE MRI) emerged as a routine scan for patients with atrial fibrillation (AF). However, due to the low image quality automating the quantification and analysis of the atrial scars is challenging. In this study, we pro-posed a fully automated method based on the graph-cuts framework, where the potential of the graph is learned on a surface mesh of the left atrium (LA) using an equidistant projection and a Deep Neural Network (DNN). For validation, we employed 100 datasets with manual delineation. The results showed that the performance of the proposed method improved and converged with respect to the increased size of training patches, which provide important features of the structural and texture information learned by the DNN. The segmentation could be further improved when the contribution from the t-link and n-link is balanced, thanks to inter-relationship learned by the DNN for the graph-cuts algorithm. Compared with the published methods which mostly acquired manual delineation of the LA or LA wall, our method is fully automatic and demonstrated evidently better results with statistical significance. Finally, the accuracy of quantifying the scars assessed by the Dice score was 0.570. The results are promising and the method can be useful in diagnosis and prognosis of AF. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09123v1 |
http://arxiv.org/pdf/1810.09123v1.pdf | |
PWC | https://paperswithcode.com/paper/atrial-scars-segmentation-via-potential |
Repo | |
Framework | |
Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs
Title | Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs |
Authors | Yuxing Tang, Xiaosong Wang, Adam P. Harrison, Le Lu, Jing Xiao, Ronald M. Summers |
Abstract | In this work, we exploit the task of joint classification and weakly supervised localization of thoracic diseases from chest radiographs, with only image-level disease labels coupled with disease severity-level (DSL) information of a subset. A convolutional neural network (CNN) based attention-guided curriculum learning (AGCL) framework is presented, which leverages the severity-level attributes mined from radiology reports. Images in order of difficulty (grouped by different severity-levels) are fed to CNN to boost the learning gradually. In addition, highly confident samples (measured by classification probabilities) and their corresponding class-conditional heatmaps (generated by the CNN) are extracted and further fed into the AGCL framework to guide the learning of more distinctive convolutional features in the next iteration. A two-path network architecture is designed to regress the heatmaps from selected seed samples in addition to the original classification task. The joint learning scheme can improve the classification and localization performance along with more seed samples for the next iteration. We demonstrate the effectiveness of this iterative refinement framework via extensive experimental evaluations on the publicly available ChestXray14 dataset. AGCL achieves over 5.7% (averaged over 14 diseases) increase in classification AUC and 7%/11% increases in Recall/Precision for the localization task compared to the state of the art. |
Tasks | |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07532v1 |
http://arxiv.org/pdf/1807.07532v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-guided-curriculum-learning-for |
Repo | |
Framework | |
Pachinko Prediction: A Bayesian method for event prediction from social media data
Title | Pachinko Prediction: A Bayesian method for event prediction from social media data |
Authors | Jonathan Tuke, Andrew Nguyen, Mehwish Nasim, Drew Mellor, Asanga Wickramasinghe, Nigel Bean, Lewis Mitchell |
Abstract | The combination of large open data sources with machine learning approaches presents a potentially powerful way to predict events such as protest or social unrest. However, accounting for uncertainty in such models, particularly when using diverse, unstructured datasets such as social media, is essential to guarantee the appropriate use of such methods. Here we develop a Bayesian method for predicting social unrest events in Australia using social media data. This method uses machine learning methods to classify individual postings to social media as being relevant, and an empirical Bayesian approach to calculate posterior event probabilities. We use the method to predict events in Australian cities over a period in 2017/18. |
Tasks | |
Published | 2018-09-22 |
URL | http://arxiv.org/abs/1809.08427v1 |
http://arxiv.org/pdf/1809.08427v1.pdf | |
PWC | https://paperswithcode.com/paper/pachinko-prediction-a-bayesian-method-for |
Repo | |
Framework | |
Augmented Artificial Intelligence: a Conceptual Framework
Title | Augmented Artificial Intelligence: a Conceptual Framework |
Authors | Alexander N. Gorban, Bogdan Grechuk, Ivan Y. Tyukin |
Abstract | All artificial Intelligence (AI) systems make errors. These errors are unexpected, and differ often from the typical human mistakes (“non-human” errors). The AI errors should be corrected without damage of existing skills and, hopefully, avoiding direct human expertise. This paper presents an initial summary report of project taking new and systematic approach to improving the intellectual effectiveness of the individual AI by communities of AIs. We combine some ideas of learning in heterogeneous multiagent systems with new and original mathematical approaches for non-iterative corrections of errors of legacy AI systems. The mathematical foundations of AI non-destructive correction are presented and a series of new stochastic separation theorems is proven. These theorems provide a new instrument for the development, analysis, and assessment of machine learning methods and algorithms in high dimension. They demonstrate that in high dimensions and even for exponentially large samples, linear classifiers in their classical Fisher’s form are powerful enough to separate errors from correct responses with high probability and to provide efficient solution to the non-destructive corrector problem. In particular, we prove some hypotheses formulated in our paper `Stochastic Separation Theorems’ (Neural Networks, 94, 255–259, 2017), and answer one general problem published by Donoho and Tanner in 2009. | |
Tasks | |
Published | 2018-02-06 |
URL | http://arxiv.org/abs/1802.02172v3 |
http://arxiv.org/pdf/1802.02172v3.pdf | |
PWC | https://paperswithcode.com/paper/augmented-artificial-intelligence-a |
Repo | |
Framework | |
Recurrent Transition Networks for Character Locomotion
Title | Recurrent Transition Networks for Character Locomotion |
Authors | Félix G. Harvey, Christopher Pal |
Abstract | Manually authoring transition animations for a complete locomotion system can be a tedious and time-consuming task, especially for large games that allow complex and constrained locomotion movements, where the number of transitions grows exponentially with the number of states. In this paper, we present a novel approach, based on deep recurrent neural networks, to automatically generate such transitions given a past context of a few frames and a target character state to reach. We present the Recurrent Transition Network (RTN), based on a modified version of the Long-Short-Term-Memory (LSTM) network, designed specifically for transition generation and trained without any gait, phase, contact or action labels. We further propose a simple yet principled way to initialize the hidden states of the LSTM layer for a given sequence which improves the performance and generalization to new motions. We both quantitatively and qualitatively evaluate our system and show that making the network terrain-aware by adding a local terrain representation to the input yields better performance for rough-terrain navigation on long transitions. Our system produces realistic and fluid transitions that rival the quality of Motion Capture-based ground-truth motions, even before applying any inverse-kinematics postprocess. Direct benefits of our approach could be to accelerate the creation of transition variations for large coverage, or even to entirely replace transition nodes in an animation graph. We further explore applications of this model in a animation super-resolution setting where we temporally decompress animations saved at 1 frame per second and show that the network is able to reconstruct motions that are hard to distinguish from un-compressed locomotion sequences. |
Tasks | Motion Capture, Super-Resolution |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02363v4 |
http://arxiv.org/pdf/1810.02363v4.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-transition-networks-for-character |
Repo | |
Framework | |
Neural DrugNet
Title | Neural DrugNet |
Authors | Nishant Nikhil, Shivansh Mundra |
Abstract | In this paper, we describe the system submitted for the shared task on Social Media Mining for Health Applications by the team Light. Previous works demonstrate that LSTMs have achieved remarkable performance in natural language processing tasks. We deploy an ensemble of two LSTM models. The first one is a pretrained language model appended with a classifier and takes words as input, while the second one is a LSTM model with an attention unit over it which takes character tri-gram as input. We call the ensemble of these two models: Neural-DrugNet. Our system ranks 2nd in the second shared task: Automatic classification of posts describing medication intake. |
Tasks | Language Modelling |
Published | 2018-08-31 |
URL | http://arxiv.org/abs/1809.01500v1 |
http://arxiv.org/pdf/1809.01500v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-drugnet |
Repo | |
Framework | |
Role of Intonation in Scoring Spoken English
Title | Role of Intonation in Scoring Spoken English |
Authors | Amber Nigam, Arpan Saxena, Ishan Sodhi |
Abstract | In this paper, we have introduced and evaluated intonation based feature for scoring the English speech of nonnative English speakers in Indian context. For this, we created an automated spoken English scoring engine to learn from the manual evaluation of spoken English. This involved using an existing Automatic Speech Recognition (ASR) engine to convert the speech to text. Thereafter, macro features like accuracy, fluency and prosodic features were used to build a scoring model. In the process, we introduced SimIntonation, short for similarity between spoken intonation pattern and “ideal” i.e. training intonation pattern. Our results show that it is a highly predictive feature under controlled environment. We also categorized interword pauses into 4 distinct types for a granular evaluation of pauses and their impact on speech evaluation. Moreover, we took steps to moderate test difficulty through its evaluation across parameters like difficult word count, average sentence readability and lexical density. Our results show that macro features like accuracy and intonation, and micro features like pause-topography are strongly predictive. The scoring of spoken English is not within the purview of this paper. |
Tasks | Speech Recognition |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07688v2 |
http://arxiv.org/pdf/1808.07688v2.pdf | |
PWC | https://paperswithcode.com/paper/role-of-intonation-in-scoring-spoken-english |
Repo | |
Framework | |
Deep Learning for Semantic Segmentation on Minimal Hardware
Title | Deep Learning for Semantic Segmentation on Minimal Hardware |
Authors | Sander G. van Dijk, Marcus M. Scheunemann |
Abstract | Deep learning has revolutionised many fields, but it is still challenging to transfer its success to small mobile robots with minimal hardware. Specifically, some work has been done to this effect in the RoboCup humanoid football domain, but results that are performant and efficient and still generally applicable outside of this domain are lacking. We propose an approach conceptually different from those taken previously. It is based on semantic segmentation and does achieve these desired properties. In detail, it is being able to process full VGA images in real-time on a low-power mobile processor. It can further handle multiple image dimensions without retraining, it does not require specific domain knowledge for achieving a high frame rate and it is applicable on a minimal mobile hardware. |
Tasks | Semantic Segmentation |
Published | 2018-07-15 |
URL | http://arxiv.org/abs/1807.05597v1 |
http://arxiv.org/pdf/1807.05597v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-semantic-segmentation-on |
Repo | |
Framework | |
Model Trees for Identifying Exceptional Players in the NHL Draft
Title | Model Trees for Identifying Exceptional Players in the NHL Draft |
Authors | Oliver Schulte, Yejia Liu, Chao Li |
Abstract | Drafting strong players is crucial for the team success. We describe a new data-driven interpretable approach for assessing draft prospects in the National Hockey League. Successful previous approaches have built a predictive model based on player features, or derived performance predictions from the observed performance of comparable players in a cohort. This paper develops model tree learning, which incorporates strengths of both model-based and cohort-based approaches. A model tree partitions the feature space according to the values of discrete features, or learned thresholds for continuous features. Each leaf node in the tree defines a group of players, easily described to hockey experts, with its own group regression model. Compared to a single model, the model tree forms an ensemble that increases predictive power. Compared to cohort-based approaches, the groups of comparables are discovered from the data, without requiring a similarity metric. The performance predictions of the model tree are competitive with the state-of-the-art methods, which validates our model empirically. We show in case studies that the model tree player ranking can be used to highlight strong and weak points of players. |
Tasks | |
Published | 2018-02-23 |
URL | http://arxiv.org/abs/1802.08765v1 |
http://arxiv.org/pdf/1802.08765v1.pdf | |
PWC | https://paperswithcode.com/paper/model-trees-for-identifying-exceptional |
Repo | |
Framework | |
On the Stability and Convergence of Stochastic Gradient Descent with Momentum
Title | On the Stability and Convergence of Stochastic Gradient Descent with Momentum |
Authors | Ali Ramezani-Kebrya, Ashish Khisti, Ben Liang |
Abstract | While momentum-based methods, in conjunction with the stochastic gradient descent, are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods. In practice, the momentum parameter is often chosen in a heuristic fashion with little theoretical guidance. In the first part of this paper, for the case of general loss functions, we analyze a modified momentum-based update rule, i.e., the method of early momentum, and develop an upper-bound on the generalization error using the framework of algorithmic stability. Our results show that machine learning models can be trained for multiple epochs of this method while their generalization errors are bounded. We also study the convergence of the method of early momentum by establishing an upper-bound on the expected norm of the gradient. In the second part of the paper, we focus on the case of strongly convex loss functions and the classical heavy-ball momentum update rule. We use the framework of algorithmic stability to provide an upper-bound on the generalization error of the stochastic gradient method with momentum. We also develop an upper-bound on the expected true risk, in terms of the number of training steps, the size of the training set, and the momentum parameter. Experimental evaluations verify the consistency between the numerical results and our theoretical bounds and the effectiveness of the method of early momentum for the case of non-convex loss functions. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04564v1 |
http://arxiv.org/pdf/1809.04564v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-stability-and-convergence-of |
Repo | |
Framework | |
Generative Adversarial Learning for Spectrum Sensing
Title | Generative Adversarial Learning for Spectrum Sensing |
Authors | Kemal Davaslioglu, Yalin E. Sagduyu |
Abstract | A novel approach of training data augmentation and domain adaptation is presented to support machine learning applications for cognitive radio. Machine learning provides effective tools to automate cognitive radio functionalities by reliably extracting and learning intrinsic spectrum dynamics. However, there are two important challenges to overcome, in order to fully utilize the machine learning benefits with cognitive radios. First, machine learning requires significant amount of truthed data to capture complex channel and emitter characteristics, and train the underlying algorithm (e.g., a classifier). Second, the training data that has been identified for one spectrum environment cannot be used for another one (e.g., after channel and emitter conditions change). To address these challenges, a generative adversarial network (GAN) with deep learning structures is used to 1)~generate additional synthetic training data to improve classifier accuracy, and 2) adapt training data to spectrum dynamics. This approach is applied to spectrum sensing by assuming only limited training data without knowledge of spectrum statistics. Machine learning classifiers are trained with limited, augmented and adapted training data to detect signals. Results show that training data augmentation increases the classifier accuracy significantly and this increase is sustained with domain adaptation as spectrum conditions change. |
Tasks | Data Augmentation, Domain Adaptation |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00709v1 |
http://arxiv.org/pdf/1804.00709v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-learning-for-spectrum |
Repo | |
Framework | |
Instantiation
Title | Instantiation |
Authors | Abhijeet Gupta, Gemma Boleda, Sebastian Pado |
Abstract | In computational linguistics, a large body of work exists on distributed modeling of lexical relations, focussing largely on lexical relations such as hypernymy (scientist – person) that hold between two categories, as expressed by common nouns. In contrast, computational linguistics has paid little attention to entities denoted by proper nouns (Marie Curie, Mumbai, …). These have investigated in detail by the Knowledge Representation and Semantic Web communities, but generally not with regard to their linguistic properties. Our paper closes this gap by investigating and modeling the lexical relation of instantiation, which holds between an entity-denoting and a category-denoting expression (Marie Curie – scientist or Mumbai – city). We present a new, principled dataset for the task of instantiation detection as well as experiments and analyses on this dataset. We obtain the following results: (a), entities belonging to one category form a region in distributional space, but the embedding for the category word is typically located outside this subspace; (b) it is easy to learn to distinguish entities from categories from distributional evidence, but due to (a), instantiation proper is much harder to learn when using common nouns as representations of categories; (c) this problem can be alleviated by using category representations based on entity rather than category word embeddings. |
Tasks | Word Embeddings |
Published | 2018-08-05 |
URL | http://arxiv.org/abs/1808.01662v1 |
http://arxiv.org/pdf/1808.01662v1.pdf | |
PWC | https://paperswithcode.com/paper/instantiation |
Repo | |
Framework | |
A Pyramid CNN for Dense-Leaves Segmentation
Title | A Pyramid CNN for Dense-Leaves Segmentation |
Authors | Daniel D. Morris |
Abstract | Automatic detection and segmentation of overlapping leaves in dense foliage can be a difficult task, particularly for leaves with strong textures and high occlusions. We present Dense-Leaves, an image dataset with ground truth segmentation labels that can be used to train and quantify algorithms for leaf segmentation in the wild. We also propose a pyramid convolutional neural network with multi-scale predictions that detects and discriminates leaf boundaries from interior textures. Using these detected boundaries, closed-contour boundaries around individual leaves are estimated with a watershed-based algorithm. The result is an instance segmenter for dense leaves. Promising segmentation results for leaves in dense foliage are obtained. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.01646v1 |
http://arxiv.org/pdf/1804.01646v1.pdf | |
PWC | https://paperswithcode.com/paper/a-pyramid-cnn-for-dense-leaves-segmentation |
Repo | |
Framework | |