October 19, 2019

3273 words 16 mins read

Paper Group ANR 113

Paper Group ANR 113

Improving the Robustness of Speech Translation. Predicting University Students’ Academic Success and Major using Random Forests. Melodic Phrase Segmentation By Deep Neural Networks. High Speed Tracking With A Fourier Domain Kernelized Correlation Filter. Modeling Melodic Feature Dependency with Modularized Variational Auto-Encoder. Cloud Chaser: Re …

Improving the Robustness of Speech Translation

Title Improving the Robustness of Speech Translation
Authors Xiang Li, Haiyang Xue, Wei Chen, Yang Liu, Yang Feng, Qun Liu
Abstract Although neural machine translation (NMT) has achieved impressive progress recently, it is usually trained on the clean parallel data set and hence cannot work well when the input sentence is the production of the automatic speech recognition (ASR) system due to the enormous errors in the source. To solve this problem, we propose a simple but effective method to improve the robustness of NMT in the case of speech translation. We simulate the noise existing in the realistic output of the ASR system and inject them into the clean parallel data so that NMT can work under similar word distributions during training and testing. Besides, we also incorporate the Chinese Pinyin feature which is easy to get in speech translation to further improve the translation performance. Experiment results show that our method has a more stable performance and outperforms the baseline by an average of 3.12 BLEU on multiple noisy test sets, even while achieves a generalization improvement on the WMT’17 Chinese-English test set.
Tasks Machine Translation, Speech Recognition
Published 2018-11-02
URL http://arxiv.org/abs/1811.00728v1
PDF http://arxiv.org/pdf/1811.00728v1.pdf
PWC https://paperswithcode.com/paper/improving-the-robustness-of-speech
Repo
Framework

Predicting University Students’ Academic Success and Major using Random Forests

Title Predicting University Students’ Academic Success and Major using Random Forests
Authors Cédric Beaulac, Jeffrey S. Rosenthal
Abstract In this article, a large data set containing every course taken by every undergraduate student in a major university in Canada over 10 years is analysed. Modern machine learning algorithms can use large data sets to build useful tools for the data provider, in this case, the university. In this article, two classifiers are constructed using random forests. To begin, the first two semesters of courses completed by a student are used to predict if they will obtain an undergraduate degree. Secondly, for the students that completed a program, their major is predicted using once again the first few courses they have registered to. A classification tree is an intuitive and powerful classifier and building a random forest of trees improves this classifier. Random forests also allow for reliable variable importance measurements. These measures explain what variables are useful to the classifiers and can be used to better understand what is statistically related to the students’ situation. The results are two accurate classifiers and a variable importance analysis that provides useful information to university administrations.
Tasks
Published 2018-02-09
URL http://arxiv.org/abs/1802.03418v3
PDF http://arxiv.org/pdf/1802.03418v3.pdf
PWC https://paperswithcode.com/paper/predicting-university-students-academic
Repo
Framework

Melodic Phrase Segmentation By Deep Neural Networks

Title Melodic Phrase Segmentation By Deep Neural Networks
Authors Yixing Guan, Jinyu Zhao, Yiqin Qiu, Zheng Zhang, Gus Xia
Abstract Automated melodic phrase detection and segmentation is a classical task in content-based music information retrieval and also the key towards automated music structure analysis. However, traditional methods still cannot satisfy practical requirements. In this paper, we explore and adapt various neural network architectures to see if they can be generalized to work with the symbolic representation of music and produce satisfactory melodic phrase segmentation. The main issue of applying deep-learning methods to phrase detection is the sparse labeling problem of training sets. We proposed two tailored label engineering with corresponding training techniques for different neural networks in order to make decisions at a sequential level. Experiment results show that the CNN-CRF architecture performs the best, being able to offer finer segmentation and faster to train, while CNN, Bi-LSTM-CNN and Bi-LSTM-CRF are acceptable alternatives.
Tasks Information Retrieval, Music Information Retrieval
Published 2018-11-14
URL http://arxiv.org/abs/1811.05688v1
PDF http://arxiv.org/pdf/1811.05688v1.pdf
PWC https://paperswithcode.com/paper/melodic-phrase-segmentation-by-deep-neural
Repo
Framework

High Speed Tracking With A Fourier Domain Kernelized Correlation Filter

Title High Speed Tracking With A Fourier Domain Kernelized Correlation Filter
Authors Mingyang Guan, Zhengguo Li, Renjie He, Changyun Wen
Abstract It is challenging to design a high speed tracking approach using l1-norm due to its non-differentiability. In this paper, a new kernelized correlation filter is introduced by leveraging the sparsity attribute of l1-norm based regularization to design a high speed tracker. We combine the l1-norm and l2-norm based regularizations in one Huber-type loss function, and then formulate an optimization problem in the Fourier Domain for fast computation, which enables the tracker to adaptively ignore the noisy features produced from occlusion and illumination variation, while keep the advantages of l2-norm based regression. This is achieved due to the attribute of Convolution Theorem that the correlation in spatial domain corresponds to an element-wise product in the Fourier domain, resulting in that the l1-norm optimization problem could be decomposed into multiple sub-optimization spaces in the Fourier domain. But the optimized variables in the Fourier domain are complex, which makes using the l1-norm impossible if the real and imaginary parts of the variables cannot be separated. However, our proposed optimization problem is formulated in such a way that their real part and imaginary parts are indeed well separated. As such, the proposed optimization problem can be solved efficiently to obtain their optimal values independently with closed-form solutions. Extensive experiments on two large benchmark datasets demonstrate that the proposed tracking algorithm significantly improves the tracking accuracy of the original kernelized correlation filter (KCF) while with little sacrifice on tracking speed. Moreover, it outperforms the state-of-the-art approaches in terms of accuracy, efficiency, and robustness.
Tasks
Published 2018-11-08
URL http://arxiv.org/abs/1811.03236v3
PDF http://arxiv.org/pdf/1811.03236v3.pdf
PWC https://paperswithcode.com/paper/high-speed-tracking-with-a-fourier-domain
Repo
Framework

Modeling Melodic Feature Dependency with Modularized Variational Auto-Encoder

Title Modeling Melodic Feature Dependency with Modularized Variational Auto-Encoder
Authors Yu-An Wang, Yu-Kai Huang, Tzu-Chuan Lin, Shang-Yu Su, Yun-Nung Chen
Abstract Automatic melody generation has been a long-time aspiration for both AI researchers and musicians. However, learning to generate euphonious melodies has turned out to be highly challenging. This paper introduces 1) a new variant of variational autoencoder (VAE), where the model structure is designed in a modularized manner in order to model polyphonic and dynamic music with domain knowledge, and 2) a hierarchical encoding/decoding strategy, which explicitly models the dependency between melodic features. The proposed framework is capable of generating distinct melodies that sounds natural, and the experiments for evaluating generated music clips show that the proposed model outperforms the baselines in human evaluation.
Tasks
Published 2018-10-31
URL http://arxiv.org/abs/1811.00162v1
PDF http://arxiv.org/pdf/1811.00162v1.pdf
PWC https://paperswithcode.com/paper/modeling-melodic-feature-dependency-with
Repo
Framework

Cloud Chaser: Real Time Deep Learning Computer Vision on Low Computing Power Devices

Title Cloud Chaser: Real Time Deep Learning Computer Vision on Low Computing Power Devices
Authors Zhengyi Luo, Austin Small, Liam Dugan, Stephen Lane
Abstract Internet of Things(IoT) devices, mobile phones, and robotic systems are often denied the power of deep learning algorithms due to their limited computing power. However, to provide time critical services such as emergency response, home assistance, surveillance, etc, these devices often need real time analysis of their camera data. This paper strives to offer a viable approach to integrate high-performance deep learning based computer vision algorithms with low-resource and low-power devices by leveraging the computing power of the cloud. By offloading the computation work to the cloud, no dedicated hardware is needed to enable deep neural networks on existing low computing power devices. A Raspberry Pi based robot, Cloud Chaser, is built to demonstrate the power of using cloud computing to perform real time vision tasks. Furthermore, to reduce latency and improve real time performance, compression algorithms are proposed and evaluated for streaming real-time video frames to the cloud.
Tasks
Published 2018-10-02
URL http://arxiv.org/abs/1810.01069v1
PDF http://arxiv.org/pdf/1810.01069v1.pdf
PWC https://paperswithcode.com/paper/cloud-chaser-real-time-deep-learning-computer
Repo
Framework

Magnetic Resonance Fingerprinting using Recurrent Neural Networks

Title Magnetic Resonance Fingerprinting using Recurrent Neural Networks
Authors Ilkay Oksuz, Gastao Cruz, James Clough, Aurelien Bustin, Nicolo Fuin, Rene M. Botnar, Claudia Prieto, Andrew P. King, Julia A. Schnabel
Abstract Magnetic Resonance Fingerprinting (MRF) is a new approach to quantitative magnetic resonance imaging that allows simultaneous measurement of multiple tissue properties in a single, time-efficient acquisition. Standard MRF reconstructs parametric maps using dictionary matching and lacks scalability due to computational inefficiency. We propose to perform MRF map reconstruction using a recurrent neural network, which exploits the time-dependent information of the MRF signal evolution. We evaluate our method on multiparametric synthetic signals and compare it to existing MRF map reconstruction approaches, including those based on neural networks. Our method achieves state-of-the-art estimates of T1 and T2 values. In addition, the reconstruction time is significantly reduced compared to dictionary-matching based approaches.
Tasks Magnetic Resonance Fingerprinting
Published 2018-12-19
URL http://arxiv.org/abs/1812.08155v1
PDF http://arxiv.org/pdf/1812.08155v1.pdf
PWC https://paperswithcode.com/paper/magnetic-resonance-fingerprinting-using
Repo
Framework

Geometry of Deep Learning for Magnetic Resonance Fingerprinting

Title Geometry of Deep Learning for Magnetic Resonance Fingerprinting
Authors Mohammad Golbabaee, Dongdong Chen, Pedro A. Gómez, Marion I. Menzel, Mike E. Davies
Abstract Current popular methods for Magnetic Resonance Fingerprint (MRF) recovery are bottlenecked by the heavy storage and computation requirements of a dictionary-matching (DM) step due to the growing size and complexity of the fingerprint dictionaries in multi-parametric quantitative MRI applications. In this paper we study a deep learning approach to address these shortcomings. Coupled with a dimensionality reduction first layer, the proposed MRF-Net is able to reconstruct quantitative maps by saving more than 60 times in memory and computations required for a DM baseline. Fine-grid manifold enumeration i.e. the MRF dictionary is only used for training the network and not during image reconstruction. We show that the MRF-Net provides a piece-wise affine approximation to the Bloch response manifold projection and that rather than memorizing the dictionary, the network efficiently clusters this manifold and learns a set of hierarchical matched-filters for affine regression of the NMR characteristics in each segment.
Tasks Dimensionality Reduction, Image Reconstruction, Magnetic Resonance Fingerprinting
Published 2018-09-05
URL http://arxiv.org/abs/1809.01749v2
PDF http://arxiv.org/pdf/1809.01749v2.pdf
PWC https://paperswithcode.com/paper/geometry-of-deep-learning-for-magnetic
Repo
Framework

Magnetic Resonance Fingerprinting Reconstruction via Spatiotemporal Convolutional Neural Networks

Title Magnetic Resonance Fingerprinting Reconstruction via Spatiotemporal Convolutional Neural Networks
Authors Fabian Balsiger, Amaresha Shridhar Konar, Shivaprasad Chikop, Vimal Chandran, Olivier Scheidegger, Sairam Geethanath, Mauricio Reyes
Abstract Magnetic resonance fingerprinting (MRF) quantifies multiple nuclear magnetic resonance parameters in a single and fast acquisition. Standard MRF reconstructs parametric maps using dictionary matching, which lacks scalability due to computational inefficiency. We propose to perform MRF map reconstruction using a spatiotemporal convolutional neural network, which exploits the relationship between neighboring MRF signal evolutions to replace the dictionary matching. We evaluate our method on multiparametric brain scans and compare it to three recent MRF reconstruction approaches. Our method achieves state-of-the-art reconstruction accuracy and yields qualitatively more appealing maps compared to other reconstruction methods. In addition, the reconstruction time is significantly reduced compared to a dictionary-based approach.
Tasks Magnetic Resonance Fingerprinting
Published 2018-07-17
URL http://arxiv.org/abs/1807.06356v2
PDF http://arxiv.org/pdf/1807.06356v2.pdf
PWC https://paperswithcode.com/paper/magnetic-resonance-fingerprinting
Repo
Framework

Analysis of Triplet Motifs in Biological Signed Oriented Graphs Suggests a Relationship Between Fine Topology and Function

Title Analysis of Triplet Motifs in Biological Signed Oriented Graphs Suggests a Relationship Between Fine Topology and Function
Authors Alberto Calderone, Gianni Cesareni
Abstract Background: Networks in different domains are characterized by similar global characteristics while differing in local structures. To further extend this concept, we investigated network regularities on a fine scale in order to examine the functional impact of recurring motifs in signed oriented biological networks. In this work we generalize to signaling net works some considerations made on feedback and feed forward loops and extend them by adding a close scrutiny of Linear Triplets, which have not yet been investigate in detail. Results: We studied the role of triplets, either open or closed (Loops or linear events) by enumerating them in different biological signaling networks and by comparing their significance profiles. We compared different data sources and investigated the fine topology of protein networks representing causal relationships based on transcriptional control, phosphorylation, ubiquitination and binding. Not only were we able to generalize findings that have already been reported but we also highlighted a connection between relative motif abundance and node function. Furthermore, by analyzing for the first time Linear Triplets, we highlighted the relative importance of nodes sitting in specific positions in closed signaling triplets. Finally, we tried to apply machine learning to show that a combination of motifs features can be used to derive node function. Availability: The triplets counter used for this work is available as a Cytoscape App and as a standalone command line Java application. http://apps.cytoscape.org/apps/counttriplets Keywords: Graph theory, graph analysis, graph topology, machine learning, cytoscape
Tasks
Published 2018-03-17
URL https://arxiv.org/abs/1803.06520v4
PDF https://arxiv.org/pdf/1803.06520v4.pdf
PWC https://paperswithcode.com/paper/analysis-of-triplet-motifs-in-biological
Repo
Framework

PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

Title PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds
Authors Aseem Behl, Despoina Paschalidou, Simon Donné, Andreas Geiger
Abstract Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.
Tasks Motion Estimation, Scene Flow Estimation, Self-Driving Cars
Published 2018-06-06
URL http://arxiv.org/abs/1806.02170v3
PDF http://arxiv.org/pdf/1806.02170v3.pdf
PWC https://paperswithcode.com/paper/pointflownet-learning-representations-for
Repo
Framework

Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN

Title Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN
Authors Dror Freirich, Ron Meir, Aviv Tamar
Abstract The recently proposed distributional approach to reinforcement learning (DiRL) is centered on learning the distribution of the reward-to-go, often referred to as the value distribution. In this work, we show that the distributional Bellman equation, which drives DiRL methods, is equivalent to a generative adversarial network (GAN) model. In this formulation, DiRL can be seen as learning a deep generative model of the value distribution, driven by the discrepancy between the distribution of the current value, and the distribution of the sum of current reward and next value. We use this insight to propose a GAN-based approach to DiRL, which leverages the strengths of GANs in learning distributions of high-dimensional data. In particular, we show that our GAN approach can be used for DiRL with multivariate rewards, an important setting which cannot be tackled with prior methods. The multivariate setting also allows us to unify learning the distribution of values and state transitions, and we exploit this idea to devise a novel exploration method that is driven by the discrepancy in estimating both values and states.
Tasks
Published 2018-08-06
URL http://arxiv.org/abs/1808.01960v1
PDF http://arxiv.org/pdf/1808.01960v1.pdf
PWC https://paperswithcode.com/paper/distributional-multivariate-policy-evaluation
Repo
Framework

What do the US West Coast Public Libraries Post on Twitter?

Title What do the US West Coast Public Libraries Post on Twitter?
Authors Amir Karami, Matthew Collins
Abstract Twitter has provided a great opportunity for public libraries to disseminate information for a variety of purposes. Twitter data have been applied in different domains such as health, politics, and history. There are thousands of public libraries in the US, but no study has yet investigated the content of their social media posts like tweets to find their interests. Moreover, traditional content analysis of Twitter content is not an efficient task for exploring thousands of tweets. Therefore, there is a need for automatic methods to overcome the limitations of manual methods. This paper proposes a computational approach to collecting and analyzing using Twitter Application Programming Interfaces (API) and investigates more than 138,000 tweets from 48 US west coast libraries using topic modeling. We found 20 topics and assigned them to five categories including public relations, book, event, training, and social good. Our results show that the US west coast libraries are more interested in using Twitter for public relations and book-related events. This research has both practical and theoretical applications for libraries as well as other organizations to explore social media actives of their customer and themselves.
Tasks
Published 2018-08-17
URL http://arxiv.org/abs/1808.06021v2
PDF http://arxiv.org/pdf/1808.06021v2.pdf
PWC https://paperswithcode.com/paper/what-do-the-us-west-coast-public-libraries
Repo
Framework

Functional ASP with Intensional Sets: Application to Gelfond-Zhang Aggregates

Title Functional ASP with Intensional Sets: Application to Gelfond-Zhang Aggregates
Authors Pedro Cabalar, Jorge Fandinno, Luis Fariñas del Cerro, David Pearce
Abstract In this paper, we propose a variant of Answer Set Programming (ASP) with evaluable functions that extends their application to sets of objects, something that allows a fully logical treatment of aggregates. Formally, we start from the syntax of First Order Logic with equality and the semantics of Quantified Equilibrium Logic with evaluable functions (QELF). Then, we proceed to incorporate a new kind of logical term, intensional set (a construct commonly used to denote the set of objects characterised by a given formula), and to extend QELF semantics for this new type of expression. In our extended approach, intensional sets can be arbitrarily used as predicate or function arguments or even nested inside other intensional sets, just as regular first-order logical terms. As a result, aggregates can be naturally formed by the application of some evaluable function (count, sum, maximum, etc) to a set of objects expressed as an intensional set. This approach has several advantages. First, while other semantics for aggregates depend on some syntactic transformation (either via a reduct or a formula translation), the QELF interpretation treats them as regular evaluable functions, providing a compositional semantics and avoiding any kind of syntactic restriction. Second, aggregates can be explicitly defined now within the logical language by the simple addition of formulas that fix their meaning in terms of multiple applications of some (commutative and associative) binary operation. For instance, we can use recursive rules to define sum in terms of integer addition. Last, but not least, we prove that the semantics we obtain for aggregates coincides with the one defined by Gelfond and Zhang for the Alog language, when we restrict to that syntactic fragment. (Under consideration for acceptance in TPLP)
Tasks
Published 2018-05-02
URL http://arxiv.org/abs/1805.00660v1
PDF http://arxiv.org/pdf/1805.00660v1.pdf
PWC https://paperswithcode.com/paper/functional-asp-with-intensional-sets
Repo
Framework

A Virtual Environment with Multi-Robot Navigation, Analytics, and Decision Support for Critical Incident Investigation

Title A Virtual Environment with Multi-Robot Navigation, Analytics, and Decision Support for Critical Incident Investigation
Authors David L. Smyth, James Fennell, Sai Abinesh, Nazli B. Karimi, Frank G. Glavin, Ihsan Ullah, Brett Drury, Michael G. Madden
Abstract Accidents and attacks that involve chemical, biological, radiological/nuclear or explosive (CBRNE) substances are rare, but can be of high consequence. Since the investigation of such events is not anybody’s routine work, a range of AI techniques can reduce investigators’ cognitive load and support decision-making, including: planning the assessment of the scene; ongoing evaluation and updating of risks; control of autonomous vehicles for collecting images and sensor data; reviewing images/videos for items of interest; identification of anomalies; and retrieval of relevant documentation. Because of the rare and high-risk nature of these events, realistic simulations can support the development and evaluation of AI-based tools. We have developed realistic models of CBRNE scenarios and implemented an initial set of tools.
Tasks Autonomous Vehicles, Decision Making, Robot Navigation
Published 2018-06-12
URL http://arxiv.org/abs/1806.04497v1
PDF http://arxiv.org/pdf/1806.04497v1.pdf
PWC https://paperswithcode.com/paper/a-virtual-environment-with-multi-robot
Repo
Framework
comments powered by Disqus