October 16, 2019

2913 words 14 mins read

Paper Group ANR 1024

From Audio to Semantics: Approaches to end-to-end spoken language understanding. Guess who? Multilingual approach for the automated generation of author-stylized poetry. Embedding Cardinality Constraints in Neural Link Predictors. Single Image Super-Resolution via Cascaded Multi-Scale Cross Network. Recurrent Neural Networks for Long and Short-Term …

From Audio to Semantics: Approaches to end-to-end spoken language understanding


Title	From Audio to Semantics: Approaches to end-to-end spoken language understanding
Authors	Parisa Haghani, Arun Narayanan, Michiel Bacchiani, Galen Chuang, Neeraj Gaur, Pedro Moreno, Rohit Prabhavalkar, Zhongdi Qu, Austin Waters
Abstract	Conventional spoken language understanding systems consist of two main components: an automatic speech recognition module that converts audio to a transcript, and a natural language understanding module that transforms the resulting text (or top N hypotheses) into a set of domains, intents, and arguments. These modules are typically optimized independently. In this paper, we formulate audio to semantic understanding as a sequence-to-sequence problem [1]. We propose and compare various encoder-decoder based approaches that optimize both modules jointly, in an end-to-end manner. Evaluations on a real-world task show that 1) having an intermediate text representation is crucial for the quality of the predicted semantics, especially the intent arguments and 2) jointly optimizing the full system improves overall accuracy of prediction. Compared to independently trained models, our best jointly trained model achieves similar domain and intent prediction F1 scores, but improves argument word error rate by 18% relative.
Tasks	Speech Recognition, Spoken Language Understanding
Published	2018-09-24
URL	http://arxiv.org/abs/1809.09190v1
PDF	http://arxiv.org/pdf/1809.09190v1.pdf
PWC	https://paperswithcode.com/paper/from-audio-to-semantics-approaches-to-end-to
Repo
Framework

Guess who? Multilingual approach for the automated generation of author-stylized poetry


Title	Guess who? Multilingual approach for the automated generation of author-stylized poetry
Authors	Alexey Tikhonov, Ivan P. Yamshchikov
Abstract	This paper addresses the problem of stylized text generation in a multilingual setup. A version of a language model based on a long short-term memory (LSTM) artificial neural network with extended phonetic and semantic embeddings is used for stylized poetry generation. The quality of the resulting poems generated by the network is estimated through bilingual evaluation understudy (BLEU), a survey and a new cross-entropy based metric that is suggested for the problems of such type. The experiments show that the proposed model consistently outperforms random sample and vanilla-LSTM baselines, humans also tend to associate machine generated texts with the target author.
Tasks	Language Modelling, Text Generation
Published	2018-07-17
URL	http://arxiv.org/abs/1807.07147v3
PDF	http://arxiv.org/pdf/1807.07147v3.pdf
PWC	https://paperswithcode.com/paper/guess-who-multilingual-approach-for-the
Repo
Framework

Embedding Cardinality Constraints in Neural Link Predictors


Title	Embedding Cardinality Constraints in Neural Link Predictors
Authors	Emir Muñoz, Pasquale Minervini, Matthias Nickles
Abstract	Neural link predictors learn distributed representations of entities and relations in a knowledge graph. They are remarkably powerful in the link prediction and knowledge base completion tasks, mainly due to the learned representations that capture important statistical dependencies in the data. Recent works in the area have focused on either designing new scoring functions or incorporating extra information into the learning process to improve the representations. Yet the representations are mostly learned from the observed links between entities, ignoring commonsense or schema knowledge associated with the relations in the graph. A fundamental aspect of the topology of relational data is the cardinality information, which bounds the number of predictions given for a relation between a minimum and maximum frequency. In this paper, we propose a new regularisation approach to incorporate relation cardinality constraints to any existing neural link predictor without affecting their efficiency or scalability. Our regularisation term aims to impose boundaries on the number of predictions with high probability, thus, structuring the embeddings space to respect commonsense cardinality assumptions resulting in better representations. Experimental results on Freebase, WordNet and YAGO show that, given suitable prior knowledge, the proposed method positively impacts the predictive accuracy of downstream link prediction tasks.
Tasks	Knowledge Base Completion, Link Prediction
Published	2018-12-16
URL	http://arxiv.org/abs/1812.06455v1
PDF	http://arxiv.org/pdf/1812.06455v1.pdf
PWC	https://paperswithcode.com/paper/embedding-cardinality-constraints-in-neural
Repo
Framework

Single Image Super-Resolution via Cascaded Multi-Scale Cross Network


Title	Single Image Super-Resolution via Cascaded Multi-Scale Cross Network
Authors	Yanting Hu, Xinbo Gao, Jie Li, Yuanfei Huang, Hanzi Wang
Abstract	The deep convolutional neural networks have achieved significant improvements in accuracy and speed for single image super-resolution. However, as the depth of network grows, the information flow is weakened and the training becomes harder and harder. On the other hand, most of the models adopt a single-stream structure with which integrating complementary contextual information under different receptive fields is difficult. To improve information flow and to capture sufficient knowledge for reconstructing the high-frequency details, we propose a cascaded multi-scale cross network (CMSC) in which a sequence of subnetworks is cascaded to infer high resolution features in a coarse-to-fine manner. In each cascaded subnetwork, we stack multiple multi-scale cross (MSC) modules to fuse complementary multi-scale information in an efficient way as well as to improve information flow across the layers. Meanwhile, by introducing residual-features learning in each stage, the relative information between high-resolution and low-resolution features is fully utilized to further boost reconstruction performance. We train the proposed network with cascaded-supervision and then assemble the intermediate predictions of the cascade to achieve high quality image reconstruction. Extensive quantitative and qualitative evaluations on benchmark datasets illustrate the superiority of our proposed method over state-of-the-art super-resolution methods.
Tasks	Image Reconstruction, Image Super-Resolution, Super-Resolution
Published	2018-02-24
URL	http://arxiv.org/abs/1802.08808v1
PDF	http://arxiv.org/pdf/1802.08808v1.pdf
PWC	https://paperswithcode.com/paper/single-image-super-resolution-via-cascaded
Repo
Framework

Recurrent Neural Networks for Long and Short-Term Sequential Recommendation


Title	Recurrent Neural Networks for Long and Short-Term Sequential Recommendation
Authors	Kiewan Villatel, Elena Smirnova, Jérémie Mary, Philippe Preux
Abstract	Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon. A large body of previous research studied long-term recommendation through dimensionality reduction techniques applied to the historical user-item interactions. A recently introduced session-based recommendation setting highlighted the importance of modeling short-term user preferences. In this task, Recurrent Neural Networks (RNN) have shown to be successful at capturing the nuances of user’s interactions within a short time window. In this paper, we evaluate RNN-based models on both short-term and long-term recommendation tasks. Our experimental results suggest that RNNs are capable of predicting immediate as well as distant user interactions. We also find the best performing configuration to be a stacked RNN with layer normalization and tied item embeddings.
Tasks	Dimensionality Reduction, Recommendation Systems, Session-Based Recommendations
Published	2018-07-23
URL	http://arxiv.org/abs/1807.09142v1
PDF	http://arxiv.org/pdf/1807.09142v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-neural-networks-for-long-and-short
Repo
Framework

Tangent-Normal Adversarial Regularization for Semi-supervised Learning


Title	Tangent-Normal Adversarial Regularization for Semi-supervised Learning
Authors	Bing Yu, Jingfeng Wu, Jinwen Ma, Zhanxing Zhu
Abstract	Compared with standard supervised learning, the key difficulty in semi-supervised learning is how to make full use of the unlabeled data. A recently proposed method, virtual adversarial training (VAT), smartly performs adversarial training without label information to impose a local smoothness on the classifier, which is especially beneficial to semi-supervised learning. In this work, we propose tangent-normal adversarial regularization (TNAR) as an extension of VAT by taking the data manifold into consideration. The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR). In TAR, VAT is applied along the tangent space of the data manifold, aiming to enforce local invariance of the classifier on the manifold, while in NAR, VAT is performed on the normal space orthogonal to the tangent space, intending to impose robustness on the classifier against the noise causing the observed data deviating from the underlying data manifold. Demonstrated by experiments on both artificial and practical datasets, our proposed TAR and NAR complement with each other, and jointly outperforms other state-of-the-art methods for semi-supervised learning.
Tasks
Published	2018-08-18
URL	http://arxiv.org/abs/1808.06088v3
PDF	http://arxiv.org/pdf/1808.06088v3.pdf
PWC	https://paperswithcode.com/paper/tangent-normal-adversarial-regularization-for
Repo
Framework

A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning


Title	A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning
Authors	Pierre Ablin, Dylan Fagot, Herwig Wendt, Alexandre Gramfort, Cédric Févotte
Abstract	Nonnegative matrix factorization (NMF) is a popular method for audio spectral unmixing. While NMF is traditionally applied to off-the-shelf time-frequency representations based on the short-time Fourier or Cosine transforms, the ability to learn transforms from raw data attracts increasing attention. However, this adds an important computational overhead. When assumed orthogonal (like the Fourier or Cosine transforms), learning the transform yields a non-convex optimization problem on the orthogonal matrix manifold. In this paper, we derive a quasi-Newton method on the manifold using sparse approximations of the Hessian. Experiments on synthetic and real audio data show that the proposed algorithm out-performs state-of-the-art first-order and coordinate-descent methods by orders of magnitude. A Python package for fast TL-NMF is released online at https://github.com/pierreablin/tlnmf.
Tasks
Published	2018-11-06
URL	http://arxiv.org/abs/1811.02225v1
PDF	http://arxiv.org/pdf/1811.02225v1.pdf
PWC	https://paperswithcode.com/paper/a-quasi-newton-algorithm-on-the-orthogonal
Repo
Framework

A Novel Feature Descriptor for Image Retrieval by Combining Modified Color Histogram and Diagonally Symmetric Co-occurrence Texture Pattern


Title	A Novel Feature Descriptor for Image Retrieval by Combining Modified Color Histogram and Diagonally Symmetric Co-occurrence Texture Pattern
Authors	Ayan Kumar Bhunia, Avirup Bhattacharyya, Prithaj Banerjee, Partha Pratim Roy, Subrahmanyam Murala
Abstract	In this paper, we have proposed a novel feature descriptors combining color and texture information collectively. In our proposed color descriptor component, the inter-channel relationship between Hue (H) and Saturation (S) channels in the HSV color space has been explored which was not done earlier. We have quantized the H channel into a number of bins and performed the voting with saturation values and vice versa by following a principle similar to that of the HOG descriptor, where orientation of the gradient is quantized into a certain number of bins and voting is done with gradient magnitude. This helps us to study the nature of variation of saturation with variation in Hue and nature of variation of Hue with the variation in saturation. The texture component of our descriptor considers the co-occurrence relationship between the pixels symmetric about both the diagonals of a 3x3 window. Our work is inspired from the work done by Dubey et al.[1]. These two components, viz. color and texture information individually perform better than existing texture and color descriptors. Moreover, when concatenated the proposed descriptors provide significant improvement over existing descriptors for content base color image retrieval. The proposed descriptor has been tested for image retrieval on five databases, including texture image databases - MIT VisTex database and Salzburg texture database and natural scene databases Corel 1K, Corel 5K and Corel 10K. The precision and recall values experimented on these databases are compared with some state-of-art local patterns. The proposed method provided satisfactory results from the experiments.
Tasks	Image Retrieval
Published	2018-01-03
URL	http://arxiv.org/abs/1801.00879v1
PDF	http://arxiv.org/pdf/1801.00879v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-feature-descriptor-for-image
Repo
Framework

Deep Image Super Resolution via Natural Image Priors


Title	Deep Image Super Resolution via Natural Image Priors
Authors	Hojjat S. Mousavi, Tiantong Guo, Vishal Monga
Abstract	Single image super-resolution (SR) via deep learning has recently gained significant attention in the literature. Convolutional neural networks (CNNs) are typically learned to represent the mapping between low-resolution (LR) and high-resolution (HR) images/patches with the help of training examples. Most existing deep networks for SR produce high quality results when training data is abundant. However, their performance degrades sharply when training is limited. We propose to regularize deep structures with prior knowledge about the images so that they can capture more structural information from the same limited data. In particular, we incorporate in a tractable fashion within the CNN framework, natural image priors which have shown to have much recent success in imaging and vision inverse problems. Experimental results show that the proposed deep network with natural image priors is particularly effective in training starved regimes.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-02-08
URL	http://arxiv.org/abs/1802.02721v1
PDF	http://arxiv.org/pdf/1802.02721v1.pdf
PWC	https://paperswithcode.com/paper/deep-image-super-resolution-via-natural-image
Repo
Framework

When deep learning meets security


Title	When deep learning meets security
Authors	Majd Latah
Abstract	Deep learning is an emerging research field that has proven its effectiveness towards deploying more efficient intelligent systems. Security, on the other hand, is one of the most essential issues in modern communication systems. Recently many papers have shown that using deep learning models can achieve promising results when applied to the security domain. In this work, we provide an overview for the recent studies that apply deep learning techniques to the field of security.
Tasks
Published	2018-07-12
URL	http://arxiv.org/abs/1807.04739v1
PDF	http://arxiv.org/pdf/1807.04739v1.pdf
PWC	https://paperswithcode.com/paper/when-deep-learning-meets-security
Repo
Framework

Brain MRI Super Resolution Using 3D Deep Densely Connected Neural Networks


Title	Brain MRI Super Resolution Using 3D Deep Densely Connected Neural Networks
Authors	Yuhua Chen, Yibin Xie, Zhengwei Zhou, Feng Shi, Anthony G. Christodoulou, Debiao Li
Abstract	Magnetic resonance image (MRI) in high spatial resolution provides detailed anatomical information and is often necessary for accurate quantitative analysis. However, high spatial resolution typically comes at the expense of longer scan time, less spatial coverage, and lower signal to noise ratio (SNR). Single Image Super-Resolution (SISR), a technique aimed to restore high-resolution (HR) details from one single low-resolution (LR) input image, has been improved dramatically by recent breakthroughs in deep learning. In this paper, we introduce a new neural network architecture, 3D Densely Connected Super-Resolution Networks (DCSRN) to restore HR features of structural brain MR images. Through experiments on a dataset with 1,113 subjects, we demonstrate that our network outperforms bicubic interpolation as well as other deep learning methods in restoring 4x resolution-reduced images.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-01-08
URL	http://arxiv.org/abs/1801.02728v1
PDF	http://arxiv.org/pdf/1801.02728v1.pdf
PWC	https://paperswithcode.com/paper/brain-mri-super-resolution-using-3d-deep
Repo
Framework

QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks


Title	QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks
Authors	Hassan Ali, Hammad Tariq, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed, Muhammad Shafique
Abstract	Deep Neural Networks (DNNs) have recently been shown vulnerable to adversarial attacks in which the input examples are perturbed to fool these DNNs towards confidence reduction and (targeted or random) misclassification. In this paper, we demonstrate that how an efficient quantization technique can be leveraged to increase the robustness of a given DNN against adversarial attacks. We present two quantization-based defense mechanisms, namely Constant Quantization (CQ) and Variable Quantization (VQ), applied at the input to increase the robustness of DNNs. In CQ, the intensity of the input pixel is quantized according to the number of quantization levels. While in VQ, the quantization levels are recursively updated during the training phase, thereby providing a stronger defense mechanism. We apply our techniques on the Convolutional Neural Networks (CNNs, a particular type of DNN which is heavily used in vision-based applications) against adversarial attacks from the open-source Cleverhans library. Our experimental results show 1%-5% increase in the adversarial accuracy for MNIST and 0%-2.4% increase in the adversarial accuracy for CIFAR10.
Tasks	Quantization
Published	2018-11-04
URL	http://arxiv.org/abs/1811.01437v1
PDF	http://arxiv.org/pdf/1811.01437v1.pdf
PWC	https://paperswithcode.com/paper/qusecnets-quantization-based-defense
Repo
Framework

User Association and Load Balancing for Massive MIMO through Deep Learning


Title	User Association and Load Balancing for Massive MIMO through Deep Learning
Authors	Alessio Zappone, Luca Sanguinetti, Merouane Debbah
Abstract	This work investigates the use of deep learning to perform user cell association for sum-rate maximization in Massive MIMO networks. It is shown how a deep neural network can be trained to approach the optimal association rule with a much more limited computational complexity, thus enabling to update the association rule in real-time, on the basis of the mobility patterns of users. In particular, the proposed neural network design requires as input only the users’ geographical positions. Numerical results show that it guarantees the same performance of traditional optimization-oriented methods.
Tasks
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06905v1
PDF	http://arxiv.org/pdf/1812.06905v1.pdf
PWC	https://paperswithcode.com/paper/user-association-and-load-balancing-for
Repo
Framework

Cross-Dataset Adaptation for Visual Question Answering


Title	Cross-Dataset Adaptation for Visual Question Answering
Authors	Wei-Lun Chao, Hexiang Hu, Fei Sha
Abstract	We investigate the problem of cross-dataset adaptation for visual question answering (Visual QA). Our goal is to train a Visual QA model on a source dataset but apply it to another target one. Analogous to domain adaptation for visual recognition, this setting is appealing when the target dataset does not have a sufficient amount of labeled data to learn an “in-domain” model. The key challenge is that the two datasets are constructed differently, resulting in the cross-dataset mismatch on images, questions, or answers. We overcome this difficulty by proposing a novel domain adaptation algorithm. Our method reduces the difference in statistical distributions by transforming the feature representation of the data in the target dataset. Moreover, it maximizes the likelihood of answering questions (in the target dataset) correctly using the Visual QA model trained on the source dataset. We empirically studied the effectiveness of the proposed approach on adapting among several popular Visual QA datasets. We show that the proposed method improves over baselines where there is no adaptation and several other adaptation methods. We both quantitatively and qualitatively analyze when the adaptation can be mostly effective.
Tasks	Domain Adaptation, Question Answering, Visual Question Answering
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03726v1
PDF	http://arxiv.org/pdf/1806.03726v1.pdf
PWC	https://paperswithcode.com/paper/cross-dataset-adaptation-for-visual-question
Repo
Framework

Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams


Title	Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams
Authors	Ciprian Amariei, Paul Diac, Emanuel Onica, Valentin Roşca
Abstract	The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic. This paper reports the technical details of a custom solution, which exposes multiple tuning parameters, making its configurability one of the main strengths. Our solution employs a cell grid architecture essentially based on a sequence of hash tables, specifically built for the targeted use case. This makes it particularly effective in prediction on AIS data, obtaining a high accuracy and scalable performance results. Moreover, the architecture proposed accommodates also an optionally semi-supervised learning process besides the basic supervised mode.
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1810.00090v1
PDF	http://arxiv.org/pdf/1810.00090v1.pdf
PWC	https://paperswithcode.com/paper/cell-grid-architecture-for-maritime-route
Repo
Framework