Paper Group ANR 1024
From Audio to Semantics: Approaches to end-to-end spoken language understanding. Guess who? Multilingual approach for the automated generation of author-stylized poetry. Embedding Cardinality Constraints in Neural Link Predictors. Single Image Super-Resolution via Cascaded Multi-Scale Cross Network. Recurrent Neural Networks for Long and Short-Term …
From Audio to Semantics: Approaches to end-to-end spoken language understanding
Title | From Audio to Semantics: Approaches to end-to-end spoken language understanding |
Authors | Parisa Haghani, Arun Narayanan, Michiel Bacchiani, Galen Chuang, Neeraj Gaur, Pedro Moreno, Rohit Prabhavalkar, Zhongdi Qu, Austin Waters |
Abstract | Conventional spoken language understanding systems consist of two main components: an automatic speech recognition module that converts audio to a transcript, and a natural language understanding module that transforms the resulting text (or top N hypotheses) into a set of domains, intents, and arguments. These modules are typically optimized independently. In this paper, we formulate audio to semantic understanding as a sequence-to-sequence problem [1]. We propose and compare various encoder-decoder based approaches that optimize both modules jointly, in an end-to-end manner. Evaluations on a real-world task show that 1) having an intermediate text representation is crucial for the quality of the predicted semantics, especially the intent arguments and 2) jointly optimizing the full system improves overall accuracy of prediction. Compared to independently trained models, our best jointly trained model achieves similar domain and intent prediction F1 scores, but improves argument word error rate by 18% relative. |
Tasks | Speech Recognition, Spoken Language Understanding |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.09190v1 |
http://arxiv.org/pdf/1809.09190v1.pdf | |
PWC | https://paperswithcode.com/paper/from-audio-to-semantics-approaches-to-end-to |
Repo | |
Framework | |
Guess who? Multilingual approach for the automated generation of author-stylized poetry
Title | Guess who? Multilingual approach for the automated generation of author-stylized poetry |
Authors | Alexey Tikhonov, Ivan P. Yamshchikov |
Abstract | This paper addresses the problem of stylized text generation in a multilingual setup. A version of a language model based on a long short-term memory (LSTM) artificial neural network with extended phonetic and semantic embeddings is used for stylized poetry generation. The quality of the resulting poems generated by the network is estimated through bilingual evaluation understudy (BLEU), a survey and a new cross-entropy based metric that is suggested for the problems of such type. The experiments show that the proposed model consistently outperforms random sample and vanilla-LSTM baselines, humans also tend to associate machine generated texts with the target author. |
Tasks | Language Modelling, Text Generation |
Published | 2018-07-17 |
URL | http://arxiv.org/abs/1807.07147v3 |
http://arxiv.org/pdf/1807.07147v3.pdf | |
PWC | https://paperswithcode.com/paper/guess-who-multilingual-approach-for-the |
Repo | |
Framework | |
Embedding Cardinality Constraints in Neural Link Predictors
Title | Embedding Cardinality Constraints in Neural Link Predictors |
Authors | Emir Muñoz, Pasquale Minervini, Matthias Nickles |
Abstract | Neural link predictors learn distributed representations of entities and relations in a knowledge graph. They are remarkably powerful in the link prediction and knowledge base completion tasks, mainly due to the learned representations that capture important statistical dependencies in the data. Recent works in the area have focused on either designing new scoring functions or incorporating extra information into the learning process to improve the representations. Yet the representations are mostly learned from the observed links between entities, ignoring commonsense or schema knowledge associated with the relations in the graph. A fundamental aspect of the topology of relational data is the cardinality information, which bounds the number of predictions given for a relation between a minimum and maximum frequency. In this paper, we propose a new regularisation approach to incorporate relation cardinality constraints to any existing neural link predictor without affecting their efficiency or scalability. Our regularisation term aims to impose boundaries on the number of predictions with high probability, thus, structuring the embeddings space to respect commonsense cardinality assumptions resulting in better representations. Experimental results on Freebase, WordNet and YAGO show that, given suitable prior knowledge, the proposed method positively impacts the predictive accuracy of downstream link prediction tasks. |
Tasks | Knowledge Base Completion, Link Prediction |
Published | 2018-12-16 |
URL | http://arxiv.org/abs/1812.06455v1 |
http://arxiv.org/pdf/1812.06455v1.pdf | |
PWC | https://paperswithcode.com/paper/embedding-cardinality-constraints-in-neural |
Repo | |
Framework | |
Single Image Super-Resolution via Cascaded Multi-Scale Cross Network
Title | Single Image Super-Resolution via Cascaded Multi-Scale Cross Network |
Authors | Yanting Hu, Xinbo Gao, Jie Li, Yuanfei Huang, Hanzi Wang |
Abstract | The deep convolutional neural networks have achieved significant improvements in accuracy and speed for single image super-resolution. However, as the depth of network grows, the information flow is weakened and the training becomes harder and harder. On the other hand, most of the models adopt a single-stream structure with which integrating complementary contextual information under different receptive fields is difficult. To improve information flow and to capture sufficient knowledge for reconstructing the high-frequency details, we propose a cascaded multi-scale cross network (CMSC) in which a sequence of subnetworks is cascaded to infer high resolution features in a coarse-to-fine manner. In each cascaded subnetwork, we stack multiple multi-scale cross (MSC) modules to fuse complementary multi-scale information in an efficient way as well as to improve information flow across the layers. Meanwhile, by introducing residual-features learning in each stage, the relative information between high-resolution and low-resolution features is fully utilized to further boost reconstruction performance. We train the proposed network with cascaded-supervision and then assemble the intermediate predictions of the cascade to achieve high quality image reconstruction. Extensive quantitative and qualitative evaluations on benchmark datasets illustrate the superiority of our proposed method over state-of-the-art super-resolution methods. |
Tasks | Image Reconstruction, Image Super-Resolution, Super-Resolution |
Published | 2018-02-24 |
URL | http://arxiv.org/abs/1802.08808v1 |
http://arxiv.org/pdf/1802.08808v1.pdf | |
PWC | https://paperswithcode.com/paper/single-image-super-resolution-via-cascaded |
Repo | |
Framework | |
Recurrent Neural Networks for Long and Short-Term Sequential Recommendation
Title | Recurrent Neural Networks for Long and Short-Term Sequential Recommendation |
Authors | Kiewan Villatel, Elena Smirnova, Jérémie Mary, Philippe Preux |
Abstract | Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon. A large body of previous research studied long-term recommendation through dimensionality reduction techniques applied to the historical user-item interactions. A recently introduced session-based recommendation setting highlighted the importance of modeling short-term user preferences. In this task, Recurrent Neural Networks (RNN) have shown to be successful at capturing the nuances of user’s interactions within a short time window. In this paper, we evaluate RNN-based models on both short-term and long-term recommendation tasks. Our experimental results suggest that RNNs are capable of predicting immediate as well as distant user interactions. We also find the best performing configuration to be a stacked RNN with layer normalization and tied item embeddings. |
Tasks | Dimensionality Reduction, Recommendation Systems, Session-Based Recommendations |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.09142v1 |
http://arxiv.org/pdf/1807.09142v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-networks-for-long-and-short |
Repo | |
Framework | |
Tangent-Normal Adversarial Regularization for Semi-supervised Learning
Title | Tangent-Normal Adversarial Regularization for Semi-supervised Learning |
Authors | Bing Yu, Jingfeng Wu, Jinwen Ma, Zhanxing Zhu |
Abstract | Compared with standard supervised learning, the key difficulty in semi-supervised learning is how to make full use of the unlabeled data. A recently proposed method, virtual adversarial training (VAT), smartly performs adversarial training without label information to impose a local smoothness on the classifier, which is especially beneficial to semi-supervised learning. In this work, we propose tangent-normal adversarial regularization (TNAR) as an extension of VAT by taking the data manifold into consideration. The proposed TNAR is composed by two complementary parts, the tangent adversarial regularization (TAR) and the normal adversarial regularization (NAR). In TAR, VAT is applied along the tangent space of the data manifold, aiming to enforce local invariance of the classifier on the manifold, while in NAR, VAT is performed on the normal space orthogonal to the tangent space, intending to impose robustness on the classifier against the noise causing the observed data deviating from the underlying data manifold. Demonstrated by experiments on both artificial and practical datasets, our proposed TAR and NAR complement with each other, and jointly outperforms other state-of-the-art methods for semi-supervised learning. |
Tasks | |
Published | 2018-08-18 |
URL | http://arxiv.org/abs/1808.06088v3 |
http://arxiv.org/pdf/1808.06088v3.pdf | |
PWC | https://paperswithcode.com/paper/tangent-normal-adversarial-regularization-for |
Repo | |
Framework | |
A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning
Title | A Quasi-Newton algorithm on the orthogonal manifold for NMF with transform learning |
Authors | Pierre Ablin, Dylan Fagot, Herwig Wendt, Alexandre Gramfort, Cédric Févotte |
Abstract | Nonnegative matrix factorization (NMF) is a popular method for audio spectral unmixing. While NMF is traditionally applied to off-the-shelf time-frequency representations based on the short-time Fourier or Cosine transforms, the ability to learn transforms from raw data attracts increasing attention. However, this adds an important computational overhead. When assumed orthogonal (like the Fourier or Cosine transforms), learning the transform yields a non-convex optimization problem on the orthogonal matrix manifold. In this paper, we derive a quasi-Newton method on the manifold using sparse approximations of the Hessian. Experiments on synthetic and real audio data show that the proposed algorithm out-performs state-of-the-art first-order and coordinate-descent methods by orders of magnitude. A Python package for fast TL-NMF is released online at https://github.com/pierreablin/tlnmf. |
Tasks | |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02225v1 |
http://arxiv.org/pdf/1811.02225v1.pdf | |
PWC | https://paperswithcode.com/paper/a-quasi-newton-algorithm-on-the-orthogonal |
Repo | |
Framework | |
A Novel Feature Descriptor for Image Retrieval by Combining Modified Color Histogram and Diagonally Symmetric Co-occurrence Texture Pattern
Title | A Novel Feature Descriptor for Image Retrieval by Combining Modified Color Histogram and Diagonally Symmetric Co-occurrence Texture Pattern |
Authors | Ayan Kumar Bhunia, Avirup Bhattacharyya, Prithaj Banerjee, Partha Pratim Roy, Subrahmanyam Murala |
Abstract | In this paper, we have proposed a novel feature descriptors combining color and texture information collectively. In our proposed color descriptor component, the inter-channel relationship between Hue (H) and Saturation (S) channels in the HSV color space has been explored which was not done earlier. We have quantized the H channel into a number of bins and performed the voting with saturation values and vice versa by following a principle similar to that of the HOG descriptor, where orientation of the gradient is quantized into a certain number of bins and voting is done with gradient magnitude. This helps us to study the nature of variation of saturation with variation in Hue and nature of variation of Hue with the variation in saturation. The texture component of our descriptor considers the co-occurrence relationship between the pixels symmetric about both the diagonals of a 3x3 window. Our work is inspired from the work done by Dubey et al.[1]. These two components, viz. color and texture information individually perform better than existing texture and color descriptors. Moreover, when concatenated the proposed descriptors provide significant improvement over existing descriptors for content base color image retrieval. The proposed descriptor has been tested for image retrieval on five databases, including texture image databases - MIT VisTex database and Salzburg texture database and natural scene databases Corel 1K, Corel 5K and Corel 10K. The precision and recall values experimented on these databases are compared with some state-of-art local patterns. The proposed method provided satisfactory results from the experiments. |
Tasks | Image Retrieval |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.00879v1 |
http://arxiv.org/pdf/1801.00879v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-feature-descriptor-for-image |
Repo | |
Framework | |
Deep Image Super Resolution via Natural Image Priors
Title | Deep Image Super Resolution via Natural Image Priors |
Authors | Hojjat S. Mousavi, Tiantong Guo, Vishal Monga |
Abstract | Single image super-resolution (SR) via deep learning has recently gained significant attention in the literature. Convolutional neural networks (CNNs) are typically learned to represent the mapping between low-resolution (LR) and high-resolution (HR) images/patches with the help of training examples. Most existing deep networks for SR produce high quality results when training data is abundant. However, their performance degrades sharply when training is limited. We propose to regularize deep structures with prior knowledge about the images so that they can capture more structural information from the same limited data. In particular, we incorporate in a tractable fashion within the CNN framework, natural image priors which have shown to have much recent success in imaging and vision inverse problems. Experimental results show that the proposed deep network with natural image priors is particularly effective in training starved regimes. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.02721v1 |
http://arxiv.org/pdf/1802.02721v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-image-super-resolution-via-natural-image |
Repo | |
Framework | |
When deep learning meets security
Title | When deep learning meets security |
Authors | Majd Latah |
Abstract | Deep learning is an emerging research field that has proven its effectiveness towards deploying more efficient intelligent systems. Security, on the other hand, is one of the most essential issues in modern communication systems. Recently many papers have shown that using deep learning models can achieve promising results when applied to the security domain. In this work, we provide an overview for the recent studies that apply deep learning techniques to the field of security. |
Tasks | |
Published | 2018-07-12 |
URL | http://arxiv.org/abs/1807.04739v1 |
http://arxiv.org/pdf/1807.04739v1.pdf | |
PWC | https://paperswithcode.com/paper/when-deep-learning-meets-security |
Repo | |
Framework | |
Brain MRI Super Resolution Using 3D Deep Densely Connected Neural Networks
Title | Brain MRI Super Resolution Using 3D Deep Densely Connected Neural Networks |
Authors | Yuhua Chen, Yibin Xie, Zhengwei Zhou, Feng Shi, Anthony G. Christodoulou, Debiao Li |
Abstract | Magnetic resonance image (MRI) in high spatial resolution provides detailed anatomical information and is often necessary for accurate quantitative analysis. However, high spatial resolution typically comes at the expense of longer scan time, less spatial coverage, and lower signal to noise ratio (SNR). Single Image Super-Resolution (SISR), a technique aimed to restore high-resolution (HR) details from one single low-resolution (LR) input image, has been improved dramatically by recent breakthroughs in deep learning. In this paper, we introduce a new neural network architecture, 3D Densely Connected Super-Resolution Networks (DCSRN) to restore HR features of structural brain MR images. Through experiments on a dataset with 1,113 subjects, we demonstrate that our network outperforms bicubic interpolation as well as other deep learning methods in restoring 4x resolution-reduced images. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-01-08 |
URL | http://arxiv.org/abs/1801.02728v1 |
http://arxiv.org/pdf/1801.02728v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-mri-super-resolution-using-3d-deep |
Repo | |
Framework | |
QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks
Title | QuSecNets: Quantization-based Defense Mechanism for Securing Deep Neural Network against Adversarial Attacks |
Authors | Hassan Ali, Hammad Tariq, Muhammad Abdullah Hanif, Faiq Khalid, Semeen Rehman, Rehan Ahmed, Muhammad Shafique |
Abstract | Deep Neural Networks (DNNs) have recently been shown vulnerable to adversarial attacks in which the input examples are perturbed to fool these DNNs towards confidence reduction and (targeted or random) misclassification. In this paper, we demonstrate that how an efficient quantization technique can be leveraged to increase the robustness of a given DNN against adversarial attacks. We present two quantization-based defense mechanisms, namely Constant Quantization (CQ) and Variable Quantization (VQ), applied at the input to increase the robustness of DNNs. In CQ, the intensity of the input pixel is quantized according to the number of quantization levels. While in VQ, the quantization levels are recursively updated during the training phase, thereby providing a stronger defense mechanism. We apply our techniques on the Convolutional Neural Networks (CNNs, a particular type of DNN which is heavily used in vision-based applications) against adversarial attacks from the open-source Cleverhans library. Our experimental results show 1%-5% increase in the adversarial accuracy for MNIST and 0%-2.4% increase in the adversarial accuracy for CIFAR10. |
Tasks | Quantization |
Published | 2018-11-04 |
URL | http://arxiv.org/abs/1811.01437v1 |
http://arxiv.org/pdf/1811.01437v1.pdf | |
PWC | https://paperswithcode.com/paper/qusecnets-quantization-based-defense |
Repo | |
Framework | |
User Association and Load Balancing for Massive MIMO through Deep Learning
Title | User Association and Load Balancing for Massive MIMO through Deep Learning |
Authors | Alessio Zappone, Luca Sanguinetti, Merouane Debbah |
Abstract | This work investigates the use of deep learning to perform user cell association for sum-rate maximization in Massive MIMO networks. It is shown how a deep neural network can be trained to approach the optimal association rule with a much more limited computational complexity, thus enabling to update the association rule in real-time, on the basis of the mobility patterns of users. In particular, the proposed neural network design requires as input only the users’ geographical positions. Numerical results show that it guarantees the same performance of traditional optimization-oriented methods. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.06905v1 |
http://arxiv.org/pdf/1812.06905v1.pdf | |
PWC | https://paperswithcode.com/paper/user-association-and-load-balancing-for |
Repo | |
Framework | |
Cross-Dataset Adaptation for Visual Question Answering
Title | Cross-Dataset Adaptation for Visual Question Answering |
Authors | Wei-Lun Chao, Hexiang Hu, Fei Sha |
Abstract | We investigate the problem of cross-dataset adaptation for visual question answering (Visual QA). Our goal is to train a Visual QA model on a source dataset but apply it to another target one. Analogous to domain adaptation for visual recognition, this setting is appealing when the target dataset does not have a sufficient amount of labeled data to learn an “in-domain” model. The key challenge is that the two datasets are constructed differently, resulting in the cross-dataset mismatch on images, questions, or answers. We overcome this difficulty by proposing a novel domain adaptation algorithm. Our method reduces the difference in statistical distributions by transforming the feature representation of the data in the target dataset. Moreover, it maximizes the likelihood of answering questions (in the target dataset) correctly using the Visual QA model trained on the source dataset. We empirically studied the effectiveness of the proposed approach on adapting among several popular Visual QA datasets. We show that the proposed method improves over baselines where there is no adaptation and several other adaptation methods. We both quantitatively and qualitatively analyze when the adaptation can be mostly effective. |
Tasks | Domain Adaptation, Question Answering, Visual Question Answering |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03726v1 |
http://arxiv.org/pdf/1806.03726v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-dataset-adaptation-for-visual-question |
Repo | |
Framework | |
Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams
Title | Cell Grid Architecture for Maritime Route Prediction on AIS Data Streams |
Authors | Ciprian Amariei, Paul Diac, Emanuel Onica, Valentin Roşca |
Abstract | The 2018 Grand Challenge targets the problem of accurate predictions on data streams produced by automatic identification system (AIS) equipment, describing naval traffic. This paper reports the technical details of a custom solution, which exposes multiple tuning parameters, making its configurability one of the main strengths. Our solution employs a cell grid architecture essentially based on a sequence of hash tables, specifically built for the targeted use case. This makes it particularly effective in prediction on AIS data, obtaining a high accuracy and scalable performance results. Moreover, the architecture proposed accommodates also an optionally semi-supervised learning process besides the basic supervised mode. |
Tasks | |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1810.00090v1 |
http://arxiv.org/pdf/1810.00090v1.pdf | |
PWC | https://paperswithcode.com/paper/cell-grid-architecture-for-maritime-route |
Repo | |
Framework | |