January 25, 2020

3063 words 15 mins read

Paper Group ANR 1650

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding. Towards Learning a Self-inverse Network for Bidirectional Image-to-image Translation. On Flow Profile Image for Video Representation. Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing. Emergent Coordination Through Competition. A study for …

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding


Title	Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Authors	Yuchen Liu, Jiajun Zhang, Hao Xiong, Long Zhou, Zhongjun He, Hua Wu, Haifeng Wang, Chengqing Zong
Abstract	Speech-to-text translation (ST), which translates source language speech into target language text, has attracted intensive attention in recent years. Compared to the traditional pipeline system, the end-to-end ST model has potential benefits of lower latency, smaller model size, and less error propagation. However, it is notoriously difficult to implement such a model without transcriptions as intermediate. Existing works generally apply multi-task learning to improve translation quality by jointly training end-to-end ST along with automatic speech recognition (ASR). However, different tasks in this method cannot utilize information from each other, which limits the improvement. Other works propose a two-stage model where the second model can use the hidden state from the first one, but its cascade manner greatly affects the efficiency of training and inference process. In this paper, we propose a novel interactive attention mechanism which enables ASR and ST to perform synchronously and interactively in a single model. Specifically, the generation of transcriptions and translations not only relies on its previous outputs but also the outputs predicted in the other task. Experiments on TED speech translation corpora have shown that our proposed model can outperform strong baselines on the quality of speech translation and achieve better speech recognition performances as well.
Tasks	Multi-Task Learning, Speech Recognition
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07240v1
PDF	https://arxiv.org/pdf/1912.07240v1.pdf
PWC	https://paperswithcode.com/paper/synchronous-speech-recognition-and-speech-to
Repo
Framework

Towards Learning a Self-inverse Network for Bidirectional Image-to-image Translation


Title	Towards Learning a Self-inverse Network for Bidirectional Image-to-image Translation
Authors	Zengming Shen, Yifan Chen, S. Kevin Zhou, Bogdan Georgescu, Xuqi Liu, Thomas S. Huang
Abstract	The one-to-one mapping is necessary for many bidirectional image-to-image translation applications, such as MRI image synthesis as MRI images are unique to the patient. State-of-the-art approaches for image synthesis from domain X to domain Y learn a convolutional neural network that meticulously maps between the domains. A different network is typically implemented to map along the opposite direction, from Y to X. In this paper, we explore the possibility of only wielding one network for bi-directional image synthesis. In other words, such an autonomous learning network implements a self-inverse function. A self-inverse network shares several distinct advantages: only one network instead of two, better generalization and more restricted parameter space. Most importantly, a self-inverse function guarantees a one-to-one mapping, a property that cannot be guaranteed by earlier approaches that are not self-inverse. The experiments on three datasets show that, compared with the baseline approaches that use two separate models for the image synthesis along two directions, our self-inverse network achieves better synthesis results in terms of standard metrics. Finally, our sensitivity analysis confirms the feasibility of learning a self-inverse function for the bidirectional image translation.
Tasks	Image Generation, Image-to-Image Translation
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04104v2
PDF	https://arxiv.org/pdf/1909.04104v2.pdf
PWC	https://paperswithcode.com/paper/towards-learning-a-self-inverse-network-for
Repo
Framework

On Flow Profile Image for Video Representation


Title	On Flow Profile Image for Video Representation
Authors	Mohammadreza Babaee, David Full, Gerhard Rigoll
Abstract	Video representation is a key challenge in many computer vision applications such as video classification, video captioning, and video surveillance. In this paper, we propose a novel approach for video representation that captures meaningful information including motion and appearance from a sequence of video frames and compacts it into a single image. To this end, we compute the optical flow and use it in a least squares optimization to find a new image, the so-called Flow Profile Image (FPI). This image encodes motions as well as foreground appearance information while background information is removed. The quality of this image is validated in activity recognition experiments and the results are compared with other video representation techniques such as dynamic images [1] and eigen images [2]. The experimental results as well as visual quality confirm that FPIs can be successfully used in video processing applications.
Tasks	Activity Recognition, Optical Flow Estimation, Video Captioning, Video Classification
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04668v1
PDF	https://arxiv.org/pdf/1905.04668v1.pdf
PWC	https://paperswithcode.com/paper/on-flow-profile-image-for-video
Repo
Framework

Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing


Title	Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing
Authors	Wenlong Mou, Nhat Ho, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan
Abstract	We study the problem of sampling from the power posterior distribution in Bayesian Gaussian mixture models, a robust version of the classical posterior. This power posterior is known to be non-log-concave and multi-modal, which leads to exponential mixing times for some standard MCMC algorithms. We introduce and study the Reflected Metropolis-Hastings Random Walk (RMRW) algorithm for sampling. For symmetric two-component Gaussian mixtures, we prove that its mixing time is bounded as $d^{1.5}(d + \Vert \theta_{0} \Vert^2)^{4.5}$ as long as the sample size $n$ is of the order $d (d + \Vert \theta_{0} \Vert^2)$. Notably, this result requires no conditions on the separation of the two means. En route to proving this bound, we establish some new results of possible independent interest that allow for combining Poincar'{e} inequalities for conditional and marginal densities.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05153v1
PDF	https://arxiv.org/pdf/1912.05153v1.pdf
PWC	https://paperswithcode.com/paper/sampling-for-bayesian-mixture-models-mcmc
Repo
Framework

Emergent Coordination Through Competition


Title	Emergent Coordination Through Competition
Authors	Siqi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel
Abstract	We study the emergence of cooperative behaviors in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics. We demonstrate that decentralized, population-based training with co-play can lead to a progression in agents’ behaviors: from random, to simple ball chasing, and finally showing evidence of cooperation. Our study highlights several of the challenges encountered in large scale multi-agent training in continuous control. In particular, we demonstrate that the automatic optimization of simple shaping rewards, not themselves conducive to co-operative behavior, can lead to long-horizon team behavior. We further apply an evaluation scheme, grounded by game theoretic principals, that can assess agent performance in the absence of pre-defined evaluation tasks or human baselines.
Tasks	Continuous Control
Published	2019-02-19
URL	http://arxiv.org/abs/1902.07151v2
PDF	http://arxiv.org/pdf/1902.07151v2.pdf
PWC	https://paperswithcode.com/paper/emergent-coordination-through-competition
Repo
Framework

A study for Image compression using Re-Pair algorithm


Title	A study for Image compression using Re-Pair algorithm
Authors	Pasquale De Luca, Vincenzo Maria Russiello, Raffaele Ciro Sannino, Lorenzo Valente
Abstract	The compression is an important topic in computer science which allows we to storage more amount of data on our data storage. There are several techniques to compress any file. In this manuscript will be described the most important algorithm to compress images such as JPEG and it will be compared with another method to retrieve good reason to not use this method on images. So to compress the text the most encoding technique known is the Huffman Encoding which it will be explained in exhaustive way. In this manuscript will showed how to compute a text compression method on images in particular the method and the reason to choice a determinate image format against the other. The method studied and analyzed in this manuscript is the Re-Pair algorithm which is purely for grammatical context to be compress. At the and it will be showed the good result of this application.
Tasks	Image Compression
Published	2019-01-30
URL	http://arxiv.org/abs/1901.10744v3
PDF	http://arxiv.org/pdf/1901.10744v3.pdf
PWC	https://paperswithcode.com/paper/a-study-for-image-compression-using-re-pair
Repo
Framework

Low-dimensional Semantic Space: from Text to Word Embedding


Title	Low-dimensional Semantic Space: from Text to Word Embedding
Authors	Xiaolei Lu, Bin Ni
Abstract	This article focuses on the study of Word Embedding, a feature-learning technique in Natural Language Processing that maps words or phrases to low-dimensional vectors. Beginning with the linguistic theories concerning contextual similarities - “Distributional Hypothesis” and “Context of Situation”, this article introduces two ways of numerical representation of text: One-hot and Distributed Representation. In addition, this article presents statistical-based Language Models(such as Co-occurrence Matrix and Singular Value Decomposition) as well as Neural Network Language Models (NNLM, such as Continuous Bag-of-Words and Skip-Gram). This article also analyzes how Word Embedding can be applied to the study of word-sense disambiguation and diachronic linguistics.
Tasks	Word Sense Disambiguation
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00845v1
PDF	https://arxiv.org/pdf/1911.00845v1.pdf
PWC	https://paperswithcode.com/paper/low-dimensional-semantic-space-from-text-to
Repo
Framework

Improving Graph Attention Networks with Large Margin-based Constraints


Title	Improving Graph Attention Networks with Large Margin-based Constraints
Authors	Guangtao Wang, Rex Ying, Jing Huang, Jure Leskovec
Abstract	Graph Attention Networks (GATs) are the state-of-the-art neural architecture for representation learning with graphs. GATs learn attention functions that assign weights to nodes so that different nodes have different influences in the feature aggregation steps. In practice, however, induced attention functions are prone to over-fitting due to the increasing number of parameters and the lack of direct supervision on attention weights. GATs also suffer from over-smoothing at the decision boundary of nodes. Here we propose a framework to address their weaknesses via margin-based constraints on attention during training. We first theoretically demonstrate the over-smoothing behavior of GATs and then develop an approach using constraint on the attention weights according to the class boundary and feature aggregation pattern. Furthermore, to alleviate the over-fitting problem, we propose additional constraints on the graph structure. Extensive experiments and ablation studies on common benchmark datasets demonstrate the effectiveness of our method, which leads to significant improvements over the previous state-of-the-art graph attention methods on all datasets.
Tasks	Representation Learning
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11945v1
PDF	https://arxiv.org/pdf/1910.11945v1.pdf
PWC	https://paperswithcode.com/paper/improving-graph-attention-networks-with-large
Repo
Framework

Multiscale Self Attentive Convolutions for Vision and Language Modeling


Title	Multiscale Self Attentive Convolutions for Vision and Language Modeling
Authors	Oren Barkan
Abstract	Self attention mechanisms have become a key building block in many state-of-the-art language understanding models. In this paper, we show that the self attention operator can be formulated in terms of 1x1 convolution operations. Following this observation, we propose several novel operators: First, we introduce a 2D version of self attention that is applicable for 2D signals such as images. Second, we present the 1D and 2D Self Attentive Convolutions (SAC) operator that generalizes self attention beyond 1x1 convolutions to 1xm and nxm convolutions, respectively. While 1D and 2D self attention operate on individual words and pixels, SAC operates on m-grams and image patches, respectively. Third, we present a multiscale version of SAC (MSAC) which analyzes the input by employing multiple SAC operators that vary by filter size, in parallel. Finally, we explain how MSAC can be utilized for vision and language modeling, and further harness MSAC to form a cross attentive image similarity machinery.
Tasks	Language Modelling
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01521v1
PDF	https://arxiv.org/pdf/1912.01521v1.pdf
PWC	https://paperswithcode.com/paper/multiscale-self-attentive-convolutions-for
Repo
Framework

Microservices based Framework to Support Interoperable IoT Applications for Enhanced Data Analytics


Title	Microservices based Framework to Support Interoperable IoT Applications for Enhanced Data Analytics
Authors	Sajjad Ali, Muhammad Aslam Jarwar, Ilyoung Chong
Abstract	Internet of things is growing with a large number of diverse objects which generate billions of data streams by sensing, actuating and communicating. Management of heterogeneous IoT objects with existing approaches and processing of myriads of data from these objects using monolithic services have become major challenges in developing effective IoT applications. The heterogeneity can be resolved by providing interoperability with semantic virtualization of objects. Moreover, monolithic services can be substituted with modular microservices. This article presents an architecture that enables the development of IoT applications using semantically interoperable microservices and virtual objects. The proposed framework supports analytic features with knowledge-driven and data-driven techniques to provision intelligent services on top of interoperable microservices in Web Objects enabled IoT environment. The knowledge-driven aspects are supported with reasoning on semantic ontology models and the data-driven aspects are realized with machine learning pipeline. The development of service functionalities is supported with microservices to enhance modularity and reusability. To evaluate the proposed framework a proof of concept implementation with a use case is discussed.
Tasks
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08713v1
PDF	https://arxiv.org/pdf/1910.08713v1.pdf
PWC	https://paperswithcode.com/paper/microservices-based-framework-to-support
Repo
Framework

Blessing of dimensionality at the edge


Title	Blessing of dimensionality at the edge
Authors	Ivan Y. Tyukin, Alexander N. Gorban, Alistair A. McEwan, Sepehr Meshkinfamfard
Abstract	In this paper we present theory and algorithms enabling classes of Artificial Intelligence (AI) systems to continuously and incrementally improve with a-priori quantifiable guarantees - or more specifically remove classification errors - over time. This is distinct from state-of-the-art machine learning, AI, and software approaches. Another feature of this approach is that, in the supervised setting, the computational complexity of training is linear in the number of training samples. At the time of classification, the computational complexity is bounded by few inner product calculations. Moreover, the implementation is shown to be very scalable. This makes it viable for deployment in applications where computational power and memory are limited, such as embedded environments. It enables the possibility for fast on-line optimisation using improved training samples. The approach is based on the concentration of measure effects and stochastic separation theorems.
Tasks
Published	2019-09-30
URL	https://arxiv.org/abs/1910.00445v1
PDF	https://arxiv.org/pdf/1910.00445v1.pdf
PWC	https://paperswithcode.com/paper/blessing-of-dimensionality-at-the-edge
Repo
Framework

A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing


Title	A novel dynamic asset allocation system using Feature Saliency Hidden Markov models for smart beta investing
Authors	Elizabeth Fons, Paula Dawson, Jeffrey Yau, Xiao-jun Zeng, John Keane
Abstract	The financial crisis of 2008 generated interest in more transparent, rules-based strategies for portfolio construction, with Smart beta strategies emerging as a trend among institutional investors. While they perform well in the long run, these strategies often suffer from severe short-term drawdown (peak-to-trough decline) with fluctuating performance across cycles. To address cyclicality and underperformance, we build a dynamic asset allocation system using Hidden Markov Models (HMMs). We test our system across multiple combinations of smart beta strategies and the resulting portfolios show an improvement in risk-adjusted returns, especially on more return oriented portfolios (up to 50$%$ in excess of market annually). In addition, we propose a novel smart beta allocation system based on the Feature Saliency HMM (FSHMM) algorithm that performs feature selection simultaneously with the training of the HMM, to improve regime identification. We evaluate our systematic trading system with real life assets using MSCI indices; further, the results (up to 60$%$ in excess of market annually) show model performance improvement with respect to portfolios built using full feature HMMs.
Tasks	Feature Selection
Published	2019-02-28
URL	http://arxiv.org/abs/1902.10849v1
PDF	http://arxiv.org/pdf/1902.10849v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-dynamic-asset-allocation-system-using
Repo
Framework

Communication-Censored Linearized ADMM for Decentralized Consensus Optimization


Title	Communication-Censored Linearized ADMM for Decentralized Consensus Optimization
Authors	Weiyu Li, Yaohua Liu, Zhi Tian, Qing Ling
Abstract	In this paper, we propose a communication- and computation-efficient algorithm to solve a convex consensus optimization problem defined over a decentralized network. A remarkable existing algorithm to solve this problem is the alternating direction method of multipliers (ADMM), in which at every iteration every node updates its local variable through combining neighboring variables and solving an optimization subproblem. The proposed algorithm, called as COmmunication-censored Linearized ADMM (COLA), leverages a linearization technique to reduce the iteration-wise computation cost of ADMM and uses a communication-censoring strategy to alleviate the communication cost. To be specific, COLA introduces successive linearization approximations to the local cost functions such that the resultant computation is first-order and light-weight. Since the linearization technique slows down the convergence speed, COLA further adopts the communication-censoring strategy to avoid transmissions of less informative messages. A node is allowed to transmit only if the distance between the current local variable and its previously transmitted one is larger than a censoring threshold. COLA is proven to be convergent when the local cost functions have Lipschitz continuous gradients and the censoring threshold is summable. When the local cost functions are further strongly convex, we establish the linear (sublinear) convergence rate of COLA, given that the censoring threshold linearly (sublinearly) decays to 0. Numerical experiments corroborate with the theoretical findings and demonstrate the satisfactory communication-computation tradeoff of COLA.
Tasks
Published	2019-09-15
URL	https://arxiv.org/abs/1909.06724v1
PDF	https://arxiv.org/pdf/1909.06724v1.pdf
PWC	https://paperswithcode.com/paper/communication-censored-linearized-admm-for
Repo
Framework


Title	Neural Architecture Refinement: A Practical Way for Avoiding Overfitting in NAS
Authors	Yang Jiang, Cong Zhao, Zeyang Dou, Lei Pang
Abstract	Neural architecture search (NAS) is proposed to automate the architecture design process and attracts overwhelming interest from both academia and industry. However, it is confronted with overfitting issue due to the high-dimensional search space composed by operator selection and skip connection of each layer. This paper explores the architecture overfitting issue in depth based on the reinforcement learning-based NAS framework. We show that the policy gradient method has deep correlations with the cross entropy minimization. Based on this correlation, we further demonstrate that, though the reward of NAS is sparse, the policy gradient method implicitly assign the reward to all operations and skip connections based on the sampling frequency. However, due to the inaccurate reward estimation, curse of dimensionality problem and the hierachical structure of neural networks, reward charateristics for operators and skip connections have intrinsic differences, the assigned rewards for the skip connections are extremely noisy and inaccurate. To alleviate this problem, we propose a neural architecture refinement approach that working with an initial state-of-the-art network structure and only refining its operators. Extensive experiments have demonstrated that the proposed method can achieve fascinated results, including classification, face recognition etc.
Tasks	Face Recognition, Neural Architecture Search
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02341v3
PDF	https://arxiv.org/pdf/1905.02341v3.pdf
PWC	https://paperswithcode.com/paper/neural-architecture-refinement-a-practical
Repo
Framework

Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents


Title	Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents
Authors	Nikolaos Malandrakis, Minmin Shen, Anuj Goyal, Shuyang Gao, Abhishek Sethi, Angeliki Metallinou
Abstract	Data availability is a bottleneck during early stages of development of new capabilities for intelligent artificial agents. We investigate the use of text generation techniques to augment the training data of a popular commercial artificial agent across categories of functionality, with the goal of faster development of new functionality. We explore a variety of encoder-decoder generative models for synthetic training data generation and propose using conditional variational auto-encoders. Our approach requires only direct optimization, works well with limited data and significantly outperforms the previous controlled text generation techniques. Further, the generated data are used as additional training samples in an extrinsic intent classification task, leading to improved performance by up to 5% absolute f-score in low-resource cases, validating the usefulness of our approach.
Tasks	Data Augmentation, Intent Classification, Text Generation
Published	2019-10-04
URL	https://arxiv.org/abs/1910.03487v1
PDF	https://arxiv.org/pdf/1910.03487v1.pdf
PWC	https://paperswithcode.com/paper/controlled-text-generation-for-data
Repo
Framework