October 18, 2019

2883 words 14 mins read

Paper Group ANR 628

Adpositional Supersenses for Mandarin Chinese. Network Decoupling: From Regular to Depthwise Separable Convolutions. Adversarial Auto-encoders for Speech Based Emotion Recognition. Generating Clues for Gender based Occupation De-biasing in Text. Robust Point Light Source Estimation Using Differentiable Rendering. Are BLEU and Meaning Representation …

Adpositional Supersenses for Mandarin Chinese


Title	Adpositional Supersenses for Mandarin Chinese
Authors	YIlun Zhu, Yang Liu, Siyao Peng, Austin Blodgett, Yushi Zhao, Nathan Schneider
Abstract	This study adapts Semantic Network of Adposition and Case Supersenses (SNACS) annotation to Mandarin Chinese and demonstrates that the same supersense categories are appropriate for Chinese adposition semantics. We annotated 15 chapters of The Little Prince, with high interannotator agreement. The parallel corpus gives insight into differences in construal between the two languages’ adpositions, namely a number of construals that are frequent in Chinese but rare or unattested in the English corpus. The annotated corpus can further support automatic disambiguation of adpositions in Chinese, and the common inventory of supersenses between the two languages can potentially serve cross-linguistic tasks such as machine translation.
Tasks	Machine Translation
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02317v1
PDF	http://arxiv.org/pdf/1812.02317v1.pdf
PWC	https://paperswithcode.com/paper/adpositional-supersenses-for-mandarin-chinese
Repo
Framework

Network Decoupling: From Regular to Depthwise Separable Convolutions


Title	Network Decoupling: From Regular to Depthwise Separable Convolutions
Authors	Jianbo Guo, Yuxi Li, Weiyao Lin, Yurong Chen, Jianguo Li
Abstract	Depthwise separable convolution has shown great efficiency in network design, but requires time-consuming training procedure with full training-set available. This paper first analyzes the mathematical relationship between regular convolutions and depthwise separable convolutions, and proves that the former one could be approximated with the latter one in closed form. We show depthwise separable convolutions are principal components of regular convolutions. And then we propose network decoupling (ND), a training-free method to accelerate convolutional neural networks (CNNs) by transferring pre-trained CNN models into the MobileNet-like depthwise separable convolution structure, with a promising speedup yet negligible accuracy loss. We further verify through experiments that the proposed method is orthogonal to other training-free methods like channel decomposition, spatial decomposition, etc. Combining the proposed method with them will bring even larger CNN speedup. For instance, ND itself achieves about 2X speedup for the widely used VGG16, and combined with other methods, it reaches 3.7X speedup with graceful accuracy degradation. We demonstrate that ND is widely applicable to classification networks like ResNet, and object detection network like SSD300.
Tasks	Object Detection
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05517v1
PDF	http://arxiv.org/pdf/1808.05517v1.pdf
PWC	https://paperswithcode.com/paper/network-decoupling-from-regular-to-depthwise
Repo
Framework

Adversarial Auto-encoders for Speech Based Emotion Recognition


Title	Adversarial Auto-encoders for Speech Based Emotion Recognition
Authors	Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael AbdAlmageed, Carol Espy-Wilson
Abstract	Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition. They map the autoencoder’s bottleneck layer output (termed as code vectors) to different noise Probability Distribution Functions (PDFs), that can be further regularized to cluster based on class information. In addition, they also allow a generation of synthetic samples by sampling the code vectors from the mapped PDFs. Inspired by these properties, we investigate the application of adversarial autoencoders to the domain of emotion recognition. Specifically, we conduct experiments on the following two aspects: (i) their ability to encode high dimensional feature vector representations for emotional utterances into a compressed space (with a minimal loss of emotion class discriminability in the compressed space), and (ii) their ability to regenerate synthetic samples in the original feature space, to be later used for purposes such as training emotion recognition classifiers. We demonstrate the promise of adversarial autoencoders with regards to these aspects on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus and present our analysis.
Tasks	Emotion Recognition, Face Recognition, Motion Capture
Published	2018-06-06
URL	http://arxiv.org/abs/1806.02146v1
PDF	http://arxiv.org/pdf/1806.02146v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-auto-encoders-for-speech-based
Repo
Framework

Generating Clues for Gender based Occupation De-biasing in Text


Title	Generating Clues for Gender based Occupation De-biasing in Text
Authors	Nishtha Madaan, Gautam Singh, Sameep Mehta, Aditya Chetan, Brihi Joshi
Abstract	Vast availability of text data has enabled widespread training and use of AI systems that not only learn and predict attributes from the text but also generate text automatically. However, these AI models also learn gender, racial and ethnic biases present in the training data. In this paper, we present the first system that discovers the possibility that a given text portrays a gender stereotype associated with an occupation. If the possibility exists, the system offers counter-evidences of opposite gender also being associated with the same occupation in the context of user-provided geography and timespan. The system thus enables text de-biasing by assisting a human-in-the-loop. The system can not only act as a text pre-processor before training any AI model but also help human story writers write stories free of occupation-level gender bias in the geographical and temporal context of their choice.
Tasks
Published	2018-04-11
URL	http://arxiv.org/abs/1804.03839v1
PDF	http://arxiv.org/pdf/1804.03839v1.pdf
PWC	https://paperswithcode.com/paper/generating-clues-for-gender-based-occupation
Repo
Framework

Robust Point Light Source Estimation Using Differentiable Rendering


Title	Robust Point Light Source Estimation Using Differentiable Rendering
Authors	Grégoire Nieto, Salma Jiddi, Philippe Robert
Abstract	Illumination estimation is often used in mixed reality to re-render a scene from another point of view, to change the color/texture of an object, or to insert a virtual object consistently lit into a real video or photograph. Specifically, the estimation of a point light source is required for the shadows cast by the inserted object to be consistent with the real scene. We tackle the problem of illumination retrieval given an RGBD image of the scene as an inverse problem: we aim to find the illumination that minimizes the photometric error between the rendered image and the observation. In particular we propose a novel differentiable renderer based on the Blinn-Phong model with cast shadows. We compare our differentiable renderer to state-of-the-art methods and demonstrate its robustness to an incorrect reflectance estimation.
Tasks	Outdoor Light Source Estimation
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04857v1
PDF	http://arxiv.org/pdf/1812.04857v1.pdf
PWC	https://paperswithcode.com/paper/robust-point-light-source-estimation-using
Repo
Framework

Are BLEU and Meaning Representation in Opposition?


Title	Are BLEU and Meaning Representation in Opposition?
Authors	Ondřej Cífka, Ondřej Bojar
Abstract	One of possible ways of obtaining continuous-space sentence representations is by training neural machine translation (NMT) systems. The recent attention mechanism however removes the single point in the neural network from which the source sentence representation can be extracted. We propose several variations of the attentive NMT architecture bringing this meeting point back. Empirical evaluation suggests that the better the translation quality, the worse the learned sentence representations serve in a wide range of classification and similarity tasks.
Tasks	Machine Translation
Published	2018-05-16
URL	http://arxiv.org/abs/1805.06536v1
PDF	http://arxiv.org/pdf/1805.06536v1.pdf
PWC	https://paperswithcode.com/paper/are-bleu-and-meaning-representation-in
Repo
Framework

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey


Title	Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
Authors	Nguyen Cong Luong, Dinh Thai Hoang, Shimin Gong, Dusit Niyato, Ping Wang, Ying-Chang Liang, Dong In Kim
Abstract	This paper presents a comprehensive literature review on applications of deep reinforcement learning in communications and networking. Modern networks, e.g., Internet of Things (IoT) and Unmanned Aerial Vehicle (UAV) networks, become more decentralized and autonomous. In such networks, network entities need to make decisions locally to maximize the network performance under uncertainty of network environment. Reinforcement learning has been efficiently used to enable the network entities to obtain the optimal policy including, e.g., decisions or actions, given their states when the state and action spaces are small. However, in complex and large-scale networks, the state and action spaces are usually large, and the reinforcement learning may not be able to find the optimal policy in reasonable time. Therefore, deep reinforcement learning, a combination of reinforcement learning with deep learning, has been developed to overcome the shortcomings. In this survey, we first give a tutorial of deep reinforcement learning from fundamental concepts to advanced models. Then, we review deep reinforcement learning approaches proposed to address emerging issues in communications and networking. The issues include dynamic network access, data rate control, wireless caching, data offloading, network security, and connectivity preservation which are all important to next generation networks such as 5G and beyond. Furthermore, we present applications of deep reinforcement learning for traffic routing, resource sharing, and data collection. Finally, we highlight important challenges, open issues, and future research directions of applying deep reinforcement learning.
Tasks
Published	2018-10-18
URL	http://arxiv.org/abs/1810.07862v1
PDF	http://arxiv.org/pdf/1810.07862v1.pdf
PWC	https://paperswithcode.com/paper/applications-of-deep-reinforcement-learning
Repo
Framework

Exposing Deep Fakes Using Inconsistent Head Poses


Title	Exposing Deep Fakes Using Inconsistent Head Poses
Authors	Xin Yang, Yuezun Li, Siwei Lyu
Abstract	In this paper, we propose a new method to expose AI-generated fake face images or videos (commonly known as the Deep Fakes). Our method is based on the observations that Deep Fakes are created by splicing synthesized face region into the original image, and in doing so, introducing errors that can be revealed when 3D head poses are estimated from the face images. We perform experiments to demonstrate this phenomenon and further develop a classification method based on this cue. Using features based on this cue, an SVM classifier is evaluated using a set of real face images and Deep Fakes.
Tasks
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00661v2
PDF	http://arxiv.org/pdf/1811.00661v2.pdf
PWC	https://paperswithcode.com/paper/exposing-deep-fakes-using-inconsistent-head
Repo
Framework

Learning to Train a Binary Neural Network


Title	Learning to Train a Binary Neural Network
Authors	Joseph Bethge, Haojin Yang, Christian Bartz, Christoph Meinel
Abstract	Convolutional neural networks have achieved astonishing results in different application areas. Various methods which allow us to use these models on mobile and embedded devices have been proposed. Especially binary neural networks seem to be a promising approach for these devices with low computational power. However, understanding binary neural networks and training accurate models for practical applications remains a challenge. In our work, we focus on increasing our understanding of the training process and making it accessible to everyone. We publish our code and models based on BMXNet for everyone to use. Within this framework, we systematically evaluated different network architectures and hyperparameters to provide useful insights on how to train a binary neural network. Further, we present how we improved accuracy by increasing the number of connections in the network.
Tasks
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10463v1
PDF	http://arxiv.org/pdf/1809.10463v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-train-a-binary-neural-network
Repo
Framework

The morphospace of language networks


Title	The morphospace of language networks
Authors	Luís F Seoane, Ricard Solé
Abstract	Language can be described as a network of interacting objects with different qualitative properties and complexity. These networks include semantic, syntactic, or phonological levels and have been found to provide a new picture of language complexity and its evolution. A general approach considers language from an information theory perspective that incorporates a speaker, a hearer, and a noisy channel. The later is often encoded in a matrix connecting the signals used for communication with meanings to be found in the real world. Most studies of language evolution deal in a way or another with such theoretical contraption and explore the outcome of diverse forms of selection on the communication matrix that somewhat optimizes communication. This framework naturally introduces networks mediating the communicating agents, but no systematic analysis of the underlying landscape of possible language graphs has been developed. Here we present a detailed analysis of network properties on a generic model of a communication code, which reveals a rather complex and heterogeneous morphospace of language networks. Additionally, we use curated data of English words to locate and evaluate real languages within this language morphospace. Our findings indicate a surprisingly simple structure in human language unless particles are introduced in the vocabulary, with the ability of naming any other concept. These results refine and for the first time complement with empirical data a lasting theoretical tradition around the framework of \emph{least effort language}.
Tasks
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01934v1
PDF	http://arxiv.org/pdf/1803.01934v1.pdf
PWC	https://paperswithcode.com/paper/the-morphospace-of-language-networks
Repo
Framework

Improving Generalization of Sequence Encoder-Decoder Networks for Inverse Imaging of Cardiac Transmembrane Potential


Title	Improving Generalization of Sequence Encoder-Decoder Networks for Inverse Imaging of Cardiac Transmembrane Potential
Authors	Sandesh Ghimire, Prashnna Kumar Gyawali, John L Sapp, Milan Horacek, Linwei Wang
Abstract	Deep learning models have shown state-of-the-art performance in many inverse reconstruction problems. However, it is not well understood what properties of the latent representation may improve the generalization ability of the network. Furthermore, limited models have been presented for inverse reconstructions over time sequences. In this paper, we study the generalization ability of a sequence encoder decoder model for solving inverse reconstructions on time sequences. Our central hypothesis is that the generalization ability of the network can be improved by 1) constrained stochasticity and 2) global aggregation of temporal information in the latent space. First, drawing from analytical learning theory, we theoretically show that a stochastic latent space will lead to an improved generalization ability. Second, we consider an LSTM encoder-decoder architecture that compresses a global latent vector from all last-layer units in the LSTM encoder. This model is compared with alternative LSTM encoder-decoder architectures, each in deterministic and stochastic versions. The results demonstrate that the generalization ability of an inverse reconstruction network can be improved by constrained stochasticity combined with global aggregation of temporal information in the latent space.
Tasks
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05713v1
PDF	http://arxiv.org/pdf/1810.05713v1.pdf
PWC	https://paperswithcode.com/paper/improving-generalization-of-sequence-encoder
Repo
Framework


Title	Multi-Modal Trajectory Prediction of Surrounding Vehicles with Maneuver based LSTMs
Authors	Nachiket Deo, Mohan M. Trivedi
Abstract	To safely and efficiently navigate through complex traffic scenarios, autonomous vehicles need to have the ability to predict the future motion of surrounding vehicles. Multiple interacting agents, the multi-modal nature of driver behavior, and the inherent uncertainty involved in the task make motion prediction of surrounding vehicles a challenging problem. In this paper, we present an LSTM model for interaction aware motion prediction of surrounding vehicles on freeways. Our model assigns confidence values to maneuvers being performed by vehicles and outputs a multi-modal distribution over future motion based on them. We compare our approach with the prior art for vehicle motion prediction on the publicly available NGSIM US-101 and I-80 datasets. Our results show an improvement in terms of RMS values of prediction error. We also present an ablative analysis of the components of our proposed model and analyze the predictions made by the model in complex traffic scenarios.
Tasks	Autonomous Vehicles, motion prediction, Trajectory Prediction
Published	2018-05-15
URL	http://arxiv.org/abs/1805.05499v1
PDF	http://arxiv.org/pdf/1805.05499v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-trajectory-prediction-of
Repo
Framework

Dantzig Selector with an Approximately Optimal Denoising Matrix and its Application to Reinforcement Learning


Title	Dantzig Selector with an Approximately Optimal Denoising Matrix and its Application to Reinforcement Learning
Authors	Bo Liu, Luwan Zhang, Ji Liu
Abstract	Dantzig Selector (DS) is widely used in compressed sensing and sparse learning for feature selection and sparse signal recovery. Since the DS formulation is essentially a linear programming optimization, many existing linear programming solvers can be simply applied for scaling up. The DS formulation can be explained as a basis pursuit denoising problem, wherein the data matrix (or measurement matrix) is employed as the denoising matrix to eliminate the observation noise. However, we notice that the data matrix may not be the optimal denoising matrix, as shown by a simple counter-example. This motivates us to pursue a better denoising matrix for defining a general DS formulation. We first define the optimal denoising matrix through a minimax optimization, which turns out to be an NPhard problem. To make the problem computationally tractable, we propose a novel algorithm, termed as Optimal Denoising Dantzig Selector (ODDS), to approximately estimate the optimal denoising matrix. Empirical experiments validate the proposed method. Finally, a novel sparse reinforcement learning algorithm is formulated by extending the proposed ODDS algorithm to temporal difference learning, and empirical experimental results demonstrate to outperform the conventional vanilla DS-TD algorithm.
Tasks	Denoising, Feature Selection, Sparse Learning
Published	2018-11-02
URL	http://arxiv.org/abs/1811.00958v1
PDF	http://arxiv.org/pdf/1811.00958v1.pdf
PWC	https://paperswithcode.com/paper/dantzig-selector-with-an-approximately
Repo
Framework

Deep Neural Networks in High Frequency Trading


Title	Deep Neural Networks in High Frequency Trading
Authors	Prakhar Ganesh, Puneet Rakheja
Abstract	The ability to give precise and fast prediction for the price movement of stocks is the key to profitability in High Frequency Trading. The main objective of this paper is to propose a novel way of modeling the high frequency trading problem using Deep Neural Networks at its heart and to argue why Deep Learning methods can have a lot of potential in the field of High Frequency Trading. The paper goes on to analyze the model’s performance based on it’s prediction accuracy as well as prediction speed across full-day trading simulations.
Tasks
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01506v2
PDF	http://arxiv.org/pdf/1809.01506v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-in-high-frequency
Repo
Framework

Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge


Title	Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge
Authors	Ziqi Zheng, Chenjie Cao, Xingwei Chen, Guoqiang Xu
Abstract	The continuous dimensional emotion modelled by arousal and valence can depict complex changes of emotions. In this paper, we present our works on arousal and valence predictions for One-Minute-Gradual (OMG) Emotion Challenge. Multimodal representations are first extracted from videos using a variety of acoustic, video and textual models and support vector machine (SVM) is then used for fusion of multimodal signals to make final predictions. Our solution achieves Concordant Correlation Coefficient (CCC) scores of 0.397 and 0.520 on arousal and valence respectively for the validation dataset, which outperforms the baseline systems with the best CCC scores of 0.15 and 0.23 on arousal and valence by a large margin.
Tasks	Emotion Recognition, Multimodal Emotion Recognition
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01060v1
PDF	http://arxiv.org/pdf/1805.01060v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-emotion-recognition-for-one-minute
Repo
Framework