January 31, 2020

3122 words 15 mins read

Paper Group AWR 394

Multiple-image encryption and hiding with an optical diffractive neural network. Handling Syntactic Divergence in Low-resource Machine Translation. How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification. Online Learning and Matching fo …

Multiple-image encryption and hiding with an optical diffractive neural network


Title	Multiple-image encryption and hiding with an optical diffractive neural network
Authors	Yang Gao, Shuming Jiao, Juncheng Fang, Ting Lei, Zhenwei Xie, Xiaocong Yuan
Abstract	A cascaded phase-only mask architecture (or an optical diffractive neural network) can be employed for different optical information processing tasks such as pattern recognition, orbital angular momentum (OAM) mode conversion, image salience detection and image encryption. However, for optical encryption and watermarking applications, such a system usually cannot process multiple pairs of input images and output images simultaneously. In our proposed scheme, multiple input images can be simultaneously fed to an optical diffractive neural network (DNN) system and each corresponding output image will be displayed in a non-overlap sub-region in the output imaging plane. Each input image undergoes a different optical transform in an independent channel within the same system. The multiple cascaded phase masks in the system can be effectively optimized by a wavefront matching algorithm. Similar to recent optical pattern recognition and mode conversion works, the orthogonality property is employed to design a multiplexed DNN.
Tasks
Published	2019-02-21
URL	https://arxiv.org/abs/1902.07985v2
PDF	https://arxiv.org/pdf/1902.07985v2.pdf
PWC	https://paperswithcode.com/paper/a-parallel-optical-image-security-system-with
Repo	https://github.com/szgy66/code
Framework	none

Handling Syntactic Divergence in Low-resource Machine Translation


Title	Handling Syntactic Divergence in Low-resource Machine Translation
Authors	Chunting Zhou, Xuezhe Ma, Junjie Hu, Graham Neubig
Abstract	Despite impressive empirical successes of neural machine translation (NMT) on standard benchmarks, limited parallel data impedes the application of NMT models to many language pairs. Data augmentation methods such as back-translation make it possible to use monolingual data to help alleviate these issues, but back-translation itself fails in extreme low-resource scenarios, especially for syntactically divergent languages. In this paper, we propose a simple yet effective solution, whereby target-language sentences are re-ordered to match the order of the source and used as an additional source of training-time supervision. Experiments with simulated low-resource Japanese-to-English, and real low-resource Uyghur-to-English scenarios find significant improvements over other semi-supervised alternatives.
Tasks	Data Augmentation, Machine Translation
Published	2019-08-30
URL	https://arxiv.org/abs/1909.00040v1
PDF	https://arxiv.org/pdf/1909.00040v1.pdf
PWC	https://paperswithcode.com/paper/handling-syntactic-divergence-in-low-resource
Repo	https://github.com/violet-zct/pytorch-reorder-nmt
Framework	pytorch

How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions


Title	How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions
Authors	Zewei Chu, Mingda Chen, Jing Chen, Miaosen Wang, Kevin Gimpel, Manaal Faruqui, Xiance Si
Abstract	We present a large-scale dataset for the task of rewriting an ill-formed natural language question to a well-formed one. Our multi-domain question rewriting MQR dataset is constructed from human contributed Stack Exchange question edit histories. The dataset contains 427,719 question pairs which come from 303 domains. We provide human annotations for a subset of the dataset as a quality estimate. When moving from ill-formed to well-formed questions, the question quality improves by an average of 45 points across three aspects. We train sequence-to-sequence neural models on the constructed dataset and obtain an improvement of 13.2% in BLEU-4 over baseline methods built from other data resources. We release the MQR dataset to encourage research on the problem of question rewriting.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09247v1
PDF	https://arxiv.org/pdf/1911.09247v1.pdf
PWC	https://paperswithcode.com/paper/how-to-ask-better-questions-a-large-scale
Repo	https://github.com/ZeweiChu/MQR
Framework	none

FewRel 2.0: Towards More Challenging Few-Shot Relation Classification


Title	FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
Authors	Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
Abstract	We present FewRel 2.0, a more challenging task to investigate two aspects of few-shot relation classification models: (1) Can they adapt to a new domain with only a handful of instances? (2) Can they detect none-of-the-above (NOTA) relations? To construct FewRel 2.0, we build upon the FewRel dataset (Han et al., 2018) by adding a new test set in a quite different domain, and a NOTA relation choice. With the new dataset and extensive experimental analysis, we found (1) that the state-of-the-art few-shot relation classification models struggle on these two aspects, and (2) that the commonly-used techniques for domain adaptation and NOTA detection still cannot handle the two challenges well. Our research calls for more attention and further efforts to these two real-world issues. All details and resources about the dataset and baselines are released at https: //github.com/thunlp/fewrel.
Tasks	Domain Adaptation, Few-Shot Relation Classification, Relation Classification
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07124v1
PDF	https://arxiv.org/pdf/1910.07124v1.pdf
PWC	https://paperswithcode.com/paper/fewrel-20-towards-more-challenging-few-shot
Repo	https://github.com/thunlp/fewrel
Framework	pytorch

Online Learning and Matching for Resource Allocation Problems


Title	Online Learning and Matching for Resource Allocation Problems
Authors	Andrea Boskovic, Qinyi Chen, Dominik Kufel, Zijie Zhou
Abstract	In order for an e-commerce platform to maximize its revenue, it must recommend customers items they are most likely to purchase. However, the company often has business constraints on these items, such as the number of each item in stock. In this work, our goal is to recommend items to users as they arrive on a webpage sequentially, in an online manner, in order to maximize reward for a company, but also satisfy budget constraints. We first approach the simpler online problem in which the customers arrive as a stationary Poisson process, and present an integrated algorithm that performs online optimization and online learning together. We then make the model more complicated but more realistic, treating the arrival processes as non-stationary Poisson processes. To deal with heterogeneous customer arrivals, we propose a time segmentation algorithm that converts a non-stationary problem into a series of stationary problems. Experiments conducted on large-scale synthetic data demonstrate the effectiveness and efficiency of our proposed approaches on solving constrained resource allocation problems.
Tasks
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07409v1
PDF	https://arxiv.org/pdf/1911.07409v1.pdf
PWC	https://paperswithcode.com/paper/online-learning-and-matching-for-resource
Repo	https://github.com/Dom98/integratedonlinestationaryalgorithm
Framework	none

Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting


Title	Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
Authors	Bryan Lim, Sercan O. Arik, Nicolas Loeff, Tomas Pfister
Abstract	Multi-horizon forecasting problems often contain a complex mix of inputs – including static (i.e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically – without any prior information on how they interact with the target. While several deep learning models have been proposed for multi-step prediction, they typically comprise black-box models which do not account for the full range of inputs present in common scenarios. In this paper, we introduce the Temporal Fusion Transformer (TFT) – a novel attention-based architecture which combines high-performance multi-horizon forecasting with interpretable insights into temporal dynamics. To learn temporal relationships at different scales, the TFT utilizes recurrent layers for local processing and interpretable self-attention layers for learning long-term dependencies. The TFT also uses specialized components for the judicious selection of relevant features and a series of gating layers to suppress unnecessary components, enabling high performance in a wide range of regimes. On a variety of real-world datasets, we demonstrate significant performance improvements over existing benchmarks, and showcase three practical interpretability use-cases of TFT.
Tasks	Time Series, Time Series Forecasting
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09363v1
PDF	https://arxiv.org/pdf/1912.09363v1.pdf
PWC	https://paperswithcode.com/paper/temporal-fusion-transformers-for
Repo	https://github.com/mattsherar/Temporal_Fusion_Transform
Framework	pytorch

General $E(2)$-Equivariant Steerable CNNs


Title	General $E(2)$-Equivariant Steerable CNNs
Authors	Maurice Weiler, Gabriele Cesa
Abstract	The big empirical success of group equivariant networks has led in recent years to the sprouting of a great variety of equivariant network architectures. A particular focus has thereby been on rotation and reflection equivariant CNNs for planar images. Here we give a general description of $E(2)$-equivariant convolutions in the framework of Steerable CNNs. The theory of Steerable CNNs thereby yields constraints on the convolution kernels which depend on group representations describing the transformation laws of feature spaces. We show that these constraints for arbitrary group representations can be reduced to constraints under irreducible representations. A general solution of the kernel space constraint is given for arbitrary representations of the Euclidean group $E(2)$ and its subgroups. We implement a wide range of previously proposed and entirely new equivariant network architectures and extensively compare their performances. $E(2)$-steerable convolutions are further shown to yield remarkable gains on CIFAR-10, CIFAR-100 and STL-10 when used as a drop-in replacement for non-equivariant convolutions.
Tasks	Image Classification
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08251v1
PDF	https://arxiv.org/pdf/1911.08251v1.pdf
PWC	https://paperswithcode.com/paper/general-e2-equivariant-steerable-cnns-1
Repo	https://github.com/QUVA-Lab/e2cnn
Framework	pytorch

Stacking Models for Nearly Optimal Link Prediction in Complex Networks


Title	Stacking Models for Nearly Optimal Link Prediction in Complex Networks
Authors	Amir Ghasemian, Homa Hosseinmardi, Aram Galstyan, Edoardo M. Airoldi, Aaron Clauset
Abstract	Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speedup the collection of network data and improve the validity of network models. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 548 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity via meta-learning to construct a series of “stacked” models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are also superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state-of-the-art for link prediction comes from combining individual algorithms, which achieves nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvement of these results.
Tasks	Link Prediction, Meta-Learning
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07578v1
PDF	https://arxiv.org/pdf/1909.07578v1.pdf
PWC	https://paperswithcode.com/paper/stacking-models-for-nearly-optimal-link
Repo	https://github.com/Aghasemian/OptimalLinkPrediction
Framework	none

RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking


Title	RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking
Authors	Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, Jeff B. Pelz
Abstract	Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time. Here, we present the RITnet model, which is a deep neural network that combines U-Net and DenseNet. RITnet is under 1 MB and achieves 95.3% accuracy on the 2019 OpenEDS Semantic Segmentation challenge. Using a GeForce GTX 1080 Ti, RITnet tracks at $>$ 300Hz, enabling real-time gaze tracking applications. Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/.
Tasks	Eye Tracking, Gaze Estimation, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00694v1
PDF	https://arxiv.org/pdf/1910.00694v1.pdf
PWC	https://paperswithcode.com/paper/ritnet-real-time-semantic-segmentation-of-the
Repo	https://github.com/AayushKrChaudhary/RITnet
Framework	pytorch

PI-Net: A Deep Learning Approach to Extract Topological Persistence Images


Title	PI-Net: A Deep Learning Approach to Extract Topological Persistence Images
Authors	Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan Turaga
Abstract	Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. Key bottlenecks to their large scale adoption are computational expenditure and difficulty in incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We propose a simple convolutional neural network architecture called PI-Net that allows us to learn mappings between the input data and PIs. We design two separate architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PI-Net respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed method on two applications: human activity recognition using accelerometer sensor data and image classification. We demonstrate the ease of fusing PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net.
Tasks	Activity Recognition, Human Activity Recognition, Image Classification, Time Series
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01769v1
PDF	https://arxiv.org/pdf/1906.01769v1.pdf
PWC	https://paperswithcode.com/paper/pi-net-a-deep-learning-approach-to-extract
Repo	https://github.com/anirudhsom/PI-Net
Framework	tf

Training Temporal Word Embeddings with a Compass


Title	Training Temporal Word Embeddings with a Compass
Authors	Valerio Di Carlo, Federico Bianchi, Matteo Palmonari
Abstract	Temporal word embeddings have been proposed to support the analysis of word meaning shifts during time and to study the evolution of languages. Different approaches have been proposed to generate vector representations of words that embed their meaning during a specific time interval. However, the training process used in these approaches is complex, may be inefficient or it may require large text corpora. As a consequence, these approaches may be difficult to apply in resource-scarce domains or by scientists with limited in-depth knowledge of embedding models. In this paper, we propose a new heuristic to train temporal word embeddings based on the Word2vec model. The heuristic consists in using atemporal vectors as a reference, i.e., as a compass, when training the representations specific to a given time interval. The use of the compass simplifies the training process and makes it more efficient. Experiments conducted using state-of-the-art datasets and methodologies suggest that our approach outperforms or equals comparable approaches while being more robust in terms of the required corpus size.
Tasks	Word Embeddings
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02376v1
PDF	https://arxiv.org/pdf/1906.02376v1.pdf
PWC	https://paperswithcode.com/paper/training-temporal-word-embeddings-with-a
Repo	https://github.com/goggoloid/diachronic-linguistic-analysis
Framework	none

Eye Semantic Segmentation with a Lightweight Model


Title	Eye Semantic Segmentation with a Lightweight Model
Authors	Van Thong Huynh, Soo-Hyung Kim, Guee-Sang Lee, Hyung-Jeong Yang
Abstract	In this paper, we present a multi-class eye segmentation method that can run the hardware limitations for real-time inference. Our approach includes three major stages: get a grayscale image from the input, segment three distinct eye region with a deep network, and remove incorrect areas with heuristic filters. Our model based on the encoder decoder structure with the key is the depthwise convolution operation to reduce the computation cost. We experiment on OpenEDS, a large scale dataset of eye images captured by a head-mounted display with two synchronized eye facing cameras. We achieved the mean intersection over union (mIoU) of 94.85% with a model of size 0.4 megabytes. The source code are available https://github.com/th2l/Eye_VR_Segmentation
Tasks	Semantic Segmentation
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01049v1
PDF	https://arxiv.org/pdf/1911.01049v1.pdf
PWC	https://paperswithcode.com/paper/eye-semantic-segmentation-with-a-lightweight
Repo	https://github.com/th2l/Eye_VR_Segmentation
Framework	pytorch

Image-Conditioned Graph Generation for Road Network Extraction


Title	Image-Conditioned Graph Generation for Road Network Extraction
Authors	Davide Belli, Thomas Kipf
Abstract	Deep generative models for graphs have shown great promise in the area of drug design, but have so far found little application beyond generating graph-structured molecules. In this work, we demonstrate a proof of concept for the challenging task of road network extraction from image data. This task can be framed as image-conditioned graph generation, for which we develop the Generative Graph Transformer (GGT), a deep autoregressive model that makes use of attention mechanisms for image conditioning and the recurrent generation of graphs. We benchmark GGT on the application of road network extraction from semantic segmentation data. For this, we introduce the Toulouse Road Network dataset, based on real-world publicly-available data. We further propose the StreetMover distance: a metric based on the Sinkhorn distance for effectively evaluating the quality of road network generation. The code and dataset are publicly available.
Tasks	Graph Generation, Semantic Segmentation
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14388v1
PDF	https://arxiv.org/pdf/1910.14388v1.pdf
PWC	https://paperswithcode.com/paper/image-conditioned-graph-generation-for-road
Repo	https://github.com/davide-belli/generative-graph-transformer
Framework	pytorch

LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images


Title	LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images
Authors	Brian H. Wang, Wei-Lun Chao, Yan Wang, Bharath Hariharan, Kilian Q. Weinberger, Mark Campbell
Abstract	Object segmentation in three-dimensional (3-D) point clouds is a critical task for robots capable of 3-D perception. Despite the impressive performance of deep learning-based approaches on object segmentation in 2-D images, deep learning has not been applied nearly as successfully for 3-D point cloud segmentation. Deep networks generally require large amounts of labeled training data, which are readily available for 2-D images but are difficult to produce for 3-D point clouds. In this letter, we present Label Diffusion Lidar Segmentation (LDLS), a novel approach for 3-D point cloud segmentation, which leverages 2-D segmentation of an RGB image from an aligned camera to avoid the need for training on annotated 3-D data. We obtain 2-D segmentation predictions by applying Mask-RCNN to the RGB image, and then link this image to a 3-D lidar point cloud by building a graph of connections among 3-D points and 2-D pixels. This graph then directs a semi-supervised label diffusion process, where the 2-D pixels act as source nodes that diffuse object label information through the 3-D point cloud, resulting in a complete 3-D point cloud segmentation. We conduct empirical studies on the KITTI benchmark dataset and on a mobile robot, demonstrating wide applicability and superior performance of LDLS compared with the previous state of the art in 3-D point cloud segmentation, without any need for either 3-D training data or fine tuning of the 2-D image segmentation model.
Tasks	Semantic Segmentation
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13955v1
PDF	https://arxiv.org/pdf/1910.13955v1.pdf
PWC	https://paperswithcode.com/paper/ldls-3-d-object-segmentation-through-label
Repo	https://github.com/brian-h-wang/LDLS
Framework	none

Hierarchical Control for Bipedal Locomotion using Central Pattern Generators and Neural Networks


Title	Hierarchical Control for Bipedal Locomotion using Central Pattern Generators and Neural Networks
Authors	Sayantan Auddy, Sven Magg, Stefan Wermter
Abstract	The complexity of bipedal locomotion may be attributed to the difficulty in synchronizing joint movements while at the same time achieving high-level objectives such as walking in a particular direction. Artificial central pattern generators (CPGs) can produce synchronized joint movements and have been used in the past for bipedal locomotion. However, most existing CPG-based approaches do not address the problem of high-level control explicitly. We propose a novel hierarchical control mechanism for bipedal locomotion where an optimized CPG network is used for joint control and a neural network acts as a high-level controller for modulating the CPG network. By separating motion generation from motion modulation, the high-level controller does not need to control individual joints directly but instead can develop to achieve a higher goal using a low-dimensional control signal. The feasibility of the hierarchical controller is demonstrated through simulation experiments using the Neuro-Inspired Companion (NICO) robot. Experimental results demonstrate the controller’s ability to function even without the availability of an exact robot model.
Tasks
Published	2019-09-02
URL	https://arxiv.org/abs/1909.00732v1
PDF	https://arxiv.org/pdf/1909.00732v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-control-for-bipedal-locomotion
Repo	https://github.com/sayantanauddy/hierarchical_bipedal_controller
Framework	tf