Paper Group AWR 394
Multiple-image encryption and hiding with an optical diffractive neural network. Handling Syntactic Divergence in Low-resource Machine Translation. How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification. Online Learning and Matching fo …
Multiple-image encryption and hiding with an optical diffractive neural network
Title | Multiple-image encryption and hiding with an optical diffractive neural network |
Authors | Yang Gao, Shuming Jiao, Juncheng Fang, Ting Lei, Zhenwei Xie, Xiaocong Yuan |
Abstract | A cascaded phase-only mask architecture (or an optical diffractive neural network) can be employed for different optical information processing tasks such as pattern recognition, orbital angular momentum (OAM) mode conversion, image salience detection and image encryption. However, for optical encryption and watermarking applications, such a system usually cannot process multiple pairs of input images and output images simultaneously. In our proposed scheme, multiple input images can be simultaneously fed to an optical diffractive neural network (DNN) system and each corresponding output image will be displayed in a non-overlap sub-region in the output imaging plane. Each input image undergoes a different optical transform in an independent channel within the same system. The multiple cascaded phase masks in the system can be effectively optimized by a wavefront matching algorithm. Similar to recent optical pattern recognition and mode conversion works, the orthogonality property is employed to design a multiplexed DNN. |
Tasks | |
Published | 2019-02-21 |
URL | https://arxiv.org/abs/1902.07985v2 |
https://arxiv.org/pdf/1902.07985v2.pdf | |
PWC | https://paperswithcode.com/paper/a-parallel-optical-image-security-system-with |
Repo | https://github.com/szgy66/code |
Framework | none |
Handling Syntactic Divergence in Low-resource Machine Translation
Title | Handling Syntactic Divergence in Low-resource Machine Translation |
Authors | Chunting Zhou, Xuezhe Ma, Junjie Hu, Graham Neubig |
Abstract | Despite impressive empirical successes of neural machine translation (NMT) on standard benchmarks, limited parallel data impedes the application of NMT models to many language pairs. Data augmentation methods such as back-translation make it possible to use monolingual data to help alleviate these issues, but back-translation itself fails in extreme low-resource scenarios, especially for syntactically divergent languages. In this paper, we propose a simple yet effective solution, whereby target-language sentences are re-ordered to match the order of the source and used as an additional source of training-time supervision. Experiments with simulated low-resource Japanese-to-English, and real low-resource Uyghur-to-English scenarios find significant improvements over other semi-supervised alternatives. |
Tasks | Data Augmentation, Machine Translation |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.00040v1 |
https://arxiv.org/pdf/1909.00040v1.pdf | |
PWC | https://paperswithcode.com/paper/handling-syntactic-divergence-in-low-resource |
Repo | https://github.com/violet-zct/pytorch-reorder-nmt |
Framework | pytorch |
How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions
Title | How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions |
Authors | Zewei Chu, Mingda Chen, Jing Chen, Miaosen Wang, Kevin Gimpel, Manaal Faruqui, Xiance Si |
Abstract | We present a large-scale dataset for the task of rewriting an ill-formed natural language question to a well-formed one. Our multi-domain question rewriting MQR dataset is constructed from human contributed Stack Exchange question edit histories. The dataset contains 427,719 question pairs which come from 303 domains. We provide human annotations for a subset of the dataset as a quality estimate. When moving from ill-formed to well-formed questions, the question quality improves by an average of 45 points across three aspects. We train sequence-to-sequence neural models on the constructed dataset and obtain an improvement of 13.2% in BLEU-4 over baseline methods built from other data resources. We release the MQR dataset to encourage research on the problem of question rewriting. |
Tasks | |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09247v1 |
https://arxiv.org/pdf/1911.09247v1.pdf | |
PWC | https://paperswithcode.com/paper/how-to-ask-better-questions-a-large-scale |
Repo | https://github.com/ZeweiChu/MQR |
Framework | none |
FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
Title | FewRel 2.0: Towards More Challenging Few-Shot Relation Classification |
Authors | Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou |
Abstract | We present FewRel 2.0, a more challenging task to investigate two aspects of few-shot relation classification models: (1) Can they adapt to a new domain with only a handful of instances? (2) Can they detect none-of-the-above (NOTA) relations? To construct FewRel 2.0, we build upon the FewRel dataset (Han et al., 2018) by adding a new test set in a quite different domain, and a NOTA relation choice. With the new dataset and extensive experimental analysis, we found (1) that the state-of-the-art few-shot relation classification models struggle on these two aspects, and (2) that the commonly-used techniques for domain adaptation and NOTA detection still cannot handle the two challenges well. Our research calls for more attention and further efforts to these two real-world issues. All details and resources about the dataset and baselines are released at https: //github.com/thunlp/fewrel. |
Tasks | Domain Adaptation, Few-Shot Relation Classification, Relation Classification |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07124v1 |
https://arxiv.org/pdf/1910.07124v1.pdf | |
PWC | https://paperswithcode.com/paper/fewrel-20-towards-more-challenging-few-shot |
Repo | https://github.com/thunlp/fewrel |
Framework | pytorch |
Online Learning and Matching for Resource Allocation Problems
Title | Online Learning and Matching for Resource Allocation Problems |
Authors | Andrea Boskovic, Qinyi Chen, Dominik Kufel, Zijie Zhou |
Abstract | In order for an e-commerce platform to maximize its revenue, it must recommend customers items they are most likely to purchase. However, the company often has business constraints on these items, such as the number of each item in stock. In this work, our goal is to recommend items to users as they arrive on a webpage sequentially, in an online manner, in order to maximize reward for a company, but also satisfy budget constraints. We first approach the simpler online problem in which the customers arrive as a stationary Poisson process, and present an integrated algorithm that performs online optimization and online learning together. We then make the model more complicated but more realistic, treating the arrival processes as non-stationary Poisson processes. To deal with heterogeneous customer arrivals, we propose a time segmentation algorithm that converts a non-stationary problem into a series of stationary problems. Experiments conducted on large-scale synthetic data demonstrate the effectiveness and efficiency of our proposed approaches on solving constrained resource allocation problems. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07409v1 |
https://arxiv.org/pdf/1911.07409v1.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-and-matching-for-resource |
Repo | https://github.com/Dom98/integratedonlinestationaryalgorithm |
Framework | none |
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
Title | Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting |
Authors | Bryan Lim, Sercan O. Arik, Nicolas Loeff, Tomas Pfister |
Abstract | Multi-horizon forecasting problems often contain a complex mix of inputs – including static (i.e. time-invariant) covariates, known future inputs, and other exogenous time series that are only observed historically – without any prior information on how they interact with the target. While several deep learning models have been proposed for multi-step prediction, they typically comprise black-box models which do not account for the full range of inputs present in common scenarios. In this paper, we introduce the Temporal Fusion Transformer (TFT) – a novel attention-based architecture which combines high-performance multi-horizon forecasting with interpretable insights into temporal dynamics. To learn temporal relationships at different scales, the TFT utilizes recurrent layers for local processing and interpretable self-attention layers for learning long-term dependencies. The TFT also uses specialized components for the judicious selection of relevant features and a series of gating layers to suppress unnecessary components, enabling high performance in a wide range of regimes. On a variety of real-world datasets, we demonstrate significant performance improvements over existing benchmarks, and showcase three practical interpretability use-cases of TFT. |
Tasks | Time Series, Time Series Forecasting |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09363v1 |
https://arxiv.org/pdf/1912.09363v1.pdf | |
PWC | https://paperswithcode.com/paper/temporal-fusion-transformers-for |
Repo | https://github.com/mattsherar/Temporal_Fusion_Transform |
Framework | pytorch |
General $E(2)$-Equivariant Steerable CNNs
Title | General $E(2)$-Equivariant Steerable CNNs |
Authors | Maurice Weiler, Gabriele Cesa |
Abstract | The big empirical success of group equivariant networks has led in recent years to the sprouting of a great variety of equivariant network architectures. A particular focus has thereby been on rotation and reflection equivariant CNNs for planar images. Here we give a general description of $E(2)$-equivariant convolutions in the framework of Steerable CNNs. The theory of Steerable CNNs thereby yields constraints on the convolution kernels which depend on group representations describing the transformation laws of feature spaces. We show that these constraints for arbitrary group representations can be reduced to constraints under irreducible representations. A general solution of the kernel space constraint is given for arbitrary representations of the Euclidean group $E(2)$ and its subgroups. We implement a wide range of previously proposed and entirely new equivariant network architectures and extensively compare their performances. $E(2)$-steerable convolutions are further shown to yield remarkable gains on CIFAR-10, CIFAR-100 and STL-10 when used as a drop-in replacement for non-equivariant convolutions. |
Tasks | Image Classification |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08251v1 |
https://arxiv.org/pdf/1911.08251v1.pdf | |
PWC | https://paperswithcode.com/paper/general-e2-equivariant-steerable-cnns-1 |
Repo | https://github.com/QUVA-Lab/e2cnn |
Framework | pytorch |
Stacking Models for Nearly Optimal Link Prediction in Complex Networks
Title | Stacking Models for Nearly Optimal Link Prediction in Complex Networks |
Authors | Amir Ghasemian, Homa Hosseinmardi, Aram Galstyan, Edoardo M. Airoldi, Aaron Clauset |
Abstract | Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speedup the collection of network data and improve the validity of network models. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 548 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity via meta-learning to construct a series of “stacked” models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are also superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state-of-the-art for link prediction comes from combining individual algorithms, which achieves nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvement of these results. |
Tasks | Link Prediction, Meta-Learning |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07578v1 |
https://arxiv.org/pdf/1909.07578v1.pdf | |
PWC | https://paperswithcode.com/paper/stacking-models-for-nearly-optimal-link |
Repo | https://github.com/Aghasemian/OptimalLinkPrediction |
Framework | none |
RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking
Title | RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking |
Authors | Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, Jeff B. Pelz |
Abstract | Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time. Here, we present the RITnet model, which is a deep neural network that combines U-Net and DenseNet. RITnet is under 1 MB and achieves 95.3% accuracy on the 2019 OpenEDS Semantic Segmentation challenge. Using a GeForce GTX 1080 Ti, RITnet tracks at $>$ 300Hz, enabling real-time gaze tracking applications. Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/. |
Tasks | Eye Tracking, Gaze Estimation, Real-Time Semantic Segmentation, Semantic Segmentation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00694v1 |
https://arxiv.org/pdf/1910.00694v1.pdf | |
PWC | https://paperswithcode.com/paper/ritnet-real-time-semantic-segmentation-of-the |
Repo | https://github.com/AayushKrChaudhary/RITnet |
Framework | pytorch |
PI-Net: A Deep Learning Approach to Extract Topological Persistence Images
Title | PI-Net: A Deep Learning Approach to Extract Topological Persistence Images |
Authors | Anirudh Som, Hongjun Choi, Karthikeyan Natesan Ramamurthy, Matthew Buman, Pavan Turaga |
Abstract | Topological features such as persistence diagrams and their functional approximations like persistence images (PIs) have been showing substantial promise for machine learning and computer vision applications. Key bottlenecks to their large scale adoption are computational expenditure and difficulty in incorporating them in a differentiable architecture. We take an important step in this paper to mitigate these bottlenecks by proposing a novel one-step approach to generate PIs directly from the input data. We propose a simple convolutional neural network architecture called PI-Net that allows us to learn mappings between the input data and PIs. We design two separate architectures, one designed to take in multi-variate time series signals as input and another that accepts multi-channel images as input. We call these networks Signal PI-Net and Image PI-Net respectively. To the best of our knowledge, we are the first to propose the use of deep learning for computing topological features directly from data. We explore the use of the proposed method on two applications: human activity recognition using accelerometer sensor data and image classification. We demonstrate the ease of fusing PIs in supervised deep learning architectures and speed up of several orders of magnitude for extracting PIs from data. Our code is available at https://github.com/anirudhsom/PI-Net. |
Tasks | Activity Recognition, Human Activity Recognition, Image Classification, Time Series |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01769v1 |
https://arxiv.org/pdf/1906.01769v1.pdf | |
PWC | https://paperswithcode.com/paper/pi-net-a-deep-learning-approach-to-extract |
Repo | https://github.com/anirudhsom/PI-Net |
Framework | tf |
Training Temporal Word Embeddings with a Compass
Title | Training Temporal Word Embeddings with a Compass |
Authors | Valerio Di Carlo, Federico Bianchi, Matteo Palmonari |
Abstract | Temporal word embeddings have been proposed to support the analysis of word meaning shifts during time and to study the evolution of languages. Different approaches have been proposed to generate vector representations of words that embed their meaning during a specific time interval. However, the training process used in these approaches is complex, may be inefficient or it may require large text corpora. As a consequence, these approaches may be difficult to apply in resource-scarce domains or by scientists with limited in-depth knowledge of embedding models. In this paper, we propose a new heuristic to train temporal word embeddings based on the Word2vec model. The heuristic consists in using atemporal vectors as a reference, i.e., as a compass, when training the representations specific to a given time interval. The use of the compass simplifies the training process and makes it more efficient. Experiments conducted using state-of-the-art datasets and methodologies suggest that our approach outperforms or equals comparable approaches while being more robust in terms of the required corpus size. |
Tasks | Word Embeddings |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02376v1 |
https://arxiv.org/pdf/1906.02376v1.pdf | |
PWC | https://paperswithcode.com/paper/training-temporal-word-embeddings-with-a |
Repo | https://github.com/goggoloid/diachronic-linguistic-analysis |
Framework | none |
Eye Semantic Segmentation with a Lightweight Model
Title | Eye Semantic Segmentation with a Lightweight Model |
Authors | Van Thong Huynh, Soo-Hyung Kim, Guee-Sang Lee, Hyung-Jeong Yang |
Abstract | In this paper, we present a multi-class eye segmentation method that can run the hardware limitations for real-time inference. Our approach includes three major stages: get a grayscale image from the input, segment three distinct eye region with a deep network, and remove incorrect areas with heuristic filters. Our model based on the encoder decoder structure with the key is the depthwise convolution operation to reduce the computation cost. We experiment on OpenEDS, a large scale dataset of eye images captured by a head-mounted display with two synchronized eye facing cameras. We achieved the mean intersection over union (mIoU) of 94.85% with a model of size 0.4 megabytes. The source code are available https://github.com/th2l/Eye_VR_Segmentation |
Tasks | Semantic Segmentation |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01049v1 |
https://arxiv.org/pdf/1911.01049v1.pdf | |
PWC | https://paperswithcode.com/paper/eye-semantic-segmentation-with-a-lightweight |
Repo | https://github.com/th2l/Eye_VR_Segmentation |
Framework | pytorch |
Image-Conditioned Graph Generation for Road Network Extraction
Title | Image-Conditioned Graph Generation for Road Network Extraction |
Authors | Davide Belli, Thomas Kipf |
Abstract | Deep generative models for graphs have shown great promise in the area of drug design, but have so far found little application beyond generating graph-structured molecules. In this work, we demonstrate a proof of concept for the challenging task of road network extraction from image data. This task can be framed as image-conditioned graph generation, for which we develop the Generative Graph Transformer (GGT), a deep autoregressive model that makes use of attention mechanisms for image conditioning and the recurrent generation of graphs. We benchmark GGT on the application of road network extraction from semantic segmentation data. For this, we introduce the Toulouse Road Network dataset, based on real-world publicly-available data. We further propose the StreetMover distance: a metric based on the Sinkhorn distance for effectively evaluating the quality of road network generation. The code and dataset are publicly available. |
Tasks | Graph Generation, Semantic Segmentation |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14388v1 |
https://arxiv.org/pdf/1910.14388v1.pdf | |
PWC | https://paperswithcode.com/paper/image-conditioned-graph-generation-for-road |
Repo | https://github.com/davide-belli/generative-graph-transformer |
Framework | pytorch |
LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images
Title | LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images |
Authors | Brian H. Wang, Wei-Lun Chao, Yan Wang, Bharath Hariharan, Kilian Q. Weinberger, Mark Campbell |
Abstract | Object segmentation in three-dimensional (3-D) point clouds is a critical task for robots capable of 3-D perception. Despite the impressive performance of deep learning-based approaches on object segmentation in 2-D images, deep learning has not been applied nearly as successfully for 3-D point cloud segmentation. Deep networks generally require large amounts of labeled training data, which are readily available for 2-D images but are difficult to produce for 3-D point clouds. In this letter, we present Label Diffusion Lidar Segmentation (LDLS), a novel approach for 3-D point cloud segmentation, which leverages 2-D segmentation of an RGB image from an aligned camera to avoid the need for training on annotated 3-D data. We obtain 2-D segmentation predictions by applying Mask-RCNN to the RGB image, and then link this image to a 3-D lidar point cloud by building a graph of connections among 3-D points and 2-D pixels. This graph then directs a semi-supervised label diffusion process, where the 2-D pixels act as source nodes that diffuse object label information through the 3-D point cloud, resulting in a complete 3-D point cloud segmentation. We conduct empirical studies on the KITTI benchmark dataset and on a mobile robot, demonstrating wide applicability and superior performance of LDLS compared with the previous state of the art in 3-D point cloud segmentation, without any need for either 3-D training data or fine tuning of the 2-D image segmentation model. |
Tasks | Semantic Segmentation |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13955v1 |
https://arxiv.org/pdf/1910.13955v1.pdf | |
PWC | https://paperswithcode.com/paper/ldls-3-d-object-segmentation-through-label |
Repo | https://github.com/brian-h-wang/LDLS |
Framework | none |
Hierarchical Control for Bipedal Locomotion using Central Pattern Generators and Neural Networks
Title | Hierarchical Control for Bipedal Locomotion using Central Pattern Generators and Neural Networks |
Authors | Sayantan Auddy, Sven Magg, Stefan Wermter |
Abstract | The complexity of bipedal locomotion may be attributed to the difficulty in synchronizing joint movements while at the same time achieving high-level objectives such as walking in a particular direction. Artificial central pattern generators (CPGs) can produce synchronized joint movements and have been used in the past for bipedal locomotion. However, most existing CPG-based approaches do not address the problem of high-level control explicitly. We propose a novel hierarchical control mechanism for bipedal locomotion where an optimized CPG network is used for joint control and a neural network acts as a high-level controller for modulating the CPG network. By separating motion generation from motion modulation, the high-level controller does not need to control individual joints directly but instead can develop to achieve a higher goal using a low-dimensional control signal. The feasibility of the hierarchical controller is demonstrated through simulation experiments using the Neuro-Inspired Companion (NICO) robot. Experimental results demonstrate the controller’s ability to function even without the availability of an exact robot model. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00732v1 |
https://arxiv.org/pdf/1909.00732v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-control-for-bipedal-locomotion |
Repo | https://github.com/sayantanauddy/hierarchical_bipedal_controller |
Framework | tf |