February 2, 2020

3106 words 15 mins read

Paper Group AWR 66

Grammatical Error Correction in Low-Resource Scenarios. An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation. Click-Through Rate Prediction with the User Memory Network. DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion. Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations. …

Grammatical Error Correction in Low-Resource Scenarios


Title	Grammatical Error Correction in Low-Resource Scenarios
Authors	Jakub Náplava, Milan Straka
Abstract	Grammatical error correction in English is a long studied problem with many existing systems and datasets. However, there has been only a limited research on error correction of other languages. In this paper, we present a new dataset AKCES-GEC on grammatical error correction for Czech. We then make experiments on Czech, German and Russian and show that when utilizing synthetic parallel corpus, Transformer neural machine translation model can reach new state-of-the-art results on these datasets. AKCES-GEC is published under CC BY-NC-SA 4.0 license at https://hdl.handle.net/11234/1-3057 and the source code of the GEC model is available at https://github.com/ufal/low-resource-gec-wnut2019.
Tasks	Grammatical Error Correction, Machine Translation
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00353v3
PDF	https://arxiv.org/pdf/1910.00353v3.pdf
PWC	https://paperswithcode.com/paper/grammatical-error-correction-in-low-resource
Repo	https://github.com/ufal/low-resource-gec-wnut2019
Framework	none

An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation


Title	An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation
Authors	Yanru Qu, Ting Bai, Weinan Zhang, Jianyun Nie, Jian Tang
Abstract	This paper studies graph-based recommendation, where an interaction graph is constructed from historical records and is lever-aged to alleviate data sparsity and cold start problems. We reveal an early summarization problem in existing graph-based models, and propose Neighborhood Interaction (NI) model to capture each neighbor pair (between user-side and item-side) distinctively. NI model is more expressive and can capture more complicated structural patterns behind user-item interactions. To further enrich node connectivity and utilize high-order structural information, we incorporate extra knowledge graphs (KGs) and adopt graph neural networks (GNNs) in NI, called Knowledge-enhanced NeighborhoodInteraction (KNI). Compared with the state-of-the-art recommendation methods,e.g., feature-based, meta path-based, and KG-based models, our KNI achieves superior performance in click-through rate prediction (1.1%-8.4% absolute AUC improvements) and out-performs by a wide margin in top-N recommendation on 4 real-world datasets.
Tasks	Click-Through Rate Prediction, Knowledge Graphs
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04032v2
PDF	https://arxiv.org/pdf/1908.04032v2.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-neighborhood-based-interaction
Repo	https://github.com/Atomu2014/KNI
Framework	tf

Click-Through Rate Prediction with the User Memory Network


Title	Click-Through Rate Prediction with the User Memory Network
Authors	Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Li Li, Zhaojie Liu, Yanlong Du
Abstract	Click-through rate (CTR) prediction is a critical task in online advertising systems. Models like Deep Neural Networks (DNNs) are simple but stateless. They consider each target ad independently and cannot directly extract useful information contained in users’ historical ad impressions and clicks. In contrast, models like Recurrent Neural Networks (RNNs) are stateful but complex. They model temporal dependency between users’ sequential behaviors and can achieve improved prediction performance than DNNs. However, both the offline training and online prediction process of RNNs are much more complex and time-consuming. In this paper, we propose Memory Augmented DNN (MA-DNN) for practical CTR prediction services. In particular, we create two external memory vectors for each user, memorizing high-level abstractions of what a user possibly likes and dislikes. The proposed MA-DNN achieves a good compromise between DNN and RNN. It is as simple as DNN, but has certain ability to exploit useful information contained in users’ historical behaviors as RNN. Both offline and online experiments demonstrate the effectiveness of MA-DNN for practical CTR prediction services. Actually, the memory component can be augmented to other models as well (e.g., the Wide&Deep model).
Tasks	Click-Through Rate Prediction
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04667v2
PDF	https://arxiv.org/pdf/1907.04667v2.pdf
PWC	https://paperswithcode.com/paper/click-through-rate-prediction-with-the-user
Repo	https://github.com/rener1199/deep_memory
Framework	tf

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion


Title	DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
Authors	Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese
Abstract	A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources. Prior works either extract information from the RGB image and depth separately or use costly post-processing steps, limiting their performances in highly cluttered scenes and real-time applications. In this work, we present DenseFusion, a generic framework for estimating 6D pose of a set of known objects from RGB-D images. DenseFusion is a heterogeneous architecture that processes the two data sources individually and uses a novel dense fusion network to extract pixel-wise dense feature embedding, from which the pose is estimated. Furthermore, we integrate an end-to-end iterative pose refinement procedure that further improves the pose estimation while achieving near real-time inference. Our experiments show that our method outperforms state-of-the-art approaches in two datasets, YCB-Video and LineMOD. We also deploy our proposed method to a real robot to grasp and manipulate objects based on the estimated pose.
Tasks	6D Pose Estimation, 6D Pose Estimation using RGBD, Pose Estimation
Published	2019-01-15
URL	http://arxiv.org/abs/1901.04780v1
PDF	http://arxiv.org/pdf/1901.04780v1.pdf
PWC	https://paperswithcode.com/paper/densefusion-6d-object-pose-estimation-by
Repo	https://github.com/caoquan95/6D-pose-project
Framework	pytorch

Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations


Title	Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations
Authors	Dmitry Kobak, George Linderman, Stefan Steinerberger, Yuval Kluger, Philipp Berens
Abstract	T-distributed stochastic neighbour embedding (t-SNE) is a widely used data visualisation technique. It differs from its predecessor SNE by the low-dimensional similarity kernel: the Gaussian kernel was replaced by the heavy-tailed Cauchy kernel, solving the “crowding problem” of SNE. Here, we develop an efficient implementation of t-SNE for a $t$-distribution kernel with an arbitrary degree of freedom $\nu$, with $\nu\to\infty$ corresponding to SNE and $\nu=1$ corresponding to the standard t-SNE. Using theoretical analysis and toy examples, we show that $\nu<1$ can further reduce the crowding problem and reveal finer cluster structure that is invisible in standard t-SNE. We further demonstrate the striking effect of heavier-tailed kernels on large real-life data sets such as MNIST, single-cell RNA-sequencing data, and the HathiTrust library. We use domain knowledge to confirm that the revealed clusters are meaningful. Overall, we argue that modifying the tail heaviness of the t-SNE kernel can yield additional insight into the cluster structure of the data.
Tasks
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05804v2
PDF	http://arxiv.org/pdf/1902.05804v2.pdf
PWC	https://paperswithcode.com/paper/heavy-tailed-kernels-reveal-a-finer-cluster
Repo	https://github.com/berenslab/finer-tsne
Framework	none

A Novel Chaos Theory Inspired Neuronal Architecture


Title	A Novel Chaos Theory Inspired Neuronal Architecture
Authors	Harikrishnan N B, Nithin Nagaraj
Abstract	The practical success of widely used machine learning (ML) and deep learning (DL) algorithms in Artificial Intelligence (AI) community owes to availability of large datasets for training and huge computational resources. Despite the enormous practical success of AI, these algorithms are only loosely inspired from the biological brain and do not mimic any of the fundamental properties of neurons in the brain, one such property being the chaotic firing of biological neurons. This motivates us to develop a novel neuronal architecture where the individual neurons are intrinsically chaotic in nature. By making use of the topological transitivity property of chaos, our neuronal network is able to perform classification tasks with very less number of training samples. For the MNIST dataset, with as low as $0.1 %$ of the total training data, our method outperforms ML and matches DL in classification accuracy for up to $7$ training samples/class. For the Iris dataset, our accuracy is comparable with ML algorithms, and even with just two training samples/class, we report an accuracy as high as $95.8 %$. This work highlights the effectiveness of chaos and its properties for learning and paves the way for chaos-inspired neuronal architectures by closely mimicking the chaotic nature of neurons in the brain.
Tasks
Published	2019-05-19
URL	https://arxiv.org/abs/1905.12601v1
PDF	https://arxiv.org/pdf/1905.12601v1.pdf
PWC	https://paperswithcode.com/paper/190512601
Repo	https://github.com/kadarakos/chaoticgls
Framework	none

CornerNet-Lite: Efficient Keypoint Based Object Detection


Title	CornerNet-Lite: Efficient Keypoint Based Object Detection
Authors	Hei Law, Yun Teng, Olga Russakovsky, Jia Deng
Abstract	Keypoint-based methods are a relatively new paradigm in object detection, eliminating the need for anchor boxes and offering a simplified detection framework. Keypoint-based CornerNet achieves state of the art accuracy among single-stage detectors. However, this accuracy comes at high processing cost. In this work, we tackle the problem of efficient keypoint-based object detection and introduce CornerNet-Lite. CornerNet-Lite is a combination of two efficient variants of CornerNet: CornerNet-Saccade, which uses an attention mechanism to eliminate the need for exhaustively processing all pixels of the image, and CornerNet-Squeeze, which introduces a new compact backbone architecture. Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency. CornerNet-Saccade is suitable for offline processing, improving the efficiency of CornerNet by 6.0x and the AP by 1.0% on COCO. CornerNet-Squeeze is suitable for real-time detection, improving both the efficiency and accuracy of the popular real-time detector YOLOv3 (34.4% AP at 34ms for CornerNet-Squeeze compared to 33.0% AP at 39ms for YOLOv3 on COCO). Together these contributions for the first time reveal the potential of keypoint-based detection to be useful for applications requiring processing efficiency.
Tasks	Object Detection, Real-Time Object Detection
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08900v1
PDF	http://arxiv.org/pdf/1904.08900v1.pdf
PWC	https://paperswithcode.com/paper/190408900
Repo	https://github.com/tc-qaq/CornerNet_Lite
Framework	none

Self-Supervised 3D Keypoint Learning for Ego-motion Estimation


Title	Self-Supervised 3D Keypoint Learning for Ego-motion Estimation
Authors	Jiexiong Tang, Rares Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim, Adrien Gaidon
Abstract	Generating reliable illumination and viewpoint invariant keypoints is critical for feature-based SLAM and SfM. State-of-the-art learning-based methods often rely on generating training samples by employing homography adaptation to create 2D synthetic views. While such approaches trivially solve data association between views, they cannot effectively learn from real illumination and non-planar 3D scenes. In this work, we propose a fully self-supervised approach towards learning depth-aware keypoints \textit{purely} from unlabeled videos by incorporating a differentiable pose estimation module that jointly optimizes the keypoints and their depths in a Structure-from-Motion setting. We introduce 3D Multi-View Adaptation, a technique that exploits the temporal context in videos to self-supervise keypoint detection and matching in an end-to-end differentiable manner. Finally, we show how a fully self-supervised keypoint detection and description network can be trivially incorporated as a front-end into a state-of-the-art visual odometry framework that is robust and accurate.
Tasks	Keypoint Detection, Motion Estimation, Pose Estimation, Visual Odometry
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03426v1
PDF	https://arxiv.org/pdf/1912.03426v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-3d-keypoint-learning-for-ego
Repo	https://github.com/TRI-ML/KP3D
Framework	none

RF-Net: An End-to-End Image Matching Network based on Receptive Field


Title	RF-Net: An End-to-End Image Matching Network based on Receptive Field
Authors	Xuelun Shen, Cheng Wang, Xin Li, Zenglei Yu, Jonathan Li, Chenglu Wen, Ming Cheng, Zijian He
Abstract	This paper proposes a new end-to-end trainable matching network based on receptive field, RF-Net, to compute sparse correspondence between images. Building end-to-end trainable matching framework is desirable and challenging. The very recent approach, LF-Net, successfully embeds the entire feature extraction pipeline into a jointly trainable pipeline, and produces the state-of-the-art matching results. This paper introduces two modifications to the structure of LF-Net. First, we propose to construct receptive feature maps, which lead to more effective keypoint detection. Second, we introduce a general loss function term, neighbor mask, to facilitate training patch selection. This results in improved stability in descriptor training. We trained RF-Net on the open dataset HPatches, and compared it with other methods on multiple benchmark datasets. Experiments show that RF-Net outperforms existing state-of-the-art methods.
Tasks	Keypoint Detection
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00604v1
PDF	https://arxiv.org/pdf/1906.00604v1.pdf
PWC	https://paperswithcode.com/paper/190600604
Repo	https://github.com/Xylon-Sean/rfnet
Framework	pytorch

Rapidly Adapting Moment Estimation


Title	Rapidly Adapting Moment Estimation
Authors	Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn
Abstract	Adaptive gradient methods such as Adam have been shown to be very effective for training deep neural networks (DNNs) by tracking the second moment of gradients to compute the individual learning rates. Differently from existing methods, we make use of the most recent first moment of gradients to compute the individual learning rates per iteration. The motivation behind it is that the dynamic variation of the first moment of gradients may provide useful information to obtain the learning rates. We refer to the new method as the rapidly adapting moment estimation (RAME). The theoretical convergence of deterministic RAME is studied by using an analysis similar to the one used in [1] for Adam. Experimental results for training a number of DNNs show promising performance of RAME w.r.t. the convergence speed and generalization performance compared to the stochastic heavy-ball (SHB) method, Adam, and RMSprop.
Tasks
Published	2019-02-24
URL	http://arxiv.org/abs/1902.09030v1
PDF	http://arxiv.org/pdf/1902.09030v1.pdf
PWC	https://paperswithcode.com/paper/rapidly-adapting-moment-estimation
Repo	https://github.com/guoqiang-zhang/RAME
Framework	tf

GLAMpoints: Greedily Learned Accurate Match points


Title	GLAMpoints: Greedily Learned Accurate Match points
Authors	Prune Truong, Stefanos Apostolopoulos, Agata Mosinska, Samuel Stucky, Carlos Ciller, Sandro De Zanet
Abstract	We introduce a novel CNN-based feature point detector - GLAMpoints - learned in a semi-supervised manner. Our detector extracts repeatable, stable interest points with a dense coverage, specifically designed to maximize the correct matching in a specific domain, which is in contrast to conventional techniques that optimize indirect metrics. In this paper, we apply our method on challenging retinal slitlamp images, for which classical detectors yield unsatisfactory results due to low image quality and insufficient amount of low-level features. We show that GLAMpoints significantly outperforms classical detectors as well as state-of-the-art CNN-based methods in matching and registration quality for retinal images. Our method can also be extended to other domains, such as natural images.
Tasks	Keypoint Detection
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06812v2
PDF	https://arxiv.org/pdf/1908.06812v2.pdf
PWC	https://paperswithcode.com/paper/glampoints-greedily-learned-accurate-match
Repo	https://github.com/DagnyT/GLAMpoints-PyTorch
Framework	pytorch

Expectation-Maximization Attention Networks for Semantic Segmentation


Title	Expectation-Maximization Attention Networks for Semantic Segmentation
Authors	Xia Li, Zhisheng Zhong, Jianlong Wu, Yibo Yang, Zhouchen Lin, Hong Liu
Abstract	Self-attention mechanism has been widely used for various tasks. It is designed to compute the representation of each position by a weighted sum of the features at all positions. Thus, it can capture long-range relations for computer vision tasks. However, it is computationally consuming. Since the attention maps are computed w.r.t all other positions. In this paper, we formulate the attention mechanism into an expectation-maximization manner and iteratively estimate a much more compact set of bases upon which the attention maps are computed. By a weighted summation upon these bases, the resulting representation is low-rank and deprecates noisy information from the input. The proposed Expectation-Maximization Attention (EMA) module is robust to the variance of input and is also friendly in memory and computation. Moreover, we set up the bases maintenance and normalization methods to stabilize its training procedure. We conduct extensive experiments on popular semantic segmentation benchmarks including PASCAL VOC, PASCAL Context and COCO Stuff, on which we set new records.
Tasks	Semantic Segmentation
Published	2019-07-31
URL	https://arxiv.org/abs/1907.13426v2
PDF	https://arxiv.org/pdf/1907.13426v2.pdf
PWC	https://paperswithcode.com/paper/expectation-maximization-attention-networks
Repo	https://github.com/XiaLiPKU/EMANet
Framework	pytorch

Unsupervised Data Augmentation for Consistency Training


Title	Unsupervised Data Augmentation for Consistency Training
Authors	Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le
Abstract	Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples. On a standard semi-supervised learning benchmark, CIFAR-10, our method outperforms all previous approaches and achieves an error rate of 2.7% with only 4,000 examples, nearly matching the performance of models trained on 50,000 labeled examples. Our method also combines well with transfer learning, e.g., when finetuning from BERT, and yields improvements in high-data regime, such as ImageNet, whether when there is only 10% labeled data or when a full labeled set with 1.3M extra unlabeled examples is used. Code is available at https://github.com/google-research/uda.
Tasks	Data Augmentation, Image Augmentation, Image Classification, Semi-Supervised Image Classification, Text Classification, Transfer Learning
Published	2019-04-29
URL	https://arxiv.org/abs/1904.12848v4
PDF	https://arxiv.org/pdf/1904.12848v4.pdf
PWC	https://paperswithcode.com/paper/unsupervised-data-augmentation-1
Repo	https://github.com/BobaZooba/Unsupervised-Data-Augmentation
Framework	pytorch

High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction


Title	High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction
Authors	Diego Valsesia, Enrico Magli
Abstract	Compression of hyperspectral images onboard of spacecrafts is a tradeoff between the limited computational resources and the ever-growing spatial and spectral resolution of the optical instruments. As such, it requires low-complexity algorithms with good rate-distortion performance and high throughput. In recent years, the Consultative Committee for Space Data Systems (CCSDS) has focused on lossless and near-lossless compression approaches based on predictive coding, resulting in the recently published CCSDS 123.0-B-2 recommended standard. While the in-loop reconstruction of quantized prediction residuals provides excellent rate-distortion performance for the near-lossless operating mode, it significantly constrains the achievable throughput due to data dependencies. In this paper, we study the performance of a faster method based on prequantization of the image followed by a lossless predictive compressor. While this is well known to be suboptimal, one can exploit powerful signal models to reconstruct the image at the ground segment, recovering part of the suboptimality. In particular, we show that convolutional neural networks can be used for this task and that they can recover the whole SNR drop incurred at a bitrate of 2 bits per pixel.
Tasks	Image Compression
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02959v1
PDF	https://arxiv.org/pdf/1907.02959v1.pdf
PWC	https://paperswithcode.com/paper/high-throughput-onboard-hyperspectral-image
Repo	https://github.com/diegovalsesia/hyperspectral-dequantization
Framework	pytorch

GAP: Generalizable Approximate Graph Partitioning Framework


Title	GAP: Generalizable Approximate Graph Partitioning Framework
Authors	Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, Azalia Mirhoseini
Abstract	Graph partitioning is the problem of dividing the nodes of a graph into balanced partitions while minimizing the edge cut across the partitions. Due to its combinatorial nature, many approximate solutions have been developed, including variants of multi-level methods and spectral clustering. We propose GAP, a Generalizable Approximate Partitioning framework that takes a deep learning approach to graph partitioning. We define a differentiable loss function that represents the partitioning objective and use backpropagation to optimize the network parameters. Unlike baselines that redo the optimization per graph, GAP is capable of generalization, allowing us to train models that produce performant partitions at inference time, even on unseen graphs. Furthermore, because we learn the representation of the graph while jointly optimizing for the partitioning loss function, GAP can be easily tuned for a variety of graph structures. We evaluate the performance of GAP on graphs of varying sizes and structures, including graphs of widely used machine learning models (e.g., ResNet, VGG, and Inception-V3), scale-free graphs, and random graphs. We show that GAP achieves competitive partitions while being up to 100 times faster than the baseline and generalizes to unseen graphs.
Tasks	graph partitioning
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00614v1
PDF	http://arxiv.org/pdf/1903.00614v1.pdf
PWC	https://paperswithcode.com/paper/gap-generalizable-approximate-graph
Repo	https://github.com/saurabhdash/GCN_Partitioning
Framework	pytorch