Paper Group AWR 66
Grammatical Error Correction in Low-Resource Scenarios. An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation. Click-Through Rate Prediction with the User Memory Network. DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion. Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations. …
Grammatical Error Correction in Low-Resource Scenarios
Title | Grammatical Error Correction in Low-Resource Scenarios |
Authors | Jakub Náplava, Milan Straka |
Abstract | Grammatical error correction in English is a long studied problem with many existing systems and datasets. However, there has been only a limited research on error correction of other languages. In this paper, we present a new dataset AKCES-GEC on grammatical error correction for Czech. We then make experiments on Czech, German and Russian and show that when utilizing synthetic parallel corpus, Transformer neural machine translation model can reach new state-of-the-art results on these datasets. AKCES-GEC is published under CC BY-NC-SA 4.0 license at https://hdl.handle.net/11234/1-3057 and the source code of the GEC model is available at https://github.com/ufal/low-resource-gec-wnut2019. |
Tasks | Grammatical Error Correction, Machine Translation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00353v3 |
https://arxiv.org/pdf/1910.00353v3.pdf | |
PWC | https://paperswithcode.com/paper/grammatical-error-correction-in-low-resource |
Repo | https://github.com/ufal/low-resource-gec-wnut2019 |
Framework | none |
An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation
Title | An End-to-End Neighborhood-based Interaction Model for Knowledge-enhanced Recommendation |
Authors | Yanru Qu, Ting Bai, Weinan Zhang, Jianyun Nie, Jian Tang |
Abstract | This paper studies graph-based recommendation, where an interaction graph is constructed from historical records and is lever-aged to alleviate data sparsity and cold start problems. We reveal an early summarization problem in existing graph-based models, and propose Neighborhood Interaction (NI) model to capture each neighbor pair (between user-side and item-side) distinctively. NI model is more expressive and can capture more complicated structural patterns behind user-item interactions. To further enrich node connectivity and utilize high-order structural information, we incorporate extra knowledge graphs (KGs) and adopt graph neural networks (GNNs) in NI, called Knowledge-enhanced NeighborhoodInteraction (KNI). Compared with the state-of-the-art recommendation methods,e.g., feature-based, meta path-based, and KG-based models, our KNI achieves superior performance in click-through rate prediction (1.1%-8.4% absolute AUC improvements) and out-performs by a wide margin in top-N recommendation on 4 real-world datasets. |
Tasks | Click-Through Rate Prediction, Knowledge Graphs |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.04032v2 |
https://arxiv.org/pdf/1908.04032v2.pdf | |
PWC | https://paperswithcode.com/paper/an-end-to-end-neighborhood-based-interaction |
Repo | https://github.com/Atomu2014/KNI |
Framework | tf |
Click-Through Rate Prediction with the User Memory Network
Title | Click-Through Rate Prediction with the User Memory Network |
Authors | Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Li Li, Zhaojie Liu, Yanlong Du |
Abstract | Click-through rate (CTR) prediction is a critical task in online advertising systems. Models like Deep Neural Networks (DNNs) are simple but stateless. They consider each target ad independently and cannot directly extract useful information contained in users’ historical ad impressions and clicks. In contrast, models like Recurrent Neural Networks (RNNs) are stateful but complex. They model temporal dependency between users’ sequential behaviors and can achieve improved prediction performance than DNNs. However, both the offline training and online prediction process of RNNs are much more complex and time-consuming. In this paper, we propose Memory Augmented DNN (MA-DNN) for practical CTR prediction services. In particular, we create two external memory vectors for each user, memorizing high-level abstractions of what a user possibly likes and dislikes. The proposed MA-DNN achieves a good compromise between DNN and RNN. It is as simple as DNN, but has certain ability to exploit useful information contained in users’ historical behaviors as RNN. Both offline and online experiments demonstrate the effectiveness of MA-DNN for practical CTR prediction services. Actually, the memory component can be augmented to other models as well (e.g., the Wide&Deep model). |
Tasks | Click-Through Rate Prediction |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04667v2 |
https://arxiv.org/pdf/1907.04667v2.pdf | |
PWC | https://paperswithcode.com/paper/click-through-rate-prediction-with-the-user |
Repo | https://github.com/rener1199/deep_memory |
Framework | tf |
DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion
Title | DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion |
Authors | Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martín-Martín, Cewu Lu, Li Fei-Fei, Silvio Savarese |
Abstract | A key technical challenge in performing 6D object pose estimation from RGB-D image is to fully leverage the two complementary data sources. Prior works either extract information from the RGB image and depth separately or use costly post-processing steps, limiting their performances in highly cluttered scenes and real-time applications. In this work, we present DenseFusion, a generic framework for estimating 6D pose of a set of known objects from RGB-D images. DenseFusion is a heterogeneous architecture that processes the two data sources individually and uses a novel dense fusion network to extract pixel-wise dense feature embedding, from which the pose is estimated. Furthermore, we integrate an end-to-end iterative pose refinement procedure that further improves the pose estimation while achieving near real-time inference. Our experiments show that our method outperforms state-of-the-art approaches in two datasets, YCB-Video and LineMOD. We also deploy our proposed method to a real robot to grasp and manipulate objects based on the estimated pose. |
Tasks | 6D Pose Estimation, 6D Pose Estimation using RGBD, Pose Estimation |
Published | 2019-01-15 |
URL | http://arxiv.org/abs/1901.04780v1 |
http://arxiv.org/pdf/1901.04780v1.pdf | |
PWC | https://paperswithcode.com/paper/densefusion-6d-object-pose-estimation-by |
Repo | https://github.com/caoquan95/6D-pose-project |
Framework | pytorch |
Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations
Title | Heavy-tailed kernels reveal a finer cluster structure in t-SNE visualisations |
Authors | Dmitry Kobak, George Linderman, Stefan Steinerberger, Yuval Kluger, Philipp Berens |
Abstract | T-distributed stochastic neighbour embedding (t-SNE) is a widely used data visualisation technique. It differs from its predecessor SNE by the low-dimensional similarity kernel: the Gaussian kernel was replaced by the heavy-tailed Cauchy kernel, solving the “crowding problem” of SNE. Here, we develop an efficient implementation of t-SNE for a $t$-distribution kernel with an arbitrary degree of freedom $\nu$, with $\nu\to\infty$ corresponding to SNE and $\nu=1$ corresponding to the standard t-SNE. Using theoretical analysis and toy examples, we show that $\nu<1$ can further reduce the crowding problem and reveal finer cluster structure that is invisible in standard t-SNE. We further demonstrate the striking effect of heavier-tailed kernels on large real-life data sets such as MNIST, single-cell RNA-sequencing data, and the HathiTrust library. We use domain knowledge to confirm that the revealed clusters are meaningful. Overall, we argue that modifying the tail heaviness of the t-SNE kernel can yield additional insight into the cluster structure of the data. |
Tasks | |
Published | 2019-02-15 |
URL | http://arxiv.org/abs/1902.05804v2 |
http://arxiv.org/pdf/1902.05804v2.pdf | |
PWC | https://paperswithcode.com/paper/heavy-tailed-kernels-reveal-a-finer-cluster |
Repo | https://github.com/berenslab/finer-tsne |
Framework | none |
A Novel Chaos Theory Inspired Neuronal Architecture
Title | A Novel Chaos Theory Inspired Neuronal Architecture |
Authors | Harikrishnan N B, Nithin Nagaraj |
Abstract | The practical success of widely used machine learning (ML) and deep learning (DL) algorithms in Artificial Intelligence (AI) community owes to availability of large datasets for training and huge computational resources. Despite the enormous practical success of AI, these algorithms are only loosely inspired from the biological brain and do not mimic any of the fundamental properties of neurons in the brain, one such property being the chaotic firing of biological neurons. This motivates us to develop a novel neuronal architecture where the individual neurons are intrinsically chaotic in nature. By making use of the topological transitivity property of chaos, our neuronal network is able to perform classification tasks with very less number of training samples. For the MNIST dataset, with as low as $0.1 %$ of the total training data, our method outperforms ML and matches DL in classification accuracy for up to $7$ training samples/class. For the Iris dataset, our accuracy is comparable with ML algorithms, and even with just two training samples/class, we report an accuracy as high as $95.8 %$. This work highlights the effectiveness of chaos and its properties for learning and paves the way for chaos-inspired neuronal architectures by closely mimicking the chaotic nature of neurons in the brain. |
Tasks | |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.12601v1 |
https://arxiv.org/pdf/1905.12601v1.pdf | |
PWC | https://paperswithcode.com/paper/190512601 |
Repo | https://github.com/kadarakos/chaoticgls |
Framework | none |
CornerNet-Lite: Efficient Keypoint Based Object Detection
Title | CornerNet-Lite: Efficient Keypoint Based Object Detection |
Authors | Hei Law, Yun Teng, Olga Russakovsky, Jia Deng |
Abstract | Keypoint-based methods are a relatively new paradigm in object detection, eliminating the need for anchor boxes and offering a simplified detection framework. Keypoint-based CornerNet achieves state of the art accuracy among single-stage detectors. However, this accuracy comes at high processing cost. In this work, we tackle the problem of efficient keypoint-based object detection and introduce CornerNet-Lite. CornerNet-Lite is a combination of two efficient variants of CornerNet: CornerNet-Saccade, which uses an attention mechanism to eliminate the need for exhaustively processing all pixels of the image, and CornerNet-Squeeze, which introduces a new compact backbone architecture. Together these two variants address the two critical use cases in efficient object detection: improving efficiency without sacrificing accuracy, and improving accuracy at real-time efficiency. CornerNet-Saccade is suitable for offline processing, improving the efficiency of CornerNet by 6.0x and the AP by 1.0% on COCO. CornerNet-Squeeze is suitable for real-time detection, improving both the efficiency and accuracy of the popular real-time detector YOLOv3 (34.4% AP at 34ms for CornerNet-Squeeze compared to 33.0% AP at 39ms for YOLOv3 on COCO). Together these contributions for the first time reveal the potential of keypoint-based detection to be useful for applications requiring processing efficiency. |
Tasks | Object Detection, Real-Time Object Detection |
Published | 2019-04-18 |
URL | http://arxiv.org/abs/1904.08900v1 |
http://arxiv.org/pdf/1904.08900v1.pdf | |
PWC | https://paperswithcode.com/paper/190408900 |
Repo | https://github.com/tc-qaq/CornerNet_Lite |
Framework | none |
Self-Supervised 3D Keypoint Learning for Ego-motion Estimation
Title | Self-Supervised 3D Keypoint Learning for Ego-motion Estimation |
Authors | Jiexiong Tang, Rares Ambrus, Vitor Guizilini, Sudeep Pillai, Hanme Kim, Adrien Gaidon |
Abstract | Generating reliable illumination and viewpoint invariant keypoints is critical for feature-based SLAM and SfM. State-of-the-art learning-based methods often rely on generating training samples by employing homography adaptation to create 2D synthetic views. While such approaches trivially solve data association between views, they cannot effectively learn from real illumination and non-planar 3D scenes. In this work, we propose a fully self-supervised approach towards learning depth-aware keypoints \textit{purely} from unlabeled videos by incorporating a differentiable pose estimation module that jointly optimizes the keypoints and their depths in a Structure-from-Motion setting. We introduce 3D Multi-View Adaptation, a technique that exploits the temporal context in videos to self-supervise keypoint detection and matching in an end-to-end differentiable manner. Finally, we show how a fully self-supervised keypoint detection and description network can be trivially incorporated as a front-end into a state-of-the-art visual odometry framework that is robust and accurate. |
Tasks | Keypoint Detection, Motion Estimation, Pose Estimation, Visual Odometry |
Published | 2019-12-07 |
URL | https://arxiv.org/abs/1912.03426v1 |
https://arxiv.org/pdf/1912.03426v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-3d-keypoint-learning-for-ego |
Repo | https://github.com/TRI-ML/KP3D |
Framework | none |
RF-Net: An End-to-End Image Matching Network based on Receptive Field
Title | RF-Net: An End-to-End Image Matching Network based on Receptive Field |
Authors | Xuelun Shen, Cheng Wang, Xin Li, Zenglei Yu, Jonathan Li, Chenglu Wen, Ming Cheng, Zijian He |
Abstract | This paper proposes a new end-to-end trainable matching network based on receptive field, RF-Net, to compute sparse correspondence between images. Building end-to-end trainable matching framework is desirable and challenging. The very recent approach, LF-Net, successfully embeds the entire feature extraction pipeline into a jointly trainable pipeline, and produces the state-of-the-art matching results. This paper introduces two modifications to the structure of LF-Net. First, we propose to construct receptive feature maps, which lead to more effective keypoint detection. Second, we introduce a general loss function term, neighbor mask, to facilitate training patch selection. This results in improved stability in descriptor training. We trained RF-Net on the open dataset HPatches, and compared it with other methods on multiple benchmark datasets. Experiments show that RF-Net outperforms existing state-of-the-art methods. |
Tasks | Keypoint Detection |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00604v1 |
https://arxiv.org/pdf/1906.00604v1.pdf | |
PWC | https://paperswithcode.com/paper/190600604 |
Repo | https://github.com/Xylon-Sean/rfnet |
Framework | pytorch |
Rapidly Adapting Moment Estimation
Title | Rapidly Adapting Moment Estimation |
Authors | Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn |
Abstract | Adaptive gradient methods such as Adam have been shown to be very effective for training deep neural networks (DNNs) by tracking the second moment of gradients to compute the individual learning rates. Differently from existing methods, we make use of the most recent first moment of gradients to compute the individual learning rates per iteration. The motivation behind it is that the dynamic variation of the first moment of gradients may provide useful information to obtain the learning rates. We refer to the new method as the rapidly adapting moment estimation (RAME). The theoretical convergence of deterministic RAME is studied by using an analysis similar to the one used in [1] for Adam. Experimental results for training a number of DNNs show promising performance of RAME w.r.t. the convergence speed and generalization performance compared to the stochastic heavy-ball (SHB) method, Adam, and RMSprop. |
Tasks | |
Published | 2019-02-24 |
URL | http://arxiv.org/abs/1902.09030v1 |
http://arxiv.org/pdf/1902.09030v1.pdf | |
PWC | https://paperswithcode.com/paper/rapidly-adapting-moment-estimation |
Repo | https://github.com/guoqiang-zhang/RAME |
Framework | tf |
GLAMpoints: Greedily Learned Accurate Match points
Title | GLAMpoints: Greedily Learned Accurate Match points |
Authors | Prune Truong, Stefanos Apostolopoulos, Agata Mosinska, Samuel Stucky, Carlos Ciller, Sandro De Zanet |
Abstract | We introduce a novel CNN-based feature point detector - GLAMpoints - learned in a semi-supervised manner. Our detector extracts repeatable, stable interest points with a dense coverage, specifically designed to maximize the correct matching in a specific domain, which is in contrast to conventional techniques that optimize indirect metrics. In this paper, we apply our method on challenging retinal slitlamp images, for which classical detectors yield unsatisfactory results due to low image quality and insufficient amount of low-level features. We show that GLAMpoints significantly outperforms classical detectors as well as state-of-the-art CNN-based methods in matching and registration quality for retinal images. Our method can also be extended to other domains, such as natural images. |
Tasks | Keypoint Detection |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06812v2 |
https://arxiv.org/pdf/1908.06812v2.pdf | |
PWC | https://paperswithcode.com/paper/glampoints-greedily-learned-accurate-match |
Repo | https://github.com/DagnyT/GLAMpoints-PyTorch |
Framework | pytorch |
Expectation-Maximization Attention Networks for Semantic Segmentation
Title | Expectation-Maximization Attention Networks for Semantic Segmentation |
Authors | Xia Li, Zhisheng Zhong, Jianlong Wu, Yibo Yang, Zhouchen Lin, Hong Liu |
Abstract | Self-attention mechanism has been widely used for various tasks. It is designed to compute the representation of each position by a weighted sum of the features at all positions. Thus, it can capture long-range relations for computer vision tasks. However, it is computationally consuming. Since the attention maps are computed w.r.t all other positions. In this paper, we formulate the attention mechanism into an expectation-maximization manner and iteratively estimate a much more compact set of bases upon which the attention maps are computed. By a weighted summation upon these bases, the resulting representation is low-rank and deprecates noisy information from the input. The proposed Expectation-Maximization Attention (EMA) module is robust to the variance of input and is also friendly in memory and computation. Moreover, we set up the bases maintenance and normalization methods to stabilize its training procedure. We conduct extensive experiments on popular semantic segmentation benchmarks including PASCAL VOC, PASCAL Context and COCO Stuff, on which we set new records. |
Tasks | Semantic Segmentation |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1907.13426v2 |
https://arxiv.org/pdf/1907.13426v2.pdf | |
PWC | https://paperswithcode.com/paper/expectation-maximization-attention-networks |
Repo | https://github.com/XiaLiPKU/EMANet |
Framework | pytorch |
Unsupervised Data Augmentation for Consistency Training
Title | Unsupervised Data Augmentation for Consistency Training |
Authors | Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le |
Abstract | Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples. On a standard semi-supervised learning benchmark, CIFAR-10, our method outperforms all previous approaches and achieves an error rate of 2.7% with only 4,000 examples, nearly matching the performance of models trained on 50,000 labeled examples. Our method also combines well with transfer learning, e.g., when finetuning from BERT, and yields improvements in high-data regime, such as ImageNet, whether when there is only 10% labeled data or when a full labeled set with 1.3M extra unlabeled examples is used. Code is available at https://github.com/google-research/uda. |
Tasks | Data Augmentation, Image Augmentation, Image Classification, Semi-Supervised Image Classification, Text Classification, Transfer Learning |
Published | 2019-04-29 |
URL | https://arxiv.org/abs/1904.12848v4 |
https://arxiv.org/pdf/1904.12848v4.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-data-augmentation-1 |
Repo | https://github.com/BobaZooba/Unsupervised-Data-Augmentation |
Framework | pytorch |
High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction
Title | High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN Reconstruction |
Authors | Diego Valsesia, Enrico Magli |
Abstract | Compression of hyperspectral images onboard of spacecrafts is a tradeoff between the limited computational resources and the ever-growing spatial and spectral resolution of the optical instruments. As such, it requires low-complexity algorithms with good rate-distortion performance and high throughput. In recent years, the Consultative Committee for Space Data Systems (CCSDS) has focused on lossless and near-lossless compression approaches based on predictive coding, resulting in the recently published CCSDS 123.0-B-2 recommended standard. While the in-loop reconstruction of quantized prediction residuals provides excellent rate-distortion performance for the near-lossless operating mode, it significantly constrains the achievable throughput due to data dependencies. In this paper, we study the performance of a faster method based on prequantization of the image followed by a lossless predictive compressor. While this is well known to be suboptimal, one can exploit powerful signal models to reconstruct the image at the ground segment, recovering part of the suboptimality. In particular, we show that convolutional neural networks can be used for this task and that they can recover the whole SNR drop incurred at a bitrate of 2 bits per pixel. |
Tasks | Image Compression |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02959v1 |
https://arxiv.org/pdf/1907.02959v1.pdf | |
PWC | https://paperswithcode.com/paper/high-throughput-onboard-hyperspectral-image |
Repo | https://github.com/diegovalsesia/hyperspectral-dequantization |
Framework | pytorch |
GAP: Generalizable Approximate Graph Partitioning Framework
Title | GAP: Generalizable Approximate Graph Partitioning Framework |
Authors | Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, Azalia Mirhoseini |
Abstract | Graph partitioning is the problem of dividing the nodes of a graph into balanced partitions while minimizing the edge cut across the partitions. Due to its combinatorial nature, many approximate solutions have been developed, including variants of multi-level methods and spectral clustering. We propose GAP, a Generalizable Approximate Partitioning framework that takes a deep learning approach to graph partitioning. We define a differentiable loss function that represents the partitioning objective and use backpropagation to optimize the network parameters. Unlike baselines that redo the optimization per graph, GAP is capable of generalization, allowing us to train models that produce performant partitions at inference time, even on unseen graphs. Furthermore, because we learn the representation of the graph while jointly optimizing for the partitioning loss function, GAP can be easily tuned for a variety of graph structures. We evaluate the performance of GAP on graphs of varying sizes and structures, including graphs of widely used machine learning models (e.g., ResNet, VGG, and Inception-V3), scale-free graphs, and random graphs. We show that GAP achieves competitive partitions while being up to 100 times faster than the baseline and generalizes to unseen graphs. |
Tasks | graph partitioning |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00614v1 |
http://arxiv.org/pdf/1903.00614v1.pdf | |
PWC | https://paperswithcode.com/paper/gap-generalizable-approximate-graph |
Repo | https://github.com/saurabhdash/GCN_Partitioning |
Framework | pytorch |