February 1, 2020

3404 words 16 mins read

Paper Group AWR 132

Paper Group AWR 132

Performance and Comparisons of STDP based and Non-STDP based Memristive Neural Networks on Hardware. ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT). A Good Sample is Hard to Find: Noise Injection Sampling and Self-Training for Neural Language Generation Models. Kandinsky Patterns. Reinforcement Learning Upside Down: Don’t Pre …

Performance and Comparisons of STDP based and Non-STDP based Memristive Neural Networks on Hardware

Title Performance and Comparisons of STDP based and Non-STDP based Memristive Neural Networks on Hardware
Authors Zhiri Tang
Abstract With the development of research on memristor, memristive neural networks (MNNs) have become a hot research topic recently. Because memristor can mimic the spike timing-dependent plasticity (STDP), the research on STDP based MNNs is rapidly increasing. However, although state-of-the-art works on STDP based MNNs have many applications such as pattern recognition, STDP mechanism brings relatively complex hardware framework and low processing speed, which block MNNs’ hardware realization. A non-STDP based unsupervised MNN is constructed in this paper. Through the comparison with STDP method on the basis of two common structures including feedforward and crossbar, non-STDP based MNNs not only remain the same advantages as STDP based MNNs including high accuracy and convergence speed in pattern recognition, but also better hardware performance as few hardware resources and higher processing speed. By virtue of the combination of memristive character and simple mechanism, non-STDP based MNNs have better hardware compatibility, which may give a new viewpoint for memristive neural networks’ engineering applications.
Tasks
Published 2019-07-22
URL https://arxiv.org/abs/1907.09126v4
PDF https://arxiv.org/pdf/1907.09126v4.pdf
PWC https://paperswithcode.com/paper/non-stdp-based-unsupervised-memristive-neural
Repo https://github.com/GerinTang/InnovateFPGA2018_PR039
Framework none

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)

Title ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)
Authors Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin
Abstract This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 - 82.65%, ii) T2.1 - 74.3%, iii) T2.2 - 85.32%, iv) T3.1 - 53.86%, and v) T3.2 - 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants methods. The dataset, the evaluation kit as well as the results are publicly available at https://rrc.cvc.uab.es/?ch=14
Tasks Scene Text Detection, Scene Text Recognition, Text Spotting
Published 2019-09-16
URL https://arxiv.org/abs/1909.07145v1
PDF https://arxiv.org/pdf/1909.07145v1.pdf
PWC https://paperswithcode.com/paper/icdar2019-robust-reading-challenge-on
Repo https://github.com/cs-chan/Total-Text-Dataset
Framework none

A Good Sample is Hard to Find: Noise Injection Sampling and Self-Training for Neural Language Generation Models

Title A Good Sample is Hard to Find: Noise Injection Sampling and Self-Training for Neural Language Generation Models
Authors Chris Kedzie, Kathleen McKeown
Abstract Deep neural networks (DNN) are quickly becoming the de facto standard modeling method for many natural language generation (NLG) tasks. In order for such models to truly be useful, they must be capable of correctly generating utterances for novel meaning representations (MRs) at test time. In practice, even sophisticated DNNs with various forms of semantic control frequently fail to generate utterances faithful to the input MR. In this paper, we propose an architecture agnostic self-training method to sample novel MR/text utterance pairs to augment the original training data. Remarkably, after training on the augmented data, even simple encoder-decoder models with greedy decoding are capable of generating semantically correct utterances that are as good as state-of-the-art outputs in both automatic and human evaluations of quality.
Tasks Text Generation
Published 2019-11-08
URL https://arxiv.org/abs/1911.03373v1
PDF https://arxiv.org/pdf/1911.03373v1.pdf
PWC https://paperswithcode.com/paper/a-good-sample-is-hard-to-find-noise-injection
Repo https://github.com/kedz/noiseylg
Framework pytorch

Kandinsky Patterns

Title Kandinsky Patterns
Authors Heimo Mueller, Andreas Holzinger
Abstract Kandinsky Figures and Kandinsky Patterns are mathematically describable, simple self-contained hence controllable test data sets for the development, validation and training of explainability in artificial intelligence. Whilst Kandinsky Patterns have these computationally manageable properties, they are at the same time easily distinguishable from human observers. Consequently, controlled patterns can be described by both humans and computers. We define a Kandinsky Pattern as a set of Kandinsky Figures, where for each figure an “infallible authority” defines that the figure belongs to the Kandinsky Pattern. With this simple principle we build training and validation data sets for automatic interpretability and context learning. In this paper we describe the basic idea and some underlying principles of Kandinsky Patterns and provide a Github repository to invite the international machine learning research community to a challenge to experiment with our Kandinsky Patterns to expand and thus make progress in the field of explainable AI and to contribute to the upcoming field of explainability and causability.
Tasks
Published 2019-06-03
URL https://arxiv.org/abs/1906.00657v1
PDF https://arxiv.org/pdf/1906.00657v1.pdf
PWC https://paperswithcode.com/paper/190600657
Repo https://github.com/human-centered-ai-lab/dat-kandinsky-patterns
Framework none

Reinforcement Learning Upside Down: Don’t Predict Rewards – Just Map Them to Actions

Title Reinforcement Learning Upside Down: Don’t Predict Rewards – Just Map Them to Actions
Authors Juergen Schmidhuber
Abstract We transform reinforcement learning (RL) into a form of supervised learning (SL) by turning traditional RL on its head, calling this Upside Down RL (UDRL). Standard RL predicts rewards, while UDRL instead uses rewards as task-defining inputs, together with representations of time horizons and other computable functions of historic and desired future data. UDRL learns to interpret these input observations as commands, mapping them to actions (or action probabilities) through SL on past (possibly accidental) experience. UDRL generalizes to achieve high rewards or other goals, through input commands such as: get lots of reward within at most so much time! A separate paper [61] on first experiments with UDRL shows that even a pilot version of UDRL can outperform traditional baseline algorithms on certain challenging RL problems. We also introduce a related simple but general approach for teaching a robot to imitate humans. First videotape humans imitating the robot’s current behaviors, then let the robot learn through SL to map the videos (as input commands) to these behaviors, then let it generalize and imitate videos of humans executing previously unknown behavior. This Imitate-Imitator concept may actually explain why biological evolution has resulted in parents who imitate the babbling of their babies.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02875v1
PDF https://arxiv.org/pdf/1912.02875v1.pdf
PWC https://paperswithcode.com/paper/reinforcement-learning-upside-down-dont
Repo https://github.com/haron1100/Upside-Down-Reinforcement-Learning
Framework pytorch

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

Title Well-Read Students Learn Better: On the Importance of Pre-training Compact Models
Authors Iulia Turc, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
Abstract Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training. Due to the cost of applying such models to down-stream tasks, several model compression techniques on pre-trained language representations have been proposed (Sun et al., 2019; Sanh, 2019). However, surprisingly, the simple baseline of just pre-training and fine-tuning compact models has been overlooked. In this paper, we first show that pre-training remains important in the context of smaller architectures, and fine-tuning pre-trained compact models can be competitive to more elaborate methods proposed in concurrent work. Starting with pre-trained compact models, we then explore transferring task knowledge from large fine-tuned models through standard knowledge distillation. The resulting simple, yet effective and general algorithm, Pre-trained Distillation, brings further improvements. Through extensive experiments, we more generally explore the interaction between pre-training and distillation under two variables that have been under-studied: model size and properties of unlabeled task data. One surprising observation is that they have a compound effect even when sequentially applied on the same data. To accelerate future research, we will make our 24 pre-trained miniature BERT models publicly available.
Tasks Language Modelling, Model Compression, Sentiment Analysis
Published 2019-08-23
URL https://arxiv.org/abs/1908.08962v2
PDF https://arxiv.org/pdf/1908.08962v2.pdf
PWC https://paperswithcode.com/paper/well-read-students-learn-better-the-impact-of
Repo https://github.com/google-research/bert
Framework tf

Deep Hough Voting for 3D Object Detection in Point Clouds

Title Deep Hough Voting for 3D Object Detection in Point Clouds
Authors Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas
Abstract Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird’s eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles to construct a 3D detection pipeline for point cloud data and as generic as possible. However, due to the sparse nature of the data – samples from 2D manifolds in 3D space – we face a major challenge when directly predicting bounding box parameters from scene points: a 3D object centroid can be far from any surface point thus hard to regress accurately in one step. To address the challenge, we propose VoteNet, an end-to-end 3D object detection network based on a synergy of deep point set networks and Hough voting. Our model achieves state-of-the-art 3D detection on two large datasets of real 3D scans, ScanNet and SUN RGB-D with a simple design, compact model size and high efficiency. Remarkably, VoteNet outperforms previous methods by using purely geometric information without relying on color images.
Tasks 3D Object Detection, Object Detection
Published 2019-04-21
URL https://arxiv.org/abs/1904.09664v2
PDF https://arxiv.org/pdf/1904.09664v2.pdf
PWC https://paperswithcode.com/paper/deep-hough-voting-for-3d-object-detection-in
Repo https://github.com/qq456cvb/VoteNet
Framework tf

Object landmark discovery through unsupervised adaptation

Title Object landmark discovery through unsupervised adaptation
Authors Enrique Sanchez, Georgios Tzimiropoulos
Abstract This paper proposes a method to ease the unsupervised learning of object landmark detectors. Similarly to previous methods, our approach is fully unsupervised in a sense that it does not require or make any use of annotated landmarks for the target object category. Contrary to previous works, we do however assume that a landmark detector, which has already learned a structured representation for a given object category in a fully supervised manner, is available. Under this setting, our main idea boils down to adapting the given pre-trained network to the target object categories in a fully unsupervised manner. To this end, our method uses the pre-trained network as a core which remains frozen and does not get updated during training, and learns, in an unsupervised manner, only a projection matrix to perform the adaptation to the target categories. By building upon an existing structured representation learned in a supervised manner, the optimization problem solved by our method is much more constrained with significantly less parameters to learn which seems to be important for the case of unsupervised learning. We show that our method surpasses fully unsupervised techniques trained from scratch as well as a strong baseline based on fine-tuning, and produces state-of-the-art results on several datasets. Code can be found at https://github.com/ESanchezLozano/SAIC-Unsupervised-landmark-detection-NeurIPS2019 .
Tasks
Published 2019-10-21
URL https://arxiv.org/abs/1910.09469v1
PDF https://arxiv.org/pdf/1910.09469v1.pdf
PWC https://paperswithcode.com/paper/object-landmark-discovery-through
Repo https://github.com/ESanchezLozano/SAIC-Unsupervised-landmark-detection-NeurIPS2019
Framework pytorch

Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild

Title Gradio: Hassle-Free Sharing and Testing of ML Models in the Wild
Authors Abubakar Abid, Ali Abdalla, Ali Abid, Dawood Khan, Abdulrahman Alfozan, James Zou
Abstract Accessibility is a major challenge of machine learning (ML). Typical ML models are built by specialists and require specialized hardware/software as well as ML experience to validate. This makes it challenging for non-technical collaborators and endpoint users (e.g. physicians) to easily provide feedback on model development and to gain trust in ML. The accessibility challenge also makes collaboration more difficult and limits the ML researcher’s exposure to realistic data and scenarios that occur in the wild. To improve accessibility and facilitate collaboration, we developed an open-source Python package, Gradio, which allows researchers to rapidly generate a visual interface for their ML models. Gradio makes accessing any ML model as easy as sharing a URL. Our development of Gradio is informed by interviews with a number of machine learning researchers who participate in interdisciplinary collaborations. Their feedback identified that Gradio should support a variety of interfaces and frameworks, allow for easy sharing of the interface, allow for input manipulation and interactive inference by the domain expert, as well as allow embedding the interface in iPython notebooks. We developed these features and carried out a case study to understand Gradio’s usefulness and usability in the setting of a machine learning collaboration between a researcher and a cardiologist.
Tasks
Published 2019-06-06
URL https://arxiv.org/abs/1906.02569v1
PDF https://arxiv.org/pdf/1906.02569v1.pdf
PWC https://paperswithcode.com/paper/gradio-hassle-free-sharing-and-testing-of-ml
Repo https://github.com/gradio-app/gradio-UI
Framework tf

Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition

Title Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition
Authors Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
Abstract Learning subtle yet discriminative features (e.g., beak and eyes for a bird) plays a significant role in fine-grained image recognition. Existing attention-based approaches localize and amplify significant parts to learn fine-grained details, which often suffer from a limited number of parts and heavy computational cost. In this paper, we propose to learn such fine-grained features from hundreds of part proposals by Trilinear Attention Sampling Network (TASN) in an efficient teacher-student manner. Specifically, TASN consists of 1) a trilinear attention module, which generates attention maps by modeling the inter-channel relationships, 2) an attention-based sampler which highlights attended parts with high resolution, and 3) a feature distiller, which distills part features into a global one by weight sharing and feature preserving strategies. Extensive experiments verify that TASN yields the best performance under the same settings with the most competitive approaches, in iNaturalist-2017, CUB-Bird, and Stanford-Cars datasets.
Tasks Fine-Grained Image Classification, Fine-Grained Image Recognition
Published 2019-03-14
URL https://arxiv.org/abs/1903.06150v2
PDF https://arxiv.org/pdf/1903.06150v2.pdf
PWC https://paperswithcode.com/paper/looking-for-the-devil-in-the-details-learning
Repo https://github.com/researchmm/tasn
Framework mxnet

Low-Rank Tucker Approximation of a Tensor From Streaming Data

Title Low-Rank Tucker Approximation of a Tensor From Streaming Data
Authors Yiming Sun, Yang Guo, Charlene Luo, Joel Tropp, Madeleine Udell
Abstract This paper describes a new algorithm for computing a low-Tucker-rank approximation of a tensor. The method applies a randomized linear map to the tensor to obtain a sketch that captures the important directions within each mode, as well as the interactions among the modes. The sketch can be extracted from streaming or distributed data or with a single pass over the tensor, and it uses storage proportional to the degrees of freedom in the output Tucker approximation. The algorithm does not require a second pass over the tensor, although it can exploit another view to compute a superior approximation. The paper provides a rigorous theoretical guarantee on the approximation error. Extensive numerical experiments show that that the algorithm produces useful results that improve on the state of the art for streaming Tucker decomposition.
Tasks
Published 2019-04-24
URL http://arxiv.org/abs/1904.10951v1
PDF http://arxiv.org/pdf/1904.10951v1.pdf
PWC https://paperswithcode.com/paper/low-rank-tucker-approximation-of-a-tensor
Repo https://github.com/udellgroup/tensorsketch
Framework tf

Hyperparameter-Free Out-of-Distribution Detection Using Softmax of Scaled Cosine Similarity

Title Hyperparameter-Free Out-of-Distribution Detection Using Softmax of Scaled Cosine Similarity
Authors Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani
Abstract The ability to detect out-of-distribution (OOD) samples is vital to secure the reliability of deep neural networks in real-world applications. Considering the nature of OOD samples, detection methods should not have hyperparameters that need to be tuned depending on incoming OOD samples. However, most of the recently proposed methods do not meet this requirement, leading to compromised performance in real-world applications. In this paper, we propose a simple, hyperparameter-free method based on softmax of scaled cosine similarity. It resembles the approach employed by modern metric learning methods, but it differs in details; the differences are essential to achieve high detection performance. We show through experiments that our method outperforms the existing methods on the evaluation test recently proposed by Shafaei et al., which takes the above issue of hyperparameter dependency into account. We also show that it achieves at least comparable performance to other methods on the conventional test, where their hyperparameters are chosen using explicit OOD samples. Furthermore, it is computationally more efficient than most of the previous methods, since it needs only a single forward pass.
Tasks Metric Learning, Out-of-Distribution Detection
Published 2019-05-25
URL https://arxiv.org/abs/1905.10628v3
PDF https://arxiv.org/pdf/1905.10628v3.pdf
PWC https://paperswithcode.com/paper/hyperparameter-free-out-of-distribution
Repo https://github.com/engkarat/cosine-ood-detector
Framework pytorch

Deep Metric Learning Beyond Binary Supervision

Title Deep Metric Learning Beyond Binary Supervision
Authors Sungyeon Kim, Minkyo Seo, Ivan Laptev, Minsu Cho, Suha Kwak
Abstract Metric Learning for visual similarity has mostly adopted binary supervision indicating whether a pair of images are of the same class or not. Such a binary indicator covers only a limited subset of image relations, and is not sufficient to represent semantic similarity between images described by continuous and/or structured labels such as object poses, image captions, and scene graphs. Motivated by this, we present a novel method for deep metric learning using continuous labels. First, we propose a new triplet loss that allows distance ratios in the label space to be preserved in the learned metric space. The proposed loss thus enables our model to learn the degree of similarity rather than just the order. Furthermore, we design a triplet mining strategy adapted to metric learning with continuous labels. We address three different image retrieval tasks with continuous labels in terms of human poses, room layouts and image captions, and demonstrate the superior performance of our approach compared to previous methods.
Tasks Image Captioning, Image Retrieval, Metric Learning, Semantic Similarity, Semantic Textual Similarity
Published 2019-04-21
URL http://arxiv.org/abs/1904.09626v1
PDF http://arxiv.org/pdf/1904.09626v1.pdf
PWC https://paperswithcode.com/paper/deep-metric-learning-beyond-binary
Repo https://github.com/tjddus9597/Beyond-Binary-Supervision-CVPR19
Framework pytorch

Gliding vertex on the horizontal bounding box for multi-oriented object detection

Title Gliding vertex on the horizontal bounding box for multi-oriented object detection
Authors Yongchao Xu, Mingtao Fu, Qimeng Wang, Yukang Wang, Kai Chen, Gui-Song Xia, Xiang Bai
Abstract Object detection has recently experienced substantial progress. Yet, the widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented objects such as objects in aerial images and scene texts. In this paper, we propose a simple yet effective framework to detect multi-oriented objects. Instead of directly regressing the four vertices, we glide the vertex of the horizontal bounding box on each corresponding side to accurately describe a multi-oriented object. Specifically, We regress four length ratios characterizing the relative gliding offset on each corresponding side. This may facilitate the offset learning and avoid the confusion issue of sequential label points for oriented objects. To further remedy the confusion issue for nearly horizontal objects, we also introduce an obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object. We add these five extra target variables to the regression head of fast R-CNN, which requires ignorable extra computation time. Extensive experimental results demonstrate that without bells and whistles, the proposed method achieves superior performances on multiple multi-oriented object detection benchmarks including object detection in aerial images, scene text detection, pedestrian detection in fisheye images.
Tasks Object Detection, Object Detection In Aerial Images, Pedestrian Detection, Scene Text Detection
Published 2019-11-21
URL https://arxiv.org/abs/1911.09358v1
PDF https://arxiv.org/pdf/1911.09358v1.pdf
PWC https://paperswithcode.com/paper/gliding-vertex-on-the-horizontal-bounding-box
Repo https://github.com/xuannianz/EfficientDet
Framework tf

SuSi: Supervised Self-Organizing Maps for Regression and Classification in Python

Title SuSi: Supervised Self-Organizing Maps for Regression and Classification in Python
Authors Felix M. Riese, Sina Keller
Abstract In many research fields, the sizes of the existing datasets vary widely. Hence, there is a need for machine learning techniques which are well-suited for these different datasets. One possible technique is the self-organizing map (SOM), a type of artificial neural network which is, so far, weakly represented in the field of machine learning. The SOM’s unique characteristic is the neighborhood relationship of the output neurons. This relationship improves the ability of generalization on small datasets. SOMs are mostly applied in unsupervised learning and few studies focus on using SOMs as supervised learning approach. Furthermore, no appropriate SOM package is available with respect to machine learning standards and in the widely used programming language Python. In this paper, we introduce the freely available Supervised Self-organizing maps (SuSi) Python package which performs supervised regression and classification. The implementation of SuSi is described with respect to the underlying mathematics. Then, we present first evaluations of the SOM for regression and classification datasets from two different domains of geospatial image analysis. Despite the early stage of its development, the SuSi framework performs well and is characterized by only small performance differences between the training and the test datasets. A comparison of the SuSi framework with existing Python and R packages demonstrates the importance of the SuSi framework. In future work, the SuSi framework will be extended, optimized and upgraded e.g. with tools to better understand and visualize the input data as well as the handling of missing and incomplete data.
Tasks
Published 2019-03-26
URL https://arxiv.org/abs/1903.11114v3
PDF https://arxiv.org/pdf/1903.11114v3.pdf
PWC https://paperswithcode.com/paper/susi-supervised-self-organizing-maps-for
Repo https://github.com/JustGlowing/minisom
Framework none
comments powered by Disqus