January 25, 2020

2957 words 14 mins read

Paper Group ANR 1666

Paper Group ANR 1666

A Vietnamese information retrieval system for product-price. In-memory hyperdimensional computing. Are skip connections necessary for biologically plausible learning rules?. Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control. Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-awar …

A Vietnamese information retrieval system for product-price

Title A Vietnamese information retrieval system for product-price
Authors Tien-Thanh Vu, Dat Quoc Nguyen
Abstract A price information retrieval (IR) system allows users to search and view differences among prices of specific products. Building product-price driven IR system is a challenging and active research area. Approaches entirely depending products information provided by shops via interface environment encounter limitations of database. While automatic systems specifically require product names and commercial websites for their input. For both paradigms, approaches of building product-price IR system for Vietnamese are still very limited. In this paper, we introduce an automatic Vietnamese IR system for product-price by identifying and storing Xpath patterns to extract prices of products from commercial websites. Experiments of our system show promising results.
Tasks Information Retrieval
Published 2019-11-26
URL https://arxiv.org/abs/1911.11623v1
PDF https://arxiv.org/pdf/1911.11623v1.pdf
PWC https://paperswithcode.com/paper/a-vietnamese-information-retrieval-system-for
Repo
Framework

In-memory hyperdimensional computing

Title In-memory hyperdimensional computing
Authors Geethan Karunaratne, Manuel Le Gallo, Giovanni Cherubini, Luca Benini, Abbas Rahimi, Abu Sebastian
Abstract Hyperdimensional computing (HDC) is an emerging computing framework that takes inspiration from attributes of neuronal circuits such as hyperdimensionality, fully distributed holographic representation, and (pseudo)randomness. When employed for machine learning tasks such as learning and classification, HDC involves manipulation and comparison of large patterns within memory. Moreover, a key attribute of HDC is its robustness to the imperfections associated with the computational substrates on which it is implemented. It is therefore particularly amenable to emerging non-von Neumann paradigms such as in-memory computing, where the physical attributes of nanoscale memristive devices are exploited to perform computation in place. Here, we present a complete in-memory HDC system that achieves a near-optimum trade-off between design complexity and classification accuracy based on three prototypical HDC related learning tasks, namely, language classification, news classification, and hand gesture recognition from electromyography signals. Comparable accuracies to software implementations are demonstrated, experimentally, using 760,000 phase-change memory devices performing analog in-memory computing.
Tasks Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published 2019-06-04
URL https://arxiv.org/abs/1906.01548v1
PDF https://arxiv.org/pdf/1906.01548v1.pdf
PWC https://paperswithcode.com/paper/in-memory-hyperdimensional-computing
Repo
Framework

Are skip connections necessary for biologically plausible learning rules?

Title Are skip connections necessary for biologically plausible learning rules?
Authors Daniel Jiwoong Im, Rutuja Patil, Kristin Branson
Abstract Backpropagation is the workhorse of deep learning, however, several other biologically-motivated learning rules have been introduced, such as random feedback alignment and difference target propagation. None of these methods have produced a competitive performance against backpropagation. In this paper, we show that biologically-motivated learning rules with skip connections between intermediate layers can perform as well as backpropagation on the MNIST dataset and are robust to various sets of hyper-parameters.
Tasks
Published 2019-12-04
URL https://arxiv.org/abs/2001.01647v1
PDF https://arxiv.org/pdf/2001.01647v1.pdf
PWC https://paperswithcode.com/paper/are-skip-connections-necessary-for
Repo
Framework

Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control

Title Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control
Authors Lukas Hermann, Max Argus, Andreas Eitel, Artemij Amiranashvili, Wolfram Burgard, Thomas Brox
Abstract We propose Adaptive Curriculum Generation from Demonstrations (ACGD) for reinforcement learning in the presence of sparse rewards. Rather than designing shaped reward functions, ACGD adaptively sets the appropriate task difficulty for the learner by controlling where to sample from the demonstration trajectories and which set of simulation parameters to use. We show that training vision-based control policies in simulation while gradually increasing the difficulty of the task via ACGD improves the policy transfer to the real world. The degree of domain randomization is also gradually increased through the task difficulty. We demonstrate zero-shot transfer for two real-world manipulation tasks: pick-and-stow and block stacking. A video showing the results can be found at https://lmb.informatik.uni-freiburg.de/projects/curriculum/
Tasks
Published 2019-10-17
URL https://arxiv.org/abs/1910.07972v2
PDF https://arxiv.org/pdf/1910.07972v2.pdf
PWC https://paperswithcode.com/paper/adaptive-curriculum-generation-from
Repo
Framework

Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation

Title Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation
Authors Sungkwon Choo, Wonkyo Seo, Nam Ik Cho
Abstract This paper presents a method for automatic video object segmentation based on the fusion of motion stream, appearance stream, and instance-aware segmentation. The proposed scheme consists of a two-stream fusion network and an instance segmentation network. The two-stream fusion network again consists of motion and appearance stream networks, which extract long-term temporal and spatial information, respectively. Unlike the existing two-stream fusion methods, the proposed fusion network blends the two streams at the original resolution for obtaining accurate segmentation boundary. We develop a recurrent bidirectional multiscale structure with skip connection for the stream fusion network to extract long-term temporal information. Also, the multiscale structure enables to obtain the original resolution features at the end of the network. As a result of two-stream fusion, we have a pixel-level probabilistic segmentation map, which has higher values at the pixels belonging to the foreground object. By combining the probability of foreground map and objectness score of instance segmentation mask, we finally obtain foreground segmentation results for video sequences without any user intervention, i.e., we achieve successful automatic video segmentation. The proposed structure shows a state-of-the-art performance for automatic video object segmentation task, and also achieves near semi-supervised performance.
Tasks Instance Segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2019-12-03
URL https://arxiv.org/abs/1912.01373v1
PDF https://arxiv.org/pdf/1912.01373v1.pdf
PWC https://paperswithcode.com/paper/automatic-video-object-segmentation-via
Repo
Framework

Accurate and Energy-Efficient Classification with Spiking Random Neural Network: Corrected and Expanded Version

Title Accurate and Energy-Efficient Classification with Spiking Random Neural Network: Corrected and Expanded Version
Authors Khaled F. Hussain, Mohamed Yousef Bassyouni, Erol Gelenbe
Abstract Artificial Neural Network (ANN) based techniques have dominated state-of-the-art results in most problems related to computer vision, audio recognition, and natural language processing in the past few years, resulting in strong industrial adoption from all leading technology companies worldwide. One of the major obstacles that have historically delayed large scale adoption of ANNs is the huge computational and power costs associated with training and testing (deploying) them. In the mean-time, Neuromorphic Computing platforms have recently achieved remarkable performance running more bio-realistic Spiking Neural Networks at high throughput and very low power consumption making them a natural alternative to ANNs. Here, we propose using the Random Neural Network (RNN), a spiking neural network with both theoretical and practical appealing properties, as a general purpose classifier that can match the classification power of ANNs on a number of tasks while enjoying all the features of a spiking neural network. This is demonstrated on a number of real-world classification datasets.
Tasks
Published 2019-06-01
URL https://arxiv.org/abs/1906.08864v1
PDF https://arxiv.org/pdf/1906.08864v1.pdf
PWC https://paperswithcode.com/paper/accurate-and-energy-efficient-classification
Repo
Framework

Tree-gated Deep Mixture-of-Experts For Pose-robust Face Alignment

Title Tree-gated Deep Mixture-of-Experts For Pose-robust Face Alignment
Authors Estephe Arnaud, Arnaud Dapogny, Kevin Bailly
Abstract Face alignment consists of aligning a shape model on a face image. It is an active domain in computer vision as it is a preprocessing for a number of face analysis and synthesis applications. Current state-of-the-art methods already perform well on “easy” datasets, with moderate head pose variations, but may not be robust for “in-the-wild” data with poses up to 90{\deg}. In order to increase robustness to an ensemble of factors of variations (e.g. head pose or occlusions), a given layer (e.g. a regressor or an upstream CNN layer) can be replaced by a Mixture of Experts (MoE) layer that uses an ensemble of experts instead of a single one. The weights of this mixture can be learned as gating functions to jointly learn the experts and the corresponding weights. In this paper, we propose to use tree-structured gates which allows a hierarchical weighting of the experts (Tree-MoE). We investigate the use of Tree-MoE layers in different contexts in the frame of face alignment with cascaded regression, firstly for emphasizing relevant, more specialized feature extractors depending of a high-level semantic information such as head pose (Pose-Tree-MoE), and secondly as an overall more robust regression layer. We perform extensive experiments on several challenging face alignment datasets, demonstrating that our approach outperforms the state-of-the-art methods.
Tasks Face Alignment, Robust Face Alignment
Published 2019-10-21
URL https://arxiv.org/abs/1910.09450v1
PDF https://arxiv.org/pdf/1910.09450v1.pdf
PWC https://paperswithcode.com/paper/tree-gated-deep-mixture-of-experts-for-pose
Repo
Framework

Bi-Semantic Reconstructing Generative Network for Zero-shot Learning

Title Bi-Semantic Reconstructing Generative Network for Zero-shot Learning
Authors Shibing Xu, Zishu Gao, Guojun Xie
Abstract Many recent methods of zero-shot learning (ZSL) attempt to utilize generative model to generate the unseen visual samples from semantic descriptions and random noise. Therefore, the ZSL problem becomes a traditional supervised classification problem. However, most of the existing methods based on the generative model only focus on the quality of synthesized samples at the training stage, and ignore the importance of the zero-shot recognition stage. In this paper, we consider both the above two points and propose a novel approach. Specially, we select the Generative Adversarial Network (GAN) as our generative model. In order to improve the quality of synthesized samples, considering the internal relation of the semantic description in the semantic space as well as the fact that the seen and unseen visual information belong to different domains, we propose a bi-semantic reconstructing (BSR) component which contain two different semantic reconstructing regressors to lead the training of GAN. Since the semantic descriptions are available during the training stage, to further improve the ability of classifier, we combine the visual samples and semantic descriptions to train a classifier. At the recognition stage, we naturally utilize the BSR component to transfer the visual features and semantic descriptions, and concatenate them for classification. Experimental results show that our method outperforms the state of the art on several ZSL benchmark datasets with significant improvements.
Tasks Zero-Shot Learning
Published 2019-12-09
URL https://arxiv.org/abs/1912.03877v3
PDF https://arxiv.org/pdf/1912.03877v3.pdf
PWC https://paperswithcode.com/paper/bi-semantic-reconstructing-generative-network
Repo
Framework

Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors

Title Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors
Authors Jayaraman J. Thiagarajan, Bindya Venkatesh, Prasanna Sattigeri, Peer-Timo Bremer
Abstract With rapid adoption of deep learning in critical applications, the question of when and how much to trust these models often arises, which drives the need to quantify the inherent uncertainties. While identifying all sources that account for the stochasticity of models is challenging, it is common to augment predictions with confidence intervals to convey the expected variations in a model’s behavior. We require prediction intervals to be well-calibrated, reflect the true uncertainties, and to be sharp. However, existing techniques for obtaining prediction intervals are known to produce unsatisfactory results in at least one of these criteria. To address this challenge, we develop a novel approach for building calibrated estimators. More specifically, we use separate models for prediction and interval estimation, and pose a bi-level optimization problem that allows the former to leverage estimates from the latter through an \textit{uncertainty matching} strategy. Using experiments in regression, time-series forecasting, and object localization, we show that our approach achieves significant improvements over existing uncertainty quantification methods, both in terms of model fidelity and calibration error.
Tasks Calibration, Object Localization, Time Series, Time Series Forecasting
Published 2019-09-09
URL https://arxiv.org/abs/1909.04079v2
PDF https://arxiv.org/pdf/1909.04079v2.pdf
PWC https://paperswithcode.com/paper/building-calibrated-deep-models-via
Repo
Framework

A neural network based on SPD manifold learning for skeleton-based hand gesture recognition

Title A neural network based on SPD manifold learning for skeleton-based hand gesture recognition
Authors Xuan Son Nguyen, Luc Brun, Olivier Lézoray, Sébastien Bougleux
Abstract This paper proposes a new neural network based on SPD manifold learning for skeleton-based hand gesture recognition. Given the stream of hand’s joint positions, our approach combines two aggregation processes on respectively spatial and temporal domains. The pipeline of our network architecture consists in three main stages. The first stage is based on a convolutional layer to increase the discriminative power of learned features. The second stage relies on different architectures for spatial and temporal Gaussian aggregation of joint features. The third stage learns a final SPD matrix from skeletal data. A new type of layer is proposed for the third stage, based on a variant of stochastic gradient descent on Stiefel manifolds. The proposed network is validated on two challenging datasets and shows state-of-the-art accuracies on both datasets.
Tasks Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published 2019-04-29
URL http://arxiv.org/abs/1904.12970v1
PDF http://arxiv.org/pdf/1904.12970v1.pdf
PWC https://paperswithcode.com/paper/a-neural-network-based-on-spd-manifold
Repo
Framework

GestARLite: An On-Device Pointing Finger Based Gestural Interface for Smartphones and Video See-Through Head-Mounts

Title GestARLite: An On-Device Pointing Finger Based Gestural Interface for Smartphones and Video See-Through Head-Mounts
Authors Varun Jain, Gaurav Garg, Ramakrishna Perla, Ramya Hebbalaguppe
Abstract Hand gestures form an intuitive means of interaction in Mixed Reality (MR) applications. However, accurate gesture recognition can be achieved only through state-of-the-art deep learning models or with the use of expensive sensors. Despite the robustness of these deep learning models, they are generally computationally expensive and obtaining real-time performance on-device is still a challenge. To this end, we propose a novel lightweight hand gesture recognition framework that works in First Person View for wearable devices. The models are trained on a GPU machine and ported on an Android smartphone for its use with frugal wearable devices such as the Google Cardboard and VR Box. The proposed hand gesture recognition framework is driven by a cascade of state-of-the-art deep learning models: MobileNetV2 for hand localisation, our custom fingertip regression architecture followed by a Bi-LSTM model for gesture classification. We extensively evaluate the framework on our EgoGestAR dataset. The overall framework works in real-time on mobile devices and achieves a classification accuracy of 80% on EgoGestAR video dataset with an average latency of only 0.12 s.
Tasks Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published 2019-04-19
URL http://arxiv.org/abs/1904.09843v1
PDF http://arxiv.org/pdf/1904.09843v1.pdf
PWC https://paperswithcode.com/paper/190409843
Repo
Framework

Developing Creative AI to Generate Sculptural Objects

Title Developing Creative AI to Generate Sculptural Objects
Authors Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos
Abstract We explore the intersection of human and machine creativity by generating sculptural objects through machine learning. This research raises questions about both the technical details of automatic art generation and the interaction between AI and people, as both artists and the audience of art. We introduce two algorithms for generating 3D point clouds and then discuss their actualization as sculpture and incorporation into a holistic art installation. Specifically, the Amalgamated DeepDream (ADD) algorithm solves the sparsity problem caused by the naive DeepDream-inspired approach and generates creative and printable point clouds. The Partitioned DeepDream (PDD) algorithm further allows us to explore more diverse 3D object creation by combining point cloud clustering algorithms and ADD.
Tasks Generating 3D Point Clouds
Published 2019-08-20
URL https://arxiv.org/abs/1908.07587v1
PDF https://arxiv.org/pdf/1908.07587v1.pdf
PWC https://paperswithcode.com/paper/190807587
Repo
Framework

Discrimination in the Age of Algorithms

Title Discrimination in the Age of Algorithms
Authors Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, Cass R. Sunstein
Abstract The law forbids discrimination. But the ambiguity of human decision-making often makes it extraordinarily hard for the legal system to know whether anyone has actually discriminated. To understand how algorithms affect discrimination, we must therefore also understand how they affect the problem of detecting discrimination. By one measure, algorithms are fundamentally opaque, not just cognitively but even mathematically. Yet for the task of proving discrimination, processes involving algorithms can provide crucial forms of transparency that are otherwise unavailable. These benefits do not happen automatically. But with appropriate requirements in place, the use of algorithms will make it possible to more easily examine and interrogate the entire decision process, thereby making it far easier to know whether discrimination has occurred. By forcing a new level of specificity, the use of algorithms also highlights, and makes transparent, central tradeoffs among competing values. Algorithms are not only a threat to be regulated; with the right safeguards in place, they have the potential to be a positive force for equity.
Tasks Decision Making
Published 2019-02-11
URL http://arxiv.org/abs/1902.03731v1
PDF http://arxiv.org/pdf/1902.03731v1.pdf
PWC https://paperswithcode.com/paper/discrimination-in-the-age-of-algorithms
Repo
Framework

Causally interpretable multi-step time series forecasting: A new machine learning approach using simulated differential equations

Title Causally interpretable multi-step time series forecasting: A new machine learning approach using simulated differential equations
Authors William Schoenberg
Abstract This work represents a new approach which generates then analyzes a highly non linear complex system of differential equations to do interpretable time series forecasting at a high level of accuracy. This approach provides insight and understanding into the mechanisms responsible for generating past and future behavior. Core to this method is the construction of a highly non linear complex system of differential equations that is then analyzed to determine the origins of behavior. This paper demonstrates the technique on Mass and Senge’s two state Inventory Workforce model (1975) and then explores its application to the real world problem of organogenesis in mice. The organogenesis application consists of a fourteen state system where the generated set of equations reproduces observed behavior with a high level of accuracy (0.880 r^2) and when analyzed produces an interpretable and causally plausible explanation for the observed behavior.
Tasks Time Series, Time Series Forecasting
Published 2019-08-27
URL https://arxiv.org/abs/1908.10336v1
PDF https://arxiv.org/pdf/1908.10336v1.pdf
PWC https://paperswithcode.com/paper/causally-interpretable-multi-step-time-series
Repo
Framework

Evaluating Lottery Tickets Under Distributional Shifts

Title Evaluating Lottery Tickets Under Distributional Shifts
Authors Shrey Desai, Hongyuan Zhan, Ahmed Aly
Abstract The Lottery Ticket Hypothesis suggests large, over-parameterized neural networks consist of small, sparse subnetworks that can be trained in isolation to reach a similar (or better) test accuracy. However, the initialization and generalizability of the obtained sparse subnetworks have been recently called into question. Our work focuses on evaluating the initialization of sparse subnetworks under distributional shifts. Specifically, we investigate the extent to which a sparse subnetwork obtained in a source domain can be re-trained in isolation in a dissimilar, target domain. In addition, we examine the effects of different initialization strategies at transfer-time. Our experiments show that sparse subnetworks obtained through lottery ticket training do not simply overfit to particular domains, but rather reflect an inductive bias of deep neural networks that can be exploited in multiple domains.
Tasks
Published 2019-10-28
URL https://arxiv.org/abs/1910.12708v1
PDF https://arxiv.org/pdf/1910.12708v1.pdf
PWC https://paperswithcode.com/paper/evaluating-lottery-tickets-under
Repo
Framework
comments powered by Disqus