January 25, 2020

2957 words 14 mins read

Paper Group ANR 1666

A Vietnamese information retrieval system for product-price. In-memory hyperdimensional computing. Are skip connections necessary for biologically plausible learning rules?. Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control. Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-awar …

A Vietnamese information retrieval system for product-price


Title	A Vietnamese information retrieval system for product-price
Authors	Tien-Thanh Vu, Dat Quoc Nguyen
Abstract	A price information retrieval (IR) system allows users to search and view differences among prices of specific products. Building product-price driven IR system is a challenging and active research area. Approaches entirely depending products information provided by shops via interface environment encounter limitations of database. While automatic systems specifically require product names and commercial websites for their input. For both paradigms, approaches of building product-price IR system for Vietnamese are still very limited. In this paper, we introduce an automatic Vietnamese IR system for product-price by identifying and storing Xpath patterns to extract prices of products from commercial websites. Experiments of our system show promising results.
Tasks	Information Retrieval
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11623v1
PDF	https://arxiv.org/pdf/1911.11623v1.pdf
PWC	https://paperswithcode.com/paper/a-vietnamese-information-retrieval-system-for
Repo
Framework

In-memory hyperdimensional computing


Title	In-memory hyperdimensional computing
Authors	Geethan Karunaratne, Manuel Le Gallo, Giovanni Cherubini, Luca Benini, Abbas Rahimi, Abu Sebastian
Abstract	Hyperdimensional computing (HDC) is an emerging computing framework that takes inspiration from attributes of neuronal circuits such as hyperdimensionality, fully distributed holographic representation, and (pseudo)randomness. When employed for machine learning tasks such as learning and classification, HDC involves manipulation and comparison of large patterns within memory. Moreover, a key attribute of HDC is its robustness to the imperfections associated with the computational substrates on which it is implemented. It is therefore particularly amenable to emerging non-von Neumann paradigms such as in-memory computing, where the physical attributes of nanoscale memristive devices are exploited to perform computation in place. Here, we present a complete in-memory HDC system that achieves a near-optimum trade-off between design complexity and classification accuracy based on three prototypical HDC related learning tasks, namely, language classification, news classification, and hand gesture recognition from electromyography signals. Comparable accuracies to software implementations are demonstrated, experimentally, using 760,000 phase-change memory devices performing analog in-memory computing.
Tasks	Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01548v1
PDF	https://arxiv.org/pdf/1906.01548v1.pdf
PWC	https://paperswithcode.com/paper/in-memory-hyperdimensional-computing
Repo
Framework

Are skip connections necessary for biologically plausible learning rules?


Title	Are skip connections necessary for biologically plausible learning rules?
Authors	Daniel Jiwoong Im, Rutuja Patil, Kristin Branson
Abstract	Backpropagation is the workhorse of deep learning, however, several other biologically-motivated learning rules have been introduced, such as random feedback alignment and difference target propagation. None of these methods have produced a competitive performance against backpropagation. In this paper, we show that biologically-motivated learning rules with skip connections between intermediate layers can perform as well as backpropagation on the MNIST dataset and are robust to various sets of hyper-parameters.
Tasks
Published	2019-12-04
URL	https://arxiv.org/abs/2001.01647v1
PDF	https://arxiv.org/pdf/2001.01647v1.pdf
PWC	https://paperswithcode.com/paper/are-skip-connections-necessary-for
Repo
Framework

Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control


Title	Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control
Authors	Lukas Hermann, Max Argus, Andreas Eitel, Artemij Amiranashvili, Wolfram Burgard, Thomas Brox
Abstract	We propose Adaptive Curriculum Generation from Demonstrations (ACGD) for reinforcement learning in the presence of sparse rewards. Rather than designing shaped reward functions, ACGD adaptively sets the appropriate task difficulty for the learner by controlling where to sample from the demonstration trajectories and which set of simulation parameters to use. We show that training vision-based control policies in simulation while gradually increasing the difficulty of the task via ACGD improves the policy transfer to the real world. The degree of domain randomization is also gradually increased through the task difficulty. We demonstrate zero-shot transfer for two real-world manipulation tasks: pick-and-stow and block stacking. A video showing the results can be found at https://lmb.informatik.uni-freiburg.de/projects/curriculum/
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07972v2
PDF	https://arxiv.org/pdf/1910.07972v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-curriculum-generation-from
Repo
Framework

Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation


Title	Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation
Authors	Sungkwon Choo, Wonkyo Seo, Nam Ik Cho
Abstract	This paper presents a method for automatic video object segmentation based on the fusion of motion stream, appearance stream, and instance-aware segmentation. The proposed scheme consists of a two-stream fusion network and an instance segmentation network. The two-stream fusion network again consists of motion and appearance stream networks, which extract long-term temporal and spatial information, respectively. Unlike the existing two-stream fusion methods, the proposed fusion network blends the two streams at the original resolution for obtaining accurate segmentation boundary. We develop a recurrent bidirectional multiscale structure with skip connection for the stream fusion network to extract long-term temporal information. Also, the multiscale structure enables to obtain the original resolution features at the end of the network. As a result of two-stream fusion, we have a pixel-level probabilistic segmentation map, which has higher values at the pixels belonging to the foreground object. By combining the probability of foreground map and objectness score of instance segmentation mask, we finally obtain foreground segmentation results for video sequences without any user intervention, i.e., we achieve successful automatic video segmentation. The proposed structure shows a state-of-the-art performance for automatic video object segmentation task, and also achieves near semi-supervised performance.
Tasks	Instance Segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01373v1
PDF	https://arxiv.org/pdf/1912.01373v1.pdf
PWC	https://paperswithcode.com/paper/automatic-video-object-segmentation-via
Repo
Framework

Accurate and Energy-Efficient Classification with Spiking Random Neural Network: Corrected and Expanded Version


Title	Accurate and Energy-Efficient Classification with Spiking Random Neural Network: Corrected and Expanded Version
Authors	Khaled F. Hussain, Mohamed Yousef Bassyouni, Erol Gelenbe
Abstract	Artificial Neural Network (ANN) based techniques have dominated state-of-the-art results in most problems related to computer vision, audio recognition, and natural language processing in the past few years, resulting in strong industrial adoption from all leading technology companies worldwide. One of the major obstacles that have historically delayed large scale adoption of ANNs is the huge computational and power costs associated with training and testing (deploying) them. In the mean-time, Neuromorphic Computing platforms have recently achieved remarkable performance running more bio-realistic Spiking Neural Networks at high throughput and very low power consumption making them a natural alternative to ANNs. Here, we propose using the Random Neural Network (RNN), a spiking neural network with both theoretical and practical appealing properties, as a general purpose classifier that can match the classification power of ANNs on a number of tasks while enjoying all the features of a spiking neural network. This is demonstrated on a number of real-world classification datasets.
Tasks
Published	2019-06-01
URL	https://arxiv.org/abs/1906.08864v1
PDF	https://arxiv.org/pdf/1906.08864v1.pdf
PWC	https://paperswithcode.com/paper/accurate-and-energy-efficient-classification
Repo
Framework

Tree-gated Deep Mixture-of-Experts For Pose-robust Face Alignment


Title	Tree-gated Deep Mixture-of-Experts For Pose-robust Face Alignment
Authors	Estephe Arnaud, Arnaud Dapogny, Kevin Bailly
Abstract	Face alignment consists of aligning a shape model on a face image. It is an active domain in computer vision as it is a preprocessing for a number of face analysis and synthesis applications. Current state-of-the-art methods already perform well on “easy” datasets, with moderate head pose variations, but may not be robust for “in-the-wild” data with poses up to 90{\deg}. In order to increase robustness to an ensemble of factors of variations (e.g. head pose or occlusions), a given layer (e.g. a regressor or an upstream CNN layer) can be replaced by a Mixture of Experts (MoE) layer that uses an ensemble of experts instead of a single one. The weights of this mixture can be learned as gating functions to jointly learn the experts and the corresponding weights. In this paper, we propose to use tree-structured gates which allows a hierarchical weighting of the experts (Tree-MoE). We investigate the use of Tree-MoE layers in different contexts in the frame of face alignment with cascaded regression, firstly for emphasizing relevant, more specialized feature extractors depending of a high-level semantic information such as head pose (Pose-Tree-MoE), and secondly as an overall more robust regression layer. We perform extensive experiments on several challenging face alignment datasets, demonstrating that our approach outperforms the state-of-the-art methods.
Tasks	Face Alignment, Robust Face Alignment
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09450v1
PDF	https://arxiv.org/pdf/1910.09450v1.pdf
PWC	https://paperswithcode.com/paper/tree-gated-deep-mixture-of-experts-for-pose
Repo
Framework

Bi-Semantic Reconstructing Generative Network for Zero-shot Learning


Title	Bi-Semantic Reconstructing Generative Network for Zero-shot Learning
Authors	Shibing Xu, Zishu Gao, Guojun Xie
Abstract	Many recent methods of zero-shot learning (ZSL) attempt to utilize generative model to generate the unseen visual samples from semantic descriptions and random noise. Therefore, the ZSL problem becomes a traditional supervised classification problem. However, most of the existing methods based on the generative model only focus on the quality of synthesized samples at the training stage, and ignore the importance of the zero-shot recognition stage. In this paper, we consider both the above two points and propose a novel approach. Specially, we select the Generative Adversarial Network (GAN) as our generative model. In order to improve the quality of synthesized samples, considering the internal relation of the semantic description in the semantic space as well as the fact that the seen and unseen visual information belong to different domains, we propose a bi-semantic reconstructing (BSR) component which contain two different semantic reconstructing regressors to lead the training of GAN. Since the semantic descriptions are available during the training stage, to further improve the ability of classifier, we combine the visual samples and semantic descriptions to train a classifier. At the recognition stage, we naturally utilize the BSR component to transfer the visual features and semantic descriptions, and concatenate them for classification. Experimental results show that our method outperforms the state of the art on several ZSL benchmark datasets with significant improvements.
Tasks	Zero-Shot Learning
Published	2019-12-09
URL	https://arxiv.org/abs/1912.03877v3
PDF	https://arxiv.org/pdf/1912.03877v3.pdf
PWC	https://paperswithcode.com/paper/bi-semantic-reconstructing-generative-network
Repo
Framework

Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors


Title	Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors
Authors	Jayaraman J. Thiagarajan, Bindya Venkatesh, Prasanna Sattigeri, Peer-Timo Bremer
Abstract	With rapid adoption of deep learning in critical applications, the question of when and how much to trust these models often arises, which drives the need to quantify the inherent uncertainties. While identifying all sources that account for the stochasticity of models is challenging, it is common to augment predictions with confidence intervals to convey the expected variations in a model’s behavior. We require prediction intervals to be well-calibrated, reflect the true uncertainties, and to be sharp. However, existing techniques for obtaining prediction intervals are known to produce unsatisfactory results in at least one of these criteria. To address this challenge, we develop a novel approach for building calibrated estimators. More specifically, we use separate models for prediction and interval estimation, and pose a bi-level optimization problem that allows the former to leverage estimates from the latter through an \textit{uncertainty matching} strategy. Using experiments in regression, time-series forecasting, and object localization, we show that our approach achieves significant improvements over existing uncertainty quantification methods, both in terms of model fidelity and calibration error.
Tasks	Calibration, Object Localization, Time Series, Time Series Forecasting
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04079v2
PDF	https://arxiv.org/pdf/1909.04079v2.pdf
PWC	https://paperswithcode.com/paper/building-calibrated-deep-models-via
Repo
Framework

A neural network based on SPD manifold learning for skeleton-based hand gesture recognition


Title	A neural network based on SPD manifold learning for skeleton-based hand gesture recognition
Authors	Xuan Son Nguyen, Luc Brun, Olivier Lézoray, Sébastien Bougleux
Abstract	This paper proposes a new neural network based on SPD manifold learning for skeleton-based hand gesture recognition. Given the stream of hand’s joint positions, our approach combines two aggregation processes on respectively spatial and temporal domains. The pipeline of our network architecture consists in three main stages. The first stage is based on a convolutional layer to increase the discriminative power of learned features. The second stage relies on different architectures for spatial and temporal Gaussian aggregation of joint features. The third stage learns a final SPD matrix from skeletal data. A new type of layer is proposed for the third stage, based on a variant of stochastic gradient descent on Stiefel manifolds. The proposed network is validated on two challenging datasets and shows state-of-the-art accuracies on both datasets.
Tasks	Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published	2019-04-29
URL	http://arxiv.org/abs/1904.12970v1
PDF	http://arxiv.org/pdf/1904.12970v1.pdf
PWC	https://paperswithcode.com/paper/a-neural-network-based-on-spd-manifold
Repo
Framework

GestARLite: An On-Device Pointing Finger Based Gestural Interface for Smartphones and Video See-Through Head-Mounts


Title	GestARLite: An On-Device Pointing Finger Based Gestural Interface for Smartphones and Video See-Through Head-Mounts
Authors	Varun Jain, Gaurav Garg, Ramakrishna Perla, Ramya Hebbalaguppe
Abstract	Hand gestures form an intuitive means of interaction in Mixed Reality (MR) applications. However, accurate gesture recognition can be achieved only through state-of-the-art deep learning models or with the use of expensive sensors. Despite the robustness of these deep learning models, they are generally computationally expensive and obtaining real-time performance on-device is still a challenge. To this end, we propose a novel lightweight hand gesture recognition framework that works in First Person View for wearable devices. The models are trained on a GPU machine and ported on an Android smartphone for its use with frugal wearable devices such as the Google Cardboard and VR Box. The proposed hand gesture recognition framework is driven by a cascade of state-of-the-art deep learning models: MobileNetV2 for hand localisation, our custom fingertip regression architecture followed by a Bi-LSTM model for gesture classification. We extensively evaluate the framework on our EgoGestAR dataset. The overall framework works in real-time on mobile devices and achieves a classification accuracy of 80% on EgoGestAR video dataset with an average latency of only 0.12 s.
Tasks	Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition
Published	2019-04-19
URL	http://arxiv.org/abs/1904.09843v1
PDF	http://arxiv.org/pdf/1904.09843v1.pdf
PWC	https://paperswithcode.com/paper/190409843
Repo
Framework

Developing Creative AI to Generate Sculptural Objects


Title	Developing Creative AI to Generate Sculptural Objects
Authors	Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos
Abstract	We explore the intersection of human and machine creativity by generating sculptural objects through machine learning. This research raises questions about both the technical details of automatic art generation and the interaction between AI and people, as both artists and the audience of art. We introduce two algorithms for generating 3D point clouds and then discuss their actualization as sculpture and incorporation into a holistic art installation. Specifically, the Amalgamated DeepDream (ADD) algorithm solves the sparsity problem caused by the naive DeepDream-inspired approach and generates creative and printable point clouds. The Partitioned DeepDream (PDD) algorithm further allows us to explore more diverse 3D object creation by combining point cloud clustering algorithms and ADD.
Tasks	Generating 3D Point Clouds
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07587v1
PDF	https://arxiv.org/pdf/1908.07587v1.pdf
PWC	https://paperswithcode.com/paper/190807587
Repo
Framework

Discrimination in the Age of Algorithms


Title	Discrimination in the Age of Algorithms
Authors	Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, Cass R. Sunstein
Abstract	The law forbids discrimination. But the ambiguity of human decision-making often makes it extraordinarily hard for the legal system to know whether anyone has actually discriminated. To understand how algorithms affect discrimination, we must therefore also understand how they affect the problem of detecting discrimination. By one measure, algorithms are fundamentally opaque, not just cognitively but even mathematically. Yet for the task of proving discrimination, processes involving algorithms can provide crucial forms of transparency that are otherwise unavailable. These benefits do not happen automatically. But with appropriate requirements in place, the use of algorithms will make it possible to more easily examine and interrogate the entire decision process, thereby making it far easier to know whether discrimination has occurred. By forcing a new level of specificity, the use of algorithms also highlights, and makes transparent, central tradeoffs among competing values. Algorithms are not only a threat to be regulated; with the right safeguards in place, they have the potential to be a positive force for equity.
Tasks	Decision Making
Published	2019-02-11
URL	http://arxiv.org/abs/1902.03731v1
PDF	http://arxiv.org/pdf/1902.03731v1.pdf
PWC	https://paperswithcode.com/paper/discrimination-in-the-age-of-algorithms
Repo
Framework

Causally interpretable multi-step time series forecasting: A new machine learning approach using simulated differential equations


Title	Causally interpretable multi-step time series forecasting: A new machine learning approach using simulated differential equations
Authors	William Schoenberg
Abstract	This work represents a new approach which generates then analyzes a highly non linear complex system of differential equations to do interpretable time series forecasting at a high level of accuracy. This approach provides insight and understanding into the mechanisms responsible for generating past and future behavior. Core to this method is the construction of a highly non linear complex system of differential equations that is then analyzed to determine the origins of behavior. This paper demonstrates the technique on Mass and Senge’s two state Inventory Workforce model (1975) and then explores its application to the real world problem of organogenesis in mice. The organogenesis application consists of a fourteen state system where the generated set of equations reproduces observed behavior with a high level of accuracy (0.880 r^2) and when analyzed produces an interpretable and causally plausible explanation for the observed behavior.
Tasks	Time Series, Time Series Forecasting
Published	2019-08-27
URL	https://arxiv.org/abs/1908.10336v1
PDF	https://arxiv.org/pdf/1908.10336v1.pdf
PWC	https://paperswithcode.com/paper/causally-interpretable-multi-step-time-series
Repo
Framework

Evaluating Lottery Tickets Under Distributional Shifts


Title	Evaluating Lottery Tickets Under Distributional Shifts
Authors	Shrey Desai, Hongyuan Zhan, Ahmed Aly
Abstract	The Lottery Ticket Hypothesis suggests large, over-parameterized neural networks consist of small, sparse subnetworks that can be trained in isolation to reach a similar (or better) test accuracy. However, the initialization and generalizability of the obtained sparse subnetworks have been recently called into question. Our work focuses on evaluating the initialization of sparse subnetworks under distributional shifts. Specifically, we investigate the extent to which a sparse subnetwork obtained in a source domain can be re-trained in isolation in a dissimilar, target domain. In addition, we examine the effects of different initialization strategies at transfer-time. Our experiments show that sparse subnetworks obtained through lottery ticket training do not simply overfit to particular domains, but rather reflect an inductive bias of deep neural networks that can be exploited in multiple domains.
Tasks
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12708v1
PDF	https://arxiv.org/pdf/1910.12708v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-lottery-tickets-under
Repo
Framework