October 17, 2019

3347 words 16 mins read

Paper Group ANR 879

Fast binary embeddings, and quantized compressed sensing with structured matrices. A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care. I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators. Object category learning and retrieval with weak supervision. End-to-End Detection and Re-identifi …

Fast binary embeddings, and quantized compressed sensing with structured matrices


Title	Fast binary embeddings, and quantized compressed sensing with structured matrices
Authors	Thang Huynh, Rayan Saab
Abstract	This paper deals with two related problems, namely distance-preserving binary embeddings and quantization for compressed sensing . First, we propose fast methods to replace points from a subset $\mathcal{X} \subset \mathbb{R}^n$, associated with the Euclidean metric, with points in the cube ${\pm 1}^m$ and we associate the cube with a pseudo-metric that approximates Euclidean distance among points in $\mathcal{X}$. Our methods rely on quantizing fast Johnson-Lindenstrauss embeddings based on bounded orthonormal systems and partial circulant ensembles, both of which admit fast transforms. Our quantization methods utilize noise-shaping, and include Sigma-Delta schemes and distributed noise-shaping schemes. The resulting approximation errors decay polynomially and exponentially fast in $m$, depending on the embedding method. This dramatically outperforms the current decay rates associated with binary embeddings and Hamming distances. Additionally, it is the first such binary embedding result that applies to fast Johnson-Lindenstrauss maps while preserving $\ell_2$ norms. Second, we again consider noise-shaping schemes, albeit this time to quantize compressed sensing measurements arising from bounded orthonormal ensembles and partial circulant matrices. We show that these methods yield a reconstruction error that again decays with the number of measurements (and bits), when using convex optimization for reconstruction. Specifically, for Sigma-Delta schemes, the error decays polynomially in the number of measurements, and it decays exponentially for distributed noise-shaping schemes based on beta encoding. These results are near optimal and the first of their kind dealing with bounded orthonormal systems.
Tasks	Quantization
Published	2018-01-26
URL	http://arxiv.org/abs/1801.08639v2
PDF	http://arxiv.org/pdf/1801.08639v2.pdf
PWC	https://paperswithcode.com/paper/fast-binary-embeddings-and-quantized
Repo
Framework

A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care


Title	A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care
Authors	Qiwei Han, Mengxin Ji, Inigo Martinez de Rituerto de Troya, Manas Gaur, Leid Zejnilovic
Abstract	We partner with a leading European healthcare provider and design a mechanism to match patients with family doctors in primary care. We define the matchmaking process for several distinct use cases given different levels of available information about patients. Then, we adopt a hybrid recommender system to present each patient a list of family doctor recommendations. In particular, we model patient trust of family doctors using a large-scale dataset of consultation histories, while accounting for the temporal dynamics of their relationships. Our proposed approach shows higher predictive accuracy than both a heuristic baseline and a collaborative filtering approach, and the proposed trust measure further improves model performance.
Tasks	Recommendation Systems
Published	2018-08-09
URL	http://arxiv.org/abs/1808.03265v2
PDF	http://arxiv.org/pdf/1808.03265v2.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-recommender-system-for-patient
Repo
Framework

I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators


Title	I Know What You See: Power Side-Channel Attack on Convolutional Neural Network Accelerators
Authors	Lingxiao Wei, Bo Luo, Yu Li, Yannan Liu, Qiang Xu
Abstract	Deep learning has become the de-facto computational paradigm for various kinds of perception problems, including many privacy-sensitive applications such as online medical image analysis. No doubt to say, the data privacy of these deep learning systems is a serious concern. Different from previous research focusing on exploiting privacy leakage from deep learning models, in this paper, we present the first attack on the implementation of deep learning models. To be specific, we perform the attack on an FPGA-based convolutional neural network accelerator and we manage to recover the input image from the collected power traces without knowing the detailed parameters in the neural network. For the MNIST dataset, our power side-channel attack is able to achieve up to 89% recognition accuracy.
Tasks
Published	2018-03-05
URL	https://arxiv.org/abs/1803.05847v2
PDF	https://arxiv.org/pdf/1803.05847v2.pdf
PWC	https://paperswithcode.com/paper/i-know-what-you-see-power-side-channel-attack
Repo
Framework

Object category learning and retrieval with weak supervision


Title	Object category learning and retrieval with weak supervision
Authors	Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar
Abstract	We consider the problem of retrieving objects from image data and learning to classify them into meaningful semantic categories with minimal supervision. To that end, we propose a fully differentiable unsupervised deep clustering approach to learn semantic classes in an end-to-end fashion without individual class labeling using only unlabeled object proposals. The key contributions of our work are 1) a kmeans clustering objective where the clusters are learned as parameters of the network and are represented as memory units, and 2) simultaneously building a feature representation, or embedding, while learning to cluster it. This approach shows promising results on two popular computer vision datasets: on CIFAR10 for clustering objects, and on the more complex and challenging Cityscapes dataset for semantically discovering classes which visually correspond to cars, people, and bicycles. Currently, the only supervision provided is segmentation objectness masks, but this method can be extended to use an unsupervised objectness-based object generation mechanism which will make the approach completely unsupervised.
Tasks
Published	2018-01-26
URL	http://arxiv.org/abs/1801.08985v2
PDF	http://arxiv.org/pdf/1801.08985v2.pdf
PWC	https://paperswithcode.com/paper/object-category-learning-and-retrieval-with
Repo
Framework

End-to-End Detection and Re-identification Integrated Net for Person Search


Title	End-to-End Detection and Re-identification Integrated Net for Person Search
Authors	Zhenwei He, Lei Zhang, Wei Jia
Abstract	This paper proposes a pedestrian detection and re-identification (re-id) integration net (I-Net) in an end-to-end learning framework. The I-Net is used in real-world video surveillance scenarios, where the target person needs to be searched in the whole scene videos, while the annotations of pedestrian bounding boxes are unavailable. By comparing to the OIM which is a work for joint detection and re-id, we have three distinct contributions. First, we introduce a Siamese architecture of I-Net instead of 1 stream, such that a verification task can be implemented. Second, we propose a novel on-line pairing loss (OLP) and hard example priority softmax loss (HEP), such that only the hard negatives are posed much attention in loss computation. Third, an on-line dictionary for negative samples storage is designed in I-Net without recording the positive samples. We show our result on person search datasets, the gap between detection and re-identification is narrowed. The superior performance can be achieved.
Tasks	Pedestrian Detection, Person Search
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00376v1
PDF	http://arxiv.org/pdf/1804.00376v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-detection-and-re-identification
Repo
Framework

Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection


Title	Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection
Authors	Chengyang Li, Dan Song, Ruofeng Tong, Min Tang
Abstract	Multispectral images of color-thermal pairs have shown more effective than a single color channel for pedestrian detection, especially under challenging illumination conditions. However, there is still a lack of studies on how to fuse the two modalities effectively. In this paper, we deeply compare six different convolutional network fusion architectures and analyse their adaptations, enabling a vanilla architecture to obtain detection performances comparable to the state-of-the-art results. Further, we discover that pedestrian detection confidences from color or thermal images are correlated with illumination conditions. With this in mind, we propose an Illumination-aware Faster R-CNN (IAF RCNN). Specifically, an Illumination-aware Network is introduced to give an illumination measure of the input image. Then we adaptively merge color and thermal sub-networks via a gate function defined over the illumination value. The experimental results on KAIST Multispectral Pedestrian Benchmark validate the effectiveness of the proposed IAF R-CNN.
Tasks	Pedestrian Detection
Published	2018-03-14
URL	http://arxiv.org/abs/1803.05347v2
PDF	http://arxiv.org/pdf/1803.05347v2.pdf
PWC	https://paperswithcode.com/paper/illumination-aware-faster-r-cnn-for-robust
Repo
Framework

Identify Susceptible Locations in Medical Records via Adversarial Attacks on Deep Predictive Models


Title	Identify Susceptible Locations in Medical Records via Adversarial Attacks on Deep Predictive Models
Authors	Mengying Sun, Fengyi Tang, Jinfeng Yi, Fei Wang, Jiayu Zhou
Abstract	The surging availability of electronic medical records (EHR) leads to increased research interests in medical predictive modeling. Recently many deep learning based predicted models are also developed for EHR data and demonstrated impressive performance. However, a series of recent studies showed that these deep models are not safe: they suffer from certain vulnerabilities. In short, a well-trained deep network can be extremely sensitive to inputs with negligible changes. These inputs are referred to as adversarial examples. In the context of medical informatics, such attacks could alter the result of a high performance deep predictive model by slightly perturbing a patient’s medical records. Such instability not only reflects the weakness of deep architectures, more importantly, it offers guide on detecting susceptible parts on the inputs. In this paper, we propose an efficient and effective framework that learns a time-preferential minimum attack targeting the LSTM model with EHR inputs, and we leverage this attack strategy to screen medical records of patients and identify susceptible events and measurements. The efficient screening procedure can assist decision makers to pay extra attentions to the locations that can cause severe consequence if not measured correctly. We conduct extensive empirical studies on a real-world urgent care cohort and demonstrate the effectiveness of the proposed screening approach.
Tasks
Published	2018-02-13
URL	http://arxiv.org/abs/1802.04822v1
PDF	http://arxiv.org/pdf/1802.04822v1.pdf
PWC	https://paperswithcode.com/paper/identify-susceptible-locations-in-medical
Repo
Framework


Title	The Design and Implementation of XiaoIce, an Empathetic Social Chatbot
Authors	Li Zhou, Jianfeng Gao, Di Li, Heung-Yeung Shum
Abstract	This paper describes the development of Microsoft XiaoIce, the most popular social chatbot in the world. XiaoIce is uniquely designed as an AI companion with an emotional connection to satisfy the human need for communication, affection, and social belonging. We take into account both intelligent quotient (IQ) and emotional quotient (EQ) in system design, cast human-machine social chat as decision-making over Markov Decision Processes (MDPs), and optimize XiaoIce for long-term user engagement, measured in expected Conversation-turns Per Session (CPS). We detail the system architecture and key components including dialogue manager, core chat, skills, and an empathetic computing module. We show how XiaoIce dynamically recognizes human feelings and states, understands user intent, and responds to user needs throughout long conversations. Since her launch in 2014, XiaoIce has communicated with over 660 million active users and succeeded in establishing long-term relationships with many of them. Analysis of large scale online logs shows that XiaoIce has achieved an average CPS of 23, which is significantly higher than that of other chatbots and even human conversations.
Tasks	Chatbot, Decision Making
Published	2018-12-21
URL	https://arxiv.org/abs/1812.08989v2
PDF	https://arxiv.org/pdf/1812.08989v2.pdf
PWC	https://paperswithcode.com/paper/the-design-and-implementation-of-xiaoice-an
Repo
Framework

Is Neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain


Title	Is Neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain
Authors	Laxmi R. Iyer, Yansong Chua, Haizhou Li
Abstract	The advantage of spiking neural networks (SNNs) over their predecessors is their ability to spike, enabling them to use spike timing for coding and efficient computing. A neuromorphic dataset should allow a neuromorphic algorithm to clearly show that a SNN is able to perform better on the dataset than an ANN. We have analyzed both N-MNIST and N-Caltech101 along these lines, but focus our study on N-MNIST. First we evaluate if additional information is encoded in the time domain in a neuromoprhic dataset. We show that an ANN trained with backpropagation on frame based versions of N-MNIST and N-Caltech101 images achieve 99.23% and 78.01% accuracy. These are the best classification accuracies obtained on these datasets to date. Second we present the first unsupervised SNN to be trained on N-MNIST and demonstrate results of 91.78%. We also use this SNN for further experiments on N-MNIST to show that rate based SNNs perform better, and precise spike timings are not important in N-MNIST. N-MNIST does not, therefore, highlight the unique ability of SNNs. The conclusion of this study opens an important question in neuromorphic engineering - what, then, constitutes a good neuromorphic dataset?
Tasks
Published	2018-07-03
URL	http://arxiv.org/abs/1807.01013v1
PDF	http://arxiv.org/pdf/1807.01013v1.pdf
PWC	https://paperswithcode.com/paper/is-neuromorphic-mnist-neuromorphic-analyzing
Repo
Framework

Multi-Merge Budget Maintenance for Stochastic Gradient Descent SVM Training


Title	Multi-Merge Budget Maintenance for Stochastic Gradient Descent SVM Training
Authors	Sahar Qaadan, Tobias Glasmachers
Abstract	Budgeted Stochastic Gradient Descent (BSGD) is a state-of-the-art technique for training large-scale kernelized support vector machines. The budget constraint is maintained incrementally by merging two points whenever the pre-defined budget is exceeded. The process of finding suitable merge partners is costly; it can account for up to 45% of the total training time. In this paper we investigate computationally more efficient schemes that merge more than two points at once. We obtain significant speed-ups without sacrificing accuracy.
Tasks
Published	2018-06-26
URL	http://arxiv.org/abs/1806.10179v1
PDF	http://arxiv.org/pdf/1806.10179v1.pdf
PWC	https://paperswithcode.com/paper/multi-merge-budget-maintenance-for-stochastic
Repo
Framework

Open Set Chinese Character Recognition using Multi-typed Attributes


Title	Open Set Chinese Character Recognition using Multi-typed Attributes
Authors	Sheng He, Lambert Schomaker
Abstract	Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot’ class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters. \|
Tasks	Few-Shot Learning, Image Retrieval, Zero-Shot Learning
Published	2018-08-27
URL	http://arxiv.org/abs/1808.08993v1
PDF	http://arxiv.org/pdf/1808.08993v1.pdf
PWC	https://paperswithcode.com/paper/open-set-chinese-character-recognition-using
Repo
Framework

FFNet: Video Fast-Forwarding via Reinforcement Learning


Title	FFNet: Video Fast-Forwarding via Reinforcement Learning
Authors	Shuyue Lan, Rameswar Panda, Qi Zhu, Amit K. Roy-Chowdhury
Abstract	For many applications with limited computation, communication, storage and energy resources, there is an imperative need of computer vision methods that could select an informative subset of the input video for efficient processing at or near real time. In the literature, there are two relevant groups of approaches: generating a trailer for a video or fast-forwarding while watching/processing the video. The first group is supported by video summarization techniques, which require processing of the entire video to select an important subset for showing to users. In the second group, current fast-forwarding methods depend on either manual control or automatic adaptation of playback speed, which often do not present an accurate representation and may still require processing of every frame. In this paper, we introduce FastForwardNet (FFNet), a reinforcement learning agent that gets inspiration from video summarization and does fast-forwarding differently. It is an online framework that automatically fast-forwards a video and presents a representative subset of frames to users on the fly. It does not require processing the entire video, but just the portion that is selected by the fast-forward agent, which makes the process very computationally efficient. The online nature of our proposed method also enables the users to begin fast-forwarding at any point of the video. Experiments on two real-world datasets demonstrate that our method can provide better representation of the input video with much less processing requirement.
Tasks	Video Summarization
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02792v1
PDF	http://arxiv.org/pdf/1805.02792v1.pdf
PWC	https://paperswithcode.com/paper/ffnet-video-fast-forwarding-via-reinforcement
Repo
Framework

Adaptive Behavior Generation for Autonomous Driving using Deep Reinforcement Learning with Compact Semantic States


Title	Adaptive Behavior Generation for Autonomous Driving using Deep Reinforcement Learning with Compact Semantic States
Authors	Peter Wolf, Karl Kurzer, Tobias Wingert, Florian Kuhnt, J. Marius Zöllner
Abstract	Making the right decision in traffic is a challenging task that is highly dependent on individual preferences as well as the surrounding environment. Therefore it is hard to model solely based on expert knowledge. In this work we use Deep Reinforcement Learning to learn maneuver decisions based on a compact semantic state representation. This ensures a consistent model of the environment across scenarios as well as a behavior adaptation function, enabling on-line changes of desired behaviors without re-training. The input for the neural network is a simulated object list similar to that of Radar or Lidar sensors, superimposed by a relational semantic scene description. The state as well as the reward are extended by a behavior adaptation function and a parameterization respectively. With little expert knowledge and a set of mid-level actions, it can be seen that the agent is capable to adhere to traffic rules and learns to drive safely in a variety of situations.
Tasks	Autonomous Driving
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03214v1
PDF	http://arxiv.org/pdf/1809.03214v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-behavior-generation-for-autonomous
Repo
Framework

Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection


Title	Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection
Authors	Dayan Guan, Yanpeng Cao, Jun Liang, Yanlong Cao, Michael Ying Yang
Abstract	Multispectral pedestrian detection has received extensive attention in recent years as a promising solution to facilitate robust human target detection for around-the-clock applications (e.g. security surveillance and autonomous driving). In this paper, we demonstrate illumination information encoded in multispectral images can be utilized to significantly boost performance of pedestrian detection. A novel illumination-aware weighting mechanism is present to accurately depict illumination condition of a scene. Such illumination information is incorporated into two-stream deep convolutional neural networks to learn multispectral human-related features under different illumination conditions (daytime and nighttime). Moreover, we utilized illumination information together with multispectral data to generate more accurate semantic segmentation which are used to boost pedestrian detection accuracy. Putting all of the pieces together, we present a powerful framework for multispectral pedestrian detection based on multi-task learning of illumination-aware pedestrian detection and semantic segmentation. Our proposed method is trained end-to-end using a well-designed multi-task loss function and outperforms state-of-the-art approaches on KAIST multispectral pedestrian dataset.
Tasks	Autonomous Driving, Multi-Task Learning, Pedestrian Detection, Semantic Segmentation
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09972v1
PDF	http://arxiv.org/pdf/1802.09972v1.pdf
PWC	https://paperswithcode.com/paper/fusion-of-multispectral-data-through
Repo
Framework

ReLeQ: An Automatic Reinforcement Learning Approach for Deep Quantization of Neural Networks


Title	ReLeQ: An Automatic Reinforcement Learning Approach for Deep Quantization of Neural Networks
Authors	Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Sicun Gao, Hadi Esmaeilzadeh
Abstract	Deep Neural Networks (DNNs) typically require massive amount of computation resource in inference tasks for computer vision applications. Quantization can significantly reduce DNN computation and storage by decreasing the bitwidth of network encodings. Recent research affirms that carefully selecting the quantization levels for each layer can preserve the accuracy while pushing the bitwidth below eight bits. However, without arduous manual effort, this deep quantization can lead to significant accuracy loss, leaving it in a position of questionable utility. As such, deep quantization opens a large hyper-parameter space (bitwidth of the layers), the exploration of which is a major challenge. We propose a systematic approach to tackle this problem, by automating the process of discovering the quantization levels through an end-to-end deep reinforcement learning framework (ReLeQ). We adapt policy optimization methods to the problem of quantization, and focus on finding the best design decisions in choosing the state and action spaces, network architecture and training framework, as well as the tuning of various hyperparamters. We show how ReLeQ can balance speed and quality, and provide an asymmetric general solution for quantization of a large variety of deep networks (AlexNet, CIFAR-10, LeNet, MobileNet-V1, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy (=< 0.3% loss) while minimizing the computation and storage cost. With these DNNs, ReLeQ enables conventional hardware to achieve 2.2x speedup over 8-bit execution. Similarly, a custom DNN accelerator achieves 2.0x speedup and energy reduction compared to 8-bit runs. These encouraging results mark ReLeQ as the initial step towards automating the deep quantization of neural networks.
Tasks	Quantization
Published	2018-11-05
URL	https://arxiv.org/abs/1811.01704v3
PDF	https://arxiv.org/pdf/1811.01704v3.pdf
PWC	https://paperswithcode.com/paper/releq-an-automatic-reinforcement-learning
Repo
Framework