July 28, 2019

3308 words 16 mins read

Paper Group ANR 234

Novel Structured Low-rank algorithm to recover spatially smooth exponential image time series. Transfer learning from synthetic to real images using variational autoencoders for robotic applications. Learning body-affordances to simplify action spaces. A Paradigm Shift: Detecting Human Rights Violations Through Web Images. Latent Embeddings for Col …

Novel Structured Low-rank algorithm to recover spatially smooth exponential image time series


Title	Novel Structured Low-rank algorithm to recover spatially smooth exponential image time series
Authors	Arvind Balachandrasekaran, Mathews Jacob
Abstract	We propose a structured low rank matrix completion algorithm to recover a time series of images consisting of linear combination of exponential parameters at every pixel, from under-sampled Fourier measurements. The spatial smoothness of these parameters is exploited along with the exponential structure of the time series at every pixel, to derive an annihilation relation in the $k-t$ domain. This annihilation relation translates into a structured low rank matrix formed from the $k-t$ samples. We demonstrate the algorithm in the parameter mapping setting and show significant improvement over state of the art methods.
Tasks	Low-Rank Matrix Completion, Matrix Completion, Time Series
Published	2017-03-29
URL	http://arxiv.org/abs/1703.09880v1
PDF	http://arxiv.org/pdf/1703.09880v1.pdf
PWC	https://paperswithcode.com/paper/novel-structured-low-rank-algorithm-to
Repo
Framework

Transfer learning from synthetic to real images using variational autoencoders for robotic applications


Title	Transfer learning from synthetic to real images using variational autoencoders for robotic applications
Authors	Tadanobu Inoue, Subhajit Chaudhury, Giovanni De Magistris, Sakyasingha Dasgupta
Abstract	Robotic learning in simulation environments provides a faster, more scalable, and safer training methodology than learning directly with physical robots. Also, synthesizing images in a simulation environment for collecting large-scale image data is easy, whereas capturing camera images in the real world is time consuming and expensive. However, learning from only synthetic images may not achieve the desired performance in real environments due to the gap between synthetic and real images. We thus propose a method that transfers learned capability of detecting object position from a simulation environment to the real world. Our method enables us to use only a very limited dataset of real images while leveraging a large dataset of synthetic images using multiple variational autoencoders. It detects object positions 6 to 7 times more precisely than the baseline of directly learning from the dataset of the real images. Object position estimation under varying environmental conditions forms one of the underlying requirement for standard robotic manipulation tasks. We show that the proposed method performs robustly in different lighting conditions or with other distractor objects present for this requirement. Using this detected object position, we transfer pick-and-place or reaching tasks learned in a simulation environment to an actual physical robot without re-training.
Tasks	Transfer Learning
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06762v1
PDF	http://arxiv.org/pdf/1709.06762v1.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-from-synthetic-to-real-1
Repo
Framework

Learning body-affordances to simplify action spaces


Title	Learning body-affordances to simplify action spaces
Authors	Nicholas Guttenberg, Martin Biehl, Ryota Kanai
Abstract	Controlling embodied agents with many actuated degrees of freedom is a challenging task. We propose a method that can discover and interpolate between context dependent high-level actions or body-affordances. These provide an abstract, low-dimensional interface indexing high-dimensional and time- extended action policies. Our method is related to recent ap- proaches in the machine learning literature but is conceptually simpler and easier to implement. More specifically our method requires the choice of a n-dimensional target sensor space that is endowed with a distance metric. The method then learns an also n-dimensional embedding of possibly reactive body-affordances that spread as far as possible throughout the target sensor space.
Tasks
Published	2017-08-15
URL	http://arxiv.org/abs/1708.04391v1
PDF	http://arxiv.org/pdf/1708.04391v1.pdf
PWC	https://paperswithcode.com/paper/learning-body-affordances-to-simplify-action
Repo
Framework

A Paradigm Shift: Detecting Human Rights Violations Through Web Images


Title	A Paradigm Shift: Detecting Human Rights Violations Through Web Images
Authors	Grigorios Kalliatakis, Shoaib Ehsan, Klaus D. McDonald-Maier
Abstract	The growing presence of devices carrying digital cameras, such as mobile phones and tablets, combined with ever improving internet networks have enabled ordinary citizens, victims of human rights abuse, and participants in armed conflicts, protests, and disaster situations to capture and share via social media networks images and videos of specific events. This paper discusses the potential of images in human rights context including the opportunities and challenges they present. This study demonstrates that real-world images have the capacity to contribute complementary data to operational human rights monitoring efforts when combined with novel computer vision approaches. The analysis is concluded by arguing that if images are to be used effectively to detect and identify human rights violations by rights advocates, greater attention to gathering task-specific visual concepts from large-scale web images is required.
Tasks
Published	2017-03-30
URL	http://arxiv.org/abs/1703.10501v1
PDF	http://arxiv.org/pdf/1703.10501v1.pdf
PWC	https://paperswithcode.com/paper/a-paradigm-shift-detecting-human-rights
Repo
Framework

Latent Embeddings for Collective Activity Recognition


Title	Latent Embeddings for Collective Activity Recognition
Authors	Yongyi Tang, Peizhen Zhang, Jian-Fang Hu, Wei-Shi Zheng
Abstract	Rather than simply recognizing the action of a person individually, collective activity recognition aims to find out what a group of people is acting in a collective scene. Previ- ous state-of-the-art methods using hand-crafted potentials in conventional graphical model which can only define a limited range of relations. Thus, the complex structural de- pendencies among individuals involved in a collective sce- nario cannot be fully modeled. In this paper, we overcome these limitations by embedding latent variables into feature space and learning the feature mapping functions in a deep learning framework. The embeddings of latent variables build a global relation containing person-group interac- tions and richer contextual information by jointly modeling broader range of individuals. Besides, we assemble atten- tion mechanism during embedding for achieving more com- pact representations. We evaluate our method on three col- lective activity datasets, where we contribute a much larger dataset in this work. The proposed model has achieved clearly better performance as compared to the state-of-the- art methods in our experiments.
Tasks	Activity Recognition
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06770v1
PDF	http://arxiv.org/pdf/1709.06770v1.pdf
PWC	https://paperswithcode.com/paper/latent-embeddings-for-collective-activity
Repo
Framework

Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges


Title	Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges
Authors	Muhammad Usama, Junaid Qadir, Aunn Raza, Hunain Arif, Kok-Lim Alvin Yau, Yehia Elkhatib, Amir Hussain, Ala Al-Fuqaha
Abstract	While machine learning and artificial intelligence have long been applied in networking research, the bulk of such works has focused on supervised learning. Recently there has been a rising trend of employing unsupervised machine learning using unstructured raw network data to improve network performance and provide services such as traffic engineering, anomaly detection, Internet traffic classification, and quality of service optimization. The interest in applying unsupervised learning techniques in networking emerges from their great success in other fields such as computer vision, natural language processing, speech recognition, and optimal control (e.g., for developing autonomous self-driving cars). Unsupervised learning is interesting since it can unconstrain us from the need of labeled data and manual handcrafted feature engineering thereby facilitating flexible, general, and automated methods of machine learning. The focus of this survey paper is to provide an overview of the applications of unsupervised learning in the domain of networking. We provide a comprehensive survey highlighting the recent advancements in unsupervised learning techniques and describe their applications for various learning tasks in the context of networking. We also provide a discussion on future directions and open research issues, while also identifying potential pitfalls. While a few survey papers focusing on the applications of machine learning in networking have previously been published, a survey of similar scope and breadth is missing in literature. Through this paper, we advance the state of knowledge by carefully synthesizing the insights from these survey papers while also providing contemporary coverage of recent advances.
Tasks	Anomaly Detection, Feature Engineering, Self-Driving Cars, Speech Recognition
Published	2017-09-19
URL	http://arxiv.org/abs/1709.06599v1
PDF	http://arxiv.org/pdf/1709.06599v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-machine-learning-for-networking
Repo
Framework

Exact MAP inference in general higher-order graphical models using linear programming


Title	Exact MAP inference in general higher-order graphical models using linear programming
Authors	Ikhlef Bechar
Abstract	This paper is concerned with the problem of exact MAP inference in general higher-order graphical models by means of a traditional linear programming relaxation approach. In fact, the proof that we have developed in this paper is a rather simple algebraic proof being made straightforward, above all, by the introduction of two novel algebraic tools. Indeed, on the one hand, we introduce the notion of delta-distribution which merely stands for the difference of two arbitrary probability distributions, and which mainly serves to alleviate the sign constraint inherent to a traditional probability distribution. On the other hand, we develop an approximation framework of general discrete functions by means of an orthogonal projection expressing in terms of linear combinations of function margins with respect to a given collection of point subsets, though, we rather exploit the latter approach for the purpose of modeling locally consistent sets of discrete functions from a global perspective. After that, as a first step, we develop from scratch the expectation optimization framework which is nothing else than a reformulation, on stochastic grounds, of the convex-hull approach, as a second step, we develop the traditional LP relaxation of such an expectation optimization approach, and we show that it enables to solve the MAP inference problem in graphical models under rather general assumptions. Last but not least, we describe an algorithm which allows to compute an exact MAP solution from a perhaps fractional optimal (probability) solution of the proposed LP relaxation.
Tasks
Published	2017-09-25
URL	http://arxiv.org/abs/1709.09051v1
PDF	http://arxiv.org/pdf/1709.09051v1.pdf
PWC	https://paperswithcode.com/paper/exact-map-inference-in-general-higher-order
Repo
Framework

Temporal Pattern Mining from Evolving Networks


Title	Temporal Pattern Mining from Evolving Networks
Authors	Angelo Impedovo, Corrado Loglisci, Michelangelo Ceci
Abstract	Recently, evolving networks are becoming a suitable form to model many real-world complex systems, due to their peculiarities to represent the systems and their constituting entities, the interactions between the entities and the time-variability of their structure and properties. Designing computational models able to analyze evolving networks becomes relevant in many applications. The goal of this research project is to evaluate the possible contribution of temporal pattern mining techniques in the analysis of evolving networks. In particular, we aim at exploiting available snapshots for the recognition of valuable and potentially useful knowledge about the temporal dynamics exhibited by the network over the time, without making any prior assumption about the underlying evolutionary schema. Pattern-based approaches of temporal pattern mining can be exploited to detect and characterize changes exhibited by a network over the time, starting from observed snapshots.
Tasks
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06772v1
PDF	http://arxiv.org/pdf/1709.06772v1.pdf
PWC	https://paperswithcode.com/paper/temporal-pattern-mining-from-evolving
Repo
Framework

Learning to Segment Breast Biopsy Whole Slide Images


Title	Learning to Segment Breast Biopsy Whole Slide Images
Authors	Sachin Mehta, Ezgi Mercan, Jamen Bartlett, Donald Weaver, Joann Elmore, Linda Shapiro
Abstract	We trained and applied an encoder-decoder model to semantically segment breast biopsy images into biologically meaningful tissue labels. Since conventional encoder-decoder networks cannot be applied directly on large biopsy images and the different sized structures in biopsies present novel challenges, we propose four modifications: (1) an input-aware encoding block to compensate for information loss, (2) a new dense connection pattern between encoder and decoder, (3) dense and sparse decoders to combine multi-level features, (4) a multi-resolution network that fuses the results of encoder-decoders run on different resolutions. Our model outperforms a feature-based approach and conventional encoder-decoders from the literature. We use semantic segmentations produced with our model in an automated diagnosis task and obtain higher accuracies than a baseline approach that employs an SVM for feature-based segmentation, both using the same segmentation-based diagnostic features.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02554v2
PDF	http://arxiv.org/pdf/1709.02554v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-segment-breast-biopsy-whole-slide
Repo
Framework

Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification


Title	Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification
Authors	Xiaoyu Cao, Neil Zhenqiang Gong
Abstract	Deep neural networks (DNNs) have transformed several artificial intelligence research areas including computer vision, speech recognition, and natural language processing. However, recent studies demonstrated that DNNs are vulnerable to adversarial manipulations at testing time. Specifically, suppose we have a testing example, whose label can be correctly predicted by a DNN classifier. An attacker can add a small carefully crafted noise to the testing example such that the DNN classifier predicts an incorrect label, where the crafted testing example is called adversarial example. Such attacks are called evasion attacks. Evasion attacks are one of the biggest challenges for deploying DNNs in safety and security critical applications such as self-driving cars. In this work, we develop new methods to defend against evasion attacks. Our key observation is that adversarial examples are close to the classification boundary. Therefore, we propose region-based classification to be robust to adversarial examples. For a benign/adversarial testing example, we ensemble information in a hypercube centered at the example to predict its label. In contrast, traditional classifiers are point-based classification, i.e., given a testing example, the classifier predicts its label based on the testing example alone. Our evaluation results on MNIST and CIFAR-10 datasets demonstrate that our region-based classification can significantly mitigate evasion attacks without sacrificing classification accuracy on benign examples. Specifically, our region-based classification achieves the same classification accuracy on testing benign examples as point-based classification, but our region-based classification is significantly more robust than point-based classification to various evasion attacks.
Tasks	Self-Driving Cars, Speech Recognition
Published	2017-09-17
URL	https://arxiv.org/abs/1709.05583v4
PDF	https://arxiv.org/pdf/1709.05583v4.pdf
PWC	https://paperswithcode.com/paper/mitigating-evasion-attacks-to-deep-neural
Repo
Framework

TimeNet: Pre-trained deep recurrent neural network for time series classification


Title	TimeNet: Pre-trained deep recurrent neural network for time series classification
Authors	Pankaj Malhotra, Vishnu TV, Lovekesh Vig, Puneet Agarwal, Gautam Shroff
Abstract	Inspired by the tremendous success of deep Convolutional Neural Networks as generic feature extractors for images, we propose TimeNet: a deep recurrent neural network (RNN) trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series. Rather than relying on data from the problem domain, TimeNet attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously. Once trained, TimeNet can be used as a generic off-the-shelf feature extractor for time series. The representations or embeddings given by a pre-trained TimeNet are found to be useful for time series classification (TSC). For several publicly available datasets from UCR TSC Archive and an industrial telematics sensor data from vehicles, we observe that a classifier learned over the TimeNet embeddings yields significantly better performance compared to (i) a classifier learned over the embeddings given by a domain-specific RNN, as well as (ii) a nearest neighbor classifier based on Dynamic Time Warping.
Tasks	Time Series, Time Series Classification
Published	2017-06-23
URL	http://arxiv.org/abs/1706.08838v1
PDF	http://arxiv.org/pdf/1706.08838v1.pdf
PWC	https://paperswithcode.com/paper/timenet-pre-trained-deep-recurrent-neural
Repo
Framework

3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection


Title	3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection
Authors	Christian Häne, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, Paul Furgale, Torsten Sattler, Marc Pollefeys
Abstract	Cameras are a crucial exteroceptive sensor for self-driving cars as they are low-cost and small, provide appearance information about the environment, and work in various weather conditions. They can be used for multiple purposes such as visual navigation and obstacle detection. We can use a surround multi-camera system to cover the full 360-degree field-of-view around the car. In this way, we avoid blind spots which can otherwise lead to accidents. To minimize the number of cameras needed for surround perception, we utilize fisheye cameras. Consequently, standard vision pipelines for 3D mapping, visual localization, obstacle detection, etc. need to be adapted to take full advantage of the availability of multiple cameras rather than treat each camera individually. In addition, processing of fisheye images has to be supported. In this paper, we describe the camera calibration and subsequent processing pipeline for multi-fisheye-camera systems developed as part of the V-Charge project. This project seeks to enable automated valet parking for self-driving cars. Our pipeline is able to precisely calibrate multi-camera systems, build sparse 3D maps for visual navigation, visually localize the car with respect to these maps, generate accurate dense maps, as well as detect obstacles based on real-time depth map extraction.
Tasks	Calibration, Self-Driving Cars, Visual Localization, Visual Navigation
Published	2017-08-31
URL	http://arxiv.org/abs/1708.09839v1
PDF	http://arxiv.org/pdf/1708.09839v1.pdf
PWC	https://paperswithcode.com/paper/3d-visual-perception-for-self-driving-cars
Repo
Framework

Privacy-Enabled Biometric Search


Title	Privacy-Enabled Biometric Search
Authors	Scott Streit, Brian Streit, Stephen Suffian
Abstract	Biometrics have a long-held hope of replacing passwords by establishing a non-repudiated identity and providing authentication with convenience. Convenience drives consumers toward biometrics-based access management solutions. Unlike passwords, biometrics cannot be script-injected; however, biometric data is considered highly sensitive due to its personal nature and unique association with users. Biometrics differ from passwords in that compromised passwords may be reset. Compromised biometrics offer no such relief. A compromised biometric offers unlimited risk in privacy (anyone can view the biometric) and authentication (anyone may use the biometric). Standards such as the Biometric Open Protocol Standard (BOPS) (IEEE 2410-2016) provide a detailed mechanism to authenticate biometrics based on pre-enrolled devices and a previous identity by storing the biometric in encrypted form. This paper describes a biometric-agnostic approach that addresses the privacy concerns of biometrics through the implementation of BOPS. Specifically, two novel concepts are introduced. First, a biometric is applied to a neural network to create a feature vector. This neural network alone can be used for one-to-one matching (authentication), but would require a search in linear time for the one-to-many case (identity lookup). The classifying algorithm described in this paper addresses this concern by producing normalized floating-point values for each feature vector. This allows authentication lookup to occur in up to polynomial time, allowing for search in encrypted biometric databases with speed, accuracy and privacy.
Tasks
Published	2017-08-16
URL	http://arxiv.org/abs/1708.04726v1
PDF	http://arxiv.org/pdf/1708.04726v1.pdf
PWC	https://paperswithcode.com/paper/privacy-enabled-biometric-search
Repo
Framework

Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training


Title	Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training
Authors	Yanmin Qian, Xuankai Chang, Dong Yu
Abstract	Although great progresses have been made in automatic speech recognition (ASR), significant performance degradation is still observed when recognizing multi-talker mixed speech. In this paper, we propose and evaluate several architectures to address this problem under the assumption that only a single channel of mixed signal is available. Our technique extends permutation invariant training (PIT) by introducing the front-end feature separation module with the minimum mean square error (MSE) criterion and the back-end recognition module with the minimum cross entropy (CE) criterion. More specifically, during training we compute the average MSE or CE over the whole utterance for each possible utterance-level output-target assignment, pick the one with the minimum MSE or CE, and optimize for that assignment. This strategy elegantly solves the label permutation problem observed in the deep learning based multi-talker mixed speech separation and recognition systems. The proposed architectures are evaluated and compared on an artificially mixed AMI dataset with both two- and three-talker mixed speech. The experimental results indicate that our proposed architectures can cut the word error rate (WER) by 45.0% and 25.0% relatively against the state-of-the-art single-talker speech recognition system across all speakers when their energies are comparable, for two- and three-talker mixed speech, respectively. To our knowledge, this is the first work on the multi-talker mixed speech recognition on the challenging speaker-independent spontaneous large vocabulary continuous speech task.
Tasks	Speech Recognition, Speech Separation
Published	2017-07-19
URL	http://arxiv.org/abs/1707.06527v1
PDF	http://arxiv.org/pdf/1707.06527v1.pdf
PWC	https://paperswithcode.com/paper/single-channel-multi-talker-speech
Repo
Framework

Updating the silent speech challenge benchmark with deep learning


Title	Updating the silent speech challenge benchmark with deep learning
Authors	Yan Ji, Licheng Liu, Hongcui Wang, Zhilei Liu, Zhibin Niu, Bruce Denby
Abstract	The 2010 Silent Speech Challenge benchmark is updated with new results obtained in a Deep Learning strategy, using the same input features and decoding strategy as in the original article. A Word Error Rate of 6.4% is obtained, compared to the published value of 17.4%. Additional results comparing new auto-encoder-based features with the original features at reduced dimensionality, as well as decoding scenarios on two different language models, are also presented. The Silent Speech Challenge archive has been updated to contain both the original and the new auto-encoder features, in addition to the original raw data.
Tasks
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06818v1
PDF	http://arxiv.org/pdf/1709.06818v1.pdf
PWC	https://paperswithcode.com/paper/updating-the-silent-speech-challenge
Repo
Framework