May 7, 2019

3260 words 16 mins read

Paper Group AWR 104

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics. Nonparametric Modeling of Dynamic Functional Connectivity in fMRI Data. Sampling Generative Networks. It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation. Communication-Efficient Learning of Deep Networks from Decentralized Data. Latent Pre …

An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics


Title	An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Authors	Francesco Conti, Robert Schilling, Pasquale Davide Schiavone, Antonio Pullini, Davide Rossi, Frank Kagan Gürkaynak, Michael Muehlberghuber, Michael Gautschi, Igor Loi, Germain Haugou, Stefan Mangard, Luca Benini
Abstract	Near-sensor data analytics is a promising direction for IoT endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data is stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a System-on-Chip based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with secured remote recognition in 5.74pJ/op; and seizure detection with encrypted data collection from EEG within 12.7pJ/op.
Tasks	EEG, Face Detection, Seizure Detection
Published	2016-12-18
URL	http://arxiv.org/abs/1612.05974v3
PDF	http://arxiv.org/pdf/1612.05974v3.pdf
PWC	https://paperswithcode.com/paper/an-iot-endpoint-system-on-chip-for-secure-and
Repo	https://github.com/pulp-platform/hwpe-tb
Framework	none

Nonparametric Modeling of Dynamic Functional Connectivity in fMRI Data


Title	Nonparametric Modeling of Dynamic Functional Connectivity in fMRI Data
Authors	Søren F. V. Nielsen, Kristoffer H. Madsen, Rasmus Røge, Mikkel N. Schmidt, Morten Mørup
Abstract	Dynamic functional connectivity (FC) has in recent years become a topic of interest in the neuroimaging community. Several models and methods exist for both functional magnetic resonance imaging (fMRI) and electroencephalography (EEG), and the results point towards the conclusion that FC exhibits dynamic changes. The existing approaches modeling dynamic connectivity have primarily been based on time-windowing the data and k-means clustering. We propose a non-parametric generative model for dynamic FC in fMRI that does not rely on specifying window lengths and number of dynamic states. Rooted in Bayesian statistical modeling we use the predictive likelihood to investigate if the model can discriminate between a motor task and rest both within and across subjects. We further investigate what drives dynamic states using the model on the entire data collated across subjects and task/rest. We find that the number of states extracted are driven by subject variability and preprocessing differences while the individual states are almost purely defined by either task or rest. This questions how we in general interpret dynamic FC and points to the need for more research on what drives dynamic FC.
Tasks	EEG
Published	2016-01-04
URL	http://arxiv.org/abs/1601.00496v2
PDF	http://arxiv.org/pdf/1601.00496v2.pdf
PWC	https://paperswithcode.com/paper/nonparametric-modeling-of-dynamic-functional
Repo	https://github.com/sfvnielsen/ndfc
Framework	none

Sampling Generative Networks


Title	Sampling Generative Networks
Authors	Tom White
Abstract	We introduce several techniques for sampling and visualizing the latent spaces of generative models. Replacing linear interpolation with spherical linear interpolation prevents diverging from a model’s prior distribution and produces sharper samples. J-Diagrams and MINE grids are introduced as visualizations of manifolds created by analogies and nearest neighbors. We demonstrate two new techniques for deriving attribute vectors: bias-corrected vectors with data replication and synthetic vectors with data augmentation. Binary classification using attribute vectors is presented as a technique supporting quantitative analysis of the latent space. Most techniques are intended to be independent of model type and examples are shown on both Variational Autoencoders and Generative Adversarial Networks.
Tasks	Data Augmentation
Published	2016-09-14
URL	http://arxiv.org/abs/1609.04468v3
PDF	http://arxiv.org/pdf/1609.04468v3.pdf
PWC	https://paperswithcode.com/paper/sampling-generative-networks
Repo	https://github.com/gitaar9/MLDAGAN
Framework	none

It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation


Title	It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation
Authors	Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Abstract	Eye gaze is an important non-verbal cue for human affect analysis. Recent gaze estimation work indicated that information from the full face region can benefit performance. Pushing this idea further, we propose an appearance-based method that, in contrast to a long-standing line of work in computer vision, only takes the full face image as input. Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps to flexibly suppress or enhance information in different facial regions. Through extensive evaluation, we show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation, achieving improvements of up to 14.3% on MPIIGaze and 27.7% on EYEDIAP for person-independent 3D gaze estimation. We further show that this improvement is consistent across different illumination conditions and gaze directions and particularly pronounced for the most challenging extreme head poses.
Tasks	Gaze Estimation
Published	2016-11-27
URL	http://arxiv.org/abs/1611.08860v2
PDF	http://arxiv.org/pdf/1611.08860v2.pdf
PWC	https://paperswithcode.com/paper/its-written-all-over-your-face-full-face
Repo	https://github.com/hysts/pytorch_mpiigaze_demo
Framework	none

Communication-Efficient Learning of Deep Networks from Decentralized Data


Title	Communication-Efficient Learning of Deep Networks from Decentralized Data
Authors	H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas
Abstract	Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center and training there using conventional approaches. We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating locally-computed updates. We term this decentralized approach Federated Learning. We present a practical method for the federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets. These experiments demonstrate the approach is robust to the unbalanced and non-IID data distributions that are a defining characteristic of this setting. Communication costs are the principal constraint, and we show a reduction in required communication rounds by 10-100x as compared to synchronized stochastic gradient descent.
Tasks	Speech Recognition
Published	2016-02-17
URL	http://arxiv.org/abs/1602.05629v3
PDF	http://arxiv.org/pdf/1602.05629v3.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-learning-of-deep
Repo	https://github.com/SachaIZADI/Misc-Machine-Learning
Framework	tf

Latent Predictor Networks for Code Generation


Title	Latent Predictor Networks for Code Generation
Authors	Wang Ling, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Andrew Senior, Fumin Wang, Phil Blunsom
Abstract	Many language generation tasks require the production of text conditioned on both structured and unstructured inputs. We present a novel neural network architecture which generates an output sequence conditioned on an arbitrary number of input functions. Crucially, our approach allows both the choice of conditioning context and the granularity of generation, for example characters or tokens, to be marginalised, thus permitting scalable and effective training. Using this framework, we address the problem of generating programming code from a mixed natural language and structured specification. We create two new data sets for this paradigm derived from the collectible trading card games Magic the Gathering and Hearthstone. On these, and a third preexisting corpus, we demonstrate that marginalising multiple predictors allows our model to outperform strong benchmarks.
Tasks	Card Games, Code Generation, Text Generation
Published	2016-03-22
URL	http://arxiv.org/abs/1603.06744v2
PDF	http://arxiv.org/pdf/1603.06744v2.pdf
PWC	https://paperswithcode.com/paper/latent-predictor-networks-for-code-generation
Repo	https://github.com/davidgolub/QuestionGeneration
Framework	pytorch

Sentence Pair Scoring: Towards Unified Framework for Text Comprehension


Title	Sentence Pair Scoring: Towards Unified Framework for Text Comprehension
Authors	Petr Baudiš, Jan Pichl, Tomáš Vyskočil, Jan Šedivý
Abstract	We review the task of Sentence Pair Scoring, popular in the literature in various forms - viewed as Answer Sentence Selection, Semantic Text Scoring, Next Utterance Ranking, Recognizing Textual Entailment, Paraphrasing or e.g. a component of Memory Networks. We argue that all such tasks are similar from the model perspective and propose new baselines by comparing the performance of common IR metrics and popular convolutional, recurrent and attention-based neural models across many Sentence Pair Scoring tasks and datasets. We discuss the problem of evaluating randomized models, propose a statistically grounded methodology, and attempt to improve comparisons by releasing new datasets that are much harder than some of the currently used well explored benchmarks. We introduce a unified open source software framework with easily pluggable models and tasks, which enables us to experiment with multi-task reusability of trained sentence model. We set a new state-of-art in performance on the Ubuntu Dialogue dataset.
Tasks	Natural Language Inference, Reading Comprehension
Published	2016-03-19
URL	http://arxiv.org/abs/1603.06127v4
PDF	http://arxiv.org/pdf/1603.06127v4.pdf
PWC	https://paperswithcode.com/paper/sentence-pair-scoring-towards-unified
Repo	https://github.com/brmson/dataset-sts
Framework	none

Graph Learning from Data under Structural and Laplacian Constraints


Title	Graph Learning from Data under Structural and Laplacian Constraints
Authors	Hilmi E. Egilmez, Eduardo Pavez, Antonio Ortega
Abstract	Graphs are fundamental mathematical structures used in various fields to represent data, signals and processes. In this paper, we propose a novel framework for learning/estimating graphs from data. The proposed framework includes (i) formulation of various graph learning problems, (ii) their probabilistic interpretations and (iii) associated algorithms. Specifically, graph learning problems are posed as estimation of graph Laplacian matrices from some observed data under given structural constraints (e.g., graph connectivity and sparsity level). From a probabilistic perspective, the problems of interest correspond to maximum a posteriori (MAP) parameter estimation of Gaussian-Markov random field (GMRF) models, whose precision (inverse covariance) is a graph Laplacian matrix. For the proposed graph learning problems, specialized algorithms are developed by incorporating the graph Laplacian and structural constraints. The experimental results demonstrate that the proposed algorithms outperform the current state-of-the-art methods in terms of accuracy and computational efficiency.
Tasks
Published	2016-11-16
URL	http://arxiv.org/abs/1611.05181v3
PDF	http://arxiv.org/pdf/1611.05181v3.pdf
PWC	https://paperswithcode.com/paper/graph-learning-from-data-under-structural-and
Repo	https://github.com/STAC-USC/Graph_Learning
Framework	none

Learning Temporal Regularity in Video Sequences


Title	Learning Temporal Regularity in Video Sequences
Authors	Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K. Roy-Chowdhury, Larry S. Davis
Abstract	Perceiving meaningful activities in a long video sequence is a challenging problem due to ambiguous definition of ‘meaningfulness’ as well as clutters in the scene. We approach this problem by learning a generative model for regular motion patterns, termed as regularity, using multiple sources with very limited supervision. Specifically, we propose two methods that are built upon the autoencoders for their ability to work with little to no supervision. We first leverage the conventional handcrafted spatio-temporal local features and learn a fully connected autoencoder on them. Second, we build a fully convolutional feed-forward autoencoder to learn both the local features and the classifiers as an end-to-end learning framework. Our model can capture the regularities from multiple datasets. We evaluate our methods in both qualitative and quantitative ways - showing the learned regularity of videos in various aspects and demonstrating competitive performance on anomaly detection datasets as an application.
Tasks	Abnormal Event Detection In Video, Anomaly Detection
Published	2016-04-15
URL	http://arxiv.org/abs/1604.04574v1
PDF	http://arxiv.org/pdf/1604.04574v1.pdf
PWC	https://paperswithcode.com/paper/learning-temporal-regularity-in-video
Repo	https://github.com/tnybny/Frame-level-anomalies-in-videos
Framework	tf


Title	HARRISON: A Benchmark on HAshtag Recommendation for Real-world Images in Social Networks
Authors	Minseok Park, Hanxiang Li, Junmo Kim
Abstract	Simple, short, and compact hashtags cover a wide range of information on social networks. Although many works in the field of natural language processing (NLP) have demonstrated the importance of hashtag recommendation, hashtag recommendation for images has barely been studied. In this paper, we introduce the HARRISON dataset, a benchmark on hashtag recommendation for real world images in social networks. The HARRISON dataset is a realistic dataset, composed of 57,383 photos from Instagram and an average of 4.5 associated hashtags for each photo. To evaluate our dataset, we design a baseline framework consisting of visual feature extractor based on convolutional neural network (CNN) and multi-label classifier based on neural network. Based on this framework, two single feature-based models, object-based and scene-based model, and an integrated model of them are evaluated on the HARRISON dataset. Our dataset shows that hashtag recommendation task requires a wide and contextual understanding of the situation conveyed in the image. As far as we know, this work is the first vision-only attempt at hashtag recommendation for real world images in social networks. We expect this benchmark to accelerate the advancement of hashtag recommendation.
Tasks
Published	2016-05-17
URL	http://arxiv.org/abs/1605.05054v1
PDF	http://arxiv.org/pdf/1605.05054v1.pdf
PWC	https://paperswithcode.com/paper/harrison-a-benchmark-on-hashtag
Repo	https://github.com/minstone/HARRISON-Dataset
Framework	none


Title	Accurate and scalable social recommendation using mixed-membership stochastic block models
Authors	Antonia Godoy-Lorite, Roger Guimera, Cristopher Moore, Marta Sales-Pardo
Abstract	With ever-increasing amounts of online information available, modeling and predicting individual preferences-for books or articles, for example-is becoming more and more important. Good predictions enable us to improve advice to users, and obtain a better understanding of the socio-psychological processes that determine those preferences. We have developed a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of individuals’ preferences. Our approach is based on the explicit assumption that there are groups of individuals and of items, and that the preferences of an individual for an item are determined only by their group memberships. Importantly, we allow each individual and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches, such as matrix factorization, we do not assume implicitly or explicitly that individuals in each group prefer items in a single group of items. The resulting overlapping groups and the predicted preferences can be inferred with a expectation-maximization algorithm whose running time scales linearly (per iteration). Our approach enables us to predict individual preferences in large datasets, and is considerably more accurate than the current algorithms for such large datasets.
Tasks
Published	2016-04-05
URL	http://arxiv.org/abs/1604.01170v2
PDF	http://arxiv.org/pdf/1604.01170v2.pdf
PWC	https://paperswithcode.com/paper/accurate-and-scalable-social-recommendation
Repo	https://github.com/billjeffries/mixMemRec
Framework	none

PersonNet: Person Re-identification with Deep Convolutional Neural Networks


Title	PersonNet: Person Re-identification with Deep Convolutional Neural Networks
Authors	Lin Wu, Chunhua Shen, Anton van den Hengel
Abstract	In this paper, we propose a deep end-to-end neu- ral network to simultaneously learn high-level features and a corresponding similarity metric for person re-identification. The network takes a pair of raw RGB images as input, and outputs a similarity value indicating whether the two input images depict the same person. A layer of computing neighborhood range differences across two input images is employed to capture local relationship between patches. This operation is to seek a robust feature from input images. By increasing the depth to 10 weight layers and using very small (3$\times$3) convolution filters, our architecture achieves a remarkable improvement on the prior-art configurations. Meanwhile, an adaptive Root- Mean-Square (RMSProp) gradient decent algorithm is integrated into our architecture, which is beneficial to deep nets. Our method consistently outperforms state-of-the-art on two large datasets (CUHK03 and Market-1501), and a medium-sized data set (CUHK01).
Tasks	Person Re-Identification
Published	2016-01-27
URL	http://arxiv.org/abs/1601.07255v2
PDF	http://arxiv.org/pdf/1601.07255v2.pdf
PWC	https://paperswithcode.com/paper/personnet-person-re-identification-with-deep
Repo	https://github.com/leonardovlibido/PersonRe-Identification
Framework	none

Computing Web-scale Topic Models using an Asynchronous Parameter Server


Title	Computing Web-scale Topic Models using an Asynchronous Parameter Server
Authors	Rolf Jagerman, Carsten Eickhoff, Maarten de Rijke
Abstract	Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieval for tasks ranging from smoothing and feedback methods to tools for exploratory search and discovery. However, classical methods for inferring topic models do not scale up to the massive size of today’s publicly available Web-scale data sets. The state-of-the-art approaches rely on custom strategies, implementations and hardware to facilitate their asynchronous, communication-intensive workloads. We present APS-LDA, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server. Advantages of this integration include convenient usage of existing data processing pipelines and eliminating the need for disk writes as data can be kept in memory from start to finish. Our goal is not to outperform highly customized implementations, but to propose a general high-performance topic modeling framework that can easily be used in today’s data processing pipelines. We compare APS-LDA to the existing Spark LDA implementations and show that our system can, on a 480-core cluster, process up to 135 times more data and 10 times more topics without sacrificing model quality.
Tasks	Information Retrieval, Topic Models
Published	2016-05-24
URL	http://arxiv.org/abs/1605.07422v3
PDF	http://arxiv.org/pdf/1605.07422v3.pdf
PWC	https://paperswithcode.com/paper/computing-web-scale-topic-models-using-an
Repo	https://github.com/rjagerman/glint
Framework	none

Analyzing Linear Dynamical Systems: From Modeling to Coding and Learning


Title	Analyzing Linear Dynamical Systems: From Modeling to Coding and Learning
Authors	Wenbing Huang, Fuchun Sun, Lele Cao, Mehrtash Harandi
Abstract	Encoding time-series with Linear Dynamical Systems (LDSs) leads to rich models with applications ranging from dynamical texture recognition to video segmentation to name a few. In this paper, we propose to represent LDSs with infinite-dimensional subspaces and derive an analytic solution to obtain stable LDSs. We then devise efficient algorithms to perform sparse coding and dictionary learning on the space of infinite-dimensional subspaces. In particular, two solutions are developed to sparsely encode an LDS. In the first method, we map the subspaces into a Reproducing Kernel Hilbert Space (RKHS) and achieve our goal through kernel sparse coding. As for the second solution, we propose to embed the infinite-dimensional subspaces into the space of symmetric matrices and formulate the sparse coding accordingly in the induced space. For dictionary learning, we encode time-series by introducing a novel concept, namely the two-fold LDSs. We then make use of the two-fold LDSs to derive an analytical form for updating atoms of an LDS dictionary, i.e., each atom is an LDS itself. Compared to several baselines and state-of-the-art methods, the proposed methods yield higher accuracies in various classification tasks including video classification and tactile recognition.
Tasks	Dictionary Learning, Time Series, Video Classification, Video Semantic Segmentation
Published	2016-08-03
URL	http://arxiv.org/abs/1608.01059v2
PDF	http://arxiv.org/pdf/1608.01059v2.pdf
PWC	https://paperswithcode.com/paper/analyzing-linear-dynamical-systems-from
Repo	https://github.com/caolele/caolele.github.io
Framework	none

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning


Title	A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning
Authors	Martha White, Adam White
Abstract	One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms. In many large-scale applications, online computation and function approximation represent key strategies in scaling up reinforcement learning algorithms. In this setting, we have effective and reasonably well understood algorithms for adapting the learning-rate parameter, online during learning. Such meta-learning approaches can improve robustness of learning and enable specialization to current task, improving learning speed. For temporal-difference learning algorithms which we study here, there is yet another parameter, $\lambda$, that similarly impacts learning speed and stability in practice. Unfortunately, unlike the learning-rate parameter, $\lambda$ parametrizes the objective function that temporal-difference methods optimize. Different choices of $\lambda$ produce different fixed-point solutions, and thus adapting $\lambda$ online and characterizing the optimization is substantially more complex than adapting the learning-rate parameter. There are no meta-learning method for $\lambda$ that can achieve (1) incremental updating, (2) compatibility with function approximation, and (3) maintain stability of learning under both on and off-policy sampling. In this paper we contribute a novel objective function for optimizing $\lambda$ as a function of state rather than time. We derive a new incremental, linear complexity $\lambda$-adaption algorithm that does not require offline batch updating or access to a model of the world, and present a suite of experiments illustrating the practicality of our new algorithm in three different settings. Taken together, our contributions represent a concrete step towards black-box application of temporal-difference learning methods in real world problems.
Tasks	Meta-Learning
Published	2016-07-02
URL	http://arxiv.org/abs/1607.00446v2
PDF	http://arxiv.org/pdf/1607.00446v2.pdf
PWC	https://paperswithcode.com/paper/a-greedy-approach-to-adapting-the-trace
Repo	https://github.com/PwnerHarry/MTA
Framework	none