May 7, 2019

2804 words 14 mins read

Paper Group ANR 98

Lazy Evaluation of Convolutional Filters. Stereo Video Deblurring. Neural Random Forests. Non-contact hemodynamic imaging reveals the jugular venous pulse waveform. Perfect Fingerprint Orientation Fields by Locally Adaptive Global Models. Enhanced perceptrons using contrastive biclusters. Fast Video Classification via Adaptive Cascading of Deep Mod …

Lazy Evaluation of Convolutional Filters


Title	Lazy Evaluation of Convolutional Filters
Authors	Sam Leroux, Steven Bohez, Cedric De Boom, Elias De Coninck, Tim Verbelen, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt
Abstract	In this paper we propose a technique which avoids the evaluation of certain convolutional filters in a deep neural network. This allows to trade-off the accuracy of a deep neural network with the computational and memory requirements. This is especially important on a constrained device unable to hold all the weights of the network in memory.
Tasks
Published	2016-05-27
URL	http://arxiv.org/abs/1605.08543v1
PDF	http://arxiv.org/pdf/1605.08543v1.pdf
PWC	https://paperswithcode.com/paper/lazy-evaluation-of-convolutional-filters
Repo
Framework

Stereo Video Deblurring


Title	Stereo Video Deblurring
Authors	Anita Sellent, Carsten Rother, Stefan Roth
Abstract	Videos acquired in low-light conditions often exhibit motion blur, which depends on the motion of the objects relative to the camera. This is not only visually unpleasing, but can hamper further processing. With this paper we are the first to show how the availability of stereo video can aid the challenging video deblurring task. We leverage 3D scene flow, which can be estimated robustly even under adverse conditions. We go beyond simply determining the object motion in two ways: First, we show how a piecewise rigid 3D scene flow representation allows to induce accurate blur kernels via local homographies. Second, we exploit the estimated motion boundaries of the 3D scene flow to mitigate ringing artifacts using an iterative weighting scheme. Being aware of 3D object motion, our approach can deal robustly with an arbitrary number of independently moving objects. We demonstrate its benefit over state-of-the-art video deblurring using quantitative and qualitative experiments on rendered scenes and real videos.
Tasks	Deblurring
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08421v1
PDF	http://arxiv.org/pdf/1607.08421v1.pdf
PWC	https://paperswithcode.com/paper/stereo-video-deblurring
Repo
Framework

Neural Random Forests


Title	Neural Random Forests
Authors	Gérard Biau, Erwan Scornet, Johannes Welbl
Abstract	Given an ensemble of randomized regression trees, it is possible to restructure them as a collection of multilayered neural networks with particular connection weights. Following this principle, we reformulate the random forest method of Breiman (2001) into a neural network setting, and in turn propose two new hybrid procedures that we call neural random forests. Both predictors exploit prior knowledge of regression trees for their architecture, have less parameters to tune than standard networks, and less restrictions on the geometry of the decision boundaries than trees. Consistency results are proved, and substantial numerical evidence is provided on both synthetic and real data sets to assess the excellent performance of our methods in a large variety of prediction problems.
Tasks
Published	2016-04-25
URL	http://arxiv.org/abs/1604.07143v2
PDF	http://arxiv.org/pdf/1604.07143v2.pdf
PWC	https://paperswithcode.com/paper/neural-random-forests
Repo
Framework

Non-contact hemodynamic imaging reveals the jugular venous pulse waveform


Title	Non-contact hemodynamic imaging reveals the jugular venous pulse waveform
Authors	Robert Amelard, Richard L Hughson, Danielle K Greaves, Kaylen J Pfisterer, Jason Leung, David A Clausi, Alexander Wong
Abstract	Cardiovascular monitoring is important to prevent diseases from progressing. The jugular venous pulse (JVP) waveform offers important clinical information about cardiac health, but is not routinely examined due to its invasive catheterisation procedure. Here, we demonstrate for the first time that the JVP can be consistently observed in a non-contact manner using a novel light-based photoplethysmographic imaging system, coded hemodynamic imaging (CHI). While traditional monitoring methods measure the JVP at a single location, CHI’s wide-field imaging capabilities were able to observe the jugular venous pulse’s spatial flow profile for the first time. The important inflection points in the JVP were observed, meaning that cardiac abnormalities can be assessed through JVP distortions. CHI provides a new way to assess cardiac health through non-contact light-based JVP monitoring, and can be used in non-surgical environments for cardiac assessment.
Tasks
Published	2016-04-15
URL	http://arxiv.org/abs/1604.05213v2
PDF	http://arxiv.org/pdf/1604.05213v2.pdf
PWC	https://paperswithcode.com/paper/non-contact-hemodynamic-imaging-reveals-the
Repo
Framework

Perfect Fingerprint Orientation Fields by Locally Adaptive Global Models


Title	Perfect Fingerprint Orientation Fields by Locally Adaptive Global Models
Authors	Carsten Gottschlich, Benjamin Tams, Stephan Huckemann
Abstract	Fingerprint recognition is widely used for verification and identification in many commercial, governmental and forensic applications. The orientation field (OF) plays an important role at various processing stages in fingerprint recognition systems. OFs are used for image enhancement, fingerprint alignment, for fingerprint liveness detection, fingerprint alteration detection and fingerprint matching. In this paper, a novel approach is presented to globally model an OF combined with locally adaptive methods. We show that this model adapts perfectly to the ‘true OF’ in the limit. This perfect OF is described by a small number of parameters with straightforward geometric interpretation. Applications are manifold: Quick expert marking of very poor quality (for instance latent) OFs, high fidelity low parameter OF compression and a direct road to ground truth OFs markings for large databases, say. In this contribution we describe an algorithm to perfectly estimate OF parameters automatically or semi-automatically, depending on image quality, and we establish the main underlying claim of high fidelity low parameter OF compression.
Tasks	Image Enhancement
Published	2016-06-20
URL	http://arxiv.org/abs/1606.06007v1
PDF	http://arxiv.org/pdf/1606.06007v1.pdf
PWC	https://paperswithcode.com/paper/perfect-fingerprint-orientation-fields-by
Repo
Framework

Enhanced perceptrons using contrastive biclusters


Title	Enhanced perceptrons using contrastive biclusters
Authors	André L. V. Coelho, Fabrício O. de França
Abstract	Perceptrons are neuronal devices capable of fully discriminating linearly separable classes. Although straightforward to implement and train, their applicability is usually hindered by non-trivial requirements imposed by real-world classification problems. Therefore, several approaches, such as kernel perceptrons, have been conceived to counteract such difficulties. In this paper, we investigate an enhanced perceptron model based on the notion of contrastive biclusters. From this perspective, a good discriminative bicluster comprises a subset of data instances belonging to one class that show high coherence across a subset of features and high differentiation from nearest instances of the other class under the same features (referred to as its contrastive bicluster). Upon each local subspace associated with a pair of contrastive biclusters a perceptron is trained and the model with highest area under the receiver operating characteristic curve (AUC) value is selected as the final classifier. Experiments conducted on a range of data sets, including those related to a difficult biosignal classification problem, show that the proposed variant can be indeed very useful, prevailing in most of the cases upon standard and kernel perceptrons in terms of accuracy and AUC measures.
Tasks
Published	2016-03-22
URL	http://arxiv.org/abs/1603.06859v1
PDF	http://arxiv.org/pdf/1603.06859v1.pdf
PWC	https://paperswithcode.com/paper/enhanced-perceptrons-using-contrastive
Repo
Framework

Fast Video Classification via Adaptive Cascading of Deep Models


Title	Fast Video Classification via Adaptive Cascading of Deep Models
Authors	Haichen Shen, Seungyeop Han, Matthai Philipose, Arvind Krishnamurthy
Abstract	Recent advances have enabled “oracle” classifiers that can classify across many classes and input distributions with high accuracy without retraining. However, these classifiers are relatively heavyweight, so that applying them to classify video is costly. We show that day-to-day video exhibits highly skewed class distributions over the short term, and that these distributions can be classified by much simpler models. We formulate the problem of detecting the short-term skews online and exploiting models based on it as a new sequential decision making problem dubbed the Online Bandit Problem, and present a new algorithm to solve it. When applied to recognizing faces in TV shows and movies, we realize end-to-end classification speedups of 2.4-7.8x/2.6-11.2x (on GPU/CPU) relative to a state-of-the-art convolutional neural network, at competitive accuracy.
Tasks	Decision Making, Video Classification
Published	2016-11-20
URL	http://arxiv.org/abs/1611.06453v2
PDF	http://arxiv.org/pdf/1611.06453v2.pdf
PWC	https://paperswithcode.com/paper/fast-video-classification-via-adaptive
Repo
Framework

Deep Learning for Video Classification and Captioning


Title	Deep Learning for Video Classification and Captioning
Authors	Zuxuan Wu, Ting Yao, Yanwei Fu, Yu-Gang Jiang
Abstract	Accelerated by the tremendous increase in Internet bandwidth and storage space, video data has been generated, published and spread explosively, becoming an indispensable part of today’s big data. In this paper, we focus on reviewing two lines of research aiming to stimulate the comprehension of videos with deep learning: video classification and video captioning. While video classification concentrates on automatically labeling video clips based on their semantic contents like human actions or complex events, video captioning attempts to generate a complete and natural sentence, enriching the single label as in video classification, to capture the most informative dynamics in videos. In addition, we also provide a review of popular benchmarks and competitions, which are critical for evaluating the technical progress of this vibrant field.
Tasks	Video Captioning, Video Classification
Published	2016-09-22
URL	http://arxiv.org/abs/1609.06782v2
PDF	http://arxiv.org/pdf/1609.06782v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-video-classification-and
Repo
Framework

Solving a Mixture of Many Random Linear Equations by Tensor Decomposition and Alternating Minimization


Title	Solving a Mixture of Many Random Linear Equations by Tensor Decomposition and Alternating Minimization
Authors	Xinyang Yi, Constantine Caramanis, Sujay Sanghavi
Abstract	We consider the problem of solving mixed random linear equations with $k$ components. This is the noiseless setting of mixed linear regression. The goal is to estimate multiple linear models from mixed samples in the case where the labels (which sample corresponds to which model) are not observed. We give a tractable algorithm for the mixed linear equation problem, and show that under some technical conditions, our algorithm is guaranteed to solve the problem exactly with sample complexity linear in the dimension, and polynomial in $k$, the number of components. Previous approaches have required either exponential dependence on $k$, or super-linear dependence on the dimension. The proposed algorithm is a combination of tensor decomposition and alternating minimization. Our analysis involves proving that the initialization provided by the tensor method allows alternating minimization, which is equivalent to EM in our setting, to converge to the global optimum at a linear rate.
Tasks
Published	2016-08-19
URL	http://arxiv.org/abs/1608.05749v1
PDF	http://arxiv.org/pdf/1608.05749v1.pdf
PWC	https://paperswithcode.com/paper/solving-a-mixture-of-many-random-linear
Repo
Framework

DAiSEE: Towards User Engagement Recognition in the Wild


Title	DAiSEE: Towards User Engagement Recognition in the Wild
Authors	Abhay Gupta, Arjun D’Cunha, Kamal Awasthi, Vineeth Balasubramanian
Abstract	We introduce DAiSEE, the first multi-label video classification dataset comprising of 9068 video snippets captured from 112 users for recognizing the user affective states of boredom, confusion, engagement, and frustration in the wild. The dataset has four levels of labels namely - very low, low, high, and very high for each of the affective states, which are crowd annotated and correlated with a gold standard annotation created using a team of expert psychologists. We have also established benchmark results on this dataset using state-of-the-art video classification methods that are available today. We believe that DAiSEE will provide the research community with challenges in feature extraction, context-based inference, and development of suitable machine learning methods for related tasks, thus providing a springboard for further research. The dataset is available for download at https://iith.ac.in/~daisee-dataset
Tasks	Video Classification
Published	2016-09-07
URL	http://arxiv.org/abs/1609.01885v6
PDF	http://arxiv.org/pdf/1609.01885v6.pdf
PWC	https://paperswithcode.com/paper/daisee-towards-user-engagement-recognition-in
Repo
Framework

Distributing Knowledge into Simple Bases


Title	Distributing Knowledge into Simple Bases
Authors	Adrian Haret, Jean-Guy Mailly, Stefan Woltran
Abstract	Understanding the behavior of belief change operators for fragments of classical logic has received increasing interest over the last years. Results in this direction are mainly concerned with adapting representation theorems. However, fragment-driven belief change also leads to novel research questions. In this paper we propose the concept of belief distribution, which can be understood as the reverse task of merging. More specifically, we are interested in the following question: given an arbitrary knowledge base $K$ and some merging operator $\Delta$, can we find a profile $E$ and a constraint $\mu$, both from a given fragment of classical logic, such that $\Delta_\mu(E)$ yields a result equivalent to $K$? In other words, we are interested in seeing if $K$ can be distributed into knowledge bases of simpler structure, such that the task of merging allows for a reconstruction of the original knowledge. Our initial results show that merging based on drastic distance allows for an easy distribution of knowledge, while the power of distribution for operators based on Hamming distance relies heavily on the fragment of choice.
Tasks
Published	2016-03-31
URL	http://arxiv.org/abs/1603.09511v1
PDF	http://arxiv.org/pdf/1603.09511v1.pdf
PWC	https://paperswithcode.com/paper/distributing-knowledge-into-simple-bases
Repo
Framework

Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures


Title	Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures
Authors	Roman V. Yampolskiy, M. S. Spellchecker
Abstract	In this work, we present and analyze reported failures of artificially intelligent systems and extrapolate our analysis to future AIs. We suggest that both the frequency and the seriousness of future AI failures will steadily increase. AI Safety can be improved based on ideas developed by cybersecurity experts. For narrow AIs safety failures are at the same, moderate, level of criticality as in cybersecurity, however for general AI, failures have a fundamentally different impact. A single failure of a superintelligent system may cause a catastrophic event without a chance for recovery. The goal of cybersecurity is to reduce the number of successful attacks on the system; the goal of AI Safety is to make sure zero attacks succeed in bypassing the safety mechanisms. Unfortunately, such a level of performance is unachievable. Every security system will eventually fail; there is no such thing as a 100% secure system.
Tasks
Published	2016-10-25
URL	http://arxiv.org/abs/1610.07997v1
PDF	http://arxiv.org/pdf/1610.07997v1.pdf
PWC	https://paperswithcode.com/paper/artificial-intelligence-safety-and
Repo
Framework

Discriminatively Trained Latent Ordinal Model for Video Classification


Title	Discriminatively Trained Latent Ordinal Model for Video Classification
Authors	Karan Sikka, Gaurav Sharma
Abstract	We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for “smile”, running and jumping for “highjump”). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF – it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.
Tasks	Multiple Instance Learning, Temporal Action Localization, Video Classification
Published	2016-08-08
URL	http://arxiv.org/abs/1608.02318v2
PDF	http://arxiv.org/pdf/1608.02318v2.pdf
PWC	https://paperswithcode.com/paper/discriminatively-trained-latent-ordinal-model
Repo
Framework

On the Latent Variable Interpretation in Sum-Product Networks


Title	On the Latent Variable Interpretation in Sum-Product Networks
Authors	Robert Peharz, Robert Gens, Franz Pernkopf, Pedro Domingos
Abstract	One of the central themes in Sum-Product networks (SPNs) is the interpretation of sum nodes as marginalized latent variables (LVs). This interpretation yields an increased syntactic or semantic structure, allows the application of the EM algorithm and to efficiently perform MPE inference. In literature, the LV interpretation was justified by explicitly introducing the indicator variables corresponding to the LVs’ states. However, as pointed out in this paper, this approach is in conflict with the completeness condition in SPNs and does not fully specify the probabilistic model. We propose a remedy for this problem by modifying the original approach for introducing the LVs, which we call SPN augmentation. We discuss conditional independencies in augmented SPNs, formally establish the probabilistic interpretation of the sum-weights and give an interpretation of augmented SPNs as Bayesian networks. Based on these results, we find a sound derivation of the EM algorithm for SPNs. Furthermore, the Viterbi-style algorithm for MPE proposed in literature was never proven to be correct. We show that this is indeed a correct algorithm, when applied to selective SPNs, and in particular when applied to augmented SPNs. Our theoretical results are confirmed in experiments on synthetic data and 103 real-world datasets.
Tasks
Published	2016-01-22
URL	http://arxiv.org/abs/1601.06180v2
PDF	http://arxiv.org/pdf/1601.06180v2.pdf
PWC	https://paperswithcode.com/paper/on-the-latent-variable-interpretation-in-sum
Repo
Framework

Joint Estimation of Multiple Dependent Gaussian Graphical Models with Applications to Mouse Genomics


Title	Joint Estimation of Multiple Dependent Gaussian Graphical Models with Applications to Mouse Genomics
Authors	Yuying Xie, Yufeng Liu, William Valdar
Abstract	Gaussian graphical models are widely used to represent conditional dependence among random variables. In this paper, we propose a novel estimator for data arising from a group of Gaussian graphical models that are themselves dependent. A motivating example is that of modeling gene expression collected on multiple tissues from the same individual: here the multivariate outcome is affected by dependencies acting not only at the level of the specific tissues, but also at the level of the whole body; existing methods that assume independence among graphs are not applicable in this case. To estimate multiple dependent graphs, we decompose the problem into two graphical layers: the systemic layer, which affects all outcomes and thereby induces cross- graph dependence, and the category-specific layer, which represents graph-specific variation. We propose a graphical EM technique that estimates both layers jointly, establish estimation consistency and selection sparsistency of the proposed estimator, and confirm by simulation that the EM method is superior to a simple one-step method. We apply our technique to mouse genomics data and obtain biologically plausible results.
Tasks
Published	2016-08-30
URL	http://arxiv.org/abs/1608.08659v1
PDF	http://arxiv.org/pdf/1608.08659v1.pdf
PWC	https://paperswithcode.com/paper/joint-estimation-of-multiple-dependent
Repo
Framework