October 18, 2019

3127 words 15 mins read

Paper Group ANR 556

Estimating Phenotypic Traits From UAV Based RGB Imagery. A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition. DeepFirearm: Learning Discriminative Feature Representation for Fine-grained Firearm Retrieval. Artificial Neural Networks in Fluid Dynamics: A Novel Approach to the Navier-Stokes Equations. End-to-End Mul …

Estimating Phenotypic Traits From UAV Based RGB Imagery


Title	Estimating Phenotypic Traits From UAV Based RGB Imagery
Authors	Javier Ribera, Fangning He, Yuhao Chen, Ayman F. Habib, Edward J. Delp
Abstract	In many agricultural applications one wants to characterize physical properties of plants and use the measurements to predict, for example biomass and environmental influence. This process is known as phenotyping. Traditional collection of phenotypic information is labor-intensive and time-consuming. Use of imagery is becoming popular for phenotyping. In this paper, we present methods to estimate traits of sorghum plants from RBG cameras on board of an unmanned aerial vehicle (UAV). The position and orientation of the imagery together with the coordinates of sparse points along the area of interest are derived through a new triangulation method. A rectified orthophoto mosaic is then generated from the imagery. The number of leaves is estimated and a model-based method to analyze the leaf morphology for leaf segmentation is proposed. We present a statistical model to find the location of each individual sorghum plant.
Tasks
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00498v1
PDF	http://arxiv.org/pdf/1807.00498v1.pdf
PWC	https://paperswithcode.com/paper/estimating-phenotypic-traits-from-uav-based
Repo
Framework

A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition


Title	A Double-Deep Spatio-Angular Learning Framework for Light Field based Face Recognition
Authors	Alireza Sepas-Moghaddam, Mohammad A. Haque, Paulo Lobato Correia, Kamal Nasrollahi, Thomas B. Moeslund, Fernando Pereira
Abstract	Face recognition has attracted increasing attention due to its wide range of applications, but it is still challenging when facing large variations in the biometric data characteristics. Lenslet light field cameras have recently come into prominence to capture rich spatio-angular information, thus offering new possibilities for advanced biometric recognition systems. This paper proposes a double-deep spatio-angular learning framework for light field based face recognition, which is able to learn both texture and angular dynamics in sequence using convolutional representations; this is a novel recognition framework that has never been proposed before for either face recognition or any other visual recognition task. The proposed double-deep learning framework includes a long short-term memory (LSTM) recurrent network whose inputs are VGG-Face descriptions that are computed using a VGG-Very-Deep-16 convolutional neural network (CNN). The VGG-16 network uses different face viewpoints rendered from a full light field image, which are organised as a pseudo-video sequence. A comprehensive set of experiments has been conducted with the IST-EURECOM light field face database, for varied and challenging recognition tasks. Results show that the proposed framework achieves superior face recognition performance when compared to the state-of-the-art.
Tasks	Face Recognition
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10078v3
PDF	http://arxiv.org/pdf/1805.10078v3.pdf
PWC	https://paperswithcode.com/paper/a-double-deep-spatio-angular-learning
Repo
Framework

DeepFirearm: Learning Discriminative Feature Representation for Fine-grained Firearm Retrieval


Title	DeepFirearm: Learning Discriminative Feature Representation for Fine-grained Firearm Retrieval
Authors	Jiedong Hao, Jing Dong, Wei Wang, Tieniu Tan
Abstract	There are great demands for automatically regulating inappropriate appearance of shocking firearm images in social media or identifying firearm types in forensics. Image retrieval techniques have great potential to solve these problems. To facilitate research in this area, we introduce Firearm 14k, a large dataset consisting of over 14,000 images in 167 categories. It can be used for both fine-grained recognition and retrieval of firearm images. Recent advances in image retrieval are mainly driven by fine-tuning state-of-the-art convolutional neural networks for retrieval task. The conventional single margin contrastive loss, known for its simplicity and good performance, has been widely used. We find that it performs poorly on the Firearm 14k dataset due to: (1) Loss contributed by positive and negative image pairs is unbalanced during training process. (2) A huge domain gap exists between this dataset and ImageNet. We propose to deal with the unbalanced loss by employing a double margin contrastive loss. We tackle the domain gap issue with a two-stage training strategy, where we first fine-tune the network for classification, and then fine-tune it for retrieval. Experimental results show that our approach outperforms the conventional single margin approach by a large margin (up to 88.5% relative improvement) and even surpasses the strong triplet-loss-based approach.
Tasks	Image Retrieval
Published	2018-06-08
URL	http://arxiv.org/abs/1806.02984v2
PDF	http://arxiv.org/pdf/1806.02984v2.pdf
PWC	https://paperswithcode.com/paper/deepfirearm-learning-discriminative-feature
Repo
Framework

Artificial Neural Networks in Fluid Dynamics: A Novel Approach to the Navier-Stokes Equations


Title	Artificial Neural Networks in Fluid Dynamics: A Novel Approach to the Navier-Stokes Equations
Authors	Megan McCracken
Abstract	Neural networks have been used to solve different types of large data related problems in many different fields.This project takes a novel approach to solving the Navier-Stokes Equations for turbulence by training a neural network using Bayesian Cluster and SOM neighbor weighting to map ionospheric velocity fields based on 3-dimensional inputs. Parameters used in this problem included the velocity, Reynolds number, Prandtl number, and temperature. In this project data was obtained from Johns-Hopkins University to train the neural network using MATLAB. The neural network was able to map the velocity fields within a sixty-seven percent accuracy of the validation data used. Further studies will focus on higher accuracy and solving further non-linear differential equations using convolutional neural networks.
Tasks
Published	2018-08-19
URL	http://arxiv.org/abs/1808.06604v1
PDF	http://arxiv.org/pdf/1808.06604v1.pdf
PWC	https://paperswithcode.com/paper/artificial-neural-networks-in-fluid-dynamics
Repo
Framework

End-to-End Multimodal Speech Recognition


Title	End-to-End Multimodal Speech Recognition
Authors	Shruti Palaskar, Ramon Sanabria, Florian Metze
Abstract	Transcription or sub-titling of open-domain videos is still a challenging domain for Automatic Speech Recognition (ASR) due to the data’s challenging acoustics, variable signal processing and the essentially unrestricted domain of the data. In previous work, we have shown that the visual channel – specifically object and scene features – can help to adapt the acoustic model (AM) and language model (LM) of a recognizer, and we are now expanding this work to end-to-end approaches. In the case of a Connectionist Temporal Classification (CTC)-based approach, we retain the separation of AM and LM, while for a sequence-to-sequence (S2S) approach, both information sources are adapted together, in a single model. This paper also analyzes the behavior of CTC and S2S models on noisy video data (How-To corpus), and compares it to results on the clean Wall Street Journal (WSJ) corpus, providing insight into the robustness of both approaches.
Tasks	Language Modelling, Speech Recognition
Published	2018-04-25
URL	http://arxiv.org/abs/1804.09713v1
PDF	http://arxiv.org/pdf/1804.09713v1.pdf
PWC	https://paperswithcode.com/paper/end-to-end-multimodal-speech-recognition
Repo
Framework

Incorporating Behavioral Constraints in Online AI Systems


Title	Incorporating Behavioral Constraints in Online AI Systems
Authors	Avinash Balakrishnan, Djallel Bouneffouf, Nicholas Mattei, Francesca Rossi
Abstract	AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints as a guide when making decisions in an online setting while still being reactive to reward feedback. To define this agent, we propose to adopt a novel extension to the classical contextual multi-armed bandit setting and we provide a new algorithm called Behavior Constrained Thompson Sampling (BCTS) that allows for online learning while obeying exogenous constraints. Our agent learns a constrained policy that implements the observed behavioral constraints demonstrated by a teacher agent, and then uses this constrained policy to guide the reward-based online exploration and exploitation. We characterize the upper bound on the expected regret of the contextual bandit algorithm that underlies our agent and provide a case study with real world data in two application domains. Our experiments show that the designed agent is able to act within the set of behavior constraints without significantly degrading its overall reward performance.
Tasks
Published	2018-09-15
URL	http://arxiv.org/abs/1809.05720v1
PDF	http://arxiv.org/pdf/1809.05720v1.pdf
PWC	https://paperswithcode.com/paper/incorporating-behavioral-constraints-in
Repo
Framework

Aspl{ü}nd’s metric defined in the Logarithmic Image Processing (LIP) framework for colour and multivariate images


Title	Aspl{ü}nd’s metric defined in the Logarithmic Image Processing (LIP) framework for colour and multivariate images
Authors	Guillaume Noyel, Michel Jourlin
Abstract	Aspl{"u}nd’s metric, which is useful for pattern matching, consists in a double-sided probing, i.e. the over-graph and the sub-graph of a function are probed jointly. It has previously been defined for grey-scale images using the Logarithmic Image Processing (LIP) framework. LIP is a non-linear model to perform operations between images while being consistent with the human visual system. Our contribution consists in extending the Aspl{"u}nd’s metric to colour and multivariate images using the LIP framework. Aspl{"u}nd’s metric is insensitive to lighting variations and we propose a colour variant which is robust to noise.
Tasks
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00764v1
PDF	http://arxiv.org/pdf/1803.00764v1.pdf
PWC	https://paperswithcode.com/paper/asplunds-metric-defined-in-the-logarithmic
Repo
Framework

Undermining User Privacy on Mobile Devices Using AI


Title	Undermining User Privacy on Mobile Devices Using AI
Authors	Berk Gulmezoglu, Andreas Zankl, Caner Tol, Saad Islam, Thomas Eisenbarth, Berk Sunar
Abstract	Over the past years, literature has shown that attacks exploiting the microarchitecture of modern processors pose a serious threat to the privacy of mobile phone users. This is because applications leave distinct footprints in the processor, which can be used by malware to infer user activities. In this work, we show that these inference attacks are considerably more practical when combined with advanced AI techniques. In particular, we focus on profiling the activity in the last-level cache (LLC) of ARM processors. We employ a simple Prime+Probe based monitoring technique to obtain cache traces, which we classify with Deep Learning methods including Convolutional Neural Networks. We demonstrate our approach on an off-the-shelf Android phone by launching a successful attack from an unprivileged, zeropermission App in well under a minute. The App thereby detects running applications with an accuracy of 98% and reveals opened websites and streaming videos by monitoring the LLC for at most 6 seconds. This is possible, since Deep Learning compensates measurement disturbances stemming from the inherently noisy LLC monitoring and unfavorable cache characteristics such as random line replacement policies. In summary, our results show that thanks to advanced AI techniques, inference attacks are becoming alarmingly easy to implement and execute in practice. This once more calls for countermeasures that confine microarchitectural leakage and protect mobile phone applications, especially those valuing the privacy of their users.
Tasks
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11218v1
PDF	http://arxiv.org/pdf/1811.11218v1.pdf
PWC	https://paperswithcode.com/paper/undermining-user-privacy-on-mobile-devices
Repo
Framework

Compassionately Conservative Balanced Cuts for Image Segmentation


Title	Compassionately Conservative Balanced Cuts for Image Segmentation
Authors	Nathan D. Cahill, Tyler L. Hayes, Renee T. Meinhold, John F. Hamilton
Abstract	The Normalized Cut (NCut) objective function, widely used in data clustering and image segmentation, quantifies the cost of graph partitioning in a way that biases clusters or segments that are balanced towards having lower values than unbalanced partitionings. However, this bias is so strong that it avoids any singleton partitions, even when vertices are very weakly connected to the rest of the graph. Motivated by the B"uhler-Hein family of balanced cut costs, we propose the family of Compassionately Conservative Balanced (CCB) Cut costs, which are indexed by a parameter that can be used to strike a compromise between the desire to avoid too many singleton partitions and the notion that all partitions should be balanced. We show that CCB-Cut minimization can be relaxed into an orthogonally constrained $\ell_{\tau}$-minimization problem that coincides with the problem of computing Piecewise Flat Embeddings (PFE) for one particular index value, and we present an algorithm for solving the relaxed problem by iteratively minimizing a sequence of reweighted Rayleigh quotients (IRRQ). Using images from the BSDS500 database, we show that image segmentation based on CCB-Cut minimization provides better accuracy with respect to ground truth and greater variability in region size than NCut-based image segmentation.
Tasks	graph partitioning, Semantic Segmentation
Published	2018-03-27
URL	http://arxiv.org/abs/1803.09903v1
PDF	http://arxiv.org/pdf/1803.09903v1.pdf
PWC	https://paperswithcode.com/paper/compassionately-conservative-balanced-cuts
Repo
Framework

Learning-Based Compressive MRI


Title	Learning-Based Compressive MRI
Authors	Baran Gözcü, Rabeeh Karimi Mahabadi, Yen-Huan Li, Efe Ilıcak, Tolga Çukur, Jonathan Scarlett, Volkan Cevher
Abstract	In the area of magnetic resonance imaging (MRI), an extensive range of non-linear reconstruction algorithms have been proposed that can be used with general Fourier subsampling patterns. However, the design of these subsampling patterns has typically been considered in isolation from the reconstruction rule and the anatomy under consideration. In this paper, we propose a learning-based framework for optimizing MRI subsampling patterns for a specific reconstruction rule and anatomy, considering both the noiseless and noisy settings. Our learning algorithm has access to a representative set of training signals, and searches for a sampling pattern that performs well on average for the signals in this set. We present a novel parameter-free greedy mask selection method, and show it to be effective for a variety of reconstruction rules and performance metrics. Moreover we also support our numerical findings by providing a rigorous justification of our framework via statistical learning theory.
Tasks
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01266v1
PDF	http://arxiv.org/pdf/1805.01266v1.pdf
PWC	https://paperswithcode.com/paper/learning-based-compressive-mri
Repo
Framework

Complexity and mission computability of adaptive computing systems


Title	Complexity and mission computability of adaptive computing systems
Authors	Venkat R. Dasari, Mee Seong Im, Billy Geerhart
Abstract	There is a subset of computational problems that are computable in polynomial time for which an existing algorithm may not complete due to a lack of high performance technology on a mission field. We define a subclass of deterministic polynomial time complexity class called mission class, as many polynomial problems are not computable in mission time. By focusing on such subclass of languages in the context for successful military applications, we also discuss their computational and communicational constraints. We investigate feasible (non)linear models that will minimize energy and maximize memory, efficiency, and computational power, and also provide an approximate solution obtained within a pre-determined length of computation time using limited resources so that an optimal solution to a language could be determined.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09586v1
PDF	http://arxiv.org/pdf/1808.09586v1.pdf
PWC	https://paperswithcode.com/paper/complexity-and-mission-computability-of
Repo
Framework

Densely Connected High Order Residual Network for Single Frame Image Super Resolution


Title	Densely Connected High Order Residual Network for Single Frame Image Super Resolution
Authors	Yiwen Huang, Ming Qin
Abstract	Deep convolutional neural networks (DCNN) have been widely adopted for research on super resolution recently, however previous work focused mainly on stacking as many layers as possible in their model, in this paper, we present a new perspective regarding to image restoration problems that we can construct the neural network model reflecting the physical significance of the image restoration process, that is, embedding the a priori knowledge of image restoration directly into the structure of our neural network model, we employed a symmetric non-linear colorspace, the sigmoidal transfer, to replace traditional transfers such as, sRGB, Rec.709, which are asymmetric non-linear colorspaces, we also propose a “reuse plus patch” method to deal with super resolution of different scaling factors, our proposed methods and model show generally superior performance over previous work even though our model was only roughly trained and could still be underfitting the training set.
Tasks	Image Restoration, Image Super-Resolution, Super-Resolution
Published	2018-04-16
URL	http://arxiv.org/abs/1804.05902v1
PDF	http://arxiv.org/pdf/1804.05902v1.pdf
PWC	https://paperswithcode.com/paper/densely-connected-high-order-residual-network
Repo
Framework

Efficient texture retrieval using multiscale local extrema descriptors and covariance embedding


Title	Efficient texture retrieval using multiscale local extrema descriptors and covariance embedding
Authors	Minh-Tan Pham
Abstract	This paper presents an efficient method for texture retrieval using multiscale feature extraction and embedding based on the local extrema keypoints. The idea is to first represent each texture image by its local maximum and local minimum pixels. The image is then divided into regular overlapping blocks and each one is characterized by a feature vector constructed from the radiometric, geometric and structural information of its local extrema. All feature vectors are finally embedded into a covariance matrix which will be exploited for dissimilarity measurement within retrieval task. Thanks to the method’s simplicity, multiscale scheme can be easily implemented to improve its scale-space representation capacity. We argue that our handcrafted features are easy to implement, fast to run but can provide very competitive performance compared to handcrafted and CNN-based learned descriptors from the literature. In particular, the proposed framework provides highly competitive retrieval rate for several texture databases including 94.95% for MIT Vistex, 79.87% for Stex, 76.15% for Outex TC-00013 and 89.74% for USPtex.
Tasks
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01124v1
PDF	http://arxiv.org/pdf/1808.01124v1.pdf
PWC	https://paperswithcode.com/paper/efficient-texture-retrieval-using-multiscale
Repo
Framework

Transferable Adversarial Attacks for Image and Video Object Detection


Title	Transferable Adversarial Attacks for Image and Video Object Detection
Authors	Xingxing Wei, Siyuan Liang, Ning Chen, Xiaochun Cao
Abstract	Adversarial examples have been demonstrated to threaten many computer vision tasks including object detection. However, the existing attacking methods for object detection have two limitations: poor transferability, which denotes that the generated adversarial examples have low success rate to attack other kinds of detection methods, and high computation cost, which means that they need more time to generate an adversarial image, and therefore are difficult to deal with the video data. To address these issues, we utilize a generative mechanism to obtain the adversarial image and video. In this way, the processing time is reduced. To enhance the transferability, we destroy the feature maps extracted from the feature network, which usually constitutes the basis of object detectors. The proposed method is based on the Generative Adversarial Network (GAN) framework, where we combine the high-level class loss and low-level feature loss to jointly train the adversarial example generator. A series of experiments conducted on PASCAL VOC and ImageNet VID datasets show that our method can efficiently generate image and video adversarial examples, and more importantly, these adversarial examples have better transferability, and thus, are able to simultaneously attack two kinds of representative object detection models: proposal based models like Faster-RCNN, and regression based models like SSD.
Tasks	Object Detection, Video Object Detection
Published	2018-11-30
URL	https://arxiv.org/abs/1811.12641v5
PDF	https://arxiv.org/pdf/1811.12641v5.pdf
PWC	https://paperswithcode.com/paper/transferable-adversarial-attacks-for-image
Repo
Framework

Multiple Instance Learning for Heterogeneous Images: Training a CNN for Histopathology


Title	Multiple Instance Learning for Heterogeneous Images: Training a CNN for Histopathology
Authors	Heather D. Couture, J. S. Marron, Charles M. Perou, Melissa A. Troester, Marc Niethammer
Abstract	Multiple instance (MI) learning with a convolutional neural network enables end-to-end training in the presence of weak image-level labels. We propose a new method for aggregating predictions from smaller regions of the image into an image-level classification by using the quantile function. The quantile function provides a more complete description of the heterogeneity within each image, improving image-level classification. We also adapt image augmentation to the MI framework by randomly selecting cropped regions on which to apply MI aggregation during each epoch of training. This provides a mechanism to study the importance of MI learning. We validate our method on five different classification tasks for breast tumor histology and provide a visualization method for interpreting local image classifications that could lead to future insights into tumor heterogeneity.
Tasks	Image Augmentation, Multiple Instance Learning
Published	2018-06-13
URL	http://arxiv.org/abs/1806.05083v1
PDF	http://arxiv.org/pdf/1806.05083v1.pdf
PWC	https://paperswithcode.com/paper/multiple-instance-learning-for-heterogeneous
Repo
Framework