January 29, 2020

2932 words 14 mins read

Paper Group ANR 562

Paper Group ANR 562

Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning. Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators. Synthetic Neural Vision System Design for Motion Pattern Recognition in Dynamic Robot Scenes. On Predictive Information in RNNs. ContCap: A comprehensive framework …

Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning

Title Trident: Efficient 4PC Framework for Privacy Preserving Machine Learning
Authors Rahul Rachuri, Ajith Suresh
Abstract Machine learning has started to be deployed in fields such as healthcare and finance, which propelled the need for and growth of privacy-preserving machine learning (PPML). We propose an actively secure four-party protocol (4PC), and a framework for PPML, showcasing its applications on four of the most widely-known machine learning algorithms – Linear Regression, Logistic Regression, Neural Networks, and Convolutional Neural Networks. Our 4PC protocol tolerating at most one malicious corruption is practically efficient as compared to the existing works. We use the protocol to build an efficient mixed-world framework (Trident) to switch between the Arithmetic, Boolean, and Garbled worlds. Our framework operates in the offline-online paradigm over rings and is instantiated in an outsourced setting for machine learning. Also, we propose conversions especially relevant to privacy-preserving machine learning. The highlights of our framework include using a minimal number of expensive circuits overall as compared to ABY3. This can be seen in our technique for truncation, which does not affect the online cost of multiplication and removes the need for any circuits in the offline phase. Our B2A conversion has an improvement of $\mathbf{7} \times$ in rounds and $\mathbf{18} \times$ in the communication complexity. In addition to these, all of the special conversions for machine learning, e.g. Secure Comparison, achieve constant round complexity. The practicality of our framework is argued through improvements in the benchmarking of the aforementioned algorithms when compared with ABY3. All the protocols are implemented over a 64-bit ring in both LAN and WAN settings. Our improvements go up to $\mathbf{187} \times$ for the training phase and $\mathbf{158} \times$ for the prediction phase when observed over LAN and WAN.
Tasks
Published 2019-12-05
URL https://arxiv.org/abs/1912.02631v1
PDF https://arxiv.org/pdf/1912.02631v1.pdf
PWC https://paperswithcode.com/paper/trident-efficient-4pc-framework-for-privacy
Repo
Framework

Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators

Title Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Authors Kuang-Huei Lee, Hamid Palangi, Xi Chen, Houdong Hu, Jianfeng Gao
Abstract Grounding language to visual relations is critical to various language-and-vision applications. In this work, we tackle two fundamental language-and-vision tasks: image-text matching and image captioning, and demonstrate that neural scene graph generators can learn effective visual relation features to facilitate grounding language to visual relations and subsequently improve the two end applications. By combining relation features with the state-of-the-art models, our experiments show significant improvement on the standard Flickr30K and MSCOCO benchmarks. Our experimental results and analysis show that relation features improve downstream models’ capability of capturing visual relations in end vision-and-language applications. We also demonstrate the importance of learning scene graph generators with visually relevant relations to the effectiveness of relation features.
Tasks Image Captioning, Text Matching
Published 2019-09-22
URL https://arxiv.org/abs/1909.09953v1
PDF https://arxiv.org/pdf/1909.09953v1.pdf
PWC https://paperswithcode.com/paper/190909953
Repo
Framework

Synthetic Neural Vision System Design for Motion Pattern Recognition in Dynamic Robot Scenes

Title Synthetic Neural Vision System Design for Motion Pattern Recognition in Dynamic Robot Scenes
Authors Qinbing Fu, Cheng Hu, Pengcheng Liu, Shigang Yue
Abstract Insects have tiny brains but complicated visual systems for motion perception. A handful of insect visual neurons have been computationally modeled and successfully applied for robotics. How different neurons collaborate on motion perception, is an open question to date. In this paper, we propose a novel embedded vision system in autonomous micro-robots, to recognize motion patterns in dynamic robot scenes. Here, the basic motion patterns are categorized into movements of looming (proximity), recession, translation, and other irrelevant ones. The presented system is a synthetic neural network, which comprises two complementary sub-systems with four spiking neurons – the lobula giant movement detectors (LGMD1 and LGMD2) in locusts for sensing looming and recession, and the direction selective neurons (DSN-R and DSN-L) in flies for translational motion extraction. Images are transformed to spikes via spatiotemporal computations towards a switch function and decision making mechanisms, in order to invoke proper robot behaviors amongst collision avoidance, tracking and wandering, in dynamic robot scenes. Our robot experiments demonstrated two main contributions: (1) This neural vision system is effective to recognize the basic motion patterns corresponding to timely and proper robot behaviors in dynamic scenes. (2) The arena tests with multi-robots demonstrated the effectiveness in recognizing more abundant motion features for collision detection, which is a great improvement compared with former studies.
Tasks Decision Making
Published 2019-04-15
URL http://arxiv.org/abs/1904.07180v1
PDF http://arxiv.org/pdf/1904.07180v1.pdf
PWC https://paperswithcode.com/paper/synthetic-neural-vision-system-design-for
Repo
Framework

On Predictive Information in RNNs

Title On Predictive Information in RNNs
Authors Zhe Dong, Deniz Oktay, Ben Poole, Alexander A. Alemi
Abstract Certain biological neurons demonstrate a remarkable capability to optimally compress the history of sensory inputs while being maximally informative about the future. In this work, we investigate if the same can be said of artificial neurons in recurrent neural networks (RNNs) trained with maximum likelihood. Empirically, we find that RNNs are suboptimal in the information plane. Instead of optimally compressing past information, they extract additional information that is not relevant for predicting the future. We show that constraining past information by injecting noise into the hidden state can improve RNNs in several ways: optimality in the predictive information plane, sample quality, heldout likelihood, and downstream classification performance.
Tasks
Published 2019-10-21
URL https://arxiv.org/abs/1910.09578v2
PDF https://arxiv.org/pdf/1910.09578v2.pdf
PWC https://paperswithcode.com/paper/on-predictive-information-sub-optimality-of-1
Repo
Framework

ContCap: A comprehensive framework for continual image captioning

Title ContCap: A comprehensive framework for continual image captioning
Authors Giang Nguyen, Tae Joon Jun, Trung Tran, Daeyoung Kim
Abstract While cutting-edge image captioning systems are increasingly describing an image coherently and exactly, recent progresses in continual learning allow deep learning systems to avoid catastrophic forgetting. However, the domain where image captioning working with continual learning is not exploited yet. We define the task in which we consolidate continual learning and image captioning as continual image captioning. In this work, we propose ContCap, a framework continually generating captions over a series of new tasks coming, seamlessly integrating continual learning into image captioning accompanied by tackling catastrophic forgetting. After proving catastrophic forgetting in image captioning, we employ freezing, knowledge distillation, and pseudo-labeling techniques to overcome the forgetting dilemma with the baseline is a simple fine-tuning scheme. We split MS-COCO 2014 dataset to perform experiments on incremental tasks without revisiting dataset of previously provided tasks. The experiments are designed to increase the degree of catastrophic forgetting and appraise the capacity of approaches. Experimental results show remarkable improvements in the performance on the old tasks, while the figure for the new task remains almost the same compared to fine-tuning. For example, pseudo-labeling increases CIDEr from 0.287 to 0.576 on the old task and 0.686 down to 0.657 BLEU1 on the new task.
Tasks Continual Learning, Image Captioning
Published 2019-09-19
URL https://arxiv.org/abs/1909.08745v1
PDF https://arxiv.org/pdf/1909.08745v1.pdf
PWC https://paperswithcode.com/paper/contcap-a-comprehensive-framework-for
Repo
Framework

Photon-Flooded Single-Photon 3D Cameras

Title Photon-Flooded Single-Photon 3D Cameras
Authors Anant Gupta, Atul Ingle, Andreas Velten, Mohit Gupta
Abstract Single photon avalanche diodes (SPADs) are starting to play a pivotal role in the development of photon-efficient, long-range LiDAR systems. However, due to non-linearities in their image formation model, a high photon flux (e.g., due to strong sunlight) leads to distortion of the incident temporal waveform, and potentially, large depth errors. Operating SPADs in low flux regimes can mitigate these distortions, but, often requires attenuating the signal and thus, results in low signal-to-noise ratio. In this paper, we address the following basic question: what is the optimal photon flux that a SPAD-based LiDAR should be operated in? We derive a closed form expression for the optimal flux, which is quasi-depth-invariant, and depends on the ambient light strength. The optimal flux is lower than what a SPAD typically measures in real world scenarios, but surprisingly, considerably higher than what is conventionally suggested for avoiding distortions. We propose a simple, adaptive approach for achieving the optimal flux by attenuating incident flux based on an estimate of ambient light strength. Using extensive simulations and a hardware prototype, we show that the optimal flux criterion holds for several depth estimators, under a wide range of illumination conditions.
Tasks
Published 2019-03-20
URL http://arxiv.org/abs/1903.08347v2
PDF http://arxiv.org/pdf/1903.08347v2.pdf
PWC https://paperswithcode.com/paper/photon-flooded-single-photon-3d-cameras
Repo
Framework

CoSegNet: Deep Co-Segmentation of 3D Shapes with Group Consistency Loss

Title CoSegNet: Deep Co-Segmentation of 3D Shapes with Group Consistency Loss
Authors Chenyang Zhu, Kai Xu, Siddhartha Chaudhuri, Li Yi, Leonidas Guibas, Hao Zhang
Abstract We introduce CoSegNet, a deep neural network architecture for co-segmentation of a set of 3D shapes represented as point clouds. CoSegNet takes as input a set of unsegmented shapes, proposes per-shape parts, and then jointly optimizes the part labelings across the set subjected to a novel group consistency loss expressed via matrix rank estimates. The proposals are refined in each iteration by an auxiliary network that acts as a weak regularizing prior, pre-trained to denoise noisy, unlabeled parts from a large collection of segmented 3D shapes, where the part compositions within the same object category can be highly inconsistent. The output is a consistent part labeling for the input set, with each shape segmented into up to K (a user-specified hyperparameter) parts. The overall pipeline is thus weakly supervised, producing consistent segmentations tailored to the test set, without consistent ground-truth segmentations. We show qualitative and quantitative results from CoSegNet and evaluate it via ablation studies and comparisons to state-of-the-art co-segmentation methods.
Tasks
Published 2019-03-25
URL http://arxiv.org/abs/1903.10297v3
PDF http://arxiv.org/pdf/1903.10297v3.pdf
PWC https://paperswithcode.com/paper/cosegnet-deep-co-segmentation-of-3d-shapes
Repo
Framework

Prediction bounds for (higher order) total variation regularized least squares

Title Prediction bounds for (higher order) total variation regularized least squares
Authors Francesco Ortelli, Sara van de Geer
Abstract We establish oracle inequalities for the least squares estimator $\hat f$ with penalty on the total variation of $\hat f$ or on its higher order differences. Our main tool is an interpolating vector that leads to upper bounds for the effective sparsity. This allows one to show that the penalty on the $k^{\text{th}}$ order differences leads to an estimator $\hat f$ that can adapt to the number of jumps in the $(k-1)^{\text{th}}$ order differences. We present the details for $k=2, \ 3$ and expose a framework for deriving the result for general $k\in \mathbb{N}$.
Tasks
Published 2019-04-24
URL https://arxiv.org/abs/1904.10871v3
PDF https://arxiv.org/pdf/1904.10871v3.pdf
PWC https://paperswithcode.com/paper/prediction-bounds-for-higher-order-total
Repo
Framework

SCALP: Superpixels with Contour Adherence using Linear Path

Title SCALP: Superpixels with Contour Adherence using Linear Path
Authors Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis
Abstract Superpixel decomposition methods are generally used as a pre-processing step to speed up image processing tasks. They group the pixels of an image into homogeneous regions while trying to respect existing contours. For all state-of-the-art superpixel decomposition methods, a trade-off is made between 1) computational time, 2) adherence to image contours and 3) regularity and compactness of the decomposition. In this paper, we propose a fast method to compute Superpixels with Contour Adherence using Linear Path (SCALP) in an iterative clustering framework. The distance computed when trying to associate a pixel to a superpixel during the clustering is enhanced by considering the linear path to the superpixel barycenter. The proposed framework produces regular and compact superpixels that adhere to the image contours. We provide a detailed evaluation of SCALP on the standard Berkeley Segmentation Dataset. The obtained results outperform state-of-the-art methods in terms of standard superpixel and contour detection metrics.
Tasks Contour Detection
Published 2019-03-17
URL http://arxiv.org/abs/1903.07149v1
PDF http://arxiv.org/pdf/1903.07149v1.pdf
PWC https://paperswithcode.com/paper/scalp-superpixels-with-contour-adherence
Repo
Framework

Risk-Aware Reasoning for Autonomous Vehicles

Title Risk-Aware Reasoning for Autonomous Vehicles
Authors Majid Khonji, Jorge Dias, Lakmal Seneviratne
Abstract A significant barrier to deploying autonomous vehicles (AVs) on a massive scale is safety assurance. Several technical challenges arise due to the uncertain environment in which AVs operate such as road and weather conditions, errors in perception and sensory data, and also model inaccuracy. In this paper, we propose a system architecture for risk-aware AVs capable of reasoning about uncertainty and deliberately bounding the risk of collision below a given threshold. We discuss key challenges in the area, highlight recent research developments, and propose future research directions in three subsystems. First, a perception subsystem that detects objects within a scene while quantifying the uncertainty that arises from different sensing and communication modalities. Second, an intention recognition subsystem that predicts the driving-style and the intention of agent vehicles (and pedestrians). Third, a planning subsystem that takes into account the uncertainty, from perception and intention recognition subsystems, and propagates all the way to control policies that explicitly bound the risk of collision. We believe that such a white-box approach is crucial for future adoption of AVs on a large scale.
Tasks Autonomous Vehicles, Intent Detection
Published 2019-10-06
URL https://arxiv.org/abs/1910.02461v1
PDF https://arxiv.org/pdf/1910.02461v1.pdf
PWC https://paperswithcode.com/paper/risk-aware-reasoning-for-autonomous-vehicles
Repo
Framework

A synthetic dataset for deep learning

Title A synthetic dataset for deep learning
Authors Xinjie Lan
Abstract In this paper, we propose a novel method for generating a synthetic dataset obeying Gaussian distribution. Compared to the commonly used benchmark datasets with unknown distribution, the synthetic dataset has an explicit distribution, i.e., Gaussian distribution. Meanwhile, it has the same characteristics as the benchmark dataset MNIST. As a result, we can easily apply Deep Neural Networks (DNNs) on the synthetic dataset. This synthetic dataset provides a novel experimental tool to verify the proposed theories of deep learning.
Tasks
Published 2019-06-01
URL https://arxiv.org/abs/1906.11905v1
PDF https://arxiv.org/pdf/1906.11905v1.pdf
PWC https://paperswithcode.com/paper/a-synthetic-dataset-for-deep-learning
Repo
Framework

A Dictionary-Based Generalization of Robust PCA Part II: Applications to Hyperspectral Demixing

Title A Dictionary-Based Generalization of Robust PCA Part II: Applications to Hyperspectral Demixing
Authors Sirisha Rambhatla, Xingguo Li, Jineng Ren, Jarvis Haupt
Abstract We consider the task of localizing targets of interest in a hyperspectral (HS) image based on their spectral signature(s), by posing the problem as two distinct convex demixing task(s). With applications ranging from remote sensing to surveillance, this task of target detection leverages the fact that each material/object possesses its own characteristic spectral response, depending upon its composition. However, since $\textit{signatures}$ of different materials are often correlated, matched filtering-based approaches may not be apply here. To this end, we model a HS image as a superposition of a low-rank component and a dictionary sparse component, wherein the dictionary consists of the $\textit{a priori}$ known characteristic spectral responses of the target we wish to localize, and develop techniques for two different sparsity structures, resulting from different model assumptions. We also present the corresponding recovery guarantees, leveraging our recent theoretical results from a companion paper. Finally, we analyze the performance of the proposed approach via experimental evaluations on real HS datasets for a classification task, and compare its performance with related techniques.
Tasks
Published 2019-02-26
URL http://arxiv.org/abs/1902.10238v1
PDF http://arxiv.org/pdf/1902.10238v1.pdf
PWC https://paperswithcode.com/paper/a-dictionary-based-generalization-of-robust-1
Repo
Framework

Herding Effect based Attention for Personalized Time-Sync Video Recommendation

Title Herding Effect based Attention for Personalized Time-Sync Video Recommendation
Authors Wenmian Yang, Wenyuan Gao, Xiaojie Zhou, Weijia Jia, Shaohua Zhang, Yutao Luo
Abstract Time-sync comment (TSC) is a new form of user-interaction review associated with real-time video contents, which contains a user’s preferences for videos and therefore well suited as the data source for video recommendations. However, existing review-based recommendation methods ignore the context-dependent (generated by user-interaction), real-time, and time-sensitive properties of TSC data. To bridge the above gaps, in this paper, we use video images and users’ TSCs to design an Image-Text Fusion model with a novel Herding Effect Attention mechanism (called ITF-HEA), which can predict users’ favorite videos with model-based collaborative filtering. Specifically, in the HEA mechanism, we weight the context information based on the semantic similarities and time intervals between each TSC and its context, thereby considering influences of the herding effect in the model. Experiments show that ITF-HEA is on average 3.78% higher than the state-of-the-art method upon F1-score in baselines.
Tasks
Published 2019-05-02
URL http://arxiv.org/abs/1905.00579v1
PDF http://arxiv.org/pdf/1905.00579v1.pdf
PWC https://paperswithcode.com/paper/herding-effect-based-attention-for
Repo
Framework

Tiny Video Networks

Title Tiny Video Networks
Authors AJ Piergiovanni, Anelia Angelova, Michael S. Ryoo
Abstract Video understanding is a challenging problem with great impact on the abilities of autonomous agents working in the real-world. Yet, solutions so far have been computationally intensive, with the fastest algorithms running for more than half a second per video snippet on powerful GPUs. We propose a novel idea on video architecture learning - Tiny Video Networks - which automatically designs highly efficient models for video understanding. The tiny video models run with competitive performance for as low as 37 milliseconds per video on a CPU and 10 milliseconds on a standard GPU.
Tasks Video Understanding
Published 2019-10-15
URL https://arxiv.org/abs/1910.06961v1
PDF https://arxiv.org/pdf/1910.06961v1.pdf
PWC https://paperswithcode.com/paper/tiny-video-networks
Repo
Framework

Evaluation of Surrogate Models for Multi-fin Flapping Propulsion Systems

Title Evaluation of Surrogate Models for Multi-fin Flapping Propulsion Systems
Authors Kamal Viswanath, Alisha Sharma, Saketh Gabbita, Jason Geder, Ravi Ramamurti, Marius Pruessner
Abstract The aim of this study is to develop surrogate models for quick, accurate prediction of thrust forces generated through flapping fin propulsion for given operating conditions and fin geometries. Different network architectures and configurations are explored to model the training data separately for the lead fin and rear fin of a tandem fin setup. We progressively improve the data representation of the input parameter space for model predictions. The models are tested on three unseen fin geometries and the predictions validated with computational fluid dynamics (CFD) data. Finally, the orders of magnitude gains in computational performance of these surrogate models, compared to experimental and CFD runs, vs their tradeoff with accuracy is discussed within the context of this tandem fin configuration.
Tasks
Published 2019-10-31
URL https://arxiv.org/abs/1910.14194v1
PDF https://arxiv.org/pdf/1910.14194v1.pdf
PWC https://paperswithcode.com/paper/evaluation-of-surrogate-models-for-multi-fin
Repo
Framework
comments powered by Disqus