April 2, 2020

3330 words 16 mins read

Paper Group ANR 171

Paper Group ANR 171

Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent. Locally Private Hypothesis Selection. Assembling Semantically-Disentangled Representations for Predictive-Generative Models via Adaptation from Synthetic Domain. Deepfakes for Medical Video De-Identification: Privacy Protection and Diagnostic Information Preservation …

Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent

Title Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent
Authors Satoshi Takabe, Tadashi Wadayama
Abstract Deep unfolding is a promising deep-learning technique in which an iterative algorithm is unrolled to a deep network architecture with trainable parameters. In the case of gradient descent algorithms, as a result of the training process, one often observes the acceleration of the convergence speed with learned non-constant step size parameters whose behavior is not intuitive nor interpretable from conventional theory. In this paper, we provide a theoretical interpretation of the learned step size of deep-unfolded gradient descent (DUGD). We first prove that the training process of DUGD reduces not only the mean squared error loss but also the spectral radius related to the convergence rate. Next, we show that minimizing the upper bound of the spectral radius naturally leads to the Chebyshev step which is a sequence of the step size based on Chebyshev polynomials. The numerical experiments confirm that the Chebyshev steps qualitatively reproduce the learned step size parameters in DUGD, which provides a plausible interpretation of the learned parameters. Additionally, we show that the Chebyshev steps achieve the lower bound of the convergence rate for the first-order method in a specific limit without learning parameters or momentum terms.
Tasks
Published 2020-01-15
URL https://arxiv.org/abs/2001.05142v2
PDF https://arxiv.org/pdf/2001.05142v2.pdf
PWC https://paperswithcode.com/paper/theoretical-interpretation-of-learned-step
Repo
Framework

Locally Private Hypothesis Selection

Title Locally Private Hypothesis Selection
Authors Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, Huanyu Zhang
Abstract We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the constraints of $\varepsilon$-local differential privacy, a distribution from $\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $\tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $\Omega(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $\tilde O(k)$ samples and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.
Tasks
Published 2020-02-21
URL https://arxiv.org/abs/2002.09465v1
PDF https://arxiv.org/pdf/2002.09465v1.pdf
PWC https://paperswithcode.com/paper/locally-private-hypothesis-selection
Repo
Framework

Assembling Semantically-Disentangled Representations for Predictive-Generative Models via Adaptation from Synthetic Domain

Title Assembling Semantically-Disentangled Representations for Predictive-Generative Models via Adaptation from Synthetic Domain
Authors Burkay Donderici, Caleb New, Chenliang Xu
Abstract Deep neural networks can form high-level hierarchical representations of input data. Various researchers have demonstrated that these representations can be used to enable a variety of useful applications. However, such representations are typically based on the statistics within the data, and may not conform with the semantic representation that may be necessitated by the application. Conditional models are typically used to overcome this challenge, but they require large annotated datasets which are difficult to come by and costly to create. In this paper, we show that semantically-aligned representations can be generated instead with the help of a physics based engine. This is accomplished by creating a synthetic dataset with decoupled attributes, learning an encoder for the synthetic dataset, and augmenting prescribed attributes from the synthetic domain with attributes from the real domain. It is shown that the proposed (SYNTH-VAE-GAN) method can construct a conditional predictive-generative model of human face attributes without relying on real data labels.
Tasks
Published 2020-02-23
URL https://arxiv.org/abs/2002.09818v1
PDF https://arxiv.org/pdf/2002.09818v1.pdf
PWC https://paperswithcode.com/paper/assembling-semantically-disentangled
Repo
Framework

Deepfakes for Medical Video De-Identification: Privacy Protection and Diagnostic Information Preservation

Title Deepfakes for Medical Video De-Identification: Privacy Protection and Diagnostic Information Preservation
Authors Bingquan Zhu, Hao Fang, Yanan Sui, Luming Li
Abstract Data sharing for medical research has been difficult as open-sourcing clinical data may violate patient privacy. Traditional methods for face de-identification wipe out facial information entirely, making it impossible to analyze facial behavior. Recent advancements on whole-body keypoints detection also rely on facial input to estimate body keypoints. Both facial and body keypoints are critical in some medical diagnoses, and keypoints invariability after de-identification is of great importance. Here, we propose a solution using deepfake technology, the face swapping technique. While this swapping method has been criticized for invading privacy and portraiture right, it could conversely protect privacy in medical video: patients’ faces could be swapped to a proper target face and become unrecognizable. However, it remained an open question that to what extent the swapping de-identification method could affect the automatic detection of body keypoints. In this study, we apply deepfake technology to Parkinson’s disease examination videos to de-identify subjects, and quantitatively show that: face-swapping as a de-identification approach is reliable, and it keeps the keypoints almost invariant, significantly better than traditional methods. This study proposes a pipeline for video de-identification and keypoint preservation, clearing up some ethical restrictions for medical data sharing. This work could make open-source high quality medical video datasets more feasible and promote future medical research that benefits our society.
Tasks Face Swapping
Published 2020-02-07
URL https://arxiv.org/abs/2003.00813v1
PDF https://arxiv.org/pdf/2003.00813v1.pdf
PWC https://paperswithcode.com/paper/deepfakes-for-medical-video-de-identification
Repo
Framework

Double/Debiased Machine Learning for Dynamic Treatment Effects

Title Double/Debiased Machine Learning for Dynamic Treatment Effects
Authors Greg Lewis, Vasilis Syrgkanis
Abstract We consider the estimation of treatment effects in settings when multiple treatments are assigned over time and treatments can have a causal effect on future outcomes. We formulate the problem as a linear state space Markov process with a high dimensional state and propose an extension of the double/debiased machine learning framework to estimate the dynamic effects of treatments. Our method allows the use of arbitrary machine learning methods to control for the high dimensional state, subject to a mean square error guarantee, while still allowing parametric estimation and construction of confidence intervals for the dynamic treatment effect parameters of interest. Our method is based on a sequential regression peeling process, which we show can be equivalently interpreted as a Neyman orthogonal moment estimator. This allows us to show root-n asymptotic normality of the estimated causal effects.
Tasks
Published 2020-02-17
URL https://arxiv.org/abs/2002.07285v1
PDF https://arxiv.org/pdf/2002.07285v1.pdf
PWC https://paperswithcode.com/paper/doubledebiased-machine-learning-for-dynamic
Repo
Framework

TF-IDFC-RF: A Novel Supervised Term Weighting Scheme

Title TF-IDFC-RF: A Novel Supervised Term Weighting Scheme
Authors Flavio Carvalho, Gustavo Paiva Guedes
Abstract Sentiment Analysis is a branch of Affective Computing usually considered a binary classification task. In this line of reasoning, Sentiment Analysis can be applied in several contexts to classify the attitude expressed in text samples, for example, movie reviews, sarcasm, among others. A common approach to represent text samples is the use of the Vector Space Model to compute numerical feature vectors consisting of the weight of terms. The most popular term weighting scheme is TF-IDF (Term Frequency - Inverse Document Frequency). It is an Unsupervised Weighting Scheme (UWS) since it does not consider the class information in the weighting of terms. Apart from that, there are Supervised Weighting Schemes (SWS), which consider the class information on term weighting calculation. Several SWS have been recently proposed, demonstrating better results than TF-IDF. In this scenario, this work presents a comparative study on different term weighting schemes and proposes a novel supervised term weighting scheme, named as TF-IDFC-RF (Term Frequency - Inverse Document Frequency in Class - Relevance Frequency). The effectiveness of TF-IDFC-RF is validated with SVM (Support Vector Machine) and NB (Naive Bayes) classifiers on four commonly used Sentiment Analysis datasets. TF-IDFC-RF outperforms all other weighting schemes and achieves F1 results of more than 99.9% on all datasets with SVM classifier.
Tasks Sentiment Analysis
Published 2020-03-12
URL https://arxiv.org/abs/2003.07193v1
PDF https://arxiv.org/pdf/2003.07193v1.pdf
PWC https://paperswithcode.com/paper/tf-idfc-rf-a-novel-supervised-term-weighting
Repo
Framework

A Transfer Learning Approach to Cross-Modal Object Recognition: From Visual Observation to Robotic Haptic Exploration

Title A Transfer Learning Approach to Cross-Modal Object Recognition: From Visual Observation to Robotic Haptic Exploration
Authors Pietro Falco, Shuang Lu, Ciro Natale, Salvatore Pirozzi, Dongheui Lee
Abstract In this work, we introduce the problem of cross-modal visuo-tactile object recognition with robotic active exploration. With this term, we mean that the robot observes a set of objects with visual perception and, later on, it is able to recognize such objects only with tactile exploration, without having touched any object before. Using a machine learning terminology, in our application we have a visual training set and a tactile test set, or vice versa. To tackle this problem, we propose an approach constituted by four steps: finding a visuo-tactile common representation, defining a suitable set of features, transferring the features across the domains, and classifying the objects. We show the results of our approach using a set of 15 objects, collecting 40 visual examples and five tactile examples for each object. The proposed approach achieves an accuracy of 94.7%, which is comparable with the accuracy of the monomodal case, i.e., when using visual data both as training set and test set. Moreover, it performs well compared to the human ability, which we have roughly estimated carrying out an experiment with ten participants.
Tasks Object Recognition, Transfer Learning
Published 2020-01-18
URL https://arxiv.org/abs/2001.06673v1
PDF https://arxiv.org/pdf/2001.06673v1.pdf
PWC https://paperswithcode.com/paper/a-transfer-learning-approach-to-cross-modal
Repo
Framework

Learning Transformation-Aware Embeddings for Image Forensics

Title Learning Transformation-Aware Embeddings for Image Forensics
Authors Aparna Bharati, Daniel Moreira, Patrick Flynn, Anderson Rocha, Kevin Bowyer, Walter Scheirer
Abstract A dramatic rise in the flow of manipulated image content on the Internet has led to an aggressive response from the media forensics research community. New efforts have incorporated increased usage of techniques from computer vision and machine learning to detect and profile the space of image manipulations. This paper addresses Image Provenance Analysis, which aims at discovering relationships among different manipulated image versions that share content. One of the main sub-problems for provenance analysis that has not yet been addressed directly is the edit ordering of images that share full content or are near-duplicates. The existing large networks that generate image descriptors for tasks such as object recognition may not encode the subtle differences between these image covariates. This paper introduces a novel deep learning-based approach to provide a plausible ordering to images that have been generated from a single image through transformations. Our approach learns transformation-aware descriptors using weak supervision via composited transformations and a rank-based quadruplet loss. To establish the efficacy of the proposed approach, comparisons with state-of-the-art handcrafted and deep learning-based descriptors, and image matching approaches are made. Further experimentation validates the proposed approach in the context of image provenance analysis.
Tasks Object Recognition
Published 2020-01-13
URL https://arxiv.org/abs/2001.04547v1
PDF https://arxiv.org/pdf/2001.04547v1.pdf
PWC https://paperswithcode.com/paper/learning-transformation-aware-embeddings-for
Repo
Framework

Multi-Scale Weight Sharing Network for Image Recognition

Title Multi-Scale Weight Sharing Network for Image Recognition
Authors Shubhra Aich, Ian Stavness, Yasuhiro Taniguchi, Masaki Yamazaki
Abstract In this paper, we explore the idea of weight sharing over multiple scales in convolutional networks. Inspired by traditional computer vision approaches, we share the weights of convolution kernels over different scales in the same layers of the network. Although multi-scale feature aggregation and sharing inside convolutional networks are common in practice, none of the previous works address the issue of convolutional weight sharing. We evaluate our weight sharing scheme on two heterogeneous image recognition datasets - ImageNet (object recognition) and Places365-Standard (scene classification). With approximately 25% fewer parameters, our shared-weight ResNet model provides similar performance compared to baseline ResNets. Shared-weight models are further validated via transfer learning experiments on four additional image recognition datasets - Caltech256 and Stanford 40 Actions (object-centric) and SUN397 and MIT Inddor67 (scene-centric). Experimental results demonstrate significant redundancy in the vanilla implementations of the deeper networks, and also indicate that a shift towards increasing the receptive field per parameter may improve future convolutional network architectures.
Tasks Object Recognition, Scene Classification, Transfer Learning
Published 2020-01-09
URL https://arxiv.org/abs/2001.02816v1
PDF https://arxiv.org/pdf/2001.02816v1.pdf
PWC https://paperswithcode.com/paper/multi-scale-weight-sharing-network-for-image
Repo
Framework

Adaptive fractional order graph neural network

Title Adaptive fractional order graph neural network
Authors Zijian Liu, Chunbo Luo, Shuai Li
Abstract This paper proposes adaptive fractional order graph neural network (AFGNN), optimized by a time-varying fractional order gradient descent method to address the challenges of local optimum of classic and fractional GNNs which are specialised at aggregating information from the feature and adjacent matrices of connected nodes and their neighbours to solve learning tasks on non-Euclidean data such as graphs. To overcome the high computational complexity of fractional order derivations, the proposed model approximately calculates the fractional order gradients. We further prove such approximation is feasible and the AFGNN is unbiased. Extensive experiments on benchmark citation networks and object recognition challenges confirm the performance of AFGNN. The first group of experiments show that the results of AFGNN outperform the steepest gradient based method and conventional GNNs on the citation networks. The second group of experiments demonstrate that AFGNN excels at image recognition tasks where the images have a significant amount of missing pixels and expresses improved accuracy than GNNs.
Tasks Object Recognition
Published 2020-01-05
URL https://arxiv.org/abs/2001.04026v1
PDF https://arxiv.org/pdf/2001.04026v1.pdf
PWC https://paperswithcode.com/paper/adaptive-fractional-order-graph-neural
Repo
Framework

Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning

Title Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning
Authors Dingzhu Wen, Mehdi Bennis, Kaibin Huang
Abstract To leverage data and computation capabilities of mobile devices, machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models, resulting in the new paradigm of edge learning. In this paper, we consider the framework of partitioned edge learning for iteratively training a large-scale model using many resource-constrained devices (called workers). To this end, in each iteration, the model is dynamically partitioned into parametric blocks, which are downloaded to worker groups for updating using data subsets. Then, the local updates are uploaded to and cascaded by the server for updating a global model. To reduce resource usage by minimizing the total learning-and-communication latency, this work focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation (for downloading and uploading). Two design approaches are adopted. First, a practical sequential approach, called partially integrated parameter-and-bandwidth allocation (PABA), yields two schemes, namely bandwidth aware parameter allocation and parameter aware bandwidth allocation. The former minimizes the load for the slowest (in computing) of worker groups, each training a same parametric block. The latter allocates the largest bandwidth to the worker being the latency bottleneck. Second, PABA are jointly optimized. Despite its being a nonconvex problem, an efficient and optimal solution algorithm is derived by intelligently nesting a bisection search and solving a convex problem. Experimental results using real data demonstrate that integrating PABA can substantially improve the performance of partitioned edge learning in terms of latency (by e.g., 46%) and accuracy (by e.g., 4%).
Tasks
Published 2020-03-10
URL https://arxiv.org/abs/2003.04544v2
PDF https://arxiv.org/pdf/2003.04544v2.pdf
PWC https://paperswithcode.com/paper/joint-parameter-and-bandwidth-allocation-for
Repo
Framework

Adversarial Attacks on Probabilistic Autoregressive Forecasting Models

Title Adversarial Attacks on Probabilistic Autoregressive Forecasting Models
Authors Raphaël Dang-Nhu, Gagandeep Singh, Pavol Bielik, Martin Vechev
Abstract We develop an effective generation of adversarial attacks on neural models that output a sequence of probability distributions rather than a sequence of single values. This setting includes the recently proposed deep probabilistic autoregressive forecasting models that estimate the probability distribution of a time series given its past and achieve state-of-the-art results in a diverse set of application domains. The key technical challenge we address is effectively differentiating through the Monte-Carlo estimation of statistics of the joint distribution of the output sequence. Additionally, we extend prior work on probabilistic forecasting to the Bayesian setting which allows conditioning on future observations, instead of only on past observations. We demonstrate that our approach can successfully generate attacks with small input perturbations in two challenging tasks where robust decision making is crucial: stock market trading and prediction of electricity consumption.
Tasks Decision Making, Time Series
Published 2020-03-08
URL https://arxiv.org/abs/2003.03778v1
PDF https://arxiv.org/pdf/2003.03778v1.pdf
PWC https://paperswithcode.com/paper/adversarial-attacks-on-probabilistic
Repo
Framework

Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism

Title Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism
Authors Hao Wang, Doyen Sahoo, Chenghao Liu, Ke Shu, Palakorn Achananuparp, Ee-peng Lim, Steven C. H. Hoi
Abstract Cross-modal food retrieval is an important task to perform analysis of food-related information, such as food images and cooking recipes. The goal is to learn an embedding of images and recipes in a common feature space, so that precise matching can be realized. Compared with existing cross-modal retrieval approaches, two major challenges in this specific problem are: 1) the large intra-class variance across cross-modal food data; and 2) the difficulties in obtaining discriminative recipe representations. To address these problems, we propose Semantic-Consistent and Attention-based Networks (SCAN), which regularize the embeddings of the two modalities by aligning output semantic probabilities. In addition, we exploit self-attention mechanism to improve the embedding of recipes. We evaluate the performance of the proposed method on the large-scale Recipe1M dataset, and the result shows that it outperforms the state-of-the-art.
Tasks Cross-Modal Retrieval
Published 2020-03-09
URL https://arxiv.org/abs/2003.03955v1
PDF https://arxiv.org/pdf/2003.03955v1.pdf
PWC https://paperswithcode.com/paper/cross-modal-food-retrieval-learning-a-joint
Repo
Framework

Deep Robust Multilevel Semantic Cross-Modal Hashing

Title Deep Robust Multilevel Semantic Cross-Modal Hashing
Authors Ge Song, Jun Zhao, Xiaoyang Tan
Abstract Hashing based cross-modal retrieval has recently made significant progress. But straightforward embedding data from different modalities into a joint Hamming space will inevitably produce false codes due to the intrinsic modality discrepancy and noises. We present a novel Robust Multilevel Semantic Hashing (RMSH) for more accurate cross-modal retrieval. It seeks to preserve fine-grained similarity among data with rich semantics, while explicitly require distances between dissimilar points to be larger than a specific value for strong robustness. For this, we give an effective bound of this value based on the information coding-theoretic analysis, and the above goals are embodied into a margin-adaptive triplet loss. Furthermore, we introduce pseudo-codes via fusing multiple hash codes to explore seldom-seen semantics, alleviating the sparsity problem of similarity information. Experiments on three benchmarks show the validity of the derived bounds, and our method achieves state-of-the-art performance.
Tasks Cross-Modal Retrieval
Published 2020-02-07
URL https://arxiv.org/abs/2002.02698v1
PDF https://arxiv.org/pdf/2002.02698v1.pdf
PWC https://paperswithcode.com/paper/deep-robust-multilevel-semantic-cross-modal
Repo
Framework

Lane Boundary Geometry Extraction from Satellite Imagery

Title Lane Boundary Geometry Extraction from Satellite Imagery
Authors Andi Zang, Runsheng Xu, Zichen Li, David Doria
Abstract Autonomous driving car is becoming more of a reality, as a key component,high-definition(HD) maps shows its value in both market place and industry. Even though HD maps generation from LiDAR or stereo/perspective imagery has achieved impressive success, its inherent defects cannot be ignored. In this paper, we proposal a novel method for Highway HD maps modeling using pixel-wise segmentation on satellite imagery and formalized hypotheses linking, which is cheaper and faster than current HD maps modeling approaches from LiDAR point cloud and perspective view imagery, and let it becomes an ideal complementary of state of the art. We also manual code/label an HD road model dataset as ground truth, aligned with Bing tile image server, to train, test and evaluate our methodology. This dataset will be publish at same time to contribute research in HD maps modeling from aerial imagery.
Tasks Autonomous Driving
Published 2020-02-06
URL https://arxiv.org/abs/2002.02362v1
PDF https://arxiv.org/pdf/2002.02362v1.pdf
PWC https://paperswithcode.com/paper/lane-boundary-geometry-extraction-from
Repo
Framework
comments powered by Disqus