April 3, 2020

3122 words 15 mins read

Paper Group ANR 14

Paper Group ANR 14

Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency. FMT:Fusing Multi-task Convolutional Neural Network for Person Search. Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning. Differentially Private Federated Learning for Resource-Constrained Internet of Things. Absolute Shapley Valu …

Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency

Title Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency
Authors Guansong Lu, Zhiming Zhou, Jian Shen, Cheng Chen, Weinan Zhang, Yong Yu
Abstract Recent advances in large-scale optimal transport have greatly extended its application scenarios in machine learning. However, existing methods either not explicitly learn the transport map or do not support general cost function. In this paper, we propose an end-to-end approach for large-scale optimal transport, which directly solves the transport map and is compatible with general cost function. It models the transport map via stochastic neural networks and enforces the constraint on the marginal distributions via adversarial training. The proposed framework can be further extended towards learning Monge map or optimal bijection via adopting cycle-consistency constraint(s). We verify the effectiveness of the proposed method and demonstrate its superior performance against existing methods with large-scale real-world applications, including domain adaptation, image-to-image translation, and color transfer.
Tasks Domain Adaptation, Image-to-Image Translation
Published 2020-03-14
URL https://arxiv.org/abs/2003.06635v1
PDF https://arxiv.org/pdf/2003.06635v1.pdf
PWC https://paperswithcode.com/paper/large-scale-optimal-transport-via-adversarial
Title FMT:Fusing Multi-task Convolutional Neural Network for Person Search
Authors Sulan Zhai, Shunqiang Liu, Xiao Wang, Jin Tang
Abstract Person search is to detect all persons and identify the query persons from detected persons in the image without proposals and bounding boxes, which is different from person re-identification. In this paper, we propose a fusing multi-task convolutional neural network(FMT-CNN) to tackle the correlation and heterogeneity of detection and re-identification with a single convolutional neural network. We focus on how the interplay of person detection and person re-identification affects the overall performance. We employ person labels in region proposal network to produce features for person re-identification and person detection network, which can improve the accuracy of detection and re-identification simultaneously. We also use a multiple loss to train our re-identification network. Experiment results on CUHK-SYSU Person Search dataset show that the performance of our proposed method is superior to state-of-the-art approaches in both mAP and top-1.
Tasks Human Detection, Person Re-Identification, Person Search
Published 2020-03-01
URL https://arxiv.org/abs/2003.00406v1
PDF https://arxiv.org/pdf/2003.00406v1.pdf
PWC https://paperswithcode.com/paper/fmtfusing-multi-task-convolutional-neural

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

Title Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Authors Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor Darrell, Huijuan Xu
Abstract Weakly-supervised action localization problem requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). Since only the bag’s label is known, the main challenge is to assign which key instances within the bag trigger the bag’s label. Most previous models use an attention-based approach. These models use attention to generate the bag’s representation from instances and then train it via bag’s classification. In this work, we explicitly model the key instances assignment as a hidden variable and adopt an Expectation-Maximization framework. We derive two pseudo-label generation schemes to model the E and M process and iteratively optimize the likelihood lower bound. We also show that previous attention-based models implicitly violate the MIL assumptions that instances in negative bags should be uniformly negative. In comparison, Our EM-MIL approach more accurately models these assumptions. Our model achieves state-of-the-art performance on two standard benchmarks, THUMOS14 and ActivityNet1.2, and shows the superiority of detecting relative complete action boundary in videos containing multiple actions.
Tasks Action Localization, Multiple Instance Learning, Weakly Supervised Action Localization
Published 2020-03-31
URL https://arxiv.org/abs/2004.00163v1
PDF https://arxiv.org/pdf/2004.00163v1.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-action-localization-with-2

Differentially Private Federated Learning for Resource-Constrained Internet of Things

Title Differentially Private Federated Learning for Resource-Constrained Internet of Things
Authors Rui Hu, Yuanxiong Guo, E. Paul. Ratazzi, Yanmin Gong
Abstract With the proliferation of smart devices having built-in sensors, Internet connectivity, and programmable computation capability in the era of Internet of things (IoT), tremendous data is being generated at the network edge. Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place. However, the commonly-used federated learning algorithm is based on stochastic gradient descent (SGD) and not suitable for resource-constrained IoT environments due to its high communication resource requirement. Moreover, the privacy of sensitive data on smart devices has become a key concern and needs to be protected rigorously. This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT while guaranteeing differential privacy. The optimal schematic design of DP-PASGD that maximizes the learning performance while satisfying the limits on resource cost and privacy loss is formulated as an optimization problem, and an approximate solution method based on the convergence analysis of DP-PASGD is developed to solve the optimization problem efficiently. Numerical results based on real-world datasets verify the effectiveness of the proposed DP-PASGD scheme.
Published 2020-03-28
URL https://arxiv.org/abs/2003.12705v1
PDF https://arxiv.org/pdf/2003.12705v1.pdf
PWC https://paperswithcode.com/paper/differentially-private-federated-learning-for

Absolute Shapley Value

Title Absolute Shapley Value
Authors Jinfei Liu
Abstract Shapley value is a concept in cooperative game theory for measuring the contribution of each participant, which was named in honor of Lloyd Shapley. Shapley value has been recently applied in data marketplaces for compensation allocation based on their contribution to the models. Shapley value is the only value division scheme used for compensation allocation that meets three desirable criteria: group rationality, fairness, and additivity. In cooperative game theory, the marginal contribution of each contributor to each coalition is a nonnegative value. However, in machine learning model training, the marginal contribution of each contributor (data tuple) to each coalition (a set of data tuples) can be a negative value, i.e., the accuracy of the model trained by a dataset with an additional data tuple can be lower than the accuracy of the model trained by the dataset only. In this paper, we investigate the problem of how to handle the negative marginal contribution when computing Shapley value. We explore three philosophies: 1) taking the original value (Original Shapley Value); 2) taking the larger of the original value and zero (Zero Shapley Value); and 3) taking the absolute value of the original value (Absolute Shapley Value). Experiments on Iris dataset demonstrate that the definition of Absolute Shapley Value significantly outperforms the other two definitions in terms of evaluating data importance (the contribution of each data tuple to the trained model).
Published 2020-03-23
URL https://arxiv.org/abs/2003.10076v1
PDF https://arxiv.org/pdf/2003.10076v1.pdf
PWC https://paperswithcode.com/paper/absolute-shapley-value

Learning to Compare Relation: Semantic Alignment for Few-Shot Learning

Title Learning to Compare Relation: Semantic Alignment for Few-Shot Learning
Authors Congqi Cao, Yanning Zhang
Abstract Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Thirdly, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.
Tasks Few-Shot Learning, Metric Learning
Published 2020-02-29
URL https://arxiv.org/abs/2003.00210v1
PDF https://arxiv.org/pdf/2003.00210v1.pdf
PWC https://paperswithcode.com/paper/learning-to-compare-relation-semantic

Radioactive data: tracing through training

Title Radioactive data: tracing through training
Authors Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Abstract We want to detect whether a particular image dataset has been used to train a model. We propose a new technique, \emph{radioactive data}, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. The mark is robust to strong variations such as different architectures or optimization methods. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value). Our experiments on large-scale benchmarks (Imagenet), using standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we can detect usage of radioactive data with high confidence (p<10^-4) even when only 1% of the data used to trained our model is radioactive. Our method is robust to data augmentation and the stochasticity of deep network optimization. As a result, it offers a much higher signal-to-noise ratio than data poisoning and backdoor methods.
Tasks Data Augmentation, data poisoning
Published 2020-02-03
URL https://arxiv.org/abs/2002.00937v1
PDF https://arxiv.org/pdf/2002.00937v1.pdf
PWC https://paperswithcode.com/paper/radioactive-data-tracing-through-training

Data augmentation with Möbius transformations

Title Data augmentation with Möbius transformations
Authors Sharon Zhou, Jiequan Zhang, Hang Jiang, Torbjörn Lundh, Andrew Y. Ng
Abstract Data augmentation has led to substantial improvements in the performance and generalization of deep models, and remain a highly adaptable method to evolving model architectures and varying amounts of data—in particular, extremely scarce amounts of available training data. In this paper, we present a novel method of applying M"obius transformations to augment input images during training. M"obius transformations are bijective conformal maps that generalize image translation to operate over complex inversion in pixel space. As a result, M"obius transformations can operate on the sample level and preserve data labels. We show that the inclusion of M"obius transformations during training enables improved generalization over prior sample-level data augmentation techniques such as cutout and standard crop-and-flip transformations, most notably in low data regimes.
Tasks Data Augmentation
Published 2020-02-07
URL https://arxiv.org/abs/2002.02917v1
PDF https://arxiv.org/pdf/2002.02917v1.pdf
PWC https://paperswithcode.com/paper/data-augmentation-with-mobius-transformations

Keyphrase Extraction with Span-based Feature Representations

Title Keyphrase Extraction with Span-based Feature Representations
Authors Funan Mu, Zhenting Yu, LiFeng Wang, Yequan Wang, Qingyu Yin, Yibo Sun, Liqun Liu, Teng Ma, Jing Tang, Xing Zhou
Abstract Keyphrases are capable of providing semantic metadata characterizing documents and producing an overview of the content of a document. Since keyphrase extraction is able to facilitate the management, categorization, and retrieval of information, it has received much attention in recent years. There are three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. Two-step ranking approach is based on feature engineering, which is labor intensive and domain dependent. Sequence labeling is not able to tackle overlapping phrases. Generation methods (i.e., Sequence-to-sequence neural network models) overcome those shortcomings, so they have been widely studied and gain state-of-the-art performance. However, generation methods can not utilize context information effectively. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens. In this way, our model obtains representation for each keyphrase and further learns to capture the interaction between keyphrases in one document to get better ranking results. In addition, with the help of tokens, our model is able to extract overlapped keyphrases. Experimental results on the benchmark datasets show that our proposed model outperforms the existing methods by a large margin.
Tasks Feature Engineering
Published 2020-02-13
URL https://arxiv.org/abs/2002.05407v1
PDF https://arxiv.org/pdf/2002.05407v1.pdf
PWC https://paperswithcode.com/paper/keyphrase-extraction-with-span-based-feature

Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data

Title Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
Authors Sebastian Lunz, Yingzhen Li, Andrew Fitzgibbon, Nate Kushman
Abstract Recent work has shown the ability to learn generative models for 3D shapes from only unstructured 2D images. However, training such models requires differentiating through the rasterization step of the rendering process, therefore past work has focused on developing bespoke rendering models which smooth over this non-differentiable process in various ways. Such models are thus unable to take advantage of the photo-realistic, fully featured, industrial renderers built by the gaming and graphics industry. In this paper we introduce the first scalable training technique for 3D generative models from 2D data which utilizes an off-the-shelf non-differentiable renderer. To account for the non-differentiability, we introduce a proxy neural renderer to match the output of the non-differentiable renderer. We further propose discriminator output matching to ensure that the neural renderer learns to smooth over the rasterization appropriately. We evaluate our model on images rendered from our generated 3D shapes, and show that our model can consistently learn to generate better shapes than existing models when trained with exclusively unstructured 2D images.
Published 2020-02-28
URL https://arxiv.org/abs/2002.12674v1
PDF https://arxiv.org/pdf/2002.12674v1.pdf
PWC https://paperswithcode.com/paper/inverse-graphics-gan-learning-to-generate-3d

SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration

Title SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration
Authors Jun Shi, Jianfeng Xu, Kazuyuki Tasaka, Zhibo Chen
Abstract Accelerating the inference speed of CNNs is critical to their deployment in real-world applications. Among all the pruning approaches, those implementing a sparsity learning framework have shown to be effective as they learn and prune the models in an end-to-end data-driven manner. However, these works impose the same sparsity regularization on all filters indiscriminately, which can hardly result in an optimal structure-sparse network. In this paper, we propose a Saliency-Adaptive Sparsity Learning (SASL) approach for further optimization. A novel and effective estimation of each filter, i.e., saliency, is designed, which is measured from two aspects: the importance for the prediction performance and the consumed computational resources. During sparsity learning, the regularization strength is adjusted according to the saliency, so our optimized format can better preserve the prediction performance while zeroing out more computation-heavy filters. The calculation for saliency introduces minimum overhead to the training process, which means our SASL is very efficient. During the pruning phase, in order to optimize the proposed data-dependent criterion, a hard sample mining strategy is utilized, which shows higher effectiveness and efficiency. Extensive experiments demonstrate the superior performance of our method. Notably, on ILSVRC-2012 dataset, our approach can reduce 49.7% FLOPs of ResNet-50 with very negligible 0.39% top-1 and 0.05% top-5 accuracy degradation.
Published 2020-03-12
URL https://arxiv.org/abs/2003.05891v1
PDF https://arxiv.org/pdf/2003.05891v1.pdf
PWC https://paperswithcode.com/paper/sasl-saliency-adaptive-sparsity-learning-for

Advancing Renewable Electricity Consumption With Reinforcement Learning

Title Advancing Renewable Electricity Consumption With Reinforcement Learning
Authors Filip Tolovski
Abstract As the share of renewable energy sources in the present electric energy mix rises, their intermittence proves to be the biggest challenge to carbon free electricity generation. To address this challenge, we propose an electricity pricing agent, which sends price signals to the customers and contributes to shifting the customer demand to periods of high renewable energy generation. We propose an implementation of a pricing agent with a reinforcement learning approach where the environment is represented by the customers, the electricity generation utilities and the weather conditions.
Published 2020-03-09
URL https://arxiv.org/abs/2003.04310v1
PDF https://arxiv.org/pdf/2003.04310v1.pdf
PWC https://paperswithcode.com/paper/advancing-renewable-electricity-consumption

Dropout Prediction over Weeks in MOOCs via Interpretable Multi-Layer Representation Learning

Title Dropout Prediction over Weeks in MOOCs via Interpretable Multi-Layer Representation Learning
Authors Byungsoo Jeon, Namyong Park, Seojin Bang
Abstract Massive Open Online Courses (MOOCs) have become popular platforms for online learning. While MOOCs enable students to study at their own pace, this flexibility makes it easy for students to drop out of class. In this paper, our goal is to predict if a learner is going to drop out within the next week, given clickstream data for the current week. To this end, we present a multi-layer representation learning solution based on branch and bound (BB) algorithm, which learns from low-level clickstreams in an unsupervised manner, produces interpretable results, and avoids manual feature engineering. In experiments on Coursera data, we show that our model learns a representation that allows a simple model to perform similarly well to more complex, task-specific models, and how the BB algorithm enables interpretable results. In our analysis of the observed limitations, we discuss promising future directions.
Tasks Feature Engineering, Representation Learning
Published 2020-02-05
URL https://arxiv.org/abs/2002.01598v1
PDF https://arxiv.org/pdf/2002.01598v1.pdf
PWC https://paperswithcode.com/paper/dropout-prediction-over-weeks-in-moocs-via

ESG investments: Filtering versus machine learning approaches

Title ESG investments: Filtering versus machine learning approaches
Authors Carmine de Franco, Christophe Geissler, Vincent Margot, Bruno Monnier
Abstract We designed a machine learning algorithm that identifies patterns between ESG profiles and financial performances for companies in a large investment universe. The algorithm consists of regularly updated sets of rules that map regions into the high-dimensional space of ESG features to excess return predictions. The final aggregated predictions are transformed into scores which allow us to design simple strategies that screen the investment universe for stocks with positive scores. By linking the ESG features with financial performances in a non-linear way, our strategy based upon our machine learning algorithm turns out to be an efficient stock picking tool, which outperforms classic strategies that screen stocks according to their ESG ratings, as the popular best-in-class approach. Our paper brings new ideas in the growing field of financial literature that investigates the links between ESG behavior and the economy. We show indeed that there is clearly some form of alpha in the ESG profile of a company, but that this alpha can be accessed only with powerful, non-linear techniques such as machine learning.
Published 2020-02-18
URL https://arxiv.org/abs/2002.07477v1
PDF https://arxiv.org/pdf/2002.07477v1.pdf
PWC https://paperswithcode.com/paper/esg-investments-filtering-versus-machine

SpotNet: Self-Attention Multi-Task Network for Object Detection

Title SpotNet: Self-Attention Multi-Task Network for Object Detection
Authors Hughes Perreault, Guillaume-Alexandre Bilodeau, Nicolas Saunier, Maguelonne Héritier
Abstract Humans are very good at directing their visual attention toward relevant areas when they search for different types of objects. For instance, when we search for cars, we will look at the streets, not at the top of buildings. The motivation of this paper is to train a network to do the same via a multi-task learning approach. To train visual attention, we produce foreground/background segmentation labels in a semi-supervised way, using background subtraction or optical flow. Using these labels, we train an object detection model to produce foreground/background segmentation maps as well as bounding boxes while sharing most model parameters. We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes, decreasing the signal of non-relevant areas. We show that by using this method, we obtain a significant mAP improvement on two traffic surveillance datasets, with state-of-the-art results on both UA-DETRAC and UAVDT.
Tasks Multi-Task Learning, Object Detection, Optical Flow Estimation
Published 2020-02-13
URL https://arxiv.org/abs/2002.05540v1
PDF https://arxiv.org/pdf/2002.05540v1.pdf
PWC https://paperswithcode.com/paper/spotnet-self-attention-multi-task-network-for
comments powered by Disqus