April 3, 2020

3122 words 15 mins read

Paper Group ANR 14

Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency. FMT:Fusing Multi-task Convolutional Neural Network for Person Search. Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning. Differentially Private Federated Learning for Resource-Constrained Internet of Things. Absolute Shapley Valu …

Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency


Title	Large-Scale Optimal Transport via Adversarial Training with Cycle-Consistency
Authors	Guansong Lu, Zhiming Zhou, Jian Shen, Cheng Chen, Weinan Zhang, Yong Yu
Abstract	Recent advances in large-scale optimal transport have greatly extended its application scenarios in machine learning. However, existing methods either not explicitly learn the transport map or do not support general cost function. In this paper, we propose an end-to-end approach for large-scale optimal transport, which directly solves the transport map and is compatible with general cost function. It models the transport map via stochastic neural networks and enforces the constraint on the marginal distributions via adversarial training. The proposed framework can be further extended towards learning Monge map or optimal bijection via adopting cycle-consistency constraint(s). We verify the effectiveness of the proposed method and demonstrate its superior performance against existing methods with large-scale real-world applications, including domain adaptation, image-to-image translation, and color transfer.
Tasks	Domain Adaptation, Image-to-Image Translation
Published	2020-03-14
URL	https://arxiv.org/abs/2003.06635v1
PDF	https://arxiv.org/pdf/2003.06635v1.pdf
PWC	https://paperswithcode.com/paper/large-scale-optimal-transport-via-adversarial
Repo
Framework

FMT:Fusing Multi-task Convolutional Neural Network for Person Search


Title	FMT:Fusing Multi-task Convolutional Neural Network for Person Search
Authors	Sulan Zhai, Shunqiang Liu, Xiao Wang, Jin Tang
Abstract	Person search is to detect all persons and identify the query persons from detected persons in the image without proposals and bounding boxes, which is different from person re-identification. In this paper, we propose a fusing multi-task convolutional neural network(FMT-CNN) to tackle the correlation and heterogeneity of detection and re-identification with a single convolutional neural network. We focus on how the interplay of person detection and person re-identification affects the overall performance. We employ person labels in region proposal network to produce features for person re-identification and person detection network, which can improve the accuracy of detection and re-identification simultaneously. We also use a multiple loss to train our re-identification network. Experiment results on CUHK-SYSU Person Search dataset show that the performance of our proposed method is superior to state-of-the-art approaches in both mAP and top-1.
Tasks	Human Detection, Person Re-Identification, Person Search
Published	2020-03-01
URL	https://arxiv.org/abs/2003.00406v1
PDF	https://arxiv.org/pdf/2003.00406v1.pdf
PWC	https://paperswithcode.com/paper/fmtfusing-multi-task-convolutional-neural
Repo
Framework

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning


Title	Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Authors	Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor Darrell, Huijuan Xu
Abstract	Weakly-supervised action localization problem requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). Since only the bag’s label is known, the main challenge is to assign which key instances within the bag trigger the bag’s label. Most previous models use an attention-based approach. These models use attention to generate the bag’s representation from instances and then train it via bag’s classification. In this work, we explicitly model the key instances assignment as a hidden variable and adopt an Expectation-Maximization framework. We derive two pseudo-label generation schemes to model the E and M process and iteratively optimize the likelihood lower bound. We also show that previous attention-based models implicitly violate the MIL assumptions that instances in negative bags should be uniformly negative. In comparison, Our EM-MIL approach more accurately models these assumptions. Our model achieves state-of-the-art performance on two standard benchmarks, THUMOS14 and ActivityNet1.2, and shows the superiority of detecting relative complete action boundary in videos containing multiple actions.
Tasks	Action Localization, Multiple Instance Learning, Weakly Supervised Action Localization
Published	2020-03-31
URL	https://arxiv.org/abs/2004.00163v1
PDF	https://arxiv.org/pdf/2004.00163v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-action-localization-with-2
Repo
Framework

Differentially Private Federated Learning for Resource-Constrained Internet of Things


Title	Differentially Private Federated Learning for Resource-Constrained Internet of Things
Authors	Rui Hu, Yuanxiong Guo, E. Paul. Ratazzi, Yanmin Gong
Abstract	With the proliferation of smart devices having built-in sensors, Internet connectivity, and programmable computation capability in the era of Internet of things (IoT), tremendous data is being generated at the network edge. Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place. However, the commonly-used federated learning algorithm is based on stochastic gradient descent (SGD) and not suitable for resource-constrained IoT environments due to its high communication resource requirement. Moreover, the privacy of sensitive data on smart devices has become a key concern and needs to be protected rigorously. This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT while guaranteeing differential privacy. The optimal schematic design of DP-PASGD that maximizes the learning performance while satisfying the limits on resource cost and privacy loss is formulated as an optimization problem, and an approximate solution method based on the convergence analysis of DP-PASGD is developed to solve the optimization problem efficiently. Numerical results based on real-world datasets verify the effectiveness of the proposed DP-PASGD scheme.
Tasks
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12705v1
PDF	https://arxiv.org/pdf/2003.12705v1.pdf
PWC	https://paperswithcode.com/paper/differentially-private-federated-learning-for
Repo
Framework

Absolute Shapley Value


Title	Absolute Shapley Value
Authors	Jinfei Liu
Abstract	Shapley value is a concept in cooperative game theory for measuring the contribution of each participant, which was named in honor of Lloyd Shapley. Shapley value has been recently applied in data marketplaces for compensation allocation based on their contribution to the models. Shapley value is the only value division scheme used for compensation allocation that meets three desirable criteria: group rationality, fairness, and additivity. In cooperative game theory, the marginal contribution of each contributor to each coalition is a nonnegative value. However, in machine learning model training, the marginal contribution of each contributor (data tuple) to each coalition (a set of data tuples) can be a negative value, i.e., the accuracy of the model trained by a dataset with an additional data tuple can be lower than the accuracy of the model trained by the dataset only. In this paper, we investigate the problem of how to handle the negative marginal contribution when computing Shapley value. We explore three philosophies: 1) taking the original value (Original Shapley Value); 2) taking the larger of the original value and zero (Zero Shapley Value); and 3) taking the absolute value of the original value (Absolute Shapley Value). Experiments on Iris dataset demonstrate that the definition of Absolute Shapley Value significantly outperforms the other two definitions in terms of evaluating data importance (the contribution of each data tuple to the trained model).
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10076v1
PDF	https://arxiv.org/pdf/2003.10076v1.pdf
PWC	https://paperswithcode.com/paper/absolute-shapley-value
Repo
Framework

Learning to Compare Relation: Semantic Alignment for Few-Shot Learning


Title	Learning to Compare Relation: Semantic Alignment for Few-Shot Learning
Authors	Congqi Cao, Yanning Zhang
Abstract	Few-shot learning is a fundamental and challenging problem since it requires recognizing novel categories from only a few examples. The objects for recognition have multiple variants and can locate anywhere in images. Directly comparing query images with example images can not handle content misalignment. The representation and metric for comparison are critical but challenging to learn due to the scarcity and wide variation of the samples in few-shot learning. In this paper, we present a novel semantic alignment model to compare relations, which is robust to content misalignment. We propose to add two key ingredients to existing few-shot learning frameworks for better feature and metric learning ability. First, we introduce a semantic alignment loss to align the relation statistics of the features from samples that belong to the same category. And second, local and global mutual information maximization is introduced, allowing for representations that contain locally-consistent and intra-class shared information across structural locations in an image. Thirdly, we introduce a principled approach to weigh multiple loss functions by considering the homoscedastic uncertainty of each stream. We conduct extensive experiments on several few-shot learning datasets. Experimental results show that the proposed method is capable of comparing relations with semantic alignment strategies, and achieves state-of-the-art performance.
Tasks	Few-Shot Learning, Metric Learning
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00210v1
PDF	https://arxiv.org/pdf/2003.00210v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-compare-relation-semantic
Repo
Framework

Radioactive data: tracing through training


Title	Radioactive data: tracing through training
Authors	Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
Abstract	We want to detect whether a particular image dataset has been used to train a model. We propose a new technique, \emph{radioactive data}, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. The mark is robust to strong variations such as different architectures or optimization methods. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value). Our experiments on large-scale benchmarks (Imagenet), using standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we can detect usage of radioactive data with high confidence (p<10^-4) even when only 1% of the data used to trained our model is radioactive. Our method is robust to data augmentation and the stochasticity of deep network optimization. As a result, it offers a much higher signal-to-noise ratio than data poisoning and backdoor methods.
Tasks	Data Augmentation, data poisoning
Published	2020-02-03
URL	https://arxiv.org/abs/2002.00937v1
PDF	https://arxiv.org/pdf/2002.00937v1.pdf
PWC	https://paperswithcode.com/paper/radioactive-data-tracing-through-training
Repo
Framework

Data augmentation with Möbius transformations


Title	Data augmentation with Möbius transformations
Authors	Sharon Zhou, Jiequan Zhang, Hang Jiang, Torbjörn Lundh, Andrew Y. Ng
Abstract	Data augmentation has led to substantial improvements in the performance and generalization of deep models, and remain a highly adaptable method to evolving model architectures and varying amounts of data—in particular, extremely scarce amounts of available training data. In this paper, we present a novel method of applying M"obius transformations to augment input images during training. M"obius transformations are bijective conformal maps that generalize image translation to operate over complex inversion in pixel space. As a result, M"obius transformations can operate on the sample level and preserve data labels. We show that the inclusion of M"obius transformations during training enables improved generalization over prior sample-level data augmentation techniques such as cutout and standard crop-and-flip transformations, most notably in low data regimes.
Tasks	Data Augmentation
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02917v1
PDF	https://arxiv.org/pdf/2002.02917v1.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-with-mobius-transformations
Repo
Framework

Keyphrase Extraction with Span-based Feature Representations


Title	Keyphrase Extraction with Span-based Feature Representations
Authors	Funan Mu, Zhenting Yu, LiFeng Wang, Yequan Wang, Qingyu Yin, Yibo Sun, Liqun Liu, Teng Ma, Jing Tang, Xing Zhou
Abstract	Keyphrases are capable of providing semantic metadata characterizing documents and producing an overview of the content of a document. Since keyphrase extraction is able to facilitate the management, categorization, and retrieval of information, it has received much attention in recent years. There are three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. Two-step ranking approach is based on feature engineering, which is labor intensive and domain dependent. Sequence labeling is not able to tackle overlapping phrases. Generation methods (i.e., Sequence-to-sequence neural network models) overcome those shortcomings, so they have been widely studied and gain state-of-the-art performance. However, generation methods can not utilize context information effectively. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens. In this way, our model obtains representation for each keyphrase and further learns to capture the interaction between keyphrases in one document to get better ranking results. In addition, with the help of tokens, our model is able to extract overlapped keyphrases. Experimental results on the benchmark datasets show that our proposed model outperforms the existing methods by a large margin.
Tasks	Feature Engineering
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05407v1
PDF	https://arxiv.org/pdf/2002.05407v1.pdf
PWC	https://paperswithcode.com/paper/keyphrase-extraction-with-span-based-feature
Repo
Framework

Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data


Title	Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
Authors	Sebastian Lunz, Yingzhen Li, Andrew Fitzgibbon, Nate Kushman
Abstract	Recent work has shown the ability to learn generative models for 3D shapes from only unstructured 2D images. However, training such models requires differentiating through the rasterization step of the rendering process, therefore past work has focused on developing bespoke rendering models which smooth over this non-differentiable process in various ways. Such models are thus unable to take advantage of the photo-realistic, fully featured, industrial renderers built by the gaming and graphics industry. In this paper we introduce the first scalable training technique for 3D generative models from 2D data which utilizes an off-the-shelf non-differentiable renderer. To account for the non-differentiability, we introduce a proxy neural renderer to match the output of the non-differentiable renderer. We further propose discriminator output matching to ensure that the neural renderer learns to smooth over the rasterization appropriately. We evaluate our model on images rendered from our generated 3D shapes, and show that our model can consistently learn to generate better shapes than existing models when trained with exclusively unstructured 2D images.
Tasks
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12674v1
PDF	https://arxiv.org/pdf/2002.12674v1.pdf
PWC	https://paperswithcode.com/paper/inverse-graphics-gan-learning-to-generate-3d
Repo
Framework

SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration


Title	SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration
Authors	Jun Shi, Jianfeng Xu, Kazuyuki Tasaka, Zhibo Chen
Abstract	Accelerating the inference speed of CNNs is critical to their deployment in real-world applications. Among all the pruning approaches, those implementing a sparsity learning framework have shown to be effective as they learn and prune the models in an end-to-end data-driven manner. However, these works impose the same sparsity regularization on all filters indiscriminately, which can hardly result in an optimal structure-sparse network. In this paper, we propose a Saliency-Adaptive Sparsity Learning (SASL) approach for further optimization. A novel and effective estimation of each filter, i.e., saliency, is designed, which is measured from two aspects: the importance for the prediction performance and the consumed computational resources. During sparsity learning, the regularization strength is adjusted according to the saliency, so our optimized format can better preserve the prediction performance while zeroing out more computation-heavy filters. The calculation for saliency introduces minimum overhead to the training process, which means our SASL is very efficient. During the pruning phase, in order to optimize the proposed data-dependent criterion, a hard sample mining strategy is utilized, which shows higher effectiveness and efficiency. Extensive experiments demonstrate the superior performance of our method. Notably, on ILSVRC-2012 dataset, our approach can reduce 49.7% FLOPs of ResNet-50 with very negligible 0.39% top-1 and 0.05% top-5 accuracy degradation.
Tasks
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05891v1
PDF	https://arxiv.org/pdf/2003.05891v1.pdf
PWC	https://paperswithcode.com/paper/sasl-saliency-adaptive-sparsity-learning-for
Repo
Framework

Advancing Renewable Electricity Consumption With Reinforcement Learning


Title	Advancing Renewable Electricity Consumption With Reinforcement Learning
Authors	Filip Tolovski
Abstract	As the share of renewable energy sources in the present electric energy mix rises, their intermittence proves to be the biggest challenge to carbon free electricity generation. To address this challenge, we propose an electricity pricing agent, which sends price signals to the customers and contributes to shifting the customer demand to periods of high renewable energy generation. We propose an implementation of a pricing agent with a reinforcement learning approach where the environment is represented by the customers, the electricity generation utilities and the weather conditions.
Tasks
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04310v1
PDF	https://arxiv.org/pdf/2003.04310v1.pdf
PWC	https://paperswithcode.com/paper/advancing-renewable-electricity-consumption
Repo
Framework

Dropout Prediction over Weeks in MOOCs via Interpretable Multi-Layer Representation Learning


Title	Dropout Prediction over Weeks in MOOCs via Interpretable Multi-Layer Representation Learning
Authors	Byungsoo Jeon, Namyong Park, Seojin Bang
Abstract	Massive Open Online Courses (MOOCs) have become popular platforms for online learning. While MOOCs enable students to study at their own pace, this flexibility makes it easy for students to drop out of class. In this paper, our goal is to predict if a learner is going to drop out within the next week, given clickstream data for the current week. To this end, we present a multi-layer representation learning solution based on branch and bound (BB) algorithm, which learns from low-level clickstreams in an unsupervised manner, produces interpretable results, and avoids manual feature engineering. In experiments on Coursera data, we show that our model learns a representation that allows a simple model to perform similarly well to more complex, task-specific models, and how the BB algorithm enables interpretable results. In our analysis of the observed limitations, we discuss promising future directions.
Tasks	Feature Engineering, Representation Learning
Published	2020-02-05
URL	https://arxiv.org/abs/2002.01598v1
PDF	https://arxiv.org/pdf/2002.01598v1.pdf
PWC	https://paperswithcode.com/paper/dropout-prediction-over-weeks-in-moocs-via
Repo
Framework

ESG investments: Filtering versus machine learning approaches


Title	ESG investments: Filtering versus machine learning approaches
Authors	Carmine de Franco, Christophe Geissler, Vincent Margot, Bruno Monnier
Abstract	We designed a machine learning algorithm that identifies patterns between ESG profiles and financial performances for companies in a large investment universe. The algorithm consists of regularly updated sets of rules that map regions into the high-dimensional space of ESG features to excess return predictions. The final aggregated predictions are transformed into scores which allow us to design simple strategies that screen the investment universe for stocks with positive scores. By linking the ESG features with financial performances in a non-linear way, our strategy based upon our machine learning algorithm turns out to be an efficient stock picking tool, which outperforms classic strategies that screen stocks according to their ESG ratings, as the popular best-in-class approach. Our paper brings new ideas in the growing field of financial literature that investigates the links between ESG behavior and the economy. We show indeed that there is clearly some form of alpha in the ESG profile of a company, but that this alpha can be accessed only with powerful, non-linear techniques such as machine learning.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07477v1
PDF	https://arxiv.org/pdf/2002.07477v1.pdf
PWC	https://paperswithcode.com/paper/esg-investments-filtering-versus-machine
Repo
Framework

SpotNet: Self-Attention Multi-Task Network for Object Detection


Title	SpotNet: Self-Attention Multi-Task Network for Object Detection
Authors	Hughes Perreault, Guillaume-Alexandre Bilodeau, Nicolas Saunier, Maguelonne Héritier
Abstract	Humans are very good at directing their visual attention toward relevant areas when they search for different types of objects. For instance, when we search for cars, we will look at the streets, not at the top of buildings. The motivation of this paper is to train a network to do the same via a multi-task learning approach. To train visual attention, we produce foreground/background segmentation labels in a semi-supervised way, using background subtraction or optical flow. Using these labels, we train an object detection model to produce foreground/background segmentation maps as well as bounding boxes while sharing most model parameters. We use those segmentation maps inside the network as a self-attention mechanism to weight the feature map used to produce the bounding boxes, decreasing the signal of non-relevant areas. We show that by using this method, we obtain a significant mAP improvement on two traffic surveillance datasets, with state-of-the-art results on both UA-DETRAC and UAVDT.
Tasks	Multi-Task Learning, Object Detection, Optical Flow Estimation
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05540v1
PDF	https://arxiv.org/pdf/2002.05540v1.pdf
PWC	https://paperswithcode.com/paper/spotnet-self-attention-multi-task-network-for
Repo
Framework