February 1, 2020

3177 words 15 mins read

Paper Group AWR 97

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification. Bayesian active learning for optimization and uncertainty quantification in protein docking. Bayesian Optimization of Composite Functions. Fast Tree Variants of Gromov-Wasserstein. CANet: Cross-disease Attention Network for Joint Diabetic Retinopathy and Diabetic Macular Edema …

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification


Title	Integrating Semantic Knowledge to Tackle Zero-shot Text Classification
Authors	Jingqing Zhang, Piyawat Lertvittayakumjorn, Yike Guo
Abstract	Insufficient or even unavailable training data of emerging classes is a big challenge of many classification tasks, including text classification. Recognising text documents of classes that have never been seen in the learning stage, so-called zero-shot text classification, is therefore difficult and only limited previous works tackled this problem. In this paper, we propose a two-phase framework together with data augmentation and feature augmentation to solve this problem. Four kinds of semantic knowledge (word embeddings, class descriptions, class hierarchy, and a general knowledge graph) are incorporated into the proposed framework to deal with instances of unseen classes effectively. Experimental results show that each and the combination of the two phases achieve the best overall accuracy compared with baselines and recent approaches in classifying real-world texts under the zero-shot scenario.
Tasks	Data Augmentation, Text Classification, Word Embeddings
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12626v1
PDF	http://arxiv.org/pdf/1903.12626v1.pdf
PWC	https://paperswithcode.com/paper/integrating-semantic-knowledge-to-tackle-zero
Repo	https://github.com/kevinwong2013/COMS4995_Team_4_Zero_Shot_Classifier
Framework	tf

Bayesian active learning for optimization and uncertainty quantification in protein docking


Title	Bayesian active learning for optimization and uncertainty quantification in protein docking
Authors	Yue Cao, Yang Shen
Abstract	Motivation: Ab initio protein docking represents a major challenge for optimizing a noisy and costly “black box”-like function in a high-dimensional space. Despite progress in this field, there is no docking method available for rigorous uncertainty quantification (UQ) of its solution quality (e.g. interface RMSD or iRMSD). Results: We introduce a novel algorithm, Bayesian Active Learning (BAL), for optimization and UQ of such black-box functions and flexible protein docking. BAL directly models the posterior distribution of the global optimum (or native structures for protein docking) with active sampling and posterior estimation iteratively feeding each other. Furthermore, we use complex normal modes to represent a homogeneous Euclidean conformation space suitable for high-dimension optimization and construct funnel-like energy models for encounter complexes. Over a protein docking benchmark set and a CAPRI set including homology docking, we establish that BAL significantly improve against both starting points by rigid docking and refinements by particle swarm optimization, providing for one third targets a top-3 near-native prediction. BAL also generates tight confidence intervals with half range around 25% of iRMSD and confidence level at 85%. Its estimated probability of a prediction being native or not achieves binary classification AUROC at 0.93 and AUPRC over 0.60 (compared to 0.14 by chance); and also found to help ranking predictions. To the best of our knowledge, this study represents the first uncertainty quantification solution for protein docking, with theoretical rigor and comprehensive assessment. Source codes are available at https://github.com/Shen-Lab/BAL.
Tasks	Active Learning
Published	2019-01-31
URL	http://arxiv.org/abs/1902.00067v1
PDF	http://arxiv.org/pdf/1902.00067v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-active-learning-for-optimization-and
Repo	https://github.com/Shen-Lab/BAL
Framework	none

Bayesian Optimization of Composite Functions


Title	Bayesian Optimization of Composite Functions
Authors	Raul Astudillo, Peter I. Frazier
Abstract	We consider optimization of composite objective functions, i.e., of the form $f(x)=g(h(x))$, where $h$ is a black-box derivative-free expensive-to-evaluate function with vector-valued outputs, and $g$ is a cheap-to-evaluate real-valued function. While these problems can be solved with standard Bayesian optimization, we propose a novel approach that exploits the composite structure of the objective function to substantially improve sampling efficiency. Our approach models $h$ using a multi-output Gaussian process and chooses where to sample using the expected improvement evaluated on the implied non-Gaussian posterior on $f$, which we call expected improvement for composite functions (\ei). Although \ei\ cannot be computed in closed form, we provide a novel stochastic gradient estimator that allows its efficient maximization. We also show that our approach is asymptotically consistent, i.e., that it recovers a globally optimal solution as sampling effort grows to infinity, generalizing previous convergence results for classical expected improvement. Numerical experiments show that our approach dramatically outperforms standard Bayesian optimization benchmarks, reducing simple regret by several orders of magnitude.
Tasks
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01537v1
PDF	https://arxiv.org/pdf/1906.01537v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-optimization-of-composite-functions
Repo	https://github.com/RaulAstudillo06/BOCF
Framework	none

Fast Tree Variants of Gromov-Wasserstein


Title	Fast Tree Variants of Gromov-Wasserstein
Authors	Tam Le, Nhat Ho, Makoto Yamada
Abstract	Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of ground metrics, namely \textit{tree metrics} for a space of supports of each probability measure in GW. By leveraging a tree structure, we propose to use \textit{flows} from a root to each support to represent a probability measure whose supports are in a tree metric space. We consequently propose a novel tree variant of GW, namely flow-based tree GW (\FlowTGW), by matching the flows of the probability measures. We then show that \FlowTGW~shares a similar structure as a univariate optimal transport distance. Therefore, \FlowTGW~is fast for computation and can scale up for large-scale applications. In order to further explore tree structures, we propose another tree variant of GW, namely depth-based tree GW (\DepthTGW), by aligning the flows of the probability measures hierarchically along each depth level of the tree structures. Theoretically, we prove that both \FlowTGW~and \DepthTGW~are pseudo-distances. Moreover, we also derive tree-sliced variants, computed by averaging the corresponding tree variants of GW using random tree metrics, built adaptively in spaces of supports. Finally, we test our proposed discrepancies against other baselines on some benchmark tasks.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04462v4
PDF	https://arxiv.org/pdf/1910.04462v4.pdf
PWC	https://paperswithcode.com/paper/computationally-efficient-tree-variants-of
Repo	https://github.com/lttam/Kmeans-FlowTreeGW-Barycenter
Framework	none

CANet: Cross-disease Attention Network for Joint Diabetic Retinopathy and Diabetic Macular Edema Grading


Title	CANet: Cross-disease Attention Network for Joint Diabetic Retinopathy and Diabetic Macular Edema Grading
Authors	Xiaomeng Li, Xiaowei Hu, Lequan Yu, Lei Zhu, Chi-Wing Fu, Pheng-Ann Heng
Abstract	Diabetic retinopathy (DR) and diabetic macular edema (DME) are the leading causes of permanent blindness in the working-age population. Automatic grading of DR and DME helps ophthalmologists design tailored treatments to patients, thus is of vital importance in the clinical practice. However, prior works either grade DR or DME, and ignore the correlation between DR and its complication, i.e., DME. Moreover, the location information, e.g., macula and soft hard exhaust annotations, are widely used as a prior for grading. Such annotations are costly to obtain, hence it is desirable to develop automatic grading methods with only image-level supervision. In this paper, we present a novel cross-disease attention network (CANet) to jointly grade DR and DME by exploring the internal relationship between the diseases with only image-level supervision. Our key contributions include the disease-specific attention module to selectively learn useful features for individual diseases, and the disease-dependent attention module to further capture the internal relationship between the two diseases. We integrate these two attention modules in a deep network to produce disease-specific and disease-dependent features, and to maximize the overall performance jointly for grading DR and DME. We evaluate our network on two public benchmark datasets, i.e., ISBI 2018 IDRiD challenge dataset and Messidor dataset. Our method achieves the best result on the ISBI 2018 IDRiD challenge dataset and outperforms other methods on the Messidor dataset. Our code is publicly available at https://github.com/xmengli999/CANet.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01376v1
PDF	https://arxiv.org/pdf/1911.01376v1.pdf
PWC	https://paperswithcode.com/paper/canet-cross-disease-attention-network-for
Repo	https://github.com/xmengli999/CANet
Framework	pytorch

Reinforcement learning with a network of spiking agents


Title	Reinforcement learning with a network of spiking agents
Authors	Sneha Aenugu, Abhishek Sharma, Sasikiran Yelamarthi, Hananel Hazan, Philip S. Thomas, Robert Kozma
Abstract	Neuroscientific theory suggests that dopaminergic neurons broadcast global reward prediction errors to large areas of the brain influencing the synaptic plasticity of the neurons in those regions. We build on this theory to propose a multi-agent learning framework with spiking neurons in the generalized linear model (GLM) formulation as agents, to solve reinforcement learning (RL) tasks. We show that a network of GLM spiking agents connected in a hierarchical fashion, where each spiking agent modulates its firing policy based on local information and a global prediction error, can learn complex action representations to solve RL tasks. We further show how leveraging principles of modularity and population coding inspired from the brain can help reduce variance in the learning updates making it a viable optimization technique.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06489v3
PDF	https://arxiv.org/pdf/1910.06489v3.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-with-spiking-coagents
Repo	https://github.com/asneha213/spiking-agent-RL
Framework	none

Trained Rank Pruning for Efficient Deep Neural Networks


Title	Trained Rank Pruning for Efficient Deep Neural Networks
Authors	Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Wenrui Dai, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong
Abstract	To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. Networks trained with TRP has a low-rank structure in nature, and is approximated with negligible performance loss, thus eliminating fine-tuning after low rank approximation. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression counterparts using low rank approximation. Our code is available at: https://github.com/yuhuixu1993/Trained-Rank-Pruning.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04576v4
PDF	https://arxiv.org/pdf/1910.04576v4.pdf
PWC	https://paperswithcode.com/paper/traned-rank-pruning-for-efficient-deep-neural
Repo	https://github.com/yuhuixu1993/Trained-Rank-Pruning
Framework	pytorch

SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble


Title	SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble
Authors	Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han
Abstract	Corpus-based set expansion (i.e., finding the “complete” set of entities belonging to the same semantic class, based on a given corpus and a tiny set of seeds) is a critical task in knowledge discovery. It may facilitate numerous downstream applications, such as information extraction, taxonomy induction, question answering, and web search. To discover new entities in an expanded set, previous approaches either make one-time entity ranking based on distributional similarity, or resort to iterative pattern-based bootstrapping. The core challenge for these methods is how to deal with noisy context features derived from free-text corpora, which may lead to entity intrusion and semantic drifting. In this study, we propose a novel framework, SetExpan, which tackles this problem, with two techniques: (1) a context feature selection method that selects clean context features for calculating entity-entity distributional similarity, and (2) a ranking-based unsupervised ensemble method for expanding entity set based on denoised context features. Experiments on three datasets show that SetExpan is robust and outperforms previous state-of-the-art methods in terms of mean average precision.
Tasks	Feature Selection, Question Answering
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08192v1
PDF	https://arxiv.org/pdf/1910.08192v1.pdf
PWC	https://paperswithcode.com/paper/setexpan-corpus-based-set-expansion-via
Repo	https://github.com/mickeystroller/SetExpan
Framework	none

Attack-Resistant Federated Learning with Residual-based Reweighting


Title	Attack-Resistant Federated Learning with Residual-based Reweighting
Authors	Shuhao Fu, Chulin Xie, Bo Li, Qifeng Chen
Abstract	Federated learning has a variety of applications in multiple domains by utilizing private training data stored on different devices. However, the aggregation process in federated learning is highly vulnerable to adversarial attacks so that the global model may behave abnormally under attacks. To tackle this challenge, we present a novel aggregation algorithm with residual-based reweighting to defend federated learning. Our aggregation algorithm combines repeated median regression with the reweighting scheme in iteratively reweighted least squares. Our experiments show that our aggregation algorithm outperforms other alternative algorithms in the presence of label-flipping, backdoor, and Gaussian noise attacks. We also provide theoretical guarantees for our aggregation algorithm.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11464v1
PDF	https://arxiv.org/pdf/1912.11464v1.pdf
PWC	https://paperswithcode.com/paper/attack-resistant-federated-learning-with-1
Repo	https://github.com/howardmumu/Attack-Resistant-Federated-Learning
Framework	pytorch

Causal Discovery with Cascade Nonlinear Additive Noise Models


Title	Causal Discovery with Cascade Nonlinear Additive Noise Models
Authors	Ruichu Cai, Jie Qiao, Kun Zhang, Zhenjie Zhang, Zhifeng Hao
Abstract	Identification of causal direction between a causal-effect pair from observed data has recently attracted much attention. Various methods based on functional causal models have been proposed to solve this problem, by assuming the causal process satisfies some (structural) constraints and showing that the reverse direction violates such constraints. The nonlinear additive noise model has been demonstrated to be effective for this purpose, but the model class is not transitive–even if each direct causal relation follows this model, indirect causal influences, which result from omitted intermediate causal variables and are frequently encountered in practice, do not necessarily follow the model constraints; as a consequence, the nonlinear additive noise model may fail to correctly discover causal direction. In this work, we propose a cascade nonlinear additive noise model to represent such causal influences–each direct causal relation follows the nonlinear additive noise model but we observe only the initial cause and final effect. We further propose a method to estimate the model, including the unmeasured intermediate variables, from data, under the variational auto-encoder framework. Our theoretical results show that with our model, causal direction is identifiable under suitable technical conditions on the data generation process. Simulation results illustrate the power of the proposed method in identifying indirect causal relations across various settings, and experimental results on real data suggest that the proposed model and method greatly extend the applicability of causal discovery based on functional causal models in nonlinear cases.
Tasks	Causal Discovery
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09442v2
PDF	https://arxiv.org/pdf/1905.09442v2.pdf
PWC	https://paperswithcode.com/paper/causal-discovery-with-cascade-nonlinear
Repo	https://github.com/DMIRLAB-Group/CANM
Framework	pytorch

Multi-source weak supervision for saliency detection


Title	Multi-source weak supervision for saliency detection
Authors	Yu Zeng, Yunzhi Zhuge, Huchuan Lu, Lihe Zhang, Mingyang Qian, Yizhou Yu
Abstract	The high cost of pixel-level annotations makes it appealing to train saliency detection models with weak supervision. However, a single weak supervision source usually does not contain enough information to train a well-performing model. To this end, we propose a unified framework to train saliency detection models with diverse weak supervision sources. In this paper, we use category labels, captions, and unlabelled data for training, yet other supervision sources can also be plugged into this flexible framework. We design a classification network (CNet) and a caption generation network (PNet), which learn to predict object categories and generate captions, respectively, meanwhile highlight the most important regions for corresponding tasks. An attention transfer loss is designed to transmit supervision signal between networks, such that the network designed to be trained with one supervision source can benefit from another. An attention coherence loss is defined on unlabelled data to encourage the networks to detect generally salient regions instead of task-specific regions. We use CNet and PNet to generate pixel-level pseudo labels to train a saliency prediction network (SNet). During the testing phases, we only need SNet to predict saliency maps. Experiments demonstrate the performance of our method compares favourably against unsupervised and weakly supervised methods and even some supervised methods.
Tasks	Saliency Detection, Saliency Prediction
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00566v1
PDF	http://arxiv.org/pdf/1904.00566v1.pdf
PWC	https://paperswithcode.com/paper/multi-source-weak-supervision-for-saliency
Repo	https://github.com/zengxianyu/mws
Framework	pytorch

Jet grooming through reinforcement learning


Title	Jet grooming through reinforcement learning
Authors	Stefano Carrazza, Frédéric A. Dreyer
Abstract	We introduce a novel implementation of a reinforcement learning (RL) algorithm which is designed to find an optimal jet grooming strategy, a critical tool for collider experiments. The RL agent is trained with a reward function constructed to optimize the resulting jet properties, using both signal and background samples in a simultaneous multi-level training. We show that the grooming algorithm derived from the deep RL agent can match state-of-the-art techniques used at the Large Hadron Collider, resulting in improved mass resolution for boosted objects. Given a suitable reward function, the agent learns how to train a policy which optimally removes soft wide-angle radiation, allowing for a modular grooming technique that can be applied in a wide range of contexts. These results are accessible through the corresponding GroomRL framework.
Tasks
Published	2019-03-22
URL	https://arxiv.org/abs/1903.09644v2
PDF	https://arxiv.org/pdf/1903.09644v2.pdf
PWC	https://paperswithcode.com/paper/jet-grooming-through-reinforcement-learning
Repo	https://github.com/JetsGame/GroomRL
Framework	tf

Learned reconstructions for practical mask-based lensless imaging


Title	Learned reconstructions for practical mask-based lensless imaging
Authors	Kristina Monakhova, Joshua Yurtsever, Grace Kuo, Nick Antipa, Kyrollos Yanny, Laura Waller
Abstract	Mask-based lensless imagers are smaller and lighter than traditional lensed cameras. In these imagers, the sensor does not directly record an image of the scene; rather, a computational algorithm reconstructs it. Typically, mask-based lensless imagers use a model-based reconstruction approach that suffers from long compute times and a heavy reliance on both system calibration and heuristically chosen denoisers. In this work, we address these limitations using a bounded-compute, trainable neural network to reconstruct the image. We leverage our knowledge of the physical system by unrolling a traditional model-based optimization algorithm, whose parameters we optimize using experimentally gathered ground-truth data. Optionally, images produced by the unrolled network are then fed into a jointly-trained denoiser. As compared to traditional methods, our architecture achieves better perceptual image quality and runs 20x faster, enabling interactive previewing of the scene. We explore a spectrum between model-based and deep learning methods, showing the benefits of using an intermediate approach. Finally, we test our network on images taken in the wild with a prototype mask-based camera, demonstrating that our network generalizes to natural images.
Tasks	Calibration
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11502v1
PDF	https://arxiv.org/pdf/1908.11502v1.pdf
PWC	https://paperswithcode.com/paper/learned-reconstructions-for-practical-mask
Repo	https://github.com/Waller-Lab/LenslessLearning
Framework	pytorch

MUSCO: Multi-Stage Compression of neural networks


Title	MUSCO: Multi-Stage Compression of neural networks
Authors	Julia Gusak, Maksym Kholiavchenko, Evgeny Ponomarev, Larisa Markeeva, Ivan Oseledets, Andrzej Cichocki
Abstract	The low-rank tensor approximation is very promising for the compression of deep neural networks. We propose a new simple and efficient iterative approach, which alternates low-rank factorization with a smart rank selection and fine-tuning. We demonstrate the efficiency of our method comparing to non-iterative ones. Our approach improves the compression rate while maintaining the accuracy for a variety of tasks.
Tasks	Neural Network Compression
Published	2019-03-24
URL	https://arxiv.org/abs/1903.09973v4
PDF	https://arxiv.org/pdf/1903.09973v4.pdf
PWC	https://paperswithcode.com/paper/one-time-is-not-enough-iterative-tensor
Repo	https://github.com/juliagusak/musco
Framework	pytorch

Self-supervised Domain Adaptation for Computer Vision Tasks


Title	Self-supervised Domain Adaptation for Computer Vision Tasks
Authors	Jiaolong Xu, Liang Xiao, Antonio M. Lopez
Abstract	Recent progress of self-supervised visual representation learning has achieved remarkable success on many challenging computer vision benchmarks. However, whether these techniques can be used for domain adaptation has not been explored. In this work, we propose a generic method for self-supervised domain adaptation, using object recognition and semantic segmentation of urban scenes as use cases. Focusing on simple pretext/auxiliary tasks (e.g. image rotation prediction), we assess different learning strategies to improve domain adaptation effectiveness by self-supervision. Additionally, we propose two complementary strategies to further boost the domain adaptation accuracy on semantic segmentation within our method, consisting of prediction layer alignment and batch normalization calibration. The experimental results show adaptation levels comparable to most studied domain adaptation methods, thus, bringing self-supervision as a new alternative for reaching domain adaptation. The code is available at https://github.com/Jiaolong/self-supervised-da.
Tasks	Calibration, Domain Adaptation, Object Recognition, Representation Learning, Semantic Segmentation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10915v3
PDF	https://arxiv.org/pdf/1907.10915v3.pdf
PWC	https://paperswithcode.com/paper/self-supervised-domain-adaptation-for
Repo	https://github.com/Jiaolong/self-supervised-da
Framework	pytorch