January 28, 2020

3146 words 15 mins read

Paper Group ANR 1055

Semi-Supervised Brain Lesion Segmentation with an Adapted Mean Teacher Model. Deep Q-Network for Angry Birds. A survey on intrinsic motivation in reinforcement learning. Recommending investors for new startups by integrating network diffusion and investors’ domain preference. Crowd Transformer Network. FlowSAN: Privacy-enhancing Semi-Adversarial Ne …

Semi-Supervised Brain Lesion Segmentation with an Adapted Mean Teacher Model


Title	Semi-Supervised Brain Lesion Segmentation with an Adapted Mean Teacher Model
Authors	Wenhui Cui, Yanlin Liu, Yuxing Li, Menghao Guo, Yiming Li, Xiuli Li, Tianle Wang, Xiangzhu Zeng, Chuyang Ye
Abstract	Automated brain lesion segmentation provides valuable information for the analysis and intervention of patients. In particular, methods based on convolutional neural networks (CNNs) have achieved state-of-the-art segmentation performance. However, CNNs usually require a decent amount of annotated data, which may be costly and time-consuming to obtain. Since unannotated data is generally abundant, it is desirable to use unannotated data to improve the segmentation performance for CNNs when limited annotated data is available. In this work, we propose a semi-supervised learning (SSL) approach to brain lesion segmentation, where unannotated data is incorporated into the training of CNNs. We adapt the mean teacher model, which is originally developed for SSL-based image classification, for brain lesion segmentation. Assuming that the network should produce consistent outputs for similar inputs, a loss of segmentation consistency is designed and integrated into a self-ensembling framework. Specifically, we build a student model and a teacher model, which share the same CNN architecture for segmentation. The student and teacher models are updated alternately. At each step, the student model learns from the teacher model by minimizing the weighted sum of the segmentation loss computed from annotated data and the segmentation consistency loss between the teacher and student models computed from unannotated data. Then, the teacher model is updated by combining the updated student model with the historical information of teacher models using an exponential moving average strategy. For demonstration, the proposed approach was evaluated on ischemic stroke lesion segmentation, where it improves stroke lesion segmentation with the incorporation of unannotated data.
Tasks	Image Classification, Ischemic Stroke Lesion Segmentation, Lesion Segmentation
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01248v1
PDF	http://arxiv.org/pdf/1903.01248v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-brain-lesion-segmentation
Repo
Framework

Deep Q-Network for Angry Birds


Title	Deep Q-Network for Angry Birds
Authors	Ekaterina Nikonova, Jakub Gemrot
Abstract	Angry Birds is a popular video game in which the player is provided with a sequence of birds to shoot from a slingshot. The task of the game is to destroy all green pigs with maximum possible score. Angry Birds appears to be a difficult task to solve for artificially intelligent agents due to the sequential decision-making, non-deterministic game environment, enormous state and action spaces and requirement to differentiate between multiple birds, their abilities and optimum tapping times. We describe the application of Deep Reinforcement learning by implementing Double Dueling Deep Q-network to play Angry Birds game. One of our main goals was to build an agent that is able to compete with previous participants and humans on the first 21 levels. In order to do so, we have collected a dataset of game frames that we used to train our agent on. We present different approaches and settings for DQN agent. We evaluate our agent using results of the previous participants of AIBirds competition, results of volunteer human players and present the results of AIBirds 2018 competition.
Tasks	Decision Making
Published	2019-10-04
URL	https://arxiv.org/abs/1910.01806v2
PDF	https://arxiv.org/pdf/1910.01806v2.pdf
PWC	https://paperswithcode.com/paper/deep-q-network-for-angry-birds
Repo
Framework

A survey on intrinsic motivation in reinforcement learning


Title	A survey on intrinsic motivation in reinforcement learning
Authors	Arthur Aubret, Laetitia Matignon, Salima Hassas
Abstract	The reinforcement learning (RL) research area is very active, with an important number of new contributions; especially considering the emergent field of deep RL (DRL). However a number of scientific and technical challenges still need to be addressed, amongst which we can mention the ability to abstract actions or the difficulty to explore the environment which can be addressed by intrinsic motivation (IM). In this article, we provide a survey on the role of intrinsic motivation in DRL. We categorize the different kinds of intrinsic motivations and detail for each category, its advantages and limitations with respect to the mentioned challenges. Additionnally, we conduct an in-depth investigation of substantial current research questions, that are currently under study or not addressed at all in the considered research area of DRL. We choose to survey these research works, from the perspective of learning how to achieve tasks. We suggest then, that solving current challenges could lead to a larger developmental architecture which may tackle most of the tasks. We describe this developmental architecture on the basis of several building blocks composed of a RL algorithm and an IM module compressing information.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.06976v2
PDF	https://arxiv.org/pdf/1908.06976v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-intrinsic-motivation-in
Repo
Framework

Recommending investors for new startups by integrating network diffusion and investors’ domain preference


Title	Recommending investors for new startups by integrating network diffusion and investors’ domain preference
Authors	Shuqi Xu, Qianming Zhang, Linyuan Lv, Manuel Sebastian Mariani
Abstract	Over the past decade, many startups have sprung up, which create a huge demand for financial support from venture investors. However, due to the information asymmetry between investors and companies, the financing process is usually challenging and time-consuming, especially for the startups that have not yet obtained any investment. Because of this, effective data-driven techniques to automatically match startups with potentially relevant investors would be highly desirable. Here, we analyze 34,469 valid investment events collected from www.itjuzi.com and consider the cold-start problem of recommending investors for new startups. We address this problem by constructing different tripartite network representations of the data where nodes represent investors, companies, and companies’ domains. First, we find that investors have strong domain preferences when investing, which motivates us to introduce virtual links between investors and investment domains in the tripartite network construction. Our analysis of the recommendation performance of diffusion-based algorithms applied to various network representations indicates that prospective investors for new startups are effectively revealed by integrating network diffusion processes with investors’ domain preference.
Tasks
Published	2019-12-06
URL	https://arxiv.org/abs/1912.02962v2
PDF	https://arxiv.org/pdf/1912.02962v2.pdf
PWC	https://paperswithcode.com/paper/recommending-investors-for-new-startups-by
Repo
Framework

Crowd Transformer Network


Title	Crowd Transformer Network
Authors	Viresh Ranjan, Mubarak Shah, Minh Hoai Nguyen
Abstract	In this paper, we tackle the problem of Crowd Counting, and present a crowd density estimation based approach for obtaining the crowd count. Most of the existing crowd counting approaches rely on local features for estimating the crowd density map. In this work, we investigate the usefulness of combining local with non-local features for crowd counting. We use convolution layers for extracting local features, and a type of self-attention mechanism for extracting non-local features. We combine the local and the non-local features, and use it for estimating crowd density map. We conduct experiments on three publicly available Crowd Counting datasets, and achieve significant improvement over the previous approaches.
Tasks	Crowd Counting, Density Estimation
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02774v1
PDF	http://arxiv.org/pdf/1904.02774v1.pdf
PWC	https://paperswithcode.com/paper/crowd-transformer-network
Repo
Framework

FlowSAN: Privacy-enhancing Semi-Adversarial Networks to Confound Arbitrary Face-based Gender Classifiers


Title	FlowSAN: Privacy-enhancing Semi-Adversarial Networks to Confound Arbitrary Face-based Gender Classifiers
Authors	Vahid Mirjalili, Sebastian Raschka, Arun Ross
Abstract	Privacy concerns in the modern digital age have prompted researchers to develop techniques that allow users to selectively suppress certain information in collected data while allowing for other information to be extracted. In this regard, Semi-Adversarial Networks (SAN) have recently emerged as a method for imparting soft-biometric privacy to face images. SAN enables modifications of input face images so that the resulting face images can still be reliably used by arbitrary conventional face matchers for recognition purposes, while attribute classifiers, such as gender classifiers, are confounded. However, the generalizability of SANs across arbitrary gender classifiers has remained an open concern. In this work, we propose a new method, FlowSAN, for allowing SANs to generalize to multiple unseen gender classifiers. We propose combining a diverse set of SAN models to compensate each other’s weaknesses, thereby, forming a robust model with improved generalization capability. Extensive experiments using different unseen gender classifiers and face matchers demonstrate the efficacy of the proposed paradigm in imparting gender privacy to face images.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01388v1
PDF	https://arxiv.org/pdf/1905.01388v1.pdf
PWC	https://paperswithcode.com/paper/flowsan-privacy-enhancing-semi-adversarial
Repo
Framework

pCAMP: Performance Comparison of Machine Learning Packages on the Edges


Title	pCAMP: Performance Comparison of Machine Learning Packages on the Edges
Authors	Xingzhou Zhang, Yifan Wang, Weisong Shi
Abstract	Machine learning has changed the computing paradigm. Products today are built with machine intelligence as a central attribute, and consumers are beginning to expect near-human interaction with the appliances they use. However, much of the deep learning revolution has been limited to the cloud. Recently, several machine learning packages based on edge devices have been announced which aim to offload the computing to the edges. However, little research has been done to evaluate these packages on the edges, making it difficult for end users to select an appropriate pair of software and hardware. In this paper, we make a performance comparison of several state-of-the-art machine learning packages on the edges, including TensorFlow, Caffe2, MXNet, PyTorch, and TensorFlow Lite. We focus on evaluating the latency, memory footprint, and energy of these tools with two popular types of neural networks on different edge devices. This evaluation not only provides a reference to select appropriate combinations of hardware and software packages for end users but also points out possible future directions to optimize packages for developers.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.01878v2
PDF	https://arxiv.org/pdf/1906.01878v2.pdf
PWC	https://paperswithcode.com/paper/pcamp-performance-comparison-of-machine
Repo
Framework

Random Forest with Learned Representations for Semantic Segmentation


Title	Random Forest with Learned Representations for Semantic Segmentation
Authors	Byeongkeun Kang, Truong Q. Nguyen
Abstract	In this work, we present a random forest framework that learns the weights, shapes, and sparsities of feature representations for real-time semantic segmentation. Typical filters (kernels) have predetermined shapes and sparsities and learn only weights. A few feature extraction methods fix weights and learn only shapes and sparsities. These predetermined constraints restrict learning and extracting optimal features. To overcome this limitation, we propose an unconstrained representation that is able to extract optimal features by learning weights, shapes, and sparsities. We, then, present the random forest framework that learns the flexible filters using an iterative optimization algorithm and segments input images using the learned representations. We demonstrate the effectiveness of the proposed method using a hand segmentation dataset for hand-object interaction and using two semantic segmentation datasets. The results show that the proposed method achieves real-time semantic segmentation using limited computational and memory resources.
Tasks	Hand Segmentation, Real-Time Semantic Segmentation, Semantic Segmentation
Published	2019-01-23
URL	http://arxiv.org/abs/1901.07828v1
PDF	http://arxiv.org/pdf/1901.07828v1.pdf
PWC	https://paperswithcode.com/paper/random-forest-with-learned-representations
Repo
Framework

Fine-grained Classification of Rowing teams


Title	Fine-grained Classification of Rowing teams
Authors	M. J. A. van Wezel, L. J. Hamburger, Y. Napolean
Abstract	Fine-grained classification tasks such as identifying different breeds of dog are quite challenging as visual differences between categories is quite small and can be easily overwhelmed by external factors such as object pose, lighting, etc. This work focuses on the specific case of classifying rowing teams from various associations. Currently, the photos are taken at rowing competitions and are manually classified by a small set of members, in what is a painstaking process. To alleviate this, Deep learning models can be utilised as a faster method to classify the images. Recent studies show that localising the manually defined parts, and modelling based on these parts, improves on vanilla convolution models, so this work also investigates the detection of clothing attributes. The networks were trained and tested on a partially labelled data set mainly consisting of rowers from multiple associations. This paper resulted in the classification of up to ten rowing associations by using deep learning networks the smaller VGG network achieved 90.1% accuracy whereas ResNet was limited to 87.20%. Adding attention to the ResNet resulted into a drop of performance as only 78.10% was achieved.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05393v1
PDF	https://arxiv.org/pdf/1912.05393v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-classification-of-rowing-teams
Repo
Framework

A Robust Stereo Camera Localization Method with Prior LiDAR Map Constrains


Title	A Robust Stereo Camera Localization Method with Prior LiDAR Map Constrains
Authors	Dong Han, Zuhao Zou, Lujia Wang, Cheng-Zhong Xu
Abstract	In complex environments, low-cost and robust localization is a challenging problem. For example, in a GPSdenied environment, LiDAR can provide accurate position information, but the cost is high. In general, visual SLAM based localization methods become unreliable when the sunlight changes greatly. Therefore, inexpensive and reliable methods are required. In this paper, we propose a stereo visual localization method based on the prior LiDAR map. Different from the conventional visual localization system, we design a novel visual optimization model by matching planar information between the LiDAR map and visual image. Bundle adjustment is built by using coplanarity constraints. To solve the optimization problem, we use a graph-based optimization algorithm and a local window optimization method. Finally, we estimate a full six degrees of freedom (DOF) pose without scale drift. To validate the efficiency, the proposed method has been tested on the KITTI dataset. The results show that our method is more robust and accurate than the state-of-art ORB-SLAM2.
Tasks	Camera Localization, Visual Localization
Published	2019-12-02
URL	https://arxiv.org/abs/1912.05023v1
PDF	https://arxiv.org/pdf/1912.05023v1.pdf
PWC	https://paperswithcode.com/paper/a-robust-stereo-camera-localization-method
Repo
Framework

ClearGrasp: 3D Shape Estimation of Transparent Objects for Manipulation


Title	ClearGrasp: 3D Shape Estimation of Transparent Objects for Manipulation
Authors	Shreeyak S. Sajjan, Matthew Moore, Mike Pan, Ganesh Nagaraja, Johnny Lee, Andy Zeng, Shuran Song
Abstract	Transparent objects are a common part of everyday life, yet they possess unique visual properties that make them incredibly difficult for standard 3D sensors to produce accurate depth estimates for. In many cases, they often appear as noisy or distorted approximations of the surfaces that lie behind them. To address these challenges, we present ClearGrasp – a deep learning approach for estimating accurate 3D geometry of transparent objects from a single RGB-D image for robotic manipulation. Given a single RGB-D image of transparent objects, ClearGrasp uses deep convolutional networks to infer surface normals, masks of transparent surfaces, and occlusion boundaries. It then uses these outputs to refine the initial depth estimates for all transparent surfaces in the scene. To train and test ClearGrasp, we construct a large-scale synthetic dataset of over 50,000 RGB-D images, as well as a real-world test benchmark with 286 RGB-D images of transparent objects and their ground truth geometries. The experiments demonstrate that ClearGrasp is substantially better than monocular depth estimation baselines and is capable of generalizing to real-world images and novel objects. We also demonstrate that ClearGrasp can be applied out-of-the-box to improve grasping algorithms’ performance on transparent objects. Code, data, and benchmarks will be released. Supplementary materials available on the project website: https://sites.google.com/view/cleargrasp
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2019-10-06
URL	https://arxiv.org/abs/1910.02550v2
PDF	https://arxiv.org/pdf/1910.02550v2.pdf
PWC	https://paperswithcode.com/paper/cleargrasp-3d-shape-estimation-of-transparent
Repo
Framework

Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning


Title	Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
Authors	Ying Wen, Yaodong Yang, Rui Luo, Jun Wang
Abstract	Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents – a property hardly met due to individual’s cognitive limitation and/or the tractability of the decision problem. In this paper, we introduce generalized recursive reasoning (GR2) as a novel framework to model agents with different \emph{hierarchical} levels of rationality; our framework enables agents to exhibit varying levels of “thinking” ability thereby allowing higher-level agents to best respond to various less sophisticated learners. We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. Within the GR2, we propose a practical actor-critic solver, and demonstrate its convergent property to a stationary point in two-player games through Lyapunov analysis. On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.
Tasks	Decision Making, Multi-agent Reinforcement Learning
Published	2019-01-26
URL	https://arxiv.org/abs/1901.09216v2
PDF	https://arxiv.org/pdf/1901.09216v2.pdf
PWC	https://paperswithcode.com/paper/multi-agent-generalized-recursive-reasoning
Repo
Framework

AdaCliP: Adaptive Clipping for Private SGD


Title	AdaCliP: Adaptive Clipping for Private SGD
Authors	Venkatadheeraj Pichapati, Ananda Theertha Suresh, Felix X. Yu, Sashank J. Reddi, Sanjiv Kumar
Abstract	Privacy preserving machine learning algorithms are crucial for learning models over user data to protect sensitive information. Motivated by this, differentially private stochastic gradient descent (SGD) algorithms for training machine learning models have been proposed. At each step, these algorithms modify the gradients and add noise proportional to the sensitivity of the modified gradients. Under this framework, we propose AdaCliP, a theoretically motivated differentially private SGD algorithm that provably adds less noise compared to the previous methods, by using coordinate-wise adaptive clipping of the gradient. We empirically demonstrate that AdaCliP reduces the amount of added noise and produces models with better accuracy.
Tasks
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07643v2
PDF	https://arxiv.org/pdf/1908.07643v2.pdf
PWC	https://paperswithcode.com/paper/190807643
Repo
Framework

Weighted Multisource Tradaboost


Title	Weighted Multisource Tradaboost
Authors	João Antunes, Alexandre Bernardino, Asim Smailagic, Daniel Siewiorek
Abstract	In this paper we propose an improved method for transfer learning that takes into account the balance between target and source data. This method builds on the state-of-the-art Multisource Tradaboost, but weighs the importance of each datapoint taking into account the amount of target and source data available. A comparative study is then presented exposing the performance of four transfer learning methods as well as the proposed Weighted Multisource Tradaboost. The experimental results show that the proposed method is able to outperform the base method as the number of target samples increase. These results are promising in the sense that source-target ratio weighing may be a path to improve current methods of transfer learning. However, against the asymptotic conjecture, all transfer learning methods tested in this work get outperformed by a no-transfer SVM for large number on target samples.
Tasks	Transfer Learning
Published	2019-03-26
URL	http://arxiv.org/abs/1903.11158v1
PDF	http://arxiv.org/pdf/1903.11158v1.pdf
PWC	https://paperswithcode.com/paper/weighted-multisource-tradaboost
Repo
Framework

Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data


Title	Elastic-InfoGAN: Unsupervised Disentangled Representation Learning in Imbalanced Data
Authors	Utkarsh Ojha, Krishna Kumar Singh, Cho-Jui Hsieh, Yong Jae Lee
Abstract	We propose a novel unsupervised generative model, Elastic-InfoGAN, that learns to disentangle object identity from other low-level aspects in class-imbalanced datasets. We first investigate the issues surrounding the assumptions about uniformity made by InfoGAN, and demonstrate its ineffectiveness to properly disentangle object identity in imbalanced data. Our key idea is to make the discovery of the discrete latent factor of variation invariant to identity-preserving transformations in real images, and use that as the signal to learn the latent distribution’s parameters. Experiments on both artificial (MNIST) and real-world (YouTube-Faces) datasets demonstrate the effectiveness of our approach in imbalanced data by: (i) better disentanglement of object identity as a latent factor of variation; and (ii) better approximation of class imbalance in the data, as reflected in the learned parameters of the latent distribution.
Tasks	Representation Learning
Published	2019-10-01
URL	https://arxiv.org/abs/1910.01112v1
PDF	https://arxiv.org/pdf/1910.01112v1.pdf
PWC	https://paperswithcode.com/paper/elastic-infogan-unsupervised-disentangled
Repo
Framework