April 1, 2020

3061 words 15 mins read

Paper Group ANR 418

Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations. Generating Object Stamps. Reinforcement Learning with Goal-Distance Gradient. Applying Gene Expression Programming for Solving One-Dimensional Bin-Packing Problems. TapLab: A Fast Framework for Semantic Video Segmentation Tapping into Compressed-Dom …

Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations


Title	Predicting Regression Probability Distributions with Imperfect Data Through Optimal Transformations
Authors	Jerome H. Friedman
Abstract	The goal of regression analysis is to predict the value of a numeric outcome variable y given a vector of joint values of other (predictor) variables x. Usually a particular x-vector does not specify a repeatable value for y, but rather a probability distribution of possible y–values, p(yx). This distribution has a location, scale and shape, all of which can depend on x, and are needed to infer likely values for y given x. Regression methods usually assume that training data y-values are perfect numeric realizations from some well behaived p(yx). Often actual training data y-values are discrete, truncated and/or arbitrary censored. Regression procedures based on an optimal transformation strategy are presented for estimating location, scale and shape of p(yx) as general functions of x, in the possible presence of such imperfect training data. In addition, validation diagnostics are presented to ascertain the quality of the solutions.
Tasks
Published	2020-01-27
URL	https://arxiv.org/abs/2001.10102v1
PDF	https://arxiv.org/pdf/2001.10102v1.pdf
PWC	https://paperswithcode.com/paper/predicting-regression-probability
Repo
Framework

Generating Object Stamps


Title	Generating Object Stamps
Authors	Youssef Alami Mejjati, Zejiang Shen, Michael Snower, Aaron Gokaslan, Oliver Wang, James Tompkin, Kwang In Kim
Abstract	We present an algorithm to generate diverse foreground objects and composite them into background images using a GAN architecture. Given an object class, a user-provided bounding box, and a background image, we first use a mask generator to create an object shape, and then use a texture generator to fill the mask such that the texture integrates with the background. By separating the problem of object insertion into these two stages, we show that our model allows us to improve the realism of diverse object generation that also agrees with the provided background image. Our results on the challenging COCO dataset show improved overall quality and diversity compared to state-of-the-art object insertion approaches.
Tasks
Published	2020-01-01
URL	https://arxiv.org/abs/2001.02595v2
PDF	https://arxiv.org/pdf/2001.02595v2.pdf
PWC	https://paperswithcode.com/paper/generating-object-stamps
Repo
Framework

Reinforcement Learning with Goal-Distance Gradient


Title	Reinforcement Learning with Goal-Distance Gradient
Authors	Kai Jiang, XiaoLong Qin
Abstract	Reinforcement learning usually uses the feedback rewards of environmental to train agents. But the rewards in the actual environment are sparse, and even some environments will not rewards. Most of the current methods are difficult to get good performance in sparse reward or non-reward environments. Although using shaped rewards is effective when solving sparse reward tasks, it is limited to specific problems and learning is also susceptible to local optima. We propose a model-free method that does not rely on environmental rewards to solve the problem of sparse rewards in the general environment. Our method use the minimum number of transitions between states as the distance to replace the rewards of environmental, and proposes a goal-distance gradient to achieve policy improvement. We also introduce a bridge point planning method based on the characteristics of our method to improve exploration efficiency, thereby solving more complex tasks. Experiments show that our method performs better on sparse reward and local optimal problems in complex environments than previous work.
Tasks
Published	2020-01-01
URL	https://arxiv.org/abs/2001.00127v2
PDF	https://arxiv.org/pdf/2001.00127v2.pdf
PWC	https://paperswithcode.com/paper/reinforcement-learning-with-goal-distance
Repo
Framework

Applying Gene Expression Programming for Solving One-Dimensional Bin-Packing Problems


Title	Applying Gene Expression Programming for Solving One-Dimensional Bin-Packing Problems
Authors	Najla Akram Al-Saati
Abstract	This work aims to study and explore the use of Gene Expression Programming (GEP) in solving the on-line Bin-Packing problem. The main idea is to show how GEP can automatically find acceptable heuristic rules to solve the problem efficiently and economically. One dimensional Bin-Packing problem is considered in the course of this work with the constraint of minimizing the number of bins filled with the given pieces. Experimental Data includes instances of benchmark test data taken from Falkenauer (1996) for One-dimensional Bin-Packing Problems. Results show that GEP can be used as a very powerful and flexible tool for finding interesting compact rules suited for the problem. The impact of functions is also investigated to show how they can affect and influence the success of rates when they appear in rules. High success rates are gained with smaller population size and fewer generations compared to previous work performed using Genetic Programming.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.09923v1
PDF	https://arxiv.org/pdf/2001.09923v1.pdf
PWC	https://paperswithcode.com/paper/applying-gene-expression-programming-for
Repo
Framework

TapLab: A Fast Framework for Semantic Video Segmentation Tapping into Compressed-Domain Knowledge


Title	TapLab: A Fast Framework for Semantic Video Segmentation Tapping into Compressed-Domain Knowledge
Authors	Junyi Feng, Songyuan Li, Xi Li, Fei Wu, Qi Tian, Ming-Hsuan Yang, Haibin Ling
Abstract	Real-time semantic video segmentation is a challenging task due to the strict requirements of inference speed. Recent approaches mainly devote great efforts to reducing the model size for high efficiency. In this paper, we rethink this problem from a different viewpoint: using knowledge contained in compressed videos. We propose a simple and effective framework, dubbed TapLab, to tap into resources from the compressed domain. Specifically, we design a fast feature warping module using motion vectors for acceleration. To reduce the noise introduced by motion vectors, we design a residual-guided correction module and a residual-guided frame selection module using residuals. Compared with the state-of-the-art fast semantic image segmentation models, our proposed TapLab significantly reduces redundant computations, running around 3 times faster with comparable accuracy for 1024x2048 video. The experimental results show that TapLab achieves 70.6% mIoU on the Cityscapes dataset at 99.8 FPS with a single GPU card. A high-speed version even reaches the speed of 160+ FPS.
Tasks	Semantic Segmentation, Video Semantic Segmentation
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13260v1
PDF	https://arxiv.org/pdf/2003.13260v1.pdf
PWC	https://paperswithcode.com/paper/taplab-a-fast-framework-for-semantic-video
Repo
Framework

Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point Clouds


Title	Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point Clouds
Authors	Jiacheng Wei, Guosheng Lin, Kim-Hui Yap, Tzu-Yi Hung, Lihua Xie
Abstract	Point clouds provide intrinsic geometric information and surface context for scene understanding. Existing methods for point cloud segmentation require a large amount of fully labeled data. Using advanced depth sensors, collection of large scale 3D dataset is no longer a cumbersome process. However, manually producing point-level label on the large scale dataset is time and labor-intensive. In this paper, we propose a weakly supervised approach to predict point-level results using weak labels on 3D point clouds. We introduce our multi-path region mining module to generate pseudo point-level label from a classification network trained with weak labels. It mines the localization cues for each class from various aspects of the network feature using different attention modules. Then, we use the point-level pseudo labels to train a point cloud segmentation network in a fully supervised manner. To the best of our knowledge, this is the first method that uses cloud-level weak labels on raw 3D space to train a point cloud semantic segmentation network. In our setting, the 3D weak labels only indicate the classes that appeared in our input sample. We discuss both scene- and subcloud-level weakly labels on raw 3D point cloud data and perform in-depth experiments on them. On ScanNet dataset, our result trained with subcloud-level labels is compatible with some fully supervised methods.
Tasks	3D Semantic Segmentation, Scene Understanding, Semantic Segmentation
Published	2020-03-29
URL	https://arxiv.org/abs/2003.13035v1
PDF	https://arxiv.org/pdf/2003.13035v1.pdf
PWC	https://paperswithcode.com/paper/multi-path-region-mining-for-weakly
Repo
Framework

Pathological speech detection using x-vector embeddings


Title	Pathological speech detection using x-vector embeddings
Authors	Catarina Botelho, Francisco Teixeira, Thomas Rolland, Alberto Abad, Isabel Trancoso
Abstract	The potential of speech as a non-invasive biomarker to assess a speaker’s health has been repeatedly supported by the results of multiple works, for both physical and psychological conditions. Traditional systems for speech-based disease classification have focused on carefully designed knowledge-based features. However, these features may not represent the disease’s full symptomatology, and may even overlook its more subtle manifestations. This has prompted researchers to move in the direction of general speaker representations that inherently model symptoms, such as Gaussian Supervectors, i-vectors and, x-vectors. In this work, we focus on the latter, to assess their applicability as a general feature extraction method to the detection of Parkinson’s disease (PD) and obstructive sleep apnea (OSA). We test our approach against knowledge-based features and i-vectors, and report results for two European Portuguese corpora, for OSA and PD, as well as for an additional Spanish corpus for PD. Both x-vector and i-vector models were trained with an out-of-domain European Portuguese corpus. Our results show that x-vectors are able to perform better than knowledge-based features in same-language corpora. Moreover, while x-vectors performed similarly to i-vectors in matched conditions, they significantly outperform them when domain-mismatch occurs.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00864v2
PDF	https://arxiv.org/pdf/2003.00864v2.pdf
PWC	https://paperswithcode.com/paper/pathological-speech-detection-using-x-vector
Repo
Framework

SNIFF: Reverse Engineering of Neural Networks with Fault Attacks


Title	SNIFF: Reverse Engineering of Neural Networks with Fault Attacks
Authors	Jakub Breier, Dirmanto Jap, Xiaolu Hou, Shivam Bhasin, Yang Liu
Abstract	Neural networks have been shown to be vulnerable against fault injection attacks. These attacks change the physical behavior of the device during the computation, resulting in a change of value that is currently being computed. They can be realized by various fault injection techniques, ranging from clock/voltage glitching to application of lasers to rowhammer. In this paper we explore the possibility to reverse engineer neural networks with the usage of fault attacks. SNIFF stands for sign bit flip fault, which enables the reverse engineering by changing the sign of intermediate values. We develop the first exact extraction method on deep-layer feature extractor networks that provably allows the recovery of the model parameters. Our experiments with Keras library show that the precision error for the parameter recovery for the tested networks is less than $10^{-13}$ with the usage of 64-bit floats, which improves the current state of the art by 6 orders of magnitude. Additionally, we discuss the protection techniques against fault injection attacks that can be applied to enhance the fault resistance.
Tasks
Published	2020-02-23
URL	https://arxiv.org/abs/2002.11021v1
PDF	https://arxiv.org/pdf/2002.11021v1.pdf
PWC	https://paperswithcode.com/paper/sniff-reverse-engineering-of-neural-networks
Repo
Framework

Using the Split Bregman Algorithm to Solve the Self-Repelling Snake Model


Title	Using the Split Bregman Algorithm to Solve the Self-Repelling Snake Model
Authors	Huizhu Pan, Jintao Song, Wanquan Liu, Ling Li, Guanglu Zhou, Lu Tan, Shichu Chen
Abstract	Preserving the contour topology during image segmentation is useful in manypractical scenarios. By keeping the contours isomorphic, it is possible to pre-vent over-segmentation and under-segmentation, as well as to adhere to giventopologies. The self-repelling snake model (SR) is a variational model thatpreserves contour topology by combining a non-local repulsion term with thegeodesic active contour model (GAC). The SR is traditionally solved using theadditive operator splitting (AOS) scheme. Although this solution is stable, thememory requirement grows quickly as the image size increases. In our paper,we propose an alternative solution to the SR using the Split Bregman method.Our algorithm breaks the problem down into simpler subproblems to use lower-order evolution equations and approximation schemes. The memory usage issignificantly reduced as a result. Experiments show comparable performance to the original algorithm with shorter iteration times.
Tasks	Semantic Segmentation
Published	2020-03-28
URL	https://arxiv.org/abs/2003.12693v1
PDF	https://arxiv.org/pdf/2003.12693v1.pdf
PWC	https://paperswithcode.com/paper/using-the-split-bregman-algorithm-to-solve
Repo
Framework

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization


Title	Biased Stochastic Gradient Descent for Conditional Stochastic Optimization
Authors	Yifan Hu, Siqi Zhang, Xin Chen, Niao He
Abstract	Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives, under smooth and non-smooth conditions. We also provide matching lower bounds of BSGD for convex CSO objectives. Extensive numerical experiments are conducted to illustrate the performance of BSGD on robust logistic regression, model-agnostic meta-learning (MAML), and instrumental variable regression (IV).
Tasks	Causal Inference, Meta-Learning, Stochastic Optimization
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10790v1
PDF	https://arxiv.org/pdf/2002.10790v1.pdf
PWC	https://paperswithcode.com/paper/biased-stochastic-gradient-descent-for
Repo
Framework

General Partial Label Learning via Dual Bipartite Graph Autoencoder


Title	General Partial Label Learning via Dual Bipartite Graph Autoencoder
Authors	Brian Chen, Bo Wu, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang
Abstract	We formulate a practical yet challenging problem: General Partial Label Learning (GPLL). Compared to the traditional Partial Label Learning (PLL) problem, GPLL relaxes the supervision assumption from instance-level — a label set partially labels an instance — to group-level: 1) a label set partially labels a group of instances, where the within-group instance-label link annotations are missing, and 2) cross-group links are allowed — instances in a group may be partially linked to the label set from another group. Such ambiguous group-level supervision is more practical in real-world scenarios as additional annotation on the instance-level is no longer required, e.g., face-naming in videos where the group consists of faces in a frame, labeled by a name set in the corresponding caption. In this paper, we propose a novel graph convolutional network (GCN) called Dual Bipartite Graph Autoencoder (DB-GAE) to tackle the label ambiguity challenge of GPLL. First, we exploit the cross-group correlations to represent the instance groups as dual bipartite graphs: within-group and cross-group, which reciprocally complements each other to resolve the linking ambiguities. Second, we design a GCN autoencoder to encode and decode them, where the decodings are considered as the refined results. It is worth noting that DB-GAE is self-supervised and transductive, as it only uses the group-level supervision without a separate offline training stage. Extensive experiments on two real-world datasets demonstrate that DB-GAE significantly outperforms the best baseline over absolute 0.159 F1-score and 24.8% accuracy. We further offer analysis on various levels of label ambiguities.
Tasks
Published	2020-01-05
URL	https://arxiv.org/abs/2001.01290v1
PDF	https://arxiv.org/pdf/2001.01290v1.pdf
PWC	https://paperswithcode.com/paper/general-partial-label-learning-via-dual
Repo
Framework

Representing Unordered Data Using Multiset Automata and Complex Numbers


Title	Representing Unordered Data Using Multiset Automata and Complex Numbers
Authors	Justin DeBenedetto, David Chiang
Abstract	Unordered, variable-sized inputs arise in many settings across multiple fields. The ability for set- and multiset- oriented neural networks to handle this type of input has been the focus of much work in recent years. We propose to represent multisets using complex-weighted multiset automata and show how the multiset representations of certain existing neural architectures can be viewed as special cases of ours. Namely, (1) we provide a new theoretical and intuitive justification for the Transformer model’s representation of positions using sinusoidal functions, and (2) we extend the DeepSets model to use complex numbers, enabling it to outperform the existing model on an extension of one of their tasks.
Tasks
Published	2020-01-02
URL	https://arxiv.org/abs/2001.00610v1
PDF	https://arxiv.org/pdf/2001.00610v1.pdf
PWC	https://paperswithcode.com/paper/representing-unordered-data-using-multiset-1
Repo
Framework

The Costs and Benefits of Goal-Directed Attention in Deep Convolutional Neural Networks


Title	The Costs and Benefits of Goal-Directed Attention in Deep Convolutional Neural Networks
Authors	Xiaoliang Luo, Brett D. Roads, Bradley C. Love
Abstract	Attention in machine learning is largely bottom-up, whereas people also deploy top-down, goal-directed attention. Motivated by neuroscience research, we evaluated a plug-and-play, top-down attention layer that is easily added to existing deep convolutional neural networks (DCNNs). In object recognition tasks, increasing top-down attention has benefits (increasing hit rates) and costs (increasing false alarm rates). At a moderate level, attention improves sensitivity (i.e., increases $d^\prime$) at only a moderate increase in bias for tasks involving standard images, blended images, and natural adversarial images. These theoretical results suggest that top-down attention can effectively reconfigure general-purpose DCNNs to better suit the current task goal. We hope our results continue the fruitful dialog between neuroscience and machine learning.
Tasks	Object Recognition
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02342v1
PDF	https://arxiv.org/pdf/2002.02342v1.pdf
PWC	https://paperswithcode.com/paper/the-costs-and-benefits-of-goal-directed
Repo
Framework

Set-Constrained Viterbi for Set-Supervised Action Segmentation


Title	Set-Constrained Viterbi for Set-Supervised Action Segmentation
Authors	Jun Li, Sinisa Todorovic
Abstract	This paper is about weakly supervised action segmentation, where the ground truth specifies only a set of actions present in a training video, but not their true temporal ordering. Prior work typically uses a classifier that independently labels video frames for generating the pseudo ground truth, and multiple instance learning for training the classifier. We extend this framework by specifying an HMM, which accounts for co-occurrences of action classes and their temporal lengths, and by explicitly training the HMM on a Viterbi-based loss. Our first contribution is the formulation of a new set-constrained Viterbi algorithm (SCV). Given a video, the SCV generates the MAP action segmentation that satisfies the ground truth. This prediction is used as a framewise pseudo ground truth in our HMM training. Our second contribution in training is a new regularization of feature affinities between training videos that share the same action classes. Evaluation on action segmentation and alignment on the Breakfast, MPII Cooking2, Hollywood Extended datasets demonstrates our significant performance improvement for the two tasks over prior work.
Tasks	action segmentation, Multiple Instance Learning
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11925v2
PDF	https://arxiv.org/pdf/2002.11925v2.pdf
PWC	https://paperswithcode.com/paper/set-constrained-viterbi-for-set-supervised
Repo
Framework

Multitask learning over graphs


Title	Multitask learning over graphs
Authors	Roula Nassif, Stefan Vlaski, Cedric Richard, Jie Chen, Ali H. Sayed
Abstract	The problem of learning simultaneously several related tasks has received considerable attention in several domains, especially in machine learning with the so-called multitask learning problem or learning to learn problem [1], [2]. Multitask learning is an approach to inductive transfer learning (using what is learned for one problem to assist in another problem) and helps improve generalization performance relative to learning each task separately by using the domain information contained in the training signals of related tasks as an inductive bias. Several strategies have been derived within this community under the assumption that all data are available beforehand at a fusion center. However, recent years have witnessed an increasing ability to collect data in a distributed and streaming manner. This requires the design of new strategies for learning jointly multiple tasks from streaming data over distributed (or networked) systems. This article provides an overview of multitask strategies for learning and adaptation over networks. The working hypothesis for these strategies is that agents are allowed to cooperate with each other in order to learn distinct, though related tasks. The article shows how cooperation steers the network limiting point and how different cooperation rules allow to promote different task relatedness models. It also explains how and when cooperation over multitask networks outperforms non-cooperative strategies.
Tasks	Transfer Learning
Published	2020-01-07
URL	https://arxiv.org/abs/2001.02112v1
PDF	https://arxiv.org/pdf/2001.02112v1.pdf
PWC	https://paperswithcode.com/paper/multitask-learning-over-graphs
Repo
Framework