October 18, 2019

3232 words 16 mins read

Paper Group ANR 481

BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking. Anomaly Detection in the Presence of Missing Values. Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision. Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization. Integrating Reinforcement Learning to Self Tr …

BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking


Title	BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking
Authors	Hauke Jürgen Mönck, Andreas Jörg, Tobias von Falkenhausen, Julian Tanke, Benjamin Wild, David Dormagen, Jonas Piotrowski, Claudia Winklmayr, David Bierbach, Tim Landgraf
Abstract	The study of animal behavior increasingly relies on (semi-) automatic methods for the extraction of relevant behavioral features from video or picture data. To date, several specialized software products exist to detect and track animals’ positions in simple (laboratory) environments. Tracking animals in their natural environments, however, often requires substantial customization of the image processing algorithms to the problem-specific image characteristics. Here we introduce BioTracker, an open-source computer vision framework, that provides programmers with core functionalities that are essential parts of a tracking software, such as video I/O, graphics overlays and mouse and keyboard interfaces. BioTracker additionally provides a number of different tracking algorithms suitable for a variety of image recording conditions. The main feature of BioTracker is however the straightforward implementation of new problem-specific tracking modules and vision algorithms that can build upon BioTracker’s core functionalities. With this open-source framework the scientific community can accelerate their research and focus on the development of new vision algorithms.
Tasks
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07985v1
PDF	http://arxiv.org/pdf/1803.07985v1.pdf
PWC	https://paperswithcode.com/paper/biotracker-an-open-source-computer-vision
Repo
Framework

Anomaly Detection in the Presence of Missing Values


Title	Anomaly Detection in the Presence of Missing Values
Authors	Thomas G. Dietterich, Tadesse Zemicheal
Abstract	Standard methods for anomaly detection assume that all features are observed at both learning time and prediction time. Such methods cannot process data containing missing values. This paper studies five strategies for handling missing values in test queries: (a) mean imputation, (b) MAP imputation, (c) reduction (reduced-dimension anomaly detectors via feature bagging), (d) marginalization (for density estimators only), and (e) proportional distribution (for tree-based methods only). Our analysis suggests that MAP imputation and proportional distribution should give better results than mean imputation, reduction, and marginalization. These hypotheses are largely confirmed by experimental studies on synthetic data and on anomaly detection benchmark data sets using the Isolation Forest (IF), LODA, and EGMM anomaly detection algorithms. However, marginalization worked surprisingly well for EGMM, and there are exceptions where reduction works well on some benchmark problems. We recommend proportional distribution for IF, MAP imputation for LODA, and marginalization for EGMM.
Tasks	Anomaly Detection, Imputation
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01605v1
PDF	http://arxiv.org/pdf/1809.01605v1.pdf
PWC	https://paperswithcode.com/paper/anomaly-detection-in-the-presence-of-missing
Repo
Framework

Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision


Title	Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision
Authors	Kuan Fang, Yuke Zhu, Animesh Garg, Andrey Kurenkov, Viraj Mehta, Li Fei-Fei, Silvio Savarese
Abstract	Tool manipulation is vital for facilitating robots to complete challenging task goals. It requires reasoning about the desired effect of the task and thus properly grasping and manipulating the tool to achieve the task. Task-agnostic grasping optimizes for grasp robustness while ignoring crucial task-specific constraints. In this paper, we propose the Task-Oriented Grasping Network (TOG-Net) to jointly optimize both task-oriented grasping of a tool and the manipulation policy for that tool. The training process of the model is based on large-scale simulated self-supervision with procedurally generated tool objects. We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering. Our model achieves overall 71.1% task success rate for sweeping and 80.0% task success rate for hammering. Supplementary material is available at: bit.ly/task-oriented-grasp
Tasks
Published	2018-06-25
URL	http://arxiv.org/abs/1806.09266v1
PDF	http://arxiv.org/pdf/1806.09266v1.pdf
PWC	https://paperswithcode.com/paper/learning-task-oriented-grasping-for-tool
Repo
Framework

Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization


Title	Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Authors	Navid Azizan, Babak Hassibi
Abstract	Stochastic descent methods (of the gradient and mirror varieties) have become increasingly popular in optimization. In fact, it is now widely recognized that the success of deep learning is not only due to the special deep architecture of the models, but also due to the behavior of the stochastic descent methods used, which play a key role in reaching “good” solutions that generalize well to unseen data. In an attempt to shed some light on why this is the case, we revisit some minimax properties of stochastic gradient descent (SGD) for the square loss of linear models—originally developed in the 1990’s—and extend them to general stochastic mirror descent (SMD) algorithms for general loss functions and nonlinear models. In particular, we show that there is a fundamental identity which holds for SMD (and SGD) under very general conditions, and which implies the minimax optimality of SMD (and SGD) for sufficiently small step size, and for a general class of loss functions and general nonlinear models. We further show that this identity can be used to naturally establish other properties of SMD (and SGD), namely convergence and implicit regularization for over-parameterized linear models (in what is now being called the “interpolating regime”), some of which have been shown in certain cases in prior literature. We also argue how this identity can be used in the so-called “highly over-parameterized” nonlinear setting (where the number of parameters far exceeds the number of data points) to provide insights into why SMD (and SGD) may have similar convergence and implicit regularization properties for deep learning.
Tasks
Published	2018-06-04
URL	http://arxiv.org/abs/1806.00952v4
PDF	http://arxiv.org/pdf/1806.00952v4.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradientmirror-descent-minimax
Repo
Framework

Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays


Title	Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays
Authors	Sejin Park, Woochan Hwang, Kyu-Hwan Jung
Abstract	Machine learning applications in medical imaging are frequently limited by the lack of quality labeled data. In this paper, we explore the self training method, a form of semi-supervised learning, to address the labeling burden. By integrating reinforcement learning, we were able to expand the application of self training to complex segmentation networks without any further human annotation. The proposed approach, reinforced self training (ReST), fine tunes a semantic segmentation networks by introducing a policy network that learns to generate pseudolabels. We incorporate an expert demonstration network, based on inverse reinforcement learning, to enhance clinical validity and convergence of the policy network. The model was tested on a pulmonary nodule segmentation task in chest X-rays and achieved the performance of a standard U-Net while using only 50% of the labeled data, by exploiting unlabeled data. When the same number of labeled data was used, a moderate to significant cross validation accuracy improvement was achieved depending on the absolute number of labels used.
Tasks	Semantic Segmentation
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08840v1
PDF	http://arxiv.org/pdf/1811.08840v1.pdf
PWC	https://paperswithcode.com/paper/integrating-reinforcement-learning-to-self
Repo
Framework

Learning Instance Segmentation by Interaction


Title	Learning Instance Segmentation by Interaction
Authors	Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik
Abstract	We present an approach for building an active agent that learns to segment its visual observations into individual objects by interacting with its environment in a completely self-supervised manner. The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels. The model learned from over 50K interactions generalizes to novel objects and backgrounds. To deal with noisy training signal for segmenting objects obtained by self-supervised interactions, we propose robust set loss. A dataset of robot’s interactions along-with a few human labeled examples is provided as a benchmark for future research. We test the utility of the learned segmentation model by providing results on a downstream vision-based control task of rearranging multiple objects into target configurations from visual inputs alone. Videos, code, and robotic interaction dataset are available at https://pathak22.github.io/seg-by-interaction/
Tasks	Instance Segmentation, Semantic Segmentation
Published	2018-06-21
URL	http://arxiv.org/abs/1806.08354v1
PDF	http://arxiv.org/pdf/1806.08354v1.pdf
PWC	https://paperswithcode.com/paper/learning-instance-segmentation-by-interaction
Repo
Framework

Neural Architectures for Open-Type Relation Argument Extraction


Title	Neural Architectures for Open-Type Relation Argument Extraction
Authors	Benjamin Roth, Costanza Conforti, Nina Poerner, Sanjeev Karn, Hinrich Schütze
Abstract	In this work, we introduce the task of Open-Type Relation Argument Extraction (ORAE): Given a corpus, a query entity Q and a knowledge base relation (e.g.,“Q authored notable work with title X”), the model has to extract an argument of non-standard entity type (entities that cannot be extracted by a standard named entity tagger, e.g. X: the title of a book or a work of art) from the corpus. A distantly supervised dataset based on WikiData relations is obtained and released to address the task. We develop and compare a wide range of neural models for this task yielding large improvements over a strong baseline obtained with a neural question answering system. The impact of different sentence encoding architectures and answer extraction methods is systematically compared. An encoder based on gated recurrent units combined with a conditional random fields tagger gives the best results.
Tasks	Question Answering
Published	2018-03-05
URL	http://arxiv.org/abs/1803.01707v2
PDF	http://arxiv.org/pdf/1803.01707v2.pdf
PWC	https://paperswithcode.com/paper/neural-architectures-for-open-type-relation
Repo
Framework

FDSNet: Finger dorsal image spoof detection network using light field camera


Title	FDSNet: Finger dorsal image spoof detection network using light field camera
Authors	Avantika Singh, Gaurav Jaswal, Aditya Nigam
Abstract	At present spoofing attacks via which biometric system is potentially vulnerable against a fake biometric characteristic, introduces a great challenge to recognition performance. Despite the availability of a broad range of presentation attack detection (PAD) or liveness detection algorithms, fingerprint sensors are vulnerable to spoofing via fake fingers. In such situations, finger dorsal images can be thought of as an alternative which can be captured without much user cooperation and are more appropriate for outdoor security applications. In this paper, we present a first feasibility study of spoofing attack scenarios on finger dorsal authentication system, which include four types of presentation attacks such as printed paper, wrapped printed paper, scan and mobile. This study also presents a CNN based spoofing attack detection method which employ state-of-the-art deep learning techniques along with transfer learning mechanism. We have collected 196 finger dorsal real images from 33 subjects, captured with a Lytro camera and also created a set of 784 finger dorsal spoofing images. Extensive experimental results have been performed that demonstrates the superiority of the proposed approach for various spoofing attacks.
Tasks	Finger Dorsal Image Spoof Detection, Transfer Learning
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07444v1
PDF	http://arxiv.org/pdf/1812.07444v1.pdf
PWC	https://paperswithcode.com/paper/fdsnet-finger-dorsal-image-spoof-detection
Repo
Framework

Contextual Speech Recognition with Difficult Negative Training Examples


Title	Contextual Speech Recognition with Difficult Negative Training Examples
Authors	Uri Alon, Golan Pundak, Tara N. Sainath
Abstract	Improving the representation of contextual information is key to unlocking the potential of end-to-end (E2E) automatic speech recognition (ASR). In this work, we present a novel and simple approach for training an ASR context mechanism with difficult negative examples. The main idea is to focus on proper nouns (e.g., unique entities such as names of people and places) in the reference transcript, and use phonetically similar phrases as negative examples, encouraging the neural model to learn more discriminative representations. We apply our approach to an end-to-end contextual ASR model that jointly learns to transcribe and select the correct context items, and show that our proposed method gives up to $53.1%$ relative improvement in word error rate (WER) across several benchmarks.
Tasks	Speech Recognition
Published	2018-10-29
URL	http://arxiv.org/abs/1810.12170v1
PDF	http://arxiv.org/pdf/1810.12170v1.pdf
PWC	https://paperswithcode.com/paper/contextual-speech-recognition-with-difficult
Repo
Framework

TraMNet - Transition Matrix Network for Efficient Action Tube Proposals


Title	TraMNet - Transition Matrix Network for Efficient Action Tube Proposals
Authors	Gurkirt Singh, Suman Saha, Fabio Cuzzolin
Abstract	Current state-of-the-art methods solve spatiotemporal action localisation by extending 2D anchors to 3D-cuboid proposals on stacks of frames, to generate sets of temporally connected bounding boxes called \textit{action micro-tubes}. However, they fail to consider that the underlying anchor proposal hypotheses should also move (transition) from frame to frame, as the actor or the camera does. Assuming we evaluate $n$ 2D anchors in each frame, then the number of possible transitions from each 2D anchor to the next, for a sequence of $f$ consecutive frames, is in the order of $O(n^f)$, expensive even for small values of $f$. To avoid this problem, we introduce a Transition-Matrix-based Network (TraMNet) which relies on computing transition probabilities between anchor proposals while maximising their overlap with ground truth bounding boxes across frames, and enforcing sparsity via a transition threshold. As the resulting transition matrix is sparse and stochastic, this reduces the proposal hypothesis search space from $O(n^f)$ to the cardinality of the thresholded matrix. At training time, transitions are specific to cell locations of the feature maps, so that a sparse (efficient) transition matrix is used to train the network. At test time, a denser transition matrix can be obtained either by decreasing the threshold or by adding to it all the relative transitions originating from any cell location, allowing the network to handle transitions in the test data that might not have been present in the training data, and making detection translation-invariant. Finally, we show that our network can handle sparse annotations such as those available in the DALY dataset. We report extensive experiments on the DALY, UCF101-24 and Transformed-UCF101-24 datasets to support our claims.
Tasks
Published	2018-08-01
URL	http://arxiv.org/abs/1808.00297v1
PDF	http://arxiv.org/pdf/1808.00297v1.pdf
PWC	https://paperswithcode.com/paper/tramnet-transition-matrix-network-for
Repo
Framework

Rough Concept Analysis


Title	Rough Concept Analysis
Authors	Robert E. Kent
Abstract	The theory introduced, presented and developed in this paper, is concerned with Rough Concept Analysis. This theory is a synthesis of the theory of Rough Sets pioneered by Zdzislaw Pawlak with the theory of Formal Concept Analysis pioneered by Rudolf Wille. The central notion in this paper of a rough formal concept combines in a natural fashion the notion of a rough set with the notion of a formal concept: “rough set + formal concept = rough formal concept”. A follow-up paper will provide a synthesis of the two important data modeling techniques: conceptual scaling of Formal Concept Analysis and Entity-Relationship database modeling.
Tasks
Published	2018-10-12
URL	http://arxiv.org/abs/1810.06986v1
PDF	http://arxiv.org/pdf/1810.06986v1.pdf
PWC	https://paperswithcode.com/paper/rough-concept-analysis
Repo
Framework

Regularized adversarial examples for model interpretability


Title	Regularized adversarial examples for model interpretability
Authors	Yoel Shoshan, Vadim Ratner
Abstract	As machine learning algorithms continue to improve, there is an increasing need for explaining why a model produces a certain prediction for a certain input. In recent years, several methods for model interpretability have been developed, aiming to provide explanation of which subset regions of the model input is the main reason for the model prediction. In parallel, a significant research community effort is occurring in recent years for developing adversarial example generation methods for fooling models, while not altering the true label of the input,as it would have been classified by a human annotator. In this paper, we bridge the gap between adversarial example generation and model interpretability, and introduce a modification to the adversarial example generation process which encourages better interpretability. We analyze the proposed method on a public medical imaging dataset, both quantitatively and qualitatively, and show that it significantly outperforms the leading known alternative method. Our suggested method is simple to implement, and can be easily plugged into most common adversarial example generation frameworks. Additionally, we propose an explanation quality metric - $APE$ - “Adversarial Perturbative Explanation”, which measures how well an explanation describes model decisions.
Tasks
Published	2018-11-18
URL	http://arxiv.org/abs/1811.07311v2
PDF	http://arxiv.org/pdf/1811.07311v2.pdf
PWC	https://paperswithcode.com/paper/regularized-adversarial-examples-for-model
Repo
Framework

Deep Learning with unsupervised data labeling for weeds detection on UAV images


Title	Deep Learning with unsupervised data labeling for weeds detection on UAV images
Authors	M. Dian. Bah, Adel Hafiane, Raphael Canals
Abstract	In modern agriculture, usually weeds control consists in spraying herbicides all over the agricultural field. This practice involves significant waste and cost of herbicide for farmers and environmental pollution. One way to reduce the cost and environmental impact is to allocate the right doses of herbicide at the right place and at the right time (Precision Agriculture). Nowadays, Unmanned Aerial Vehicle (UAV) is becoming an interesting acquisition system for weeds localization and management due to its ability to obtain the images of the entire agricultural field with a very high spatial resolution and at low cost. Despite the important advances in UAV acquisition systems, automatic weeds detection remains a challenging problem because of its strong similarity with the crops. Recently Deep Learning approach has shown impressive results in different complex classification problem. However, this approach needs a certain amount of training data but, creating large agricultural datasets with pixel-level annotations by expert is an extremely time consuming task. In this paper, we propose a novel fully automatic learning method using Convolutional Neuronal Networks (CNNs) with unsupervised training dataset collection for weeds detection from UAV images. The proposed method consists in three main phases. First we automatically detect the crop lines and using them to identify the interline weeds. In the second phase, interline weeds are used to constitute the training dataset. Finally, we performed CNNs on this dataset to build a model able to detect the crop and weeds in the images. The results obtained are comparable to the traditional supervised training data labeling. The accuracy gaps are 1.5% in the spinach field and 6% in the bean field.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12395v1
PDF	http://arxiv.org/pdf/1805.12395v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-with-unsupervised-data-labeling
Repo
Framework

Attention-based Ensemble for Deep Metric Learning


Title	Attention-based Ensemble for Deep Metric Learning
Authors	Wonsik Kim, Bhavya Goyal, Kunal Chawla, Jungmin Lee, Keunjoo Kwon
Abstract	Deep metric learning aims to learn an embedding function, modeled as deep neural network. This embedding function usually puts semantically similar images close while dissimilar images far from each other in the learned embedding space. Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks.
Tasks	Image Retrieval, Metric Learning
Published	2018-04-02
URL	http://arxiv.org/abs/1804.00382v2
PDF	http://arxiv.org/pdf/1804.00382v2.pdf
PWC	https://paperswithcode.com/paper/attention-based-ensemble-for-deep-metric
Repo
Framework

Vehicle Image Generation Going Well with The Surroundings


Title	Vehicle Image Generation Going Well with The Surroundings
Authors	Jeesoo Kim, Jangho Kim, Jaeyoung Yoo, Daesik Kim, Nojun Kwak
Abstract	Since the generative neural networks have made a breakthrough in the image generation problem, lots of researches on their applications have been studied such as image restoration, style transfer and image completion. However, there has been few research generating objects in uncontrolled real-world environments. In this paper, we propose a novel approach for vehicle image generation in real-world scenes. Using a subnetwork based on a precedent work of image completion, our model makes the shape of an object. Details of objects are trained by an additional colorization and refinement subnetwork, resulting in a better quality of generated objects. Unlike many other works, our method does not require any segmentation layout but still makes a plausible vehicle in the image. We evaluate our method by using images from Berkeley Deep Drive (BDD) and Cityscape datasets, which are widely used for object detection and image segmentation problems. The adequacy of the generated images by the proposed method has also been evaluated using a widely utilized object detection algorithm and the FID score.
Tasks	Colorization, Image Generation, Image Restoration, Object Detection, Semantic Segmentation, Style Transfer
Published	2018-07-09
URL	http://arxiv.org/abs/1807.02925v3
PDF	http://arxiv.org/pdf/1807.02925v3.pdf
PWC	https://paperswithcode.com/paper/vehicle-image-generation-going-well-with-the
Repo
Framework