Paper Group ANR 481
BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking. Anomaly Detection in the Presence of Missing Values. Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision. Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization. Integrating Reinforcement Learning to Self Tr …
BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking
Title | BioTracker: An Open-Source Computer Vision Framework for Visual Animal Tracking |
Authors | Hauke Jürgen Mönck, Andreas Jörg, Tobias von Falkenhausen, Julian Tanke, Benjamin Wild, David Dormagen, Jonas Piotrowski, Claudia Winklmayr, David Bierbach, Tim Landgraf |
Abstract | The study of animal behavior increasingly relies on (semi-) automatic methods for the extraction of relevant behavioral features from video or picture data. To date, several specialized software products exist to detect and track animals’ positions in simple (laboratory) environments. Tracking animals in their natural environments, however, often requires substantial customization of the image processing algorithms to the problem-specific image characteristics. Here we introduce BioTracker, an open-source computer vision framework, that provides programmers with core functionalities that are essential parts of a tracking software, such as video I/O, graphics overlays and mouse and keyboard interfaces. BioTracker additionally provides a number of different tracking algorithms suitable for a variety of image recording conditions. The main feature of BioTracker is however the straightforward implementation of new problem-specific tracking modules and vision algorithms that can build upon BioTracker’s core functionalities. With this open-source framework the scientific community can accelerate their research and focus on the development of new vision algorithms. |
Tasks | |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.07985v1 |
http://arxiv.org/pdf/1803.07985v1.pdf | |
PWC | https://paperswithcode.com/paper/biotracker-an-open-source-computer-vision |
Repo | |
Framework | |
Anomaly Detection in the Presence of Missing Values
Title | Anomaly Detection in the Presence of Missing Values |
Authors | Thomas G. Dietterich, Tadesse Zemicheal |
Abstract | Standard methods for anomaly detection assume that all features are observed at both learning time and prediction time. Such methods cannot process data containing missing values. This paper studies five strategies for handling missing values in test queries: (a) mean imputation, (b) MAP imputation, (c) reduction (reduced-dimension anomaly detectors via feature bagging), (d) marginalization (for density estimators only), and (e) proportional distribution (for tree-based methods only). Our analysis suggests that MAP imputation and proportional distribution should give better results than mean imputation, reduction, and marginalization. These hypotheses are largely confirmed by experimental studies on synthetic data and on anomaly detection benchmark data sets using the Isolation Forest (IF), LODA, and EGMM anomaly detection algorithms. However, marginalization worked surprisingly well for EGMM, and there are exceptions where reduction works well on some benchmark problems. We recommend proportional distribution for IF, MAP imputation for LODA, and marginalization for EGMM. |
Tasks | Anomaly Detection, Imputation |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01605v1 |
http://arxiv.org/pdf/1809.01605v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-in-the-presence-of-missing |
Repo | |
Framework | |
Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision
Title | Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision |
Authors | Kuan Fang, Yuke Zhu, Animesh Garg, Andrey Kurenkov, Viraj Mehta, Li Fei-Fei, Silvio Savarese |
Abstract | Tool manipulation is vital for facilitating robots to complete challenging task goals. It requires reasoning about the desired effect of the task and thus properly grasping and manipulating the tool to achieve the task. Task-agnostic grasping optimizes for grasp robustness while ignoring crucial task-specific constraints. In this paper, we propose the Task-Oriented Grasping Network (TOG-Net) to jointly optimize both task-oriented grasping of a tool and the manipulation policy for that tool. The training process of the model is based on large-scale simulated self-supervision with procedurally generated tool objects. We perform both simulated and real-world experiments on two tool-based manipulation tasks: sweeping and hammering. Our model achieves overall 71.1% task success rate for sweeping and 80.0% task success rate for hammering. Supplementary material is available at: bit.ly/task-oriented-grasp |
Tasks | |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09266v1 |
http://arxiv.org/pdf/1806.09266v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-task-oriented-grasping-for-tool |
Repo | |
Framework | |
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Title | Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization |
Authors | Navid Azizan, Babak Hassibi |
Abstract | Stochastic descent methods (of the gradient and mirror varieties) have become increasingly popular in optimization. In fact, it is now widely recognized that the success of deep learning is not only due to the special deep architecture of the models, but also due to the behavior of the stochastic descent methods used, which play a key role in reaching “good” solutions that generalize well to unseen data. In an attempt to shed some light on why this is the case, we revisit some minimax properties of stochastic gradient descent (SGD) for the square loss of linear models—originally developed in the 1990’s—and extend them to general stochastic mirror descent (SMD) algorithms for general loss functions and nonlinear models. In particular, we show that there is a fundamental identity which holds for SMD (and SGD) under very general conditions, and which implies the minimax optimality of SMD (and SGD) for sufficiently small step size, and for a general class of loss functions and general nonlinear models. We further show that this identity can be used to naturally establish other properties of SMD (and SGD), namely convergence and implicit regularization for over-parameterized linear models (in what is now being called the “interpolating regime”), some of which have been shown in certain cases in prior literature. We also argue how this identity can be used in the so-called “highly over-parameterized” nonlinear setting (where the number of parameters far exceeds the number of data points) to provide insights into why SMD (and SGD) may have similar convergence and implicit regularization properties for deep learning. |
Tasks | |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.00952v4 |
http://arxiv.org/pdf/1806.00952v4.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-gradientmirror-descent-minimax |
Repo | |
Framework | |
Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays
Title | Integrating Reinforcement Learning to Self Training for Pulmonary Nodule Segmentation in Chest X-rays |
Authors | Sejin Park, Woochan Hwang, Kyu-Hwan Jung |
Abstract | Machine learning applications in medical imaging are frequently limited by the lack of quality labeled data. In this paper, we explore the self training method, a form of semi-supervised learning, to address the labeling burden. By integrating reinforcement learning, we were able to expand the application of self training to complex segmentation networks without any further human annotation. The proposed approach, reinforced self training (ReST), fine tunes a semantic segmentation networks by introducing a policy network that learns to generate pseudolabels. We incorporate an expert demonstration network, based on inverse reinforcement learning, to enhance clinical validity and convergence of the policy network. The model was tested on a pulmonary nodule segmentation task in chest X-rays and achieved the performance of a standard U-Net while using only 50% of the labeled data, by exploiting unlabeled data. When the same number of labeled data was used, a moderate to significant cross validation accuracy improvement was achieved depending on the absolute number of labels used. |
Tasks | Semantic Segmentation |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08840v1 |
http://arxiv.org/pdf/1811.08840v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-reinforcement-learning-to-self |
Repo | |
Framework | |
Learning Instance Segmentation by Interaction
Title | Learning Instance Segmentation by Interaction |
Authors | Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik |
Abstract | We present an approach for building an active agent that learns to segment its visual observations into individual objects by interacting with its environment in a completely self-supervised manner. The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels. The model learned from over 50K interactions generalizes to novel objects and backgrounds. To deal with noisy training signal for segmenting objects obtained by self-supervised interactions, we propose robust set loss. A dataset of robot’s interactions along-with a few human labeled examples is provided as a benchmark for future research. We test the utility of the learned segmentation model by providing results on a downstream vision-based control task of rearranging multiple objects into target configurations from visual inputs alone. Videos, code, and robotic interaction dataset are available at https://pathak22.github.io/seg-by-interaction/ |
Tasks | Instance Segmentation, Semantic Segmentation |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08354v1 |
http://arxiv.org/pdf/1806.08354v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-instance-segmentation-by-interaction |
Repo | |
Framework | |
Neural Architectures for Open-Type Relation Argument Extraction
Title | Neural Architectures for Open-Type Relation Argument Extraction |
Authors | Benjamin Roth, Costanza Conforti, Nina Poerner, Sanjeev Karn, Hinrich Schütze |
Abstract | In this work, we introduce the task of Open-Type Relation Argument Extraction (ORAE): Given a corpus, a query entity Q and a knowledge base relation (e.g.,“Q authored notable work with title X”), the model has to extract an argument of non-standard entity type (entities that cannot be extracted by a standard named entity tagger, e.g. X: the title of a book or a work of art) from the corpus. A distantly supervised dataset based on WikiData relations is obtained and released to address the task. We develop and compare a wide range of neural models for this task yielding large improvements over a strong baseline obtained with a neural question answering system. The impact of different sentence encoding architectures and answer extraction methods is systematically compared. An encoder based on gated recurrent units combined with a conditional random fields tagger gives the best results. |
Tasks | Question Answering |
Published | 2018-03-05 |
URL | http://arxiv.org/abs/1803.01707v2 |
http://arxiv.org/pdf/1803.01707v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-architectures-for-open-type-relation |
Repo | |
Framework | |
FDSNet: Finger dorsal image spoof detection network using light field camera
Title | FDSNet: Finger dorsal image spoof detection network using light field camera |
Authors | Avantika Singh, Gaurav Jaswal, Aditya Nigam |
Abstract | At present spoofing attacks via which biometric system is potentially vulnerable against a fake biometric characteristic, introduces a great challenge to recognition performance. Despite the availability of a broad range of presentation attack detection (PAD) or liveness detection algorithms, fingerprint sensors are vulnerable to spoofing via fake fingers. In such situations, finger dorsal images can be thought of as an alternative which can be captured without much user cooperation and are more appropriate for outdoor security applications. In this paper, we present a first feasibility study of spoofing attack scenarios on finger dorsal authentication system, which include four types of presentation attacks such as printed paper, wrapped printed paper, scan and mobile. This study also presents a CNN based spoofing attack detection method which employ state-of-the-art deep learning techniques along with transfer learning mechanism. We have collected 196 finger dorsal real images from 33 subjects, captured with a Lytro camera and also created a set of 784 finger dorsal spoofing images. Extensive experimental results have been performed that demonstrates the superiority of the proposed approach for various spoofing attacks. |
Tasks | Finger Dorsal Image Spoof Detection, Transfer Learning |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07444v1 |
http://arxiv.org/pdf/1812.07444v1.pdf | |
PWC | https://paperswithcode.com/paper/fdsnet-finger-dorsal-image-spoof-detection |
Repo | |
Framework | |
Contextual Speech Recognition with Difficult Negative Training Examples
Title | Contextual Speech Recognition with Difficult Negative Training Examples |
Authors | Uri Alon, Golan Pundak, Tara N. Sainath |
Abstract | Improving the representation of contextual information is key to unlocking the potential of end-to-end (E2E) automatic speech recognition (ASR). In this work, we present a novel and simple approach for training an ASR context mechanism with difficult negative examples. The main idea is to focus on proper nouns (e.g., unique entities such as names of people and places) in the reference transcript, and use phonetically similar phrases as negative examples, encouraging the neural model to learn more discriminative representations. We apply our approach to an end-to-end contextual ASR model that jointly learns to transcribe and select the correct context items, and show that our proposed method gives up to $53.1%$ relative improvement in word error rate (WER) across several benchmarks. |
Tasks | Speech Recognition |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12170v1 |
http://arxiv.org/pdf/1810.12170v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-speech-recognition-with-difficult |
Repo | |
Framework | |
TraMNet - Transition Matrix Network for Efficient Action Tube Proposals
Title | TraMNet - Transition Matrix Network for Efficient Action Tube Proposals |
Authors | Gurkirt Singh, Suman Saha, Fabio Cuzzolin |
Abstract | Current state-of-the-art methods solve spatiotemporal action localisation by extending 2D anchors to 3D-cuboid proposals on stacks of frames, to generate sets of temporally connected bounding boxes called \textit{action micro-tubes}. However, they fail to consider that the underlying anchor proposal hypotheses should also move (transition) from frame to frame, as the actor or the camera does. Assuming we evaluate $n$ 2D anchors in each frame, then the number of possible transitions from each 2D anchor to the next, for a sequence of $f$ consecutive frames, is in the order of $O(n^f)$, expensive even for small values of $f$. To avoid this problem, we introduce a Transition-Matrix-based Network (TraMNet) which relies on computing transition probabilities between anchor proposals while maximising their overlap with ground truth bounding boxes across frames, and enforcing sparsity via a transition threshold. As the resulting transition matrix is sparse and stochastic, this reduces the proposal hypothesis search space from $O(n^f)$ to the cardinality of the thresholded matrix. At training time, transitions are specific to cell locations of the feature maps, so that a sparse (efficient) transition matrix is used to train the network. At test time, a denser transition matrix can be obtained either by decreasing the threshold or by adding to it all the relative transitions originating from any cell location, allowing the network to handle transitions in the test data that might not have been present in the training data, and making detection translation-invariant. Finally, we show that our network can handle sparse annotations such as those available in the DALY dataset. We report extensive experiments on the DALY, UCF101-24 and Transformed-UCF101-24 datasets to support our claims. |
Tasks | |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00297v1 |
http://arxiv.org/pdf/1808.00297v1.pdf | |
PWC | https://paperswithcode.com/paper/tramnet-transition-matrix-network-for |
Repo | |
Framework | |
Rough Concept Analysis
Title | Rough Concept Analysis |
Authors | Robert E. Kent |
Abstract | The theory introduced, presented and developed in this paper, is concerned with Rough Concept Analysis. This theory is a synthesis of the theory of Rough Sets pioneered by Zdzislaw Pawlak with the theory of Formal Concept Analysis pioneered by Rudolf Wille. The central notion in this paper of a rough formal concept combines in a natural fashion the notion of a rough set with the notion of a formal concept: “rough set + formal concept = rough formal concept”. A follow-up paper will provide a synthesis of the two important data modeling techniques: conceptual scaling of Formal Concept Analysis and Entity-Relationship database modeling. |
Tasks | |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.06986v1 |
http://arxiv.org/pdf/1810.06986v1.pdf | |
PWC | https://paperswithcode.com/paper/rough-concept-analysis |
Repo | |
Framework | |
Regularized adversarial examples for model interpretability
Title | Regularized adversarial examples for model interpretability |
Authors | Yoel Shoshan, Vadim Ratner |
Abstract | As machine learning algorithms continue to improve, there is an increasing need for explaining why a model produces a certain prediction for a certain input. In recent years, several methods for model interpretability have been developed, aiming to provide explanation of which subset regions of the model input is the main reason for the model prediction. In parallel, a significant research community effort is occurring in recent years for developing adversarial example generation methods for fooling models, while not altering the true label of the input,as it would have been classified by a human annotator. In this paper, we bridge the gap between adversarial example generation and model interpretability, and introduce a modification to the adversarial example generation process which encourages better interpretability. We analyze the proposed method on a public medical imaging dataset, both quantitatively and qualitatively, and show that it significantly outperforms the leading known alternative method. Our suggested method is simple to implement, and can be easily plugged into most common adversarial example generation frameworks. Additionally, we propose an explanation quality metric - $APE$ - “Adversarial Perturbative Explanation”, which measures how well an explanation describes model decisions. |
Tasks | |
Published | 2018-11-18 |
URL | http://arxiv.org/abs/1811.07311v2 |
http://arxiv.org/pdf/1811.07311v2.pdf | |
PWC | https://paperswithcode.com/paper/regularized-adversarial-examples-for-model |
Repo | |
Framework | |
Deep Learning with unsupervised data labeling for weeds detection on UAV images
Title | Deep Learning with unsupervised data labeling for weeds detection on UAV images |
Authors | M. Dian. Bah, Adel Hafiane, Raphael Canals |
Abstract | In modern agriculture, usually weeds control consists in spraying herbicides all over the agricultural field. This practice involves significant waste and cost of herbicide for farmers and environmental pollution. One way to reduce the cost and environmental impact is to allocate the right doses of herbicide at the right place and at the right time (Precision Agriculture). Nowadays, Unmanned Aerial Vehicle (UAV) is becoming an interesting acquisition system for weeds localization and management due to its ability to obtain the images of the entire agricultural field with a very high spatial resolution and at low cost. Despite the important advances in UAV acquisition systems, automatic weeds detection remains a challenging problem because of its strong similarity with the crops. Recently Deep Learning approach has shown impressive results in different complex classification problem. However, this approach needs a certain amount of training data but, creating large agricultural datasets with pixel-level annotations by expert is an extremely time consuming task. In this paper, we propose a novel fully automatic learning method using Convolutional Neuronal Networks (CNNs) with unsupervised training dataset collection for weeds detection from UAV images. The proposed method consists in three main phases. First we automatically detect the crop lines and using them to identify the interline weeds. In the second phase, interline weeds are used to constitute the training dataset. Finally, we performed CNNs on this dataset to build a model able to detect the crop and weeds in the images. The results obtained are comparable to the traditional supervised training data labeling. The accuracy gaps are 1.5% in the spinach field and 6% in the bean field. |
Tasks | |
Published | 2018-05-31 |
URL | http://arxiv.org/abs/1805.12395v1 |
http://arxiv.org/pdf/1805.12395v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-with-unsupervised-data-labeling |
Repo | |
Framework | |
Attention-based Ensemble for Deep Metric Learning
Title | Attention-based Ensemble for Deep Metric Learning |
Authors | Wonsik Kim, Bhavya Goyal, Kunal Chawla, Jungmin Lee, Keunjoo Kwon |
Abstract | Deep metric learning aims to learn an embedding function, modeled as deep neural network. This embedding function usually puts semantically similar images close while dissimilar images far from each other in the learned embedding space. Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks. |
Tasks | Image Retrieval, Metric Learning |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00382v2 |
http://arxiv.org/pdf/1804.00382v2.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-ensemble-for-deep-metric |
Repo | |
Framework | |
Vehicle Image Generation Going Well with The Surroundings
Title | Vehicle Image Generation Going Well with The Surroundings |
Authors | Jeesoo Kim, Jangho Kim, Jaeyoung Yoo, Daesik Kim, Nojun Kwak |
Abstract | Since the generative neural networks have made a breakthrough in the image generation problem, lots of researches on their applications have been studied such as image restoration, style transfer and image completion. However, there has been few research generating objects in uncontrolled real-world environments. In this paper, we propose a novel approach for vehicle image generation in real-world scenes. Using a subnetwork based on a precedent work of image completion, our model makes the shape of an object. Details of objects are trained by an additional colorization and refinement subnetwork, resulting in a better quality of generated objects. Unlike many other works, our method does not require any segmentation layout but still makes a plausible vehicle in the image. We evaluate our method by using images from Berkeley Deep Drive (BDD) and Cityscape datasets, which are widely used for object detection and image segmentation problems. The adequacy of the generated images by the proposed method has also been evaluated using a widely utilized object detection algorithm and the FID score. |
Tasks | Colorization, Image Generation, Image Restoration, Object Detection, Semantic Segmentation, Style Transfer |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.02925v3 |
http://arxiv.org/pdf/1807.02925v3.pdf | |
PWC | https://paperswithcode.com/paper/vehicle-image-generation-going-well-with-the |
Repo | |
Framework | |