Paper Group ANR 367
Knowledge distillation using unlabeled mismatched images. Learning Phrase Embeddings from Paraphrases with GRUs. Learning General Latent-Variable Graphical Models with Predictive Belief Propagation. Neuron-level Selective Context Aggregation for Scene Segmentation. Towards understanding feedback from supermassive black holes using convolutional neu …
Knowledge distillation using unlabeled mismatched images
Title | Knowledge distillation using unlabeled mismatched images |
Authors | Mandar Kulkarni, Kalpesh Patil, Shirish Karande |
Abstract | Current approaches for Knowledge Distillation (KD) either directly use training data or sample from the training data distribution. In this paper, we demonstrate effectiveness of ‘mismatched’ unlabeled stimulus to perform KD for image classification networks. For illustration, we consider scenarios where this is a complete absence of training data, or mismatched stimulus has to be used for augmenting a small amount of training data. We demonstrate that stimulus complexity is a key factor for distillation’s good performance. Our examples include use of various datasets for stimulating MNIST and CIFAR teachers. |
Tasks | Image Classification |
Published | 2017-03-21 |
URL | http://arxiv.org/abs/1703.07131v1 |
http://arxiv.org/pdf/1703.07131v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-distillation-using-unlabeled |
Repo | |
Framework | |
Learning Phrase Embeddings from Paraphrases with GRUs
Title | Learning Phrase Embeddings from Paraphrases with GRUs |
Authors | Zhihao Zhou, Lifu Huang, Heng Ji |
Abstract | Learning phrase representations has been widely explored in many Natural Language Processing (NLP) tasks (e.g., Sentiment Analysis, Machine Translation) and has shown promising improvements. Previous studies either learn non-compositional phrase representations with general word embedding learning techniques or learn compositional phrase representations based on syntactic structures, which either require huge amounts of human annotations or cannot be easily generalized to all phrases. In this work, we propose to take advantage of large-scaled paraphrase database and present a pair-wise gated recurrent units (pairwise-GRU) framework to generate compositional phrase representations. Our framework can be re-used to generate representations for any phrases. Experimental results show that our framework achieves state-of-the-art results on several phrase similarity tasks. |
Tasks | Machine Translation, Sentiment Analysis |
Published | 2017-10-13 |
URL | http://arxiv.org/abs/1710.05094v1 |
http://arxiv.org/pdf/1710.05094v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-phrase-embeddings-from-paraphrases |
Repo | |
Framework | |
Learning General Latent-Variable Graphical Models with Predictive Belief Propagation
Title | Learning General Latent-Variable Graphical Models with Predictive Belief Propagation |
Authors | Borui Wang, Geoffrey Gordon |
Abstract | Learning general latent-variable probabilistic graphical models is a key theoretical challenge in machine learning and artificial intelligence. All previous methods, including the EM algorithm and the spectral algorithms, face severe limitations that largely restrict their applicability and affect their performance. In order to overcome these limitations, in this paper we introduce a novel formulation of message-passing inference over junction trees named predictive belief propagation, and propose a new learning and inference algorithm for general latent-variable graphical models based on this formulation. Our proposed algorithm reduces the hard parameter learning problem into a sequence of supervised learning problems, and unifies the learning of different kinds of latent graphical models into a single learning framework, which is local-optima-free and statistically consistent. We then give a proof of the correctness of our algorithm and show in experiments on both synthetic and real datasets that our algorithm significantly outperforms both the EM algorithm and the spectral algorithm while also being orders of magnitude faster to compute. |
Tasks | |
Published | 2017-12-06 |
URL | https://arxiv.org/abs/1712.02046v2 |
https://arxiv.org/pdf/1712.02046v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-general-latent-variable-graphical |
Repo | |
Framework | |
Neuron-level Selective Context Aggregation for Scene Segmentation
Title | Neuron-level Selective Context Aggregation for Scene Segmentation |
Authors | Zhenhua Wang, Fanglin Gu, Dani Lischinski, Daniel Cohen-Or, Changhe Tu, Baoquan Chen |
Abstract | Contextual information provides important cues for disambiguating visually similar pixels in scene segmentation. In this paper, we introduce a neuron-level Selective Context Aggregation (SCA) module for scene segmentation, comprised of a contextual dependency predictor and a context aggregation operator. The dependency predictor is implicitly trained to infer contextual dependencies between different image regions. The context aggregation operator augments local representations with global context, which is aggregated selectively at each neuron according to its on-the-fly predicted dependencies. The proposed mechanism enables data-driven inference of contextual dependencies, and facilitates context-aware feature learning. The proposed method improves strong baselines built upon VGG16 on challenging scene segmentation datasets, which demonstrates its effectiveness in modeling context information. |
Tasks | Scene Segmentation |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08278v1 |
http://arxiv.org/pdf/1711.08278v1.pdf | |
PWC | https://paperswithcode.com/paper/neuron-level-selective-context-aggregation |
Repo | |
Framework | |
Towards understanding feedback from supermassive black holes using convolutional neural networks
Title | Towards understanding feedback from supermassive black holes using convolutional neural networks |
Authors | Stanislav Fort |
Abstract | Supermassive black holes at centers of clusters of galaxies strongly interact with their host environment via AGN feedback. Key tracers of such activity are X-ray cavities – regions of lower X-ray brightness within the cluster. We present an automatic method for detecting, and characterizing X-ray cavities in noisy, low-resolution X-ray images. We simulate clusters of galaxies, insert cavities into them, and produce realistic low-quality images comparable to observations at high redshifts. We then train a custom-built convolutional neural network to generate pixel-wise analysis of presence of cavities in a cluster. A ResNet architecture is then used to decode radii of cavities from the pixel-wise predictions. We surpass the accuracy, stability, and speed of current visual inspection based methods on simulated data. |
Tasks | |
Published | 2017-12-02 |
URL | http://arxiv.org/abs/1712.00523v1 |
http://arxiv.org/pdf/1712.00523v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-understanding-feedback-from |
Repo | |
Framework | |
A Fully Convolutional Tri-branch Network (FCTN) for Domain Adaptation
Title | A Fully Convolutional Tri-branch Network (FCTN) for Domain Adaptation |
Authors | Junting Zhang, Chen Liang, C. -C. Jay Kuo |
Abstract | A domain adaptation method for urban scene segmentation is proposed in this work. We develop a fully convolutional tri-branch network, where two branches assign pseudo labels to images in the unlabeled target domain while the third branch is trained with supervision based on images in the pseudo-labeled target domain. The re-labeling and re-training processes alternate. With this design, the tri-branch network learns target-specific discriminative representations progressively and, as a result, the cross-domain capability of the segmenter improves. We evaluate the proposed network on large-scale domain adaptation experiments using both synthetic (GTA) and real (Cityscapes) images. It is shown that our solution achieves the state-of-the-art performance and it outperforms previous methods by a significant margin. |
Tasks | Domain Adaptation, Scene Segmentation |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.03694v2 |
http://arxiv.org/pdf/1711.03694v2.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-convolutional-tri-branch-network-fctn |
Repo | |
Framework | |
Autonomous and Connected Intersection Crossing Traffic Management using Discrete-Time Occupancies Trajectory
Title | Autonomous and Connected Intersection Crossing Traffic Management using Discrete-Time Occupancies Trajectory |
Authors | Qiang Lu, Kyoung-Dae Kim |
Abstract | In this paper, we address a problem of safe and efficient intersection crossing traffic management of autonomous and connected ground traffic. Toward this objective, we propose an algorithm that is called the Discrete-time occupancies trajectory based Intersection traffic Coordination Algorithm (DICA). We first prove that the basic DICA is deadlock free and also starvation free. Then, we show that the basic DICA has a computational complexity of $\mathcal{O}(n^2 L_m^3)$ where $n$ is the number of vehicles granted to cross an intersection and $L_m$ is the maximum length of intersection crossing routes. To improve the overall computational efficiency of the algorithm, the basic DICA is enhanced by several computational approaches that are proposed in this paper. The enhanced algorithm has the computational complexity of $\mathcal{O}(n^2 L_m \log_2 L_m)$. The improved computational efficiency of the enhanced algorithm is validated through simulation using an open source traffic simulator, called the Simulation of Urban MObility (SUMO). The overall throughput as well as the computational efficiency of the enhanced algorithm are also compared with those of an optimized traffic light control. |
Tasks | |
Published | 2017-05-12 |
URL | http://arxiv.org/abs/1705.05231v1 |
http://arxiv.org/pdf/1705.05231v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-and-connected-intersection |
Repo | |
Framework | |
UCB Exploration via Q-Ensembles
Title | UCB Exploration via Q-Ensembles |
Authors | Richard Y. Chen, Szymon Sidor, Pieter Abbeel, John Schulman |
Abstract | We show how an ensemble of $Q^*$-functions can be leveraged for more effective exploration in deep reinforcement learning. We build on well established algorithms from the bandit setting, and adapt them to the $Q$-learning setting. We propose an exploration strategy based on upper-confidence bounds (UCB). Our experiments show significant gains on the Atari benchmark. |
Tasks | Q-Learning |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01502v3 |
http://arxiv.org/pdf/1706.01502v3.pdf | |
PWC | https://paperswithcode.com/paper/ucb-exploration-via-q-ensembles |
Repo | |
Framework | |
A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects
Title | A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects |
Authors | Yuanlu Xu, Lei Qin, Xiaobai Liu, Jianwen Xie, Song-Chun Zhu |
Abstract | Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject’s interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object’s visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions. |
Tasks | Visual Tracking |
Published | 2017-09-16 |
URL | http://arxiv.org/abs/1709.05437v2 |
http://arxiv.org/pdf/1709.05437v2.pdf | |
PWC | https://paperswithcode.com/paper/a-causal-and-or-graph-model-for-visibility |
Repo | |
Framework | |
A fully dense and globally consistent 3D map reconstruction approach for GI tract to enhance therapeutic relevance of the endoscopic capsule robot
Title | A fully dense and globally consistent 3D map reconstruction approach for GI tract to enhance therapeutic relevance of the endoscopic capsule robot |
Authors | Mehmet Turan, Yusuf Yigit Pilavci, Redhwan Jamiruddin, Helder Araujo, Ender Konukoglu, Metin Sitti |
Abstract | In the gastrointestinal (GI) tract endoscopy field, ingestible wireless capsule endoscopy is emerging as a novel, minimally invasive diagnostic technology for inspection of the GI tract and diagnosis of a wide range of diseases and pathologies. Since the development of this technology, medical device companies and many research groups have made substantial progress in converting passive capsule endoscopes to robotic active capsule endoscopes with most of the functionality of current active flexible endoscopes. However, robotic capsule endoscopy still has some challenges. In particular, the use of such devices to generate a precise three-dimensional (3D) mapping of the entire inner organ remains an unsolved problem. Such global 3D maps of inner organs would help doctors to detect the location and size of diseased areas more accurately and intuitively, thus permitting more reliable diagnoses. To our knowledge, this paper presents the first complete pipeline for a complete 3D visual map reconstruction of the stomach. The proposed pipeline is modular and includes a preprocessing module, an image registration module, and a final shape-from-shading-based 3D reconstruction module; the 3D map is primarily generated by a combination of image stitching and shape-from-shading techniques, and is updated in a frame-by-frame iterative fashion via capsule motion inside the stomach. A comprehensive quantitative analysis of the proposed 3D reconstruction method is performed using an esophagus gastro duodenoscopy simulator, three different endoscopic cameras, and a 3D optical scanner. |
Tasks | 3D Reconstruction, Image Registration, Image Stitching |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06524v1 |
http://arxiv.org/pdf/1705.06524v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-dense-and-globally-consistent-3d-map |
Repo | |
Framework | |
Automatic Ground Truths: Projected Image Annotations for Omnidirectional Vision
Title | Automatic Ground Truths: Projected Image Annotations for Omnidirectional Vision |
Authors | Victor Stamatescu, Peter Barsznica, Manjung Kim, Kin K. Liu, Mark McKenzie, Will Meakin, Gwilyn Saunders, Sebastien C. Wong, Russell S. A. Brinkworth |
Abstract | We present a novel data set made up of omnidirectional video of multiple objects whose centroid positions are annotated automatically. Omnidirectional vision is an active field of research focused on the use of spherical imagery in video analysis and scene understanding, involving tasks such as object detection, tracking and recognition. Our goal is to provide a large and consistently annotated video data set that can be used to train and evaluate new algorithms for these tasks. Here we describe the experimental setup and software environment used to capture and map the 3D ground truth positions of multiple objects into the image. Furthermore, we estimate the expected systematic error on the mapped positions. In addition to final data products, we release publicly the software tools and raw data necessary to re-calibrate the camera and/or redo this mapping. The software also provides a simple framework for comparing the results of standard image annotation tools or visual tracking systems against our mapped ground truth annotations. |
Tasks | Object Detection, Scene Understanding, Visual Tracking |
Published | 2017-09-12 |
URL | http://arxiv.org/abs/1709.03697v1 |
http://arxiv.org/pdf/1709.03697v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-ground-truths-projected-image |
Repo | |
Framework | |
Modeling Human Categorization of Natural Images Using Deep Feature Representations
Title | Modeling Human Categorization of Natural Images Using Deep Feature Representations |
Authors | Ruairidh M. Battleday, Joshua C. Peterson, Thomas L. Griffiths |
Abstract | Over the last few decades, psychologists have developed sophisticated formal models of human categorization using simple artificial stimuli. In this paper, we use modern machine learning methods to extend this work into the realm of naturalistic stimuli, enabling human categorization to be studied over the complex visual domain in which it evolved and developed. We show that representations derived from a convolutional neural network can be used to model behavior over a database of >300,000 human natural image classifications, and find that a group of models based on these representations perform well, near the reliability of human judgments. Interestingly, this group includes both exemplar and prototype models, contrasting with the dominance of exemplar models in previous work. We are able to improve the performance of the remaining models by preprocessing neural network representations to more closely capture human similarity judgments. |
Tasks | |
Published | 2017-11-13 |
URL | http://arxiv.org/abs/1711.04855v1 |
http://arxiv.org/pdf/1711.04855v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-human-categorization-of-natural |
Repo | |
Framework | |
Anisotropic-Scale Junction Detection and Matching for Indoor Images
Title | Anisotropic-Scale Junction Detection and Matching for Indoor Images |
Authors | Nan Xue, Gui-Song Xia, Xiang Bai, Liangpei Zhang, Weiming Shen |
Abstract | Junctions play an important role in the characterization of local geometric structures in images, the detection of which is a longstanding and challenging task. Existing junction detectors usually focus on identifying the junction locations and the orientations of the junction branches while ignoring their scales; however, these scales also contain rich geometric information. This paper presents a novel approach to junction detection and characterization that exploits the locally anisotropic geometries of a junction and estimates the scales of these geometries using an \emph{a contrario} model. The output junctions have anisotropic scales — i.e., each branch of a junction is associated with an independent scale parameter — and are thus termed anisotropic-scale junctions (ASJs). We then apply the newly detected ASJs for the matching of indoor images, in which there may be dramatic changes in viewpoint and the detected local visual features, e.g., key-points, are usually insufficiently distinctive. We propose to use the anisotropic geometries of our junctions to improve the matching precision for indoor images. Matching results obtained on sets of indoor images demonstrate that our approach achieves state-of-the-art performance in indoor image matching. |
Tasks | |
Published | 2017-03-16 |
URL | http://arxiv.org/abs/1703.05630v2 |
http://arxiv.org/pdf/1703.05630v2.pdf | |
PWC | https://paperswithcode.com/paper/anisotropic-scale-junction-detection-and |
Repo | |
Framework | |
A deep learning-based method for relative location prediction in CT scan images
Title | A deep learning-based method for relative location prediction in CT scan images |
Authors | Jiajia Guo, Hongwei Du, Bensheng Qiu, Xiao Liang |
Abstract | Relative location prediction in computed tomography (CT) scan images is a challenging problem. In this paper, a regression model based on one-dimensional convolutional neural networks is proposed to determine the relative location of a CT scan image both robustly and precisely. A public dataset is employed to validate the performance of the study’s proposed method using a 5-fold cross validation. Experimental results demonstrate an excellent performance of the proposed model when compared with the state-of-the-art techniques, achieving a median absolute error of 1.04 cm and mean absolute error of 1.69 cm. |
Tasks | Computed Tomography (CT) |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07624v1 |
http://arxiv.org/pdf/1711.07624v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-based-method-for-relative |
Repo | |
Framework | |
Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers
Title | Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers |
Authors | Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein |
Abstract | Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, querying or commenting on an earlier post, is becoming of increasing interest to researchers. While most previous work has focused on using individual tweets as classifier inputs, here we report on the performance of sequential classifiers that exploit the discourse features inherent in social media interactions or ‘conversational threads’. Testing the effectiveness of four sequential classifiers – Hawkes Processes, Linear-Chain Conditional Random Fields (Linear CRF), Tree-Structured Conditional Random Fields (Tree CRF) and Long Short Term Memory networks (LSTM) – on eight datasets associated with breaking news stories, and looking at different types of local and contextual features, our work sheds new light on the development of accurate stance classifiers. We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers. Furthermore, we show that LSTM using a reduced set of features can outperform the other sequential classifiers; this performance is consistent across datasets and across types of stances. To conclude, our work also analyses the different features under study, identifying those that best help characterise and distinguish between stances, such as supporting tweets being more likely to be accompanied by evidence than denying tweets. We also set forth a number of directions for future research. |
Tasks | |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02223v1 |
http://arxiv.org/pdf/1712.02223v1.pdf | |
PWC | https://paperswithcode.com/paper/discourse-aware-rumour-stance-classification |
Repo | |
Framework | |