Paper Group ANR 22
GAR: An efficient and scalable Graph-based Activity Regularization for semi-supervised learning. Effective Inference for Generative Neural Parsing. Batch-Expansion Training: An Efficient Optimization Framework. LOOP Descriptor: Local Optimal Oriented Pattern. Separating Reflection and Transmission Images in the Wild. Task-specific Word Identificati …
GAR: An efficient and scalable Graph-based Activity Regularization for semi-supervised learning
Title | GAR: An efficient and scalable Graph-based Activity Regularization for semi-supervised learning |
Authors | Ozsel Kilinc, Ismail Uysal |
Abstract | In this paper, we propose a novel graph-based approach for semi-supervised learning problems, which considers an adaptive adjacency of the examples throughout the unsupervised portion of the training. Adjacency of the examples is inferred using the predictions of a neural network model which is first initialized by a supervised pretraining. These predictions are then updated according to a novel unsupervised objective which regularizes another adjacency, now linking the output nodes. Regularizing the adjacency of the output nodes, inferred from the predictions of the network, creates an easier optimization problem and ultimately provides that the predictions of the network turn into the optimal embedding. Ultimately, the proposed framework provides an effective and scalable graph-based solution which is natural to the operational mechanism of deep neural networks. Our results show comparable performance with state-of-the-art generative approaches for semi-supervised learning on an easier-to-train, low-cost framework. |
Tasks | |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07219v2 |
http://arxiv.org/pdf/1705.07219v2.pdf | |
PWC | https://paperswithcode.com/paper/gar-an-efficient-and-scalable-graph-based |
Repo | |
Framework | |
Effective Inference for Generative Neural Parsing
Title | Effective Inference for Generative Neural Parsing |
Authors | Mitchell Stern, Daniel Fried, Dan Klein |
Abstract | Generative neural models have recently achieved state-of-the-art results for constituency parsing. However, without a feasible search procedure, their use has so far been limited to reranking the output of external parsers in which decoding is more tractable. We describe an alternative to the conventional action-level beam search used for discriminative neural models that enables us to decode directly in these generative models. We then show that by improving our basic candidate selection strategy and using a coarse pruning function, we can improve accuracy while exploring significantly less of the search space. Applied to the model of Choe and Charniak (2016), our inference procedure obtains 92.56 F1 on section 23 of the Penn Treebank, surpassing prior state-of-the-art results for single-model systems. |
Tasks | Constituency Parsing |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08976v1 |
http://arxiv.org/pdf/1707.08976v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-inference-for-generative-neural |
Repo | |
Framework | |
Batch-Expansion Training: An Efficient Optimization Framework
Title | Batch-Expansion Training: An Efficient Optimization Framework |
Authors | Michał Dereziński, Dhruv Mahajan, S. Sathiya Keerthi, S. V. N. Vishwanathan, Markus Weimer |
Abstract | We propose Batch-Expansion Training (BET), a framework for running a batch optimizer on a gradually expanding dataset. As opposed to stochastic approaches, batches do not need to be resampled i.i.d. at every iteration, thus making BET more resource efficient in a distributed setting, and when disk-access is constrained. Moreover, BET can be easily paired with most batch optimizers, does not require any parameter-tuning, and compares favorably to existing stochastic and batch methods. We show that when the batch size grows exponentially with the number of outer iterations, BET achieves optimal $O(1/\epsilon)$ data-access convergence rate for strongly convex objectives. Experiments in parallel and distributed settings show that BET performs better than standard batch and stochastic approaches. |
Tasks | |
Published | 2017-04-22 |
URL | http://arxiv.org/abs/1704.06731v3 |
http://arxiv.org/pdf/1704.06731v3.pdf | |
PWC | https://paperswithcode.com/paper/batch-expansion-training-an-efficient |
Repo | |
Framework | |
LOOP Descriptor: Local Optimal Oriented Pattern
Title | LOOP Descriptor: Local Optimal Oriented Pattern |
Authors | Tapabrata Chakraborti, Brendan McCane, Steven Mills, Umapada Pal |
Abstract | This letter introduces the LOOP binary descriptor (local optimal oriented pattern) that encodes rotation invariance into the main formulation itself. This makes any post processing stage for rotation invariance redundant and improves on both accuracy and time complexity. We consider fine-grained lepidoptera (moth/butterfly) species recognition as the representative problem since it involves repetition of localized patterns and textures that may be exploited for discrimination. We evaluate the performance of LOOP against its predecessors as well as few other popular descriptors. Besides experiments on standard benchmarks, we also introduce a new small image dataset on NZ Lepidoptera. Loop performs as well or better on all datasets evaluated compared to previous binary descriptors. The new dataset and demo code of the proposed method are to be made available through the lead author’s academic webpage and GitHub. |
Tasks | |
Published | 2017-10-25 |
URL | http://arxiv.org/abs/1710.09317v4 |
http://arxiv.org/pdf/1710.09317v4.pdf | |
PWC | https://paperswithcode.com/paper/loop-descriptor-local-optimal-oriented |
Repo | |
Framework | |
Separating Reflection and Transmission Images in the Wild
Title | Separating Reflection and Transmission Images in the Wild |
Authors | Patrick Wieschollek, Orazio Gallo, Jinwei Gu, Jan Kautz |
Abstract | The reflections caused by common semi-reflectors, such as glass windows, can impact the performance of computer vision algorithms. State-of-the-art methods can remove reflections on synthetic data and in controlled scenarios. However, they are based on strong assumptions and do not generalize well to real-world images. Contrary to a common misconception, real-world images are challenging even when polarization information is used. We present a deep learning approach to separate the reflected and the transmitted components of the recorded irradiance, which explicitly uses the polarization properties of light. To train it, we introduce an accurate synthetic data generation pipeline, which simulates realistic reflections, including those generated by curved and non-ideal surfaces, non-static scenes, and high-dynamic-range scenes. |
Tasks | Synthetic Data Generation |
Published | 2017-12-06 |
URL | http://arxiv.org/abs/1712.02099v2 |
http://arxiv.org/pdf/1712.02099v2.pdf | |
PWC | https://paperswithcode.com/paper/separating-reflection-and-transmission-images |
Repo | |
Framework | |
Task-specific Word Identification from Short Texts Using a Convolutional Neural Network
Title | Task-specific Word Identification from Short Texts Using a Convolutional Neural Network |
Authors | Shuhan Yuan, Xintao Wu, Yang Xiang |
Abstract | Task-specific word identification aims to choose the task-related words that best describe a short text. Existing approaches require well-defined seed words or lexical dictionaries (e.g., WordNet), which are often unavailable for many applications such as social discrimination detection and fake review detection. However, we often have a set of labeled short texts where each short text has a task-related class label, e.g., discriminatory or non-discriminatory, specified by users or learned by classification algorithms. In this paper, we focus on identifying task-specific words and phrases from short texts by exploiting their class labels rather than using seed words or lexical dictionaries. We consider the task-specific word and phrase identification as feature learning. We train a convolutional neural network over a set of labeled texts and use score vectors to localize the task-specific words and phrases. Experimental results on sentiment word identification show that our approach significantly outperforms existing methods. We further conduct two case studies to show the effectiveness of our approach. One case study on a crawled tweets dataset demonstrates that our approach can successfully capture the discrimination-related words/phrases. The other case study on fake review detection shows that our approach can identify the fake-review words/phrases. |
Tasks | |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00884v1 |
http://arxiv.org/pdf/1706.00884v1.pdf | |
PWC | https://paperswithcode.com/paper/task-specific-word-identification-from-short |
Repo | |
Framework | |
The Complex Negotiation Dialogue Game
Title | The Complex Negotiation Dialogue Game |
Authors | Romain Laroche |
Abstract | This position paper formalises an abstract model for complex negotiation dialogue. This model is to be used for the benchmark of optimisation algorithms ranging from Reinforcement Learning to Stochastic Games, through Transfer Learning, One-Shot Learning or others. |
Tasks | One-Shot Learning, Transfer Learning |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01450v1 |
http://arxiv.org/pdf/1707.01450v1.pdf | |
PWC | https://paperswithcode.com/paper/the-complex-negotiation-dialogue-game |
Repo | |
Framework | |
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
Title | Data-efficient Deep Reinforcement Learning for Dexterous Manipulation |
Authors | Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, Martin Riedmiller |
Abstract | Deep learning and reinforcement learning methods have recently been used to solve a variety of problems in continuous control domains. An obvious application of these techniques is dexterous manipulation tasks in robotics which are difficult to solve using traditional control theory or hand-engineered approaches. One example of such a task is to grasp an object and precisely stack it on another. Solving this difficult and practically relevant problem in the real world is an important long-term goal for the field of robotics. Here we take a step towards this goal by examining the problem in simulation and providing models and techniques aimed at solving it. We introduce two extensions to the Deep Deterministic Policy Gradient algorithm (DDPG), a model-free Q-learning based method, which make it significantly more data-efficient and scalable. Our results show that by making extensive use of off-policy data and replay, it is possible to find control policies that robustly grasp objects and stack them. Further, our results hint that it may soon be feasible to train successful stacking policies by collecting interactions on real robots. |
Tasks | Continuous Control, Q-Learning |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.03073v1 |
http://arxiv.org/pdf/1704.03073v1.pdf | |
PWC | https://paperswithcode.com/paper/data-efficient-deep-reinforcement-learning |
Repo | |
Framework | |
Unsupervised Classification of PolSAR Data Using a Scattering Similarity Measure Derived from a Geodesic Distance
Title | Unsupervised Classification of PolSAR Data Using a Scattering Similarity Measure Derived from a Geodesic Distance |
Authors | Debanshu Ratha, Avik Bhattacharya, Alejandro C. Frery |
Abstract | In this letter, we propose a novel technique for obtaining scattering components from Polarimetric Synthetic Aperture Radar (PolSAR) data using the geodesic distance on the unit sphere. This geodesic distance is obtained between an elementary target and the observed Kennaugh matrix, and it is further utilized to compute a similarity measure between scattering mechanisms. The normalized similarity measure for each elementary target is then modulated with the total scattering power (Span). This measure is used to categorize pixels into three categories i.e. odd-bounce, double-bounce and volume, depending on which of the above scattering mechanisms dominate. Then the maximum likelihood classifier of [J.-S. Lee, M. R. Grunes, E. Pottier, and L. Ferro-Famil, Unsupervised terrain classification preserving polarimetric scattering characteristics, IEEE Trans. Geos. Rem. Sens., vol. 42, no. 4, pp. 722731, April 2004.] based on the complex Wishart distribution is iteratively used for each category. Dominant scattering mechanisms are thus preserved in this classification scheme. We show results for L-band AIRSAR and ALOS-2 datasets acquired over San Francisco and Mumbai, respectively. The scattering mechanisms are better preserved using the proposed methodology than the unsupervised classification results using the Freeman-Durden scattering powers on an orientation angle (OA) corrected PolSAR image. Furthermore, (1) the scattering similarity is a completely non-negative quantity unlike the negative powers that might occur in double- bounce and odd-bounce scattering component under Freeman Durden decomposition (FDD), and (2) the methodology can be extended to more canonical targets as well as for bistatic scattering. |
Tasks | |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00427v1 |
http://arxiv.org/pdf/1712.00427v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-classification-of-polsar-data |
Repo | |
Framework | |
Optimized Spatial Partitioning via Minimal Swarm Intelligence
Title | Optimized Spatial Partitioning via Minimal Swarm Intelligence |
Authors | Casey Kneale, Dominic Poerio, Karl S. Booksh |
Abstract | Optimized spatial partitioning algorithms are the corner stone of many successful experimental designs and statistical methods. Of these algorithms, the Centroidal Voronoi Tessellation (CVT) is the most widely utilized. CVT based methods require global knowledge of spatial boundaries, do not readily allow for weighted regions, have challenging implementations, and are inefficiently extended to high dimensional spaces. We describe two simple partitioning schemes based on nearest and next nearest neighbor locations which easily incorporate these features at the slight expense of optimal placement. Several novel qualitative techniques which assess these partitioning schemes are also included. The feasibility of autonomous uninformed sensor networks utilizing these algorithms are considered. Some improvements in particle swarm optimizer results on multimodal test functions from partitioned initial positions in two space are also illustrated. Pseudo code for all of the novel algorithms depicted here-in is available in the supplementary information of this manuscript. |
Tasks | |
Published | 2017-01-19 |
URL | http://arxiv.org/abs/1701.05553v1 |
http://arxiv.org/pdf/1701.05553v1.pdf | |
PWC | https://paperswithcode.com/paper/optimized-spatial-partitioning-via-minimal |
Repo | |
Framework | |
Label Efficient Learning of Transferable Representations across Domains and Tasks
Title | Label Efficient Learning of Transferable Representations across Domains and Tasks |
Authors | Zelun Luo, Yuliang Zou, Judy Hoffman, Li Fei-Fei |
Abstract | We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner. Our approach battles domain shift with a domain adversarial loss, and generalizes the embedding to novel task using a metric learning-based approach. Our model is simultaneously optimized on labeled source data and unlabeled or sparsely labeled data in the target domain. Our method shows compelling results on novel classes within a new domain even when only a few labeled examples per class are available, outperforming the prevalent fine-tuning approach. In addition, we demonstrate the effectiveness of our framework on the transfer learning task from image object recognition to video action recognition. |
Tasks | Metric Learning, Object Recognition, Temporal Action Localization, Transfer Learning |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1712.00123v1 |
http://arxiv.org/pdf/1712.00123v1.pdf | |
PWC | https://paperswithcode.com/paper/label-efficient-learning-of-transferable-1 |
Repo | |
Framework | |
Wikipedia Vandal Early Detection: from User Behavior to User Embedding
Title | Wikipedia Vandal Early Detection: from User Behavior to User Embedding |
Authors | Shuhan Yuan, Panpan Zheng, Xintao Wu, Yang Xiang |
Abstract | Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, we propose the use of deep learning to detect vandals based on their edit history. In particular, we develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, we can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, we can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals. |
Tasks | |
Published | 2017-06-03 |
URL | http://arxiv.org/abs/1706.00887v1 |
http://arxiv.org/pdf/1706.00887v1.pdf | |
PWC | https://paperswithcode.com/paper/wikipedia-vandal-early-detection-from-user |
Repo | |
Framework | |
Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text
Title | Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text |
Authors | Ayush Jaiswal, Ekraam Sabir, Wael AbdAlmageed, Premkumar Natarajan |
Abstract | Real world multimedia data is often composed of multiple modalities such as an image or a video with associated text (e.g. captions, user comments, etc.) and metadata. Such multimodal data packages are prone to manipulations, where a subset of these modalities can be altered to misrepresent or repurpose data packages, with possible malicious intent. It is, therefore, important to develop methods to assess or verify the integrity of these multimedia packages. Using computer vision and natural language processing methods to directly compare the image (or video) and the associated caption to verify the integrity of a media package is only possible for a limited set of objects and scenes. In this paper, we present a novel deep learning-based approach for assessing the semantic integrity of multimedia packages containing images and captions, using a reference set of multimedia packages. We construct a joint embedding of images and captions with deep multimodal representation learning on the reference dataset in a framework that also provides image-caption consistency scores (ICCSs). The integrity of query media packages is assessed as the inlierness of the query ICCSs with respect to the reference dataset. We present the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media packages from Flickr, which we make available to the research community. We use both the newly created dataset as well as Flickr30K and MS COCO datasets to quantitatively evaluate our proposed approach. The reference dataset does not contain unmanipulated versions of tampered query packages. Our method is able to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO, respectively, for detecting semantically incoherent media packages. |
Tasks | Representation Learning |
Published | 2017-07-06 |
URL | http://arxiv.org/abs/1707.01606v4 |
http://arxiv.org/pdf/1707.01606v4.pdf | |
PWC | https://paperswithcode.com/paper/multimedia-semantic-integrity-assessment |
Repo | |
Framework | |
Knowledge Engineering for Hybrid Deductive Databases
Title | Knowledge Engineering for Hybrid Deductive Databases |
Authors | Dietmar Seipel |
Abstract | Modern knowledge base systems frequently need to combine a collection of databases in different formats: e.g., relational databases, XML databases, rule bases, ontologies, etc. In the deductive database system DDBASE, we can manage these different formats of knowledge and reason about them. Even the file systems on different computers can be part of the knowledge base. Often, it is necessary to handle different versions of a knowledge base. E.g., we might want to find out common parts or differences of two versions of a relational database. We will examine the use of abstractions of rule bases by predicate dependency and rule predicate graphs. Also the proof trees of derived atoms can help to compare different versions of a rule base. Moreover, it might be possible to have derivations joining rules with other formalisms of knowledge representation. Ontologies have shown their benefits in many applications of intelligent systems, and there have been many proposals for rule languages compatible with the semantic web stack, e.g., SWRL, the semantic web rule language. Recently, ontologies are used in hybrid systems for specifying the provenance of the different components. |
Tasks | |
Published | 2017-01-03 |
URL | http://arxiv.org/abs/1701.00622v1 |
http://arxiv.org/pdf/1701.00622v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-engineering-for-hybrid-deductive |
Repo | |
Framework | |
Learning from lions: inferring the utility of agents from their trajectories
Title | Learning from lions: inferring the utility of agents from their trajectories |
Authors | Adam D. Cobb, Andrew Markham, Stephen J. Roberts |
Abstract | We build a model using Gaussian processes to infer a spatio-temporal vector field from observed agent trajectories. Significant landmarks or influence points in agent surroundings are jointly derived through vector calculus operations that indicate presence of sources and sinks. We evaluate these influence points by using the Kullback-Leibler divergence between the posterior and prior Laplacian of the inferred spatio-temporal vector field. Through locating significant features that influence trajectories, our model aims to give greater insight into underlying causal utility functions that determine agent decision-making. A key feature of our model is that it infers a joint Gaussian process over the observed trajectories, the time-varying vector field of utility and canonical vector calculus operators. We apply our model to both synthetic data and lion GPS data collected at the Bubye Valley Conservancy in southern Zimbabwe. |
Tasks | Decision Making, Gaussian Processes |
Published | 2017-09-07 |
URL | http://arxiv.org/abs/1709.02357v1 |
http://arxiv.org/pdf/1709.02357v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-lions-inferring-the-utility-of |
Repo | |
Framework | |