Paper Group ANR 135
Learning to Grasp Without Seeing. Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances. 4D Temporally Coherent Light-field Video. Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans. Story Ending Generation with Incremental Encoding and Commonsense Knowledge. Towards Inference-Oriented Reading Co …
Learning to Grasp Without Seeing
Title | Learning to Grasp Without Seeing |
Authors | Adithyavairavan Murali, Yin Li, Dhiraj Gandhi, Abhinav Gupta |
Abstract | Can a robot grasp an unknown object without seeing it? In this paper, we present a tactile-sensing based approach to this challenging problem of grasping novel objects without prior knowledge of their location or physical properties. Our key idea is to combine touch based object localization with tactile based re-grasping. To train our learning models, we created a large-scale grasping dataset, including more than 30 RGB frames and over 2.8 million tactile samples from 7800 grasp interactions of 52 objects. To learn a representation of tactile signals, we propose an unsupervised auto-encoding scheme, which shows a significant improvement of 4-9% over prior methods on a variety of tactile perception tasks. Our system consists of two steps. First, our touch localization model sequentially ‘touch-scans’ the workspace and uses a particle filter to aggregate beliefs from multiple hits of the target. It outputs an estimate of the object’s location, from which an initial grasp is established. Next, our re-grasping model learns to progressively improve grasps with tactile feedback based on the learned features. This network learns to estimate grasp stability and predict adjustment for the next grasp. Re-grasping thus is performed iteratively until our model identifies a stable grasp. Finally, we demonstrate extensive experimental results on grasping a large set of novel objects using tactile sensing alone. Furthermore, when applied on top of a vision-based policy, our re-grasping model significantly boosts the overall accuracy by 10.6%. We believe this is the first attempt at learning to grasp with only tactile sensing and without any prior object knowledge. |
Tasks | Object Localization |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.04201v1 |
http://arxiv.org/pdf/1805.04201v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-grasp-without-seeing |
Repo | |
Framework | |
Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances
Title | Generative Adversarial Nets for Information Retrieval: Fundamentals and Advances |
Authors | Weinan Zhang |
Abstract | Generative adversarial nets (GANs) have been widely studied during the recent development of deep learning and unsupervised learning. With an adversarial training mechanism, GAN manages to train a generative model to fit the underlying unknown real data distribution under the guidance of the discriminative model estimating whether a data instance is real or generated. Such a framework is originally proposed for fitting continuous data distribution such as images, thus it is not straightforward to be directly applied to information retrieval scenarios where the data is mostly discrete, such as IDs, text and graphs. In this tutorial, we focus on discussing the GAN techniques and the variants on discrete data fitting in various information retrieval scenarios. (i) We introduce the fundamentals of GAN framework and its theoretic properties; (ii) we carefully study the promising solutions to extend GAN onto discrete data generation; (iii) we introduce IRGAN, the fundamental GAN framework of fitting single ID data distribution and the direct application on information retrieval; (iv) we further discuss the task of sequential discrete data generation tasks, e.g., text generation, and the corresponding GAN solutions; (v) we present the most recent work on graph/network data fitting with node embedding techniques by GANs. Meanwhile, we also introduce the relevant open-source platforms such as IRGAN and Texygen to help audience conduct research experiments on GANs in information retrieval. Finally, we conclude this tutorial with a comprehensive summarization and a prospect of further research directions for GANs in information retrieval. |
Tasks | Information Retrieval, Text Generation |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03577v1 |
http://arxiv.org/pdf/1806.03577v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-nets-for-information |
Repo | |
Framework | |
4D Temporally Coherent Light-field Video
Title | 4D Temporally Coherent Light-field Video |
Authors | Armin Mustafa, Marco Volino, Jean-yves Guillemaut, Adrian Hilton |
Abstract | Light-field video has recently been used in virtual and augmented reality applications to increase realism and immersion. However, existing light-field methods are generally limited to static scenes due to the requirement to acquire a dense scene representation. The large amount of data and the absence of methods to infer temporal coherence pose major challenges in storage, compression and editing compared to conventional video. In this paper, we propose the first method to extract a spatio-temporally coherent light-field video representation. A novel method to obtain Epipolar Plane Images (EPIs) from a spare light-field camera array is proposed. EPIs are used to constrain scene flow estimation to obtain 4D temporally coherent representations of dynamic light-fields. Temporal coherence is achieved on a variety of light-field datasets. Evaluation of the proposed light-field scene flow against existing multi-view dense correspondence approaches demonstrates a significant improvement in accuracy of temporal coherence. |
Tasks | Scene Flow Estimation |
Published | 2018-04-30 |
URL | http://arxiv.org/abs/1804.11276v1 |
http://arxiv.org/pdf/1804.11276v1.pdf | |
PWC | https://paperswithcode.com/paper/4d-temporally-coherent-light-field-video |
Repo | |
Framework | |
Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans
Title | Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans |
Authors | Cheng Lin, Changjian Li, Wenping Wang |
Abstract | We present a novel approach to align partial 3D reconstructions which may not have substantial overlap. Using floorplan priors, our method jointly predicts a room layout and estimates the transformations from a set of partial 3D data. Unlike the existing methods relying on feature descriptors to establish correspondences, we exploit the 3D “box” structure of a typical room layout that meets the Manhattan World property. We first estimate a local layout for each partial scan separately and then combine these local layouts to form a globally aligned layout with loop closure. Without the requirement of feature matching, the proposed method enables some novel applications ranging from large or featureless scene reconstruction and modeling from sparse input. We validate our method quantitatively and qualitatively on real and synthetic scenes of various sizes and complexities. The evaluations and comparisons show superior effectiveness and accuracy of our method. |
Tasks | |
Published | 2018-12-17 |
URL | https://arxiv.org/abs/1812.06677v3 |
https://arxiv.org/pdf/1812.06677v3.pdf | |
PWC | https://paperswithcode.com/paper/floorplan-priors-for-joint-camera-pose-and |
Repo | |
Framework | |
Story Ending Generation with Incremental Encoding and Commonsense Knowledge
Title | Story Ending Generation with Incremental Encoding and Commonsense Knowledge |
Authors | Jian Guan, Yansen Wang, Minlie Huang |
Abstract | Generating a reasonable ending for a given story context, i.e., story ending generation, is a strong indication of story comprehension. This task requires not only to understand the context clues which play an important role in planning the plot but also to handle implicit knowledge to make a reasonable, coherent story. In this paper, we devise a novel model for story ending generation. The model adopts an incremental encoding scheme to represent context clues which are spanning in the story context. In addition, commonsense knowledge is applied through multi-source attention to facilitate story comprehension, and thus to help generate coherent and reasonable endings. Through building context clues and using implicit knowledge, the model is able to produce reasonable story endings. context clues implied in the post and make the inference based on it. Automatic and manual evaluation shows that our model can generate more reasonable story endings than state-of-the-art baselines. |
Tasks | |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10113v3 |
http://arxiv.org/pdf/1808.10113v3.pdf | |
PWC | https://paperswithcode.com/paper/story-ending-generation-with-incremental |
Repo | |
Framework | |
Towards Inference-Oriented Reading Comprehension: ParallelQA
Title | Towards Inference-Oriented Reading Comprehension: ParallelQA |
Authors | Soumya Wadhwa, Varsha Embar, Matthias Grabmair, Eric Nyberg |
Abstract | In this paper, we investigate the tendency of end-to-end neural Machine Reading Comprehension (MRC) models to match shallow patterns rather than perform inference-oriented reasoning on RC benchmarks. We aim to test the ability of these systems to answer questions which focus on referential inference. We propose ParallelQA, a strategy to formulate such questions using parallel passages. We also demonstrate that existing neural models fail to generalize well to this setting. |
Tasks | Machine Reading Comprehension, Reading Comprehension |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.03830v1 |
http://arxiv.org/pdf/1805.03830v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-inference-oriented-reading |
Repo | |
Framework | |
Exchangeability and Kernel Invariance in Trained MLPs
Title | Exchangeability and Kernel Invariance in Trained MLPs |
Authors | Russell Tsuchida, Fred Roosta, Marcus Gallagher |
Abstract | In the analysis of machine learning models, it is often convenient to assume that the parameters are IID. This assumption is not satisfied when the parameters are updated through training processes such as SGD. A relaxation of the IID condition is a probabilistic symmetry known as exchangeability. We show the sense in which the weights in MLPs are exchangeable. This yields the result that in certain instances, the layer-wise kernel of fully-connected layers remains approximately constant during training. We identify a sharp change in the macroscopic behavior of networks as the covariance between weights changes from zero. |
Tasks | |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08351v2 |
http://arxiv.org/pdf/1810.08351v2.pdf | |
PWC | https://paperswithcode.com/paper/exchangeability-and-kernel-invariance-in |
Repo | |
Framework | |
Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning
Title | Asynchronous stochastic approximations with asymptotically biased errors and deep multi-agent learning |
Authors | Arunselvan Ramaswamy, Shalabh Bhatnagar, Daniel E. Quevedo |
Abstract | Asynchronous stochastic approximations (SAs) are an important class of model-free algorithms, tools and techniques that are popular in multi-agent and distributed control scenarios. To counter Bellman’s curse of dimensionality, such algorithms are coupled with function approximations. Although the learning/ control problem becomes more tractable, function approximations affect stability and convergence. In this paper, we present verifiable sufficient conditions for stability and convergence of asynchronous SAs with biased approximation errors. The theory developed herein is used to analyze Policy Gradient methods and noisy Value Iteration schemes. Specifically, we analyze the asynchronous approximate counterparts of the policy gradient (A2PG) and value iteration (A2VI) schemes. It is shown that the stability of these algorithms is unaffected by biased approximation errors, provided they are asymptotically bounded. With respect to convergence (of A2VI and A2PG), a relationship between the limiting set and the approximation errors is established. Finally, experimental results are presented that support the theory. |
Tasks | Multi-agent Reinforcement Learning, Policy Gradient Methods |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07935v2 |
http://arxiv.org/pdf/1802.07935v2.pdf | |
PWC | https://paperswithcode.com/paper/asynchronous-stochastic-approximations-with |
Repo | |
Framework | |
Hunting for Tractable Languages for Judgment Aggregation
Title | Hunting for Tractable Languages for Judgment Aggregation |
Authors | Ronald de Haan |
Abstract | Judgment aggregation is a general framework for collective decision making that can be used to model many different settings. Due to its general nature, the worst case complexity of essentially all relevant problems in this framework is very high. However, these intractability results are mainly due to the fact that the language to represent the aggregation domain is overly expressive. We initiate an investigation of representation languages for judgment aggregation that strike a balance between (1) being limited enough to yield computational tractability results and (2) being expressive enough to model relevant applications. In particular, we consider the languages of Krom formulas, (definite) Horn formulas, and Boolean circuits in decomposable negation normal form (DNNF). We illustrate the use of the positive complexity results that we obtain for these languages with a concrete application: voting on how to spend a budget (i.e., participatory budgeting). |
Tasks | Decision Making |
Published | 2018-08-09 |
URL | http://arxiv.org/abs/1808.03043v1 |
http://arxiv.org/pdf/1808.03043v1.pdf | |
PWC | https://paperswithcode.com/paper/hunting-for-tractable-languages-for-judgment |
Repo | |
Framework | |
Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication
Title | Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication |
Authors | Colin de Vrieze, Shane Barratt, Daniel Tsai, Anant Sahai |
Abstract | Traditional radio systems are strictly co-designed on the lower levels of the OSI stack for compatibility and efficiency. Although this has enabled the success of radio communications, it has also introduced lengthy standardization processes and imposed static allocation of the radio spectrum. Various initiatives have been undertaken by the research community to tackle the problem of artificial spectrum scarcity by both making frequency allocation more dynamic and building flexible radios to replace the static ones. There is reason to believe that just as computer vision and control have been overhauled by the introduction of machine learning, wireless communication can also be improved by utilizing similar techniques to increase the flexibility of wireless networks. In this work, we pose the problem of discovering low-level wireless communication schemes ex-nihilo between two agents in a fully decentralized fashion as a reinforcement learning problem. Our proposed approach uses policy gradients to learn an optimal bi-directional communication scheme and shows surprisingly sophisticated and intelligent learning behavior. We present the results of extensive experiments and an analysis of the fidelity of our approach. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2018-01-14 |
URL | http://arxiv.org/abs/1801.04541v1 |
http://arxiv.org/pdf/1801.04541v1.pdf | |
PWC | https://paperswithcode.com/paper/cooperative-multi-agent-reinforcement |
Repo | |
Framework | |
Correlated Embeddings of Pairs of Dependent Random Variables
Title | Correlated Embeddings of Pairs of Dependent Random Variables |
Authors | Hsiang Hsu, Salman Salamatian, Flavio P. Calmon |
Abstract | Maximally correlated embeddings of pairs of random variables play a central role in several learning problems. For a fixed data distribution, these embeddings belong to the Hilbert space of finite-variance functions regardless of how they are derived. In this paper, we overview an information-theoretic framework for analyzing this function space, and demonstrate how a basis for this space can be approximated from data using deep neural networks. Together with experiments on different datasets, we show that this framework (i) enables classical exploratory statistical techniques such as correspondence analysis to be performed at scale with continuous variables, (ii) can be used to derive new methods for comparing black-box classification models, and (iii) underlies recent multi-view and multi-modal learning methods. |
Tasks | Dimensionality Reduction, Representation Learning |
Published | 2018-06-21 |
URL | https://arxiv.org/abs/1806.08449v2 |
https://arxiv.org/pdf/1806.08449v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-orthogonal-representations-fundamental |
Repo | |
Framework | |
Automatic Plaque Detection in IVOCT Pullbacks Using Convolutional Neural Networks
Title | Automatic Plaque Detection in IVOCT Pullbacks Using Convolutional Neural Networks |
Authors | Nils Gessert, Matthias Lutz, Markus Heyder, Sarah Latus, David M. Leistner, Youssef S. Abdelwahed, Alexander Schlaefer |
Abstract | Coronary heart disease is a common cause of death despite being preventable. To treat the underlying plaque deposits in the arterial walls, intravascular optical coherence tomography can be used by experts to detect and characterize the lesions. In clinical routine, hundreds of images are acquired for each patient which requires automatic plaque detection for fast and accurate decision support. So far, automatic approaches rely on classic machine learning methods and deep learning solutions have rarely been studied. Given the success of deep learning methods with other imaging modalities, a thorough understanding of deep learning-based plaque detection for future clinical decision support systems is required. We address this issue with a new dataset consisting of in-vivo patient images labeled by three trained experts. Using this dataset, we employ state-of-the-art deep learning models that directly learn plaque classification from the images. For improved performance, we study different transfer learning approaches. Furthermore, we investigate the use of cartesian and polar image representations and employ data augmentation techniques tailored to each representation. We fuse both representations in a multi-path architecture for more effective feature exploitation. Last, we address the challenge of plaque differentiation in addition to detection. Overall, we find that our combined model performs best with an accuracy of 91.7%, a sensitivity of 90.9% and a specificity of 92.4%. Our results indicate that building a deep learning-based clinical decision support system for plaque detection is feasible. |
Tasks | Data Augmentation, Transfer Learning |
Published | 2018-08-13 |
URL | http://arxiv.org/abs/1808.04187v2 |
http://arxiv.org/pdf/1808.04187v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-plaque-detection-in-ivoct-pullbacks |
Repo | |
Framework | |
Identifying Real Estate Opportunities using Machine Learning
Title | Identifying Real Estate Opportunities using Machine Learning |
Authors | Alejandro Baldominos, Iván Blanco, Antonio José Moreno, Rubén Iturrarte, Óscar Bernárdez, Carlos Afonso |
Abstract | The real estate market is exposed to many fluctuations in prices because of existing correlations with many variables, some of which cannot be controlled or might even be unknown. Housing prices can increase rapidly (or in some cases, also drop very fast), yet the numerous listings available online where houses are sold or rented are not likely to be updated that often. In some cases, individuals interested in selling a house (or apartment) might include it in some online listing, and forget about updating the price. In other cases, some individuals might be interested in deliberately setting a price below the market price in order to sell the home faster, for various reasons. In this paper, we aim at developing a machine learning application that identifies opportunities in the real estate market in real time, i.e., houses that are listed with a price substantially below the market price. This program can be useful for investors interested in the housing market. We have focused in a use case considering real estate assets located in the Salamanca district in Madrid (Spain) and listed in the most relevant Spanish online site for home sales and rentals. The application is formally implemented as a regression problem that tries to estimate the market price of a house given features retrieved from public online listings. For building this application, we have performed a feature engineering stage in order to discover relevant features that allows for attaining a high predictive performance. Several machine learning algorithms have been tested, including regression trees, k-nearest neighbors, support vector machines and neural networks, identifying advantages and handicaps of each of them. |
Tasks | Feature Engineering |
Published | 2018-09-13 |
URL | http://arxiv.org/abs/1809.04933v2 |
http://arxiv.org/pdf/1809.04933v2.pdf | |
PWC | https://paperswithcode.com/paper/identifying-real-estate-opportunities-using |
Repo | |
Framework | |
Train Feedfoward Neural Network with Layer-wise Adaptive Rate via Approximating Back-matching Propagation
Title | Train Feedfoward Neural Network with Layer-wise Adaptive Rate via Approximating Back-matching Propagation |
Authors | Huishuai Zhang, Wei Chen, Tie-Yan Liu |
Abstract | Stochastic gradient descent (SGD) has achieved great success in training deep neural network, where the gradient is computed through back-propagation. However, the back-propagated values of different layers vary dramatically. This inconsistence of gradient magnitude across different layers renders optimization of deep neural network with a single learning rate problematic. We introduce the back-matching propagation which computes the backward values on the layer’s parameter and the input by matching backward values on the layer’s output. This leads to solving a bunch of least-squares problems, which requires high computational cost. We then reduce the back-matching propagation with approximations and propose an algorithm that turns to be the regular SGD with a layer-wise adaptive learning rate strategy. This allows an easy implementation of our algorithm in current machine learning frameworks equipped with auto-differentiation. We apply our algorithm in training modern deep neural networks and achieve favorable results over SGD. |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09750v1 |
http://arxiv.org/pdf/1802.09750v1.pdf | |
PWC | https://paperswithcode.com/paper/train-feedfoward-neural-network-with-layer |
Repo | |
Framework | |
Plaque Classification in Coronary Arteries from IVOCT Images Using Convolutional Neural Networks and Transfer Learning
Title | Plaque Classification in Coronary Arteries from IVOCT Images Using Convolutional Neural Networks and Transfer Learning |
Authors | Nils Gessert, Markus Heyder, Sarah Latus, Matthias Lutz, Alexander Schlaefer |
Abstract | Advanced atherosclerosis in the coronary arteries is one of the leading causes of deaths worldwide while being preventable and treatable. In order to image atherosclerotic lesions (plaque), intravascular optical coherence tomography (IVOCT) can be used. The technique provides high-resolution images of arterial walls which allows for early plaque detection by experts. Due to the vast amount of IVOCT images acquired in clinical routines, automatic plaque detection has been addressed. For example, attenuation profiles in single A-Scans of IVOCT images are examined to detect plaque. We address automatic plaque classification from entire IVOCT images, the cross-sectional view of the artery, using deep feature learning. In this way, we take context between A-Scans into account and we directly learn relevant features from the image source without the need for handcrafting features. |
Tasks | Transfer Learning |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.03904v1 |
http://arxiv.org/pdf/1804.03904v1.pdf | |
PWC | https://paperswithcode.com/paper/plaque-classification-in-coronary-arteries |
Repo | |
Framework | |