January 29, 2020

3332 words 16 mins read

Paper Group ANR 593

Deep Dynamics Models for Learning Dexterous Manipulation. Survey of Computer Vision and Machine Learning in Gastrointestinal Endoscopy. A Fast Template-based Approach to Automatically Identify Primary Text Content of a Web Page. Identifying short-term interests from mobile app adoption pattern. Query Expansion for Patent Searching using Word Embedd …

Deep Dynamics Models for Learning Dexterous Manipulation


Title	Deep Dynamics Models for Learning Dexterous Manipulation
Authors	Anusha Nagabandi, Kurt Konoglie, Sergey Levine, Vikash Kumar
Abstract	Dexterous multi-fingered hands can provide robots with the ability to flexibly perform a wide range of manipulation skills. However, many of the more complex behaviors are also notoriously difficult to control: Performing in-hand object manipulation, executing finger gaits to move objects, and exhibiting precise fine motor skills such as writing, all require finely balancing contact forces, breaking and reestablishing contacts repeatedly, and maintaining control of unactuated objects. Learning-based techniques provide the appealing possibility of acquiring these skills directly from data, but current learning approaches either require large amounts of data and produce task-specific policies, or they have not yet been shown to scale up to more complex and realistic tasks requiring fine motor skills. In this work, we demonstrate that our method of online planning with deep dynamics models (PDDM) addresses both of these limitations; we show that improvements in learned dynamics models, together with improvements in online model-predictive control, can indeed enable efficient and effective learning of flexible contact-rich dexterous manipulation skills – and that too, on a 24-DoF anthropomorphic hand in the real world, using just 4 hours of purely real-world data to learn to simultaneously coordinate multiple free-floating objects. Videos can be found at https://sites.google.com/view/pddm/
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11652v1
PDF	https://arxiv.org/pdf/1909.11652v1.pdf
PWC	https://paperswithcode.com/paper/deep-dynamics-models-for-learning-dexterous
Repo
Framework

Survey of Computer Vision and Machine Learning in Gastrointestinal Endoscopy


Title	Survey of Computer Vision and Machine Learning in Gastrointestinal Endoscopy
Authors	Anant S. Vemuri
Abstract	This paper attempts to provide the reader a place to begin studying the application of computer vision and machine learning to gastrointestinal (GI) endoscopy. They have been classified into 18 categories. It should be be noted by the reader that this is a review from pre-deep learning era. A lot of deep learning based applications have not been covered in this thesis.
Tasks
Published	2019-04-26
URL	http://arxiv.org/abs/1904.13307v1
PDF	http://arxiv.org/pdf/1904.13307v1.pdf
PWC	https://paperswithcode.com/paper/survey-of-computer-vision-and-machine
Repo
Framework

A Fast Template-based Approach to Automatically Identify Primary Text Content of a Web Page


Title	A Fast Template-based Approach to Automatically Identify Primary Text Content of a Web Page
Authors	Dat Quoc Nguyen, Dai Quoc Nguyen, Son Bao Pham, The Duy Bui
Abstract	Search engines have become an indispensable tool for browsing information on the Internet. The user, however, is often annoyed by redundant results from irrelevant Web pages. One reason is because search engines also look at non-informative blocks of Web pages such as advertisement, navigation links, etc. In this paper, we propose a fast algorithm called FastContentExtractor to automatically detect main content blocks in a Web page by improving the ContentExtractor algorithm. By automatically identifying and storing templates representing the structure of content blocks in a website, content blocks of a new Web page from the Website can be extracted quickly. The hierarchical order of the output blocks is also maintained which guarantees that the extracted content blocks are in the same order as the original ones.
Tasks
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11473v1
PDF	https://arxiv.org/pdf/1911.11473v1.pdf
PWC	https://paperswithcode.com/paper/a-fast-template-based-approach-to
Repo
Framework

Identifying short-term interests from mobile app adoption pattern


Title	Identifying short-term interests from mobile app adoption pattern
Authors	Bharat Gaind, Nitish Varshney, Shubham Goel, Akash Mondal
Abstract	With the increase in an average user’s dependence on their mobile devices, the reliance on collecting his browsing history from mobile browsers has also increased. This browsing history is highly utilized in the advertising industry for providing targeted ads in the purview of inferring his short-term interests and pushing relevant ads. However, the major limitation of such an extraction from mobile browsers is that they reset when the browser is closed or when the device is shut down/restarted; thus rendering existing methods to identify the user’s short-term interests on mobile devices users, ineffective. In this paper, we propose an alternative method to identify such short-term interests by analysing their mobile app adoption (installation/uninstallation) patterns over a period of time. Such a method can be highly effective in pinpointing the user’s ephemeral inclinations like buying/renting an apartment, buying/selling a car or a sudden increased interest in shopping (possibly due to a recent salary bonus, he received). Subsequently, these derived interests are also used for targeted experiments. Our experiments result in up to 93.68% higher click-through rate in comparison to the ads shown without any user-interest knowledge. Also, up to 51% higher revenue in the long term is expected as a result of the application of our proposed algorithm.
Tasks
Published	2019-04-25
URL	http://arxiv.org/abs/1904.11388v1
PDF	http://arxiv.org/pdf/1904.11388v1.pdf
PWC	https://paperswithcode.com/paper/identifying-short-term-interests-from-mobile
Repo
Framework

Query Expansion for Patent Searching using Word Embedding and Professional Crowdsourcing


Title	Query Expansion for Patent Searching using Word Embedding and Professional Crowdsourcing
Authors	Arthi Krishna, Ye Jin, Christine Foster, Greg Gabel, Britt Hanley, Abdou Youssef
Abstract	The patent examination process includes a search of previous work to verify that a patent application describes a novel invention. Patent examiners primarily use keyword-based searches to uncover prior art. A critical part of keyword searching is query expansion, which is the process of including alternate terms such as synonyms and other related words, since the same concepts are often described differently in the literature. Patent terminology is often domain specific. By curating technology-specific corpora and training word embedding models based on these corpora, we are able to automatically identify the most relevant expansions of a given word or phrase. We compare the performance of several automated query expansion techniques against expert specified expansions. Furthermore, we explore a novel mechanism to extract related terms not just based on one input term but several terms in conjunction by computing their centroid and identifying the nearest neighbors to this centroid. Highly skilled patent examiners are often the best and most reliable source of identifying related terms. By designing a user interface that allows examiners to interact with the word embedding suggestions, we are able to use these interactions to power crowdsourced modes of related terms. Learning from users allows us to overcome several challenges such as identifying words that are bleeding edge and have not been published in the corpus yet. This paper studies the effectiveness of word embedding and crowdsourced models across 11 disparate technical areas.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.11069v1
PDF	https://arxiv.org/pdf/1911.11069v1.pdf
PWC	https://paperswithcode.com/paper/191111069
Repo
Framework


Title	Uncovering Flaming Events on News Media in Social Media
Authors	Praboda Rajapaksha, Reza Farahbakhsh, Noel Crespi, Bruno Defude
Abstract	Social networking sites (SNSs) facilitate the sharing of ideas and information through different types of feedback including publishing posts, leaving comments and other type of reactions. However, some comments or feedback on SNSs are inconsiderate and offensive, and sometimes this type of feedback has a very negative effect on a target user. The phenomenon known as flaming goes hand-in-hand with this type of posting that can trigger almost instantly on SNSs. Most popular users such as celebrities, politicians and news media are the major victims of the flaming behaviors and so detecting these types of events will be useful and appreciated. Flaming event can be monitored and identified by analyzing negative comments received on a post. Thus, our main objective of this study is to identify a way to detect flaming events in SNS using a sentiment prediction method. We use a deep Neural Network (NN) model that can identity sentiments of variable length sentences and classifies the sentiment of SNSs content (both comments and posts) to discover flaming events. Our deep NN model uses Word2Vec and FastText word embedding methods as its training to explore which method is the most appropriate. The labeled dataset for training the deep NN is generated using an enhanced lexicon based approach. Our deep NN model classifies the sentiment of a sentence into five classes: Very Positive, Positive, Neutral, Negative and Very Negative. To detect flaming incidents, we focus only on the comments classified into the Negative and Very Negative classes. As a use-case, we try to explore the flaming phenomena in the news media domain and therefore we focused on news items posted by three popular news media on Facebook (BBCNews, CNN and FoxNews) to train and test the model.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07181v1
PDF	https://arxiv.org/pdf/1909.07181v1.pdf
PWC	https://paperswithcode.com/paper/uncovering-flaming-events-on-news-media-in
Repo
Framework

Visual Tracking via Dynamic Memory Networks


Title	Visual Tracking via Dynamic Memory Networks
Authors	Tianyu Yang, Antoni B. Chan
Abstract	Template-matching methods for visual tracking have gained popularity recently due to their good performance and fast speed. However, they lack effective ways to adapt to changes in the target object’s appearance, making their tracking accuracy still far from state-of-the-art. In this paper, we propose a dynamic memory network to adapt the template to the target’s appearance variations during tracking. The reading and writing process of the external memory is controlled by an LSTM network with the search feature map as input. A spatial attention mechanism is applied to concentrate the LSTM input on the potential target as the location of the target is at first unknown. To prevent aggressive model adaptivity, we apply gated residual template learning to control the amount of retrieved memory that is used to combine with the initial template. In order to alleviate the drift problem, we also design a “negative” memory unit that stores templates for distractors, which are used to cancel out wrong responses from the object template. To further boost the tracking performance, an auxiliary classification loss is added after the feature extractor part. Unlike tracking-by-detection methods where the object’s information is maintained by the weight parameters of neural networks, which requires expensive online fine-tuning to be adaptable, our tracker runs completely feed-forward and adapts to the target’s appearance changes by updating the external memory. Moreover, the capacity of our model is not determined by the network size as with other trackers — the capacity can be easily enlarged as the memory requirements of a task increase, which is favorable for memorizing long-term object information. Extensive experiments on the OTB and VOT datasets demonstrate that our trackers perform favorably against state-of-the-art tracking methods while retaining real-time speed.
Tasks	Visual Tracking
Published	2019-07-12
URL	https://arxiv.org/abs/1907.07613v3
PDF	https://arxiv.org/pdf/1907.07613v3.pdf
PWC	https://paperswithcode.com/paper/visual-tracking-via-dynamic-memory-networks
Repo
Framework

Explainable Product Search with a Dynamic Relation Embedding Model


Title	Explainable Product Search with a Dynamic Relation Embedding Model
Authors	Qingyao Ai, Yongfeng Zhang, Keping Bi, W. Bruce Croft
Abstract	Product search is one of the most popular methods for customers to discover products online. Most existing studies on product search focus on developing effective retrieval models that rank items by their likelihood to be purchased. They, however, ignore the problem that there is a gap between how systems and customers perceive the relevance of items. Without explanations, users may not understand why product search engines retrieve certain items for them, which consequentially leads to imperfect user experience and suboptimal system performance in practice. In this work, we tackle this problem by constructing explainable retrieval models for product search. Specifically, we propose to model the “search and purchase” behavior as a dynamic relation between users and items, and create a dynamic knowledge graph based on both the multi-relational product data and the context of the search session. Ranking is conducted based on the relationship between users and items in the latent space, and explanations are generated with logic inferences and entity soft matching on the knowledge graph. Empirical experiments show that our model, which we refer to as the Dynamic Relation Embedding Model (DREM), significantly outperforms the state-of-the-art baselines and has the ability to produce reasonable explanations for search results.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.07212v1
PDF	https://arxiv.org/pdf/1909.07212v1.pdf
PWC	https://paperswithcode.com/paper/explainable-product-search-with-a-dynamic
Repo
Framework

Deep 3D Pan via adaptive “t-shaped” convolutions with global and local adaptive dilations


Title	Deep 3D Pan via adaptive “t-shaped” convolutions with global and local adaptive dilations
Authors	Juan Luis Gonzalez Bello, Munchurl Kim
Abstract	Recent advances in deep learning have shown promising results in many low-level vision tasks. However, solving the single-image-based view synthesis is still an open problem. In particular, the generation of new images at parallel camera views given a single input image is of great interest, as it enables 3D visualization of the 2D input scenery. We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with “t-shaped” adaptive kernels equipped with globally and locally adaptive dilations. Our proposed network architecture, the monster-net, is devised with a novel “t-shaped” adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image’s pixels for the synthesis of naturally looking 3D panned views when a 2-D input image is given. Extensive experiments were performed on the KITTI, CityScapes and our VICLAB_STEREO indoors dataset to prove the efficacy of our method. Our monster-net significantly outperforms the state-of-the-art method, SOTA, by a large margin in all metrics of RMSE, PSNR, and SSIM. Our proposed monster-net is capable of reconstructing more reliable image structures in synthesized images with coherent geometry. Moreover, the disparity information that can be extracted from the “t-shaped” kernel is much more reliable than that of the SOTA for the unsupervised monocular depth estimation task, confirming the effectiveness of our method.
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01089v3
PDF	https://arxiv.org/pdf/1910.01089v3.pdf
PWC	https://paperswithcode.com/paper/deep-3d-pan-via-adaptive-t-shaped
Repo
Framework

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks


Title	Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Authors	Yu Bai, Jason D. Lee
Abstract	Recent theoretical work has established connections between over-parametrized neural networks and linearized models governed by he Neural Tangent Kernels (NTKs). NTK theory leads to concrete convergence and generalization results, yet the empirical performance of neural networks are observed to exceed their linearized models, suggesting insufficiency of this theory. Towards closing this gap, we investigate the training of over-parametrized neural networks that are beyond the NTK regime yet still governed by the Taylor expansion of the network. We bring forward the idea of \emph{randomizing} the neural networks, which allows them to escape their NTK and couple with quadratic models. We show that the optimization landscape of randomized two-layer networks are nice and amenable to escaping-saddle algorithms. We prove concrete generalization and expressivity results on these randomized networks, which lead to sample complexity bounds (of learning certain simple functions) that match the NTK and can in addition be better by a dimension factor when mild distributional assumptions are present. We demonstrate that our randomization technique can be generalized systematically beyond the quadratic case, by using it to find networks that are coupled with higher-order terms in their Taylor series.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01619v2
PDF	https://arxiv.org/pdf/1910.01619v2.pdf
PWC	https://paperswithcode.com/paper/beyond-linearization-on-quadratic-and-higher
Repo
Framework

Universal Adversarial Perturbation for Text Classification


Title	Universal Adversarial Perturbation for Text Classification
Authors	Hang Gao, Tim Oates
Abstract	Given a state-of-the-art deep neural network text classifier, we show the existence of a universal and very small perturbation vector (in the embedding space) that causes natural text to be misclassified with high probability. Unlike images on which a single fixed-size adversarial perturbation can be found, text is of variable length, so we define the “universality” as “token-agnostic”, where a single perturbation is applied to each token, resulting in different perturbations of flexible sizes at the sequence level. We propose an algorithm to compute universal adversarial perturbations, and show that the state-of-the-art deep neural networks are highly vulnerable to them, even though they keep the neighborhood of tokens mostly preserved. We also show how to use these adversarial perturbations to generate adversarial text samples. The surprising existence of universal “token-agnostic” adversarial perturbations may reveal important properties of a text classifier.
Tasks	Adversarial Text, Text Classification
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04618v1
PDF	https://arxiv.org/pdf/1910.04618v1.pdf
PWC	https://paperswithcode.com/paper/universal-adversarial-perturbation-for-text
Repo
Framework

Exploring Bias in GAN-based Data Augmentation for Small Samples


Title	Exploring Bias in GAN-based Data Augmentation for Small Samples
Authors	Mengxiao Hu, Jinlong Li
Abstract	For machine learning task, lacking sufficient samples mean the trained model has low confidence to approach the ground truth function. Until recently, after the generative adversarial networks (GAN) had been proposed, we see the hope of small samples data augmentation (DA) with realistic fake data, and many works validated the viability of GAN-based DA. Although most of the works pointed out higher accuracy can be achieved using GAN-based DA, some researchers stressed that the fake data generated from GAN has inherent bias, and in this paper, we explored when the bias is so low that it cannot hurt the performance, we set experiments to depict the bias in different GAN-based DA setting, and from the results, we design a pipeline to inspect specific dataset is efficiently-augmentable with GAN-based DA or not. And finally, depending on our trial to reduce the bias, we proposed some advice to mitigate bias in GAN-based DA application.
Tasks	Data Augmentation
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08495v1
PDF	https://arxiv.org/pdf/1905.08495v1.pdf
PWC	https://paperswithcode.com/paper/exploring-bias-in-gan-based-data-augmentation
Repo
Framework

Estimate Sequences for Variance-Reduced Stochastic Composite Optimization


Title	Estimate Sequences for Variance-Reduced Stochastic Composite Optimization
Authors	Andrei Kulunchakov, Julien Mairal
Abstract	In this paper, we propose a unified view of gradient-based algorithms for stochastic convex composite optimization by extending the concept of estimate sequence introduced by Nesterov. This point of view covers the stochastic gradient descent method, variants of the approaches SAGA, SVRG, and has several advantages: (i) we provide a generic proof of convergence for the aforementioned methods; (ii) we show that this SVRG variant is adaptive to strong convexity; (iii) we naturally obtain new algorithms with the same guarantees; (iv) we derive generic strategies to make these algorithms robust to stochastic noise, which is useful when data is corrupted by small random perturbations. Finally, we show that this viewpoint is useful to obtain new accelerated algorithms in the sense of Nesterov.
Tasks
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02374v1
PDF	https://arxiv.org/pdf/1905.02374v1.pdf
PWC	https://paperswithcode.com/paper/estimate-sequences-for-variance-reduced
Repo
Framework

Unsupervised Text Generation from Structured Data


Title	Unsupervised Text Generation from Structured Data
Authors	Martin Schmitt, Sahand Sharifzadeh, Volker Tresp, Hinrich Schütze
Abstract	This work presents a joint solution to two challenging tasks: text generation from data and open information extraction. We propose to model both tasks as sequence-to-sequence translation problems and thus construct a joint neural model for both. Our experiments on knowledge graphs from Visual Genome, i.e., structured image analyses, shows promising results compared to strong baselines. Building on recent work on unsupervised machine translation, we report the first results - to the best of our knowledge - on fully unsupervised text generation from structured data.
Tasks	Knowledge Graphs, Machine Translation, Open Information Extraction, Text Generation, Unsupervised Machine Translation
Published	2019-04-20
URL	https://arxiv.org/abs/1904.09447v2
PDF	https://arxiv.org/pdf/1904.09447v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-text-generation-from-structured
Repo
Framework

Bilingual-GAN: A Step Towards Parallel Text Generation


Title	Bilingual-GAN: A Step Towards Parallel Text Generation
Authors	Ahmad Rashid, Alan Do-Omri, Md. Akmal Haidar, Qun Liu, Mehdi Rezagholizadeh
Abstract	Latent space based GAN methods and attention based sequence to sequence models have achieved impressive results in text generation and unsupervised machine translation respectively. Leveraging the two domains, we propose an adversarial latent space based model capable of generating parallel sentences in two languages concurrently and translating bidirectionally. The bilingual generation goal is achieved by sampling from the latent space that is shared between both languages. First two denoising autoencoders are trained, with shared encoders and back-translation to enforce a shared latent state between the two languages. The decoder is shared for the two translation directions. Next, a GAN is trained to generate synthetic “code” mimicking the languages’ shared latent space. This code is then fed into the decoder to generate text in either language. We perform our experiments on Europarl and Multi30k datasets, on the English-French language pair, and document our performance using both supervised and unsupervised machine translation.
Tasks	Denoising, Machine Translation, Text Generation, Unsupervised Machine Translation
Published	2019-04-09
URL	https://arxiv.org/abs/1904.04742v2
PDF	https://arxiv.org/pdf/1904.04742v2.pdf
PWC	https://paperswithcode.com/paper/bilingual-gan-a-step-towards-parallel-text
Repo
Framework