July 28, 2019

2912 words 14 mins read

Paper Group ANR 350

ADMM-Net: A Deep Learning Approach for Compressive Sensing MRI. Induction of Decision Trees based on Generalized Graph Queries. Texture Synthesis with Recurrent Variational Auto-Encoder. LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions. Visuospatial Skill Learning for Robots. Interpretable Counting for Visual Question A …

ADMM-Net: A Deep Learning Approach for Compressive Sensing MRI


Title	ADMM-Net: A Deep Learning Approach for Compressive Sensing MRI
Authors	Yan Yang, Jian Sun, Huibin Li, Zongben Xu
Abstract	Compressive sensing (CS) is an effective approach for fast Magnetic Resonance Imaging (MRI). It aims at reconstructing MR images from a small number of under-sampled data in k-space, and accelerating the data acquisition in MRI. To improve the current MRI system in reconstruction accuracy and speed, in this paper, we propose two novel deep architectures, dubbed ADMM-Nets in basic and generalized versions. ADMM-Nets are defined over data flow graphs, which are derived from the iterative procedures in Alternating Direction Method of Multipliers (ADMM) algorithm for optimizing a general CS-based MRI model. They take the sampled k-space data as inputs and output reconstructed MR images. Moreover, we extend our network to cope with complex-valued MR images. In the training phase, all parameters of the nets, e.g., transforms, shrinkage functions, etc., are discriminatively trained end-to-end. In the testing phase, they have computational overhead similar to ADMM algorithm but use optimized parameters learned from the data for CS-based reconstruction task. We investigate different configurations in network structures and conduct extensive experiments on MR image reconstruction under different sampling rates. Due to the combination of the advantages in model-based approach and deep learning approach, the ADMM-Nets achieve state-of-the-art reconstruction accuracies with fast computational speed.
Tasks	Compressive Sensing, Image Reconstruction
Published	2017-05-19
URL	http://arxiv.org/abs/1705.06869v1
PDF	http://arxiv.org/pdf/1705.06869v1.pdf
PWC	https://paperswithcode.com/paper/admm-net-a-deep-learning-approach-for
Repo
Framework

Induction of Decision Trees based on Generalized Graph Queries


Title	Induction of Decision Trees based on Generalized Graph Queries
Authors	Pedro Almagro-Blanco, Fernando Sancho-Caparrini
Abstract	Usually, decision tree induction algorithms are limited to work with non relational data. Given a record, they do not take into account other objects attributes even though they can provide valuable information for the learning task. In this paper we present GGQ-ID3, a multi-relational decision tree learning algorithm that uses Generalized Graph Queries (GGQ) as predicates in the decision nodes. GGQs allow to express complex patterns (including cycles) and they can be refined step-by-step. Also, they can evaluate structures (not only single records) and perform Regular Pattern Matching. GGQ are built dynamically (pattern mining) during the GGQ-ID3 tree construction process. We will show how to use GGQ-ID3 to perform multi-relational machine learning keeping complexity under control. Finally, some real examples of automatically obtained classification trees and semantic patterns are shown. —– Normalmente, los algoritmos de inducci'on de 'arboles de decisi'on trabajan con datos no relacionales. Dado un registro, no tienen en cuenta los atributos de otros objetos a pesar de que 'estos pueden proporcionar informaci'on 'util para la tarea de aprendizaje. En este art'iculo presentamos GGQ-ID3, un algoritmo de aprendizaje de 'arboles de decisiones multi-relacional que utiliza Generalized Graph Queries (GGQ) como predicados en los nodos de decisi'on. Los GGQs permiten expresar patrones complejos (incluyendo ciclos) y pueden ser refinados paso a paso. Adem'as, pueden evaluar estructuras (no solo registros) y llevar a cabo Regular Pattern Matching. En GGQ-ID3, los GGQ son construidos din'amicamente (pattern mining) durante el proceso de construcci'on del 'arbol. Adem'as, se muestran algunos ejemplos reales de 'arboles de clasificaci'on multi-relacionales y patrones sem'anticos obtenidos autom'aticamente.
Tasks
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05563v1
PDF	http://arxiv.org/pdf/1708.05563v1.pdf
PWC	https://paperswithcode.com/paper/induction-of-decision-trees-based-on
Repo
Framework

Texture Synthesis with Recurrent Variational Auto-Encoder


Title	Texture Synthesis with Recurrent Variational Auto-Encoder
Authors	Rohan Chandra, Sachin Grover, Kyungjun Lee, Moustafa Meshry, Ahmed Taha
Abstract	We propose a recurrent variational auto-encoder for texture synthesis. A novel loss function, FLTBNK, is used for training the texture synthesizer. It is rotational and partially color invariant loss function. Unlike L2 loss, FLTBNK explicitly models the correlation of color intensity between pixels. Our texture synthesizer generates neighboring tiles to expand a sample texture and is evaluated using various texture patterns from Describable Textures Dataset (DTD). We perform both quantitative and qualitative experiments with various loss functions to evaluate the performance of our proposed loss function (FLTBNK) — a mini-human subject study is used for the qualitative evaluation.
Tasks	Texture Synthesis
Published	2017-12-23
URL	http://arxiv.org/abs/1712.08838v1
PDF	http://arxiv.org/pdf/1712.08838v1.pdf
PWC	https://paperswithcode.com/paper/texture-synthesis-with-recurrent-variational
Repo
Framework

LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions


Title	LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions
Authors	Yu Wang, Jiayi Liu, Yuxiang Liu, Jun Hao, Yang He, Jinghe Hu, Weipeng P. Yan, Mantian Li
Abstract	We present LADDER, the first deep reinforcement learning agent that can successfully learn control policies for large-scale real-world problems directly from raw inputs composed of high-level semantic information. The agent is based on an asynchronous stochastic variant of DQN (Deep Q Network) named DASQN. The inputs of the agent are plain-text descriptions of states of a game of incomplete information, i.e. real-time large scale online auctions, and the rewards are auction profits of very large scale. We apply the agent to an essential portion of JD’s online RTB (real-time bidding) advertising business and find that it easily beats the former state-of-the-art bidding policy that had been carefully engineered and calibrated by human experts: during JD.com’s June 18th anniversary sale, the agent increased the company’s ads revenue from the portion by more than 50%, while the advertisers’ ROI (return on investment) also improved significantly.
Tasks
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05565v2
PDF	http://arxiv.org/pdf/1708.05565v2.pdf
PWC	https://paperswithcode.com/paper/ladder-a-human-level-bidding-agent-for-large
Repo
Framework

Visuospatial Skill Learning for Robots


Title	Visuospatial Skill Learning for Robots
Authors	S. Reza Ahmadzadeh, Fulvio Mastrogiovanni, Petar Kormushev
Abstract	A novel skill learning approach is proposed that allows a robot to acquire human-like visuospatial skills for object manipulation tasks. Visuospatial skills are attained by observing spatial relationships among objects through demonstrations. The proposed Visuospatial Skill Learning (VSL) is a goal-based approach that focuses on achieving a desired goal configuration of objects relative to one another while maintaining the sequence of operations. VSL is capable of learning and generalizing multi-operation skills from a single demonstration, while requiring minimum prior knowledge about the objects and the environment. In contrast to many existing approaches, VSL offers simplicity, efficiency and user-friendly human-robot interaction. We also show that VSL can be easily extended towards 3D object manipulation tasks, simply by employing point cloud processing techniques. In addition, a robot learning framework, VSL-SP, is proposed by integrating VSL, Imitation Learning, and a conventional planning method. In VSL-SP, the sequence of performed actions are learned using VSL, while the sensorimotor skills are learned using a conventional trajectory-based learning approach. such integration easily extends robot capabilities to novel situations, even by users without programming ability. In VSL-SP the internal planner of VSL is integrated with an existing action-level symbolic planner. Using the underlying constraints of the task and extracted symbolic predicates, identified by VSL, symbolic representation of the task is updated. Therefore the planner maintains a generalized representation of each skill as a reusable action, which can be used in planning and performed independently during the learning phase. The proposed approach is validated through several real-world experiments.
Tasks	Imitation Learning
Published	2017-06-03
URL	http://arxiv.org/abs/1706.00989v1
PDF	http://arxiv.org/pdf/1706.00989v1.pdf
PWC	https://paperswithcode.com/paper/visuospatial-skill-learning-for-robots
Repo
Framework

Interpretable Counting for Visual Question Answering


Title	Interpretable Counting for Visual Question Answering
Authors	Alexander Trott, Caiming Xiong, Richard Socher
Abstract	Questions that require counting a variety of objects in images remain a major challenge in visual question answering (VQA). The most common approaches to VQA involve either classifying answers based on fixed length representations of both the image and question or summing fractional counts estimated from each section of the image. In contrast, we treat counting as a sequential decision process and force our model to make discrete choices of what to count. Specifically, the model sequentially selects from detected objects and learns interactions between objects that influence subsequent selections. A distinction of our approach is its intuitive and interpretable output, as discrete counts are automatically grounded in the image. Furthermore, our method outperforms the state of the art architecture for VQA on multiple metrics that evaluate counting.
Tasks	Question Answering, Visual Question Answering
Published	2017-12-23
URL	http://arxiv.org/abs/1712.08697v2
PDF	http://arxiv.org/pdf/1712.08697v2.pdf
PWC	https://paperswithcode.com/paper/interpretable-counting-for-visual-question
Repo
Framework

A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks


Title	A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks
Authors	Zhiqing Sun, Gehui Shen, Zhihong Deng
Abstract	Most previous approaches to Chinese word segmentation can be roughly classified into character-based and word-based methods. The former regards this task as a sequence-labeling problem, while the latter directly segments character sequence into words. However, if we consider segmenting a given sentence, the most intuitive idea is to predict whether to segment for each gap between two consecutive characters, which in comparison makes previous approaches seem too complex. Therefore, in this paper, we propose a gap-based framework to implement this intuitive idea. Moreover, very deep convolutional neural networks, namely, ResNets and DenseNets, are exploited in our experiments. Results show that our approach outperforms the best character-based and word-based methods on 5 benchmarks, without any further post-processing module (e.g. Conditional Random Fields) nor beam search.
Tasks	Chinese Word Segmentation
Published	2017-12-27
URL	http://arxiv.org/abs/1712.09509v1
PDF	http://arxiv.org/pdf/1712.09509v1.pdf
PWC	https://paperswithcode.com/paper/a-gap-based-framework-for-chinese-word
Repo
Framework

A Supervised Learning Concept for Reducing User Interaction in Passenger Cars


Title	A Supervised Learning Concept for Reducing User Interaction in Passenger Cars
Authors	Marius Stärk, Damian Backes, Christian Kehl
Abstract	In this article an automation system for human-machine-interfaces (HMI) for setpoint adjustment using supervised learning is presented. We use HMIs of multi-modal thermal conditioning systems in passenger cars as example for a complex setpoint selection system. The goal is the reduction of interaction complexity up to full automation. The approach is not limited to climate control applications but can be extended to other setpoint-based HMIs.
Tasks
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04518v1
PDF	http://arxiv.org/pdf/1711.04518v1.pdf
PWC	https://paperswithcode.com/paper/a-supervised-learning-concept-for-reducing
Repo
Framework

Talking Drums: Generating drum grooves with neural networks


Title	Talking Drums: Generating drum grooves with neural networks
Authors	P. Hutchings
Abstract	Presented is a method of generating a full drum kit part for a provided kick-drum sequence. A sequence to sequence neural network model used in natural language translation was adopted to encode multiple musical styles and an online survey was developed to test different techniques for sampling the output of the softmax function. The strongest results were found using a sampling technique that drew from the three most probable outputs at each subdivision of the drum pattern but the consistency of output was found to be heavily dependent on style.
Tasks
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09558v1
PDF	http://arxiv.org/pdf/1706.09558v1.pdf
PWC	https://paperswithcode.com/paper/talking-drums-generating-drum-grooves-with
Repo
Framework

Predicting Demographics of High-Resolution Geographies with Geotagged Tweets


Title	Predicting Demographics of High-Resolution Geographies with Geotagged Tweets
Authors	Omar Montasser, Daniel Kifer
Abstract	In this paper, we consider the problem of predicting demographics of geographic units given geotagged Tweets that are composed within these units. Traditional survey methods that offer demographics estimates are usually limited in terms of geographic resolution, geographic boundaries, and time intervals. Thus, it would be highly useful to develop computational methods that can complement traditional survey methods by offering demographics estimates at finer geographic resolutions, with flexible geographic boundaries (i.e. not confined to administrative boundaries), and at different time intervals. While prior work has focused on predicting demographics and health statistics at relatively coarse geographic resolutions such as the county-level or state-level, we introduce an approach to predict demographics at finer geographic resolutions such as the blockgroup-level. For the task of predicting gender and race/ethnicity counts at the blockgroup-level, an approach adapted from prior work to our problem achieves an average correlation of 0.389 (gender) and 0.569 (race) on a held-out test dataset. Our approach outperforms this prior approach with an average correlation of 0.671 (gender) and 0.692 (race).
Tasks
Published	2017-01-22
URL	http://arxiv.org/abs/1701.06225v1
PDF	http://arxiv.org/pdf/1701.06225v1.pdf
PWC	https://paperswithcode.com/paper/predicting-demographics-of-high-resolution
Repo
Framework

Future Word Contexts in Neural Network Language Models


Title	Future Word Contexts in Neural Network Language Models
Authors	Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark Gales
Abstract	Recently, bidirectional recurrent network language models (bi-RNNLMs) have been shown to outperform standard, unidirectional, recurrent neural network language models (uni-RNNLMs) on a range of speech recognition tasks. This indicates that future word context information beyond the word history can be useful. However, bi-RNNLMs pose a number of challenges as they make use of the complete previous and future word context information. This impacts both training efficiency and their use within a lattice rescoring framework. In this paper these issues are addressed by proposing a novel neural network structure, succeeding word RNNLMs (su-RNNLMs). Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words. This model can be trained much more efficiently than bi-RNNLMs and can also be used for lattice rescoring. Experimental results on a meeting transcription task (AMI) show the proposed model consistently outperformed uni-RNNLMs and yield only a slight degradation compared to bi-RNNLMs in N-best rescoring. Additionally, performance improvements can be obtained using lattice rescoring and subsequent confusion network decoding.
Tasks	Speech Recognition
Published	2017-08-18
URL	http://arxiv.org/abs/1708.05592v1
PDF	http://arxiv.org/pdf/1708.05592v1.pdf
PWC	https://paperswithcode.com/paper/future-word-contexts-in-neural-network
Repo
Framework

Saliency Prediction for Mobile User Interfaces


Title	Saliency Prediction for Mobile User Interfaces
Authors	Prakhar Gupta, Shubh Gupta, Ajaykrishnan Jayagopal, Sourav Pal, Ritwik Sinha
Abstract	We introduce models for saliency prediction for mobile user interfaces. A mobile interface may include elements like buttons, text, etc. in addition to natural images which enable performing a variety of tasks. Saliency in natural images is a well studied area. However, given the difference in what constitutes a mobile interface, and the usage context of these devices, we postulate that saliency prediction for mobile interface images requires a fresh approach. Mobile interface design involves operating on elements, the building blocks of the interface. We first collected eye-gaze data from mobile devices for free viewing task. Using this data, we develop a novel autoencoder based multi-scale deep learning model that provides saliency prediction at the mobile interface element level. Compared to saliency prediction approaches developed for natural images, we show that our approach performs significantly better on a range of established metrics.
Tasks	Saliency Prediction
Published	2017-11-10
URL	http://arxiv.org/abs/1711.03726v3
PDF	http://arxiv.org/pdf/1711.03726v3.pdf
PWC	https://paperswithcode.com/paper/saliency-prediction-for-mobile-user
Repo
Framework

Recommender System for News Articles using Supervised Learning


Title	Recommender System for News Articles using Supervised Learning
Authors	Akshay Kumar Chaturvedi, Filipa Peleja, Ana Freire
Abstract	In the last decade we have observed a mass increase of information, in particular information that is shared through smartphones. Consequently, the amount of information that is available does not allow the average user to be aware of all his options. In this context, recommender systems use a number of techniques to help a user find the desired product. Hence, nowadays recommender systems play an important role. Recommender Systems’ aim to identify products that best fits user preferences. These techniques are advantageous to both users and vendors, as it enables the user to rapidly find what he needs and the vendors to promote their products and sales. As the industry became aware of the gains that could be accomplished by using these algorithms, also a very interesting problem for many researchers, recommender systems became a very active area since the mid 90’s. Having in mind that this is an ongoing problem the present thesis intends to observe the value of using a recommender algorithm to find users likes by observing her domain preferences. In a balanced probabilistic method, this thesis will show how news topics can be used to recommend news articles. In this thesis, we used different machine learning methods to determine the user ratings for an article. To tackle this problem, supervised learning methods such as linear regression, Naive Bayes and logistic regression are used. All the aforementioned models have a different nature which has an impact on the solution of the given problem. Furthermore, number of experiments are presented and discussed to identify the feature set that fits best to the problem.
Tasks	Recommendation Systems
Published	2017-07-03
URL	http://arxiv.org/abs/1707.00506v1
PDF	http://arxiv.org/pdf/1707.00506v1.pdf
PWC	https://paperswithcode.com/paper/recommender-system-for-news-articles-using
Repo
Framework

Differentially Private Matrix Completion Revisited


Title	Differentially Private Matrix Completion Revisited
Authors	Prateek Jain, Om Thakkar, Abhradeep Thakurta
Abstract	We provide the first provably joint differentially private algorithm with formal utility guarantees for the problem of user-level privacy-preserving collaborative filtering. Our algorithm is based on the Frank-Wolfe method, and it consistently estimates the underlying preference matrix as long as the number of users $m$ is $\omega(n^{5/4})$, where $n$ is the number of items, and each user provides her preference for at least $\sqrt{n}$ randomly selected items. Along the way, we provide an optimal differentially private algorithm for singular vector computation, based on the celebrated Oja’s method, that provides significant savings in terms of space and time while operating on sparse matrices. We also empirically evaluate our algorithm on a suite of datasets, and show that it consistently outperforms the state-of-the-art private algorithms.
Tasks	Matrix Completion
Published	2017-12-28
URL	http://arxiv.org/abs/1712.09765v2
PDF	http://arxiv.org/pdf/1712.09765v2.pdf
PWC	https://paperswithcode.com/paper/differentially-private-matrix-completion
Repo
Framework

Handwritten Recognition Using SVM, KNN and Neural Network


Title	Handwritten Recognition Using SVM, KNN and Neural Network
Authors	Norhidayu Abdul Hamid, Nilam Nur Amir Sjarif
Abstract	Handwritten recognition (HWR) is the ability of a computer to receive and interpret intelligible handwritten input from source such as paper documents, photographs, touch-screens and other devices. In this paper we will using three (3) classification t o re cognize the handwritten which is SVM, KNN and Neural Network.
Tasks
Published	2017-02-01
URL	http://arxiv.org/abs/1702.00723v1
PDF	http://arxiv.org/pdf/1702.00723v1.pdf
PWC	https://paperswithcode.com/paper/handwritten-recognition-using-svm-knn-and
Repo
Framework