October 18, 2019

3049 words 15 mins read

Paper Group ANR 655

Paper Group ANR 655

Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification. Human-AI Learning Performance in Multi-Armed Bandits. Learning to Sketch with Shortcut Cycle Consistency. Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing. Sim-to-Real Transfer of …

Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification

Title Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification
Authors Pradeep Rangan, Sreenivasa Rao K
Abstract Manner of articulation detection using deep neural networks require a priori knowledge of the attribute discriminative features or the decent phoneme alignments. However generating an appropriate phoneme alignment is complex and its performance depends on the choice of optimal number of senones, Gaussians, etc. In the first part of our work, we exploit the manner of articulation detection using connectionist temporal classification (CTC) which doesn’t need any phoneme alignment. Later we modify the state-of-the-art character based posteriors generated by CTC using the manner of articulation CTC detector. Beam search decoding is performed on the modified posteriors and it’s impact on open source datasets such as AN4 and LibriSpeech is observed.
Tasks Manner Of Articulation Detection
Published 2018-11-16
URL http://arxiv.org/abs/1811.07720v1
PDF http://arxiv.org/pdf/1811.07720v1.pdf
PWC https://paperswithcode.com/paper/beam-search-decoding-using-manner-of
Repo
Framework

Human-AI Learning Performance in Multi-Armed Bandits

Title Human-AI Learning Performance in Multi-Armed Bandits
Authors Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan
Abstract People frequently face challenging decision-making problems in which outcomes are uncertain or unknown. Artificial intelligence (AI) algorithms exist that can outperform humans at learning such tasks. Thus, there is an opportunity for AI agents to assist people in learning these tasks more effectively. In this work, we use a multi-armed bandit as a controlled setting in which to explore this direction. We pair humans with a selection of agents and observe how well each human-agent team performs. We find that team performance can beat both human and agent performance in isolation. Interestingly, we also find that an agent’s performance in isolation does not necessarily correlate with the human-agent team’s performance. A drop in agent performance can lead to a disproportionately large drop in team performance, or in some settings can even improve team performance. Pairing a human with an agent that performs slightly better than them can make them perform much better, while pairing them with an agent that performs the same can make them them perform much worse. Further, our results suggest that people have different exploration strategies and might perform better with agents that match their strategy. Overall, optimizing human-agent team performance requires going beyond optimizing agent performance, to understanding how the agent’s suggestions will influence human decision-making.
Tasks Decision Making, Multi-Armed Bandits
Published 2018-12-21
URL http://arxiv.org/abs/1812.09376v1
PDF http://arxiv.org/pdf/1812.09376v1.pdf
PWC https://paperswithcode.com/paper/human-ai-learning-performance-in-multi-armed
Repo
Framework

Learning to Sketch with Shortcut Cycle Consistency

Title Learning to Sketch with Shortcut Cycle Consistency
Authors Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales
Abstract To see is to sketch – free-hand sketching naturally builds ties between human and machine vision. In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process. This is an extremely challenging task because the photo and sketch domains differ significantly. Furthermore, human sketches exhibit various levels of sophistication and abstraction even when depicting the same object instance in a reference photo. This means that even if photo-sketch pairs are available, they only provide weak supervision signal to learn a translation model. Compared with existing supervised approaches that solve the problem of D(E(photo)) -> sketch, where E($\cdot$) and D($\cdot$) denote encoder and decoder respectively, we take advantage of the inverse problem (e.g., D(E(sketch)) -> photo), and combine with the unsupervised learning tasks of within-domain reconstruction, all within a multi-task learning framework. Compared with existing unsupervised approaches based on cycle consistency (i.e., D(E(D(E(photo)))) -> photo), we introduce a shortcut consistency enforced at the encoder bottleneck (e.g., D(E(photo)) -> photo) to exploit the additional self-supervision. Both qualitative and quantitative results show that the proposed model is superior to a number of state-of-the-art alternatives. We also show that the synthetic sketches can be used to train a better fine-grained sketch-based image retrieval (FG-SBIR) model, effectively alleviating the problem of sketch data scarcity.
Tasks Image Retrieval, Multi-Task Learning, Sketch-Based Image Retrieval
Published 2018-05-01
URL http://arxiv.org/abs/1805.00247v1
PDF http://arxiv.org/pdf/1805.00247v1.pdf
PWC https://paperswithcode.com/paper/learning-to-sketch-with-shortcut-cycle
Repo
Framework

Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing

Title Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing
Authors Chao Gao, Zongming Ma
Abstract This paper surveys some recent developments in fundamental limits and optimal algorithms for network analysis. We focus on minimax optimal rates in three fundamental problems of network analysis: graphon estimation, community detection, and hypothesis testing. For each problem, we review state-of-the-art results in the literature followed by general principles behind the optimal procedures that lead to minimax estimation and testing. This allows us to connect problems in network analysis to other statistical inference problems from a general perspective.
Tasks Community Detection, Graphon Estimation
Published 2018-11-14
URL http://arxiv.org/abs/1811.06055v2
PDF http://arxiv.org/pdf/1811.06055v2.pdf
PWC https://paperswithcode.com/paper/minimax-rates-in-network-analysis-graphon
Repo
Framework

Sim-to-Real Transfer of Robot Learning with Variable Length Inputs

Title Sim-to-Real Transfer of Robot Learning with Variable Length Inputs
Authors Vibhavari Dasagi, Robert Lee, Serena Mou, Jake Bruce, Niko Sünderhauf, Jürgen Leitner
Abstract Current end-to-end deep Reinforcement Learning (RL) approaches require jointly learning perception, decision-making and low-level control from very sparse reward signals and high-dimensional inputs, with little capability of incorporating prior knowledge. This results in prohibitively long training times for use on real-world robotic tasks. Existing algorithms capable of extracting task-level representations from high-dimensional inputs, e.g. object detection, often produce outputs of varying lengths, restricting their use in RL methods due to the need for neural networks to have fixed length inputs. In this work, we propose a framework that combines deep sets encoding, which allows for variable-length abstract representations, with modular RL that utilizes these representations, decoupling high-level decision making from low-level control. We successfully demonstrate our approach on the robot manipulation task of object sorting, showing that this method can learn effective policies within mere minutes of highly simplified simulation. The learned policies can be directly deployed on a robot without further training, and generalize to variations of the task unseen during training.
Tasks Decision Making, Object Detection, Transfer Learning
Published 2018-09-20
URL https://arxiv.org/abs/1809.07480v2
PDF https://arxiv.org/pdf/1809.07480v2.pdf
PWC https://paperswithcode.com/paper/zero-shot-sim-to-real-transfer-with-modular
Repo
Framework

Venue Suggestion Using Social-Centric Scores

Title Venue Suggestion Using Social-Centric Scores
Authors Mohammad Aliannejadi, Fabio Crestani
Abstract User modeling is a very important task for making relevant suggestions of venues to the users. These suggestions are often based on matching the venues’ features with the users’ preferences, which can be collected from previously visited locations. In this paper, we present a set of relevance scores for making personalized suggestions of points of interest. These scores model each user by focusing on the different types of information extracted from venues that they have previously visited. In particular, we focus on scores extracted from social information available on location-based social networks. Our experiments, conducted on the dataset of the TREC Contextual Suggestion Track, show that social scores are more effective than scores based venues’ content.
Tasks
Published 2018-03-22
URL http://arxiv.org/abs/1803.08354v2
PDF http://arxiv.org/pdf/1803.08354v2.pdf
PWC https://paperswithcode.com/paper/venue-suggestion-using-social-centric-scores
Repo
Framework

Cognitive system to achieve human-level accuracy in automated assignment of helpdesk email tickets

Title Cognitive system to achieve human-level accuracy in automated assignment of helpdesk email tickets
Authors Atri Mandal, Nikhil Malhotra, Shivali Agarwal, Anupama Ray, Giriprasad Sridhara
Abstract Ticket assignment/dispatch is a crucial part of service delivery business with lot of scope for automation and optimization. In this paper, we present an end-to-end automated helpdesk email ticket assignment system, which is also offered as a service. The objective of the system is to determine the nature of the problem mentioned in an incoming email ticket and then automatically dispatch it to an appropriate resolver group (or team) for resolution. The proposed system uses an ensemble classifier augmented with a configurable rule engine. While design of classifier that is accurate is one of the main challenges, we also need to address the need of designing a system that is robust and adaptive to changing business needs. We discuss some of the main design challenges associated with email ticket assignment automation and how we solve them. The design decisions for our system are driven by high accuracy, coverage, business continuity, scalability and optimal usage of computational resources. Our system has been deployed in production of three major service providers and currently assigning over 40,000 emails per month, on an average, with an accuracy close to 90% and covering at least 90% of email tickets. This translates to achieving human-level accuracy and results in a net saving of about 23000 man-hours of effort per annum.
Tasks
Published 2018-08-08
URL http://arxiv.org/abs/1808.02636v2
PDF http://arxiv.org/pdf/1808.02636v2.pdf
PWC https://paperswithcode.com/paper/cognitive-system-to-achieve-human-level
Repo
Framework

Learning with Bad Training Data via Iterative Trimmed Loss Minimization

Title Learning with Bad Training Data via Iterative Trimmed Loss Minimization
Authors Yanyao Shen, Sujay Sanghavi
Abstract In this paper, we study a simple and generic framework to tackle the problem of learning model parameters when a fraction of the training samples are corrupted. We first make a simple observation: in a variety of such settings, the evolution of training accuracy (as a function of training epochs) is different for clean and bad samples. Based on this we propose to iteratively minimize the trimmed loss, by alternating between (a) selecting samples with lowest current loss, and (b) retraining a model on only these samples. We prove that this process recovers the ground truth (with linear convergence rate) in generalized linear models with standard statistical assumptions. Experimentally, we demonstrate its effectiveness in three settings: (a) deep image classifiers with errors only in labels, (b) generative adversarial networks with bad training images, and (c) deep image classifiers with adversarial (image, label) pairs (i.e., backdoor attacks). For the well-studied setting of random label noise, our algorithm achieves state-of-the-art performance without having access to any a-priori guaranteed clean samples.
Tasks
Published 2018-10-28
URL http://arxiv.org/abs/1810.11874v2
PDF http://arxiv.org/pdf/1810.11874v2.pdf
PWC https://paperswithcode.com/paper/learning-with-bad-training-data-via-iterative
Repo
Framework
Title Learning Theory and Algorithms for Revenue Management in Sponsored Search
Authors Lulu Wang, Huahui Liu, Guanhao Chen, Shaola Ren, Xiaonan Meng, Yi Hu
Abstract Online advertisement is the main source of revenue for Internet business. Advertisers are typically ranked according to a score that takes into account their bids and potential click-through rates(eCTR). Generally, the likelihood that a user clicks on an ad is often modeled by optimizing for the click through rates rather than the performance of the auction in which the click through rates will be used. This paper attempts to eliminate this dis-connection by proposing loss functions for click modeling that are based on final auction performance.In this paper, we address two feasible metrics (AUC^R and SAUC) to evaluate the on-line RPM (revenue per mille) directly rather than the CTR. And then, we design an explicit ranking function by incorporating the calibration fac-tor and price-squashed factor to maximize the revenue. Given the power of deep networks, we also explore an implicit optimal ranking function with deep model. Lastly, various experiments with two real world datasets are presented. In particular, our proposed methods perform better than the state-of-the-art methods with regard to the revenue of the platform.
Tasks Calibration
Published 2018-07-05
URL http://arxiv.org/abs/1807.01827v1
PDF http://arxiv.org/pdf/1807.01827v1.pdf
PWC https://paperswithcode.com/paper/learning-theory-and-algorithms-for-revenue-1
Repo
Framework

Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks

Title Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks
Authors Humphrey Sheil, Omer Rana, Ronan Reilly
Abstract We present a neural network for predicting purchasing intent in an Ecommerce setting. Our main contribution is to address the significant investment in feature engineering that is usually associated with state-of-the-art methods such as Gradient Boosted Machines. We use trainable vector spaces to model varied, semi-structured input data comprising categoricals, quantities and unique instances. Multi-layer recurrent neural networks capture both session-local and dataset-global event dependencies and relationships for user sessions of any length. An exploration of model design decisions including parameter sharing and skip connections further increase model accuracy. Results on benchmark datasets deliver classification accuracy within 98% of state-of-the-art on one and exceed state-of-the-art on the second without the need for any domain / dataset-specific feature engineering on both short and long event sequences.
Tasks Feature Engineering
Published 2018-07-21
URL http://arxiv.org/abs/1807.08207v1
PDF http://arxiv.org/pdf/1807.08207v1.pdf
PWC https://paperswithcode.com/paper/predicting-purchasing-intent-automatic
Repo
Framework
Title Linguistic Legal Concept Extraction in Portuguese
Authors Alessandra Cid, Alexandre Rademaker, Bruno Cuconato, Valeria de Paiva
Abstract This work investigates legal concepts and their expression in Portuguese, concentrating on the “Order of Attorneys of Brazil” Bar exam. Using a corpus formed by a collection of multiple-choice questions, three norms related to the Ethics part of the OAB exam, language resources (Princeton WordNet and OpenWordNet-PT) and tools (AntConc and Freeling), we began to investigate the concepts and words missing from our repertory of concepts and words in Portuguese, the knowledge base OpenWordNet-PT. We add these concepts and words to OpenWordNet-PT and hence obtain a representation of these texts that is “contained” in the lexical knowledge base.
Tasks
Published 2018-10-22
URL http://arxiv.org/abs/1810.09379v1
PDF http://arxiv.org/pdf/1810.09379v1.pdf
PWC https://paperswithcode.com/paper/linguistic-legal-concept-extraction-in
Repo
Framework

StoryGAN: A Sequential Conditional GAN for Story Visualization

Title StoryGAN: A Sequential Conditional GAN for Story Visualization
Authors Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao
Abstract We propose a new task, called Story Visualization. Given a multi-sentence paragraph, the story is visualized by generating a sequence of images, one for each sentence. In contrast to video generation, story visualization focuses less on the continuity in generated images (frames), but more on the global consistency across dynamic scenes and characters – a challenge that has not been addressed by any single-image or video generation methods. We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework. Our model is unique in that it consists of a deep Context Encoder that dynamically tracks the story flow, and two discriminators at the story and image levels, to enhance the image quality and the consistency of the generated sequences. To evaluate the model, we modified existing datasets to create the CLEVR-SV and Pororo-SV datasets. Empirically, StoryGAN outperforms state-of-the-art models in image quality, contextual consistency metrics, and human evaluation.
Tasks Story Visualization, Video Generation
Published 2018-12-06
URL http://arxiv.org/abs/1812.02784v2
PDF http://arxiv.org/pdf/1812.02784v2.pdf
PWC https://paperswithcode.com/paper/storygan-a-sequential-conditional-gan-for
Repo
Framework

Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA

Title Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA
Authors Cheng Fu, Shilin Zhu, Hao Su, Ching-En Lee, Jishen Zhao
Abstract Binarized Neural Network (BNN) removes bitwidth redundancy in classical CNN by using a single bit (-1/+1) for network parameters and intermediate representations, which has greatly reduced the off-chip data transfer and storage overhead. However, a large amount of computation redundancy still exists in BNN inference. By analyzing local properties of images and the learned BNN kernel weights, we observe an average of $\sim$78% input similarity and $\sim$59% weight similarity among weight kernels, measured by our proposed metric in common network architectures. Thus there does exist redundancy that can be exploited to further reduce the amount of on-chip computations. Motivated by the observation, in this paper, we proposed two types of fast and energy-efficient architectures for BNN inference. We also provide analysis and insights to pick the better strategy of these two for different datasets and network models. By reusing the results from previous computation, much cycles for data buffer access and computations can be skipped. By experiments, we demonstrate that 80% of the computation and 40% of the buffer access can be skipped by exploiting BNN similarity. Thus, our design can achieve 17% reduction in total power consumption, 54% reduction in on-chip power consumption and 2.4$\times$ maximum speedup, compared to the baseline without applying our reuse technique. Our design also shows 1.9$\times$ more area-efficiency compared to state-of-the-art BNN inference design. We believe our deployment of BNN on FPGA leads to a promising future of running deep learning models on mobile devices.
Tasks
Published 2018-10-04
URL http://arxiv.org/abs/1810.02068v1
PDF http://arxiv.org/pdf/1810.02068v1.pdf
PWC https://paperswithcode.com/paper/towards-fast-and-energy-efficient-binarized
Repo
Framework

Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding

Title Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding
Authors Ari Azarafrooz, John Brock
Abstract Measuring the similarity of two files is an important task in malware analysis, with fuzzy hash functions being a popular approach. Traditional fuzzy hash functions are data agnostic: they do not learn from a particular dataset how to determine similarity; their behavior is fixed across all datasets. In this paper, we demonstrate that fuzzy hash functions can be learned in a novel minimax training framework and that these learned fuzzy hash functions outperform traditional fuzzy hash functions at the file similarity task for Portable Executable files. In our approach, hash digests can be extracted from the kernel embeddings of two kernel networks, trained in a minimax framework, where the roles of players during training (i.e adversary versus generator) alternate along with the input data. We refer to this new minimax architecture as perturbation-consistent. The similarity score for a pair of files is the utility of the minimax game in equilibrium. Our experiments show that learned fuzzy hash functions generalize well, capable of determining that two files are similar even when one of those files was generated using insertion and deletion operations.
Tasks
Published 2018-12-17
URL http://arxiv.org/abs/1812.07071v1
PDF http://arxiv.org/pdf/1812.07071v1.pdf
PWC https://paperswithcode.com/paper/fuzzy-hashing-as-perturbation-consistent
Repo
Framework

Adding Cues to Binary Feature Descriptors for Visual Place Recognition

Title Adding Cues to Binary Feature Descriptors for Visual Place Recognition
Authors Dominik Schlegel, Giorgio Grisetti
Abstract In this paper we propose an approach to embed continuous and selector cues in binary feature descriptors used for visual place recognition. The embedding is achieved by extending each feature descriptor with a binary string that encodes a cue and supports the Hamming distance metric. Augmenting the descriptors in such a way has the advantage of being transparent to the procedure used to compare them. We present two concrete applications of our methodology, demonstrating the two considered types of cues. In addition to that, we conducted on these applications a broad quantitative and comparative evaluation covering five benchmark datasets and several state-of-the-art image retrieval approaches in combination with various binary descriptor types.
Tasks Image Retrieval, Visual Place Recognition
Published 2018-09-18
URL http://arxiv.org/abs/1809.06690v1
PDF http://arxiv.org/pdf/1809.06690v1.pdf
PWC https://paperswithcode.com/paper/adding-cues-to-binary-feature-descriptors-for
Repo
Framework
comments powered by Disqus