Paper Group ANR 655
Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification. Human-AI Learning Performance in Multi-Armed Bandits. Learning to Sketch with Shortcut Cycle Consistency. Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing. Sim-to-Real Transfer of …
Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification
Title | Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification |
Authors | Pradeep Rangan, Sreenivasa Rao K |
Abstract | Manner of articulation detection using deep neural networks require a priori knowledge of the attribute discriminative features or the decent phoneme alignments. However generating an appropriate phoneme alignment is complex and its performance depends on the choice of optimal number of senones, Gaussians, etc. In the first part of our work, we exploit the manner of articulation detection using connectionist temporal classification (CTC) which doesn’t need any phoneme alignment. Later we modify the state-of-the-art character based posteriors generated by CTC using the manner of articulation CTC detector. Beam search decoding is performed on the modified posteriors and it’s impact on open source datasets such as AN4 and LibriSpeech is observed. |
Tasks | Manner Of Articulation Detection |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.07720v1 |
http://arxiv.org/pdf/1811.07720v1.pdf | |
PWC | https://paperswithcode.com/paper/beam-search-decoding-using-manner-of |
Repo | |
Framework | |
Human-AI Learning Performance in Multi-Armed Bandits
Title | Human-AI Learning Performance in Multi-Armed Bandits |
Authors | Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan |
Abstract | People frequently face challenging decision-making problems in which outcomes are uncertain or unknown. Artificial intelligence (AI) algorithms exist that can outperform humans at learning such tasks. Thus, there is an opportunity for AI agents to assist people in learning these tasks more effectively. In this work, we use a multi-armed bandit as a controlled setting in which to explore this direction. We pair humans with a selection of agents and observe how well each human-agent team performs. We find that team performance can beat both human and agent performance in isolation. Interestingly, we also find that an agent’s performance in isolation does not necessarily correlate with the human-agent team’s performance. A drop in agent performance can lead to a disproportionately large drop in team performance, or in some settings can even improve team performance. Pairing a human with an agent that performs slightly better than them can make them perform much better, while pairing them with an agent that performs the same can make them them perform much worse. Further, our results suggest that people have different exploration strategies and might perform better with agents that match their strategy. Overall, optimizing human-agent team performance requires going beyond optimizing agent performance, to understanding how the agent’s suggestions will influence human decision-making. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2018-12-21 |
URL | http://arxiv.org/abs/1812.09376v1 |
http://arxiv.org/pdf/1812.09376v1.pdf | |
PWC | https://paperswithcode.com/paper/human-ai-learning-performance-in-multi-armed |
Repo | |
Framework | |
Learning to Sketch with Shortcut Cycle Consistency
Title | Learning to Sketch with Shortcut Cycle Consistency |
Authors | Jifei Song, Kaiyue Pang, Yi-Zhe Song, Tao Xiang, Timothy Hospedales |
Abstract | To see is to sketch – free-hand sketching naturally builds ties between human and machine vision. In this paper, we present a novel approach for translating an object photo to a sketch, mimicking the human sketching process. This is an extremely challenging task because the photo and sketch domains differ significantly. Furthermore, human sketches exhibit various levels of sophistication and abstraction even when depicting the same object instance in a reference photo. This means that even if photo-sketch pairs are available, they only provide weak supervision signal to learn a translation model. Compared with existing supervised approaches that solve the problem of D(E(photo)) -> sketch, where E($\cdot$) and D($\cdot$) denote encoder and decoder respectively, we take advantage of the inverse problem (e.g., D(E(sketch)) -> photo), and combine with the unsupervised learning tasks of within-domain reconstruction, all within a multi-task learning framework. Compared with existing unsupervised approaches based on cycle consistency (i.e., D(E(D(E(photo)))) -> photo), we introduce a shortcut consistency enforced at the encoder bottleneck (e.g., D(E(photo)) -> photo) to exploit the additional self-supervision. Both qualitative and quantitative results show that the proposed model is superior to a number of state-of-the-art alternatives. We also show that the synthetic sketches can be used to train a better fine-grained sketch-based image retrieval (FG-SBIR) model, effectively alleviating the problem of sketch data scarcity. |
Tasks | Image Retrieval, Multi-Task Learning, Sketch-Based Image Retrieval |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00247v1 |
http://arxiv.org/pdf/1805.00247v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-sketch-with-shortcut-cycle |
Repo | |
Framework | |
Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing
Title | Minimax Rates in Network Analysis: Graphon Estimation, Community Detection and Hypothesis Testing |
Authors | Chao Gao, Zongming Ma |
Abstract | This paper surveys some recent developments in fundamental limits and optimal algorithms for network analysis. We focus on minimax optimal rates in three fundamental problems of network analysis: graphon estimation, community detection, and hypothesis testing. For each problem, we review state-of-the-art results in the literature followed by general principles behind the optimal procedures that lead to minimax estimation and testing. This allows us to connect problems in network analysis to other statistical inference problems from a general perspective. |
Tasks | Community Detection, Graphon Estimation |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.06055v2 |
http://arxiv.org/pdf/1811.06055v2.pdf | |
PWC | https://paperswithcode.com/paper/minimax-rates-in-network-analysis-graphon |
Repo | |
Framework | |
Sim-to-Real Transfer of Robot Learning with Variable Length Inputs
Title | Sim-to-Real Transfer of Robot Learning with Variable Length Inputs |
Authors | Vibhavari Dasagi, Robert Lee, Serena Mou, Jake Bruce, Niko Sünderhauf, Jürgen Leitner |
Abstract | Current end-to-end deep Reinforcement Learning (RL) approaches require jointly learning perception, decision-making and low-level control from very sparse reward signals and high-dimensional inputs, with little capability of incorporating prior knowledge. This results in prohibitively long training times for use on real-world robotic tasks. Existing algorithms capable of extracting task-level representations from high-dimensional inputs, e.g. object detection, often produce outputs of varying lengths, restricting their use in RL methods due to the need for neural networks to have fixed length inputs. In this work, we propose a framework that combines deep sets encoding, which allows for variable-length abstract representations, with modular RL that utilizes these representations, decoupling high-level decision making from low-level control. We successfully demonstrate our approach on the robot manipulation task of object sorting, showing that this method can learn effective policies within mere minutes of highly simplified simulation. The learned policies can be directly deployed on a robot without further training, and generalize to variations of the task unseen during training. |
Tasks | Decision Making, Object Detection, Transfer Learning |
Published | 2018-09-20 |
URL | https://arxiv.org/abs/1809.07480v2 |
https://arxiv.org/pdf/1809.07480v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-sim-to-real-transfer-with-modular |
Repo | |
Framework | |
Venue Suggestion Using Social-Centric Scores
Title | Venue Suggestion Using Social-Centric Scores |
Authors | Mohammad Aliannejadi, Fabio Crestani |
Abstract | User modeling is a very important task for making relevant suggestions of venues to the users. These suggestions are often based on matching the venues’ features with the users’ preferences, which can be collected from previously visited locations. In this paper, we present a set of relevance scores for making personalized suggestions of points of interest. These scores model each user by focusing on the different types of information extracted from venues that they have previously visited. In particular, we focus on scores extracted from social information available on location-based social networks. Our experiments, conducted on the dataset of the TREC Contextual Suggestion Track, show that social scores are more effective than scores based venues’ content. |
Tasks | |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08354v2 |
http://arxiv.org/pdf/1803.08354v2.pdf | |
PWC | https://paperswithcode.com/paper/venue-suggestion-using-social-centric-scores |
Repo | |
Framework | |
Cognitive system to achieve human-level accuracy in automated assignment of helpdesk email tickets
Title | Cognitive system to achieve human-level accuracy in automated assignment of helpdesk email tickets |
Authors | Atri Mandal, Nikhil Malhotra, Shivali Agarwal, Anupama Ray, Giriprasad Sridhara |
Abstract | Ticket assignment/dispatch is a crucial part of service delivery business with lot of scope for automation and optimization. In this paper, we present an end-to-end automated helpdesk email ticket assignment system, which is also offered as a service. The objective of the system is to determine the nature of the problem mentioned in an incoming email ticket and then automatically dispatch it to an appropriate resolver group (or team) for resolution. The proposed system uses an ensemble classifier augmented with a configurable rule engine. While design of classifier that is accurate is one of the main challenges, we also need to address the need of designing a system that is robust and adaptive to changing business needs. We discuss some of the main design challenges associated with email ticket assignment automation and how we solve them. The design decisions for our system are driven by high accuracy, coverage, business continuity, scalability and optimal usage of computational resources. Our system has been deployed in production of three major service providers and currently assigning over 40,000 emails per month, on an average, with an accuracy close to 90% and covering at least 90% of email tickets. This translates to achieving human-level accuracy and results in a net saving of about 23000 man-hours of effort per annum. |
Tasks | |
Published | 2018-08-08 |
URL | http://arxiv.org/abs/1808.02636v2 |
http://arxiv.org/pdf/1808.02636v2.pdf | |
PWC | https://paperswithcode.com/paper/cognitive-system-to-achieve-human-level |
Repo | |
Framework | |
Learning with Bad Training Data via Iterative Trimmed Loss Minimization
Title | Learning with Bad Training Data via Iterative Trimmed Loss Minimization |
Authors | Yanyao Shen, Sujay Sanghavi |
Abstract | In this paper, we study a simple and generic framework to tackle the problem of learning model parameters when a fraction of the training samples are corrupted. We first make a simple observation: in a variety of such settings, the evolution of training accuracy (as a function of training epochs) is different for clean and bad samples. Based on this we propose to iteratively minimize the trimmed loss, by alternating between (a) selecting samples with lowest current loss, and (b) retraining a model on only these samples. We prove that this process recovers the ground truth (with linear convergence rate) in generalized linear models with standard statistical assumptions. Experimentally, we demonstrate its effectiveness in three settings: (a) deep image classifiers with errors only in labels, (b) generative adversarial networks with bad training images, and (c) deep image classifiers with adversarial (image, label) pairs (i.e., backdoor attacks). For the well-studied setting of random label noise, our algorithm achieves state-of-the-art performance without having access to any a-priori guaranteed clean samples. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11874v2 |
http://arxiv.org/pdf/1810.11874v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-bad-training-data-via-iterative |
Repo | |
Framework | |
Learning Theory and Algorithms for Revenue Management in Sponsored Search
Title | Learning Theory and Algorithms for Revenue Management in Sponsored Search |
Authors | Lulu Wang, Huahui Liu, Guanhao Chen, Shaola Ren, Xiaonan Meng, Yi Hu |
Abstract | Online advertisement is the main source of revenue for Internet business. Advertisers are typically ranked according to a score that takes into account their bids and potential click-through rates(eCTR). Generally, the likelihood that a user clicks on an ad is often modeled by optimizing for the click through rates rather than the performance of the auction in which the click through rates will be used. This paper attempts to eliminate this dis-connection by proposing loss functions for click modeling that are based on final auction performance.In this paper, we address two feasible metrics (AUC^R and SAUC) to evaluate the on-line RPM (revenue per mille) directly rather than the CTR. And then, we design an explicit ranking function by incorporating the calibration fac-tor and price-squashed factor to maximize the revenue. Given the power of deep networks, we also explore an implicit optimal ranking function with deep model. Lastly, various experiments with two real world datasets are presented. In particular, our proposed methods perform better than the state-of-the-art methods with regard to the revenue of the platform. |
Tasks | Calibration |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.01827v1 |
http://arxiv.org/pdf/1807.01827v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-theory-and-algorithms-for-revenue-1 |
Repo | |
Framework | |
Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks
Title | Predicting purchasing intent: Automatic Feature Learning using Recurrent Neural Networks |
Authors | Humphrey Sheil, Omer Rana, Ronan Reilly |
Abstract | We present a neural network for predicting purchasing intent in an Ecommerce setting. Our main contribution is to address the significant investment in feature engineering that is usually associated with state-of-the-art methods such as Gradient Boosted Machines. We use trainable vector spaces to model varied, semi-structured input data comprising categoricals, quantities and unique instances. Multi-layer recurrent neural networks capture both session-local and dataset-global event dependencies and relationships for user sessions of any length. An exploration of model design decisions including parameter sharing and skip connections further increase model accuracy. Results on benchmark datasets deliver classification accuracy within 98% of state-of-the-art on one and exceed state-of-the-art on the second without the need for any domain / dataset-specific feature engineering on both short and long event sequences. |
Tasks | Feature Engineering |
Published | 2018-07-21 |
URL | http://arxiv.org/abs/1807.08207v1 |
http://arxiv.org/pdf/1807.08207v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-purchasing-intent-automatic |
Repo | |
Framework | |
Linguistic Legal Concept Extraction in Portuguese
Title | Linguistic Legal Concept Extraction in Portuguese |
Authors | Alessandra Cid, Alexandre Rademaker, Bruno Cuconato, Valeria de Paiva |
Abstract | This work investigates legal concepts and their expression in Portuguese, concentrating on the “Order of Attorneys of Brazil” Bar exam. Using a corpus formed by a collection of multiple-choice questions, three norms related to the Ethics part of the OAB exam, language resources (Princeton WordNet and OpenWordNet-PT) and tools (AntConc and Freeling), we began to investigate the concepts and words missing from our repertory of concepts and words in Portuguese, the knowledge base OpenWordNet-PT. We add these concepts and words to OpenWordNet-PT and hence obtain a representation of these texts that is “contained” in the lexical knowledge base. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09379v1 |
http://arxiv.org/pdf/1810.09379v1.pdf | |
PWC | https://paperswithcode.com/paper/linguistic-legal-concept-extraction-in |
Repo | |
Framework | |
StoryGAN: A Sequential Conditional GAN for Story Visualization
Title | StoryGAN: A Sequential Conditional GAN for Story Visualization |
Authors | Yitong Li, Zhe Gan, Yelong Shen, Jingjing Liu, Yu Cheng, Yuexin Wu, Lawrence Carin, David Carlson, Jianfeng Gao |
Abstract | We propose a new task, called Story Visualization. Given a multi-sentence paragraph, the story is visualized by generating a sequence of images, one for each sentence. In contrast to video generation, story visualization focuses less on the continuity in generated images (frames), but more on the global consistency across dynamic scenes and characters – a challenge that has not been addressed by any single-image or video generation methods. We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework. Our model is unique in that it consists of a deep Context Encoder that dynamically tracks the story flow, and two discriminators at the story and image levels, to enhance the image quality and the consistency of the generated sequences. To evaluate the model, we modified existing datasets to create the CLEVR-SV and Pororo-SV datasets. Empirically, StoryGAN outperforms state-of-the-art models in image quality, contextual consistency metrics, and human evaluation. |
Tasks | Story Visualization, Video Generation |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02784v2 |
http://arxiv.org/pdf/1812.02784v2.pdf | |
PWC | https://paperswithcode.com/paper/storygan-a-sequential-conditional-gan-for |
Repo | |
Framework | |
Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA
Title | Towards Fast and Energy-Efficient Binarized Neural Network Inference on FPGA |
Authors | Cheng Fu, Shilin Zhu, Hao Su, Ching-En Lee, Jishen Zhao |
Abstract | Binarized Neural Network (BNN) removes bitwidth redundancy in classical CNN by using a single bit (-1/+1) for network parameters and intermediate representations, which has greatly reduced the off-chip data transfer and storage overhead. However, a large amount of computation redundancy still exists in BNN inference. By analyzing local properties of images and the learned BNN kernel weights, we observe an average of $\sim$78% input similarity and $\sim$59% weight similarity among weight kernels, measured by our proposed metric in common network architectures. Thus there does exist redundancy that can be exploited to further reduce the amount of on-chip computations. Motivated by the observation, in this paper, we proposed two types of fast and energy-efficient architectures for BNN inference. We also provide analysis and insights to pick the better strategy of these two for different datasets and network models. By reusing the results from previous computation, much cycles for data buffer access and computations can be skipped. By experiments, we demonstrate that 80% of the computation and 40% of the buffer access can be skipped by exploiting BNN similarity. Thus, our design can achieve 17% reduction in total power consumption, 54% reduction in on-chip power consumption and 2.4$\times$ maximum speedup, compared to the baseline without applying our reuse technique. Our design also shows 1.9$\times$ more area-efficiency compared to state-of-the-art BNN inference design. We believe our deployment of BNN on FPGA leads to a promising future of running deep learning models on mobile devices. |
Tasks | |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02068v1 |
http://arxiv.org/pdf/1810.02068v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-fast-and-energy-efficient-binarized |
Repo | |
Framework | |
Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding
Title | Fuzzy Hashing as Perturbation-Consistent Adversarial Kernel Embedding |
Authors | Ari Azarafrooz, John Brock |
Abstract | Measuring the similarity of two files is an important task in malware analysis, with fuzzy hash functions being a popular approach. Traditional fuzzy hash functions are data agnostic: they do not learn from a particular dataset how to determine similarity; their behavior is fixed across all datasets. In this paper, we demonstrate that fuzzy hash functions can be learned in a novel minimax training framework and that these learned fuzzy hash functions outperform traditional fuzzy hash functions at the file similarity task for Portable Executable files. In our approach, hash digests can be extracted from the kernel embeddings of two kernel networks, trained in a minimax framework, where the roles of players during training (i.e adversary versus generator) alternate along with the input data. We refer to this new minimax architecture as perturbation-consistent. The similarity score for a pair of files is the utility of the minimax game in equilibrium. Our experiments show that learned fuzzy hash functions generalize well, capable of determining that two files are similar even when one of those files was generated using insertion and deletion operations. |
Tasks | |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.07071v1 |
http://arxiv.org/pdf/1812.07071v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-hashing-as-perturbation-consistent |
Repo | |
Framework | |
Adding Cues to Binary Feature Descriptors for Visual Place Recognition
Title | Adding Cues to Binary Feature Descriptors for Visual Place Recognition |
Authors | Dominik Schlegel, Giorgio Grisetti |
Abstract | In this paper we propose an approach to embed continuous and selector cues in binary feature descriptors used for visual place recognition. The embedding is achieved by extending each feature descriptor with a binary string that encodes a cue and supports the Hamming distance metric. Augmenting the descriptors in such a way has the advantage of being transparent to the procedure used to compare them. We present two concrete applications of our methodology, demonstrating the two considered types of cues. In addition to that, we conducted on these applications a broad quantitative and comparative evaluation covering five benchmark datasets and several state-of-the-art image retrieval approaches in combination with various binary descriptor types. |
Tasks | Image Retrieval, Visual Place Recognition |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06690v1 |
http://arxiv.org/pdf/1809.06690v1.pdf | |
PWC | https://paperswithcode.com/paper/adding-cues-to-binary-feature-descriptors-for |
Repo | |
Framework | |