July 28, 2019

2568 words 13 mins read

Paper Group ANR 299

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model. An Empirical Bayes Approach for High Dimensional Classification. Recovering Dense Tissue Multispectral Signal from in vivo RGB Images. Types of Cognition and its Implications for future High-Level Cognitive Machines. A clever elimination strategy for eff …

Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model


Title	Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model
Authors	Pin-Jung Chen, I-Hung Hsu, Yi-Yao Huang, Hung-Yi Lee
Abstract	We apply sequence-to-sequence model to mitigate the impact of speech recognition errors on open domain end-to-end dialog generation. We cast the task as a domain adaptation problem where ASR transcriptions and original text are in two different domains. In this paper, our proposed model includes two individual encoders for each domain data and make their hidden states similar to ensure the decoder predict the same dialog text. The method shows that the sequence-to-sequence model can learn the ASR transcriptions and original text pair having the same meaning and eliminate the speech recognition errors. Experimental results on Cornell movie dialog dataset demonstrate that the domain adaption system help the spoken dialog system generate more similar responses with the original text answers.
Tasks	Chatbot, Domain Adaptation, Speech Recognition
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07862v2
PDF	http://arxiv.org/pdf/1709.07862v2.pdf
PWC	https://paperswithcode.com/paper/mitigating-the-impact-of-speech-recognition
Repo
Framework

An Empirical Bayes Approach for High Dimensional Classification


Title	An Empirical Bayes Approach for High Dimensional Classification
Authors	Yunbo Ouyang, Feng Liang
Abstract	We propose an empirical Bayes estimator based on Dirichlet process mixture model for estimating the sparse normalized mean difference, which could be directly applied to the high dimensional linear classification. In theory, we build a bridge to connect the estimation error of the mean difference and the misclassification error, also provide sufficient conditions of sub-optimal classifiers and optimal classifiers. In implementation, a variational Bayes algorithm is developed to compute the posterior efficiently and could be parallelized to deal with the ultra-high dimensional case.
Tasks
Published	2017-02-16
URL	http://arxiv.org/abs/1702.05056v1
PDF	http://arxiv.org/pdf/1702.05056v1.pdf
PWC	https://paperswithcode.com/paper/an-empirical-bayes-approach-for-high
Repo
Framework

Recovering Dense Tissue Multispectral Signal from in vivo RGB Images


Title	Recovering Dense Tissue Multispectral Signal from in vivo RGB Images
Authors	Jianyu Lin, Neil T. Clancy, Daniel S. Elson
Abstract	Hyperspectral/multispectral imaging (HSI/MSI) contains rich information clinical applications, such as 1) narrow band imaging for vascular visualisation; 2) oxygen saturation for intraoperative perfusion monitoring and clinical decision making [1]; 3) tissue classification and identification of pathology [2]. The current systems which provide pixel-level HSI/MSI signal can be generally divided into two types: spatial scanning and spectral scanning. However, the trade-off between spatial/spectral resolution, the acquisition time, and the hardware complexity hampers implementation in real-world applications, especially intra-operatively. Acquiring high resolution images in real-time is important for HSI/MSI in intra-operative imaging, to alleviate the side effect caused by breathing, heartbeat, and other sources of motion. Therefore, we developed an algorithm to recover a pixel-level MSI stack using only the captured snapshot RGB images from a normal camera. We refer to this technique as “super-spectral-resolution”. The proposed method enables recovery of pixel-level-dense MSI signals with 24 spectral bands at ~11 frames per second (FPS) on a GPU. Multispectral data captured from porcine bowel and sheep/rabbit uteri in vivo has been used for training, and the algorithm has been validated using unseen in vivo animal experiments.
Tasks	Decision Making
Published	2017-06-20
URL	http://arxiv.org/abs/1707.03468v1
PDF	http://arxiv.org/pdf/1707.03468v1.pdf
PWC	https://paperswithcode.com/paper/recovering-dense-tissue-multispectral-signal
Repo
Framework

Types of Cognition and its Implications for future High-Level Cognitive Machines


Title	Types of Cognition and its Implications for future High-Level Cognitive Machines
Authors	Camilo Miguel Signorelli
Abstract	This work summarizes part of current knowledge on High-level Cognitive process and its relation with biological hardware. Thus, it is possible to identify some paradoxes which could impact the development of future technologies and artificial intelligence: we may make a High-level Cognitive Machine, sacrificing the principal attribute of a machine, its accuracy.
Tasks
Published	2017-06-05
URL	http://arxiv.org/abs/1706.01443v1
PDF	http://arxiv.org/pdf/1706.01443v1.pdf
PWC	https://paperswithcode.com/paper/types-of-cognition-and-its-implications-for
Repo
Framework

A clever elimination strategy for efficient minimal solvers


Title	A clever elimination strategy for efficient minimal solvers
Authors	Zuzana Kukelova, Joe Kileel, Bernd Sturmfels, Tomas Pajdla
Abstract	We present a new insight into the systematic generation of minimal solvers in computer vision, which leads to smaller and faster solvers. Many minimal problem formulations are coupled sets of linear and polynomial equations where image measurements enter the linear equations only. We show that it is useful to solve such systems by first eliminating all the unknowns that do not appear in the linear equations and then extending solutions to the rest of unknowns. This can be generalized to fully non-linear systems by linearization via lifting. We demonstrate that this approach leads to more efficient solvers in three problems of partially calibrated relative camera pose computation with unknown focal length and/or radial distortion. Our approach also generates new interesting constraints on the fundamental matrices of partially calibrated cameras, which were not known before.
Tasks
Published	2017-03-15
URL	http://arxiv.org/abs/1703.05289v1
PDF	http://arxiv.org/pdf/1703.05289v1.pdf
PWC	https://paperswithcode.com/paper/a-clever-elimination-strategy-for-efficient
Repo
Framework

Learning Macromanagement in StarCraft from Replays using Deep Learning


Title	Learning Macromanagement in StarCraft from Replays using Deep Learning
Authors	Niels Justesen, Sebastian Risi
Abstract	The real-time strategy game StarCraft has proven to be a challenging environment for artificial intelligence techniques, and as a result, current state-of-the-art solutions consist of numerous hand-crafted modules. In this paper, we show how macromanagement decisions in StarCraft can be learned directly from game replays using deep learning. Neural networks are trained on 789,571 state-action pairs extracted from 2,005 replays of highly skilled players, achieving top-1 and top-3 error rates of 54.6% and 22.9% in predicting the next build action. By integrating the trained network into UAlbertaBot, an open source StarCraft bot, the system can significantly outperform the game’s built-in Terran bot, and play competitively against UAlbertaBot with a fixed rush strategy. To our knowledge, this is the first time macromanagement tasks are learned directly from replays in StarCraft. While the best hand-crafted strategies are still the state-of-the-art, the deep network approach is able to express a wide range of different strategies and thus improving the network’s performance further with deep reinforcement learning is an immediately promising avenue for future research. Ultimately this approach could lead to strong StarCraft bots that are less reliant on hard-coded strategies.
Tasks	Starcraft
Published	2017-07-12
URL	http://arxiv.org/abs/1707.03743v1
PDF	http://arxiv.org/pdf/1707.03743v1.pdf
PWC	https://paperswithcode.com/paper/learning-macromanagement-in-starcraft-from
Repo
Framework

Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation


Title	Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation
Authors	Raphael Shu, Hideki Nakayama
Abstract	For extended periods of time, sequence generation models rely on beam search algorithm to generate output sequence. However, the correctness of beam search degrades when the a model is over-confident about a suboptimal prediction. In this paper, we propose to perform minimum Bayes-risk (MBR) decoding for some extra steps at a later stage. In order to speed up MBR decoding, we compute the Bayes risks on GPU in batch mode. In our experiments, we found that MBR reranking works with a large beam size. Later-stage MBR decoding is shown to outperform simple MBR reranking in machine translation tasks.
Tasks	Machine Translation
Published	2017-04-11
URL	http://arxiv.org/abs/1704.03169v2
PDF	http://arxiv.org/pdf/1704.03169v2.pdf
PWC	https://paperswithcode.com/paper/later-stage-minimum-bayes-risk-decoding-for
Repo
Framework

Intelligent EHRs: Predicting Procedure Codes From Diagnosis Codes


Title	Intelligent EHRs: Predicting Procedure Codes From Diagnosis Codes
Authors	Hasham Ul Haq, Rameel Ahmad, Sibt Ul Hussain
Abstract	In order to submit a claim to insurance companies, a doctor needs to code a patient encounter with both the diagnosis (ICDs) and procedures performed (CPTs) in an Electronic Health Record (EHR). Identifying and applying relevant procedures code is a cumbersome and time-consuming task as a doctor has to choose from around 13,000 procedure codes with no predefined one-to-one mapping. In this paper, we propose a state-of-the-art deep learning method for automatic and intelligent coding of procedures (CPTs) from the diagnosis codes (ICDs) entered by the doctor. Precisely, we cast the learning problem as a multi-label classification problem and use distributed representation to learn the input mapping of high-dimensional sparse ICDs codes. Our final model trained on 2.3 million claims is able to outperform existing rule-based probabilistic and association-rule mining based methods and has a recall of 90@3.
Tasks	Multi-Label Classification
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00481v1
PDF	http://arxiv.org/pdf/1712.00481v1.pdf
PWC	https://paperswithcode.com/paper/intelligent-ehrs-predicting-procedure-codes
Repo
Framework

GAN and VAE from an Optimal Transport Point of View


Title	GAN and VAE from an Optimal Transport Point of View
Authors	Aude Genevay, Gabriel Peyré, Marco Cuturi
Abstract	This short article revisits some of the ideas introduced in arXiv:1701.07875 and arXiv:1705.07642 in a simple setup. This sheds some lights on the connexions between Variational Autoencoders (VAE), Generative Adversarial Networks (GAN) and Minimum Kantorovitch Estimators (MKE).
Tasks
Published	2017-06-06
URL	http://arxiv.org/abs/1706.01807v1
PDF	http://arxiv.org/pdf/1706.01807v1.pdf
PWC	https://paperswithcode.com/paper/gan-and-vae-from-an-optimal-transport-point
Repo
Framework

Stream Reasoning in Temporal Datalog


Title	Stream Reasoning in Temporal Datalog
Authors	Alessandro Ronca, Mark Kaminski, Bernardo Cuenca Grau, Boris Motik, Ian Horrocks
Abstract	In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past. Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory. In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications).
Tasks
Published	2017-11-10
URL	http://arxiv.org/abs/1711.04013v2
PDF	http://arxiv.org/pdf/1711.04013v2.pdf
PWC	https://paperswithcode.com/paper/stream-reasoning-in-temporal-datalog
Repo
Framework

Adaptive Questionnaires for Direct Identification of Optimal Product Design


Title	Adaptive Questionnaires for Direct Identification of Optimal Product Design
Authors	Max Yi Ren, Clayton Scott
Abstract	We consider the problem of identifying the most profitable product design from a finite set of candidates under unknown consumer preference. A standard approach to this problem follows a two-step strategy: First, estimate the preference of the consumer population, represented as a point in part-worth space, using an adaptive discrete-choice questionnaire. Second, integrate the estimated part-worth vector with engineering feasibility and cost models to determine the optimal design. In this work, we (1) demonstrate that accurate preference estimation is neither necessary nor sufficient for identifying the optimal design, (2) introduce a novel adaptive questionnaire that leverages knowledge about engineering feasibility and manufacturing costs to directly determine the optimal design, and (3) interpret product design in terms of a nonlinear segmentation of part-worth space, and use this interpretation to illuminate the intrinsic difficulty of optimal design in the presence of noisy questionnaire responses. We establish the superiority of the proposed approach using a well-documented optimal product design task. This study demonstrates how the identification of optimal product design can be accelerated by integrating marketing and manufacturing knowledge into the adaptive questionnaire.
Tasks
Published	2017-01-05
URL	http://arxiv.org/abs/1701.01231v1
PDF	http://arxiv.org/pdf/1701.01231v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-questionnaires-for-direct
Repo
Framework

Deep Probabilistic Programming


Title	Deep Probabilistic Programming
Authors	Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei
Abstract	We propose Edward, a Turing-complete probabilistic programming language. Edward defines two compositional representations—random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same model using a variety of composable inference methods, ranging from point estimation to variational inference to MCMC. In addition, Edward can reuse the modeling representation as part of inference, facilitating the design of rich variational models and generative adversarial networks. For efficiency, Edward is integrated into TensorFlow, providing significant speedups over existing probabilistic systems. For example, we show on a benchmark logistic regression task that Edward is at least 35x faster than Stan and 6x faster than PyMC3. Further, Edward incurs no runtime overhead: it is as fast as handwritten TensorFlow.
Tasks	Probabilistic Programming
Published	2017-01-13
URL	http://arxiv.org/abs/1701.03757v2
PDF	http://arxiv.org/pdf/1701.03757v2.pdf
PWC	https://paperswithcode.com/paper/deep-probabilistic-programming
Repo
Framework

Learning Word-Like Units from Joint Audio-Visual Analysis


Title	Learning Word-Like Units from Joint Audio-Visual Analysis
Authors	David Harwath, James R. Glass
Abstract	Given a collection of images and spoken audio captions, we present a method for discovering word-like acoustic units in the continuous speech signal and grounding them to semantically relevant image regions. For example, our model is able to detect spoken instances of the word ‘lighthouse’ within an utterance and associate them with image regions containing lighthouses. We do not use any form of conventional automatic speech recognition, nor do we use any text transcriptions or conventional linguistic annotations. Our model effectively implements a form of spoken language acquisition, in which the computer learns not only to recognize word categories by sound, but also to enrich the words it learns with semantics by grounding them in images.
Tasks	Language Acquisition, Speech Recognition
Published	2017-01-25
URL	http://arxiv.org/abs/1701.07481v3
PDF	http://arxiv.org/pdf/1701.07481v3.pdf
PWC	https://paperswithcode.com/paper/learning-word-like-units-from-joint-audio
Repo
Framework

Robust Imitation of Diverse Behaviors


Title	Robust Imitation of Diverse Behaviors
Authors	Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess
Abstract	Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment.
Tasks	Imitation Learning
Published	2017-07-10
URL	http://arxiv.org/abs/1707.02747v2
PDF	http://arxiv.org/pdf/1707.02747v2.pdf
PWC	https://paperswithcode.com/paper/robust-imitation-of-diverse-behaviors
Repo
Framework

Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF


Title	Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF
Authors	Tuan Do, James Pustejovsky
Abstract	Event learning is one of the most important problems in AI. However, notwithstanding significant research efforts, it is still a very complex task, especially when the events involve the interaction of humans or agents with other objects, as it requires modeling human kinematics and object movements. This study proposes a methodology for learning complex human-object interaction (HOI) events, involving the recording, annotation and classification of event interactions. For annotation, we allow multiple interpretations of a motion capture by slicing over its temporal span, for classification, we use Long-Short Term Memory (LSTM) sequential models with Conditional Randon Field (CRF) for constraints of outputs. Using a setup involving captures of human-object interaction as three dimensional inputs, we argue that this approach could be used for event types involving complex spatio-temporal dynamics.
Tasks	Human-Object Interaction Detection, Motion Capture
Published	2017-09-30
URL	http://arxiv.org/abs/1710.00262v1
PDF	http://arxiv.org/pdf/1710.00262v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-event-learning-of-human-object
Repo
Framework