Paper Group ANR 299
Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model. An Empirical Bayes Approach for High Dimensional Classification. Recovering Dense Tissue Multispectral Signal from in vivo RGB Images. Types of Cognition and its Implications for future High-Level Cognitive Machines. A clever elimination strategy for eff …
Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model
Title | Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-Sequence Model |
Authors | Pin-Jung Chen, I-Hung Hsu, Yi-Yao Huang, Hung-Yi Lee |
Abstract | We apply sequence-to-sequence model to mitigate the impact of speech recognition errors on open domain end-to-end dialog generation. We cast the task as a domain adaptation problem where ASR transcriptions and original text are in two different domains. In this paper, our proposed model includes two individual encoders for each domain data and make their hidden states similar to ensure the decoder predict the same dialog text. The method shows that the sequence-to-sequence model can learn the ASR transcriptions and original text pair having the same meaning and eliminate the speech recognition errors. Experimental results on Cornell movie dialog dataset demonstrate that the domain adaption system help the spoken dialog system generate more similar responses with the original text answers. |
Tasks | Chatbot, Domain Adaptation, Speech Recognition |
Published | 2017-09-22 |
URL | http://arxiv.org/abs/1709.07862v2 |
http://arxiv.org/pdf/1709.07862v2.pdf | |
PWC | https://paperswithcode.com/paper/mitigating-the-impact-of-speech-recognition |
Repo | |
Framework | |
An Empirical Bayes Approach for High Dimensional Classification
Title | An Empirical Bayes Approach for High Dimensional Classification |
Authors | Yunbo Ouyang, Feng Liang |
Abstract | We propose an empirical Bayes estimator based on Dirichlet process mixture model for estimating the sparse normalized mean difference, which could be directly applied to the high dimensional linear classification. In theory, we build a bridge to connect the estimation error of the mean difference and the misclassification error, also provide sufficient conditions of sub-optimal classifiers and optimal classifiers. In implementation, a variational Bayes algorithm is developed to compute the posterior efficiently and could be parallelized to deal with the ultra-high dimensional case. |
Tasks | |
Published | 2017-02-16 |
URL | http://arxiv.org/abs/1702.05056v1 |
http://arxiv.org/pdf/1702.05056v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-bayes-approach-for-high |
Repo | |
Framework | |
Recovering Dense Tissue Multispectral Signal from in vivo RGB Images
Title | Recovering Dense Tissue Multispectral Signal from in vivo RGB Images |
Authors | Jianyu Lin, Neil T. Clancy, Daniel S. Elson |
Abstract | Hyperspectral/multispectral imaging (HSI/MSI) contains rich information clinical applications, such as 1) narrow band imaging for vascular visualisation; 2) oxygen saturation for intraoperative perfusion monitoring and clinical decision making [1]; 3) tissue classification and identification of pathology [2]. The current systems which provide pixel-level HSI/MSI signal can be generally divided into two types: spatial scanning and spectral scanning. However, the trade-off between spatial/spectral resolution, the acquisition time, and the hardware complexity hampers implementation in real-world applications, especially intra-operatively. Acquiring high resolution images in real-time is important for HSI/MSI in intra-operative imaging, to alleviate the side effect caused by breathing, heartbeat, and other sources of motion. Therefore, we developed an algorithm to recover a pixel-level MSI stack using only the captured snapshot RGB images from a normal camera. We refer to this technique as “super-spectral-resolution”. The proposed method enables recovery of pixel-level-dense MSI signals with 24 spectral bands at ~11 frames per second (FPS) on a GPU. Multispectral data captured from porcine bowel and sheep/rabbit uteri in vivo has been used for training, and the algorithm has been validated using unseen in vivo animal experiments. |
Tasks | Decision Making |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1707.03468v1 |
http://arxiv.org/pdf/1707.03468v1.pdf | |
PWC | https://paperswithcode.com/paper/recovering-dense-tissue-multispectral-signal |
Repo | |
Framework | |
Types of Cognition and its Implications for future High-Level Cognitive Machines
Title | Types of Cognition and its Implications for future High-Level Cognitive Machines |
Authors | Camilo Miguel Signorelli |
Abstract | This work summarizes part of current knowledge on High-level Cognitive process and its relation with biological hardware. Thus, it is possible to identify some paradoxes which could impact the development of future technologies and artificial intelligence: we may make a High-level Cognitive Machine, sacrificing the principal attribute of a machine, its accuracy. |
Tasks | |
Published | 2017-06-05 |
URL | http://arxiv.org/abs/1706.01443v1 |
http://arxiv.org/pdf/1706.01443v1.pdf | |
PWC | https://paperswithcode.com/paper/types-of-cognition-and-its-implications-for |
Repo | |
Framework | |
A clever elimination strategy for efficient minimal solvers
Title | A clever elimination strategy for efficient minimal solvers |
Authors | Zuzana Kukelova, Joe Kileel, Bernd Sturmfels, Tomas Pajdla |
Abstract | We present a new insight into the systematic generation of minimal solvers in computer vision, which leads to smaller and faster solvers. Many minimal problem formulations are coupled sets of linear and polynomial equations where image measurements enter the linear equations only. We show that it is useful to solve such systems by first eliminating all the unknowns that do not appear in the linear equations and then extending solutions to the rest of unknowns. This can be generalized to fully non-linear systems by linearization via lifting. We demonstrate that this approach leads to more efficient solvers in three problems of partially calibrated relative camera pose computation with unknown focal length and/or radial distortion. Our approach also generates new interesting constraints on the fundamental matrices of partially calibrated cameras, which were not known before. |
Tasks | |
Published | 2017-03-15 |
URL | http://arxiv.org/abs/1703.05289v1 |
http://arxiv.org/pdf/1703.05289v1.pdf | |
PWC | https://paperswithcode.com/paper/a-clever-elimination-strategy-for-efficient |
Repo | |
Framework | |
Learning Macromanagement in StarCraft from Replays using Deep Learning
Title | Learning Macromanagement in StarCraft from Replays using Deep Learning |
Authors | Niels Justesen, Sebastian Risi |
Abstract | The real-time strategy game StarCraft has proven to be a challenging environment for artificial intelligence techniques, and as a result, current state-of-the-art solutions consist of numerous hand-crafted modules. In this paper, we show how macromanagement decisions in StarCraft can be learned directly from game replays using deep learning. Neural networks are trained on 789,571 state-action pairs extracted from 2,005 replays of highly skilled players, achieving top-1 and top-3 error rates of 54.6% and 22.9% in predicting the next build action. By integrating the trained network into UAlbertaBot, an open source StarCraft bot, the system can significantly outperform the game’s built-in Terran bot, and play competitively against UAlbertaBot with a fixed rush strategy. To our knowledge, this is the first time macromanagement tasks are learned directly from replays in StarCraft. While the best hand-crafted strategies are still the state-of-the-art, the deep network approach is able to express a wide range of different strategies and thus improving the network’s performance further with deep reinforcement learning is an immediately promising avenue for future research. Ultimately this approach could lead to strong StarCraft bots that are less reliant on hard-coded strategies. |
Tasks | Starcraft |
Published | 2017-07-12 |
URL | http://arxiv.org/abs/1707.03743v1 |
http://arxiv.org/pdf/1707.03743v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-macromanagement-in-starcraft-from |
Repo | |
Framework | |
Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation
Title | Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation |
Authors | Raphael Shu, Hideki Nakayama |
Abstract | For extended periods of time, sequence generation models rely on beam search algorithm to generate output sequence. However, the correctness of beam search degrades when the a model is over-confident about a suboptimal prediction. In this paper, we propose to perform minimum Bayes-risk (MBR) decoding for some extra steps at a later stage. In order to speed up MBR decoding, we compute the Bayes risks on GPU in batch mode. In our experiments, we found that MBR reranking works with a large beam size. Later-stage MBR decoding is shown to outperform simple MBR reranking in machine translation tasks. |
Tasks | Machine Translation |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03169v2 |
http://arxiv.org/pdf/1704.03169v2.pdf | |
PWC | https://paperswithcode.com/paper/later-stage-minimum-bayes-risk-decoding-for |
Repo | |
Framework | |
Intelligent EHRs: Predicting Procedure Codes From Diagnosis Codes
Title | Intelligent EHRs: Predicting Procedure Codes From Diagnosis Codes |
Authors | Hasham Ul Haq, Rameel Ahmad, Sibt Ul Hussain |
Abstract | In order to submit a claim to insurance companies, a doctor needs to code a patient encounter with both the diagnosis (ICDs) and procedures performed (CPTs) in an Electronic Health Record (EHR). Identifying and applying relevant procedures code is a cumbersome and time-consuming task as a doctor has to choose from around 13,000 procedure codes with no predefined one-to-one mapping. In this paper, we propose a state-of-the-art deep learning method for automatic and intelligent coding of procedures (CPTs) from the diagnosis codes (ICDs) entered by the doctor. Precisely, we cast the learning problem as a multi-label classification problem and use distributed representation to learn the input mapping of high-dimensional sparse ICDs codes. Our final model trained on 2.3 million claims is able to outperform existing rule-based probabilistic and association-rule mining based methods and has a recall of 90@3. |
Tasks | Multi-Label Classification |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00481v1 |
http://arxiv.org/pdf/1712.00481v1.pdf | |
PWC | https://paperswithcode.com/paper/intelligent-ehrs-predicting-procedure-codes |
Repo | |
Framework | |
GAN and VAE from an Optimal Transport Point of View
Title | GAN and VAE from an Optimal Transport Point of View |
Authors | Aude Genevay, Gabriel Peyré, Marco Cuturi |
Abstract | This short article revisits some of the ideas introduced in arXiv:1701.07875 and arXiv:1705.07642 in a simple setup. This sheds some lights on the connexions between Variational Autoencoders (VAE), Generative Adversarial Networks (GAN) and Minimum Kantorovitch Estimators (MKE). |
Tasks | |
Published | 2017-06-06 |
URL | http://arxiv.org/abs/1706.01807v1 |
http://arxiv.org/pdf/1706.01807v1.pdf | |
PWC | https://paperswithcode.com/paper/gan-and-vae-from-an-optimal-transport-point |
Repo | |
Framework | |
Stream Reasoning in Temporal Datalog
Title | Stream Reasoning in Temporal Datalog |
Authors | Alessandro Ronca, Mark Kaminski, Bernardo Cuenca Grau, Boris Motik, Ian Horrocks |
Abstract | In recent years, there has been an increasing interest in extending traditional stream processing engines with logical, rule-based, reasoning capabilities. This poses significant theoretical and practical challenges since rules can derive new information and propagate it both towards past and future time points; as a result, streamed query answers can depend on data that has not yet been received, as well as on data that arrived far in the past. Stream reasoning algorithms, however, must be able to stream out query answers as soon as possible, and can only keep a limited number of previous input facts in memory. In this paper, we propose novel reasoning problems to deal with these challenges, and study their computational properties on Datalog extended with a temporal sort and the successor function (a core rule-based language for stream reasoning applications). |
Tasks | |
Published | 2017-11-10 |
URL | http://arxiv.org/abs/1711.04013v2 |
http://arxiv.org/pdf/1711.04013v2.pdf | |
PWC | https://paperswithcode.com/paper/stream-reasoning-in-temporal-datalog |
Repo | |
Framework | |
Adaptive Questionnaires for Direct Identification of Optimal Product Design
Title | Adaptive Questionnaires for Direct Identification of Optimal Product Design |
Authors | Max Yi Ren, Clayton Scott |
Abstract | We consider the problem of identifying the most profitable product design from a finite set of candidates under unknown consumer preference. A standard approach to this problem follows a two-step strategy: First, estimate the preference of the consumer population, represented as a point in part-worth space, using an adaptive discrete-choice questionnaire. Second, integrate the estimated part-worth vector with engineering feasibility and cost models to determine the optimal design. In this work, we (1) demonstrate that accurate preference estimation is neither necessary nor sufficient for identifying the optimal design, (2) introduce a novel adaptive questionnaire that leverages knowledge about engineering feasibility and manufacturing costs to directly determine the optimal design, and (3) interpret product design in terms of a nonlinear segmentation of part-worth space, and use this interpretation to illuminate the intrinsic difficulty of optimal design in the presence of noisy questionnaire responses. We establish the superiority of the proposed approach using a well-documented optimal product design task. This study demonstrates how the identification of optimal product design can be accelerated by integrating marketing and manufacturing knowledge into the adaptive questionnaire. |
Tasks | |
Published | 2017-01-05 |
URL | http://arxiv.org/abs/1701.01231v1 |
http://arxiv.org/pdf/1701.01231v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-questionnaires-for-direct |
Repo | |
Framework | |
Deep Probabilistic Programming
Title | Deep Probabilistic Programming |
Authors | Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei |
Abstract | We propose Edward, a Turing-complete probabilistic programming language. Edward defines two compositional representations—random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same model using a variety of composable inference methods, ranging from point estimation to variational inference to MCMC. In addition, Edward can reuse the modeling representation as part of inference, facilitating the design of rich variational models and generative adversarial networks. For efficiency, Edward is integrated into TensorFlow, providing significant speedups over existing probabilistic systems. For example, we show on a benchmark logistic regression task that Edward is at least 35x faster than Stan and 6x faster than PyMC3. Further, Edward incurs no runtime overhead: it is as fast as handwritten TensorFlow. |
Tasks | Probabilistic Programming |
Published | 2017-01-13 |
URL | http://arxiv.org/abs/1701.03757v2 |
http://arxiv.org/pdf/1701.03757v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-probabilistic-programming |
Repo | |
Framework | |
Learning Word-Like Units from Joint Audio-Visual Analysis
Title | Learning Word-Like Units from Joint Audio-Visual Analysis |
Authors | David Harwath, James R. Glass |
Abstract | Given a collection of images and spoken audio captions, we present a method for discovering word-like acoustic units in the continuous speech signal and grounding them to semantically relevant image regions. For example, our model is able to detect spoken instances of the word ‘lighthouse’ within an utterance and associate them with image regions containing lighthouses. We do not use any form of conventional automatic speech recognition, nor do we use any text transcriptions or conventional linguistic annotations. Our model effectively implements a form of spoken language acquisition, in which the computer learns not only to recognize word categories by sound, but also to enrich the words it learns with semantics by grounding them in images. |
Tasks | Language Acquisition, Speech Recognition |
Published | 2017-01-25 |
URL | http://arxiv.org/abs/1701.07481v3 |
http://arxiv.org/pdf/1701.07481v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-word-like-units-from-joint-audio |
Repo | |
Framework | |
Robust Imitation of Diverse Behaviors
Title | Robust Imitation of Diverse Behaviors |
Authors | Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess |
Abstract | Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train. In this paper, we show how to combine the favourable aspects of these two approaches. The base of our model is a new type of variational autoencoder on demonstration trajectories that learns semantic policy embeddings. We show that these embeddings can be learned on a 9 DoF Jaco robot arm in reaching tasks, and then smoothly interpolated with a resulting smooth interpolation of reaching behavior. Leveraging these policy representations, we develop a new version of GAIL that (1) is much more robust than the purely-supervised controller, especially with few demonstrations, and (2) avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not. We demonstrate our approach on learning diverse gaits from demonstration on a 2D biped and a 62 DoF 3D humanoid in the MuJoCo physics environment. |
Tasks | Imitation Learning |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.02747v2 |
http://arxiv.org/pdf/1707.02747v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-imitation-of-diverse-behaviors |
Repo | |
Framework | |
Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF
Title | Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF |
Authors | Tuan Do, James Pustejovsky |
Abstract | Event learning is one of the most important problems in AI. However, notwithstanding significant research efforts, it is still a very complex task, especially when the events involve the interaction of humans or agents with other objects, as it requires modeling human kinematics and object movements. This study proposes a methodology for learning complex human-object interaction (HOI) events, involving the recording, annotation and classification of event interactions. For annotation, we allow multiple interpretations of a motion capture by slicing over its temporal span, for classification, we use Long-Short Term Memory (LSTM) sequential models with Conditional Randon Field (CRF) for constraints of outputs. Using a setup involving captures of human-object interaction as three dimensional inputs, we argue that this approach could be used for event types involving complex spatio-temporal dynamics. |
Tasks | Human-Object Interaction Detection, Motion Capture |
Published | 2017-09-30 |
URL | http://arxiv.org/abs/1710.00262v1 |
http://arxiv.org/pdf/1710.00262v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-event-learning-of-human-object |
Repo | |
Framework | |