January 28, 2020

3250 words 16 mins read

Paper Group ANR 993

Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks. Emotion Classification in Response to Tactile Enhanced Multimedia using Frequency Domain Features of Brain Signals. Class-dependent Compression of Deep Neural Networks. Simultaneous Feature Aggregating and Hashing for Compact Binary Code Learning. Radiological imag …

Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks


Title	Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks
Authors	Igor Shalyminov, Sungjin Lee, Arash Eshghi, Oliver Lemon
Abstract	Goal-oriented dialogue systems are now being widely adopted in industry where it is of key importance to maintain a rapid prototyping cycle for new products and domains. Data-driven dialogue system development has to be adapted to meet this requirement — therefore, reducing the amount of data and annotations necessary for training such systems is a central research problem. In this paper, we present the Dialogue Knowledge Transfer Network (DiKTNet), a state-of-the-art approach to goal-oriented dialogue generation which only uses a few example dialogues (i.e. few-shot learning), none of which has to be annotated. We achieve this by performing a 2-stage training. Firstly, we perform unsupervised dialogue representation pre-training on a large source of goal-oriented dialogues in multiple domains, the MetaLWOz corpus. Secondly, at the transfer stage, we train DiKTNet using this representation together with 2 other textual knowledge sources with different levels of generality: ELMo encoder and the main dataset’s source domains. Our main dataset is the Stanford Multi-Domain dialogue corpus. We evaluate our model on it in terms of BLEU and Entity F1 scores, and show that our approach significantly and consistently improves upon a series of baseline models as well as over the previous state-of-the-art dialogue generation model, ZSDG. The improvement upon the latter — up to 10% in Entity F1 and the average of 3% in BLEU score — is achieved using only the equivalent of 10% of ZSDG’s in-domain training data.
Tasks	Dialogue Generation, Few-Shot Learning, Goal-Oriented Dialogue Systems, Transfer Learning
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01302v1
PDF	https://arxiv.org/pdf/1910.01302v1.pdf
PWC	https://paperswithcode.com/paper/data-efficient-goal-oriented-conversation
Repo
Framework

Emotion Classification in Response to Tactile Enhanced Multimedia using Frequency Domain Features of Brain Signals


Title	Emotion Classification in Response to Tactile Enhanced Multimedia using Frequency Domain Features of Brain Signals
Authors	Aasim Raheel, Muhammad Majid, Syed Muhammad Anwar, Ulas Bagci
Abstract	Tactile enhanced multimedia is generated by synchronizing traditional multimedia clips, to generate hot and cold air effect, with an electric heater and a fan. This objective is to give viewers a more realistic and immersing feel of the multimedia content. The response to this enhanced multimedia content (mulsemedia) is evaluated in terms of the appreciation/emotion by using human brain signals. We observe and record electroencephalography (EEG) data using a commercially available four channel MUSE headband. A total of 21 participants voluntarily participated in this study for EEG recordings. We extract frequency domain features from five different bands of each EEG channel. Four emotions namely: happy, relaxed, sad, and angry are classified using a support vector machine in response to the tactile enhanced multimedia. An increased accuracy of 76:19% is achieved when compared to 63:41% by using the time domain features. Our results show that the selected frequency domain features could be better suited for emotion classification in mulsemedia studies.
Tasks	EEG, Emotion Classification
Published	2019-05-13
URL	https://arxiv.org/abs/1905.10423v1
PDF	https://arxiv.org/pdf/1905.10423v1.pdf
PWC	https://paperswithcode.com/paper/190510423
Repo
Framework

Class-dependent Compression of Deep Neural Networks


Title	Class-dependent Compression of Deep Neural Networks
Authors	Rahim Entezari, Olga Saukh
Abstract	Today’s deep neural networks require substantial computation resources for their training, storage, and inference, which limits their effective use on resource-constrained devices. Many recent research activities explore different options for compressing and optimizing deep models. On the one hand, in many real-world applications, we face the data imbalance challenge, i.e. when the number of labeled instances of one class considerably outweighs the number of labeled instances of the other class. On the other hand, applications may pose a class imbalance problem, i.e. higher number of false positives produced when training a model and optimizing its performance may be tolerable, yet the number of false negatives must stay low. The problem originates from the fact that some classes are more important for the application than others, e.g. detection problems in medical and surveillance domains. Motivated by the success of the lottery ticket hypothesis, in this paper we propose an iterative deep model compression technique, which keeps the number of false negatives of the compressed model close to the one of the original model at the price of increasing the number of false positives if necessary. Our experimental evaluation using two benchmark data sets shows that the resulting compressed sub-networks 1) achieve up to 35% lower number of false negatives than the compressed model without class optimization, 2) provide an overall higher \aucroc measure, and 3) use up to 99% fewer parameters compared to the original network.
Tasks	Model Compression, Network Pruning
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10364v2
PDF	https://arxiv.org/pdf/1909.10364v2.pdf
PWC	https://paperswithcode.com/paper/190910364
Repo
Framework

Simultaneous Feature Aggregating and Hashing for Compact Binary Code Learning


Title	Simultaneous Feature Aggregating and Hashing for Compact Binary Code Learning
Authors	Thanh-Toan Do, Khoa Le, Tuan Hoang, Huu Le, Tam V. Nguyen, Ngai-Man Cheung
Abstract	Representing images by compact hash codes is an attractive approach for large-scale content-based image retrieval. In most state-of-the-art hashing-based image retrieval systems, for each image, local descriptors are first aggregated as a global representation vector. This global vector is then subjected to a hashing function to generate a binary hash code. In previous works, the aggregating and the hashing processes are designed independently. Hence these frameworks may generate suboptimal hash codes. In this paper, we first propose a novel unsupervised hashing framework in which feature aggregating and hashing are designed simultaneously and optimized jointly. Specifically, our joint optimization generates aggregated representations that can be better reconstructed by some binary codes. This leads to more discriminative binary hash codes and improved retrieval accuracy. In addition, the proposed method is flexible. It can be extended for supervised hashing. When the data label is available, the framework can be adapted to learn binary codes which minimize the reconstruction loss w.r.t. label vectors. Furthermore, we also propose a fast version of the state-of-the-art hashing method Binary Autoencoder to be used in our proposed frameworks. Extensive experiments on benchmark datasets under various settings show that the proposed methods outperform state-of-the-art unsupervised and supervised hashing methods.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2019-04-24
URL	http://arxiv.org/abs/1904.11820v1
PDF	http://arxiv.org/pdf/1904.11820v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-feature-aggregating-and-hashing-2
Repo
Framework

Radiological images and machine learning: trends, perspectives, and prospects


Title	Radiological images and machine learning: trends, perspectives, and prospects
Authors	Zhenwei Zhang, Ervin Sejdic
Abstract	The application of machine learning to radiological images is an increasingly active research area that is expected to grow in the next five to ten years. Recent advances in machine learning have the potential to recognize and classify complex patterns from different radiological imaging modalities such as x-rays, computed tomography, magnetic resonance imaging and positron emission tomography imaging. In many applications, machine learning based systems have shown comparable performance to human decision-making. The applications of machine learning are the key ingredients of future clinical decision making and monitoring systems. This review covers the fundamental concepts behind various machine learning techniques and their applications in several radiological imaging areas, such as medical image segmentation, brain function studies and neurological disease diagnosis, as well as computer-aided systems, image registration, and content-based image retrieval systems. Synchronistically, we will briefly discuss current challenges and future directions regarding the application of machine learning in radiological imaging. By giving insight on how take advantage of machine learning powered applications, we expect that clinicians can prevent and diagnose diseases more accurately and efficiently.
Tasks	Content-Based Image Retrieval, Decision Making, Image Registration, Image Retrieval, Medical Image Segmentation, Semantic Segmentation
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11726v1
PDF	http://arxiv.org/pdf/1903.11726v1.pdf
PWC	https://paperswithcode.com/paper/radiological-images-and-machine-learning
Repo
Framework

Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion


Title	Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion
Authors	Tao Hu, Zhizhong Han, Abhinav Shrivastava, Matthias Zwicker
Abstract	We propose a novel approach for 3D shape completion by synthesizing multi-view depth maps. While previous work for shape completion relies on volumetric representations, meshes, or point clouds, we propose to use multi-view depth maps from a set of fixed viewing angles as our shape representation. This allows us to be free of the limitations of memory for volumetric representations and point clouds by casting shape completion into an image-to-image translation problem. Specifically, we render depth maps of the incomplete shape from a fixed set of viewpoints, and perform depth map completion in each view. Different from image-to-image translation network that completes each view separately, our novel network, multi-view completion net (MVCN), leverages information from all views of a 3D shape to help the completion of each single view. This enables MVCN to leverage more information from different depth views to achieve high accuracy in single depth view completion and keep the consistency among the completed depth images in different views. Benefited by the multi-view representation and the novel network structure, MVCN significantly improves the accuracy of 3D shape completion in large-scale benchmarks compared to the state of the art.
Tasks	Image-to-Image Translation
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08366v4
PDF	https://arxiv.org/pdf/1904.08366v4.pdf
PWC	https://paperswithcode.com/paper/render4completion-synthesizing-multi-view
Repo
Framework

An Analysis of the Expressiveness of Deep Neural Network Architectures Based on Their Lipschitz Constants


Title	An Analysis of the Expressiveness of Deep Neural Network Architectures Based on Their Lipschitz Constants
Authors	SiQi Zhou, Angela P. Schoellig
Abstract	Deep neural networks (DNNs) have emerged as a popular mathematical tool for function approximation due to their capability of modelling highly nonlinear functions. Their applications range from image classification and natural language processing to learning-based control. Despite their empirical successes, there is still a lack of theoretical understanding of the representative power of such deep architectures. In this work, we provide a theoretical analysis of the expressiveness of fully-connected, feedforward DNNs with 1-Lipschitz activation functions. In particular, we characterize the expressiveness of a DNN by its Lipchitz constant. By leveraging random matrix theory, we show that, given sufficiently large and randomly distributed weights, the expected upper and lower bounds of the Lipschitz constant of a DNN and hence their expressiveness increase exponentially with depth and polynomially with width, which gives rise to the benefit of the depth of DNN architectures for efficient function approximation. This observation is consistent with established results based on alternative expressiveness measures of DNNs. In contrast to most of the existing work, our analysis based on the Lipschitz properties of DNNs is applicable to a wider range of activation nonlinearities and potentially allows us to make sensible comparisons between the complexity of a DNN and the function to be approximated by the DNN. We consider this work to be a step towards understanding the expressive power of DNNs and towards designing appropriate deep architectures for practical applications such as system control.
Tasks	Image Classification
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11511v1
PDF	https://arxiv.org/pdf/1912.11511v1.pdf
PWC	https://paperswithcode.com/paper/an-analysis-of-the-expressiveness-of-deep
Repo
Framework

Learning Hash Function through Codewords


Title	Learning Hash Function through Codewords
Authors	Yinjie Huang, Michael Georgiopoulos, Georgios C. Anagnostopoulos
Abstract	In this paper, we propose a novel hash learning approach that has the following main distinguishing features, when compared to past frameworks. First, the codewords are utilized in the Hamming space as ancillary techniques to accomplish its hash learning task. These codewords, which are inferred from the data, attempt to capture grouping aspects of the data’s hash codes. Furthermore, the proposed framework is capable of addressing supervised, unsupervised and, even, semi-supervised hash learning scenarios. Additionally, the framework adopts a regularization term over the codewords, which automatically chooses the codewords for the problem. To efficiently solve the problem, one Block Coordinate Descent algorithm is showcased in the paper. We also show that one step of the algorithms can be casted into several Support Vector Machine problems which enables our algorithms to utilize efficient software package. For the regularization term, a closed form solution of the proximal operator is provided in the paper. A series of comparative experiments focused on content-based image retrieval highlights its performance advantages.
Tasks	Content-Based Image Retrieval, Image Retrieval
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08639v1
PDF	http://arxiv.org/pdf/1902.08639v1.pdf
PWC	https://paperswithcode.com/paper/learning-hash-function-through-codewords
Repo
Framework

A Strong Baseline for Domain Adaptation and Generalization in Medical Imaging


Title	A Strong Baseline for Domain Adaptation and Generalization in Medical Imaging
Authors	Li Yao, Jordan Prosky, Ben Covington, Kevin Lyman
Abstract	This work provides a strong baseline for the problem of multi-source multi-target domain adaptation and generalization in medical imaging. Using a diverse collection of ten chest X-ray datasets, we empirically demonstrate the benefits of training medical imaging deep learning models on varied patient populations for generalization to out-of-sample domains.
Tasks	Domain Adaptation
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01638v1
PDF	http://arxiv.org/pdf/1904.01638v1.pdf
PWC	https://paperswithcode.com/paper/a-strong-baseline-for-domain-adaptation-and
Repo
Framework

Confident Head Circumference Measurement from Ultrasound with Real-time Feedback for Sonographers


Title	Confident Head Circumference Measurement from Ultrasound with Real-time Feedback for Sonographers
Authors	Samuel Budd, Matthew Sinclair, Bishesh Khanal, Jacqueline Matthew, David Lloyd, Alberto Gomez, Nicolas Toussaint, Emma Robinson, Bernhard Kainz
Abstract	Manual estimation of fetal Head Circumference (HC) from Ultrasound (US) is a key biometric for monitoring the healthy development of fetuses. Unfortunately, such measurements are subject to large inter-observer variability, resulting in low early-detection rates of fetal abnormalities. To address this issue, we propose a novel probabilistic Deep Learning approach for real-time automated estimation of fetal HC. This system feeds back statistics on measurement robustness to inform users how confident a deep neural network is in evaluating suitable views acquired during free-hand ultrasound examination. In real-time scenarios, this approach may be exploited to guide operators to scan planes that are as close as possible to the underlying distribution of training images, for the purpose of improving inter-operator consistency. We train on free-hand ultrasound data from over 2000 subjects (2848 training/540 test) and show that our method is able to predict HC measurements within 1.81$\pm$1.65mm deviation from the ground truth, with 50% of the test images fully contained within the predicted confidence margins, and an average of 1.82$\pm$1.78mm deviation from the margin for the remaining cases that are not fully contained.
Tasks
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02582v1
PDF	https://arxiv.org/pdf/1908.02582v1.pdf
PWC	https://paperswithcode.com/paper/confident-head-circumference-measurement-from
Repo
Framework

Neural Heterogeneous Scheduler


Title	Neural Heterogeneous Scheduler
Authors	Tegg Taekyong Sung, Valliappa Chockalingam, Alex Yahja, Bo Ryu
Abstract	Access to parallel and distributed computation has enabled researchers and developers to improve algorithms and performance in many applications. Recent research has focused on next generation special purpose systems with multiple kinds of coprocessors, known as heterogeneous system-on-chips (SoC). In this paper, we introduce a method to intelligently schedule–and learn to schedule–a stream of tasks to available processing elements in such a system. We use deep reinforcement learning enabling complex sequential decision making and empirically show that our reinforcement learning system provides for a viable, better alternative to conventional scheduling heuristics with respect to minimizing execution time.
Tasks	Decision Making
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03724v1
PDF	https://arxiv.org/pdf/1906.03724v1.pdf
PWC	https://paperswithcode.com/paper/neural-heterogeneous-scheduler
Repo
Framework

The Semantic Web Rule Language Expressiveness Extensions-A Survey


Title	The Semantic Web Rule Language Expressiveness Extensions-A Survey
Authors	Abba Lawan, Abdur Rakib
Abstract	The Semantic Web Rule Language (SWRL) is a direct extension of OWL 2 DL with a subset of RuleML, and it is designed to be the rule language of the Semantic Web. This paper explores the state-of-the-art of SWRL’s expressiveness extensions proposed over time. As a motivation, the effectiveness of the SWRL/OWL combination in modeling domain facts is discussed while some of the common expressive limitations of the combination are also highlighted. The paper then classifies and presents the relevant language extensions of the SWRL and their added expressive powers to the original SWRL definition. Furthermore, it provides a comparative analysis of the syntax and semantics of the proposed extensions. In conclusion, the decidability requirement and usability of each expressiveness extension are evaluated towards an efficient inclusion into the OWL ontologies.
Tasks
Published	2019-03-27
URL	http://arxiv.org/abs/1903.11723v1
PDF	http://arxiv.org/pdf/1903.11723v1.pdf
PWC	https://paperswithcode.com/paper/the-semantic-web-rule-language-expressiveness
Repo
Framework

Neural Network-Based Modeling of Phonetic Durations


Title	Neural Network-Based Modeling of Phonetic Durations
Authors	Xizi Wei, Melvyn Hunt, Adrian Skilling
Abstract	A deep neural network (DNN)-based model has been developed to predict non-parametric distributions of durations of phonemes in specified phonetic contexts and used to explore which factors influence durations most. Major factors in US English are pre-pausal lengthening, lexical stress, and speaking rate. The model can be used to check that text-to-speech (TTS) training speech follows the script and words are pronounced as expected. Duration prediction is poorer with training speech for automatic speech recognition (ASR) because the training corpus typically consists of single utterances from many speakers and is often noisy or casually spoken. Low probability durations in ASR training material nevertheless mostly correspond to non-standard speech, with some having disfluencies. Children’s speech is disproportionately present in these utterances, since children show much more variation in timing.
Tasks	Speech Recognition
Published	2019-09-06
URL	https://arxiv.org/abs/1909.03030v1
PDF	https://arxiv.org/pdf/1909.03030v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-based-modeling-of-phonetic
Repo
Framework

On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model Compression


Title	On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model Compression
Authors	Genta Indra Winata, Andrea Madotto, Jamin Shin, Elham J. Barezi, Pascale Fung
Abstract	Despite their ubiquity in NLP tasks, Long Short-Term Memory (LSTM) networks suffer from computational inefficiencies caused by inherent unparallelizable recurrences, which further aggravates as LSTMs require more parameters for larger memory capacity. In this paper, we propose to apply low-rank matrix factorization (MF) algorithms to different recurrences in LSTMs, and explore the effectiveness on different NLP tasks and model components. We discover that additive recurrence is more important than multiplicative recurrence, and explain this by identifying meaningful correlations between matrix norms and compression performance. We compare our approach across two settings: 1) compressing core LSTM recurrences in language models, 2) compressing biLSTM layers of ELMo evaluated in three downstream NLP tasks.
Tasks	Model Compression
Published	2019-08-27
URL	https://arxiv.org/abs/1908.09982v1
PDF	https://arxiv.org/pdf/1908.09982v1.pdf
PWC	https://paperswithcode.com/paper/on-the-effectiveness-of-low-rank-matrix
Repo
Framework

Bayesian automated posterior repartitioning for nested sampling


Title	Bayesian automated posterior repartitioning for nested sampling
Authors	Xi Chen, Farhan Feroz, Michael Hobson
Abstract	Priors in Bayesian analyses often encode informative domain knowledge that can be useful in making the inference process more efficient. Occasionally, however, priors may be unrepresentative of the parameter values for a given dataset, which can result in inefficient parameter space exploration, or even incorrect inferences, particularly for nested sampling (NS) algorithms. Simply broadening the prior in such cases may be inappropriate or impossible in some applications. Hence a previous solution of this problem, known as posterior repartitioning (PR), redefines the prior and likelihood while keeping their product fixed, so that the posterior inferences and evidence estimates remain unchanged, but the efficiency of the NS process is significantly increased. In its most practical form, PR raises the prior to some power $\beta$, which is introduced as an auxiliary variable that must be determined on a case-by-case basis, usually by lowering $\beta$ from unity according to some pre-defined `annealing schedule' until the resulting inferences converge to a consistent solution. We present here an alternative Bayesian` automated PR’ method, in which $\beta$ is instead treated as a hyperparameter that is inferred from the data alongside the original parameters of the problem, and then marginalised over to obtain the final inference. We show through numerical examples that this approach provides a robust and efficient `hands-off’ solution to addressing the issue of unrepresentative priors in Bayesian inference using NS. Moreover, we show that for problems with representative priors the method has a negligible computational overhead relative to standard nesting sampling, which suggests that it should be used in as a matter of course in all NS analyses. \|
Tasks	Bayesian Inference
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04655v1
PDF	https://arxiv.org/pdf/1908.04655v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-automated-posterior-repartitioning
Repo
Framework