Paper Group ANR 207
On denoising autoencoders trained to minimise binary cross-entropy. Relaxing Exclusive Control in Boolean Games. Learning to Compute Word Embeddings On the Fly. Is China Entering WTO or shijie maoyi zuzhi–a Corpus Study of English Acronyms in Chinese Newspapers. Feasibility of Corneal Imaging for Handheld Augmented Reality. Distribution of Gaussia …
On denoising autoencoders trained to minimise binary cross-entropy
Title | On denoising autoencoders trained to minimise binary cross-entropy |
Authors | Antonia Creswell, Kai Arulkumaran, Anil A. Bharath |
Abstract | Denoising autoencoders (DAEs) are powerful deep learning models used for feature extraction, data generation and network pre-training. DAEs consist of an encoder and decoder which may be trained simultaneously to minimise a loss (function) between an input and the reconstruction of a corrupted version of the input. There are two common loss functions used for training autoencoders, these include the mean-squared error (MSE) and the binary cross-entropy (BCE). When training autoencoders on image data a natural choice of loss function is BCE, since pixel values may be normalised to take values in [0,1] and the decoder model may be designed to generate samples that take values in (0,1). We show theoretically that DAEs trained to minimise BCE may be used to take gradient steps in the data space towards regions of high probability under the data-generating distribution. Previously this had only been shown for DAEs trained using MSE. As a consequence of the theory, iterative application of a trained DAE moves a data sample from regions of low probability to regions of higher probability under the data-generating distribution. Firstly, we validate the theory by showing that novel data samples, consistent with the training data, may be synthesised when the initial data samples are random noise. Secondly, we motivate the theory by showing that initial data samples synthesised via other methods may be improved via iterative application of a trained DAE to those initial samples. |
Tasks | Denoising |
Published | 2017-08-28 |
URL | http://arxiv.org/abs/1708.08487v2 |
http://arxiv.org/pdf/1708.08487v2.pdf | |
PWC | https://paperswithcode.com/paper/on-denoising-autoencoders-trained-to-minimise |
Repo | |
Framework | |
Relaxing Exclusive Control in Boolean Games
Title | Relaxing Exclusive Control in Boolean Games |
Authors | Francesco Belardinelli, Umberto Grandi, Andreas Herzig, Dominique Longin, Emiliano Lorini, Arianna Novaro, Laurent Perrussel |
Abstract | In the typical framework for boolean games (BG) each player can change the truth value of some propositional atoms, while attempting to make her goal true. In standard BG goals are propositional formulas, whereas in iterated BG goals are formulas of Linear Temporal Logic. Both notions of BG are characterised by the fact that agents have exclusive control over their set of atoms, meaning that no two agents can control the same atom. In the present contribution we drop the exclusivity assumption and explore structures where an atom can be controlled by multiple agents. We introduce Concurrent Game Structures with Shared Propositional Control (CGS-SPC) and show that they ac- count for several classes of repeated games, including iterated boolean games, influence games, and aggregation games. Our main result shows that, as far as verification is concerned, CGS-SPC can be reduced to concurrent game structures with exclusive control. This result provides a polynomial reduction for the model checking problem of specifications in Alternating-time Temporal Logic on CGS-SPC. |
Tasks | |
Published | 2017-07-27 |
URL | http://arxiv.org/abs/1707.08736v1 |
http://arxiv.org/pdf/1707.08736v1.pdf | |
PWC | https://paperswithcode.com/paper/relaxing-exclusive-control-in-boolean-games |
Repo | |
Framework | |
Learning to Compute Word Embeddings On the Fly
Title | Learning to Compute Word Embeddings On the Fly |
Authors | Dzmitry Bahdanau, Tom Bosc, Stanisław Jastrzębski, Edward Grefenstette, Pascal Vincent, Yoshua Bengio |
Abstract | Words in natural language follow a Zipfian distribution whereby some words are frequent but most are rare. Learning representations for words in the “long tail” of this distribution requires enormous amounts of data. Representations of rare words trained directly on end tasks are usually poor, requiring us to pre-train embeddings on external data, or treat all rare words as out-of-vocabulary words with a unique representation. We provide a method for predicting embeddings of rare words on the fly from small amounts of auxiliary data with a network trained end-to-end for the downstream task. We show that this improves results against baselines where embeddings are trained on the end task for reading comprehension, recognizing textual entailment and language modeling. |
Tasks | Language Modelling, Natural Language Inference, Question Answering, Reading Comprehension, Word Embeddings |
Published | 2017-06-01 |
URL | http://arxiv.org/abs/1706.00286v3 |
http://arxiv.org/pdf/1706.00286v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-compute-word-embeddings-on-the |
Repo | |
Framework | |
Is China Entering WTO or shijie maoyi zuzhi–a Corpus Study of English Acronyms in Chinese Newspapers
Title | Is China Entering WTO or shijie maoyi zuzhi–a Corpus Study of English Acronyms in Chinese Newspapers |
Authors | Hai Hu |
Abstract | This is one of the first studies that quantitatively examine the usage of English acronyms (e.g. WTO) in Chinese texts. Using newspaper corpora, I try to answer 1) for all instances of a concept that has an English acronym (e.g. World Trade Organization), what percentage is expressed in the English acronym (WTO), and what percentage in its Chinese translation (shijie maoyi zuzhi), and 2) what factors are at play in language users’ choice between the English and Chinese forms? Results show that different concepts have different percentage for English acronyms (PercentOfEn), ranging from 2% to 98%. Linear models show that PercentOfEn for individual concepts can be predicted by language economy (how long the Chinese translation is), concept frequency, and whether the first appearance of the concept in Chinese newspapers is the English acronym or its Chinese translation (all p < .05). |
Tasks | |
Published | 2017-11-18 |
URL | http://arxiv.org/abs/1711.06895v1 |
http://arxiv.org/pdf/1711.06895v1.pdf | |
PWC | https://paperswithcode.com/paper/is-china-entering-wto-or-shijie-maoyi-zuzhi-a |
Repo | |
Framework | |
Feasibility of Corneal Imaging for Handheld Augmented Reality
Title | Feasibility of Corneal Imaging for Handheld Augmented Reality |
Authors | Daniel Schneider, Jens Grubert |
Abstract | Smartphones are a popular device class for mobile Augmented Reality but suffer from a limited input space. Around-device interaction techniques aim at extending this input space using various sensing modalities. In this paper we present our work towards extending the input area of mobile devices using front-facing device-centered cameras that capture reflections in the cornea. As current generation mobile devices lack high resolution front-facing cameras, we study the feasibility of around-device interaction using corneal reflective imaging based on a high resolution camera. We present a workflow, a technical prototype and a feasibility evaluation. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.00965v1 |
http://arxiv.org/pdf/1709.00965v1.pdf | |
PWC | https://paperswithcode.com/paper/feasibility-of-corneal-imaging-for-handheld |
Repo | |
Framework | |
Distribution of Gaussian Process Arc Lengths
Title | Distribution of Gaussian Process Arc Lengths |
Authors | Justin D. Bewsher, Alessandra Tosi, Michael A. Osborne, Stephen J. Roberts |
Abstract | We present the first treatment of the arc length of the Gaussian Process (GP) with more than a single output dimension. GPs are commonly used for tasks such as trajectory modelling, where path length is a crucial quantity of interest. Previously, only paths in one dimension have been considered, with no theoretical consideration of higher dimensional problems. We fill the gap in the existing literature by deriving the moments of the arc length for a stationary GP with multiple output dimensions. A new method is used to derive the mean of a one-dimensional GP over a finite interval, by considering the distribution of the arc length integrand. This technique is used to derive an approximate distribution over the arc length of a vector valued GP in $\mathbb{R}^n$ by moment matching the distribution. Numerical simulations confirm our theoretical derivations. |
Tasks | |
Published | 2017-03-23 |
URL | http://arxiv.org/abs/1703.08031v1 |
http://arxiv.org/pdf/1703.08031v1.pdf | |
PWC | https://paperswithcode.com/paper/distribution-of-gaussian-process-arc-lengths |
Repo | |
Framework | |
A Dual Encoder Sequence to Sequence Model for Open-Domain Dialogue Modeling
Title | A Dual Encoder Sequence to Sequence Model for Open-Domain Dialogue Modeling |
Authors | Sharath T. S., Shubhangi Tandon, Ryan Bauer |
Abstract | Ever since the successful application of sequence to sequence learning for neural machine translation systems, interest has surged in its applicability towards language generation in other problem domains. Recent work has investigated the use of these neural architectures towards modeling open-domain conversational dialogue, where it has been found that although these models are capable of learning a good distributional language model, dialogue coherence is still of concern. Unlike translation, conversation is much more a one-to-many mapping from utterance to a response, and it is even more pressing that the model be aware of the preceding flow of conversation. In this paper we propose to tackle this problem by introducing previous conversational context in terms of latent representations of dialogue acts over time. We inject the latent context representations into a sequence to sequence neural network in the form of dialog acts using a second encoder to enhance the quality and the coherence of the conversations generated. The main task of this research work is to show that adding latent variables that capture discourse relations does indeed result in more coherent responses when compared to conventional sequence to sequence models. |
Tasks | Language Modelling, Machine Translation, Text Generation |
Published | 2017-10-28 |
URL | http://arxiv.org/abs/1710.10520v1 |
http://arxiv.org/pdf/1710.10520v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dual-encoder-sequence-to-sequence-model-for |
Repo | |
Framework | |
Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
Title | Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking |
Authors | Yi Tay, Anh Tuan Luu, Siu Cheung Hui |
Abstract | This paper proposes a new neural architecture for collaborative ranking with implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning}) is a novel metric learning approach for recommendation. More specifically, instead of simple push-pull mechanisms between user and item pairs, we propose to learn latent relations that describe each user item interaction. This helps to alleviate the potential geometric inflexibility of existing metric learing approaches. This enables not only better performance but also a greater extent of modeling capability, allowing our model to scale to a larger number of interactions. In order to do so, we employ a augmented memory module and learn to attend over these memory blocks to construct latent relations. The memory-based attention module is controlled by the user-item interaction, making the learned relation vector specific to each user-item pair. Hence, this can be interpreted as learning an exclusive and optimal relational translation for each user-item interaction. The proposed architecture demonstrates the state-of-the-art performance across multiple recommendation benchmarks. LRML outperforms other metric learning models by $6%-7.5%$ in terms of Hits@10 and nDCG@10 on large datasets such as Netflix and MovieLens20M. Moreover, qualitative studies also demonstrate evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of LRML to uncover hidden relational structure within implicit datasets. |
Tasks | Collaborative Ranking, Metric Learning, Recommendation Systems |
Published | 2017-07-17 |
URL | http://arxiv.org/abs/1707.05176v3 |
http://arxiv.org/pdf/1707.05176v3.pdf | |
PWC | https://paperswithcode.com/paper/latent-relational-metric-learning-via-memory |
Repo | |
Framework | |
Unsupervised Learning of Semantic Audio Representations
Title | Unsupervised Learning of Semantic Audio Representations |
Authors | Aren Jansen, Manoj Plakal, Ratheet Pandya, Daniel P. W. Ellis, Shawn Hershey, Jiayang Liu, R. Channing Moore, Rif A. Saurous |
Abstract | Even in the absence of any explicit semantic annotation, vast collections of audio recordings provide valuable information for learning the categorical structure of sounds. We consider several class-agnostic semantic constraints that apply to unlabeled nonspeech audio: (i) noise and translations in time do not change the underlying sound category, (ii) a mixture of two sound events inherits the categories of the constituents, and (iii) the categories of events in close temporal proximity are likely to be the same or related. Without labels to ground them, these constraints are incompatible with classification loss functions. However, they may still be leveraged to identify geometric inequalities needed for triplet loss-based training of convolutional neural networks. The result is low-dimensional embeddings of the input spectrograms that recover 41% and 84% of the performance of their fully-supervised counterparts when applied to downstream query-by-example sound retrieval and sound event classification tasks, respectively. Moreover, in limited-supervision settings, our unsupervised embeddings double the state-of-the-art classification performance. |
Tasks | |
Published | 2017-11-06 |
URL | http://arxiv.org/abs/1711.02209v1 |
http://arxiv.org/pdf/1711.02209v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-semantic-audio |
Repo | |
Framework | |
MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network
Title | MDNet: A Semantically and Visually Interpretable Medical Image Diagnosis Network |
Authors | Zizhao Zhang, Yuanpu Xie, Fuyong Xing, Mason McGough, Lin Yang |
Abstract | The inability to interpret the model prediction in semantically and visually meaningful ways is a well-known shortcoming of most existing computer-aided diagnosis methods. In this paper, we propose MDNet to establish a direct multimodal mapping between medical images and diagnostic reports that can read images, generate diagnostic reports, retrieve images by symptom descriptions, and visualize attention, to provide justifications of the network diagnosis process. MDNet includes an image model and a language model. The image model is proposed to enhance multi-scale feature ensembles and utilization efficiency. The language model, integrated with our improved attention mechanism, aims to read and explore discriminative image feature descriptions from reports to learn a direct mapping from sentence words to image pixels. The overall network is trained end-to-end by using our developed optimization strategy. Based on a pathology bladder cancer images and its diagnostic reports (BCIDR) dataset, we conduct sufficient experiments to demonstrate that MDNet outperforms comparative baselines. The proposed image model obtains state-of-the-art performance on two CIFAR datasets as well. |
Tasks | Language Modelling |
Published | 2017-07-08 |
URL | http://arxiv.org/abs/1707.02485v1 |
http://arxiv.org/pdf/1707.02485v1.pdf | |
PWC | https://paperswithcode.com/paper/mdnet-a-semantically-and-visually |
Repo | |
Framework | |
Performance Guaranteed Network Acceleration via High-Order Residual Quantization
Title | Performance Guaranteed Network Acceleration via High-Order Residual Quantization |
Authors | Zefan Li, Bingbing Ni, Wenjun Zhang, Xiaokang Yang, Wen Gao |
Abstract | Input binarization has shown to be an effective way for network acceleration. However, previous binarization scheme could be regarded as simple pixel-wise thresholding operations (i.e., order-one approximation) and suffers a big accuracy loss. In this paper, we propose a highorder binarization scheme, which achieves more accurate approximation while still possesses the advantage of binary operation. In particular, the proposed scheme recursively performs residual quantization and yields a series of binary input images with decreasing magnitude scales. Accordingly, we propose high-order binary filtering and gradient propagation operations for both forward and backward computations. Theoretical analysis shows approximation error guarantee property of proposed method. Extensive experimental results demonstrate that the proposed scheme yields great recognition accuracy while being accelerated. |
Tasks | Quantization |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08687v1 |
http://arxiv.org/pdf/1708.08687v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-guaranteed-network-acceleration |
Repo | |
Framework | |
Credal Networks under Epistemic Irrelevance
Title | Credal Networks under Epistemic Irrelevance |
Authors | Jasper De Bock |
Abstract | A credal network under epistemic irrelevance is a generalised type of Bayesian network that relaxes its two main building blocks. On the one hand, the local probabilities are allowed to be partially specified. On the other hand, the assessments of independence do not have to hold exactly. Conceptually, these two features turn credal networks under epistemic irrelevance into a powerful alternative to Bayesian networks, offering a more flexible approach to graph-based multivariate uncertainty modelling. However, in practice, they have long been perceived as very hard to work with, both theoretically and computationally. The aim of this paper is to demonstrate that this perception is no longer justified. We provide a general introduction to credal networks under epistemic irrelevance, give an overview of the state of the art, and present several new theoretical results. Most importantly, we explain how these results can be combined to allow for the design of recursive inference methods. We provide numerous concrete examples of how this can be achieved, and use these to demonstrate that computing with credal networks under epistemic irrelevance is most definitely feasible, and in some cases even highly efficient. We also discuss several philosophical aspects, including the lack of symmetry, how to deal with probability zero, the interpretation of lower expectations, the axiomatic status of graphoid properties, and the difference between updating and conditioning. |
Tasks | |
Published | 2017-01-27 |
URL | http://arxiv.org/abs/1701.08661v2 |
http://arxiv.org/pdf/1701.08661v2.pdf | |
PWC | https://paperswithcode.com/paper/credal-networks-under-epistemic-irrelevance |
Repo | |
Framework | |
Subspace Clustering via Optimal Direction Search
Title | Subspace Clustering via Optimal Direction Search |
Authors | Mostafa Rahmani, George Atia |
Abstract | This letter presents a new spectral-clustering-based approach to the subspace clustering problem. Underpinning the proposed method is a convex program for optimal direction search, which for each data point d finds an optimal direction in the span of the data that has minimum projection on the other data points and non-vanishing projection on d. The obtained directions are subsequently leveraged to identify a neighborhood set for each data point. An alternating direction method of multipliers framework is provided to efficiently solve for the optimal directions. The proposed method is shown to notably outperform the existing subspace clustering methods, particularly for unwieldy scenarios involving high levels of noise and close subspaces, and yields the state-of-the-art results for the problem of face clustering using subspace segmentation. |
Tasks | |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03860v4 |
http://arxiv.org/pdf/1706.03860v4.pdf | |
PWC | https://paperswithcode.com/paper/subspace-clustering-via-optimal-direction |
Repo | |
Framework | |
A Sequential Thinning Algorithm For Multi-Dimensional Binary Patterns
Title | A Sequential Thinning Algorithm For Multi-Dimensional Binary Patterns |
Authors | Himanshu Jain, Archana Praveen Kumar |
Abstract | Thinning is the removal of contour pixels/points of connected components in an image to produce their skeleton with retained connectivity and structural properties. The output requirements of a thinning procedure often vary with application. This paper proposes a sequential algorithm that is very easy to understand and modify based on application to perform the thinning of multi-dimensional binary patterns. The algorithm was tested on 2D and 3D patterns and showed very good results. Moreover, comparisons were also made with two of the state-of-the-art methods used for 2D patterns. The results obtained prove the validity of the procedure. |
Tasks | |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.03025v2 |
http://arxiv.org/pdf/1710.03025v2.pdf | |
PWC | https://paperswithcode.com/paper/a-sequential-thinning-algorithm-for-multi |
Repo | |
Framework | |
Towards Zero-Shot Frame Semantic Parsing for Domain Scaling
Title | Towards Zero-Shot Frame Semantic Parsing for Domain Scaling |
Authors | Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck |
Abstract | State-of-the-art slot filling models for goal-oriented human/machine conversational language understanding systems rely on deep learning methods. While multi-task training of such models alleviates the need for large in-domain annotated datasets, bootstrapping a semantic parsing model for a new domain using only the semantic frame, such as the back-end API or knowledge graph schema, is still one of the holy grail tasks of language understanding for dialogue systems. This paper proposes a deep learning based approach that can utilize only the slot description in context without the need for any labeled or unlabeled in-domain examples, to quickly bootstrap a new domain. The main idea of this paper is to leverage the encoding of the slot names and descriptions within a multi-task deep learned slot filling model, to implicitly align slots across domains. The proposed approach is promising for solving the domain scaling problem and eliminating the need for any manually annotated data or explicit schema alignment. Furthermore, our experiments on multiple domains show that this approach results in significantly better slot-filling performance when compared to using only in-domain data, especially in the low data regime. |
Tasks | Semantic Parsing, Slot Filling |
Published | 2017-07-07 |
URL | http://arxiv.org/abs/1707.02363v1 |
http://arxiv.org/pdf/1707.02363v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-zero-shot-frame-semantic-parsing-for |
Repo | |
Framework | |