February 2, 2020

3032 words 15 mins read

Paper Group AWR 74

Paper Group AWR 74

Using a KG-Copy Network for Non-Goal Oriented Dialogues. Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems. Dilated deeply supervised networks for hippocampus segmentation in MRI. Density Matching for Bilingual Word Embedding. An Attention Mechanism for Musical Instrument Recognition. Tag2Pix: Line Art Coloriz …

Using a KG-Copy Network for Non-Goal Oriented Dialogues

Title Using a KG-Copy Network for Non-Goal Oriented Dialogues
Authors Debanjan Chaudhuri, Md Rashad Al Hasan Rony, Simon Jordan, Jens Lehmann
Abstract Non-goal oriented, generative dialogue systems lack the ability to generate answers with grounded facts. A knowledge graph can be considered an abstraction of the real world consisting of well-grounded facts. This paper addresses the problem of generating well grounded responses by integrating knowledge graphs into the dialogue systems response generation process, in an end-to-end manner. A dataset for nongoal oriented dialogues is proposed in this paper in the domain of soccer, conversing on different clubs and national teams along with a knowledge graph for each of these teams. A novel neural network architecture is also proposed as a baseline on this dataset, which can integrate knowledge graphs into the response generation process, producing well articulated, knowledge grounded responses. Empirical evidence suggests that the proposed model performs better than other state-of-the-art models for knowledge graph integrated dialogue systems.
Tasks Knowledge Graphs
Published 2019-10-17
URL https://arxiv.org/abs/1910.07834v1
PDF https://arxiv.org/pdf/1910.07834v1.pdf
PWC https://paperswithcode.com/paper/using-a-kg-copy-network-for-non-goal-oriented
Repo https://github.com/SmartDataAnalytics/KG-Copy_Network
Framework pytorch

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems

Title Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
Authors Asma Ghandeharioun, Judy Hanwen Shen, Natasha Jaques, Craig Ferguson, Noah Jones, Agata Lapedriza, Rosalind Picard
Abstract Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of static conversation, do not capture conversation quality in a realistic interactive context. In this paper, we investigate interactive human evaluation and provide evidence for its necessity; we then introduce a novel, model-agnostic, and dataset-agnostic method to approximate it. In particular, we propose a self-play scenario where the dialog system talks to itself and we calculate a combination of proxies such as sentiment and semantic coherence on the conversation trajectory. We show that this metric is capable of capturing the human-rated quality of a dialog model better than any automated metric known to-date, achieving a significant Pearson correlation (r>.7, p<.05). To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and human evaluation of static conversations, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level. Finally, we open-source the interactive evaluation platform we built and the dataset we collected to allow researchers to efficiently deploy and evaluate dialog models.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09308v2
PDF https://arxiv.org/pdf/1906.09308v2.pdf
PWC https://paperswithcode.com/paper/approximating-interactive-human-evaluation
Repo https://github.com/asmadotgh/neural_chat_web
Framework none

Dilated deeply supervised networks for hippocampus segmentation in MRI

Title Dilated deeply supervised networks for hippocampus segmentation in MRI
Authors Lukas Folle, Sulaiman Vesal, Nishant Ravikumar, Andreas Maier
Abstract Tissue loss in the hippocampi has been heavily correlated with the progression of Alzheimer’s Disease (AD). The shape and structure of the hippocampus are important factors in terms of early AD diagnosis and prognosis by clinicians. However, manual segmentation of such subcortical structures in MR studies is a challenging and subjective task. In this paper, we investigate variants of the well known 3D U-Net, a type of convolution neural network (CNN) for semantic segmentation tasks. We propose an alternative form of the 3D U-Net, which uses dilated convolutions and deep supervision to incorporate multi-scale information into the model. The proposed method is evaluated on the task of hippocampus head and body segmentation in an MRI dataset, provided as part of the MICCAI 2018 segmentation decathlon challenge. The experimental results show that our approach outperforms other conventional methods in terms of different segmentation accuracy metrics.
Tasks Accuracy Metrics, Semantic Segmentation
Published 2019-03-20
URL http://arxiv.org/abs/1903.09097v1
PDF http://arxiv.org/pdf/1903.09097v1.pdf
PWC https://paperswithcode.com/paper/dilated-deeply-supervised-networks-for
Repo https://github.com/satyakees/FaultNet
Framework pytorch

Density Matching for Bilingual Word Embedding

Title Density Matching for Bilingual Word Embedding
Authors Chunting Zhou, Xuezhe Ma, Di Wang, Graham Neubig
Abstract Recent approaches to cross-lingual word embedding have generally been based on linear transformations between the sets of embedding vectors in the two languages. In this paper, we propose an approach that instead expresses the two monolingual embedding spaces as probability densities defined by a Gaussian mixture model, and matches the two densities using a method called normalizing flow. The method requires no explicit supervision, and can be learned with only a seed dictionary of words that have identical strings. We argue that this formulation has several intuitively attractive properties, particularly with the respect to improving robustness and generalization to mappings between difficult language pairs or word pairs. On a benchmark data set of bilingual lexicon induction and cross-lingual word similarity, our approach can achieve competitive or superior performance compared to state-of-the-art published results, with particularly strong results being found on etymologically distant and/or morphologically rich languages.
Tasks Word Embeddings
Published 2019-04-04
URL http://arxiv.org/abs/1904.02343v3
PDF http://arxiv.org/pdf/1904.02343v3.pdf
PWC https://paperswithcode.com/paper/density-matching-for-bilingual-word-embedding
Repo https://github.com/violet-zct/DeMa-BWE
Framework pytorch

An Attention Mechanism for Musical Instrument Recognition

Title An Attention Mechanism for Musical Instrument Recognition
Authors Siddharth Gururani, Mohit Sharma, Alexander Lerch
Abstract While the automatic recognition of musical instruments has seen significant progress, the task is still considered hard for music featuring multiple instruments as opposed to single instrument recordings. Datasets for polyphonic instrument recognition can be categorized into roughly two categories. Some, such as MedleyDB, have strong per-frame instrument activity annotations but are usually small in size. Other, larger datasets such as OpenMIC only have weak labels, i.e., instrument presence or absence is annotated only for long snippets of a song. We explore an attention mechanism for handling weakly labeled data for multi-label instrument recognition. Attention has been found to perform well for other tasks with weakly labeled data. We compare the proposed attention model to multiple models which include a baseline binary relevance random forest, recurrent neural network, and fully connected neural networks. Our results show that incorporating attention leads to an overall improvement in classification accuracy metrics across all 20 instruments in the OpenMIC dataset. We find that attention enables models to focus on (or `attend to’) specific time segments in the audio relevant to each instrument label leading to interpretable results. |
Tasks Accuracy Metrics
Published 2019-07-09
URL https://arxiv.org/abs/1907.04294v1
PDF https://arxiv.org/pdf/1907.04294v1.pdf
PWC https://paperswithcode.com/paper/an-attention-mechanism-for-musical-instrument
Repo https://github.com/SiddGururani/AttentionMIC
Framework pytorch

Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss

Title Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss
Authors Hyunsu Kim, Ho Young Jhoo, Eunhyeok Park, Sungjoo Yoo
Abstract Line art colorization is expensive and challenging to automate. A GAN approach is proposed, called Tag2Pix, of line art colorization which takes as input a grayscale line art and color tag information and produces a quality colored image. First, we present the Tag2Pix line art colorization dataset. A generator network is proposed which consists of convolutional layers to transform the input line art, a pre-trained semantic extraction network, and an encoder for input color information. The discriminator is based on an auxiliary classifier GAN to classify the tag information as well as genuineness. In addition, we propose a novel network structure called SECat, which makes the generator properly colorize even small features such as eyes, and also suggest a novel two-step training method where the generator and discriminator first learn the notion of object and shape and then, based on the learned notion, learn colorization, such as where and how to place which color. We present both quantitative and qualitative evaluations which prove the effectiveness of the proposed method.
Tasks Colorization, Line Art Colorization
Published 2019-08-16
URL https://arxiv.org/abs/1908.05840v1
PDF https://arxiv.org/pdf/1908.05840v1.pdf
PWC https://paperswithcode.com/paper/tag2pix-line-art-colorization-using-text-tag
Repo https://github.com/blandocs/Tag2Pix
Framework pytorch

Deep Image Blending

Title Deep Image Blending
Authors Lingzhi Zhang, Tarmily Wen, Jianbo Shi
Abstract Image composition is an important operation to create visual content. Among image composition tasks, image blending aims to seamlessly blend an object from a source image onto a target image with lightly mask adjustment. A popular approach is Poisson image blending, which enforces the gradient domain smoothness in the composite image. However, this approach only considers the boundary pixels of target image, and thus can not adapt to texture of target image. In addition, the colors of the target image often seep through the original source object too much causing a significant loss of content of the source object. We propose a Poisson blending loss that achieves the same purpose of Poisson image blending. In addition, we jointly optimize the proposed Poisson blending loss as well as the style and content loss computed from a deep network, and reconstruct the blending region by iteratively updating the pixels using the L-BFGS solver. In the blending image, we not only smooth out gradient domain of the blending boundary but also add consistent texture into the blending region. User studies show that our method outperforms strong baselines as well as state-of-the-art approaches when placing objects onto both paintings and real-world images.
Tasks
Published 2019-10-25
URL https://arxiv.org/abs/1910.11495v1
PDF https://arxiv.org/pdf/1910.11495v1.pdf
PWC https://paperswithcode.com/paper/deep-image-blending
Repo https://github.com/owenzlz/Deep_Image_Blending
Framework pytorch

Coloring With Limited Data: Few-Shot Colorization via Memory-Augmented Networks

Title Coloring With Limited Data: Few-Shot Colorization via Memory-Augmented Networks
Authors Seungjoo Yoo, Hyojin Bahng, Sunghyo Chung, Junsoo Lee, Jaehyuk Chang, Jaegul Choo
Abstract Despite recent advancements in deep learning-based automatic colorization, they are still limited when it comes to few-shot learning. Existing models require a significant amount of training data. To tackle this issue, we present a novel memory-augmented colorization model MemoPainter that can produce high-quality colorization with limited data. In particular, our model is able to capture rare instances and successfully colorize them. We also propose a novel threshold triplet loss that enables unsupervised training of memory networks without the need of class labels. Experiments show that our model has superior quality in both few-shot and one-shot colorization tasks.
Tasks Colorization, Few-Shot Learning
Published 2019-06-09
URL https://arxiv.org/abs/1906.11888v1
PDF https://arxiv.org/pdf/1906.11888v1.pdf
PWC https://paperswithcode.com/paper/coloring-with-limited-data-few-shot-1
Repo https://github.com/dongheehand/MemoPainter-PyTorch
Framework pytorch

Interactive Fiction Games: A Colossal Adventure

Title Interactive Fiction Games: A Colossal Adventure
Authors Matthew Hausknecht, Prithviraj Ammanabrolu, Marc-Alexandre Côté, Xingdi Yuan
Abstract A hallmark of human intelligence is the ability to understand and communicate with language. Interactive Fiction games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. We argue that IF games are an excellent testbed for studying language-based autonomous agents. In particular, IF games combine challenges of combinatorial action spaces, language understanding, and commonsense reasoning. To facilitate rapid development of language-based agents, we introduce Jericho, a learning environment for man-made IF games and conduct a comprehensive study of text-agents across a rich set of games, highlighting directions in which agents can improve.
Tasks
Published 2019-09-11
URL https://arxiv.org/abs/1909.05398v3
PDF https://arxiv.org/pdf/1909.05398v3.pdf
PWC https://paperswithcode.com/paper/interactive-fiction-games-a-colossal
Repo https://github.com/Microsoft/jericho
Framework none

Grouped sparse projection

Title Grouped sparse projection
Authors Nicolas Gillis, Riyasat Ohib, Sergey Plis, Vamsi Potluru
Abstract As evident from deep learning, very large models bring improvements in training dynamics and representation power. Yet, smaller models have benefits of energy efficiency and interpretability. To get the benefits from both ends of the spectrum we often encourage sparsity in the model. Unfortunately, most existing approaches do not have a controllable way to request a desired value of sparsity in an interpretable parameter. In this paper, we design a new sparse projection method for a set of vectors in order to achieve a desired average level of sparsity which is measured using the ratio of the $\ell_1$ and $\ell_2$ norms. Most existing methods project each vector individuality trying to achieve a target sparsity, hence the user has to choose a sparsity level for each vector (e.g., impose that all vectors have the same sparsity). Instead, we project all vectors together to achieve an average target sparsity, where the sparsity levels of the vectors is automatically tuned. We also propose a generalization of this projection using a new notion of weighted sparsity measured using the ratio of a weighted $\ell_1$ and the $\ell_2$ norms. These projections can be used in particular to sparsify the columns of a matrix, which we use to compute sparse nonnegative matrix factorization and to learn sparse deep networks.
Tasks
Published 2019-12-09
URL https://arxiv.org/abs/1912.03896v2
PDF https://arxiv.org/pdf/1912.03896v2.pdf
PWC https://paperswithcode.com/paper/grouped-sparse-projection
Repo https://github.com/riohib/GSP
Framework pytorch

A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

Title A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Authors Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
Abstract Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, reconstruction error). We present the Visual Task Adaptation Benchmark (VTAB), which defines good representations as those that adapt to diverse, unseen tasks with few examples. With VTAB, we conduct a large-scale study of many popular publicly-available representation learning algorithms. We carefully control confounders such as architecture and tuning budget. We address questions like: How effective are ImageNet representations beyond standard natural datasets? How do representations trained via generative and discriminative models compare? To what extent can self-supervision replace labels? And, how close are we to general visual representations?
Tasks Image Classification, Representation Learning
Published 2019-10-01
URL https://arxiv.org/abs/1910.04867v2
PDF https://arxiv.org/pdf/1910.04867v2.pdf
PWC https://paperswithcode.com/paper/the-visual-task-adaptation-benchmark
Repo https://github.com/google-research/task_adaptation
Framework tf

Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output

Title Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output
Authors Vahdat Abdelzad, Krzysztof Czarnecki, Rick Salay, Taylor Denounden, Sachin Vernekar, Buu Phan
Abstract Deep neural networks achieve superior performance in challenging tasks such as image classification. However, deep classifiers tend to incorrectly classify out-of-distribution (OOD) inputs, which are inputs that do not belong to the classifier training distribution. Several approaches have been proposed to detect OOD inputs, but the detection task is still an ongoing challenge. In this paper, we propose a new OOD detection approach that can be easily applied to an existing classifier and does not need to have access to OOD samples. The detector is a one-class classifier trained on the output of an early layer of the original classifier fed with its original training set. We apply our approach to several low- and high-dimensional datasets and compare it to the state-of-the-art detection approaches. Our approach achieves substantially better results over multiple metrics.
Tasks Image Classification, One-class classifier
Published 2019-10-23
URL https://arxiv.org/abs/1910.10307v1
PDF https://arxiv.org/pdf/1910.10307v1.pdf
PWC https://paperswithcode.com/paper/detecting-out-of-distribution-inputs-in-deep
Repo https://github.com/gietema/ood-early-layer-detection
Framework pytorch

Neural Extractive Text Summarization with Syntactic Compression

Title Neural Extractive Text Summarization with Syntactic Compression
Authors Jiacheng Xu, Greg Durrett
Abstract Recent neural network approaches to summarization are largely either selection-based extraction or generation-based abstraction. In this work, we present a neural model for single-document summarization based on joint extraction and syntactic compression. Our model chooses sentences from the document, identifies possible compressions based on constituency parses, and scores those compressions with a neural model to produce the final summary. For learning, we construct oracle extractive-compressive summaries, then learn both of our components jointly with this supervision. Experimental results on the CNN/Daily Mail and New York Times datasets show that our model achieves strong performance (comparable to state-of-the-art systems) as evaluated by ROUGE. Moreover, our approach outperforms an off-the-shelf compression module, and human and manual evaluation shows that our model’s output generally remains grammatical.
Tasks Document Summarization, Text Summarization
Published 2019-02-03
URL https://arxiv.org/abs/1902.00863v2
PDF https://arxiv.org/pdf/1902.00863v2.pdf
PWC https://paperswithcode.com/paper/neural-extractive-text-summarization-with
Repo https://github.com/jiacheng-xu/neu-compression-sum
Framework none

An adaptive nearest neighbor rule for classification

Title An adaptive nearest neighbor rule for classification
Authors Akshay Balsubramani, Sanjoy Dasgupta, Yoav Freund, Shay Moran
Abstract We introduce a variant of the $k$-nearest neighbor classifier in which $k$ is chosen adaptively for each query, rather than supplied as a parameter. The choice of $k$ depends on properties of each neighborhood, and therefore may significantly vary between different points. (For example, the algorithm will use larger $k$ for predicting the labels of points in noisy regions.) We provide theory and experiments that demonstrate that the algorithm performs comparably to, and sometimes better than, $k$-NN with an optimal choice of $k$. In particular, we derive bounds on the convergence rates of our classifier that depend on a local quantity we call the `advantage’ which is significantly weaker than the Lipschitz conditions used in previous convergence rate proofs. These generalization bounds hinge on a variant of the seminal Uniform Convergence Theorem due to Vapnik and Chervonenkis; this variant concerns conditional probabilities and may be of independent interest. |
Tasks
Published 2019-05-29
URL https://arxiv.org/abs/1905.12717v1
PDF https://arxiv.org/pdf/1905.12717v1.pdf
PWC https://paperswithcode.com/paper/an-adaptive-nearest-neighbor-rule-for
Repo https://github.com/b-akshay/aknn-classifier
Framework none

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Title Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Authors Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu
Abstract Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at eliminating some parameters with tolerable performance degradation. In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network complexity by on-the-fly pruning. Concretely, given a global compression ratio, we categorize all the parameters into two parts at each training iteration which are updated using different rules. In this way, we gradually zero out the redundant parameters, as we update them using only the ordinary weight decay but no gradients derived from the objective function. As a departure from prior methods that require heavy human works to tune the layer-wise sparsity ratios, prune by solving complicated non-differentiable problems or finetune the model after pruning, our method is characterized by 1) global compression that automatically finds the appropriate per-layer sparsity ratios; 2) end-to-end training; 3) no need for a time-consuming re-training process after pruning; and 4) superior capability to find better winning tickets which have won the initialization lottery.
Tasks Model Compression
Published 2019-09-27
URL https://arxiv.org/abs/1909.12778v3
PDF https://arxiv.org/pdf/1909.12778v3.pdf
PWC https://paperswithcode.com/paper/global-sparse-momentum-sgd-for-pruning-very
Repo https://github.com/DingXiaoH/GSM-SGD
Framework tf
comments powered by Disqus