February 2, 2020

3032 words 15 mins read

Paper Group AWR 74

Using a KG-Copy Network for Non-Goal Oriented Dialogues. Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems. Dilated deeply supervised networks for hippocampus segmentation in MRI. Density Matching for Bilingual Word Embedding. An Attention Mechanism for Musical Instrument Recognition. Tag2Pix: Line Art Coloriz …

Using a KG-Copy Network for Non-Goal Oriented Dialogues


Title	Using a KG-Copy Network for Non-Goal Oriented Dialogues
Authors	Debanjan Chaudhuri, Md Rashad Al Hasan Rony, Simon Jordan, Jens Lehmann
Abstract	Non-goal oriented, generative dialogue systems lack the ability to generate answers with grounded facts. A knowledge graph can be considered an abstraction of the real world consisting of well-grounded facts. This paper addresses the problem of generating well grounded responses by integrating knowledge graphs into the dialogue systems response generation process, in an end-to-end manner. A dataset for nongoal oriented dialogues is proposed in this paper in the domain of soccer, conversing on different clubs and national teams along with a knowledge graph for each of these teams. A novel neural network architecture is also proposed as a baseline on this dataset, which can integrate knowledge graphs into the response generation process, producing well articulated, knowledge grounded responses. Empirical evidence suggests that the proposed model performs better than other state-of-the-art models for knowledge graph integrated dialogue systems.
Tasks	Knowledge Graphs
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07834v1
PDF	https://arxiv.org/pdf/1910.07834v1.pdf
PWC	https://paperswithcode.com/paper/using-a-kg-copy-network-for-non-goal-oriented
Repo	https://github.com/SmartDataAnalytics/KG-Copy_Network
Framework	pytorch

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems


Title	Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems
Authors	Asma Ghandeharioun, Judy Hanwen Shen, Natasha Jaques, Craig Ferguson, Noah Jones, Agata Lapedriza, Rosalind Picard
Abstract	Building an open-domain conversational agent is a challenging problem. Current evaluation methods, mostly post-hoc judgments of static conversation, do not capture conversation quality in a realistic interactive context. In this paper, we investigate interactive human evaluation and provide evidence for its necessity; we then introduce a novel, model-agnostic, and dataset-agnostic method to approximate it. In particular, we propose a self-play scenario where the dialog system talks to itself and we calculate a combination of proxies such as sentiment and semantic coherence on the conversation trajectory. We show that this metric is capable of capturing the human-rated quality of a dialog model better than any automated metric known to-date, achieving a significant Pearson correlation (r>.7, p<.05). To investigate the strengths of this novel metric and interactive evaluation in comparison to state-of-the-art metrics and human evaluation of static conversations, we perform extended experiments with a set of models, including several that make novel improvements to recent hierarchical dialog generation architectures through sentiment and semantic knowledge distillation on the utterance level. Finally, we open-source the interactive evaluation platform we built and the dataset we collected to allow researchers to efficiently deploy and evaluate dialog models.
Tasks
Published	2019-06-21
URL	https://arxiv.org/abs/1906.09308v2
PDF	https://arxiv.org/pdf/1906.09308v2.pdf
PWC	https://paperswithcode.com/paper/approximating-interactive-human-evaluation
Repo	https://github.com/asmadotgh/neural_chat_web
Framework	none

Dilated deeply supervised networks for hippocampus segmentation in MRI


Title	Dilated deeply supervised networks for hippocampus segmentation in MRI
Authors	Lukas Folle, Sulaiman Vesal, Nishant Ravikumar, Andreas Maier
Abstract	Tissue loss in the hippocampi has been heavily correlated with the progression of Alzheimer’s Disease (AD). The shape and structure of the hippocampus are important factors in terms of early AD diagnosis and prognosis by clinicians. However, manual segmentation of such subcortical structures in MR studies is a challenging and subjective task. In this paper, we investigate variants of the well known 3D U-Net, a type of convolution neural network (CNN) for semantic segmentation tasks. We propose an alternative form of the 3D U-Net, which uses dilated convolutions and deep supervision to incorporate multi-scale information into the model. The proposed method is evaluated on the task of hippocampus head and body segmentation in an MRI dataset, provided as part of the MICCAI 2018 segmentation decathlon challenge. The experimental results show that our approach outperforms other conventional methods in terms of different segmentation accuracy metrics.
Tasks	Accuracy Metrics, Semantic Segmentation
Published	2019-03-20
URL	http://arxiv.org/abs/1903.09097v1
PDF	http://arxiv.org/pdf/1903.09097v1.pdf
PWC	https://paperswithcode.com/paper/dilated-deeply-supervised-networks-for
Repo	https://github.com/satyakees/FaultNet
Framework	pytorch

Density Matching for Bilingual Word Embedding


Title	Density Matching for Bilingual Word Embedding
Authors	Chunting Zhou, Xuezhe Ma, Di Wang, Graham Neubig
Abstract	Recent approaches to cross-lingual word embedding have generally been based on linear transformations between the sets of embedding vectors in the two languages. In this paper, we propose an approach that instead expresses the two monolingual embedding spaces as probability densities defined by a Gaussian mixture model, and matches the two densities using a method called normalizing flow. The method requires no explicit supervision, and can be learned with only a seed dictionary of words that have identical strings. We argue that this formulation has several intuitively attractive properties, particularly with the respect to improving robustness and generalization to mappings between difficult language pairs or word pairs. On a benchmark data set of bilingual lexicon induction and cross-lingual word similarity, our approach can achieve competitive or superior performance compared to state-of-the-art published results, with particularly strong results being found on etymologically distant and/or morphologically rich languages.
Tasks	Word Embeddings
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02343v3
PDF	http://arxiv.org/pdf/1904.02343v3.pdf
PWC	https://paperswithcode.com/paper/density-matching-for-bilingual-word-embedding
Repo	https://github.com/violet-zct/DeMa-BWE
Framework	pytorch

An Attention Mechanism for Musical Instrument Recognition


Title	An Attention Mechanism for Musical Instrument Recognition
Authors	Siddharth Gururani, Mohit Sharma, Alexander Lerch
Abstract	While the automatic recognition of musical instruments has seen significant progress, the task is still considered hard for music featuring multiple instruments as opposed to single instrument recordings. Datasets for polyphonic instrument recognition can be categorized into roughly two categories. Some, such as MedleyDB, have strong per-frame instrument activity annotations but are usually small in size. Other, larger datasets such as OpenMIC only have weak labels, i.e., instrument presence or absence is annotated only for long snippets of a song. We explore an attention mechanism for handling weakly labeled data for multi-label instrument recognition. Attention has been found to perform well for other tasks with weakly labeled data. We compare the proposed attention model to multiple models which include a baseline binary relevance random forest, recurrent neural network, and fully connected neural networks. Our results show that incorporating attention leads to an overall improvement in classification accuracy metrics across all 20 instruments in the OpenMIC dataset. We find that attention enables models to focus on (or `attend to’) specific time segments in the audio relevant to each instrument label leading to interpretable results. \|
Tasks	Accuracy Metrics
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04294v1
PDF	https://arxiv.org/pdf/1907.04294v1.pdf
PWC	https://paperswithcode.com/paper/an-attention-mechanism-for-musical-instrument
Repo	https://github.com/SiddGururani/AttentionMIC
Framework	pytorch

Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss


Title	Tag2Pix: Line Art Colorization Using Text Tag With SECat and Changing Loss
Authors	Hyunsu Kim, Ho Young Jhoo, Eunhyeok Park, Sungjoo Yoo
Abstract	Line art colorization is expensive and challenging to automate. A GAN approach is proposed, called Tag2Pix, of line art colorization which takes as input a grayscale line art and color tag information and produces a quality colored image. First, we present the Tag2Pix line art colorization dataset. A generator network is proposed which consists of convolutional layers to transform the input line art, a pre-trained semantic extraction network, and an encoder for input color information. The discriminator is based on an auxiliary classifier GAN to classify the tag information as well as genuineness. In addition, we propose a novel network structure called SECat, which makes the generator properly colorize even small features such as eyes, and also suggest a novel two-step training method where the generator and discriminator first learn the notion of object and shape and then, based on the learned notion, learn colorization, such as where and how to place which color. We present both quantitative and qualitative evaluations which prove the effectiveness of the proposed method.
Tasks	Colorization, Line Art Colorization
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05840v1
PDF	https://arxiv.org/pdf/1908.05840v1.pdf
PWC	https://paperswithcode.com/paper/tag2pix-line-art-colorization-using-text-tag
Repo	https://github.com/blandocs/Tag2Pix
Framework	pytorch

Deep Image Blending


Title	Deep Image Blending
Authors	Lingzhi Zhang, Tarmily Wen, Jianbo Shi
Abstract	Image composition is an important operation to create visual content. Among image composition tasks, image blending aims to seamlessly blend an object from a source image onto a target image with lightly mask adjustment. A popular approach is Poisson image blending, which enforces the gradient domain smoothness in the composite image. However, this approach only considers the boundary pixels of target image, and thus can not adapt to texture of target image. In addition, the colors of the target image often seep through the original source object too much causing a significant loss of content of the source object. We propose a Poisson blending loss that achieves the same purpose of Poisson image blending. In addition, we jointly optimize the proposed Poisson blending loss as well as the style and content loss computed from a deep network, and reconstruct the blending region by iteratively updating the pixels using the L-BFGS solver. In the blending image, we not only smooth out gradient domain of the blending boundary but also add consistent texture into the blending region. User studies show that our method outperforms strong baselines as well as state-of-the-art approaches when placing objects onto both paintings and real-world images.
Tasks
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11495v1
PDF	https://arxiv.org/pdf/1910.11495v1.pdf
PWC	https://paperswithcode.com/paper/deep-image-blending
Repo	https://github.com/owenzlz/Deep_Image_Blending
Framework	pytorch

Coloring With Limited Data: Few-Shot Colorization via Memory-Augmented Networks


Title	Coloring With Limited Data: Few-Shot Colorization via Memory-Augmented Networks
Authors	Seungjoo Yoo, Hyojin Bahng, Sunghyo Chung, Junsoo Lee, Jaehyuk Chang, Jaegul Choo
Abstract	Despite recent advancements in deep learning-based automatic colorization, they are still limited when it comes to few-shot learning. Existing models require a significant amount of training data. To tackle this issue, we present a novel memory-augmented colorization model MemoPainter that can produce high-quality colorization with limited data. In particular, our model is able to capture rare instances and successfully colorize them. We also propose a novel threshold triplet loss that enables unsupervised training of memory networks without the need of class labels. Experiments show that our model has superior quality in both few-shot and one-shot colorization tasks.
Tasks	Colorization, Few-Shot Learning
Published	2019-06-09
URL	https://arxiv.org/abs/1906.11888v1
PDF	https://arxiv.org/pdf/1906.11888v1.pdf
PWC	https://paperswithcode.com/paper/coloring-with-limited-data-few-shot-1
Repo	https://github.com/dongheehand/MemoPainter-PyTorch
Framework	pytorch

Interactive Fiction Games: A Colossal Adventure


Title	Interactive Fiction Games: A Colossal Adventure
Authors	Matthew Hausknecht, Prithviraj Ammanabrolu, Marc-Alexandre Côté, Xingdi Yuan
Abstract	A hallmark of human intelligence is the ability to understand and communicate with language. Interactive Fiction games are fully text-based simulation environments where a player issues text commands to effect change in the environment and progress through the story. We argue that IF games are an excellent testbed for studying language-based autonomous agents. In particular, IF games combine challenges of combinatorial action spaces, language understanding, and commonsense reasoning. To facilitate rapid development of language-based agents, we introduce Jericho, a learning environment for man-made IF games and conduct a comprehensive study of text-agents across a rich set of games, highlighting directions in which agents can improve.
Tasks
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05398v3
PDF	https://arxiv.org/pdf/1909.05398v3.pdf
PWC	https://paperswithcode.com/paper/interactive-fiction-games-a-colossal
Repo	https://github.com/Microsoft/jericho
Framework	none

Grouped sparse projection


Title	Grouped sparse projection
Authors	Nicolas Gillis, Riyasat Ohib, Sergey Plis, Vamsi Potluru
Abstract	As evident from deep learning, very large models bring improvements in training dynamics and representation power. Yet, smaller models have benefits of energy efficiency and interpretability. To get the benefits from both ends of the spectrum we often encourage sparsity in the model. Unfortunately, most existing approaches do not have a controllable way to request a desired value of sparsity in an interpretable parameter. In this paper, we design a new sparse projection method for a set of vectors in order to achieve a desired average level of sparsity which is measured using the ratio of the $\ell_1$ and $\ell_2$ norms. Most existing methods project each vector individuality trying to achieve a target sparsity, hence the user has to choose a sparsity level for each vector (e.g., impose that all vectors have the same sparsity). Instead, we project all vectors together to achieve an average target sparsity, where the sparsity levels of the vectors is automatically tuned. We also propose a generalization of this projection using a new notion of weighted sparsity measured using the ratio of a weighted $\ell_1$ and the $\ell_2$ norms. These projections can be used in particular to sparsify the columns of a matrix, which we use to compute sparse nonnegative matrix factorization and to learn sparse deep networks.
Tasks
Published	2019-12-09
URL	https://arxiv.org/abs/1912.03896v2
PDF	https://arxiv.org/pdf/1912.03896v2.pdf
PWC	https://paperswithcode.com/paper/grouped-sparse-projection
Repo	https://github.com/riohib/GSP
Framework	pytorch

A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark


Title	A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Authors	Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby
Abstract	Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, reconstruction error). We present the Visual Task Adaptation Benchmark (VTAB), which defines good representations as those that adapt to diverse, unseen tasks with few examples. With VTAB, we conduct a large-scale study of many popular publicly-available representation learning algorithms. We carefully control confounders such as architecture and tuning budget. We address questions like: How effective are ImageNet representations beyond standard natural datasets? How do representations trained via generative and discriminative models compare? To what extent can self-supervision replace labels? And, how close are we to general visual representations?
Tasks	Image Classification, Representation Learning
Published	2019-10-01
URL	https://arxiv.org/abs/1910.04867v2
PDF	https://arxiv.org/pdf/1910.04867v2.pdf
PWC	https://paperswithcode.com/paper/the-visual-task-adaptation-benchmark
Repo	https://github.com/google-research/task_adaptation
Framework	tf

Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output


Title	Detecting Out-of-Distribution Inputs in Deep Neural Networks Using an Early-Layer Output
Authors	Vahdat Abdelzad, Krzysztof Czarnecki, Rick Salay, Taylor Denounden, Sachin Vernekar, Buu Phan
Abstract	Deep neural networks achieve superior performance in challenging tasks such as image classification. However, deep classifiers tend to incorrectly classify out-of-distribution (OOD) inputs, which are inputs that do not belong to the classifier training distribution. Several approaches have been proposed to detect OOD inputs, but the detection task is still an ongoing challenge. In this paper, we propose a new OOD detection approach that can be easily applied to an existing classifier and does not need to have access to OOD samples. The detector is a one-class classifier trained on the output of an early layer of the original classifier fed with its original training set. We apply our approach to several low- and high-dimensional datasets and compare it to the state-of-the-art detection approaches. Our approach achieves substantially better results over multiple metrics.
Tasks	Image Classification, One-class classifier
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10307v1
PDF	https://arxiv.org/pdf/1910.10307v1.pdf
PWC	https://paperswithcode.com/paper/detecting-out-of-distribution-inputs-in-deep
Repo	https://github.com/gietema/ood-early-layer-detection
Framework	pytorch

Neural Extractive Text Summarization with Syntactic Compression


Title	Neural Extractive Text Summarization with Syntactic Compression
Authors	Jiacheng Xu, Greg Durrett
Abstract	Recent neural network approaches to summarization are largely either selection-based extraction or generation-based abstraction. In this work, we present a neural model for single-document summarization based on joint extraction and syntactic compression. Our model chooses sentences from the document, identifies possible compressions based on constituency parses, and scores those compressions with a neural model to produce the final summary. For learning, we construct oracle extractive-compressive summaries, then learn both of our components jointly with this supervision. Experimental results on the CNN/Daily Mail and New York Times datasets show that our model achieves strong performance (comparable to state-of-the-art systems) as evaluated by ROUGE. Moreover, our approach outperforms an off-the-shelf compression module, and human and manual evaluation shows that our model’s output generally remains grammatical.
Tasks	Document Summarization, Text Summarization
Published	2019-02-03
URL	https://arxiv.org/abs/1902.00863v2
PDF	https://arxiv.org/pdf/1902.00863v2.pdf
PWC	https://paperswithcode.com/paper/neural-extractive-text-summarization-with
Repo	https://github.com/jiacheng-xu/neu-compression-sum
Framework	none

An adaptive nearest neighbor rule for classification


Title	An adaptive nearest neighbor rule for classification
Authors	Akshay Balsubramani, Sanjoy Dasgupta, Yoav Freund, Shay Moran
Abstract	We introduce a variant of the $k$-nearest neighbor classifier in which $k$ is chosen adaptively for each query, rather than supplied as a parameter. The choice of $k$ depends on properties of each neighborhood, and therefore may significantly vary between different points. (For example, the algorithm will use larger $k$ for predicting the labels of points in noisy regions.) We provide theory and experiments that demonstrate that the algorithm performs comparably to, and sometimes better than, $k$-NN with an optimal choice of $k$. In particular, we derive bounds on the convergence rates of our classifier that depend on a local quantity we call the `advantage’ which is significantly weaker than the Lipschitz conditions used in previous convergence rate proofs. These generalization bounds hinge on a variant of the seminal Uniform Convergence Theorem due to Vapnik and Chervonenkis; this variant concerns conditional probabilities and may be of independent interest. \|
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12717v1
PDF	https://arxiv.org/pdf/1905.12717v1.pdf
PWC	https://paperswithcode.com/paper/an-adaptive-nearest-neighbor-rule-for
Repo	https://github.com/b-akshay/aknn-classifier
Framework	none

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks


Title	Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Authors	Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu
Abstract	Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices. DNN pruning is an approach for deep model compression, which aims at eliminating some parameters with tolerable performance degradation. In this paper, we propose a novel momentum-SGD-based optimization method to reduce the network complexity by on-the-fly pruning. Concretely, given a global compression ratio, we categorize all the parameters into two parts at each training iteration which are updated using different rules. In this way, we gradually zero out the redundant parameters, as we update them using only the ordinary weight decay but no gradients derived from the objective function. As a departure from prior methods that require heavy human works to tune the layer-wise sparsity ratios, prune by solving complicated non-differentiable problems or finetune the model after pruning, our method is characterized by 1) global compression that automatically finds the appropriate per-layer sparsity ratios; 2) end-to-end training; 3) no need for a time-consuming re-training process after pruning; and 4) superior capability to find better winning tickets which have won the initialization lottery.
Tasks	Model Compression
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12778v3
PDF	https://arxiv.org/pdf/1909.12778v3.pdf
PWC	https://paperswithcode.com/paper/global-sparse-momentum-sgd-for-pruning-very
Repo	https://github.com/DingXiaoH/GSM-SGD
Framework	tf