October 16, 2019

2777 words 14 mins read

Paper Group NANR 1

Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers. A Situated Dialogue System for Learning Structural Concepts in Blocks World. Learning Longer-term Dependencies in RNNs with Auxiliary Losses. Do Deep Reinforcement Learning Algorithms really Learn to Navigate?. SystemT: Declar …

Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers


Title	Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
Authors	Adrien Barbaresi
Abstract	The present contribution revolves around efficient approaches to language classification which have been field-tested in the Vardial evaluation campaign. The methods used in several language identification tasks comprising different language types are presented and their results are discussed, giving insights on real-world application of regularization, linear classifiers and corresponding linguistic features. The use of a specially adapted Ridge classifier proved useful in 2 tasks out of 3. The overall approach (XAC) has slightly outperformed most of the other systems on the DFS task (Dutch and Flemish) and on the ILI task (Indo-Aryan languages), while its comparative performance was poorer in on the GDI task (Swiss German dialects).
Tasks	Language Identification, Text Categorization
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-3918/
PDF	https://www.aclweb.org/anthology/W18-3918
PWC	https://paperswithcode.com/paper/computationally-efficient-discrimination
Repo
Framework

A Situated Dialogue System for Learning Structural Concepts in Blocks World


Title	A Situated Dialogue System for Learning Structural Concepts in Blocks World
Authors	Ian Perera, James Allen, Choh Man Teng, Lucian Galescu
Abstract	We present a modular, end-to-end dialogue system for a situated agent to address a multimodal, natural language dialogue task in which the agent learns complex representations of block structure classes through assertions, demonstrations, and questioning. The concept to learn is provided to the user through a set of positive and negative visual examples, from which the user determines the underlying constraints to be provided to the system in natural language. The system in turn asks questions about demonstrated examples and simulates new examples to check its knowledge and verify the user{'}s description is complete. We find that this task is non-trivial for users and generates natural language that is varied yet understood by our deep language understanding architecture.
Tasks	Transfer Learning
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5010/
PDF	https://www.aclweb.org/anthology/W18-5010
PWC	https://paperswithcode.com/paper/a-situated-dialogue-system-for-learning
Repo
Framework

Learning Longer-term Dependencies in RNNs with Auxiliary Losses


Title	Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Authors	Trieu Trinh, Andrew Dai, Thang Luong, Quoc Le
Abstract	Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation.
Tasks	Document Classification, Image Classification
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2127
PDF	http://proceedings.mlr.press/v80/trinh18a/trinh18a.pdf
PWC	https://paperswithcode.com/paper/learning-longer-term-dependencies-in-rnns-1
Repo
Framework

Do Deep Reinforcement Learning Algorithms really Learn to Navigate?


Title	Do Deep Reinforcement Learning Algorithms really Learn to Navigate?
Authors	Shurjo Banerjee, Vikas Dhiman, Brent Griffin, Jason J. Corso
Abstract	Deep reinforcement learning (DRL) algorithms have demonstrated progress in learning to find a goal in challenging environments. As the title of the paper by Mirowski et al. (2016) suggests, one might assume that DRL-based algorithms are able to “learn to navigate” and are thus ready to replace classical mapping and path-planning algorithms, at least in simulated environments. Yet, from experiments and analysis in this earlier work, it is not clear what strategies are used by these algorithms in navigating the mazes and finding the goal. In this paper, we pose and study this underlying question: are DRL algorithms doing some form of mapping and/or path-planning? Our experiments show that the algorithms are not memorizing the maps of mazes at the testing stage but, rather, at the training stage. Hence, the DRL algorithms fall short of qualifying as mapping or path-planning algorithms with any reasonable definition of mapping. We extend the experiments in Mirowski et al. (2016) by separating the set of training and testing maps and by a more ablative coverage of the space of experiments. Our systematic experiments show that the NavA3C-D1-D2-L algorithm, when trained and tested on the same maps, is able to choose the shorter paths to the goal. However, when tested on unseen maps the algorithm utilizes a wall-following strategy to find the goal without doing any mapping or path planning.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=BkiIkBJ0b
PDF	https://openreview.net/pdf?id=BkiIkBJ0b
PWC	https://paperswithcode.com/paper/do-deep-reinforcement-learning-algorithms
Repo
Framework

SystemT: Declarative Text Understanding for Enterprise


Title	SystemT: Declarative Text Understanding for Enterprise
Authors	Laura Chiticariu, Marina Danilevsky, Yunyao Li, Frederick Reiss, Huaiyu Zhu
Abstract	The rise of enterprise applications over unstructured and semi-structured documents poses new challenges to text understanding systems across multiple dimensions. We present SystemT, a declarative text understanding system that addresses these challenges and has been deployed in a wide range of enterprise applications. We highlight the design considerations and decisions behind SystemT in addressing the needs of the enterprise setting. We also summarize the impact of SystemT on business and education.
Tasks	Document Classification, Entity Extraction, Relation Extraction, Semantic Parsing, Sentiment Analysis, Tokenization
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-3010/
PDF	https://www.aclweb.org/anthology/N18-3010
PWC	https://paperswithcode.com/paper/systemt-declarative-text-understanding-for
Repo
Framework

Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs


Title	Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs
Authors	Sima Sharifirad, Borna Jafarpour, Stan Matwin
Abstract	Text classification models have been heavily utilized for a slew of interesting natural language processing problems. Like any other machine learning model, these classifiers are very dependent on the size and quality of the training dataset. Insufficient and imbalanced datasets will lead to poor performance. An interesting solution to poor datasets is to take advantage of the world knowledge in the form of knowledge graphs to improve our training data. In this paper, we use ConceptNet and Wikidata to improve sexist tweet classification by two methods (1) text augmentation and (2) text generation. In our text generation approach, we generate new tweets by replacing words using data acquired from ConceptNet relations in order to increase the size of our training set, this method is very helpful with frustratingly small datasets, preserves the label and increases diversity. In our text augmentation approach, the number of tweets remains the same but their words are augmented (concatenation) with words extracted from their ConceptNet relations and their description extracted from Wikidata. In our text augmentation approach, the number of tweets in each class remains the same but the range of each tweet increases. Our experiments show that our approach improves sexist tweet classification significantly in our entire machine learning models. Our approach can be readily applied to any other small dataset size like hate speech or abusive language and text classification problem using any machine learning model.
Tasks	Dialogue Generation, Knowledge Graphs, Machine Translation, Text Augmentation, Text Classification, Text Generation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5114/
PDF	https://www.aclweb.org/anthology/W18-5114
PWC	https://paperswithcode.com/paper/boosting-text-classification-performance-on
Repo
Framework

Dynamic Feature Selection with Attention in Incremental Parsing


Title	Dynamic Feature Selection with Attention in Incremental Parsing
Authors	Ryosuke Kohita, Hiroshi Noji, Yuji Matsumoto
Abstract	One main challenge for incremental transition-based parsers, when future inputs are invisible, is to extract good features from a limited local context. In this work, we present a simple technique to maximally utilize the local features with an attention mechanism, which works as context- dependent dynamic feature selection. Our model learns, for example, which tokens should a parser focus on, to decide the next action. Our multilingual experiment shows its effectiveness across many languages. We also present an experiment with augmented test dataset and demon- strate it helps to understand the model{'}s behavior on locally ambiguous points.
Tasks	Dependency Parsing, Dialogue Generation, Feature Selection, Image Captioning, Machine Translation, Text Summarization
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1067/
PDF	https://www.aclweb.org/anthology/C18-1067
PWC	https://paperswithcode.com/paper/dynamic-feature-selection-with-attention-in
Repo
Framework

Olive Oil is Made \textitof Olives, Baby Oil is Made \textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model


Title	Olive Oil is Made \textitof Olives, Baby Oil is Made \textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model
Authors	Vered Shwartz, Chris Waterson
Abstract	Automatic interpretation of the relation between the constituents of a noun compound, e.g. olive oil (source) and baby oil (purpose) is an important task for many NLP applications. Recent approaches are typically based on either noun-compound representations or paraphrases. While the former has initially shown promising results, recent work suggests that the success stems from memorizing single prototypical words for each relation. We explore a neural paraphrasing approach that demonstrates superior performance when such memorization is not possible.
Tasks	Relation Classification
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2035/
PDF	https://www.aclweb.org/anthology/N18-2035
PWC	https://paperswithcode.com/paper/olive-oil-is-made-of-olives-baby-oil-is-made-1
Repo
Framework

Aspect Sentiment Classification with both Word-level and Clause-level AttentionNetworks


Title	Aspect Sentiment Classification with both Word-level and Clause-level AttentionNetworks
Authors	Jingjing Wang, Jie Li, Shoushan Li, Yangyang Kang, Min Zhang, Luo Si, Guodong Zhou
Abstract	Aspect sentiment classification, a challenging taskin sentiment analysis, has been attracting more andmore attention in recent years. In this paper, wehighlight the need for incorporating the importancedegrees of both words and clauses inside a sentenceand propose a hierarchical network with both word-level and clause-level attentions to aspect senti-ment classification.Specifically, we first adoptsentence-level discourse segmentationto segmenta sentence into several clauses. Then, we lever-age multiple Bi-directional LSTM layers to encodeall clauses and propose a word-level attention layerto capture the importance degrees of words in eachclause. Third and finally, we leverage another Bi-directional LSTM layer to encode the output fromthe former layers and propose a clause-level atten-tion layer to capture the importance degrees of allthe clauses inside a sentence. Experimental re-sults on thelaptopandrestaurantdatasets fromSemEval-2015 demonstrate the effectiveness of ourproposed approach to aspect sentiment classifica-tion.
Tasks	Sentiment Analysis
Published	2018-06-17
URL	https://www.ijcai.org/proceedings/2018/617
PDF	https://doi.org/10.24963/ijcai.2018/617
PWC	https://paperswithcode.com/paper/aspect-sentiment-classification-with-both
Repo
Framework

âLearning-Compressionâ Algorithms for Neural Net Pruning


Title	âLearning-Compressionâ Algorithms for Neural Net Pruning
Authors	Miguel Ã. Carreira-PerpiÃ±Ã¡n, Yerlan Idelbayev
Abstract	Pruning a neural net consists of removing weights without degrading its performance. This is an old problem of renewed interest because of the need to compress ever larger nets so they can run in mobile devices. Pruning has been traditionally done by ranking or penalizing weights according to some criterion (such as magnitude), removing low-ranked weights and retraining the remaining ones. We formulate pruning as an optimization problem of finding the weights that minimize the loss while satisfying a pruning cost condition. We give a generic algorithm to solve this which alternates “learning” steps that optimize a regularized, data-dependent loss and “compression” steps that mark weights for pruning in a data-independent way. Magnitude thresholding arises naturally in the compression step, but unlike existing magnitude pruning approaches, our algorithm explores subsets of weights rather than committing irrevocably to a specific subset from the beginning. It is also able to learn automatically the best number of weights to prune in each layer of the net without incurring an exponentially costly model selection. Using a single pruning-level user parameter, we achieve state-of-the-art pruning in nets of various sizes.
Tasks	Model Selection
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Carreira-Perpinan_Learning-Compression_Algorithms_for_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Carreira-Perpinan_Learning-Compression_Algorithms_for_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/alearning-compressiona-algorithms-for-neural
Repo
Framework

Complementary Strategies for Low Resourced Morphological Modeling


Title	Complementary Strategies for Low Resourced Morphological Modeling
Authors	Alex Erdmann, er, Nizar Habash
Abstract	Morphologically rich languages are challenging for natural language processing tasks due to data sparsity. This can be addressed either by introducing out-of-context morphological knowledge, or by developing machine learning architectures that specifically target data sparsity and/or morphological information. We find these approaches to complement each other in a morphological paradigm modeling task in Modern Standard Arabic, which, in addition to being morphologically complex, features ubiquitous ambiguity, exacerbating sparsity with noise. Given a small number of out-of-context rules describing closed class morphology, we combine them with word embeddings leveraging subword strings and noise reduction techniques. The combination outperforms both approaches individually by about 20{%} absolute. While morphological resources already exist for Modern Standard Arabic, our results inform how comparable resources might be constructed for non-standard dialects or any morphologically rich, low resourced language, given scarcity of time and funding.
Tasks	Morphological Analysis, Word Embeddings
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5806/
PDF	https://www.aclweb.org/anthology/W18-5806
PWC	https://paperswithcode.com/paper/complementary-strategies-for-low-resourced
Repo
Framework

Task-Aware Image Downscaling


Title	Task-Aware Image Downscaling
Authors	Heewon Kim, Myungsub Choi, Bee Lim, Kyoung Mu Lee
Abstract	Image downscaling is one of the most classical problems in computer vision that aims to preserve the visual appearance of the original image when it is resized to a smaller scale. Upscaling a small image back to its original size is a difficult, ill-posed problem due to information loss that arises in the downscaling process. In this paper, we present a novel operation called task-aware image downscaling to support an upscaling task. We propose an auto-encoder-based framework that enables joint learning of the downscaling network and the upscaling network to maximize the restoration performance. Our framework is efficient, and it can be generalized to handle an arbitrary image resizing operation. Experimental results show that our task-aware downscaled image, greatly improved the super-resolution performance of the previous state-of-the-art. In addition, realistic images can be recovered by recursively applying our scaling model up to an extreme scaling factor of x128. We validate our model’s generalization capability by applying it to the task of image colorization.
Tasks	Colorization, Super-Resolution
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Heewon_Kim_Task-Aware_Image_Downscaling_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Heewon_Kim_Task-Aware_Image_Downscaling_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/task-aware-image-downscaling
Repo
Framework

Feature Quantization for Defending Against Distortion of Images


Title	Feature Quantization for Defending Against Distortion of Images
Authors	Zhun Sun, Mete Ozay, Yan Zhang, Xing Liu, Takayuki Okatani
Abstract	In this work, we address the problem of improving robustness of convolutional neural networks (CNNs) to image distortion. We argue that higher moment statistics of feature distributions can be shifted due to image distortion, and the shift leads to performance decrease and cannot be reduced by ordinary normalization methods as observed in our experimental analyses. In order to mitigate this effect, we propose an approach base on feature quantization. To be specific, we propose to employ three different types of additional non-linearity in CNNs: i) a floor function with scalable resolution, ii) a power function with learnable exponents, and iii) a power function with data-dependent exponents. In the experiments, we observe that CNNs which employ the proposed methods obtain better performance in both generalization performance and robustness for various distortion types for large scale benchmark datasets. For instance, a ResNet-50 model equipped with the proposed method (+HPOW) obtains 6.95%, 5.26% and 5.61% better accuracy on the ILSVRC-12 classification tasks using images distorted with motion blur, salt and pepper and mixed distortions.
Tasks	Quantization
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Sun_Feature_Quantization_for_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Sun_Feature_Quantization_for_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/feature-quantization-for-defending-against
Repo
Framework

Deep Marching Cubes: Learning Explicit Surface Representations


Title	Deep Marching Cubes: Learning Explicit Surface Representations
Authors	Yiyi Liao, Simon DonnÃ©, Andreas Geiger
Abstract	Existing learning based solutions to 3D surface prediction cannot be trained end-to-end as they operate on intermediate representations (e.g., TSDF) from which 3D surface meshes must be extracted in a post-processing step (e.g., via the marching cubes algorithm). In this paper, we investigate the problem of end-to-end 3D surface prediction. We first demonstrate that the marching cubes algorithm is not differentiable and propose an alternative differentiable formulation which we insert as a final layer into a 3D convolutional neural network. We further propose a set of loss functions which allow for training our model with sparse point supervision. Our experiments demonstrate that the model allows for predicting sub-voxel accurate 3D shapes of arbitrary topology. Additionally, it learns to complete shapes and to separate an object’s inside from its outside even in the presence of sparse and incomplete ground truth. We investigate the benefits of our approach on the task of inferring shapes from 3D point clouds. Our model is flexible and can be combined with a variety of shape encoder and shape inference techniques.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Liao_Deep_Marching_Cubes_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Liao_Deep_Marching_Cubes_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/deep-marching-cubes-learning-explicit-surface
Repo
Framework

PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles


Title	PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Authors	Daniel Ferr{'e}s, Horacio Saggion, Francesco Ronzano, {`A}lex Bravo
Abstract
Tasks	Named Entity Recognition, Optical Character Recognition, Relation Extraction
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1298/
PDF	https://www.aclweb.org/anthology/L18-1298
PWC	https://paperswithcode.com/paper/pdfdigest-an-adaptable-layout-aware-pdf-to
Repo
Framework