Paper Group NANR 1
Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers. A Situated Dialogue System for Learning Structural Concepts in Blocks World. Learning Longer-term Dependencies in RNNs with Auxiliary Losses. Do Deep Reinforcement Learning Algorithms really Learn to Navigate?. SystemT: Declar …
Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers
Title | Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers |
Authors | Adrien Barbaresi |
Abstract | The present contribution revolves around efficient approaches to language classification which have been field-tested in the Vardial evaluation campaign. The methods used in several language identification tasks comprising different language types are presented and their results are discussed, giving insights on real-world application of regularization, linear classifiers and corresponding linguistic features. The use of a specially adapted Ridge classifier proved useful in 2 tasks out of 3. The overall approach (XAC) has slightly outperformed most of the other systems on the DFS task (Dutch and Flemish) and on the ILI task (Indo-Aryan languages), while its comparative performance was poorer in on the GDI task (Swiss German dialects). |
Tasks | Language Identification, Text Categorization |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-3918/ |
https://www.aclweb.org/anthology/W18-3918 | |
PWC | https://paperswithcode.com/paper/computationally-efficient-discrimination |
Repo | |
Framework | |
A Situated Dialogue System for Learning Structural Concepts in Blocks World
Title | A Situated Dialogue System for Learning Structural Concepts in Blocks World |
Authors | Ian Perera, James Allen, Choh Man Teng, Lucian Galescu |
Abstract | We present a modular, end-to-end dialogue system for a situated agent to address a multimodal, natural language dialogue task in which the agent learns complex representations of block structure classes through assertions, demonstrations, and questioning. The concept to learn is provided to the user through a set of positive and negative visual examples, from which the user determines the underlying constraints to be provided to the system in natural language. The system in turn asks questions about demonstrated examples and simulates new examples to check its knowledge and verify the user{'}s description is complete. We find that this task is non-trivial for users and generates natural language that is varied yet understood by our deep language understanding architecture. |
Tasks | Transfer Learning |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-5010/ |
https://www.aclweb.org/anthology/W18-5010 | |
PWC | https://paperswithcode.com/paper/a-situated-dialogue-system-for-learning |
Repo | |
Framework | |
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Title | Learning Longer-term Dependencies in RNNs with Auxiliary Losses |
Authors | Trieu Trinh, Andrew Dai, Thang Luong, Quoc Le |
Abstract | Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary loss to the original objective. This auxiliary loss forces RNNs to either reconstruct previous events or predict next events in a sequence, making truncated backpropagation feasible for long sequences and also improving full BPTT. We evaluate our method on a variety of settings, including pixel-by-pixel image classification with sequence lengths up to 16000, and a real document classification benchmark. Our results highlight good performance and resource efficiency of this approach over competitive baselines, including other recurrent models and a comparable sized Transformer. Further analyses reveal beneficial effects of the auxiliary loss on optimization and regularization, as well as extreme cases where there is little to no backpropagation. |
Tasks | Document Classification, Image Classification |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2127 |
http://proceedings.mlr.press/v80/trinh18a/trinh18a.pdf | |
PWC | https://paperswithcode.com/paper/learning-longer-term-dependencies-in-rnns-1 |
Repo | |
Framework | |
Do Deep Reinforcement Learning Algorithms really Learn to Navigate?
Title | Do Deep Reinforcement Learning Algorithms really Learn to Navigate? |
Authors | Shurjo Banerjee, Vikas Dhiman, Brent Griffin, Jason J. Corso |
Abstract | Deep reinforcement learning (DRL) algorithms have demonstrated progress in learning to find a goal in challenging environments. As the title of the paper by Mirowski et al. (2016) suggests, one might assume that DRL-based algorithms are able to “learn to navigate” and are thus ready to replace classical mapping and path-planning algorithms, at least in simulated environments. Yet, from experiments and analysis in this earlier work, it is not clear what strategies are used by these algorithms in navigating the mazes and finding the goal. In this paper, we pose and study this underlying question: are DRL algorithms doing some form of mapping and/or path-planning? Our experiments show that the algorithms are not memorizing the maps of mazes at the testing stage but, rather, at the training stage. Hence, the DRL algorithms fall short of qualifying as mapping or path-planning algorithms with any reasonable definition of mapping. We extend the experiments in Mirowski et al. (2016) by separating the set of training and testing maps and by a more ablative coverage of the space of experiments. Our systematic experiments show that the NavA3C-D1-D2-L algorithm, when trained and tested on the same maps, is able to choose the shorter paths to the goal. However, when tested on unseen maps the algorithm utilizes a wall-following strategy to find the goal without doing any mapping or path planning. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BkiIkBJ0b |
https://openreview.net/pdf?id=BkiIkBJ0b | |
PWC | https://paperswithcode.com/paper/do-deep-reinforcement-learning-algorithms |
Repo | |
Framework | |
SystemT: Declarative Text Understanding for Enterprise
Title | SystemT: Declarative Text Understanding for Enterprise |
Authors | Laura Chiticariu, Marina Danilevsky, Yunyao Li, Frederick Reiss, Huaiyu Zhu |
Abstract | The rise of enterprise applications over unstructured and semi-structured documents poses new challenges to text understanding systems across multiple dimensions. We present SystemT, a declarative text understanding system that addresses these challenges and has been deployed in a wide range of enterprise applications. We highlight the design considerations and decisions behind SystemT in addressing the needs of the enterprise setting. We also summarize the impact of SystemT on business and education. |
Tasks | Document Classification, Entity Extraction, Relation Extraction, Semantic Parsing, Sentiment Analysis, Tokenization |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3010/ |
https://www.aclweb.org/anthology/N18-3010 | |
PWC | https://paperswithcode.com/paper/systemt-declarative-text-understanding-for |
Repo | |
Framework | |
Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs
Title | Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs |
Authors | Sima Sharifirad, Borna Jafarpour, Stan Matwin |
Abstract | Text classification models have been heavily utilized for a slew of interesting natural language processing problems. Like any other machine learning model, these classifiers are very dependent on the size and quality of the training dataset. Insufficient and imbalanced datasets will lead to poor performance. An interesting solution to poor datasets is to take advantage of the world knowledge in the form of knowledge graphs to improve our training data. In this paper, we use ConceptNet and Wikidata to improve sexist tweet classification by two methods (1) text augmentation and (2) text generation. In our text generation approach, we generate new tweets by replacing words using data acquired from ConceptNet relations in order to increase the size of our training set, this method is very helpful with frustratingly small datasets, preserves the label and increases diversity. In our text augmentation approach, the number of tweets remains the same but their words are augmented (concatenation) with words extracted from their ConceptNet relations and their description extracted from Wikidata. In our text augmentation approach, the number of tweets in each class remains the same but the range of each tweet increases. Our experiments show that our approach improves sexist tweet classification significantly in our entire machine learning models. Our approach can be readily applied to any other small dataset size like hate speech or abusive language and text classification problem using any machine learning model. |
Tasks | Dialogue Generation, Knowledge Graphs, Machine Translation, Text Augmentation, Text Classification, Text Generation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5114/ |
https://www.aclweb.org/anthology/W18-5114 | |
PWC | https://paperswithcode.com/paper/boosting-text-classification-performance-on |
Repo | |
Framework | |
Dynamic Feature Selection with Attention in Incremental Parsing
Title | Dynamic Feature Selection with Attention in Incremental Parsing |
Authors | Ryosuke Kohita, Hiroshi Noji, Yuji Matsumoto |
Abstract | One main challenge for incremental transition-based parsers, when future inputs are invisible, is to extract good features from a limited local context. In this work, we present a simple technique to maximally utilize the local features with an attention mechanism, which works as context- dependent dynamic feature selection. Our model learns, for example, which tokens should a parser focus on, to decide the next action. Our multilingual experiment shows its effectiveness across many languages. We also present an experiment with augmented test dataset and demon- strate it helps to understand the model{'}s behavior on locally ambiguous points. |
Tasks | Dependency Parsing, Dialogue Generation, Feature Selection, Image Captioning, Machine Translation, Text Summarization |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1067/ |
https://www.aclweb.org/anthology/C18-1067 | |
PWC | https://paperswithcode.com/paper/dynamic-feature-selection-with-attention-in |
Repo | |
Framework | |
Olive Oil is Made \textitof Olives, Baby Oil is Made \textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model
Title | Olive Oil is Made \textitof Olives, Baby Oil is Made \textitfor Babies: Interpreting Noun Compounds Using Paraphrases in a Neural Model |
Authors | Vered Shwartz, Chris Waterson |
Abstract | Automatic interpretation of the relation between the constituents of a noun compound, e.g. olive oil (source) and baby oil (purpose) is an important task for many NLP applications. Recent approaches are typically based on either noun-compound representations or paraphrases. While the former has initially shown promising results, recent work suggests that the success stems from memorizing single prototypical words for each relation. We explore a neural paraphrasing approach that demonstrates superior performance when such memorization is not possible. |
Tasks | Relation Classification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2035/ |
https://www.aclweb.org/anthology/N18-2035 | |
PWC | https://paperswithcode.com/paper/olive-oil-is-made-of-olives-baby-oil-is-made-1 |
Repo | |
Framework | |
Aspect Sentiment Classification with both Word-level and Clause-level AttentionNetworks
Title | Aspect Sentiment Classification with both Word-level and Clause-level AttentionNetworks |
Authors | Jingjing Wang, Jie Li, Shoushan Li, Yangyang Kang, Min Zhang, Luo Si, Guodong Zhou |
Abstract | Aspect sentiment classification, a challenging taskin sentiment analysis, has been attracting more andmore attention in recent years. In this paper, wehighlight the need for incorporating the importancedegrees of both words and clauses inside a sentenceand propose a hierarchical network with both word-level and clause-level attentions to aspect senti-ment classification.Specifically, we first adoptsentence-level discourse segmentationto segmenta sentence into several clauses. Then, we lever-age multiple Bi-directional LSTM layers to encodeall clauses and propose a word-level attention layerto capture the importance degrees of words in eachclause. Third and finally, we leverage another Bi-directional LSTM layer to encode the output fromthe former layers and propose a clause-level atten-tion layer to capture the importance degrees of allthe clauses inside a sentence. Experimental re-sults on thelaptopandrestaurantdatasets fromSemEval-2015 demonstrate the effectiveness of ourproposed approach to aspect sentiment classifica-tion. |
Tasks | Sentiment Analysis |
Published | 2018-06-17 |
URL | https://www.ijcai.org/proceedings/2018/617 |
https://doi.org/10.24963/ijcai.2018/617 | |
PWC | https://paperswithcode.com/paper/aspect-sentiment-classification-with-both |
Repo | |
Framework | |
âLearning-Compressionâ Algorithms for Neural Net Pruning
Title | âLearning-Compressionâ Algorithms for Neural Net Pruning |
Authors | Miguel Ã. Carreira-Perpiñán, Yerlan Idelbayev |
Abstract | Pruning a neural net consists of removing weights without degrading its performance. This is an old problem of renewed interest because of the need to compress ever larger nets so they can run in mobile devices. Pruning has been traditionally done by ranking or penalizing weights according to some criterion (such as magnitude), removing low-ranked weights and retraining the remaining ones. We formulate pruning as an optimization problem of finding the weights that minimize the loss while satisfying a pruning cost condition. We give a generic algorithm to solve this which alternates “learning” steps that optimize a regularized, data-dependent loss and “compression” steps that mark weights for pruning in a data-independent way. Magnitude thresholding arises naturally in the compression step, but unlike existing magnitude pruning approaches, our algorithm explores subsets of weights rather than committing irrevocably to a specific subset from the beginning. It is also able to learn automatically the best number of weights to prune in each layer of the net without incurring an exponentially costly model selection. Using a single pruning-level user parameter, we achieve state-of-the-art pruning in nets of various sizes. |
Tasks | Model Selection |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Carreira-Perpinan_Learning-Compression_Algorithms_for_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Carreira-Perpinan_Learning-Compression_Algorithms_for_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/alearning-compressiona-algorithms-for-neural |
Repo | |
Framework | |
Complementary Strategies for Low Resourced Morphological Modeling
Title | Complementary Strategies for Low Resourced Morphological Modeling |
Authors | Alex Erdmann, er, Nizar Habash |
Abstract | Morphologically rich languages are challenging for natural language processing tasks due to data sparsity. This can be addressed either by introducing out-of-context morphological knowledge, or by developing machine learning architectures that specifically target data sparsity and/or morphological information. We find these approaches to complement each other in a morphological paradigm modeling task in Modern Standard Arabic, which, in addition to being morphologically complex, features ubiquitous ambiguity, exacerbating sparsity with noise. Given a small number of out-of-context rules describing closed class morphology, we combine them with word embeddings leveraging subword strings and noise reduction techniques. The combination outperforms both approaches individually by about 20{%} absolute. While morphological resources already exist for Modern Standard Arabic, our results inform how comparable resources might be constructed for non-standard dialects or any morphologically rich, low resourced language, given scarcity of time and funding. |
Tasks | Morphological Analysis, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5806/ |
https://www.aclweb.org/anthology/W18-5806 | |
PWC | https://paperswithcode.com/paper/complementary-strategies-for-low-resourced |
Repo | |
Framework | |
Task-Aware Image Downscaling
Title | Task-Aware Image Downscaling |
Authors | Heewon Kim, Myungsub Choi, Bee Lim, Kyoung Mu Lee |
Abstract | Image downscaling is one of the most classical problems in computer vision that aims to preserve the visual appearance of the original image when it is resized to a smaller scale. Upscaling a small image back to its original size is a difficult, ill-posed problem due to information loss that arises in the downscaling process. In this paper, we present a novel operation called task-aware image downscaling to support an upscaling task. We propose an auto-encoder-based framework that enables joint learning of the downscaling network and the upscaling network to maximize the restoration performance. Our framework is efficient, and it can be generalized to handle an arbitrary image resizing operation. Experimental results show that our task-aware downscaled image, greatly improved the super-resolution performance of the previous state-of-the-art. In addition, realistic images can be recovered by recursively applying our scaling model up to an extreme scaling factor of x128. We validate our model’s generalization capability by applying it to the task of image colorization. |
Tasks | Colorization, Super-Resolution |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Heewon_Kim_Task-Aware_Image_Downscaling_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Heewon_Kim_Task-Aware_Image_Downscaling_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/task-aware-image-downscaling |
Repo | |
Framework | |
Feature Quantization for Defending Against Distortion of Images
Title | Feature Quantization for Defending Against Distortion of Images |
Authors | Zhun Sun, Mete Ozay, Yan Zhang, Xing Liu, Takayuki Okatani |
Abstract | In this work, we address the problem of improving robustness of convolutional neural networks (CNNs) to image distortion. We argue that higher moment statistics of feature distributions can be shifted due to image distortion, and the shift leads to performance decrease and cannot be reduced by ordinary normalization methods as observed in our experimental analyses. In order to mitigate this effect, we propose an approach base on feature quantization. To be specific, we propose to employ three different types of additional non-linearity in CNNs: i) a floor function with scalable resolution, ii) a power function with learnable exponents, and iii) a power function with data-dependent exponents. In the experiments, we observe that CNNs which employ the proposed methods obtain better performance in both generalization performance and robustness for various distortion types for large scale benchmark datasets. For instance, a ResNet-50 model equipped with the proposed method (+HPOW) obtains 6.95%, 5.26% and 5.61% better accuracy on the ILSVRC-12 classification tasks using images distorted with motion blur, salt and pepper and mixed distortions. |
Tasks | Quantization |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Sun_Feature_Quantization_for_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Sun_Feature_Quantization_for_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/feature-quantization-for-defending-against |
Repo | |
Framework | |
Deep Marching Cubes: Learning Explicit Surface Representations
Title | Deep Marching Cubes: Learning Explicit Surface Representations |
Authors | Yiyi Liao, Simon Donné, Andreas Geiger |
Abstract | Existing learning based solutions to 3D surface prediction cannot be trained end-to-end as they operate on intermediate representations (e.g., TSDF) from which 3D surface meshes must be extracted in a post-processing step (e.g., via the marching cubes algorithm). In this paper, we investigate the problem of end-to-end 3D surface prediction. We first demonstrate that the marching cubes algorithm is not differentiable and propose an alternative differentiable formulation which we insert as a final layer into a 3D convolutional neural network. We further propose a set of loss functions which allow for training our model with sparse point supervision. Our experiments demonstrate that the model allows for predicting sub-voxel accurate 3D shapes of arbitrary topology. Additionally, it learns to complete shapes and to separate an object’s inside from its outside even in the presence of sparse and incomplete ground truth. We investigate the benefits of our approach on the task of inferring shapes from 3D point clouds. Our model is flexible and can be combined with a variety of shape encoder and shape inference techniques. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Liao_Deep_Marching_Cubes_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Liao_Deep_Marching_Cubes_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-marching-cubes-learning-explicit-surface |
Repo | |
Framework | |
PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles
Title | PDFdigest: an Adaptable Layout-Aware PDF-to-XML Textual Content Extractor for Scientific Articles |
Authors | Daniel Ferr{'e}s, Horacio Saggion, Francesco Ronzano, {`A}lex Bravo |
Abstract | |
Tasks | Named Entity Recognition, Optical Character Recognition, Relation Extraction |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1298/ |
https://www.aclweb.org/anthology/L18-1298 | |
PWC | https://paperswithcode.com/paper/pdfdigest-an-adaptable-layout-aware-pdf-to |
Repo | |
Framework | |