Paper Group NANR 94
The German Reference Corpus DeReKo: New Developments – New Opportunities. LSH-SAMPLING BREAKS THE COMPUTATIONAL CHICKEN-AND-EGG LOOP IN ADAPTIVE STOCHASTIC GRADIENT ESTIMATION. Football and Beer - a Social Media Analysis on Twitter in Context of the FIFA Football World Cup 2018. Large Scale Multi-Domain Multi-Task Learning with MultiModel. Deep Pi …
The German Reference Corpus DeReKo: New Developments – New Opportunities
Title | The German Reference Corpus DeReKo: New Developments – New Opportunities |
Authors | Marc Kupietz, Harald L{"u}ngen, Pawe{\l} Kamocki, Andreas Witt |
Abstract | |
Tasks | Word Embeddings |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1689/ |
https://www.aclweb.org/anthology/L18-1689 | |
PWC | https://paperswithcode.com/paper/the-german-reference-corpus-dereko-new |
Repo | |
Framework | |
LSH-SAMPLING BREAKS THE COMPUTATIONAL CHICKEN-AND-EGG LOOP IN ADAPTIVE STOCHASTIC GRADIENT ESTIMATION
Title | LSH-SAMPLING BREAKS THE COMPUTATIONAL CHICKEN-AND-EGG LOOP IN ADAPTIVE STOCHASTIC GRADIENT ESTIMATION |
Authors | Beidi Chen, Yingchen Xu, Anshumali Shrivastava |
Abstract | Stochastic Gradient Descent or SGD is the most popular optimization algorithm for large-scale problems. SGD estimates the gradient by uniform sampling with sample size one. There have been several other works that suggest faster epoch wise convergence by using weighted non-uniform sampling for better gradient estimates. Unfortunately, the per-iteration cost of maintaining this adaptive distribution for gradient estimation is more than calculating the full gradient. As a result, the false impression of faster convergence in iterations leads to slower convergence in time, which we call a chicken-and-egg loop. In this paper, we break this barrier by providing the first demonstration of a sampling scheme, which leads to superior gradient estimation, while keeping the sampling cost per iteration similar to that of the uniform sampling. Such an algorithm is possible due to the sampling view of Locality Sensitive Hashing (LSH), which came to light recently. As a consequence of superior and fast estimation, we reduce the running time of all existing gradient descent algorithms. We demonstrate the benefits of our proposal on both SGD and AdaGrad. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SyVOjfbRb |
https://openreview.net/pdf?id=SyVOjfbRb | |
PWC | https://paperswithcode.com/paper/lsh-sampling-breaks-the-computational-chicken |
Repo | |
Framework | |
Football and Beer - a Social Media Analysis on Twitter in Context of the FIFA Football World Cup 2018
Title | Football and Beer - a Social Media Analysis on Twitter in Context of the FIFA Football World Cup 2018 |
Authors | Rol Roller, , Philippe Thomas, Sven Schmeier |
Abstract | In many societies alcohol is a legal and common recreational substance and socially accepted. Alcohol consumption often comes along with social events as it helps people to increase their sociability and to overcome their inhibitions. On the other hand we know that increased alcohol consumption can lead to serious health issues, such as cancer, cardiovascular diseases and diseases of the digestive system, to mention a few. This work examines alcohol consumption during the FIFA Football World Cup 2018, particularly the usage of alcohol related information on Twitter. For this we analyse the tweeting behaviour and show that the tournament strongly increases the interest in beer. Furthermore we show that countries who had to leave the tournament at early stage might have done something good to their fans as the interest in beer decreased again. |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5901/ |
https://www.aclweb.org/anthology/W18-5901 | |
PWC | https://paperswithcode.com/paper/football-and-beer-a-social-media-analysis-on |
Repo | |
Framework | |
Large Scale Multi-Domain Multi-Task Learning with MultiModel
Title | Large Scale Multi-Domain Multi-Task Learning with MultiModel |
Authors | Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit |
Abstract | Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition corpus, and an English parsing task. Our model architecture incorporates building blocks from multiple domains. It contains convolutional layers, an attention mechanism, and sparsely-gated layers. Each of these computational blocks is crucial for a subset of the tasks we train on. Interestingly, even if a block is not crucial for a task, we observe that adding it never hurts performance and in most cases improves it on all tasks. We also show that tasks with less data benefit largely from joint training with other tasks, while performance on large tasks degrades only slightly if at all. |
Tasks | Image Captioning, Image Classification, Multi-Task Learning, Speech Recognition |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HyKZyYlRZ |
https://openreview.net/pdf?id=HyKZyYlRZ | |
PWC | https://paperswithcode.com/paper/large-scale-multi-domain-multi-task-learning |
Repo | |
Framework | |
Deep Pivot-Based Modeling for Cross-language Cross-domain Transfer with Minimal Guidance
Title | Deep Pivot-Based Modeling for Cross-language Cross-domain Transfer with Minimal Guidance |
Authors | Yftah Ziser, Roi Reichart |
Abstract | While cross-domain and cross-language transfer have long been prominent topics in NLP research, their combination has hardly been explored. In this work we consider this problem, and propose a framework that builds on pivot-based learning, structure-aware Deep Neural Networks (particularly LSTMs and CNNs) and bilingual word embeddings, with the goal of training a model on labeled data from one (language, domain) pair so that it can be effectively applied to another (language, domain) pair. We consider two setups, differing with respect to the unlabeled data available for model training. In the full setup the model has access to unlabeled data from both pairs, while in the lazy setup, which is more realistic for truly resource-poor languages, unlabeled data is available for both domains but only for the source language. We design our model for the lazy setup so that for a given target domain, it can train once on the source language and then be applied to any target language without re-training. In experiments with nine English-German and nine English-French domain pairs our best model substantially outperforms previous models even when it is trained in the lazy setup and previous models are trained in the full setup. |
Tasks | Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1022/ |
https://www.aclweb.org/anthology/D18-1022 | |
PWC | https://paperswithcode.com/paper/deep-pivot-based-modeling-for-cross-language |
Repo | |
Framework | |
Thumbs Up and Down: Sentiment Analysis of Medical Online Forums
Title | Thumbs Up and Down: Sentiment Analysis of Medical Online Forums |
Authors | Victoria Bobicev, Marina Sokolova |
Abstract | In the current study, we apply multi-class and multi-label sentence classification to sentiment analysis of online medical forums. We aim to identify major health issues discussed in online social media and the types of sentiments those issues evoke. We use ontology of personal health information for Information Extraction and apply Machine Learning methods in automated recognition of the expressed sentiments. |
Tasks | Sentence Classification, Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5906/ |
https://www.aclweb.org/anthology/W18-5906 | |
PWC | https://paperswithcode.com/paper/thumbs-up-and-down-sentiment-analysis-of |
Repo | |
Framework | |
Efficient Stochastic Gradient Hard Thresholding
Title | Efficient Stochastic Gradient Hard Thresholding |
Authors | Pan Zhou, Xiaotong Yuan, Jiashi Feng |
Abstract | Stochastic gradient hard thresholding methods have recently been shown to work favorably in solving large-scale empirical risk minimization problems under sparsity or rank constraint. Despite the improved iteration complexity over full gradient methods, the gradient evaluation and hard thresholding complexity of the existing stochastic algorithms usually scales linearly with data size, which could still be expensive when data is huge and the hard thresholding step could be as expensive as singular value decomposition in rank-constrained problems. To address these deficiencies, we propose an efficient hybrid stochastic gradient hard thresholding (HSG-HT) method that can be provably shown to have sample-size-independent gradient evaluation and hard thresholding complexity bounds. Specifically, we prove that the stochastic gradient evaluation complexity of HSG-HT scales linearly with inverse of sub-optimality and its hard thresholding complexity scales logarithmically. By applying the heavy ball acceleration technique, we further propose an accelerated variant of HSG-HT which can be shown to have improved factor dependence on restricted condition number. Numerical results confirm our theoretical affirmation and demonstrate the computational efficiency of the proposed methods. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7469-efficient-stochastic-gradient-hard-thresholding |
http://papers.nips.cc/paper/7469-efficient-stochastic-gradient-hard-thresholding.pdf | |
PWC | https://paperswithcode.com/paper/efficient-stochastic-gradient-hard |
Repo | |
Framework | |
Re-Weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation
Title | Re-Weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation |
Authors | Qingchao Chen, Yang Liu, Zhaowen Wang, Ian Wassell, Kevin Chetty |
Abstract | Unsupervised Domain Adaptation (UDA) aims to transfer domain knowledge from existing well-defined tasks to new ones where labels are unavailable. In the real-world applications, as the domain (task) discrepancies are usually uncontrollable, it is significantly motivated to match the feature distributions even if the domain discrepancies are disparate. Additionally, as no label is available in the target domain, how to successfully adapt the classifier from the source to the target domain still remains an open question. In this paper, we propose the Re-weighted Adversarial Adaptation Network (RAAN) to reduce the feature distribution divergence and adapt the classifier when domain discrepancies are disparate. Specifically, to alleviate the need of common supports in matching the feature distribution, we choose to minimize optimal transport (OT) based Earth-Mover (EM) distance and reformulate it to a minimax objective function. Utilizing this, RAAN can be trained in an end-to-end and adversarial manner. To further adapt the classifier, we propose to match the label distribution and embed it into the adversarial training. Finally, after extensive evaluation of our method using UDA datasets of varying difficulty, RAAN achieved the state-of-the-art results and outperformed other methods by a large margin when the domain shifts are disparate. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_Re-Weighted_Adversarial_Adaptation_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_Re-Weighted_Adversarial_Adaptation_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/re-weighted-adversarial-adaptation-network |
Repo | |
Framework | |
A Framework for Developing and Evaluating Word Embeddings of Drug-named Entity
Title | A Framework for Developing and Evaluating Word Embeddings of Drug-named Entity |
Authors | Mengnan Zhao, Aaron J. Masino, Christopher C. Yang |
Abstract | We investigate the quality of task specific word embeddings created with relatively small, targeted corpora. We present a comprehensive evaluation framework including both intrinsic and extrinsic evaluation that can be expanded to named entities beyond drug name. Intrinsic evaluation results tell that drug name embeddings created with a domain specific document corpus outperformed the previously published versions that derived from a very large general text corpus. Extrinsic evaluation uses word embedding for the task of drug name recognition with Bi-LSTM model and the results demonstrate the advantage of using domain-specific word embeddings as the only input feature for drug name recognition with F1-score achieving 0.91. This work suggests that it may be advantageous to derive domain specific embeddings for certain tasks even when the domain specific corpus is of limited size. |
Tasks | Named Entity Recognition, Outlier Detection, Question Answering, Relation Extraction, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2319/ |
https://www.aclweb.org/anthology/W18-2319 | |
PWC | https://paperswithcode.com/paper/a-framework-for-developing-and-evaluating |
Repo | |
Framework | |
Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech
Title | Building Open Javanese and Sundanese Corpora for Multilingual Text-to-Speech |
Authors | Jaka Aris Eko Wibawa, Supheakmungkol Sarin, Chenfang Li, Knot Pipatsrisawat, Keshan Sodimana, Oddur Kjartansson, Alex Gutkin, er, Martin Jansche, Linne Ha |
Abstract | |
Tasks | Speech Recognition, Speech Synthesis |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1255/ |
https://www.aclweb.org/anthology/L18-1255 | |
PWC | https://paperswithcode.com/paper/building-open-javanese-and-sundanese-corpora |
Repo | |
Framework | |
Spline Filters For End-to-End Deep Learning
Title | Spline Filters For End-to-End Deep Learning |
Authors | Randall Balestriero, Romain Cosentino, Herve Glotin, Richard Baraniuk |
Abstract | We propose to tackle the problem of end-to-end learning for raw waveform signals by introducing learnable continuous time-frequency atoms. The derivation of these filters is achieved by defining a functional space with a given smoothness order and boundary conditions. From this space, we derive the parametric analytical filters. Their differentiability property allows gradient-based optimization. As such, one can utilize any Deep Neural Network (DNN) with these filters. This enables us to tackle in a front-end fashion a large scale bird detection task based on the freefield1010 dataset known to contain key challenges, such as the dimensionality of the inputs data ($>100,000$) and the presence of additional noises: multiple sources and soundscapes. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2291 |
http://proceedings.mlr.press/v80/balestriero18a/balestriero18a.pdf | |
PWC | https://paperswithcode.com/paper/spline-filters-for-end-to-end-deep-learning |
Repo | |
Framework | |
Baseline wander and power line interference removal from ECG signals using eigenvalue decomposition
Title | Baseline wander and power line interference removal from ECG signals using eigenvalue decomposition |
Authors | Rishi Raj Sharma,Ram Bilas Pachori |
Abstract | In this paper, a novel method is proposed for baseline wander (BW) and power line interference (PLI) removal from electrocardiogram (ECG) signals. The proposed methodology is based on the eigenvalue decomposition of the Hankel matrix. It has been observed that the end-point eigenvalues of the Hankel matrix formed using noisy ECG signals are correlated with BW and PLI components. We have proposed a methodology to remove BW and PLI noise by eliminating eigenvalues corresponding to noisy components. The proposed concept uses one-step process for removing both BW and PLI noise simultaneously. The proposed method has been compared with other existing methods using performance measure parameters namely output signal to noise ratio (SNRout), and percent root mean square difference (PRD). Simulation results validate the better performance of the proposed method than compared methods at different noise levels. The proposed method is suitable for preprocessing of ECG signals. |
Tasks | |
Published | 2018-06-01 |
URL | https://doi.org/10.1016/j.bspc.2018.05.002 |
https://doi.org/10.1016/j.bspc.2018.05.002 | |
PWC | https://paperswithcode.com/paper/baseline-wander-and-power-line-interference |
Repo | |
Framework | |
Neural Sparse Topical Coding
Title | Neural Sparse Topical Coding |
Authors | Min Peng, Qianqian Xie, Yanchun Zhang, Hua Wang, Xiuzhen Zhang, Jimin Huang, Gang Tian |
Abstract | Topic models with sparsity enhancement have been proven to be effective at learning discriminative and coherent latent topics of short texts, which is critical to many scientific and engineering applications. However, the extensions of these models require carefully tailored graphical models and re-deduced inference algorithms, limiting their variations and applications. We propose a novel sparsity-enhanced topic model, Neural Sparse Topical Coding (NSTC) base on a sparsity-enhanced topic model called Sparse Topical Coding (STC). It focuses on replacing the complex inference process with the back propagation, which makes the model easy to explore extensions. Moreover, the external semantic information of words in word embeddings is incorporated to improve the representation of short texts. To illustrate the flexibility offered by the neural network based framework, we present three extensions base on NSTC without re-deduced inference algorithms. Experiments on Web Snippet and 20Newsgroups datasets demonstrate that our models outperform existing methods. |
Tasks | Language Modelling, Topic Models, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1217/ |
https://www.aclweb.org/anthology/P18-1217 | |
PWC | https://paperswithcode.com/paper/neural-sparse-topical-coding |
Repo | |
Framework | |
Towards Safe Deep Learning: Unsupervised Defense Against Generic Adversarial Attacks
Title | Towards Safe Deep Learning: Unsupervised Defense Against Generic Adversarial Attacks |
Authors | Bita Darvish Rouhani, Mohammad Samragh, Tara Javidi, Farinaz Koushanfar |
Abstract | Recent advances in adversarial Deep Learning (DL) have opened up a new and largely unexplored surface for malicious attacks jeopardizing the integrity of autonomous DL systems. We introduce a novel automated countermeasure called Parallel Checkpointing Learners (PCL) to thwart the potential adversarial attacks and significantly improve the reliability (safety) of a victim DL model. The proposed PCL methodology is unsupervised, meaning that no adversarial sample is leveraged to build/train parallel checkpointing learners. We formalize the goal of preventing adversarial attacks as an optimization problem to minimize the rarely observed regions in the latent feature space spanned by a DL network. To solve the aforementioned minimization problem, a set of complementary but disjoint checkpointing modules are trained and leveraged to validate the victim model execution in parallel. Each checkpointing learner explicitly characterizes the geometry of the input data and the corresponding high-level data abstractions within a particular DL layer. As such, the adversary is required to simultaneously deceive all the defender modules in order to succeed. We extensively evaluate the performance of the PCL methodology against the state-of-the-art attack scenarios, including Fast-Gradient-Sign (FGS), Jacobian Saliency Map Attack (JSMA), Deepfool, and Carlini&WagnerL2 algorithm. Extensive proof-of-concept evaluations for analyzing various data collections including MNIST, CIFAR10, and ImageNet corroborate the effectiveness of our proposed defense mechanism against adversarial samples. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HyI6s40a- |
https://openreview.net/pdf?id=HyI6s40a- | |
PWC | https://paperswithcode.com/paper/towards-safe-deep-learning-unsupervised |
Repo | |
Framework | |
The Generalization Error of Dictionary Learning with Moreau Envelopes
Title | The Generalization Error of Dictionary Learning with Moreau Envelopes |
Authors | Alexandros Georgogiannis |
Abstract | This is a theoretical study on the sample complexity of dictionary learning with a general type of reconstruction loss. The goal is to estimate a $m \times d$ matrix $D$ of unit-norm columns when the only available information is a set of training samples. Points $x$ in $\mathbb{R}^m$ are subsequently approximated by the linear combination $Da$ after solving the problem $\min_{a \in \mathbb{R}^d} \Phi(x - Da) + g(a)$; function $g:\mathbb{R}^d \to [0,+\infty)$ is either an indicator function or a sparsity promoting regularizer. Here is considered the case where $ \Phi(x) = \inf_{z \in \mathbb{R}^m} { x-z_2^2 + h(z_2)}$ and $h$ is an even and univariate function on the real line. Connections are drawn between $\Phi$ and the Moreau envelope of $h$. A new sample complexity result concerning the $k$-sparse dictionary problem removes the spurious condition on the coherence of $D$ appearing in previous works. Finally, comments are made on the approximation error of certain families of losses. The derived generalization bounds are of order $\mathcal{O}(\sqrt{\log n /n})$ and valid without any further restrictions on the set of dictionaries with unit-norm columns. |
Tasks | Dictionary Learning |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1931 |
http://proceedings.mlr.press/v80/georgogiannis18a/georgogiannis18a.pdf | |
PWC | https://paperswithcode.com/paper/the-generalization-error-of-dictionary |
Repo | |
Framework | |