Paper Group NANR 237
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management. Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval. Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank. ShakeDrop regularization. LSH Softmax: Sub-Linear Learning and …
European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management
Title | European Language Resource Coordination: Collecting Language Resources for Public Sector Multilingual Information Management |
Authors | Andrea L{"o}sch, Val{'e}rie Mapelli, Stelios Piperidis, Andrejs Vasi{\c{l}}jevs, Lilli Smal, Thierry Declerck, Eileen Schnur, Khalid Choukri, Josef van Genabith |
Abstract | |
Tasks | Machine Translation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1213/ |
https://www.aclweb.org/anthology/L18-1213 | |
PWC | https://paperswithcode.com/paper/european-language-resource-coordination |
Repo | |
Framework | |
Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval
Title | Attention-aware Deep Adversarial Hashing for Cross-Modal Retrieval |
Authors | Xi Zhang, Hanjiang Lai , Jiashi Feng |
Abstract | Due to the rapid growth of multi-modal data, hashing methods for cross-modal retrieval have received considerable attention. However, finding content similarities between different modalities of data is still challenging due to an existing heterogeneity gap. To further address this problem, we propose an adversarial hashing network with an attention mechanism to enhance the measurement of content similarities by selectively focusing on the informative parts of multi-modal data. The proposed new deep adversarial network consists of three building blocks: 1) the feature learning module to obtain the feature representations; 2) the attention module to generate an attention mask, which is used to divide the feature representations into the attended and unattended feature representations; and 3) the hashing module to learn hash functions that preserve the similarities between different modalities. In our framework, the attention and hashing modules are trained in an adversarial way: the attention module attempts to make the hashing module unable to preserve the similarities of multi-modal data w.r.t. the unattended feature representations, while the hashing module aims to preserve the similarities of multi-modal data w.r.t. the attended and unattended feature representations. Extensive evaluations on several benchmark datasets demonstrate that the proposed method brings substantial improvements over other state-of-the-art cross-modal hashing methods. |
Tasks | Cross-Modal Retrieval |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Xi_Zhang_Attention-aware_Deep_Adversarial_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Xi_Zhang_Attention-aware_Deep_Adversarial_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/attention-aware-deep-adversarial-hashing-for |
Repo | |
Framework | |
Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank
Title | Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank |
Authors | Deniz Zeyrek, Am{'a}lia Mendes, Murathan Kurfal{\i} |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1301/ |
https://www.aclweb.org/anthology/L18-1301 | |
PWC | https://paperswithcode.com/paper/multilingual-extension-of-pdtb-style |
Repo | |
Framework | |
ShakeDrop regularization
Title | ShakeDrop regularization |
Authors | Yoshihiro Yamada, Masakazu Iwamura, Koichi Kise |
Abstract | This paper proposes a powerful regularization method named \textit{ShakeDrop regularization}. ShakeDrop is inspired by Shake-Shake regularization that decreases error rates by disturbing learning. While Shake-Shake can be applied to only ResNeXt which has multiple branches, ShakeDrop can be applied to not only ResNeXt but also ResNet, Wide ResNet and PyramidNet in a memory efficient way. Important and interesting feature of ShakeDrop is that it strongly disturbs learning by multiplying even a negative factor to the output of a convolutional layer in the forward training pass. The effectiveness of ShakeDrop is confirmed by experiments on CIFAR-10/100 and Tiny ImageNet datasets. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=S1NHaMW0b |
https://openreview.net/pdf?id=S1NHaMW0b | |
PWC | https://paperswithcode.com/paper/shakedrop-regularization |
Repo | |
Framework | |
LSH Softmax: Sub-Linear Learning and Inference of the Softmax Layer in Deep Architectures
Title | LSH Softmax: Sub-Linear Learning and Inference of the Softmax Layer in Deep Architectures |
Authors | Daniel Levy, Danlu Chan, Stefano Ermon |
Abstract | Log-linear models models are widely used in machine learning, and in particular are ubiquitous in deep learning architectures in the form of the softmax. While exact inference and learning of these requires linear time, it can be done approximately in sub-linear time with strong concentrations guarantees. In this work, we present LSH Softmax, a method to perform sub-linear learning and inference of the softmax layer in the deep learning setting. Our method relies on the popular Locality-Sensitive Hashing to build a well-concentrated gradient estimator, using nearest neighbors and uniform samples. We also present an inference scheme in sub-linear time for LSH Softmax using the Gumbel distribution. On language modeling, we show that Recurrent Neural Networks trained with LSH Softmax perform on-par with computing the exact softmax while requiring sub-linear computations. |
Tasks | Language Modelling |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SJ3dBGZ0Z |
https://openreview.net/pdf?id=SJ3dBGZ0Z | |
PWC | https://paperswithcode.com/paper/lsh-softmax-sub-linear-learning-and-inference |
Repo | |
Framework | |
Combining Model-based and Model-free RL via Multi-step Control Variates
Title | Combining Model-based and Model-free RL via Multi-step Control Variates |
Authors | Tong Che, Yuchen Lu, George Tucker, Surya Bhupatiraju, Shane Gu, Sergey Levine, Yoshua Bengio |
Abstract | Model-free deep reinforcement learning algorithms are able to successfully solve a wide range of continuous control tasks, but typically require many on-policy samples to achieve good performance. Model-based RL algorithms are sample-efficient on the other hand, while learning accurate global models of complex dynamic environments has turned out to be tricky in practice, which leads to the unsatisfactory performance of the learned policies. In this work, we combine the sample-efficiency of model-based algorithms and the accuracy of model-free algorithms. We leverage multi-step neural network based predictive models by embedding real trajectories into imaginary rollouts of the model, and use the imaginary cumulative rewards as control variates for model-free algorithms. In this way, we achieved the strengths of both sides and derived an estimator which is not only sample-efficient, but also unbiased and of very low variance. We present our evaluation on the MuJoCo and OpenAI Gym benchmarks. |
Tasks | Continuous Control |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HkPCrEZ0Z |
https://openreview.net/pdf?id=HkPCrEZ0Z | |
PWC | https://paperswithcode.com/paper/combining-model-based-and-model-free-rl-via |
Repo | |
Framework | |
Natural Language Inference with External Knowledge
Title | Natural Language Inference with External Knowledge |
Authors | Qian Chen, Xiaodan Zhu, Zhen-Hua Ling, Diana Inkpen |
Abstract | Modeling informal inference in natural language is very challenging. With the recent availability of large annotated data, it has become feasible to train complex models such as neural networks to perform natural language inference (NLI), which have achieved state-of-the-art performance. Although there exist relatively large annotated data, can machines learn all knowledge needed to perform NLI from the data? If not, how can NLI models benefit from external knowledge and how to build NLI models to leverage it? In this paper, we aim to answer these questions by enriching the state-of-the-art neural natural language inference models with external knowledge. We demonstrate that the proposed models with external knowledge further improve the state of the art on the Stanford Natural Language Inference (SNLI) dataset. |
Tasks | Natural Language Inference |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Sy3XxCx0Z |
https://openreview.net/pdf?id=Sy3XxCx0Z | |
PWC | https://paperswithcode.com/paper/natural-language-inference-with-external |
Repo | |
Framework | |
Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?
Title | Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? |
Authors | Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, Li Fei-Fei |
Abstract | One of the most widely used optimization methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD). However, a key issue that arises here is that of delayed gradients: when a “worker” node asynchronously contributes a gradient update to the “master”, the global model parameter may have changed, rendering this information stale. In massively parallel computing grids, these delays can quickly add up if the computational throughput of a node is saturated, so the convergence of DASGD is uncertain under these conditions. Nevertheless, by using a judiciously chosen quasilinear step-size sequence, we show that it is possible to amortize these delays and achieve global convergence with probability 1, even when the delays grow at a polynomial rate. In this way, our results help reaffirm the successful application of DASGD to large-scale optimization problems. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2293 |
http://proceedings.mlr.press/v80/zhou18b/zhou18b.pdf | |
PWC | https://paperswithcode.com/paper/distributed-asynchronous-optimization-with |
Repo | |
Framework | |
Domain-Sensitive Temporal Tagging By Jannik Str"otgen, Michael Gertz
Title | Domain-Sensitive Temporal Tagging By Jannik Str"otgen, Michael Gertz |
Authors | Ruihong Huang |
Abstract | |
Tasks | Information Retrieval, Named Entity Recognition, Question Answering |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/J18-2006/ |
https://www.aclweb.org/anthology/J18-2006 | |
PWC | https://paperswithcode.com/paper/domain-sensitive-temporal-tagging-by-jannik |
Repo | |
Framework | |
Low-Rank Riemannian Optimization on Positive Semidefinite Stochastic Matrices with Applications to Graph Clustering
Title | Low-Rank Riemannian Optimization on Positive Semidefinite Stochastic Matrices with Applications to Graph Clustering |
Authors | Ahmed Douik, Babak Hassibi |
Abstract | This paper develops a Riemannian optimization framework for solving optimization problems on the set of symmetric positive semidefinite stochastic matrices. The paper first reformulates the problem by factorizing the optimization variable as $\mathbf{X}=\mathbf{Y}\mathbf{Y}^T$ and deriving conditions on $p$, i.e., the number of columns of $\mathbf{Y}$, under which the factorization yields a satisfactory solution. The reparameterization of the problem allows its formulation as an optimization over either an embedded or quotient Riemannian manifold whose geometries are investigated. In particular, the paper explicitly derives the tangent space, Riemannian gradients and retraction operator that allow the design of efficient optimization methods on the proposed manifolds. The numerical results reveal that, when the optimal solution has a known low-rank, the resulting algorithms present a clear complexity advantage when compared with state-of-the-art Euclidean and Riemannian approaches for graph clustering applications. |
Tasks | Graph Clustering |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2061 |
http://proceedings.mlr.press/v80/douik18a/douik18a.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-riemannian-optimization-on-positive |
Repo | |
Framework | |
Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense
Title | Constructing High Quality Sense-specific Corpus and Word Embedding via Unsupervised Elimination of Pseudo Multi-sense |
Authors | Haoyue Shi, Xihao Wang, Yuqi Sun, Junfeng Hu |
Abstract | |
Tasks | Word Embeddings |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1154/ |
https://www.aclweb.org/anthology/L18-1154 | |
PWC | https://paperswithcode.com/paper/constructing-high-quality-sense-specific |
Repo | |
Framework | |
Toward Cross-Domain Engagement Analysis in Medical Notes
Title | Toward Cross-Domain Engagement Analysis in Medical Notes |
Authors | Sara Rosenthal, Adam Faulkner |
Abstract | We present a novel annotation task evaluating a patient{'}s engagement with their health care regimen. The concept of engagement supplements the traditional concept of adherence with a focus on the patient{'}s affect, lifestyle choices, and health goal status. We describe an engagement annotation task across two patient note domains: traditional clinical notes and a novel domain, care manager notes, where we find engagement to be more common. The annotation task resulted in a kappa of .53, suggesting strong annotator intuitions regarding engagement-bearing language. In addition, we report the results of a series of preliminary engagement classification experiments using domain adaptation. |
Tasks | Domain Adaptation |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2325/ |
https://www.aclweb.org/anthology/W18-2325 | |
PWC | https://paperswithcode.com/paper/toward-cross-domain-engagement-analysis-in |
Repo | |
Framework | |
Pulmonary Artery–Vein Classification in CT Images Using Deep Learning
Title | Pulmonary Artery–Vein Classification in CT Images Using Deep Learning |
Authors | Pietro Nardelli, Daniel Jimenez-Carretero, David Bermejo-Pelaez, George R. Washko, Farbod N. Rahaghi, Maria J. Ledesma-Carbayo, Raúl San José Estépar |
Abstract | Recent studies show that pulmonary vascular diseases may specifically affect arteries or veins through different physiologic mechanisms. To detect changes in the two vascular trees, physicians manually analyze the chest computed tomography (CT) image of the patients in search of abnormalities. This process is time consuming, difficult to standardize, and thus not feasible for large clinical studies or useful in real-world clinical decision making. Therefore, automatic separation of arteries and veins in CT images is becoming of great interest, as it may help physicians to accurately diagnose pathological conditions. In this paper, we present a novel, fully automatic approach to classify vessels from chest CT images into arteries and veins. The algorithm follows three main steps: first, a scale-space particles segmentation to isolate vessels; then a 3-D convolutional neural network (CNN) to obtain a first classification of vessels; finally, graph-cuts’ optimization to refine the results. To justify the usage of the proposed CNN architecture, we compared different 2-D and 3-D CNNs that may use local information from bronchus- and vessel-enhanced images provided to the network with different strategies. We also compared the proposed CNN approach with a random forests (RFs) classifier. The methodology was trained and evaluated on the superior and inferior lobes of the right lung of 18 clinical cases with noncontrast chest CT scans, in comparison with manual classification. The proposed algorithm achieves an overall accuracy of 94%, which is higher than the accuracy obtained using other CNN architectures and RF. Our method was also validated with contrast-enhanced CT scans of patients with chronic thromboembolic pulmonary hypertension to demonstrate that our model generalizes well to contrast-enhanced modalities. The proposed method outperforms state-of-the-art methods, paving the way for future use of 3-D CNN for artery/vein classification in CT images. |
Tasks | 3D Medical Imaging Segmentation, Computed Tomography (CT), Decision Making, Medical Image Segmentation, Pulmonary Artery–Vein Classification, Pulmorary Vessel Segmentation |
Published | 2018-05-04 |
URL | https://doi.org/10.1109/TMI.2018.2833385 |
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6214740/pdf/nihms982442.pdf | |
PWC | https://paperswithcode.com/paper/pulmonary-arteryvein-classification-in-ct |
Repo | |
Framework | |
Self-Training for Jointly Learning to Ask and Answer Questions
Title | Self-Training for Jointly Learning to Ask and Answer Questions |
Authors | Mrinmaya Sachan, Eric Xing |
Abstract | Building curious machines that can answer as well as ask questions is an important challenge for AI. The two tasks of question answering and question generation are usually tackled separately in the NLP literature. At the same time, both require significant amounts of supervised data which is hard to obtain in many domains. To alleviate these issues, we propose a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning. We evaluate our approach on four benchmark datasets: SQUAD, MS MARCO, WikiQA and TrecQA, and show significant improvements over a number of established baselines on both question answering and question generation tasks. We also achieved new state-of-the-art results on two competitive answer sentence selection tasks: WikiQA and TrecQA. |
Tasks | Data Augmentation, Question Answering, Question Generation |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1058/ |
https://www.aclweb.org/anthology/N18-1058 | |
PWC | https://paperswithcode.com/paper/self-training-for-jointly-learning-to-ask-and |
Repo | |
Framework | |
Connecting Optimization and Regularization Paths
Title | Connecting Optimization and Regularization Paths |
Authors | Arun Suggala, Adarsh Prasad, Pradeep K. Ravikumar |
Abstract | We study the implicit regularization properties of optimization techniques by explicitly connecting their optimization paths to the regularization paths of ``corresponding’’ regularized problems. This surprising connection shows that iterates of optimization techniques such as gradient descent and mirror descent are \emph{pointwise} close to solutions of appropriately regularized objectives. While such a tight connection between optimization and regularization is of independent intellectual interest, it also has important implications for machine learning: we can port results from regularized estimators to optimization, and vice versa. We investigate one key consequence, that borrows from the well-studied analysis of regularized estimators, to then obtain tight excess risk bounds of the iterates generated by optimization techniques. | |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8260-connecting-optimization-and-regularization-paths |
http://papers.nips.cc/paper/8260-connecting-optimization-and-regularization-paths.pdf | |
PWC | https://paperswithcode.com/paper/connecting-optimization-and-regularization |
Repo | |
Framework | |