July 26, 2019

2099 words 10 mins read

Paper Group NANR 143

Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task. Communication-Efficient Distributed Learning of Discrete Distributions. Weighted-Entropy-Based Quantization for Deep Neural Networks. How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?. Detection of Chines …

Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task


Title	Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task
Authors	Yi Yang, Pengjun Xie, Jun Tao, Guangwei Xu, Linlin Li, Luo Si
Abstract	This paper introduces Alibaba NLP team system on IJCNLP 2017 shared task No. 1 Chinese Grammatical Error Diagnosis (CGED). The task is to diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the task as a sequence tagging problem and design some handcraft features to solve it. Our system is mainly based on the LSTM-CRF model and 3 ensemble strategies are applied to improve the performance. At the identification level and the position level our system gets the highest F1 scores. At the position level, which is the most difficult level, we perform best on all metrics.
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4006/
PDF	https://www.aclweb.org/anthology/I17-4006
PWC	https://paperswithcode.com/paper/alibaba-at-ijcnlp-2017-task-1-embedding
Repo
Framework

Communication-Efficient Distributed Learning of Discrete Distributions


Title	Communication-Efficient Distributed Learning of Discrete Distributions
Authors	Ilias Diakonikolas, Elena Grigorescu, Jerry Li, Abhiram Natarajan, Krzysztof Onak, Ludwig Schmidt
Abstract	We initiate a systematic investigation of distribution learning (density estimation) when the data is distributed across multiple servers. The servers must communicate with a referee and the goal is to estimate the underlying distribution with as few bits of communication as possible. We focus on non-parametric density estimation of discrete distributions with respect to the l1 and l2 norms. We provide the first non-trivial upper and lower bounds on the communication complexity of this basic estimation task in various settings of interest. Specifically, our results include the following: 1. When the unknown discrete distribution is unstructured and each server has only one sample, we show that any blackboard protocol (i.e., any protocol in which servers interact arbitrarily using public messages) that learns the distribution must essentially communicate the entire sample. 2. For the case of structured distributions, such as k-histograms and monotone distributions, we design distributed learning algorithms that achieve significantly better communication guarantees than the naive ones, and obtain tight upper and lower bounds in several regimes. Our distributed learning algorithms run in near-linear time and are robust to model misspecification. Our results provide insights on the interplay between structure and communication efficiency for a range of fundamental distribution estimation tasks.
Tasks	Density Estimation
Published	2017-12-01
URL	http://papers.nips.cc/paper/7218-communication-efficient-distributed-learning-of-discrete-distributions
PDF	http://papers.nips.cc/paper/7218-communication-efficient-distributed-learning-of-discrete-distributions.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-learning
Repo
Framework

Weighted-Entropy-Based Quantization for Deep Neural Networks


Title	Weighted-Entropy-Based Quantization for Deep Neural Networks
Authors	Eunhyeok Park, Junwhan Ahn, Sungjoo Yoo
Abstract	Quantization is considered as one of the most effective methods to optimize the inference cost of neural network models for their deployment to mobile and embedded systems, which have tight resource constraints. In such approaches, it is critical to provide low-cost quantization under a tight accuracy loss constraint (e.g., 1%). In this paper, we propose a novel method for quantizing weights and activations based on the concept of weighted entropy. Unlike recent work on binary-weight neural networks, our approach is multi-bit quantization, in which weights and activations can be quantized by any number of bits depending on the target accuracy. This facilitates much more flexible exploitation of accuracy-performance trade-off provided by different levels of quantization. Moreover, our scheme provides an automated quantization flow based on conventional training algorithms, which greatly reduces the design-time effort to quantize the network. According to our extensive evaluations based on practical neural network models for image classification (AlexNet, GoogLeNet and ResNet-50/101), object detection (R-FCN with 50-layer ResNet), and language modeling (an LSTM network), our method achieves significant reductions in both the model size and the amount of computation with minimal accuracy loss. Also, compared to existing quantization schemes, ours provides higher accuracy with a similar resource constraint and requires much lower design effort.
Tasks	Image Classification, Language Modelling, Object Detection, Quantization
Published	2017-07-01
URL	http://openaccess.thecvf.com/content_cvpr_2017/html/Park_Weighted-Entropy-Based_Quantization_for_CVPR_2017_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2017/papers/Park_Weighted-Entropy-Based_Quantization_for_CVPR_2017_paper.pdf
PWC	https://paperswithcode.com/paper/weighted-entropy-based-quantization-for-deep
Repo
Framework

How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?


Title	How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?
Authors	Andreas Loukas
Abstract	How many samples are sufficient to guarantee that the eigenvectors of the sample covariance matrix are close to those of the actual covariance matrix? For a wide family of distributions, including distributions with finite second moment and sub-gaussian distributions supported in a centered Euclidean ball, we prove that the inner product between eigenvectors of the sample and actual covariance matrices decreases proportionally to the respective eigenvalue distance and the number of samples. Our findings imply non-asymptotic concentration bounds for eigenvectors and eigenvalues and carry strong consequences for the non-asymptotic analysis of PCA and its applications. For instance, they provide conditions for separating components estimated from $O(1)$ samples and show that even few samples can be sufficient to perform dimensionality reduction, especially for low-rank covariances.
Tasks	Dimensionality Reduction
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=489
PDF	http://proceedings.mlr.press/v70/loukas17a/loukas17a.pdf
PWC	https://paperswithcode.com/paper/how-close-are-the-eigenvectors-of-the-sample
Repo
Framework

Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM


Title	Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM
Authors	Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen
Abstract	Selecting appropriate words to compose a sentence is one common problem faced by non-native Chinese learners. In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences. By combining CWINDOW word embedding features and POS information, the best bidirectional LSTM model achieves accuracy 0.5138 and MRR 0.6789 on the HSK dataset. For 80.79{%} of the test data, the model ranks the ground-truth within the top two at position level.
Tasks	Grammatical Error Detection
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2064/
PDF	https://www.aclweb.org/anthology/P17-2064
PWC	https://paperswithcode.com/paper/detection-of-chinese-word-usage-errors-for
Repo
Framework

Multi-Channel Lexicon Integrated CNN-BiLSTM Models for Sentiment Analysis


Title	Multi-Channel Lexicon Integrated CNN-BiLSTM Models for Sentiment Analysis
Authors	Joosung Yoon, Hyeoncheol Kim
Abstract
Tasks	Opinion Mining, Sentiment Analysis
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1023/
PDF	https://www.aclweb.org/anthology/O17-1023
PWC	https://paperswithcode.com/paper/multi-channel-lexicon-integrated-cnn-bilstm
Repo
Framework

Multi-Domain Aspect Extraction Using Support Vector Machines


Title	Multi-Domain Aspect Extraction Using Support Vector Machines
Authors	Nadheesh Jihan, Yasas Senarath, Dulanjaya Tennekoon, Mithila Wickramarathne, Surangika Ranathunga
Abstract
Tasks	Aspect-Based Sentiment Analysis, Aspect Extraction, Sentiment Analysis
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1029/
PDF	https://www.aclweb.org/anthology/O17-1029
PWC	https://paperswithcode.com/paper/multi-domain-aspect-extraction-using-support
Repo
Framework

以軟體為基礎建構語音增強系統使用者介面 (Development of a software-based User-Interface of Speech Enhancement System) [In Chinese]


Title	以軟體為基礎建構語音增強系統使用者介面 (Development of a software-based User-Interface of Speech Enhancement System) [In Chinese]
Authors	Tao-Wei Wang, Yu Tsao, Ying-Hui Lai, Hsiang-Ping Hsu, Chia-Lung Wu
Abstract
Tasks	Speech Enhancement
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1030/
PDF	https://www.aclweb.org/anthology/O17-1030
PWC	https://paperswithcode.com/paper/aeecoaocaoeae3aa14c3ca12c-eae-development-of
Repo
Framework

序列標記與配對方法用於語音辨識錯誤偵測及修正 (On the Use of Sequence Labeling and Matching Methods for ASR Error Detection and Correction) [In Chinese]


Title	序列標記與配對方法用於語音辨識錯誤偵測及修正 (On the Use of Sequence Labeling and Matching Methods for ASR Error Detection and Correction) [In Chinese]
Authors	Chia-Hua Wu, Chun-I Tsai, Hsiao-Tsung Hung, Yu-Chen Kao, Berlin Chen
Abstract
Tasks	Speech Recognition
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1032/
PDF	https://www.aclweb.org/anthology/O17-1032
PWC	https://paperswithcode.com/paper/aoa-e-eea13c-14eae3e34-ee-eaa-aa-on-the-use
Repo
Framework

Nonlinear random matrix theory for deep learning


Title	Nonlinear random matrix theory for deep learning
Authors	Jeffrey Pennington, Pratik Worah
Abstract	Neural network configurations with random weights play an important role in the analysis of deep learning. They define the initial loss landscape and are closely related to kernel and random feature methods. Despite the fact that these networks are built out of random matrices, the vast and powerful machinery of random matrix theory has so far found limited success in studying them. A main obstacle in this direction is that neural networks are nonlinear, which prevents the straightforward utilization of many of the existing mathematical results. In this work, we open the door for direct applications of random matrix theory to deep learning by demonstrating that the pointwise nonlinearities typically applied in neural networks can be incorporated into a standard method of proof in random matrix theory known as the moments method. The test case for our study is the Gram matrix $Y^TY$, $Y=f(WX)$, where $W$ is a random weight matrix, $X$ is a random data matrix, and $f$ is a pointwise nonlinear activation function. We derive an explicit representation for the trace of the resolvent of this matrix, which defines its limiting spectral distribution. We apply these results to the computation of the asymptotic performance of single-layer random feature methods on a memorization task and to the analysis of the eigenvalues of the data covariance matrix as it propagates through a neural network. As a byproduct of our analysis, we identify an intriguing new class of activation functions with favorable properties.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6857-nonlinear-random-matrix-theory-for-deep-learning
PDF	http://papers.nips.cc/paper/6857-nonlinear-random-matrix-theory-for-deep-learning.pdf
PWC	https://paperswithcode.com/paper/nonlinear-random-matrix-theory-for-deep
Repo
Framework

Medication and Adverse Event Extraction from Noisy Text


Title	Medication and Adverse Event Extraction from Noisy Text
Authors	Xiang Dai, Sarvnaz Karimi, Cecile Paris
Abstract
Tasks	Named Entity Recognition
Published	2017-12-01
URL	https://www.aclweb.org/anthology/U17-1009/
PDF	https://www.aclweb.org/anthology/U17-1009
PWC	https://paperswithcode.com/paper/medication-and-adverse-event-extraction-from
Repo
Framework

Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition


Title	Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition
Authors	Leon Derczynski, Eric Nichols, Marieke van Erp, Nut Limsopatham
Abstract	This shared task focuses on identifying unusual, previously-unseen entities in the context of emerging discussions. Named entities form the basis of many modern approaches to other tasks (like event clustering and summarization), but recall on them is a real problem in noisy text - even among annotators. This drop tends to be due to novel entities and surface forms. Take for example the tweet {``}so.. kktny in 30 mins?!{''} {–} even human experts find the entity {`}kktny{'} hard to detect and resolve. The goal of this task is to provide a definition of emerging and of rare entities, and based on that, also datasets for detecting these entities. The task as described in this paper evaluated the ability of participating entries to detect and classify novel and emerging named entities in noisy text. \|
Tasks	Named Entity Recognition
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4418/
PDF	https://www.aclweb.org/anthology/W17-4418
PWC	https://paperswithcode.com/paper/results-of-the-wnut2017-shared-task-on-novel
Repo
Framework

Learning Fine-Grained Expressions to Solve Math Word Problems


Title	Learning Fine-Grained Expressions to Solve Math Word Problems
Authors	Danqing Huang, Shuming Shi, Chin-Yew Lin, Jian Yin
Abstract	This paper presents a novel template-based method to solve math word problems. This method learns the mappings between math concept phrases in math word problems and their math expressions from training data. For each equation template, we automatically construct a rich template sketch by aggregating information from various problems with the same template. Our approach is implemented in a two-stage system. It first retrieves a few relevant equation system templates and aligns numbers in math word problems to those templates for candidate equation generation. It then does a fine-grained inference to obtain the final answer. Experiment results show that our method achieves an accuracy of 28.4{%} on the linear Dolphin18K benchmark, which is 10{%} (54{%} relative) higher than previous state-of-the-art systems while achieving an accuracy increase of 12{%} (59{%} relative) on the TS6 benchmark subset.
Tasks	Math Word Problem Solving
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1084/
PDF	https://www.aclweb.org/anthology/D17-1084
PWC	https://paperswithcode.com/paper/learning-fine-grained-expressions-to-solve
Repo
Framework

Deep Neural Solver for Math Word Problems


Title	Deep Neural Solver for Math Word Problems
Authors	Yan Wang, Xiaojiang Liu, Shuming Shi
Abstract	This paper presents a deep neural solver to automatically solve math word problems. In contrast to previous statistical learning approaches, we directly translate math word problems to equation templates using a recurrent neural network (RNN) model, without sophisticated feature engineering. We further design a hybrid model that combines the RNN model and a similarity-based retrieval model to achieve additional performance improvement. Experiments conducted on a large dataset show that the RNN model and the hybrid model significantly outperform state-of-the-art statistical learning methods for math word problem solving.
Tasks	Feature Engineering, Machine Translation, Math Word Problem Solving, Semantic Parsing
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1088/
PDF	https://www.aclweb.org/anthology/D17-1088
PWC	https://paperswithcode.com/paper/deep-neural-solver-for-math-word-problems
Repo
Framework

The BreakingNews Dataset


Title	The BreakingNews Dataset
Authors	Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk
Abstract	We present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (e.g. GPS coordinates and popularity metrics). The tenuous connection between the images and text in news data is appropriate to take work at the intersection of Computer Vision and Natural Language Processing to the next step, hence we hope this dataset will help spur progress in the field.
Tasks	Image Captioning, Sentiment Analysis
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-2005/
PDF	https://www.aclweb.org/anthology/W17-2005
PWC	https://paperswithcode.com/paper/the-breakingnews-dataset
Repo
Framework