Paper Group ANR 756
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task. Fast Distributed Deep Learning via Worker-adaptive Batch Sizing. Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks. Piecewise Flat Embedding for Image Segmentation. Robust Bayesian Cluster Enumera …
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
Title | Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task |
Authors | Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield |
Abstract | Previously, neural methods in grammatical error correction (GEC) did not reach state-of-the-art results compared to phrase-based statistical machine translation (SMT) baselines. We demonstrate parallels between neural GEC and low-resource neural MT and successfully adapt several methods from low-resource MT to neural GEC. We further establish guidelines for trustable results in neural GEC and propose a set of model-independent methods for neural GEC that can be easily applied in most GEC settings. Proposed methods include adding source-side noise, domain-adaptation techniques, a GEC-specific training-objective, transfer learning with monolingual data, and ensembling of independently trained GEC models and language models. The combined effects of these methods result in better than state-of-the-art neural GEC models that outperform previously best neural GEC systems by more than 10% M$^2$ on the CoNLL-2014 benchmark and 5.9% on the JFLEG test set. Non-neural state-of-the-art systems are outperformed by more than 2% on the CoNLL-2014 benchmark and by 4% on JFLEG. |
Tasks | Domain Adaptation, Grammatical Error Correction, Machine Translation, Transfer Learning |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05940v1 |
http://arxiv.org/pdf/1804.05940v1.pdf | |
PWC | https://paperswithcode.com/paper/approaching-neural-grammatical-error |
Repo | |
Framework | |
Fast Distributed Deep Learning via Worker-adaptive Batch Sizing
Title | Fast Distributed Deep Learning via Worker-adaptive Batch Sizing |
Authors | Chen Chen, Qizhen Weng, Wei Wang, Baochun Li, Bo Li |
Abstract | Deep neural network models are usually trained in cluster environments, where the model parameters are iteratively refined by multiple worker machines in parallel. One key challenge in this regard is the presence of stragglers, which significantly degrades the learning performance. In this paper, we propose to eliminate stragglers by adapting each worker’s training load to its processing capability; that is, slower workers receive a smaller batch of data to process. Following this idea, we develop a new synchronization scheme called LB-BSP (Load-balanced BSP). It works by coordinately setting the batch size of each worker so that they can finish batch processing at around the same time. A prerequisite for deciding the workers’ batch sizes is to know their processing speeds before each iteration starts. For the best prediction accuracy, we adopt NARX, an extended recurrent neural network that accounts for both the historical speeds and the driving factors such as CPU and memory in prediction. We have implemented LB-BSP for both TensorFlow and MXNet. EC2 experiments against popular benchmarks show that LB-BSP can effectively accelerate the training of deep models, with up to 2x speedup. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02508v1 |
http://arxiv.org/pdf/1806.02508v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-distributed-deep-learning-via-worker |
Repo | |
Framework | |
Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks
Title | Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks |
Authors | Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno |
Abstract | In this work we tackle the problem of sentence boundary detection applied to French as a binary classification task (“sentence boundary” or “not sentence boundary”). We combine convolutional neural networks with subword-level information vectors, which are word embedding representations learned from Wikipedia that take advantage of the words morphology; so each word is represented as a bag of their character n-grams. We decide to use a big written dataset (French Gigaword) instead of standard size transcriptions to train and evaluate the proposed architectures with the intention of using the trained models in posterior real life ASR transcriptions. Three different architectures are tested showing similar results; general accuracy for all models overpasses 0.96. All three models have good F1 scores reaching values over 0.97 regarding the “not sentence boundary” class. However, the “sentence boundary” class reflects lower scores decreasing the F1 metric to 0.778 for one of the models. Using subword-level information vectors seem to be very effective leading to conclude that the morphology of words encoded in the embeddings representations behave like pixels in an image making feasible the use of convolutional neural network architectures. |
Tasks | Boundary Detection |
Published | 2018-02-13 |
URL | http://arxiv.org/abs/1802.04559v1 |
http://arxiv.org/pdf/1802.04559v1.pdf | |
PWC | https://paperswithcode.com/paper/sentence-boundary-detection-for-french-with |
Repo | |
Framework | |
Piecewise Flat Embedding for Image Segmentation
Title | Piecewise Flat Embedding for Image Segmentation |
Authors | Chaowei Fang, Zicheng Liao, Yizhou Yu |
Abstract | We introduce a new multi-dimensional nonlinear embedding – Piecewise Flat Embedding (PFE) – for image segmentation. Based on the theory of sparse signal recovery, piecewise flat embedding with diverse channels attempts to recover a piecewise constant image representation with sparse region boundaries and sparse cluster value scattering. The resultant piecewise flat embedding exhibits interesting properties such as suppressing slowly varying signals, and offers an image representation with higher region identifiability which is desirable for image segmentation or high-level semantic analysis tasks. We formulate our embedding as a variant of the Laplacian Eigenmap embedding with an $L_{1,p} (0<p\leq1)$ regularization term to promote sparse solutions. First, we devise a two-stage numerical algorithm based on Bregman iterations to compute $L_{1,1}$-regularized piecewise flat embeddings. We further generalize this algorithm through iterative reweighting to solve the general $L_{1,p}$-regularized problem. To demonstrate its efficacy, we integrate PFE into two existing image segmentation frameworks, segmentation based on clustering and hierarchical segmentation based on contour detection. Experiments on four major benchmark datasets, BSDS500, MSRC, Stanford Background Dataset, and PASCAL Context, show that segmentation algorithms incorporating our embedding achieve significantly improved results. |
Tasks | Contour Detection, Semantic Segmentation |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.03248v5 |
http://arxiv.org/pdf/1802.03248v5.pdf | |
PWC | https://paperswithcode.com/paper/piecewise-flat-embedding-for-image |
Repo | |
Framework | |
Robust Bayesian Cluster Enumeration
Title | Robust Bayesian Cluster Enumeration |
Authors | Freweyni K. Teklehaymanot, Michael Muma, Abdelhak M. Zoubir |
Abstract | A major challenge in cluster analysis is that the number of data clusters is mostly unknown and it must be estimated prior to clustering the observed data. In real-world applications, the observed data is often subject to heavy tailed noise and outliers which obscure the true underlying structure of the data. Consequently, estimating the number of clusters becomes challenging. To this end, we derive a robust cluster enumeration criterion by formulating the problem of estimating the number of clusters as maximization of the posterior probability of multivariate $t_\nu$ candidate models. We utilize Bayes’ theorem and asymptotic approximations to come up with a robust criterion that possesses a closed-form expression. Further, we refine the derivation and provide a robust cluster enumeration criterion for the finite sample regime. The robust criteria require an estimate of cluster parameters for each candidate model as an input. Hence, we propose a two-step cluster enumeration algorithm that uses the expectation maximization algorithm to partition the data and estimate cluster parameters prior to the calculation of one of the robust criteria. The performance of the proposed algorithm is tested and compared to existing cluster enumeration methods using numerical and real data experiments. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12337v1 |
http://arxiv.org/pdf/1811.12337v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-bayesian-cluster-enumeration |
Repo | |
Framework | |
Evaluating Word Embeddings in Multi-label Classification Using Fine-grained Name Typing
Title | Evaluating Word Embeddings in Multi-label Classification Using Fine-grained Name Typing |
Authors | Yadollah Yaghoobzadeh, Katharina Kann, Hinrich Schütze |
Abstract | Embedding models typically associate each word with a single real-valued vector, representing its different properties. Evaluation methods, therefore, need to analyze the accuracy and completeness of these properties in embeddings. This requires fine-grained analysis of embedding subspaces. Multi-label classification is an appropriate way to do so. We propose a new evaluation method for word embeddings based on multi-label classification given a word embedding. The task we use is fine-grained name typing: given a large corpus, find all types that a name can refer to based on the name embedding. Given the scale of entities in knowledge bases, we can build datasets for this task that are complementary to the current embedding evaluation datasets in: they are very large, contain fine-grained classes, and allow the direct evaluation of embeddings without confounding factors like sentence context |
Tasks | Multi-Label Classification, Word Embeddings |
Published | 2018-07-18 |
URL | http://arxiv.org/abs/1807.07186v1 |
http://arxiv.org/pdf/1807.07186v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-word-embeddings-in-multi-label |
Repo | |
Framework | |
Residual Networks: Lyapunov Stability and Convex Decomposition
Title | Residual Networks: Lyapunov Stability and Convex Decomposition |
Authors | Kamil Nar, Shankar Sastry |
Abstract | While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception. We show that the main reason for this is the Lyapunov stability of the gradient descent algorithm: for an arbitrarily chosen step size, the equilibria of the gradient descent are most likely to remain stable for the parametrization of residual networks. We then present an architecture with a pair of residual networks to approximate a large class of functions by decomposing them into a convex and a concave part. Some parameters of this model are shown to change little during training, and this imperfect optimization prevents overfitting the data and leads to solutions with small Lipschitz constants, while providing clues about the generalization of other deep networks. |
Tasks | |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08203v1 |
http://arxiv.org/pdf/1803.08203v1.pdf | |
PWC | https://paperswithcode.com/paper/residual-networks-lyapunov-stability-and |
Repo | |
Framework | |
Experimentally detecting a quantum change point via Bayesian inference
Title | Experimentally detecting a quantum change point via Bayesian inference |
Authors | Shang Yu, Chang-Jiang Huang, Jian-Shun Tang, Zhih-Ahn Jia, Yi-Tao Wang, Zhi-Jin Ke, Wei Liu, Xiao Liu, Zong-Quan Zhou, Ze-Di Cheng, Jin-Shi Xu, Yu-Chun Wu, Yuan-Yuan Zhao, Guo-Yong Xiang, Chuan-Feng Li, Guang-Can Guo, Gael Sentís, Ramon Muñoz-Tapia |
Abstract | Detecting a change point is a crucial task in statistics that has been recently extended to the quantum realm. A source state generator that emits a series of single photons in a default state suffers an alteration at some point and starts to emit photons in a mutated state. The problem consists in identifying the point where the change took place. In this work, we consider a learning agent that applies Bayesian inference on experimental data to solve this problem. This learning machine adjusts the measurement over each photon according to the past experimental results finds the change position in an online fashion. Our results show that the local-detection success probability can be largely improved by using such a machine learning technique. This protocol provides a tool for improvement in many applications where a sequence of identical quantum states is required. |
Tasks | Bayesian Inference |
Published | 2018-01-23 |
URL | http://arxiv.org/abs/1801.07508v1 |
http://arxiv.org/pdf/1801.07508v1.pdf | |
PWC | https://paperswithcode.com/paper/experimentally-detecting-a-quantum-change |
Repo | |
Framework | |
Online Mutual Foreground Segmentation for Multispectral Stereo Videos
Title | Online Mutual Foreground Segmentation for Multispectral Stereo Videos |
Authors | Pierre-Luc St-Charles, Guillaume-Alexandre Bilodeau, Robert Bergevin |
Abstract | The segmentation of video sequences into foreground and background regions is a low-level process commonly used in video content analysis and smart surveillance applications. Using a multispectral camera setup can improve this process by providing more diverse data to help identify objects despite adverse imaging conditions. The registration of several data sources is however not trivial if the appearance of objects produced by each sensor differs substantially. This problem is further complicated when parallax effects cannot be ignored when using close-range stereo pairs. In this work, we present a new method to simultaneously tackle multispectral segmentation and stereo registration. Using an iterative procedure, we estimate the labeling result for one problem using the provisional result of the other. Our approach is based on the alternating minimization of two energy functions that are linked through the use of dynamic priors. We rely on the integration of shape and appearance cues to find proper multispectral correspondences, and to properly segment objects in low contrast regions. We also formulate our model as a frame processing pipeline using higher order terms to improve the temporal coherence of our results. Our method is evaluated under different configurations on multiple multispectral datasets, and our implementation is available online. |
Tasks | |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02851v2 |
http://arxiv.org/pdf/1809.02851v2.pdf | |
PWC | https://paperswithcode.com/paper/online-mutual-foreground-segmentation-for |
Repo | |
Framework | |
Predicting computational reproducibility of data analysis pipelines in large population studies using collaborative filtering
Title | Predicting computational reproducibility of data analysis pipelines in large population studies using collaborative filtering |
Authors | Soudabeh Barghi, Lalet Scaria, Ali Salari, Tristan Glatard |
Abstract | Evaluating the computational reproducibility of data analysis pipelines has become a critical issue. It is, however, a cumbersome process for analyses that involve data from large populations of subjects, due to their computational and storage requirements. We present a method to predict the computational reproducibility of data analysis pipelines in large population studies. We formulate the problem as a collaborative filtering process, with constraints on the construction of the training set. We propose 6 different strategies to build the training set, which we evaluate on 2 datasets, a synthetic one modeling a population with a growing number of subject types, and a real one obtained with neuroinformatics pipelines. Results show that one sampling method, “Random File Numbers (Uniform)” is able to predict computational reproducibility with a good accuracy. We also analyze the relevance of including file and subject biases in the collaborative filtering model. We conclude that the proposed method is able to speedup reproducibility evaluations substantially, with a reduced accuracy loss. |
Tasks | |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.10139v1 |
http://arxiv.org/pdf/1809.10139v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-computational-reproducibility-of |
Repo | |
Framework | |
Dihedral angle prediction using generative adversarial networks
Title | Dihedral angle prediction using generative adversarial networks |
Authors | Hyeongki Kim |
Abstract | Several dihedral angles prediction methods were developed for protein structure prediction and their other applications. However, distribution of predicted angles would not be similar to that of real angles. To address this we employed generative adversarial networks (GAN). Generative adversarial networks are composed of two adversarially trained networks: a discriminator and a generator. A discriminator distinguishes samples from a dataset and generated samples while a generator generates realistic samples. Although the discriminator of GANs is trained to estimate density, GAN model is intractable. On the other hand, noise-contrastive estimation (NCE) was introduced to estimate a normalization constant of an unnormalized statistical model and thus the density function. In this thesis, we introduce noise-contrastive estimation generative adversarial networks (NCE-GAN) which enables explicit density estimation of a GAN model. And a new loss for the generator is proposed. We also propose residue-wise variants of auxiliary classifier GAN (AC-GAN) and Semi-supervised GAN to handle sequence information in a window. In our experiment, the conditional generative adversarial network (C-GAN), AC-GAN and Semi-supervised GAN were compared. And experiments done with improved conditions were invested. We identified a phenomenon of AC-GAN that distribution of its predicted angles is composed of unusual clusters. The distribution of the predicted angles of Semi-supervised GAN was most similar to the Ramachandran plot. We found that adding the output of the NCE as an additional input of the discriminator is helpful to stabilize the training of the GANs and to capture the detailed structures. Adding regression loss and using predicted angles by regression loss only model could improve the conditional generation performance of the C-GAN and AC-GAN. |
Tasks | Density Estimation |
Published | 2018-03-29 |
URL | http://arxiv.org/abs/1803.10996v1 |
http://arxiv.org/pdf/1803.10996v1.pdf | |
PWC | https://paperswithcode.com/paper/dihedral-angle-prediction-using-generative |
Repo | |
Framework | |
The Medico-Task 2018: Disease Detection in the Gastrointestinal Tract using Global Features and Deep Learning
Title | The Medico-Task 2018: Disease Detection in the Gastrointestinal Tract using Global Features and Deep Learning |
Authors | Vajira Thambawita, Debesh Jha, Michael Riegler, Pål Halvorsen, Hugo Lewi Hammer, Håvard D. Johansen, Dag Johansen |
Abstract | In this paper, we present our approach for the 2018 Medico Task classifying diseases in the gastrointestinal tract. We have proposed a system based on global features and deep neural networks. The best approach combines two neural networks, and the reproducible experimental results signify the efficiency of the proposed model with an accuracy rate of 95.80%, a precision of 95.87%, and an F1-score of 95.80%. |
Tasks | |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1810.13278v1 |
http://arxiv.org/pdf/1810.13278v1.pdf | |
PWC | https://paperswithcode.com/paper/the-medico-task-2018-disease-detection-in-the |
Repo | |
Framework | |
Pareto Optimization for Subset Selection with Dynamic Cost Constraints
Title | Pareto Optimization for Subset Selection with Dynamic Cost Constraints |
Authors | Vahid Roostapour, Aneta Neumann, Frank Neumann, Tobias Friedrich |
Abstract | In this paper, we consider the subset selection problem for function $f$ with constraint bound $B$ which changes over time. We point out that adaptive variants of greedy approaches commonly used in the area of submodular optimization are not able to maintain their approximation quality. Investigating the recently introduced POMC Pareto optimization approach, we show that this algorithm efficiently computes a $\phi= (\alpha_f/2)(1-\frac{1}{e^{\alpha_f}})$-approximation, where $\alpha_f$ is the submodularity ratio of $f$, for each possible constraint bound $b \leq B$. Furthermore, we show that POMC is able to adapt its set of solutions quickly in the case that $B$ increases. Our experimental investigations for the influence maximization in social networks show the advantage of POMC over generalized greedy algorithms. |
Tasks | |
Published | 2018-11-14 |
URL | http://arxiv.org/abs/1811.07806v1 |
http://arxiv.org/pdf/1811.07806v1.pdf | |
PWC | https://paperswithcode.com/paper/pareto-optimization-for-subset-selection-with |
Repo | |
Framework | |
ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities
Title | ADBSCAN: Adaptive Density-Based Spatial Clustering of Applications with Noise for Identifying Clusters with Varying Densities |
Authors | Mohammad Mahmudur Rahman Khan, Md. Abu Bakr Siddique, Rezoana Bente Arif, Mahjabin Rahman Oishe |
Abstract | Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm which has the high-performance rate for dataset where clusters have the constant density of data points. One of the significant attributes of this algorithm is noise cancellation. However, DBSCAN demonstrates reduced performances for clusters with different densities. Therefore, in this paper, an adaptive DBSCAN is proposed which can work significantly well for identifying clusters with varying densities. |
Tasks | |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06189v3 |
http://arxiv.org/pdf/1809.06189v3.pdf | |
PWC | https://paperswithcode.com/paper/adbscan-adaptive-density-based-spatial |
Repo | |
Framework | |
Nonparametric Risk Assessment and Density Estimation for Persistence Landscapes
Title | Nonparametric Risk Assessment and Density Estimation for Persistence Landscapes |
Authors | Soroush Pakniat, Farzad Eskandari |
Abstract | This paper presents approximate confidence intervals for each function of parameters in a Banach space based on a bootstrap algorithm. We apply kernel density approach to estimate the persistence landscape. In addition, we evaluate the quality distribution function estimator of random variables using integrated mean square error (IMSE). The results of simulation studies show a significant improvement achieved by our approach compared to the standard version of confidence intervals algorithm. In the next step, we provide several algorithms to solve our model. Finally, real data analysis shows that the accuracy of our method compared to that of previous works for computing the confidence interval. |
Tasks | Density Estimation |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03677v1 |
http://arxiv.org/pdf/1803.03677v1.pdf | |
PWC | https://paperswithcode.com/paper/nonparametric-risk-assessment-and-density |
Repo | |
Framework | |