Paper Group ANR 45
Impact of Coreference Resolution on Slot Filling. Deep Learning for Photoacoustic Tomography from Sparse Data. On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets. A Convex Parametrization of a New Class of Universal Kernel Functions for use in Kernel Learning. Learning Rotation for Kernel Correlation Filter. CryptoDL: …
Impact of Coreference Resolution on Slot Filling
Title | Impact of Coreference Resolution on Slot Filling |
Authors | Heike Adel, Hinrich Schütze |
Abstract | In this paper, we demonstrate the importance of coreference resolution for natural language processing on the example of the TAC Slot Filling shared task. We illustrate the strengths and weaknesses of automatic coreference resolution systems and provide experimental results to show that they improve performance in the slot filling end-to-end setting. Finally, we publish KBPchains, a resource containing automatically extracted coreference chains from the TAC source corpus in order to support other researchers working on this topic. |
Tasks | Coreference Resolution, Slot Filling |
Published | 2017-10-26 |
URL | http://arxiv.org/abs/1710.09753v1 |
http://arxiv.org/pdf/1710.09753v1.pdf | |
PWC | https://paperswithcode.com/paper/impact-of-coreference-resolution-on-slot |
Repo | |
Framework | |
Deep Learning for Photoacoustic Tomography from Sparse Data
Title | Deep Learning for Photoacoustic Tomography from Sparse Data |
Authors | Stephan Antholzer, Markus Haltmeier, Johannes Schwab |
Abstract | The development of fast and accurate image reconstruction algorithms is a central aspect of computed tomography. In this paper, we investigate this issue for the sparse data problem in photoacoustic tomography (PAT). We develop a direct and highly efficient reconstruction algorithm based on deep learning. In our approach image reconstruction is performed with a deep convolutional neural network (CNN), whose weights are adjusted prior to the actual image reconstruction based on a set of training data. The proposed reconstruction approach can be interpreted as a network that uses the PAT filtered backprojection algorithm for the first layer, followed by the U-net architecture for the remaining layers. Actual image reconstruction with deep learning consists in one evaluation of the trained CNN, which does not require time consuming solution of the forward and adjoint problems. At the same time, our numerical results demonstrate that the proposed deep learning approach reconstructs images with a quality comparable to state of the art iterative approaches for PAT from sparse data. |
Tasks | Image Reconstruction |
Published | 2017-04-15 |
URL | http://arxiv.org/abs/1704.04587v3 |
http://arxiv.org/pdf/1704.04587v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-photoacoustic-tomography |
Repo | |
Framework | |
On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets
Title | On modeling vagueness and uncertainty in data-to-text systems through fuzzy sets |
Authors | A. Ramos-Soto, M. Pereira-Fariña |
Abstract | Vagueness and uncertainty management is counted among one of the challenges that remain unresolved in systems that generate texts from non-linguistic data, known as data-to-text systems. In the last decade, work in fuzzy linguistic summarization and description of data has raised the interest of using fuzzy sets to model and manage the imprecision of human language in data-to-text systems. However, despite some research in this direction, there has not been an actual clear discussion and justification on how fuzzy sets can contribute to data-to-text for modeling vagueness and uncertainty in words and expressions. This paper intends to bridge this gap by answering the following questions: What does vagueness mean in fuzzy sets theory? What does vagueness mean in data-to-text contexts? In what ways can fuzzy sets theory contribute to improve data-to-text systems? What are the challenges that researchers from both disciplines need to address for a successful integration of fuzzy sets into data-to-text systems? In what cases should the use of fuzzy sets be avoided in D2T? For this, we review and discuss the state of the art of vagueness modeling in natural language generation and data-to-text, describe potential and actual usages of fuzzy sets in data-to-text contexts, and provide some additional insights about the engineering of data-to-text systems that make use of fuzzy set-based techniques. |
Tasks | Text Generation |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10093v1 |
http://arxiv.org/pdf/1710.10093v1.pdf | |
PWC | https://paperswithcode.com/paper/on-modeling-vagueness-and-uncertainty-in-data |
Repo | |
Framework | |
A Convex Parametrization of a New Class of Universal Kernel Functions for use in Kernel Learning
Title | A Convex Parametrization of a New Class of Universal Kernel Functions for use in Kernel Learning |
Authors | Brendon K. Colbert, Matthew M. Peet |
Abstract | We propose a new class of universal kernel functions which admit a linear parametrization using positive semidefinite matrices. These kernels are generalizations of the Sobolev kernel and are defined by piecewise-polynomial functions. The class of kernels is termed “tessellated” as the resulting discriminant is defined piecewise with hyper-rectangular domains whose corners are determined by the training data. The kernels have scalable complexity, but each instance is universal in the sense that its hypothesis space is dense in $L_2$. Using numerical testing, we show that for the soft margin SVM, this class can eliminate the need for Gaussian kernels. Furthermore, we demonstrate that when the ratio of the number of training data to features is high, this method will significantly outperform other kernel learning algorithms. Finally, to reduce the complexity associated with SDP-based kernel learning methods, we use a randomized basis for the positive matrices to integrate with existing multiple kernel learning algorithms such as SimpleMKL. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05477v1 |
http://arxiv.org/pdf/1711.05477v1.pdf | |
PWC | https://paperswithcode.com/paper/a-convex-parametrization-of-a-new-class-of |
Repo | |
Framework | |
Learning Rotation for Kernel Correlation Filter
Title | Learning Rotation for Kernel Correlation Filter |
Authors | Abdullah Hamdi, Bernard Ghanem |
Abstract | Kernel Correlation Filters have shown a very promising scheme for visual tracking in terms of speed and accuracy on several benchmarks. However it suffers from problems that affect its performance like occlusion, rotation and scale change. This paper tries to tackle the problem of rotation by reformulating the optimization problem for learning the correlation filter. This modification (RKCF) includes learning rotation filter that utilizes circulant structure of HOG feature to guesstimate rotation from one frame to another and enhance the detection of KCF. Hence it gains boost in overall accuracy in many of OBT50 detest videos with minimal additional computation. |
Tasks | Visual Tracking |
Published | 2017-08-11 |
URL | http://arxiv.org/abs/1708.03698v1 |
http://arxiv.org/pdf/1708.03698v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-rotation-for-kernel-correlation |
Repo | |
Framework | |
CryptoDL: Deep Neural Networks over Encrypted Data
Title | CryptoDL: Deep Neural Networks over Encrypted Data |
Authors | Ehsan Hesamifard, Hassan Takabi, Mehdi Ghasemi |
Abstract | Machine learning algorithms based on deep neural networks have achieved remarkable results and are being extensively used in different domains. However, the machine learning algorithms requires access to raw data which is often privacy sensitive. To address this issue, we develop new techniques to provide solutions for running deep neural networks over encrypted data. In this paper, we develop new techniques to adopt deep neural networks within the practical limitation of current homomorphic encryption schemes. More specifically, we focus on classification of the well-known convolutional neural networks (CNN). First, we design methods for approximation of the activation functions commonly used in CNNs (i.e. ReLU, Sigmoid, and Tanh) with low degree polynomials which is essential for efficient homomorphic encryption schemes. Then, we train convolutional neural networks with the approximation polynomials instead of original activation functions and analyze the performance of the models. Finally, we implement convolutional neural networks over encrypted data and measure performance of the models. Our experimental results validate the soundness of our approach with several convolutional neural networks with varying number of layers and structures. When applied to the MNIST optical character recognition tasks, our approach achieves 99.52% accuracy which significantly outperforms the state-of-the-art solutions and is very close to the accuracy of the best non-private version, 99.77%. Also, it can make close to 164000 predictions per hour. We also applied our approach to CIFAR-10, which is much more complex compared to MNIST, and were able to achieve 91.5% accuracy with approximation polynomials used as activation functions. These results show that CryptoDL provides efficient, accurate and scalable privacy-preserving predictions. |
Tasks | Optical Character Recognition |
Published | 2017-11-14 |
URL | http://arxiv.org/abs/1711.05189v1 |
http://arxiv.org/pdf/1711.05189v1.pdf | |
PWC | https://paperswithcode.com/paper/cryptodl-deep-neural-networks-over-encrypted |
Repo | |
Framework | |
Development & Implementation of the Trigger for a Short-baseline Reactor Antineutrino Experiment (SoLid)
Title | Development & Implementation of the Trigger for a Short-baseline Reactor Antineutrino Experiment (SoLid) |
Authors | Lukas On Arnold |
Abstract | SoLid, located at SCK-CEN in Mol, Belgium, is a reactor antineutrino experiment at a very short baseline of 5.5 - 10m aiming at the search for sterile neutrinos and for high precision measurement of the neutrino energy spectrum of Uranium-235. It uses a novel approach using Lithium-6 sheets and PVT cubes as scintillators for tagging the Inverse Beta-Decay products (neutron and positron). Being located overground and close to the BR2 research reactor, the experiment faces a large amount of backgrounds. Efficient real-time background and noise rejection is essential in order to increase the signal-background ratio for precise oscillation measurement and decrease data production to a rate which can be handled by the online software. Therefore, a reliable distinction between the neutrons and background signals is crucial. This can be performed online with a dedicated firmware trigger. A peak counting algorithm and an algorithm measuring time over threshold have been identified as performing well both in terms of efficiency and fake rate, and have been implemented onto an FPGA. After having introduced the experimental and theoretical background of neutrino oscillation physics, as well as SoLid’s detector technology, read-out system and trigger scheme, the thesis presents the design of the firmware neutron trigger implemented by applying machine learning methods. |
Tasks | |
Published | 2017-07-05 |
URL | http://arxiv.org/abs/1707.01394v1 |
http://arxiv.org/pdf/1707.01394v1.pdf | |
PWC | https://paperswithcode.com/paper/development-implementation-of-the-trigger-for |
Repo | |
Framework | |
Improve Lexicon-based Word Embeddings By Word Sense Disambiguation
Title | Improve Lexicon-based Word Embeddings By Word Sense Disambiguation |
Authors | Yuanzhi Ke, Masafumi Hagiwara |
Abstract | There have been some works that learn a lexicon together with the corpus to improve the word embeddings. However, they either model the lexicon separately but update the neural networks for both the corpus and the lexicon by the same likelihood, or minimize the distance between all of the synonym pairs in the lexicon. Such methods do not consider the relatedness and difference of the corpus and the lexicon, and may not be the best optimized. In this paper, we propose a novel method that considers the relatedness and difference of the corpus and the lexicon. It trains word embeddings by learning the corpus to predicate a word and its corresponding synonym under the context at the same time. For polysemous words, we use a word sense disambiguation filter to eliminate the synonyms that have different meanings for the context. To evaluate the proposed method, we compare the performance of the word embeddings trained by our proposed model, the control groups without the filter or the lexicon, and the prior works in the word similarity tasks and text classification task. The experimental results show that the proposed model provides better embeddings for polysemous words and improves the performance for text classification. |
Tasks | Text Classification, Word Embeddings, Word Sense Disambiguation |
Published | 2017-07-24 |
URL | http://arxiv.org/abs/1707.07628v1 |
http://arxiv.org/pdf/1707.07628v1.pdf | |
PWC | https://paperswithcode.com/paper/improve-lexicon-based-word-embeddings-by-word |
Repo | |
Framework | |
An Improved Training Procedure for Neural Autoregressive Data Completion
Title | An Improved Training Procedure for Neural Autoregressive Data Completion |
Authors | Maxime Voisin, Daniel Ritchie |
Abstract | Neural autoregressive models are explicit density estimators that achieve state-of-the-art likelihoods for generative modeling. The D-dimensional data distribution is factorized into an autoregressive product of one-dimensional conditional distributions according to the chain rule. Data completion is a more involved task than data generation: the model must infer missing variables for any partially observed input vector. Previous work introduced an order-agnostic training procedure for data completion with autoregressive models. Missing variables in any partially observed input vector can be imputed efficiently by choosing an ordering where observed dimensions precede unobserved ones and by computing the autoregressive product in this order. In this paper, we provide evidence that the order-agnostic (OA) training procedure is suboptimal for data completion. We propose an alternative procedure (OA++) that reaches better performance in fewer computations. It can handle all data completion queries while training fewer one-dimensional conditional distributions than the OA procedure. In addition, these one-dimensional conditional distributions are trained proportionally to their expected usage at inference time, reducing overfitting. Finally, our OA++ procedure can exploit prior knowledge about the distribution of inference completion queries, as opposed to OA. We support these claims with quantitative experiments on standard datasets used to evaluate autoregressive generative models. |
Tasks | |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08598v1 |
http://arxiv.org/pdf/1711.08598v1.pdf | |
PWC | https://paperswithcode.com/paper/an-improved-training-procedure-for-neural |
Repo | |
Framework | |
Prediction-Constrained Topic Models for Antidepressant Recommendation
Title | Prediction-Constrained Topic Models for Antidepressant Recommendation |
Authors | Michael C. Hughes, Gabriel Hope, Leah Weiner, Thomas H. McCoy, Roy H. Perlis, Erik B. Sudderth, Finale Doshi-Velez |
Abstract | Supervisory signals can help topic models discover low-dimensional data representations that are more interpretable for clinical tasks. We propose a framework for training supervised latent Dirichlet allocation that balances two goals: faithful generative explanations of high-dimensional data and accurate prediction of associated class labels. Existing approaches fail to balance these goals by not properly handling a fundamental asymmetry: the intended task is always predicting labels from data, not data from labels. Our new prediction-constrained objective trains models that predict labels from heldout data well while also producing good generative likelihoods and interpretable topic-word parameters. In a case study on predicting depression medications from electronic health records, we demonstrate improved recommendations compared to previous supervised topic models and high- dimensional logistic regression from words alone. |
Tasks | Topic Models |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00499v1 |
http://arxiv.org/pdf/1712.00499v1.pdf | |
PWC | https://paperswithcode.com/paper/prediction-constrained-topic-models-for |
Repo | |
Framework | |
Cross-label Suppression: A Discriminative and Fast Dictionary Learning with Group Regularization
Title | Cross-label Suppression: A Discriminative and Fast Dictionary Learning with Group Regularization |
Authors | Xiudong Wang, Yuantao Gu |
Abstract | This paper addresses image classification through learning a compact and discriminative dictionary efficiently. Given a structured dictionary with each atom (columns in the dictionary matrix) related to some label, we propose cross-label suppression constraint to enlarge the difference among representations for different classes. Meanwhile, we introduce group regularization to enforce representations to preserve label properties of original samples, meaning the representations for the same class are encouraged to be similar. Upon the cross-label suppression, we don’t resort to frequently-used $\ell_0$-norm or $\ell_1$-norm for coding, and obtain computational efficiency without losing the discriminative power for categorization. Moreover, two simple classification schemes are also developed to take full advantage of the learnt dictionary. Extensive experiments on six data sets including face recognition, object categorization, scene classification, texture recognition and sport action categorization are conducted, and the results show that the proposed approach can outperform lots of recently presented dictionary algorithms on both recognition accuracy and computational efficiency. |
Tasks | Dictionary Learning, Face Recognition, Image Classification, Scene Classification |
Published | 2017-05-08 |
URL | http://arxiv.org/abs/1705.02928v1 |
http://arxiv.org/pdf/1705.02928v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-label-suppression-a-discriminative-and |
Repo | |
Framework | |
An influence-based fast preceding questionnaire model for elderly assessments
Title | An influence-based fast preceding questionnaire model for elderly assessments |
Authors | Tong Mo, Rong Zhang, Weiping Li, Jingbo Zhang, Zhonghai Wu, Wei Tan |
Abstract | To improve the efficiency of elderly assessments, an influence-based fast preceding questionnaire model (FPQM) is proposed. Compared with traditional assessments, the FPQM optimizes questionnaires by reordering their attributes. The values of low-ranking attributes can be predicted by the values of the high-ranking attributes. Therefore, the number of attributes can be reduced without redesigning the questionnaires. A new function for calculating the influence of the attributes is proposed based on probability theory. Reordering and reducing algorithms are given based on the attributes’ influences. The model is verified through a practical application. The practice in an elderly-care company shows that the FPQM can reduce the number of attributes by 90.56% with a prediction accuracy of 98.39%. Compared with other methods, such as the Expert Knowledge, Rough Set and C4.5 methods, the FPQM achieves the best performance. In addition, the FPQM can also be applied to other questionnaires. |
Tasks | |
Published | 2017-11-22 |
URL | http://arxiv.org/abs/1711.08228v1 |
http://arxiv.org/pdf/1711.08228v1.pdf | |
PWC | https://paperswithcode.com/paper/an-influence-based-fast-preceding |
Repo | |
Framework | |
Active Orthogonal Matching Pursuit for Sparse Subspace Clustering
Title | Active Orthogonal Matching Pursuit for Sparse Subspace Clustering |
Authors | Yanxi Chen, Gen Li, Yuantao Gu |
Abstract | Sparse Subspace Clustering (SSC) is a state-of-the-art method for clustering high-dimensional data points lying in a union of low-dimensional subspaces. However, while $\ell_1$ optimization-based SSC algorithms suffer from high computational complexity, other variants of SSC, such as Orthogonal Matching Pursuit-based SSC (OMP-SSC), lose clustering accuracy in pursuit of improving time efficiency. In this letter, we propose a novel Active OMP-SSC, which improves clustering accuracy of OMP-SSC by adaptively updating data points and randomly dropping data points in the OMP process, while still enjoying the low computational complexity of greedy pursuit algorithms. We provide heuristic analysis of our approach, and explain how these two active steps achieve a better tradeoff between connectivity and separation. Numerical results on both synthetic data and real-world data validate our analyses and show the advantages of the proposed active algorithm. |
Tasks | |
Published | 2017-08-16 |
URL | http://arxiv.org/abs/1708.04764v1 |
http://arxiv.org/pdf/1708.04764v1.pdf | |
PWC | https://paperswithcode.com/paper/active-orthogonal-matching-pursuit-for-sparse |
Repo | |
Framework | |
3D Morphology Prediction of Progressive Spinal Deformities from Probabilistic Modeling of Discriminant Manifolds
Title | 3D Morphology Prediction of Progressive Spinal Deformities from Probabilistic Modeling of Discriminant Manifolds |
Authors | Samuel Kadoury, William Mandel, Marjolaine Roy-Beaudry, Marie-Lyne Nault, Stefan Parent |
Abstract | We introduce a novel approach for predicting the progression of adolescent idiopathic scoliosis from 3D spine models reconstructed from biplanar X-ray images. Recent progress in machine learning have allowed to improve classification and prognosis rates, but lack a probabilistic framework to measure uncertainty in the data. We propose a discriminative probabilistic manifold embedding where locally linear mappings transform data points from high-dimensional space to corresponding low-dimensional coordinates. A discriminant adjacency matrix is constructed to maximize the separation between progressive and non-progressive groups of patients diagnosed with scoliosis, while minimizing the distance in latent variables belonging to the same class. To predict the evolution of deformation, a baseline reconstruction is projected onto the manifold, from which a spatiotemporal regression model is built from parallel transport curves inferred from neighboring exemplars. Rate of progression is modulated from the spine flexibility and curve magnitude of the 3D spine deformation. The method was tested on 745 reconstructions from 133 subjects using longitudinal 3D reconstructions of the spine, with results demonstrating the discriminatory framework can identify between progressive and non-progressive of scoliotic patients with a classification rate of 81% and prediction differences of 2.1$^{o}$ in main curve angulation, outperforming other manifold learning methods. Our method achieved a higher prediction accuracy and improved the modeling of spatiotemporal morphological changes in highly deformed spines compared to other learning methods. |
Tasks | |
Published | 2017-01-17 |
URL | http://arxiv.org/abs/1701.04869v2 |
http://arxiv.org/pdf/1701.04869v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-morphology-prediction-of-progressive |
Repo | |
Framework | |
StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
Title | StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection |
Authors | Sanghyun Woo, Soonmin Hwang, In So Kweon |
Abstract | One-stage object detectors such as SSD or YOLO already have shown promising accuracy with small memory footprint and fast speed. However, it is widely recognized that one-stage detectors have difficulty in detecting small objects while they are competitive with two-stage methods on large objects. In this paper, we investigate how to alleviate this problem starting from the SSD framework. Due to their pyramidal design, the lower layer that is responsible for small objects lacks strong semantics(e.g contextual information). We address this problem by introducing a feature combining module that spreads out the strong semantics in a top-down manner. Our final model StairNet detector unifies the multi-scale representations and semantic distribution effectively. Experiments on PASCAL VOC 2007 and PASCAL VOC 2012 datasets demonstrate that StairNet significantly improves the weakness of SSD and outperforms the other state-of-the-art one-stage detectors. |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.05788v1 |
http://arxiv.org/pdf/1709.05788v1.pdf | |
PWC | https://paperswithcode.com/paper/stairnet-top-down-semantic-aggregation-for |
Repo | |
Framework | |