Paper Group ANR 164
AttnConvnet at SemEval-2018 Task 1: Attention-based Convolutional Neural Networks for Multi-label Emotion Classification. Wikibook-Bot - Automatic Generation of a Wikipedia Book. Network Learning with Local Propagation. Solving Sinhala Language Arithmetic Problems using Neural Networks. Probabilistic Sparse Subspace Clustering Using Delayed Associa …
AttnConvnet at SemEval-2018 Task 1: Attention-based Convolutional Neural Networks for Multi-label Emotion Classification
Title | AttnConvnet at SemEval-2018 Task 1: Attention-based Convolutional Neural Networks for Multi-label Emotion Classification |
Authors | Yanghoon Kim, Hwanhee Lee, Kyomin Jung |
Abstract | In this paper, we propose an attention-based classifier that predicts multiple emotions of a given sentence. Our model imitates human’s two-step procedure of sentence understanding and it can effectively represent and classify sentences. With emoji-to-meaning preprocessing and extra lexicon utilization, we further improve the model performance. We train and evaluate our model with data provided by SemEval-2018 task 1-5, each sentence of which has several labels among 11 given sentiments. Our model achieves 5-th/1-th rank in English/Spanish respectively. |
Tasks | Emotion Classification |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.00831v2 |
http://arxiv.org/pdf/1804.00831v2.pdf | |
PWC | https://paperswithcode.com/paper/attnconvnet-at-semeval-2018-task-1-attention |
Repo | |
Framework | |
Wikibook-Bot - Automatic Generation of a Wikipedia Book
Title | Wikibook-Bot - Automatic Generation of a Wikipedia Book |
Authors | Shahar Admati, Lior Rokach, Bracha Shapira |
Abstract | A Wikipedia book (known as Wikibook) is a collection of Wikipedia articles on a particular theme that is organized as a book. We propose Wikibook-Bot, a machine-learning based technique for automatically generating high quality Wikibooks based on a concept provided by the user. In order to create the Wikibook we apply machine learning algorithms to the different steps of the proposed technique. Firs, we need to decide whether an article belongs to a specific Wikibook - a classification task. Then, we need to divide the chosen articles into chapters - a clustering task - and finally, we deal with the ordering task which includes two subtasks: order articles within each chapter and order the chapters themselves. We propose a set of structural, text-based and unique Wikipedia features, and we show that by using these features, a machine learning classifier can successfully address the above challenges. The predictive performance of the proposed method is evaluated by comparing the auto-generated books to existing 407 Wikibooks which were manually generated by humans. For all the tasks we were able to obtain high and statistically significant results when comparing the Wikibook-bot books to books that were manually generated by Wikipedia contributors |
Tasks | |
Published | 2018-12-28 |
URL | http://arxiv.org/abs/1812.10937v1 |
http://arxiv.org/pdf/1812.10937v1.pdf | |
PWC | https://paperswithcode.com/paper/wikibook-bot-automatic-generation-of-a |
Repo | |
Framework | |
Network Learning with Local Propagation
Title | Network Learning with Local Propagation |
Authors | Dimche Kostadinov, Behrooz Razeghi, Sohrab Ferdowsi, Slava Voloshynovskiy |
Abstract | This paper presents a locally decoupled network parameter learning with local propagation. Three elements are taken into account: (i) sets of nonlinear transforms that describe the representations at all nodes, (ii) a local objective at each node related to the corresponding local representation goal, and (iii) a local propagation model that relates the nonlinear error vectors at each node with the goal error vectors from the directly connected nodes. The modeling concepts (i), (ii) and (iii) offer several advantages, including (a) a unified learning principle for any network that is represented as a graph, (b) understanding and interpretation of the local and the global learning dynamics, (c) decoupled and parallel parameter learning, (d) a possibility for learning in infinitely long, multi-path and multi-goal networks. Numerical experiments validate the potential of the learning principle. The preliminary results show advantages in comparison to the state-of-the-art methods, w.r.t. the learning time and the network size while having comparable recognition accuracy. |
Tasks | |
Published | 2018-05-20 |
URL | http://arxiv.org/abs/1805.07802v1 |
http://arxiv.org/pdf/1805.07802v1.pdf | |
PWC | https://paperswithcode.com/paper/network-learning-with-local-propagation |
Repo | |
Framework | |
Solving Sinhala Language Arithmetic Problems using Neural Networks
Title | Solving Sinhala Language Arithmetic Problems using Neural Networks |
Authors | W. M. T Chathurika, K. C. E De Silva, A. M. Raddella, E. M. R. S. Ekanayake, A. Nugaliyadde, Y. Mallawarachchi |
Abstract | A methodology is presented to solve Arithmetic problems in Sinhala Language using a Neural Network. The system comprises of (a) keyword identification, (b) question identification, (c) mathematical operation identification and is combined using a neural network. Naive Bayes Classification is used in order to identify keywords and Conditional Random Field to identify the question and the operation which should be performed on the identified keywords to achieve the expected result. “One vs. all Classification” is done using a neural network for sentences. All functions are combined through the neural network which builds an equation to solve the problem. The paper compares each methodology in ARIS and Mahoshadha to the method presented in the paper. Mahoshadha2 learns to solve arithmetic problems with the accuracy of 76%. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.04557v1 |
http://arxiv.org/pdf/1809.04557v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-sinhala-language-arithmetic-problems |
Repo | |
Framework | |
Probabilistic Sparse Subspace Clustering Using Delayed Association
Title | Probabilistic Sparse Subspace Clustering Using Delayed Association |
Authors | Maryam Jaberi, Marianna Pensky, Hassan Foroosh |
Abstract | Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into two separate stages of finding the similarity matrix and finding clusters. Similar to some recent works, we integrate these two steps using a joint optimization approach. We make the following contributions: (i) we estimate the reliability of the cluster assignment for each point before assigning a point to a subspace. We group the data points into two groups of “certain” and “uncertain”, with the assignment of latter group delayed until their subspace association certainty improves. (ii) We demonstrate that delayed association is better suited for clustering subspaces that have ambiguities, i.e. when subspaces intersect or data are contaminated with outliers/noise. (iii) We demonstrate experimentally that such delayed probabilistic association leads to a more accurate self-representation and final clusters. The proposed method has higher accuracy both for points that exclusively lie in one subspace, and those that are on the intersection of subspaces. (iv) We show that delayed association leads to huge reduction of computational cost, since it allows for incremental spectral clustering. |
Tasks | |
Published | 2018-08-28 |
URL | http://arxiv.org/abs/1808.09574v1 |
http://arxiv.org/pdf/1808.09574v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-sparse-subspace-clustering |
Repo | |
Framework | |
Optimal-margin evolutionary classifier
Title | Optimal-margin evolutionary classifier |
Authors | Mohammad Reza Bonyadi, David C. Reutens |
Abstract | We introduce a novel approach for discriminative classification using evolutionary algorithms. We first propose an algorithm to optimize the total loss value using a modified 0-1 loss function in a one-dimensional space for classification. We then extend this algorithm for multi-dimensional classification using an evolutionary algorithm. The proposed evolutionary algorithm aims to find a hyperplane which best classifies instances while minimizes the classification risk. We test particle swarm optimization, evolutionary strategy, and covariance matrix adaptation evolutionary strategy for optimization purpose. Finally, we compare our results with well-established and state-of-the-art classification algorithms, for both binary and multi-class classification, on 19 benchmark classification problems, with and without noise and outliers. Results show that the performance of the proposed algorithm is significantly (t-test) better than all other methods in almost all problems tested. We also show that the proposed algorithm is significantly more robust against noise and outliers comparing to other methods. The running time of the algorithm is within a reasonable range for the solution of real-world classification problems. |
Tasks | |
Published | 2018-04-26 |
URL | http://arxiv.org/abs/1804.09891v1 |
http://arxiv.org/pdf/1804.09891v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-margin-evolutionary-classifier |
Repo | |
Framework | |
A Framework for Approval-based Budgeting Methods
Title | A Framework for Approval-based Budgeting Methods |
Authors | Piotr Faliszewski, Nimrod Talmon |
Abstract | We define and study a general framework for approval-based budgeting methods and compare certain methods within this framework by their axiomatic and computational properties. Furthermore, we visualize their behavior on certain Euclidean distributions and analyze them experimentally. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04382v1 |
http://arxiv.org/pdf/1809.04382v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-approval-based-budgeting |
Repo | |
Framework | |
Eval all, trust a few, do wrong to none: Comparing sentence generation models
Title | Eval all, trust a few, do wrong to none: Comparing sentence generation models |
Authors | Ondřej Cífka, Aliaksei Severyn, Enrique Alfonseca, Katja Filippova |
Abstract | In this paper, we study recent neural generative models for text generation related to variational autoencoders. Previous works have employed various techniques to control the prior distribution of the latent codes in these models, which is important for sampling performance, but little attention has been paid to reconstruction error. In our study, we follow a rigorous evaluation protocol using a large set of previously used and novel automatic and human evaluation metrics, applied to both generated samples and reconstructions. We hope that it will become the new evaluation standard when comparing neural generative models for text. |
Tasks | Text Generation |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07972v2 |
http://arxiv.org/pdf/1804.07972v2.pdf | |
PWC | https://paperswithcode.com/paper/eval-all-trust-a-few-do-wrong-to-none |
Repo | |
Framework | |
Prognosticating Autism Spectrum Disorder Using Artificial Neural Network: Levenberg-Marquardt Algorithm
Title | Prognosticating Autism Spectrum Disorder Using Artificial Neural Network: Levenberg-Marquardt Algorithm |
Authors | Avishek Choudhury, . Christopher Greene |
Abstract | Autism spectrum condition (ASC) or autism spectrum disorder (ASD) is primarily identified with the help of behavioral indications encompassing social, sensory and motor characteristics. Although categorized, recurring motor actions are measured during diagnosis, quantifiable measures that ascertain kinematic physiognomies in the movement configurations of autistic persons are not adequately studied, hindering the advances in understanding the etiology of motor mutilation. Subject aspects such as behavioral characters that influences ASD need further exploration. Presently, limited autism datasets concomitant with screening ASD are available, and a majority of them are genetic. Hence, in this study, we used a dataset related to autism screening enveloping ten behavioral and ten personal attributes that have been effective in diagnosing ASD cases from controls in behavior science. ASD diagnosis is time exhaustive and uneconomical. The burgeoning ASD cases worldwide mandate a need for the fast and economical screening tool. Our study aimed to implement an artificial neural network with the Levenberg-Marquardt algorithm to detect ASD and examine its predictive accuracy. Consecutively, develop a clinical decision support system for early ASD identification. |
Tasks | |
Published | 2018-12-12 |
URL | https://arxiv.org/abs/1812.07716v2 |
https://arxiv.org/pdf/1812.07716v2.pdf | |
PWC | https://paperswithcode.com/paper/181207716 |
Repo | |
Framework | |
3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning
Title | 3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning |
Authors | Hyeontaek Lim, David G. Andersen, Michael Kaminsky |
Abstract | The performance and efficiency of distributed machine learning (ML) depends significantly on how long it takes for nodes to exchange state changes. Overly-aggressive attempts to reduce communication often sacrifice final model accuracy and necessitate additional ML techniques to compensate for this loss, limiting their generality. Some attempts to reduce communication incur high computation overhead, which makes their performance benefits visible only over slow networks. We present 3LC, a lossy compression scheme for state change traffic that strikes balance between multiple goals: traffic reduction, accuracy, computation overhead, and generality. It combines three new techniques—3-value quantization with sparsity multiplication, quartic encoding, and zero-run encoding—to leverage strengths of quantization and sparsification techniques and avoid their drawbacks. It achieves a data compression ratio of up to 39–107X, almost the same test accuracy of trained models, and high compression speed. Distributed ML frameworks can employ 3LC without modifications to existing ML algorithms. Our experiments show that 3LC reduces wall-clock training time of ResNet-110–based image classifiers for CIFAR-10 on a 10-GPU cluster by up to 16–23X compared to TensorFlow’s baseline design. |
Tasks | Quantization |
Published | 2018-02-21 |
URL | http://arxiv.org/abs/1802.07389v1 |
http://arxiv.org/pdf/1802.07389v1.pdf | |
PWC | https://paperswithcode.com/paper/3lc-lightweight-and-effective-traffic |
Repo | |
Framework | |
Optimizing the Union of Intersections LASSO ($UoI_{LASSO}$) and Vector Autoregressive ($UoI_{VAR}$) Algorithms for Improved Statistical Estimation at Scale
Title | Optimizing the Union of Intersections LASSO ($UoI_{LASSO}$) and Vector Autoregressive ($UoI_{VAR}$) Algorithms for Improved Statistical Estimation at Scale |
Authors | Mahesh Balasubramanian, Trevor Ruiz, Brandon Cook, Sharmodeep Bhattacharyya, Prabhat, Aviral Shrivastava, Kristofer Bouchard |
Abstract | The analysis of scientific data of increasing size and complexity requires statistical machine learning methods that are both interpretable and predictive. Union of Intersections (UoI), a recently developed framework, is a two-step approach that separates model selection and model estimation. A linear regression algorithm based on UoI, $UoI_{LASSO}$, simultaneously achieves low false positives and low false negative feature selection as well as low bias and low variance estimates. Together, these qualities make the results both predictive and interpretable. In this paper, we optimize the $UoI_{LASSO}$ algorithm for single-node execution on NERSC’s Cori Knights Landing, a Xeon Phi based supercomputer. We then scale $UoI_{LASSO}$ to execute on cores ranging from 68-278,528 cores on a range of dataset sizes demonstrating the weak and strong scaling of the implementation. We also implement a variant of $UoI_{LASSO}$, $UoI_{VAR}$ for vector autoregressive models, to analyze high dimensional time-series data. We perform single node optimization and multi-node scaling experiments for $UoI_{VAR}$ to demonstrate the effectiveness of the algorithm for weak and strong scaling. Our implementations enable to use estimate the largest VAR model (1000 nodes) we are aware of, and apply it to large neurophysiology data 192 nodes). |
Tasks | Feature Selection, Model Selection, Time Series |
Published | 2018-08-21 |
URL | http://arxiv.org/abs/1808.06992v1 |
http://arxiv.org/pdf/1808.06992v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-the-union-of-intersections-lasso |
Repo | |
Framework | |
ReHAR: Robust and Efficient Human Activity Recognition
Title | ReHAR: Robust and Efficient Human Activity Recognition |
Authors | Xin Li, Mooi Choo Chuah |
Abstract | Designing a scheme that can achieve a good performance in predicting single person activities and group activities is a challenging task. In this paper, we propose a novel robust and efficient human activity recognition scheme called ReHAR, which can be used to handle single person activities and group activities prediction. First, we generate an optical flow image for each video frame. Then, both video frames and their corresponding optical flow images are fed into a Single Frame Representation Model to generate representations. Finally, an LSTM is used to pre- dict the final activities based on the generated representations. The whole model is trained end-to-end to allow meaningful representations to be generated for the final activity recognition. We evaluate ReHAR using two well-known datasets: the NCAA Basketball Dataset and the UCFSports Action Dataset. The experimental results show that the pro- posed ReHAR achieves a higher activity recognition accuracy with an order of magnitude shorter computation time compared to the state-of-the-art methods. |
Tasks | Activity Recognition, Human Activity Recognition, Optical Flow Estimation |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09745v1 |
http://arxiv.org/pdf/1802.09745v1.pdf | |
PWC | https://paperswithcode.com/paper/rehar-robust-and-efficient-human-activity |
Repo | |
Framework | |
DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks
Title | DeepPhys: Video-Based Physiological Measurement Using Convolutional Attention Networks |
Authors | Weixuan Chen, Daniel McDuff |
Abstract | Non-contact video-based physiological measurement has many applications in health care and human-computer interaction. Practical applications require measurements to be accurate even in the presence of large head rotations. We propose the first end-to-end system for video-based measurement of heart and breathing rate using a deep convolutional network. The system features a new motion representation based on a skin reflection model and a new attention mechanism using appearance information to guide motion estimation, both of which enable robust measurement under heterogeneous lighting and major motions. Our approach significantly outperforms all current state-of-the-art methods on both RGB and infrared video datasets. Furthermore, it allows spatial-temporal distributions of physiological signals to be visualized via the attention mechanism. |
Tasks | Motion Estimation |
Published | 2018-05-21 |
URL | http://arxiv.org/abs/1805.07888v2 |
http://arxiv.org/pdf/1805.07888v2.pdf | |
PWC | https://paperswithcode.com/paper/deepphys-video-based-physiological |
Repo | |
Framework | |
Deep Canonically Correlated LSTMs
Title | Deep Canonically Correlated LSTMs |
Authors | Neil Mallinar, Corbin Rosset |
Abstract | We examine Deep Canonically Correlated LSTMs as a way to learn nonlinear transformations of variable length sequences and embed them into a correlated, fixed dimensional space. We use LSTMs to transform multi-view time-series data non-linearly while learning temporal relationships within the data. We then perform correlation analysis on the outputs of these neural networks to find a correlated subspace through which we get our final representation via projection. This work follows from previous work done on Deep Canonical Correlation (DCCA), in which deep feed-forward neural networks were used to learn nonlinear transformations of data while maximizing correlation. |
Tasks | Time Series |
Published | 2018-01-16 |
URL | http://arxiv.org/abs/1801.05407v1 |
http://arxiv.org/pdf/1801.05407v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-canonically-correlated-lstms |
Repo | |
Framework | |
Entity Tracking Improves Cloze-style Reading Comprehension
Title | Entity Tracking Improves Cloze-style Reading Comprehension |
Authors | Luong Hoang, Sam Wiseman, Alexander M. Rush |
Abstract | Reading comprehension tasks test the ability of models to process long-term context and remember salient information. Recent work has shown that relatively simple neural methods such as the Attention Sum-Reader can perform well on these tasks; however, these systems still significantly trail human performance. Analysis suggests that many of the remaining hard instances are related to the inability to track entity-references throughout documents. This work focuses on these hard entity tracking cases with two extensions: (1) additional entity features, and (2) training with a multi-task tracking objective. We show that these simple modifications improve performance both independently and in combination, and we outperform the previous state of the art on the LAMBADA dataset, particularly on difficult entity examples. |
Tasks | Reading Comprehension |
Published | 2018-10-05 |
URL | http://arxiv.org/abs/1810.02891v1 |
http://arxiv.org/pdf/1810.02891v1.pdf | |
PWC | https://paperswithcode.com/paper/entity-tracking-improves-cloze-style-reading |
Repo | |
Framework | |