Paper Group ANR 407
Learning Kolmogorov Models for Binary Random Variables. A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. Numeral Understanding in Financial Tweets for Fine-grained Crowd-based Forecasting. Who framed Roger Reindeer? De-censorship of Facebook posts by snippet classification. A Unified Approxim …
Learning Kolmogorov Models for Binary Random Variables
Title | Learning Kolmogorov Models for Binary Random Variables |
Authors | Hadi Ghauch, Mikael Skoglund, Hossein Shokri-Ghadikolaei, Carlo Fischione, Ali H. Sayed |
Abstract | We summarize our recent findings, where we proposed a framework for learning a Kolmogorov model, for a collection of binary random variables. More specifically, we derive conditions that link outcomes of specific random variables, and extract valuable relations from the data. We also propose an algorithm for computing the model and show its first-order optimality, despite the combinatorial nature of the learning problem. We apply the proposed algorithm to recommendation systems, although it is applicable to other scenarios. We believe that the work is a significant step toward interpretable machine learning. |
Tasks | Interpretable Machine Learning, Recommendation Systems |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02322v1 |
http://arxiv.org/pdf/1806.02322v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-kolmogorov-models-for-binary-random |
Repo | |
Framework | |
A review of possible effects of cognitive biases on interpretation of rule-based machine learning models
Title | A review of possible effects of cognitive biases on interpretation of rule-based machine learning models |
Authors | Tomáš Kliegr, Štěpán Bahník, Johannes Fürnkranz |
Abstract | While the interpretability of machine learning models is often equated with their mere syntactic comprehensibility, we think that interpretability goes beyond that, and that human interpretability should also be investigated from the point of view of cognitive science. In particular, the goal of this paper is to discuss to what extent cognitive biases may affect human understanding of interpretable machine learning models, in particular of logical rules discovered from data. Twenty cognitive biases are covered, as are possible debiasing techniques that can be adopted by designers of machine learning algorithms and software. Our review transfers results obtained in cognitive psychology to the domain of machine learning, aiming to bridge the current gap between these two areas. It needs to be followed by empirical studies specifically focused on the machine learning domain. |
Tasks | Interpretable Machine Learning |
Published | 2018-04-09 |
URL | https://arxiv.org/abs/1804.02969v4 |
https://arxiv.org/pdf/1804.02969v4.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-possible-effects-of-cognitive |
Repo | |
Framework | |
Numeral Understanding in Financial Tweets for Fine-grained Crowd-based Forecasting
Title | Numeral Understanding in Financial Tweets for Fine-grained Crowd-based Forecasting |
Authors | Chung-Chi Chen, Hen-Hsen Huang, Yow-Ting Shiue, Hsin-Hsi Chen |
Abstract | Numerals that contain much information in financial documents are crucial for financial decision making. They play different roles in financial analysis processes. This paper is aimed at understanding the meanings of numerals in financial tweets for fine-grained crowd-based forecasting. We propose a taxonomy that classifies the numerals in financial tweets into 7 categories, and further extend some of these categories into several subcategories. Neural network-based models with word and character-level encoders are proposed for 7-way classification and 17-way classification. We perform backtest to confirm the effectiveness of the numeric opinions made by the crowd. This work is the first attempt to understand numerals in financial social media data, and we provide the first comparison of fine-grained opinion of individual investors and analysts based on their forecast price. The numeral corpus used in our experiments, called FinNum 1.0 , is available for research purposes. |
Tasks | Decision Making |
Published | 2018-09-14 |
URL | http://arxiv.org/abs/1809.05356v2 |
http://arxiv.org/pdf/1809.05356v2.pdf | |
PWC | https://paperswithcode.com/paper/numeral-understanding-in-financial-tweets-for |
Repo | |
Framework | |
Who framed Roger Reindeer? De-censorship of Facebook posts by snippet classification
Title | Who framed Roger Reindeer? De-censorship of Facebook posts by snippet classification |
Authors | Fabio Del Vigna, Marinella Petrocchi, Alessandro Tommasi, Cesare Zavattari, Maurizio Tesconi |
Abstract | This paper considers online news censorship and it concentrates on censorship of identities. Obfuscating identities may occur for disparate reasons, from military to judiciary ones. In the majority of cases, this happens to protect individuals from being identified and persecuted by hostile people. However, being the collaborative web characterised by a redundancy of information, it is not unusual that the same fact is reported by multiple sources, which may not apply the same restriction policies in terms of censorship. Also, the proven aptitude of social network users to disclose personal information leads to the phenomenon that comments to news can reveal the data withheld in the news itself. This gives us a mean to figure out who the subject of the censored news is. We propose an adaptation of a text analysis approach to unveil censored identities. The approach is tested on a synthesised scenario, which however resembles a real use case. Leveraging a text analysis based on a context classifier trained over snippets from posts and comments of Facebook pages, we achieve promising results. Despite the quite constrained settings in which we operate – such as considering only snippets of very short length – our system successfully detects the censored name, choosing among 10 different candidate names, in more than 50% of the investigated cases. This outperforms the results of two reference baselines. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the insidious issues of censorship on the web. |
Tasks | |
Published | 2018-04-10 |
URL | http://arxiv.org/abs/1804.03433v1 |
http://arxiv.org/pdf/1804.03433v1.pdf | |
PWC | https://paperswithcode.com/paper/who-framed-roger-reindeer-de-censorship-of |
Repo | |
Framework | |
A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks
Title | A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks |
Authors | Yuzhe Ma, Ran Chen, Wei Li, Fanhua Shang, Wenjian Yu, Minsik Cho, Bei Yu |
Abstract | Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model size and the intensive computation. To address this issue, various approximation techniques have been investigated, which seek for a light weighted network with little performance degradation in exchange of smaller model size or faster inference. Both low-rankness and sparsity are appealing properties for the network approximation. In this paper we propose a unified framework to compress the convolutional neural networks (CNNs) by combining these two properties, while taking the nonlinear activation into consideration. Each layer in the network is approximated by the sum of a structured sparse component and a low-rank component, which is formulated as an optimization problem. Then, an extended version of alternating direction method of multipliers (ADMM) with guaranteed convergence is presented to solve the relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet and GoogLeNet with large image classification datasets. The results outperform previous work in terms of accuracy degradation, compression rate and speedup ratio. The proposed method is able to remarkably compress the model (with up to 4.9x reduction of parameters) at a cost of little loss or without loss on accuracy. |
Tasks | Image Classification, Model Compression |
Published | 2018-07-26 |
URL | https://arxiv.org/abs/1807.10119v3 |
https://arxiv.org/pdf/1807.10119v3.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-approximation-framework-for-deep |
Repo | |
Framework | |
Unsupervised Learning by Competing Hidden Units
Title | Unsupervised Learning by Competing Hidden Units |
Authors | Dmitry Krotov, John Hopfield |
Abstract | It is widely believed that the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility, and which is motivated by Hebb’s idea that change of the synapse strength should be local - i.e. should depend only on the activities of the pre and post synaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer, and is capable of learning early feature detectors in a completely unsupervised way. These learned lower layer feature detectors can be used to train higher layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm. |
Tasks | |
Published | 2018-06-26 |
URL | https://arxiv.org/abs/1806.10181v2 |
https://arxiv.org/pdf/1806.10181v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-by-competing-hidden |
Repo | |
Framework | |
Sparse Winograd Convolutional neural networks on small-scale systolic arrays
Title | Sparse Winograd Convolutional neural networks on small-scale systolic arrays |
Authors | Feng Shi, Haochen Li, Yuhe Gao, Benjamin Kuschner, Song-Chun Zhu |
Abstract | The reconfigurability, energy-efficiency, and massive parallelism on FPGAs make them one of the best choices for implementing efficient deep learning accelerators. However, state-of-art implementations seldom consider the balance between high throughput of computation power and the ability of the memory subsystem to support it. In this paper, we implement an accelerator on FPGA by combining the sparse Winograd convolution, clusters of small-scale systolic arrays, and a tailored memory layout design. We also provide an analytical model analysis for the general Winograd convolution algorithm as a design reference. Experimental results on VGG16 show that it achieves very high computational resource utilization, 20x ~ 30x energy efficiency, and more than 5x speedup compared with the dense implementation. |
Tasks | |
Published | 2018-10-03 |
URL | http://arxiv.org/abs/1810.01973v1 |
http://arxiv.org/pdf/1810.01973v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-winograd-convolutional-neural-networks |
Repo | |
Framework | |
The Well Tempered Lasso
Title | The Well Tempered Lasso |
Authors | Yuanzhi Li, Yoram Singer |
Abstract | We study the complexity of the entire regularization path for least squares regression with 1-norm penalty, known as the Lasso. Every regression parameter in the Lasso changes linearly as a function of the regularization value. The number of changes is regarded as the Lasso’s complexity. Experimental results using exact path following exhibit polynomial complexity of the Lasso in the problem size. Alas, the path complexity of the Lasso on artificially designed regression problems is exponential. We use smoothed analysis as a mechanism for bridging the gap between worst case settings and the de facto low complexity. Our analysis assumes that the observed data has a tiny amount of intrinsic noise. We then prove that the Lasso’s complexity is polynomial in the problem size. While building upon the seminal work of Spielman and Teng on smoothed complexity, our analysis is morally different as it is divorced from specific path following algorithms. We verify the validity of our analysis in experiments with both worst case settings and real datasets. The empirical results we obtain closely match our analysis. |
Tasks | |
Published | 2018-06-08 |
URL | http://arxiv.org/abs/1806.03190v1 |
http://arxiv.org/pdf/1806.03190v1.pdf | |
PWC | https://paperswithcode.com/paper/the-well-tempered-lasso |
Repo | |
Framework | |
Finding the needle in high-dimensional haystack: A tutorial on canonical correlation analysis
Title | Finding the needle in high-dimensional haystack: A tutorial on canonical correlation analysis |
Authors | Hao-Ting Wang, Jonathan Smallwood, Janaina Mourao-Miranda, Cedric Huchuan Xia, Theodore D. Satterthwaite, Danielle S. Bassett, Danilo Bzdok |
Abstract | Since the beginning of the 21st century, the size, breadth, and granularity of data in biology and medicine has grown rapidly. In the example of neuroscience, studies with thousands of subjects are becoming more common, which provide extensive phenotyping on the behavioral, neural, and genomic level with hundreds of variables. The complexity of such big data repositories offer new opportunities and pose new challenges to investigate brain, cognition, and disease. Canonical correlation analysis (CCA) is a prototypical family of methods for wrestling with and harvesting insight from such rich datasets. This doubly-multivariate tool can simultaneously consider two variable sets from different modalities to uncover essential hidden associations. Our primer discusses the rationale, promises, and pitfalls of CCA in biomedicine. |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02598v1 |
http://arxiv.org/pdf/1812.02598v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-the-needle-in-high-dimensional |
Repo | |
Framework | |
AFRA: Argumentation framework with recursive attacks
Title | AFRA: Argumentation framework with recursive attacks |
Authors | Pietro Baroni, Federico Cerutti, Massimiliano Giacomin, Giovanni Guida |
Abstract | The issue of representing attacks to attacks in argumentation is receiving an increasing attention as a useful conceptual modelling tool in several contexts. In this paper we present AFRA, a formalism encompassing unlimited recursive attacks within argumentation frameworks. AFRA satisfies the basic requirements of definition simplicity and rigorous compatibility with Dung’s theory of argumentation. This paper provides a complete development of the AFRA formalism complemented by illustrative examples and a detailed comparison with other recursive attack formalizations. |
Tasks | |
Published | 2018-10-11 |
URL | http://arxiv.org/abs/1810.04886v1 |
http://arxiv.org/pdf/1810.04886v1.pdf | |
PWC | https://paperswithcode.com/paper/afra-argumentation-framework-with-recursive |
Repo | |
Framework | |
CliNER 2.0: Accessible and Accurate Clinical Concept Extraction
Title | CliNER 2.0: Accessible and Accurate Clinical Concept Extraction |
Authors | Willie Boag, Elena Sergeeva, Saurabh Kulshreshtha, Peter Szolovits, Anna Rumshisky, Tristan Naumann |
Abstract | Clinical notes often describe important aspects of a patient’s stay and are therefore critical to medical research. Clinical concept extraction (CCE) of named entities - such as problems, tests, and treatments - aids in forming an understanding of notes and provides a foundation for many downstream clinical decision-making tasks. Historically, this task has been posed as a standard named entity recognition (NER) sequence tagging problem, and solved with feature-based methods using handengineered domain knowledge. Recent advances, however, have demonstrated the efficacy of LSTM-based models for NER tasks, including CCE. This work presents CliNER 2.0, a simple-to-install, open-source tool for extracting concepts from clinical text. CliNER 2.0 uses a word- and character- level LSTM model, and achieves state-of-the-art performance. For ease of use, the tool also includes pre-trained models available for public use. |
Tasks | Clinical Concept Extraction, Decision Making, Named Entity Recognition |
Published | 2018-03-06 |
URL | http://arxiv.org/abs/1803.02245v1 |
http://arxiv.org/pdf/1803.02245v1.pdf | |
PWC | https://paperswithcode.com/paper/cliner-20-accessible-and-accurate-clinical |
Repo | |
Framework | |
Interactive Deep Colorization With Simultaneous Global and Local Inputs
Title | Interactive Deep Colorization With Simultaneous Global and Local Inputs |
Authors | Yi Xiao, Peiyao Zhou, Yan Zheng |
Abstract | Colorization methods using deep neural networks have become a recent trend. However, most of them do not allow user inputs, or only allow limited user inputs (only global inputs or only local inputs), to control the output colorful images. The possible reason is that it’s difficult to differentiate the influence of different kind of user inputs in network training. To solve this problem, we present a novel deep colorization method, which allows simultaneous global and local inputs to better control the output colorized images. The key step is to design an appropriate loss function that can differentiate the influence of input data, global inputs and local inputs. With this design, our method accepts no inputs, or global inputs, or local inputs, or both global and local inputs, which is not supported in previous deep colorization methods. In addition, we propose a global color theme recommendation system to help users determine global inputs. Experimental results shows that our methods can better control the colorized images and generate state-of-art results. |
Tasks | Colorization |
Published | 2018-01-27 |
URL | http://arxiv.org/abs/1801.09083v1 |
http://arxiv.org/pdf/1801.09083v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-deep-colorization-with |
Repo | |
Framework | |
Statistical Model Compression for Small-Footprint Natural Language Understanding
Title | Statistical Model Compression for Small-Footprint Natural Language Understanding |
Authors | Grant P. Strimel, Kanthashree Mysore Sathyendra, Stanislav Peshterliev |
Abstract | In this paper we investigate statistical model compression applied to natural language understanding (NLU) models. Small-footprint NLU models are important for enabling offline systems on hardware restricted devices, and for decreasing on-demand model loading latency in cloud-based systems. To compress NLU models, we present two main techniques, parameter quantization and perfect feature hashing. These techniques are complementary to existing model pruning strategies such as L1 regularization. We performed experiments on a large scale NLU system. The results show that our approach achieves 14-fold reduction in memory usage compared to the original models with minimal predictive performance impact. |
Tasks | Model Compression, Quantization |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07520v1 |
http://arxiv.org/pdf/1807.07520v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-model-compression-for-small |
Repo | |
Framework | |
A Quantization-Friendly Separable Convolution for MobileNets
Title | A Quantization-Friendly Separable Convolution for MobileNets |
Authors | Tao Sheng, Chen Feng, Shaojie Zhuo, Xiaopeng Zhang, Liang Shen, Mickey Aleksic |
Abstract | As deep learning (DL) is being rapidly pushed to edge computing, researchers invented various ways to make inference computation more efficient on mobile/IoT devices, such as network pruning, parameter compression, and etc. Quantization, as one of the key approaches, can effectively offload GPU, and make it possible to deploy DL on fixed-point pipeline. Unfortunately, not all existing networks design are friendly to quantization. For example, the popular lightweight MobileNetV1, while it successfully reduces parameter size and computation latency with separable convolution, our experiment shows its quantized models have large accuracy gap against its float point models. To resolve this, we analyzed the root cause of quantization loss and proposed a quantization-friendly separable convolution architecture. By evaluating the image classification task on ImageNet2012 dataset, our modified MobileNetV1 model can archive 8-bit inference top-1 accuracy in 68.03%, almost closed the gap to the float pipeline. |
Tasks | Image Classification, Network Pruning, Quantization |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08607v3 |
http://arxiv.org/pdf/1803.08607v3.pdf | |
PWC | https://paperswithcode.com/paper/a-quantization-friendly-separable-convolution |
Repo | |
Framework | |
Significance-based Estimation-of-Distribution Algorithms
Title | Significance-based Estimation-of-Distribution Algorithms |
Authors | Benjamin Doerr, Martin Krejca |
Abstract | Estimation-of-distribution algorithms (EDAs) are randomized search heuristics that maintain a probabilistic model of the solution space. This model is updated from iteration to iteration, based on the quality of the solutions sampled according to the model. As previous works show, this short-term perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. Such frequencies take long to be moved back to the middle range, leading to significant performance losses. In order to overcome this problem, we propose a new EDA based on the classic compact genetic algorithm (cGA) that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significance-based compact genetic algorithm (sig-cGA) optimizes the commonly regarded benchmark functions OneMax, LeadingOnes, and BinVal all in $O(n\log n)$ time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed scGA – an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model – we prove that it optimizes OneMax only in a time exponential in the hypothetical population size $1/\rho$. Similarly, we show that the convex search algorithm cannot optimize OneMax in polynomial time. |
Tasks | |
Published | 2018-07-10 |
URL | http://arxiv.org/abs/1807.03495v2 |
http://arxiv.org/pdf/1807.03495v2.pdf | |
PWC | https://paperswithcode.com/paper/significance-based-estimation-of-distribution |
Repo | |
Framework | |