Paper Group ANR 273
Sparse principal component regression for generalized linear models. Can Turing machine be curious about its Turing test results? Three informal lectures on physics of intelligence. Classifying discourse in a CSCL platform to evaluate correlations with Teacher Participation and Progress. Deep Extreme Feature Extraction: New MVA Method for Searching …
Sparse principal component regression for generalized linear models
Title | Sparse principal component regression for generalized linear models |
Authors | Shuichi Kawano, Hironori Fujisawa, Toyoyuki Takada, Toshihiko Shiroishi |
Abstract | Principal component regression (PCR) is a widely used two-stage procedure: principal component analysis (PCA), followed by regression in which the selected principal components are regarded as new explanatory variables in the model. Note that PCA is based only on the explanatory variables, so the principal components are not selected using the information on the response variable. In this paper, we propose a one-stage procedure for PCR in the framework of generalized linear models. The basic loss function is based on a combination of the regression loss and PCA loss. An estimate of the regression parameter is obtained as the minimizer of the basic loss function with a sparse penalty. We call the proposed method sparse principal component regression for generalized linear models (SPCR-glm). Taking the two loss function into consideration simultaneously, SPCR-glm enables us to obtain sparse principal component loadings that are related to a response variable. However, a combination of loss functions may cause a parameter identification problem, but this potential problem is avoided by virtue of the sparse penalty. Thus, the sparse penalty plays two roles in this method. The parameter estimation procedure is proposed using various update algorithms with the coordinate descent algorithm. We apply SPCR-glm to two real datasets, doctor visits data and mouse consomic strain data. SPCR-glm provides more easily interpretable principal component (PC) scores and clearer classification on PC plots than the usual PCA. |
Tasks | |
Published | 2016-09-28 |
URL | http://arxiv.org/abs/1609.08886v3 |
http://arxiv.org/pdf/1609.08886v3.pdf | |
PWC | https://paperswithcode.com/paper/sparse-principal-component-regression-for |
Repo | |
Framework | |
Can Turing machine be curious about its Turing test results? Three informal lectures on physics of intelligence
Title | Can Turing machine be curious about its Turing test results? Three informal lectures on physics of intelligence |
Authors | Alex Ushveridze |
Abstract | What is the nature of curiosity? Is there any scientific way to understand the origin of this mysterious force that drives the behavior of even the stupidest naturally intelligent systems and is completely absent in their smartest artificial analogs? Can we build AI systems that could be curious about something, systems that would have an intrinsic motivation to learn? Is such a motivation quantifiable? Is it implementable? I will discuss this problem from the standpoint of physics. The relationship between physics and intelligence is a consequence of the fact that correctly predicted information is nothing but an energy resource, and the process of thinking can be viewed as a process of accumulating and spending this resource through the acts of perception and, respectively, decision making. The natural motivation of any autonomous system to keep this accumulation/spending balance as high as possible allows one to treat the problem of describing the dynamics of thinking processes as a resource optimization problem. Here I will propose and discuss a simple theoretical model of such an autonomous system which I call the Autonomous Turing Machine (ATM). The potential attractiveness of ATM lies in the fact that it is the model of a self-propelled AI for which the only available energy resource is the information itself. For ATM, the problem of optimal thinking, learning, and decision-making becomes conceptually simple and mathematically well tractable. This circumstance makes the ATM an ideal playground for studying the dynamics of intelligent behavior and allows one to quantify many seemingly unquantifiable features of genuine intelligence. |
Tasks | Decision Making |
Published | 2016-06-27 |
URL | http://arxiv.org/abs/1606.08109v1 |
http://arxiv.org/pdf/1606.08109v1.pdf | |
PWC | https://paperswithcode.com/paper/can-turing-machine-be-curious-about-its |
Repo | |
Framework | |
Classifying discourse in a CSCL platform to evaluate correlations with Teacher Participation and Progress
Title | Classifying discourse in a CSCL platform to evaluate correlations with Teacher Participation and Progress |
Authors | Eliana Scheihing, Matthieu Vernier, Javiera Born, Julio Guerra, Luis Carcamo |
Abstract | In Computer-Supported learning, monitoring and engaging a group of learners is a complex task for teachers, especially when learners are working collaboratively: Are my students motivated? What kind of progress are they making? Should I intervene? Is my communication and the didactic design adapted to my students? Our hypothesis is that the analysis of natural language interactions between students, and between students and teachers, provide very valuable information and could be used to produce qualitative indicators to help teachers’ decisions. We develop an automatic approach in three steps (1) to explore the discursive functions of messages in a CSCL platform, (2) to classify the messages automatically and (3) to evaluate correlations between discursive attitudes and other variables linked to the learning activity. Results tend to show that some types of discourse are correlated with a notion of Progress on the learning activities and the importance of emotive participation from the Teacher. |
Tasks | |
Published | 2016-05-24 |
URL | http://arxiv.org/abs/1605.07268v1 |
http://arxiv.org/pdf/1605.07268v1.pdf | |
PWC | https://paperswithcode.com/paper/classifying-discourse-in-a-cscl-platform-to |
Repo | |
Framework | |
Deep Extreme Feature Extraction: New MVA Method for Searching Particles in High Energy Physics
Title | Deep Extreme Feature Extraction: New MVA Method for Searching Particles in High Energy Physics |
Authors | Chao Ma, Tianchenghou, Bin Lan, Jinhui Xu, Zhenhua Zhang |
Abstract | In this paper, we present Deep Extreme Feature Extraction (DEFE), a new ensemble MVA method for searching $\tau^{+}\tau^{-}$ channel of Higgs bosons in high energy physics. DEFE can be viewed as a deep ensemble learning scheme that trains a strongly diverse set of neural feature learners without explicitly encouraging diversity and penalizing correlations. This is achieved by adopting an implicit neural controller (not involved in feedforward compuation) that directly controls and distributes gradient flows from higher level deep prediction network. Such model-independent controller results in that every single local feature learned are used in the feature-to-output mapping stage, avoiding the blind averaging of features. DEFE makes the ensembles ‘deep’ in the sense that it allows deep post-process of these features that tries to learn to select and abstract the ensemble of neural feature learners. With the application of this model, a selection regions full of signal process can be obtained through the training of a miniature collision events set. In comparison of the Classic Deep Neural Network, DEFE shows a state-of-the-art performance: the error rate has decreased by about 37%, the accuracy has broken through 90% for the first time, along with the discovery significance has reached a standard deviation of 6.0 $\sigma$. Experimental data shows that, DEFE is able to train an ensemble of discriminative feature learners that boosts the overperformance of final prediction. |
Tasks | |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07454v1 |
http://arxiv.org/pdf/1603.07454v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-extreme-feature-extraction-new-mva |
Repo | |
Framework | |
Distributed Cell Association for Energy Harvesting IoT Devices in Dense Small Cell Networks: A Mean-Field Multi-Armed Bandit Approach
Title | Distributed Cell Association for Energy Harvesting IoT Devices in Dense Small Cell Networks: A Mean-Field Multi-Armed Bandit Approach |
Authors | Setareh Maghsudi, Ekram Hossain |
Abstract | The emerging Internet of Things (IoT)-driven ultra-dense small cell networks (UD-SCNs) will need to combat a variety of challenges. On one hand, massive number of devices sharing the limited wireless resources will render centralized control mechanisms infeasible due to the excessive cost of information acquisition and computations. On the other hand, to reduce energy consumption from fixed power grid and/or battery, many IoT devices may need to depend on the energy harvested from the ambient environment (e.g., from RF transmissions, environmental sources). However, due to the opportunistic nature of energy harvesting, this will introduce uncertainty in the network operation. In this article, we study the distributed cell association problem for energy harvesting IoT devices in UD-SCNs. After reviewing the state-of-the-art research on the cell association problem in small cell networks, we outline the major challenges for distributed cell association in IoT-driven UD-SCNs where the IoT devices will need to perform cell association in a distributed manner in presence of uncertainty (e.g., limited knowledge on channel/network) and limited computational capabilities. To this end, we propose an approach based on mean-field multi-armed bandit games to solve the uplink cell association problem for energy harvesting IoT devices in a UD-SCN. This approach is particularly suitable to analyze large multi-agent systems under uncertainty and lack of information. We provide some theoretical results as well as preliminary performance evaluation results for the proposed approach. |
Tasks | |
Published | 2016-04-30 |
URL | http://arxiv.org/abs/1605.00057v1 |
http://arxiv.org/pdf/1605.00057v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-cell-association-for-energy |
Repo | |
Framework | |
Local Similarity-Aware Deep Feature Embedding
Title | Local Similarity-Aware Deep Feature Embedding |
Authors | Chen Huang, Chen Change Loy, Xiaoou Tang |
Abstract | Existing deep embedding methods in vision tasks are capable of learning a compact Euclidean space from images, where Euclidean distances correspond to a similarity metric. To make learning more effective and efficient, hard sample mining is usually employed, with samples identified through computing the Euclidean feature distance. However, the global Euclidean distance cannot faithfully characterize the true feature similarity in a complex visual feature space, where the intraclass distance in a high-density region may be larger than the interclass distance in low-density regions. In this paper, we introduce a Position-Dependent Deep Metric (PDDM) unit, which is capable of learning a similarity metric adaptive to local feature structure. The metric can be used to select genuinely hard samples in a local neighborhood to guide the deep embedding learning in an online and robust manner. The new layer is appealing in that it is pluggable to any convolutional networks and is trained end-to-end. Our local similarity-aware feature embedding not only demonstrates faster convergence and boosted performance on two complex image retrieval datasets, its large margin nature also leads to superior generalization results under the large and open set scenarios of transfer learning and zero-shot learning on ImageNet 2010 and ImageNet-10K datasets. |
Tasks | Image Retrieval, Transfer Learning, Zero-Shot Learning |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08904v1 |
http://arxiv.org/pdf/1610.08904v1.pdf | |
PWC | https://paperswithcode.com/paper/local-similarity-aware-deep-feature-embedding |
Repo | |
Framework | |
Robust Low-Complexity Randomized Methods for Locating Outliers in Large Matrices
Title | Robust Low-Complexity Randomized Methods for Locating Outliers in Large Matrices |
Authors | Xingguo Li, Jarvis Haupt |
Abstract | This paper examines the problem of locating outlier columns in a large, otherwise low-rank matrix, in settings where {}{the data} are noisy, or where the overall matrix has missing elements. We propose a randomized two-step inference framework, and establish sufficient conditions on the required sample complexities under which these methods succeed (with high probability) in accurately locating the outliers for each task. Comprehensive numerical experimental results are provided to verify the theoretical bounds and demonstrate the computational efficiency of the proposed algorithm. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02334v1 |
http://arxiv.org/pdf/1612.02334v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-low-complexity-randomized-methods-for |
Repo | |
Framework | |
Spotting Rumors via Novelty Detection
Title | Spotting Rumors via Novelty Detection |
Authors | Yumeng Qin, Dominik Wurzer, Victor Lavrenko, Cunchen Tang |
Abstract | Rumour detection is hard because the most accurate systems operate retrospectively, only recognizing rumours once they have collected repeated signals. By then the rumours might have already spread and caused harm. We introduce a new category of features based on novelty, tailored to detect rumours early on. To compensate for the absence of repeated signals, we make use of news wire as an additional data source. Unconfirmed (novel) information with respect to the news articles is considered as an indication of rumours. Additionally we introduce pseudo feedback, which assumes that documents that are similar to previous rumours, are more likely to also be a rumour. Comparison with other real-time approaches shows that novelty based features in conjunction with pseudo feedback perform significantly better, when detecting rumours instantly after their publication. |
Tasks | Rumour Detection |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06322v1 |
http://arxiv.org/pdf/1611.06322v1.pdf | |
PWC | https://paperswithcode.com/paper/spotting-rumors-via-novelty-detection |
Repo | |
Framework | |
Large-Scale Video Search with Efficient Temporal Voting Structure
Title | Large-Scale Video Search with Efficient Temporal Voting Structure |
Authors | Ersin Esen, Savas Ozkan, Ilkay Atil |
Abstract | In this work, we propose a fast content-based video querying system for large-scale video search. The proposed system is distinguished from similar works with two major contributions. First contribution is superiority of joint usage of repeated content representation and efficient hashing mechanisms. Repeated content representation is utilized with a simple yet robust feature, which is based on edge energy of frames. Each of the representation is converted into hash code with Hamming Embedding method for further queries. Second contribution is novel queue-based voting scheme that leads to modest memory requirements with gradual memory allocation capability, contrary to complete brute-force temporal voting schemes. This aspect enables us to make queries on large video databases conveniently, even on commodity computers with limited memory capacity. Our results show that the system can respond to video queries on a large video database with fast query times, high recall rate and very low memory and disk requirements. |
Tasks | |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07160v1 |
http://arxiv.org/pdf/1607.07160v1.pdf | |
PWC | https://paperswithcode.com/paper/large-scale-video-search-with-efficient |
Repo | |
Framework | |
Word Embeddings and Their Use In Sentence Classification Tasks
Title | Word Embeddings and Their Use In Sentence Classification Tasks |
Authors | Amit Mandelbaum, Adi Shalev |
Abstract | This paper have two parts. In the first part we discuss word embeddings. We discuss the need for them, some of the methods to create them, and some of their interesting properties. We also compare them to image embeddings and see how word embedding and image embedding can be combined to perform different tasks. In the second part we implement a convolutional neural network trained on top of pre-trained word vectors. The network is used for several sentence-level classification tasks, and achieves state-of-art (or comparable) results, demonstrating the great power of pre-trainted word embeddings over random ones. |
Tasks | Sentence Classification, Word Embeddings |
Published | 2016-10-26 |
URL | http://arxiv.org/abs/1610.08229v1 |
http://arxiv.org/pdf/1610.08229v1.pdf | |
PWC | https://paperswithcode.com/paper/word-embeddings-and-their-use-in-sentence |
Repo | |
Framework | |
Markov Chain methods for the bipartite Boolean quadratic programming problem
Title | Markov Chain methods for the bipartite Boolean quadratic programming problem |
Authors | Daniel Karapetyan, Abraham P. Punnen, Andrew J. Parkes |
Abstract | We study the Bipartite Boolean Quadratic Programming Problem (BBQP) which is an extension of the well known Boolean Quadratic Programming Problem (BQP). Applications of the BBQP include mining discrete patterns from binary data, approximating matrices by rank-one binary matrices, computing the cut-norm of a matrix, and solving optimisation problems such as maximum weight biclique, bipartite maximum weight cut, maximum weight induced sub-graph of a bipartite graph, etc. For the BBQP, we first present several algorithmic components, specifically, hill climbers and mutations, and then show how to combine them in a high-performance metaheuristic. Instead of hand-tuning a standard metaheuristic to test the efficiency of the hybrid of the components, we chose to use an automated generation of a multi-component metaheuristic to save human time, and also improve objectivity in the analysis and comparisons of components. For this we designed a new metaheuristic schema which we call Conditional Markov Chain Search (CMCS). We show that CMCS is flexible enough to model several standard metaheuristics; this flexibility is controlled by multiple numeric parameters, and so is convenient for automated generation. We study the configurations revealed by our approach and show that the best of them outperforms the previous state-of-the-art BBQP algorithm by several orders of magnitude. In our experiments we use benchmark instances introduced in the preliminary version of this paper and described here, which have already become the de facto standard in the BBQP literature. |
Tasks | |
Published | 2016-05-06 |
URL | http://arxiv.org/abs/1605.02038v2 |
http://arxiv.org/pdf/1605.02038v2.pdf | |
PWC | https://paperswithcode.com/paper/markov-chain-methods-for-the-bipartite |
Repo | |
Framework | |
Deleting and Testing Forbidden Patterns in Multi-Dimensional Arrays
Title | Deleting and Testing Forbidden Patterns in Multi-Dimensional Arrays |
Authors | Omri Ben-Eliezer, Simon Korman, Daniel Reichman |
Abstract | Understanding the local behaviour of structured multi-dimensional data is a fundamental problem in various areas of computer science. As the amount of data is often huge, it is desirable to obtain sublinear time algorithms, and specifically property testers, to understand local properties of the data. We focus on the natural local problem of testing pattern freeness: given a large $d$-dimensional array $A$ and a fixed $d$-dimensional pattern $P$ over a finite alphabet, we say that $A$ is $P$-free if it does not contain a copy of the forbidden pattern $P$ as a consecutive subarray. The distance of $A$ to $P$-freeness is the fraction of entries of $A$ that need to be modified to make it $P$-free. For any $\epsilon \in [0,1]$ and any large enough pattern $P$ over any alphabet, other than a very small set of exceptional patterns, we design a tolerant tester that distinguishes between the case that the distance is at least $\epsilon$ and the case that it is at most $a_d \epsilon$, with query complexity and running time $c_d \epsilon^{-1}$, where $a_d < 1$ and $c_d$ depend only on $d$. To analyze the testers we establish several combinatorial results, including the following $d$-dimensional modification lemma, which might be of independent interest: for any large enough pattern $P$ over any alphabet (excluding a small set of exceptional patterns for the binary case), and any array $A$ containing a copy of $P$, one can delete this copy by modifying one of its locations without creating new $P$-copies in $A$. Our results address an open question of Fischer and Newman, who asked whether there exist efficient testers for properties related to tight substructures in multi-dimensional structured data. They serve as a first step towards a general understanding of local properties of multi-dimensional arrays, as any such property can be characterized by a fixed family of forbidden patterns. |
Tasks | |
Published | 2016-07-13 |
URL | http://arxiv.org/abs/1607.03961v3 |
http://arxiv.org/pdf/1607.03961v3.pdf | |
PWC | https://paperswithcode.com/paper/deleting-and-testing-forbidden-patterns-in |
Repo | |
Framework | |
Determining the Veracity of Rumours on Twitter
Title | Determining the Veracity of Rumours on Twitter |
Authors | Georgios Giasemidis, Colin Singleton, Ioannis Agrafiotis, Jason R. C. Nurse, Alan Pilgrim, Chris Willis, Danica Vukadinovic Greetham |
Abstract | While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users’ past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors’ profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis. |
Tasks | Rumour Detection |
Published | 2016-11-19 |
URL | http://arxiv.org/abs/1611.06314v1 |
http://arxiv.org/pdf/1611.06314v1.pdf | |
PWC | https://paperswithcode.com/paper/determining-the-veracity-of-rumours-on |
Repo | |
Framework | |
Comparison among dimensionality reduction techniques based on Random Projection for cancer classification
Title | Comparison among dimensionality reduction techniques based on Random Projection for cancer classification |
Authors | Haozhe Xie, Jie Li, Qiaosheng Zhang, Yadong Wang |
Abstract | Random Projection (RP) technique has been widely applied in many scenarios because it can reduce high-dimensional features into low-dimensional space within short time and meet the need of real-time analysis of massive data. There is an urgent need of dimensionality reduction with fast increase of big genomics data. However, the performance of RP is usually lower. We attempt to improve classification accuracy of RP through combining other reduction dimension methods such as Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Feature Selection (FS). We compared classification accuracy and running time of different combination methods on three microarray datasets and a simulation dataset. Experimental results show a remarkable improvement of 14.77% in classification accuracy of FS followed by RP compared to RP on BC-TCGA dataset. LDA followed by RP also helps RP to yield a more discriminative subspace with an increase of 13.65% on classification accuracy on the same dataset. FS followed by RP outperforms other combination methods in classification accuracy on most of the datasets. |
Tasks | Dimensionality Reduction, Feature Selection |
Published | 2016-08-25 |
URL | http://arxiv.org/abs/1608.07019v5 |
http://arxiv.org/pdf/1608.07019v5.pdf | |
PWC | https://paperswithcode.com/paper/comparison-among-dimensionality-reduction |
Repo | |
Framework | |
Surprisal-Driven Zoneout
Title | Surprisal-Driven Zoneout |
Authors | Kamil Rocki, Tomasz Kornuta, Tegan Maharaj |
Abstract | We propose a novel method of regularization for recurrent neural networks called suprisal-driven zoneout. In this method, states zoneout (maintain their previous value rather than updating), when the suprisal (discrepancy between the last state’s prediction and target) is small. Thus regularization is adaptive and input-driven on a per-neuron basis. We demonstrate the effectiveness of this idea by achieving state-of-the-art bits per character of 1.31 on the Hutter Prize Wikipedia dataset, significantly reducing the gap to the best known highly-engineered compression methods. |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07675v6 |
http://arxiv.org/pdf/1610.07675v6.pdf | |
PWC | https://paperswithcode.com/paper/surprisal-driven-zoneout |
Repo | |
Framework | |