Paper Group ANR 631
The IQ of Artificial Intelligence. Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering. Improv Chat: Second Response Generation for Chatbot. Deep Transfer Learning for EEG-based Brain Computer Interface. Learning with tree-based tensor …
The IQ of Artificial Intelligence
Title | The IQ of Artificial Intelligence |
Authors | Dimiter Dobrev |
Abstract | All it takes to identify the computer programs which are Artificial Intelligence is to give them a test and award AI to those that pass the test. Let us say that the scores they earn at the test will be called IQ. We cannot pinpoint a minimum IQ threshold that a program has to cover in order to be AI, however, we will choose a certain value. Thus, our definition for AI will be any program the IQ of which is above the chosen value. While this idea has already been implemented in [3], here we will revisit this construct in order to introduce certain improvements. |
Tasks | |
Published | 2018-06-13 |
URL | http://arxiv.org/abs/1806.04915v1 |
http://arxiv.org/pdf/1806.04915v1.pdf | |
PWC | https://paperswithcode.com/paper/the-iq-of-artificial-intelligence |
Repo | |
Framework | |
Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic
Title | Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic |
Authors | Benjamin J. Radford, Bartley D. Richardson, Shawn E. Davis |
Abstract | We evaluate methods for applying unsupervised anomaly detection to cybersecurity applications on computer network traffic data, or flow. We borrow from the natural language processing literature and conceptualize flow as a sort of “language” spoken between machines. Five sequence aggregation rules are evaluated for their efficacy in flagging multiple attack types in a labeled flow dataset, CICIDS2017. For sequence modeling, we rely on long short-term memory (LSTM) recurrent neural networks (RNN). Additionally, a simple frequency-based model is described and its performance with respect to attack detection is compared to the LSTM models. We conclude that the frequency-based model tends to perform as well as or better than the LSTM models for the tasks at hand, with a few notable exceptions. |
Tasks | Anomaly Detection, Unsupervised Anomaly Detection |
Published | 2018-05-09 |
URL | http://arxiv.org/abs/1805.03735v2 |
http://arxiv.org/pdf/1805.03735v2.pdf | |
PWC | https://paperswithcode.com/paper/sequence-aggregation-rules-for-anomaly |
Repo | |
Framework | |
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Title | Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering |
Authors | Todor Mihaylov, Peter Clark, Tushar Khot, Ashish Sabharwal |
Abstract | We present a new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject. The open book that comes with our questions is a set of 1329 elementary level science facts. Roughly 6000 questions probe an understanding of these facts and their application to novel situations. This requires combining an open book fact (e.g., metals conduct electricity) with broad common knowledge (e.g., a suit of armor is made of metal) obtained from other sources. While existing QA datasets over documents or knowledge bases, being generally self-contained, focus on linguistic understanding, OpenBookQA probes a deeper understanding of both the topic—in the context of common knowledge—and the language it is expressed in. Human performance on OpenBookQA is close to 92%, but many state-of-the-art pre-trained QA methods perform surprisingly poorly, worse than several simple neural baselines we develop. Our oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts. We leave it as a challenge to solve the retrieval problem in this multi-hop setting and to close the large gap to human performance. |
Tasks | Question Answering |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02789v1 |
http://arxiv.org/pdf/1809.02789v1.pdf | |
PWC | https://paperswithcode.com/paper/can-a-suit-of-armor-conduct-electricity-a-new |
Repo | |
Framework | |
Improv Chat: Second Response Generation for Chatbot
Title | Improv Chat: Second Response Generation for Chatbot |
Authors | Furu Wei |
Abstract | Existing research on response generation for chatbot focuses on \textbf{First Response Generation} which aims to teach the chatbot to say the first response (e.g. a sentence) appropriate to the conversation context (e.g. the user’s query). In this paper, we introduce a new task \textbf{Second Response Generation}, termed as Improv chat, which aims to teach the chatbot to say the second response after saying the first response with respect the conversation context, so as to lighten the burden on the user to keep the conversation going. Specifically, we propose a general learning based framework and develop a retrieval based system which can generate the second responses with the users’ query and the chatbot’s first response as input. We present the approach to building the conversation corpus for Improv chat from public forums and social networks, as well as the neural networks based models for response matching and ranking. We include the preliminary experiments and results in this paper. This work could be further advanced with better deep matching models for retrieval base systems or generative models for generation based systems as well as extensive evaluations in real-life applications. |
Tasks | Chatbot |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.03900v1 |
http://arxiv.org/pdf/1805.03900v1.pdf | |
PWC | https://paperswithcode.com/paper/improv-chat-second-response-generation-for |
Repo | |
Framework | |
Deep Transfer Learning for EEG-based Brain Computer Interface
Title | Deep Transfer Learning for EEG-based Brain Computer Interface |
Authors | Chuanqi Tan, Fuchun Sun, Wenchang Zhang |
Abstract | The electroencephalography classifier is the most important component of brain-computer interface based systems. There are two major problems hindering the improvement of it. First, traditional methods do not fully exploit multimodal information. Second, large-scale annotated EEG datasets are almost impossible to acquire because biological data acquisition is challenging and quality annotation is costly. Herein, we propose a novel deep transfer learning approach to solve these two problems. First, we model cognitive events based on EEG data by characterizing the data using EEG optical flow, which is designed to preserve multimodal EEG information in a uniform representation. Second, we design a deep transfer learning framework which is suitable for transferring knowledge by joint training, which contains a adversarial network and a special loss function. The experiments demonstrate that our approach, when applied to EEG classification tasks, has many advantages, such as robustness and accuracy. |
Tasks | EEG, Optical Flow Estimation, Transfer Learning |
Published | 2018-08-06 |
URL | http://arxiv.org/abs/1808.01752v1 |
http://arxiv.org/pdf/1808.01752v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-transfer-learning-for-eeg-based-brain |
Repo | |
Framework | |
Learning with tree-based tensor formats
Title | Learning with tree-based tensor formats |
Authors | Erwan Grelier, Anthony Nouy, Mathilde Chevreuil |
Abstract | This paper is concerned with the approximation of high-dimensional functions in a statistical learning setting, by empirical risk minimization over model classes of functions in tree-based tensor format. These are particular classes of rank-structured functions that can be seen as deep neural networks with a sparse architecture related to the tree and multilinear activation functions. For learning in a given model class, we exploit the fact that tree-based tensor formats are multilinear models and recast the problem of risk minimization over a nonlinear set into a succession of learning problems with linear models. Suitable changes of representation yield numerically stable learning problems and allow to exploit sparsity. For high-dimensional problems or when only a small data set is available, the selection of a good model class is a critical issue. For a given tree, the selection of the tuple of tree-based ranks that minimize the risk is a combinatorial problem. Here, we propose a rank adaptation strategy which provides in practice a good convergence of the risk as a function of the model class complexity. Finding a good tree is also a combinatorial problem, which can be related to the choice of a particular sparse architecture for deep neural networks. Here, we propose a stochastic algorithm for minimizing the complexity of the representation of a given function over a class of trees with a given arity, allowing changes in the topology of the tree. This tree optimization algorithm is then included in a learning scheme that successively adapts the tree and the corresponding tree-based ranks. Contrary to classical learning algorithms for nonlinear model classes, the proposed algorithms are numerically stable, reliable, and require only a low level expertise of the user. |
Tasks | |
Published | 2018-11-11 |
URL | http://arxiv.org/abs/1811.04455v2 |
http://arxiv.org/pdf/1811.04455v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-tree-based-tensor-formats |
Repo | |
Framework | |
Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning
Title | Transferrable Feature and Projection Learning with Class Hierarchy for Zero-Shot Learning |
Authors | Aoxue Li, Zhiwu Lu, Jiechao Guan, Tao Xiang, Liwei Wang, Ji-Rong Wen |
Abstract | Zero-shot learning (ZSL) aims to transfer knowledge from seen classes to unseen ones so that the latter can be recognised without any training samples. This is made possible by learning a projection function between a feature space and a semantic space (e.g. attribute space). Considering the seen and unseen classes as two domains, a big domain gap often exists which challenges ZSL. Inspired by the fact that an unseen class is not exactly `unseen’ if it belongs to the same superclass as a seen class, we propose a novel inductive ZSL model that leverages superclasses as the bridge between seen and unseen classes to narrow the domain gap. Specifically, we first build a class hierarchy of multiple superclass layers and a single class layer, where the superclasses are automatically generated by data-driven clustering over the semantic representations of all seen and unseen class names. We then exploit the superclasses from the class hierarchy to tackle the domain gap challenge in two aspects: deep feature learning and projection function learning. First, to narrow the domain gap in the feature space, we integrate a recurrent neural network (RNN) defined with the superclasses into a convolutional neural network (CNN), in order to enforce the superclass hierarchy. Second, to further learn a transferrable projection function for ZSL, a novel projection function learning method is proposed by exploiting the superclasses to align the two domains. Importantly, our transferrable feature and projection learning methods can be easily extended to a closely related task – few-shot learning (FSL). Extensive experiments show that the proposed model significantly outperforms the state-of-the-art alternatives in both ZSL and FSL tasks. | |
Tasks | Few-Shot Learning, Zero-Shot Learning |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08329v1 |
http://arxiv.org/pdf/1810.08329v1.pdf | |
PWC | https://paperswithcode.com/paper/transferrable-feature-and-projection-learning |
Repo | |
Framework | |
A New Algorithmic Decision for Categorical Syllogisms via Caroll’s Diagrams
Title | A New Algorithmic Decision for Categorical Syllogisms via Caroll’s Diagrams |
Authors | Arif Gursoy, Ibrahim Senturk, Tahsin Oner |
Abstract | In this paper, we deal with a calculus system SLCD (Syllogistic Logic with Carroll Diagrams), which gives a formal approach to logical reasoning with diagrams, for representations of the fundamental Aristotelian categorical propositions and show that they are closed under the syllogistic criterion of inference which is the deletion of middle term. Therefore, it is implemented to let the formalism comprise synchronically bilateral and trilateral diagrammatical appearance and a naive algorithmic nature. And also, there is no need specific knowledge or exclusive ability to understand as well as to use it. Consequently, we give an effective algorithm used to determine whether a syllogistic reasoning valid or not by using SLCD. |
Tasks | |
Published | 2018-02-08 |
URL | http://arxiv.org/abs/1802.04127v3 |
http://arxiv.org/pdf/1802.04127v3.pdf | |
PWC | https://paperswithcode.com/paper/a-new-algorithmic-decision-for-categorical |
Repo | |
Framework | |
Sequence Training of DNN Acoustic Models With Natural Gradient
Title | Sequence Training of DNN Acoustic Models With Natural Gradient |
Authors | Adnan Haider, Philip C. Woodland |
Abstract | Deep Neural Network (DNN) acoustic models often use discriminative sequence training that optimises an objective function that better approximates the word error rate (WER) than frame-based training. Sequence training is normally implemented using Stochastic Gradient Descent (SGD) or Hessian Free (HF) training. This paper proposes an alternative batch style optimisation framework that employs a Natural Gradient (NG) approach to traverse through the parameter space. By correcting the gradient according to the local curvature of the KL-divergence, the NG optimisation process converges more quickly than HF. Furthermore, the proposed NG approach can be applied to any sequence discriminative training criterion. The efficacy of the NG method is shown using experiments on a Multi-Genre Broadcast (MGB) transcription task that demonstrates both the computational efficiency and the accuracy of the resulting DNN models. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02204v1 |
http://arxiv.org/pdf/1804.02204v1.pdf | |
PWC | https://paperswithcode.com/paper/sequence-training-of-dnn-acoustic-models-with |
Repo | |
Framework | |
Independence of Sources in Social Networks
Title | Independence of Sources in Social Networks |
Authors | Manel Chehibi, Mouna Chebbah, Arnaud Martin |
Abstract | Online social networks are more and more studied. The links between users of a social network are important and have to be well qualified in order to detect communities and find influencers for example. In this paper, we present an approach based on the theory of belief functions to estimate the degrees of cognitive independence between users in a social network. We experiment the proposed method on a large amount of data gathered from the Twitter social network. |
Tasks | |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.09959v1 |
http://arxiv.org/pdf/1806.09959v1.pdf | |
PWC | https://paperswithcode.com/paper/independence-of-sources-in-social-networks |
Repo | |
Framework | |
Gradient descent aligns the layers of deep linear networks
Title | Gradient descent aligns the layers of deep linear networks |
Authors | Ziwei Ji, Matus Telgarsky |
Abstract | This paper establishes risk convergence and asymptotic weight matrix alignment — a form of implicit regularization — of gradient flow and gradient descent when applied to deep linear networks on linearly separable data. In more detail, for gradient flow applied to strictly decreasing loss functions (with similar results for gradient descent with particular decreasing step sizes): (i) the risk converges to 0; (ii) the normalized i-th weight matrix asymptotically equals its rank-1 approximation $u_iv_i^{\top}$; (iii) these rank-1 matrices are aligned across layers, meaning $v_{i+1}^{\top}u_i\to1$. In the case of the logistic loss (binary cross entropy), more can be said: the linear function induced by the network — the product of its weight matrices — converges to the same direction as the maximum margin solution. This last property was identified in prior work, but only under assumptions on gradient descent which here are implied by the alignment phenomenon. |
Tasks | |
Published | 2018-10-04 |
URL | http://arxiv.org/abs/1810.02032v2 |
http://arxiv.org/pdf/1810.02032v2.pdf | |
PWC | https://paperswithcode.com/paper/gradient-descent-aligns-the-layers-of-deep |
Repo | |
Framework | |
An optimized system to solve text-based CAPTCHA
Title | An optimized system to solve text-based CAPTCHA |
Authors | Ye Wang, Mi Lu |
Abstract | CAPTCHA(Completely Automated Public Turing test to Tell Computers and Humans Apart) can be used to protect data from auto bots. Countless kinds of CAPTCHAs are thus designed, while we most frequently utilize text-based scheme because of most convenience and user-friendly way \cite{bursztein2011text}. Currently, various types of CAPTCHAs need corresponding segmentation to identify single character due to the numerous different segmentation ways. Our goal is to defeat the CAPTCHA, thus firstly the CAPTCHAs need to be split into character by character. There isn’t a regular segmentation algorithm to obtain the divided characters in all kinds of examples, which means that we have to treat the segmentation individually. In this paper, we build a whole system to defeat the CAPTCHAs as well as achieve state-of-the-art performance. In detail, we present our self-adaptive algorithm to segment different kinds of characters optimally, and then utilize both the existing methods and our own constructed convolutional neural network as an extra classifier. Results are provided showing how our system work well towards defeating these CAPTCHAs. |
Tasks | |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.07202v1 |
http://arxiv.org/pdf/1806.07202v1.pdf | |
PWC | https://paperswithcode.com/paper/an-optimized-system-to-solve-text-based |
Repo | |
Framework | |
Linear Algebra and Duality of Neural Networks
Title | Linear Algebra and Duality of Neural Networks |
Authors | Galin Georgiev |
Abstract | Bases, mappings, projections and metrics, natural for Neural network training, are introduced. Graph-theoretical interpretation is offered. Non-Gaussianity naturally emerges, even in relatively simple datasets. Training statistics, hierarchies and energies are analyzed, from physics point of view. Duality between observables (for example, pixels) and observations is established. Relationship between exact and numerical solutions is studied. Physics and financial mathematics interpretations of a key problem are offered. Examples support all new concepts. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04711v2 |
http://arxiv.org/pdf/1809.04711v2.pdf | |
PWC | https://paperswithcode.com/paper/linear-algebra-and-duality-of-neural-networks |
Repo | |
Framework | |
Enhancing clinical MRI Perfusion maps with data-driven maps of complementary nature for lesion outcome prediction
Title | Enhancing clinical MRI Perfusion maps with data-driven maps of complementary nature for lesion outcome prediction |
Authors | Adriano Pinto, Sergio Pereira, Raphael Meier, Victor Alves, Roland Wiest, Carlos A. Silva, Mauricio Reyes |
Abstract | Stroke is the second most common cause of death in developed countries, where rapid clinical intervention can have a major impact on a patient’s life. To perform the revascularization procedure, the decision making of physicians considers its risks and benefits based on multi-modal MRI and clinical experience. Therefore, automatic prediction of the ischemic stroke lesion outcome has the potential to assist the physician towards a better stroke assessment and information about tissue outcome. Typically, automatic methods consider the information of the standard kinetic models of diffusion and perfusion MRI (e.g. Tmax, TTP, MTT, rCBF, rCBV) to perform lesion outcome prediction. In this work, we propose a deep learning method to fuse this information with an automated data selection of the raw 4D PWI image information, followed by a data-driven deep-learning modeling of the underlying blood flow hemodynamics. We demonstrate the ability of the proposed approach to improve prediction of tissue at risk before therapy, as compared to only using the standard clinical perfusion maps, hence suggesting on the potential benefits of the proposed data-driven raw perfusion data modelling approach. |
Tasks | Decision Making |
Published | 2018-06-12 |
URL | http://arxiv.org/abs/1806.04413v1 |
http://arxiv.org/pdf/1806.04413v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-clinical-mri-perfusion-maps-with |
Repo | |
Framework | |
Deep learning based automatic segmentation of lumbosacral nerves on non-contrast CT for radiographic evaluation: a pilot study
Title | Deep learning based automatic segmentation of lumbosacral nerves on non-contrast CT for radiographic evaluation: a pilot study |
Authors | Guoxin Fan, Huaqing Liu, Zhenhua Wu, Yufeng Li, Chaobo Feng, Dongdong Wang, Jie Luo, Xiaofei Guan, William M. Wells III, Shisheng He |
Abstract | Background and objective: Combined evaluation of lumbosacral structures (e.g. nerves, bone) on multimodal radiographic images is routinely conducted prior to spinal surgery and interventional procedures. Generally, magnetic resonance imaging is conducted to differentiate nerves, while computed tomography (CT) is used to observe bony structures. The aim of this study is to investigate the feasibility of automatically segmenting lumbosacral structures (e.g. nerves & bone) on non-contrast CT with deep learning. Methods: a total of 50 cases with spinal CT were manually labeled for lumbosacral nerves and bone with Slicer 4.8. The ratio of training: validation: testing is 32:8:10. A 3D-Unet is adopted to build the model SPINECT for automatically segmenting lumbosacral structures. Pixel accuracy, IoU, and Dice score are used to assess the segmentation performance of lumbosacral structures. Results: the testing results reveals successful segmentation of lumbosacral bone and nerve on CT. The average pixel accuracy is 0.940 for bone and 0.918 for nerve. The average IoU is 0.897 for bone and 0.827 for nerve. The dice score is 0.945 for bone and 0.905 for nerve. Conclusions: this pilot study indicated that automatic segmenting lumbosacral structures (nerves and bone) on non-contrast CT is feasible and may have utility for planning and navigating spinal interventions and surgery. |
Tasks | Computed Tomography (CT) |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11843v1 |
http://arxiv.org/pdf/1811.11843v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-automatic-segmentation-of |
Repo | |
Framework | |