Paper Group ANR 1116
An Unsupervised Model with Attention Autoencoders for Question Retrieval. Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing. Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders. Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model. Learning how to be robust: …
An Unsupervised Model with Attention Autoencoders for Question Retrieval
Title | An Unsupervised Model with Attention Autoencoders for Question Retrieval |
Authors | Minghua Zhang, Yunfang Wu |
Abstract | Question retrieval is a crucial subtask for community question answering. Previous research focus on supervised models which depend heavily on training data and manual feature engineering. In this paper, we propose a novel unsupervised framework, namely reduced attentive matching network (RAMN), to compute semantic matching between two questions. Our RAMN integrates together the deep semantic representations, the shallow lexical mismatching information and the initial rank produced by an external search engine. For the first time, we propose attention autoencoders to generate semantic representations of questions. In addition, we employ lexical mismatching to capture surface matching between two questions, which is derived from the importance of each word in a question. We conduct experiments on the open CQA datasets of SemEval-2016 and SemEval-2017. The experimental results show that our unsupervised model obtains comparable performance with the state-of-the-art supervised methods in SemEval-2016 Task 3, and outperforms the best system in SemEval-2017 Task 3 by a wide margin. |
Tasks | Community Question Answering, Feature Engineering, Question Answering |
Published | 2018-03-09 |
URL | http://arxiv.org/abs/1803.03476v1 |
http://arxiv.org/pdf/1803.03476v1.pdf | |
PWC | https://paperswithcode.com/paper/an-unsupervised-model-with-attention |
Repo | |
Framework | |
Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing
Title | Reinforcement Learning for Adaptive Caching with Dynamic Storage Pricing |
Authors | Alireza Sadeghi, Fatemeh Sheikholeslami, Antonio G. Marques, Georgios B. Giannakis |
Abstract | Small base stations (SBs) of fifth-generation (5G) cellular networks are envisioned to have storage devices to locally serve requests for reusable and popular contents by \emph{caching} them at the edge of the network, close to the end users. The ultimate goal is to shift part of the predictable load on the back-haul links, from on-peak to off-peak periods, contributing to a better overall network performance and service experience. To enable the SBs with efficient \textit{fetch-cache} decision-making schemes operating in dynamic settings, this paper introduces simple but flexible generic time-varying fetching and caching costs, which are then used to formulate a constrained minimization of the aggregate cost across files and time. Since caching decisions per time slot influence the content availability in future slots, the novel formulation for optimal fetch-cache decisions falls into the class of dynamic programming. Under this generic formulation, first by considering stationary distributions for the costs and file popularities, an efficient reinforcement learning-based solver known as value iteration algorithm can be used to solve the emerging optimization problem. Later, it is shown that practical limitations on cache capacity can be handled using a particular instance of the generic dynamic pricing formulation. Under this setting, to provide a light-weight online solver for the corresponding optimization, the well-known reinforcement learning algorithm, $Q$-learning, is employed to find optimal fetch-cache decisions. Numerical tests corroborating the merits of the proposed approach wrap up the paper. |
Tasks | Decision Making, Q-Learning |
Published | 2018-12-17 |
URL | http://arxiv.org/abs/1812.08593v2 |
http://arxiv.org/pdf/1812.08593v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-adaptive-caching |
Repo | |
Framework | |
Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders
Title | Building Jiminy Cricket: An Architecture for Moral Agreements Among Stakeholders |
Authors | Beishui Liao, Marija Slavkovik, Leendert van der Torre |
Abstract | An autonomous system is constructed by a manufacturer, operates in a society subject to norms and laws, and is interacting with end-users. We address the challenge of how the moral values and views of all stakeholders can be integrated and reflected in the moral behaviour of the autonomous system. We propose an artificial moral agent architecture that uses techniques from normative systems and formal argumentation to reach moral agreements among stakeholders. We show how our architecture can be used not only for ethical practical reasoning and collaborative decision-making, but also for the explanation of such moral behavior. |
Tasks | Decision Making |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04741v2 |
http://arxiv.org/pdf/1812.04741v2.pdf | |
PWC | https://paperswithcode.com/paper/building-jiminy-cricket-an-architecture-for |
Repo | |
Framework | |
Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model
Title | Semi-supervised Skin Lesion Segmentation via Transformation Consistent Self-ensembling Model |
Authors | Xiaomeng Li, Lequan Yu, Hao Chen, Chi-Wing Fu, Pheng-Ann Heng |
Abstract | Automatic skin lesion segmentation on dermoscopic images is an essential component in computer-aided diagnosis of melanoma. Recently, many fully supervised deep learning based methods have been proposed for automatic skin lesion segmentation. However, these approaches require massive pixel-wise annotation from experienced dermatologists, which is very costly and time-consuming. In this paper, we present a novel semi-supervised method for skin lesion segmentation by leveraging both labeled and unlabeled data. The network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. In this paper, we present a novel semi-supervised method for skin lesion segmentation, where the network is optimized by the weighted combination of a common supervised loss for labeled inputs only and a regularization loss for both labeled and unlabeled data. Our method encourages a consistent prediction for unlabeled images using the outputs of the network-in-training under different regularizations, so that it can utilize the unlabeled data. To utilize the unlabeled data, our method encourages the consistent predictions of the network-in-training for the same input under different regularizations. Aiming for the semi-supervised segmentation problem, we enhance the effect of regularization for pixel-level predictions by introducing a transformation, including rotation and flipping, consistent scheme in our self-ensembling model. With only 300 labeled training samples, our method sets a new record on the benchmark of the International Skin Imaging Collaboration (ISIC) 2017 skin lesion segmentation challenge. Such a result clearly surpasses fully-supervised state-of-the-arts that are trained with 2000 labeled data. |
Tasks | Lesion Segmentation |
Published | 2018-08-12 |
URL | http://arxiv.org/abs/1808.03887v1 |
http://arxiv.org/pdf/1808.03887v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-skin-lesion-segmentation-via |
Repo | |
Framework | |
Learning how to be robust: Deep polynomial regression
Title | Learning how to be robust: Deep polynomial regression |
Authors | Juan-Manuel Perez-Rua, Tomas Crivelli, Patrick Bouthemy, Patrick Perez |
Abstract | Polynomial regression is a recurrent problem with a large number of applications. In computer vision it often appears in motion analysis. Whatever the application, standard methods for regression of polynomial models tend to deliver biased results when the input data is heavily contaminated by outliers. Moreover, the problem is even harder when outliers have strong structure. Departing from problem-tailored heuristics for robust estimation of parametric models, we explore deep convolutional neural networks. Our work aims to find a generic approach for training deep regression models without the explicit need of supervised annotation. We bypass the need for a tailored loss function on the regression parameters by attaching to our model a differentiable hard-wired decoder corresponding to the polynomial operation at hand. We demonstrate the value of our findings by comparing with standard robust regression methods. Furthermore, we demonstrate how to use such models for a real computer vision problem, i.e., video stabilization. The qualitative and quantitative experiments show that neural networks are able to learn robustness for general polynomial regression, with results that well overpass scores of traditional robust estimation methods. |
Tasks | |
Published | 2018-04-17 |
URL | http://arxiv.org/abs/1804.06504v2 |
http://arxiv.org/pdf/1804.06504v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-how-to-be-robust-deep-polynomial |
Repo | |
Framework | |
Deep Pepper: Expert Iteration based Chess agent in the Reinforcement Learning Setting
Title | Deep Pepper: Expert Iteration based Chess agent in the Reinforcement Learning Setting |
Authors | Sai Krishna G. V., Kyle Goyette, Ahmad Chamseddine, Breandan Considine |
Abstract | An almost-perfect chess playing agent has been a long standing challenge in the field of Artificial Intelligence. Some of the recent advances demonstrate we are approaching that goal. In this project, we provide methods for faster training of self-play style algorithms, mathematical details of the algorithm used, various potential future directions, and discuss most of the relevant work in the area of computer chess. Deep Pepper uses embedded knowledge to accelerate the training of the chess engine over a “tabula rasa” system such as Alpha Zero. We also release our code to promote further research. |
Tasks | |
Published | 2018-06-02 |
URL | http://arxiv.org/abs/1806.00683v2 |
http://arxiv.org/pdf/1806.00683v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-pepper-expert-iteration-based-chess |
Repo | |
Framework | |
Towards Unsupervised Single-Channel Blind Source Separation using Adversarial Pair Unmix-and-Remix
Title | Towards Unsupervised Single-Channel Blind Source Separation using Adversarial Pair Unmix-and-Remix |
Authors | Yedid Hoshen |
Abstract | Blind single-channel source separation is a long standing signal processing challenge. Many methods were proposed to solve this task utilizing multiple signal priors such as low rank, sparsity, temporal continuity etc. The recent advance of generative adversarial models presented new opportunities in signal regression tasks. The power of adversarial training however has not yet been realized for blind source separation tasks. In this work, we propose a novel method for blind source separation (BSS) using adversarial methods. We rely on the independence of sources for creating adversarial constraints on pairs of approximately separated sources, which ensure good separation. Experiments are carried out on image sources validating the good performance of our approach, and presenting our method as a promising approach for solving BSS for general signals. |
Tasks | |
Published | 2018-12-14 |
URL | https://arxiv.org/abs/1812.07504v2 |
https://arxiv.org/pdf/1812.07504v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-unsupervised-single-channel-blind |
Repo | |
Framework | |
A New Backpropagation Algorithm without Gradient Descent
Title | A New Backpropagation Algorithm without Gradient Descent |
Authors | Varun Ranganathan, S. Natarajan |
Abstract | The backpropagation algorithm, which had been originally introduced in the 1970s, is the workhorse of learning in neural networks. This backpropagation algorithm makes use of the famous machine learning algorithm known as Gradient Descent, which is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. In this paper, we develop an alternative to the backpropagation without the use of the Gradient Descent Algorithm, but instead we are going to devise a new algorithm to find the error in the weights and biases of an artificial neuron using Moore-Penrose Pseudo Inverse. The numerical studies and the experiments performed on various datasets are used to verify the working of this alternative algorithm. |
Tasks | |
Published | 2018-01-25 |
URL | http://arxiv.org/abs/1802.00027v1 |
http://arxiv.org/pdf/1802.00027v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-backpropagation-algorithm-without |
Repo | |
Framework | |
A Jointly Learned Context-Aware Place of Interest Embedding for Trip Recommendations
Title | A Jointly Learned Context-Aware Place of Interest Embedding for Trip Recommendations |
Authors | Jiayuan He, Jianzhong Qi, Kotagiri Ramamohanarao |
Abstract | Trip recommendation is an important location-based service that helps relieve users from the time and efforts for trip planning. It aims to recommend a sequence of places of interest (POIs) for a user to visit that maximizes the user’s satisfaction. When adding a POI to a recommended trip, it is essential to understand the context of the recommendation, including the POI popularity, other POIs co-occurring in the trip, and the preferences of the user. These contextual factors are learned separately in existing studies, while in reality, they impact jointly on a user’s choice of a POI to visit. In this study, we propose a POI embedding model to jointly learn the impact of these contextual factors. We call the learned POI embedding a context-aware POI embedding. To showcase the effectiveness of this embedding, we apply it to generate trip recommendations given a user and a time budget. We propose two trip recommendation algorithms based on our context-aware POI embedding. The first algorithm finds the exact optimal trip by transforming and solving the trip recommendation problem as an integer linear programming problem. To achieve a high computation efficiency, the second algorithm finds a heuristically optimal trip based on adaptive large neighborhood search. We perform extensive experiments on real datasets. The results show that our proposed algorithms consistently outperform state-of-the-art algorithms in trip recommendation quality, with an advantage of up to 43% in F1-score. |
Tasks | |
Published | 2018-08-24 |
URL | http://arxiv.org/abs/1808.08023v1 |
http://arxiv.org/pdf/1808.08023v1.pdf | |
PWC | https://paperswithcode.com/paper/a-jointly-learned-context-aware-place-of |
Repo | |
Framework | |
Learning a face space for experiments on human identity
Title | Learning a face space for experiments on human identity |
Authors | Jordan W. Suchow, Joshua C. Peterson, Thomas L. Griffiths |
Abstract | Generative models of human identity and appearance have broad applicability to behavioral science and technology, but the exquisite sensitivity of human face perception means that their utility hinges on the alignment of the model’s representation to human psychological representations and the photorealism of the generated images. Meeting these requirements is an exacting task, and existing models of human identity and appearance are often unworkably abstract, artificial, uncanny, or biased. Here, we use a variational autoencoder with an autoregressive decoder to learn a face space from a uniquely diverse dataset of portraits that control much of the variation irrelevant to human identity and appearance. Our method generates photorealistic portraits of fictive identities with a smooth, navigable latent space. We validate our model’s alignment with human sensitivities by introducing a psychophysical Turing test for images, which humans mostly fail. Lastly, we demonstrate an initial application of our model to the problem of fast search in mental space to obtain detailed “police sketches” in a small number of trials. |
Tasks | |
Published | 2018-05-19 |
URL | http://arxiv.org/abs/1805.07653v1 |
http://arxiv.org/pdf/1805.07653v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-face-space-for-experiments-on-1 |
Repo | |
Framework | |
The Hidden Vulnerability of Distributed Learning in Byzantium
Title | The Hidden Vulnerability of Distributed Learning in Byzantium |
Authors | El Mahdi El Mhamdi, Rachid Guerraoui, Sébastien Rouault |
Abstract | While machine learning is going through an era of celebrated success, concerns have been raised about the vulnerability of its backbone: stochastic gradient descent (SGD). Recent approaches have been proposed to ensure the robustness of distributed SGD against adversarial (Byzantine) workers sending poisoned gradients during the training phase. Some of these approaches have been proven Byzantine-resilient: they ensure the convergence of SGD despite the presence of a minority of adversarial workers. We show in this paper that convergence is not enough. In high dimension $d \gg 1$, an adver-sary can build on the loss function’s non-convexity to make SGD converge to ineffective models. More precisely, we bring to light that existing Byzantine-resilient schemes leave a margin of poisoning of $\Omega\left(f(d)\right)$, where $f(d)$ increases at least like $\sqrt{d~}$. Based on this leeway, we build a simple attack, and experimentally show its strong to utmost effectivity on CIFAR-10 and MNIST. We introduce Bulyan, and prove it significantly reduces the attackers leeway to a narrow $O( \frac{1}{\sqrt{d~}})$ bound. We empirically show that Bulyan does not suffer the fragility of existing aggregation rules and, at a reasonable cost in terms of required batch size, achieves convergence as if only non-Byzantine gradients had been used to update the model. |
Tasks | |
Published | 2018-02-22 |
URL | http://arxiv.org/abs/1802.07927v2 |
http://arxiv.org/pdf/1802.07927v2.pdf | |
PWC | https://paperswithcode.com/paper/the-hidden-vulnerability-of-distributed |
Repo | |
Framework | |
Computational Red Teaming in a Sudoku Solving Context: Neural Network Based Skill Representation and Acquisition
Title | Computational Red Teaming in a Sudoku Solving Context: Neural Network Based Skill Representation and Acquisition |
Authors | George Leu, Hussein Abbass |
Abstract | In this paper we provide an insight into the skill representation, where skill representation is seen as an essential part of the skill assessment stage in the Computational Red Teaming process. Skill representation is demonstrated in the context of Sudoku puzzle, for which the real human skills used in Sudoku solving, along with their acquisition, are represented computationally in a cognitively plausible manner, by using feed-forward neural networks with back-propagation, and supervised learning. The neural network based skills are then coupled with a hard-coded constraint propagation computational Sudoku solver, in which the solving sequence is kept hard-coded, and the skills are represented through neural networks. The paper demonstrates that the modified solver can achieve different levels of proficiency, depending on the amount of skills acquired through the neural networks. Results are encouraging for developing more complex skill and skill acquisition models usable in general frameworks related to the skill assessment aspect of Computational Red Teaming. |
Tasks | |
Published | 2018-02-27 |
URL | http://arxiv.org/abs/1802.09660v1 |
http://arxiv.org/pdf/1802.09660v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-red-teaming-in-a-sudoku-solving |
Repo | |
Framework | |
Edge-Based Blur Kernel Estimation Using Sparse Representation and Self-Similarity
Title | Edge-Based Blur Kernel Estimation Using Sparse Representation and Self-Similarity |
Authors | Jing Yu, Zhenchun Chang, Chuangbai Xiao |
Abstract | Blind image deconvolution is the problem of recovering the latent image from the only observed blurry image when the blur kernel is unknown. In this paper, we propose an edge-based blur kernel estimation method for blind motion deconvolution. In our previous work, we incorporate both sparse representation and self-similarity of image patches as priors into our blind deconvolution model to regularize the recovery of the latent image. Since almost any natural image has properties of sparsity and multi-scale self-similarity, we construct a sparsity regularizer and a cross-scale non-local regularizer based on our patch priors. It has been observed that our regularizers often favor sharp images over blurry ones only for image patches of the salient edges and thus we define an edge mask to locate salient edges that we want to apply our regularizers. Experimental results on both simulated and real blurry images demonstrate that our method outperforms existing state-of-the-art blind deblurring methods even for handling of very large blurs, thanks to the use of the edge mask. |
Tasks | Deblurring, Image Deconvolution |
Published | 2018-11-17 |
URL | http://arxiv.org/abs/1811.07161v1 |
http://arxiv.org/pdf/1811.07161v1.pdf | |
PWC | https://paperswithcode.com/paper/edge-based-blur-kernel-estimation-using |
Repo | |
Framework | |
Detection of Alzheimers Disease from MRI using Convolutional Neural Network with Tensorflow
Title | Detection of Alzheimers Disease from MRI using Convolutional Neural Network with Tensorflow |
Authors | Gururaj Awate, Sunil Bangare, G Pradeepini, S Patil |
Abstract | Nowadays, due to tremendous improvements in high performance computing, it has become easier to train Neural Networks. We intend to take advantage of this situation and apply this technology in solving real world problems. There was a need for automatic diagnosis certain diseases from medical images that could help a doctor and radiologist for further action towards treating the illness. We chose Alzheimer disease for this purpose. Alzheimer disease is the leading cause of dementia and memory loss. Alzheimer disease, it is caused by atrophy of the certain brain regions and by brain cell death. MRI scans reveal this information but atrophy regions are different for different people which makes the diagnosis a little trickier and often gets miss-diagnosed by doctors and radiologists. The Dataset used for this project is provided by OASIS, which contains over 400 subjects 100 of which having mild to severe dementia and is supplemented by MMSE and CDR standards of diagnosis in the same context. Enter CNN, Convolutional Neural Networks are a hybrid of Kernel Convolutions and Neural Networks. Kernel Convolutions is a technique that uses filters to recognize and segment images based on features. Neural Networks consist of neurons which are loosely based on human brains neuron which represents a single classifier and interconnected by weights, have different biases and are activated by some activation functions. By using Convolutional Neural Networks, the problem can be solved with minimal error rate. The technologies we intend to use are libraries like CUDA CuDNN for making use of GPU and its multiple cores-parallel computing to train models while giving us high performance. |
Tasks | |
Published | 2018-06-26 |
URL | https://arxiv.org/abs/1806.10170v2 |
https://arxiv.org/pdf/1806.10170v2.pdf | |
PWC | https://paperswithcode.com/paper/detection-of-alzheimers-disease-from-mri-1 |
Repo | |
Framework | |
Applications of Artificial Intelligence to Network Security
Title | Applications of Artificial Intelligence to Network Security |
Authors | Alberto Perez Veiga |
Abstract | Attacks to networks are becoming more complex and sophisticated every day. Beyond the so-called script-kiddies and hacking newbies, there is a myriad of professional attackers seeking to make serious profits infiltrating in corporate networks. Either hostile governments, big corporations or mafias are constantly increasing their resources and skills in cybercrime in order to spy, steal or cause damage more effectively. traditional approaches to Network Security seem to start hitting their limits and it is being recognized the need for a smarter approach to threat detections. This paper provides an introduction on the need for evolution of Cyber Security techniques and how Artificial Intelligence could be of application to help solving some of the problems. It provides also, a high-level overview of some state of the art AI Network Security techniques, to finish analysing what is the foreseeable future of the application of AI to Network Security. |
Tasks | |
Published | 2018-03-27 |
URL | http://arxiv.org/abs/1803.09992v1 |
http://arxiv.org/pdf/1803.09992v1.pdf | |
PWC | https://paperswithcode.com/paper/applications-of-artificial-intelligence-to |
Repo | |
Framework | |