Paper Group ANR 68
Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach. Forecasting Corn Yield with Machine Learning Ensembles. Learning from Noisy Similar and Dissimilar Data. SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration. Optimal Pricing of Internet of Things: A Machine Learning Approach. Netw …
Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach
Title | Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach |
Authors | Carlos Fernandez, Foster Provost, Xintian Han |
Abstract | Lack of understanding of the decisions made by model-based AI systems is an important barrier for their adoption. We examine counterfactual explanations as an alternative for explaining AI decisions. The counterfactual approach defines an explanation as a set of the system’s data inputs that causally drives the decision (meaning that removing them changes the decision) and is irreducible (meaning that removing any subset of the inputs in the explanation does not change the decision). We generalize previous work on counterfactual explanations, resulting in a framework that (a) is model-agnostic, (b) can address features with arbitrary data types, (c) can explain decisions made by complex AI systems that incorporate multiple models, and (d) is scalable to large numbers of features. We also propose a heuristic procedure to find the most useful explanations depending on the context. We contrast counterfactual explanations with another alternative: methods that explain model predictions by weighting features according to their importance (e.g., SHAP, LIME). This paper presents two fundamental reasons why explaining model predictions is not the same as explaining the decisions made using those predictions, suggesting we should carefully consider whether importance-weight explanations are well-suited to explain decisions made by AI systems. Specifically, we show that (1) features that have a large importance weight for a model prediction may not actually affect the corresponding decision, and (2) importance weights are insufficient to communicate whether and how features influence system decisions. We demonstrate this with several examples, including three detailed case studies that compare the counterfactual approach with SHAP to illustrate various conditions under which counterfactual explanations explain data-driven decisions better than feature importance weights. |
Tasks | Feature Importance |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07417v2 |
https://arxiv.org/pdf/2001.07417v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-data-driven-decisions-made-by-ai |
Repo | |
Framework | |
Forecasting Corn Yield with Machine Learning Ensembles
Title | Forecasting Corn Yield with Machine Learning Ensembles |
Authors | Mohsen Shahhosseini, Guiping Hu, Sotirios V. Archontoulis |
Abstract | The emerge of new technologies to synthesize and analyze big data with high-performance computing, has increased our capacity to more accurately predict crop yields. Recent research has shown that Machine learning (ML) can provide reasonable predictions, faster, and with higher flexibility compared to simulation crop modeling. The earlier the prediction during the growing season the better, but this has not been thoroughly investigated as previous studies considered all data available to predict yields. This paper provides a machine learning based framework to forecast corn yields in three US Corn Belt states (Illinois, Indiana, and Iowa) considering complete and partial in-season weather knowledge. Several ensemble models are designed using blocked sequential procedure to generate out-of-bag predictions. The forecasts are made in county-level scale and aggregated for agricultural district, and state level scales. Results show that ensemble models based on weighted average of the base learners outperform individual models. Specifically, the proposed ensemble model could achieve best prediction accuracy (RRMSE of 7.8%) and least mean bias error (-6.06 bu/acre) compared to other developed models. Comparing our proposed model forecasts with the literature demonstrates the superiority of forecasts made by our proposed ensemble model. Results from the scenario of having partial in-season weather knowledge reveal that decent yield forecasts can be made as early as June 1st. To find the marginal effect of each input feature on the forecasts made by the proposed ensemble model, a methodology is suggested that is the basis for finding feature importance for the ensemble model. The findings suggest that weather features corresponding to weather in weeks 18-24 (May 1st to June 1st) are the most important input features. |
Tasks | Feature Importance |
Published | 2020-01-18 |
URL | https://arxiv.org/abs/2001.09055v1 |
https://arxiv.org/pdf/2001.09055v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-corn-yield-with-machine-learning |
Repo | |
Framework | |
Learning from Noisy Similar and Dissimilar Data
Title | Learning from Noisy Similar and Dissimilar Data |
Authors | Soham Dan, Han Bao, Masashi Sugiyama |
Abstract | With the widespread use of machine learning for classification, it becomes increasingly important to be able to use weaker kinds of supervision for tasks in which it is hard to obtain standard labeled data. One such kind of supervision is provided pairwise—in the form of Similar (S) pairs (if two examples belong to the same class) and Dissimilar (D) pairs (if two examples belong to different classes). This kind of supervision is realistic in privacy-sensitive domains. Although this problem has been looked at recently, it is unclear how to learn from such supervision under label noise, which is very common when the supervision is crowd-sourced. In this paper, we close this gap and demonstrate how to learn a classifier from noisy S and D labeled data. We perform a detailed investigation of this problem under two realistic noise models and propose two algorithms to learn from noisy S-D data. We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data. Finally, we perform experiments on synthetic and real world datasets and show our noise-informed algorithms outperform noise-blind baselines in learning from noisy pairwise data. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00995v1 |
https://arxiv.org/pdf/2002.00995v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-noisy-similar-and-dissimilar |
Repo | |
Framework | |
SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration
Title | SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration |
Authors | Nuno A. C. Henriques, Helder Coelho, Leonel Garcia-Marques |
Abstract | The humans are affective and cognitive beings relying on memories for their individual and social identities. Also, human dyadic bonds require some common beliefs such as empathetic behaviour for better interaction. In this sense, research studies involving human-agent interaction should resource on affect, cognition, and memory integration. The developed artificial agent system (SensAI+Expanse) includes machine learning algorithms, heuristics, and memory as cognition aids towards emotional valence prediction on the interacting human. Further, an adaptive empathy score is always present in order to engage the human in a recognisable interaction outcome. […] The agent is resilient on collecting data, adapts its cognitive processes to each human individual in a learning best effort for proper contextualised prediction. The current study make use of an achieved adaptive process. Also, the use of individual prediction models with specific options of the learning algorithm and evaluation metric from a previous research study. The accomplished solution includes a highly performant prediction ability, an efficient energy use, and feature importance explanation for predicted probabilities. Results of the present study show evidence of significant emotional valence behaviour differences between some age ranges and gender combinations. Therefore, this work contributes with an artificial intelligent agent able to assist on cognitive science studies. This ability is about affective disturbances by means of predicting human emotional valence contextualised in space and time. Moreover, contributes with learning processes and heuristics fit to the task including economy of cognition and memory to cope with the environment. Finally, these contributions include an achieved age and gender neutrality on predicting emotional valence states in context and with very good performance for each individual. |
Tasks | Feature Importance |
Published | 2020-01-03 |
URL | https://arxiv.org/abs/2001.09746v3 |
https://arxiv.org/pdf/2001.09746v3.pdf | |
PWC | https://paperswithcode.com/paper/sensaiexpanse-emotional-valence-prediction |
Repo | |
Framework | |
Optimal Pricing of Internet of Things: A Machine Learning Approach
Title | Optimal Pricing of Internet of Things: A Machine Learning Approach |
Authors | Mohammad Abu Alsheikh, Dinh Thai Hoang, Dusit Niyato, Derek Leong, Ping Wang, Zhu Han |
Abstract | Internet of things (IoT) produces massive data from devices embedded with sensors. The IoT data allows creating profitable services using machine learning. However, previous research does not address the problem of optimal pricing and bundling of machine learning-based IoT services. In this paper, we define the data value and service quality from a machine learning perspective. We present an IoT market model which consists of data vendors selling data to service providers, and service providers offering IoT services to customers. Then, we introduce optimal pricing schemes for the standalone and bundled selling of IoT services. In standalone service sales, the service provider optimizes the size of bought data and service subscription fee to maximize its profit. For service bundles, the subscription fee and data sizes of the grouped IoT services are optimized to maximize the total profit of cooperative service providers. We show that bundling IoT services maximizes the profit of service providers compared to the standalone selling. For profit sharing of bundled services, we apply the concepts of core and Shapley solutions from cooperative game theory as efficient and fair allocations of payoffs among the cooperative service providers in the bundling coalition. |
Tasks | |
Published | 2020-02-14 |
URL | https://arxiv.org/abs/2002.05929v1 |
https://arxiv.org/pdf/2002.05929v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-pricing-of-internet-of-things-a |
Repo | |
Framework | |
Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos
Title | Network of Steel: Neural Font Style Transfer from Heavy Metal to Corporate Logos |
Authors | Aram Ter-Sarkisov |
Abstract | We introduce a method for transferring style from the logos of heavy metal bands onto corporate logos using a VGG16 network. We establish the contribution of different layers and loss coefficients to the learning of style, minimization of artefacts and maintenance of readability of corporate logos. We find layers and loss coefficients that produce a good tradeoff between heavy metal style and corporate logo readability. This is the first step both towards sparse font style transfer and corporate logo decoration using generative networks. Heavy metal and corporate logos are very different artistically, in the way they emphasize emotions and readability, therefore training a model to fuse the two is an interesting problem. |
Tasks | Font Style Transfer, Style Transfer |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.03659v1 |
https://arxiv.org/pdf/2001.03659v1.pdf | |
PWC | https://paperswithcode.com/paper/network-of-steel-neural-font-style-transfer |
Repo | |
Framework | |
Bootstrapping a DQN Replay Memory with Synthetic Experiences
Title | Bootstrapping a DQN Replay Memory with Synthetic Experiences |
Authors | Wenzel Baron Pilar von Pilchau, Anthony Stein, Jörg Hähner |
Abstract | An important component of many Deep Reinforcement Learning algorithms is the Experience Replay which serves as a storage mechanism or memory of made experiences. These experiences are used for training and help the agent to stably find the perfect trajectory through the problem space. The classic Experience Replay however makes only use of the experiences it actually made, but the stored samples bear great potential in form of knowledge about the problem that can be extracted. We present an algorithm that creates synthetic experiences in a nondeterministic discrete environment to assist the learner. The Interpolated Experience Replay is evaluated on the FrozenLake environment and we show that it can support the agent to learn faster and even better than the classic version. |
Tasks | |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01370v1 |
https://arxiv.org/pdf/2002.01370v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-a-dqn-replay-memory-with |
Repo | |
Framework | |
Coded Federated Learning
Title | Coded Federated Learning |
Authors | Sagar Dhakal, Saurav Prakash, Yair Yona, Shilpa Talwar, Nageen Himayat |
Abstract | Federated learning is a method of training a global model from decentralized data distributed across client devices. Here, model parameters are computed locally by each client device and exchanged with a central server, which aggregates the local models for a global view, without requiring sharing of training data. The convergence performance of federated learning is severely impacted in heterogeneous computing platforms such as those at the wireless edge, where straggling computations and communication links can significantly limit timely model parameter updates. This paper develops a novel coded computing technique for federated learning to mitigate the impact of stragglers. In the proposed Coded Federated Learning (CFL) scheme, each client device privately generates parity training data and shares it with the central server only once at the start of the training phase. The central server can then preemptively perform redundant gradient computations on the composite parity data to compensate for the erased or delayed parameter updates. Our results show that CFL allows the global model to converge nearly four times faster when compared to an uncoded approach |
Tasks | |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09574v1 |
https://arxiv.org/pdf/2002.09574v1.pdf | |
PWC | https://paperswithcode.com/paper/coded-federated-learning |
Repo | |
Framework | |
Time evolution of the characteristic and probability density function of diffusion processes via neural networks
Title | Time evolution of the characteristic and probability density function of diffusion processes via neural networks |
Authors | Wayne Isaac Tan Uy, Mircea Grigoriu |
Abstract | We investigate the use of physics-informed neural networks-based solution of the PDE satisfied by the probability density function (pdf) of the state of a dynamical system subject to random forcing. Two alternatives for the PDE are considered: the Fokker-Planck equation and a PDE for the characteristic function (chf) of the state, both of which provide the same probabilistic information. Solving these PDEs using the finite element method is unfeasible when the dimension of the state is larger than 3. We examine analytically and numerically the advantages and disadvantages of solving the corresponding PDE of one over the other. It is also demonstrated how prior information of the dynamical system can be exploited to design and simplify the neural network architecture. Numerical examples show that: 1) the neural network solution can approximate the target solution even for partial integro-differential equations and system of PDEs, 2) solving either PDE using neural networks yields similar pdfs of the state, and 3) the solution to the PDE can be used to study the behavior of the state for different types of random forcings. |
Tasks | |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05437v1 |
https://arxiv.org/pdf/2001.05437v1.pdf | |
PWC | https://paperswithcode.com/paper/time-evolution-of-the-characteristic-and |
Repo | |
Framework | |
Self learning robot using real-time neural networks
Title | Self learning robot using real-time neural networks |
Authors | Chirag Gupta, Chikita Nangia, Chetan Kumar |
Abstract | With the advancements in high volume, low precision computational technology and applied research on cognitive artificially intelligent heuristic systems, machine learning solutions through neural networks with real-time learning has seen an immense interest in the research community as well the industry. This paper involves research, development and experimental analysis of a neural network implemented on a robot with an arm through which evolves to learn to walk in a straight line or as required. The neural network learns using the algorithms of Gradient Descent and Backpropagation. Both the implementation and training of the neural network is done locally on the robot on a raspberry pi 3 so that its learning process is completely independent. The neural network is first tested on a custom simulator developed on MATLAB and then implemented on the raspberry computer. Data at each generation of the evolving network is stored, and analysis both mathematical and graphical is done on the data. Impact of factors like the learning rate and error tolerance on the learning process and final output is analyzed. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.02103v1 |
https://arxiv.org/pdf/2001.02103v1.pdf | |
PWC | https://paperswithcode.com/paper/self-learning-robot-using-real-time-neural |
Repo | |
Framework | |
Design of Capacity-Approaching Low-Density Parity-Check Codes using Recurrent Neural Networks
Title | Design of Capacity-Approaching Low-Density Parity-Check Codes using Recurrent Neural Networks |
Authors | Eleni Nisioti, Nikolaos Thomos |
Abstract | In this paper, we model Density Evolution (DE) using Recurrent Neural Networks (RNNs) with the aim of designing capacity-approaching Irregular Low-Density Parity-Check (LDPC) codes for binary erasure channels. In particular, we present a method for determining the coefficients of the degree distributions, characterizing the structure of an LDPC code. We refer to our RNN architecture as Neural Density Evolution (NDE) and determine the weights of the RNN that correspond to optimal designs by minimizing a loss function that enforces the properties of asymptotically optimal design, as well as the desired structural characteristics of the code. This renders the LDPC design process highly configurable, as constraints can be added to meet applications’ requirements by means of modifying the loss function. In order to train the RNN, we generate data corresponding to the expected channel noise. We analyze the complexity and optimality of NDE theoretically, and compare it with traditional design methods that employ differential evolution. Simulations illustrate that NDE improves upon differential evolution both in terms of asymptotic performance and complexity. Although we focus on asymptotic settings, we evaluate designs found by NDE for finite codeword lengths and observe that performance remains satisfactory across a variety of channels. |
Tasks | |
Published | 2020-01-05 |
URL | https://arxiv.org/abs/2001.01249v1 |
https://arxiv.org/pdf/2001.01249v1.pdf | |
PWC | https://paperswithcode.com/paper/design-of-capacity-approaching-low-density |
Repo | |
Framework | |
Differential Privacy at Risk: Bridging Randomness and Privacy Budget
Title | Differential Privacy at Risk: Bridging Randomness and Privacy Budget |
Authors | Ashish Dandekar, Debabrota Basu, Stephane Bressan |
Abstract | The calibration of noise for a privacy-preserving mechanism depends on the sensitivity of the query and the prescribed privacy level. A data steward must make the non-trivial choice of a privacy level that balances the requirements of users and the monetary constraints of the business entity. We analyse roles of the sources of randomness, namely the explicit randomness induced by the noise distribution and the implicit randomness induced by the data-generation distribution, that are involved in the design of a privacy-preserving mechanism. The finer analysis enables us to provide stronger privacy guarantees with quantifiable risks. Thus, we propose privacy at risk that is a probabilistic calibration of privacy-preserving mechanisms. We provide a composition theorem that leverages privacy at risk. We instantiate the probabilistic calibration for the Laplace mechanism by providing analytical results. We also propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity. The convexity of the proposed cost model leads to a unique fine-tuning of privacy level that minimises the compensation budget. We show its effectiveness by illustrating a realistic scenario that avoids overestimation of the compensation budget by using privacy at risk for the Laplace mechanism. We quantitatively show that composition using the cost optimal privacy at risk provides stronger privacy guarantee than the classical advanced composition. |
Tasks | Calibration |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00973v1 |
https://arxiv.org/pdf/2003.00973v1.pdf | |
PWC | https://paperswithcode.com/paper/differential-privacy-at-risk-bridging |
Repo | |
Framework | |
“Love is as Complex as Math”: Metaphor Generation System for Social Chatbot
Title | “Love is as Complex as Math”: Metaphor Generation System for Social Chatbot |
Authors | Danning Zheng, Ruihua Song, Tianran Hu, Hao Fu, Jin Zhou |
Abstract | As the wide adoption of intelligent chatbot in human daily life, user demands for such systems evolve from basic task-solving conversations to more casual and friend-like communication. To meet the user needs and build emotional bond with users, it is essential for social chatbots to incorporate more human-like and advanced linguistic features. In this paper, we investigate the usage of a commonly used rhetorical device by human – metaphor for social chatbot. Our work first designs a metaphor generation framework, which generates topic-aware and novel figurative sentences. By embedding the framework into a chatbot system, we then enables the chatbot to communicate with users using figurative language. Human annotators validate the novelty and properness of the generated metaphors. More importantly, we evaluate the effects of employing metaphors in human-chatbot conversations. Experiments indicate that our system effectively arouses user interests in communicating with our chatbot, resulting in significantly longer human-chatbot conversations. |
Tasks | Chatbot |
Published | 2020-01-03 |
URL | https://arxiv.org/abs/2001.00733v1 |
https://arxiv.org/pdf/2001.00733v1.pdf | |
PWC | https://paperswithcode.com/paper/love-is-as-complex-as-math-metaphor |
Repo | |
Framework | |
Multi-layer Representation Fusion for Neural Machine Translation
Title | Multi-layer Representation Fusion for Neural Machine Translation |
Authors | Qiang Wang, Fuxue Li, Tong Xiao, Yanyang Li, Yinqiao Li, Jingbo Zhu |
Abstract | Neural machine translation systems require a number of stacked layers for deep models. But the prediction depends on the sentence representation of the top-most layer with no access to low-level representations. This makes it more difficult to train the model and poses a risk of information loss to prediction. In this paper, we propose a multi-layer representation fusion (MLRF) approach to fusing stacked layers. In particular, we design three fusion functions to learn a better representation from the stack. Experimental results show that our approach yields improvements of 0.92 and 0.56 BLEU points over the strong Transformer baseline on IWSLT German-English and NIST Chinese-English MT tasks respectively. The result is new state-of-the-art in German-English translation. |
Tasks | Machine Translation |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06714v1 |
https://arxiv.org/pdf/2002.06714v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-layer-representation-fusion-for-neural-2 |
Repo | |
Framework | |
Spatial-Adaptive Network for Single Image Denoising
Title | Spatial-Adaptive Network for Single Image Denoising |
Authors | Meng Chang, Qi Li, Huajun Feng, Zhihai Xu |
Abstract | Previous works have shown that convolutional neural networks can achieve good performance in image denoising tasks. However, limited by the local rigid convolutional operation, these methods lead to oversmoothing artifacts. A deeper network structure could alleviate these problems, but more computational overhead is needed. In this paper, we propose a novel spatial-adaptive denoising network (SADNet) for efficient single image blind noise removal. To adapt to changes in spatial textures and edges, we design a residual spatial-adaptive block. Deformable convolution is introduced to sample the spatially correlated features for weighting. An encoder-decoder structure with a context block is introduced to capture multiscale information. With noise removal from the coarse to fine, a high-quality noisefree image can be obtained. We apply our method to both synthetic and real noisy image datasets. The experimental results demonstrate that our method can surpass the state-of-the-art denoising methods both quantitatively and visually. |
Tasks | Denoising, Image Denoising |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10291v1 |
https://arxiv.org/pdf/2001.10291v1.pdf | |
PWC | https://paperswithcode.com/paper/spatial-adaptive-network-for-single-image |
Repo | |
Framework | |