Paper Group ANR 355
Learning to Rank from Samples of Variable Quality. HAR-Net:Fusing Deep Representation and Hand-crafted Features for Human Activity Recognition. Synthetic Dynamic PMU Data Generation: A Generative Adversarial Network Approach. The What, the Why, and the How of Artificial Explanations in Automated Decision-Making. End-to-end Deep Learning of Optical …
Learning to Rank from Samples of Variable Quality
Title | Learning to Rank from Samples of Variable Quality |
Authors | Mostafa Dehghani, Jaap Kamps |
Abstract | Training deep neural networks requires many training samples, but in practice, training labels are expensive to obtain and may be of varying quality, as some may be from trusted expert labelers while others might be from heuristics or other sources of weak supervision such as crowd-sourcing. This creates a fundamental quality-versus quantity trade-off in the learning process. Do we learn from the small amount of high-quality data or the potentially large amount of weakly-labeled data? We argue that if the learner could somehow know and take the label-quality into account when learning the data representation, we could get the best of both worlds. To this end, we introduce “fidelity-weighted learning” (FWL), a semi-supervised student-teacher approach for training deep neural networks using weakly-labeled data. FWL modulates the parameter updates to a student network (trained on the task we care about) on a per-sample basis according to the posterior confidence of its label-quality estimated by a teacher (who has access to the high-quality labels). Both student and teacher are learned from the data. We evaluate FWL on document ranking where we outperform state-of-the-art alternative semi-supervised methods. |
Tasks | Document Ranking, Learning-To-Rank |
Published | 2018-06-21 |
URL | http://arxiv.org/abs/1806.08694v1 |
http://arxiv.org/pdf/1806.08694v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-rank-from-samples-of-variable |
Repo | |
Framework | |
HAR-Net:Fusing Deep Representation and Hand-crafted Features for Human Activity Recognition
Title | HAR-Net:Fusing Deep Representation and Hand-crafted Features for Human Activity Recognition |
Authors | Mingtao Dong, Jindong Han |
Abstract | Wearable computing and context awareness are the focuses of study in the field of artificial intelligence recently. One of the most appealing as well as challenging applications is the Human Activity Recognition (HAR) utilizing smart phones. Conventional HAR based on Support Vector Machine relies on subjective manually extracted features. This approach is time and energy consuming as well as immature in prediction due to the partial view toward which features to be extracted by human. With the rise of deep learning, artificial intelligence has been making progress toward being a mature technology. This paper proposes a new approach based on deep learning and traditional feature engineering called HAR-Net to address the issue related to HAR. The study used the data collected by gyroscopes and acceleration sensors in android smart phones. The raw sensor data was put into the HAR-Net proposed. The HAR-Net fusing the hand-crafted features and high-level features extracted from convolutional network to make prediction. The performance of the proposed method was proved to be 0.9% higher than the original MC-SVM approach. The experimental results on the UCI dataset demonstrate that fusing the two kinds of features can make up for the shortage of traditional feature engineering and deep learning techniques. |
Tasks | Activity Recognition, Feature Engineering, Human Activity Recognition |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10929v1 |
http://arxiv.org/pdf/1810.10929v1.pdf | |
PWC | https://paperswithcode.com/paper/har-netfusing-deep-representation-and-hand |
Repo | |
Framework | |
Synthetic Dynamic PMU Data Generation: A Generative Adversarial Network Approach
Title | Synthetic Dynamic PMU Data Generation: A Generative Adversarial Network Approach |
Authors | Xiangtian Zheng, Bin Wang, Le Xie |
Abstract | This paper concerns with the production of synthetic phasor measurement unit (PMU) data for research and education purposes. Due to the confidentiality of real PMU data and no public access to the real power systems infrastructure information, the lack of credible realistic data becomes a growing concern. Instead of constructing synthetic power grids and then producing synthetic PMU measurement data by time simulations, we propose a model-free approach to directly generate synthetic PMU data. we train the generative adversarial network (GAN) with real PMU data, which can be used to generate synthetic PMU data capturing the system dynamic behaviors. To validate the sequential generation by GAN to mimic PMU data, we theoretically analyze GAN’s capacity of learning system dynamics. Further by evaluating the synthetic PMU data by a proposed quantitative method, we verify GAN’s potential to synthesize realistic samples and meanwhile realize that GAN model in this paper still has room to improve. Moreover it is the first time that such generative model is applied to synthesize PMU data. |
Tasks | |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03203v1 |
http://arxiv.org/pdf/1812.03203v1.pdf | |
PWC | https://paperswithcode.com/paper/synthetic-dynamic-pmu-data-generation-a |
Repo | |
Framework | |
The What, the Why, and the How of Artificial Explanations in Automated Decision-Making
Title | The What, the Why, and the How of Artificial Explanations in Automated Decision-Making |
Authors | Tarek R. Besold, Sara L. Uckelman |
Abstract | The increasing incorporation of Artificial Intelligence in the form of automated systems into decision-making procedures highlights not only the importance of decision theory for automated systems but also the need for these decision procedures to be explainable to the people involved in them. Traditional realist accounts of explanation, wherein explanation is a relation that holds (or does not hold) eternally between an explanans and an explanandum, are not adequate to account for the notion of explanation required for artificial decision procedures. We offer an alternative account of explanation as used in the context of automated decision-making that makes explanation an epistemic phenomenon, and one that is dependent on context. This account of explanation better accounts for the way that we talk about, and use, explanations and derived concepts, such as `explanatory power’, and also allows us to differentiate between reasons or causes on the one hand, which do not need to have an epistemic aspect, and explanations on the other, which do have such an aspect. Against this theoretical backdrop we then review existing approaches to explanation in Artificial Intelligence and Machine Learning, and suggest desiderata which truly explainable decision systems should fulfill. | |
Tasks | Decision Making |
Published | 2018-08-21 |
URL | http://arxiv.org/abs/1808.07074v1 |
http://arxiv.org/pdf/1808.07074v1.pdf | |
PWC | https://paperswithcode.com/paper/the-what-the-why-and-the-how-of-artificial |
Repo | |
Framework | |
End-to-end Deep Learning of Optical Fiber Communications
Title | End-to-end Deep Learning of Optical Fiber Communications |
Authors | Boris Karanov, Mathieu Chagnon, Félix Thouin, Tobias A. Eriksson, Henning Bülow, Domaniç Lavery, Polina Bayvel, Laurent Schmalen |
Abstract | In this paper, we implement an optical fiber communication system as an end-to-end deep neural network, including the complete chain of transmitter, channel model, and receiver. This approach enables the optimization of the transceiver in a single end-to-end process. We illustrate the benefits of this method by applying it to intensity modulation/direct detection (IM/DD) systems and show that we can achieve bit error rates below the 6.7% hard-decision forward error correction (HD-FEC) threshold. We model all componentry of the transmitter and receiver, as well as the fiber channel, and apply deep learning to find transmitter and receiver configurations minimizing the symbol error rate. We propose and verify in simulations a training method that yields robust and flexible transceivers that allow—without reconfiguration—reliable transmission over a large range of link dispersions. The results from end-to-end deep learning are successfully verified for the first time in an experiment. In particular, we achieve information rates of 42,Gb/s below the HD-FEC threshold at distances beyond 40,km. We find that our results outperform conventional IM/DD solutions based on 2 and 4 level pulse amplitude modulation (PAM2/PAM4) with feedforward equalization (FFE) at the receiver. Our study is the first step towards end-to-end deep learning-based optimization of optical fiber communication systems. |
Tasks | |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04097v3 |
http://arxiv.org/pdf/1804.04097v3.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-deep-learning-of-optical-fiber |
Repo | |
Framework | |
Linguistic data mining with complex networks: a stylometric-oriented approach
Title | Linguistic data mining with complex networks: a stylometric-oriented approach |
Authors | Tomasz Stanisz, Jarosław Kwapień, Stanisław Drożdż |
Abstract | By representing a text by a set of words and their co-occurrences, one obtains a word-adjacency network being a reduced representation of a given language sample. In this paper, the possibility of using network representation to extract information about individual language styles of literary texts is studied. By determining selected quantitative characteristics of the networks and applying machine learning algorithms, it is possible to distinguish between texts of different authors. Within the studied set of texts, English and Polish, a properly rescaled weighted clustering coefficients and weighted degrees of only a few nodes in the word-adjacency networks are sufficient to obtain the authorship attribution accuracy over 90%. A correspondence between the text authorship and the word-adjacency network structure can therefore be found. The network representation allows to distinguish individual language styles by comparing the way the authors use particular words and punctuation marks. The presented approach can be viewed as a generalization of the authorship attribution methods based on simple lexical features. Additionally, other network parameters are studied, both local and global ones, for both the unweighted and weighted networks. Their potential to capture the writing style diversity is discussed; some differences between languages are observed. |
Tasks | |
Published | 2018-08-16 |
URL | http://arxiv.org/abs/1808.05439v2 |
http://arxiv.org/pdf/1808.05439v2.pdf | |
PWC | https://paperswithcode.com/paper/linguistic-data-mining-with-complex-networks |
Repo | |
Framework | |
Towards whole-body CT Bone Segmentation
Title | Towards whole-body CT Bone Segmentation |
Authors | André Klein, Jan Warszawski, Jens Hillengaß, Klaus H. Maier-Hein |
Abstract | Bone segmentation from CT images is a task that has been worked on for decades. It is an important ingredient to several diagnostics or treatment planning approaches and relevant to various diseases. As high-quality manual and semi-automatic bone segmentation is very time-consuming, a reliable and fully automatic approach would be of great interest in many scenarios. In this publication, we propose a UNet inspired architecture to address the task using Deep Learning. We evaluated the approach on whole-body CT scans of patients suffering from multiple myeloma. As the disease decomposes the bone, an accurate segmentation is of utmost importance for the evaluation of bone density, disease staging and localization of focal lesions. The method was evaluated on an in-house data-set of 6000 2D image slices taken from 15 whole-body CT scans, achieving a dice score of 0.96 and an IOU of 0.94. |
Tasks | Medical Image Segmentation |
Published | 2018-04-03 |
URL | http://arxiv.org/abs/1804.00908v1 |
http://arxiv.org/pdf/1804.00908v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-whole-body-ct-bone-segmentation |
Repo | |
Framework | |
Sequential Gating Ensemble Network for Noise Robust Multi-Scale Face Restoration
Title | Sequential Gating Ensemble Network for Noise Robust Multi-Scale Face Restoration |
Authors | Zhibo Chen, Jianxin Lin, Tiankuang Zhou, Feng Wu |
Abstract | Face restoration from low resolution and noise is important for applications of face analysis recognition. However, most existing face restoration models omit the multiple scale issues in face restoration problem, which is still not well-solved in research area. In this paper, we propose a Sequential Gating Ensemble Network (SGEN) for multi-scale noise robust face restoration issue. To endow the network with multi-scale representation ability, we first employ the principle of ensemble learning for SGEN network architecture designing. The SGEN aggregates multi-level base-encoders and base-decoders into the network, which enables the network to contain multiple scales of receptive field. Instead of combining these base-en/decoders directly with non-sequential operations, the SGEN takes base-en/decoders from different levels as sequential data. Specifically, it is visualized that SGEN learns to sequentially extract high level information from base-encoders in bottom-up manner and restore low level information from base-decoders in top-down manner. Besides, we propose to realize bottom-up and top-down information combination and selection with Sequential Gating Unit (SGU). The SGU sequentially takes information from two different levels as inputs and decides the output based on one active input. Experiment results on benchmark dataset demonstrate that our SGEN is more effective at multi-scale human face restoration with more image details and less noise than state-of-the-art image restoration models. Further utilizing adversarial training scheme, SGEN also produces more visually preferred results than other models under subjective evaluation. |
Tasks | Image Restoration |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.11834v1 |
http://arxiv.org/pdf/1812.11834v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-gating-ensemble-network-for-noise |
Repo | |
Framework | |
Universal Successor Representations for Transfer Reinforcement Learning
Title | Universal Successor Representations for Transfer Reinforcement Learning |
Authors | Chen Ma, Junfeng Wen, Yoshua Bengio |
Abstract | The objective of transfer reinforcement learning is to generalize from a set of previous tasks to unseen new tasks. In this work, we focus on the transfer scenario where the dynamics among tasks are the same, but their goals differ. Although general value function (Sutton et al., 2011) has been shown to be useful for knowledge transfer, learning a universal value function can be challenging in practice. To attack this, we propose (1) to use universal successor representations (USR) to represent the transferable knowledge and (2) a USR approximator (USRA) that can be trained by interacting with the environment. Our experiments show that USR can be effectively applied to new tasks, and the agent initialized by the trained USRA can achieve the goal considerably faster than random initialization. |
Tasks | Transfer Learning, Transfer Reinforcement Learning |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.03758v1 |
http://arxiv.org/pdf/1804.03758v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-successor-representations-for |
Repo | |
Framework | |
Human-in-the-Loop Interpretability Prior
Title | Human-in-the-Loop Interpretability Prior |
Authors | Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J. Gershman, Finale Doshi-Velez |
Abstract | We often desire our models to be interpretable as well as accurate. Prior work on optimizing models for interpretability has relied on easy-to-quantify proxies for interpretability, such as sparsity or the number of operations required. In this work, we optimize for interpretability by directly including humans in the optimization loop. We develop an algorithm that minimizes the number of user studies to find models that are both predictive and interpretable and demonstrate our approach on several data sets. Our human subjects results show trends towards different proxy notions of interpretability on different datasets, which suggests that different proxies are preferred on different tasks. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11571v2 |
http://arxiv.org/pdf/1805.11571v2.pdf | |
PWC | https://paperswithcode.com/paper/human-in-the-loop-interpretability-prior |
Repo | |
Framework | |
A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization
Title | A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization |
Authors | Hoai An Le Thi, Hoai Minh Le, Duy Nhat Phan, Bach Tran |
Abstract | In this paper, we present two variants of DCA (Different of Convex functions Algorithm) to solve the constrained sum of differentiable function and composite functions minimization problem, with the aim of increasing the convergence speed of DCA. In the first variant, DCA-Like, we introduce a new technique to iteratively modify the decomposition of the objective function. This successive decomposition could lead to a better majorization and consequently a better convergence speed than the basic DCA. We then incorporate the Nesterov’s acceleration technique into DCA-Like to give rise to the second variant, named Accelerated DCA-Like. The convergence properties and the convergence rate under Kudyka-Lojasiewicz assumption of both variants are rigorously studied. As an application, we investigate our algorithms for the t-distributed stochastic neighbor embedding. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithms. |
Tasks | |
Published | 2018-06-25 |
URL | http://arxiv.org/abs/1806.09620v1 |
http://arxiv.org/pdf/1806.09620v1.pdf | |
PWC | https://paperswithcode.com/paper/a-dca-like-algorithm-and-its-accelerated |
Repo | |
Framework | |
The implicit fairness criterion of unconstrained learning
Title | The implicit fairness criterion of unconstrained learning |
Authors | Lydia T. Liu, Max Simchowitz, Moritz Hardt |
Abstract | We clarify what fairness guarantees we can and cannot expect to follow from unconstrained machine learning. Specifically, we characterize when unconstrained learning on its own implies group calibration, that is, the outcome variable is conditionally independent of group membership given the score. We show that under reasonable conditions, the deviation from satisfying group calibration is upper bounded by the excess risk of the learned score relative to the Bayes optimal score function. A lower bound confirms the optimality of our upper bound. Moreover, we prove that as the excess risk of the learned score decreases, it strongly violates separation and independence, two other standard fairness criteria. Our results show that group calibration is the fairness criterion that unconstrained learning implicitly favors. On the one hand, this means that calibration is often satisfied on its own without the need for active intervention, albeit at the cost of violating other criteria that are at odds with calibration. On the other hand, it suggests that we should be satisfied with calibration as a fairness criterion only if we are at ease with the use of unconstrained machine learning in a given application. |
Tasks | Calibration |
Published | 2018-08-29 |
URL | http://arxiv.org/abs/1808.10013v2 |
http://arxiv.org/pdf/1808.10013v2.pdf | |
PWC | https://paperswithcode.com/paper/the-implicit-fairness-criterion-of |
Repo | |
Framework | |
Neural Multi-Task Learning for Citation Function and Provenance
Title | Neural Multi-Task Learning for Citation Function and Provenance |
Authors | Xuan Su, Animesh Prasad, Min-Yen Kan, Kazunari Sugiyama |
Abstract | Citation function and provenance are two cornerstone tasks in citation analysis. Given a citation, the former task determines its rhetorical role, while the latter locates the text in the cited paper that contains the relevant cited information. We hypothesize that these two tasks are synergistically related, and build a model that validates this claim. For both tasks, we show that a single-layer convolutional neural network (CNN) outperforms existing state-of-the-art baselines. More importantly, we show that the two tasks are indeed synergistic: by jointly training both of the tasks in a multi-task learning setup, we demonstrate additional performance gains. Altogether, our models improve the current state-of-the-arts up to 2%, with statistical significance for both citation function and provenance prediction tasks. |
Tasks | Multi-Task Learning |
Published | 2018-11-18 |
URL | http://arxiv.org/abs/1811.07351v2 |
http://arxiv.org/pdf/1811.07351v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-multi-task-learning-for-citation |
Repo | |
Framework | |
Deep Reinforcement Learning and the Deadly Triad
Title | Deep Reinforcement Learning and the Deadly Triad |
Authors | Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil |
Abstract | We know from reinforcement learning theory that temporal difference learning can fail in certain cases. Sutton and Barto (2018) identify a deadly triad of function approximation, bootstrapping, and off-policy learning. When these three properties are combined, learning can diverge with the value estimates becoming unbounded. However, several algorithms successfully combine these three properties, which indicates that there is at least a partial gap in our understanding. In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent’s performance |
Tasks | |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02648v1 |
http://arxiv.org/pdf/1812.02648v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-and-the-deadly |
Repo | |
Framework | |
The Hierarchical Adaptive Forgetting Variational Filter
Title | The Hierarchical Adaptive Forgetting Variational Filter |
Authors | Vincent Moens |
Abstract | A common problem in Machine Learning and statistics consists in detecting whether the current sample in a stream of data belongs to the same distribution as previous ones, is an isolated outlier or inaugurates a new distribution of data. We present a hierarchical Bayesian algorithm that aims at learning a time-specific approximate posterior distribution of the parameters describing the distribution of the data observed. We derive the update equations of the variational parameters of the approximate posterior at each time step for models from the exponential family, and show that these updates find interesting correspondents in Reinforcement Learning (RL). In this perspective, our model can be seen as a hierarchical RL algorithm that learns a posterior distribution according to a certain stability confidence that is, in turn, learned according to its own stability confidence. Finally, we show some applications of our generic model, first in a RL context, next with an adaptive Bayesian Autoregressive model, and finally in the context of Stochastic Gradient Descent optimization. |
Tasks | |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05703v1 |
http://arxiv.org/pdf/1805.05703v1.pdf | |
PWC | https://paperswithcode.com/paper/the-hierarchical-adaptive-forgetting |
Repo | |
Framework | |