Paper Group ANR 247
Logistic regression models for aggregated data. Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks. Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$-bounds for $Q$-learning. Analysis of Deep Neural Networks with Quasi-optimal polynomial approximation rates. Optimal checkpointing for …
Logistic regression models for aggregated data
Title | Logistic regression models for aggregated data |
Authors | Tom Whitaker, Boris Beranger, Scott A. Sisson |
Abstract | Logistic regression models are a popular and effective method to predict the probability of categorical response data. However inference for these models can become computationally prohibitive for large datasets. Here we adapt ideas from symbolic data analysis to summarise the collection of predictor variables into histogram form, and perform inference on this summary dataset. We develop ideas based on composite likelihoods to derive an efficient one-versus-rest approximate composite likelihood model for histogram-based random variables, constructed from low-dimensional marginal histograms obtained from the full histogram. We demonstrate that this procedure can achieve comparable classification rates compared to the standard full data multinomial analysis and against state-of-the-art subsampling algorithms for logistic regression, but at a substantially lower computational cost. Performance is explored through simulated examples, and analyses of large supersymmetry and satellite crop classification datasets. |
Tasks | Crop Classification |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.03805v1 |
https://arxiv.org/pdf/1912.03805v1.pdf | |
PWC | https://paperswithcode.com/paper/logistic-regression-models-for-aggregated |
Repo | |
Framework | |
Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks
Title | Quantifying Intrinsic Uncertainty in Classification via Deep Dirichlet Mixture Networks |
Authors | Qingyang Wu, He Li, Lexin Li, Zhou Yu |
Abstract | With the widespread success of deep neural networks in science and technology, it is becoming increasingly important to quantify the uncertainty of the predictions produced by deep learning. In this paper, we introduce a new method that attaches an explicit uncertainty statement to the probabilities of classification using deep neural networks. Precisely, we view that the classification probabilities are sampled from an unknown distribution, and we propose to learn this distribution through the Dirichlet mixture that is flexible enough for approximating any continuous distribution on the simplex. We then construct credible intervals from the learned distribution to assess the uncertainty of the classification probabilities. Our approach is easy to implement, computationally efficient, and can be coupled with any deep neural network architecture. Our method leverages the crucial observation that, in many classification applications such as medical diagnosis, more than one class labels are available for each observational unit. We demonstrate the usefulness of our approach through simulations and a real data example. |
Tasks | Medical Diagnosis |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04450v2 |
https://arxiv.org/pdf/1906.04450v2.pdf | |
PWC | https://paperswithcode.com/paper/quantifying-intrinsic-uncertainty-in |
Repo | |
Framework | |
Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$-bounds for $Q$-learning
Title | Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$-bounds for $Q$-learning |
Authors | Martin J. Wainwright |
Abstract | Motivated by the study of $Q$-learning algorithms in reinforcement learning, we study a class of stochastic approximation procedures based on operators that satisfy monotonicity and quasi-contractivity conditions with respect to an underlying cone. We prove a general sandwich relation on the iterate error at each time, and use it to derive non-asymptotic bounds on the error in terms of a cone-induced gauge norm. These results are derived within a deterministic framework, requiring no assumptions on the noise. We illustrate these general bounds in application to synchronous $Q$-learning for discounted Markov decision processes with discrete state-action spaces, in particular by deriving non-asymptotic bounds on the $\ell_\infty$-norm for a range of stepsizes. These results are the sharpest known to date, and we show via simulation that the dependence of our bounds cannot be improved in a worst-case sense. These results show that relative to a model-based $Q$-iteration, the $\ell_\infty$-based sample complexity of $Q$-learning is suboptimal in terms of the discount factor $\gamma$. |
Tasks | Q-Learning |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06265v2 |
https://arxiv.org/pdf/1905.06265v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-approximation-with-cone |
Repo | |
Framework | |
Analysis of Deep Neural Networks with Quasi-optimal polynomial approximation rates
Title | Analysis of Deep Neural Networks with Quasi-optimal polynomial approximation rates |
Authors | Joseph Daws, Clayton Webster |
Abstract | We show the existence of a deep neural network capable of approximating a wide class of high-dimensional approximations. The construction of the proposed neural network is based on a quasi-optimal polynomial approximation. We show that this network achieves an error rate that is sub-exponential in the number of polynomial functions, $M$, used in the polynomial approximation. The complexity of the network which achieves this sub-exponential rate is shown to be algebraic in $M$. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02302v1 |
https://arxiv.org/pdf/1912.02302v1.pdf | |
PWC | https://paperswithcode.com/paper/analysis-of-deep-neural-networks-with-quasi |
Repo | |
Framework | |
Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory
Title | Optimal checkpointing for heterogeneous chains: how to train deep neural networks with limited memory |
Authors | Julien Herrmann, Olivier Beaumont, Lionel Eyraud-Dubois, Julien Hermann, Alexis Joly, Alena Shilova |
Abstract | This paper introduces a new activation checkpointing method which allows to significantly decrease memory usage when training Deep Neural Networks with the back-propagation algorithm. Similarly to checkpoint-ing techniques coming from the literature on Automatic Differentiation, it consists in dynamically selecting the forward activations that are saved during the training phase, and then automatically recomputing missing activations from those previously recorded. We propose an original computation model that combines two types of activation savings: either only storing the layer inputs, or recording the complete history of operations that produced the outputs (this uses more memory, but requires fewer recomputations in the backward phase), and we provide an algorithm to compute the optimal computation sequence for this model. This paper also describes a PyTorch implementation that processes the entire chain, dealing with any sequential DNN whose internal layers may be arbitrarily complex and automatically executing it according to the optimal checkpointing strategy computed given a memory limit. Through extensive experiments, we show that our implementation consistently outperforms existing checkpoint-ing approaches for a large class of networks, image sizes and batch sizes. |
Tasks | |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.13214v1 |
https://arxiv.org/pdf/1911.13214v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-checkpointing-for-heterogeneous |
Repo | |
Framework | |
Resolving Conflicts in Clinical Guidelines using Argumentation
Title | Resolving Conflicts in Clinical Guidelines using Argumentation |
Authors | Kristijonas Čyras, Tiago Oliveira |
Abstract | Automatically reasoning with conflicting generic clinical guidelines is a burning issue in patient-centric medical reasoning where patient-specific conditions and goals need to be taken into account. It is even more challenging in the presence of preferences such as patient’s wishes and clinician’s priorities over goals. We advance a structured argumentation formalism for reasoning with conflicting clinical guidelines, patient-specific information and preferences. Our formalism integrates assumption-based reasoning and goal-driven selection among reasoning outcomes. Specifically, we assume applicability of guideline recommendations concerning the generic goal of patient well-being, resolve conflicts among recommendations using patient’s conditions and preferences, and then consider prioritised patient-centered goals to yield non-conflicting, goal-maximising and preference-respecting recommendations. We rely on the state-of-the-art Transition-based Medical Recommendation model for representing guideline recommendations and augment it with context given by the patient’s conditions, goals, as well as preferences over recommendations and goals. We establish desirable properties of our approach in terms of sensitivity to recommendation conflicts and patient context. |
Tasks | |
Published | 2019-02-20 |
URL | http://arxiv.org/abs/1902.07526v1 |
http://arxiv.org/pdf/1902.07526v1.pdf | |
PWC | https://paperswithcode.com/paper/resolving-conflicts-in-clinical-guidelines |
Repo | |
Framework | |
Double-Coupling Learning for Multi-Task Data Stream Classification
Title | Double-Coupling Learning for Multi-Task Data Stream Classification |
Authors | Yingzhong Shi, Zhaohong Deng, Haoran Chen, Kup-Sze Choi, Shitong Wang |
Abstract | Data stream classification methods demonstrate promising performance on a single data stream by exploring the cohesion in the data stream. However, multiple data streams that involve several correlated data streams are common in many practical scenarios, which can be viewed as multi-task data streams. Instead of handling them separately, it is beneficial to consider the correlations among the multi-task data streams for data stream modeling tasks. In this regard, a novel classification method called double-coupling support vector machines (DC-SVM), is proposed for classifying them simultaneously. DC-SVM considers the external correlations between multiple data streams, while handling the internal relationship within the individual data stream. Experimental results on artificial and real-world multi-task data streams demonstrate that the proposed method outperforms traditional data stream classification methods. |
Tasks | |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.06021v1 |
https://arxiv.org/pdf/1908.06021v1.pdf | |
PWC | https://paperswithcode.com/paper/double-coupling-learning-for-multi-task-data |
Repo | |
Framework | |
node2bits: Compact Time- and Attribute-aware Node Representations for User Stitching
Title | node2bits: Compact Time- and Attribute-aware Node Representations for User Stitching |
Authors | Di Jin, Mark Heimann, Ryan Rossi, Danai Koutra |
Abstract | Identity stitching, the task of identifying and matching various online references (e.g., sessions over different devices and timespans) to the same user in real-world web services, is crucial for personalization and recommendations. However, traditional user stitching approaches, such as grouping or blocking, require quadratic pairwise comparisons between a massive number of user activities, thus posing both computational and storage challenges. Recent works, which are often application-specific, heuristically seek to reduce the amount of comparisons, but they suffer from low precision and recall. To solve the problem in an application-independent way, we take a heterogeneous network-based approach in which users (nodes) interact with content (e.g., sessions, websites), and may have attributes (e.g., location). We propose node2bits, an efficient framework that represents multi-dimensional features of node contexts with binary hashcodes. node2bits leverages feature-based temporal walks to encapsulate short- and long-term interactions between nodes in heterogeneous web networks, and adopts SimHash to obtain compact, binary representations and avoid the quadratic complexity for similarity search. Extensive experiments on large-scale real networks show that node2bits outperforms traditional techniques and existing works that generate real-valued embeddings by up to 5.16% in F1 score on user stitching, while taking only up to 1.56% as much storage. |
Tasks | |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08572v2 |
https://arxiv.org/pdf/1904.08572v2.pdf | |
PWC | https://paperswithcode.com/paper/node2bits-compact-time-and-attribute-aware |
Repo | |
Framework | |
Learning Motion Priors for Efficient Video Object Detection
Title | Learning Motion Priors for Efficient Video Object Detection |
Authors | Zhengkai Jiang, Yu Liu, Ceyuan Yang, Jihao Liu, Qian Zhang, Shiming Xiang, Chunhong Pan |
Abstract | Convolution neural networks have achieved great progress on image object detection task. However, it is not trivial to transfer existing image object detection methods to the video domain since most of them are designed specifically for the image domain. Directly applying an image detector cannot achieve optimal results because of the lack of temporal information, which is vital for the video domain. Recently, image-level flow warping has been proposed to propagate features across different frames, aiming at achieving a better trade-off between accuracy and efficiency. However, the gap between image-level optical flow with high-level features can hinder the spatial propagation accuracy. To achieve a better trade-off between accuracy and efficiency, in this paper, we propose to learn motion priors for efficient video object detection. We first initialize some motion priors for each location and then use them to propagate features across frames. At the same time, Motion priors are updated adaptively to find better spatial correspondences. Without bells and whistles, the proposed framework achieves state-of-the-art performance on the ImageNet VID dataset with real-time speed. |
Tasks | Object Detection, Optical Flow Estimation, Video Object Detection |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05253v1 |
https://arxiv.org/pdf/1911.05253v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-motion-priors-for-efficient-video |
Repo | |
Framework | |
Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning
Title | Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning |
Authors | Xinyu You, Xuanjie Li, Yuedong Xu, Hui Feng, Jin Zhao, Huaicheng Yan |
Abstract | Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local information of stochastic packet arrival and service. However, the curse of dimensionality of RL prohibits the more comprehensive representation of dynamic network states, thus limiting its potential benefit. In this paper, we propose a novel packet routing framework based on \emph{multi-agent} deep reinforcement learning (DRL) in which each router possess an \emph{independent} LSTM recurrent neural network for training and decision making in a \emph{fully distributed} environment. The LSTM recurrent neural network extracts routing features from rich information regarding backlogged packets and past actions, and effectively approximates the value function of Q-learning. We further allow each route to communicate periodically with direct neighbors so that a broader view of network state can be incorporated. Experimental results manifest that our multi-agent DRL policy can strike the delicate balance between congestion-aware and shortest routes, and significantly reduce the packet delivery time in general network topologies compared with its counterparts. |
Tasks | Decision Making, Q-Learning |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03494v2 |
https://arxiv.org/pdf/1905.03494v2.pdf | |
PWC | https://paperswithcode.com/paper/190503494 |
Repo | |
Framework | |
Autoencoder-Based Error Correction Coding for One-Bit Quantization
Title | Autoencoder-Based Error Correction Coding for One-Bit Quantization |
Authors | Eren Balevi, Jeffrey G. Andrews |
Abstract | This paper proposes a novel deep learning-based error correction coding scheme for AWGN channels under the constraint of one-bit quantization in the receivers. Specifically, it is first shown that the optimum error correction code that minimizes the probability of bit error can be obtained by perfectly training a special autoencoder, in which “perfectly” refers to converging the global minima. However, perfect training is not possible in most cases. To approach the performance of a perfectly trained autoencoder with a suboptimum training, we propose utilizing turbo codes as an implicit regularization, i.e., using a concatenation of a turbo code and an autoencoder. It is empirically shown that this design gives nearly the same performance as to the hypothetically perfectly trained autoencoder, and we also provide a theoretical proof of why that is so. The proposed coding method is as bandwidth efficient as the integrated (outer) turbo code, since the autoencoder exploits the excess bandwidth from pulse shaping and packs signals more intelligently thanks to sparsity in neural networks. Our results show that the proposed coding scheme at finite block lengths outperforms conventional turbo codes even for QPSK modulation. Furthermore, the proposed coding method can make one-bit quantization operational even for 16-QAM. |
Tasks | Quantization |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.12120v1 |
https://arxiv.org/pdf/1909.12120v1.pdf | |
PWC | https://paperswithcode.com/paper/autoencoder-based-error-correction-coding-for |
Repo | |
Framework | |
Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task
Title | Transfer Learning from Transformers to Fake News Challenge Stance Detection (FNC-1) Task |
Authors | Valeriya Slovikovskaya |
Abstract | In this paper, we report improved results of the Fake News Challenge Stage 1 (FNC-1) stance detection task. This gain in performance is due to the generalization power of large language models based on Transformer architecture, invented, trained and publicly released over the last two years. Specifically (1) we improved the FNC-1 best performing model adding BERT sentence embedding of input sequences as a model feature, (2) we fine-tuned BERT, XLNet, and RoBERTa transformers on FNC-1 extended dataset and obtained state-of-the-art results on FNC-1 task. |
Tasks | Sentence Embedding, Stance Detection, Transfer Learning |
Published | 2019-10-31 |
URL | https://arxiv.org/abs/1910.14353v1 |
https://arxiv.org/pdf/1910.14353v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-from-transformers-to-fake |
Repo | |
Framework | |
Global Adversarial Attacks for Assessing Deep Learning Robustness
Title | Global Adversarial Attacks for Assessing Deep Learning Robustness |
Authors | Hanbin Hu, Mit Shah, Jianhua Z. Huang, Peng Li |
Abstract | It has been shown that deep neural networks (DNNs) may be vulnerable to adversarial attacks, raising the concern on their robustness particularly for safety-critical applications. Recognizing the local nature and limitations of existing adversarial attacks, we present a new type of global adversarial attacks for assessing global DNN robustness. More specifically, we propose a novel concept of global adversarial example pairs in which each pair of two examples are close to each other but have different class labels predicted by the DNN. We further propose two families of global attack methods and show that our methods are able to generate diverse and intriguing adversarial example pairs at locations far from the training or testing data. Moreover, we demonstrate that DNNs hardened using the strong projected gradient descent (PGD) based (local) adversarial training are vulnerable to the proposed global adversarial example pairs, suggesting that global robustness must be considered while training robust deep learning networks. |
Tasks | |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.07920v1 |
https://arxiv.org/pdf/1906.07920v1.pdf | |
PWC | https://paperswithcode.com/paper/global-adversarial-attacks-for-assessing-deep |
Repo | |
Framework | |
Real-time data-driven detection of the rock type alteration during a directional drilling
Title | Real-time data-driven detection of the rock type alteration during a directional drilling |
Authors | Evgenya Romanenkova, Alexey Zaytsev, Nikita Klyuchnikov, Arseniy Gruzdev, Ksenia Antipova, Leyla Ismailova, Evgeny Burnaev, Artyom Semenikhin, Vitaliy Koryabkin, Igor Simon, Dmitry Koroteev |
Abstract | During the directional drilling, a bit may sometimes go to a nonproductive rock layer due to the gap about 20m between the bit and high-fidelity rock type sensors. The only way to detect the lithotype changes in time is the usage of Measurements While Drilling (MWD) data. However, there are no general mathematical modeling approaches that both well reconstruct the rock type based on MWD data and correspond to specifics of the oil and gas industry. In this article, we present a data-driven procedure that utilizes MWD data for quick detection of changes in rock type. We propose the approach that combines traditional machine learning based on the solution of the rock type classification problem with change detection procedures rarely used before in the Oil&Gas industry. The data come from a newly developed oilfield in the north of western Siberia. The results suggest that we can detect a significant part of changes in rock type reducing the change detection delay from $20$ to $1.8$ meters and the number of false-positive alarms from $43$ to $6$ per well. |
Tasks | |
Published | 2019-03-27 |
URL | https://arxiv.org/abs/1903.11436v2 |
https://arxiv.org/pdf/1903.11436v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-data-driven-detection-of-the-rock |
Repo | |
Framework | |
Cooperative Embeddings for Instance, Attribute and Category Retrieval
Title | Cooperative Embeddings for Instance, Attribute and Category Retrieval |
Authors | William Thong, Cees G. M. Snoek, Arnold W. M. Smeulders |
Abstract | The goal of this paper is to retrieve an image based on instance, attribute and category similarity notions. Different from existing works, which usually address only one of these entities in isolation, we introduce a cooperative embedding to integrate them while preserving their specific level of semantic representation. An algebraic structure defines a superspace filled with instances. Attributes are axis-aligned to form subspaces, while categories influence the arrangement of similar instances. These relationships enable them to cooperate for their mutual benefits for image retrieval. We derive a proxy-based softmax embedding loss to learn simultaneously all similarity measures in both superspace and subspaces. We evaluate our model on datasets from two different domains. Experiments on image retrieval tasks show the benefits of the cooperative embeddings for modeling multiple image similarities, and for discovering style evolution of instances between- and within-categories. |
Tasks | Image Retrieval |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01421v1 |
http://arxiv.org/pdf/1904.01421v1.pdf | |
PWC | https://paperswithcode.com/paper/cooperative-embeddings-for-instance-attribute |
Repo | |
Framework | |