Paper Group ANR 1065
Transfer-Learning Oriented Class Imbalance Learning for Cross-Project Defect Prediction. Argument Component Classification for Classroom Discussions. Deep Learning for Fine-Grained Image Analysis: A Survey. Stock Price Forecasting and Hypothesis Testing Using Neural Networks. Equipping Experts/Bandits with Long-term Memory. Differential Privacy for …
Transfer-Learning Oriented Class Imbalance Learning for Cross-Project Defect Prediction
Title | Transfer-Learning Oriented Class Imbalance Learning for Cross-Project Defect Prediction |
Authors | Haonan Tong, Bin Liu, Shihai Wang, Qiuying Li |
Abstract | Cross-project defect prediction (CPDP) aims to predict defects of projects lacking training data by using prediction models trained on historical defect data from other projects. However, since the distribution differences between datasets from different projects, it is still a challenge to build high-quality CPDP models. Unfortunately, class imbalanced nature of software defect datasets further increases the difficulty. In this paper, we propose a transferlearning oriented minority over-sampling technique (TOMO) based feature weighting transfer naive Bayes (FWTNB) approach (TOMOFWTNB) for CPDP by considering both classimbalance and feature importance problems. Differing from traditional over-sampling techniques, TOMO not only can balance the data but reduce the distribution difference. And then FWTNB is used to further increase the similarity of two distributions. Experiments are performed on 11 public defect datasets. The experimental results show that (1) TOMO improves the average G-Measure by 23.7%$\sim$41.8%, and the average MCC by 54.2%$\sim$77.8%. (2) feature weighting (FW) strategy improves the average G-Measure by 11%, and the average MCC by 29.2%. (3) TOMOFWTNB improves the average G-Measure value by at least 27.8%, and the average MCC value by at least 71.5%, compared with existing state-of-theart CPDP approaches. It can be concluded that (1) TOMO is very effective for addressing class-imbalance problem in CPDP scenario; (2) our FW strategy is helpful for CPDP; (3) TOMOFWTNB outperforms previous state-of-the-art CPDP approaches. |
Tasks | Feature Importance, Transfer Learning |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08429v1 |
http://arxiv.org/pdf/1901.08429v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-oriented-class-imbalance |
Repo | |
Framework | |
Argument Component Classification for Classroom Discussions
Title | Argument Component Classification for Classroom Discussions |
Authors | Luca Lugini, Diane Litman |
Abstract | This paper focuses on argument component classification for transcribed spoken classroom discussions, with the goal of automatically classifying student utterances into claims, evidence, and warrants. We show that an existing method for argument component classification developed for another educationally-oriented domain performs poorly on our dataset. We then show that feature sets from prior work on argument mining for student essays and online dialogues can be used to improve performance considerably. We also provide a comparison between convolutional neural networks and recurrent neural networks when trained under different conditions to classify argument components in classroom discussions. While neural network models are not always able to outperform a logistic regression model, we were able to gain some useful insights: convolutional networks are more robust than recurrent networks both at the character and at the word level, and specificity information can help boost performance in multi-task training. |
Tasks | Argument Mining |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.03022v1 |
https://arxiv.org/pdf/1909.03022v1.pdf | |
PWC | https://paperswithcode.com/paper/argument-component-classification-for-1 |
Repo | |
Framework | |
Deep Learning for Fine-Grained Image Analysis: A Survey
Title | Deep Learning for Fine-Grained Image Analysis: A Survey |
Authors | Xiu-Shen Wei, Jianxin Wu, Quan Cui |
Abstract | Computer vision (CV) is the process of using machines to understand and analyze imagery, which is an integral branch of artificial intelligence. Among various research areas of CV, fine-grained image analysis (FGIA) is a longstanding and fundamental problem, and has become ubiquitous in diverse real-world applications. The task of FGIA targets analyzing visual objects from subordinate categories, \eg, species of birds or models of cars. The small inter-class variations and the large intra-class variations caused by the fine-grained nature makes it a challenging problem. During the booming of deep learning, recent years have witnessed remarkable progress of FGIA using deep learning techniques. In this paper, we aim to give a survey on recent advances of deep learning based FGIA techniques in a systematic way. Specifically, we organize the existing studies of FGIA techniques into three major categories: fine-grained image recognition, fine-grained image retrieval and fine-grained image generation. In addition, we also cover some other important issues of FGIA, such as publicly available benchmark datasets and its related domain specific applications. Finally, we conclude this survey by highlighting several directions and open problems which need be further explored by the community in the future. |
Tasks | Fine-Grained Image Recognition, Image Generation, Image Retrieval |
Published | 2019-07-06 |
URL | https://arxiv.org/abs/1907.03069v1 |
https://arxiv.org/pdf/1907.03069v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-fine-grained-image-analysis |
Repo | |
Framework | |
Stock Price Forecasting and Hypothesis Testing Using Neural Networks
Title | Stock Price Forecasting and Hypothesis Testing Using Neural Networks |
Authors | Kerda Varaku |
Abstract | In this work we use Recurrent Neural Networks and Multilayer Perceptrons to predict NYSE, NASDAQ and AMEX stock prices from historical data. We experiment with different architectures and compare data normalization techniques. Then, we leverage those findings to question the efficient-market hypothesis through a formal statistical test. |
Tasks | |
Published | 2019-08-28 |
URL | https://arxiv.org/abs/1908.11212v1 |
https://arxiv.org/pdf/1908.11212v1.pdf | |
PWC | https://paperswithcode.com/paper/stock-price-forecasting-and-hypothesis |
Repo | |
Framework | |
Equipping Experts/Bandits with Long-term Memory
Title | Equipping Experts/Bandits with Long-term Memory |
Authors | Kai Zheng, Haipeng Luo, Ilias Diakonikolas, Liwei Wang |
Abstract | We propose the first reduction-based approach to obtaining long-term memory guarantees for online learning in the sense of Bousquet and Warmuth, 2002, by reducing the problem to achieving typical switching regret. Specifically, for the classical expert problem with $K$ actions and $T$ rounds, using our framework we develop various algorithms with a regret bound of order $\mathcal{O}(\sqrt{T(S\ln T + n \ln K)})$ compared to any sequence of experts with $S-1$ switches among $n \leq \min{S, K}$ distinct experts. In addition, by plugging specific adaptive algorithms into our framework we also achieve the best of both stochastic and adversarial environments simultaneously. This resolves an open problem of Warmuth and Koolen, 2014. Furthermore, we extend our results to the sparse multi-armed bandit setting and show both negative and positive results for long-term memory guarantees. As a side result, our lower bound also implies that sparse losses do not help improve the worst-case regret for contextual bandits, a sharp contrast with the non-contextual case. |
Tasks | Multi-Armed Bandits |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12950v2 |
https://arxiv.org/pdf/1905.12950v2.pdf | |
PWC | https://paperswithcode.com/paper/equipping-expertsbandits-with-long-term |
Repo | |
Framework | |
Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?
Title | Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost? |
Authors | Debabrota Basu, Christos Dimitrakakis, Aristide Tossou |
Abstract | We introduce a number of privacy definitions for the multi-armed bandit problem, based on differential privacy. We relate them through a unifying graphical model representation and connect them to existing definitions. We then derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We show that for all of them, the learner’s regret is increased by a multiplicative factor dependent on the privacy level $\epsilon$, but that the dependency is weaker when we do not require local differential privacy for the rewards. |
Tasks | Multi-Armed Bandits |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12298v1 |
https://arxiv.org/pdf/1905.12298v1.pdf | |
PWC | https://paperswithcode.com/paper/differential-privacy-for-multi-armed-bandits |
Repo | |
Framework | |
Top-k Combinatorial Bandits with Full-Bandit Feedback
Title | Top-k Combinatorial Bandits with Full-Bandit Feedback |
Authors | Idan Rejwan, Yishay Mansour |
Abstract | Top-k Combinatorial Bandits generalize multi-armed bandits, where at each round any subset of $k$ out of $n$ arms may be chosen and the sum of the rewards is gained. We address the full-bandit feedback, in which the agent observes only the sum of rewards, in contrast to the semi-bandit feedback, in which the agent observes also the individual arms’ rewards. We present the Combinatorial Successive Accepts and Rejects (CSAR) algorithm, which generalizes SAR (Bubeck et al, 2013) for top-k combinatorial bandits. Our main contribution is an efficient sampling scheme that uses Hadamard matrices in order to estimate accurately the individual arms’ expected rewards. We discuss two variants of the algorithm, the first minimizes the sample complexity and the second minimizes the regret. We also prove a lower bound on sample complexity, which is tight for $k=O(1)$. Finally, we run experiments and show that our algorithm outperforms other methods. |
Tasks | Multi-Armed Bandits |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12624v2 |
https://arxiv.org/pdf/1905.12624v2.pdf | |
PWC | https://paperswithcode.com/paper/combinatorial-bandits-with-full-bandit |
Repo | |
Framework | |
Modeling the Uncertainty in Electronic Health Records: a Bayesian Deep Learning Approach
Title | Modeling the Uncertainty in Electronic Health Records: a Bayesian Deep Learning Approach |
Authors | Riyi Qiu, Yugang Jia, Mirsad Hadzikadic, Michael Dulin, Xi Niu, Xin Wang |
Abstract | Deep learning models have exhibited superior performance in predictive tasks with the explosively increasing Electronic Health Records (EHR). However, due to the lack of transparency, behaviors of deep learning models are difficult to interpret. Without trustworthiness, deep learning models will not be able to assist in the real-world decision-making process of healthcare issues. We propose a deep learning model based on Bayesian Neural Networks (BNN) to predict uncertainty induced by data noise. The uncertainty is introduced to provide model predictions with an extra level of confidence. Our experiments verify that instances with high uncertainty are harmful to model performance. Moreover, by investigating the distributions of model prediction and uncertainty, we show that it is possible to identify a group of patients for timely intervention, such that decreasing data noise will benefit more on the prediction accuracy for these patients. |
Tasks | Decision Making |
Published | 2019-07-14 |
URL | https://arxiv.org/abs/1907.06162v1 |
https://arxiv.org/pdf/1907.06162v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-the-uncertainty-in-electronic-health |
Repo | |
Framework | |
SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition
Title | SCSampler: Sampling Salient Clips from Video for Efficient Action Recognition |
Authors | Bruno Korbar, Du Tran, Lorenzo Torresani |
Abstract | While many action recognition datasets consist of collections of brief, trimmed videos each containing a relevant action, videos in the real-world (e.g., on YouTube) exhibit very different properties: they are often several minutes long, where brief relevant clips are often interleaved with segments of extended duration containing little change. Applying densely an action recognition system to every temporal clip within such videos is prohibitively expensive. Furthermore, as we show in our experiments, this results in suboptimal recognition accuracy as informative predictions from relevant clips are outnumbered by meaningless classification outputs over long uninformative sections of the video. In this paper we introduce a lightweight “clip-sampling” model that can efficiently identify the most salient temporal clips within a long video. We demonstrate that the computational cost of action recognition on untrimmed videos can be dramatically reduced by invoking recognition only on these most salient clips. Furthermore, we show that this yields significant gains in recognition accuracy compared to analysis of all clips or randomly/uniformly selected clips. On Sports1M, our clip sampling scheme elevates the accuracy of an already state-of-the-art action classifier by 7% and reduces by more than 15 times its computational cost. |
Tasks | Action Recognition In Videos, Temporal Action Localization |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.04289v2 |
https://arxiv.org/pdf/1904.04289v2.pdf | |
PWC | https://paperswithcode.com/paper/scsampler-sampling-salient-clips-from-video |
Repo | |
Framework | |
Decay-Function-Free Time-Aware Attention to Context and Speaker Indicator for Spoken Language Understanding
Title | Decay-Function-Free Time-Aware Attention to Context and Speaker Indicator for Spoken Language Understanding |
Authors | Jonggu Kim, Jong-Hyeok Lee |
Abstract | To capture salient contextual information for spoken language understanding (SLU) of a dialogue, we propose time-aware models that automatically learn the latent time-decay function of the history without a manual time-decay function. We also propose a method to identify and label the current speaker to improve the SLU accuracy. In experiments on the benchmark dataset used in Dialog State Tracking Challenge 4, the proposed models achieved significantly higher F1 scores than the state-of-the-art contextual models. Finally, we analyze the effectiveness of the introduced models in detail. The analysis demonstrates that the proposed methods were effective to improve SLU accuracy individually. |
Tasks | Spoken Language Understanding |
Published | 2019-03-20 |
URL | https://arxiv.org/abs/1903.08450v3 |
https://arxiv.org/pdf/1903.08450v3.pdf | |
PWC | https://paperswithcode.com/paper/decay-function-free-time-aware-attention-to |
Repo | |
Framework | |
Reference Product Search
Title | Reference Product Search |
Authors | Chu Wang, Lei Tang, Shujun Bian, Da Zhang, Zuohua Zhang, Yongning Wu |
Abstract | For a product of interest, we propose a search method to surface a set of reference products. The reference products can be used as candidates to support downstream modeling tasks and business applications. The search method consists of product representation learning and fingerprint-type vector searching. The product catalog information is transformed into a high-quality embedding of low dimensions via a novel attention auto-encoder neural network, and the embedding is further coupled with a binary encoding vector for fast retrieval. We conduct extensive experiments to evaluate the proposed method, and compare it with peer services to demonstrate its advantage in terms of search return rate and precision. |
Tasks | Representation Learning |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05985v1 |
http://arxiv.org/pdf/1904.05985v1.pdf | |
PWC | https://paperswithcode.com/paper/reference-product-search |
Repo | |
Framework | |
Quaternion Convolutional Neural Networks
Title | Quaternion Convolutional Neural Networks |
Authors | Xuanyu Zhu, Yi Xu, Hongteng Xu, Changjian Chen |
Abstract | Neural networks in the real domain have been studied for a long time and achieved promising results in many vision tasks for recent years. However, the extensions of the neural network models in other number fields and their potential applications are not fully-investigated yet. Focusing on color images, which can be naturally represented as quaternion matrices, we propose a quaternion convolutional neural network (QCNN) model to obtain more representative features. In particular, we redesign the basic modules like convolution layer and fully-connected layer in the quaternion domain, which can be used to establish fully-quaternion convolutional neural networks. Moreover, these modules are compatible with almost all deep learning techniques and can be plugged into traditional CNNs easily. We test our QCNN models in both color image classification and denoising tasks. Experimental results show that they outperform the real-valued CNNs with same structures. |
Tasks | Denoising, Image Classification |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.00658v1 |
http://arxiv.org/pdf/1903.00658v1.pdf | |
PWC | https://paperswithcode.com/paper/quaternion-convolutional-neural-networks |
Repo | |
Framework | |
Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning
Title | Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning |
Authors | Lucas Lehnert, Michael L. Littman |
Abstract | A key question in reinforcement learning is how an intelligent agent can generalize knowledge across different inputs. By generalizing across different inputs, information learned for one input can be immediately reused for improving predictions for another input. Reusing information allows an agent to compute an optimal decision-making strategy using less data. State representation is a key element of the generalization process, compressing a high-dimensional input space into a low-dimensional latent state space. This article analyzes properties of different latent state spaces, leading to new connections between model-based and model-free reinforcement learning. Successor features, which predict frequencies of future observations, form a link between model-based and model-free learning: Learning to predict future expected reward outcomes, a key characteristic of model-based agents, is equivalent to learning successor features. Learning successor features is a form of temporal difference learning and is equivalent to learning to predict a single policy’s utility, which is a characteristic of model-free agents. Drawing on the connection between model-based reinforcement learning and successor features, we demonstrate that representations that are predictive of future reward outcomes generalize across variations in both transitions and rewards. This result extends previous work on successor features, which is constrained to fixed transitions and assumes re-learning of the transferred state representation. |
Tasks | Decision Making |
Published | 2019-01-31 |
URL | https://arxiv.org/abs/1901.11437v2 |
https://arxiv.org/pdf/1901.11437v2.pdf | |
PWC | https://paperswithcode.com/paper/successor-features-support-model-based-and |
Repo | |
Framework | |
A High-Throughput Solver for Marginalized Graph Kernels on GPU
Title | A High-Throughput Solver for Marginalized Graph Kernels on GPU |
Authors | Yu-Hang Tang, Oguz Selvitopi, Doru Popovici, Aydın Buluç |
Abstract | We present the design and optimization of a linear solver on General Purpose GPUs for the efficient and high-throughput evaluation of the marginalized graph kernel between pairs of labeled graphs. The solver implements a preconditioned conjugate gradient (PCG) method to compute the solution to a generalized Laplacian equation associated with the tensor product of two graphs. To cope with the gap between the instruction throughput and the memory bandwidth of current generation GPUs, our solver forms the tensor product linear system on-the-fly without storing it in memory when performing matrix-vector dot product operations in PCG. Such on-the-fly computation is accomplished by using threads in a warp to cooperatively stream the adjacency and edge label matrices of individual graphs by small square matrix blocks called tiles, which are then staged in registers and the shared memory for later reuse. Warps across a thread block can further share tiles via the shared memory to increase data reuse. We exploit the sparsity of the graphs hierarchically by storing only non-empty tiles using a coordinate format and nonzero elements within each tile using bitmaps. Besides, we propose a new partition-based reordering algorithm for aggregating nonzero elements of the graphs into fewer but denser tiles to improve the efficiency of the sparse format. We carry out extensive theoretical analyses on the graph tensor product primitives for tiles of various density and evaluate their performance on synthetic and real-world datasets. Our solver delivers three to four orders of magnitude speedup over existing CPU-based solvers such as GraKeL and GraphKernels. The capability of the solver enables kernel-based learning tasks at unprecedented scales. |
Tasks | |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.06310v4 |
https://arxiv.org/pdf/1910.06310v4.pdf | |
PWC | https://paperswithcode.com/paper/a-high-throughput-solver-for-marginalized |
Repo | |
Framework | |
Accurate Esophageal Gross Tumor Volume Segmentation in PET/CT using Two-Stream Chained 3D Deep Network Fusion
Title | Accurate Esophageal Gross Tumor Volume Segmentation in PET/CT using Two-Stream Chained 3D Deep Network Fusion |
Authors | Dakai Jin, Dazhou Guo, Tsung-Ying Ho, Adam P. Harrison, Jing Xiao, Chen-kan Tseng, Le Lu |
Abstract | Gross tumor volume (GTV) segmentation is a critical step in esophageal cancer radiotherapy treatment planning. Inconsistencies across oncologists and prohibitive labor costs motivate automated approaches for this task. However, leading approaches are only applied to radiotherapy computed tomography (RTCT) images taken prior to treatment. This limits the performance as RTCT suffers from low contrast between the esophagus, tumor, and surrounding tissues. In this paper, we aim to exploit both RTCT and positron emission tomography (PET) imaging modalities to facilitate more accurate GTV segmentation. By utilizing PET, we emulate medical professionals who frequently delineate GTV boundaries through observation of the RTCT images obtained after prescribing radiotherapy and PET/CT images acquired earlier for cancer staging. To take advantage of both modalities, we present a two-stream chained segmentation approach that effectively fuses the CT and PET modalities via early and late 3D deep-network-based fusion. Furthermore, to effect the fusion and segmentation we propose a simple yet effective progressive semantically nested network (PSNN) model that outperforms more complicated models. Extensive 5-fold cross-validation on 110 esophageal cancer patients, the largest analysis to date, demonstrates that both the proposed two-stream chained segmentation pipeline and the PSNN model can significantly improve the quantitative performance over the previous state-of-the-art work by 11% in absolute Dice score (DSC) (from 0.654 to 0.764) and, at the same time, reducing the Hausdorff distance from 129 mm to 47 mm. |
Tasks | |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01524v2 |
https://arxiv.org/pdf/1909.01524v2.pdf | |
PWC | https://paperswithcode.com/paper/accurate-esophageal-gross-tumor-volume |
Repo | |
Framework | |