Paper Group ANR 796
Point Linking Network for Object Detection. Joint Transmission Map Estimation and Dehazing using Deep Networks. Tech Report: A Fast Multiscale Spatial Regularization for Sparse Hyperspectral Unmixing. On conditional parity as a notion of non-discrimination in machine learning. Uncertainty measurement with belief entropy on interference effect in Qu …
Point Linking Network for Object Detection
Title | Point Linking Network for Object Detection |
Authors | Xinggang Wang, Kaibing Chen, Zilong Huang, Cong Yao, Wenyu Liu |
Abstract | Object detection is a core problem in computer vision. With the development of deep ConvNets, the performance of object detectors has been dramatically improved. The deep ConvNets based object detectors mainly focus on regressing the coordinates of bounding box, e.g., Faster-R-CNN, YOLO and SSD. Different from these methods that considering bounding box as a whole, we propose a novel object bounding box representation using points and links and implemented using deep ConvNets, termed as Point Linking Network (PLN). Specifically, we regress the corner/center points of bounding-box and their links using a fully convolutional network; then we map the corner points and their links back to multiple bounding boxes; finally an object detection result is obtained by fusing the multiple bounding boxes. PLN is naturally robust to object occlusion and flexible to object scale variation and aspect ratio variation. In the experiments, PLN with the Inception-v2 model achieves state-of-the-art single-model and single-scale results on the PASCAL VOC 2007, the PASCAL VOC 2012 and the COCO detection benchmarks without bells and whistles. The source code will be released. |
Tasks | Object Detection |
Published | 2017-06-12 |
URL | http://arxiv.org/abs/1706.03646v2 |
http://arxiv.org/pdf/1706.03646v2.pdf | |
PWC | https://paperswithcode.com/paper/point-linking-network-for-object-detection |
Repo | |
Framework | |
Joint Transmission Map Estimation and Dehazing using Deep Networks
Title | Joint Transmission Map Estimation and Dehazing using Deep Networks |
Authors | He Zhang, Vishwanath Sindagi, Vishal M. Patel |
Abstract | Single image haze removal is an extremely challenging problem due to its inherent ill-posed nature. Several prior-based and learning-based methods have been proposed in the literature to solve this problem and they have achieved superior results. However, most of the existing methods assume constant atmospheric light model and tend to follow a two-step procedure involving prior-based methods for estimating transmission map followed by calculation of dehazed image using the closed form solution. In this paper, we relax the constant atmospheric light assumption and propose a novel unified single image dehazing network that jointly estimates the transmission map and performs dehazing. In other words, our new approach provides an end-to-end learning framework, where the inherent transmission map and dehazed result are learned directly from the loss function. Extensive experiments on synthetic and real datasets with challenging hazy images demonstrate that the proposed method achieves significant improvements over the state-of-the-art methods. |
Tasks | Image Dehazing, Single Image Dehazing, Single Image Haze Removal |
Published | 2017-08-02 |
URL | http://arxiv.org/abs/1708.00581v2 |
http://arxiv.org/pdf/1708.00581v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-transmission-map-estimation-and |
Repo | |
Framework | |
Tech Report: A Fast Multiscale Spatial Regularization for Sparse Hyperspectral Unmixing
Title | Tech Report: A Fast Multiscale Spatial Regularization for Sparse Hyperspectral Unmixing |
Authors | Ricardo Augusto Borsoi, Tales Imbiriba, José Carlos Moreira Bermudez, Cédric Richard |
Abstract | Sparse hyperspectral unmixing from large spectral libraries has been considered to circumvent limitations of endmember extraction algorithms in many applications. This strategy often leads to ill-posed inverse problems, which can benefit from spatial regularization strategies. While existing spatial regularization methods improve the problem conditioning and promote piecewise smooth solutions, they lead to large nonsmooth optimization problems. Thus, efficiently introducing spatial context in the unmixing problem remains a challenge, and a necessity for many real world applications. In this paper, a novel multiscale spatial regularization approach for sparse unmixing is proposed. The method uses a signal-adaptive spatial multiscale decomposition based on superpixels to decompose the unmixing problem into two simpler problems, one in the approximation domain and another in the original domain. Simulation results using both synthetic and real data indicate that the proposed method can outperform state-of-the-art Total Variation-based algorithms with a computation time comparable to that of their unregularized counterparts. |
Tasks | Hyperspectral Unmixing |
Published | 2017-12-05 |
URL | http://arxiv.org/abs/1712.01770v3 |
http://arxiv.org/pdf/1712.01770v3.pdf | |
PWC | https://paperswithcode.com/paper/tech-report-a-fast-multiscale-spatial |
Repo | |
Framework | |
On conditional parity as a notion of non-discrimination in machine learning
Title | On conditional parity as a notion of non-discrimination in machine learning |
Authors | Ya’acov Ritov, Yuekai Sun, Ruofei Zhao |
Abstract | We identify conditional parity as a general notion of non-discrimination in machine learning. In fact, several recently proposed notions of non-discrimination, including a few counterfactual notions, are instances of conditional parity. We show that conditional parity is amenable to statistical analysis by studying randomization as a general mechanism for achieving conditional parity and a kernel-based test of conditional parity. |
Tasks | |
Published | 2017-06-26 |
URL | http://arxiv.org/abs/1706.08519v1 |
http://arxiv.org/pdf/1706.08519v1.pdf | |
PWC | https://paperswithcode.com/paper/on-conditional-parity-as-a-notion-of-non |
Repo | |
Framework | |
Uncertainty measurement with belief entropy on interference effect in Quantum-Like Bayesian Networks
Title | Uncertainty measurement with belief entropy on interference effect in Quantum-Like Bayesian Networks |
Authors | Zhiming Huang, Lin Yang, Wen Jiang |
Abstract | Social dilemmas have been regarded as the essence of evolution game theory, in which the prisoner’s dilemma game is the most famous metaphor for the problem of cooperation. Recent findings revealed people’s behavior violated the Sure Thing Principle in such games. Classic probability methodologies have difficulty explaining the underlying mechanisms of people’s behavior. In this paper, a novel quantum-like Bayesian Network was proposed to accommodate the paradoxical phenomenon. The special network can take interference into consideration, which is likely to be an efficient way to describe the underlying mechanism. With the assistance of belief entropy, named as Deng entropy, the paper proposes Belief Distance to render the model practical. Tested with empirical data, the proposed model is proved to be predictable and effective. |
Tasks | |
Published | 2017-09-08 |
URL | http://arxiv.org/abs/1709.02844v1 |
http://arxiv.org/pdf/1709.02844v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-measurement-with-belief-entropy |
Repo | |
Framework | |
A Spectral Method for Activity Shaping in Continuous-Time Information Cascades
Title | A Spectral Method for Activity Shaping in Continuous-Time Information Cascades |
Authors | Kevin Scaman, Argyris Kalogeratos, Luca Corinzia, Nicolas Vayatis |
Abstract | Information Cascades Model captures dynamical properties of user activity in a social network. In this work, we develop a novel framework for activity shaping under the Continuous-Time Information Cascades Model which allows the administrator for local control actions by allocating targeted resources that can alter the spread of the process. Our framework employs the optimization of the spectral radius of the Hazard matrix, a quantity that has been shown to drive the maximum influence in a network, while enjoying a simple convex relaxation when used to minimize the influence of the cascade. In addition, use-cases such as quarantine and node immunization are discussed to highlight the generality of the proposed activity shaping framework. Finally, we present the NetShape influence minimization method which is compared favorably to baseline and state-of-the-art approaches through simulations on real social networks. |
Tasks | |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05231v1 |
http://arxiv.org/pdf/1709.05231v1.pdf | |
PWC | https://paperswithcode.com/paper/a-spectral-method-for-activity-shaping-in |
Repo | |
Framework | |
Classification of Radiology Reports Using Neural Attention Models
Title | Classification of Radiology Reports Using Neural Attention Models |
Authors | Bonggun Shin, Falgun H. Chokshi, Timothy Lee, Jinho D. Choi |
Abstract | The electronic health record (EHR) contains a large amount of multi-dimensional and unstructured clinical data of significant operational and research value. Distinguished from previous studies, our approach embraces a double-annotated dataset and strays away from obscure “black-box” models to comprehensive deep learning models. In this paper, we present a novel neural attention mechanism that not only classifies clinically important findings. Specifically, convolutional neural networks (CNN) with attention analysis are used to classify radiology head computed tomography reports based on five categories that radiologists would account for in assessing acute and communicable findings in daily practice. The experiments show that our CNN attention models outperform non-neural models, especially when trained on a larger dataset. Our attention analysis demonstrates the intuition behind the classifier’s decision by generating a heatmap that highlights attended terms used by the CNN model; this is valuable when potential downstream medical decisions are to be performed by human experts or the classifier information is to be used in cohort construction such as for epidemiological studies. |
Tasks | |
Published | 2017-08-22 |
URL | http://arxiv.org/abs/1708.06828v1 |
http://arxiv.org/pdf/1708.06828v1.pdf | |
PWC | https://paperswithcode.com/paper/classification-of-radiology-reports-using |
Repo | |
Framework | |
Supervised and Unsupervised Transfer Learning for Question Answering
Title | Supervised and Unsupervised Transfer Learning for Question Answering |
Authors | Yu-An Chung, Hung-Yi Lee, James Glass |
Abstract | Although transfer learning has been shown to be successful for tasks like object and speech recognition, its applicability to question answering (QA) has yet to be well-studied. In this paper, we conduct extensive experiments to investigate the transferability of knowledge learned from a source QA dataset to a target dataset using two QA models. The performance of both models on a TOEFL listening comprehension test (Tseng et al., 2016) and MCTest (Richardson et al., 2013) is significantly improved via a simple transfer learning technique from MovieQA (Tapaswi et al., 2016). In particular, one of the models achieves the state-of-the-art on all target datasets; for the TOEFL listening comprehension test, it outperforms the previous best model by 7%. Finally, we show that transfer learning is helpful even in unsupervised scenarios when correct answers for target QA dataset examples are not available. |
Tasks | Question Answering, Speech Recognition, Transfer Learning |
Published | 2017-11-14 |
URL | http://arxiv.org/abs/1711.05345v3 |
http://arxiv.org/pdf/1711.05345v3.pdf | |
PWC | https://paperswithcode.com/paper/supervised-and-unsupervised-transfer-learning |
Repo | |
Framework | |
Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology
Title | Towards the Augmented Pathologist: Challenges of Explainable-AI in Digital Pathology |
Authors | Andreas Holzinger, Bernd Malle, Peter Kieseberg, Peter M. Roth, Heimo Müller, Robert Reihs, Kurt Zatloukal |
Abstract | Digital pathology is not only one of the most promising fields of diagnostic medicine, but at the same time a hot topic for fundamental research. Digital pathology is not just the transfer of histopathological slides into digital representations. The combination of different data sources (images, patient records, and *omics data) together with current advances in artificial intelligence/machine learning enable to make novel information accessible and quantifiable to a human expert, which is not yet available and not exploited in current medical settings. The grand goal is to reach a level of usable intelligence to understand the data in the context of an application task, thereby making machine decisions transparent, interpretable and explainable. The foundation of such an “augmented pathologist” needs an integrated approach: While machine learning algorithms require many thousands of training examples, a human expert is often confronted with only a few data points. Interestingly, humans can learn from such few examples and are able to instantly interpret complex patterns. Consequently, the grand goal is to combine the possibilities of artificial intelligence with human intelligence and to find a well-suited balance between them to enable what neither of them could do on their own. This can raise the quality of education, diagnosis, prognosis and prediction of cancer and other diseases. In this paper we describe some (incomplete) research issues which we believe should be addressed in an integrated and concerted effort for paving the way towards the augmented pathologist. |
Tasks | |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06657v1 |
http://arxiv.org/pdf/1712.06657v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-the-augmented-pathologist-challenges |
Repo | |
Framework | |
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Title | The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning |
Authors | Siyuan Ma, Raef Bassily, Mikhail Belkin |
Abstract | In this paper we aim to formally explain the phenomenon of fast convergence of SGD observed in modern machine learning. The key observation is that most modern learning architectures are over-parametrized and are trained to interpolate the data by driving the empirical loss (classification and regression) close to zero. While it is still unclear why these interpolated solutions perform well on test data, we show that these regimes allow for fast convergence of SGD, comparable in number of iterations to full gradient descent. For convex loss functions we obtain an exponential convergence bound for {\it mini-batch} SGD parallel to that for full gradient descent. We show that there is a critical batch size $m^$ such that: (a) SGD iteration with mini-batch size $m\leq m^$ is nearly equivalent to $m$ iterations of mini-batch size $1$ (\emph{linear scaling regime}). (b) SGD iteration with mini-batch $m> m^*$ is nearly equivalent to a full gradient descent iteration (\emph{saturation regime}). Moreover, for the quadratic loss, we derive explicit expressions for the optimal mini-batch and step size and explicitly characterize the two regimes above. The critical mini-batch size can be viewed as the limit for effective mini-batch parallelization. It is also nearly independent of the data size, implying $O(n)$ acceleration over GD per unit of computation. We give experimental evidence on real data which closely follows our theoretical analyses. Finally, we show how our results fit in the recent developments in training deep neural networks and discuss connections to adaptive rates for SGD and variance reduction. |
Tasks | |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06559v3 |
http://arxiv.org/pdf/1712.06559v3.pdf | |
PWC | https://paperswithcode.com/paper/the-power-of-interpolation-understanding-the |
Repo | |
Framework | |
Automated Curriculum Learning for Neural Networks
Title | Automated Curriculum Learning for Neural Networks |
Authors | Alex Graves, Marc G. Bellemare, Jacob Menick, Remi Munos, Koray Kavukcuoglu |
Abstract | We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency. A measure of the amount that the network learns from each data sample is provided as a reward signal to a nonstationary multi-armed bandit algorithm, which then determines a stochastic syllabus. We consider a range of signals derived from two distinct indicators of learning progress: rate of increase in prediction accuracy, and rate of increase in network complexity. Experimental results for LSTM networks on three curricula demonstrate that our approach can significantly accelerate learning, in some cases halving the time required to attain a satisfactory performance level. |
Tasks | |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.03003v1 |
http://arxiv.org/pdf/1704.03003v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-curriculum-learning-for-neural |
Repo | |
Framework | |
Multiple testing for outlier detection in functional data
Title | Multiple testing for outlier detection in functional data |
Authors | Clémentine Barreyre, Béatrice Laurent, Jean-Michel Loubes, Bertrand Cabon, Loïc Boussouf |
Abstract | We propose a novel procedure for outlier detection in functional data, in a semi-supervised framework. As the data is functional, we consider the coefficients obtained after projecting the observations onto orthonormal bases (wavelet, PCA). A multiple testing procedure based on the two-sample test is defined in order to highlight the levels of the coefficients on which the outliers appear as significantly different to the normal data. The selected coefficients are then called features for the outlier detection, on which we compute the Local Outlier Factor to highlight the outliers. This procedure to select the features is applied on simulated data that mimic the behaviour of space telemetries, and compared with existing dimension reduction techniques. |
Tasks | Dimensionality Reduction, Outlier Detection |
Published | 2017-12-13 |
URL | http://arxiv.org/abs/1712.04775v1 |
http://arxiv.org/pdf/1712.04775v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-testing-for-outlier-detection-in |
Repo | |
Framework | |
Exact Mean Computation in Dynamic Time Warping Spaces
Title | Exact Mean Computation in Dynamic Time Warping Spaces |
Authors | Markus Brill, Till Fluschnik, Vincent Froese, Brijnesh Jain, Rolf Niedermeier, David Schultz |
Abstract | Dynamic time warping constitutes a major tool for analyzing time series. In particular, computing a mean series of a given sample of series in dynamic time warping spaces (by minimizing the Fr'echet function) is a challenging computational problem, so far solved by several heuristic and inexact strategies. We spot some inaccuracies in the literature on exact mean computation in dynamic time warping spaces. Our contributions comprise an exact dynamic program computing a mean (useful for benchmarking and evaluating known heuristics). Based on this dynamic program, we empirically study properties like uniqueness and length of a mean. Moreover, experimental evaluations reveal substantial deficits of state-of-the-art heuristics in terms of their output quality. We also give an exact polynomial-time algorithm for the special case of binary time series. |
Tasks | Time Series |
Published | 2017-10-24 |
URL | http://arxiv.org/abs/1710.08937v3 |
http://arxiv.org/pdf/1710.08937v3.pdf | |
PWC | https://paperswithcode.com/paper/exact-mean-computation-in-dynamic-time |
Repo | |
Framework | |
On Polynomial Time Methods for Exact Low Rank Tensor Completion
Title | On Polynomial Time Methods for Exact Low Rank Tensor Completion |
Authors | Dong Xia, Ming Yuan |
Abstract | In this paper, we investigate the sample size requirement for exact recovery of a high order tensor of low rank from a subset of its entries. We show that a gradient descent algorithm with initial value obtained from a spectral method can, in particular, reconstruct a ${d\times d\times d}$ tensor of multilinear ranks $(r,r,r)$ with high probability from as few as $O(r^{7/2}d^{3/2}\log^{7/2}d+r^7d\log^6d)$ entries. In the case when the ranks $r=O(1)$, our sample size requirement matches those for nuclear norm minimization (Yuan and Zhang, 2016a), or alternating least squares assuming orthogonal decomposability (Jain and Oh, 2014). Unlike these earlier approaches, however, our method is efficient to compute, easy to implement, and does not impose extra structures on the tensor. Numerical results are presented to further demonstrate the merits of the proposed approach. |
Tasks | |
Published | 2017-02-22 |
URL | http://arxiv.org/abs/1702.06980v1 |
http://arxiv.org/pdf/1702.06980v1.pdf | |
PWC | https://paperswithcode.com/paper/on-polynomial-time-methods-for-exact-low-rank |
Repo | |
Framework | |
Synthesizing Novel Pairs of Image and Text
Title | Synthesizing Novel Pairs of Image and Text |
Authors | Jason Xie, Tingwen Bao |
Abstract | Generating novel pairs of image and text is a problem that combines computer vision and natural language processing. In this paper, we present strategies for generating novel image and caption pairs based on existing captioning datasets. The model takes advantage of recent advances in generative adversarial networks and sequence-to-sequence modeling. We make generalizations to generate paired samples from multiple domains. Furthermore, we study cycles – generating from image to text then back to image and vise versa, as well as its connection with autoencoders. |
Tasks | |
Published | 2017-12-18 |
URL | http://arxiv.org/abs/1712.06682v1 |
http://arxiv.org/pdf/1712.06682v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesizing-novel-pairs-of-image-and-text |
Repo | |
Framework | |