Paper Group ANR 1385
Expanding Label Sets for Graph Convolutional Networks. Predicting Economic Development using Geolocated Wikipedia Articles. Deformable Non-local Network for Video Super-Resolution. Revisiting Fine-tuning for Few-shot Learning. High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Ex …
Expanding Label Sets for Graph Convolutional Networks
Title | Expanding Label Sets for Graph Convolutional Networks |
Authors | Mustafa Coskun, Burcu Bakir Gungor, Mehmet Koyuturk |
Abstract | In recent years, Graph Convolutional Networks (GCNs) and their variants have been widely utilized in learning tasks that involve graphs. These tasks include recommendation systems, node classification, among many others. In node classification problem, the input is a graph in which the edges represent the association between pairs of nodes, multi-dimensional feature vectors are associated with the nodes, and some of the nodes in the graph have known labels. The objective is to predict the labels of the nodes that are not labeled, using the nodes features, in conjunction with graph topology. While GCNs have been successfully applied to this problem, the caveats that they inherit from traditional deep learning models pose significant challenges to broad utilization of GCNs in node classification. One such caveat is that training a GCN requires a large number of labeled training instances, which is often not the case in realistic settings. To remedy this requirement, state-of-the-art methods leverage network diffusion-based approaches to propagate labels across the network before training GCNs. However, these approaches ignore the tendency of the network diffusion methods in biasing proximity with centrality, resulting in the propagation of labels to the nodes that are well-connected in the graph. To address this problem, here we present an alternate approach to extrapolating node labels in GCNs in the following three steps: (i) clustering of the network to identify communities, (ii) use of network diffusion algorithms to quantify the proximity of each node to the communities, thereby obtaining a low-dimensional topological profile for each node, (iii) comparing these topological profiles to identify nodes that are most similar to the labeled nodes. |
Tasks | Node Classification, Recommendation Systems |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.09575v1 |
https://arxiv.org/pdf/1912.09575v1.pdf | |
PWC | https://paperswithcode.com/paper/expanding-label-sets-for-graph-convolutional |
Repo | |
Framework | |
Predicting Economic Development using Geolocated Wikipedia Articles
Title | Predicting Economic Development using Geolocated Wikipedia Articles |
Authors | Evan Sheehan, Chenlin Meng, Matthew Tan, Burak Uzkent, Neal Jean, David Lobell, Marshall Burke, Stefano Ermon |
Abstract | Progress on the UN Sustainable Development Goals (SDGs) is hampered by a persistent lack of data regarding key social, environmental, and economic indicators, particularly in developing countries. For example, data on poverty — the first of seventeen SDGs — is both spatially sparse and infrequently collected in Sub-Saharan Africa due to the high cost of surveys. Here we propose a novel method for estimating socioeconomic indicators using open-source, geolocated textual information from Wikipedia articles. We demonstrate that modern NLP techniques can be used to predict community-level asset wealth and education outcomes using nearby geolocated Wikipedia articles. When paired with nightlights satellite imagery, our method outperforms all previously published benchmarks for this prediction task, indicating the potential of Wikipedia to inform both research in the social sciences and future policy decisions. |
Tasks | |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.01627v2 |
https://arxiv.org/pdf/1905.01627v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-economic-development-using |
Repo | |
Framework | |
Deformable Non-local Network for Video Super-Resolution
Title | Deformable Non-local Network for Video Super-Resolution |
Authors | Hua Wang, Dewei Su, Chuangchuang Liu, Longcun Jin, Xianfang Sun, Xinyi Peng |
Abstract | The video super-resolution (VSR) task aims to restore a high-resolution (HR) video frame by using its corresponding low-resolution (LR) frame and multiple neighboring frames. At present, many deep learning-based VSR methods rely on optical flow to perform frame alignment. The final recovery results will be greatly affected by the accuracy of optical flow. However, optical flow estimation cannot be completely accurate, and there are always some errors. In this paper, we propose a novel deformable non-local network (DNLN) which is a non-optical-flow-based method. Specifically, we apply the deformable convolution and improve its ability of adaptive alignment at the feature level. Furthermore, we utilize a non-local structure to capture the global correlation between the reference frame and the aligned neighboring frames, and simultaneously enhance desired fine details in the aligned frames. To reconstruct the final high-quality HR video frames, we use residual in residual dense blocks to take full advantage of the hierarchical features. Experimental results on benchmark datasets demonstrate that the proposed DNLN can achieve state-of-the-art performance on VSR task. |
Tasks | Optical Flow Estimation, Super-Resolution, Video Super-Resolution |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.10692v2 |
https://arxiv.org/pdf/1909.10692v2.pdf | |
PWC | https://paperswithcode.com/paper/deformable-non-local-network-for-video-super |
Repo | |
Framework | |
Revisiting Fine-tuning for Few-shot Learning
Title | Revisiting Fine-tuning for Few-shot Learning |
Authors | Akihiro Nakamura, Tatsuya Harada |
Abstract | Few-shot learning is the process of learning novel classes using only a few examples and it remains a challenging task in machine learning. Many sophisticated few-shot learning algorithms have been proposed based on the notion that networks can easily overfit to novel examples if they are simply fine-tuned using only a few examples. In this study, we show that in the commonly used low-resolution mini-ImageNet dataset, the fine-tuning method achieves higher accuracy than common few-shot learning algorithms in the 1-shot task and nearly the same accuracy as that of the state-of-the-art algorithm in the 5-shot task. We then evaluate our method with more practical tasks, namely the high-resolution single-domain and cross-domain tasks. With both tasks, we show that our method achieves higher accuracy than common few-shot learning algorithms. We further analyze the experimental results and show that: 1) the retraining process can be stabilized by employing a low learning rate, 2) using adaptive gradient optimizers during fine-tuning can increase test accuracy, and 3) test accuracy can be improved by updating the entire network when a large domain-shift exists between base and novel classes. |
Tasks | Few-Shot Learning |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00216v2 |
https://arxiv.org/pdf/1910.00216v2.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-fine-tuning-for-few-shot-learning |
Repo | |
Framework | |
High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks
Title | High Accuracy Tumor Diagnoses and Benchmarking of Hematoxylin and Eosin Stained Prostate Core Biopsy Images Generated by Explainable Deep Neural Networks |
Authors | Aman Rana, Alarice Lowe, Marie Lithgow, Katharine Horback, Tyler Janovitz, Annacarolina Da Silva, Harrison Tsai, Vignesh Shanmugam, Hyung-Jin Yoon, Pratik Shah |
Abstract | Histopathological diagnoses of tumors in tissue biopsy after Hematoxylin and Eosin (H&E) staining is the gold standard for oncology care. H&E staining is slow and uses dyes, reagents and precious tissue samples that cannot be reused. Thousands of native nonstained RGB Whole Slide Image (RWSI) patches of prostate core tissue biopsies were registered with their H&E stained versions. Conditional Generative Adversarial Neural Networks (cGANs) that automate conversion of native nonstained RWSI to computational H&E stained images were then trained. High similarities between computational and H&E dye stained images with Structural Similarity Index (SSIM) 0.902, Pearsons Correlation Coefficient (CC) 0.962 and Peak Signal to Noise Ratio (PSNR) 22.821 dB were calculated. A second cGAN performed accurate computational destaining of H&E dye stained images back to their native nonstained form with SSIM 0.9, CC 0.963 and PSNR 25.646 dB. A single-blind study computed more than 95% pixel-by-pixel overlap between prostate tumor annotations on computationally stained images, provided by five-board certified MD pathologists, with those on H&E dye stained counterparts. We report the first visualization and explanation of neural network kernel activation maps during H&E staining and destaining of RGB images by cGANs. High similarities between kernel activation maps of computational and H&E stained images (Mean-Squared Errors <0.0005) provide additional mathematical and mechanistic validation of the staining system. Our neural network framework thus is automated, explainable and performs high precision H&E staining and destaining of low cost native RGB images, and is computer vision and physician authenticated for rapid and accurate tumor diagnoses. |
Tasks | |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.01593v1 |
https://arxiv.org/pdf/1908.01593v1.pdf | |
PWC | https://paperswithcode.com/paper/high-accuracy-tumor-diagnoses-and |
Repo | |
Framework | |
TMLab: Generative Enhanced Model (GEM) for adversarial attacks
Title | TMLab: Generative Enhanced Model (GEM) for adversarial attacks |
Authors | Piotr Niewinski, Maria Pszona, Maria Janicka |
Abstract | We present our Generative Enhanced Model (GEM) that we used to create samples awarded the first prize on the FEVER 2.0 Breakers Task. GEM is the extended language model developed upon GPT-2 architecture. The addition of novel target vocabulary input to the already existing context input enabled controlled text generation. The training procedure resulted in creating a model that inherited the knowledge of pretrained GPT-2, and therefore was ready to generate natural-like English sentences in the task domain with some additional control. As a result, GEM generated malicious claims that mixed facts from various articles, so it became difficult to classify their truthfulness. |
Tasks | Language Modelling, Text Generation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00337v1 |
https://arxiv.org/pdf/1910.00337v1.pdf | |
PWC | https://paperswithcode.com/paper/tmlab-generative-enhanced-model-gem-for |
Repo | |
Framework | |
Convex Optimisation for Inverse Kinematics
Title | Convex Optimisation for Inverse Kinematics |
Authors | Tarun Yenamandra, Florian Bernard, Jiayi Wang, Franziska Mueller, Christian Theobalt |
Abstract | We consider the problem of inverse kinematics (IK), where one wants to find the parameters of a given kinematic skeleton that best explain a set of observed 3D joint locations. The kinematic skeleton has a tree structure, where each node is a joint that has an associated geometric transformation that is propagated to all its child nodes. The IK problem has various applications in vision and graphics, for example for tracking or reconstructing articulated objects, such as human hands or bodies. Most commonly, the IK problem is tackled using local optimisation methods. A major downside of these approaches is that, due to the non-convex nature of the problem, such methods are prone to converge to unwanted local optima and therefore require a good initialisation. In this paper we propose a convex optimisation approach for the IK problem based on semidefinite programming, which admits a polynomial-time algorithm that globally solves (a relaxation of) the IK problem. Experimentally, we demonstrate that the proposed method significantly outperforms local optimisation methods using different real-world skeletons. |
Tasks | |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11016v1 |
https://arxiv.org/pdf/1910.11016v1.pdf | |
PWC | https://paperswithcode.com/paper/convex-optimisation-for-inverse-kinematics |
Repo | |
Framework | |
Asymmetrical Hierarchical Networks with Attentive Interactions for Interpretable Review-Based Recommendation
Title | Asymmetrical Hierarchical Networks with Attentive Interactions for Interpretable Review-Based Recommendation |
Authors | Xin Dong, Jingchao Ni, Wei Cheng, Zhengzhang Chen, Bo Zong, Dongjin Song, Yanchi Liu, Haifeng Chen, Gerard de Melo |
Abstract | Recently, recommender systems have been able to emit substantially improved recommendations by leveraging user-provided reviews. Existing methods typically merge all reviews of a given user or item into a long document, and then process user and item documents in the same manner. In practice, however, these two sets of reviews are notably different: users’ reviews reflect a variety of items that they have bought and are hence very heterogeneous in their topics, while an item’s reviews pertain only to that single item and are thus topically homogeneous. In this work, we develop a novel neural network model that properly accounts for this important difference by means of asymmetric attentive modules. The user module learns to attend to only those signals that are relevant with respect to the target item, whereas the item module learns to extract the most salient contents with regard to properties of the item. Our multi-hierarchical paradigm accounts for the fact that neither are all reviews equally useful, nor are all sentences within each review equally pertinent. Extensive experimental results on a variety of real datasets demonstrate the effectiveness of our method. |
Tasks | Recommendation Systems |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/2001.04346v1 |
https://arxiv.org/pdf/2001.04346v1.pdf | |
PWC | https://paperswithcode.com/paper/asymmetrical-hierarchical-networks-with |
Repo | |
Framework | |
TextSR: Content-Aware Text Super-Resolution Guided by Recognition
Title | TextSR: Content-Aware Text Super-Resolution Guided by Recognition |
Authors | Wenjia Wang, Enze Xie, Peize Sun, Wenhai Wang, Lixun Tian, Chunhua Shen, Ping Luo |
Abstract | Scene text recognition has witnessed rapid development with the advance of convolutional neural networks. Nonetheless, most of the previous methods may not work well in recognizing text with low resolution which is often seen in natural scene images. An intuitive solution is to introduce super-resolution techniques as pre-processing. However, conventional super-resolution methods in the literature mainly focus on reconstructing the detailed texture of natural images, which typically do not work well for text due to the unique characteristics of text. To tackle these problems, in this work, we propose a content-aware text super-resolution network to generate the information desired for text recognition. In particular, we design an end-to-end network that can perform super-resolution and text recognition simultaneously. Different from previous super-resolution methods, we use the loss of text recognition as the Text Perceptual Loss to guide the training of the super-resolution network, and thus it pays more attention to the text content, rather than the irrelevant background area. Extensive experiments on several challenging benchmarks demonstrate the effectiveness of our proposed method in restoring a sharp high-resolution image from a small blurred one, and show that the recognition performance clearly boosts up the performance of text recognizer. To our knowledge, this is the first work focusing on text super-resolution. Code will be released in https://github.com/xieenze/TextSR. |
Tasks | Scene Text Recognition, Super-Resolution |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07113v4 |
https://arxiv.org/pdf/1909.07113v4.pdf | |
PWC | https://paperswithcode.com/paper/textsr-content-aware-text-super-resolution |
Repo | |
Framework | |
Error Analysis and Correction for Weighted A*‘s Suboptimality (Extended Version)
Title | Error Analysis and Correction for Weighted A*‘s Suboptimality (Extended Version) |
Authors | Robert C. Holte, Ruben Majadas, Alberto Pozanco, Daniel Borrajo |
Abstract | Weighted A* (wA*) is a widely used algorithm for rapidly, but suboptimally, solving planning and search problems. The cost of the solution it produces is guaranteed to be at most W times the optimal solution cost, where W is the weight wA* uses in prioritizing open nodes. W is therefore a suboptimality bound for the solution produced by wA*. There is broad consensus that this bound is not very accurate, that the actual suboptimality of wA*‘s solution is often much less than W times optimal. However, there is very little published evidence supporting that view, and no existing explanation of why W is a poor bound. This paper fills in these gaps in the literature. We begin with a large-scale experiment demonstrating that, across a wide variety of domains and heuristics for those domains, W is indeed very often far from the true suboptimality of wA*‘s solution. We then analytically identify the potential sources of error. Finally, we present a practical method for correcting for two of these sources of error and experimentally show that the correction frequently eliminates much of the error. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11346v2 |
https://arxiv.org/pdf/1905.11346v2.pdf | |
PWC | https://paperswithcode.com/paper/error-analysis-and-correction-for-weighted-as |
Repo | |
Framework | |
ReD-CaNe: A Systematic Methodology for Resilience Analysis and Design of Capsule Networks under Approximations
Title | ReD-CaNe: A Systematic Methodology for Resilience Analysis and Design of Capsule Networks under Approximations |
Authors | Alberto Marchisio, Vojtech Mrazek, Muhammad Abudllah Hanif, Muhammad Shafique |
Abstract | Recent advances in Capsule Networks (CapsNets) have shown their superior learning capability, compared to the traditional Convolutional Neural Networks (CNNs). However, the extremely high complexity of CapsNets limits their fast deployment in real-world applications. Moreover, while the resilience of CNNs have been extensively investigated to enable their energy-efficient implementations, the analysis of CapsNets’ resilience is a largely unexplored area, that can provide a strong foundation to investigate techniques to overcome the CapsNets’ complexity challenge. Following the trend of Approximate Computing to enable energy-efficient designs, we perform an extensive resilience analysis of the CapsNets inference subjected to the approximation errors. Our methodology models the errors arising from the approximate components (like multipliers), and analyze their impact on the classification accuracy of CapsNets. This enables the selection of approximate components based on the resilience of each operation of the CapsNet inference. We modify the TensorFlow framework to simulate the injection of approximation noise (based on the models of the approximate components) at different computational operations of the CapsNet inference. Our results show that the CapsNets are more resilient to the errors injected in the computations that occur during the dynamic routing (the softmax and the update of the coefficients), rather than other stages like convolutions and activation functions. Our analysis is extremely useful towards designing efficient CapsNet hardware accelerators with approximate components. To the best of our knowledge, this is the first proof-of-concept for employing approximations on the specialized CapsNet hardware. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00700v1 |
https://arxiv.org/pdf/1912.00700v1.pdf | |
PWC | https://paperswithcode.com/paper/red-cane-a-systematic-methodology-for |
Repo | |
Framework | |
Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning
Title | Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning |
Authors | Caleb Chuck, Supawit Chockchowwat, Scott Niekum |
Abstract | Deep reinforcement learning (DRL) is capable of learning high-performing policies on a variety of complex high-dimensional tasks, ranging from video games to robotic manipulation. However, standard DRL methods often suffer from poor sample efficiency, partially because they aim to be entirely problem-agnostic. In this work, we introduce a novel approach to exploration and hierarchical skill learning that derives its sample efficiency from intuitive assumptions it makes about the behavior of objects both in the physical world and simulations which mimic physics. Specifically, we propose the Hypothesis Proposal and Evaluation (HyPE) algorithm, which discovers objects from raw pixel data, generates hypotheses about the controllability of observed changes in object state, and learns a hierarchy of skills to test these hypotheses. We demonstrate that HyPE can dramatically improve the sample efficiency of policy learning in two different domains: a simulated robotic block-pushing domain, and a popular benchmark task: Breakout. In these domains, HyPE learns high-scoring policies an order of magnitude faster than several state-of-the-art reinforcement learning methods. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1906.01408v3 |
https://arxiv.org/pdf/1906.01408v3.pdf | |
PWC | https://paperswithcode.com/paper/190601408 |
Repo | |
Framework | |
An Exploratory Analysis of Biased Learners in Soft-Sensing Frames
Title | An Exploratory Analysis of Biased Learners in Soft-Sensing Frames |
Authors | Aysun Urhan, Burak Alakent |
Abstract | Data driven soft sensor design has recently gained immense popularity, due to advances in sensory devices, and a growing interest in data mining. While partial least squares (PLS) is traditionally used in the process literature for designing soft sensors, the statistical literature has focused on sparse learners, such as Lasso and relevance vector machine (RVM), to solve the high dimensional data problem. In the current study, predictive performances of three regression techniques, PLS, Lasso and RVM were assessed and compared under various offline and online soft sensing scenarios applied on datasets from five real industrial plants, and a simulated process. In offline learning, predictions of RVM and Lasso were found to be superior to those of PLS when a large number of time-lagged predictors were used. Online prediction results gave a slightly more complicated picture. It was found that the minimum prediction error achieved by PLS under moving window (MW), or just-in-time learning scheme was decreased up to ~5-10% using Lasso, or RVM. However, when a small MW size was used, or the optimum number of PLS components was as low as ~1, prediction performance of PLS surpassed RVM, which was found to yield occasional unstable predictions. PLS and Lasso models constructed via online parameter tuning generally did not yield better predictions compared to those constructed via offline tuning. We present evidence to suggest that retaining a large portion of the available process measurement data in the predictor matrix, instead of preselecting variables, would be more advantageous for sparse learners in increasing prediction accuracy. As a result, Lasso is recommended as a better substitute for PLS in soft sensors; while performance of RVM should be validated before online application. |
Tasks | |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10753v1 |
http://arxiv.org/pdf/1904.10753v1.pdf | |
PWC | https://paperswithcode.com/paper/an-exploratory-analysis-of-biased-learners-in |
Repo | |
Framework | |
An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions
Title | An Investigation of End-to-End Multichannel Speech Recognition for Reverberant and Mismatch Conditions |
Authors | Aswin Shanmugam Subramanian, Xiaofei Wang, Shinji Watanabe, Toru Taniguchi, Dung Tran, Yuya Fujita |
Abstract | Sequence-to-sequence (S2S) modeling is becoming a popular paradigm for automatic speech recognition (ASR) because of its ability to jointly optimize all the conventional ASR components in an end-to-end (E2E) fashion. This report investigates the ability of E2E ASR from standard close-talk to far-field applications by encompassing entire multichannel speech enhancement and ASR components within the S2S model. There have been previous studies on jointly optimizing neural beamforming alongside E2E ASR for denoising. It is clear from both recent challenge outcomes and successful products that far-field systems would be incomplete without solving both denoising and dereverberation simultaneously. This report uses a recently developed architecture for far-field ASR by composing neural extensions of dereverberation and beamforming modules with the S2S ASR module as a single differentiable neural network and also clearly defining the role of each subnetwork. The original implementation of this architecture was successfully applied to the noisy speech recognition task (CHiME-4), while we applied this implementation to noisy reverberant tasks (DIRHA and REVERB). Our investigation shows that the method achieves better performance than conventional pipeline methods on the DIRHA English dataset and comparable performance on the REVERB dataset. It also has additional advantages of being neither iterative nor requiring parallel noisy and clean speech data. |
Tasks | Denoising, Noisy Speech Recognition, Speech Enhancement, Speech Recognition |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09049v3 |
http://arxiv.org/pdf/1904.09049v3.pdf | |
PWC | https://paperswithcode.com/paper/dry-focus-and-transcribe-end-to-end |
Repo | |
Framework | |
Differentially Private Regression and Classification with Sparse Gaussian Processes
Title | Differentially Private Regression and Classification with Sparse Gaussian Processes |
Authors | Michael Thomas Smith, Mauricio A. Alvarez, Neil D. Lawrence |
Abstract | A continuing challenge for machine learning is providing methods to perform computation on data while ensuring the data remains private. In this paper we build on the provable privacy guarantees of differential privacy which has been combined with Gaussian processes through the previously published \emph{cloaking method}. In this paper we solve several shortcomings of this method, starting with the problem of predictions in regions with low data density. We experiment with the use of inducing points to provide a sparse approximation and show that these can provide robust differential privacy in outlier areas and at higher dimensions. We then look at classification, and modify the Laplace approximation approach to provide differentially private predictions. We then combine this with the sparse approximation and demonstrate the capability to perform classification in high dimensions. We finally explore the issue of hyperparameter selection and develop a method for their private selection. This paper and associated libraries provide a robust toolkit for combining differential privacy and GPs in a practical manner. |
Tasks | Gaussian Processes |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09147v1 |
https://arxiv.org/pdf/1909.09147v1.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-regression-and |
Repo | |
Framework | |