Paper Group ANR 517
Reconstructing the Noise Manifold for Image Denoising. Utilizing Deep Learning to Identify Drug Use on Twitter Data. Learning a distance function with a Siamese network to localize anomalies in videos. TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images. Towards an Efficient and Gene …
Reconstructing the Noise Manifold for Image Denoising
Title | Reconstructing the Noise Manifold for Image Denoising |
Authors | Ioannis Marras, Grigorios G. Chrysos, Ioannis Alexiou, Gregory Slabaugh, Stefanos Zafeiriou |
Abstract | Deep Convolutional Neural Networks (CNNs) have been successfully used in many low-level vision problems like image denoising. Although the conditional image generation techniques have led to large improvements in this task, there has been little effort in providing conditional generative adversarial networks (cGAN)[42] with an explicit way of understanding the image noise for object-independent denoising reliable for real-world applications. The task of leveraging structures in the target space is unstable due to the complexity of patterns in natural scenes, so the presence of unnatural artifacts or over-smoothed image areas cannot be avoided. To fill the gap, in this work we introduce the idea of a cGAN which explicitly leverages structure in the image noise space. By learning directly a low dimensional manifold of the image noise, the generator promotes the removal from the noisy image only that information which spans this manifold. This idea brings many advantages while it can be appended at the end of any denoiser to significantly improve its performance. Based on our experiments, our model substantially outperforms existing state-of-the-art architectures, resulting in denoised images with less oversmoothing and better detail. |
Tasks | Conditional Image Generation, Denoising, Image Denoising, Image Generation, Super-Resolution |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04147v2 |
https://arxiv.org/pdf/2002.04147v2.pdf | |
PWC | https://paperswithcode.com/paper/reconstructing-the-noise-manifold-for-image |
Repo | |
Framework | |
Utilizing Deep Learning to Identify Drug Use on Twitter Data
Title | Utilizing Deep Learning to Identify Drug Use on Twitter Data |
Authors | Joseph Tassone, Peizhi Yan, Mackenzie Simpson, Chetan Mendhe, Vijay Mago, Salimur Choudhury |
Abstract | The collection and examination of social media has become a useful mechanism for studying the mental activity and behavior tendencies of users. Through the analysis of collected Twitter data, models were developed for classifying drug-related tweets. Using topic pertaining keywords, such as slang and methods of drug consumption, a set of tweets was generated. Potential candidates were then preprocessed resulting in a dataset of 3,696,150 rows. The classification power of multiple methods was compared including support vector machines (SVM), XGBoost, and convolutional neural network (CNN) based classifiers. Rather than simple feature or attribute analysis, a deep learning approach was implemented to screen and analyze the tweets’ semantic meaning. The two CNN-based classifiers presented the best result when compared against other methodologies. The first was trained with 2,661 manually labeled samples, while the other included synthetically generated tweets culminating in 12,142 samples. The accuracy scores were 76.35% and 82.31%, with an AUC of 0.90 and 0.91. Additionally, association rule mining showed that commonly mentioned drugs had a level of correspondence with frequently used illicit substances, proving the practical usefulness of the system. Lastly, the synthetically generated set provided increased scores, improving the classification capability and proving the worth of this methodology. |
Tasks | |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.11522v1 |
https://arxiv.org/pdf/2003.11522v1.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-deep-learning-to-identify-drug-use |
Repo | |
Framework | |
Learning a distance function with a Siamese network to localize anomalies in videos
Title | Learning a distance function with a Siamese network to localize anomalies in videos |
Authors | Bharathkumar Ramachandra, Michael J. Jones, Ranga Raju Vatsavai |
Abstract | This work introduces a new approach to localize anomalies in surveillance video. The main novelty is the idea of using a Siamese convolutional neural network (CNN) to learn a distance function between a pair of video patches (spatio-temporal regions of video). The learned distance function, which is not specific to the target video, is used to measure the distance between each video patch in the testing video and the video patches found in normal training video. If a testing video patch is not similar to any normal video patch then it must be anomalous. We compare our approach to previously published algorithms using 4 evaluation measures and 3 challenging target benchmark datasets. Experiments show that our approach either surpasses or performs comparably to current state-of-the-art methods. |
Tasks | |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.09189v1 |
https://arxiv.org/pdf/2001.09189v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-distance-function-with-a-siamese |
Repo | |
Framework | |
TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images
Title | TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images |
Authors | Shubham Paliwal, Vishwanath D, Rohit Rahul, Monika Sharma, Lovekesh Vig |
Abstract | With the widespread use of mobile phones and scanners to photograph and upload documents, the need for extracting the information trapped in unstructured document images such as retail receipts, insurance claim forms and financial invoices is becoming more acute. A major hurdle to this objective is that these images often contain information in the form of tables and extracting data from tabular sub-images presents a unique set of challenges. This includes accurate detection of the tabular region within an image, and subsequently detecting and extracting information from the rows and columns of the detected table. While some progress has been made in table detection, extracting the table contents is still a challenge since this involves more fine grained table structure(rows & columns) recognition. Prior approaches have attempted to solve the table detection and structure recognition problems independently using two separate models. In this paper, we propose TableNet: a novel end-to-end deep learning model for both table detection and structure recognition. The model exploits the interdependence between the twin tasks of table detection and table structure recognition to segment out the table and column regions. This is followed by semantic rule-based row extraction from the identified tabular sub-regions. The proposed model and extraction approach was evaluated on the publicly available ICDAR 2013 and Marmot Table datasets obtaining state of the art results. Additionally, we demonstrate that feeding additional semantic features further improves model performance and that the model exhibits transfer learning across datasets. Another contribution of this paper is to provide additional table structure annotations for the Marmot data, which currently only has annotations for table detection. |
Tasks | Table Detection, Transfer Learning |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01469v1 |
https://arxiv.org/pdf/2001.01469v1.pdf | |
PWC | https://paperswithcode.com/paper/tablenet-deep-learning-model-for-end-to-end |
Repo | |
Framework | |
Towards an Efficient and General Framework of Robust Training for Graph Neural Networks
Title | Towards an Efficient and General Framework of Robust Training for Graph Neural Networks |
Authors | Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, Xue Lin |
Abstract | Graph Neural Networks (GNNs) have made significant advances on several fundamental inference tasks. As a result, there is a surge of interest in using these models for making potentially important decisions in high-regret applications. However, despite GNNs’ impressive performance, it has been observed that carefully crafted perturbations on graph structures (or nodes attributes) lead them to make wrong predictions. Presence of these adversarial examples raises serious security concerns. Most of the existing robust GNN design/training methods are only applicable to white-box settings where model parameters are known and gradient based methods can be used by performing convex relaxation of the discrete graph domain. More importantly, these methods are not efficient and scalable which make them infeasible in time sensitive tasks and massive graph datasets. To overcome these limitations, we propose a general framework which leverages the greedy search algorithms and zeroth-order methods to obtain robust GNNs in a generic and an efficient manner. On several applications, we show that the proposed techniques are significantly less computationally expensive and, in some cases, more robust than the state-of-the-art methods making them suitable to large-scale problems which were out of the reach of traditional robust training methods. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10947v1 |
https://arxiv.org/pdf/2002.10947v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-an-efficient-and-general-framework-of |
Repo | |
Framework | |
HarDNN: Feature Map Vulnerability Evaluation in CNNs
Title | HarDNN: Feature Map Vulnerability Evaluation in CNNs |
Authors | Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler |
Abstract | As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors. Transient hardware errors may percolate undesirable state during execution, resulting in software-manifested errors which can adversely affect high-level decision making. This paper presents HarDNN, a software-directed approach to identify vulnerable computations during a CNN inference and selectively protect them based on their propensity towards corrupting the inference output in the presence of a hardware error. We show that HarDNN can accurately estimate relative vulnerability of a feature map (fmap) in CNNs using a statistical error injection campaign, and explore heuristics for fast vulnerability assessment. Based on these results, we analyze the tradeoff between error coverage and computational overhead that the system designers can use to employ selective protection. Results show that the improvement in resilience for the added computation is superlinear with HarDNN. For example, HarDNN improves SqueezeNet’s resilience by 10x with just 30% additional computations. |
Tasks | Decision Making |
Published | 2020-02-22 |
URL | https://arxiv.org/abs/2002.09786v2 |
https://arxiv.org/pdf/2002.09786v2.pdf | |
PWC | https://paperswithcode.com/paper/hardnn-feature-map-vulnerability-evaluation |
Repo | |
Framework | |
Medical image reconstruction with image-adaptive priors learned by use of generative adversarial networks
Title | Medical image reconstruction with image-adaptive priors learned by use of generative adversarial networks |
Authors | Sayantan Bhadra, Weimin Zhou, Mark A. Anastasio |
Abstract | Medical image reconstruction is typically an ill-posed inverse problem. In order to address such ill-posed problems, the prior distribution of the sought after object property is usually incorporated by means of some sparsity-promoting regularization. Recently, prior distributions for images estimated using generative adversarial networks (GANs) have shown great promise in regularizing some of these image reconstruction problems. In this work, we apply an image-adaptive GAN-based reconstruction method (IAGAN) to reconstruct high fidelity images from incomplete medical imaging data. It is observed that the IAGAN method can potentially recover fine structures in the object that are relevant for medical diagnosis but may be oversmoothed in reconstructions with traditional sparsity-promoting regularization. |
Tasks | Image Reconstruction, Medical Diagnosis |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.10830v1 |
https://arxiv.org/pdf/2001.10830v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-image-reconstruction-with-image |
Repo | |
Framework | |
Q-Learning in enormous action spaces via amortized approximate maximization
Title | Q-Learning in enormous action spaces via amortized approximate maximization |
Authors | Tom Van de Wiele, David Warde-Farley, Andriy Mnih, Volodymyr Mnih |
Abstract | Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions. Motivated by techniques from amortized inference, we replace the expensive maximization over all actions with a maximization over a small subset of possible actions sampled from a learned proposal distribution. The resulting approach, which we dub Amortized Q-learning (AQL), is able to handle discrete, continuous, or hybrid action spaces while maintaining the benefits of Q-learning. Our experiments on continuous control tasks with up to 21 dimensional actions show that AQL outperforms D3PG (Barth-Maron et al, 2018) and QT-Opt (Kalashnikov et al, 2018). Experiments on structured discrete action spaces demonstrate that AQL can efficiently learn good policies in spaces with thousands of discrete actions. |
Tasks | Continuous Control, Q-Learning |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08116v1 |
https://arxiv.org/pdf/2001.08116v1.pdf | |
PWC | https://paperswithcode.com/paper/q-learning-in-enormous-action-spaces-via |
Repo | |
Framework | |
Diseño e implementación de una meta-heurística multi-poblacional de optimización combinatoria enfocada a la resolución de problemas de asignación de rutas a vehículos
Title | Diseño e implementación de una meta-heurística multi-poblacional de optimización combinatoria enfocada a la resolución de problemas de asignación de rutas a vehículos |
Authors | Eneko Osaba |
Abstract | Transportation is an essential area in the nowadays society, both for business sector and citizenry. There are different kinds of transportation systems, each one with its own characteristics. In the same way, various areas of knowledge can deal efficiently with the transport planning. The majority of the problems related with the transport and logistics have common characteristics, so they can be modeled as optimization problems, being able to see them as special cases of other generic problems. These problems fit into the combinatorial optimization field. Much of the problems of this type have an exceptional complexity. A great amount of meta-heuristics can be found the literature, each one with its advantages and disadvantages. Due to the high complexity of combinatorial optimization problems, there is no technique able to solve all these problems optimally. This fact makes the fields of combinatorial optimization and vehicle routing problems be a hot topic of research. This doctoral thesis will focus its efforts on developing a new meta-heuristic to solve different kind of vehicle routing problems. The presented technique offers an added value compared to existing methods, either in relation to the performance, and the contribution of conceptual originality. With the aim of validating the proposed model, the results obtained by the developed meta-heuristic have been compared with the ones obtained by other four algorithms of similar philosophy. Four well-known routing problems have been used in this experimentation, as well as two classical combinatorial optimization problems. In addition to the comparisons based on parameters such as the mean, or the standard deviation, two different statistical tests have been carried out. Thanks to these tests it can be affirmed that the proposed meta-heuristic is competitive in terms of performance and conceptual originality. |
Tasks | Combinatorial Optimization |
Published | 2020-03-24 |
URL | https://arxiv.org/abs/2003.11393v1 |
https://arxiv.org/pdf/2003.11393v1.pdf | |
PWC | https://paperswithcode.com/paper/diseno-e-implementacion-de-una-meta |
Repo | |
Framework | |
Imputation for High-Dimensional Linear Regression
Title | Imputation for High-Dimensional Linear Regression |
Authors | Kabir Aladin Chandrasekher, Ahmed El Alaoui, Andrea Montanari |
Abstract | We study high-dimensional regression with missing entries in the covariates. A common strategy in practice is to \emph{impute} the missing entries with an appropriate substitute and then implement a standard statistical procedure acting as if the covariates were fully observed. Recent literature on this subject proposes instead to design a specific, often complicated or non-convex, algorithm tailored to the case of missing covariates. We investigate a simpler approach where we fill-in the missing entries with their conditional mean given the observed covariates. We show that this imputation scheme coupled with standard off-the-shelf procedures such as the LASSO and square-root LASSO retains the minimax estimation rate in the random-design setting where the covariates are i.i.d.\ sub-Gaussian. We further show that the square-root LASSO remains \emph{pivotal} in this setting. It is often the case that the conditional expectation cannot be computed exactly and must be approximated from data. We study two cases where the covariates either follow an autoregressive (AR) process, or are jointly Gaussian with sparse precision matrix. We propose tractable estimators for the conditional expectation and then perform linear regression via LASSO, and show similar estimation rates in both cases. We complement our theoretical results with simulations on synthetic and semi-synthetic examples, illustrating not only the sharpness of our bounds, but also the broader utility of this strategy beyond our theoretical assumptions. |
Tasks | Imputation |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.09180v1 |
https://arxiv.org/pdf/2001.09180v1.pdf | |
PWC | https://paperswithcode.com/paper/imputation-for-high-dimensional-linear |
Repo | |
Framework | |
MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning
Title | MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning |
Authors | Simon Vandenhende, Stamatios Georgoulis, Luc Van Gool |
Abstract | In this paper, we argue about the importance of considering task interactions at multiple scales when distilling task information in a multi-task learning setup. In contrast to common belief, we show that tasks with high affinity at a certain scale are not guaranteed to retain this behaviour at other scales, and vice versa. We propose a novel architecture, namely MTI-Net, that builds upon this finding in three ways. First, it explicitly models task interactions at every scale via a multi-scale multi-modal distillation unit. Second, it propagates distilled task information from lower to higher scales via a feature propagation module. Third, it aggregates the refined task features from all scales via a feature aggregation unit to produce the final per-task predictions. Extensive experiments on two multi-task dense labeling datasets show that, unlike prior work, our multi-task model delivers on the full potential of multi-task learning, that is, smaller memory footprint, reduced number of calculations, and better performance w.r.t. single-task learning. |
Tasks | Multi-Task Learning |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06902v3 |
https://arxiv.org/pdf/2001.06902v3.pdf | |
PWC | https://paperswithcode.com/paper/mti-net-multi-scale-task-interaction-networks |
Repo | |
Framework | |
Moniqua: Modulo Quantized Communication in Decentralized SGD
Title | Moniqua: Modulo Quantized Communication in Decentralized SGD |
Authors | Yucheng Lu, Christopher De Sa |
Abstract | Running Stochastic Gradient Descent (SGD) in a decentralized fashion has shown promising results. In this paper we propose Moniqua, a technique that allows decentralized SGD to use quantized communication. We prove in theory that Moniqua communicates a provably bounded number of bits per iteration, while converging at the same asymptotic rate as the original algorithm does with full-precision communication. Moniqua improves upon prior works in that it (1) requires zero additional memory, (2) works with 1-bit quantization, and (3) is applicable to a variety of decentralized algorithms. We demonstrate empirically that Moniqua converges faster with respect to wall clock time than other quantized decentralized algorithms. We also show that Moniqua is robust to very low bit-budgets, allowing 1-bit-per-parameter communication without compromising validation accuracy when training ResNet20 and ResNet110 on CIFAR10. |
Tasks | Quantization |
Published | 2020-02-26 |
URL | https://arxiv.org/abs/2002.11787v1 |
https://arxiv.org/pdf/2002.11787v1.pdf | |
PWC | https://paperswithcode.com/paper/moniqua-modulo-quantized-communication-in-1 |
Repo | |
Framework | |
EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning
Title | EventSR: From Asynchronous Events to Image Reconstruction, Restoration, and Super-Resolution via End-to-End Adversarial Learning |
Authors | Lin Wang, Tae-Kyun Kim, Kuk-Jin Yoon |
Abstract | Event cameras sense intensity changes and have many advantages over conventional cameras. To take advantage of event cameras, some methods have been proposed to reconstruct intensity images from event streams. However, the outputs are still in low resolution (LR), noisy, and unrealistic. The low-quality outputs stem broader applications of event cameras, where high spatial resolution (HR) is needed as well as high temporal resolution, dynamic range, and no motion blur. We consider the problem of reconstructing and super-resolving intensity images from LR events, when no ground truth (GT) HR images and down-sampling kernels are available. To tackle the challenges, we propose a novel end-to-end pipeline that reconstructs LR images from event streams, enhances the image qualities and upsamples the enhanced images, called EventSR. For the absence of real GT images, our method is primarily unsupervised, deploying adversarial learning. To train EventSR, we create an open dataset including both real-world and simulated scenes. The use of both datasets boosts up the network performance, and the network architectures and various loss functions in each phase help improve the image qualities. The whole pipeline is trained in three phases. While each phase is mainly for one of the three tasks, the networks in earlier phases are fine-tuned by respective loss functions in an end-to-end manner. Experimental results show that EventSR reconstructs high-quality SR images from events for both simulated and real-world data. |
Tasks | Image Reconstruction, Super-Resolution |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07640v1 |
https://arxiv.org/pdf/2003.07640v1.pdf | |
PWC | https://paperswithcode.com/paper/eventsr-from-asynchronous-events-to-image |
Repo | |
Framework | |
Pyramidal Edge-maps based Guided Thermal Super-resolution
Title | Pyramidal Edge-maps based Guided Thermal Super-resolution |
Authors | Honey Gupta, Kaushik Mitra |
Abstract | Thermal imaging is a robust sensing technique but its consumer applicability is limited by the high cost of thermal sensors. Nevertheless, low-resolution thermal cameras are relatively affordable and are also usually accompanied by a high-resolution visible-range camera. This visible-range image can be used as a guide to reconstruct a high-resolution thermal image using guided super-resolution(GSR) techniques. However, the difference in wavelength-range of the input images makes this task challenging. Improper processing can introduce artifacts such as blur and ghosting, mainly due to texture and content mismatch. To this end, we propose a novel algorithm for guided super-resolution that explicitly tackles the issue of texture-mismatch caused due to multimodality. We propose a two-stage network that combines information from a low-resolution thermal and a high-resolution visible image with the help of multi-level edge-extraction and integration. The first stage of our network extracts edge-maps from the visual image at different pyramidal levels and the second stage integrates these edge-maps into our proposed super-resolution network at appropriate layers. Extraction and integration of edges belonging to different scales simplifies the task of GSR as it provides texture to object-level information in a progressive manner. Using multi-level edges also allows us to adjust the contribution of the visual image directly at the time of testing and thus provides controllability at test-time. We perform multiple experiments and show that our method performs better than existing state-of-the-art guided super-resolution methods both quantitatively and qualitatively. |
Tasks | Super-Resolution |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.06216v1 |
https://arxiv.org/pdf/2003.06216v1.pdf | |
PWC | https://paperswithcode.com/paper/pyramidal-edge-maps-based-guided-thermal |
Repo | |
Framework | |
Texture Classification using Block Intensity and Gradient Difference (BIGD) Descriptor
Title | Texture Classification using Block Intensity and Gradient Difference (BIGD) Descriptor |
Authors | Yuting Hu, Zhen Wang, Ghassan AlRegib |
Abstract | In this paper, we present an efficient and distinctive local descriptor, namely block intensity and gradient difference (BIGD). In an image patch, we randomly sample multi-scale block pairs and utilize the intensity and gradient differences of pairwise blocks to construct the local BIGD descriptor. The random sampling strategy and the multi-scale framework help BIGD descriptors capture the distinctive patterns of patches at different orientations and spatial granularity levels. We use vectors of locally aggregated descriptors (VLAD) or improved Fisher vector (IFV) to encode local BIGD descriptors into a full image descriptor, which is then fed into a linear support vector machine (SVM) classifier for texture classification. We compare the proposed descriptor with typical and state-of-the-art ones by evaluating their classification performance on five public texture data sets including Brodatz, CUReT, KTH-TIPS, and KTH-TIPS-2a and -2b. Experimental results show that the proposed BIGD descriptor with stronger discriminative power yields 0.12% ~ 6.43% higher classification accuracy than the state-of-the-art texture descriptor, dense microblock difference (DMD). |
Tasks | Texture Classification |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01154v1 |
https://arxiv.org/pdf/2002.01154v1.pdf | |
PWC | https://paperswithcode.com/paper/texture-classification-using-block-intensity |
Repo | |
Framework | |