Paper Group ANR 1130
Connecting Visual Experiences using Max-flow Network with Application to Visual Localization. Abstraction Learning. HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples. A Map Equation with Metadata: Varying the Role of Attributes in Community Detection. An $O(N)$ Sorting Algorithm: Machine …
Connecting Visual Experiences using Max-flow Network with Application to Visual Localization
Title | Connecting Visual Experiences using Max-flow Network with Application to Visual Localization |
Authors | A. H. Abdul Hafez, Nakul Agarwal, C. V. Jawahar |
Abstract | We are motivated by the fact that multiple representations of the environment are required to stand for the changes in appearance with time and for changes that appear in a cyclic manner. These changes are, for example, from day to night time, and from day to day across seasons. In such situations, the robot visits the same routes multiple times and collects different appearances of it. Multiple visual experiences usually find robotic vision applications like visual localization, mapping, place recognition, and autonomous navigation. The novelty in this paper is an algorithm that connects multiple visual experiences via aligning multiple image sequences. This problem is solved by finding the maximum flow in a directed graph flow-network, whose vertices represent the matches between frames in the test and reference sequences. Edges of the graph represent the cost of these matches. The problem of finding the best match is reduced to finding the minimum-cut surface, which is solved as a maximum flow network problem. Application to visual localization is considered in this paper to show the effectiveness of the proposed multiple image sequence alignment method, without loosing its generality. Experimental evaluations show that the precision of sequence matching is improved by considering multiple visual sequences for the same route, and that the method performs favorably against state-of-the-art single representation methods like SeqSLAM and ABLE-M. |
Tasks | Autonomous Navigation, Visual Localization |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00208v1 |
http://arxiv.org/pdf/1808.00208v1.pdf | |
PWC | https://paperswithcode.com/paper/connecting-visual-experiences-using-max-flow |
Repo | |
Framework | |
Abstraction Learning
Title | Abstraction Learning |
Authors | Fei Deng, Jinsheng Ren, Feng Chen |
Abstract | There has been a gap between artificial intelligence and human intelligence. In this paper, we identify three key elements forming human intelligence, and suggest that abstraction learning combines these elements and is thus a way to bridge the gap. Prior researches in artificial intelligence either specify abstraction by human experts, or take abstraction as a qualitative explanation for the model. This paper aims to learn abstraction directly. We tackle three main challenges: representation, objective function, and learning algorithm. Specifically, we propose a partition structure that contains pre-allocated abstraction neurons; we formulate abstraction learning as a constrained optimization problem, which integrates abstraction properties; we develop a network evolution algorithm to solve this problem. This complete framework is named ONE (Optimization via Network Evolution). In our experiments on MNIST, ONE shows elementary human-like intelligence, including low energy consumption, knowledge sharing, and lifelong learning. |
Tasks | |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03956v1 |
http://arxiv.org/pdf/1809.03956v1.pdf | |
PWC | https://paperswithcode.com/paper/abstraction-learning |
Repo | |
Framework | |
HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples
Title | HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples |
Authors | Deqiang Li, Ramesh Baral, Tao Li, Han Wang, Qianmu Li, Shouhuai Xu |
Abstract | Adversarial machine learning in the context of image processing and related applications has received a large amount of attention. However, adversarial machine learning, especially adversarial deep learning, in the context of malware detection has received much less attention despite its apparent importance. In this paper, we present a framework for enhancing the robustness of Deep Neural Networks (DNNs) against adversarial malware samples, dubbed Hashing Transformation Deep Neural Networks} (HashTran-DNN). The core idea is to use hash functions with a certain locality-preserving property to transform samples to enhance the robustness of DNNs in malware classification. The framework further uses a Denoising Auto-Encoder (DAE) regularizer to reconstruct the hash representations of samples, making the resulting DNN classifiers capable of attaining the locality information in the latent space. We experiment with two concrete instantiations of the HashTran-DNN framework to classify Android malware. Experimental results show that four known attacks can render standard DNNs useless in classifying Android malware, that known defenses can at most defend three of the four attacks, and that HashTran-DNN can effectively defend against all of the four attacks. |
Tasks | Denoising, Malware Classification, Malware Detection |
Published | 2018-09-18 |
URL | http://arxiv.org/abs/1809.06498v1 |
http://arxiv.org/pdf/1809.06498v1.pdf | |
PWC | https://paperswithcode.com/paper/hashtran-dnn-a-framework-for-enhancing |
Repo | |
Framework | |
A Map Equation with Metadata: Varying the Role of Attributes in Community Detection
Title | A Map Equation with Metadata: Varying the Role of Attributes in Community Detection |
Authors | Scott Emmons, Peter J. Mucha |
Abstract | Much of the community detection literature studies structural communities, communities defined solely by the connectivity patterns of the network. Often, networks contain additional metadata which can inform community detection such as the grade and gender of students in a high school social network. In this work, we introduce a tuning parameter to the content map equation that allows users of the Infomap community detection algorithm to control the metadata’s relative importance for identifying network structure. On synthetic networks, we show that our algorithm can overcome the structural detectability limit when the metadata is well-aligned with community structure. On real-world networks, we show how our algorithm can achieve greater mutual information with the metadata at a cost in the traditional map equation. Our tuning parameter, like the focusing knob of a microscope, allows users to “zoom in” and “zoom out” on communities with varying levels of focus on the metadata. |
Tasks | Community Detection |
Published | 2018-10-24 |
URL | https://arxiv.org/abs/1810.10433v2 |
https://arxiv.org/pdf/1810.10433v2.pdf | |
PWC | https://paperswithcode.com/paper/a-map-equation-with-metadata-varying-the-role |
Repo | |
Framework | |
An $O(N)$ Sorting Algorithm: Machine Learning Sort
Title | An $O(N)$ Sorting Algorithm: Machine Learning Sort |
Authors | Hanqing Zhao, Yuehan Luo |
Abstract | We propose an $O(N\cdot M)$ sorting algorithm by Machine Learning method, which shows a huge potential sorting big data. This sorting algorithm can be applied to parallel sorting and is suitable for GPU or TPU acceleration. Furthermore, we discuss the application of this algorithm to sparse hash table. |
Tasks | |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04272v2 |
http://arxiv.org/pdf/1805.04272v2.pdf | |
PWC | https://paperswithcode.com/paper/an-on-sorting-algorithm-machine-learning-sort |
Repo | |
Framework | |
Integrating Stance Detection and Fact Checking in a Unified Corpus
Title | Integrating Stance Detection and Fact Checking in a Unified Corpus |
Authors | Ramy Baly, Mitra Mohtarami, James Glass, Lluis Marquez, Alessandro Moschitti, Preslav Nakov |
Abstract | A reasonable approach for fact checking a claim involves retrieving potentially relevant documents from different sources (e.g., news websites, social media, etc.), determining the stance of each document with respect to the claim, and finally making a prediction about the claim’s factuality by aggregating the strength of the stances, while taking the reliability of the source into account. Moreover, a fact checking system should be able to explain its decision by providing relevant extracts (rationales) from the documents. Yet, this setup is not directly supported by existing datasets, which treat fact checking, document retrieval, source credibility, stance detection and rationale extraction as independent tasks. In this paper, we support the interdependencies between these tasks as annotations in the same corpus. We implement this setup on an Arabic fact checking corpus, the first of its kind. |
Tasks | Stance Detection |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.08012v1 |
http://arxiv.org/pdf/1804.08012v1.pdf | |
PWC | https://paperswithcode.com/paper/integrating-stance-detection-and-fact |
Repo | |
Framework | |
SIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers
Title | SIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers |
Authors | Farhad Nooralahzadeh, Lilja Øvrelid, Jan Tore Lønning |
Abstract | This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers. First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and performs relation classification with differing objectives for each subtask of the shared task. This approach achieved overall F1 scores of 76.7 and 83.2 for relation classification on clean and noisy data, respectively. Furthermore, for combined relation extraction and classification on clean data, it obtained F1 scores of 37.4 and 33.6 for each phase. Our system ranks 3rd in all three sub-tasks of the shared task. |
Tasks | Relation Classification, Relation Extraction |
Published | 2018-04-24 |
URL | http://arxiv.org/abs/1804.08887v1 |
http://arxiv.org/pdf/1804.08887v1.pdf | |
PWC | https://paperswithcode.com/paper/sirius-ltg-uio-at-semeval-2018-task-7 |
Repo | |
Framework | |
Optimizing speed/accuracy trade-off for person re-identification via knowledge distillation
Title | Optimizing speed/accuracy trade-off for person re-identification via knowledge distillation |
Authors | Idoia Ruiz, Bogdan Raducanu, Rakesh Mehta, Jaume Amores |
Abstract | Finding a person across a camera network plays an important role in video surveillance. For a real-world person re-identification application, in order to guarantee an optimal time response, it is crucial to find the balance between accuracy and speed. We analyse this trade-off, comparing a classical method, that comprises hand-crafted feature description and metric learning, in particular, LOMO and XQDA, to deep learning based techniques, using image classification networks, ResNet and MobileNets. Additionally, we propose and analyse network distillation as a learning strategy to reduce the computational cost of the deep learning approach at test time. We evaluate both methods on the Market-1501 and DukeMTMC-reID large-scale datasets, showing that distillation helps reducing the computational cost at inference time while even increasing the accuracy performance. |
Tasks | Image Classification, Metric Learning, Person Re-Identification |
Published | 2018-12-07 |
URL | https://arxiv.org/abs/1812.02937v2 |
https://arxiv.org/pdf/1812.02937v2.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-speedaccuracy-trade-off-for-person |
Repo | |
Framework | |
Signal Processing and Piecewise Convex Estimation
Title | Signal Processing and Piecewise Convex Estimation |
Authors | Kurt Riedel |
Abstract | Many problems on signal processing reduce to nonparametric function estimation. We propose a new methodology, piecewise convex fitting (PCF), and give a two-stage adaptive estimate. In the first stage, the number and location of the change points is estimated using strong smoothing. In the second stage, a constrained smoothing spline fit is performed with the smoothing level chosen to minimize the MSE. The imposed constraint is that a single change point occurs in a region about each empirical change point of the first-stage estimate. This constraint is equivalent to requiring that the third derivative of the second-stage estimate has a single sign in a small neighborhood about each first-stage change point. We sketch how PCF may be applied to signal recovery, instantaneous frequency estimation, surface reconstruction, image segmentation, spectral estimation and multivariate adaptive regression. |
Tasks | Semantic Segmentation |
Published | 2018-03-14 |
URL | https://arxiv.org/abs/1803.05130v1 |
https://arxiv.org/pdf/1803.05130v1.pdf | |
PWC | https://paperswithcode.com/paper/signal-processing-and-piecewise-convex |
Repo | |
Framework | |
Scaling Speech Enhancement in Unseen Environments with Noise Embeddings
Title | Scaling Speech Enhancement in Unseen Environments with Noise Embeddings |
Authors | Gil Keren, Jing Han, Björn Schuller |
Abstract | We address the problem of speech enhancement generalisation to unseen environments by performing two manipulations. First, we embed an additional recording from the environment alone, and use this embedding to alter activations in the main enhancement subnetwork. Second, we scale the number of noise environments present at training time to 16,784 different environments. Experiment results show that both manipulations reduce word error rates of a pretrained speech recognition system and improve enhancement quality according to a number of performance measures. Specifically, our best model reduces the word error rate from 34.04% on noisy speech to 15.46% on the enhanced speech. Enhanced audio samples can be found in https://speechenhancement.page.link/samples. |
Tasks | Speech Enhancement, Speech Recognition |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.12757v1 |
http://arxiv.org/pdf/1810.12757v1.pdf | |
PWC | https://paperswithcode.com/paper/scaling-speech-enhancement-in-unseen |
Repo | |
Framework | |
Enhanced CNN for image denoising
Title | Enhanced CNN for image denoising |
Authors | Chunwei Tian, Yong Xu, Lunke Fei, Junqian Wang, Jie Wen, Nan Luo |
Abstract | Owing to flexible architectures of deep convolutional neural networks (CNNs), CNNs are successfully used for image denoising. However, they suffer from the following drawbacks: (i) deep network architecture is very difficult to train. (ii) Deeper networks face the challenge of performance saturation. In this study, the authors propose a novel method called enhanced convolutional neural denoising network (ECNDNet). Specifically, they use residual learning and batch normalisation techniques to address the problem of training difficulties and accelerate the convergence of the network. In addition, dilated convolutions are used in the proposed network to enlarge the context information and reduce the computational cost. Extensive experiments demonstrate that the ECNDNet outperforms the state-of-the-art methods for image denoising. |
Tasks | Denoising, Image Denoising |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11834v4 |
http://arxiv.org/pdf/1810.11834v4.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-cnn-for-image-denoising |
Repo | |
Framework | |
Efficient Super Resolution Using Binarized Neural Network
Title | Efficient Super Resolution Using Binarized Neural Network |
Authors | Yinglan Ma, Hongyu Xiong, Zhe Hu, Lizhuang Ma |
Abstract | Deep convolutional neural networks (DCNNs) have recently demonstrated high-quality results in single-image super-resolution (SR). DCNNs often suffer from over-parametrization and large amounts of redundancy, which results in inefficient inference and high memory usage, preventing massive applications on mobile devices. As a way to significantly reduce model size and computation time, binarized neural network has only been shown to excel on semantic-level tasks such as image classification and recognition. However, little effort of network quantization has been spent on image enhancement tasks like SR, as network quantization is usually assumed to sacrifice pixel-level accuracy. In this work, we explore an network-binarization approach for SR tasks without sacrificing much reconstruction accuracy. To achieve this, we binarize the convolutional filters in only residual blocks, and adopt a learnable weight for each binary filter. We evaluate this idea on several state-of-the-art DCNN-based architectures, and show that binarized SR networks achieve comparable qualitative and quantitative results as their real-weight counterparts. Moreover, the proposed binarized strategy could help reduce model size by 80% when applying on SRResNet, and could potentially speed up inference by 5 times. |
Tasks | Image Classification, Image Enhancement, Image Super-Resolution, Quantization, Super-Resolution |
Published | 2018-12-16 |
URL | http://arxiv.org/abs/1812.06378v1 |
http://arxiv.org/pdf/1812.06378v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-super-resolution-using-binarized |
Repo | |
Framework | |
Graph Multiview Canonical Correlation Analysis
Title | Graph Multiview Canonical Correlation Analysis |
Authors | Jia Chen, Gang Wang, Georgios B. Giannakis |
Abstract | Multiview canonical correlation analysis (MCCA) seeks latent low-dimensional representations encountered with multiview data of shared entities (a.k.a. common sources). However, existing MCCA approaches do not exploit the geometry of the common sources, which may be available \emph{a priori}, or can be constructed using certain domain knowledge. This prior information about the common sources can be encoded by a graph, and be invoked as a regularizer to enrich the maximum variance MCCA framework. In this context, the present paper’s novel graph-regularized (G) MCCA approach minimizes the distance between the wanted canonical variables and the common low-dimensional representations, while accounting for graph-induced knowledge of the common sources. Relying on a function capturing the extent low-dimensional representations of the multiple views are similar, a generalization bound of GMCCA is established based on Rademacher’s complexity. Tailored for setups where the number of data pairs is smaller than the data vector dimensions, a graph-regularized dual MCCA approach is also developed. To further deal with nonlinearities present in the data, graph-regularized kernel MCCA variants are put forward too. Interestingly, solutions of the graph-regularized linear, dual, and kernel MCCA, are all provided in terms of generalized eigenvalue decomposition. Several corroborating numerical tests using real datasets are provided to showcase the merits of the graph-regularized MCCA variants relative to several competing alternatives including MCCA, Laplacian-regularized MCCA, and (graph-regularized) PCA. |
Tasks | |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12345v2 |
http://arxiv.org/pdf/1811.12345v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-multiview-canonical-correlation |
Repo | |
Framework | |
Efficient Super Resolution For Large-Scale Images Using Attentional GAN
Title | Efficient Super Resolution For Large-Scale Images Using Attentional GAN |
Authors | Harsh Nilesh Pathak, Xinxin Li, Shervin Minaee, Brooke Cowan |
Abstract | Single Image Super Resolution (SISR) is a well-researched problem with broad commercial relevance. However, most of the SISR literature focuses on small-size images under 500px, whereas business needs can mandate the generation of very high resolution images. At Expedia Group, we were tasked with generating images of at least 2000px for display on the website, four times greater than the sizes typically reported in the literature. This requirement poses a challenge that state-of-the-art models, validated on small images, have not been proven to handle. In this paper, we investigate solutions to the problem of generating high-quality images for large-scale super resolution in a commercial setting. We find that training a generative adversarial network (GAN) with attention from scratch using a large-scale lodging image data set generates images with high PSNR and SSIM scores. We describe a novel attentional SISR model for large-scale images, A-SRGAN, that uses a Flexible Self Attention layer to enable processing of large-scale images. We also describe a distributed algorithm which speeds up training by around a factor of five. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-12-12 |
URL | http://arxiv.org/abs/1812.04821v4 |
http://arxiv.org/pdf/1812.04821v4.pdf | |
PWC | https://paperswithcode.com/paper/efficient-super-resolution-for-large-scale |
Repo | |
Framework | |
Binary Document Image Super Resolution for Improved Readability and OCR Performance
Title | Binary Document Image Super Resolution for Improved Readability and OCR Performance |
Authors | Ram Krishna Pandey, K Vignesh, A G Ramakrishnan, Chandrahasa B |
Abstract | There is a need for information retrieval from large collections of low-resolution (LR) binary document images, which can be found in digital libraries across the world, where the high-resolution (HR) counterpart is not available. This gives rise to the problem of binary document image super-resolution (BDISR). The objective of this paper is to address the interesting and challenging problem of super resolution of binary Tamil document images for improved readability and better optical character recognition (OCR). We propose multiple deep neural network architectures to address this problem and analyze their performance. The proposed models are all single image super-resolution techniques, which learn a generalized spatial correspondence between the LR and HR binary document images. We employ convolutional layers for feature extraction followed by transposed convolution and sub-pixel convolution layers for upscaling the features. Since the outputs of the neural networks are gray scale, we utilize the advantage of power law transformation as a post-processing technique to improve the character level pixel connectivity. The performance of our models is evaluated by comparing the OCR accuracies and the mean opinion scores given by human evaluators on LR images and the corresponding model-generated HR images. |
Tasks | Image Super-Resolution, Information Retrieval, Optical Character Recognition, Super-Resolution |
Published | 2018-12-06 |
URL | http://arxiv.org/abs/1812.02475v1 |
http://arxiv.org/pdf/1812.02475v1.pdf | |
PWC | https://paperswithcode.com/paper/binary-document-image-super-resolution-for |
Repo | |
Framework | |