October 16, 2019

2936 words 14 mins read

Paper Group ANR 1130

Connecting Visual Experiences using Max-flow Network with Application to Visual Localization. Abstraction Learning. HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples. A Map Equation with Metadata: Varying the Role of Attributes in Community Detection. An $O(N)$ Sorting Algorithm: Machine …

Connecting Visual Experiences using Max-flow Network with Application to Visual Localization


Title	Connecting Visual Experiences using Max-flow Network with Application to Visual Localization
Authors	A. H. Abdul Hafez, Nakul Agarwal, C. V. Jawahar
Abstract	We are motivated by the fact that multiple representations of the environment are required to stand for the changes in appearance with time and for changes that appear in a cyclic manner. These changes are, for example, from day to night time, and from day to day across seasons. In such situations, the robot visits the same routes multiple times and collects different appearances of it. Multiple visual experiences usually find robotic vision applications like visual localization, mapping, place recognition, and autonomous navigation. The novelty in this paper is an algorithm that connects multiple visual experiences via aligning multiple image sequences. This problem is solved by finding the maximum flow in a directed graph flow-network, whose vertices represent the matches between frames in the test and reference sequences. Edges of the graph represent the cost of these matches. The problem of finding the best match is reduced to finding the minimum-cut surface, which is solved as a maximum flow network problem. Application to visual localization is considered in this paper to show the effectiveness of the proposed multiple image sequence alignment method, without loosing its generality. Experimental evaluations show that the precision of sequence matching is improved by considering multiple visual sequences for the same route, and that the method performs favorably against state-of-the-art single representation methods like SeqSLAM and ABLE-M.
Tasks	Autonomous Navigation, Visual Localization
Published	2018-08-01
URL	http://arxiv.org/abs/1808.00208v1
PDF	http://arxiv.org/pdf/1808.00208v1.pdf
PWC	https://paperswithcode.com/paper/connecting-visual-experiences-using-max-flow
Repo
Framework

Abstraction Learning


Title	Abstraction Learning
Authors	Fei Deng, Jinsheng Ren, Feng Chen
Abstract	There has been a gap between artificial intelligence and human intelligence. In this paper, we identify three key elements forming human intelligence, and suggest that abstraction learning combines these elements and is thus a way to bridge the gap. Prior researches in artificial intelligence either specify abstraction by human experts, or take abstraction as a qualitative explanation for the model. This paper aims to learn abstraction directly. We tackle three main challenges: representation, objective function, and learning algorithm. Specifically, we propose a partition structure that contains pre-allocated abstraction neurons; we formulate abstraction learning as a constrained optimization problem, which integrates abstraction properties; we develop a network evolution algorithm to solve this problem. This complete framework is named ONE (Optimization via Network Evolution). In our experiments on MNIST, ONE shows elementary human-like intelligence, including low energy consumption, knowledge sharing, and lifelong learning.
Tasks
Published	2018-09-11
URL	http://arxiv.org/abs/1809.03956v1
PDF	http://arxiv.org/pdf/1809.03956v1.pdf
PWC	https://paperswithcode.com/paper/abstraction-learning
Repo
Framework

HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples


Title	HashTran-DNN: A Framework for Enhancing Robustness of Deep Neural Networks against Adversarial Malware Samples
Authors	Deqiang Li, Ramesh Baral, Tao Li, Han Wang, Qianmu Li, Shouhuai Xu
Abstract	Adversarial machine learning in the context of image processing and related applications has received a large amount of attention. However, adversarial machine learning, especially adversarial deep learning, in the context of malware detection has received much less attention despite its apparent importance. In this paper, we present a framework for enhancing the robustness of Deep Neural Networks (DNNs) against adversarial malware samples, dubbed Hashing Transformation Deep Neural Networks} (HashTran-DNN). The core idea is to use hash functions with a certain locality-preserving property to transform samples to enhance the robustness of DNNs in malware classification. The framework further uses a Denoising Auto-Encoder (DAE) regularizer to reconstruct the hash representations of samples, making the resulting DNN classifiers capable of attaining the locality information in the latent space. We experiment with two concrete instantiations of the HashTran-DNN framework to classify Android malware. Experimental results show that four known attacks can render standard DNNs useless in classifying Android malware, that known defenses can at most defend three of the four attacks, and that HashTran-DNN can effectively defend against all of the four attacks.
Tasks	Denoising, Malware Classification, Malware Detection
Published	2018-09-18
URL	http://arxiv.org/abs/1809.06498v1
PDF	http://arxiv.org/pdf/1809.06498v1.pdf
PWC	https://paperswithcode.com/paper/hashtran-dnn-a-framework-for-enhancing
Repo
Framework

A Map Equation with Metadata: Varying the Role of Attributes in Community Detection


Title	A Map Equation with Metadata: Varying the Role of Attributes in Community Detection
Authors	Scott Emmons, Peter J. Mucha
Abstract	Much of the community detection literature studies structural communities, communities defined solely by the connectivity patterns of the network. Often, networks contain additional metadata which can inform community detection such as the grade and gender of students in a high school social network. In this work, we introduce a tuning parameter to the content map equation that allows users of the Infomap community detection algorithm to control the metadata’s relative importance for identifying network structure. On synthetic networks, we show that our algorithm can overcome the structural detectability limit when the metadata is well-aligned with community structure. On real-world networks, we show how our algorithm can achieve greater mutual information with the metadata at a cost in the traditional map equation. Our tuning parameter, like the focusing knob of a microscope, allows users to “zoom in” and “zoom out” on communities with varying levels of focus on the metadata.
Tasks	Community Detection
Published	2018-10-24
URL	https://arxiv.org/abs/1810.10433v2
PDF	https://arxiv.org/pdf/1810.10433v2.pdf
PWC	https://paperswithcode.com/paper/a-map-equation-with-metadata-varying-the-role
Repo
Framework

An $O(N)$ Sorting Algorithm: Machine Learning Sort


Title	An $O(N)$ Sorting Algorithm: Machine Learning Sort
Authors	Hanqing Zhao, Yuehan Luo
Abstract	We propose an $O(N\cdot M)$ sorting algorithm by Machine Learning method, which shows a huge potential sorting big data. This sorting algorithm can be applied to parallel sorting and is suitable for GPU or TPU acceleration. Furthermore, we discuss the application of this algorithm to sparse hash table.
Tasks
Published	2018-05-11
URL	http://arxiv.org/abs/1805.04272v2
PDF	http://arxiv.org/pdf/1805.04272v2.pdf
PWC	https://paperswithcode.com/paper/an-on-sorting-algorithm-machine-learning-sort
Repo
Framework

Integrating Stance Detection and Fact Checking in a Unified Corpus


Title	Integrating Stance Detection and Fact Checking in a Unified Corpus
Authors	Ramy Baly, Mitra Mohtarami, James Glass, Lluis Marquez, Alessandro Moschitti, Preslav Nakov
Abstract	A reasonable approach for fact checking a claim involves retrieving potentially relevant documents from different sources (e.g., news websites, social media, etc.), determining the stance of each document with respect to the claim, and finally making a prediction about the claim’s factuality by aggregating the strength of the stances, while taking the reliability of the source into account. Moreover, a fact checking system should be able to explain its decision by providing relevant extracts (rationales) from the documents. Yet, this setup is not directly supported by existing datasets, which treat fact checking, document retrieval, source credibility, stance detection and rationale extraction as independent tasks. In this paper, we support the interdependencies between these tasks as annotations in the same corpus. We implement this setup on an Arabic fact checking corpus, the first of its kind.
Tasks	Stance Detection
Published	2018-04-21
URL	http://arxiv.org/abs/1804.08012v1
PDF	http://arxiv.org/pdf/1804.08012v1.pdf
PWC	https://paperswithcode.com/paper/integrating-stance-detection-and-fact
Repo
Framework

SIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers


Title	SIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers
Authors	Farhad Nooralahzadeh, Lilja Øvrelid, Jan Tore Lønning
Abstract	This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers. First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and performs relation classification with differing objectives for each subtask of the shared task. This approach achieved overall F1 scores of 76.7 and 83.2 for relation classification on clean and noisy data, respectively. Furthermore, for combined relation extraction and classification on clean data, it obtained F1 scores of 37.4 and 33.6 for each phase. Our system ranks 3rd in all three sub-tasks of the shared task.
Tasks	Relation Classification, Relation Extraction
Published	2018-04-24
URL	http://arxiv.org/abs/1804.08887v1
PDF	http://arxiv.org/pdf/1804.08887v1.pdf
PWC	https://paperswithcode.com/paper/sirius-ltg-uio-at-semeval-2018-task-7
Repo
Framework

Optimizing speed/accuracy trade-off for person re-identification via knowledge distillation


Title	Optimizing speed/accuracy trade-off for person re-identification via knowledge distillation
Authors	Idoia Ruiz, Bogdan Raducanu, Rakesh Mehta, Jaume Amores
Abstract	Finding a person across a camera network plays an important role in video surveillance. For a real-world person re-identification application, in order to guarantee an optimal time response, it is crucial to find the balance between accuracy and speed. We analyse this trade-off, comparing a classical method, that comprises hand-crafted feature description and metric learning, in particular, LOMO and XQDA, to deep learning based techniques, using image classification networks, ResNet and MobileNets. Additionally, we propose and analyse network distillation as a learning strategy to reduce the computational cost of the deep learning approach at test time. We evaluate both methods on the Market-1501 and DukeMTMC-reID large-scale datasets, showing that distillation helps reducing the computational cost at inference time while even increasing the accuracy performance.
Tasks	Image Classification, Metric Learning, Person Re-Identification
Published	2018-12-07
URL	https://arxiv.org/abs/1812.02937v2
PDF	https://arxiv.org/pdf/1812.02937v2.pdf
PWC	https://paperswithcode.com/paper/optimizing-speedaccuracy-trade-off-for-person
Repo
Framework

Signal Processing and Piecewise Convex Estimation


Title	Signal Processing and Piecewise Convex Estimation
Authors	Kurt Riedel
Abstract	Many problems on signal processing reduce to nonparametric function estimation. We propose a new methodology, piecewise convex fitting (PCF), and give a two-stage adaptive estimate. In the first stage, the number and location of the change points is estimated using strong smoothing. In the second stage, a constrained smoothing spline fit is performed with the smoothing level chosen to minimize the MSE. The imposed constraint is that a single change point occurs in a region about each empirical change point of the first-stage estimate. This constraint is equivalent to requiring that the third derivative of the second-stage estimate has a single sign in a small neighborhood about each first-stage change point. We sketch how PCF may be applied to signal recovery, instantaneous frequency estimation, surface reconstruction, image segmentation, spectral estimation and multivariate adaptive regression.
Tasks	Semantic Segmentation
Published	2018-03-14
URL	https://arxiv.org/abs/1803.05130v1
PDF	https://arxiv.org/pdf/1803.05130v1.pdf
PWC	https://paperswithcode.com/paper/signal-processing-and-piecewise-convex
Repo
Framework

Scaling Speech Enhancement in Unseen Environments with Noise Embeddings


Title	Scaling Speech Enhancement in Unseen Environments with Noise Embeddings
Authors	Gil Keren, Jing Han, Björn Schuller
Abstract	We address the problem of speech enhancement generalisation to unseen environments by performing two manipulations. First, we embed an additional recording from the environment alone, and use this embedding to alter activations in the main enhancement subnetwork. Second, we scale the number of noise environments present at training time to 16,784 different environments. Experiment results show that both manipulations reduce word error rates of a pretrained speech recognition system and improve enhancement quality according to a number of performance measures. Specifically, our best model reduces the word error rate from 34.04% on noisy speech to 15.46% on the enhanced speech. Enhanced audio samples can be found in https://speechenhancement.page.link/samples.
Tasks	Speech Enhancement, Speech Recognition
Published	2018-10-26
URL	http://arxiv.org/abs/1810.12757v1
PDF	http://arxiv.org/pdf/1810.12757v1.pdf
PWC	https://paperswithcode.com/paper/scaling-speech-enhancement-in-unseen
Repo
Framework

Enhanced CNN for image denoising


Title	Enhanced CNN for image denoising
Authors	Chunwei Tian, Yong Xu, Lunke Fei, Junqian Wang, Jie Wen, Nan Luo
Abstract	Owing to flexible architectures of deep convolutional neural networks (CNNs), CNNs are successfully used for image denoising. However, they suffer from the following drawbacks: (i) deep network architecture is very difficult to train. (ii) Deeper networks face the challenge of performance saturation. In this study, the authors propose a novel method called enhanced convolutional neural denoising network (ECNDNet). Specifically, they use residual learning and batch normalisation techniques to address the problem of training difficulties and accelerate the convergence of the network. In addition, dilated convolutions are used in the proposed network to enlarge the context information and reduce the computational cost. Extensive experiments demonstrate that the ECNDNet outperforms the state-of-the-art methods for image denoising.
Tasks	Denoising, Image Denoising
Published	2018-10-28
URL	http://arxiv.org/abs/1810.11834v4
PDF	http://arxiv.org/pdf/1810.11834v4.pdf
PWC	https://paperswithcode.com/paper/enhanced-cnn-for-image-denoising
Repo
Framework

Efficient Super Resolution Using Binarized Neural Network


Title	Efficient Super Resolution Using Binarized Neural Network
Authors	Yinglan Ma, Hongyu Xiong, Zhe Hu, Lizhuang Ma
Abstract	Deep convolutional neural networks (DCNNs) have recently demonstrated high-quality results in single-image super-resolution (SR). DCNNs often suffer from over-parametrization and large amounts of redundancy, which results in inefficient inference and high memory usage, preventing massive applications on mobile devices. As a way to significantly reduce model size and computation time, binarized neural network has only been shown to excel on semantic-level tasks such as image classification and recognition. However, little effort of network quantization has been spent on image enhancement tasks like SR, as network quantization is usually assumed to sacrifice pixel-level accuracy. In this work, we explore an network-binarization approach for SR tasks without sacrificing much reconstruction accuracy. To achieve this, we binarize the convolutional filters in only residual blocks, and adopt a learnable weight for each binary filter. We evaluate this idea on several state-of-the-art DCNN-based architectures, and show that binarized SR networks achieve comparable qualitative and quantitative results as their real-weight counterparts. Moreover, the proposed binarized strategy could help reduce model size by 80% when applying on SRResNet, and could potentially speed up inference by 5 times.
Tasks	Image Classification, Image Enhancement, Image Super-Resolution, Quantization, Super-Resolution
Published	2018-12-16
URL	http://arxiv.org/abs/1812.06378v1
PDF	http://arxiv.org/pdf/1812.06378v1.pdf
PWC	https://paperswithcode.com/paper/efficient-super-resolution-using-binarized
Repo
Framework

Graph Multiview Canonical Correlation Analysis


Title	Graph Multiview Canonical Correlation Analysis
Authors	Jia Chen, Gang Wang, Georgios B. Giannakis
Abstract	Multiview canonical correlation analysis (MCCA) seeks latent low-dimensional representations encountered with multiview data of shared entities (a.k.a. common sources). However, existing MCCA approaches do not exploit the geometry of the common sources, which may be available \emph{a priori}, or can be constructed using certain domain knowledge. This prior information about the common sources can be encoded by a graph, and be invoked as a regularizer to enrich the maximum variance MCCA framework. In this context, the present paper’s novel graph-regularized (G) MCCA approach minimizes the distance between the wanted canonical variables and the common low-dimensional representations, while accounting for graph-induced knowledge of the common sources. Relying on a function capturing the extent low-dimensional representations of the multiple views are similar, a generalization bound of GMCCA is established based on Rademacher’s complexity. Tailored for setups where the number of data pairs is smaller than the data vector dimensions, a graph-regularized dual MCCA approach is also developed. To further deal with nonlinearities present in the data, graph-regularized kernel MCCA variants are put forward too. Interestingly, solutions of the graph-regularized linear, dual, and kernel MCCA, are all provided in terms of generalized eigenvalue decomposition. Several corroborating numerical tests using real datasets are provided to showcase the merits of the graph-regularized MCCA variants relative to several competing alternatives including MCCA, Laplacian-regularized MCCA, and (graph-regularized) PCA.
Tasks
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12345v2
PDF	http://arxiv.org/pdf/1811.12345v2.pdf
PWC	https://paperswithcode.com/paper/graph-multiview-canonical-correlation
Repo
Framework

Efficient Super Resolution For Large-Scale Images Using Attentional GAN


Title	Efficient Super Resolution For Large-Scale Images Using Attentional GAN
Authors	Harsh Nilesh Pathak, Xinxin Li, Shervin Minaee, Brooke Cowan
Abstract	Single Image Super Resolution (SISR) is a well-researched problem with broad commercial relevance. However, most of the SISR literature focuses on small-size images under 500px, whereas business needs can mandate the generation of very high resolution images. At Expedia Group, we were tasked with generating images of at least 2000px for display on the website, four times greater than the sizes typically reported in the literature. This requirement poses a challenge that state-of-the-art models, validated on small images, have not been proven to handle. In this paper, we investigate solutions to the problem of generating high-quality images for large-scale super resolution in a commercial setting. We find that training a generative adversarial network (GAN) with attention from scratch using a large-scale lodging image data set generates images with high PSNR and SSIM scores. We describe a novel attentional SISR model for large-scale images, A-SRGAN, that uses a Flexible Self Attention layer to enable processing of large-scale images. We also describe a distributed algorithm which speeds up training by around a factor of five.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-12-12
URL	http://arxiv.org/abs/1812.04821v4
PDF	http://arxiv.org/pdf/1812.04821v4.pdf
PWC	https://paperswithcode.com/paper/efficient-super-resolution-for-large-scale
Repo
Framework

Binary Document Image Super Resolution for Improved Readability and OCR Performance


Title	Binary Document Image Super Resolution for Improved Readability and OCR Performance
Authors	Ram Krishna Pandey, K Vignesh, A G Ramakrishnan, Chandrahasa B
Abstract	There is a need for information retrieval from large collections of low-resolution (LR) binary document images, which can be found in digital libraries across the world, where the high-resolution (HR) counterpart is not available. This gives rise to the problem of binary document image super-resolution (BDISR). The objective of this paper is to address the interesting and challenging problem of super resolution of binary Tamil document images for improved readability and better optical character recognition (OCR). We propose multiple deep neural network architectures to address this problem and analyze their performance. The proposed models are all single image super-resolution techniques, which learn a generalized spatial correspondence between the LR and HR binary document images. We employ convolutional layers for feature extraction followed by transposed convolution and sub-pixel convolution layers for upscaling the features. Since the outputs of the neural networks are gray scale, we utilize the advantage of power law transformation as a post-processing technique to improve the character level pixel connectivity. The performance of our models is evaluated by comparing the OCR accuracies and the mean opinion scores given by human evaluators on LR images and the corresponding model-generated HR images.
Tasks	Image Super-Resolution, Information Retrieval, Optical Character Recognition, Super-Resolution
Published	2018-12-06
URL	http://arxiv.org/abs/1812.02475v1
PDF	http://arxiv.org/pdf/1812.02475v1.pdf
PWC	https://paperswithcode.com/paper/binary-document-image-super-resolution-for
Repo
Framework