April 2, 2020

3448 words 17 mins read

Paper Group ANR 215

Deep Quaternion Features for Privacy Protection. EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement. Discernible Compressed Images via Deep Perception Consistency. Deep Domain Adaptive Object Detection: a Survey. Local Facial Attribute Transfer through Inpainting. A hybrid optimization procedure for solving a ti …

Deep Quaternion Features for Privacy Protection


Title	Deep Quaternion Features for Privacy Protection
Authors	Hao Zhang, Yiting Chen, Liyao Xiang, Haotian Ma, Jie Shi, Quanshi Zhang
Abstract	We propose a method to revise the neural network to construct the quaternion-valued neural network (QNN), in order to prevent intermediate-layer features from leaking input information. The QNN uses quaternion-valued features, where each element is a quaternion. The QNN hides input information into a random phase of quaternion-valued features. Even if attackers have obtained network parameters and intermediate-layer features, they cannot extract input information without knowing the target phase. In this way, the QNN can effectively protect the input privacy. Besides, the output accuracy of QNNs only degrades mildly compared to traditional neural networks, and the computational cost is much less than other privacy-preserving methods.
Tasks
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08365v1
PDF	https://arxiv.org/pdf/2003.08365v1.pdf
PWC	https://paperswithcode.com/paper/deep-quaternion-features-for-privacy
Repo
Framework


Title	EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement
Authors	Linpu Fang, Hang Xu, Zhili Liu, Sarah Parisot, Zhenguo Li
Abstract	Object detectors trained on fully-annotated data currently yield state of the art performance but require expensive manual annotations. On the other hand, weakly-supervised detectors have much lower performance and cannot be used reliably in a realistic setting. In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amount of fullyannotated data and fully exploiting cheap data with imagelevel labels. State of the art methods typically propose an iterative approach, alternating between generating pseudo-labels and updating a detector. This paradigm requires careful manual hyper-parameter tuning for mining good pseudo labels at each round and is quite time-consuming. To address these issues, we present EHSOD, an end-to-end hybrid-supervised object detection system which can be trained in one shot on both fully and weakly-annotated data. Specifically, based on a two-stage detector, we proposed two modules to fully utilize the information from both kinds of labels: 1) CAMRPN module aims at finding foreground proposals guided by a class activation heat-map; 2) hybrid-supervised cascade module further refines the bounding-box position and classification with the help of an auxiliary head compatible with image-level data. Extensive experiments demonstrate the effectiveness of the proposed method and it achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data, e.g. 37.5% mAP on COCO. We will release the code and the trained models.
Tasks	Object Detection
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07421v1
PDF	https://arxiv.org/pdf/2002.07421v1.pdf
PWC	https://paperswithcode.com/paper/ehsod-cam-guided-end-to-end-hybrid-supervised
Repo
Framework

Discernible Compressed Images via Deep Perception Consistency


Title	Discernible Compressed Images via Deep Perception Consistency
Authors	Zhaohui Yang, Yunhe Wang, Chao Xu, Chang Xu
Abstract	Image compression, as one of the fundamental low-level image processing tasks, is very essential for computer vision. Conventional image compression methods tend to obtain compressed images by minimizing their appearance discrepancy with the corresponding original images, but pay little attention to their efficacy in downstream perception tasks, e.g., image recognition and object detection. In contrast, this paper aims to produce compressed images by pursuing both appearance and perception consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of original and compressed images. In addition, the maximum mean discrepancy (MMD) is employed to minimize the difference between feature distributions. The resulting compression network can generate images with high image quality and preserve the consistent perception in the feature domain, so that these images can be well recognized by pre-trained machine learning models. Experiments on benchmarks demonstrate the superiority of the proposed algorithm over comparison methods.
Tasks	Image Compression, Object Detection
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06810v1
PDF	https://arxiv.org/pdf/2002.06810v1.pdf
PWC	https://paperswithcode.com/paper/discernible-compressed-images-via-deep
Repo
Framework

Deep Domain Adaptive Object Detection: a Survey


Title	Deep Domain Adaptive Object Detection: a Survey
Authors	Wanyi Li, Fuyu Li, Yongkang Luo, Peng Wang
Abstract	Deep learning (DL) based object detection has achieved great progress. These methods typically assume that large amount of labeled training data is available, and training and test data are drawn from an identical distribution. However, the two assumptions are not always hold in practice. Deep domain adaptive object detection (DDAOD) has emerged as a new learning paradigm to address the above mentioned challenges. This paper aims to review the state-of-the-art progress on deep domain adaptive object detection approaches. Firstly, we introduce briefly the basic concepts of deep domain adaptation. Secondly, the deep domain adaptive detectors are classified into four categories and detailed descriptions of representative methods in each category are provided. Finally, insights for future research trend are presented.
Tasks	Domain Adaptation, Object Detection
Published	2020-02-17
URL	https://arxiv.org/abs/2002.06797v1
PDF	https://arxiv.org/pdf/2002.06797v1.pdf
PWC	https://paperswithcode.com/paper/deep-domain-adaptive-object-detection-a
Repo
Framework

Local Facial Attribute Transfer through Inpainting


Title	Local Facial Attribute Transfer through Inpainting
Authors	Ricard Durall, Franz-Josef Pfreundt, Janis Keuper
Abstract	The term attribute transfer refers to the tasks of altering images in such a way, that the semantic interpretation of a given input image is shifted towards an intended direction, which is quantified by semantic attributes. Prominent example applications are photo realistic changes of facial features and expressions, like changing the hair color, adding a smile, enlarging the nose or altering the entire context of a scene, like transforming a summer landscape into a winter panorama. Recent advances in attribute transfer are mostly based on generative deep neural networks, using various techniques to manipulate images in the latent space of the generator. In this paper, we present a novel method for the common sub-task of local attribute transfers, where only parts of a face have to be altered in order to achieve semantic changes (e.g. removing a mustache). In contrast to previous methods, where such local changes have been implemented by generating new (global) images, we propose to formulate local attribute transfers as an inpainting problem. Removing and regenerating only parts of images, our Attribute Transfer Inpainting Generative Adversarial Network (ATI-GAN) is able to utilize local context information, resulting in visually sound results.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.03040v1
PDF	https://arxiv.org/pdf/2002.03040v1.pdf
PWC	https://paperswithcode.com/paper/local-facial-attribute-transfer-through
Repo
Framework

A hybrid optimization procedure for solving a tire curing scheduling problem


Title	A hybrid optimization procedure for solving a tire curing scheduling problem
Authors	Joaquín Velázquez, Héctor Cancela, Pedro Piñeyro
Abstract	This paper addresses a lot-sizing and scheduling problem variant arising from the study of the curing process of a tire factory. The aim is to find the minimum makespan needed for producing enough tires to meet the demand requirements on time, considering the availability and compatibility of different resources involved. To solve this problem, we suggest a hybrid approach that consists in first applying a heuristic to obtain an estimated value of the makespan and then solving a mathematical model to determine the minimum value. We note that the size of the model (number of variables and constraints) depends significantly on the estimated makespan. Extensive numerical experiments over different instances based on real data are presented to evaluate the effectiveness of the hybrid procedure proposed. From the results obtained we can note that the hybrid approach is able to achieve the optimal makespan for many of the instances, even large ones, since the results provided by the heuristic allow to reduce significantly the size of the mathematical model.
Tasks
Published	2020-03-29
URL	https://arxiv.org/abs/2004.00425v1
PDF	https://arxiv.org/pdf/2004.00425v1.pdf
PWC	https://paperswithcode.com/paper/a-hybrid-optimization-procedure-for-solving-a
Repo
Framework

High Temporal Resolution Rainfall Runoff Modelling Using Long-Short-Term-Memory (LSTM) Networks


Title	High Temporal Resolution Rainfall Runoff Modelling Using Long-Short-Term-Memory (LSTM) Networks
Authors	Wei Li, Amin Kiaghadi, Clint N. Dawson
Abstract	Accurate and efficient models for rainfall runoff (RR) simulations are crucial for flood risk management. Most rainfall models in use today are process-driven; i.e. they solve either simplified empirical formulas or some variation of the St. Venant (shallow water) equations. With the development of machine-learning techniques, we may now be able to emulate rainfall models using, for example, neural networks. In this study, a data-driven RR model using a sequence-to-sequence Long-short-Term-Memory (LSTM) network was constructed. The model was tested for a watershed in Houston, TX, known for severe flood events. The LSTM network’s capability in learning long-term dependencies between the input and output of the network allowed modeling RR with high resolution in time (15 minutes). Using 10-years precipitation from 153 rainfall gages and river channel discharge data (more than 5.3 million data points), and by designing several numerical tests the developed model performance in predicting river discharge was tested. The model results were also compared with the output of a process-driven model Gridded Surface Subsurface Hydrologic Analysis (GSSHA). Moreover, physical consistency of the LSTM model was explored. The model results showed that the LSTM model was able to efficiently predict discharge and achieve good model performance. When compared to GSSHA, the data-driven model was more efficient and robust in terms of prediction and calibration. Interestingly, the performance of the LSTM model improved (test Nash-Sutcliffe model efficiency from 0.666 to 0.942) when a selected subset of rainfall gages based on the model performance, were used as input instead of all rainfall gages.
Tasks	Calibration
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02568v1
PDF	https://arxiv.org/pdf/2002.02568v1.pdf
PWC	https://paperswithcode.com/paper/high-temporal-resolution-rainfall-runoff
Repo
Framework

The Case for Bayesian Deep Learning


Title	The Case for Bayesian Deep Learning
Authors	Andrew Gordon Wilson
Abstract	The key distinguishing property of a Bayesian approach is marginalization instead of optimization, not the prior, or Bayes rule. Bayesian inference is especially compelling for deep neural networks. (1) Neural networks are typically underspecified by the data, and can represent many different but high performing models corresponding to different settings of parameters, which is exactly when marginalization will make the biggest difference for both calibration and accuracy. (2) Deep ensembles have been mistaken as competing approaches to Bayesian methods, but can be seen as approximate Bayesian marginalization. (3) The structure of neural networks gives rise to a structured prior in function space, which reflects the inductive biases of neural networks that help them generalize. (4) The observed correlation between parameters in flat regions of the loss and a diversity of solutions that provide good generalization is further conducive to Bayesian marginalization, as flat regions occupy a large volume in a high dimensional space, and each different solution will make a good contribution to a Bayesian model average. (5) Recent practical advances for Bayesian deep learning provide improvements in accuracy and calibration compared to standard training, while retaining scalability.
Tasks	Bayesian Inference, Calibration
Published	2020-01-29
URL	https://arxiv.org/abs/2001.10995v1
PDF	https://arxiv.org/pdf/2001.10995v1.pdf
PWC	https://paperswithcode.com/paper/the-case-for-bayesian-deep-learning
Repo
Framework

Real-time calibration of coherent-state receivers: learning by trial and error


Title	Real-time calibration of coherent-state receivers: learning by trial and error
Authors	M. Bilkis, M. Rosati, R. Morral Yepes, J. Calsamiglia
Abstract	The optimal discrimination of coherent states of light with current technology is a key problem in classical and quantum communication, whose solution would enable the realization of efficient receivers for long-distance communications in free-space and optical fiber channels. In this article, we show that reinforcement learning (RL) protocols allow an agent to learn near-optimal coherent-state receivers made of passive linear optics, photodetectors and classical adaptive control. Each agent is trained and tested in real time over several runs of independent discrimination experiments and has no knowledge about the energy of the states nor the receiver setup nor the quantum-mechanical laws governing the experiments. Based exclusively on the observed photodetector outcomes, the agent adaptively chooses among a set of ~3 10^3 possible receiver setups, and obtains a reward at the end of each experiment if its guess is correct. At variance with previous applications of RL in quantum physics, the information gathered in each run is intrinsically stochastic and thus insufficient to evaluate exactly the performance of the chosen receiver. Nevertheless, we present families of agents that: (i) discover a receiver beating the best Gaussian receiver after ~3 10^2 experiments; (ii) surpass the cumulative reward of the best Gaussian receiver after ~10^3 experiments; (iii) simultaneously discover a near-optimal receiver and attain its cumulative reward after ~10^5 experiments. Our results show that RL techniques are suitable for on-line control of quantum receivers and can be employed for long-distance communications over potentially unknown channels.
Tasks	Calibration
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10283v1
PDF	https://arxiv.org/pdf/2001.10283v1.pdf
PWC	https://paperswithcode.com/paper/real-time-calibration-of-coherent-state
Repo
Framework

StarNet: towards weakly supervised few-shot detection and explainable few-shot classification


Title	StarNet: towards weakly supervised few-shot detection and explainable few-shot classification
Authors	Leonid Karlinsky, Joseph Shtok, Amit Alfassy, Moshe Lichtenstein, Sivan Harary, Eli Schwartz, Sivan Doveh, Prasanna Sattigeri, Rogerio Feris, Alexander Bronstein, Raja Giryes
Abstract	In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.
Tasks	Few-Shot Learning, Few-Shot Object Detection, Object Detection
Published	2020-03-15
URL	https://arxiv.org/abs/2003.06798v1
PDF	https://arxiv.org/pdf/2003.06798v1.pdf
PWC	https://paperswithcode.com/paper/starnet-towards-weakly-supervised-few-shot
Repo
Framework

BIHL:A Fast and High Performance Object Proposals based on Binarized HL Frequency


Title	BIHL:A Fast and High Performance Object Proposals based on Binarized HL Frequency
Authors	Jiang Chao, Liang Huawei, Wang Zhiling
Abstract	In recent years, the use of object proposal as a preprocessing step for target detection to improve computational efficiency has become an effective method. Good object proposal methods should have high object detection recall rate and low computational cost, as well as good location accuracy and repeatability. However, it is difficult for current advanced algorithms to achieve a good balance in the above performance. Therefore, it is especially important to ensure that the recall rate and location quality are not degraded while accelerating object generation.For this problem, we propose a class-independent object proposal algorithm BIHL. It combines the advantages of window scoring and superpixel merging. First, a binarized horizontal high frequency component feature and a linear classifier are used to learn and generate a set of candidate boxs with a objective score. Then, the candidate boxs are merged based on the principle of location and score proximity. Different from superpixel merging algorithm, our method does not use pixel level operation to avoid a lot of computation without losing performance. Experimental results on the VOC2007 dataset and the VOC2007 synthetic interference dataset containing 297,120 test images show that when including difficult-to-identify objects with an IOU threshold of 0.5 and 10000 budget proposals, our method achieves a 99.3% detection recall and an mean average best overlap of 81.1% . The average processing time of our method for all test set images is 0.0015 seconds, which is nearly 3 times faster than the current fastest method. In repeatability testing, our method is the method with the highest average repeatability among the methods that achieve good repeatability to various disturbances, and the average repeatability is 10% higher than RPN. The code will be published in {https://github.com/JiangChao2009/BIHL}
Tasks	Object Detection
Published	2020-03-13
URL	https://arxiv.org/abs/2003.06124v1
PDF	https://arxiv.org/pdf/2003.06124v1.pdf
PWC	https://paperswithcode.com/paper/bihla-fast-and-high-performance-object
Repo
Framework

Using context to make gas classifiers robust to sensor drift


Title	Using context to make gas classifiers robust to sensor drift
Authors	J. Warner, A. Devaraj, R. Miikkulainen
Abstract	The interaction of a gas particle with a metal-oxide based gas sensor changes the sensor irreversibly. The compounded changes, referred to as sensor drift, are unstable, but adaptive algorithms can sustain the accuracy of odor sensor systems. Here we focus on extending the lifetime of sensor systems without additional data acquisition by transfering knowledge from one time window to a subsequent one after drift has occurred. To support generalization across sensor states, we introduce a context-based neural network model which forms a latent representation of sensor state. We tested our models to classify samples taken from unseen subsequent time windows and discovered favorable accuracy compared to drift-naive and ensemble methods on a gas sensor array drift dataset. By reducing the effect that sensor drift has on classification accuracy, context-based models may extend the effective lifetime of gas identification systems in practical settings.
Tasks
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07292v1
PDF	https://arxiv.org/pdf/2003.07292v1.pdf
PWC	https://paperswithcode.com/paper/using-context-to-make-gas-classifiers-robust
Repo
Framework

Deep Hough Transform for Semantic Line Detection


Title	Deep Hough Transform for Semantic Line Detection
Authors	Qi Han, Kai Zhao, Jun Xu, Mingg-Ming Cheng
Abstract	In this paper, we put forward a simple yet effective method to detect meaningful straight lines, a.k.a. semantic lines, in given scenes. Prior methods take line detection as a special case of object detection, while neglect the inherent characteristics of lines, leading to less efficient and suboptimal results. We propose a one-shot end-to-end framework by incorporating the classical Hough transform into deeply learned representations. By parameterizing lines with slopes and biases, we perform Hough transform to translate deep representations to the parametric space and then directly detect lines in the parametric space. More concretely, we aggregate features along candidate lines on the feature map plane and then assign the aggregated features to corresponding locations in the parametric domain. Consequently, the problem of detecting semantic lines in the spatial domain is transformed to spotting individual points in the parametric domain, making the post-processing steps, \ie non-maximal suppression, more efficient. Furthermore, our method makes it easy to extract contextual line features, that are critical to accurate line detection. Experimental results on a public dataset demonstrate the advantages of our method over state-of-the-arts.
Tasks	Object Detection
Published	2020-03-10
URL	https://arxiv.org/abs/2003.04676v1
PDF	https://arxiv.org/pdf/2003.04676v1.pdf
PWC	https://paperswithcode.com/paper/deep-hough-transform-for-semantic-line
Repo
Framework

Clickbait Detection using Multiple Categorization Techniques


Title	Clickbait Detection using Multiple Categorization Techniques
Authors	Abinash Pujahari, Dilip Singh Sisodia
Abstract	Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempted visitors to click on a particular link either to monetize the landing page or to spread the false news for sensationalization. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This paper proposes a hybrid categorization technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure, and clustering. During preliminary categorization, the headlines are separated using eleven features. After that, the headlines are recategorized using sentence formality, syntactic similarity measures. In the last phase, the headlines are again recategorized by applying clustering using word vector similarity based on t-Stochastic Neighbourhood Embedding (t-SNE) approach. After categorization of these headlines, machine learning models are applied to the data set to evaluate machine learning algorithms. The obtained experimental results indicate the proposed hybrid model is more robust, reliable and efficient than any individual categorization techniques for the real-world dataset we used.
Tasks	Clickbait Detection
Published	2020-03-29
URL	https://arxiv.org/abs/2003.12961v1
PDF	https://arxiv.org/pdf/2003.12961v1.pdf
PWC	https://paperswithcode.com/paper/clickbait-detection-using-multiple
Repo
Framework

Procedural Reading Comprehension with Attribute-Aware Context Flow


Title	Procedural Reading Comprehension with Attribute-Aware Context Flow
Authors	Aida Amini, Antoine Bosselut, Bhavana Dalvi Mishra, Yejin Choi, Hannaneh Hajishirzi
Abstract	Procedural texts often describe processes (e.g., photosynthesis and cooking) that happen over entities (e.g., light, food). In this paper, we introduce an algorithm for procedural reading comprehension by translating the text into a general formalism that represents processes as a sequence of transitions over entity attributes (e.g., location, temperature). Leveraging pre-trained language models, our model obtains entity-aware and attribute-aware representations of the text by joint prediction of entity attributes and their transitions. Our model dynamically obtains contextual encodings of the procedural text exploiting information that is encoded about previous and current states to predict the transition of a certain attribute which can be identified as a span of text or from a pre-defined set of classes. Moreover, our model achieves state of the art results on two procedural reading comprehension datasets, namely ProPara and npn-cooking
Tasks	Reading Comprehension
Published	2020-03-31
URL	https://arxiv.org/abs/2003.13878v1
PDF	https://arxiv.org/pdf/2003.13878v1.pdf
PWC	https://paperswithcode.com/paper/procedural-reading-comprehension-with
Repo
Framework