January 31, 2020

3121 words 15 mins read

Paper Group ANR 107

Understanding the Behaviour of the Empirical Cross-Entropy Beyond the Training Distribution. StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible. A deep learning approach to solar-irradiance forecasting in sky-videos. Learning From Brains How to Regularize Machines. Building Damage Detection in Satellite Im …

Understanding the Behaviour of the Empirical Cross-Entropy Beyond the Training Distribution


Title	Understanding the Behaviour of the Empirical Cross-Entropy Beyond the Training Distribution
Authors	Matias Vera, Pablo Piantanida, Leonardo Rey Vega
Abstract	Machine learning theory has mostly focused on generalization to samples from the same distribution as the training data. Whereas a better understanding of generalization beyond the training distribution where the observed distribution changes is also fundamentally important to achieve a more powerful form of generalization. In this paper, we attempt to study through the lens of information measures how a particular architecture behaves when the true probability law of the samples is potentially different at training and testing times. Our main result is that the testing gap between the empirical cross-entropy and its statistical expectation (measured with respect to the testing probability law) can be bounded with high probability by the mutual information between the input testing samples and the corresponding representations, generated by the encoder obtained at training time. These results of theoretical nature are supported by numerical simulations showing that the mentioned mutual information is representative of the testing gap, capturing qualitatively the dynamic in terms of the hyperparameters of the network.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11972v1
PDF	https://arxiv.org/pdf/1905.11972v1.pdf
PWC	https://paperswithcode.com/paper/understanding-the-behaviour-of-the-empirical
Repo
Framework

StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible


Title	StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible
Authors	Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham
Abstract	Blind people frequently encounter inaccessible dynamic touchscreens in their everyday lives that are difficult, frustrating, and often impossible to use independently. Touchscreens are often the only way to control everything from coffee machines and payment terminals, to subway ticket machines and in-flight entertainment systems. Interacting with dynamic touchscreens is difficult non-visually because the visual user interfaces change, interactions often occur over multiple different screens, and it is easy to accidentally trigger interface actions while exploring the screen. To solve these problems, we introduce StateLens - a three-part reverse engineering solution that makes existing dynamic touchscreens accessible. First, StateLens reverse engineers the underlying state diagrams of existing interfaces using point-of-view videos found online or taken by users using a hybrid crowd-computer vision pipeline. Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface. Finally, a set of 3D-printed accessories enable blind people to explore capacitive touchscreens without the risk of triggering accidental touches on the interface. Our technical evaluation shows that StateLens can accurately reconstruct interfaces from stationary, hand-held, and web videos; and, a user study of the complete system demonstrates that StateLens successfully enables blind users to access otherwise inaccessible dynamic touchscreens.
Tasks
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07144v1
PDF	https://arxiv.org/pdf/1908.07144v1.pdf
PWC	https://paperswithcode.com/paper/statelens-a-reverse-engineering-solution-for
Repo
Framework

A deep learning approach to solar-irradiance forecasting in sky-videos


Title	A deep learning approach to solar-irradiance forecasting in sky-videos
Authors	Talha A. Siddiqui, Samarth Bharadwaj, Shivkumar Kalyanaraman
Abstract	Ahead-of-time forecasting of incident solar-irradiance on a panel is indicative of expected energy yield and is essential for efficient grid distribution and planning. Traditionally, these forecasts are based on meteorological physics models whose parameters are tuned by coarse-grained radiometric tiles sensed from geo-satellites. This research presents a novel application of deep neural network approach to observe and estimate short-term weather effects from videos. Specifically, we use time-lapsed videos (sky-videos) obtained from upward facing wide-lensed cameras (sky-cameras) to directly estimate and forecast solar irradiance. We introduce and present results on two large publicly available datasets obtained from weather stations in two regions of North America using relatively inexpensive optical hardware. These datasets contain over a million images that span for 1 and 12 years respectively, the largest such collection to our knowledge. Compared to satellite based approaches, the proposed deep learning approach significantly reduces the normalized mean-absolute-percentage error for both nowcasting, i.e. prediction of the solar irradiance at the instance the frame is captured, as well as forecasting, ahead-of-time irradiance prediction for a duration for upto 4 hours.
Tasks
Published	2019-01-15
URL	http://arxiv.org/abs/1901.04881v1
PDF	http://arxiv.org/pdf/1901.04881v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-learning-approach-to-solar-irradiance
Repo
Framework

Learning From Brains How to Regularize Machines


Title	Learning From Brains How to Regularize Machines
Authors	Zhe Li, Wieland Brendel, Edgar Y. Walker, Erick Cobos, Taliah Muhammad, Jacob Reimer, Matthias Bethge, Fabian H. Sinz, Xaq Pitkow, Andreas S. Tolias
Abstract	Despite impressive performance on numerous visual tasks, Convolutional Neural Networks (CNNs) — unlike brains — are often highly sensitive to small perturbations of their input, e.g. adversarial noise leading to erroneous decisions. We propose to regularize CNNs using large-scale neuroscience data to learn more robust neural features in terms of representational similarity. We presented natural images to mice and measured the responses of thousands of neurons from cortical visual areas. Next, we denoised the notoriously variable neural activity using strong predictive models trained on this large corpus of responses from the mouse visual system, and calculated the representational similarity for millions of pairs of images from the model’s predictions. We then used the neural representation similarity to regularize CNNs trained on image classification by penalizing intermediate representations that deviated from neural ones. This preserved performance of baseline models when classifying images under standard benchmarks, while maintaining substantially higher performance compared to baseline or control models when classifying noisy images. Moreover, the models regularized with cortical representations also improved model robustness in terms of adversarial attacks. This demonstrates that regularizing with neural data can be an effective tool to create an inductive bias towards more robust inference.
Tasks	Image Classification
Published	2019-11-11
URL	https://arxiv.org/abs/1911.05072v1
PDF	https://arxiv.org/pdf/1911.05072v1.pdf
PWC	https://paperswithcode.com/paper/learning-from-brains-how-to-regularize-1
Repo
Framework

Building Damage Detection in Satellite Imagery Using Convolutional Neural Networks


Title	Building Damage Detection in Satellite Imagery Using Convolutional Neural Networks
Authors	Joseph Z. Xu, Wenhan Lu, Zebo Li, Pranav Khaitan, Valeriya Zaytseva
Abstract	In all types of disasters, from earthquakes to armed conflicts, aid workers need accurate and timely data such as damage to buildings and population displacement to mount an effective response. Remote sensing provides this data at an unprecedented scale, but extracting operationalizable information from satellite images is slow and labor-intensive. In this work, we use machine learning to automate the detection of building damage in satellite imagery. We compare the performance of four different convolutional neural network models in detecting damaged buildings in the 2010 Haiti earthquake. We also quantify how well the models will generalize to future disasters by training and testing models on different disaster events.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06444v1
PDF	https://arxiv.org/pdf/1910.06444v1.pdf
PWC	https://paperswithcode.com/paper/building-damage-detection-in-satellite
Repo
Framework

Stabilizing Deep Reinforcement Learning with Conservative Updates


Title	Stabilizing Deep Reinforcement Learning with Conservative Updates
Authors	Chen Tessler, Nadav Merlis, Shie Mannor
Abstract	In recent years, advances in deep learning have enabled the application of reinforcement learning algorithms in complex domains. However, they lack the theoretical guarantees which are present in the tabular setting and suffer from many stability and reproducibility problems \citep{henderson2018deep}. In this work, we suggest a simple approach for improving stability and providing probabilistic performance improvement in off-policy actor-critic deep reinforcement learning regimes. Experiments on continuous action spaces, in the MuJoCo control suite, show that our proposed method reduces the variance of the process and improves the overall performance.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01062v2
PDF	https://arxiv.org/pdf/1910.01062v2.pdf
PWC	https://paperswithcode.com/paper/stabilizing-off-policy-reinforcement-learning
Repo
Framework

Robust Wireless Fingerprinting via Complex-Valued Neural Networks


Title	Robust Wireless Fingerprinting via Complex-Valued Neural Networks
Authors	Soorya Gopalakrishnan, Metehan Cekic, Upamanyu Madhow
Abstract	A “wireless fingerprint” which exploits hardware imperfections unique to each device is a potentially powerful tool for wireless security. Such a fingerprint should be able to distinguish between devices sending the same message, and should be robust against standard spoofing techniques. Since the information in wireless signals resides in complex baseband, in this paper, we explore the use of neural networks with complex-valued weights to learn fingerprints using supervised learning. We demonstrate that, while there are potential benefits to using sections of the signal beyond just the preamble to learn fingerprints, the network cheats when it can, using information such as transmitter ID (which can be easily spoofed) to artificially inflate performance. We also show that noise augmentation by inserting additional white Gaussian noise can lead to significant performance gains, which indicates that this counter-intuitive strategy helps in learning more robust fingerprints. We provide results for two different wireless protocols, WiFi and ADS-B, demonstrating the effectiveness of the proposed method.
Tasks
Published	2019-05-19
URL	https://arxiv.org/abs/1905.09388v2
PDF	https://arxiv.org/pdf/1905.09388v2.pdf
PWC	https://paperswithcode.com/paper/190509388
Repo
Framework

3D Kidneys and Kidney Tumor Semantic Segmentation using Boundary-Aware Networks


Title	3D Kidneys and Kidney Tumor Semantic Segmentation using Boundary-Aware Networks
Authors	Andriy Myronenko, Ali Hatamizadeh
Abstract	Automated segmentation of kidneys and kidney tumors is an important step in quantifying the tumor’s morphometrical details to monitor the progression of the disease and accurately compare decisions regarding the kidney tumor treatment. Manual delineation techniques are often tedious, error-prone and require expert knowledge for creating unambiguous representation of kidneys and kidney tumors segmentation. In this work, we propose an end-to-end boundary aware fully Convolutional Neural Networks (CNNs) for reliable kidney and kidney tumor semantic segmentation from arterial phase abdominal 3D CT scans. We propose a segmentation network consisting of an encoder-decoder architecture that specifically accounts for organ and tumor edge information by devising a dedicated boundary branch supervised by edge-aware loss terms. We have evaluated our model on 2019 MICCAI KiTS Kidney Tumor Segmentation Challenge dataset and our method has achieved dice scores of 0.9742 and 0.8103 for kidney and tumor repetitively and an overall composite dice score of 0.8923.
Tasks	Semantic Segmentation
Published	2019-09-14
URL	https://arxiv.org/abs/1909.06684v1
PDF	https://arxiv.org/pdf/1909.06684v1.pdf
PWC	https://paperswithcode.com/paper/3d-kidneys-and-kidney-tumor-semantic
Repo
Framework

When Does Self-supervision Improve Few-shot Learning?


Title	When Does Self-supervision Improve Few-shot Learning?
Authors	Jong-Chyi Su, Subhransu Maji, Bharath Hariharan
Abstract	We present a technique to improve the generalization of deep representations learned on small labeled datasets by introducing self-supervised tasks as auxiliary loss functions. Although recent research has shown benefits of self-supervised learning (SSL) on large unlabeled datasets, its utility on small datasets is unknown. We find that SSL reduces the relative error rate of few-shot meta-learners by 4%-27%, even when the datasets are small and only utilizing images within the datasets. The improvements are greater when the training set is smaller or the task is more challenging. Though the benefits of SSL may increase with larger training sets, we observe that SSL can have a negative impact on performance when there is a domain shift between distribution of images used for meta-learning and SSL. Based on this analysis we present a technique that automatically select images for SSL from a large, generic pool of unlabeled images for a given dataset using a domain classifier that provides further improvements. We present results using several meta-learners and self-supervised tasks across datasets with varying degrees of domain shifts and label sizes to characterize the effectiveness of SSL for few-shot learning.
Tasks	Few-Shot Learning, Meta-Learning
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03560v1
PDF	https://arxiv.org/pdf/1910.03560v1.pdf
PWC	https://paperswithcode.com/paper/when-does-self-supervision-improve-few-shot
Repo
Framework

Gated Group Self-Attention for Answer Selection


Title	Gated Group Self-Attention for Answer Selection
Authors	Dong Xu, Jianhui Ji, Haikuan Huang, Hongbo Deng, Wu-Jun Li
Abstract	Answer selection (answer ranking) is one of the key steps in many kinds of question answering (QA) applications, where deep models have achieved state-of-the-art performance. Among these deep models, recurrent neural network (RNN) based models are most popular, typically with better performance than convolutional neural network (CNN) based models. Nevertheless, it is difficult for RNN based models to capture the information about long-range dependency among words in the sentences of questions and answers. In this paper, we propose a new deep model, called gated group self-attention (GGSA), for answer selection. GGSA is inspired by global self-attention which is originally proposed for machine translation and has not been explored in answer selection. GGSA tackles the problem of global self-attention that local and global information cannot be well distinguished. Furthermore, an interaction mechanism between questions and answers is also proposed to enhance GGSA by a residual structure. Experimental results on two popular QA datasets show that GGSA can outperform existing answer selection models to achieve state-of-the-art performance. Furthermore, GGSA can also achieve higher accuracy than global self-attention for the answer selection task, with a lower computation cost.
Tasks	Answer Selection, Machine Translation, Question Answering
Published	2019-05-26
URL	https://arxiv.org/abs/1905.10720v1
PDF	https://arxiv.org/pdf/1905.10720v1.pdf
PWC	https://paperswithcode.com/paper/gated-group-self-attention-for-answer
Repo
Framework

Towards Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-based Convolutional Encoders


Title	Towards Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-based Convolutional Encoders
Authors	Matteo Manica, Ali Oskooei, Jannis Born, Vigneshwari Subramanian, Julio Sáez-Rodríguez, María Rodríguez Martínez
Abstract	In line with recent advances in neural drug design and sensitivity prediction, we propose a novel architecture for interpretable prediction of anticancer compound sensitivity using a multimodal attention-based convolutional encoder. Our model is based on the three key pillars of drug sensitivity: compounds’ structure in the form of a SMILES sequence, gene expression profiles of tumors and prior knowledge on intracellular interactions from protein-protein interaction networks. We demonstrate that our multiscale convolutional attention-based (MCA) encoder significantly outperforms a baseline model trained on Morgan fingerprints, a selection of encoders based on SMILES as well as previously reported state of the art for multimodal drug sensitivity prediction (R2 = 0.86 and RMSE = 0.89). Moreover, the explainability of our approach is demonstrated by a thorough analysis of the attention weights. We show that the attended genes significantly enrich apoptotic processes and that the drug attention is strongly correlated with a standard chemical structure similarity index. Finally, we report a case study of two receptor tyrosine kinase (RTK) inhibitors acting on a leukemia cell line, showcasing the ability of the model to focus on informative genes and submolecular regions of the two compounds. The demonstrated generalizability and the interpretability of our model testify its potential for in-silico prediction of anticancer compound efficacy on unseen cancer cells, positioning it as a valid solution for the development of personalized therapies as well as for the evaluation of candidate compounds in de novo drug design.
Tasks
Published	2019-04-25
URL	https://arxiv.org/abs/1904.11223v3
PDF	https://arxiv.org/pdf/1904.11223v3.pdf
PWC	https://paperswithcode.com/paper/towards-explainable-anticancer-compound
Repo
Framework

Automatic Whole-body Bone Age Assessment Using Deep Hierarchical Features


Title	Automatic Whole-body Bone Age Assessment Using Deep Hierarchical Features
Authors	Hai-Duong Nguyen, Soo-Hyung Kim
Abstract	Bone age assessment gives us evidence to analyze the children growth status and the rejuvenation involved chronological and biological ages. All the previous works consider left-hand X-ray image of a child in their works. In this paper, we carry out a study on estimating human age using whole-body bone CT images and a novel convolutional neural network. Our model with additional connections shows an effective way to generate a massive number of vital features while reducing overfitting influence on small training data in the medical image analysis research area. A dataset and a comparison with common deep architectures will be provided for future research in this field.
Tasks
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10237v1
PDF	http://arxiv.org/pdf/1901.10237v1.pdf
PWC	https://paperswithcode.com/paper/automatic-whole-body-bone-age-assessment
Repo
Framework

Matrix Completion via Nonconvex Regularization: Convergence of the Proximal Gradient Algorithm


Title	Matrix Completion via Nonconvex Regularization: Convergence of the Proximal Gradient Algorithm
Authors	Fei Wen, Rendong Ying, Peilin Liu, Trieu-Kien Truong
Abstract	Matrix completion has attracted much interest in the past decade in machine learning and computer vision. For low-rank promotion in matrix completion, the nuclear norm penalty is convenient due to its convexity but has a bias problem. Recently, various algorithms using nonconvex penalties have been proposed, among which the proximal gradient descent (PGD) algorithm is one of the most efficient and effective. For the nonconvex PGD algorithm, whether it converges to a local minimizer and its convergence rate are still unclear. This work provides a nontrivial analysis on the PGD algorithm in the nonconvex case. Besides the convergence to a stationary point for a generalized nonconvex penalty, we provide more deep analysis on a popular and important class of nonconvex penalties which have discontinuous thresholding functions. For such penalties, we establish the finite rank convergence, convergence to restricted strictly local minimizer and eventually linear convergence rate of the PGD algorithm. Meanwhile, convergence to a local minimizer has been proved for the hard-thresholding penalty. Our result is the first shows that, nonconvex regularized matrix completion only has restricted strictly local minimizers, and the PGD algorithm can converge to such minimizers with eventually linear rate under certain conditions. Illustration of the PGD algorithm via experiments has also been provided. Code is available at https://github.com/FWen/nmc.
Tasks	Matrix Completion
Published	2019-03-02
URL	http://arxiv.org/abs/1903.00702v1
PDF	http://arxiv.org/pdf/1903.00702v1.pdf
PWC	https://paperswithcode.com/paper/matrix-completion-via-nonconvex
Repo
Framework

A Nonlinear Model for Time Synchronization


Title	A Nonlinear Model for Time Synchronization
Authors	Frank Wang, Danjue Li
Abstract	The current algorithms are based on linear model, for example, Precision Time Protocol (PTP) which requires frequent synchronization in order to handle the effects of clock frequency drift. This paper introduces a nonlinear approach to clock time synchronize. This approach can accurately model the frequency shift. Therefore, the required time interval to synchronize clocks can be longer. Meanwhile, it also offers better performance and relaxes the synchronization process. The idea of the nonlinear algorithm and some numerical examples will be presented in this paper in detail.
Tasks
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00545v1
PDF	http://arxiv.org/pdf/1903.00545v1.pdf
PWC	https://paperswithcode.com/paper/a-nonlinear-model-for-time-synchronization
Repo
Framework

MV-C3D: A Spatial Correlated Multi-View 3D Convolutional Neural Networks


Title	MV-C3D: A Spatial Correlated Multi-View 3D Convolutional Neural Networks
Authors	Qi Xuan, Fuxian Li, Yi Liu, Yun Xiang
Abstract	As the development of deep neural networks, 3D object recognition is becoming increasingly popular in computer vision community. Many multi-view based methods are proposed to improve the category recognition accuracy. These approaches mainly rely on multi-view images which are rendered with the whole circumference. In real-world applications, however, 3D objects are mostly observed from partial viewpoints in a less range. Therefore, we propose a multi-view based 3D convolutional neural network, which takes only part of contiguous multi-view images as input and can still maintain high accuracy. Moreover, our model takes these view images as a joint variable to better learn spatially correlated features using 3D convolution and 3D max-pooling layers. Experimental results on ModelNet10 and ModelNet40 datasets show that our MV-C3D technique can achieve outstanding performance with multi-view images which are captured from partial angles with less range. The results on 3D rotated real image dataset MIRO further demonstrate that MV-C3D is more adaptable in real-world scenarios. The classification accuracy can be further improved with the increasing number of view images.
Tasks	3D Object Recognition, Object Recognition
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06538v1
PDF	https://arxiv.org/pdf/1906.06538v1.pdf
PWC	https://paperswithcode.com/paper/mv-c3d-a-spatial-correlated-multi-view-3d
Repo
Framework