October 17, 2019

3482 words 17 mins read

Paper Group ANR 709

Discriminative Cross-View Binary Representation Learning. Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding. Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values. A Scalable Framework for Trajectory Prediction. PCA-Based Missing Information Imputation for Real-Time Crash Likelihood …

Discriminative Cross-View Binary Representation Learning


Title	Discriminative Cross-View Binary Representation Learning
Authors	Liu Liu, Hairong Qi
Abstract	Learning compact representation is vital and challenging for large scale multimedia data. Cross-view/cross-modal hashing for effective binary representation learning has received significant attention with exponentially growing availability of multimedia content. Most existing cross-view hashing algorithms emphasize the similarities in individual views, which are then connected via cross-view similarities. In this work, we focus on the exploitation of the discriminative information from different views, and propose an end-to-end method to learn semantic-preserving and discriminative binary representation, dubbed Discriminative Cross-View Hashing (DCVH), in light of learning multitasking binary representation for various tasks including cross-view retrieval, image-to-image retrieval, and image annotation/tagging. The proposed DCVH has the following key components. First, it uses convolutional neural network (CNN) based nonlinear hashing functions and multilabel classification for both images and texts simultaneously. Such hashing functions achieve effective continuous relaxation during training without explicit quantization loss by using Direct Binary Embedding (DBE) layers. Second, we propose an effective view alignment via Hamming distance minimization, which is efficiently accomplished by bit-wise XOR operation. Extensive experiments on two image-text benchmark datasets demonstrate that DCVH outperforms state-of-the-art cross-view hashing algorithms as well as single-view image hashing algorithms. In addition, DCVH can provide competitive performance for image annotation/tagging.
Tasks	Image Retrieval, Quantization, Representation Learning
Published	2018-04-04
URL	http://arxiv.org/abs/1804.01233v1
PDF	http://arxiv.org/pdf/1804.01233v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-cross-view-binary
Repo
Framework

Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding


Title	Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Authors	Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, Dacheng Tao
Abstract	Visual grounding aims to localize an object in an image referred to by a textual query phrase. Various visual grounding approaches have been proposed, and the problem can be modularized into a general framework: proposal generation, multi-modal feature representation, and proposal ranking. Of these three modules, most existing approaches focus on the latter two, with the importance of proposal generation generally neglected. In this paper, we rethink the problem of what properties make a good proposal generator. We introduce the diversity and discrimination simultaneously when generating proposals, and in doing so propose Diversified and Discriminative Proposal Networks model (DDPN). Based on the proposals generated by DDPN, we propose a high performance baseline model for visual grounding and evaluate it on four benchmark datasets. Experimental results demonstrate that our model delivers significant improvements on all the tested data-sets (e.g., 18.8% improvement on ReferItGame and 8.2% improvement on Flickr30k Entities over the existing state-of-the-arts respectively)
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03508v1
PDF	http://arxiv.org/pdf/1805.03508v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-diversified-and-discriminative
Repo
Framework

Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values


Title	Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values
Authors	Hiroyuki Hanada, Toshiyuki Takada, Jun Sakuma, Ichiro Takeuchi
Abstract	The problem of machine learning with missing values is common in many areas. A simple approach is to first construct a dataset without missing values simply by discarding instances with missing entries or by imputing a fixed value for each missing entry, and then train a prediction model with the new dataset. A drawback of this naive approach is that the uncertainty in the missing entries is not properly incorporated in the prediction. In order to evaluate prediction uncertainty, the multiple imputation (MI) approach has been studied, but the performance of MI is sensitive to the choice of the probabilistic model of the true values in the missing entries, and the computational cost of MI is high because multiple models must be trained. In this paper, we propose an alternative approach called the Interval-based Prediction Uncertainty Bounding (IPUB) method. The IPUB method represents the uncertainties due to missing entries as intervals, and efficiently computes the lower and upper bounds of the prediction results when all possible training sets constructed by imputing arbitrary values in the intervals are considered. The IPUB method can be applied to a wide class of convex learning algorithms including penalized least-squares regression, support vector machine (SVM), and logistic regression. We demonstrate the advantages of the IPUB method by comparing it with an existing method in numerical experiment with benchmark datasets.
Tasks	Imputation
Published	2018-03-01
URL	http://arxiv.org/abs/1803.00218v1
PDF	http://arxiv.org/pdf/1803.00218v1.pdf
PWC	https://paperswithcode.com/paper/interval-based-prediction-uncertainty-bound
Repo
Framework

A Scalable Framework for Trajectory Prediction


Title	A Scalable Framework for Trajectory Prediction
Authors	Punit Rathore, Dheeraj Kumar, Sutharshan Rajasegarar, Marimuthu Palaniswami, James C. Bezdek
Abstract	Trajectory prediction (TP) is of great importance for a wide range of location-based applications in intelligent transport systems such as location-based advertising, route planning, traffic management, and early warning systems. In the last few years, the widespread use of GPS navigation systems and wireless communication technology enabled vehicles has resulted in huge volumes of trajectory data. The task of utilizing this data employing spatio-temporal techniques for trajectory prediction in an efficient and accurate manner is an ongoing research problem. Existing TP approaches are limited to short-term predictions. Moreover, they cannot handle a large volume of trajectory data for long-term prediction. To address these limitations, we propose a scalable clustering and Markov chain based hybrid framework, called Traj-clusiVAT-based TP, for both short-term and long-term trajectory prediction, which can handle a large number of overlapping trajectories in a dense road network. Traj-clusiVAT can also determine the number of clusters, which represent different movement behaviours in input trajectory data. In our experiments, we compare our proposed approach with a mixed Markov model (MMM)-based scheme, and a trajectory clustering, NETSCAN-based TP method for both short- and long-term trajectory predictions. We performed our experiments on two real, vehicle trajectory datasets, including a large-scale trajectory dataset consisting of 3.28 million trajectories obtained from 15,061 taxis in Singapore over a period of one month. Experimental results on two real trajectory datasets show that our proposed approach outperforms the existing approaches in terms of both short- and long-term prediction performances, based on prediction accuracy and distance error (in km).
Tasks	Trajectory Prediction
Published	2018-06-10
URL	http://arxiv.org/abs/1806.03582v3
PDF	http://arxiv.org/pdf/1806.03582v3.pdf
PWC	https://paperswithcode.com/paper/a-scalable-framework-for-trajectory
Repo
Framework

PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data


Title	PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data
Authors	Jintao Ke, Shuaichao Zhang, Hai Yang, Xiqun Chen
Abstract	The real-time crash likelihood prediction has been an important research topic. Various classifiers, such as support vector machine (SVM) and tree-based boosting algorithms, have been proposed in traffic safety studies. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a difficult problem in real-time crash likelihood prediction, since it is hard to distinguish crash-prone cases from non-crash cases which compose the majority of the observed samples. In this paper, principal component analysis (PCA) based approaches, including LS-PCA, PPCA, and VBPCA, are employed for imputing missing values, while two kinds of solutions are developed to solve the problem in imbalanced data. The results show that PPCA and VBPCA not only outperform LS-PCA and other imputation methods (including mean imputation and k-means clustering imputation), in terms of the root mean square error (RMSE), but also help the classifiers achieve better predictive performance. The two solutions, i.e., cost-sensitive learning and synthetic minority oversampling technique (SMOTE), help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class.
Tasks	Imputation
Published	2018-02-11
URL	http://arxiv.org/abs/1802.03699v1
PDF	http://arxiv.org/pdf/1802.03699v1.pdf
PWC	https://paperswithcode.com/paper/pca-based-missing-information-imputation-for
Repo
Framework

Learning Monocular 3D Human Pose Estimation from Multi-view Images


Title	Learning Monocular 3D Human Pose Estimation from Multi-view Images
Authors	Helge Rhodin, Jörg Spörri, Isinsu Katircioglu, Victor Constantin, Frédéric Meyer, Erich Müller, Mathieu Salzmann, Pascal Fua
Abstract	Accurate 3D human pose estimation from single images is possible with sophisticated deep-net architectures that have been trained on very large datasets. However, this still leaves open the problem of capturing motions for which no such database exists. Manual annotation is tedious, slow, and error-prone. In this paper, we propose to replace most of the annotations by the use of multiple views, at training time only. Specifically, we train the system to predict the same pose in all views. Such a consistency constraint is necessary but not sufficient to predict accurate poses. We therefore complement it with a supervised loss aiming to predict the correct pose in a small set of labeled images, and with a regularization term that penalizes drift from initial predictions. Furthermore, we propose a method to estimate camera pose jointly with human pose, which lets us utilize multi-view footage where calibration is difficult, e.g., for pan-tilt or moving handheld cameras. We demonstrate the effectiveness of our approach on established benchmarks, as well as on a new Ski dataset with rotating cameras and expert ski motion, for which annotations are truly hard to obtain.
Tasks	3D Human Pose Estimation, Calibration, Pose Estimation
Published	2018-03-13
URL	http://arxiv.org/abs/1803.04775v2
PDF	http://arxiv.org/pdf/1803.04775v2.pdf
PWC	https://paperswithcode.com/paper/learning-monocular-3d-human-pose-estimation
Repo
Framework

RF-PUF: Enhancing IoT Security through Authentication of Wireless Nodes using In-situ Machine Learning


Title	RF-PUF: Enhancing IoT Security through Authentication of Wireless Nodes using In-situ Machine Learning
Authors	Baibhab Chatterjee, Debayan Das, Shovan Maity, Shreyas Sen
Abstract	Traditional authentication in radio-frequency (RF) systems enable secure data communication within a network through techniques such as digital signatures and hash-based message authentication codes (HMAC), which suffer from key recovery attacks. State-of-the-art IoT networks such as Nest also use Open Authentication (OAuth 2.0) protocols that are vulnerable to cross-site-recovery forgery (CSRF), which shows that these techniques may not prevent an adversary from copying or modeling the secret IDs or encryption keys using invasive, side channel, learning or software attacks. Physical unclonable functions (PUF), on the other hand, can exploit manufacturing process variations to uniquely identify silicon chips which makes a PUF-based system extremely robust and secure at low cost, as it is practically impossible to replicate the same silicon characteristics across dies. Taking inspiration from human communication, which utilizes inherent variations in the voice signatures to identify a certain speaker, we present RF- PUF: a deep neural network-based framework that allows real-time authentication of wireless nodes, using the effects of inherent process variation on RF properties of the wireless transmitters (Tx), detected through in-situ machine learning at the receiver (Rx) end. The proposed method utilizes the already-existing asymmetric RF communication framework and does not require any additional circuitry for PUF generation or feature extraction. Simulation results involving the process variations in a standard 65 nm technology node, and features such as LO offset and I-Q imbalance detected with a neural network having 50 neurons in the hidden layer indicate that the framework can distinguish up to 4800 transmitters with an accuracy of 99.9% (~ 99% for 10,000 transmitters) under varying channel conditions, and without the need for traditional preambles.
Tasks
Published	2018-05-03
URL	http://arxiv.org/abs/1805.01374v3
PDF	http://arxiv.org/pdf/1805.01374v3.pdf
PWC	https://paperswithcode.com/paper/rf-puf-enhancing-iot-security-through
Repo
Framework

Operations Guided Neural Networks for High Fidelity Data-To-Text Generation


Title	Operations Guided Neural Networks for High Fidelity Data-To-Text Generation
Authors	Feng Nie, Jinpeng Wang, Jin-Ge Yao, Rong Pan, Chin-Yew Lin
Abstract	Recent neural models for data-to-text generation are mostly based on data-driven end-to-end training over encoder-decoder networks. Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data. This is a critical issue especially in domains that require inference or calculations over raw data. In this paper, we attempt to improve the fidelity of neural data-to-text generation by utilizing pre-executed symbolic operations. We propose a framework called Operation-guided Attention-based sequence-to-sequence network (OpAtt), with a specifically designed gating mechanism as well as a quantization module for operation results to utilize information from pre-executed operations. Experiments on two sports datasets show our proposed method clearly improves the fidelity of the generated texts to the input structured data.
Tasks	Data-to-Text Generation, Quantization, Text Generation
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02735v1
PDF	http://arxiv.org/pdf/1809.02735v1.pdf
PWC	https://paperswithcode.com/paper/operations-guided-neural-networks-for-high
Repo
Framework

Toward a Thinking Microscope: Deep Learning in Optical Microscopy and Image Reconstruction


Title	Toward a Thinking Microscope: Deep Learning in Optical Microscopy and Image Reconstruction
Authors	Yair Rivenson, Aydogan Ozcan
Abstract	We discuss recently emerging applications of the state-of-art deep learning methods on optical microscopy and microscopic image reconstruction, which enable new transformations among different modes and modalities of microscopic imaging, driven entirely by image data. We believe that deep learning will fundamentally change both the hardware and image reconstruction methods used in optical microscopy in a holistic manner.
Tasks	Image Reconstruction
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08970v1
PDF	http://arxiv.org/pdf/1805.08970v1.pdf
PWC	https://paperswithcode.com/paper/toward-a-thinking-microscope-deep-learning-in
Repo
Framework

RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data


Title	RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data
Authors	Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, Jimeng Sun
Abstract	With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requirement for interpretable models. Integration of these high-density monitoring data with the discrete clinical events (including diagnosis, medications, labs) is challenging but potentially rewarding since richness and granularity in such multimodal data increase the possibilities for accurate detection of complex problems and predicting outcomes (e.g., length of stay and mortality). We propose Recurrent Attentive and Intensive Model (RAIM) for jointly analyzing continuous monitoring data and discrete clinical events. RAIM introduces an efficient attention mechanism for continuous monitoring data (e.g., ECG), which is guided by discrete clinical events (e.g, medication usage). We apply RAIM in predicting physiological decompensation and length of stay in those critically ill patients at ICU. With evaluations on MIMIC- III Waveform Database Matched Subset, we obtain an AUC-ROC score of 90.18% for predicting decompensation and an accuracy of 86.82% for forecasting length of stay with our final model, which outperforms our six baseline models.
Tasks
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08820v1
PDF	http://arxiv.org/pdf/1807.08820v1.pdf
PWC	https://paperswithcode.com/paper/raim-recurrent-attentive-and-intensive-model
Repo
Framework

Microscope 2.0: An Augmented Reality Microscope with Real-time Artificial Intelligence Integration


Title	Microscope 2.0: An Augmented Reality Microscope with Real-time Artificial Intelligence Integration
Authors	Po-Hsuan Cameron Chen, Krishna Gadepalli, Robert MacDonald, Yun Liu, Kunal Nagpal, Timo Kohlberger, Jeffrey Dean, Greg S. Corrado, Jason D. Hipp, Martin C. Stumpe
Abstract	The brightfield microscope is instrumental in the visual examination of both biological and physical samples at sub-millimeter scales. One key clinical application has been in cancer histopathology, where the microscopic assessment of the tissue samples is used for the diagnosis and staging of cancer and thus guides clinical therapy. However, the interpretation of these samples is inherently subjective, resulting in significant diagnostic variability. Moreover, in many regions of the world, access to pathologists is severely limited due to lack of trained personnel. In this regard, Artificial Intelligence (AI) based tools promise to improve the access and quality of healthcare. However, despite significant advances in AI research, integration of these tools into real-world cancer diagnosis workflows remains challenging because of the costs of image digitization and difficulties in deploying AI solutions. Here we propose a cost-effective solution to the integration of AI: the Augmented Reality Microscope (ARM). The ARM overlays AI-based information onto the current view of the sample through the optical pathway in real-time, enabling seamless integration of AI into the regular microscopy workflow. We demonstrate the utility of ARM in the detection of lymph node metastases in breast cancer and the identification of prostate cancer with a latency that supports real-time workflows. We anticipate that ARM will remove barriers towards the use of AI in microscopic analysis and thus improve the accuracy and efficiency of cancer diagnosis. This approach is applicable to other microscopy tasks and AI algorithms in the life sciences and beyond.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1812.00825v2
PDF	http://arxiv.org/pdf/1812.00825v2.pdf
PWC	https://paperswithcode.com/paper/microscope-20-an-augmented-reality-microscope
Repo
Framework

Convolutional Neural Network Architectures for Signals Supported on Graphs


Title	Convolutional Neural Network Architectures for Signals Supported on Graphs
Authors	Fernando Gama, Antonio G. Marques, Geert Leus, Alejandro Ribeiro
Abstract	Two architectures that generalize convolutional neural networks (CNNs) for the processing of signals supported on graphs are introduced. We start with the selection graph neural network (GNN), which replaces linear time invariant filters with linear shift invariant graph filters to generate convolutional features and reinterprets pooling as a possibly nonlinear subsampling stage where nearby nodes pool their information in a set of preselected sample nodes. A key component of the architecture is to remember the position of sampled nodes to permit computation of convolutional features at deeper layers. The second architecture, dubbed aggregation GNN, diffuses the signal through the graph and stores the sequence of diffused components observed by a designated node. This procedure effectively aggregates all components into a stream of information having temporal structure to which the convolution and pooling stages of regular CNNs can be applied. A multinode version of aggregation GNNs is further introduced for operation in large scale graphs. An important property of selection and aggregation GNNs is that they reduce to conventional CNNs when particularized to time signals reinterpreted as graph signals in a circulant graph. Comparative numerical analyses are performed in a source localization application over synthetic and real-world networks. Performance is also evaluated for an authorship attribution problem and text category classification. Multinode aggregation GNNs are consistently the best performing GNN architecture.
Tasks
Published	2018-05-01
URL	http://arxiv.org/abs/1805.00165v2
PDF	http://arxiv.org/pdf/1805.00165v2.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-architectures
Repo
Framework

Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs


Title	Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs
Authors	Chuanhao Zhuge, Xinheng Liu, Xiaofan Zhang, Sudeep Gummadi, Jinjun Xiong, Deming Chen
Abstract	Deep Convolutional Neural Networks have become a Swiss knife in solving critical artificial intelligence tasks. However, deploying deep CNN models for latency-critical tasks remains to be challenging because of the complex nature of CNNs. Recently, FPGA has become a favorable device to accelerate deep CNNs thanks to its high parallel processing capability and energy efficiency. In this work, we explore different fast convolution algorithms including Winograd and Fast Fourier Transform (FFT), and find an optimal strategy to apply them together on different types of convolutions. We also propose an optimization scheme to exploit parallelism on novel CNN architectures such as Inception modules in GoogLeNet. We implement a configurable IP-based face recognition acceleration system based on FaceNet using High-Level Synthesis. Our implementation on a Xilinx Ultrascale device achieves 3.75x latency speedup compared to a high-end NVIDIA GPU and surpasses previous FPGA results significantly.
Tasks	Face Recognition
Published	2018-03-23
URL	http://arxiv.org/abs/1803.09004v1
PDF	http://arxiv.org/pdf/1803.09004v1.pdf
PWC	https://paperswithcode.com/paper/face-recognition-with-hybrid-efficient
Repo
Framework

q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm


Title	q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm
Authors	Alishba Sadiq, Muhammad Usman, Shujaat Khan, Imran Naseem, Muhammad Moinuddin, Ubaid M. Al-Saggaf
Abstract	Channel estimation is an essential part of modern communication systems as it enhances the overall performance of the system. In recent past a variety of adaptive learning methods have been designed to enhance the robustness and convergence speed of the learning process. However, the need for an optimal technique is still there. Herein, for non-Gaussian noisy environment we propose a new class of stochastic gradient algorithm for channel identification. The proposed $q$-least mean fourth ($q$-LMF) is an extension of least mean fourth (LMF) algorithm and it is based on the $q$-calculus which is also known as Jackson derivative. The proposed algorithm utilizes a novel concept of error-correlation energy and normalization of signal to ensure high convergence rate, better stability and low steady-state error. Contrary to the conventional LMF, the proposed method has more freedom for large step-sizes. Extensive experiments show significant gain in the performance of the proposed $q$-LMF algorithm in comparison to the contemporary techniques.
Tasks
Published	2018-12-04
URL	http://arxiv.org/abs/1812.02588v2
PDF	http://arxiv.org/pdf/1812.02588v2.pdf
PWC	https://paperswithcode.com/paper/q-lmf-quantum-calculus-based-least-mean
Repo
Framework

Improving Abstraction in Text Summarization


Title	Improving Abstraction in Text Summarization
Authors	Wojciech Kryściński, Romain Paulus, Caiming Xiong, Richard Socher
Abstract	Abstractive text summarization aims to shorten long text documents into a human readable form that contains the most important facts from the original document. However, the level of actual abstraction as measured by novel phrases that do not appear in the source document remains low in existing approaches. We propose two techniques to improve the level of abstraction of generated summaries. First, we decompose the decoder into a contextual network that retrieves relevant parts of the source document, and a pretrained language model that incorporates prior knowledge about language generation. Second, we propose a novelty metric that is optimized directly through policy learning to encourage the generation of novel phrases. Our model achieves results comparable to state-of-the-art models, as determined by ROUGE scores and human evaluations, while achieving a significantly higher level of abstraction as measured by n-gram overlap with the source document.
Tasks	Abstractive Text Summarization, Language Modelling, Text Generation, Text Summarization
Published	2018-08-23
URL	http://arxiv.org/abs/1808.07913v1
PDF	http://arxiv.org/pdf/1808.07913v1.pdf
PWC	https://paperswithcode.com/paper/improving-abstraction-in-text-summarization
Repo
Framework