Paper Group ANR 709
Discriminative Cross-View Binary Representation Learning. Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding. Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values. A Scalable Framework for Trajectory Prediction. PCA-Based Missing Information Imputation for Real-Time Crash Likelihood …
Discriminative Cross-View Binary Representation Learning
Title | Discriminative Cross-View Binary Representation Learning |
Authors | Liu Liu, Hairong Qi |
Abstract | Learning compact representation is vital and challenging for large scale multimedia data. Cross-view/cross-modal hashing for effective binary representation learning has received significant attention with exponentially growing availability of multimedia content. Most existing cross-view hashing algorithms emphasize the similarities in individual views, which are then connected via cross-view similarities. In this work, we focus on the exploitation of the discriminative information from different views, and propose an end-to-end method to learn semantic-preserving and discriminative binary representation, dubbed Discriminative Cross-View Hashing (DCVH), in light of learning multitasking binary representation for various tasks including cross-view retrieval, image-to-image retrieval, and image annotation/tagging. The proposed DCVH has the following key components. First, it uses convolutional neural network (CNN) based nonlinear hashing functions and multilabel classification for both images and texts simultaneously. Such hashing functions achieve effective continuous relaxation during training without explicit quantization loss by using Direct Binary Embedding (DBE) layers. Second, we propose an effective view alignment via Hamming distance minimization, which is efficiently accomplished by bit-wise XOR operation. Extensive experiments on two image-text benchmark datasets demonstrate that DCVH outperforms state-of-the-art cross-view hashing algorithms as well as single-view image hashing algorithms. In addition, DCVH can provide competitive performance for image annotation/tagging. |
Tasks | Image Retrieval, Quantization, Representation Learning |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.01233v1 |
http://arxiv.org/pdf/1804.01233v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-cross-view-binary |
Repo | |
Framework | |
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Title | Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding |
Authors | Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, Dacheng Tao |
Abstract | Visual grounding aims to localize an object in an image referred to by a textual query phrase. Various visual grounding approaches have been proposed, and the problem can be modularized into a general framework: proposal generation, multi-modal feature representation, and proposal ranking. Of these three modules, most existing approaches focus on the latter two, with the importance of proposal generation generally neglected. In this paper, we rethink the problem of what properties make a good proposal generator. We introduce the diversity and discrimination simultaneously when generating proposals, and in doing so propose Diversified and Discriminative Proposal Networks model (DDPN). Based on the proposals generated by DDPN, we propose a high performance baseline model for visual grounding and evaluate it on four benchmark datasets. Experimental results demonstrate that our model delivers significant improvements on all the tested data-sets (e.g., 18.8% improvement on ReferItGame and 8.2% improvement on Flickr30k Entities over the existing state-of-the-arts respectively) |
Tasks | |
Published | 2018-05-09 |
URL | http://arxiv.org/abs/1805.03508v1 |
http://arxiv.org/pdf/1805.03508v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-diversified-and-discriminative |
Repo | |
Framework | |
Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values
Title | Interval-based Prediction Uncertainty Bound Computation in Learning with Missing Values |
Authors | Hiroyuki Hanada, Toshiyuki Takada, Jun Sakuma, Ichiro Takeuchi |
Abstract | The problem of machine learning with missing values is common in many areas. A simple approach is to first construct a dataset without missing values simply by discarding instances with missing entries or by imputing a fixed value for each missing entry, and then train a prediction model with the new dataset. A drawback of this naive approach is that the uncertainty in the missing entries is not properly incorporated in the prediction. In order to evaluate prediction uncertainty, the multiple imputation (MI) approach has been studied, but the performance of MI is sensitive to the choice of the probabilistic model of the true values in the missing entries, and the computational cost of MI is high because multiple models must be trained. In this paper, we propose an alternative approach called the Interval-based Prediction Uncertainty Bounding (IPUB) method. The IPUB method represents the uncertainties due to missing entries as intervals, and efficiently computes the lower and upper bounds of the prediction results when all possible training sets constructed by imputing arbitrary values in the intervals are considered. The IPUB method can be applied to a wide class of convex learning algorithms including penalized least-squares regression, support vector machine (SVM), and logistic regression. We demonstrate the advantages of the IPUB method by comparing it with an existing method in numerical experiment with benchmark datasets. |
Tasks | Imputation |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00218v1 |
http://arxiv.org/pdf/1803.00218v1.pdf | |
PWC | https://paperswithcode.com/paper/interval-based-prediction-uncertainty-bound |
Repo | |
Framework | |
A Scalable Framework for Trajectory Prediction
Title | A Scalable Framework for Trajectory Prediction |
Authors | Punit Rathore, Dheeraj Kumar, Sutharshan Rajasegarar, Marimuthu Palaniswami, James C. Bezdek |
Abstract | Trajectory prediction (TP) is of great importance for a wide range of location-based applications in intelligent transport systems such as location-based advertising, route planning, traffic management, and early warning systems. In the last few years, the widespread use of GPS navigation systems and wireless communication technology enabled vehicles has resulted in huge volumes of trajectory data. The task of utilizing this data employing spatio-temporal techniques for trajectory prediction in an efficient and accurate manner is an ongoing research problem. Existing TP approaches are limited to short-term predictions. Moreover, they cannot handle a large volume of trajectory data for long-term prediction. To address these limitations, we propose a scalable clustering and Markov chain based hybrid framework, called Traj-clusiVAT-based TP, for both short-term and long-term trajectory prediction, which can handle a large number of overlapping trajectories in a dense road network. Traj-clusiVAT can also determine the number of clusters, which represent different movement behaviours in input trajectory data. In our experiments, we compare our proposed approach with a mixed Markov model (MMM)-based scheme, and a trajectory clustering, NETSCAN-based TP method for both short- and long-term trajectory predictions. We performed our experiments on two real, vehicle trajectory datasets, including a large-scale trajectory dataset consisting of 3.28 million trajectories obtained from 15,061 taxis in Singapore over a period of one month. Experimental results on two real trajectory datasets show that our proposed approach outperforms the existing approaches in terms of both short- and long-term prediction performances, based on prediction accuracy and distance error (in km). |
Tasks | Trajectory Prediction |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03582v3 |
http://arxiv.org/pdf/1806.03582v3.pdf | |
PWC | https://paperswithcode.com/paper/a-scalable-framework-for-trajectory |
Repo | |
Framework | |
PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data
Title | PCA-Based Missing Information Imputation for Real-Time Crash Likelihood Prediction Under Imbalanced Data |
Authors | Jintao Ke, Shuaichao Zhang, Hai Yang, Xiqun Chen |
Abstract | The real-time crash likelihood prediction has been an important research topic. Various classifiers, such as support vector machine (SVM) and tree-based boosting algorithms, have been proposed in traffic safety studies. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a difficult problem in real-time crash likelihood prediction, since it is hard to distinguish crash-prone cases from non-crash cases which compose the majority of the observed samples. In this paper, principal component analysis (PCA) based approaches, including LS-PCA, PPCA, and VBPCA, are employed for imputing missing values, while two kinds of solutions are developed to solve the problem in imbalanced data. The results show that PPCA and VBPCA not only outperform LS-PCA and other imputation methods (including mean imputation and k-means clustering imputation), in terms of the root mean square error (RMSE), but also help the classifiers achieve better predictive performance. The two solutions, i.e., cost-sensitive learning and synthetic minority oversampling technique (SMOTE), help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class. |
Tasks | Imputation |
Published | 2018-02-11 |
URL | http://arxiv.org/abs/1802.03699v1 |
http://arxiv.org/pdf/1802.03699v1.pdf | |
PWC | https://paperswithcode.com/paper/pca-based-missing-information-imputation-for |
Repo | |
Framework | |
Learning Monocular 3D Human Pose Estimation from Multi-view Images
Title | Learning Monocular 3D Human Pose Estimation from Multi-view Images |
Authors | Helge Rhodin, Jörg Spörri, Isinsu Katircioglu, Victor Constantin, Frédéric Meyer, Erich Müller, Mathieu Salzmann, Pascal Fua |
Abstract | Accurate 3D human pose estimation from single images is possible with sophisticated deep-net architectures that have been trained on very large datasets. However, this still leaves open the problem of capturing motions for which no such database exists. Manual annotation is tedious, slow, and error-prone. In this paper, we propose to replace most of the annotations by the use of multiple views, at training time only. Specifically, we train the system to predict the same pose in all views. Such a consistency constraint is necessary but not sufficient to predict accurate poses. We therefore complement it with a supervised loss aiming to predict the correct pose in a small set of labeled images, and with a regularization term that penalizes drift from initial predictions. Furthermore, we propose a method to estimate camera pose jointly with human pose, which lets us utilize multi-view footage where calibration is difficult, e.g., for pan-tilt or moving handheld cameras. We demonstrate the effectiveness of our approach on established benchmarks, as well as on a new Ski dataset with rotating cameras and expert ski motion, for which annotations are truly hard to obtain. |
Tasks | 3D Human Pose Estimation, Calibration, Pose Estimation |
Published | 2018-03-13 |
URL | http://arxiv.org/abs/1803.04775v2 |
http://arxiv.org/pdf/1803.04775v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-monocular-3d-human-pose-estimation |
Repo | |
Framework | |
RF-PUF: Enhancing IoT Security through Authentication of Wireless Nodes using In-situ Machine Learning
Title | RF-PUF: Enhancing IoT Security through Authentication of Wireless Nodes using In-situ Machine Learning |
Authors | Baibhab Chatterjee, Debayan Das, Shovan Maity, Shreyas Sen |
Abstract | Traditional authentication in radio-frequency (RF) systems enable secure data communication within a network through techniques such as digital signatures and hash-based message authentication codes (HMAC), which suffer from key recovery attacks. State-of-the-art IoT networks such as Nest also use Open Authentication (OAuth 2.0) protocols that are vulnerable to cross-site-recovery forgery (CSRF), which shows that these techniques may not prevent an adversary from copying or modeling the secret IDs or encryption keys using invasive, side channel, learning or software attacks. Physical unclonable functions (PUF), on the other hand, can exploit manufacturing process variations to uniquely identify silicon chips which makes a PUF-based system extremely robust and secure at low cost, as it is practically impossible to replicate the same silicon characteristics across dies. Taking inspiration from human communication, which utilizes inherent variations in the voice signatures to identify a certain speaker, we present RF- PUF: a deep neural network-based framework that allows real-time authentication of wireless nodes, using the effects of inherent process variation on RF properties of the wireless transmitters (Tx), detected through in-situ machine learning at the receiver (Rx) end. The proposed method utilizes the already-existing asymmetric RF communication framework and does not require any additional circuitry for PUF generation or feature extraction. Simulation results involving the process variations in a standard 65 nm technology node, and features such as LO offset and I-Q imbalance detected with a neural network having 50 neurons in the hidden layer indicate that the framework can distinguish up to 4800 transmitters with an accuracy of 99.9% (~ 99% for 10,000 transmitters) under varying channel conditions, and without the need for traditional preambles. |
Tasks | |
Published | 2018-05-03 |
URL | http://arxiv.org/abs/1805.01374v3 |
http://arxiv.org/pdf/1805.01374v3.pdf | |
PWC | https://paperswithcode.com/paper/rf-puf-enhancing-iot-security-through |
Repo | |
Framework | |
Operations Guided Neural Networks for High Fidelity Data-To-Text Generation
Title | Operations Guided Neural Networks for High Fidelity Data-To-Text Generation |
Authors | Feng Nie, Jinpeng Wang, Jin-Ge Yao, Rong Pan, Chin-Yew Lin |
Abstract | Recent neural models for data-to-text generation are mostly based on data-driven end-to-end training over encoder-decoder networks. Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data. This is a critical issue especially in domains that require inference or calculations over raw data. In this paper, we attempt to improve the fidelity of neural data-to-text generation by utilizing pre-executed symbolic operations. We propose a framework called Operation-guided Attention-based sequence-to-sequence network (OpAtt), with a specifically designed gating mechanism as well as a quantization module for operation results to utilize information from pre-executed operations. Experiments on two sports datasets show our proposed method clearly improves the fidelity of the generated texts to the input structured data. |
Tasks | Data-to-Text Generation, Quantization, Text Generation |
Published | 2018-09-08 |
URL | http://arxiv.org/abs/1809.02735v1 |
http://arxiv.org/pdf/1809.02735v1.pdf | |
PWC | https://paperswithcode.com/paper/operations-guided-neural-networks-for-high |
Repo | |
Framework | |
Toward a Thinking Microscope: Deep Learning in Optical Microscopy and Image Reconstruction
Title | Toward a Thinking Microscope: Deep Learning in Optical Microscopy and Image Reconstruction |
Authors | Yair Rivenson, Aydogan Ozcan |
Abstract | We discuss recently emerging applications of the state-of-art deep learning methods on optical microscopy and microscopic image reconstruction, which enable new transformations among different modes and modalities of microscopic imaging, driven entirely by image data. We believe that deep learning will fundamentally change both the hardware and image reconstruction methods used in optical microscopy in a holistic manner. |
Tasks | Image Reconstruction |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08970v1 |
http://arxiv.org/pdf/1805.08970v1.pdf | |
PWC | https://paperswithcode.com/paper/toward-a-thinking-microscope-deep-learning-in |
Repo | |
Framework | |
RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data
Title | RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data |
Authors | Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, Jimeng Sun |
Abstract | With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requirement for interpretable models. Integration of these high-density monitoring data with the discrete clinical events (including diagnosis, medications, labs) is challenging but potentially rewarding since richness and granularity in such multimodal data increase the possibilities for accurate detection of complex problems and predicting outcomes (e.g., length of stay and mortality). We propose Recurrent Attentive and Intensive Model (RAIM) for jointly analyzing continuous monitoring data and discrete clinical events. RAIM introduces an efficient attention mechanism for continuous monitoring data (e.g., ECG), which is guided by discrete clinical events (e.g, medication usage). We apply RAIM in predicting physiological decompensation and length of stay in those critically ill patients at ICU. With evaluations on MIMIC- III Waveform Database Matched Subset, we obtain an AUC-ROC score of 90.18% for predicting decompensation and an accuracy of 86.82% for forecasting length of stay with our final model, which outperforms our six baseline models. |
Tasks | |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.08820v1 |
http://arxiv.org/pdf/1807.08820v1.pdf | |
PWC | https://paperswithcode.com/paper/raim-recurrent-attentive-and-intensive-model |
Repo | |
Framework | |
Microscope 2.0: An Augmented Reality Microscope with Real-time Artificial Intelligence Integration
Title | Microscope 2.0: An Augmented Reality Microscope with Real-time Artificial Intelligence Integration |
Authors | Po-Hsuan Cameron Chen, Krishna Gadepalli, Robert MacDonald, Yun Liu, Kunal Nagpal, Timo Kohlberger, Jeffrey Dean, Greg S. Corrado, Jason D. Hipp, Martin C. Stumpe |
Abstract | The brightfield microscope is instrumental in the visual examination of both biological and physical samples at sub-millimeter scales. One key clinical application has been in cancer histopathology, where the microscopic assessment of the tissue samples is used for the diagnosis and staging of cancer and thus guides clinical therapy. However, the interpretation of these samples is inherently subjective, resulting in significant diagnostic variability. Moreover, in many regions of the world, access to pathologists is severely limited due to lack of trained personnel. In this regard, Artificial Intelligence (AI) based tools promise to improve the access and quality of healthcare. However, despite significant advances in AI research, integration of these tools into real-world cancer diagnosis workflows remains challenging because of the costs of image digitization and difficulties in deploying AI solutions. Here we propose a cost-effective solution to the integration of AI: the Augmented Reality Microscope (ARM). The ARM overlays AI-based information onto the current view of the sample through the optical pathway in real-time, enabling seamless integration of AI into the regular microscopy workflow. We demonstrate the utility of ARM in the detection of lymph node metastases in breast cancer and the identification of prostate cancer with a latency that supports real-time workflows. We anticipate that ARM will remove barriers towards the use of AI in microscopic analysis and thus improve the accuracy and efficiency of cancer diagnosis. This approach is applicable to other microscopy tasks and AI algorithms in the life sciences and beyond. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1812.00825v2 |
http://arxiv.org/pdf/1812.00825v2.pdf | |
PWC | https://paperswithcode.com/paper/microscope-20-an-augmented-reality-microscope |
Repo | |
Framework | |
Convolutional Neural Network Architectures for Signals Supported on Graphs
Title | Convolutional Neural Network Architectures for Signals Supported on Graphs |
Authors | Fernando Gama, Antonio G. Marques, Geert Leus, Alejandro Ribeiro |
Abstract | Two architectures that generalize convolutional neural networks (CNNs) for the processing of signals supported on graphs are introduced. We start with the selection graph neural network (GNN), which replaces linear time invariant filters with linear shift invariant graph filters to generate convolutional features and reinterprets pooling as a possibly nonlinear subsampling stage where nearby nodes pool their information in a set of preselected sample nodes. A key component of the architecture is to remember the position of sampled nodes to permit computation of convolutional features at deeper layers. The second architecture, dubbed aggregation GNN, diffuses the signal through the graph and stores the sequence of diffused components observed by a designated node. This procedure effectively aggregates all components into a stream of information having temporal structure to which the convolution and pooling stages of regular CNNs can be applied. A multinode version of aggregation GNNs is further introduced for operation in large scale graphs. An important property of selection and aggregation GNNs is that they reduce to conventional CNNs when particularized to time signals reinterpreted as graph signals in a circulant graph. Comparative numerical analyses are performed in a source localization application over synthetic and real-world networks. Performance is also evaluated for an authorship attribution problem and text category classification. Multinode aggregation GNNs are consistently the best performing GNN architecture. |
Tasks | |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00165v2 |
http://arxiv.org/pdf/1805.00165v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-architectures |
Repo | |
Framework | |
Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs
Title | Face Recognition with Hybrid Efficient Convolution Algorithms on FPGAs |
Authors | Chuanhao Zhuge, Xinheng Liu, Xiaofan Zhang, Sudeep Gummadi, Jinjun Xiong, Deming Chen |
Abstract | Deep Convolutional Neural Networks have become a Swiss knife in solving critical artificial intelligence tasks. However, deploying deep CNN models for latency-critical tasks remains to be challenging because of the complex nature of CNNs. Recently, FPGA has become a favorable device to accelerate deep CNNs thanks to its high parallel processing capability and energy efficiency. In this work, we explore different fast convolution algorithms including Winograd and Fast Fourier Transform (FFT), and find an optimal strategy to apply them together on different types of convolutions. We also propose an optimization scheme to exploit parallelism on novel CNN architectures such as Inception modules in GoogLeNet. We implement a configurable IP-based face recognition acceleration system based on FaceNet using High-Level Synthesis. Our implementation on a Xilinx Ultrascale device achieves 3.75x latency speedup compared to a high-end NVIDIA GPU and surpasses previous FPGA results significantly. |
Tasks | Face Recognition |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.09004v1 |
http://arxiv.org/pdf/1803.09004v1.pdf | |
PWC | https://paperswithcode.com/paper/face-recognition-with-hybrid-efficient |
Repo | |
Framework | |
q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm
Title | q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm |
Authors | Alishba Sadiq, Muhammad Usman, Shujaat Khan, Imran Naseem, Muhammad Moinuddin, Ubaid M. Al-Saggaf |
Abstract | Channel estimation is an essential part of modern communication systems as it enhances the overall performance of the system. In recent past a variety of adaptive learning methods have been designed to enhance the robustness and convergence speed of the learning process. However, the need for an optimal technique is still there. Herein, for non-Gaussian noisy environment we propose a new class of stochastic gradient algorithm for channel identification. The proposed $q$-least mean fourth ($q$-LMF) is an extension of least mean fourth (LMF) algorithm and it is based on the $q$-calculus which is also known as Jackson derivative. The proposed algorithm utilizes a novel concept of error-correlation energy and normalization of signal to ensure high convergence rate, better stability and low steady-state error. Contrary to the conventional LMF, the proposed method has more freedom for large step-sizes. Extensive experiments show significant gain in the performance of the proposed $q$-LMF algorithm in comparison to the contemporary techniques. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.02588v2 |
http://arxiv.org/pdf/1812.02588v2.pdf | |
PWC | https://paperswithcode.com/paper/q-lmf-quantum-calculus-based-least-mean |
Repo | |
Framework | |
Improving Abstraction in Text Summarization
Title | Improving Abstraction in Text Summarization |
Authors | Wojciech Kryściński, Romain Paulus, Caiming Xiong, Richard Socher |
Abstract | Abstractive text summarization aims to shorten long text documents into a human readable form that contains the most important facts from the original document. However, the level of actual abstraction as measured by novel phrases that do not appear in the source document remains low in existing approaches. We propose two techniques to improve the level of abstraction of generated summaries. First, we decompose the decoder into a contextual network that retrieves relevant parts of the source document, and a pretrained language model that incorporates prior knowledge about language generation. Second, we propose a novelty metric that is optimized directly through policy learning to encourage the generation of novel phrases. Our model achieves results comparable to state-of-the-art models, as determined by ROUGE scores and human evaluations, while achieving a significantly higher level of abstraction as measured by n-gram overlap with the source document. |
Tasks | Abstractive Text Summarization, Language Modelling, Text Generation, Text Summarization |
Published | 2018-08-23 |
URL | http://arxiv.org/abs/1808.07913v1 |
http://arxiv.org/pdf/1808.07913v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-abstraction-in-text-summarization |
Repo | |
Framework | |