Paper Group ANR 487
Design of Image Matched Non-Separable Wavelet using Convolutional Neural Network. Learning Fully Convolutional Networks for Iterative Non-blind Deconvolution. Black-box optimization with a politician. Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs. Unified Statistical Theory of Spectral Graph Analysis. …
Design of Image Matched Non-Separable Wavelet using Convolutional Neural Network
Title | Design of Image Matched Non-Separable Wavelet using Convolutional Neural Network |
Authors | Naushad Ansari, Anubha Gupta, Rahul Duggal |
Abstract | Image-matched nonseparable wavelets can find potential use in many applications including image classification, segmen- tation, compressive sensing, etc. This paper proposes a novel design methodology that utilizes convolutional neural net- work (CNN) to design two-channel non-separable wavelet matched to a given image. The design is proposed on quin- cunx lattice. The loss function of the convolutional neural network is setup with total squared error between the given input image to CNN and the reconstructed image at the output of CNN, leading to perfect reconstruction at the end of train- ing. Simulation results have been shown on some standard images. |
Tasks | Compressive Sensing, Image Classification |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.04966v1 |
http://arxiv.org/pdf/1612.04966v1.pdf | |
PWC | https://paperswithcode.com/paper/design-of-image-matched-non-separable-wavelet |
Repo | |
Framework | |
Learning Fully Convolutional Networks for Iterative Non-blind Deconvolution
Title | Learning Fully Convolutional Networks for Iterative Non-blind Deconvolution |
Authors | Jiawei Zhang, Jinshan Pan, Wei-Sheng Lai, Rynson Lau, Ming-Hsuan Yang |
Abstract | In this paper, we propose a fully convolutional networks for iterative non-blind deconvolution We decompose the non-blind deconvolution problem into image denoising and image deconvolution. We train a FCNN to remove noises in the gradient domain and use the learned gradients to guide the image deconvolution step. In contrast to the existing deep neural network based methods, we iteratively deconvolve the blurred images in a multi-stage framework. The proposed method is able to learn an adaptive image prior, which keeps both local (details) and global (structures) information. Both quantitative and qualitative evaluations on benchmark datasets demonstrate that the proposed method performs favorably against state-of-the-art algorithms in terms of quality and speed. |
Tasks | Denoising, Image Deconvolution, Image Denoising |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06495v1 |
http://arxiv.org/pdf/1611.06495v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-fully-convolutional-networks-for |
Repo | |
Framework | |
Black-box optimization with a politician
Title | Black-box optimization with a politician |
Authors | Sébastien Bubeck, Yin-Tat Lee |
Abstract | We propose a new framework for black-box convex optimization which is well-suited for situations where gradient computations are expensive. We derive a new method for this framework which leverages several concepts from convex optimization, from standard first-order methods (e.g. gradient descent or quasi-Newton methods) to analytical centers (i.e. minimizers of self-concordant barriers). We demonstrate empirically that our new technique compares favorably with state of the art algorithms (such as BFGS). |
Tasks | |
Published | 2016-02-15 |
URL | http://arxiv.org/abs/1602.04847v1 |
http://arxiv.org/pdf/1602.04847v1.pdf | |
PWC | https://paperswithcode.com/paper/black-box-optimization-with-a-politician |
Repo | |
Framework | |
Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs
Title | Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs |
Authors | Xinyu Wu, Vishal Saxena |
Abstract | Large-scale integration of emerging nanoscale non-volatile memory devices, e.g. resistive random-access memory (RRAM), can enable a new generation of neuromorphic computers that can solve a wide range of machine learning problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in several orders of magnitude reduction in energy consumption at a very small form factor, and herald autonomous learning machines capable of self-adapting to their environment. However, the progress in this area has been impeded from the realization that the actual memory devices fall well short of their expected behavior. In this work, we discuss the challenges associated with these memory devices and their use in neuromorphic computing circuits, and propose pathways to overcome these limitations by introducing ‘dendritic learning’. |
Tasks | |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01491v2 |
http://arxiv.org/pdf/1612.01491v2.pdf | |
PWC | https://paperswithcode.com/paper/enabling-bio-plausible-multi-level-stdp-using |
Repo | |
Framework | |
Unified Statistical Theory of Spectral Graph Analysis
Title | Unified Statistical Theory of Spectral Graph Analysis |
Authors | Subhadeep Mukhopadhyay |
Abstract | The goal of this paper is to show that there exists a simple, yet universal statistical logic of spectral graph analysis by recasting it into a nonparametric function estimation problem. The prescribed viewpoint appears to be good enough to accommodate most of the existing spectral graph techniques as a consequence of just one single formalism and algorithm. |
Tasks | |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03861v4 |
http://arxiv.org/pdf/1602.03861v4.pdf | |
PWC | https://paperswithcode.com/paper/unified-statistical-theory-of-spectral-graph |
Repo | |
Framework | |
Recognizing Car Fluents from Video
Title | Recognizing Car Fluents from Video |
Authors | Bo Li, Tianfu Wu, Caiming Xiong, Song-Chun Zhu |
Abstract | Physical fluents, a term originally used by Newton [40], refers to time-varying object states in dynamic scenes. In this paper, we are interested in inferring the fluents of vehicles from video. For example, a door (hood, trunk) is open or closed through various actions, light is blinking to turn. Recognizing these fluents has broad applications, yet have received scant attention in the computer vision literature. Car fluent recognition entails a unified framework for car detection, car part localization and part status recognition, which is made difficult by large structural and appearance variations, low resolutions and occlusions. This paper learns a spatial-temporal And-Or hierarchical model to represent car fluents. The learning of this model is formulated under the latent structural SVM framework. Since there are no publicly related dataset, we collect and annotate a car fluent dataset consisting of car videos with diverse fluents. In experiments, the proposed method outperforms several highly related baseline methods in terms of car fluent recognition and car part localization. |
Tasks | |
Published | 2016-03-26 |
URL | http://arxiv.org/abs/1603.08067v1 |
http://arxiv.org/pdf/1603.08067v1.pdf | |
PWC | https://paperswithcode.com/paper/recognizing-car-fluents-from-video |
Repo | |
Framework | |
Animation and Chirplet-Based Development of a PIR Sensor Array for Intruder Classification in an Outdoor Environment
Title | Animation and Chirplet-Based Development of a PIR Sensor Array for Intruder Classification in an Outdoor Environment |
Authors | Raviteja Upadrashta, Tarun Choubisa, A. Praneeth, Tony G., Aswath V. S., P. Vijay Kumar, Sripad Kowshik, Hari Prasad Gokul R, T. V. Prabhakar |
Abstract | This paper presents the development of a passive infra-red sensor tower platform along with a classification algorithm to distinguish between human intrusion, animal intrusion and clutter arising from wind-blown vegetative movement in an outdoor environment. The research was aimed at exploring the potential use of wireless sensor networks as an early-warning system to help mitigate human-wildlife conflicts occurring at the edge of a forest. There are three important features to the development. Firstly, the sensor platform employs multiple sensors arranged in the form of a two-dimensional array to give it a key spatial-resolution capability that aids in classification. Secondly, given the challenges of collecting data involving animal intrusion, an Animation-based Simulation tool for Passive Infra-Red sEnsor (ASPIRE) was developed that simulates signals corresponding to human and animal intrusion and some limited models of vegetative clutter. This speeded up the process of algorithm development by allowing us to test different hypotheses in a time-efficient manner. Finally, a chirplet-based model for intruder signal was developed that significantly helped boost classification accuracy despite drawing data from a smaller number of sensors. An SVM-based classifier was used which made use of chirplet, energy and signal cross-correlation-based features. The average accuracy obtained for intruder detection and classification on real-world and simulated data sets was in excess of 97%. |
Tasks | |
Published | 2016-04-13 |
URL | http://arxiv.org/abs/1604.03829v1 |
http://arxiv.org/pdf/1604.03829v1.pdf | |
PWC | https://paperswithcode.com/paper/animation-and-chirplet-based-development-of-a |
Repo | |
Framework | |
Real-Time RGB-D based Template Matching Pedestrian Detection
Title | Real-Time RGB-D based Template Matching Pedestrian Detection |
Authors | Omid Hosseini jafari, Michael Ying Yang |
Abstract | Pedestrian detection is one of the most popular topics in computer vision and robotics. Considering challenging issues in multiple pedestrian detection, we present a real-time depth-based template matching people detector. In this paper, we propose different approaches for training the depth-based template. We train multiple templates for handling issues due to various upper-body orientations of the pedestrians and different levels of detail in depth-map of the pedestrians with various distances from the camera. And, we take into account the degree of reliability for different regions of sliding window by proposing the weighted template approach. Furthermore, we combine the depth-detector with an appearance based detector as a verifier to take advantage of the appearance cues for dealing with the limitations of depth data. We evaluate our method on the challenging ETH dataset sequence. We show that our method outperforms the state-of-the-art approaches. |
Tasks | Pedestrian Detection |
Published | 2016-10-03 |
URL | http://arxiv.org/abs/1610.00748v1 |
http://arxiv.org/pdf/1610.00748v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-rgb-d-based-template-matching |
Repo | |
Framework | |
Multi-modal image retrieval with random walk on multi-layer graphs
Title | Multi-modal image retrieval with random walk on multi-layer graphs |
Authors | Renata Khasanova, Xiaowen Dong, Pascal Frossard |
Abstract | The analysis of large collections of image data is still a challenging problem due to the difficulty of capturing the true concepts in visual data. The similarity between images could be computed using different and possibly multimodal features such as color or edge information or even text labels. This motivates the design of image analysis solutions that are able to effectively integrate the multi-view information provided by different feature sets. We therefore propose a new image retrieval solution that is able to sort images through a random walk on a multi-layer graph, where each layer corresponds to a different type of information about the image data. We study in depth the design of the image graph and propose in particular an effective method to select the edge weights for the multi-layer graph, such that the image ranking scores are optimised. We then provide extensive experiments in different real-world photo collections, which confirm the high performance of our new image retrieval algorithm that generally surpasses state-of-the-art solutions due to a more meaningful image similarity computation. |
Tasks | Image Retrieval |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03406v1 |
http://arxiv.org/pdf/1607.03406v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-image-retrieval-with-random-walk |
Repo | |
Framework | |
2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search
Title | 2-Bit Random Projections, NonLinear Estimators, and Approximate Near Neighbor Search |
Authors | Ping Li, Michael Mitzenmacher, Anshumali Shrivastava |
Abstract | The method of random projections has become a standard tool for machine learning, data mining, and search with massive data at Web scale. The effective use of random projections requires efficient coding schemes for quantizing (real-valued) projected data into integers. In this paper, we focus on a simple 2-bit coding scheme. In particular, we develop accurate nonlinear estimators of data similarity based on the 2-bit strategy. This work will have important practical applications. For example, in the task of near neighbor search, a crucial step (often called re-ranking) is to compute or estimate data similarities once a set of candidate data points have been identified by hash table techniques. This re-ranking step can take advantage of the proposed coding scheme and estimator. As a related task, in this paper, we also study a simple uniform quantization scheme for the purpose of building hash tables with projected data. Our analysis shows that typically only a small number of bits are needed. For example, when the target similarity level is high, 2 or 3 bits might be sufficient. When the target similarity level is not so high, it is preferable to use only 1 or 2 bits. Therefore, a 2-bit scheme appears to be overall a good choice for the task of sublinear time approximate near neighbor search via hash tables. Combining these results, we conclude that 2-bit random projections should be recommended for approximate near neighbor search and similarity estimation. Extensive experimental results are provided. |
Tasks | Quantization |
Published | 2016-02-21 |
URL | http://arxiv.org/abs/1602.06577v1 |
http://arxiv.org/pdf/1602.06577v1.pdf | |
PWC | https://paperswithcode.com/paper/2-bit-random-projections-nonlinear-estimators |
Repo | |
Framework | |
Rank Aggregation for Course Sequence Discovery
Title | Rank Aggregation for Course Sequence Discovery |
Authors | Mihai Cucuringu, Charlie Marshak, Dillon Montag, Puck Rombach |
Abstract | In this work, we adapt the rank aggregation framework for the discovery of optimal course sequences at the university level. Each student provides a partial ranking of the courses taken throughout his or her undergraduate career. We compute pairwise rank comparisons between courses based on the order students typically take them, aggregate the results over the entire student population, and then obtain a proxy for the rank offset between pairs of courses. We extract a global ranking of the courses via several state-of-the art algorithms for ranking with pairwise noisy information, including SerialRank, Rank Centrality, and the recent SyncRank based on the group synchronization problem. We test this application of rank aggregation on 15 years of student data from the Department of Mathematics at the University of California, Los Angeles (UCLA). Furthermore, we experiment with the above approach on different subsets of the student population conditioned on final GPA, and highlight several differences in the obtained rankings that uncover hidden pre-requisites in the Mathematics curriculum. |
Tasks | |
Published | 2016-03-03 |
URL | http://arxiv.org/abs/1603.02695v1 |
http://arxiv.org/pdf/1603.02695v1.pdf | |
PWC | https://paperswithcode.com/paper/rank-aggregation-for-course-sequence |
Repo | |
Framework | |
Multiple Regularizations Deep Learning for Paddy Growth Stages Classification from LANDSAT-8
Title | Multiple Regularizations Deep Learning for Paddy Growth Stages Classification from LANDSAT-8 |
Authors | Ines Heidieni Ikasari, Vina Ayumi, Mohamad Ivan Fanany, Sidik Mulyono |
Abstract | This study uses remote sensing technology that can provide information about the condition of the earth’s surface area, fast, and spatially. The study area was in Karawang District, lying in the Northern part of West Java-Indonesia. We address a paddy growth stages classification using LANDSAT 8 image data obtained from multi-sensor remote sensing image taken in October 2015 to August 2016. This study pursues a fast and accurate classification of paddy growth stages by employing multiple regularizations learning on some deep learning methods such as DNN (Deep Neural Networks) and 1-D CNN (1-D Convolutional Neural Networks). The used regularizations are Fast Dropout, Dropout, and Batch Normalization. To evaluate the effectiveness, we also compared our method with other machine learning methods such as (Logistic Regression, SVM, Random Forest, and XGBoost). The data used are seven bands of LANDSAT-8 spectral data samples that correspond to paddy growth stages data obtained from i-Sky (eye in the sky) Innovation system. The growth stages are determined based on paddy crop phenology profile from time series of LANDSAT-8 images. The classification results show that MLP using multiple regularization Dropout and Batch Normalization achieves the highest accuracy for this dataset. |
Tasks | Time Series |
Published | 2016-10-06 |
URL | http://arxiv.org/abs/1610.01795v1 |
http://arxiv.org/pdf/1610.01795v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-regularizations-deep-learning-for |
Repo | |
Framework | |
Application of Advanced Record Linkage Techniques for Complex Population Reconstruction
Title | Application of Advanced Record Linkage Techniques for Complex Population Reconstruction |
Authors | Peter Christen |
Abstract | Record linkage is the process of identifying records that refer to the same entities from several databases. This process is challenging because commonly no unique entity identifiers are available. Linkage therefore has to rely on partially identifying attributes, such as names and addresses of people. Recent years have seen the development of novel techniques for linking data from diverse application areas, where a major focus has been on linking complex data that contain records about different types of entities. Advanced approaches that exploit both the similarities between record attributes as well as the relationships between entities to identify clusters of matching records have been developed. In this application paper we study the novel problem where rather than different types of entities we have databases where the same entity can have different roles, and where these roles change over time. We specifically develop novel techniques for linking historical birth, death, marriage and census records with the aim to reconstruct the population covered by these records over a period of several decades. Our experimental evaluation on real Scottish data shows that even with advanced linkage techniques that consider group, relationship, and temporal aspects it is challenging to achieve high quality linkage from such complex data. |
Tasks | |
Published | 2016-12-13 |
URL | http://arxiv.org/abs/1612.04286v1 |
http://arxiv.org/pdf/1612.04286v1.pdf | |
PWC | https://paperswithcode.com/paper/application-of-advanced-record-linkage |
Repo | |
Framework | |
Deep Joint Face Hallucination and Recognition
Title | Deep Joint Face Hallucination and Recognition |
Authors | Junyu Wu, Shengyong Ding, Wei Xu, Hongyang Chao |
Abstract | Deep models have achieved impressive performance for face hallucination tasks. However, we observe that directly feeding the hallucinated facial images into recog- nition models can even degrade the recognition performance despite the much better visualization quality. In this paper, we address this problem by jointly learning a deep model for two tasks, i.e. face hallucination and recognition. In particular, we design an end-to-end deep convolution network with hallucination sub-network cascaded by recognition sub-network. The recognition sub- network are responsible for producing discriminative feature representations using the hallucinated images as inputs generated by hallucination sub-network. During training, we feed LR facial images into the network and optimize the parameters by minimizing two loss items, i.e. 1) face hallucination loss measured by the pixel wise difference between the ground truth HR images and network-generated images; and 2) verification loss which is measured by the classification error and intra-class distance. We extensively evaluate our method on LFW and YTF datasets. The experimental results show that our method can achieve recognition accuracy 97.95% on 4x down-sampled LFW testing set, outperforming the accuracy 96.35% of conventional face recognition model. And on the more challenging YTF dataset, we achieve recognition accuracy 90.65%, a margin over the recognition accuracy 89.45% obtained by conventional face recognition model on the 4x down-sampled version. |
Tasks | Face Hallucination, Face Recognition |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08091v1 |
http://arxiv.org/pdf/1611.08091v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-joint-face-hallucination-and-recognition |
Repo | |
Framework | |
Rank Ordered Autoencoders
Title | Rank Ordered Autoencoders |
Authors | Paul Bertens |
Abstract | A new method for the unsupervised learning of sparse representations using autoencoders is proposed and implemented by ordering the output of the hidden units by their activation value and progressively reconstructing the input in this order. This can be done efficiently in parallel with the use of cumulative sums and sorting only slightly increasing the computational costs. Minimizing the difference of this progressive reconstruction with respect to the input can be seen as minimizing the number of active output units required for the reconstruction of the input. The model thus learns to reconstruct optimally using the least number of active output units. This leads to high sparsity without the need for extra hyperparameters, the amount of sparsity is instead implicitly learned by minimizing this progressive reconstruction error. Results of the trained model are given for patches of the CIFAR10 dataset, showing rapid convergence of features and extremely sparse output activations while maintaining a minimal reconstruction error and showing extreme robustness to overfitting. Additionally the reconstruction as function of number of active units is presented which shows the autoencoder learns a rank order code over the input where the highest ranked units correspond to the highest decrease in reconstruction error. |
Tasks | |
Published | 2016-05-05 |
URL | http://arxiv.org/abs/1605.01749v1 |
http://arxiv.org/pdf/1605.01749v1.pdf | |
PWC | https://paperswithcode.com/paper/rank-ordered-autoencoders |
Repo | |
Framework | |