Paper Group ANR 36
Stacked Approximated Regression Machine: A Simple Deep Learning Approach. 4D Crop Monitoring: Spatio-Temporal Reconstruction for Agriculture. Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data. Combining Texture and Shape Cues for Object Recognition With Minimal Supervision. Multipartite Ranking-Selection o …
Stacked Approximated Regression Machine: A Simple Deep Learning Approach
Title | Stacked Approximated Regression Machine: A Simple Deep Learning Approach |
Authors | Zhangyang Wang, Shiyu Chang, Qing Ling, Shuai Huang, Xia Hu, Honghui Shi, Thomas S. Huang |
Abstract | With the agreement of my coauthors, I Zhangyang Wang would like to withdraw the manuscript “Stacked Approximated Regression Machine: A Simple Deep Learning Approach”. Some experimental procedures were not included in the manuscript, which makes a part of important claims not meaningful. In the relevant research, I was solely responsible for carrying out the experiments; the other coauthors joined in the discussions leading to the main algorithm. Please see the updated text for more details. |
Tasks | |
Published | 2016-08-14 |
URL | http://arxiv.org/abs/1608.04062v2 |
http://arxiv.org/pdf/1608.04062v2.pdf | |
PWC | https://paperswithcode.com/paper/stacked-approximated-regression-machine-a |
Repo | |
Framework | |
4D Crop Monitoring: Spatio-Temporal Reconstruction for Agriculture
Title | 4D Crop Monitoring: Spatio-Temporal Reconstruction for Agriculture |
Authors | Jing Dong, John Gary Burnham, Byron Boots, Glen C. Rains, Frank Dellaert |
Abstract | Autonomous crop monitoring at high spatial and temporal resolution is a critical problem in precision agriculture. While Structure from Motion and Multi-View Stereo algorithms can finely reconstruct the 3D structure of a field with low-cost image sensors, these algorithms fail to capture the dynamic nature of continuously growing crops. In this paper we propose a 4D reconstruction approach to crop monitoring, which employs a spatio-temporal model of dynamic scenes that is useful for precision agriculture applications. Additionally, we provide a robust data association algorithm to address the problem of large appearance changes due to scenes being viewed from different angles at different points in time, which is critical to achieving 4D reconstruction. Finally, we collected a high quality dataset with ground truth statistics to evaluate the performance of our method. We demonstrate that our 4D reconstruction approach provides models that are qualitatively correct with respect to visual appearance and quantitatively accurate when measured against the ground truth geometric properties of the monitored crops. |
Tasks | |
Published | 2016-10-08 |
URL | http://arxiv.org/abs/1610.02482v1 |
http://arxiv.org/pdf/1610.02482v1.pdf | |
PWC | https://paperswithcode.com/paper/4d-crop-monitoring-spatio-temporal |
Repo | |
Framework | |
Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data
Title | Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data |
Authors | Decebal Constantin Mocanu, Maria Torres Vega, Eric Eaton, Peter Stone, Antonio Liotta |
Abstract | Conceived in the early 1990s, Experience Replay (ER) has been shown to be a successful mechanism to allow online learning algorithms to reuse past experiences. Traditionally, ER can be applied to all machine learning paradigms (i.e., unsupervised, supervised, and reinforcement learning). Recently, ER has contributed to improving the performance of deep reinforcement learning. Yet, its application to many practical settings is still limited by the memory requirements of ER, necessary to explicitly store previous observations. To remedy this issue, we explore a novel approach, Online Contrastive Divergence with Generative Replay (OCD_GR), which uses the generative capability of Restricted Boltzmann Machines (RBMs) instead of recorded past experiences. The RBM is trained online, and does not require the system to store any of the observed data points. We compare OCD_GR to ER on 9 real-world datasets, considering a worst-case scenario (data points arriving in sorted order) as well as a more realistic one (sequential random-order data points). Our results show that in 64.28% of the cases OCD_GR outperforms ER and in the remaining 35.72% it has an almost equal performance, while having a considerably reduced space complexity (i.e., memory usage) at a comparable time complexity. |
Tasks | |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05555v1 |
http://arxiv.org/pdf/1610.05555v1.pdf | |
PWC | https://paperswithcode.com/paper/online-contrastive-divergence-with-generative |
Repo | |
Framework | |
Combining Texture and Shape Cues for Object Recognition With Minimal Supervision
Title | Combining Texture and Shape Cues for Object Recognition With Minimal Supervision |
Authors | Xingchao Peng, Kate Saenko |
Abstract | We present a novel approach to object classification and detection which requires minimal supervision and which combines visual texture cues and shape information learned from freely available unlabeled web search results. The explosion of visual data on the web can potentially make visual examples of almost any object easily accessible via web search. Previous unsupervised methods have utilized either large scale sources of texture cues from the web, or shape information from data such as crowdsourced CAD models. We propose a two-stream deep learning framework that combines these cues, with one stream learning visual texture cues from image search data, and the other stream learning rich shape information from 3D CAD models. To perform classification or detection for a novel image, the predictions of the two streams are combined using a late fusion scheme. We present experiments and visualizations for both tasks on the standard benchmark PASCAL VOC 2007 to demonstrate that texture and shape provide complementary information in our model. Our method outperforms previous web image based models, 3D CAD model based approaches, and weakly supervised models. |
Tasks | Image Retrieval, Object Classification, Object Recognition |
Published | 2016-09-14 |
URL | http://arxiv.org/abs/1609.04356v1 |
http://arxiv.org/pdf/1609.04356v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-texture-and-shape-cues-for-object |
Repo | |
Framework | |
Multipartite Ranking-Selection of Low-Dimensional Instances by Supervised Projection to High-Dimensional Space
Title | Multipartite Ranking-Selection of Low-Dimensional Instances by Supervised Projection to High-Dimensional Space |
Authors | Arash Shahriari |
Abstract | Pruning of redundant or irrelevant instances of data is a key to every successful solution for pattern recognition. In this paper, we present a novel ranking-selection framework for low-length but highly correlated instances. Instead of working in the low-dimensional instance space, we learn a supervised projection to high-dimensional space spanned by the number of classes in the dataset under study. Imposing higher distinctions via exposing the notion of labels to the instances, lets to deploy one versus all ranking for each individual classes and selecting quality instances via adaptive thresholding of the overall scores. To prove the efficiency of our paradigm, we employ it for the purpose of texture understanding which is a hard recognition challenge due to high similarity of texture pixels and low dimensionality of their color features. Our experiments show considerable improvements in recognition performance over other local descriptors on several publicly available datasets. |
Tasks | |
Published | 2016-06-24 |
URL | http://arxiv.org/abs/1606.07575v1 |
http://arxiv.org/pdf/1606.07575v1.pdf | |
PWC | https://paperswithcode.com/paper/multipartite-ranking-selection-of-low |
Repo | |
Framework | |
pg-Causality: Identifying Spatiotemporal Causal Pathways for Air Pollutants with Urban Big Data
Title | pg-Causality: Identifying Spatiotemporal Causal Pathways for Air Pollutants with Urban Big Data |
Authors | Julie Yixuan Zhu, Chao Zhang, Huichu Zhang, Shi Zhi, Victor O. K. Li, Jiawei Han, Yu Zheng |
Abstract | Many countries are suffering from severe air pollution. Understanding how different air pollutants accumulate and propagate is critical to making relevant public policies. In this paper, we use urban big data (air quality data and meteorological data) to identify the \emph{spatiotemporal (ST) causal pathways} for air pollutants. This problem is challenging because: (1) there are numerous noisy and low-pollution periods in the raw air quality data, which may lead to unreliable causality analysis, (2) for large-scale data in the ST space, the computational complexity of constructing a causal structure is very high, and (3) the \emph{ST causal pathways} are complex due to the interactions of multiple pollutants and the influence of environmental factors. Therefore, we present \emph{p-Causality}, a novel pattern-aided causality analysis approach that combines the strengths of \emph{pattern mining} and \emph{Bayesian learning} to efficiently and faithfully identify the \emph{ST causal pathways}. First, \emph{Pattern mining} helps suppress the noise by capturing frequent evolving patterns (FEPs) of each monitoring sensor, and greatly reduce the complexity by selecting the pattern-matched sensors as “causers”. Then, \emph{Bayesian learning} carefully encodes the local and ST causal relations with a Gaussian Bayesian network (GBN)-based graphical model, which also integrates environmental influences to minimize biases in the final results. We evaluate our approach with three real-world data sets containing 982 air quality sensors, in three regions of China from 01-Jun-2013 to 19-Dec-2015. Results show that our approach outperforms the traditional causal structure learning methods in time efficiency, inference accuracy and interpretability. |
Tasks | |
Published | 2016-10-22 |
URL | http://arxiv.org/abs/1610.07045v3 |
http://arxiv.org/pdf/1610.07045v3.pdf | |
PWC | https://paperswithcode.com/paper/pg-causality-identifying-spatiotemporal |
Repo | |
Framework | |
Highly accurate gaze estimation using a consumer RGB-depth sensor
Title | Highly accurate gaze estimation using a consumer RGB-depth sensor |
Authors | Reza Shoja Ghiass, Ognjen Arandjelovic |
Abstract | Determining the direction in which a person is looking is an important problem in a wide range of HCI applications. In this paper we describe a highly accurate algorithm that performs gaze estimation using an affordable and widely available device such as Kinect. The method we propose starts by performing accurate head pose estimation achieved by fitting a person specific morphable model of the face to depth data. The ordinarily competing requirements of high accuracy and high speed are met concurrently by formulating the fitting objective function as a combination of terms which excel either in accurate or fast fitting, and then by adaptively adjusting their relative contributions throughout fitting. Following pose estimation, pose normalization is done by re-rendering the fitted model as a frontal face. Finally gaze estimates are obtained through regression from the appearance of the eyes in synthetic, normalized images. Using EYEDIAP, the standard public dataset for the evaluation of gaze estimation algorithms from RGB-D data, we demonstrate that our method greatly outperforms the state of the art. |
Tasks | Gaze Estimation, Head Pose Estimation, Pose Estimation |
Published | 2016-04-05 |
URL | http://arxiv.org/abs/1604.01420v1 |
http://arxiv.org/pdf/1604.01420v1.pdf | |
PWC | https://paperswithcode.com/paper/highly-accurate-gaze-estimation-using-a |
Repo | |
Framework | |
Detection of Cooperative Interactions in Logistic Regression Models
Title | Detection of Cooperative Interactions in Logistic Regression Models |
Authors | Easton Li Xu, Xiaoning Qian, Tie Liu, Shuguang Cui |
Abstract | An important problem in the field of bioinformatics is to identify interactive effects among profiled variables for outcome prediction. In this paper, a logistic regression model with pairwise interactions among a set of binary covariates is considered. Modeling the structure of the interactions by a graph, our goal is to recover the interaction graph from independently identically distributed (i.i.d.) samples of the covariates and the outcome. When viewed as a feature selection problem, a simple quantity called influence is proposed as a measure of the marginal effects of the interaction terms on the outcome. For the case when the underlying interaction graph is known to be acyclic, it is shown that a simple algorithm that is based on a maximum-weight spanning tree with respect to the plug-in estimates of the influences not only has strong theoretical performance guarantees, but can also outperform generic feature selection algorithms for recovering the interaction graph from i.i.d. samples of the covariates and the outcome. Our results can also be extended to the model that includes both individual effects and pairwise interactions via the help of an auxiliary covariate. |
Tasks | Feature Selection |
Published | 2016-02-12 |
URL | http://arxiv.org/abs/1602.03963v2 |
http://arxiv.org/pdf/1602.03963v2.pdf | |
PWC | https://paperswithcode.com/paper/detection-of-cooperative-interactions-in |
Repo | |
Framework | |
Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering
Title | Kernel Risk-Sensitive Loss: Definition, Properties and Application to Robust Adaptive Filtering |
Authors | Badong Chen, Lei Xing, Bin Xu, Haiquan Zhao, Nanning Zheng, Jose C. Principe |
Abstract | Nonlinear similarity measures defined in kernel space, such as correntropy, can extract higher-order statistics of data and offer potentially significant performance improvement over their linear counterparts especially in non-Gaussian signal processing and machine learning. In this work, we propose a new similarity measure in kernel space, called the kernel risk-sensitive loss (KRSL), and provide some important properties. We apply the KRSL to adaptive filtering and investigate the robustness, and then develop the MKRSL algorithm and analyze the mean square convergence performance. Compared with correntropy, the KRSL can offer a more efficient performance surface, thereby enabling a gradient based method to achieve faster convergence speed and higher accuracy while still maintaining the robustness to outliers. Theoretical analysis results and superior performance of the new algorithm are confirmed by simulation. |
Tasks | |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00441v1 |
http://arxiv.org/pdf/1608.00441v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-risk-sensitive-loss-definition |
Repo | |
Framework | |
Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach
Title | Nonlinear variable selection with continuous outcome: a nonparametric incremental forward stagewise approach |
Authors | Tianwei Yu |
Abstract | We present a method of variable selection for the sparse generalized additive model. The method doesn’t assume any specific functional form, and can select from a large number of candidates. It takes the form of incremental forward stagewise regression. Given no functional form is assumed, we devised an approach termed roughening to adjust the residuals in the iterations. In simulations, we show the new method is competitive against popular machine learning approaches. We also demonstrate its performance using some real datasets. The method is available as a part of the nlnet package on CRAN https://cran.r-project.org/package=nlnet. |
Tasks | |
Published | 2016-01-20 |
URL | http://arxiv.org/abs/1601.05285v4 |
http://arxiv.org/pdf/1601.05285v4.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-variable-selection-with-continuous |
Repo | |
Framework | |
Introspective Perception: Learning to Predict Failures in Vision Systems
Title | Introspective Perception: Learning to Predict Failures in Vision Systems |
Authors | Shreyansh Daftry, Sam Zeng, J. Andrew Bagnell, Martial Hebert |
Abstract | As robots aspire for long-term autonomous operations in complex dynamic environments, the ability to reliably take mission-critical decisions in ambiguous situations becomes critical. This motivates the need to build systems that have situational awareness to assess how qualified they are at that moment to make a decision. We call this self-evaluating capability as introspection. In this paper, we take a small step in this direction and propose a generic framework for introspective behavior in perception systems. Our goal is to learn a model to reliably predict failures in a given system, with respect to a task, directly from input sensor data. We present this in the context of vision-based autonomous MAV flight in outdoor natural environments, and show that it effectively handles uncertain situations. |
Tasks | |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08665v1 |
http://arxiv.org/pdf/1607.08665v1.pdf | |
PWC | https://paperswithcode.com/paper/introspective-perception-learning-to-predict |
Repo | |
Framework | |
FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only
Title | FPGA Based Implementation of Deep Neural Networks Using On-chip Memory Only |
Authors | Jinhwan Park, Wonyong Sung |
Abstract | Deep neural networks (DNNs) demand a very large amount of computation and weight storage, and thus efficient implementation using special purpose hardware is highly desired. In this work, we have developed an FPGA based fixed-point DNN system using only on-chip memory not to access external DRAM. The execution time and energy consumption of the developed system is compared with a GPU based implementation. Since the capacity of memory in FPGA is limited, only 3-bit weights are used for this implementation, and training based fixed-point weight optimization is employed. The implementation using Xilinx XC7Z045 is tested for the MNIST handwritten digit recognition benchmark and a phoneme recognition task on TIMIT corpus. The obtained speed is about one quarter of a GPU based implementation and much better than that of a PC based one. The power consumption is less than 5 Watt at the full speed operation resulting in much higher efficiency compared to GPU based systems. |
Tasks | Handwritten Digit Recognition |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01616v2 |
http://arxiv.org/pdf/1602.01616v2.pdf | |
PWC | https://paperswithcode.com/paper/fpga-based-implementation-of-deep-neural |
Repo | |
Framework | |
Faster Kernels for Graphs with Continuous Attributes via Hashing
Title | Faster Kernels for Graphs with Continuous Attributes via Hashing |
Authors | Christopher Morris, Nils M. Kriege, Kristian Kersting, Petra Mutzel |
Abstract | While state-of-the-art kernels for graphs with discrete labels scale well to graphs with thousands of nodes, the few existing kernels for graphs with continuous attributes, unfortunately, do not scale well. To overcome this limitation, we present hash graph kernels, a general framework to derive kernels for graphs with continuous attributes from discrete ones. The idea is to iteratively turn continuous attributes into discrete labels using randomized hash functions. We illustrate hash graph kernels for the Weisfeiler-Lehman subtree kernel and for the shortest-path kernel. The resulting novel graph kernels are shown to be, both, able to handle graphs with continuous attributes and scalable to large graphs and data sets. This is supported by our theoretical analysis and demonstrated by an extensive experimental evaluation. |
Tasks | |
Published | 2016-10-01 |
URL | http://arxiv.org/abs/1610.00064v1 |
http://arxiv.org/pdf/1610.00064v1.pdf | |
PWC | https://paperswithcode.com/paper/faster-kernels-for-graphs-with-continuous |
Repo | |
Framework | |
The encoding of proprioceptive inputs in the brain: knowns and unknowns from a robotic perspective
Title | The encoding of proprioceptive inputs in the brain: knowns and unknowns from a robotic perspective |
Authors | Matej Hoffmann, Nada Bednarova |
Abstract | Somatosensory inputs can be grossly divided into tactile (or cutaneous) and proprioceptive – the former conveying information about skin stimulation, the latter about limb position and movement. The principal proprioceptors are constituted by muscle spindles, which deliver information about muscle length and speed. In primates, this information is relayed to the primary somatosensory cortex and eventually the posterior parietal cortex, where integrated information about body posture (postural schema) is presumably available. However, coming from robotics and seeking a biologically motivated model that could be used in a humanoid robot, we faced a number of difficulties. First, it is not clear what neurons in the ascending pathway and primary somatosensory cortex code. To an engineer, joint angles would seem the most useful variables. However, the lengths of individual muscles have nonlinear relationships with the angles at joints. Kim et al. (Neuron, 2015) found different types of proprioceptive neurons in the primary somatosensory cortex – sensitive to movement of single or multiple joints or to static postures. Second, there are indications that the somatotopic arrangement (“the homunculus”) of these brain areas is to a significant extent learned. However, the mechanisms behind this developmental process are unclear. We will report first results from modeling of this process using data obtained from body babbling in the iCub humanoid robot and feeding them into a Self-Organizing Map (SOM). Our results reveal that the SOM algorithm is only suited to develop receptive fields of the posture-selective type. Furthermore, the SOM algorithm has intrinsic difficulties when combined with population code on its input and in particular with nonlinear tuning curves (sigmoids or Gaussians). |
Tasks | |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.05944v1 |
http://arxiv.org/pdf/1607.05944v1.pdf | |
PWC | https://paperswithcode.com/paper/the-encoding-of-proprioceptive-inputs-in-the |
Repo | |
Framework | |
Noise Models in Feature-based Stereo Visual Odometry
Title | Noise Models in Feature-based Stereo Visual Odometry |
Authors | Pablo F. Alcantarilla, Oliver J. Woodford |
Abstract | Feature-based visual structure and motion reconstruction pipelines, common in visual odometry and large-scale reconstruction from photos, use the location of corresponding features in different images to determine the 3D structure of the scene, as well as the camera parameters associated with each image. The noise model, which defines the likelihood of the location of each feature in each image, is a key factor in the accuracy of such pipelines, alongside optimization strategy. Many different noise models have been proposed in the literature; in this paper we investigate the performance of several. We evaluate these models specifically w.r.t. stereo visual odometry, as this task is both simple (camera intrinsics are constant and known; geometry can be initialized reliably) and has datasets with ground truth readily available (KITTI Odometry and New Tsukuba Stereo Dataset). Our evaluation shows that noise models which are more adaptable to the varying nature of noise generally perform better. |
Tasks | Visual Odometry |
Published | 2016-07-01 |
URL | http://arxiv.org/abs/1607.00273v1 |
http://arxiv.org/pdf/1607.00273v1.pdf | |
PWC | https://paperswithcode.com/paper/noise-models-in-feature-based-stereo-visual |
Repo | |
Framework | |