Paper Group ANR 164
DDD17: End-To-End DAVIS Driving Dataset. Solving Almost all Systems of Random Quadratic Equations. Embedded-Graph Theory. Registration and Fusion of Multi-Spectral Images Using a Novel Edge Descriptor. Fashion Forward: Forecasting Visual Style in Fashion. Enhanced Local Binary Patterns for Automatic Face Recognition. Using $k$-way Co-occurrences fo …
DDD17: End-To-End DAVIS Driving Dataset
Title | DDD17: End-To-End DAVIS Driving Dataset |
Authors | Jonathan Binas, Daniel Neil, Shih-Chii Liu, Tobi Delbruck |
Abstract | Event cameras, such as dynamic vision sensors (DVS), and dynamic and active-pixel vision sensors (DAVIS) can supplement other autonomous driving sensors by providing a concurrent stream of standard active pixel sensor (APS) images and DVS temporal contrast events. The APS stream is a sequence of standard grayscale global-shutter image sensor frames. The DVS events represent brightness changes occurring at a particular moment, with a jitter of about a millisecond under most lighting conditions. They have a dynamic range of >120 dB and effective frame rates >1 kHz at data rates comparable to 30 fps (frames/second) image sensors. To overcome some of the limitations of current image acquisition technology, we investigate in this work the use of the combined DVS and APS streams in end-to-end driving applications. The dataset DDD17 accompanying this paper is the first open dataset of annotated DAVIS driving recordings. DDD17 has over 12 h of a 346x260 pixel DAVIS sensor recording highway and city driving in daytime, evening, night, dry and wet weather conditions, along with vehicle speed, GPS position, driver steering, throttle, and brake captured from the car’s on-board diagnostics interface. As an example application, we performed a preliminary end-to-end learning study of using a convolutional neural network that is trained to predict the instantaneous steering angle from DVS and APS visual data. |
Tasks | Autonomous Driving |
Published | 2017-11-04 |
URL | http://arxiv.org/abs/1711.01458v1 |
http://arxiv.org/pdf/1711.01458v1.pdf | |
PWC | https://paperswithcode.com/paper/ddd17-end-to-end-davis-driving-dataset |
Repo | |
Framework | |
Solving Almost all Systems of Random Quadratic Equations
Title | Solving Almost all Systems of Random Quadratic Equations |
Authors | Gang Wang, Georgios B. Giannakis, Yousef Saad, Jie Chen |
Abstract | This paper deals with finding an $n$-dimensional solution $x$ to a system of quadratic equations of the form $y_i=\langle{a}i,x\rangle^2$ for $1\le i \le m$, which is also known as phase retrieval and is NP-hard in general. We put forth a novel procedure for minimizing the amplitude-based least-squares empirical loss, that starts with a weighted maximal correlation initialization obtainable with a few power or Lanczos iterations, followed by successive refinements based upon a sequence of iteratively reweighted (generalized) gradient iterations. The two (both the initialization and gradient flow) stages distinguish themselves from prior contributions by the inclusion of a fresh (re)weighting regularization technique. The overall algorithm is conceptually simple, numerically scalable, and easy-to-implement. For certain random measurement models, the novel procedure is shown capable of finding the true solution $x$ in time proportional to reading the data ${(a_i;y_i)}{1\le i \le m}$. This holds with high probability and without extra assumption on the signal $x$ to be recovered, provided that the number $m$ of equations is some constant $c>0$ times the number $n$ of unknowns in the signal vector, namely, $m>cn$. Empirically, the upshots of this contribution are: i) (almost) $100%$ perfect signal recovery in the high-dimensional (say e.g., $n\ge 2,000$) regime given only an information-theoretic limit number of noiseless equations, namely, $m=2n-1$ in the real-valued Gaussian case; and, ii) (nearly) optimal statistical accuracy in the presence of additive noise of bounded support. Finally, substantial numerical tests using both synthetic data and real images corroborate markedly improved signal recovery performance and computational efficiency of our novel procedure relative to state-of-the-art approaches. |
Tasks | |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10407v1 |
http://arxiv.org/pdf/1705.10407v1.pdf | |
PWC | https://paperswithcode.com/paper/solving-almost-all-systems-of-random |
Repo | |
Framework | |
Embedded-Graph Theory
Title | Embedded-Graph Theory |
Authors | Atsushi Yokoyama |
Abstract | In this paper, we propose a new type of graph, denoted as “embedded-graph”, and its theory, which employs a distributed representation to describe the relations on the graph edges. Embedded-graphs can express linguistic and complicated relations, which cannot be expressed by the existing edge-graphs or weighted-graphs. We introduce the mathematical definition of embedded-graph, translation, edge distance, and graph similarity. We can transform an embedded-graph into a weighted-graph and a weighted-graph into an edge-graph by the translation method and by threshold calculation, respectively. The edge distance of an embedded-graph is a distance based on the components of a target vector, and it is calculated through cosine similarity with the target vector. The graph similarity is obtained considering the relations with linguistic complexity. In addition, we provide some examples and data structures for embedded-graphs in this paper. |
Tasks | Graph Similarity |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04710v1 |
http://arxiv.org/pdf/1709.04710v1.pdf | |
PWC | https://paperswithcode.com/paper/embedded-graph-theory |
Repo | |
Framework | |
Registration and Fusion of Multi-Spectral Images Using a Novel Edge Descriptor
Title | Registration and Fusion of Multi-Spectral Images Using a Novel Edge Descriptor |
Authors | Nati Ofir, Shai Silberstein, Dani Rozenbaum, Yosi Keller, Sharon Duvdevani Bar |
Abstract | In this paper we introduce a fully end-to-end approach for multi-spectral image registration and fusion. Our method for fusion combines images from different spectral channels into a single fused image by different approaches for low and high frequency signals. A prerequisite of fusion is a stage of geometric alignment between the spectral bands, commonly referred to as registration. Unfortunately, common methods for image registration of a single spectral channel do not yield reasonable results on images from different modalities. For that end, we introduce a new algorithm for multi-spectral image registration, based on a novel edge descriptor of feature points. Our method achieves an accurate alignment of a level that allows us to further fuse the images. As our experiments show, we produce a high quality of multi-spectral image registration and fusion under many challenging scenarios. |
Tasks | Image Registration |
Published | 2017-11-05 |
URL | http://arxiv.org/abs/1711.01543v5 |
http://arxiv.org/pdf/1711.01543v5.pdf | |
PWC | https://paperswithcode.com/paper/registration-and-fusion-of-multi-spectral |
Repo | |
Framework | |
Fashion Forward: Forecasting Visual Style in Fashion
Title | Fashion Forward: Forecasting Visual Style in Fashion |
Authors | Ziad Al-Halah, Rainer Stiefelhagen, Kristen Grauman |
Abstract | What is the future of fashion? Tackling this question from a data-driven vision perspective, we propose to forecast visual style trends before they occur. We introduce the first approach to predict the future popularity of styles discovered from fashion images in an unsupervised manner. Using these styles as a basis, we train a forecasting model to represent their trends over time. The resulting model can hypothesize new mixtures of styles that will become popular in the future, discover style dynamics (trendy vs. classic), and name the key visual attributes that will dominate tomorrow’s fashion. We demonstrate our idea applied to three datasets encapsulating 80,000 fashion products sold across six years on Amazon. Results indicate that fashion forecasting benefits greatly from visual analysis, much more than textual or meta-data cues surrounding products. |
Tasks | |
Published | 2017-05-18 |
URL | http://arxiv.org/abs/1705.06394v2 |
http://arxiv.org/pdf/1705.06394v2.pdf | |
PWC | https://paperswithcode.com/paper/fashion-forward-forecasting-visual-style-in |
Repo | |
Framework | |
Enhanced Local Binary Patterns for Automatic Face Recognition
Title | Enhanced Local Binary Patterns for Automatic Face Recognition |
Authors | Pavel Král, Ladislav Lenc, Antonín Vrba |
Abstract | This paper presents a novel automatic face recognition approach based on local binary patterns. This descriptor considers a local neighbourhood of a pixel to compute the feature vector values. This method is not very robust to handle image noise, variances and different illumination conditions. We address these issues by proposing a novel descriptor which considers more pixels and different neighbourhoods to compute the feature vector values. The proposed method is evaluated on two benchmark corpora, namely UFI and FERET face datasets. We experimentally show that our approach outperforms state-of-the-art methods and is efficient particularly in the real conditions where the above mentioned issues are obvious. We further show that the proposed method handles well one training sample issue and is also robust to the image resolution. |
Tasks | Face Recognition |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03349v2 |
http://arxiv.org/pdf/1702.03349v2.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-local-binary-patterns-for-automatic |
Repo | |
Framework | |
Using $k$-way Co-occurrences for Learning Word Embeddings
Title | Using $k$-way Co-occurrences for Learning Word Embeddings |
Authors | Danushka Bollegala, Yuichi Yoshida, Ken-ichi Kawarabayashi |
Abstract | Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning have used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-occurrences to cover $k(\geq!!2)$-way co-occurrences among a set of $k$-words. Specifically, we prove a theoretical relationship between the joint probability of $k(\geq!!2)$ words, and the sum of $\ell_2$ norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises $k$-way co-occurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data sparsity, for some smaller $k$ values, $k$-way embeddings perform comparably or better than $2$-way embeddings in a range of tasks. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2017-09-05 |
URL | http://arxiv.org/abs/1709.01199v1 |
http://arxiv.org/pdf/1709.01199v1.pdf | |
PWC | https://paperswithcode.com/paper/using-k-way-co-occurrences-for-learning-word |
Repo | |
Framework | |
Estimation of classrooms occupancy using a multi-layer perceptron
Title | Estimation of classrooms occupancy using a multi-layer perceptron |
Authors | Eugénio Rodrigues, Luísa Dias Pereira, Adélio Rodrigues Gaspar, Álvaro Gomes, Manuel Carlos Gameiro da Silva |
Abstract | This paper presents a multi-layer perceptron model for the estimation of classrooms number of occupants from sensed indoor environmental data-relative humidity, air temperature, and carbon dioxide concentration. The modelling datasets were collected from two classrooms in the Secondary School of Pombal, Portugal. The number of occupants and occupation periods were obtained from class attendance reports. However, post-class occupancy was unknown and the developed model is used to reconstruct the classrooms occupancy by filling the unreported periods. Different model structure and environment variables combination were tested. The model with best accuracy had as input vector 10 variables of five averaged time intervals of relative humidity and carbon dioxide concentration. The model presented a mean square error of 1.99, coefficient of determination of 0.96 with a significance of p-value < 0.001, and a mean absolute error of 1 occupant. These results show promising estimation capabilities in uncertain indoor environment conditions. |
Tasks | |
Published | 2017-02-07 |
URL | http://arxiv.org/abs/1702.02125v1 |
http://arxiv.org/pdf/1702.02125v1.pdf | |
PWC | https://paperswithcode.com/paper/estimation-of-classrooms-occupancy-using-a |
Repo | |
Framework | |
A Hybrid Architecture for Multi-Party Conversational Systems
Title | A Hybrid Architecture for Multi-Party Conversational Systems |
Authors | Maira Gatti de Bayser, Paulo Cavalin, Renan Souza, Alan Braz, Heloisa Candello, Claudio Pinhanez, Jean-Pierre Briot |
Abstract | Multi-party Conversational Systems are systems with natural language interaction between one or more people or systems. From the moment that an utterance is sent to a group, to the moment that it is replied in the group by a member, several activities must be done by the system: utterance understanding, information search, reasoning, among others. In this paper we present the challenges of designing and building multi-party conversational systems, the state of the art, our proposed hybrid architecture using both rules and machine learning and some insights after implementing and evaluating one on the finance domain. |
Tasks | |
Published | 2017-05-03 |
URL | http://arxiv.org/abs/1705.01214v2 |
http://arxiv.org/pdf/1705.01214v2.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-architecture-for-multi-party |
Repo | |
Framework | |
From Plants to Landmarks: Time-invariant Plant Localization that uses Deep Pose Regression in Agricultural Fields
Title | From Plants to Landmarks: Time-invariant Plant Localization that uses Deep Pose Regression in Agricultural Fields |
Authors | Florian Kraemer, Alexander Schaefer, Andreas Eitel, Johan Vertens, Wolfram Burgard |
Abstract | Agricultural robots are expected to increase yields in a sustainable way and automate precision tasks, such as weeding and plant monitoring. At the same time, they move in a continuously changing, semi-structured field environment, in which features can hardly be found and reproduced at a later time. Challenges for Lidar and visual detection systems stem from the fact that plants can be very small, overlapping and have a steadily changing appearance. Therefore, a popular way to localize vehicles with high accuracy is based on ex- pensive global navigation satellite systems and not on natural landmarks. The contribution of this work is a novel image- based plant localization technique that uses the time-invariant stem emerging point as a reference. Our approach is based on a fully convolutional neural network that learns landmark localization from RGB and NIR image input in an end-to-end manner. The network performs pose regression to generate a plant location likelihood map. Our approach allows us to cope with visual variances of plants both for different species and different growth stages. We achieve high localization accuracies as shown in detailed evaluations of a sugar beet cultivation phase. In experiments with our BoniRob we demonstrate that detections can be robustly reproduced with centimeter accuracy. |
Tasks | |
Published | 2017-09-14 |
URL | http://arxiv.org/abs/1709.04751v1 |
http://arxiv.org/pdf/1709.04751v1.pdf | |
PWC | https://paperswithcode.com/paper/from-plants-to-landmarks-time-invariant-plant |
Repo | |
Framework | |
Ranking and Selection as Stochastic Control
Title | Ranking and Selection as Stochastic Control |
Authors | Yijie Peng, Edwin K. P. Chong, Chun-Hung Chen, Michael C. Fu |
Abstract | Under a Bayesian framework, we formulate the fully sequential sampling and selection decision in statistical ranking and selection as a stochastic control problem, and derive the associated Bellman equation. Using value function approximation, we derive an approximately optimal allocation policy. We show that this policy is not only computationally efficient but also possesses both one-step-ahead and asymptotic optimality for independent normal sampling distributions. Moreover, the proposed allocation policy is easily generalizable in the approximate dynamic programming paradigm. |
Tasks | |
Published | 2017-10-07 |
URL | http://arxiv.org/abs/1710.02619v1 |
http://arxiv.org/pdf/1710.02619v1.pdf | |
PWC | https://paperswithcode.com/paper/ranking-and-selection-as-stochastic-control |
Repo | |
Framework | |
Multitask Learning with CTC and Segmental CRF for Speech Recognition
Title | Multitask Learning with CTC and Segmental CRF for Speech Recognition |
Authors | Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith |
Abstract | Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models. Both models define a transcription probability by marginalizing decisions about latent segmentation alternatives to derive a sequence probability: the former uses a globally normalized joint model of segment labels and durations, and the latter classifies each frame as either an output symbol or a “continuation” of the previous label. In this paper, we train a recognition model by optimizing an interpolation between the SCRF and CTC losses, where the same recurrent neural network (RNN) encoder is used for feature extraction for both outputs. We find that this multitask objective improves recognition accuracy when decoding with either the SCRF or CTC models. Additionally, we show that CTC can also be used to pretrain the RNN encoder, which improves the convergence rate when learning the joint model. |
Tasks | Speech Recognition |
Published | 2017-02-21 |
URL | http://arxiv.org/abs/1702.06378v4 |
http://arxiv.org/pdf/1702.06378v4.pdf | |
PWC | https://paperswithcode.com/paper/multitask-learning-with-ctc-and-segmental-crf |
Repo | |
Framework | |
Competing Bandits: Learning under Competition
Title | Competing Bandits: Learning under Competition |
Authors | Yishay Mansour, Aleksandrs Slivkins, Zhiwei Steven Wu |
Abstract | Most modern systems strive to learn from interactions with users, and many engage in exploration: making potentially suboptimal choices for the sake of acquiring new information. We initiate a study of the interplay between exploration and competition–how such systems balance the exploration for learning and the competition for users. Here the users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing systems. In our model, we consider competition between two multi-armed bandit algorithms faced with the same bandit instance. Users arrive one by one and choose among the two algorithms, so that each algorithm makes progress if and only if it is chosen. We ask whether and to what extent competition incentivizes the adoption of better bandit algorithms. We investigate this issue for several models of user response, as we vary the degree of rationality and competitiveness in the model. Our findings are closely related to the “competition vs. innovation” relationship, a well-studied theme in economics. |
Tasks | |
Published | 2017-02-27 |
URL | http://arxiv.org/abs/1702.08533v2 |
http://arxiv.org/pdf/1702.08533v2.pdf | |
PWC | https://paperswithcode.com/paper/competing-bandits-learning-under-competition |
Repo | |
Framework | |
Learning RBM with a DC programming Approach
Title | Learning RBM with a DC programming Approach |
Authors | Vidyadhar Upadhya, P. S. Sastry |
Abstract | By exploiting the property that the RBM log-likelihood function is the difference of convex functions, we formulate a stochastic variant of the difference of convex functions (DC) programming to minimize the negative log-likelihood. Interestingly, the traditional contrastive divergence algorithm is a special case of the above formulation and the hyperparameters of the two algorithms can be chosen such that the amount of computation per mini-batch is identical. We show that for a given computational budget the proposed algorithm almost always reaches a higher log-likelihood more rapidly, compared to the standard contrastive divergence algorithm. Further, we modify this algorithm to use the centered gradients and show that it is more efficient and effective compared to the standard centered gradient algorithm on benchmark datasets. |
Tasks | |
Published | 2017-09-21 |
URL | http://arxiv.org/abs/1709.07149v2 |
http://arxiv.org/pdf/1709.07149v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-rbm-with-a-dc-programming-approach |
Repo | |
Framework | |
Training Deep Networks to be Spatially Sensitive
Title | Training Deep Networks to be Spatially Sensitive |
Authors | Nicholas Kolkin, Gregory Shakhnarovich, Eli Shechtman |
Abstract | In many computer vision tasks, for example saliency prediction or semantic segmentation, the desired output is a foreground map that predicts pixels where some criteria is satisfied. Despite the inherently spatial nature of this task commonly used learning objectives do not incorporate the spatial relationships between misclassified pixels and the underlying ground truth. The Weighted F-measure, a recently proposed evaluation metric, does reweight errors spatially, and has been shown to closely correlate with human evaluation of quality, and stably rank predictions with respect to noisy ground truths (such as a sloppy human annotator might generate). However it suffers from computational complexity which makes it intractable as an optimization objective for gradient descent, which must be evaluated thousands or millions of times while learning a model’s parameters. We propose a differentiable and efficient approximation of this metric. By incorporating spatial information into the objective we can use a simpler model than competing methods without sacrificing accuracy, resulting in faster inference speeds and alleviating the need for pre/post-processing. We match (or improve) performance on several tasks compared to prior state of the art by traditional metrics, and in many cases significantly improve performance by the weighted F-measure. |
Tasks | Saliency Prediction, Semantic Segmentation |
Published | 2017-08-07 |
URL | http://arxiv.org/abs/1708.02212v1 |
http://arxiv.org/pdf/1708.02212v1.pdf | |
PWC | https://paperswithcode.com/paper/training-deep-networks-to-be-spatially |
Repo | |
Framework | |