Paper Group ANR 58
Illumination-invariant image mosaic calculation based on logarithmic search. Context Matters: Refining Object Detection in Video with Recurrent Neural Networks. Support Driven Wavelet Frame-based Image Deblurring. Searching Scenes by Abstracting Things. Character-Level Question Answering with Attention. Deep Image Set Hashing. Kernel-based methods …
Illumination-invariant image mosaic calculation based on logarithmic search
Title | Illumination-invariant image mosaic calculation based on logarithmic search |
Authors | Wolfgang Konen |
Abstract | This technical report describes an improved image mosaicking algorithm. It is based on Jain’s logarithmic search algorithm [Jain 1981] which is coupled to the method of Kourogi (1999} for matching images in a video sequence. Logarithmic search has a better invariance against illumination changes than the original optical-flow-based method of Kourogi. |
Tasks | Optical Flow Estimation |
Published | 2016-03-21 |
URL | http://arxiv.org/abs/1603.06433v1 |
http://arxiv.org/pdf/1603.06433v1.pdf | |
PWC | https://paperswithcode.com/paper/illumination-invariant-image-mosaic |
Repo | |
Framework | |
Context Matters: Refining Object Detection in Video with Recurrent Neural Networks
Title | Context Matters: Refining Object Detection in Video with Recurrent Neural Networks |
Authors | Subarna Tripathi, Zachary C. Lipton, Serge Belongie, Truong Nguyen |
Abstract | Given the vast amounts of video available online, and recent breakthroughs in object detection with static images, object detection in video offers a promising new frontier. However, motion blur and compression artifacts cause substantial frame-level variability, even in videos that appear smooth to the eye. Additionally, video datasets tend to have sparsely annotated frames. We present a new framework for improving object detection in videos that captures temporal context and encourages consistency of predictions. First, we train a pseudo-labeler, that is, a domain-adapted convolutional neural network for object detection. The pseudo-labeler is first trained individually on the subset of labeled frames, and then subsequently applied to all frames. Then we train a recurrent neural network that takes as input sequences of pseudo-labeled frames and optimizes an objective that encourages both accuracy on the target frame and consistency across consecutive frames. The approach incorporates strong supervision of target frames, weak-supervision on context frames, and regularization via a smoothness penalty. Our approach achieves mean Average Precision (mAP) of 68.73, an improvement of 7.1 over the strongest image-based baselines for the Youtube-Video Objects dataset. Our experiments demonstrate that neighboring frames can provide valuable information, even absent labels. |
Tasks | Object Detection |
Published | 2016-07-15 |
URL | http://arxiv.org/abs/1607.04648v2 |
http://arxiv.org/pdf/1607.04648v2.pdf | |
PWC | https://paperswithcode.com/paper/context-matters-refining-object-detection-in |
Repo | |
Framework | |
Support Driven Wavelet Frame-based Image Deblurring
Title | Support Driven Wavelet Frame-based Image Deblurring |
Authors | Liangtian He, Yilun Wang, Zhaoyin Xiang |
Abstract | The wavelet frame systems have been playing an active role in image restoration and many other image processing fields over the past decades, owing to the good capability of sparsely approximating piece-wise smooth functions such as images. In this paper, we propose a novel wavelet frame based sparse recovery model called \textit{Support Driven Sparse Regularization} (SDSR) for image deblurring, where the partial support information of frame coefficients is attained via a self-learning strategy and exploited via the proposed truncated $\ell_0$ regularization. Moreover, the state-of-the-art image restoration methods can be naturally incorporated into our proposed wavelet frame based sparse recovery framework. In particular, in order to achieve reliable support estimation of the frame coefficients, we make use of the state-of-the-art image restoration result such as that from the IDD-BM3D method as the initial reference image for support estimation. Our extensive experimental results have shown convincing improvements over existing state-of-the-art deblurring methods. |
Tasks | Deblurring, Image Restoration |
Published | 2016-03-26 |
URL | http://arxiv.org/abs/1603.08108v1 |
http://arxiv.org/pdf/1603.08108v1.pdf | |
PWC | https://paperswithcode.com/paper/support-driven-wavelet-frame-based-image |
Repo | |
Framework | |
Searching Scenes by Abstracting Things
Title | Searching Scenes by Abstracting Things |
Authors | Svetlana Kordumova, Jan C. van Gemert, Cees G. M. Snoek, Arnold W. M. Smeulders |
Abstract | In this paper we propose to represent a scene as an abstraction of ‘things’. We start from ‘things’ as generated by modern object proposals, and we investigate their immediately observable properties: position, size, aspect ratio and color, and those only. Where the recent successes and excitement of the field lie in object identification, we represent the scene composition independent of object identities. We make three contributions in this work. First, we study simple observable properties of ‘things’, and call it things syntax. Second, we propose translating the things syntax in linguistic abstract statements and study their descriptive effect to retrieve scenes. Thirdly, we propose querying of scenes with abstract block illustrations and study their effectiveness to discriminate among different types of scenes. The benefit of abstract statements and block illustrations is that we generate them directly from the images, without any learning beforehand as in the standard attribute learning. Surprisingly, we show that even though we use the simplest of features from ‘things’ layout and no learning at all, we can still retrieve scenes reasonably well. |
Tasks | |
Published | 2016-10-06 |
URL | http://arxiv.org/abs/1610.01801v1 |
http://arxiv.org/pdf/1610.01801v1.pdf | |
PWC | https://paperswithcode.com/paper/searching-scenes-by-abstracting-things |
Repo | |
Framework | |
Character-Level Question Answering with Attention
Title | Character-Level Question Answering with Attention |
Authors | David Golub, Xiaodong He |
Abstract | We show that a character-level encoder-decoder framework can be successfully applied to question answering with a structured knowledge base. We use our model for single-relation question answering and demonstrate the effectiveness of our approach on the SimpleQuestions dataset (Bordes et al., 2015), where we improve state-of-the-art accuracy from 63.9% to 70.9%, without use of ensembles. Importantly, our character-level model has 16x fewer parameters than an equivalent word-level model, can be learned with significantly less data compared to previous work, which relies on data augmentation, and is robust to new entities in testing. |
Tasks | Data Augmentation, Question Answering |
Published | 2016-04-04 |
URL | http://arxiv.org/abs/1604.00727v4 |
http://arxiv.org/pdf/1604.00727v4.pdf | |
PWC | https://paperswithcode.com/paper/character-level-question-answering-with |
Repo | |
Framework | |
Deep Image Set Hashing
Title | Deep Image Set Hashing |
Authors | Jie Feng, Svebor Karaman, I-Hong Jhuo, Shih-Fu Chang |
Abstract | In applications involving matching of image sets, the information from multiple images must be effectively exploited to represent each set. State-of-the-art methods use probabilistic distribution or subspace to model a set and use specific distance measure to compare two sets. These methods are slow to compute and not compact to use in a large scale scenario. Learning-based hashing is often used in large scale image retrieval as they provide a compact representation of each sample and the Hamming distance can be used to efficiently compare two samples. However, most hashing methods encode each image separately and discard knowledge that multiple images in the same set represent the same object or person. We investigate the set hashing problem by combining both set representation and hashing in a single deep neural network. An image set is first passed to a CNN module to extract image features, then these features are aggregated using two types of set feature to capture both set specific and database-wide distribution information. The computed set feature is then fed into a multilayer perceptron to learn a compact binary embedding. Triplet loss is used to train the network by forming set similarity relations using class labels. We extensively evaluate our approach on datasets used for image matching and show highly competitive performance compared to state-of-the-art methods. |
Tasks | Image Retrieval |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05381v2 |
http://arxiv.org/pdf/1606.05381v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-image-set-hashing |
Repo | |
Framework | |
Kernel-based methods for bandit convex optimization
Title | Kernel-based methods for bandit convex optimization |
Authors | Sébastien Bubeck, Ronen Eldan, Yin Tat Lee |
Abstract | We consider the adversarial convex bandit problem and we build the first $\mathrm{poly}(T)$-time algorithm with $\mathrm{poly}(n) \sqrt{T}$-regret for this problem. To do so we introduce three new ideas in the derivative-free optimization literature: (i) kernel methods, (ii) a generalization of Bernoulli convolutions, and (iii) a new annealing schedule for exponential weights (with increasing learning rate). The basic version of our algorithm achieves $\tilde{O}(n^{9.5} \sqrt{T})$-regret, and we show that a simple variant of this algorithm can be run in $\mathrm{poly}(n \log(T))$-time per step at the cost of an additional $\mathrm{poly}(n) T^{o(1)}$ factor in the regret. These results improve upon the $\tilde{O}(n^{11} \sqrt{T})$-regret and $\exp(\mathrm{poly}(T))$-time result of the first two authors, and the $\log(T)^{\mathrm{poly}(n)} \sqrt{T}$-regret and $\log(T)^{\mathrm{poly}(n)}$-time result of Hazan and Li. Furthermore we conjecture that another variant of the algorithm could achieve $\tilde{O}(n^{1.5} \sqrt{T})$-regret, and moreover that this regret is unimprovable (the current best lower bound being $\Omega(n \sqrt{T})$ and it is achieved with linear functions). For the simpler situation of zeroth order stochastic convex optimization this corresponds to the conjecture that the optimal query complexity is of order $n^3 / \epsilon^2$. |
Tasks | |
Published | 2016-07-11 |
URL | http://arxiv.org/abs/1607.03084v1 |
http://arxiv.org/pdf/1607.03084v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-based-methods-for-bandit-convex |
Repo | |
Framework | |
A Formal Calculus for International Relations Computation and Evaluation
Title | A Formal Calculus for International Relations Computation and Evaluation |
Authors | Mohd Anuar Mat Isa, Ramlan Mahmod, Nur Izura Udzir, Jamalul-lail Ab Manan, Audun Jøsang, Ali Dehghan Tanha |
Abstract | This publication presents a relation computation or calculus for international relations using a mathematical modeling. It examined trust for international relations and its calculus, which related to Bayesian inference, Dempster-Shafer theory and subjective logic. Based on an observation in the literature, we found no literature discussing the calculus method for the international relations. To bridge this research gap, we propose a relation algebra method for international relations computation. The proposed method will allow a relation computation which is previously subjective and incomputable. We also present three international relations as case studies to demonstrate the proposed method is a real-world scenario. The method will deliver the relation computation for the international relations that to support decision makers in a government such as foreign ministry, defense ministry, presidential or prime minister office. The Department of Defense (DoD) may use our method to determine a nation that can be identified as a friendly, neutral or hostile nation. |
Tasks | Bayesian Inference |
Published | 2016-04-02 |
URL | http://arxiv.org/abs/1606.02239v1 |
http://arxiv.org/pdf/1606.02239v1.pdf | |
PWC | https://paperswithcode.com/paper/a-formal-calculus-for-international-relations |
Repo | |
Framework | |
Options Discovery with Budgeted Reinforcement Learning
Title | Options Discovery with Budgeted Reinforcement Learning |
Authors | Aurélia Léon, Ludovic Denoyer |
Abstract | We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe a new learning model called Budgeted Option Neural Network (BONN) able to discover options based on a budgeted learning objective. The BONN model is evaluated on different classical RL problems, demonstrating both quantitative and qualitative interesting results. |
Tasks | |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06824v3 |
http://arxiv.org/pdf/1611.06824v3.pdf | |
PWC | https://paperswithcode.com/paper/options-discovery-with-budgeted-reinforcement |
Repo | |
Framework | |
FALCON: Feature Driven Selective Classification for Energy-Efficient Image Recognition
Title | FALCON: Feature Driven Selective Classification for Energy-Efficient Image Recognition |
Authors | Priyadarshini Panda, Aayush Ankit, Parami Wijesinghe, Kaushik Roy |
Abstract | Machine-learning algorithms have shown outstanding image recognition or classification performance for computer vision applications. However, the compute and energy requirement for implementing such classifier models for large-scale problems is quite high. In this paper, we propose Feature Driven Selective Classification (FALCON) inspired by the biological visual attention mechanism in the brain to optimize the energy-efficiency of machine-learning classifiers. We use the consensus in the characteristic features (color/texture) across images in a dataset to decompose the original classification problem and construct a tree of classifiers (nodes) with a generic-to-specific transition in the classification hierarchy. The initial nodes of the tree separate the instances based on feature information and selectively enable the latter nodes to perform object specific classification. The proposed methodology allows selective activation of only those branches and nodes of the classification tree that are relevant to the input while keeping the remaining nodes idle. Additionally, we propose a programmable and scalable Neuromorphic Engine (NeuE) that utilizes arrays of specialized neural computational elements to execute the FALCON based classifier models for diverse datasets. The structure of FALCON facilitates the reuse of nodes while scaling up from small classification problems to larger ones thus allowing us to construct classifier implementations that are significantly more efficient. We evaluate our approach for a 12-object classification task on the Caltech101 dataset and 10-object task on CIFAR-10 dataset by constructing FALCON models on the NeuE platform in 45nm technology. Our results demonstrate significant improvement in energy-efficiency and training time for minimal loss in output quality. |
Tasks | Object Classification |
Published | 2016-09-12 |
URL | http://arxiv.org/abs/1609.03396v2 |
http://arxiv.org/pdf/1609.03396v2.pdf | |
PWC | https://paperswithcode.com/paper/falcon-feature-driven-selective |
Repo | |
Framework | |
Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding
Title | Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding |
Authors | Jaeyong Sung, Seok Hyun Jin, Ian Lenz, Ashutosh Saxena |
Abstract | There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and learn to transfer manipulation strategy across different objects by embedding point-cloud, natural language, and manipulation trajectory data into a shared embedding space using a deep neural network. In order to learn semantically meaningful spaces throughout our network, we introduce a method for pre-training its lower layers for multimodal feature embedding and a method for fine-tuning this embedding space using a loss-based margin. In order to collect a large number of manipulation demonstrations for different objects, we develop a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects and appliances with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot with our model can even prepare a cup of a latte with appliances it has never seen before. |
Tasks | Structured Prediction |
Published | 2016-01-12 |
URL | http://arxiv.org/abs/1601.02705v1 |
http://arxiv.org/pdf/1601.02705v1.pdf | |
PWC | https://paperswithcode.com/paper/robobarista-learning-to-manipulate-novel |
Repo | |
Framework | |
SLA Violation Prediction In Cloud Computing: A Machine Learning Perspective
Title | SLA Violation Prediction In Cloud Computing: A Machine Learning Perspective |
Authors | Reyhane Askari Hemmat, Abdelhakim Hafid |
Abstract | Service level agreement (SLA) is an essential part of cloud systems to ensure maximum availability of services for customers. With a violation of SLA, the provider has to pay penalties. In this paper, we explore two machine learning models: Naive Bayes and Random Forest Classifiers to predict SLA violations. Since SLA violations are a rare event in the real world (~0.2 %), the classification task becomes more challenging. In order to overcome these challenges, we use several re-sampling methods. We find that random forests with SMOTE-ENN re-sampling have the best performance among other methods with the accuracy of 99.88 % and F_1 score of 0.9980. |
Tasks | |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10338v1 |
http://arxiv.org/pdf/1611.10338v1.pdf | |
PWC | https://paperswithcode.com/paper/sla-violation-prediction-in-cloud-computing-a |
Repo | |
Framework | |
Efficient Distributed Semi-Supervised Learning using Stochastic Regularization over Affinity Graphs
Title | Efficient Distributed Semi-Supervised Learning using Stochastic Regularization over Affinity Graphs |
Authors | Sunil Thulasidasan, Jeffrey Bilmes, Garrett Kenyon |
Abstract | We describe a computationally efficient, stochastic graph-regularization technique that can be utilized for the semi-supervised training of deep neural networks in a parallel or distributed setting. We utilize a technique, first described in [13] for the construction of mini-batches for stochastic gradient descent (SGD) based on synthesized partitions of an affinity graph that are consistent with the graph structure, but also preserve enough stochasticity for convergence of SGD to good local minima. We show how our technique allows a graph-based semi-supervised loss function to be decomposed into a sum over objectives, facilitating data parallelism for scalable training of machine learning models. Empirical results indicate that our method significantly improves classification accuracy compared to the fully-supervised case when the fraction of labeled data is low, and in the parallel case, achieves significant speed-up in terms of wall-clock time to convergence. We show the results for both sequential and distributed-memory semi-supervised DNN training on a speech corpus. |
Tasks | |
Published | 2016-12-15 |
URL | http://arxiv.org/abs/1612.04898v2 |
http://arxiv.org/pdf/1612.04898v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-distributed-semi-supervised |
Repo | |
Framework | |
PMI Matrix Approximations with Applications to Neural Language Modeling
Title | PMI Matrix Approximations with Applications to Neural Language Modeling |
Authors | Oren Melamud, Ido Dagan, Jacob Goldberger |
Abstract | The negative sampling (NEG) objective function, used in word2vec, is a simplification of the Noise Contrastive Estimation (NCE) method. NEG was found to be highly effective in learning continuous word representations. However, unlike NCE, it was considered inapplicable for the purpose of learning the parameters of a language model. In this study, we refute this assertion by providing a principled derivation for NEG-based language modeling, founded on a novel analysis of a low-dimensional approximation of the matrix of pointwise mutual information between the contexts and the predicted words. The obtained language modeling is closely related to NCE language models but is based on a simplified objective function. We thus provide a unified formulation for two main language processing tasks, namely word embedding and language modeling, based on the NEG objective function. Experimental results on two popular language modeling benchmarks show comparable perplexity results, with a small advantage to NEG over NCE. |
Tasks | Language Modelling |
Published | 2016-09-05 |
URL | http://arxiv.org/abs/1609.01235v1 |
http://arxiv.org/pdf/1609.01235v1.pdf | |
PWC | https://paperswithcode.com/paper/pmi-matrix-approximations-with-applications |
Repo | |
Framework | |
Enhancing Observability in Distribution Grids using Smart Meter Data
Title | Enhancing Observability in Distribution Grids using Smart Meter Data |
Authors | Siddharth Bhela, Vassilis Kekatos, Sriharsha Veeramachaneni |
Abstract | Due to limited metering infrastructure, distribution grids are currently challenged by observability issues. On the other hand, smart meter data, including local voltage magnitudes and power injections, are communicated to the utility operator from grid buses with renewable generation and demand-response programs. This work employs grid data from metered buses towards inferring the underlying grid state. To this end, a coupled formulation of the power flow problem (CPF) is put forth. Exploiting the high variability of injections at metered buses, the controllability of solar inverters, and the relative time-invariance of conventional loads, the idea is to solve the non-linear power flow equations jointly over consecutive time instants. An intuitive and easily verifiable rule pertaining to the locations of metered and non-metered buses on the physical grid is shown to be a necessary and sufficient criterion for local observability in radial networks. To account for noisy smart meter readings, a coupled power system state estimation (CPSSE) problem is further developed. Both CPF and CPSSE tasks are tackled via augmented semi-definite program relaxations. The observability criterion along with the CPF and CPSSE solvers are numerically corroborated using synthetic and actual solar generation and load data on the IEEE 34-bus benchmark feeder. |
Tasks | |
Published | 2016-12-20 |
URL | http://arxiv.org/abs/1612.06669v1 |
http://arxiv.org/pdf/1612.06669v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-observability-in-distribution-grids |
Repo | |
Framework | |