Paper Group ANR 375
Robust Object Tracking Based on Self-adaptive Search Area. Multi-View Task-Driven Recognition in Visual Sensor Networks. Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection. It Takes Two to Tango: Towards Theory of AI’s Mind. Submodular Mini-Batch Training in Generative Moment Matching Networks. On …
Robust Object Tracking Based on Self-adaptive Search Area
Title | Robust Object Tracking Based on Self-adaptive Search Area |
Authors | Taihang Dong, Sheng Zhong |
Abstract | Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers. |
Tasks | Object Tracking |
Published | 2017-11-21 |
URL | https://arxiv.org/abs/1711.07835v2 |
https://arxiv.org/pdf/1711.07835v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-object-tracking-based-on-self-adaptive |
Repo | |
Framework | |
Multi-View Task-Driven Recognition in Visual Sensor Networks
Title | Multi-View Task-Driven Recognition in Visual Sensor Networks |
Authors | Ali Taalimi, Alireza Rahimpour, Liu Liu, Hairong Qi |
Abstract | Nowadays, distributed smart cameras are deployed for a wide set of tasks in several application scenarios, ranging from object recognition, image retrieval, and forensic applications. Due to limited bandwidth in distributed systems, efficient coding of local visual features has in fact been an active topic of research. In this paper, we propose a novel approach to obtain a compact representation of high-dimensional visual data using sensor fusion techniques. We convert the problem of visual analysis in resource-limited scenarios to a multi-view representation learning, and we show that the key to finding properly compressed representation is to exploit the position of cameras with respect to each other as a norm-based regularization in the particular signal representation of sparse coding. Learning the representation of each camera is viewed as an individual task and a multi-task learning with joint sparsity for all nodes is employed. The proposed representation learning scheme is referred to as the multi-view task-driven learning for visual sensor network (MT-VSN). We demonstrate that MT-VSN outperforms state-of-the-art in various surveillance recognition tasks. |
Tasks | Image Retrieval, Multi-Task Learning, Object Recognition, Representation Learning, Sensor Fusion |
Published | 2017-05-30 |
URL | http://arxiv.org/abs/1705.10715v2 |
http://arxiv.org/pdf/1705.10715v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-view-task-driven-recognition-in-visual |
Repo | |
Framework | |
Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection
Title | Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection |
Authors | Bin Liu, Ke-Jia Chen |
Abstract | This paper addresses maximum likelihood (ML) estimation based model fitting in the context of extrasolar planet detection. This problem is featured by the following properties: 1) the candidate models under consideration are highly nonlinear; 2) the likelihood surface has a huge number of peaks; 3) the parameter space ranges in size from a few to dozens of dimensions. These properties make the ML search a very challenging problem, as it lacks any analytical or gradient based searching solution to explore the parameter space. A population based searching method, called estimation of distribution algorithm (EDA), is adopted to explore the model parameter space starting from a batch of random locations. EDA is featured by its ability to reveal and utilize problem structures. This property is desirable for characterizing the detections. However, it is well recognized that EDAs can not scale well to large scale problems, as it consists of iterative random sampling and model fitting procedures, which results in the well-known dilemma curse of dimensionality. A novel mechanism to perform EDAs in interactive random subspaces spanned by correlated variables is proposed and the hope is to alleviate the curse of dimensionality for EDAs by performing the operations of sampling and model fitting in lower dimensional subspaces. The effectiveness of the proposed algorithm is verified via both benchmark numerical studies and real data analysis. |
Tasks | |
Published | 2017-04-18 |
URL | http://arxiv.org/abs/1704.05761v2 |
http://arxiv.org/pdf/1704.05761v2.pdf | |
PWC | https://paperswithcode.com/paper/maximum-likelihood-estimation-based-on-random |
Repo | |
Framework | |
It Takes Two to Tango: Towards Theory of AI’s Mind
Title | It Takes Two to Tango: Towards Theory of AI’s Mind |
Authors | Arjun Chandrasekaran, Deshraj Yadav, Prithvijit Chattopadhyay, Viraj Prabhu, Devi Parikh |
Abstract | Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectives, etc.) to others and recognize that these mental states may differ from one’s own. Theory of Mind is critical to effective communication and to teams demonstrating higher collective performance. To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, there has been much emphasis on research to make AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts. The latter involves making AI more human-like and having it develop a theory of our minds. In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI’s mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks. We instantiate these ideas within the domain of Visual Question Answering (VQA). We find that using just a few examples (50), lay people can be trained to better predict responses and oncoming failures of a complex VQA model. We further evaluate the role existing explanation (or interpretability) modalities play in helping humans build ToAIM. Explainable AI has received considerable scientific and popular attention in recent times. Surprisingly, we find that having access to the model’s internal states - its confidence in its top-k predictions, explicit or implicit attention maps which highlight regions in the image (and words in the question) the model is looking at (and listening to) while answering a question about an image - do not help people better predict its behavior. |
Tasks | Question Answering, Visual Question Answering |
Published | 2017-04-03 |
URL | http://arxiv.org/abs/1704.00717v2 |
http://arxiv.org/pdf/1704.00717v2.pdf | |
PWC | https://paperswithcode.com/paper/it-takes-two-to-tango-towards-theory-of-ais |
Repo | |
Framework | |
Submodular Mini-Batch Training in Generative Moment Matching Networks
Title | Submodular Mini-Batch Training in Generative Moment Matching Networks |
Authors | Jun Qi |
Abstract | This article was withdrawn because (1) it was uploaded without the co-authors’ knowledge or consent, and (2) there are allegations of plagiarism. |
Tasks | |
Published | 2017-07-18 |
URL | http://arxiv.org/abs/1707.05721v3 |
http://arxiv.org/pdf/1707.05721v3.pdf | |
PWC | https://paperswithcode.com/paper/submodular-mini-batch-training-in-generative |
Repo | |
Framework | |
On the Usability of Probably Approximately Correct Implication Bases
Title | On the Usability of Probably Approximately Correct Implication Bases |
Authors | Daniel Borchmann, Tom Hanika, Sergei Obiedkov |
Abstract | We revisit the notion of probably approximately correct implication bases from the literature and present a first formulation in the language of formal concept analysis, with the goal to investigate whether such bases represent a suitable substitute for exact implication bases in practical use-cases. To this end, we quantitatively examine the behavior of probably approximately correct implication bases on artificial and real-world data sets and compare their precision and recall with respect to their corresponding exact implication bases. Using a small example, we also provide qualitative insight that implications from probably approximately correct bases can still represent meaningful knowledge from a given data set. |
Tasks | |
Published | 2017-01-04 |
URL | http://arxiv.org/abs/1701.00877v2 |
http://arxiv.org/pdf/1701.00877v2.pdf | |
PWC | https://paperswithcode.com/paper/on-the-usability-of-probably-approximately |
Repo | |
Framework | |
Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation
Title | Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation |
Authors | Parag Shridhar Chandakkar, Baoxin Li |
Abstract | Training deep CNNs to capture localized image artifacts on a relatively small dataset is a challenging task. With enough images at hand, one can hope that a deep CNN characterizes localized artifacts over the entire data and their effect on the output. However, on smaller datasets, such deep CNNs may overfit and shallow ones find it hard to capture local artifacts. Thus some image-based small-data applications first train their framework on a collection of patches (instead of the entire image) to better learn the representation of localized artifacts. Then the output is obtained by averaging the patch-level results. Such an approach ignores the spatial correlation among patches and how various patch locations affect the output. It also fails in cases where few patches mainly contribute to the image label. To combat these scenarios, we develop the notion of hyper-image representations. Our CNN has two stages. The first stage is trained on patches. The second stage utilizes the last layer representation developed in the first stage to form a hyper-image, which is used to train the second stage. We show that this approach is able to develop a better mapping between the image and its output. We analyze additional properties of our approach and show its effectiveness on one synthetic and two real-world vision tasks - no-reference image quality estimation and image tampering detection - by its performance improvement over existing strong baselines. |
Tasks | Image Quality Estimation |
Published | 2017-11-14 |
URL | http://arxiv.org/abs/1711.04945v1 |
http://arxiv.org/pdf/1711.04945v1.pdf | |
PWC | https://paperswithcode.com/paper/capturing-localized-image-artifacts-through-a |
Repo | |
Framework | |
Akid: A Library for Neural Network Research and Production from a Dataism Approach
Title | Akid: A Library for Neural Network Research and Production from a Dataism Approach |
Authors | Shuai Li |
Abstract | Neural networks are a revolutionary but immature technique that is fast evolving and heavily relies on data. To benefit from the newest development and newly available data, we want the gap between research and production as small as possibly. On the other hand, differing from traditional machine learning models, neural network is not just yet another statistic model, but a model for the natural processing engine — the brain. In this work, we describe a neural network library named {\texttt akid}. It provides higher level of abstraction for entities (abstracted as blocks) in nature upon the abstraction done on signals (abstracted as tensors) by Tensorflow, characterizing the dataism observation that all entities in nature processes input and emit out in some ways. It includes a full stack of software that provides abstraction to let researchers focus on research instead of implementation, while at the same time the developed program can also be put into production seamlessly in a distributed environment, and be production ready. At the top application stack, it provides out-of-box tools for neural network applications. Lower down, akid provides a programming paradigm that lets user easily build customized models. The distributed computing stack handles the concurrency and communication, thus letting models be trained or deployed to a single GPU, multiple GPUs, or a distributed environment without affecting how a model is specified in the programming paradigm stack. Lastly, the distributed deployment stack handles how the distributed computing is deployed, thus decoupling the research prototype environment with the actual production environment, and is able to dynamically allocate computing resources, so development (Devs) and operations (Ops) could be separated. Please refer to http://akid.readthedocs.io/en/latest/ for documentation. |
Tasks | |
Published | 2017-01-03 |
URL | http://arxiv.org/abs/1701.00609v1 |
http://arxiv.org/pdf/1701.00609v1.pdf | |
PWC | https://paperswithcode.com/paper/akid-a-library-for-neural-network-research |
Repo | |
Framework | |
On reducing the communication cost of the diffusion LMS algorithm
Title | On reducing the communication cost of the diffusion LMS algorithm |
Authors | Ibrahim El Khalil Harrane, Rémi Flamary, Cédric Richard |
Abstract | The rise of digital and mobile communications has recently made the world more connected and networked, resulting in an unprecedented volume of data flowing between sources, data centers, or processes. While these data may be processed in a centralized manner, it is often more suitable to consider distributed strategies such as diffusion as they are scalable and can handle large amounts of data by distributing tasks over networked agents. Although it is relatively simple to implement diffusion strategies over a cluster, it appears to be challenging to deploy them in an ad-hoc network with limited energy budget for communication. In this paper, we introduce a diffusion LMS strategy that significantly reduces communication costs without compromising the performance. Then, we analyze the proposed algorithm in the mean and mean-square sense. Next, we conduct numerical experiments to confirm the theoretical findings. Finally, we perform large scale simulations to test the algorithm efficiency in a scenario where energy is limited. |
Tasks | |
Published | 2017-11-30 |
URL | http://arxiv.org/abs/1711.11423v2 |
http://arxiv.org/pdf/1711.11423v2.pdf | |
PWC | https://paperswithcode.com/paper/on-reducing-the-communication-cost-of-the |
Repo | |
Framework | |
Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning
Title | Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning |
Authors | Shiyue Zhang, Pengtao Xie, Dong Wang, Eric P. Xing |
Abstract | A primary goal of computational phenotype research is to conduct medical diagnosis. In hospital, physicians rely on massive clinical data to make diagnosis decisions, among which laboratory tests are one of the most important resources. However, the longitudinal and incomplete nature of laboratory test data casts a significant challenge on its interpretation and usage, which may result in harmful decisions by both human physicians and automatic diagnosis systems. In this work, we take advantage of deep generative models to deal with the complex laboratory tests. Specifically, we propose an end-to-end architecture that involves a deep generative variational recurrent neural networks (VRNN) to learn robust and generalizable features, and a discriminative neural network (NN) model to learn diagnosis decision making, and the two models are trained jointly. Our experiments are conducted on a dataset involving 46,252 patients, and the 50 most frequent tests are used to predict the 50 most common diagnoses. The results show that our model, VRNN+NN, significantly (p<0.001) outperforms other baseline models. Moreover, we demonstrate that the representations learned by the joint training are more informative than those learned by pure generative models. Finally, we find that our model offers a surprisingly good imputation for missing values. |
Tasks | Decision Making, Imputation, Medical Diagnosis |
Published | 2017-11-12 |
URL | http://arxiv.org/abs/1711.04329v2 |
http://arxiv.org/pdf/1711.04329v2.pdf | |
PWC | https://paperswithcode.com/paper/medical-diagnosis-from-laboratory-tests-by |
Repo | |
Framework | |
Feature Engineering for Predictive Modeling using Reinforcement Learning
Title | Feature Engineering for Predictive Modeling using Reinforcement Learning |
Authors | Udayan Khurana, Horst Samulowitz, Deepak Turaga |
Abstract | Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The human attention involved in overseeing this process significantly influences the cost of model generation. We present a new framework to automate feature engineering. It is based on performance driven exploration of a transformation graph, which systematically and compactly enumerates the space of given options. A highly efficient exploration strategy is derived through reinforcement learning on past examples. |
Tasks | Automated Feature Engineering, Efficient Exploration, Feature Engineering |
Published | 2017-09-21 |
URL | http://arxiv.org/abs/1709.07150v1 |
http://arxiv.org/pdf/1709.07150v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-engineering-for-predictive-modeling |
Repo | |
Framework | |
Unconstrained Scene Text and Video Text Recognition for Arabic Script
Title | Unconstrained Scene Text and Video Text Recognition for Arabic Script |
Authors | Mohit Jain, Minesh Mathew, C. V. Jawahar |
Abstract | Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesising millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNNs to model contextual dependencies yields superior recognition results. |
Tasks | Scene Text Recognition |
Published | 2017-11-07 |
URL | http://arxiv.org/abs/1711.02396v1 |
http://arxiv.org/pdf/1711.02396v1.pdf | |
PWC | https://paperswithcode.com/paper/unconstrained-scene-text-and-video-text |
Repo | |
Framework | |
Sketching for Kronecker Product Regression and P-splines
Title | Sketching for Kronecker Product Regression and P-splines |
Authors | Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff |
Abstract | TensorSketch is an oblivious linear sketch introduced in Pagh’13 and later used in Pham, Pagh’13 in the context of SVMs for polynomial kernels. It was shown in Avron, Nguyen, Woodruff’14 that TensorSketch provides a subspace embedding, and therefore can be used for canonical correlation analysis, low rank approximation, and principal component regression for the polynomial kernel. We take TensorSketch outside of the context of polynomials kernels, and show its utility in applications in which the underlying design matrix is a Kronecker product of smaller matrices. This allows us to solve Kronecker product regression and non-negative Kronecker product regression, as well as regularized spline regression. Our main technical result is then in extending TensorSketch to other norms. That is, TensorSketch only provides input sparsity time for Kronecker product regression with respect to the $2$-norm. We show how to solve Kronecker product regression with respect to the $1$-norm in time sublinear in the time required for computing the Kronecker product, as well as for more general $p$-norms. |
Tasks | |
Published | 2017-12-27 |
URL | http://arxiv.org/abs/1712.09473v1 |
http://arxiv.org/pdf/1712.09473v1.pdf | |
PWC | https://paperswithcode.com/paper/sketching-for-kronecker-product-regression |
Repo | |
Framework | |
A Heuristic Search Algorithm Using the Stability of Learning Algorithms in Certain Scenarios as the Fitness Function: An Artificial General Intelligence Engineering Approach
Title | A Heuristic Search Algorithm Using the Stability of Learning Algorithms in Certain Scenarios as the Fitness Function: An Artificial General Intelligence Engineering Approach |
Authors | Zengkun Li |
Abstract | This paper presents a non-manual design engineering method based on heuristic search algorithm to search for candidate agents in the solution space which formed by artificial intelligence agents modeled on the base of bionics.Compared with the artificial design method represented by meta-learning and the bionics method represented by the neural architecture chip,this method is more feasible for realizing artificial general intelligence,and it has a much better interaction with cognitive neuroscience;at the same time,the engineering method is based on the theoretical hypothesis that the final learning algorithm is stable in certain scenarios,and has generalization ability in various scenarios.The paper discusses the theory preliminarily and proposes the possible correlation between the theory and the fixed-point theorem in the field of mathematics.Limited by the author’s knowledge level,this correlation is proposed only as a kind of conjecture. |
Tasks | Meta-Learning |
Published | 2017-12-08 |
URL | http://arxiv.org/abs/1712.03043v3 |
http://arxiv.org/pdf/1712.03043v3.pdf | |
PWC | https://paperswithcode.com/paper/a-heuristic-search-algorithm-using-the |
Repo | |
Framework | |
Network Representation Learning: A Survey
Title | Network Representation Learning: A Survey |
Authors | Daokun Zhang, Jie Yin, Xingquan Zhu, Chengqi Zhang |
Abstract | With the widespread use of information technologies, information networks are becoming increasingly popular to capture complex relationships across various disciplines, such as social networks, citation networks, telecommunication networks, and biological networks. Analyzing these networks sheds light on different aspects of social life such as the structure of societies, information diffusion, and communication patterns. In reality, however, the large scale of information networks often makes network analytic tasks computationally expensive or intractable. Network representation learning has been recently proposed as a new learning paradigm to embed network vertices into a low-dimensional vector space, by preserving network topology structure, vertex content, and other side information. This facilitates the original network to be easily handled in the new vector space for further analysis. In this survey, we perform a comprehensive review of the current literature on network representation learning in the data mining and machine learning field. We propose new taxonomies to categorize and summarize the state-of-the-art network representation learning techniques according to the underlying learning mechanisms, the network information intended to preserve, as well as the algorithmic designs and methodologies. We summarize evaluation protocols used for validating network representation learning including published benchmark datasets, evaluation methods, and open source algorithms. We also perform empirical studies to compare the performance of representative algorithms on common datasets, and analyze their computational complexity. Finally, we suggest promising research directions to facilitate future study. |
Tasks | Representation Learning |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1801.05852v3 |
http://arxiv.org/pdf/1801.05852v3.pdf | |
PWC | https://paperswithcode.com/paper/network-representation-learning-a-survey |
Repo | |
Framework | |