July 28, 2019

3322 words 16 mins read

Paper Group ANR 375

Robust Object Tracking Based on Self-adaptive Search Area. Multi-View Task-Driven Recognition in Visual Sensor Networks. Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection. It Takes Two to Tango: Towards Theory of AI’s Mind. Submodular Mini-Batch Training in Generative Moment Matching Networks. On …

Robust Object Tracking Based on Self-adaptive Search Area


Title	Robust Object Tracking Based on Self-adaptive Search Area
Authors	Taihang Dong, Sheng Zhong
Abstract	Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers.
Tasks	Object Tracking
Published	2017-11-21
URL	https://arxiv.org/abs/1711.07835v2
PDF	https://arxiv.org/pdf/1711.07835v2.pdf
PWC	https://paperswithcode.com/paper/robust-object-tracking-based-on-self-adaptive
Repo
Framework

Multi-View Task-Driven Recognition in Visual Sensor Networks


Title	Multi-View Task-Driven Recognition in Visual Sensor Networks
Authors	Ali Taalimi, Alireza Rahimpour, Liu Liu, Hairong Qi
Abstract	Nowadays, distributed smart cameras are deployed for a wide set of tasks in several application scenarios, ranging from object recognition, image retrieval, and forensic applications. Due to limited bandwidth in distributed systems, efficient coding of local visual features has in fact been an active topic of research. In this paper, we propose a novel approach to obtain a compact representation of high-dimensional visual data using sensor fusion techniques. We convert the problem of visual analysis in resource-limited scenarios to a multi-view representation learning, and we show that the key to finding properly compressed representation is to exploit the position of cameras with respect to each other as a norm-based regularization in the particular signal representation of sparse coding. Learning the representation of each camera is viewed as an individual task and a multi-task learning with joint sparsity for all nodes is employed. The proposed representation learning scheme is referred to as the multi-view task-driven learning for visual sensor network (MT-VSN). We demonstrate that MT-VSN outperforms state-of-the-art in various surveillance recognition tasks.
Tasks	Image Retrieval, Multi-Task Learning, Object Recognition, Representation Learning, Sensor Fusion
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10715v2
PDF	http://arxiv.org/pdf/1705.10715v2.pdf
PWC	https://paperswithcode.com/paper/multi-view-task-driven-recognition-in-visual
Repo
Framework

Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection


Title	Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection
Authors	Bin Liu, Ke-Jia Chen
Abstract	This paper addresses maximum likelihood (ML) estimation based model fitting in the context of extrasolar planet detection. This problem is featured by the following properties: 1) the candidate models under consideration are highly nonlinear; 2) the likelihood surface has a huge number of peaks; 3) the parameter space ranges in size from a few to dozens of dimensions. These properties make the ML search a very challenging problem, as it lacks any analytical or gradient based searching solution to explore the parameter space. A population based searching method, called estimation of distribution algorithm (EDA), is adopted to explore the model parameter space starting from a batch of random locations. EDA is featured by its ability to reveal and utilize problem structures. This property is desirable for characterizing the detections. However, it is well recognized that EDAs can not scale well to large scale problems, as it consists of iterative random sampling and model fitting procedures, which results in the well-known dilemma curse of dimensionality. A novel mechanism to perform EDAs in interactive random subspaces spanned by correlated variables is proposed and the hope is to alleviate the curse of dimensionality for EDAs by performing the operations of sampling and model fitting in lower dimensional subspaces. The effectiveness of the proposed algorithm is verified via both benchmark numerical studies and real data analysis.
Tasks
Published	2017-04-18
URL	http://arxiv.org/abs/1704.05761v2
PDF	http://arxiv.org/pdf/1704.05761v2.pdf
PWC	https://paperswithcode.com/paper/maximum-likelihood-estimation-based-on-random
Repo
Framework

It Takes Two to Tango: Towards Theory of AI’s Mind


Title	It Takes Two to Tango: Towards Theory of AI’s Mind
Authors	Arjun Chandrasekaran, Deshraj Yadav, Prithvijit Chattopadhyay, Viraj Prabhu, Devi Parikh
Abstract	Theory of Mind is the ability to attribute mental states (beliefs, intents, knowledge, perspectives, etc.) to others and recognize that these mental states may differ from one’s own. Theory of Mind is critical to effective communication and to teams demonstrating higher collective performance. To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. Traditionally, there has been much emphasis on research to make AI more accurate, and (to a lesser extent) on having it better understand human intentions, tendencies, beliefs, and contexts. The latter involves making AI more human-like and having it develop a theory of our minds. In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI’s mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks. We instantiate these ideas within the domain of Visual Question Answering (VQA). We find that using just a few examples (50), lay people can be trained to better predict responses and oncoming failures of a complex VQA model. We further evaluate the role existing explanation (or interpretability) modalities play in helping humans build ToAIM. Explainable AI has received considerable scientific and popular attention in recent times. Surprisingly, we find that having access to the model’s internal states - its confidence in its top-k predictions, explicit or implicit attention maps which highlight regions in the image (and words in the question) the model is looking at (and listening to) while answering a question about an image - do not help people better predict its behavior.
Tasks	Question Answering, Visual Question Answering
Published	2017-04-03
URL	http://arxiv.org/abs/1704.00717v2
PDF	http://arxiv.org/pdf/1704.00717v2.pdf
PWC	https://paperswithcode.com/paper/it-takes-two-to-tango-towards-theory-of-ais
Repo
Framework

Submodular Mini-Batch Training in Generative Moment Matching Networks


Title	Submodular Mini-Batch Training in Generative Moment Matching Networks
Authors	Jun Qi
Abstract	This article was withdrawn because (1) it was uploaded without the co-authors’ knowledge or consent, and (2) there are allegations of plagiarism.
Tasks
Published	2017-07-18
URL	http://arxiv.org/abs/1707.05721v3
PDF	http://arxiv.org/pdf/1707.05721v3.pdf
PWC	https://paperswithcode.com/paper/submodular-mini-batch-training-in-generative
Repo
Framework

On the Usability of Probably Approximately Correct Implication Bases


Title	On the Usability of Probably Approximately Correct Implication Bases
Authors	Daniel Borchmann, Tom Hanika, Sergei Obiedkov
Abstract	We revisit the notion of probably approximately correct implication bases from the literature and present a first formulation in the language of formal concept analysis, with the goal to investigate whether such bases represent a suitable substitute for exact implication bases in practical use-cases. To this end, we quantitatively examine the behavior of probably approximately correct implication bases on artificial and real-world data sets and compare their precision and recall with respect to their corresponding exact implication bases. Using a small example, we also provide qualitative insight that implications from probably approximately correct bases can still represent meaningful knowledge from a given data set.
Tasks
Published	2017-01-04
URL	http://arxiv.org/abs/1701.00877v2
PDF	http://arxiv.org/pdf/1701.00877v2.pdf
PWC	https://paperswithcode.com/paper/on-the-usability-of-probably-approximately
Repo
Framework

Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation


Title	Capturing Localized Image Artifacts through a CNN-based Hyper-image Representation
Authors	Parag Shridhar Chandakkar, Baoxin Li
Abstract	Training deep CNNs to capture localized image artifacts on a relatively small dataset is a challenging task. With enough images at hand, one can hope that a deep CNN characterizes localized artifacts over the entire data and their effect on the output. However, on smaller datasets, such deep CNNs may overfit and shallow ones find it hard to capture local artifacts. Thus some image-based small-data applications first train their framework on a collection of patches (instead of the entire image) to better learn the representation of localized artifacts. Then the output is obtained by averaging the patch-level results. Such an approach ignores the spatial correlation among patches and how various patch locations affect the output. It also fails in cases where few patches mainly contribute to the image label. To combat these scenarios, we develop the notion of hyper-image representations. Our CNN has two stages. The first stage is trained on patches. The second stage utilizes the last layer representation developed in the first stage to form a hyper-image, which is used to train the second stage. We show that this approach is able to develop a better mapping between the image and its output. We analyze additional properties of our approach and show its effectiveness on one synthetic and two real-world vision tasks - no-reference image quality estimation and image tampering detection - by its performance improvement over existing strong baselines.
Tasks	Image Quality Estimation
Published	2017-11-14
URL	http://arxiv.org/abs/1711.04945v1
PDF	http://arxiv.org/pdf/1711.04945v1.pdf
PWC	https://paperswithcode.com/paper/capturing-localized-image-artifacts-through-a
Repo
Framework

Akid: A Library for Neural Network Research and Production from a Dataism Approach


Title	Akid: A Library for Neural Network Research and Production from a Dataism Approach
Authors	Shuai Li
Abstract	Neural networks are a revolutionary but immature technique that is fast evolving and heavily relies on data. To benefit from the newest development and newly available data, we want the gap between research and production as small as possibly. On the other hand, differing from traditional machine learning models, neural network is not just yet another statistic model, but a model for the natural processing engine — the brain. In this work, we describe a neural network library named {\texttt akid}. It provides higher level of abstraction for entities (abstracted as blocks) in nature upon the abstraction done on signals (abstracted as tensors) by Tensorflow, characterizing the dataism observation that all entities in nature processes input and emit out in some ways. It includes a full stack of software that provides abstraction to let researchers focus on research instead of implementation, while at the same time the developed program can also be put into production seamlessly in a distributed environment, and be production ready. At the top application stack, it provides out-of-box tools for neural network applications. Lower down, akid provides a programming paradigm that lets user easily build customized models. The distributed computing stack handles the concurrency and communication, thus letting models be trained or deployed to a single GPU, multiple GPUs, or a distributed environment without affecting how a model is specified in the programming paradigm stack. Lastly, the distributed deployment stack handles how the distributed computing is deployed, thus decoupling the research prototype environment with the actual production environment, and is able to dynamically allocate computing resources, so development (Devs) and operations (Ops) could be separated. Please refer to http://akid.readthedocs.io/en/latest/ for documentation.
Tasks
Published	2017-01-03
URL	http://arxiv.org/abs/1701.00609v1
PDF	http://arxiv.org/pdf/1701.00609v1.pdf
PWC	https://paperswithcode.com/paper/akid-a-library-for-neural-network-research
Repo
Framework

On reducing the communication cost of the diffusion LMS algorithm


Title	On reducing the communication cost of the diffusion LMS algorithm
Authors	Ibrahim El Khalil Harrane, Rémi Flamary, Cédric Richard
Abstract	The rise of digital and mobile communications has recently made the world more connected and networked, resulting in an unprecedented volume of data flowing between sources, data centers, or processes. While these data may be processed in a centralized manner, it is often more suitable to consider distributed strategies such as diffusion as they are scalable and can handle large amounts of data by distributing tasks over networked agents. Although it is relatively simple to implement diffusion strategies over a cluster, it appears to be challenging to deploy them in an ad-hoc network with limited energy budget for communication. In this paper, we introduce a diffusion LMS strategy that significantly reduces communication costs without compromising the performance. Then, we analyze the proposed algorithm in the mean and mean-square sense. Next, we conduct numerical experiments to confirm the theoretical findings. Finally, we perform large scale simulations to test the algorithm efficiency in a scenario where energy is limited.
Tasks
Published	2017-11-30
URL	http://arxiv.org/abs/1711.11423v2
PDF	http://arxiv.org/pdf/1711.11423v2.pdf
PWC	https://paperswithcode.com/paper/on-reducing-the-communication-cost-of-the
Repo
Framework

Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning


Title	Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning
Authors	Shiyue Zhang, Pengtao Xie, Dong Wang, Eric P. Xing
Abstract	A primary goal of computational phenotype research is to conduct medical diagnosis. In hospital, physicians rely on massive clinical data to make diagnosis decisions, among which laboratory tests are one of the most important resources. However, the longitudinal and incomplete nature of laboratory test data casts a significant challenge on its interpretation and usage, which may result in harmful decisions by both human physicians and automatic diagnosis systems. In this work, we take advantage of deep generative models to deal with the complex laboratory tests. Specifically, we propose an end-to-end architecture that involves a deep generative variational recurrent neural networks (VRNN) to learn robust and generalizable features, and a discriminative neural network (NN) model to learn diagnosis decision making, and the two models are trained jointly. Our experiments are conducted on a dataset involving 46,252 patients, and the 50 most frequent tests are used to predict the 50 most common diagnoses. The results show that our model, VRNN+NN, significantly (p<0.001) outperforms other baseline models. Moreover, we demonstrate that the representations learned by the joint training are more informative than those learned by pure generative models. Finally, we find that our model offers a surprisingly good imputation for missing values.
Tasks	Decision Making, Imputation, Medical Diagnosis
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04329v2
PDF	http://arxiv.org/pdf/1711.04329v2.pdf
PWC	https://paperswithcode.com/paper/medical-diagnosis-from-laboratory-tests-by
Repo
Framework

Feature Engineering for Predictive Modeling using Reinforcement Learning


Title	Feature Engineering for Predictive Modeling using Reinforcement Learning
Authors	Udayan Khurana, Horst Samulowitz, Deepak Turaga
Abstract	Feature engineering is a crucial step in the process of predictive modeling. It involves the transformation of given feature space, typically using mathematical functions, with the objective of reducing the modeling error for a given target. However, there is no well-defined basis for performing effective feature engineering. It involves domain knowledge, intuition, and most of all, a lengthy process of trial and error. The human attention involved in overseeing this process significantly influences the cost of model generation. We present a new framework to automate feature engineering. It is based on performance driven exploration of a transformation graph, which systematically and compactly enumerates the space of given options. A highly efficient exploration strategy is derived through reinforcement learning on past examples.
Tasks	Automated Feature Engineering, Efficient Exploration, Feature Engineering
Published	2017-09-21
URL	http://arxiv.org/abs/1709.07150v1
PDF	http://arxiv.org/pdf/1709.07150v1.pdf
PWC	https://paperswithcode.com/paper/feature-engineering-for-predictive-modeling
Repo
Framework

Unconstrained Scene Text and Video Text Recognition for Arabic Script


Title	Unconstrained Scene Text and Video Text Recognition for Arabic Script
Authors	Mohit Jain, Minesh Mathew, C. V. Jawahar
Abstract	Building robust recognizers for Arabic has always been challenging. We demonstrate the effectiveness of an end-to-end trainable CNN-RNN hybrid architecture in recognizing Arabic text in videos and natural scenes. We outperform previous state-of-the-art on two publicly available video text datasets - ALIF and ACTIV. For the scene text recognition task, we introduce a new Arabic scene text dataset and establish baseline results. For scripts like Arabic, a major challenge in developing robust recognizers is the lack of large quantity of annotated data. We overcome this by synthesising millions of Arabic text images from a large vocabulary of Arabic words and phrases. Our implementation is built on top of the model introduced here [37] which is proven quite effective for English scene text recognition. The model follows a segmentation-free, sequence to sequence transcription approach. The network transcribes a sequence of convolutional features from the input image to a sequence of target labels. This does away with the need for segmenting input image into constituent characters/glyphs, which is often difficult for Arabic script. Further, the ability of RNNs to model contextual dependencies yields superior recognition results.
Tasks	Scene Text Recognition
Published	2017-11-07
URL	http://arxiv.org/abs/1711.02396v1
PDF	http://arxiv.org/pdf/1711.02396v1.pdf
PWC	https://paperswithcode.com/paper/unconstrained-scene-text-and-video-text
Repo
Framework

Sketching for Kronecker Product Regression and P-splines


Title	Sketching for Kronecker Product Regression and P-splines
Authors	Huaian Diao, Zhao Song, Wen Sun, David P. Woodruff
Abstract	TensorSketch is an oblivious linear sketch introduced in Pagh’13 and later used in Pham, Pagh’13 in the context of SVMs for polynomial kernels. It was shown in Avron, Nguyen, Woodruff’14 that TensorSketch provides a subspace embedding, and therefore can be used for canonical correlation analysis, low rank approximation, and principal component regression for the polynomial kernel. We take TensorSketch outside of the context of polynomials kernels, and show its utility in applications in which the underlying design matrix is a Kronecker product of smaller matrices. This allows us to solve Kronecker product regression and non-negative Kronecker product regression, as well as regularized spline regression. Our main technical result is then in extending TensorSketch to other norms. That is, TensorSketch only provides input sparsity time for Kronecker product regression with respect to the $2$-norm. We show how to solve Kronecker product regression with respect to the $1$-norm in time sublinear in the time required for computing the Kronecker product, as well as for more general $p$-norms.
Tasks
Published	2017-12-27
URL	http://arxiv.org/abs/1712.09473v1
PDF	http://arxiv.org/pdf/1712.09473v1.pdf
PWC	https://paperswithcode.com/paper/sketching-for-kronecker-product-regression
Repo
Framework

A Heuristic Search Algorithm Using the Stability of Learning Algorithms in Certain Scenarios as the Fitness Function: An Artificial General Intelligence Engineering Approach


Title	A Heuristic Search Algorithm Using the Stability of Learning Algorithms in Certain Scenarios as the Fitness Function: An Artificial General Intelligence Engineering Approach
Authors	Zengkun Li
Abstract	This paper presents a non-manual design engineering method based on heuristic search algorithm to search for candidate agents in the solution space which formed by artificial intelligence agents modeled on the base of bionics.Compared with the artificial design method represented by meta-learning and the bionics method represented by the neural architecture chip,this method is more feasible for realizing artificial general intelligence,and it has a much better interaction with cognitive neuroscience;at the same time,the engineering method is based on the theoretical hypothesis that the final learning algorithm is stable in certain scenarios,and has generalization ability in various scenarios.The paper discusses the theory preliminarily and proposes the possible correlation between the theory and the fixed-point theorem in the field of mathematics.Limited by the author’s knowledge level,this correlation is proposed only as a kind of conjecture.
Tasks	Meta-Learning
Published	2017-12-08
URL	http://arxiv.org/abs/1712.03043v3
PDF	http://arxiv.org/pdf/1712.03043v3.pdf
PWC	https://paperswithcode.com/paper/a-heuristic-search-algorithm-using-the
Repo
Framework

Network Representation Learning: A Survey


Title	Network Representation Learning: A Survey
Authors	Daokun Zhang, Jie Yin, Xingquan Zhu, Chengqi Zhang
Abstract	With the widespread use of information technologies, information networks are becoming increasingly popular to capture complex relationships across various disciplines, such as social networks, citation networks, telecommunication networks, and biological networks. Analyzing these networks sheds light on different aspects of social life such as the structure of societies, information diffusion, and communication patterns. In reality, however, the large scale of information networks often makes network analytic tasks computationally expensive or intractable. Network representation learning has been recently proposed as a new learning paradigm to embed network vertices into a low-dimensional vector space, by preserving network topology structure, vertex content, and other side information. This facilitates the original network to be easily handled in the new vector space for further analysis. In this survey, we perform a comprehensive review of the current literature on network representation learning in the data mining and machine learning field. We propose new taxonomies to categorize and summarize the state-of-the-art network representation learning techniques according to the underlying learning mechanisms, the network information intended to preserve, as well as the algorithmic designs and methodologies. We summarize evaluation protocols used for validating network representation learning including published benchmark datasets, evaluation methods, and open source algorithms. We also perform empirical studies to compare the performance of representative algorithms on common datasets, and analyze their computational complexity. Finally, we suggest promising research directions to facilitate future study.
Tasks	Representation Learning
Published	2017-12-04
URL	http://arxiv.org/abs/1801.05852v3
PDF	http://arxiv.org/pdf/1801.05852v3.pdf
PWC	https://paperswithcode.com/paper/network-representation-learning-a-survey
Repo
Framework