May 5, 2019

2876 words 14 mins read

Paper Group ANR 447

Constraint-Based Clustering Selection. Estimating 3D Trajectories from 2D Projections via Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machines. Classifying Patents Based on their Semantic Content. Dialog state tracking, a machine reading approach using Memory Network. A Unified Gender-Aware Age Estimation. Improved Dense Trajecto …

Constraint-Based Clustering Selection


Title	Constraint-Based Clustering Selection
Authors	Toon Van Craenendonck, Hendrik Blockeel
Abstract	Semi-supervised clustering methods incorporate a limited amount of supervision into the clustering process. Typically, this supervision is provided by the user in the form of pairwise constraints. Existing methods use such constraints in one of the following ways: they adapt their clustering procedure, their similarity metric, or both. All of these approaches operate within the scope of individual clustering algorithms. In contrast, we propose to use constraints to choose between clusterings generated by very different unsupervised clustering algorithms, run with different parameter settings. We empirically show that this simple approach often outperforms existing semi-supervised clustering methods.
Tasks
Published	2016-09-23
URL	http://arxiv.org/abs/1609.07272v1
PDF	http://arxiv.org/pdf/1609.07272v1.pdf
PWC	https://paperswithcode.com/paper/constraint-based-clustering-selection
Repo
Framework

Estimating 3D Trajectories from 2D Projections via Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machines


Title	Estimating 3D Trajectories from 2D Projections via Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machines
Authors	Decebal Constantin Mocanu, Haitham Bou Ammar, Luis Puig, Eric Eaton, Antonio Liotta
Abstract	Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on. In this article, we propose a solution to solve this problem based on a novel deep learning model dubbed Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machine (DFFW-CRBM). Our method improves state-of-the-art deep learning techniques for high dimensional time-series modeling by introducing a novel tensor factorization capable of driving forth order Boltzmann machines to considerably lower energy levels, at no computational costs. DFFW-CRBMs are capable of accurately estimating, recognizing, and performing near-future prediction of three-dimensional trajectories from their 2D projections while requiring limited amount of labeled data. We evaluate our method on both simulated and real-world data, showing its effectiveness in predicting and classifying complex ball trajectories and human activities.
Tasks	Future prediction, Time Series
Published	2016-04-20
URL	http://arxiv.org/abs/1604.05865v2
PDF	http://arxiv.org/pdf/1604.05865v2.pdf
PWC	https://paperswithcode.com/paper/estimating-3d-trajectories-from-2d
Repo
Framework

Classifying Patents Based on their Semantic Content


Title	Classifying Patents Based on their Semantic Content
Authors	Antonin Bergeaud, Yoann Potiron, Juste Raimbault
Abstract	In this paper, we extend some usual techniques of classification resulting from a large-scale data-mining and network approach. This new technology, which in particular is designed to be suitable to big data, is used to construct an open consolidated database from raw data on 4 million patents taken from the US patent office from 1976 onward. To build the pattern network, not only do we look at each patent title, but we also examine their full abstract and extract the relevant keywords accordingly. We refer to this classification as semantic approach in contrast with the more common technological approach which consists in taking the topology when considering US Patent office technological classes. Moreover, we document that both approaches have highly different topological measures and strong statistical evidence that they feature a different model. This suggests that our method is a useful tool to extract endogenous information.
Tasks
Published	2016-12-27
URL	http://arxiv.org/abs/1612.08504v1
PDF	http://arxiv.org/pdf/1612.08504v1.pdf
PWC	https://paperswithcode.com/paper/classifying-patents-based-on-their-semantic
Repo
Framework

Dialog state tracking, a machine reading approach using Memory Network


Title	Dialog state tracking, a machine reading approach using Memory Network
Authors	Julien Perez, Fei Liu
Abstract	In an end-to-end dialog system, the aim of dialog state tracking is to accurately estimate a compact representation of the current dialog status from a sequence of noisy observations produced by the speech recognition and the natural language understanding modules. This paper introduces a novel method of dialog state tracking based on the general paradigm of machine reading and proposes to solve it using an End-to-End Memory Network, MemN2N, a memory-enhanced neural network architecture. We evaluate the proposed approach on the second Dialog State Tracking Challenge (DSTC-2) dataset. The corpus has been converted for the occasion in order to frame the hidden state variable inference as a question-answering task based on a sequence of utterances extracted from a dialog. We show that the proposed tracker gives encouraging results. Then, we propose to extend the DSTC-2 dataset with specific reasoning capabilities requirement like counting, list maintenance, yes-no question answering and indefinite knowledge management. Finally, we present encouraging results using our proposed MemN2N based tracking model.
Tasks	Question Answering, Reading Comprehension, Speech Recognition
Published	2016-06-13
URL	http://arxiv.org/abs/1606.04052v5
PDF	http://arxiv.org/pdf/1606.04052v5.pdf
PWC	https://paperswithcode.com/paper/dialog-state-tracking-a-machine-reading
Repo
Framework

A Unified Gender-Aware Age Estimation


Title	A Unified Gender-Aware Age Estimation
Authors	Qing Tian, Songcan Chen, Xiaoyang Tan
Abstract	Human age estimation has attracted increasing researches due to its wide applicability in such as security monitoring and advertisement recommendation. Although a variety of methods have been proposed, most of them focus only on the age-specific facial appearance. However, biological researches have shown that not only gender but also the aging difference between the male and the female inevitably affect the age estimation. To our knowledge, so far there have been two methods that have concerned the gender factor. The first is a sequential method which first classifies the gender and then performs age estimation respectively for classified male and female. Although it promotes age estimation performance because of its consideration on the gender semantic difference, an accumulation risk of estimation errors is unavoidable. To overcome drawbacks of the sequential strategy, the second is to regress the age appended with the gender by concatenating their labels as two dimensional output using Partial Least Squares (PLS). Although leading to promotion of age estimation performance, such a concatenation not only likely confuses the semantics between the gender and age, but also ignores the aging discrepancy between the male and the female. In order to overcome their shortcomings, in this paper we propose a unified framework to perform gender-aware age estimation. The proposed method considers and utilizes not only the semantic relationship between the gender and the age, but also the aging discrepancy between the male and the female. Finally, experimental results demonstrate not only the superiority of our method in performance, but also its good interpretability in revealing the aging discrepancy.
Tasks	Age Estimation
Published	2016-09-13
URL	http://arxiv.org/abs/1609.03815v1
PDF	http://arxiv.org/pdf/1609.03815v1.pdf
PWC	https://paperswithcode.com/paper/a-unified-gender-aware-age-estimation
Repo
Framework

Improved Dense Trajectory with Cross Streams


Title	Improved Dense Trajectory with Cross Streams
Authors	Katsunori Ohnishi, Masatoshi Hidaka, Tatsuya Harada
Abstract	Improved dense trajectories (iDT) have shown great performance in action recognition, and their combination with the two-stream approach has achieved state-of-the-art performance. It is, however, difficult for iDT to completely remove background trajectories from video with camera shaking. Trajectories in less discriminative regions should be given modest weights in order to create more discriminative local descriptors for action recognition. In addition, the two-stream approach, which learns appearance and motion information separately, cannot focus on motion in important regions when extracting features from spatial convolutional layers of the appearance network, and vice versa. In order to address the above mentioned problems, we propose a new local descriptor that pools a new convolutional layer obtained from crossing two networks along iDT. This new descriptor is calculated by applying discriminative weights learned from one network to a convolutional layer of the other network. Our method has achieved state-of-the-art performance on ordinal action recognition datasets, 92.3% on UCF101, and 66.2% on HMDB51.
Tasks	Temporal Action Localization
Published	2016-04-29
URL	http://arxiv.org/abs/1604.08826v1
PDF	http://arxiv.org/pdf/1604.08826v1.pdf
PWC	https://paperswithcode.com/paper/improved-dense-trajectory-with-cross-streams
Repo
Framework

Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls


Title	Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls
Authors	Jason M. Klusowski, Andrew R. Barron
Abstract	We establish $ L^{\infty} $ and $ L^2 $ error bounds for functions of many variables that are approximated by linear combinations of ReLU (rectified linear unit) and squared ReLU ridge functions with $ \ell^1 $ and $ \ell^0 $ controls on their inner and outer parameters. With the squared ReLU ridge function, we show that the $ L^2 $ approximation error is inversely proportional to the inner layer $ \ell^0 $ sparsity and it need only be sublinear in the outer layer $ \ell^0 $ sparsity. Our constructions are obtained using a variant of the Jones-Barron probabilistic method, which can be interpreted as either stratified sampling with proportionate allocation or two-stage cluster sampling. We also provide companion error lower bounds that reveal near optimality of our constructions. Despite the sparsity assumptions, we showcase the richness and flexibility of these ridge combinations by defining a large family of functions, in terms of certain spectral conditions, that are particularly well approximated by them.
Tasks
Published	2016-07-26
URL	http://arxiv.org/abs/1607.07819v3
PDF	http://arxiv.org/pdf/1607.07819v3.pdf
PWC	https://paperswithcode.com/paper/approximation-by-combinations-of-relu-and
Repo
Framework

A practical local tomography reconstruction algorithm based on known subregion


Title	A practical local tomography reconstruction algorithm based on known subregion
Authors	Pierre Paleo, Michel Desvignes, Alessandro Mirone
Abstract	We propose a new method to reconstruct data acquired in a local tomography setup. This method uses an initial reconstruction and refines it by correcting the low frequency artifacts known as the cupping effect. A basis of Gaussian functions is used to correct the initial reconstruction. The coefficients of this basis are iteratively optimized under the constraint of a known subregion. Using a coarse basis reduces the degrees of freedom of the problem while actually correcting the cupping effect. Simulations show that the known region constraint yields an unbiased reconstruction, in accordance to uniqueness theorems stated in local tomography.
Tasks
Published	2016-06-15
URL	http://arxiv.org/abs/1606.04940v1
PDF	http://arxiv.org/pdf/1606.04940v1.pdf
PWC	https://paperswithcode.com/paper/a-practical-local-tomography-reconstruction
Repo
Framework

Sparse Diffusion Steepest-Descent for One Bit Compressed Sensing in Wireless Sensor Networks


Title	Sparse Diffusion Steepest-Descent for One Bit Compressed Sensing in Wireless Sensor Networks
Authors	Hadi Zayyani, Mehdi Korki, Farrokh Marvasti
Abstract	This letter proposes a sparse diffusion steepest-descent algorithm for one bit compressed sensing in wireless sensor networks. The approach exploits the diffusion strategy from distributed learning in the one bit compressed sensing framework. To estimate a common sparse vector cooperatively from only the sign of measurements, steepest-descent is used to minimize the suitable global and local convex cost functions. A diffusion strategy is suggested for distributive learning of the sparse vector. Simulation results show the effectiveness of the proposed distributed algorithm compared to the state-of-the-art non distributive algorithms in the one bit compressed sensing framework.
Tasks
Published	2016-01-03
URL	http://arxiv.org/abs/1601.00350v1
PDF	http://arxiv.org/pdf/1601.00350v1.pdf
PWC	https://paperswithcode.com/paper/sparse-diffusion-steepest-descent-for-one-bit
Repo
Framework

IEDC: An Integrated Approach for Overlapping and Non-overlapping Community Detection


Title	IEDC: An Integrated Approach for Overlapping and Non-overlapping Community Detection
Authors	Mahdi Hajiabadi, Hadi Zare, Hossein Bobarshad
Abstract	Community detection is a task of fundamental importance in social network analysis that can be used in a variety of knowledge-based domains. While there exist many works on community detection based on connectivity structures, they suffer from either considering the overlapping or non-overlapping communities. In this work, we propose a novel approach for general community detection through an integrated framework to extract the overlapping and non-overlapping community structures without assuming prior structural connectivity on networks. Our general framework is based on a primary node based criterion which consists of the internal association degree along with the external association degree. The evaluation of the proposed method is investigated through the extensive simulation experiments and several benchmark real network datasets. The experimental results show that the proposed method outperforms the earlier state-of-the-art algorithms based on the well-known evaluation criteria.
Tasks	Community Detection
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04679v2
PDF	http://arxiv.org/pdf/1612.04679v2.pdf
PWC	https://paperswithcode.com/paper/iedc-an-integrated-approach-for-overlapping
Repo
Framework

AIDE: Fast and Communication Efficient Distributed Optimization


Title	AIDE: Fast and Communication Efficient Distributed Optimization
Authors	Sashank J. Reddi, Jakub Konečný, Peter Richtárik, Barnabás Póczós, Alex Smola
Abstract	In this paper, we present two new communication-efficient methods for distributed minimization of an average of functions. The first algorithm is an inexact variant of the DANE algorithm that allows any local algorithm to return an approximate solution to a local subproblem. We show that such a strategy does not affect the theoretical guarantees of DANE significantly. In fact, our approach can be viewed as a robustification strategy since the method is substantially better behaved than DANE on data partition arising in practice. It is well known that DANE algorithm does not match the communication complexity lower bounds. To bridge this gap, we propose an accelerated variant of the first method, called AIDE, that not only matches the communication lower bounds but can also be implemented using a purely first-order oracle. Our empirical results show that AIDE is superior to other communication efficient algorithms in settings that naturally arise in machine learning applications.
Tasks	Distributed Optimization
Published	2016-08-24
URL	http://arxiv.org/abs/1608.06879v1
PDF	http://arxiv.org/pdf/1608.06879v1.pdf
PWC	https://paperswithcode.com/paper/aide-fast-and-communication-efficient
Repo
Framework

Distributed Optimization for Client-Server Architecture with Negative Gradient Weights


Title	Distributed Optimization for Client-Server Architecture with Negative Gradient Weights
Authors	Shripad Gade, Nitin H. Vaidya
Abstract	Availability of both massive datasets and computing resources have made machine learning and predictive analytics extremely pervasive. In this work we present a synchronous algorithm and architecture for distributed optimization motivated by privacy requirements posed by applications in machine learning. We present an algorithm for the recently proposed multi-parameter-server architecture. We consider a group of parameter servers that learn a model based on randomized gradients received from clients. Clients are computational entities with private datasets (inducing a private objective function), that evaluate and upload randomized gradients to the parameter servers. The parameter servers perform model updates based on received gradients and share the model parameters with other servers. We prove that the proposed algorithm can optimize the overall objective function for a very general architecture involving $C$ clients connected to $S$ parameter servers in an arbitrary time varying topology and the parameter servers forming a connected network.
Tasks	Distributed Optimization
Published	2016-08-12
URL	http://arxiv.org/abs/1608.03866v2
PDF	http://arxiv.org/pdf/1608.03866v2.pdf
PWC	https://paperswithcode.com/paper/distributed-optimization-for-client-server
Repo
Framework

GaDei: On Scale-up Training As A Service For Deep Learning


Title	GaDei: On Scale-up Training As A Service For Deep Learning
Authors	Wei Zhang, Minwei Feng, Yunhui Zheng, Yufei Ren, Yandong Wang, Ji Liu, Peng Liu, Bing Xiang, Li Zhang, Bowen Zhou, Fei Wang
Abstract	Deep learning (DL) training-as-a-service (TaaS) is an important emerging industrial workload. The unique challenge of TaaS is that it must satisfy a wide range of customers who have no experience and resources to tune DL hyper-parameters, and meticulous tuning for each user’s dataset is prohibitively expensive. Therefore, TaaS hyper-parameters must be fixed with values that are applicable to all users. IBM Watson Natural Language Classifier (NLC) service, the most popular IBM cognitive service used by thousands of enterprise-level clients around the globe, is a typical TaaS service. By evaluating the NLC workloads, we show that only the conservative hyper-parameter setup (e.g., small mini-batch size and small learning rate) can guarantee acceptable model accuracy for a wide range of customers. We further justify theoretically why such a setup guarantees better model convergence in general. Unfortunately, the small mini-batch size causes a high volume of communication traffic in a parameter-server based system. We characterize the high communication bandwidth requirement of TaaS using representative industrial deep learning workloads and demonstrate that none of the state-of-the-art scale-up or scale-out solutions can satisfy such a requirement. We then present GaDei, an optimized shared-memory based scale-up parameter server design. We prove that the designed protocol is deadlock-free and it processes each gradient exactly once. Our implementation is evaluated on both commercial benchmarks and public benchmarks to demonstrate that it significantly outperforms the state-of-the-art parameter-server based implementation while maintaining the required accuracy and our implementation reaches near the best possible runtime performance, constrained only by the hardware limitation. Furthermore, to the best of our knowledge, GaDei is the only scale-up DL system that provides fault-tolerance.
Tasks
Published	2016-11-18
URL	http://arxiv.org/abs/1611.06213v2
PDF	http://arxiv.org/pdf/1611.06213v2.pdf
PWC	https://paperswithcode.com/paper/gadei-on-scale-up-training-as-a-service-for
Repo
Framework

Delta Epsilon Alpha Star: A PAC-Admissible Search Algorithm


Title	Delta Epsilon Alpha Star: A PAC-Admissible Search Algorithm
Authors	David Cox
Abstract	Delta Epsilon Alpha Star is a minimal coverage, real-time robotic search algorithm that yields a moderately aggressive search path with minimal backtracking. Search performance is bounded by a placing a combinatorial bound, epsilon and delta, on the maximum deviation from the theoretical shortest path and the probability at which further deviations can occur. Additionally, we formally define the notion of PAC-admissibility – a relaxed admissibility criteria for algorithms, and show that PAC-admissible algorithms are better suited to robotic search situations than epsilon-admissible or strict algorithms.
Tasks
Published	2016-08-08
URL	http://arxiv.org/abs/1608.02287v1
PDF	http://arxiv.org/pdf/1608.02287v1.pdf
PWC	https://paperswithcode.com/paper/delta-epsilon-alpha-star-a-pac-admissible
Repo
Framework

Investigating the effects Diversity Mechanisms have on Evolutionary Algorithms in Dynamic Environments


Title	Investigating the effects Diversity Mechanisms have on Evolutionary Algorithms in Dynamic Environments
Authors	Matthew Hughes
Abstract	Evolutionary algorithms have been successfully applied to a variety of optimisation problems in stationary environments. However, many real world optimisation problems are set in dynamic environments where the success criteria shifts regularly. Population diversity affects algorithmic performance, particularly on multiobjective and dynamic problems. Diversity mechanisms are methods of altering evolutionary algorithms in a way that promotes the maintenance of population diversity. This project intends to measure and compare the performance effect a variety of diversity mechanisms have on an evolutionary algorithm when facing an assortment of dynamic problems.
Tasks
Published	2016-10-09
URL	http://arxiv.org/abs/1610.02732v1
PDF	http://arxiv.org/pdf/1610.02732v1.pdf
PWC	https://paperswithcode.com/paper/investigating-the-effects-diversity
Repo
Framework