May 6, 2019

2845 words 14 mins read

Paper Group ANR 187

Characterizing Driving Styles with Deep Learning. Towards a Theoretical Analysis of PCA for Heteroscedastic Data. Improving Agreement and Disagreement Identification in Online Discussions with A Socially-Tuned Sentiment Lexicon. Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA. Neural Networks C …

Characterizing Driving Styles with Deep Learning


Title	Characterizing Driving Styles with Deep Learning
Authors	Weishan Dong, Jian Li, Renjie Yao, Changsheng Li, Ting Yuan, Lanjun Wang
Abstract	Characterizing driving styles of human drivers using vehicle sensor data, e.g., GPS, is an interesting research problem and an important real-world requirement from automotive industries. A good representation of driving features can be highly valuable for autonomous driving, auto insurance, and many other application scenarios. However, traditional methods mainly rely on handcrafted features, which limit machine learning algorithms to achieve a better performance. In this paper, we propose a novel deep learning solution to this problem, which could be the first attempt of extending deep learning to driving behavior analysis based on GPS data. The proposed approach can effectively extract high level and interpretable features describing complex driving patterns. It also requires significantly less human experience and work. The power of the learned driving style representations are validated through the driver identification problem using a large real dataset.
Tasks	Autonomous Driving
Published	2016-07-13
URL	http://arxiv.org/abs/1607.03611v2
PDF	http://arxiv.org/pdf/1607.03611v2.pdf
PWC	https://paperswithcode.com/paper/characterizing-driving-styles-with-deep
Repo
Framework

Towards a Theoretical Analysis of PCA for Heteroscedastic Data


Title	Towards a Theoretical Analysis of PCA for Heteroscedastic Data
Authors	David Hong, Laura Balzano, Jeffrey A. Fessler
Abstract	Principal Component Analysis (PCA) is a method for estimating a subspace given noisy samples. It is useful in a variety of problems ranging from dimensionality reduction to anomaly detection and the visualization of high dimensional data. PCA performs well in the presence of moderate noise and even with missing data, but is also sensitive to outliers. PCA is also known to have a phase transition when noise is independent and identically distributed; recovery of the subspace sharply declines at a threshold noise variance. Effective use of PCA requires a rigorous understanding of these behaviors. This paper provides a step towards an analysis of PCA for samples with heteroscedastic noise, that is, samples that have non-uniform noise variances and so are no longer identically distributed. In particular, we provide a simple asymptotic prediction of the recovery of a one-dimensional subspace from noisy heteroscedastic samples. The prediction enables: a) easy and efficient calculation of the asymptotic performance, and b) qualitative reasoning to understand how PCA is impacted by heteroscedasticity (such as outliers).
Tasks	Anomaly Detection, Dimensionality Reduction
Published	2016-10-12
URL	http://arxiv.org/abs/1610.03595v1
PDF	http://arxiv.org/pdf/1610.03595v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-theoretical-analysis-of-pca-for
Repo
Framework

Improving Agreement and Disagreement Identification in Online Discussions with A Socially-Tuned Sentiment Lexicon


Title	Improving Agreement and Disagreement Identification in Online Discussions with A Socially-Tuned Sentiment Lexicon
Authors	Lu Wang, Claire Cardie
Abstract	We study the problem of agreement and disagreement detection in online discussions. An isotonic Conditional Random Fields (isotonic CRF) based sequential model is proposed to make predictions on sentence- or segment-level. We automatically construct a socially-tuned lexicon that is bootstrapped from existing general-purpose sentiment lexicons to further improve the performance. We evaluate our agreement and disagreement tagging model on two disparate online discussion corpora – Wikipedia Talk pages and online debates. Our model is shown to outperform the state-of-the-art approaches in both datasets. For example, the isotonic CRF model achieves F1 scores of 0.74 and 0.67 for agreement and disagreement detection, when a linear chain CRF obtains 0.58 and 0.56 for the discussions on Wikipedia Talk pages.
Tasks
Published	2016-06-17
URL	http://arxiv.org/abs/1606.05706v1
PDF	http://arxiv.org/pdf/1606.05706v1.pdf
PWC	https://paperswithcode.com/paper/improving-agreement-and-disagreement
Repo
Framework

Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA


Title	Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA
Authors	Manuel Martin Salvador, Marcin Budka, Bogdan Gabrys
Abstract	Composition and parameterization of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps are a challenging task. Auto-WEKA is a tool to automate the combined algorithm selection and hyperparameter (CASH) optimization problem. In this paper, we extend the CASH problem and Auto-WEKA to support the MCPS, including preprocessing steps for both classification and regression tasks. We define the optimization problem in which the search space consists of suitably parameterized Petri nets forming the sought MCPS solutions. In the experimental analysis, we focus on examining the impact of considerably extending the search space (from approximately 22,000 to 812 billion possible combinations of methods and categorical hyperparameters). In a range of extensive experiments, three different optimization strategies are used to automatically compose MCPSs for 21 publicly available data sets. The diversity of the composed MCPSs found is an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. We also present the results on seven data sets from real chemical production processes. Our findings can have a major impact on the development of high-quality predictive models as well as their maintenance and scalability aspects needed in modern applications and deployment scenarios.
Tasks
Published	2016-12-28
URL	http://arxiv.org/abs/1612.08789v2
PDF	http://arxiv.org/pdf/1612.08789v2.pdf
PWC	https://paperswithcode.com/paper/automatic-composition-and-optimization-of
Repo
Framework

Neural Networks Classifier for Data Selection in Statistical Machine Translation


Title	Neural Networks Classifier for Data Selection in Statistical Machine Translation
Authors	Álvaro Peris, Mara Chinea-Rios, Francisco Casacuberta
Abstract	We address the data selection problem in statistical machine translation (SMT) as a classification task. The new data selection method is based on a neural network classifier. We present a new method description and empirical results proving that our data selection method provides better translation quality, compared to a state-of-the-art method (i.e., Cross entropy). Moreover, the empirical results reported are coherent across different language pairs.
Tasks	Machine Translation
Published	2016-12-16
URL	http://arxiv.org/abs/1612.05555v2
PDF	http://arxiv.org/pdf/1612.05555v2.pdf
PWC	https://paperswithcode.com/paper/neural-networks-classifier-for-data-selection
Repo
Framework

Question Generation from a Knowledge Base with Web Exploration


Title	Question Generation from a Knowledge Base with Web Exploration
Authors	Linfeng Song, Lin Zhao
Abstract	Question generation from a knowledge base (KB) is the task of generating questions related to the domain of the input KB. We propose a system for generating fluent and natural questions from a KB, which significantly reduces the human effort by leveraging massive web resources. In more detail, a seed question set is first generated by applying a small number of hand-crafted templates on the input KB, then more questions are retrieved by iteratively forming already obtained questions as search queries into a standard search engine, before finally questions are selected by estimating their fluency and domain relevance. Evaluated by human graders on 500 random-selected triples from Freebase, questions generated by our system are judged to be more fluent than those of \newcite{serban-EtAl:2016:P16-1} by human graders.
Tasks	Question Generation
Published	2016-10-12
URL	http://arxiv.org/abs/1610.03807v2
PDF	http://arxiv.org/pdf/1610.03807v2.pdf
PWC	https://paperswithcode.com/paper/question-generation-from-a-knowledge-base
Repo
Framework

Training Bit Fully Convolutional Network for Fast Semantic Segmentation


Title	Training Bit Fully Convolutional Network for Fast Semantic Segmentation
Authors	He Wen, Shuchang Zhou, Zhe Liang, Yuxiang Zhang, Dieqiao Feng, Xinyu Zhou, Cong Yao
Abstract	Fully convolutional neural networks give accurate, per-pixel prediction for input images and have applications like semantic segmentation. However, a typical FCN usually requires lots of floating point computation and large run-time memory, which effectively limits its usability. We propose a method to train Bit Fully Convolution Network (BFCN), a fully convolutional neural network that has low bit-width weights and activations. Because most of its computation-intensive convolutions are accomplished between low bit-width numbers, a BFCN can be accelerated by an efficient bit-convolution implementation. On CPU, the dot product operation between two bit vectors can be reduced to bitwise operations and popcounts, which can offer much higher throughput than 32-bit multiplications and additions. To validate the effectiveness of BFCN, we conduct experiments on the PASCAL VOC 2012 semantic segmentation task and Cityscapes. Our BFCN with 1-bit weights and 2-bit activations, which runs 7.8x faster on CPU or requires less than 1% resources on FPGA, can achieve comparable performance as the 32-bit counterpart.
Tasks	Semantic Segmentation
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00212v1
PDF	http://arxiv.org/pdf/1612.00212v1.pdf
PWC	https://paperswithcode.com/paper/training-bit-fully-convolutional-network-for
Repo
Framework

Adaptive Submodular Ranking and Routing


Title	Adaptive Submodular Ranking and Routing
Authors	Fatemeh Navidi, Prabhanjan Kambadur, Viswanath Nagarajan
Abstract	We study a general stochastic ranking problem where an algorithm needs to adaptively select a sequence of elements so as to “cover” a random scenario (drawn from a known distribution) at minimum expected cost. The coverage of each scenario is captured by an individual submodular function, where the scenario is said to be covered when its function value goes above a given threshold. We obtain a logarithmic factor approximation algorithm for this adaptive ranking problem, which is the best possible (unless P=NP). This problem unifies and generalizes many previously studied problems with applications in search ranking and active learning. The approximation ratio of our algorithm either matches or improves the best result known in each of these special cases. Furthermore, we extend our results to an adaptive vehicle routing problem, where costs are determined by an underlying metric. This routing problem is a significant generalization of the previously-studied adaptive traveling salesman and traveling repairman problems. Our approximation ratio nearly matches the best bound known for these special cases. Finally, we present experimental results for some applications of adaptive ranking.
Tasks	Active Learning
Published	2016-06-05
URL	http://arxiv.org/abs/1606.01530v2
PDF	http://arxiv.org/pdf/1606.01530v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-submodular-ranking-and-routing
Repo
Framework

Mining Software Components from Object-Oriented APIs


Title	Mining Software Components from Object-Oriented APIs
Authors	Anas Shatnawi, Abdelhak Seriai, Houari Sahraoui, Zakarea Al-Shara
Abstract	Object-oriented Application Programing Interfaces (APIs) support software reuse by providing pre-implemented functionalities. Due to the huge number of included classes, reusing and understanding large APIs is a complex task. Otherwise, software components are admitted to be more reusable and understandable entities than object-oriented ones. Thus, in this paper, we propose an approach for reengineering object-oriented APIs into component-based ones. We mine components as a group of classes based on the frequency they are used together and their ability to form a quality-centric component. To validate our approach, we experimented on 100 Java applications that used Android APIs.
Tasks
Published	2016-06-02
URL	http://arxiv.org/abs/1606.00561v1
PDF	http://arxiv.org/pdf/1606.00561v1.pdf
PWC	https://paperswithcode.com/paper/mining-software-components-from-object
Repo
Framework

Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off


Title	Runtime Configurable Deep Neural Networks for Energy-Accuracy Trade-off
Authors	Hokchhay Tann, Soheil Hashemi, R. Iris Bahar, Sherief Reda
Abstract	We present a novel dynamic configuration technique for deep neural networks that permits step-wise energy-accuracy trade-offs during runtime. Our configuration technique adjusts the number of channels in the network dynamically depending on response time, power, and accuracy targets. To enable this dynamic configuration technique, we co-design a new training algorithm, where the network is incrementally trained such that the weights in channels trained in earlier steps are fixed. Our technique provides the flexibility of multiple networks while storing and utilizing one set of weights. We evaluate our techniques using both an ASIC-based hardware accelerator as well as a low-power embedded GPGPU and show that our approach leads to only a small or negligible loss in the final network accuracy. We analyze the performance of our proposed methodology using three well-known networks for MNIST, CIFAR-10, and SVHN datasets, and we show that we are able to achieve up to 95% energy reduction with less than 1% accuracy loss across the three benchmarks. In addition, compared to prior work on dynamic network reconfiguration, we show that our approach leads to approximately 50% savings in storage requirements, while achieving similar accuracy.
Tasks
Published	2016-07-19
URL	http://arxiv.org/abs/1607.05418v2
PDF	http://arxiv.org/pdf/1607.05418v2.pdf
PWC	https://paperswithcode.com/paper/runtime-configurable-deep-neural-networks-for
Repo
Framework

Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)


Title	Towards Analytics Aware Ontology Based Access to Static and Streaming Data (Extended Version)
Authors	Evgeny Kharlamov, Yannis Kotidis, Theofilos Mailis, Christian Neuenstadt, Charalampos Nikolaou, Özgür Özcep, Christoforos Svingos, Dmitriy Zheleznyakov, Sebastian Brandt, Ian Horrocks, Yannis Ioannidis, Steffen Lamparter, Ralf Möller
Abstract	Real-time analytics that requires integration and aggregation of heterogeneous and distributed streaming and static data is a typical task in many industrial scenarios such as diagnostics of turbines in Siemens. OBDA approach has a great potential to facilitate such tasks; however, it has a number of limitations in dealing with analytics that restrict its use in important industrial applications. Based on our experience with Siemens, we argue that in order to overcome those limitations OBDA should be extended and become analytics, source, and cost aware. In this work we propose such an extension. In particular, we propose an ontology, mapping, and query language for OBDA, where aggregate and other analytical functions are first class citizens. Moreover, we develop query optimisation techniques that allow to efficiently process analytical tasks over static and streaming data. We implement our approach in a system and evaluate our system with Siemens turbine data.
Tasks
Published	2016-07-18
URL	http://arxiv.org/abs/1607.05351v2
PDF	http://arxiv.org/pdf/1607.05351v2.pdf
PWC	https://paperswithcode.com/paper/towards-analytics-aware-ontology-based-access
Repo
Framework

Evolving Shepherding Behavior with Genetic Programming Algorithms


Title	Evolving Shepherding Behavior with Genetic Programming Algorithms
Authors	Joshua Brulé, Kevin Engel, Nick Fung, Isaac Julien
Abstract	We apply genetic programming techniques to the `shepherding’ problem, in which a group of one type of animal (sheep dogs) attempts to control the movements of a second group of animals (sheep) obeying flocking behavior. Our genetic programming algorithm evolves an expression tree that governs the movements of each dog. The operands of the tree are hand-selected features of the simulation environment that may allow the dogs to herd the sheep effectively. The algorithm uses tournament-style selection, crossover reproduction, and a point mutation. We find that the evolved solutions generalize well and outperform a (naive) human-designed algorithm. \|
Tasks
Published	2016-03-19
URL	http://arxiv.org/abs/1603.06141v1
PDF	http://arxiv.org/pdf/1603.06141v1.pdf
PWC	https://paperswithcode.com/paper/evolving-shepherding-behavior-with-genetic
Repo
Framework

NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization


Title	NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
Authors	Davood Hajinezhad, Mingyi Hong, Tuo Zhao, Zhaoran Wang
Abstract	We study a stochastic and distributed algorithm for nonconvex problems whose objective consists of a sum of $N$ nonconvex $L_i/N$-smooth functions, plus a nonsmooth regularizer. The proposed NonconvEx primal-dual SpliTTing (NESTT) algorithm splits the problem into $N$ subproblems, and utilizes an augmented Lagrangian based primal-dual scheme to solve it in a distributed and stochastic manner. With a special non-uniform sampling, a version of NESTT achieves $\epsilon$-stationary solution using $\mathcal{O}((\sum_{i=1}^N\sqrt{L_i/N})^2/\epsilon)$ gradient evaluations, which can be up to $\mathcal{O}(N)$ times better than the (proximal) gradient descent methods. It also achieves Q-linear convergence rate for nonconvex $\ell_1$ penalized quadratic problems with polyhedral constraints. Further, we reveal a fundamental connection between primal-dual based methods and a few primal only methods such as IAG/SAG/SAGA.
Tasks	Stochastic Optimization
Published	2016-05-25
URL	http://arxiv.org/abs/1605.07747v2
PDF	http://arxiv.org/pdf/1605.07747v2.pdf
PWC	https://paperswithcode.com/paper/nestt-a-nonconvex-primal-dual-splitting
Repo
Framework

Probabilistic Relational Model Benchmark Generation


Title	Probabilistic Relational Model Benchmark Generation
Authors	Mouna Ben Ishak, Rajani Chulyadyo, Philippe Leray
Abstract	The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relational data mining context and enable effective and robust reasoning over relational data. Even though a panoply of works have focused, separately , on the generation of random Bayesian networks and relational databases, no work has been identified for PRMs on that track. This paper provides an algorithmic approach for generating random PRMs from scratch to fill this gap. The proposed method allows to generate PRMs as well as synthetic relational data from a randomly generated relational schema and a random set of probabilistic dependencies. This can be of interest not only for machine learning researchers to evaluate their proposals in a common framework, but also for databases designers to evaluate the effectiveness of the components of a database management system.
Tasks
Published	2016-03-02
URL	http://arxiv.org/abs/1603.00709v1
PDF	http://arxiv.org/pdf/1603.00709v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-relational-model-benchmark
Repo
Framework

Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images


Title	Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images
Authors	Stephen Lombardi, Ko Nishino
Abstract	Recovering the radiometric properties of a scene (i.e., the reflectance, illumination, and geometry) is a long-sought ability of computer vision that can provide invaluable information for a wide range of applications. Deciphering the radiometric ingredients from the appearance of a real-world scene, as opposed to a single isolated object, is particularly challenging as it generally consists of various objects with different material compositions exhibiting complex reflectance and light interactions that are also part of the illumination. We introduce the first method for radiometric scene decomposition that handles those intricacies. We use RGB-D images to bootstrap geometry recovery and simultaneously recover the complex reflectance and natural illumination while refining the noisy initial geometry and segmenting the scene into different material regions. Most important, we handle real-world scenes consisting of multiple objects of unknown materials, which necessitates the modeling of spatially-varying complex reflectance, natural illumination, texture, interreflection and shadows. We systematically evaluate the effectiveness of our method on synthetic scenes and demonstrate its application to real-world scenes. The results show that rich radiometric information can be recovered from RGB-D images and demonstrate a new role RGB-D sensors can play for general scene understanding tasks.
Tasks	Scene Understanding
Published	2016-04-05
URL	http://arxiv.org/abs/1604.01354v1
PDF	http://arxiv.org/pdf/1604.01354v1.pdf
PWC	https://paperswithcode.com/paper/radiometric-scene-decomposition-scene
Repo
Framework