July 27, 2019

3597 words 17 mins read

Paper Group ANR 689

Learning from Multi-View Multi-Way Data via Structural Factorization Machines. Byzantine-Tolerant Machine Learning. A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds. Cost-Optimal Learning of Causal Graphs. Scene Text Eraser. Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks. $\left( β, \varpi \ri …

Learning from Multi-View Multi-Way Data via Structural Factorization Machines


Title	Learning from Multi-View Multi-Way Data via Structural Factorization Machines
Authors	Chun-Ta Lu, Lifang He, Hao Ding, Bokai Cao, Philip S. Yu
Abstract	Real-world relations among entities can often be observed and determined by different perspectives/views. For example, the decision made by a user on whether to adopt an item relies on multiple aspects such as the contextual information of the decision, the item’s attributes, the user’s profile and the reviews given by other users. Different views may exhibit multi-way interactions among entities and provide complementary information. In this paper, we introduce a multi-tensor-based approach that can preserve the underlying structure of multi-view data in a generic predictive model. Specifically, we propose structural factorization machines (SFMs) that learn the common latent spaces shared by multi-view tensors and automatically adjust the importance of each view in the predictive model. Furthermore, the complexity of SFMs is linear in the number of parameters, which make SFMs suitable to large-scale problems. Extensive experiments on real-world datasets demonstrate that the proposed SFMs outperform several state-of-the-art methods in terms of prediction accuracy and computational cost.
Tasks
Published	2017-04-10
URL	http://arxiv.org/abs/1704.03037v2
PDF	http://arxiv.org/pdf/1704.03037v2.pdf
PWC	https://paperswithcode.com/paper/learning-from-multi-view-multi-way-data-via
Repo
Framework

Byzantine-Tolerant Machine Learning


Title	Byzantine-Tolerant Machine Learning
Authors	Peva Blanchard, El Mahdi El Mhamdi, Rachid Guerraoui, Julien Stainer
Abstract	The growth of data, the need for scalability and the complexity of models used in modern machine learning calls for distributed implementations. Yet, as of today, distributed machine learning frameworks have largely ignored the possibility of arbitrary (i.e., Byzantine) failures. In this paper, we study the robustness to Byzantine failures at the fundamental level of stochastic gradient descent (SGD), the heart of most machine learning algorithms. Assuming a set of $n$ workers, up to $f$ of them being Byzantine, we ask how robust can SGD be, without limiting the dimension, nor the size of the parameter space. We first show that no gradient descent update rule based on a linear combination of the vectors proposed by the workers (i.e, current approaches) tolerates a single Byzantine failure. We then formulate a resilience property of the update rule capturing the basic requirements to guarantee convergence despite $f$ Byzantine workers. We finally propose Krum, an update rule that satisfies the resilience property aforementioned. For a $d$-dimensional learning problem, the time complexity of Krum is $O(n^2 \cdot (d + \log n))$.
Tasks
Published	2017-03-08
URL	http://arxiv.org/abs/1703.02757v1
PDF	http://arxiv.org/pdf/1703.02757v1.pdf
PWC	https://paperswithcode.com/paper/byzantine-tolerant-machine-learning
Repo
Framework

A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds


Title	A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds
Authors	Mohammed Yousefhussien, David J. Kelbe, Emmett J. Ientilucci, Carl Salvaggio
Abstract	When classifying point clouds, a large amount of time is devoted to the process of engineering a reliable set of features which are then passed to a classifier of choice. Generally, such features - usually derived from the 3D-covariance matrix - are computed using the surrounding neighborhood of points. While these features capture local information, the process is usually time-consuming, and requires the application at multiple scales combined with contextual methods in order to adequately describe the diversity of objects within a scene. In this paper we present a 1D-fully convolutional network that consumes terrain-normalized points directly with the corresponding spectral data,if available, to generate point-wise labeling while implicitly learning contextual features in an end-to-end fashion. Our method uses only the 3D-coordinates and three corresponding spectral features for each point. Spectral features may either be extracted from 2D-georeferenced images, as shown here for Light Detection and Ranging (LiDAR) point clouds, or extracted directly for passive-derived point clouds,i.e. from muliple-view imagery. We train our network by splitting the data into square regions, and use a pooling layer that respects the permutation-invariance of the input points. Evaluated using the ISPRS 3D Semantic Labeling Contest, our method scored second place with an overall accuracy of 81.6%. We ranked third place with a mean F1-score of 63.32%, surpassing the F1-score of the method with highest accuracy by 1.69%. In addition to labeling 3D-point clouds, we also show that our method can be easily extended to 2D-semantic segmentation tasks, with promising initial results.
Tasks	Semantic Segmentation
Published	2017-10-03
URL	http://arxiv.org/abs/1710.01408v1
PDF	http://arxiv.org/pdf/1710.01408v1.pdf
PWC	https://paperswithcode.com/paper/a-fully-convolutional-network-for-semantic
Repo
Framework

Cost-Optimal Learning of Causal Graphs


Title	Cost-Optimal Learning of Causal Graphs
Authors	Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath
Abstract	We consider the problem of learning a causal graph over a set of variables with interventions. We study the cost-optimal causal graph learning problem: For a given skeleton (undirected version of the causal graph), design the set of interventions with minimum total cost, that can uniquely identify any causal graph with the given skeleton. We show that this problem is solvable in polynomial time. Later, we consider the case when the number of interventions is limited. For this case, we provide polynomial time algorithms when the skeleton is a tree or a clique tree. For a general chordal skeleton, we develop an efficient greedy algorithm, which can be improved when the causal graph skeleton is an interval graph.
Tasks
Published	2017-03-08
URL	http://arxiv.org/abs/1703.02645v1
PDF	http://arxiv.org/pdf/1703.02645v1.pdf
PWC	https://paperswithcode.com/paper/cost-optimal-learning-of-causal-graphs
Repo
Framework

Scene Text Eraser


Title	Scene Text Eraser
Authors	Toshiki Nakamura, Anna Zhu, Keiji Yanai, Seiichi Uchida
Abstract	The character information in natural scene images contains various personal information, such as telephone numbers, home addresses, etc. It is a high risk of leakage the information if they are published. In this paper, we proposed a scene text erasing method to properly hide the information via an inpainting convolutional neural network (CNN) model. The input is a scene text image, and the output is expected to be text erased image with all the character regions filled up the colors of the surrounding background pixels. This work is accomplished by a CNN model through convolution to deconvolution with interconnection process. The training samples and the corresponding inpainting images are considered as teaching signals for training. To evaluate the text erasing performance, the output images are detected by a novel scene text detection method. Subsequently, the same measurement on text detection is utilized for testing the images in benchmark dataset ICDAR2013. Compared with direct text detection way, the scene text erasing process demonstrates a drastically decrease on the precision, recall and f-score. That proves the effectiveness of proposed method for erasing the text in natural scene images.
Tasks	Scene Text Detection
Published	2017-05-08
URL	http://arxiv.org/abs/1705.02772v1
PDF	http://arxiv.org/pdf/1705.02772v1.pdf
PWC	https://paperswithcode.com/paper/scene-text-eraser
Repo
Framework

Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks


Title	Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks
Authors	Michele Alberti, Mathias Seuret, Vinaychandran Pondenkandath, Rolf Ingold, Marcus Liwicki
Abstract	In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wise pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from another network (transfer learning). Hence, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is required for seeding a fine-tuning of transfer learning. In this paper, we describe how to turn an LDA into either a neural layer or a classification layer. We analyze the initialization technique on historical documents. First, we show that an LDA-based initialization is quick and leads to a very stable initialization. Furthermore, for the task of layout analysis at pixel level, we investigate the effectiveness of LDA-based initialization and show that it outperforms state-of-the-art random weight initialization methods.
Tasks	Semantic Segmentation, Transfer Learning
Published	2017-10-19
URL	http://arxiv.org/abs/1710.07363v1
PDF	http://arxiv.org/pdf/1710.07363v1.pdf
PWC	https://paperswithcode.com/paper/historical-document-image-segmentation-with
Repo
Framework

$\left( β, \varpi \right)$-stability for cross-validation and the choice of the number of folds


Title	$\left( β, \varpi \right)$-stability for cross-validation and the choice of the number of folds
Authors	Ning Xu, Jian Hong, Timothy C. G. Fisher
Abstract	In this paper, we introduce a new concept of stability for cross-validation, called the $\left( \beta, \varpi \right)$-stability, and use it as a new perspective to build the general theory for cross-validation. The $\left( \beta, \varpi \right)$-stability mathematically connects the generalization ability and the stability of the cross-validated model via the Rademacher complexity. Our result reveals mathematically the effect of cross-validation from two sides: on one hand, cross-validation picks the model with the best empirical generalization ability by validating all the alternatives on test sets; on the other hand, cross-validation may compromise the stability of the model selection by causing subsampling error. Moreover, the difference between training and test errors in q\textsuperscript{th} round, sometimes referred to as the generalization error, might be autocorrelated on q. Guided by the ideas above, the $\left( \beta, \varpi \right)$-stability help us derivd a new class of Rademacher bounds, referred to as the one-round/convoluted Rademacher bounds, for the stability of cross-validation in both the i.i.d.\ and non-i.i.d.\ cases. For both light-tail and heavy-tail losses, the new bounds quantify the stability of the one-round/average test error of the cross-validated model in terms of its one-round/average training error, the sample sizes $n$, number of folds $K$, the tail property of the loss (encoded as Orlicz-$\Psi_\nu$ norms) and the Rademacher complexity of the model class $\Lambda$. The new class of bounds not only quantitatively reveals the stability of the generalization ability of the cross-validated model, it also shows empirically the optimal choice for number of folds $K$, at which the upper bound of the one-round/average test error is lowest, or, to put it in another way, where the test error is most stable.
Tasks	Model Selection
Published	2017-05-20
URL	http://arxiv.org/abs/1705.07349v5
PDF	http://arxiv.org/pdf/1705.07349v5.pdf
PWC	https://paperswithcode.com/paper/left-varpi-right-stability-for-cross
Repo
Framework

A Web-Based Tool for Analysing Normative Documents in English


Title	A Web-Based Tool for Analysing Normative Documents in English
Authors	John J. Camilleri, Mohammad Reza Haghshenas, Gerardo Schneider
Abstract	Our goal is to use formal methods to analyse normative documents written in English, such as privacy policies and service-level agreements. This requires the combination of a number of different elements, including information extraction from natural language, formal languages for model representation, and an interface for property specification and verification. We have worked on a collection of components for this task: a natural language extraction tool, a suitable formalism for representing such documents, an interface for building models in this formalism, and methods for answering queries asked of a given model. In this work, each of these concerns is brought together in a web-based tool, providing a single interface for analysing normative texts in English. Through the use of a running example, we describe each component and demonstrate the workflow established by our tool.
Tasks
Published	2017-07-13
URL	http://arxiv.org/abs/1707.03997v1
PDF	http://arxiv.org/pdf/1707.03997v1.pdf
PWC	https://paperswithcode.com/paper/a-web-based-tool-for-analysing-normative
Repo
Framework

3D Reconstruction with Low Resolution, Small Baseline and High Radial Distortion Stereo Images


Title	3D Reconstruction with Low Resolution, Small Baseline and High Radial Distortion Stereo Images
Authors	Tiago Dias, Helder Araujo, Pedro Miraldo
Abstract	In this paper we analyze and compare approaches for 3D reconstruction from low-resolution (250x250), high radial distortion stereo images, which are acquired with small baseline (approximately 1mm). These images are acquired with the system NanEye Stereo manufactured by CMOSIS/AWAIBA. These stereo cameras have also small apertures, which means that high levels of illumination are required. The goal was to develop an approach yielding accurate reconstructions, with a low computational cost, i.e., avoiding non-linear numerical optimization algorithms. In particular we focused on the analysis and comparison of radial distortion models. To perform the analysis and comparison, we defined a baseline method based on available software and methods, such as the Bouguet toolbox [2] or the Computer Vision Toolbox from Matlab. The approaches tested were based on the use of the polynomial model of radial distortion, and on the application of the division model. The issue of the center of distortion was also addressed within the framework of the application of the division model. We concluded that the division model with a single radial distortion parameter has limitations.
Tasks	3D Reconstruction
Published	2017-09-19
URL	http://arxiv.org/abs/1709.06451v1
PDF	http://arxiv.org/pdf/1709.06451v1.pdf
PWC	https://paperswithcode.com/paper/3d-reconstruction-with-low-resolution-small
Repo
Framework

Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images using Weakly-Supervised Joint Convolutional Sparse Coding


Title	Simultaneous Super-Resolution and Cross-Modality Synthesis of 3D Medical Images using Weakly-Supervised Joint Convolutional Sparse Coding
Authors	Yawen Huang, Ling Shao, Alejandro F. Frangi
Abstract	Magnetic Resonance Imaging (MRI) offers high-resolution \emph{in vivo} imaging and rich functional and anatomical multimodality tissue contrast. In practice, however, there are challenges associated with considerations of scanning costs, patient comfort, and scanning time that constrain how much data can be acquired in clinical or research studies. In this paper, we explore the possibility of generating high-resolution and multimodal images from low-resolution single-modality imagery. We propose the weakly-supervised joint convolutional sparse coding to simultaneously solve the problems of super-resolution (SR) and cross-modality image synthesis. The learning process requires only a few registered multimodal image pairs as the training set. Additionally, the quality of the joint dictionary learning can be improved using a larger set of unpaired images. To combine unpaired data from different image resolutions/modalities, a hetero-domain image alignment term is proposed. Local image neighborhoods are naturally preserved by operating on the whole image domain (as opposed to image patches) and using joint convolutional sparse coding. The paired images are enhanced in the joint learning process with unpaired data and an additional maximum mean discrepancy term, which minimizes the dissimilarity between their feature distributions. Experiments show that the proposed method outperforms state-of-the-art techniques on both SR reconstruction and simultaneous SR and cross-modality synthesis.
Tasks	Dictionary Learning, Image Generation, Super-Resolution
Published	2017-05-07
URL	http://arxiv.org/abs/1705.02596v1
PDF	http://arxiv.org/pdf/1705.02596v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-super-resolution-and-cross
Repo
Framework

Intelligent Parameter Tuning in Optimization-based Iterative CT Reconstruction via Deep Reinforcement Learning


Title	Intelligent Parameter Tuning in Optimization-based Iterative CT Reconstruction via Deep Reinforcement Learning
Authors	Chenyang Shen, Yesenia Gonzalez, Liyuan Chen, Steve B. Jiang, Xun Jia
Abstract	A number of image-processing problems can be formulated as optimization problems. The objective function typically contains several terms specifically designed for different purposes. Parameters in front of these terms are used to control the relative weights among them. It is of critical importance to tune these parameters, as quality of the solution depends on their values. Tuning parameter is a relatively straightforward task for a human, as one can intelligently determine the direction of parameter adjustment based on the solution quality. Yet manual parameter tuning is not only tedious in many cases, but becomes impractical when a number of parameters exist in a problem. Aiming at solving this problem, this paper proposes an approach that employs deep reinforcement learning to train a system that can automatically adjust parameters in a human-like manner. We demonstrate our idea in an example problem of optimization-based iterative CT reconstruction with a pixel-wise total-variation regularization term. We set up a parameter tuning policy network (PTPN), which maps an CT image patch to an output that specifies the direction and amplitude by which the parameter at the patch center is adjusted. We train the PTPN via an end-to-end reinforcement learning procedure. We demonstrate that under the guidance of the trained PTPN for parameter tuning at each pixel, reconstructed CT images attain quality similar or better than in those reconstructed with manually tuned parameters.
Tasks
Published	2017-11-01
URL	http://arxiv.org/abs/1711.00414v1
PDF	http://arxiv.org/pdf/1711.00414v1.pdf
PWC	https://paperswithcode.com/paper/intelligent-parameter-tuning-in-optimization
Repo
Framework

Non-Depth-First Search against Independent Distributions on an AND-OR Tree


Title	Non-Depth-First Search against Independent Distributions on an AND-OR Tree
Authors	Toshio Suzuki
Abstract	Suzuki and Niida (Ann. Pure. Appl. Logic, 2015) showed the following results on independent distributions (IDs) on an AND-OR tree, where they took only depth-first algorithms into consideration. (1) Among IDs such that probability of the root having value 0 is fixed as a given r such that 0 < r < 1, if d is a maximizer of cost of the best algorithm then d is an independent and identical distribution (IID). (2) Among all IDs, if d is a maximizer of cost of the best algorithm then d is an IID. In the case where non-depth-first algorithms are taken into consideration, the counter parts of (1) and (2) are left open in the above work. Peng et al. (Inform. Process. Lett., 2017) extended (1) and (2) to multi-branching trees, where in (2) they put an additional hypothesis on IDs that probability of the root having value 0 is neither 0 nor 1. We give positive answers for the two questions of Suzuki-Niida. A key to the proof is that if ID d achieves the equilibrium among IDs then we can chose an algorithm of the best cost against d from depth-first algorithms. In addition, we extend the result of Peng et al. to the case where non-depth-first algorithms are taken into consideration.
Tasks
Published	2017-09-21
URL	http://arxiv.org/abs/1709.07358v1
PDF	http://arxiv.org/pdf/1709.07358v1.pdf
PWC	https://paperswithcode.com/paper/non-depth-first-search-against-independent
Repo
Framework

Tensor Decompositions for Modeling Inverse Dynamics


Title	Tensor Decompositions for Modeling Inverse Dynamics
Authors	Stephan Baier, Volker Tresp
Abstract	Modeling inverse dynamics is crucial for accurate feedforward robot control. The model computes the necessary joint torques, to perform a desired movement. The highly non-linear inverse function of the dynamical system can be approximated using regression techniques. We propose as regression method a tensor decomposition model that exploits the inherent three-way interaction of positions x velocities x accelerations. Most work in tensor factorization has addressed the decomposition of dense tensors. In this paper, we build upon the decomposition of sparse tensors, with only small amounts of nonzero entries. The decomposition of sparse tensors has successfully been used in relational learning, e.g., the modeling of large knowledge graphs. Recently, the approach has been extended to multi-class classification with discrete input variables. Representing the data in high dimensional sparse tensors enables the approximation of complex highly non-linear functions. In this paper we show how the decomposition of sparse tensors can be applied to regression problems. Furthermore, we extend the method to continuous inputs, by learning a mapping from the continuous inputs to the latent representations of the tensor decomposition, using basis functions. We evaluate our proposed model on a dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our experimental results show superior performance of the proposed functional tensor model, compared to challenging state-of-the art methods.
Tasks	Knowledge Graphs, Relational Reasoning
Published	2017-11-13
URL	http://arxiv.org/abs/1711.04683v1
PDF	http://arxiv.org/pdf/1711.04683v1.pdf
PWC	https://paperswithcode.com/paper/tensor-decompositions-for-modeling-inverse
Repo
Framework

Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm


Title	Optimizing Gross Merchandise Volume via DNN-MAB Dynamic Ranking Paradigm
Authors	Yan Yan, Wentao Guo, Meng Zhao, Jinghe Hu, Weipeng P. Yan
Abstract	With the transition from people’s traditional `brick-and-mortar’ shopping to online mobile shopping patterns in web 2.0 $\mathit{era}$, the recommender system plays a critical role in E-Commerce and E-Retails. This is especially true when designing this system for more than $\mathbf{236~million}$ daily active users. Ranking strategy, the key module of the recommender system, needs to be precise, accurate, and responsive for estimating customers’ intents. We propose a dynamic ranking paradigm, named as DNN-MAB, that is composed of a pairwise deep neural network (DNN) $\mathit{pre}$-ranker connecting a revised multi-armed bandit (MAB) dynamic $\mathit{post}$-ranker. By taking into account of explicit and implicit user feedbacks such as impressions, clicks, conversions, etc. DNN-MAB is able to adjust DNN $\mathit{pre}$-ranking scores to assist customers locating items they are interested in most so that they can converge quickly and frequently. To the best of our knowledge, frameworks like DNN-MAB have not been discussed in the previous literature to either E-Commerce or machine learning audiences. In practice, DNN-MAB has been deployed to production and it easily outperforms against other state-of-the-art models by significantly lifting the gross merchandise volume (GMV) which is the objective metrics at JD. \|
Tasks	Recommendation Systems
Published	2017-08-14
URL	http://arxiv.org/abs/1708.03993v1
PDF	http://arxiv.org/pdf/1708.03993v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-gross-merchandise-volume-via-dnn
Repo
Framework

Influence Function and Robust Variant of Kernel Canonical Correlation Analysis


Title	Influence Function and Robust Variant of Kernel Canonical Correlation Analysis
Authors	Md. Ashad Alam, Kenji Fukumizu, Yu-Ping Wang
Abstract	Many unsupervised kernel methods rely on the estimation of the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both kernel CO and kernel CCO are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In addition, while the influence function (IF) of an estimator can characterize its robustness, asymptotic properties and standard error, the IF of a standard kernel canonical correlation analysis (standard kernel CCA) has not been derived yet. To fill this gap, we first propose a robust kernel covariance operator (robust kernel CO) and a robust kernel cross-covariance operator (robust kernel CCO) based on a generalized loss function instead of the quadratic loss function. Second, we derive the IF for robust kernel CCO and standard kernel CCA. Using the IF of the standard kernel CCA, we can detect influential observations from two sets of data. Finally, we propose a method based on the robust kernel CO and the robust kernel CCO, called {\bf robust kernel CCA}, which is less sensitive to noise than the standard kernel CCA. The introduced principles can also be applied to many other kernel methods involving kernel CO or kernel CCO. Our experiments on synthesized data and imaging genetics analysis demonstrate that the proposed IF of standard kernel CCA can identify outliers. It is also seen that the proposed robust kernel CCA method performs better for ideal and contaminated data than the standard kernel CCA.
Tasks
Published	2017-05-09
URL	http://arxiv.org/abs/1705.04194v1
PDF	http://arxiv.org/pdf/1705.04194v1.pdf
PWC	https://paperswithcode.com/paper/influence-function-and-robust-variant-of
Repo
Framework