Paper Group ANR 1248
Personalised novel and explainable matrix factorisation. C-SALT: Mining Class-Specific ALTerations in Boolean Matrix Factorization. Uncalibrated Deflectometry with a Mobile Device on Extended Specular Surfaces. Feature Fusion for Robust Patch Matching With Compact Binary Descriptors. Modeling Sensorimotor Coordination as Multi-Agent Reinforcement L …
Personalised novel and explainable matrix factorisation
Title | Personalised novel and explainable matrix factorisation |
Authors | Ludovik Coba, Panagiotis Symeonidis, Markus Zanker |
Abstract | Recommendation systems personalise suggestions to individuals to help them in their decision making and exploration tasks. In the ideal case, these recommendations, besides of being accurate, should also be novel and explainable. However, up to now most platforms fail to provide both, novel recommendations that advance users’ exploration along with explanations to make their reasoning more transparent to them. For instance, a well-known recommendation algorithm, such as matrix factorisation (MF), optimises only the accuracy criterion, while disregarding other quality criteria such as the explainability or the novelty, of recommended items. In this paper, to the best of our knowledge, we propose a new model, denoted as NEMF, that allows to trade-off the MF performance with respect to the criteria of novelty and explainability, while only minimally compromising on accuracy. In addition, we recommend a new explainability metric based on nDCG, which distinguishes a more explainable item from a less explainable item. An initial user study indicates how users perceive the different attributes of these “user” style explanations and our extensive experimental results demonstrate that we attain high accuracy by recommending also novel and explainable items. |
Tasks | Decision Making, Recommendation Systems |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11000v1 |
https://arxiv.org/pdf/1907.11000v1.pdf | |
PWC | https://paperswithcode.com/paper/personalised-novel-and-explainable-matrix |
Repo | |
Framework | |
C-SALT: Mining Class-Specific ALTerations in Boolean Matrix Factorization
Title | C-SALT: Mining Class-Specific ALTerations in Boolean Matrix Factorization |
Authors | Sibylle Hess, Katharina Morik |
Abstract | Given labeled data represented by a binary matrix, we consider the task to derive a Boolean matrix factorization which identifies commonalities and specifications among the classes. While existing works focus on rank-one factorizations which are either specific or common to the classes, we derive class-specific alterations from common factorizations as well. Therewith, we broaden the applicability of our new method to datasets whose class-dependencies have a more complex structure. On the basis of synthetic and real-world datasets, we show on the one hand that our method is able to filter structure which corresponds to our model assumption, and on the other hand that our model assumption is justified in real-world application. Our method is parameter-free. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.09907v1 |
https://arxiv.org/pdf/1906.09907v1.pdf | |
PWC | https://paperswithcode.com/paper/c-salt-mining-class-specific-alterations-in |
Repo | |
Framework | |
Uncalibrated Deflectometry with a Mobile Device on Extended Specular Surfaces
Title | Uncalibrated Deflectometry with a Mobile Device on Extended Specular Surfaces |
Authors | Florian Willomitzer, Chia-Kai Yeh, Vikas Gupta, William Spies, Florian Schiffers, Marc Walton, Oliver Cossairt |
Abstract | We introduce a system and methods for the three-dimensional measurement of extended specular surfaces with high surface normal variations. Our system consists only of a mobile hand held device and exploits screen and front camera for Deflectometry-based surface measurements. We demonstrate high quality measurements without the need for an offline calibration procedure. In addition, we develop a multi-view technique to compensate for the small screen of a mobile device so that large surfaces can be densely reconstructed in their entirety. This work is a first step towards developing a self-calibrating Deflectometry procedure capable of taking 3D surface measurements of specular objects in the wild and accessible to users with little to no technical imaging experience. |
Tasks | Calibration |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10700v1 |
https://arxiv.org/pdf/1907.10700v1.pdf | |
PWC | https://paperswithcode.com/paper/uncalibrated-deflectometry-with-a-mobile |
Repo | |
Framework | |
Feature Fusion for Robust Patch Matching With Compact Binary Descriptors
Title | Feature Fusion for Robust Patch Matching With Compact Binary Descriptors |
Authors | Andrea Migliorati, Attilio Fiandrotti, Gianluca Francini, Skjalg Lepsoy, Riccardo Leonardi |
Abstract | This work addresses the problem of learning compact yet discriminative patch descriptors within a deep learning framework. We observe that features extracted by convolutional layers in the pixel domain are largely complementary to features extracted in a transformed domain. We propose a convolutional network framework for learning binary patch descriptors where pixel domain features are fused with features extracted from the transformed domain. In our framework, while convolutional and transformed features are distinctly extracted, they are fused and provided to a single classifier which thus jointly operates on convolutional and transformed features. We experiment at matching patches from three different datasets, showing that our feature fusion approach outperforms multiple state-of-the-art approaches in terms of accuracy, rate, and complexity. |
Tasks | |
Published | 2019-01-11 |
URL | http://arxiv.org/abs/1901.03547v1 |
http://arxiv.org/pdf/1901.03547v1.pdf | |
PWC | https://paperswithcode.com/paper/feature-fusion-for-robust-patch-matching-with |
Repo | |
Framework | |
Modeling Sensorimotor Coordination as Multi-Agent Reinforcement Learning with Differentiable Communication
Title | Modeling Sensorimotor Coordination as Multi-Agent Reinforcement Learning with Differentiable Communication |
Authors | Bowen Jing, William Yin |
Abstract | Multi-agent reinforcement learning has shown promise on a variety of cooperative tasks as a consequence of recent developments in differentiable inter-agent communication. However, most architectures are limited to pools of homogeneous agents, limiting their applicability. Here we propose a modular framework for learning complex tasks in which a traditional monolithic agent is framed as a collection of cooperating heterogeneous agents. We apply this approach to model sensorimotor coordination in the neocortex as a multi-agent reinforcement learning problem. Our results demonstrate proof-of-concept of the proposed architecture and open new avenues for learning complex tasks and for understanding functional localization in the brain and future intelligent systems. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05815v1 |
https://arxiv.org/pdf/1909.05815v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-sensorimotor-coordination-as-multi |
Repo | |
Framework | |
Digging Deeper into Egocentric Gaze Prediction
Title | Digging Deeper into Egocentric Gaze Prediction |
Authors | Hamed R. Tavakoli, Esa Rahtu, Juho Kannala, Ali Borji |
Abstract | This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed as representatives of top-down information. We also look into the contribution of these factors by investigating a simple recurrent neural model for ego-centric gaze prediction. First, deep features are extracted for all input video frames. Then, a gated recurrent unit is employed to integrate information over time and to predict the next fixation. We also propose an integrated model that combines the recurrent model with several top-down and bottom-up cues. Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up saliency models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction. Our findings suggest that (1) there should be more emphasis on hand-object interaction and (2) the egocentric vision community should consider larger datasets including diverse stimuli and more subjects. |
Tasks | Activity Recognition, Gaze Prediction, Optical Flow Estimation, Transfer Learning |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06090v1 |
http://arxiv.org/pdf/1904.06090v1.pdf | |
PWC | https://paperswithcode.com/paper/digging-deeper-into-egocentric-gaze |
Repo | |
Framework | |
Correlated Logistic Model With Elastic Net Regularization for Multilabel Image Classification
Title | Correlated Logistic Model With Elastic Net Regularization for Multilabel Image Classification |
Authors | Qiang Li, Bo Xie, Jane You, Wei Bian, Dacheng Tao |
Abstract | In this paper, we present correlated logistic (CorrLog) model for multilabel image classification. CorrLog extends conventional logistic regression model into multilabel cases, via explicitly modeling the pairwise correlation between labels. In addition, we propose to learn the model parameters of CorrLog with elastic net regularization, which helps exploit the sparsity in feature selection and label correlations and thus further boost the performance of multilabel classification. CorrLog can be efficiently learned, though approximately, by regularized maximum pseudo likelihood estimation, and it enjoys a satisfying generalization bound that is independent of the number of labels. CorrLog performs competitively for multilabel image classification on benchmark data sets MULAN scene, MIT outdoor scene, PASCAL VOC 2007, and PASCAL VOC 2012, compared with the state-of-the-art multilabel classification algorithms. |
Tasks | Feature Selection, Image Classification |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08098v1 |
http://arxiv.org/pdf/1904.08098v1.pdf | |
PWC | https://paperswithcode.com/paper/correlated-logistic-model-with-elastic-net |
Repo | |
Framework | |
Evaluation of Multi-Slice Inputs to Convolutional Neural Networks for Medical Image Segmentation
Title | Evaluation of Multi-Slice Inputs to Convolutional Neural Networks for Medical Image Segmentation |
Authors | Minh H. Vu, Guus Grimbergen, Tufve Nyholm, Tommy Löfstedt |
Abstract | When using Convolutional Neural Networks (CNNs) for segmentation of organs and lesions in medical images, the conventional approach is to work with inputs and outputs either as single slice (2D) or whole volumes (3D). One common alternative, in this study denoted as pseudo-3D, is to use a stack of adjacent slices as input and produce a prediction for at least the central slice. This approach gives the network the possibility to capture 3D spatial information, with only a minor additional computational cost. In this study, we systematically evaluate the segmentation performance and computational costs of this pseudo-3D approach as a function of the number of input slices, and compare the results to conventional end-to-end 2D and 3D CNNs. The standard pseudo-3D method regards the neighboring slices as multiple input image channels. We additionally evaluate a simple approach where the input stack is a volumetric input that is repeatably convolved in 3D to obtain a 2D feature map. This 2D map is in turn fed into a standard 2D network. We conducted experiments using two different CNN backbone architectures and on five diverse data sets covering different anatomical regions, imaging modalities, and segmentation tasks. We found that while both pseudo-3D methods can process a large number of slices at once and still be computationally much more efficient than fully 3D CNNs, a significant improvement over a regular 2D CNN was only observed for one of the five data sets. An analysis of the structural properties of the segmentation masks revealed no relations to the segmentation performance with respect to the number of input slices. The conclusion is therefore that in the general case, multi-slice inputs appear to not significantly improve segmentation results over using 2D or 3D CNNs. |
Tasks | Medical Image Segmentation, Semantic Segmentation |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09287v2 |
https://arxiv.org/pdf/1912.09287v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-multi-slice-inputs-to |
Repo | |
Framework | |
Do Subsampled Newton Methods Work for High-Dimensional Data?
Title | Do Subsampled Newton Methods Work for High-Dimensional Data? |
Authors | Xiang Li, Shusen Wang, Zhihua Zhang |
Abstract | Subsampled Newton methods approximate Hessian matrices through subsampling techniques, alleviating the cost of forming Hessian matrices but using sufficient curvature information. However, previous results require $\Omega (d)$ samples to approximate Hessians, where $d$ is the dimension of data points, making it less practically feasible for high-dimensional data. The situation is deteriorated when $d$ is comparably as large as the number of data points $n$, which requires to take the whole dataset into account, making subsampling useless. This paper theoretically justifies the effectiveness of subsampled Newton methods on high dimensional data. Specifically, we prove only $\widetilde{\Theta}(d^\gamma_{\rm eff})$ samples are needed in the approximation of Hessian matrices, where $d^\gamma_{\rm eff}$ is the $\gamma$-ridge leverage and can be much smaller than $d$ as long as $n\gamma \gg 1$. Additionally, we extend this result so that subsampled Newton methods can work for high-dimensional data on both distributed optimization problems and non-smooth regularized problems. |
Tasks | Distributed Optimization |
Published | 2019-02-13 |
URL | https://arxiv.org/abs/1902.04952v2 |
https://arxiv.org/pdf/1902.04952v2.pdf | |
PWC | https://paperswithcode.com/paper/do-subsampled-newton-methods-work-for-high |
Repo | |
Framework | |
99% of Distributed Optimization is a Waste of Time: The Issue and How to Fix it
Title | 99% of Distributed Optimization is a Waste of Time: The Issue and How to Fix it |
Authors | Konstantin Mishchenko, Filip Hanzely, Peter Richtárik |
Abstract | Many popular distributed optimization methods for training machine learning models fit the following template: a local gradient estimate is computed independently by each worker, then communicated to a master, which subsequently performs averaging. The average is broadcast back to the workers, which use it to perform a gradient-type step to update the local version of the model. It is also well known that many such methods, including SGD, SAGA, and accelerated SGD for over-parameterized models, do not scale well with the number of parallel workers. In this paper we observe that the above template is fundamentally inefficient in that too much data is unnecessarily communicated by the workers, which slows down the overall system. We propose a fix based on a new update-sparsification method we develop in this work, which we suggest be used on top of existing methods. Namely, we develop a new variant of parallel block coordinate descent based on independent sparsification of the local gradient estimates before communication. We demonstrate that with only $m/n$ blocks sent by each of $n$ workers, where $m$ is the total number of parameter blocks, the theoretical iteration complexity of the underlying distributed methods is essentially unaffected. As an illustration, this means that when $n=100$ parallel workers are used, the communication of $99%$ blocks is redundant, and hence a waste of time. Our theoretical claims are supported through extensive numerical experiments which demonstrate an almost perfect match with our theory on a number of synthetic and real datasets. |
Tasks | Distributed Optimization |
Published | 2019-01-27 |
URL | https://arxiv.org/abs/1901.09437v2 |
https://arxiv.org/pdf/1901.09437v2.pdf | |
PWC | https://paperswithcode.com/paper/99-of-parallel-optimization-is-inevitably-a |
Repo | |
Framework | |
Feature Selection for Data Integration with Mixed Multi-view Data
Title | Feature Selection for Data Integration with Mixed Multi-view Data |
Authors | Yulia Baker, Tiffany M. Tang, Genevera I. Allen |
Abstract | Data integration methods that analyze multiple sources of data simultaneously can often provide more holistic insights than can separate inquiries of each data source. Motivated by the advantages of data integration in the era of “big data”, we investigate feature selection for high-dimensional multi-view data with mixed data types (e.g. continuous, binary, count-valued). This heterogeneity of multi-view data poses numerous challenges for existing feature selection methods. However, after critically examining these issues through empirical and theoretically-guided lenses, we develop a practical solution, the Block Randomized Adaptive Iterative Lasso (B-RAIL), which combines the strengths of the randomized Lasso, adaptive weighting schemes, and stability selection. B-RAIL serves as a versatile data integration method for sparse regression and graph selection, and we demonstrate the effectiveness of B-RAIL through extensive simulations and a case study to infer the ovarian cancer gene regulatory network. In this case study, B-RAIL successfully identifies well-known biomarkers associated with ovarian cancer and hints at novel candidates for future ovarian cancer research. |
Tasks | Feature Selection |
Published | 2019-03-27 |
URL | https://arxiv.org/abs/1903.11232v2 |
https://arxiv.org/pdf/1903.11232v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-selection-for-data-integration-with |
Repo | |
Framework | |
Simplifying Random Forests: On the Trade-off between Interpretability and Accuracy
Title | Simplifying Random Forests: On the Trade-off between Interpretability and Accuracy |
Authors | Michael Rapp, Eneldo Loza Mencía, Johannes Fürnkranz |
Abstract | We analyze the trade-off between model complexity and accuracy for random forests by breaking the trees up into individual classification rules and selecting a subset of them. We show experimentally that already a few rules are sufficient to achieve an acceptable accuracy close to that of the original model. Moreover, our results indicate that in many cases, this can lead to simpler models that clearly outperform the original ones. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04393v1 |
https://arxiv.org/pdf/1911.04393v1.pdf | |
PWC | https://paperswithcode.com/paper/simplifying-random-forests-on-the-trade-off |
Repo | |
Framework | |
GeoCapsNet: Aerial to Ground view Image Geo-localization using Capsule Network
Title | GeoCapsNet: Aerial to Ground view Image Geo-localization using Capsule Network |
Authors | Bin Sun, Chen Chen, Yingying Zhu, Jianmin Jiang |
Abstract | The task of cross-view image geo-localization aims to determine the geo-location (GPS coordinates) of a query ground-view image by matching it with the GPS-tagged aerial (satellite) images in a reference dataset. Due to the dramatic changes of viewpoint, matching the cross-view images is challenging. In this paper, we propose the GeoCapsNet based on the capsule network for ground-to-aerial image geo-localization. The network first extracts features from both ground-view and aerial images via standard convolution layers and the capsule layers further encode the features to model the spatial feature hierarchies and enhance the representation power. Moreover, we introduce a simple and effective weighted soft-margin triplet loss with online batch hard sample mining, which can greatly improve image retrieval accuracy. Experimental results show that our GeoCapsNet significantly outperforms the state-of-the-art approaches on two benchmark datasets. |
Tasks | Image Retrieval |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06281v1 |
http://arxiv.org/pdf/1904.06281v1.pdf | |
PWC | https://paperswithcode.com/paper/geocapsnet-aerial-to-ground-view-image-geo |
Repo | |
Framework | |
Trajectory Normalized Gradients for Distributed Optimization
Title | Trajectory Normalized Gradients for Distributed Optimization |
Authors | Jianqiao Wangni, Ke Li, Jianbo Shi, Jitendra Malik |
Abstract | Recently, researchers proposed various low-precision gradient compression, for efficient communication in large-scale distributed optimization. Based on these work, we try to reduce the communication complexity from a new direction. We pursue an ideal bijective mapping between two spaces of gradient distribution, so that the mapped gradient carries greater information entropy after the compression. In our setting, all servers should share a reference gradient in advance, and they communicate via the normalized gradients, which are the subtraction or quotient, between current gradients and the reference. To obtain a reference vector that yields a stronger signal-to-noise ratio, dynamically in each iteration, we extract and fuse information from the past trajectory in hindsight, and search for an optimal reference for compression. We name this to be the trajectory-based normalized gradients (TNG). It bridges the research from different societies, like coding, optimization, systems, and learning. It is easy to implement and can universally combine with existing algorithms. Our experiments on benchmarking hard non-convex functions, convex problems like logistic regression demonstrate that TNG is more compression-efficient for communication of distributed optimization of general functions. |
Tasks | Distributed Optimization |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08227v1 |
http://arxiv.org/pdf/1901.08227v1.pdf | |
PWC | https://paperswithcode.com/paper/trajectory-normalized-gradients-for |
Repo | |
Framework | |
Successive Over Relaxation Q-Learning
Title | Successive Over Relaxation Q-Learning |
Authors | Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar |
Abstract | In a discounted reward Markov Decision Process (MDP), the objective is to find the optimal value function, i.e., the value function corresponding to an optimal policy. This problem reduces to solving a functional equation known as the Bellman equation and a fixed point iteration scheme known as the value iteration is utilized to obtain the solution. In literature, a successive over-relaxation based value iteration scheme is proposed to speed-up the computation of the optimal value function. The speed-up is achieved by constructing a modified Bellman equation that ensures faster convergence to the optimal value function. However, in many practical applications, the model information is not known and we resort to Reinforcement Learning (RL) algorithms to obtain optimal policy and value function. One such popular algorithm is Q-learning. In this paper, we propose Successive Over-Relaxation (SOR) Q-learning. We first derive a modified fixed point iteration for SOR Q-values and utilize stochastic approximation to derive a learning algorithm to compute the optimal value function and an optimal policy. We then prove the almost sure convergence of the SOR Q-learning to SOR Q-values. Finally, through numerical experiments, we show that SOR Q-learning is faster compared to the standard Q-learning algorithm. |
Tasks | Q-Learning |
Published | 2019-03-09 |
URL | https://arxiv.org/abs/1903.03812v3 |
https://arxiv.org/pdf/1903.03812v3.pdf | |
PWC | https://paperswithcode.com/paper/successive-over-relaxation-q-learning |
Repo | |
Framework | |