Paper Group ANR 543
Studying the brain from adolescence to adulthood through sparse multi-view matrix factorisations. Branching Gaussian Processes with Applications to Spatiotemporal Reconstruction of 3D Trees. SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques. Cross-modal Supervision for Learning Active Speaker Detection in Video. Safe, Mu …
Studying the brain from adolescence to adulthood through sparse multi-view matrix factorisations
Title | Studying the brain from adolescence to adulthood through sparse multi-view matrix factorisations |
Authors | Zi Wang, Vyacheslav Karolis, Chiara Nosarti, Giovanni Montana |
Abstract | Men and women differ in specific cognitive abilities and in the expression of several neuropsychiatric conditions. Such findings could be attributed to sex hormones, brain differences, as well as a number of environmental variables. Existing research on identifying sex-related differences in brain structure have predominantly used cross-sectional studies to investigate, for instance, differences in average gray matter volumes (GMVs). In this article we explore the potential of a recently proposed multi-view matrix factorisation (MVMF) methodology to study structural brain changes in men and women that occur from adolescence to adulthood. MVMF is a multivariate variance decomposition technique that extends principal component analysis to “multi-view” datasets, i.e. where multiple and related groups of observations are available. In this application, each view represents a different age group. MVMF identifies latent factors explaining shared and age-specific contributions to the observed overall variability in GMVs over time. These latent factors can be used to produce low-dimensional visualisations of the data that emphasise age-specific effects once the shared effects have been accounted for. The analysis of two datasets consisting of individuals born prematurely as well as healthy controls provides evidence to suggest that the separation between males and females becomes increasingly larger as the brain transitions from adolescence to adulthood. We report on specific brain regions associated to these variance effects. |
Tasks | |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02560v1 |
http://arxiv.org/pdf/1605.02560v1.pdf | |
PWC | https://paperswithcode.com/paper/studying-the-brain-from-adolescence-to |
Repo | |
Framework | |
Branching Gaussian Processes with Applications to Spatiotemporal Reconstruction of 3D Trees
Title | Branching Gaussian Processes with Applications to Spatiotemporal Reconstruction of 3D Trees |
Authors | Kyle Simek, Ravishankar Palanivelu, Kobus Barnard |
Abstract | We propose a robust method for estimating dynamic 3D curvilinear branching structure from monocular images. While 3D reconstruction from images has been widely studied, estimating thin structure has received less attention. This problem becomes more challenging in the presence of camera error, scene motion, and a constraint that curves are attached in a branching structure. We propose a new general-purpose prior, a branching Gaussian processes (BGP), that models spatial smoothness and temporal dynamics of curves while enforcing attachment between them. We apply this prior to fit 3D trees directly to image data, using an efficient scheme for approximate inference based on expectation propagation. The BGP prior’s Gaussian form allows us to approximately marginalize over 3D trees with a given model structure, enabling principled comparison between tree models with varying complexity. We test our approach on a novel multi-view dataset depicting plants with known 3D structures and topologies undergoing small nonrigid motion. Our method outperforms a state-of-the-art 3D reconstruction method designed for non-moving thin structure. We evaluate under several common measures, and we propose a new measure for reconstructions of branching multi-part 3D scenes under motion. |
Tasks | 3D Reconstruction, Gaussian Processes |
Published | 2016-08-14 |
URL | http://arxiv.org/abs/1608.04045v1 |
http://arxiv.org/pdf/1608.04045v1.pdf | |
PWC | https://paperswithcode.com/paper/branching-gaussian-processes-with |
Repo | |
Framework | |
SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques
Title | SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques |
Authors | Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky |
Abstract | We present SEBOOST, a technique for boosting the performance of existing stochastic optimization methods. SEBOOST applies a secondary optimization process in the subspace spanned by the last steps and descent directions. The method was inspired by the SESOP optimization method for large-scale problems, and has been adapted for the stochastic learning framework. It can be applied on top of any existing optimization method with no need to tweak the internal algorithm. We show that the method is able to boost the performance of different algorithms, and make them more robust to changes in their hyper-parameters. As the boosting steps of SEBOOST are applied between large sets of descent steps, the additional subspace optimization hardly increases the overall computational burden. We introduce two hyper-parameters that control the balance between the baseline method and the secondary optimization process. The method was evaluated on several deep learning tasks, demonstrating promising results. |
Tasks | Stochastic Optimization |
Published | 2016-09-02 |
URL | http://arxiv.org/abs/1609.00629v1 |
http://arxiv.org/pdf/1609.00629v1.pdf | |
PWC | https://paperswithcode.com/paper/seboost-boosting-stochastic-learning-using |
Repo | |
Framework | |
Cross-modal Supervision for Learning Active Speaker Detection in Video
Title | Cross-modal Supervision for Learning Active Speaker Detection in Video |
Authors | Punarjay Chakravarty, Tinne Tuytelaars |
Abstract | In this paper, we show how to use audio to supervise the learning of active speaker detection in video. Voice Activity Detection (VAD) guides the learning of the vision-based classifier in a weakly supervised manner. The classifier uses spatio-temporal features to encode upper body motion - facial expressions and gesticulations associated with speaking. We further improve a generic model for active speaker detection by learning person specific models. Finally, we demonstrate the online adaptation of generic models learnt on one dataset, to previously unseen people in a new dataset, again using audio (VAD) for weak supervision. The use of temporal continuity overcomes the lack of clean training data. We are the first to present an active speaker detection system that learns on one audio-visual dataset and automatically adapts to speakers in a new dataset. This work can be seen as an example of how the availability of multi-modal data allows us to learn a model without the need for supervision, by transferring knowledge from one modality to another. |
Tasks | Action Detection, Activity Detection |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08907v1 |
http://arxiv.org/pdf/1603.08907v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-modal-supervision-for-learning-active |
Repo | |
Framework | |
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving
Title | Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving |
Authors | Shai Shalev-Shwartz, Shaked Shammah, Amnon Shashua |
Abstract | Autonomous driving is a multi-agent setting where the host vehicle must apply sophisticated negotiation skills with other road users when overtaking, giving way, merging, taking left and right turns and while pushing ahead in unstructured urban roadways. Since there are many possible scenarios, manually tackling all possible cases will likely yield a too simplistic policy. Moreover, one must balance between unexpected behavior of other drivers/pedestrians and at the same time not to be too defensive so that normal traffic flow is maintained. In this paper we apply deep reinforcement learning to the problem of forming long term driving strategies. We note that there are two major challenges that make autonomous driving different from other robotic tasks. First, is the necessity for ensuring functional safety - something that machine learning has difficulty with given that performance is optimized at the level of an expectation over many instances. Second, the Markov Decision Process model often used in robotics is problematic in our case because of unpredictable behavior of other agents in this multi-agent scenario. We make three contributions in our work. First, we show how policy gradient iterations can be used without Markovian assumptions. Second, we decompose the problem into a composition of a Policy for Desires (which is to be learned) and trajectory planning with hard constraints (which is not learned). The goal of Desires is to enable comfort of driving, while hard constraints guarantees the safety of driving. Third, we introduce a hierarchical temporal abstraction we call an “Option Graph” with a gating mechanism that significantly reduces the effective horizon and thereby reducing the variance of the gradient estimation even further. |
Tasks | Autonomous Driving, Multi-agent Reinforcement Learning |
Published | 2016-10-11 |
URL | http://arxiv.org/abs/1610.03295v1 |
http://arxiv.org/pdf/1610.03295v1.pdf | |
PWC | https://paperswithcode.com/paper/safe-multi-agent-reinforcement-learning-for |
Repo | |
Framework | |
Multi-Relational Learning at Scale with ADMM
Title | Multi-Relational Learning at Scale with ADMM |
Authors | Lucas Drumond, Ernesto Diaz-Aviles, Lars Schmidt-Thieme |
Abstract | Learning from multiple-relational data which contains noise, ambiguities, or duplicate entities is essential to a wide range of applications such as statistical inference based on Web Linked Data, recommender systems, computational biology, and natural language processing. These tasks usually require working with very large and complex datasets - e.g., the Web graph - however, current approaches to multi-relational learning are not practical for such scenarios due to their high computational complexity and poor scalability on large data. In this paper, we propose a novel and scalable approach for multi-relational factorization based on consensus optimization. Our model, called ConsMRF, is based on the Alternating Direction Method of Multipliers (ADMM) framework, which enables us to optimize each target relation using a smaller set of parameters than the state-of-the-art competitors in this task. Due to ADMM’s nature, ConsMRF can be easily parallelized which makes it suitable for large multi-relational data. Experiments on large Web datasets - derived from DBpedia, Wikipedia and YAGO - show the efficiency and performance improvement of ConsMRF over strong competitors. In addition, ConsMRF near-linear scalability indicates great potential to tackle Web-scale problem sizes. |
Tasks | Recommendation Systems, Relational Reasoning |
Published | 2016-04-03 |
URL | http://arxiv.org/abs/1604.00647v1 |
http://arxiv.org/pdf/1604.00647v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-relational-learning-at-scale-with-admm |
Repo | |
Framework | |
Predictive Analytics Using Smartphone Sensors for Depressive Episodes
Title | Predictive Analytics Using Smartphone Sensors for Depressive Episodes |
Authors | Taeheon Jeong, Diego Klabjan, Justin Starren |
Abstract | The behaviors of patients with depression are usually difficult to predict because the patients demonstrate the symptoms of a depressive episode without a warning at unexpected times. The goal of this research is to build algorithms that detect signals of such unusual moments so that doctors can be proactive in approaching already diagnosed patients before they fall in depression. Each patient is equipped with a smartphone with the capability to track its sensors. We first find the home location of a patient, which is then augmented with other sensor data to identify sleep patterns and select communication patterns. The algorithms require two to three weeks of training data to build standard patterns, which are considered normal behaviors; and then, the methods identify any anomalies in day-to-day data readings of sensors. Four smartphone sensors, including the accelerometer, the gyroscope, the location probe and the communication log probe are used for anomaly detection in sleeping and communication patterns. |
Tasks | Anomaly Detection |
Published | 2016-03-24 |
URL | http://arxiv.org/abs/1603.07692v1 |
http://arxiv.org/pdf/1603.07692v1.pdf | |
PWC | https://paperswithcode.com/paper/predictive-analytics-using-smartphone-sensors |
Repo | |
Framework | |
Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples’ Therapy
Title | Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples’ Therapy |
Authors | Haoqi Li, Brian Baucom, Panayiotis Georgiou |
Abstract | Observational studies are based on accurate assessment of human state. A behavior recognition system that models interlocutors’ state in real-time can significantly aid the mental health domain. However, behavior recognition from speech remains a challenging task since it is difficult to find generalizable and representative features because of noisy and high-dimensional data, especially when data is limited and annotated coarsely and subjectively. Deep Neural Networks (DNN) have shown promise in a wide range of machine learning tasks, but for Behavioral Signal Processing (BSP) tasks their application has been constrained due to limited quantity of data. We propose a Sparsely-Connected and Disjointly-Trained DNN (SD-DNN) framework to deal with limited data. First, we break the acoustic feature set into subsets and train multiple distinct classifiers. Then, the hidden layers of these classifiers become parts of a deeper network that integrates all feature streams. The overall system allows for full connectivity while limiting the number of parameters trained at any time and allows convergence possible with even limited data. We present results on multiple behavior codes in the couples’ therapy domain and demonstrate the benefits in behavior classification accuracy. We also show the viability of this system towards live behavior annotations. |
Tasks | |
Published | 2016-06-14 |
URL | http://arxiv.org/abs/1606.04518v1 |
http://arxiv.org/pdf/1606.04518v1.pdf | |
PWC | https://paperswithcode.com/paper/sparsely-connected-and-disjointly-trained |
Repo | |
Framework | |
Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation
Title | Parsimonious Mixed-Effects HodgeRank for Crowdsourced Preference Aggregation |
Authors | Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Yuan Yao |
Abstract | In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or utility function which generates their comparison behaviors in experiments. However, in reality annotators are subject to variations due to multi-criteria, abnormal, or a mixture of such behaviors. In this paper, we propose a parsimonious mixed-effects model based on HodgeRank, which takes into account both the fixed effect that the majority of annotators follows a common linear utility model, and the random effect that a small subset of annotators might deviate from the common significantly and exhibits strongly personalized preferences. HodgeRank has been successfully applied to subjective quality evaluation of multimedia and resolves pairwise crowdsourced ranking data into a global consensus ranking and cyclic conflicts of interests. As an extension, our proposed methodology further explores the conflicts of interests through the random effect in annotator specific variations. The key algorithm in this paper establishes a dynamic path from the common utility to individual variations, with different levels of parsimony or sparsity on personalization, based on newly developed Linearized Bregman Algorithms with Inverse Scale Space method. Finally the validity of the methodology are supported by experiments with both simulated examples and three real-world crowdsourcing datasets, which shows that our proposed method exhibits better performance (i.e. smaller test error) compared with HodgeRank due to its parsimonious property. |
Tasks | |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03401v1 |
http://arxiv.org/pdf/1607.03401v1.pdf | |
PWC | https://paperswithcode.com/paper/parsimonious-mixed-effects-hodgerank-for |
Repo | |
Framework | |
Boosting Image Captioning with Attributes
Title | Boosting Image Captioning with Attributes |
Authors | Ting Yao, Yingwei Pan, Yehao Li, Zhaofan Qiu, Tao Mei |
Abstract | Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term Memory with Attributes (LSTM-A) - a novel architecture that integrates attributes into the successful Convolutional Neural Networks (CNNs) plus Recurrent Neural Networks (RNNs) image captioning framework, by training them in an end-to-end manner. To incorporate attributes, we construct variants of architectures by feeding image representations and attributes into RNNs in different ways to explore the mutual but also fuzzy relationship between them. Extensive experiments are conducted on COCO image captioning dataset and our framework achieves superior results when compared to state-of-the-art deep models. Most remarkably, we obtain METEOR/CIDEr-D of 25.2%/98.6% on testing data of widely used and publicly available splits in (Karpathy & Fei-Fei, 2015) when extracting image representations by GoogleNet and achieve to date top-1 performance on COCO captioning Leaderboard. |
Tasks | Image Captioning |
Published | 2016-11-05 |
URL | http://arxiv.org/abs/1611.01646v1 |
http://arxiv.org/pdf/1611.01646v1.pdf | |
PWC | https://paperswithcode.com/paper/boosting-image-captioning-with-attributes |
Repo | |
Framework | |
Dynamic Question Ordering in Online Surveys
Title | Dynamic Question Ordering in Online Surveys |
Authors | Kirstin Early, Jennifer Mankoff, Stephen E. Fienberg |
Abstract | Online surveys have the potential to support adaptive questions, where later questions depend on earlier responses. Past work has taken a rule-based approach, uniformly across all respondents. We envision a richer interpretation of adaptive questions, which we call dynamic question ordering (DQO), where question order is personalized. Such an approach could increase engagement, and therefore response rate, as well as imputation quality. We present a DQO framework to improve survey completion and imputation. In the general survey-taking setting, we want to maximize survey completion, and so we focus on ordering questions to engage the respondent and collect hopefully all information, or at least the information that most characterizes the respondent, for accurate imputations. In another scenario, our goal is to provide a personalized prediction. Since it is possible to give reasonable predictions with only a subset of questions, we are not concerned with motivating users to answer all questions. Instead, we want to order questions to get information that reduces prediction uncertainty, while not being too burdensome. We illustrate this framework with an example of providing energy estimates to prospective tenants. We also discuss DQO for national surveys and consider connections between our statistics-based question-ordering approach and cognitive survey methodology. |
Tasks | Imputation |
Published | 2016-07-14 |
URL | http://arxiv.org/abs/1607.04209v1 |
http://arxiv.org/pdf/1607.04209v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-question-ordering-in-online-surveys |
Repo | |
Framework | |
Discriminative Sparse Neighbor Approximation for Imbalanced Learning
Title | Discriminative Sparse Neighbor Approximation for Imbalanced Learning |
Authors | Chen Huang, Chen Change Loy, Xiaoou Tang |
Abstract | Data imbalance is common in many vision tasks where one or more classes are rare. Without addressing this issue conventional methods tend to be biased toward the majority class with poor predictive accuracy for the minority class. These methods further deteriorate on small, imbalanced data that has a large degree of class overlap. In this study, we propose a novel discriminative sparse neighbor approximation (DSNA) method to ameliorate the effect of class-imbalance during prediction. Specifically, given a test sample, we first traverse it through a cost-sensitive decision forest to collect a good subset of training examples in its local neighborhood. Then we generate from this subset several class-discriminating but overlapping clusters and model each as an affine subspace. From these subspaces, the proposed DSNA iteratively seeks an optimal approximation of the test sample and outputs an unbiased prediction. We show that our method not only effectively mitigates the imbalance issue, but also allows the prediction to extrapolate to unseen data. The latter capability is crucial for achieving accurate prediction on small dataset with limited samples. The proposed imbalanced learning method can be applied to both classification and regression tasks at a wide range of imbalance levels. It significantly outperforms the state-of-the-art methods that do not possess an imbalance handling mechanism, and is found to perform comparably or even better than recent deep learning methods by using hand-crafted features only. |
Tasks | |
Published | 2016-02-03 |
URL | http://arxiv.org/abs/1602.01197v1 |
http://arxiv.org/pdf/1602.01197v1.pdf | |
PWC | https://paperswithcode.com/paper/discriminative-sparse-neighbor-approximation |
Repo | |
Framework | |
Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming
Title | Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming |
Authors | Yangyang Xu |
Abstract | Motivated by big data applications, first-order methods have been extremely popular in recent years. However, naive gradient methods generally converge slowly. Hence, much efforts have been made to accelerate various first-order methods. This paper proposes two accelerated methods towards solving structured linearly constrained convex programming, for which we assume composite convex objective. The first method is the accelerated linearized augmented Lagrangian method (LALM). At each update to the primal variable, it allows linearization to the differentiable function and also the augmented term, and thus it enables easy subproblems. Assuming merely weak convexity, we show that LALM owns $O(1/t)$ convergence if parameters are kept fixed during all the iterations and can be accelerated to $O(1/t^2)$ if the parameters are adapted, where $t$ is the number of total iterations. The second method is the accelerated linearized alternating direction method of multipliers (LADMM). In addition to the composite convexity, it further assumes two-block structure on the objective. Different from classic ADMM, our method allows linearization to the objective and also augmented term to make the update simple. Assuming strong convexity on one block variable, we show that LADMM also enjoys $O(1/t^2)$ convergence with adaptive parameters. This result is a significant improvement over that in [Goldstein et. al, SIIMS’14], which requires strong convexity on both block variables and no linearization to the objective or augmented term. Numerical experiments are performed on quadratic programming, image denoising, and support vector machine. The proposed accelerated methods are compared to nonaccelerated ones and also existing accelerated methods. The results demonstrate the validness of acceleration and superior performance of the proposed methods over existing ones. |
Tasks | Denoising, Image Denoising |
Published | 2016-06-29 |
URL | http://arxiv.org/abs/1606.09155v1 |
http://arxiv.org/pdf/1606.09155v1.pdf | |
PWC | https://paperswithcode.com/paper/accelerated-first-order-primal-dual-proximal |
Repo | |
Framework | |
Cyberbullying Identification Using Participant-Vocabulary Consistency
Title | Cyberbullying Identification Using Participant-Vocabulary Consistency |
Authors | Elaheh Raisi, Bert Huang |
Abstract | With the rise of social media, people can now form relationships and communities easily regardless of location, race, ethnicity, or gender. However, the power of social media simultaneously enables harmful online behavior such as harassment and bullying. Cyberbullying is a serious social problem, making it an important topic in social network analysis. Machine learning methods can potentially help provide better understanding of this phenomenon, but they must address several key challenges: the rapidly changing vocabulary involved in cyber- bullying, the role of social network structure, and the scale of the data. In this study, we propose a model that simultaneously discovers instigators and victims of bullying as well as new bullying vocabulary by starting with a corpus of social interactions and a seed dictionary of bullying indicators. We formulate an objective function based on participant-vocabulary consistency. We evaluate this approach on Twitter and Ask.fm data sets and show that the proposed method can detect new bullying vocabulary as well as victims and bullies. |
Tasks | |
Published | 2016-06-26 |
URL | http://arxiv.org/abs/1606.08084v1 |
http://arxiv.org/pdf/1606.08084v1.pdf | |
PWC | https://paperswithcode.com/paper/cyberbullying-identification-using |
Repo | |
Framework | |
11 x 11 Domineering is Solved: The first player wins
Title | 11 x 11 Domineering is Solved: The first player wins |
Authors | Jos W. H. M. Uiterwijk |
Abstract | We have developed a program called MUDoS (Maastricht University Domineering Solver) that solves Domineering positions in a very efficient way. This enables the solution of known positions so far (up to the 10 x 10 board) much quicker (measured in number of investigated nodes). More importantly, it enables the solution of the 11 x 11 Domineering board, a board up till now far out of reach of previous Domineering solvers. The solution needed the investigation of 259,689,994,008 nodes, using almost half a year of computation time on a single simple desktop computer. The results show that under optimal play the first player wins the 11 x 11 Domineering game, irrespective if Vertical or Horizontal starts the game. In addition, several other boards hitherto unsolved were solved. Using the convention that Vertical starts, the 8 x 15, 11 x 9, 12 x 8, 12 x 15, 14 x 8, and 17 x 6 boards are all won by Vertical, whereas the 6 x 17, 8 x 12, 9 x 11, and 11 x 10 boards are all won by Horizontal. |
Tasks | Recommendation Systems |
Published | 2016-02-17 |
URL | http://arxiv.org/abs/1602.05404v1 |
http://arxiv.org/pdf/1602.05404v1.pdf | |
PWC | https://paperswithcode.com/paper/11-x-11-domineering-is-solved-the-first |
Repo | |
Framework | |