July 27, 2019

3036 words 15 mins read

Paper Group ANR 506

Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions. Rank-to-engage: New Listwise Approaches to Maximize Engagement. Improving Language Modelling with Noise-contrastive estimation. MMGAN: Manifold Matching Generative Adversarial Network. DeepNorm-A Deep Learning Approach to Text Normalization. Multimodal MRI brai …

Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions


Title	Distributed-Representation Based Hybrid Recommender System with Short Item Descriptions
Authors	Junhua He, Hankz Hankui Zhuo, Jarvan Law
Abstract	Collaborative filtering (CF) aims to build a model from users’ past behaviors and/or similar decisions made by other users, and use the model to recommend items for users. Despite of the success of previous collaborative filtering approaches, they are all based on the assumption that there are sufficient rating scores available for building high-quality recommendation models. In real world applications, however, it is often difficult to collect sufficient rating scores, especially when new items are introduced into the system, which makes the recommendation task challenging. We find that there are often “short” texts describing features of items, based on which we can approximate the similarity of items and make recommendation together with rating scores. In this paper we “borrow” the idea of vector representation of words to capture the information of short texts and embed it into a matrix factorization framework. We empirically show that our approach is effective by comparing it with state-of-the-art approaches.
Tasks	Recommendation Systems
Published	2017-03-15
URL	http://arxiv.org/abs/1703.04854v1
PDF	http://arxiv.org/pdf/1703.04854v1.pdf
PWC	https://paperswithcode.com/paper/distributed-representation-based-hybrid
Repo
Framework

Rank-to-engage: New Listwise Approaches to Maximize Engagement


Title	Rank-to-engage: New Listwise Approaches to Maximize Engagement
Authors	Swayambhoo Jain, Akshay Soni, Nikolay Laptev, Yashar Mehdad
Abstract	For many internet businesses, presenting a given list of items in an order that maximizes a certain metric of interest (e.g., click-through-rate, average engagement time etc.) is crucial. We approach the aforementioned task from a learning-to-rank perspective which reveals a new problem setup. In traditional learning-to-rank literature, it is implicitly assumed that during the training data generation one has access to the \emph{best or desired} order for the given list of items. In this work, we consider a problem setup where we do not observe the desired ranking. We present two novel solutions: the first solution is an extension of already existing listwise learning-to-rank technique–Listwise maximum likelihood estimation (ListMLE)–while the second one is a generic machine learning based framework that tackles the problem in its entire generality. We discuss several challenges associated with this generic framework, and propose a simple \emph{item-payoff} and \emph{positional-gain} model that addresses these challenges. We provide training algorithms, inference procedures, and demonstrate the effectiveness of the two approaches over traditional ListMLE on synthetic as well as on real-life setting of ranking news articles for increased dwell time.
Tasks	Learning-To-Rank
Published	2017-02-24
URL	http://arxiv.org/abs/1702.07798v1
PDF	http://arxiv.org/pdf/1702.07798v1.pdf
PWC	https://paperswithcode.com/paper/rank-to-engage-new-listwise-approaches-to
Repo
Framework

Improving Language Modelling with Noise-contrastive estimation


Title	Improving Language Modelling with Noise-contrastive estimation
Authors	Farhana Ferdousi Liza, Marek Grzes
Abstract	Neural language models do not scale well when the vocabulary is large. Noise-contrastive estimation (NCE) is a sampling-based method that allows for fast learning with large vocabularies. Although NCE has shown promising performance in neural machine translation, it was considered to be an unsuccessful approach for language modelling. A sufficient investigation of the hyperparameters in the NCE-based neural language models was also missing. In this paper, we showed that NCE can be a successful approach in neural language modelling when the hyperparameters of a neural network are tuned appropriately. We introduced the ‘search-then-converge’ learning rate schedule for NCE and designed a heuristic that specifies how to use this schedule. The impact of the other important hyperparameters, such as the dropout rate and the weight initialisation range, was also demonstrated. We showed that appropriate tuning of NCE-based neural language models outperforms the state-of-the-art single-model methods on a popular benchmark.
Tasks	Language Modelling, Machine Translation
Published	2017-09-22
URL	http://arxiv.org/abs/1709.07758v1
PDF	http://arxiv.org/pdf/1709.07758v1.pdf
PWC	https://paperswithcode.com/paper/improving-language-modelling-with-noise
Repo
Framework

MMGAN: Manifold Matching Generative Adversarial Network


Title	MMGAN: Manifold Matching Generative Adversarial Network
Authors	Noseong Park, Ankesh Anand, Joel Ruben Antony Moniz, Kookjin Lee, Tanmoy Chakraborty, Jaegul Choo, Hongkyu Park, Youngmin Kim
Abstract	It is well-known that GANs are difficult to train, and several different techniques have been proposed in order to stabilize their training. In this paper, we propose a novel training method called manifold-matching, and a new GAN model called manifold-matching GAN (MMGAN). MMGAN finds two manifolds representing the vector representations of real and fake images. If these two manifolds match, it means that real and fake images are statistically identical. To assist the manifold-matching task, we also use i) kernel tricks to find better manifold structures, ii) moving-averaged manifolds across mini-batches, and iii) a regularizer based on correlation matrix to suppress mode collapse. We conduct in-depth experiments with three image datasets and compare with several state-of-the-art GAN models. 32.4% of images generated by the proposed MMGAN are recognized as fake images during our user study (16% enhancement compared to other state-of-the-art model). MMGAN achieved an unsupervised inception score of 7.8 for CIFAR-10.
Tasks
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08273v4
PDF	http://arxiv.org/pdf/1707.08273v4.pdf
PWC	https://paperswithcode.com/paper/mmgan-manifold-matching-generative
Repo
Framework

DeepNorm-A Deep Learning Approach to Text Normalization


Title	DeepNorm-A Deep Learning Approach to Text Normalization
Authors	Maryam Zare, Shaurya Rohatgi
Abstract	This paper presents an simple yet sophisticated approach to the challenge by Sproat and Jaitly (2016)- given a large corpus of written text aligned to its normalized spoken form, train an RNN to learn the correct normalization function. Text normalization for a token seems very straightforward without it’s context. But given the context of the used token and then normalizing becomes tricky for some classes. We present a novel approach in which the prediction of our classification algorithm is used by our sequence to sequence model to predict the normalized text of the input token. Our approach takes very less time to learn and perform well unlike what has been reported by Google (5 days on their GPU cluster). We have achieved an accuracy of 97.62 which is impressive given the resources we use. Our approach is using the best of both worlds, gradient boosting - state of the art in most classification tasks and sequence to sequence learning - state of the art in machine translation. We present our experiments and report results with various parameter settings.
Tasks	Machine Translation
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06994v1
PDF	http://arxiv.org/pdf/1712.06994v1.pdf
PWC	https://paperswithcode.com/paper/deepnorm-a-deep-learning-approach-to-text
Repo
Framework

Multimodal MRI brain tumor segmentation using random forests with features learned from fully convolutional neural network


Title	Multimodal MRI brain tumor segmentation using random forests with features learned from fully convolutional neural network
Authors	Mohammadreza Soltaninejad, Lei Zhang, Tryphon Lambrou, Nigel Allinson, Xujiong Ye
Abstract	In this paper, we propose a novel learning based method for automated segmenta-tion of brain tumor in multimodal MRI images. The machine learned features from fully convolutional neural network (FCN) and hand-designed texton fea-tures are used to classify the MRI image voxels. The score map with pixel-wise predictions is used as a feature map which is learned from multimodal MRI train-ing dataset using the FCN. The learned features are then applied to random for-ests to classify each MRI image voxel into normal brain tissues and different parts of tumor. The method was evaluated on BRATS 2013 challenge dataset. The results show that the application of the random forest classifier to multimodal MRI images using machine-learned features based on FCN and hand-designed features based on textons provides promising segmentations. The Dice overlap measure for automatic brain tumor segmentation against ground truth is 0.88, 080 and 0.73 for complete tumor, core and enhancing tumor, respectively.
Tasks	Brain Tumor Segmentation
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08134v1
PDF	http://arxiv.org/pdf/1704.08134v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-mri-brain-tumor-segmentation-using
Repo
Framework

Group-driven Reinforcement Learning for Personalized mHealth Intervention


Title	Group-driven Reinforcement Learning for Personalized mHealth Intervention
Authors	Feiyun Zhu, Jun Guo, Zheng Xu, Peng Liao, Junzhou Huang
Abstract	Due to the popularity of smartphones and wearable devices nowadays, mobile health (mHealth) technologies are promising to bring positive and wide impacts on people’s health. State-of-the-art decision-making methods for mHealth rely on some ideal assumptions. Those methods either assume that the users are completely homogenous or completely heterogeneous. However, in reality, a user might be similar with some, but not all, users. In this paper, we propose a novel group-driven reinforcement learning method for the mHealth. We aim to understand how to share information among similar users to better convert the limited user information into sharper learned RL policies. Specifically, we employ the K-means clustering method to group users based on their trajectory information similarity and learn a shared RL policy for each group. Extensive experiment results have shown that our method can achieve clear gains over the state-of-the-art RL methods for mHealth.
Tasks	Decision Making
Published	2017-08-14
URL	http://arxiv.org/abs/1708.04001v1
PDF	http://arxiv.org/pdf/1708.04001v1.pdf
PWC	https://paperswithcode.com/paper/group-driven-reinforcement-learning-for
Repo
Framework

Theoretical properties of the global optimizer of two layer neural network


Title	Theoretical properties of the global optimizer of two layer neural network
Authors	Digvijay Boob, Guanghui Lan
Abstract	In this paper, we study the problem of optimizing a two-layer artificial neural network that best fits a training dataset. We look at this problem in the setting where the number of parameters is greater than the number of sampled points. We show that for a wide class of differentiable activation functions (this class involves “almost” all functions which are not piecewise linear), we have that first-order optimal solutions satisfy global optimality provided the hidden layer is non-singular. Our results are easily extended to hidden layers given by a flat matrix from that of a square matrix. Results are applicable even if network has more than one hidden layer provided all hidden layers satisfy non-singularity, all activations are from the given “good” class of differentiable functions and optimization is only with respect to the last hidden layer. We also study the smoothness properties of the objective function and show that it is actually Lipschitz smooth, i.e., its gradients do not change sharply. We use smoothness properties to guarantee asymptotic convergence of O(1/number of iterations) to a first-order optimal solution. We also show that our algorithm will maintain non-singularity of hidden layer for any finite number of iterations.
Tasks
Published	2017-10-30
URL	http://arxiv.org/abs/1710.11241v1
PDF	http://arxiv.org/pdf/1710.11241v1.pdf
PWC	https://paperswithcode.com/paper/theoretical-properties-of-the-global
Repo
Framework

Linguistic Features of Genre and Method Variation in Translation: A Computational Perspective


Title	Linguistic Features of Genre and Method Variation in Translation: A Computational Perspective
Authors	Ekaterina Lapshninova-Koltunski, Marcos Zampieri
Abstract	In this paper we describe the use of text classification methods to investigate genre and method variation in an English - German translation corpus. For this purpose we use linguistically motivated features representing texts using a combination of part-of-speech tags arranged in bigrams, trigrams, and 4-grams. The classification method used in this paper is a Bayesian classifier with Laplace smoothing. We use the output of the classifiers to carry out an extensive feature analysis on the main difference between genres and methods of translation.
Tasks	Text Classification
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04359v1
PDF	http://arxiv.org/pdf/1709.04359v1.pdf
PWC	https://paperswithcode.com/paper/linguistic-features-of-genre-and-method
Repo
Framework

Pose Invariant Embedding for Deep Person Re-identification


Title	Pose Invariant Embedding for Deep Person Re-identification
Authors	Liang Zheng, Yujia Huang, Huchuan Lu, Yi Yang
Abstract	Pedestrian misalignment, which mainly arises from detector errors and pose variations, is a critical problem for a robust person re-identification (re-ID) system. With bad alignment, the background noise will significantly compromise the feature learning and matching process. To address this problem, this paper introduces the pose invariant embedding (PIE) as a pedestrian descriptor. First, in order to align pedestrians to a standard pose, the PoseBox structure is introduced, which is generated through pose estimation followed by affine transformations. Second, to reduce the impact of pose estimation errors and information loss during PoseBox construction, we design a PoseBox fusion (PBF) CNN architecture that takes the original image, the PoseBox, and the pose estimation confidence as input. The proposed PIE descriptor is thus defined as the fully connected layer of the PBF network for the retrieval task. Experiments are conducted on the Market-1501, CUHK03, and VIPeR datasets. We show that PoseBox alone yields decent re-ID accuracy and that when integrated in the PBF network, the learned PIE descriptor produces competitive performance compared with the state-of-the-art approaches.
Tasks	Person Re-Identification, Pose Estimation
Published	2017-01-26
URL	http://arxiv.org/abs/1701.07732v1
PDF	http://arxiv.org/pdf/1701.07732v1.pdf
PWC	https://paperswithcode.com/paper/pose-invariant-embedding-for-deep-person-re
Repo
Framework

High-Dimensional Dependency Structure Learning for Physical Processes


Title	High-Dimensional Dependency Structure Learning for Physical Processes
Authors	Jamal Golmohammadi, Imme Ebert-Uphoff, Sijie He, Yi Deng, Arindam Banerjee
Abstract	In this paper, we consider the use of structure learning methods for probabilistic graphical models to identify statistical dependencies in high-dimensional physical processes. Such processes are often synthetically characterized using PDEs (partial differential equations) and are observed in a variety of natural phenomena, including geoscience data capturing atmospheric and hydrological phenomena. Classical structure learning approaches such as the PC algorithm and variants are challenging to apply due to their high computational and sample requirements. Modern approaches, often based on sparse regression and variants, do come with finite sample guarantees, but are usually highly sensitive to the choice of hyper-parameters, e.g., parameter $\lambda$ for sparsity inducing constraint or regularization. In this paper, we present ACLIME-ADMM, an efficient two-step algorithm for adaptive structure learning, which estimates an edge specific parameter $\lambda_{ij}$ in the first step, and uses these parameters to learn the structure in the second step. Both steps of our algorithm use (inexact) ADMM to solve suitable linear programs, and all iterations can be done in closed form in an efficient block parallel manner. We compare ACLIME-ADMM with baselines on both synthetic data simulated by partial differential equations (PDEs) that model advection-diffusion processes, and real data (50 years) of daily global geopotential heights to study information flow in the atmosphere. ACLIME-ADMM is shown to be efficient, stable, and competitive, usually better than the baselines especially on difficult problems. On real data, ACLIME-ADMM recovers the underlying structure of global atmospheric circulation, including switches in wind directions at the equator and tropics entirely from the data.
Tasks
Published	2017-09-12
URL	http://arxiv.org/abs/1709.03891v1
PDF	http://arxiv.org/pdf/1709.03891v1.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-dependency-structure
Repo
Framework

Towards Semantic Modeling of Contradictions and Disagreements: A Case Study of Medical Guidelines


Title	Towards Semantic Modeling of Contradictions and Disagreements: A Case Study of Medical Guidelines
Authors	Wlodek Zadrozny, Hossein Hematialam, Luciana Garbayo
Abstract	We introduce a formal distinction between contradictions and disagreements in natural language texts, motivated by the need to formally reason about contradictory medical guidelines. This is a novel and potentially very useful distinction, and has not been discussed so far in NLP and logic. We also describe a NLP system capable of automated finding contradictory medical guidelines; the system uses a combination of text analysis and information retrieval modules. We also report positive evaluation results on a small corpus of contradictory medical recommendations.
Tasks	Information Retrieval
Published	2017-08-02
URL	http://arxiv.org/abs/1708.00850v1
PDF	http://arxiv.org/pdf/1708.00850v1.pdf
PWC	https://paperswithcode.com/paper/towards-semantic-modeling-of-contradictions
Repo
Framework

A correlation game for unsupervised learning yields computational interpretations of Hebbian excitation, anti-Hebbian inhibition, and synapse elimination


Title	A correlation game for unsupervised learning yields computational interpretations of Hebbian excitation, anti-Hebbian inhibition, and synapse elimination
Authors	H. Sebastian Seung, Jonathan Zung
Abstract	Much has been learned about plasticity of biological synapses from empirical studies. Hebbian plasticity is driven by correlated activity of presynaptic and postsynaptic neurons. Synapses that converge onto the same neuron often behave as if they compete for a fixed resource; some survive the competition while others are eliminated. To provide computational interpretations of these aspects of synaptic plasticity, we formulate unsupervised learning as a zero-sum game between Hebbian excitation and anti-Hebbian inhibition in a neural network model. The game formalizes the intuition that Hebbian excitation tries to maximize correlations of neurons with their inputs, while anti-Hebbian inhibition tries to decorrelate neurons from each other. We further include a model of synaptic competition, which enables a neuron to eliminate all connections except those from its most strongly correlated inputs. Through empirical studies, we show that this facilitates the learning of sensory features that resemble parts of objects.
Tasks
Published	2017-04-03
URL	http://arxiv.org/abs/1704.00646v1
PDF	http://arxiv.org/pdf/1704.00646v1.pdf
PWC	https://paperswithcode.com/paper/a-correlation-game-for-unsupervised-learning
Repo
Framework

Optimal algorithms for smooth and strongly convex distributed optimization in networks


Title	Optimal algorithms for smooth and strongly convex distributed optimization in networks
Authors	Kevin Scaman, Francis Bach, Sébastien Bubeck, Yin Tat Lee, Laurent Massoulié
Abstract	In this paper, we determine the optimal convergence rates for strongly convex and smooth distributed optimization in two settings: centralized and decentralized communications over a network. For centralized (i.e. master/slave) algorithms, we show that distributing Nesterov’s accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp. $1$) is the time needed to communicate values between two neighbors (resp. perform local computations). For decentralized algorithms based on gossip, we provide the first optimal algorithm, called the multi-step dual accelerated (MSDA) method, that achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_l}(1+\frac{\tau}{\sqrt{\gamma}})\ln(1/\varepsilon))$, where $\kappa_l$ is the condition number of the local functions and $\gamma$ is the (normalized) eigengap of the gossip matrix used for communication between nodes. We then verify the efficiency of MSDA against state-of-the-art methods for two problems: least-squares regression and classification by logistic regression.
Tasks	Distributed Optimization
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08704v2
PDF	http://arxiv.org/pdf/1702.08704v2.pdf
PWC	https://paperswithcode.com/paper/optimal-algorithms-for-smooth-and-strongly
Repo
Framework

New region force for variational models in image segmentation and high dimensional data clustering


Title	New region force for variational models in image segmentation and high dimensional data clustering
Authors	Ke Wei, Ke Yin, Xue-Cheng Tai, Tony F. Chan
Abstract	We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model. Assume the probability that a pixel or a data point belongs to each class is known a priori. We show that the corresponding indicator function obeys the Bernoulli distribution and the new region force function can be computed as the negative log-likelihood function under the Bernoulli distribution. We solve the Potts model by the primal-dual hybrid gradient method and the augmented Lagrangian method, which are based on two different dual problems of the same primal problem. Empirical evaluations of the Potts model with the new region force function on benchmark problems show that it is competitive with existing variational methods in both image segmentation and semi-supervised data clustering.
Tasks	Semantic Segmentation
Published	2017-04-26
URL	http://arxiv.org/abs/1704.08218v1
PDF	http://arxiv.org/pdf/1704.08218v1.pdf
PWC	https://paperswithcode.com/paper/new-region-force-for-variational-models-in
Repo
Framework