July 28, 2019

2882 words 14 mins read

Paper Group ANR 294

Slim Embedding Layers for Recurrent Neural Language Models. SAGA: A Submodular Greedy Algorithm For Group Recommendation. T-SKIRT: Online Estimation of Student Proficiency in an Adaptive Learning System. Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels. Spectral-graph Based Classifications: Linear Regression for Classifi …

Slim Embedding Layers for Recurrent Neural Language Models


Title	Slim Embedding Layers for Recurrent Neural Language Models
Authors	Zhongliang Li, Raymond Kulhanek, Shaojun Wang, Yunxin Zhao, Shuang Wu
Abstract	Recurrent neural language models are the state-of-the-art models for language modeling. When the vocabulary size is large, the space taken to store the model parameters becomes the bottleneck for the use of recurrent neural language models. In this paper, we introduce a simple space compression method that randomly shares the structured parameters at both the input and output embedding layers of the recurrent neural language models to significantly reduce the size of model parameters, but still compactly represent the original input and output embedding layers. The method is easy to implement and tune. Experiments on several data sets show that the new method can get similar perplexity and BLEU score results while only using a very tiny fraction of parameters.
Tasks	Language Modelling
Published	2017-11-27
URL	http://arxiv.org/abs/1711.09873v2
PDF	http://arxiv.org/pdf/1711.09873v2.pdf
PWC	https://paperswithcode.com/paper/slim-embedding-layers-for-recurrent-neural
Repo
Framework

SAGA: A Submodular Greedy Algorithm For Group Recommendation


Title	SAGA: A Submodular Greedy Algorithm For Group Recommendation
Authors	Shameem A Puthiya Parambath, Nishant Vijayakumar, Sanjay Chawla
Abstract	In this paper, we propose a unified framework and an algorithm for the problem of group recommendation where a fixed number of items or alternatives can be recommended to a group of users. The problem of group recommendation arises naturally in many real world contexts, and is closely related to the budgeted social choice problem studied in economics. We frame the group recommendation problem as choosing a subgraph with the largest group consensus score in a completely connected graph defined over the item affinity matrix. We propose a fast greedy algorithm with strong theoretical guarantees, and show that the proposed algorithm compares favorably to the state-of-the-art group recommendation algorithms according to commonly used relevance and coverage performance measures on benchmark dataset.
Tasks
Published	2017-12-25
URL	http://arxiv.org/abs/1712.09123v1
PDF	http://arxiv.org/pdf/1712.09123v1.pdf
PWC	https://paperswithcode.com/paper/saga-a-submodular-greedy-algorithm-for-group
Repo
Framework

T-SKIRT: Online Estimation of Student Proficiency in an Adaptive Learning System


Title	T-SKIRT: Online Estimation of Student Proficiency in an Adaptive Learning System
Authors	Chaitanya Ekanadham, Yan Karklin
Abstract	We develop T-SKIRT: a temporal, structured-knowledge, IRT-based method for predicting student responses online. By explicitly accounting for student learning and employing a structured, multidimensional representation of student proficiencies, the model outperforms standard IRT-based methods on an online response prediction task when applied to real responses collected from students interacting with diverse pools of educational content.
Tasks
Published	2017-02-14
URL	http://arxiv.org/abs/1702.04282v1
PDF	http://arxiv.org/pdf/1702.04282v1.pdf
PWC	https://paperswithcode.com/paper/t-skirt-online-estimation-of-student
Repo
Framework

Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels


Title	Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels
Authors	Kai Zhong, Zhao Song, Inderjit S. Dhillon
Abstract	In this paper, we consider parameter recovery for non-overlapping convolutional neural networks (CNNs) with multiple kernels. We show that when the inputs follow Gaussian distribution and the sample size is sufficiently large, the squared loss of such CNNs is $\mathit{~locally~strongly~convex}$ in a basin of attraction near the global optima for most popular activation functions, like ReLU, Leaky ReLU, Squared ReLU, Sigmoid and Tanh. The required sample complexity is proportional to the dimension of the input and polynomial in the number of kernels and a condition number of the parameters. We also show that tensor methods are able to initialize the parameters to the local strong convex region. Hence, for most smooth activations, gradient descent following tensor initialization is guaranteed to converge to the global optimal with time that is linear in input dimension, logarithmic in precision and polynomial in other factors. To the best of our knowledge, this is the first work that provides recovery guarantees for CNNs with multiple kernels under polynomial sample and computational complexities.
Tasks
Published	2017-11-08
URL	http://arxiv.org/abs/1711.03440v1
PDF	http://arxiv.org/pdf/1711.03440v1.pdf
PWC	https://paperswithcode.com/paper/learning-non-overlapping-convolutional-neural
Repo
Framework

Spectral-graph Based Classifications: Linear Regression for Classification and Normalized Radial Basis Function Network


Title	Spectral-graph Based Classifications: Linear Regression for Classification and Normalized Radial Basis Function Network
Authors	Zhenfang Hu, Gang Pan, Zhaohui Wu
Abstract	Spectral graph theory has been widely applied in unsupervised and semi-supervised learning. In this paper, we find for the first time, to our knowledge, that it also plays a concrete role in supervised classification. It turns out that two classifiers are inherently related to the theory: linear regression for classification (LRC) and normalized radial basis function network (nRBFN), corresponding to linear and nonlinear kernel respectively. The spectral graph theory provides us with a new insight into a fundamental aspect of classification: the tradeoff between fitting error and overfitting risk. With the theory, ideal working conditions for LRC and nRBFN are presented, which ensure not only zero fitting error but also low overfitting risk. For quantitative analysis, two concepts, the fitting error and the spectral risk (indicating overfitting), have been defined. Their bounds for nRBFN and LRC are derived. A special result shows that the spectral risk of nRBFN is lower bounded by the number of classes and upper bounded by the size of radial basis. When the conditions are not met exactly, the classifiers will pursue the minimum fitting error, running into the risk of overfitting. It turns out that $\ell_2$-norm regularization can be applied to control overfitting. Its effect is explored under the spectral context. It is found that the two terms in the $\ell_2$-regularized objective are one-one correspondent to the fitting error and the spectral risk, revealing a tradeoff between the two quantities. Concerning practical performance, we devise a basis selection strategy to address the main problem hindering the applications of (n)RBFN. With the strategy, nRBFN is easy to implement yet flexible. Experiments on 14 benchmark data sets show the performance of nRBFN is comparable to that of SVM, whereas the parameter tuning of nRBFN is much easier, leading to reduction of model selection time.
Tasks	Model Selection
Published	2017-05-19
URL	http://arxiv.org/abs/1705.06922v2
PDF	http://arxiv.org/pdf/1705.06922v2.pdf
PWC	https://paperswithcode.com/paper/spectral-graph-based-classifications-linear
Repo
Framework

3D Shape Retrieval via Irrelevance Filtering and Similarity Ranking (IF/SR)


Title	3D Shape Retrieval via Irrelevance Filtering and Similarity Ranking (IF/SR)
Authors	Xiaqing Pan, Yueru Chen, C. -C. Jay Kuo
Abstract	A novel solution for the content-based 3D shape retrieval problem using an unsupervised clustering approach, which does not need any label information of 3D shapes, is presented in this work. The proposed shape retrieval system consists of two modules in cascade: the irrelevance filtering (IF) module and the similarity ranking (SR) module. The IF module attempts to cluster gallery shapes that are similar to each other by examining global and local features simultaneously. However, shapes that are close in the local feature space can be distant in the global feature space, and vice versa. To resolve this issue, we propose a joint cost function that strikes a balance between two distances. Irrelevant samples that are close in the local feature space but distant in the global feature space can be removed in this stage. The remaining gallery samples are ranked in the SR module using the local feature. The superior performance of the proposed IF/SR method is demonstrated by extensive experiments conducted on the popular SHREC12 dataset.
Tasks	3D Shape Retrieval
Published	2017-01-30
URL	http://arxiv.org/abs/1701.08869v1
PDF	http://arxiv.org/pdf/1701.08869v1.pdf
PWC	https://paperswithcode.com/paper/3d-shape-retrieval-via-irrelevance-filtering
Repo
Framework

Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks


Title	Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks
Authors	Jae Shin Yoon, Francois Rameau, Junsik Kim, Seokju Lee, Seunghak Shin, In So Kweon
Abstract	We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial details and the category-level semantic information. Furthermore, we propose a feature compression technique that drastically reduces the memory requirements while maintaining the capability of feature representation. Two-stage training (pre-training and fine-tuning) allows our network to handle any target object regardless of its category (even if the object’s type does not belong to the pre-training data) or of variations in its appearance through a video sequence. Experiments on large datasets demonstrate the effectiveness of our model - against related methods - in terms of accuracy, speed, and stability. Finally, we introduce the transferability of our network to different domains, such as the infrared data domain.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05137v1
PDF	http://arxiv.org/pdf/1708.05137v1.pdf
PWC	https://paperswithcode.com/paper/pixel-level-matching-for-video-object
Repo
Framework

Cascaded Scene Flow Prediction using Semantic Segmentation


Title	Cascaded Scene Flow Prediction using Semantic Segmentation
Authors	Zhile Ren, Deqing Sun, Jan Kautz, Erik B. Sudderth
Abstract	Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.
Tasks	Autonomous Driving, Optical Flow Estimation, Semantic Segmentation
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08313v2
PDF	http://arxiv.org/pdf/1707.08313v2.pdf
PWC	https://paperswithcode.com/paper/cascaded-scene-flow-prediction-using-semantic
Repo
Framework

Salient Object Detection with Semantic Priors


Title	Salient Object Detection with Semantic Priors
Authors	Tam V. Nguyen, Luoqi Liu
Abstract	Salient object detection has increasingly become a popular topic in cognitive and computational sciences, including computer vision and artificial intelligence research. In this paper, we propose integrating \textit{semantic priors} into the salient object detection process. Our algorithm consists of three basic steps. Firstly, the explicit saliency map is obtained based on the semantic segmentation refined by the explicit saliency priors learned from the data. Next, the implicit saliency map is computed based on a trained model which maps the implicit saliency priors embedded into regional features with the saliency values. Finally, the explicit semantic map and the implicit map are adaptively fused to form a pixel-accurate saliency map which uniformly covers the objects of interest. We further evaluate the proposed framework on two challenging datasets, namely, ECSSD and HKUIS. The extensive experimental results demonstrate that our method outperforms other state-of-the-art methods.
Tasks	Object Detection, Salient Object Detection, Semantic Segmentation
Published	2017-05-23
URL	http://arxiv.org/abs/1705.08207v1
PDF	http://arxiv.org/pdf/1705.08207v1.pdf
PWC	https://paperswithcode.com/paper/salient-object-detection-with-semantic-priors
Repo
Framework

Online and Stable Learning of Analysis Operators


Title	Online and Stable Learning of Analysis Operators
Authors	Michael Sandbichler, Karin Schnass
Abstract	In this paper four iterative algorithms for learning analysis operators are presented. They are built upon the same optimisation principle underlying both Analysis K-SVD and Analysis SimCO. The Forward and Sequential Analysis Operator Learning (FAOL and SAOL) algorithms are based on projected gradient descent with optimally chosen step size. The Implicit AOL (IAOL) algorithm is inspired by the implicit Euler scheme for solving ordinary differential equations and does not require to choose a step size. The fourth algorithm, Singular Value AOL (SVAOL), uses a similar strategy as Analysis K-SVD while avoiding its high computational cost. All algorithms are proven to decrease or preserve the target function in each step and a characterisation of their stationary points is provided. Further they are tested on synthetic and image data, compared to Analysis SimCO and found to give better recovery rates and faster decay of the objective function respectively. In a final denoising experiment the presented algorithms are again shown to perform similar to or better than the state-of-the-art algorithm ASimCO.
Tasks	Denoising
Published	2017-04-01
URL	http://arxiv.org/abs/1704.00227v2
PDF	http://arxiv.org/pdf/1704.00227v2.pdf
PWC	https://paperswithcode.com/paper/online-and-stable-learning-of-analysis
Repo
Framework

Pure Rough Mereology and Counting


Title	Pure Rough Mereology and Counting
Authors	A. Mani
Abstract	The study of mereology (parts and wholes) in the context of formal approaches to vagueness can be approached in a number of ways. In the context of rough sets, mereological concepts with a set-theoretic or valuation based ontology acquire complex and diverse behavior. In this research a general rough set framework called granular operator spaces is extended and the nature of parthood in it is explored from a minimally intrusive point of view. This is used to develop counting strategies that help in classifying the framework. The developed methodologies would be useful for drawing involved conclusions about the nature of data (and validity of assumptions about it) from antichains derived from context. The problem addressed is also about whether counting procedures help in confirming that the approximations involved in formation of data are indeed rough approximations?
Tasks
Published	2017-01-28
URL	http://arxiv.org/abs/1701.08301v1
PDF	http://arxiv.org/pdf/1701.08301v1.pdf
PWC	https://paperswithcode.com/paper/pure-rough-mereology-and-counting
Repo
Framework

Spatial Random Sampling: A Structure-Preserving Data Sketching Tool


Title	Spatial Random Sampling: A Structure-Preserving Data Sketching Tool
Authors	Mostafa Rahmani, George Atia
Abstract	Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data clusters. Also, adaptive sampling can often provide accurate low rank approximations, yet may fall short of producing descriptive data sketches, especially when the cluster centers are linearly dependent. Motivated by that, this paper introduces a novel randomized column sampling tool dubbed Spatial Random Sampling (SRS), in which data points are sampled based on their proximity to randomly sampled points on the unit sphere. The most compelling feature of SRS is that the corresponding probability of sampling from a given data cluster is proportional to the surface area the cluster occupies on the unit sphere, independently from the size of the cluster population. Although it is fully randomized, SRS is shown to provide descriptive and balanced data representations. The proposed idea addresses a pressing need in data science and holds potential to inspire many novel approaches for analysis of big data.
Tasks
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03566v2
PDF	http://arxiv.org/pdf/1705.03566v2.pdf
PWC	https://paperswithcode.com/paper/spatial-random-sampling-a-structure
Repo
Framework

Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues


Title	Optical Mapping Near-eye Three-dimensional Display with Correct Focus Cues
Authors	Wei Cui, Liang Gao
Abstract	We present an optical mapping near-eye (OMNI) three-dimensional display method for wearable devices. By dividing a display screen into different sub-panels and optically mapping them to various depths, we create a multiplane volumetric image with correct focus cues for depth perception. The resultant system can drive the eye’s accommodation to the distance that is consistent with binocular stereopsis, thereby alleviating the vergence-accommodation conflict, the primary cause for eye fatigue and discomfort. Compared with the previous methods, the OMNI display offers prominent advantages in adaptability, image dynamic range, and refresh rate.
Tasks
Published	2017-05-24
URL	http://arxiv.org/abs/1707.03685v1
PDF	http://arxiv.org/pdf/1707.03685v1.pdf
PWC	https://paperswithcode.com/paper/optical-mapping-near-eye-three-dimensional
Repo
Framework

Learning From Noisy Large-Scale Datasets With Minimal Supervision


Title	Learning From Noisy Large-Scale Datasets With Minimal Supervision
Authors	Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, Serge Belongie
Abstract	We present an approach to effectively use millions of images with noisy annotations in conjunction with a small subset of cleanly-annotated images to learn powerful image representations. One common approach to combine clean and noisy data is to first pre-train a network using the large noisy dataset and then fine-tune with the clean dataset. We show this approach does not fully leverage the information contained in the clean set. Thus, we demonstrate how to use the clean annotations to reduce the noise in the large dataset before fine-tuning the network using both the clean set and the full set with reduced noise. The approach comprises a multi-task network that jointly learns to clean noisy annotations and to accurately classify images. We evaluate our approach on the recently released Open Images dataset, containing ~9 million images, multiple annotations per image and over 6000 unique classes. For the small clean set of annotations we use a quarter of the validation set with ~40k images. Our results demonstrate that the proposed approach clearly outperforms direct fine-tuning across all major categories of classes in the Open Image dataset. Further, our approach is particularly effective for a large number of classes with wide range of noise in annotations (20-80% false positive annotations).
Tasks
Published	2017-01-06
URL	http://arxiv.org/abs/1701.01619v2
PDF	http://arxiv.org/pdf/1701.01619v2.pdf
PWC	https://paperswithcode.com/paper/learning-from-noisy-large-scale-datasets-with
Repo
Framework

Fast Amortized Inference and Learning in Log-linear Models with Randomly Perturbed Nearest Neighbor Search


Title	Fast Amortized Inference and Learning in Log-linear Models with Randomly Perturbed Nearest Neighbor Search
Authors	Stephen Mussmann, Daniel Levy, Stefano Ermon
Abstract	Inference in log-linear models scales linearly in the size of output space in the worst-case. This is often a bottleneck in natural language processing and computer vision tasks when the output space is feasibly enumerable but very large. We propose a method to perform inference in log-linear models with sublinear amortized cost. Our idea hinges on using Gumbel random variable perturbations and a pre-computed Maximum Inner Product Search data structure to access the most-likely elements in sublinear amortized time. Our method yields provable runtime and accuracy guarantees. Further, we present empirical experiments on ImageNet and Word Embeddings showing significant speedups for sampling, inference, and learning in log-linear models.
Tasks	Word Embeddings
Published	2017-07-11
URL	http://arxiv.org/abs/1707.03372v1
PDF	http://arxiv.org/pdf/1707.03372v1.pdf
PWC	https://paperswithcode.com/paper/fast-amortized-inference-and-learning-in-log
Repo
Framework