July 27, 2019

3128 words 15 mins read

Paper Group ANR 662

First and Second Order Methods for Online Convolutional Dictionary Learning. Assessment Formats and Student Learning Performance: What is the Relation?. Relative Camera Pose Estimation Using Convolutional Neural Networks. The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems. Structural Attention Neural N …

First and Second Order Methods for Online Convolutional Dictionary Learning


Title	First and Second Order Methods for Online Convolutional Dictionary Learning
Authors	Jialin Liu, Cristina Garcia-Cardona, Brendt Wohlberg, Wotao Yin
Abstract	Convolutional sparse representations are a form of sparse representation with a structured, translation invariant dictionary. Most convolutional dictionary learning algorithms to date operate in batch mode, requiring simultaneous access to all training images during the learning process, which results in very high memory usage and severely limits the training data that can be used. Very recently, however, a number of authors have considered the design of online convolutional dictionary learning algorithms that offer far better scaling of memory and computational cost with training set size than batch methods. This paper extends our prior work, improving a number of aspects of our previous algorithm; proposing an entirely new one, with better performance, and that supports the inclusion of a spatial mask for learning from incomplete data; and providing a rigorous theoretical analysis of these methods.
Tasks	Dictionary Learning
Published	2017-08-31
URL	http://arxiv.org/abs/1709.00106v3
PDF	http://arxiv.org/pdf/1709.00106v3.pdf
PWC	https://paperswithcode.com/paper/first-and-second-order-methods-for-online
Repo
Framework

Assessment Formats and Student Learning Performance: What is the Relation?


Title	Assessment Formats and Student Learning Performance: What is the Relation?
Authors	Khondkar Islam, Pouyan Ahmadi, Salman Yousaf
Abstract	Although compelling assessments have been examined in recent years, more studies are required to yield a better understanding of the several methods where assessment techniques significantly affect student learning process. Most of the educational research in this area does not consider demographics data, differing methodologies, and notable sample size. To address these drawbacks, the objective of our study is to analyse student learning outcomes of multiple assessment formats for a web-facilitated in-class section with an asynchronous online class of a core data communications course in the Undergraduate IT program of the Information Sciences and Technology (IST) Department at George Mason University (GMU). In this study, students were evaluated based on course assessments such as home and lab assignments, skill-based assessments, and traditional midterm and final exams across all four sections of the course. All sections have equivalent content, assessments, and teaching methodologies. Student demographics such as exam type and location preferences are considered in our study to determine whether they have any impact on their learning approach. Large amount of data from the learning management system (LMS), Blackboard (BB) Learn, had to be examined to compare the results of several assessment outcomes for all students within their respective section and amongst students of other sections. To investigate the effect of dissimilar assessment formats on student performance, we had to correlate individual question formats with the overall course grade. The results show that collective assessment formats allow students to be effective in demonstrating their knowledge.
Tasks
Published	2017-11-15
URL	http://arxiv.org/abs/1711.10396v1
PDF	http://arxiv.org/pdf/1711.10396v1.pdf
PWC	https://paperswithcode.com/paper/assessment-formats-and-student-learning
Repo
Framework

Relative Camera Pose Estimation Using Convolutional Neural Networks


Title	Relative Camera Pose Estimation Using Convolutional Neural Networks
Authors	Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu
Abstract	This paper presents a convolutional neural network based approach for estimating the relative pose between two cameras. The proposed network takes RGB images from both cameras as input and directly produces the relative rotation and translation as output. The system is trained in an end-to-end manner utilising transfer learning from a large scale classification dataset. The introduced approach is compared with widely used local feature based methods (SURF, ORB) and the results indicate a clear improvement over the baseline. In addition, a variant of the proposed architecture containing a spatial pyramid pooling (SPP) layer is evaluated and shown to further improve the performance.
Tasks	Pose Estimation, Transfer Learning
Published	2017-02-05
URL	http://arxiv.org/abs/1702.01381v3
PDF	http://arxiv.org/pdf/1702.01381v3.pdf
PWC	https://paperswithcode.com/paper/relative-camera-pose-estimation-using
Repo
Framework

The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems


Title	The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems
Authors	Vivak Patel
Abstract	In several experimental reports on nonconvex optimization problems in machine learning, stochastic gradient descent (SGD) was observed to prefer minimizers with flat basins in comparison to more deterministic methods, yet there is very little rigorous understanding of this phenomenon. In fact, the lack of such work has led to an unverified, but widely-accepted stochastic mechanism describing why SGD prefers flatter minimizers to sharper minimizers. However, as we demonstrate, the stochastic mechanism fails to explain this phenomenon. Here, we propose an alternative deterministic mechanism that can accurately explain why SGD prefers flatter minimizers to sharper minimizers. We derive this mechanism based on a detailed analysis of a generic stochastic quadratic problem, which generalizes known results for classical gradient descent. Finally, we verify the predictions of our deterministic mechanism on two nonconvex problems.
Tasks
Published	2017-09-14
URL	http://arxiv.org/abs/1709.04718v2
PDF	http://arxiv.org/pdf/1709.04718v2.pdf
PWC	https://paperswithcode.com/paper/the-impact-of-local-geometry-and-batch-size
Repo
Framework

Structural Attention Neural Networks for improved sentiment analysis


Title	Structural Attention Neural Networks for improved sentiment analysis
Authors	Filippos Kokkinos, Alexandros Potamianos
Abstract	We introduce a tree-structured attention neural network for sentences and small phrases and apply it to the problem of sentiment classification. Our model expands the current recursive models by incorporating structural information around a node of a syntactic tree using both bottom-up and top-down information propagation. Also, the model utilizes structural attention to identify the most salient representations during the construction of the syntactic tree. To our knowledge, the proposed models achieve state of the art performance on the Stanford Sentiment Treebank dataset.
Tasks	Sentiment Analysis
Published	2017-01-07
URL	http://arxiv.org/abs/1701.01811v1
PDF	http://arxiv.org/pdf/1701.01811v1.pdf
PWC	https://paperswithcode.com/paper/structural-attention-neural-networks-for
Repo
Framework

Traffic Surveillance Camera Calibration by 3D Model Bounding Box Alignment for Accurate Vehicle Speed Measurement


Title	Traffic Surveillance Camera Calibration by 3D Model Bounding Box Alignment for Accurate Vehicle Speed Measurement
Authors	Jakub Sochor, Roman Juránek, Adam Herout
Abstract	In this paper, we focus on fully automatic traffic surveillance camera calibration, which we use for speed measurement of passing vehicles. We improve over a recent state-of-the-art camera calibration method for traffic surveillance based on two detected vanishing points. More importantly, we propose a novel automatic scene scale inference method. The method is based on matching bounding boxes of rendered 3D models of vehicles with detected bounding boxes in the image. The proposed method can be used from arbitrary viewpoints, since it has no constraints on camera placement. We evaluate our method on the recent comprehensive dataset for speed measurement BrnoCompSpeed. Experiments show that our automatic camera calibration method by detection of two vanishing points reduces error by 50% (mean distance ratio error reduced from 0.18 to 0.09) compared to the previous state-of-the-art method. We also show that our scene scale inference method is more precise, outperforming both state-of-the-art automatic calibration method for speed measurement (error reduction by 86% – 7.98km/h to 1.10km/h) and manual calibration (error reduction by 19% – 1.35km/h to 1.10km/h). We also present qualitative results of the proposed automatic camera calibration method on video sequences obtained from real surveillance cameras in various places, and under different lighting conditions (night, dawn, day).
Tasks	Calibration
Published	2017-02-21
URL	http://arxiv.org/abs/1702.06451v2
PDF	http://arxiv.org/pdf/1702.06451v2.pdf
PWC	https://paperswithcode.com/paper/traffic-surveillance-camera-calibration-by-3d
Repo
Framework

Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data


Title	Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data
Authors	Paolo Inglese, James L. Alexander, Anna Mroz, Zoltan Takats, Robert Glen
Abstract	The paper presents the application of Variational Autoencoders (VAE) for data dimensionality reduction and explorative analysis of mass spectrometry imaging data (MSI). The results confirm that VAEs are capable of detecting the patterns associated with the different tissue sub-types with performance than standard approaches.
Tasks	Dimensionality Reduction
Published	2017-08-23
URL	http://arxiv.org/abs/1708.07012v2
PDF	http://arxiv.org/pdf/1708.07012v2.pdf
PWC	https://paperswithcode.com/paper/variational-autoencoders-for-tissue
Repo
Framework

ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information


Title	ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information
Authors	Rodney LaLonde, Dong Zhang, Mubarak Shah
Abstract	Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects.
Tasks	Object Detection
Published	2017-04-10
URL	http://arxiv.org/abs/1704.02694v2
PDF	http://arxiv.org/pdf/1704.02694v2.pdf
PWC	https://paperswithcode.com/paper/clusternet-detecting-small-objects-in-large
Repo
Framework

Multi-Branch Fully Convolutional Network for Face Detection


Title	Multi-Branch Fully Convolutional Network for Face Detection
Authors	Yancheng Bai, Bernard Ghanem
Abstract	Face detection is a fundamental problem in computer vision. It is still a challenging task in unconstrained conditions due to significant variations in scale, pose, expressions, and occlusion. In this paper, we propose a multi-branch fully convolutional network (MB-FCN) for face detection, which considers both efficiency and effectiveness in the design process. Our MB-FCN detector can deal with faces at all scale ranges with only a single pass through the backbone network. As such, our MB-FCN model saves computation and thus is more efficient, compared to previous methods that make multiple passes. For each branch, the specific skip connections of the convolutional feature maps at different layers are exploited to represent faces in specific scale ranges. Specifically, small faces can be represented with both shallow fine-grained and deep powerful coarse features. With this representation, superior improvement in performance is registered for the task of detecting small faces. We test our MB-FCN detector on two public face detection benchmarks, including FDDB and WIDER FACE. Extensive experiments show that our detector outperforms state-of-the-art methods on all these datasets in general and by a substantial margin on the most challenging among them (e.g. WIDER FACE Hard subset). Also, MB-FCN runs at 15 FPS on a GPU for images of size 640 x 480 with no assumption on the minimum detectable face size.
Tasks	Face Detection
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06330v1
PDF	http://arxiv.org/pdf/1707.06330v1.pdf
PWC	https://paperswithcode.com/paper/multi-branch-fully-convolutional-network-for
Repo
Framework

A Minimax Algorithm Better Than Alpha-beta?: No and Yes


Title	A Minimax Algorithm Better Than Alpha-beta?: No and Yes
Authors	Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
Abstract	This paper has three main contributions to our understanding of fixed-depth minimax search: (A) A new formulation for Stockman’s SSS* algorithm, based on Alpha-Beta, is presented. It solves all the perceived drawbacks of SSS, finally transforming it into a practical algorithm. In effect, we show that SSS = alpha-beta + ransposition tables. The crucial step is the realization that transposition tables contain so-called solution trees, structures that are used in best-first search algorithms like SSS. Having created a practical version, we present performance measurements with tournament game-playing programs for three different minimax games, yielding results that contradict a number of publications. (B) Based on the insights gained in our attempts at understanding SSS, we present a framework that facilitates the construction of several best-first fixed- depth game-tree search algorithms, known and new. The framework is based on depth-first null-window Alpha-Beta search, enhanced with storage to allow for the refining of previous search results. It focuses attention on the essential differences between algorithms. (C) We present a new instance of the framework, MTD(f). It is well-suited for use with iterative deepening, and performs better than algorithms that are currently used in most state-of-the-art game-playing programs. We provide experimental evidence to explain why MTD(f) performs better than the other fixed-depth minimax algorithms.
Tasks
Published	2017-02-11
URL	http://arxiv.org/abs/1702.03401v1
PDF	http://arxiv.org/pdf/1702.03401v1.pdf
PWC	https://paperswithcode.com/paper/a-minimax-algorithm-better-than-alpha-beta-no
Repo
Framework

High-Resolution Multispectral Dataset for Semantic Segmentation


Title	High-Resolution Multispectral Dataset for Semantic Segmentation
Authors	Ronald Kemker, Carl Salvaggio, Christopher Kanan
Abstract	Unmanned aircraft have decreased the cost required to collect remote sensing imagery, which has enabled researchers to collect high-spatial resolution data from multiple sensor modalities more frequently and easily. The increase in data will push the need for semantic segmentation frameworks that are able to classify non-RGB imagery, but this type of algorithmic development requires an increase in publicly available benchmark datasets with class labels. In this paper, we introduce a high-resolution multispectral dataset with image labels. This new benchmark dataset has been pre-split into training/testing folds in order to standardize evaluation and continue to push state-of-the-art classification frameworks for non-RGB imagery.
Tasks	Semantic Segmentation
Published	2017-03-06
URL	http://arxiv.org/abs/1703.01918v1
PDF	http://arxiv.org/pdf/1703.01918v1.pdf
PWC	https://paperswithcode.com/paper/high-resolution-multispectral-dataset-for
Repo
Framework

Machine Translation Approaches and Survey for Indian Languages


Title	Machine Translation Approaches and Survey for Indian Languages
Authors	Nadeem Jadoon Khan, Waqas Anwar, Nadir Durrani
Abstract	In this study, we present an analysis regarding the performance of the state-of-art Phrase-based Statistical Machine Translation (SMT) on multiple Indian languages. We report baseline systems on several language pairs. The motivation of this study is to promote the development of SMT and linguistic resources for these language pairs, as the current state-of-the-art is quite bleak due to sparse data resources. The success of an SMT system is contingent on the availability of a large parallel corpus. Such data is necessary to reliably estimate translation probabilities. We report the performance of baseline systems translating from Indian languages (Bengali, Guajarati, Hindi, Malayalam, Punjabi, Tamil, Telugu and Urdu) into English with average 10% accurate results for all the language pairs.
Tasks	Machine Translation
Published	2017-01-16
URL	http://arxiv.org/abs/1701.04290v1
PDF	http://arxiv.org/pdf/1701.04290v1.pdf
PWC	https://paperswithcode.com/paper/machine-translation-approaches-and-survey-for
Repo
Framework

Sunrise or Sunset: Selective Comparison Learning for Subtle Attribute Recognition


Title	Sunrise or Sunset: Selective Comparison Learning for Subtle Attribute Recognition
Authors	Hong-Yu Zhou, Bin-Bin Gao, Jianxin Wu
Abstract	The difficulty of image recognition has gradually increased from general category recognition to fine-grained recognition and to the recognition of some subtle attributes such as temperature and geolocation. In this paper, we try to focus on the classification between sunrise and sunset and hope to give a hint about how to tell the difference in subtle attributes. Sunrise vs. sunset is a difficult recognition task, which is challenging even for humans. Towards understanding this new problem, we first collect a new dataset made up of over one hundred webcams from different places. Since existing algorithmic methods have poor accuracy, we propose a new pairwise learning strategy to learn features from selective pairs of images. Experiments show that our approach surpasses baseline methods by a large margin and achieves better results even compared with humans. We also apply our approach to existing subtle attribute recognition problems, such as temperature estimation, and achieve state-of-the-art results.
Tasks
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06335v1
PDF	http://arxiv.org/pdf/1707.06335v1.pdf
PWC	https://paperswithcode.com/paper/sunrise-or-sunset-selective-comparison
Repo
Framework

Frequentist Consistency of Variational Bayes


Title	Frequentist Consistency of Variational Bayes
Authors	Yixin Wang, David M. Blei
Abstract	A key challenge for modern Bayesian statistics is how to perform scalable inference of posterior distributions. To address this challenge, variational Bayes (VB) methods have emerged as a popular alternative to the classical Markov chain Monte Carlo (MCMC) methods. VB methods tend to be faster while achieving comparable predictive performance. However, there are few theoretical results around VB. In this paper, we establish frequentist consistency and asymptotic normality of VB methods. Specifically, we connect VB methods to point estimates based on variational approximations, called frequentist variational approximations, and we use the connection to prove a variational Bernstein-von Mises theorem. The theorem leverages the theoretical characterizations of frequentist variational approximations to understand asymptotic properties of VB. In summary, we prove that (1) the VB posterior converges to the Kullback-Leibler (KL) minimizer of a normal distribution, centered at the truth and (2) the corresponding variational expectation of the parameter is consistent and asymptotically normal. As applications of the theorem, we derive asymptotic properties of VB posteriors in Bayesian mixture models, Bayesian generalized linear mixed models, and Bayesian stochastic block models. We conduct a simulation study to illustrate these theoretical results.
Tasks
Published	2017-05-09
URL	http://arxiv.org/abs/1705.03439v2
PDF	http://arxiv.org/pdf/1705.03439v2.pdf
PWC	https://paperswithcode.com/paper/frequentist-consistency-of-variational-bayes
Repo
Framework

tHoops: A Multi-Aspect Analytical Framework Spatio-Temporal Basketball Data


Title	tHoops: A Multi-Aspect Analytical Framework Spatio-Temporal Basketball Data
Authors	Evangelos Papalexakis, Konstantinos Pelechrinis
Abstract	During the past few years advancements in sports information systems and technology has allowed us to collect a number of detailed spatio-temporal data capturing various aspects of basketball. For example, shot charts, that is, maps capturing locations of (made or missed) shots, and spatio-temporal trajectories for all the players on the court can capture information about the offensive and defensive tendencies and schemes of a team. Characterization of these processes is important for player and team comparisons, pre-game scouting, game preparation etc. Playing tendencies among teams have traditionally been compared in a heuristic manner. Recently automated ways for similar comparisons have appeared in the sports analytics literature. However, these approaches are almost exclusively focused on the spatial distribution of the underlying actions (usually shots taken), ignoring a multitude of other parameters that can affect the action studied. In this work, we propose a framework based on tensor decomposition for obtaining a set of prototype spatio-temporal patterns based on the core spatiotemporal information and contextual meta-data. The core of our framework is a 3D tensor X, whose dimensions represent the entity under consideration (team, player, possession etc.), the location on the court and time. We make use of the PARAFAC decomposition and we decompose the tensor into several interpretable patterns, that can be thought of as prototype patterns of the process examined (e.g., shot selection, offensive schemes etc.). We also introduce an approach for choosing the number of components to be considered. Using the tensor components, we can then express every entity as a weighted combination of these components. The framework introduced in this paper can have further applications in the work-flow of the basketball operations of a franchise, which we also briefly discuss.
Tasks
Published	2017-12-04
URL	http://arxiv.org/abs/1712.01199v5
PDF	http://arxiv.org/pdf/1712.01199v5.pdf
PWC	https://paperswithcode.com/paper/thoops-a-multi-aspect-analytical-framework
Repo
Framework