Paper Group ANR 236
Learnable Visual Markers. Symmetric Non-Rigid Structure from Motion for Category-Specific Object Structure Estimation. Multi-task CNN Model for Attribute Prediction. RISAS: A Novel Rotation, Illumination, Scale Invariant Appearance and Shape Feature. Estimating Depth from Monocular Images as Classification Using Deep Fully Convolutional Residual Ne …
Learnable Visual Markers
Title | Learnable Visual Markers |
Authors | Oleg Grinchuk, Vadim Lebedev, Victor Lempitsky |
Abstract | We propose a new approach to designing visual markers (analogous to QR-codes, markers for augmented reality, and robotic fiducial tags) based on the advances in deep generative networks. In our approach, the markers are obtained as color images synthesized by a deep network from input bit strings, whereas another deep network is trained to recover the bit strings back from the photos of these markers. The two networks are trained simultaneously in a joint backpropagation process that takes characteristic photometric and geometric distortions associated with marker fabrication and marker scanning into account. Additionally, a stylization loss based on statistics of activations in a pretrained classification network can be inserted into the learning in order to shift the marker appearance towards some texture prototype. In the experiments, we demonstrate that the markers obtained using our approach are capable of retaining bit strings that are long enough to be practical. The ability to automatically adapt markers according to the usage scenario and the desired capacity as well as the ability to combine information encoding with artistic stylization are the unique properties of our approach. As a byproduct, our approach provides an insight on the structure of patterns that are most suitable for recognition by ConvNets and on their ability to distinguish composite patterns. |
Tasks | |
Published | 2016-10-28 |
URL | http://arxiv.org/abs/1610.09237v1 |
http://arxiv.org/pdf/1610.09237v1.pdf | |
PWC | https://paperswithcode.com/paper/learnable-visual-markers |
Repo | |
Framework | |
Symmetric Non-Rigid Structure from Motion for Category-Specific Object Structure Estimation
Title | Symmetric Non-Rigid Structure from Motion for Category-Specific Object Structure Estimation |
Authors | Yuan Gao, Alan Yuille |
Abstract | Many objects, especially these made by humans, are symmetric, e.g. cars and aeroplanes. This paper addresses the estimation of 3D structures of symmetric objects from multiple images of the same object category, e.g. different cars, seen from various viewpoints. We assume that the deformation between different instances from the same object category is non-rigid and symmetric. In this paper, we extend two leading non-rigid structure from motion (SfM) algorithms to exploit symmetry constraints. We model the both methods as energy minimization, in which we also recover the missing observations caused by occlusions. In particularly, we show that by rotating the coordinate system, the energy can be decoupled into two independent terms, which still exploit symmetry, to apply matrix factorization separately on each of them for initialization. The results on the Pascal3D+ dataset show that our methods significantly improve performance over baseline methods. |
Tasks | |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.06988v1 |
http://arxiv.org/pdf/1609.06988v1.pdf | |
PWC | https://paperswithcode.com/paper/symmetric-non-rigid-structure-from-motion-for |
Repo | |
Framework | |
Multi-task CNN Model for Attribute Prediction
Title | Multi-task CNN Model for Attribute Prediction |
Authors | Abrar H. Abdulnabi, Gang Wang, Jiwen Lu, Kui Jia |
Abstract | This paper proposes a joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN). We consider learning binary semantic attributes through a multi-task CNN model, where each CNN will predict one binary attribute. The multi-task learning allows CNN models to simultaneously share visual knowledge among different attribute categories. Each CNN will generate attribute-specific feature representations, and then we apply multi-task learning on the features to predict their attributes. In our multi-task framework, we propose a method to decompose the overall model’s parameters into a latent task matrix and combination matrix. Furthermore, under-sampled classifiers can leverage shared statistics from other classifiers to improve their performance. Natural grouping of attributes is applied such that attributes in the same group are encouraged to share more knowledge. Meanwhile, attributes in different groups will generally compete with each other, and consequently share less knowledge. We show the effectiveness of our method on two popular attribute datasets. |
Tasks | Multi-Task Learning |
Published | 2016-01-04 |
URL | http://arxiv.org/abs/1601.00400v1 |
http://arxiv.org/pdf/1601.00400v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-cnn-model-for-attribute-prediction |
Repo | |
Framework | |
RISAS: A Novel Rotation, Illumination, Scale Invariant Appearance and Shape Feature
Title | RISAS: A Novel Rotation, Illumination, Scale Invariant Appearance and Shape Feature |
Authors | Kanzhi Wu, Xiaoyang Li, Ravindra Ranasinghe, Gamini Dissanayake, Yong Liu |
Abstract | This paper presents a novel appearance and shape feature, RISAS, which is robust to viewpoint, illumination, scale and rotation variations. RISAS consists of a keypoint detector and a feature descriptor both of which utilise texture and geometric information present in the appearance and shape channels. A novel response function based on the surface normals is used in combination with the Harris corner detector for selecting keypoints in the scene. A strategy that uses the depth information for scale estimation and background elimination is proposed to select the neighbourhood around the keypoints in order to build precise invariant descriptors. Proposed descriptor relies on the ordering of both grayscale intensity and shape information in the neighbourhood. Comprehensive experiments which confirm the effectiveness of the proposed RGB-D feature when compared with CSHOT and LOIND are presented. Furthermore, we highlight the utility of incorporating texture and shape information in the design of both the detector and the descriptor by demonstrating the enhanced performance of CSHOT and LOIND when combined with RISAS detector. |
Tasks | |
Published | 2016-03-14 |
URL | http://arxiv.org/abs/1603.04134v2 |
http://arxiv.org/pdf/1603.04134v2.pdf | |
PWC | https://paperswithcode.com/paper/risas-a-novel-rotation-illumination-scale |
Repo | |
Framework | |
Estimating Depth from Monocular Images as Classification Using Deep Fully Convolutional Residual Networks
Title | Estimating Depth from Monocular Images as Classification Using Deep Fully Convolutional Residual Networks |
Authors | Yuanzhouhan Cao, Zifeng Wu, Chunhua Shen |
Abstract | Depth estimation from single monocular images is a key component of scene understanding and has benefited largely from deep convolutional neural networks (CNN) recently. In this article, we take advantage of the recent deep residual networks and propose a simple yet effective approach to this problem. We formulate depth estimation as a pixel-wise classification task. Specifically, we first discretize the continuous depth values into multiple bins and label the bins according to their depth range. Then we train fully convolutional deep residual networks to predict the depth label of each pixel. Performing discrete depth label classification instead of continuous depth value regression allows us to predict a confidence in the form of probability distribution. We further apply fully-connected conditional random fields (CRF) as a post processing step to enforce local smoothness interactions, which improves the results. We evaluate our approach on both indoor and outdoor datasets and achieve state-of-the-art performance. |
Tasks | Depth Estimation, Scene Understanding |
Published | 2016-05-08 |
URL | http://arxiv.org/abs/1605.02305v3 |
http://arxiv.org/pdf/1605.02305v3.pdf | |
PWC | https://paperswithcode.com/paper/estimating-depth-from-monocular-images-as |
Repo | |
Framework | |
Multi-Object Tracking and Identification over Sets
Title | Multi-Object Tracking and Identification over Sets |
Authors | Aijun Bai |
Abstract | The ability for an autonomous agent or robot to track and identify potentially multiple objects in a dynamic environment is essential for many applications, such as automated surveillance, traffic monitoring, human-robot interaction, etc. The main challenge is due to the noisy and incomplete perception including inevitable false negative and false positive errors from a low-level detector. In this paper, we propose a novel multi-object tracking and identification over sets approach to address this challenge. We define joint states and observations both as finite sets, and develop motion and observation functions accordingly. The object identification problem is then formulated and solved by using expectation-maximization methods. The set formulation enables us to avoid directly performing observation-to-object association. We empirically confirm that the overall algorithm outperforms the state-of-the-art in a popular PETS dataset. |
Tasks | Multi-Object Tracking, Object Tracking |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07960v1 |
http://arxiv.org/pdf/1605.07960v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-object-tracking-and-identification-over |
Repo | |
Framework | |
Generalization Error Bounds for Optimization Algorithms via Stability
Title | Generalization Error Bounds for Optimization Algorithms via Stability |
Authors | Qi Meng, Yue Wang, Wei Chen, Taifeng Wang, Zhi-Ming Ma, Tie-Yan Liu |
Abstract | Many machine learning tasks can be formulated as Regularized Empirical Risk Minimization (R-ERM), and solved by optimization algorithms such as gradient descent (GD), stochastic gradient descent (SGD), and stochastic variance reduction (SVRG). Conventional analysis on these optimization algorithms focuses on their convergence rates during the training process, however, people in the machine learning community may care more about the generalization performance of the learned model on unseen test data. In this paper, we investigate on this issue, by using stability as a tool. In particular, we decompose the generalization error for R-ERM, and derive its upper bound for both convex and non-convex cases. In convex cases, we prove that the generalization error can be bounded by the convergence rate of the optimization algorithm and the stability of the R-ERM process, both in expectation (in the order of $\mathcal{O}((1/n)+\mathbb{E}\rho(T))$, where $\rho(T)$ is the convergence error and $T$ is the number of iterations) and in high probability (in the order of $\mathcal{O}\left(\frac{\log{1/\delta}}{\sqrt{n}}+\rho(T)\right)$ with probability $1-\delta$). For non-convex cases, we can also obtain a similar expected generalization error bound. Our theorems indicate that 1) along with the training process, the generalization error will decrease for all the optimization algorithms under our investigation; 2) Comparatively speaking, SVRG has better generalization ability than GD and SGD. We have conducted experiments on both convex and non-convex problems, and the experimental results verify our theoretical findings. |
Tasks | |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08397v1 |
http://arxiv.org/pdf/1609.08397v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-error-bounds-for-optimization |
Repo | |
Framework | |
Pareto Optimality and Strategy Proofness in Group Argument Evaluation (Extended Version)
Title | Pareto Optimality and Strategy Proofness in Group Argument Evaluation (Extended Version) |
Authors | Edmond Awad, Martin Caminada, Gabriella Pigozzi, Mikołaj Podlaszewski, Iyad Rahwan |
Abstract | An inconsistent knowledge base can be abstracted as a set of arguments and a defeat relation among them. There can be more than one consistent way to evaluate such an argumentation graph. Collective argument evaluation is the problem of aggregating the opinions of multiple agents on how a given set of arguments should be evaluated. It is crucial not only to ensure that the outcome is logically consistent, but also satisfies measures of social optimality and immunity to strategic manipulation. This is because agents have their individual preferences about what the outcome ought to be. In the current paper, we analyze three previously introduced argument-based aggregation operators with respect to Pareto optimality and strategy proofness under different general classes of agent preferences. We highlight fundamental trade-offs between strategic manipulability and social optimality on one hand, and classical logical criteria on the other. Our results motivate further investigation into the relationship between social choice and argumentation theory. The results are also relevant for choosing an appropriate aggregation operator given the criteria that are considered more important, as well as the nature of agents’ preferences. |
Tasks | |
Published | 2016-04-03 |
URL | http://arxiv.org/abs/1604.00693v2 |
http://arxiv.org/pdf/1604.00693v2.pdf | |
PWC | https://paperswithcode.com/paper/pareto-optimality-and-strategy-proofness-in |
Repo | |
Framework | |
All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles
Title | All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles |
Authors | Peter Radecki, Mark Campbell, Kevin Matzen |
Abstract | A novel probabilistic perception algorithm is presented as a real-time joint solution to data association, object tracking, and object classification for an autonomous ground vehicle in all-weather conditions. The presented algorithm extends a Rao-Blackwellized Particle Filter originally built with a particle filter for data association and a Kalman filter for multi-object tracking (Miller et al. 2011a) to now also include multiple model tracking for classification. Additionally a state-of-the-art vision detection algorithm that includes heading information for autonomous ground vehicle (AGV) applications was implemented. Cornell’s AGV from the DARPA Urban Challenge was upgraded and used to experimentally examine if and how state-of-the-art vision algorithms can complement or replace lidar and radar sensors. Sensor and algorithm performance in adverse weather and lighting conditions is tested. Experimental evaluation demonstrates robust all-weather data association, tracking, and classification where camera, lidar, and radar sensors complement each other inside the joint probabilistic perception algorithm. |
Tasks | Multi-Object Tracking, Object Classification, Object Tracking |
Published | 2016-05-07 |
URL | http://arxiv.org/abs/1605.02196v1 |
http://arxiv.org/pdf/1605.02196v1.pdf | |
PWC | https://paperswithcode.com/paper/all-weather-perception-joint-data-association |
Repo | |
Framework | |
SlangSD: Building and Using a Sentiment Dictionary of Slang Words for Short-Text Sentiment Classification
Title | SlangSD: Building and Using a Sentiment Dictionary of Slang Words for Short-Text Sentiment Classification |
Authors | Liang Wu, Fred Morstatter, Huan Liu |
Abstract | Sentiment in social media is increasingly considered as an important resource for customer segmentation, market understanding, and tackling other socio-economic issues. However, sentiment in social media is difficult to measure since user-generated content is usually short and informal. Although many traditional sentiment analysis methods have been proposed, identifying slang sentiment words remains untackled. One of the reasons is that slang sentiment words are not available in existing dictionaries or sentiment lexicons. To this end, we propose to build the first sentiment dictionary of slang words to aid sentiment analysis of social media content. It is laborious and time-consuming to collect and label the sentiment polarity of a comprehensive list of slang words. We present an approach to leverage web resources to construct an extensive Slang Sentiment word Dictionary (SlangSD) that is easy to maintain and extend. SlangSD is publicly available for research purposes. We empirically show the advantages of using SlangSD, the newly-built slang sentiment word dictionary for sentiment classification, and provide examples demonstrating its ease of use with an existing sentiment system. |
Tasks | Sentiment Analysis |
Published | 2016-08-17 |
URL | http://arxiv.org/abs/1608.05129v1 |
http://arxiv.org/pdf/1608.05129v1.pdf | |
PWC | https://paperswithcode.com/paper/slangsd-building-and-using-a-sentiment |
Repo | |
Framework | |
Real-Time Visual Tracking: Promoting the Robustness of Correlation Filter Learning
Title | Real-Time Visual Tracking: Promoting the Robustness of Correlation Filter Learning |
Authors | Yao Sui, Ziming Zhang, Guanghui Wang, Yafei Tang, Li Zhang |
Abstract | Correlation filtering based tracking model has received lots of attention and achieved great success in real-time tracking, however, the lost function in current correlation filtering paradigm could not reliably response to the appearance changes caused by occlusion and illumination variations. This study intends to promote the robustness of the correlation filter learning. By exploiting the anisotropy of the filter response, three sparsity related loss functions are proposed to alleviate the overfitting issue of previous methods and improve the overall tracking performance. As a result, three real-time trackers are implemented. Extensive experiments in various challenging situations demonstrate that the robustness of the learned correlation filter has been greatly improved via the designed loss functions. In addition, the study reveals, from an experimental perspective, how different loss functions essentially influence the tracking performance. An important conclusion is that the sensitivity of the peak values of the filter in successive frames is consistent with the tracking performance. This is a useful reference criterion in designing a robust correlation filter for visual tracking. |
Tasks | Real-Time Visual Tracking, Visual Tracking |
Published | 2016-08-29 |
URL | http://arxiv.org/abs/1608.08173v2 |
http://arxiv.org/pdf/1608.08173v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-visual-tracking-promoting-the |
Repo | |
Framework | |
3DFS: Deformable Dense Depth Fusion and Segmentation for Object Reconstruction from a Handheld Camera
Title | 3DFS: Deformable Dense Depth Fusion and Segmentation for Object Reconstruction from a Handheld Camera |
Authors | Tanmay Gupta, Daeyun Shin, Naren Sivagnanadasan, Derek Hoiem |
Abstract | We propose an approach for 3D reconstruction and segmentation of a single object placed on a flat surface from an input video. Our approach is to perform dense depth map estimation for multiple views using a proposed objective function that preserves detail. The resulting depth maps are then fused using a proposed implicit surface function that is robust to estimation error, producing a smooth surface reconstruction of the entire scene. Finally, the object is segmented from the remaining scene using a proposed 2D-3D segmentation that incorporates image and depth cues with priors and regularization over the 3D volume and 2D segmentations. We evaluate 3D reconstructions qualitatively on our Object-Videos dataset, comparing to fusion, multiview stereo, and segmentation baselines. We also quantitatively evaluate the dense depth estimation using the RGBD Scenes V2 dataset [Henry et al. 2013] and the segmentation using keyframe annotations of the Object-Videos dataset. |
Tasks | 3D Reconstruction, Depth Estimation, Object Reconstruction |
Published | 2016-06-15 |
URL | http://arxiv.org/abs/1606.05002v2 |
http://arxiv.org/pdf/1606.05002v2.pdf | |
PWC | https://paperswithcode.com/paper/3dfs-deformable-dense-depth-fusion-and |
Repo | |
Framework | |
Adaptive Least Mean Squares Estimation of Graph Signals
Title | Adaptive Least Mean Squares Estimation of Graph Signals |
Authors | Paolo Di Lorenzo, Sergio Barbarossa, Paolo Banelli, Stefania Sardellitti |
Abstract | The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices. A detailed mean square analysis provides the performance of the proposed method, and leads to several insights for designing useful sampling strategies for graph signals. Numerical results validate our theoretical findings, and illustrate the performance of the proposed method. Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment. |
Tasks | |
Published | 2016-02-18 |
URL | http://arxiv.org/abs/1602.05703v3 |
http://arxiv.org/pdf/1602.05703v3.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-least-mean-squares-estimation-of |
Repo | |
Framework | |
Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems
Title | Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems |
Authors | Maria-Florina Balcan, Vaishnavh Nagarajan, Ellen Vitercik, Colin White |
Abstract | Max-cut, clustering, and many other partitioning problems that are of significant importance to machine learning and other scientific fields are NP-hard, a reality that has motivated researchers to develop a wealth of approximation algorithms and heuristics. Although the best algorithm to use typically depends on the specific application domain, a worst-case analysis is often used to compare algorithms. This may be misleading if worst-case instances occur infrequently, and thus there is a demand for optimization methods which return the algorithm configuration best suited for the given application’s typical inputs. We address this problem for clustering, max-cut, and other partitioning problems, such as integer quadratic programming, by designing computationally efficient and sample efficient learning algorithms which receive samples from an application-specific distribution over problem instances and learn a partitioning algorithm with high expected performance. Our algorithms learn over common integer quadratic programming and clustering algorithm families: SDP rounding algorithms and agglomerative clustering algorithms with dynamic programming. For our sample complexity analysis, we provide tight bounds on the pseudodimension of these algorithm classes, and show that surprisingly, even for classes of algorithms parameterized by a single parameter, the pseudo-dimension is superconstant. In this way, our work both contributes to the foundations of algorithm configuration and pushes the boundaries of learning theory, since the algorithm classes we analyze consist of multi-stage optimization procedures and are significantly more complex than classes typically studied in learning theory. |
Tasks | |
Published | 2016-11-14 |
URL | http://arxiv.org/abs/1611.04535v4 |
http://arxiv.org/pdf/1611.04535v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-theoretic-foundations-of-algorithm |
Repo | |
Framework | |
Identifying Dogmatism in Social Media: Signals and Models
Title | Identifying Dogmatism in Social Media: Signals and Models |
Authors | Ethan Fast, Eric Horvitz |
Abstract | We explore linguistic and behavioral features of dogmatism in social media and construct statistical models that can identify dogmatic comments. Our model is based on a corpus of Reddit posts, collected across a diverse set of conversational topics and annotated via paid crowdsourcing. We operationalize key aspects of dogmatism described by existing psychology theories (such as over-confidence), finding they have predictive power. We also find evidence for new signals of dogmatism, such as the tendency of dogmatic posts to refrain from signaling cognitive processes. When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves. |
Tasks | |
Published | 2016-09-01 |
URL | http://arxiv.org/abs/1609.00425v1 |
http://arxiv.org/pdf/1609.00425v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-dogmatism-in-social-media-signals |
Repo | |
Framework | |