Paper Group ANR 44
Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance. Locally Imposing Function for Generalized Constraint Neural Networks - A Study on Equality Constraints. Fleet Size and Mix Split-Delivery Vehicle Routing. Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition. Seq-NMS for Video Object Detecti …
Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance
Title | Subjects and Their Objects: Localizing Interactees for a Person-Centric View of Importance |
Authors | Chao-Yeh Chen, Kristen Grauman |
Abstract | Understanding images with people often entails understanding their \emph{interactions} with other objects or people. As such, given a novel image, a vision system ought to infer which other objects/people play an important role in a given person’s activity. However, existing methods are limited to learning action-specific interactions (e.g., how the pose of a tennis player relates to the position of his racquet when serving the ball) for improved recognition, making them unequipped to reason about novel interactions with actions or objects unobserved in the training data. We propose to predict the “interactee” in novel images—that is, to localize the \emph{object} of a person’s action. Given an arbitrary image with a detected person, the goal is to produce a saliency map indicating the most likely positions and scales where that person’s interactee would be found. To that end, we explore ways to learn the generic, action-independent connections between (a) representations of a person’s pose, gaze, and scene cues and (b) the interactee object’s position and scale. We provide results on a newly collected UT Interactee dataset spanning more than 10,000 images from SUN, PASCAL, and COCO. We show that the proposed interaction-informed saliency metric has practical utility for four tasks: contextual object detection, image retargeting, predicting object importance, and data-driven natural language scene description. All four scenarios reveal the value in linking the subject to its object in order to understand the story of an image. |
Tasks | Object Detection |
Published | 2016-04-17 |
URL | http://arxiv.org/abs/1604.04842v1 |
http://arxiv.org/pdf/1604.04842v1.pdf | |
PWC | https://paperswithcode.com/paper/subjects-and-their-objects-localizing |
Repo | |
Framework | |
Locally Imposing Function for Generalized Constraint Neural Networks - A Study on Equality Constraints
Title | Locally Imposing Function for Generalized Constraint Neural Networks - A Study on Equality Constraints |
Authors | Linlin Cao, Ran He, Bao-Gang Hu |
Abstract | This work is a further study on the Generalized Constraint Neural Network (GCNN) model [1], [2]. Two challenges are encountered in the study, that is, to embed any type of prior information and to select its imposing schemes. The work focuses on the second challenge and studies a new constraint imposing scheme for equality constraints. A new method called locally imposing function (LIF) is proposed to provide a local correction to the GCNN prediction function, which therefore falls within Locally Imposing Scheme (LIS). In comparison, the conventional Lagrange multiplier method is considered as Globally Imposing Scheme (GIS) because its added constraint term exhibits a global impact to its objective function. Two advantages are gained from LIS over GIS. First, LIS enables constraints to fire locally and explicitly in the domain only where they need on the prediction function. Second, constraints can be implemented within a network setting directly. We attempt to interpret several constraint methods graphically from a viewpoint of the locality principle. Numerical examples confirm the advantages of the proposed method. In solving boundary value problems with Dirichlet and Neumann constraints, the GCNN model with LIF is possible to achieve an exact satisfaction of the constraints. |
Tasks | |
Published | 2016-04-18 |
URL | http://arxiv.org/abs/1604.05198v1 |
http://arxiv.org/pdf/1604.05198v1.pdf | |
PWC | https://paperswithcode.com/paper/locally-imposing-function-for-generalized |
Repo | |
Framework | |
Fleet Size and Mix Split-Delivery Vehicle Routing
Title | Fleet Size and Mix Split-Delivery Vehicle Routing |
Authors | Arthur Mahéo, Tommaso Urli, Philip Kilby |
Abstract | In the classic Vehicle Routing Problem (VRP) a fleet of of vehicles has to visit a set of customers while minimising the operations’ costs. We study a rich variant of the VRP featuring split deliveries, an heterogeneous fleet, and vehicle-commodity incompatibility constraints. Our goal is twofold: define the cheapest routing and the most adequate fleet. To do so, we split the problem into two interdependent components: a fleet design component and a routing component. First, we define two Mixed Integer Programming (MIP) formulations for each component. Then we discuss several improvements in the form of valid cuts and symmetry breaking constraints. The main contribution of this paper is a comparison of the four resulting models for this Rich VRP. We highlight their strengths and weaknesses with extensive experiments. Finally, we explore a lightweight integration with Constraint Programming (CP). We use a fast CP model which gives good solutions and use the solution to warm-start our models. |
Tasks | |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01691v1 |
http://arxiv.org/pdf/1612.01691v1.pdf | |
PWC | https://paperswithcode.com/paper/fleet-size-and-mix-split-delivery-vehicle |
Repo | |
Framework | |
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Title | Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition |
Authors | Jun Liu, Amir Shahroudy, Dong Xu, Gang Wang |
Abstract | 3D action recognition - analysis of human actions based on 3D skeleton data - becomes popular recently due to its succinctness, robustness, and view-invariant representation. Recent attempts on this problem suggested to develop RNN-based learning methods to model the contextual dependency in the temporal domain. In this paper, we extend this idea to spatio-temporal domains to analyze the hidden sources of action-related information within the input data over both domains concurrently. Inspired by the graphical structure of the human skeleton, we further propose a more powerful tree-structure based traversal method. To handle the noise and occlusion in 3D skeleton data, we introduce new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell. Our method achieves state-of-the-art performance on 4 challenging benchmark datasets for 3D human action analysis. |
Tasks | 3D Human Action Recognition, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2016-07-24 |
URL | http://arxiv.org/abs/1607.07043v1 |
http://arxiv.org/pdf/1607.07043v1.pdf | |
PWC | https://paperswithcode.com/paper/spatio-temporal-lstm-with-trust-gates-for-3d |
Repo | |
Framework | |
Seq-NMS for Video Object Detection
Title | Seq-NMS for Video Object Detection |
Authors | Wei Han, Pooya Khorrami, Tom Le Paine, Prajit Ramachandran, Mohammad Babaeizadeh, Honghui Shi, Jianan Li, Shuicheng Yan, Thomas S. Huang |
Abstract | Video object detection is challenging because objects that are easily detected in one frame may be difficult to detect in another frame within the same clip. Recently, there have been major advances for doing object detection in a single image. These methods typically contain three phases: (i) object proposal generation (ii) object classification and (iii) post-processing. We propose a modification of the post-processing phase that uses high-scoring object detections from nearby frames to boost scores of weaker detections within the same clip. We show that our method obtains superior results to state-of-the-art single image object detection techniques. Our method placed 3rd in the video object detection (VID) task of the ImageNet Large Scale Visual Recognition Challenge 2015 (ILSVRC2015). |
Tasks | Object Classification, Object Detection, Object Proposal Generation, Object Recognition, Video Object Detection |
Published | 2016-02-26 |
URL | http://arxiv.org/abs/1602.08465v3 |
http://arxiv.org/pdf/1602.08465v3.pdf | |
PWC | https://paperswithcode.com/paper/seq-nms-for-video-object-detection |
Repo | |
Framework | |
Big Batch SGD: Automated Inference using Adaptive Batch Sizes
Title | Big Batch SGD: Automated Inference using Adaptive Batch Sizes |
Authors | Soham De, Abhay Yadav, David Jacobs, Tom Goldstein |
Abstract | Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selection and automatic stopping. We propose alternative “big batch” SGD schemes that adaptively grow the batch size over time to maintain a nearly constant signal-to-noise ratio in the gradient approximation. The resulting methods have similar convergence rates to classical SGD, and do not require convexity of the objective. The high fidelity gradients enable automated learning rate selection and do not require stepsize decay. Big batch methods are thus easily automated and can run with little or no oversight. |
Tasks | |
Published | 2016-10-18 |
URL | http://arxiv.org/abs/1610.05792v4 |
http://arxiv.org/pdf/1610.05792v4.pdf | |
PWC | https://paperswithcode.com/paper/big-batch-sgd-automated-inference-using |
Repo | |
Framework | |
Generating Discriminative Object Proposals via Submodular Ranking
Title | Generating Discriminative Object Proposals via Submodular Ranking |
Authors | Yangmuzi Zhang, Zhuolin Jiang, Xi Chen, Larry S. Davis |
Abstract | A multi-scale greedy-based object proposal generation approach is presented. Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation. We first identify the representative and diverse exemplar clusters within each scale by using a diversity ranking algorithm. Object proposals are obtained by selecting a subset from the multi-scale segment pool via maximizing a submodular objective function, which consists of a weighted coverage term, a single-scale diversity term and a multi-scale reward term. The weighted coverage term forces the selected set of object proposals to be representative and compact; the single-scale diversity term encourages choosing segments from different exemplar clusters so that they will cover as many object patterns as possible; the multi-scale reward term encourages the selected proposals to be discriminative and selected from multiple layers generated by the hierarchical image segmentation. The experimental results on the Berkeley Segmentation Dataset and PASCAL VOC2012 segmentation dataset demonstrate the accuracy and efficiency of our object proposal model. Additionally, we validate our object proposals in simultaneous segmentation and detection and outperform the state-of-art performance. |
Tasks | Object Proposal Generation, Semantic Segmentation |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03585v1 |
http://arxiv.org/pdf/1602.03585v1.pdf | |
PWC | https://paperswithcode.com/paper/generating-discriminative-object-proposals |
Repo | |
Framework | |
Predicate Gradual Logic and Linguistics
Title | Predicate Gradual Logic and Linguistics |
Authors | Ryuta Arisaka |
Abstract | There are several major proposals for treating donkey anaphora such as discourse representation theory and the likes, or E-Type theories and the likes. Every one of them works well for a set of specific examples that they use to demonstrate validity of their approaches. As I show in this paper, however, they are not very generalisable and do not account for essentially the same problem that they remedy when it manifests in other examples. I propose another logical approach. I develoop logic that extends a recent, propositional gradual logic, and show that it can treat donkey anaphora generally. I also identify and address a problem around the modern convention on existential import. Furthermore, I show that Aristotle’s syllogisms and conversion are realisable in this logic. |
Tasks | |
Published | 2016-03-17 |
URL | http://arxiv.org/abs/1603.05570v1 |
http://arxiv.org/pdf/1603.05570v1.pdf | |
PWC | https://paperswithcode.com/paper/predicate-gradual-logic-and-linguistics |
Repo | |
Framework | |
Efficient Distributed Learning with Sparsity
Title | Efficient Distributed Learning with Sparsity |
Authors | Jialei Wang, Mladen Kolar, Nathan Srebro, Tong Zhang |
Abstract | We propose a novel, efficient approach for distributed sparse learning in high-dimensions, where observations are randomly partitioned across machines. Computationally, at each round our method only requires the master machine to solve a shifted ell_1 regularized M-estimation problem, and other workers to compute the gradient. In respect of communication, the proposed approach provably matches the estimation error bound of centralized methods within constant rounds of communications (ignoring logarithmic factors). We conduct extensive experiments on both simulated and real world datasets, and demonstrate encouraging performances on high-dimensional regression and classification tasks. |
Tasks | Sparse Learning |
Published | 2016-05-25 |
URL | http://arxiv.org/abs/1605.07991v1 |
http://arxiv.org/pdf/1605.07991v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-distributed-learning-with-sparsity |
Repo | |
Framework | |
A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data
Title | A Theoretical Analysis of Noisy Sparse Subspace Clustering on Dimensionality-Reduced Data |
Authors | Yining Wang, Yu-Xiang Wang, Aarti Singh |
Abstract | Subspace clustering is the problem of partitioning unlabeled data points into a number of clusters so that data points within one cluster lie approximately on a low-dimensional linear subspace. In many practical scenarios, the dimensionality of data points to be clustered are compressed due to constraints of measurement, computation or privacy. In this paper, we study the theoretical properties of a popular subspace clustering algorithm named sparse subspace clustering (SSC) and establish formal success conditions of SSC on dimensionality-reduced data. Our analysis applies to the most general fully deterministic model where both underlying subspaces and data points within each subspace are deterministically positioned, and also a wide range of dimensionality reduction techniques (e.g., Gaussian random projection, uniform subsampling, sketching) that fall into a subspace embedding framework (Meng & Mahoney, 2013; Avron et al., 2014). Finally, we apply our analysis to a differentially private SSC algorithm and established both privacy and utility guarantees of the proposed method. |
Tasks | Dimensionality Reduction |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07650v1 |
http://arxiv.org/pdf/1610.07650v1.pdf | |
PWC | https://paperswithcode.com/paper/a-theoretical-analysis-of-noisy-sparse |
Repo | |
Framework | |
Unsupervised Learning in Neuromemristive Systems
Title | Unsupervised Learning in Neuromemristive Systems |
Authors | Cory Merkel, Dhireesha Kudithipudi |
Abstract | Neuromemristive systems (NMSs) currently represent the most promising platform to achieve energy efficient neuro-inspired computation. However, since the research field is less than a decade old, there are still countless algorithms and design paradigms to be explored within these systems. One particular domain that remains to be fully investigated within NMSs is unsupervised learning. In this work, we explore the design of an NMS for unsupervised clustering, which is a critical element of several machine learning algorithms. Using a simple memristor crossbar architecture and learning rule, we are able to achieve performance which is on par with MATLAB’s k-means clustering. |
Tasks | |
Published | 2016-01-27 |
URL | http://arxiv.org/abs/1601.07482v1 |
http://arxiv.org/pdf/1601.07482v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-in-neuromemristive |
Repo | |
Framework | |
Classification-based Financial Markets Prediction using Deep Neural Networks
Title | Classification-based Financial Markets Prediction using Deep Neural Networks |
Authors | Matthew Dixon, Diego Klabjan, Jin Hoon Bang |
Abstract | Deep neural networks (DNNs) are powerful types of artificial neural networks (ANNs) that use several hidden layers. They have recently gained considerable attention in the speech transcription and image recognition community (Krizhevsky et al., 2012) for their superior predictive properties including robustness to overfitting. However their application to algorithmic trading has not been previously researched, partly because of their computational complexity. This paper describes the application of DNNs to predicting financial market movement directions. In particular we describe the configuration and training approach and then demonstrate their application to backtesting a simple trading strategy over 43 different Commodity and FX future mid-prices at 5-minute intervals. All results in this paper are generated using a C++ implementation on the Intel Xeon Phi co-processor which is 11.4x faster than the serial version and a Python strategy backtesting environment both of which are available as open source code written by the authors. |
Tasks | |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08604v2 |
http://arxiv.org/pdf/1603.08604v2.pdf | |
PWC | https://paperswithcode.com/paper/classification-based-financial-markets |
Repo | |
Framework | |
A nonparametric sequential test for online randomized experiments
Title | A nonparametric sequential test for online randomized experiments |
Authors | Vineet Abhishek, Shie Mannor |
Abstract | We propose a nonparametric sequential test that aims to address two practical problems pertinent to online randomized experiments: (i) how to do a hypothesis test for complex metrics; (ii) how to prevent type $1$ error inflation under continuous monitoring. The proposed test does not require knowledge of the underlying probability distribution generating the data. We use the bootstrap to estimate the likelihood for blocks of data followed by mixture sequential probability ratio test. We validate this procedure on data from a major online e-commerce website. We show that the proposed test controls type $1$ error at any time, has good power, is robust to misspecification in the distribution generating the data, and allows quick inference in online randomized experiments. |
Tasks | |
Published | 2016-10-08 |
URL | http://arxiv.org/abs/1610.02490v4 |
http://arxiv.org/pdf/1610.02490v4.pdf | |
PWC | https://paperswithcode.com/paper/a-nonparametric-sequential-test-for-online |
Repo | |
Framework | |
Scaling up Echo-State Networks with multiple light scattering
Title | Scaling up Echo-State Networks with multiple light scattering |
Authors | Jonathan Dong, Sylvain Gigan, Florent Krzakala, Gilles Wainrib |
Abstract | Echo-State Networks and Reservoir Computing have been studied for more than a decade. They provide a simpler yet powerful alternative to Recurrent Neural Networks, every internal weight is fixed and only the last linear layer is trained. They involve many multiplications by dense random matrices. Very large networks are difficult to obtain, as the complexity scales quadratically both in time and memory. Here, we present a novel optical implementation of Echo-State Networks using light-scattering media and a Digital Micromirror Device. As a proof of concept, binary networks have been successfully trained to predict the chaotic Mackey-Glass time series. This new method is fast, power efficient and easily scalable to very large networks. |
Tasks | Time Series |
Published | 2016-09-15 |
URL | http://arxiv.org/abs/1609.05204v3 |
http://arxiv.org/pdf/1609.05204v3.pdf | |
PWC | https://paperswithcode.com/paper/scaling-up-echo-state-networks-with-multiple |
Repo | |
Framework | |
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion
Title | GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion |
Authors | Ankit Gandhi, Arjun Sharma, Arijit Biswas, Om Deshmukh |
Abstract | Data generated from real world events are usually temporal and contain multimodal information such as audio, visual, depth, sensor etc. which are required to be intelligently combined for classification tasks. In this paper, we propose a novel generalized deep neural network architecture where temporal streams from multiple modalities are combined. There are total M+1 (M is the number of modalities) components in the proposed network. The first component is a novel temporally hybrid Recurrent Neural Network (RNN) that exploits the complimentary nature of the multimodal temporal information by allowing the network to learn both modality specific temporal dynamics as well as the dynamics in a multimodal feature space. M additional components are added to the network which extract discriminative but non-temporal cues from each modality. Finally, the predictions from all of these components are linearly combined using a set of automatically learned weights. We perform exhaustive experiments on three different datasets spanning four modalities. The proposed network is relatively 3.5%, 5.7% and 2% better than the best performing temporal multimodal baseline for UCF-101, CCV and Multimodal Gesture datasets respectively. |
Tasks | |
Published | 2016-09-17 |
URL | http://arxiv.org/abs/1609.05281v1 |
http://arxiv.org/pdf/1609.05281v1.pdf | |
PWC | https://paperswithcode.com/paper/gethr-net-a-generalized-temporally-hybrid |
Repo | |
Framework | |