Paper Group ANR 482
![Paper Group ANR 482](/2016/images/pwc/paper-arxiv_hu144ec288a26b3e360d673e256787de3e_28623_900x500_fit_q75_box.jpg)
Language classification from bilingual word embedding graphs. Contextual Decision Processes with Low Bellman Rank are PAC-Learnable. Stochastic Contextual Bandits with Known Reward Functions. Swapout: Learning an ensemble of deep architectures. Fast Object Localization Using a CNN Feature Map Based Multi-Scale Search. Cruciform: Solving Crosswords …
Language classification from bilingual word embedding graphs
Title | Language classification from bilingual word embedding graphs |
Authors | Steffen Eger, Armin Hoenen, Alexander Mehler |
Abstract | We study the role of the second language in bilingual word embeddings in monolingual semantic evaluation tasks. We find strongly and weakly positive correlations between down-stream task performance and second language similarity to the target language. Additionally, we show how bilingual word embeddings can be employed for the task of semantic language classification and that joint semantic spaces vary in meaningful ways across second languages. Our results support the hypothesis that semantic language similarity is influenced by both structural similarity as well as geography/contact. |
Tasks | Word Embeddings |
Published | 2016-07-18 |
URL | http://arxiv.org/abs/1607.05014v2 |
http://arxiv.org/pdf/1607.05014v2.pdf | |
PWC | https://paperswithcode.com/paper/language-classification-from-bilingual-word |
Repo | |
Framework | |
Contextual Decision Processes with Low Bellman Rank are PAC-Learnable
Title | Contextual Decision Processes with Low Bellman Rank are PAC-Learnable |
Authors | Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire |
Abstract | This paper studies systematic exploration for reinforcement learning with rich observations and function approximation. We introduce a new model called contextual decision processes, that unifies and generalizes most prior settings. Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings. Our second contribution is a new reinforcement learning algorithm that engages in systematic exploration to learn contextual decision processes with low Bellman rank. Our algorithm provably learns near-optimal behavior with a number of samples that is polynomial in all relevant parameters but independent of the number of unique observations. The approach uses Bellman error minimization with optimistic exploration and provides new insights into efficient exploration for reinforcement learning with function approximation. |
Tasks | Efficient Exploration |
Published | 2016-10-29 |
URL | http://arxiv.org/abs/1610.09512v2 |
http://arxiv.org/pdf/1610.09512v2.pdf | |
PWC | https://paperswithcode.com/paper/contextual-decision-processes-with-low |
Repo | |
Framework | |
Stochastic Contextual Bandits with Known Reward Functions
Title | Stochastic Contextual Bandits with Known Reward Functions |
Authors | Pranav Sakulkar, Bhaskar Krishnamachari |
Abstract | Many sequential decision-making problems in communication networks can be modeled as contextual bandit problems, which are natural extensions of the well-known multi-armed bandit problem. In contextual bandit problems, at each time, an agent observes some side information or context, pulls one arm and receives the reward for that arm. We consider a stochastic formulation where the context-reward tuples are independently drawn from an unknown distribution in each trial. Motivated by networking applications, we analyze a setting where the reward is a known non-linear function of the context and the chosen arm’s current state. We first consider the case of discrete and finite context-spaces and propose DCB($\epsilon$), an algorithm that we prove, through a careful analysis, yields regret (cumulative reward gap compared to a distribution-aware genie) scaling logarithmically in time and linearly in the number of arms that are not optimal for any context, improving over existing algorithms where the regret scales linearly in the total number of arms. We then study continuous context-spaces with Lipschitz reward functions and propose CCB($\epsilon, \delta$), an algorithm that uses DCB($\epsilon$) as a subroutine. CCB($\epsilon, \delta$) reveals a novel regret-storage trade-off that is parametrized by $\delta$. Tuning $\delta$ to the time horizon allows us to obtain sub-linear regret bounds, while requiring sub-linear storage. By exploiting joint learning for all contexts we get regret bounds for CCB($\epsilon, \delta$) that are unachievable by any existing contextual bandit algorithm for continuous context-spaces. We also show similar performance bounds for the unknown horizon case. |
Tasks | Decision Making, Multi-Armed Bandits |
Published | 2016-04-30 |
URL | http://arxiv.org/abs/1605.00176v2 |
http://arxiv.org/pdf/1605.00176v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-contextual-bandits-with-known |
Repo | |
Framework | |
Swapout: Learning an ensemble of deep architectures
Title | Swapout: Learning an ensemble of deep architectures |
Authors | Saurabh Singh, Derek Hoiem, David Forsyth |
Abstract | We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR-100. Swapout samples from a rich set of architectures including dropout, stochastic depth and residual architectures as special cases. When viewed as a regularization method swapout not only inhibits co-adaptation of units in a layer, similar to dropout, but also across network layers. We conjecture that swapout achieves strong regularization by implicitly tying the parameters across layers. When viewed as an ensemble training method, it samples a much richer set of architectures than existing methods such as dropout or stochastic depth. We propose a parameterization that reveals connections to exiting architectures and suggests a much richer set of architectures to be explored. We show that our formulation suggests an efficient training method and validate our conclusions on CIFAR-10 and CIFAR-100 matching state of the art accuracy. Remarkably, our 32 layer wider model performs similar to a 1001 layer ResNet model. |
Tasks | |
Published | 2016-05-20 |
URL | http://arxiv.org/abs/1605.06465v1 |
http://arxiv.org/pdf/1605.06465v1.pdf | |
PWC | https://paperswithcode.com/paper/swapout-learning-an-ensemble-of-deep |
Repo | |
Framework | |
Fast Object Localization Using a CNN Feature Map Based Multi-Scale Search
Title | Fast Object Localization Using a CNN Feature Map Based Multi-Scale Search |
Authors | Hyungtae Lee, Heesung Kwon, Archith J. Bency, William D. Nothwang |
Abstract | Object localization is an important task in computer vision but requires a large amount of computational power due mainly to an exhaustive multiscale search on the input image. In this paper, we describe a near real-time multiscale search on a deep CNN feature map that does not use region proposals. The proposed approach effectively exploits local semantic information preserved in the feature map of the outermost convolutional layer. A multi-scale search is performed on the feature map by processing all the sub-regions of different sizes using separate expert units of fully connected layers. Each expert unit receives as input local semantic features only from the corresponding sub-regions of a specific geometric shape. Therefore, it contains more nearly optimal parameters tailored to the corresponding shape. This multi-scale and multi-aspect ratio scanning strategy can effectively localize a potential object of an arbitrary size. The proposed approach is fast and able to localize objects of interest with a frame rate of 4 fps while providing improved detection performance over the state-of-the art on the PASCAL VOC 12 and MSCOCO data sets. |
Tasks | Object Localization |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03517v1 |
http://arxiv.org/pdf/1604.03517v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-object-localization-using-a-cnn-feature |
Repo | |
Framework | |
Cruciform: Solving Crosswords with Natural Language Processing
Title | Cruciform: Solving Crosswords with Natural Language Processing |
Authors | Dragomir Radev, Rui Zhang, Steve Wilson, Derek Van Assche, Henrique Spyra Gubert, Alisa Krivokapic, MeiXing Dong, Chongruo Wu, Spruce Bondera, Luke Brandl, Jeremy Dohmann |
Abstract | Crossword puzzles are popular word games that require not only a large vocabulary, but also a broad knowledge of topics. Answering each clue is a natural language task on its own as many clues contain nuances, puns, or counter-intuitive word definitions. Additionally, it can be extremely difficult to ascertain definitive answers without the constraints of the crossword grid itself. This task is challenging for both humans and computers. We describe here a new crossword solving system, Cruciform. We employ a group of natural language components, each of which returns a list of candidate words with scores when given a clue. These lists are used in conjunction with the fill intersections in the puzzle grid to formulate a constraint satisfaction problem, in a manner similar to the one used in the Dr. Fill system. We describe the results of several of our experiments with the system. |
Tasks | |
Published | 2016-11-08 |
URL | http://arxiv.org/abs/1611.02360v2 |
http://arxiv.org/pdf/1611.02360v2.pdf | |
PWC | https://paperswithcode.com/paper/cruciform-solving-crosswords-with-natural |
Repo | |
Framework | |
The Role of Word Length in Semantic Topology
Title | The Role of Word Length in Semantic Topology |
Authors | Francesco Fumarola |
Abstract | A topological argument is presented concering the structure of semantic space, based on the negative correlation between polysemy and word length. The resulting graph structure is applied to the modeling of free-recall experiments, resulting in predictions on the comparative values of recall probabilities. Associative recall is found to favor longer words whereas sequential recall is found to favor shorter words. Data from the PEERS experiments of Lohnas et al. (2015) and Healey and Kahana (2016) confirm both predictons, with correlation coefficients $r_{seq}= -0.17$ and $r_{ass}= +0.17$. The argument is then applied to predicting global properties of list recall, which leads to a novel explanation for the word-length effect based on the optimization of retrieval strategies. |
Tasks | |
Published | 2016-11-15 |
URL | http://arxiv.org/abs/1611.04842v1 |
http://arxiv.org/pdf/1611.04842v1.pdf | |
PWC | https://paperswithcode.com/paper/the-role-of-word-length-in-semantic-topology |
Repo | |
Framework | |
SDCA without Duality, Regularization, and Individual Convexity
Title | SDCA without Duality, Regularization, and Individual Convexity |
Authors | Shai Shalev-Shwartz |
Abstract | Stochastic Dual Coordinate Ascent is a popular method for solving regularized loss minimization for the case of convex losses. We describe variants of SDCA that do not require explicit regularization and do not rely on duality. We prove linear convergence rates even if individual loss functions are non-convex, as long as the expected loss is strongly convex. |
Tasks | |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01582v2 |
http://arxiv.org/pdf/1602.01582v2.pdf | |
PWC | https://paperswithcode.com/paper/sdca-without-duality-regularization-and |
Repo | |
Framework | |
Dynamic Probabilistic Network Based Human Action Recognition
Title | Dynamic Probabilistic Network Based Human Action Recognition |
Authors | Anne Veenendaal, Eddie Jones, Zhao Gang, Elliot Daly, Sumalini Vartak, Rahul Patwardhan |
Abstract | This paper examines use of dynamic probabilistic networks (DPN) for human action recognition. The actions of lifting objects and walking in the room, sitting in the room and neutral standing pose were used for testing the classification. The research used the dynamic interrelation between various different regions of interest (ROI) on the human body (face, body, arms, legs) and the time series based events related to the these ROIs. This dynamic links are then used to recognize the human behavioral aspects in the scene. First a model is developed to identify the human activities in an indoor scene and this model is dependent on the key features and interlinks between the various dynamic events using DPNs. The sub ROI are classified with DPN to associate the combined interlink with a specific human activity. The recognition accuracy performance between indoor (controlled lighting conditions) is compared with the outdoor lighting conditions. The accuracy in outdoor scenes was lower than the controlled environment. |
Tasks | Temporal Action Localization, Time Series |
Published | 2016-07-26 |
URL | http://arxiv.org/abs/1610.06395v1 |
http://arxiv.org/pdf/1610.06395v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-probabilistic-network-based-human |
Repo | |
Framework | |
ROAM: a Rich Object Appearance Model with Application to Rotoscoping
Title | ROAM: a Rich Object Appearance Model with Application to Rotoscoping |
Authors | Ondrej Miksik, Juan-Manuel Pérez-Rúa, Philip H. S. Torr, Patrick Pérez |
Abstract | Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition, editing and manipulation of the segments of interest. Sticking to this prevalent rotoscoping paradigm, we propose a novel framework to capture and track the visual aspect of an arbitrary object in a scene, given a first closed outline of this object. This model combines a collection of local foreground/background appearance models spread along the outline, a global appearance model of the enclosed object and a set of distinctive foreground landmarks. The structure of this rich appearance model allows simple initialization, efficient iterative optimization with exact minimization at each step, and on-line adaptation in videos. We demonstrate qualitatively and quantitatively the merit of this framework through comparisons with tools based on either dynamic segmentation with a closed curve or pixel-wise binary labelling. |
Tasks | |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01495v1 |
http://arxiv.org/pdf/1612.01495v1.pdf | |
PWC | https://paperswithcode.com/paper/roam-a-rich-object-appearance-model-with |
Repo | |
Framework | |
Learning Features by Watching Objects Move
Title | Learning Features by Watching Objects Move |
Authors | Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan |
Abstract | This paper presents a novel yet intuitive approach to unsupervised feature learning. Inspired by the human visual system, we explore whether low-level motion-based grouping cues can be used to learn an effective visual representation. Specifically, we use unsupervised motion-based segmentation on videos to obtain segments, which we use as ‘pseudo ground truth’ to train a convolutional network to segment objects from a single frame. Given the extensive evidence that motion plays a key role in the development of the human visual system, we hope that this straightforward approach to unsupervised learning will be more effective than cleverly designed ‘pretext’ tasks studied in the literature. Indeed, our extensive experiments show that this is the case. When used for transfer learning on object detection, our representation significantly outperforms previous unsupervised approaches across multiple settings, especially when training data for the target task is scarce. |
Tasks | Object Detection, Transfer Learning |
Published | 2016-12-19 |
URL | http://arxiv.org/abs/1612.06370v2 |
http://arxiv.org/pdf/1612.06370v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-features-by-watching-objects-move |
Repo | |
Framework | |
Hybrid Ant Colony Optimization in solving Multi-Skill Resource-Constrained Project Scheduling Problem
Title | Hybrid Ant Colony Optimization in solving Multi-Skill Resource-Constrained Project Scheduling Problem |
Authors | Paweł B. Myszkowski, Marek E. Skowroński, Łukasz P. Olech, Krzysztof Oślizło |
Abstract | In this paper Hybrid Ant Colony Optimization (HAntCO) approach in solving Multi–Skill Resource Constrained Project Scheduling Problem (MS–RCPSP) has been presented. We have proposed hybrid approach that links classical heuristic priority rules for project scheduling with Ant Colony Optimization (ACO). Furthermore, a novel approach for updating pheromone value has been proposed, based on both the best and worst solutions stored by ants. The objective of this paper is to research the usability and robustness of ACO and its hybrids with priority rules in solving MS–RCPSP. Experiments have been performed using artificially created dataset instances, based on real–world ones. We published those instances that can be used as a benchmark. Presented results show that ACO–based hybrid method is an efficient approach. More directed search process by hybrids makes this approach more stable and provides mostly better results than classical ACO. |
Tasks | |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08538v2 |
http://arxiv.org/pdf/1603.08538v2.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-ant-colony-optimization-in-solving |
Repo | |
Framework | |
Accelerating Eulerian Fluid Simulation With Convolutional Networks
Title | Accelerating Eulerian Fluid Simulation With Convolutional Networks |
Authors | Jonathan Tompson, Kristofer Schlachter, Pablo Sprechmann, Ken Perlin |
Abstract | Efficient simulation of the Navier-Stokes equations for fluid flow is a long standing problem in applied mathematics, for which state-of-the-art methods require large compute resources. In this work, we propose a data-driven approach that leverages the approximation power of deep-learning with the precision of standard solvers to obtain fast and highly realistic simulations. Our method solves the incompressible Euler equations using the standard operator splitting method, in which a large sparse linear system with many free parameters must be solved. We use a Convolutional Network with a highly tailored architecture, trained using a novel unsupervised learning framework to solve the linear system. We present real-time 2D and 3D simulations that outperform recently proposed data-driven methods; the obtained results are realistic and show good generalization properties. |
Tasks | |
Published | 2016-07-13 |
URL | http://arxiv.org/abs/1607.03597v6 |
http://arxiv.org/pdf/1607.03597v6.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-eulerian-fluid-simulation-with |
Repo | |
Framework | |
Exponential Concentration of a Density Functional Estimator
Title | Exponential Concentration of a Density Functional Estimator |
Authors | Shashank Singh, Barnabás P óczos |
Abstract | We analyze a plug-in estimator for a large class of integral functionals of one or more continuous probability densities. This class includes important families of entropy, divergence, mutual information, and their conditional versions. For densities on the $d$-dimensional unit cube $[0,1]^d$ that lie in a $\beta$-H"older smoothness class, we prove our estimator converges at the rate $O \left( n^{-\frac{\beta}{\beta + d}} \right)$. Furthermore, we prove the estimator is exponentially concentrated about its mean, whereas most previous related results have proven only expected error bounds on estimators. |
Tasks | |
Published | 2016-03-28 |
URL | http://arxiv.org/abs/1603.08584v1 |
http://arxiv.org/pdf/1603.08584v1.pdf | |
PWC | https://paperswithcode.com/paper/exponential-concentration-of-a-density |
Repo | |
Framework | |
A short proof that $O_2$ is an MCFL
Title | A short proof that $O_2$ is an MCFL |
Authors | Mark-Jan Nederhof |
Abstract | We present a new proof that $O_2$ is a multiple context-free language. It contrasts with a recent proof by Salvati (2015) in its avoidance of concepts that seem specific to two-dimensional geometry, such as the complex exponential function. Our simple proof creates realistic prospects of widening the results to higher dimensions. This finding is of central importance to the relation between extreme free word order and classes of grammars used to describe the syntax of natural language. |
Tasks | |
Published | 2016-03-11 |
URL | http://arxiv.org/abs/1603.03610v1 |
http://arxiv.org/pdf/1603.03610v1.pdf | |
PWC | https://paperswithcode.com/paper/a-short-proof-that-o_2-is-an-mcfl |
Repo | |
Framework | |