Paper Group ANR 262
Causally Correct Partial Models for Reinforcement Learning. aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption. On the Effectiveness of Richardson Extrapolation in Machine Learning. Understanding Crowd Behaviors in a Social Event by Passive WiFi Sensing and Data Mining. Policy Teaching via Environment Poisoning: Training-t …
Causally Correct Partial Models for Reinforcement Learning
Title | Causally Correct Partial Models for Reinforcement Learning |
Authors | Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing |
Abstract | In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent’s next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don’t model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, yet remain fast because they do not need to fully model future observations. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02836v1 |
https://arxiv.org/pdf/2002.02836v1.pdf | |
PWC | https://paperswithcode.com/paper/causally-correct-partial-models-for-1 |
Repo | |
Framework | |
aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption
Title | aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption |
Authors | Chiranjib Sur |
Abstract | Region visual features enhance the generative capability of the machines based on features, however they lack proper interaction attentional perceptions and thus ends up with biased or uncorrelated sentences or pieces of misinformation. In this work, we propose Attribute Interaction-Tensor Product Representation (aiTPR) which is a convenient way of gathering more information through orthogonal combination and learning the interactions as physical entities (tensors) and improving the captions. Compared to previous works, where features are added up to undefined feature spaces, TPR helps in maintaining sanity in combinations and orthogonality helps in defining familiar spaces. We have introduced a new concept layer that defines the objects and also their interactions that can play a crucial role in determination of different descriptions. The interaction portions have contributed heavily for better caption quality and has out-performed different previous works on this domain and MSCOCO dataset. We introduced, for the first time, the notion of combining regional image features and abstracted interaction likelihood embedding for image captioning. |
Tasks | Image Captioning |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09545v1 |
https://arxiv.org/pdf/2001.09545v1.pdf | |
PWC | https://paperswithcode.com/paper/aitpr-attribute-interaction-tensor-product |
Repo | |
Framework | |
On the Effectiveness of Richardson Extrapolation in Machine Learning
Title | On the Effectiveness of Richardson Extrapolation in Machine Learning |
Authors | Francis Bach |
Abstract | Richardson extrapolation is a classical technique from numerical analysis that can improve the approximation error of an estimation method by combining linearly several estimates obtained from different values of one of its hyperparameters, without the need to know in details the inner structure of the original estimation method. The main goal of this paper is to study when Richardson extrapolation can be used within machine learning, beyond the existing applications to step-size adaptations in stochastic gradient descent. We identify two situations where Richardson interpolation can be useful: (1) when the hyperparameter is the number of iterations of an existing iterative optimization algorithm, with applications to averaged gradient descent and Frank-Wolfe algorithms (where we obtain asymptotically rates of $O(1/k^2)$ on polytopes, where $k$ is the number of iterations), and (2) when it is a regularization parameter, with applications to Nesterov smoothing techniques for minimizing non-smooth functions (where we obtain asymptotically rates close to $O(1/k^2)$ for non-smooth functions), and ridge regression. In all these cases, we show that extrapolation techniques come with no significant loss in performance, but with sometimes strong gains, and we provide theoretical justifications based on asymptotic developments for such gains, as well as empirical illustrations on classical problems from machine learning. |
Tasks | |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02835v1 |
https://arxiv.org/pdf/2002.02835v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-effectiveness-of-richardson |
Repo | |
Framework | |
Understanding Crowd Behaviors in a Social Event by Passive WiFi Sensing and Data Mining
Title | Understanding Crowd Behaviors in a Social Event by Passive WiFi Sensing and Data Mining |
Authors | Yuren Zhou, Billy Pik Lik Lau, Zann Koh, Chau Yuen, Benny Kai Kiat Ng |
Abstract | Understanding crowd behaviors in a large social event is crucial for event management. Passive WiFi sensing, by collecting WiFi probe requests sent from mobile devices, provides a better way to monitor crowds compared with people counters and cameras in terms of free interference, larger coverage, lower cost, and more information on people’s movement. In existing studies, however, not enough attention has been paid to the thorough analysis and mining of collected data. Especially, the power of machine learning has not been fully exploited. In this paper, therefore, we propose a comprehensive data analysis framework to fully analyze the collected probe requests to extract three types of patterns related to crowd behaviors in a large social event, with the help of statistics, visualization, and unsupervised machine learning. First, trajectories of the mobile devices are extracted from probe requests and analyzed to reveal the spatial patterns of the crowds’ movement. Hierarchical agglomerative clustering is adopted to find the interconnections between different locations. Next, k-means and k-shape clustering algorithms are applied to extract temporal visiting patterns of the crowds by days and locations, respectively. Finally, by combining with time, trajectories are transformed into spatiotemporal patterns, which reveal how trajectory duration changes over the length and how the overall trends of crowd movement change over time. The proposed data analysis framework is fully demonstrated using real-world data collected in a large social event. Results show that one can extract comprehensive patterns from data collected by a network of passive WiFi sensors. |
Tasks | |
Published | 2020-02-05 |
URL | https://arxiv.org/abs/2002.04401v1 |
https://arxiv.org/pdf/2002.04401v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-crowd-behaviors-in-a-social |
Repo | |
Framework | |
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning
Title | Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning |
Authors | Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla |
Abstract | We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings. The attacker can manipulate the rewards or the transition dynamics in the learning environment at training-time and is interested in doing so in a stealthy manner. We propose an optimization framework for finding an \emph{optimal stealthy attack} for different measures of attack cost. We provide sufficient technical conditions under which the attack is feasible and provide lower/upper bounds on the attack cost. We instantiate our attacks in two settings: (i) an \emph{offline} setting where the agent is doing planning in the poisoned environment, and (ii) an \emph{online} setting where the agent is learning a policy using a regret-minimization framework with poisoned feedback. Our results show that the attacker can easily succeed in teaching any target policy to the victim under mild conditions and highlight a significant security threat to reinforcement learning agents in practice. |
Tasks | |
Published | 2020-03-28 |
URL | https://arxiv.org/abs/2003.12909v1 |
https://arxiv.org/pdf/2003.12909v1.pdf | |
PWC | https://paperswithcode.com/paper/policy-teaching-via-environment-poisoning |
Repo | |
Framework | |
Development of modeling and control strategies for an approximated Gaussian process
Title | Development of modeling and control strategies for an approximated Gaussian process |
Authors | Shisheng Cui, Chia-Jung Chang |
Abstract | The Gaussian process (GP) model, which has been extensively applied as priors of functions, has demonstrated excellent performance. The specification of a large number of parameters affects the computational efficiency and the feasibility of implementation of a control strategy. We propose a linear model to approximate GPs; this model expands the GP model by a series of basis functions. Several examples and simulation studies are presented to demonstrate the advantages of the proposed method. A control strategy is provided with the proposed linear model. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05105v1 |
https://arxiv.org/pdf/2002.05105v1.pdf | |
PWC | https://paperswithcode.com/paper/development-of-modeling-and-control |
Repo | |
Framework | |
ArcText: A Unified Text Approach to Describing Convolutional Neural Network Architectures
Title | ArcText: A Unified Text Approach to Describing Convolutional Neural Network Architectures |
Authors | Yanan Sun, Ziyao Ren |
Abstract | Convolutional Neural Networks (CNNs) have demonstrated their promising performance in the field of computer vision. The superiority of CNNs mainly relies on their architectures that are often manually crafted with extensive human expertise. Data mining on existing CNN can discover useful patterns and fundamental sub-comments from their architectures, providing researchers with strong prior knowledge to design proper CNN architectures when they have no expertise in CNNs. There have been various state-of-the-art data mining algorithms at hand, while there is only rare work that has been done for the mining. One of the main reasons is the barrier between CNN architectures and data mining algorithms. Specifically, the current CNN architecture descriptions cannot be exactly vectorized as input to data mining algorithms. In this paper, we propose a unified approach, named ArcText, to describing CNN architectures based on text. Particularly, four different units and an ordering method have been elaborately designed in ArcText, to uniquely describe the same architecture with sufficient information. Also, the resulted description can be exactly converted back to the corresponding CNN architecture. ArcText bridges the gap between CNN architectures and data mining researchers, and has the potentiality to be utilized to wider scenarios. |
Tasks | |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.10233v3 |
https://arxiv.org/pdf/2002.10233v3.pdf | |
PWC | https://paperswithcode.com/paper/arctext-an-unified-text-approach-to |
Repo | |
Framework | |
In Defense of Grid Features for Visual Question Answering
Title | In Defense of Grid Features for Visual Question Answering |
Authors | Huaizu Jiang, Ishan Misra, Marcus Rohrbach, Erik Learned-Miller, Xinlei Chen |
Abstract | Popularized as ‘bottom-up’ attention, bounding box (or region) based visual features have recently surpassed vanilla grid-based convolutional features as the de facto standard for vision and language tasks like visual question answering (VQA). However, it is not clear whether the advantages of regions (e.g. better localization) are the key reasons for the success of bottom-up attention. In this paper, we revisit grid features for VQA and find they can work surprisingly well-running more than an order of magnitude faster with the same accuracy. Through extensive experiments, we verify that this observation holds true across different VQA models, datasets, and generalizes well to other tasks like image captioning. As grid features make the model design and training process much simpler, this enables us to train them end-to-end and also use a more flexible network design. We learn VQA models end-to-end, from pixels directly to answers, and show that strong performance is achievable without using any region annotations in pre-training. We hope our findings help further improve the scientific understanding and the practical application of VQA. Code and features will be made available. |
Tasks | Image Captioning, Question Answering, Visual Question Answering |
Published | 2020-01-10 |
URL | https://arxiv.org/abs/2001.03615v1 |
https://arxiv.org/pdf/2001.03615v1.pdf | |
PWC | https://paperswithcode.com/paper/in-defense-of-grid-features-for-visual |
Repo | |
Framework | |
Segmentation and Optimal Region Selection of Physiological Signals using Deep Neural Networks and Combinatorial Optimization
Title | Segmentation and Optimal Region Selection of Physiological Signals using Deep Neural Networks and Combinatorial Optimization |
Authors | Jorge Oliveira, Margarida Carvalho, Diogo Marcelo Nogueira, Miguel Coimbra |
Abstract | Physiological signals, such as the electrocardiogram and the phonocardiogram are very often corrupted by noisy sources. Usually, artificial intelligent algorithms analyze the signal regardless of its quality. On the other hand, physicians use a completely orthogonal strategy. They do not assess the entire recording, instead they search for a segment where the fundamental and abnormal waves are easily detected, and only then a prognostic is attempted. Inspired by this fact, a new algorithm that automatically selects an optimal segment for a post-processing stage, according to a criteria defined by the user is proposed. In the process, a Neural Network is used to compute the output state probability distribution for each sample. Using the aforementioned quantities, a graph is designed, whereas state transition constraints are physically imposed into the graph and a set of constraints are used to retrieve a subset of the recording that maximizes the likelihood function, proposed by the user. The developed framework is tested and validated in two applications. In both cases, the system performance is boosted significantly, e.g in heart sound segmentation, sensitivity increases 2.4% when compared to the standard approaches in the literature. |
Tasks | Combinatorial Optimization |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07981v1 |
https://arxiv.org/pdf/2003.07981v1.pdf | |
PWC | https://paperswithcode.com/paper/segmentation-and-optimal-region-selection-of |
Repo | |
Framework | |
Reinforcement Learning for Combinatorial Optimization: A Survey
Title | Reinforcement Learning for Combinatorial Optimization: A Survey |
Authors | Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, Evgeny Burnaev |
Abstract | Combinatorial optimization (CO) is the workhorse of numerous important applications in operations research, engineering and other fields and, thus, has been attracting enormous attention from the research community for over a century. Many efficient solutions to common problems involve using hand-crafted heuristics to sequentially construct a solution. Therefore, it is intriguing to see how a combinatorial optimization problem can be formulated as a sequential decision making process and whether efficient heuristics can be implicitly learned by a reinforcement learning agent to find a solution. This survey explores the synergy between CO and reinforcement learning (RL) framework, which can become a promising direction for solving combinatorial problems. |
Tasks | Combinatorial Optimization, Decision Making |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.03600v1 |
https://arxiv.org/pdf/2003.03600v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-combinatorial |
Repo | |
Framework | |
Sparse Optimization for Green Edge AI Inference
Title | Sparse Optimization for Green Edge AI Inference |
Authors | Xiangyu Yang, Sheng Hua, Yuanming Shi, Hao Wang, Jun Zhang, Khaled B. Letaief |
Abstract | With the rapid upsurge of deep learning tasks at the network edge, effective edge artificial intelligence (AI) inference becomes critical to provide low-latency intelligent services for mobile users via leveraging the edge computing capability. In such scenarios, energy efficiency becomes a primary concern. In this paper, we present a joint inference task selection and downlink beamforming strategy to achieve energy-efficient edge AI inference through minimizing the overall power consumption consisting of both computation and transmission power consumption, yielding a mixed combinatorial optimization problem. By exploiting the inherent connections between the set of task selection and group sparsity structural transmit beamforming vector, we reformulate the optimization as a group sparse beamforming problem. To solve this challenging problem, we propose a log-sum function based three-stage approach. By adopting the log-sum function to enhance the group sparsity, a proximal iteratively reweighted algorithm is developed. Furthermore, we establish the global convergence analysis and provide the ergodic worst-case convergence rate for this algorithm. Simulation results will demonstrate the effectiveness of the proposed approach for improving energy efficiency in edge AI inference systems. |
Tasks | Combinatorial Optimization |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10080v2 |
https://arxiv.org/pdf/2002.10080v2.pdf | |
PWC | https://paperswithcode.com/paper/sparse-optimization-for-green-edge-ai |
Repo | |
Framework | |
Incremental Sampling Without Replacement for Sequence Models
Title | Incremental Sampling Without Replacement for Sequence Models |
Authors | Kensen Shi, David Bieber, Charles Sutton |
Abstract | Sampling is a fundamental technique, and sampling without replacement is often desirable when duplicate samples are not beneficial. Within machine learning, sampling is useful for generating diverse outputs from a trained model. We present an elegant procedure for sampling without replacement from a broad class of randomized programs, including generative neural models that construct outputs sequentially. Our procedure is efficient even for exponentially-large output spaces. Unlike prior work, our approach is incremental, i.e., samples can be drawn one at a time, allowing for increased flexibility. We also present a new estimator for computing expectations from samples drawn without replacement. We show that incremental sampling without replacement is applicable to many domains, e.g., program synthesis and combinatorial optimization. |
Tasks | Combinatorial Optimization, Program Synthesis |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09067v1 |
https://arxiv.org/pdf/2002.09067v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-sampling-without-replacement-for |
Repo | |
Framework | |
Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm
Title | Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm |
Authors | Dimitri Bertsekas |
Abstract | We consider an extension of the rollout algorithm that applies to constrained deterministic dynamic programming, including challenging combinatorial optimization problems. The algorithm relies on a suboptimal policy, called base heuristic. Under suitable assumptions, we show that if the base heuristic produces a feasible solution, the rollout algorithm has a cost improvement property: it produces a feasible solution, whose cost is no worse than the base heuristic’s cost. We then focus on multiagent problems, where the control at each stage consists of multiple components (one per agent), which are coupled either through the cost function or the constraints or both. We show that the cost improvement property is maintained with an alternative implementation that has greatly reduced computational requirements, and makes possible the use of rollout in problems with many agents. We demonstrate this alternative algorithm by applying it to layered graph problems that involve both a spatial and a temporal structure. We consider in some detail a prominent example of such problems: multidimensional assignment, where we use the auction algorithm for 2-dimensional assignment as a base heuristic. This auction algorithm is particularly well-suited for our context, because through the use of prices, it can advantageously use the solution of an assignment problem as a starting point for solving other related assignment problems, and this can greatly speed up the execution of the rollout algorithm. |
Tasks | Combinatorial Optimization |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07407v1 |
https://arxiv.org/pdf/2002.07407v1.pdf | |
PWC | https://paperswithcode.com/paper/constrained-multiagent-rollout-and |
Repo | |
Framework | |
Evolutionary Bi-objective Optimization for the Dynamic Chance-Constrained Knapsack Problem Based on Tail Bound Objectives
Title | Evolutionary Bi-objective Optimization for the Dynamic Chance-Constrained Knapsack Problem Based on Tail Bound Objectives |
Authors | Hirad Assimi, Oscar Harper, Yue Xie, Aneta Neumann, Frank Neumann |
Abstract | Real-world combinatorial optimization problems are often stochastic and dynamic. Therefore, it is essential to make optimal and reliable decisions with a holistic approach. In this paper, we consider the dynamic chance-constrained knapsack problem where the weight of each item is stochastic, the capacity constraint changes dynamically over time, and the objective is to maximize the total profit subject to the probability that total weight exceeds the capacity. We make use of prominent tail inequalities such as Chebyshev’s inequality, and Chernoff bound to approximate the probabilistic constraint. Our key contribution is to introduce an additional objective which estimates the minimal capacity bound for a given stochastic solution that still meets the chance constraint. This objective helps to cater for dynamic changes to the stochastic problem. We apply single- and multi-objective evolutionary algorithms to the problem and show how bi-objective optimization can help to deal with dynamic chance-constrained problems. |
Tasks | Combinatorial Optimization |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06766v1 |
https://arxiv.org/pdf/2002.06766v1.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-bi-objective-optimization-for |
Repo | |
Framework | |
Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization
Title | Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization |
Authors | Dmitrii Beloborodov, A. E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky |
Abstract | Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to solve the Ising energy minimization problem, which is equivalent to the Maximum Cut problem. The agent controls the algorithm by tuning one of its parameters with the goal of improving recently seen solutions. We propose a new Rescaled Ranked Reward (R3) method that enables stable single-player version of self-play training that helps the agent to escape local optima. The training on any problem instance can be accelerated by applying transfer learning from an agent trained on randomly generated problems. Our approach allows sampling high-quality solutions to the Ising problem with high probability and outperforms both baseline heuristics and a black-box hyperparameter optimization approach. |
Tasks | Combinatorial Optimization, Hyperparameter Optimization, Transfer Learning |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04676v2 |
https://arxiv.org/pdf/2002.04676v2.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-enhanced-quantum |
Repo | |
Framework | |