January 30, 2020

3483 words 17 mins read

Paper Group ANR 228

Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts. Unsupervised Representation Learning by Discovering Reliable Image Relations. Leveraging Knowledge Bases in LSTMs for Improving Machine Reading. Face Behavior à la carte: Expressions, Affect and Action Units in a Single Network. eSports Pro-Players Behavior During the Game Event …

Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts


Title	Thinking Outside the Box: Generation of Unconstrained 3D Room Layouts
Authors	Henry Howard-Jenkins, Shuda Li, Victor Prisacariu
Abstract	We propose a method for room layout estimation that does not rely on the typical box approximation or Manhattan world assumption. Instead, we reformulate the geometry inference problem as an instance detection task, which we solve by directly regressing 3D planes using an R-CNN. We then use a variant of probabilistic clustering to combine the 3D planes regressed at each frame in a video sequence, with their respective camera poses, into a single global 3D room layout estimate. Finally, we showcase results which make no assumptions about perpendicular alignment, so can deal effectively with walls in any alignment.
Tasks	Room Layout Estimation
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03105v1
PDF	https://arxiv.org/pdf/1905.03105v1.pdf
PWC	https://paperswithcode.com/paper/thinking-outside-the-box-generation-of
Repo
Framework

Unsupervised Representation Learning by Discovering Reliable Image Relations


Title	Unsupervised Representation Learning by Discovering Reliable Image Relations
Authors	Timo Milbich, Omair Ghori, Ferran Diego, Björn Ommer
Abstract	Learning robust representations that allow to reliably establish relations between images is of paramount importance for virtually all of computer vision. Annotating the quadratic number of pairwise relations between training images is simply not feasible, while unsupervised inference is prone to noise, thus leaving the vast majority of these relations to be unreliable. To nevertheless find those relations which can be reliably utilized for learning, we follow a divide-and-conquer strategy: We find reliable similarities by extracting compact groups of images and reliable dissimilarities by partitioning these groups into subsets, converting the complicated overall problem into few reliable local subproblems. For each of the subsets we obtain a representation by learning a mapping to a target feature space so that their reliable relations are kept. Transitivity relations between the subsets are then exploited to consolidate the local solutions into a concerted global representation. While iterating between grouping, partitioning, and learning, we can successively use more and more reliable relations which, in turn, improves our image representation. In experiments, our approach shows state-of-the-art performance on unsupervised classification on ImageNet with 46.0% and competes favorably on different transfer learning tasks on PASCAL VOC.
Tasks	Representation Learning, Transfer Learning, Unsupervised Representation Learning
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07808v1
PDF	https://arxiv.org/pdf/1911.07808v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-representation-learning-by-5
Repo
Framework

Leveraging Knowledge Bases in LSTMs for Improving Machine Reading


Title	Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
Authors	Bishan Yang, Tom Mitchell
Abstract	This paper focuses on how to take advantage of external knowledge bases (KBs) to improve recurrent neural networks for machine reading. Traditional methods that exploit knowledge from KBs encode knowledge as discrete indicator features. Not only do these features generalize poorly, but they require task-specific feature engineering to achieve good performance. We propose KBLSTM, a novel neural model that leverages continuous representations of KBs to enhance the learning of recurrent neural networks for machine reading. To effectively integrate background knowledge with information from the currently processed text, our model employs an attention mechanism with a sentinel to adaptively decide whether to attend to background knowledge and which information from KBs is useful. Experimental results show that our model achieves accuracies that surpass the previous state-of-the-art results for both entity extraction and event extraction on the widely used ACE2005 dataset.
Tasks	Entity Extraction, Feature Engineering, Reading Comprehension
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09091v1
PDF	http://arxiv.org/pdf/1902.09091v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-knowledge-bases-in-lstms-for
Repo
Framework

Face Behavior à la carte: Expressions, Affect and Action Units in a Single Network


Title	Face Behavior à la carte: Expressions, Affect and Action Units in a Single Network
Authors	Dimitrios Kollias, Viktoriia Sharmanska, Stefanos Zafeiriou
Abstract	Automatic facial behavior analysis has a long history of studies in the intersection of computer vision, physiology and psychology. However it is only recently, with the collection of large-scale datasets and powerful machine learning methods such as deep neural networks, that automatic facial behavior analysis started to thrive. Three of its iconic tasks are automatic recognition of basic expressions (e.g. happy, sad, surprised), estimation of continuous emotions (e.g., valence and arousal), and detection of facial action units (activations of e.g. upper/inner eyebrows, nose wrinkles). Up until now these tasks have been mostly studied independently collecting a dataset for the task. We present the first and the largest study of all facial behaviour tasks learned jointly in a single multi-task, multi-domain and multi-label network, which we call FaceBehaviorNet. For this we utilize all publicly available datasets in the community (around 5M images) that study facial behaviour tasks in-the-wild. We demonstrate that training jointly an end-to-end network for all tasks has consistently better performance than training each of the single-task networks. Furthermore, we propose two simple strategies for coupling the tasks during training, co-annotation and distribution matching, and show the advantages of this approach. Finally we show that FaceBehaviorNet has learned features that encapsulate all aspects of facial behaviour, and can be successfully applied to perform tasks (compound emotion recognition) beyond the ones that it has been trained in a zero- and few-shot learning setting.
Tasks	Emotion Recognition, Few-Shot Learning
Published	2019-10-15
URL	https://arxiv.org/abs/1910.11111v1
PDF	https://arxiv.org/pdf/1910.11111v1.pdf
PWC	https://paperswithcode.com/paper/face-behavior-a-la-carte-expressions-affect
Repo
Framework

eSports Pro-Players Behavior During the Game Events: Statistical Analysis of Data Obtained Using the Smart Chair


Title	eSports Pro-Players Behavior During the Game Events: Statistical Analysis of Data Obtained Using the Smart Chair
Authors	Anton Smerdov, Evgeny Burnaev, Andrey Somov
Abstract	Today’s competition between the professional eSports teams is so strong that in-depth analysis of players’ performance literally crucial for creating a powerful team. There are two main approaches to such an estimation: obtaining features and metrics directly from the in-game data or collecting detailed information about the player including data on his/her physical training. While the correlation between the player’s skill and in-game data has already been covered in many papers, there are very few works related to analysis of eSports athlete’s skill through his/her physical behavior. We propose the smart chair platform which is to collect data on the person’s behavior on the chair using an integrated accelerometer, a gyroscope and a magnetometer. We extract the important game events to define the players’ physical reactions to them. The obtained data are used for training machine learning models in order to distinguish between the low-skilled and high-skilled players. We extract and figure out the key features during the game and discuss the results.
Tasks
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06402v1
PDF	https://arxiv.org/pdf/1908.06402v1.pdf
PWC	https://paperswithcode.com/paper/esports-pro-players-behavior-during-the-game
Repo
Framework

Generalizing from a few environments in safety-critical reinforcement learning


Title	Generalizing from a few environments in safety-critical reinforcement learning
Authors	Zachary Kenton, Angelos Filos, Owain Evans, Yarin Gal
Abstract	Before deploying autonomous agents in the real world, we need to be confident they will perform safely in novel situations. Ideally, we would expose agents to a very wide range of situations during training, allowing them to learn about every possible danger, but this is often impractical. This paper investigates safety and generalization from a limited number of training environments in deep reinforcement learning (RL). We find RL algorithms can fail dangerously on unseen test environments even when performing perfectly on training environments. Firstly, in a gridworld setting, we show that catastrophes can be significantly reduced with simple modifications, including ensemble model averaging and the use of a blocking classifier. In the more challenging CoinRun environment we find similar methods do not significantly reduce catastrophes. However, we do find that the uncertainty information from the ensemble is useful for predicting whether a catastrophe will occur within a few steps and hence whether human intervention should be requested.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01475v1
PDF	https://arxiv.org/pdf/1907.01475v1.pdf
PWC	https://paperswithcode.com/paper/generalizing-from-a-few-environments-in
Repo
Framework

Model-Based Reinforcement Learning for Whole-Chain Recommendations


Title	Model-Based Reinforcement Learning for Whole-Chain Recommendations
Authors	Xiangyu Zhao, Long Xia, Dawei Yin, Jiliang Tang
Abstract	With the recent prevalence of Reinforcement Learning (RL), there have been tremendous interests in developing RL-based recommender systems. In practical recommendation sessions, users will sequentially access multiple scenarios, such as the entrance pages and the item detail pages, and each scenario has its own recommendation strategy. However, the majority of existing RL-based recommender systems focus on optimizing one strategy for all scenarios or separately optimizing each strategy, which could lead to sub-optimal overall performance. In this paper, we study the recommendation problem with multiple (consecutive) scenarios, i.e., whole-chain recommendations. We propose a multi-agent reinforcement learning based approach (DeepChain), which can capture the sequential correlation among different scenarios and jointly optimize multiple recommendation strategies. To be specific, all recommender agents share the same memory of users’ historical behaviors, and they work collaboratively to maximize the overall reward of a session. Note that optimizing multiple recommendation strategies jointly faces two challenges in existing model-free RL model \cite{feng2018learning}- (i) it requires huge amounts of user behavior data, and (ii) the distribution of reward (users’ feedback) are extremely unbalanced. In this paper, we introduce model-based reinforcement learning techniques to reduce the training data requirement and execute more accurate strategy updates. The experimental results based on a real e-commerce platform demonstrate the effectiveness of the proposed framework.
Tasks	Multi-agent Reinforcement Learning, Recommendation Systems
Published	2019-02-11
URL	https://arxiv.org/abs/1902.03987v2
PDF	https://arxiv.org/pdf/1902.03987v2.pdf
PWC	https://paperswithcode.com/paper/model-based-reinforcement-learning-for-whole
Repo
Framework

Adversarially Learning a Local Anatomical Prior: Vertebrae Labelling with 2D reformations


Title	Adversarially Learning a Local Anatomical Prior: Vertebrae Labelling with 2D reformations
Authors	Anjany Sekuboyina, Markus Rempfler, Alexander Valentinitsch, Jan S. Kirschke, Bjoern H. Menze
Abstract	Robust localisation and identification of vertebrae, jointly termed vertebrae labelling, in computed tomography (CT) images is an essential component of automated spine analysis. Current approaches for this task mostly work with 3D scans and are comprised of a sequence of multiple networks. Contrarily, our approach relies only on 2D reformations, enabling us to design an end-to-end trainable, standalone network. Our contribution includes: (1) Inspired by the workflow of human experts, a novel butterfly-shaped network architecture (termed Btrfly net) that efficiently combines information across sufficiently-informative sagittal and coronal reformations. (2) Two adversarial training regimes that encode an anatomical prior of the spine’s shape into the Btrfly net, each enforcing the prior in a distinct manner. We evaluate our approach on a public benchmarking dataset of 302 CT scans achieving a performance comparable to state-of-art methods (identification rate of $>$88%) without any post-processing stages. Addressing its translation to clinical settings, an in-house dataset of 65 CT scans with a higher data variability is introduced, where we discuss refinements that render our approach robust to such scenarios.
Tasks	Computed Tomography (CT)
Published	2019-02-06
URL	http://arxiv.org/abs/1902.02205v2
PDF	http://arxiv.org/pdf/1902.02205v2.pdf
PWC	https://paperswithcode.com/paper/adversarially-learning-a-local-anatomical
Repo
Framework

DLTPy: Deep Learning Type Inference of Python Function Signatures using Natural Language Context


Title	DLTPy: Deep Learning Type Inference of Python Function Signatures using Natural Language Context
Authors	Casper Boone, Niels de Bruin, Arjan Langerak, Fabian Stelmach
Abstract	Due to the rise of machine learning, Python is an increasingly popular programming language. Python, however, is dynamically typed. Dynamic typing has shown to have drawbacks when a project grows, while at the same time it improves developer productivity. To have the benefits of static typing, combined with high developer productivity, types need to be inferred. In this paper, we present DLTPy: a deep learning type inference solution for the prediction of types in function signatures based on the natural language context (identifier names, comments and return expressions) of a function. We found that DLTPy is effective and has a top-3 F1-score of 91.6%. This means that in most of the cases the correct type is within the top-3 predictions. We conclude that natural language contained in comments and return expressions are beneficial to predicting types more accurately. DLTPy does not significantly outperform or underperform the previous work NL2Type for Javascript, but does show that similar prediction is possible for Python.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00680v1
PDF	https://arxiv.org/pdf/1912.00680v1.pdf
PWC	https://paperswithcode.com/paper/dltpy-deep-learning-type-inference-of-python
Repo
Framework

Evaluating Artificial Systems for Pairwise Ranking Tasks Sensitive to Individual Differences


Title	Evaluating Artificial Systems for Pairwise Ranking Tasks Sensitive to Individual Differences
Authors	Xing Liu, Takayuki Okatani
Abstract	Owing to the advancement of deep learning, artificial systems are now rival to humans in several pattern recognition tasks, such as visual recognition of object categories. However, this is only the case with the tasks for which correct answers exist independent of human perception. There is another type of tasks for which what to predict is human perception itself, in which there are often individual differences. Then, there are no longer single “correct” answers to predict, which makes evaluation of artificial systems difficult. In this paper, focusing on pairwise ranking tasks sensitive to individual differences, we propose an evaluation method. Given a ranking result for multiple item pairs that is generated by an artificial system, our method quantifies the probability that the same ranking result will be generated by humans, and judges if it is distinguishable from human-generated results. We introduce a probabilistic model of human ranking behavior, and present an efficient computation method for the judgment. To estimate model parameters accurately from small-size samples, we present a method that uses confidence scores given by annotators for ranking each item pair. Taking as an example a task of ranking image pairs according to material attributes of objects, we demonstrate how the proposed method works.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13560v1
PDF	https://arxiv.org/pdf/1905.13560v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-artificial-systems-for-pairwise
Repo
Framework

PharML.Bind: Pharmacologic Machine Learning for Protein-Ligand Interactions


Title	PharML.Bind: Pharmacologic Machine Learning for Protein-Ligand Interactions
Authors	Aaron D. Vose, Jacob Balma, Damon Farnsworth, Kaylie Anderson, Yuri K. Peterson
Abstract	Is it feasible to create an analysis paradigm that can analyze and then accurately and quickly predict known drugs from experimental data? PharML.Bind is a machine learning toolkit which is able to accomplish this feat. Utilizing deep neural networks and big data, PharML.Bind correlates experimentally-derived drug affinities and protein-ligand X-ray structures to create novel predictions. The utility of PharML.Bind is in its application as a rapid, accurate, and robust prediction platform for discovery and personalized medicine. This paper demonstrates that graph neural networks (GNNs) can be trained to screen hundreds of thousands of compounds against thousands of targets in minutes, a vastly shorter time than previous approaches. This manuscript presents results from training and testing using the entirety of BindingDB after cleaning; this includes a test set with 19,708 X-ray structures and 247,633 drugs, leading to 2,708,151 unique protein-ligand pairings. PharML.Bind achieves a prodigious 98.3% accuracy on this test set in under 25 minutes. PharML.Bind is premised on the following key principles: 1) speed and a high enrichment factor per unit compute time, provided by high-quality training data combined with a novel GNN architecture and use of high-performance computing resources, 2) the ability to generalize to proteins and drugs outside of the training set, including those with unknown active sites, through the use of an active-site-agnostic GNN mapping, and 3) the ability to be easily integrated as a component of increasingly-complex prediction and analysis pipelines. PharML.Bind represents a timely and practical approach to leverage the power of machine learning to efficiently analyze and predict drug action on any practical scale and will provide utility in a variety of discovery and medical applications.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1911.06105v1
PDF	https://arxiv.org/pdf/1911.06105v1.pdf
PWC	https://paperswithcode.com/paper/pharmlbind-pharmacologic-machine-learning-for
Repo
Framework

Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression


Title	Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression
Authors	Puning Zhao, Lifeng Lai
Abstract	k Nearest Neighbor (kNN) method is a simple and popular statistical method for classification and regression. For both classification and regression problems, existing works have shown that, if the distribution of the feature vector has bounded support and the probability density function is bounded away from zero in its support, the convergence rate of the standard kNN method, in which k is the same for all test samples, is minimax optimal. On the contrary, if the distribution has unbounded support, we show that there is a gap between the convergence rate achieved by the standard kNN method and the minimax bound. To close this gap, we propose an adaptive kNN method, in which different k is selected for different samples. Our selection rule does not require precise knowledge of the underlying distribution of features. The new proposed method significantly outperforms the standard one. We characterize the convergence rate of the proposed adaptive method, and show that it matches the minimax lower bound.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10513v1
PDF	https://arxiv.org/pdf/1910.10513v1.pdf
PWC	https://paperswithcode.com/paper/minimax-rate-optimal-adaptive-nearest
Repo
Framework

Machine Learning Approach for Air Shower Recognition in EUSO-SPB Data


Title	Machine Learning Approach for Air Shower Recognition in EUSO-SPB Data
Authors	Michal Vrábel, Ján Genči, Pavol Bobik, Francesca Bisconti
Abstract	The main goal of The Extreme Universe Space Observatory on a Super Pressure Balloon (EUSO-SPB1) was to observe from above extensive air showers caused by ultra-high energy cosmic rays. EUSO-SPB1 uses a fluorescence detector that observes the atmosphere in a nadir observation mode from a near space altitude. During the 12-day flight, an onboard first level trigger detected more than \num{175000} candidate events. This paper presents an approach to recognize air showers in this dataset. The approach uses a feature extraction method to create a simpler representation of an event and then it uses established machine learning techniques to classify data into at least two classes - shower and noise. The machine learning models are trained on a set of air shower simulations put on top of the background observed during the flight and a set of events from the flight. We present the efficiency of the method on datasets of simulated events. The flight data events are also used in unsupervised learning methods to identify groups of events with similar features. The presented methods allow us to shorten the candidate events list and, thanks to the groups of similar events identified by the unsupervised methods, the classification of the triggered events is made simpler.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03680v1
PDF	https://arxiv.org/pdf/1909.03680v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-approach-for-air-shower
Repo
Framework

Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning


Title	Enabling Robots to Understand Incomplete Natural Language Instructions Using Commonsense Reasoning
Authors	Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz
Abstract	Enabling robots to understand instructions provided via spoken natural language would facilitate interaction between robots and people in a variety of settings in homes and workplaces. However, natural language instructions are often missing information that would be obvious to a human based on environmental context and common sense, and hence does not need to be explicitly stated. In this paper, we introduce Language-Model-based Commonsense Reasoning (LMCR), a new method which enables a robot to listen to a natural language instruction from a human, observe the environment around it, and automatically fill in information missing from the instruction using environmental context and a new commonsense reasoning approach. Our approach first converts an instruction provided as unconstrained natural language into a form that a robot can understand by parsing it into verb frames. Our approach then fills in missing information in the instruction by observing objects in its vicinity and leveraging commonsense reasoning. To learn commonsense reasoning automatically, our approach distills knowledge from large unstructured textual corpora by training a language model. Our results show the feasibility of a robot learning commonsense knowledge automatically from web-based textual corpora, and the power of learned commonsense reasoning models in enabling a robot to autonomously perform tasks based on incomplete natural language instructions.
Tasks	Common Sense Reasoning, Language Modelling
Published	2019-04-29
URL	http://arxiv.org/abs/1904.12907v1
PDF	http://arxiv.org/pdf/1904.12907v1.pdf
PWC	https://paperswithcode.com/paper/enabling-robots-to-understand-incomplete
Repo
Framework

Exploring the Origins and Prevalence of Texture Bias in Convolutional Neural Networks


Title	Exploring the Origins and Prevalence of Texture Bias in Convolutional Neural Networks
Authors	Katherine L. Hermann, Simon Kornblith
Abstract	Recent work has indicated that, unlike humans, ImageNet-trained CNNs tend to classify images by texture rather than shape. How pervasive is this bias, and where does it come from? We find that, when trained on datasets of images with conflicting shape and texture, the inductive bias of CNNs often favors shape; in general, models learn shape at least as easily as texture. Moreover, although ImageNet training leads to classifier weights that classify ambiguous images according to texture, shape is decodable from the hidden representations of ImageNet networks. Turning to the question of the origin of texture bias, we identify consistent effects of task, architecture, preprocessing, and hyperparameters. Different self-supervised training objectives and different architectures have significant and largely independent effects on the shape bias of the learned representations. Among modern ImageNet architectures, we find that shape bias is positively correlated with ImageNet accuracy. Random-crop data augmentation encourages reliance on texture: Models trained without crops have lower accuracy but higher shape bias. Finally, hyperparameter combinations that yield similar accuracy are associated with vastly different levels of shape bias. Our results suggest general strategies to reduce texture bias in neural networks.
Tasks	Data Augmentation
Published	2019-11-20
URL	https://arxiv.org/abs/1911.09071v1
PDF	https://arxiv.org/pdf/1911.09071v1.pdf
PWC	https://paperswithcode.com/paper/exploring-the-origins-and-prevalence-of
Repo
Framework