Paper Group AWR 122
Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors. Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware. Applying Deep Learning To Airbnb Search. Fine-grained Activity Recognition in Baseball Videos. Commonsense for Generative Multi-Hop Question Answering Tasks. Visual Reasoning by Progressive Module Networks. In …
Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors
Title | Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors |
Authors | Andrew Ilyas, Logan Engstrom, Aleksander Madry |
Abstract | We study the problem of generating adversarial examples in a black-box setting in which only loss-oracle access to a model is available. We introduce a framework that conceptually unifies much of the existing work on black-box attacks, and we demonstrate that the current state-of-the-art methods are optimal in a natural sense. Despite this optimality, we show how to improve black-box attacks by bringing a new element into the problem: gradient priors. We give a bandit optimization-based algorithm that allows us to seamlessly integrate any such priors, and we explicitly identify and incorporate two examples. The resulting methods use two to four times fewer queries and fail two to five times less often than the current state-of-the-art. |
Tasks | |
Published | 2018-07-20 |
URL | http://arxiv.org/abs/1807.07978v3 |
http://arxiv.org/pdf/1807.07978v3.pdf | |
PWC | https://paperswithcode.com/paper/prior-convictions-black-box-adversarial |
Repo | https://github.com/mllab-adv-attack/lazy-attack |
Framework | tf |
Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware
Title | Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware |
Authors | Peter Blouw, Xuan Choo, Eric Hunsberger, Chris Eliasmith |
Abstract | Using Intel’s Loihi neuromorphic research chip and ABR’s Nengo Deep Learning toolkit, we analyze the inference speed, dynamic power consumption, and energy cost per inference of a two-layer neural network keyword spotter trained to recognize a single phrase. We perform comparative analyses of this keyword spotter running on more conventional hardware devices including a CPU, a GPU, Nvidia’s Jetson TX1, and the Movidius Neural Compute Stick. Our results indicate that for this inference application, Loihi outperforms all of these alternatives on an energy cost per inference basis while maintaining equivalent inference accuracy. Furthermore, an analysis of tradeoffs between network size, inference speed, and energy cost indicates that Loihi’s comparative advantage over other low-power computing devices improves for larger networks. |
Tasks | Keyword Spotting |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01739v2 |
http://arxiv.org/pdf/1812.01739v2.pdf | |
PWC | https://paperswithcode.com/paper/benchmarking-keyword-spotting-efficiency-on |
Repo | https://github.com/abr/power_benchmarks |
Framework | tf |
Applying Deep Learning To Airbnb Search
Title | Applying Deep Learning To Airbnb Search |
Authors | Malay Haldar, Mustafa Abdool, Prashant Ramanathan, Tao Xu, Shulin Yang, Huizhong Duan, Qing Zhang, Nick Barrow-Williams, Bradley C. Turnbull, Brendan M. Collins, Thomas Legrand |
Abstract | The application to search ranking is one of the biggest machine learning success stories at Airbnb. Much of the initial gains were driven by a gradient boosted decision tree model. The gains, however, plateaued over time. This paper discusses the work done in applying neural networks in an attempt to break out of that plateau. We present our perspective not with the intention of pushing the frontier of new modeling techniques. Instead, ours is a story of the elements we found useful in applying neural networks to a real life product. Deep learning was steep learning for us. To other teams embarking on similar journeys, we hope an account of our struggles and triumphs will provide some useful pointers. Bon voyage! |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09591v2 |
http://arxiv.org/pdf/1810.09591v2.pdf | |
PWC | https://paperswithcode.com/paper/applying-deep-learning-to-airbnb-search |
Repo | https://github.com/SachaIZADI/Misc-Machine-Learning |
Framework | tf |
Fine-grained Activity Recognition in Baseball Videos
Title | Fine-grained Activity Recognition in Baseball Videos |
Authors | AJ Piergiovanni, Michael S. Ryoo |
Abstract | In this paper, we introduce a challenging new dataset, MLB-YouTube, designed for fine-grained activity detection. The dataset contains two settings: segmented video classification as well as activity detection in continuous videos. We experimentally compare various recognition approaches capturing temporal structure in activity videos, by classifying segmented videos and extending those approaches to continuous videos. We also compare models on the extremely difficult task of predicting pitch speed and pitch type from broadcast baseball videos. We find that learning temporal structure is valuable for fine-grained activity recognition. |
Tasks | Action Detection, Activity Detection, Activity Recognition, Video Classification |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.03247v1 |
http://arxiv.org/pdf/1804.03247v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-activity-recognition-in-baseball |
Repo | https://github.com/piergiaj/mlb-youtube |
Framework | pytorch |
Commonsense for Generative Multi-Hop Question Answering Tasks
Title | Commonsense for Generative Multi-Hop Question Answering Tasks |
Authors | Lisa Bauer, Yicheng Wang, Mohit Bansal |
Abstract | Reading comprehension QA tasks have seen a recent surge in popularity, yet most works have focused on fact-finding extractive QA. We instead focus on a more challenging multi-hop generative task (NarrativeQA), which requires the model to reason, gather, and synthesize disjoint pieces of information within the context to generate an answer. This type of multi-step reasoning also often requires understanding implicit relations, which humans resolve via external, background commonsense knowledge. We first present a strong generative baseline that uses a multi-attention mechanism to perform multiple hops of reasoning and a pointer-generator decoder to synthesize the answer. This model performs substantially better than previous generative models, and is competitive with current state-of-the-art span prediction models. We next introduce a novel system for selecting grounded multi-hop relational commonsense information from ConceptNet via a pointwise mutual information and term-frequency based scoring function. Finally, we effectively use this extracted commonsense information to fill in gaps of reasoning between context hops, using a selectively-gated attention mechanism. This boosts the model’s performance significantly (also verified via human evaluation), establishing a new state-of-the-art for the task. We also show promising initial results of the generalizability of our background knowledge enhancements by demonstrating some improvement on QAngaroo-WikiHop, another multi-hop reasoning dataset. |
Tasks | Question Answering, Reading Comprehension |
Published | 2018-09-17 |
URL | https://arxiv.org/abs/1809.06309v3 |
https://arxiv.org/pdf/1809.06309v3.pdf | |
PWC | https://paperswithcode.com/paper/commonsense-for-generative-multi-hop-question |
Repo | https://github.com/yicheng-w/CommonSenseMultiHopQA |
Framework | tf |
Visual Reasoning by Progressive Module Networks
Title | Visual Reasoning by Progressive Module Networks |
Authors | Seung Wook Kim, Makarand Tapaswi, Sanja Fidler |
Abstract | Humans learn to solve tasks of increasing complexity by building on top of previously acquired knowledge. Typically, there exists a natural progression in the tasks that we learn - most do not require completely independent solutions, but can be broken down into simpler subtasks. We propose to represent a solver for each task as a neural module that calls existing modules (solvers for simpler tasks) in a functional program-like manner. Lower modules are a black box to the calling module, and communicate only via a query and an output. Thus, a module for a new task learns to query existing modules and composes their outputs in order to produce its own output. Our model effectively combines previous skill-sets, does not suffer from forgetting, and is fully differentiable. We test our model in learning a set of visual reasoning tasks, and demonstrate improved performances in all tasks by learning progressively. By evaluating the reasoning process using human judges, we show that our model is more interpretable than an attention-based baseline. |
Tasks | Visual Reasoning |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02453v2 |
http://arxiv.org/pdf/1806.02453v2.pdf | |
PWC | https://paperswithcode.com/paper/visual-reasoning-by-progressive-module |
Repo | https://github.com/seung-kim/pmn_demo |
Framework | pytorch |
Inferring transportation modes from GPS trajectories using a convolutional neural network
Title | Inferring transportation modes from GPS trajectories using a convolutional neural network |
Authors | Sina Dabiri, Kevin Heaslip |
Abstract | Identifying the distribution of users’ transportation modes is an essential part of travel demand analysis and transportation planning. With the advent of ubiquitous GPS-enabled devices (e.g., a smartphone), a cost-effective approach for inferring commuters’ mobility mode(s) is to leverage their GPS trajectories. A majority of studies have proposed mode inference models based on hand-crafted features and traditional machine learning algorithms. However, manual features engender some major drawbacks including vulnerability to traffic and environmental conditions as well as possessing human’s bias in creating efficient features. One way to overcome these issues is by utilizing Convolutional Neural Network (CNN) schemes that are capable of automatically driving high-level features from the raw input. Accordingly, in this paper, we take advantage of CNN architectures so as to predict travel modes based on only raw GPS trajectories, where the modes are labeled as walk, bike, bus, driving, and train. Our key contribution is designing the layout of the CNN’s input layer in such a way that not only is adaptable with the CNN schemes but represents fundamental motion characteristics of a moving object including speed, acceleration, jerk, and bearing rate. Furthermore, we ameliorate the quality of GPS logs through several data preprocessing steps. Using the clean input layer, a variety of CNN configurations are evaluated to achieve the best CNN architecture. The highest accuracy of 84.8% has been achieved through the ensemble of the best CNN configuration. In this research, we contrast our methodology with traditional machine learning algorithms as well as the seminal and most related studies to demonstrate the superiority of our framework. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.02386v1 |
http://arxiv.org/pdf/1804.02386v1.pdf | |
PWC | https://paperswithcode.com/paper/inferring-transportation-modes-from-gps |
Repo | https://github.com/PatrickMotylinski/LBCPI-project |
Framework | none |
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
Title | LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image |
Authors | Chuhang Zou, Alex Colburn, Qi Shan, Derek Hoiem |
Abstract | We propose an algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. L-shape room). Our method operates directly on the panoramic image, rather than decomposing into perspective images as do recent works. Our network architecture is similar to that of RoomNet, but we show improvements due to aligning the image based on vanishing points, predicting multiple layout elements (corners, boundaries, size and translation), and fitting a constrained Manhattan layout to the resulting predictions. Our method compares well in speed and accuracy to other existing work on panoramas, achieves among the best accuracy for perspective images, and can handle both cuboid-shaped and more general Manhattan layouts. |
Tasks | 3D Room Layouts From A Single Rgb Panorama |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08999v1 |
http://arxiv.org/pdf/1803.08999v1.pdf | |
PWC | https://paperswithcode.com/paper/layoutnet-reconstructing-the-3d-room-layout |
Repo | https://github.com/zouchuhang/LayoutNet |
Framework | pytorch |
STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework
Title | STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework |
Authors | Mingbo Ma, Liang Huang, Hao Xiong, Renjie Zheng, Kaibo Liu, Baigong Zheng, Chuanqiang Zhang, Zhongjun He, Hairong Liu, Xing Li, Hua Wu, Haifeng Wang |
Abstract | Simultaneous translation, which translates sentences before they are finished, is useful in many scenarios but is notoriously difficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we propose a novel prefix-to-prefix framework for simultaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very simple yet surprisingly effective wait-k policy trained to generate the target sentence concurrently with the source sentence, but always k words behind. Experiments show our strategy achieves low latency and reasonable quality (compared to full-sentence translation) on 4 directions: zh<->en and de<->en. |
Tasks | |
Published | 2018-10-19 |
URL | https://arxiv.org/abs/1810.08398v5 |
https://arxiv.org/pdf/1810.08398v5.pdf | |
PWC | https://paperswithcode.com/paper/stacl-simultaneous-translation-with |
Repo | https://github.com/SimulTrans-demo/STACL |
Framework | none |
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Title | Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs |
Authors | Xiaolong Wang, Yufei Ye, Abhinav Gupta |
Abstract | We consider the problem of zero-shot recognition: learning a visual classifier for a category with zero training examples, just using the word embedding of the category and its relationship to other categories, which visual data are provided. The key to dealing with the unfamiliar or novel category is to transfer knowledge obtained from familiar classes to describe the unfamiliar class. In this paper, we build upon the recently introduced Graph Convolutional Network (GCN) and propose an approach that uses both semantic embeddings and the categorical relationships to predict the classifiers. Given a learned knowledge graph (KG), our approach takes as input semantic embeddings for each node (representing visual category). After a series of graph convolutions, we predict the visual classifier for each category. During training, the visual classifiers for a few categories are given to learn the GCN parameters. At test time, these filters are used to predict the visual classifiers of unseen categories. We show that our approach is robust to noise in the KG. More importantly, our approach provides significant improvement in performance compared to the current state-of-the-art results (from 2 ~ 3% on some metrics to whopping 20% on a few). |
Tasks | Knowledge Graphs, Zero-Shot Learning |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.08035v2 |
http://arxiv.org/pdf/1803.08035v2.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-recognition-via-semantic-embeddings |
Repo | https://github.com/JudyYe/zero-shot-gcn |
Framework | tf |
News Session-Based Recommendations using Deep Neural Networks
Title | News Session-Based Recommendations using Deep Neural Networks |
Authors | Gabriel de Souza P. Moreira, Felipe Ferreira, Adilson Marques da Cunha |
Abstract | News recommender systems are aimed to personalize users experiences and help them to discover relevant articles from a large and dynamic search space. Therefore, news domain is a challenging scenario for recommendations, due to its sparse user profiling, fast growing number of items, accelerated item’s value decay, and users preferences dynamic shift. Some promising results have been recently achieved by the usage of Deep Learning techniques on Recommender Systems, specially for item’s feature extraction and for session-based recommendations with Recurrent Neural Networks. In this paper, it is proposed an instantiation of the CHAMELEON – a Deep Learning Meta-Architecture for News Recommender Systems. This architecture is composed of two modules, the first responsible to learn news articles representations, based on their text and metadata, and the second module aimed to provide session-based recommendations using Recurrent Neural Networks. The recommendation task addressed in this work is next-item prediction for users sessions: “what is the next most likely article a user might read in a session?” Users sessions context is leveraged by the architecture to provide additional information in such extreme cold-start scenario of news recommendation. Users’ behavior and item features are both merged in an hybrid recommendation approach. A temporal offline evaluation method is also proposed as a complementary contribution, for a more realistic evaluation of such task, considering dynamic factors that affect global readership interests like popularity, recency, and seasonality. Experiments with an extensive number of session-based recommendation methods were performed and the proposed instantiation of CHAMELEON meta-architecture obtained a significant relative improvement in top-n accuracy and ranking metrics (10% on Hit Rate and 13% on MRR) over the best benchmark methods. |
Tasks | Recommendation Systems, Session-Based Recommendations |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1808.00076v3 |
http://arxiv.org/pdf/1808.00076v3.pdf | |
PWC | https://paperswithcode.com/paper/news-session-based-recommendations-using-deep |
Repo | https://github.com/gabrielspmoreira/chameleon_recsys |
Framework | tf |
Stochastic algorithms with descent guarantees for ICA
Title | Stochastic algorithms with descent guarantees for ICA |
Authors | Pierre Ablin, Alexandre Gramfort, Jean-François Cardoso, Francis Bach |
Abstract | Independent component analysis (ICA) is a widespread data exploration technique, where observed signals are modeled as linear mixtures of independent components. From a machine learning point of view, it amounts to a matrix factorization problem with a statistical independence criterion. Infomax is one of the most used ICA algorithms. It is based on a loss function which is a non-convex log-likelihood. We develop a new majorization-minimization framework adapted to this loss function. We derive an online algorithm for the streaming setting, and an incremental algorithm for the finite sum setting, with the following benefits. First, unlike most algorithms found in the literature, the proposed methods do not rely on any critical hyper-parameter like a step size, nor do they require a line-search technique. Second, the algorithm for the finite sum setting, although stochastic, guarantees a decrease of the loss function at each iteration. Experiments demonstrate progress on the state-of-the-art for large scale datasets, without the necessity for any manual parameter tuning. |
Tasks | |
Published | 2018-05-25 |
URL | https://arxiv.org/abs/1805.10054v2 |
https://arxiv.org/pdf/1805.10054v2.pdf | |
PWC | https://paperswithcode.com/paper/em-algorithms-for-ica |
Repo | https://github.com/pierreablin/mmica |
Framework | none |
On Extended Long Short-term Memory and Dependent Bidirectional Recurrent Neural Network
Title | On Extended Long Short-term Memory and Dependent Bidirectional Recurrent Neural Network |
Authors | Yuanhang Su, C. -C. Jay Kuo |
Abstract | In this work, we first analyze the memory behavior in three recurrent neural networks (RNN) cells; namely, the simple RNN (SRN), the long short-term memory (LSTM) and the gated recurrent unit (GRU), where the memory is defined as a function that maps previous elements in a sequence to the current output. Our study shows that all three of them suffer rapid memory decay. Then, to alleviate this effect, we introduce trainable scaling factors that act like an attention mechanism to adjust memory decay adaptively. The new design is called the extended LSTM (ELSTM). Finally, to design a system that is robust to previous erroneous predictions, we propose a dependent bidirectional recurrent neural network (DBRNN). Extensive experiments are conducted on different language tasks to demonstrate the superiority of the proposed ELSTM and DBRNN solutions. The ELTSM has achieved up to 30% increase in the labeled attachment score (LAS) as compared to LSTM and GRU in the dependency parsing (DP) task. Our models also outperform other state-of-the-art models such as bi-attention and convolutional sequence to sequence (convseq2seq) by close to 10% in the LAS. The code is released as an open source (https://github.com/yuanhangsu/ELSTM-DBRNN) |
Tasks | Dependency Parsing |
Published | 2018-02-27 |
URL | https://arxiv.org/abs/1803.01686v5 |
https://arxiv.org/pdf/1803.01686v5.pdf | |
PWC | https://paperswithcode.com/paper/on-extended-long-short-term-memory-and |
Repo | https://github.com/yuanhangsu/ELSTM-DBRNN |
Framework | tf |
GADGET SVM: A Gossip-bAseD sub-GradiEnT Solver for Linear SVMs
Title | GADGET SVM: A Gossip-bAseD sub-GradiEnT Solver for Linear SVMs |
Authors | Haimonti Dutta, Nitin Nataraj |
Abstract | In the era of big data, an important weapon in a machine learning researcher’s arsenal is a scalable Support Vector Machine (SVM) algorithm. SVMs are extensively used for solving classification problems. Traditional algorithms for learning SVMs often scale super linearly with training set size which becomes infeasible very quickly for large data sets. In recent years, scalable algorithms have been designed which study the primal or dual formulations of the problem. This often suggests a way to decompose the problem and facilitate development of distributed algorithms. In this paper, we present a distributed algorithm for learning linear Support Vector Machines in the primal form for binary classification called Gossip-bAseD sub-GradiEnT (GADGET) SVM. The algorithm is designed such that it can be executed locally on nodes of a distributed system. Each node processes its local homogeneously partitioned data and learns a primal SVM model. It then gossips with random neighbors about the classifier learnt and uses this information to update the model. Extensive theoretical and empirical results suggest that this anytime algorithm has performance comparable to its centralized and online counterparts. |
Tasks | |
Published | 2018-12-05 |
URL | http://arxiv.org/abs/1812.02261v1 |
http://arxiv.org/pdf/1812.02261v1.pdf | |
PWC | https://paperswithcode.com/paper/gadget-svm-a-gossip-based-sub-gradient-solver |
Repo | https://github.com/nitinnat/GADGET |
Framework | none |
Lightweight Probabilistic Deep Networks
Title | Lightweight Probabilistic Deep Networks |
Authors | Jochen Gast, Stefan Roth |
Abstract | Even though probabilistic treatments of neural networks have a long history, they have not found widespread use in practice. Sampling approaches are often too slow already for simple networks. The size of the inputs and the depth of typical CNN architectures in computer vision only compound this problem. Uncertainty in neural networks has thus been largely ignored in practice, despite the fact that it may provide important information about the reliability of predictions and the inner workings of the network. In this paper, we introduce two lightweight approaches to making supervised learning with probabilistic deep networks practical: First, we suggest probabilistic output layers for classification and regression that require only minimal changes to existing networks. Second, we employ assumed density filtering and show that activation uncertainties can be propagated in a practical fashion through the entire network, again with minor changes. Both probabilistic networks retain the predictive power of the deterministic counterpart, but yield uncertainties that correlate well with the empirical error induced by their predictions. Moreover, the robustness to adversarial examples is significantly increased. |
Tasks | |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11327v1 |
http://arxiv.org/pdf/1805.11327v1.pdf | |
PWC | https://paperswithcode.com/paper/lightweight-probabilistic-deep-networks |
Repo | https://github.com/mattiasegu/uncertainty_estimation_deep_learning |
Framework | pytorch |