Paper Group NANR 102
Domain Adaptation for Deep Reinforcement Learning in Visually Distinct Games. ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars. A Painless Attention Mechanism for Convolutional Neural Networks. Recursive Binary Neural Network Learning Model with 2-bit/weight Storage Requirement. PunFields at SemEval-2018 Task 3: Detecting Irony by Tools o …
Domain Adaptation for Deep Reinforcement Learning in Visually Distinct Games
Title | Domain Adaptation for Deep Reinforcement Learning in Visually Distinct Games |
Authors | Dino S. Ratcliffe, Luca Citi, Sam Devlin, Udo Kruschwitz |
Abstract | Many deep reinforcement learning approaches use graphical state representations, this means visually distinct games that share the same underlying structure cannot effectively share knowledge. This paper outlines a new approach for learning underlying game state embeddings irrespective of the visual rendering of the game state. We utilise approaches from multi-task learning and domain adaption in order to place visually distinct game states on a shared embedding manifold. We present our results in the context of deep reinforcement learning agents. |
Tasks | Domain Adaptation, Multi-Task Learning |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=BJB7fkWR- |
https://openreview.net/pdf?id=BJB7fkWR- | |
PWC | https://paperswithcode.com/paper/domain-adaptation-for-deep-reinforcement |
Repo | |
Framework | |
ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars
Title | ScholarGraph:a Chinese Knowledge Graph of Chinese Scholars |
Authors | Shuo Wang, Zehui Hao, Xiaofeng Meng, Qiuyue Wang |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1092/ |
https://www.aclweb.org/anthology/L18-1092 | |
PWC | https://paperswithcode.com/paper/scholargrapha-chinese-knowledge-graph-of |
Repo | |
Framework | |
A Painless Attention Mechanism for Convolutional Neural Networks
Title | A Painless Attention Mechanism for Convolutional Neural Networks |
Authors | Pau Rodríguez, Guillem Cucurull, Jordi Gonzàlez, Josep M. Gonfaus, Xavier Roca |
Abstract | We propose a novel attention mechanism to enhance Convolutional Neural Networks for fine-grained recognition. The proposed mechanism reuses CNN feature activations to find the most informative parts of the image at different depths with the help of gating mechanisms and without part annotations. Thus, it can be used to augment any layer of a CNN to extract low- and high-level local information to be more discriminative. Differently, from other approaches, the mechanism we propose just needs a single pass through the input and it can be trained end-to-end through SGD. As a consequence, the proposed mechanism is modular, architecture-independent, easy to implement, and faster than iterative approaches. Experiments show that, when augmented with our approach, Wide Residual Networks systematically achieve superior performance on each of five different fine-grained recognition datasets: the Adience age and gender recognition benchmark, Caltech-UCSD Birds-200-2011, Stanford Dogs, Stanford Cars, and UEC Food-100, obtaining competitive and state-of-the-art scores. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rJe7FW-Cb |
https://openreview.net/pdf?id=rJe7FW-Cb | |
PWC | https://paperswithcode.com/paper/a-painless-attention-mechanism-for |
Repo | |
Framework | |
Recursive Binary Neural Network Learning Model with 2-bit/weight Storage Requirement
Title | Recursive Binary Neural Network Learning Model with 2-bit/weight Storage Requirement |
Authors | Tianchan Guan, Xiaoyang Zeng, Mingoo Seok |
Abstract | This paper presents a storage-efficient learning model titled Recursive Binary Neural Networks for embedded and mobile devices having a limited amount of on-chip data storage such as hundreds of kilo-Bytes. The main idea of the proposed model is to recursively recycle data storage of weights (parameters) during training. This enables a device with a given storage constraint to train and instantiate a neural network classifier with a larger number of weights on a chip, achieving better classification accuracy. Such efficient use of on-chip storage reduces off-chip storage accesses, improving energy-efficiency and speed of training. We verified the proposed training model with deep and convolutional neural network classifiers on the MNIST and voice activity detection benchmarks. For the deep neural network, our model achieves data storage requirement of as low as 2 bits/weight, whereas the conventional binary neural network learning models require data storage of 8 to 32 bits/weight. With the same amount of data storage, our model can train a bigger network having more weights, achieving 1% less test error than the conventional binary neural network learning model. To achieve the similar classification error, the conventional binary neural network model requires 4× more data storage for weights than our proposed model. For the convolution neural network classifier, the proposed model achieves 2.4% less test error for the same on-chip storage or 6× storage savings to achieve the similar accuracy. |
Tasks | Action Detection, Activity Detection |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkONG0xAW |
https://openreview.net/pdf?id=rkONG0xAW | |
PWC | https://paperswithcode.com/paper/recursive-binary-neural-network-learning |
Repo | |
Framework | |
PunFields at SemEval-2018 Task 3: Detecting Irony by Tools of Humor Analysis
Title | PunFields at SemEval-2018 Task 3: Detecting Irony by Tools of Humor Analysis |
Authors | Elena Mikhalkova, Yuri Karyakin, Alex Voronov, er, Dmitry Grigoriev, Artem Leoznov |
Abstract | The paper describes our search for a universal algorithm of detecting intentional lexical ambiguity in different forms of creative language. At SemEval-2018 Task 3, we used PunFields, the system of automatic analysis of English puns that we introduced at SemEval-2017, to detect irony in tweets. Preliminary tests showed that it can reach the score of F1=0.596. However, at the competition, its result was F1=0.549. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1088/ |
https://www.aclweb.org/anthology/S18-1088 | |
PWC | https://paperswithcode.com/paper/punfields-at-semeval-2018-task-3-detecting |
Repo | |
Framework | |
Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design
Title | Limits of Estimating Heterogeneous Treatment Effects: Guidelines for Practical Algorithm Design |
Authors | Ahmed Alaa, Mihaela Schaar |
Abstract | Estimating heterogeneous treatment effects from observational data is a central problem in many domains. Because counterfactual data is inaccessible, the problem differs fundamentally from supervised learning, and entails a more complex set of modeling choices. Despite a variety of recently proposed algorithmic solutions, a principled guideline for building estimators of treatment effects using machine learning algorithms is still lacking. In this paper, we provide such a guideline by characterizing the fundamental limits of estimating heterogeneous treatment effects, and establishing conditions under which these limits can be achieved. Our analysis reveals that the relative importance of the different aspects of observational data vary with the sample size. For instance, we show that selection bias matters only in small-sample regimes, whereas with a large sample size, the way an algorithm models the control and treated outcomes is what bottlenecks its performance. Guided by our analysis, we build a practical algorithm for estimating treatment effects using a non-stationary Gaussian processes with doubly-robust hyperparameters. Using a standard semi-synthetic simulation setup, we show that our algorithm outperforms the state-of-the-art, and that the behavior of existing algorithms conforms with our analysis. |
Tasks | Gaussian Processes |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2226 |
http://proceedings.mlr.press/v80/alaa18a/alaa18a.pdf | |
PWC | https://paperswithcode.com/paper/limits-of-estimating-heterogeneous-treatment |
Repo | |
Framework | |
Second Language Acquisition Modeling
Title | Second Language Acquisition Modeling |
Authors | Burr Settles, Chris Brust, Erin Gustafson, Masato Hagiwara, Nitin Madnani |
Abstract | We present the task of \textit{second language acquisition (SLA) modeling}. Given a history of errors made by learners of a second language, the task is to predict errors that they are likely to make at arbitrary points in the future. We describe a large corpus of more than 7M words produced by more than 6k learners of English, Spanish, and French using Duolingo, a popular online language-learning app. Then we report on the results of a shared task challenge aimed studying the SLA task via this corpus, which attracted 15 teams and synthesized work from various fields including cognitive science, linguistics, and machine learning. |
Tasks | Language Acquisition |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0506/ |
https://www.aclweb.org/anthology/W18-0506 | |
PWC | https://paperswithcode.com/paper/second-language-acquisition-modeling |
Repo | |
Framework | |
Knowledge Representation and Extraction at Scale
Title | Knowledge Representation and Extraction at Scale |
Authors | Christos Christodoulopoulos |
Abstract | These days, most general knowledge question-answering systems rely on large-scale knowledge bases comprising billions of facts about millions of entities. Having a structured source of semantic knowledge means that we can answer questions involving single static facts (e.g. {}Who was the 8th president of the US?{''}) or dynamically generated ones (e.g. { }How old is Donald Trump?{''}). More importantly, we can answer questions involving multiple inference steps ({}Is the queen older than the president of the US?{''}). In this talk, I{'}m going to be discussing some of the unique challenges that are involved with building and maintaining a consistent knowledge base for Alexa, extending it with new facts and using it to serve answers in multiple languages. I will focus on three recent projects from our group. First, a way of measuring the completeness of a knowledge base, that is based on usage patterns. The definition of the usage of the KB is done in terms of the relation distribution of entities seen in question-answer logs. Instead of directly estimating the relation distribution of individual entities, it is generalized to the { }class signature{''} of each entity. For example, users ask for baseball players{'} height, age, and batting average, so a knowledge base is complete (with respect to baseball players) if every entity has facts for those three relations. Second, an investigation into fact extraction from unstructured text. I will present a method for creating distant (weak) supervision labels for training a large-scale relation extraction system. I will also discuss the effectiveness of neural network approaches by decoupling the model architecture from the feature design of a state-of-the-art neural network system. Surprisingly, a much simpler classifier trained on similar features performs on par with the highly complex neural network system (at 75x reduction to the training time), suggesting that the features are a bigger contributor to the final performance. Finally, I will present the Fact Extraction and VERification (FEVER) dataset and challenge. The dataset comprises more than 185,000 human-generated claims extracted from Wikipedia pages. False claims were generated by mutating true claims in a variety of ways, some of which were meaningaltering. During the verification step, annotators were required to label a claim for its validity and also supply full-sentence textual evidence from (potentially multiple) Wikipedia articles for the label. With FEVER, we aim to help create a new generation of transparent and interprable knowledge extraction systems. |
Tasks | Language Acquisition, Question Answering, Relation Extraction, Semantic Role Labeling |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4007/ |
https://www.aclweb.org/anthology/W18-4007 | |
PWC | https://paperswithcode.com/paper/knowledge-representation-and-extraction-at |
Repo | |
Framework | |
RiskFinder: A Sentence-level Risk Detector for Financial Reports
Title | RiskFinder: A Sentence-level Risk Detector for Financial Reports |
Authors | Yu-Wen Liu, Liang-Chih Liu, Chuan-Ju Wang, Ming-Feng Tsai |
Abstract | This paper presents a web-based information system, RiskFinder, for facilitating the analyses of soft and hard information in financial reports. In particular, the system broadens the analyses from the word level to sentence level, which makes the system useful for practitioner communities and unprecedented among financial academics. The proposed system has four main components: 1) a Form 10-K risk-sentiment dataset, consisting of a set of risk-labeled financial sentences and pre-trained sentence embeddings; 2) metadata, including basic information on each company that published the Form 10-K financial report as well as several relevant financial measures; 3) an interface that highlights risk-related sentences in the financial reports based on the latest sentence embedding techniques; 4) a visualization of financial time-series data for a corresponding company. This paper also conducts some case studies to showcase that the system can be of great help in capturing valuable insight within large amounts of textual information. The system is now online available at \url{https://cfda.csie.org/RiskFinder/}. |
Tasks | Sentence Embedding, Sentence Embeddings, Sentiment Analysis, Time Series |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-5017/ |
https://www.aclweb.org/anthology/N18-5017 | |
PWC | https://paperswithcode.com/paper/riskfinder-a-sentence-level-risk-detector-for |
Repo | |
Framework | |
Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing
Title | Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing |
Authors | |
Abstract | |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-0800/ |
https://www.aclweb.org/anthology/W18-0800 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-second-acl-workshop-on |
Repo | |
Framework | |
Coded Two-Bucket Cameras for Computer Vision
Title | Coded Two-Bucket Cameras for Computer Vision |
Authors | Mian Wei, Navid Sarhangnejad, Zhengfan Xia, Nikita Gusev, Nikola Katic, Roman Genov, Kiriakos N. Kutulakos |
Abstract | We introduce coded two-bucket (C2B) imaging, a new operating principle for computational sensors with applications in active 3D shape estimation and coded-exposure imaging. A C2B sensor modulates the light arriving at each pixel by controlling which of the pixel’s two “buckets” should integrate it. C2B sensors output two images per video frame—one per bucket—and allow rapid, fully-programmable, per-pixel control of the active bucket. Using these properties as a starting point, we (1) develop an image formation model for these sensors, (2) couple them with programmable light sources to acquire illumination mosaics, i.e., images of a scene under many different illumination conditions whose pixels have been multiplexed onto the sensor plane and acquired in one shot, and (3) show how to process illumination mosaics to acquire time-varying depth or normal maps of dynamic scenes at the sensor’s native resolution. We present the first experimental demonstration of these capabilities, using a fully functional C2B camera prototype. Key to this prototype is a C2B sensor that was designed by us, fabricated in a standard CMOS imaging technology, and demonstrated for the first time in this paper. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Mian_Wei_Coded_Two-Bucket_Cameras_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Mian_Wei_Coded_Two-Bucket_Cameras_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/coded-two-bucket-cameras-for-computer-vision |
Repo | |
Framework | |
Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking
Title | Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking |
Authors | Jyun-Yu Jiang, Francine Chen, Yan-Ying Chen, Wei Wang |
Abstract | An enormous amount of conversation occurs online every day, such as on chat platforms where multiple conversations may take place concurrently. Interleaved conversations lead to difficulties in not only following discussions but also retrieving relevant information from simultaneous messages. Conversation disentanglement aims to separate intermingled messages into detached conversations. In this paper, we propose to leverage representation learning for conversation disentanglement. A Siamese hierarchical convolutional neural network (SHCNN), which integrates local and more global representations of a message, is first presented to estimate the conversation-level similarity between closely posted messages. With the estimated similarity scores, our algorithm for conversation identification by similarity ranking (CISIR) then derives conversations based on high-confidence message pairs and pairwise redundancy. Experiments were conducted with four publicly available datasets of conversations from Reddit and IRC channels. The experimental results show that our approach significantly outperforms comparative baselines in both pairwise similarity estimation and conversation disentanglement. |
Tasks | Representation Learning |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1164/ |
https://www.aclweb.org/anthology/N18-1164 | |
PWC | https://paperswithcode.com/paper/learning-to-disentangle-interleaved |
Repo | |
Framework | |
Robust Physical-World Attacks on Deep Learning Visual Classification
Title | Robust Physical-World Attacks on Deep Learning Visual Classification |
Authors | Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, Dawn Song |
Abstract | Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input. Given that that emerging physical systems are using DNNs in safety-critical situations, adversarial examples could mislead these systems and cause dangerous situations. Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient learning algorithms. We propose a general attack algorithm, Robust Physical Perturbations (RP 2 ), to generate robust visual adversarial perturbations under different physical conditions. Using the real-world case of road sign classification, we show that adversarial examples generated using RP 2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints. Due to the current lack of a standardized testing method, we propose a two-stage evaluation methodology for robust physical adversarial examples consisting of lab and field tests. Using this methodology, we evaluate the efficacy of physical adversarial manipulations on real objects. With a perturbation in the form of only black and white stickers, we attack a real stop sign, causing targeted misclassification in 100% of the images obtained in lab settings, and in 84.8% of the captured video frames obtained on a moving vehicle (field test) for the target classifier. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Eykholt_Robust_Physical-World_Attacks_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Eykholt_Robust_Physical-World_Attacks_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/robust-physical-world-attacks-on-deep-1 |
Repo | |
Framework | |
Bias-Variance Decomposition for Boltzmann Machines
Title | Bias-Variance Decomposition for Boltzmann Machines |
Authors | Mahito Sugiyama, Koji Tsuda, Hiroyuki Nakahara |
Abstract | We achieve bias-variance decomposition for Boltzmann machines using an information geometric formulation. Our decomposition leads to an interesting phenomenon that the variance does not necessarily increase when more parameters are included in Boltzmann machines, while the bias always decreases. Our result gives a theoretical evidence of the generalization ability of deep learning architectures because it provides the possibility of increasing the representation power with avoiding the variance inflation. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=rkMt1bWAZ |
https://openreview.net/pdf?id=rkMt1bWAZ | |
PWC | https://paperswithcode.com/paper/bias-variance-decomposition-for-boltzmann |
Repo | |
Framework | |
AirNet: a machine learning dataset for air quality forecasting
Title | AirNet: a machine learning dataset for air quality forecasting |
Authors | Songgang Zhao, Xingyuan Yuan, Da Xiao, Jianyuan Zhang, Zhouyuan Li |
Abstract | In the past decade, many urban areas in China have suffered from serious air pollution problems, making air quality forecast a hot spot. Conventional approaches rely on numerical methods to estimate the pollutant concentration and require lots of computing power. To solve this problem, we applied the widely used deep learning methods. Deep learning requires large-scale datasets to train an effective model. In this paper, we introduced a new dataset, entitled as AirNet, containing the 0.25 degree resolution grid map of mainland China, with more than two years of continued air quality measurement and meteorological data. We published this dataset as an open resource for machine learning researches and set up a baseline of a 5-day air pollution forecast. The results of experiments demonstrated that this dataset could facilitate the development of new algorithms on the air quality forecast. |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=SkymMAxAb |
https://openreview.net/pdf?id=SkymMAxAb | |
PWC | https://paperswithcode.com/paper/airnet-a-machine-learning-dataset-for-air |
Repo | |
Framework | |