Paper Group ANR 1605
Misspelling Oblivious Word Embeddings. Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. Robust Dynamic Assortment Optimization in the Presence of Outlier Customers. Aggregating Votes with Local Differential Privacy: Usefulness, Soundness vs. Indistinguishability. Autonomous Haiku Generation. Learning Geo-Temporal Image Fea …
Misspelling Oblivious Word Embeddings
Title | Misspelling Oblivious Word Embeddings |
Authors | Bora Edizel, Aleksandra Piktus, Piotr Bojanowski, Rui Ferreira, Edouard Grave, Fabrizio Silvestri |
Abstract | In this paper we present a method to learn word embeddings that are resilient to misspellings. Existing word embeddings have limited applicability to malformed texts, which contain a non-negligible amount of out-of-vocabulary words. We propose a method combining FastText with subwords and a supervised task of learning misspelling patterns. In our method, misspellings of each word are embedded close to their correct variants. We train these embeddings on a new dataset we are releasing publicly. Finally, we experimentally show the advantages of this approach on both intrinsic and extrinsic NLP tasks using public test sets. |
Tasks | Word Embeddings |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09755v1 |
https://arxiv.org/pdf/1905.09755v1.pdf | |
PWC | https://paperswithcode.com/paper/misspelling-oblivious-word-embeddings |
Repo | |
Framework | |
Interaction-Transformation Evolutionary Algorithm for Symbolic Regression
Title | Interaction-Transformation Evolutionary Algorithm for Symbolic Regression |
Authors | Fabricio Olivetti de Franca, Guilherme Seidyo Imai Aldeia |
Abstract | The Interaction-Transformation (IT) is a new representation for Symbolic Regression that restricts the search space into simpler, but expressive, function forms. This representation has the advantage of creating a smoother search space unlike the space generated by Expression Trees, the common representation used in Genetic Programming. This paper introduces an Evolutionary Algorithm capable of evolving a population of IT expressions supported only by the mutation operator. The results show that this representation is capable of finding better approximations to real-world data sets when compared to traditional approaches and a state-of-the-art Genetic Programming algorithm. |
Tasks | |
Published | 2019-02-11 |
URL | https://arxiv.org/abs/1902.03983v3 |
https://arxiv.org/pdf/1902.03983v3.pdf | |
PWC | https://paperswithcode.com/paper/interaction-transformation-evolutionary |
Repo | |
Framework | |
Robust Dynamic Assortment Optimization in the Presence of Outlier Customers
Title | Robust Dynamic Assortment Optimization in the Presence of Outlier Customers |
Authors | Xi Chen, Akshay Krishnamurthy, Yining Wang |
Abstract | We consider the dynamic assortment optimization problem under the multinomial logit model (MNL) with unknown utility parameters. The main question investigated in this paper is model mis-specification under the $\varepsilon$-contamination model, which is a fundamental model in robust statistics and machine learning. In particular, throughout a selling horizon of length $T$, we assume that customers make purchases according to a well specified underlying multinomial logit choice model in a ($1-\varepsilon$)-fraction of the time periods, and make arbitrary purchasing decisions instead in the remaining $\varepsilon$-fraction of the time periods. In this model, we develop a new robust online assortment optimization policy via an active elimination strategy. We establish both upper and lower bounds on the regret, and show that our policy is optimal up to logarithmic factor in T when the assortment capacity is constant. Furthermore, we develop a fully adaptive policy that does not require any prior knowledge of the contamination parameter $\varepsilon$. Our simulation study shows that our policy outperforms the existing policies based on upper confidence bounds (UCB) and Thompson sampling. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04183v1 |
https://arxiv.org/pdf/1910.04183v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-dynamic-assortment-optimization-in-the |
Repo | |
Framework | |
Aggregating Votes with Local Differential Privacy: Usefulness, Soundness vs. Indistinguishability
Title | Aggregating Votes with Local Differential Privacy: Usefulness, Soundness vs. Indistinguishability |
Authors | Shaowei Wang, Jiachun Du, Wei Yang, Xinrong Diao, Zichun Liu, Yiwen Nie, Liusheng Huang, Hongli Xu |
Abstract | Voting plays a central role in bringing crowd wisdom to collective decision making, meanwhile data privacy has been a common ethical/legal issue in eliciting preferences from individuals. This work studies the problem of aggregating individual’s voting data under the local differential privacy setting, where usefulness and soundness of the aggregated scores are of major concern. One naive approach to the problem is adding Laplace random noises, however, it makes aggregated scores extremely fragile to new types of strategic behaviors tailored to the local privacy setting: data amplification attack and view disguise attack. The data amplification attack means an attacker’s manipulation power is amplified by the privacy-preserving procedure when contributing a fraud vote. The view disguise attack happens when an attacker could disguise malicious data as valid private views to manipulate the voting result. In this work, after theoretically quantifying the estimation error bound and the manipulating risk bound of the Laplace mechanism, we propose two mechanisms improving the usefulness and soundness simultaneously: the weighted sampling mechanism and the additive mechanism. The former one interprets the score vector as probabilistic data. Compared to the Laplace mechanism for Borda voting rule with $d$ candidates, it reduces the mean squared error bound by half and lowers the maximum magnitude risk bound from $+\infty$ to $O(\frac{d^3}{n\epsilon})$. The latter one randomly outputs a subset of candidates according to their total scores. Its mean squared error bound is optimized from $O(\frac{d^5}{n\epsilon^2})$ to $O(\frac{d^4}{n\epsilon^2})$, and its maximum magnitude risk bound is reduced to $O(\frac{d^2}{n\epsilon})$. Experimental results validate that our proposed approaches averagely reduce estimation error by $50%$ and are more robust to adversarial attacks. |
Tasks | Decision Making |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.04920v1 |
https://arxiv.org/pdf/1908.04920v1.pdf | |
PWC | https://paperswithcode.com/paper/aggregating-votes-with-local-differential |
Repo | |
Framework | |
Autonomous Haiku Generation
Title | Autonomous Haiku Generation |
Authors | Rui Aguiar, Kevin Liao |
Abstract | Artificial Intelligence is an excellent tool to improve efficiency and lower cost in many quantitative real world applications, but what if the task is not easily defined? What if the task is generating creativity? Poetry is a creative endeavor that is highly difficult to both grasp and achieve with any level of competence. As Rita Dove, a famous American poet and author states, “Poetry is language at its most distilled and most powerful.” Taking Doves quote as an inspiration, our task was to generate high quality haikus using artificial intelligence and deep learning. |
Tasks | |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08733v1 |
https://arxiv.org/pdf/1906.08733v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-haiku-generation |
Repo | |
Framework | |
Learning Geo-Temporal Image Features
Title | Learning Geo-Temporal Image Features |
Authors | Menghua Zhai, Tawfiq Salem, Connor Greenwell, Scott Workman, Robert Pless, Nathan Jacobs |
Abstract | We propose to implicitly learn to extract geo-temporal image features, which are mid-level features related to when and where an image was captured, by explicitly optimizing for a set of location and time estimation tasks. To train our method, we take advantage of a large image dataset, captured by outdoor webcams and cell phones. The only form of supervision we provide are the known capture time and location of each image. We find that our approach learns features that are related to natural appearance changes in outdoor scenes. Additionally, we demonstrate the application of these geo-temporal features to time and location estimation. |
Tasks | |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07499v1 |
https://arxiv.org/pdf/1909.07499v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-geo-temporal-image-features |
Repo | |
Framework | |
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
Title | StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding |
Authors | Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Jiangnan Xia, Liwei Peng, Luo Si |
Abstract | Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering. Inspired by the linearization exploration work of Elman [8], we extend BERT to a new model, StructBERT, by incorporating language structures into pre-training. Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. As a result, the new model is adapted to different levels of language understanding required by downstream tasks. The StructBERT with structural pre-training gives surprisingly good empirical results on a variety of downstream tasks, including pushing the state-of-the-art on the GLUE benchmark to 89.0 (outperforming all published models), the F1 score on SQuAD v1.1 question answering to 93.0, the accuracy on SNLI to 91.7. |
Tasks | Language Modelling, Natural Language Inference, Question Answering, Semantic Textual Similarity, Sentiment Analysis |
Published | 2019-08-13 |
URL | https://arxiv.org/abs/1908.04577v3 |
https://arxiv.org/pdf/1908.04577v3.pdf | |
PWC | https://paperswithcode.com/paper/structbert-incorporating-language-structures |
Repo | |
Framework | |
Incorporating Query Term Independence Assumption for Efficient Retrieval and Ranking using Deep Neural Networks
Title | Incorporating Query Term Independence Assumption for Efficient Retrieval and Ranking using Deep Neural Networks |
Authors | Bhaskar Mitra, Corby Rosset, David Hawking, Nick Craswell, Fernando Diaz, Emine Yilmaz |
Abstract | Classical information retrieval (IR) methods, such as query likelihood and BM25, score documents independently w.r.t. each query term, and then accumulate the scores. Assuming query term independence allows precomputing term-document scores using these models—which can be combined with specialized data structures, such as inverted index, for efficient retrieval. Deep neural IR models, in contrast, compare the whole query to the document and are, therefore, typically employed only for late stage re-ranking. We incorporate query term independence assumption into three state-of-the-art neural IR models: BERT, Duet, and CKNRM—and evaluate their performance on a passage ranking task. Surprisingly, we observe no significant loss in result quality for Duet and CKNRM—and a small degradation in the case of BERT. However, by operating on each query term independently, these otherwise computationally intensive models become amenable to offline precomputation—dramatically reducing the cost of query evaluations employing state-of-the-art neural ranking models. This strategy makes it practical to use deep models for retrieval from large collections—and not restrict their usage to late stage re-ranking. |
Tasks | Information Retrieval |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03693v1 |
https://arxiv.org/pdf/1907.03693v1.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-query-term-independence |
Repo | |
Framework | |
Language2Pose: Natural Language Grounded Pose Forecasting
Title | Language2Pose: Natural Language Grounded Pose Forecasting |
Authors | Chaitanya Ahuja, Louis-Philippe Morency |
Abstract | Generating animations from natural language sentences finds its applications in a a number of domains such as movie script visualization, virtual human animation and, robot motion planning. These sentences can describe different kinds of actions, speeds and direction of these actions, and possibly a target destination. The core modeling challenge in this language-to-pose application is how to map linguistic concepts to motion animations. In this paper, we address this multimodal problem by introducing a neural architecture called Joint Language to Pose (or JL2P), which learns a joint embedding of language and pose. This joint embedding space is learned end-to-end using a curriculum learning approach which emphasizes shorter and easier sequences first before moving to longer and harder ones. We evaluate our proposed model on a publicly available corpus of 3D pose data and human-annotated sentences. Both objective metrics and human judgment evaluation confirm that our proposed approach is able to generate more accurate animations and are deemed visually more representative by humans than other data driven approaches. |
Tasks | Motion Planning |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01108v2 |
https://arxiv.org/pdf/1907.01108v2.pdf | |
PWC | https://paperswithcode.com/paper/language2pose-natural-language-grounded-pose |
Repo | |
Framework | |
Universal EEG Encoder for Learning Diverse Intelligent Tasks
Title | Universal EEG Encoder for Learning Diverse Intelligent Tasks |
Authors | Baani Leen Kaur Jolly, Palash Aggrawal, Surabhi S Nath, Viresh Gupta, Manraj Singh Grover, Rajiv Ratn Shah |
Abstract | Brain Computer Interfaces (BCI) have become very popular with Electroencephalography (EEG) being one of the most commonly used signal acquisition techniques. A major challenge in BCI studies is the individualistic analysis required for each task. Thus, task-specific feature extraction and classification are performed, which fails to generalize to other tasks with similar time-series EEG input data. To this end, we design a GRU-based universal deep encoding architecture to extract meaningful features from publicly available datasets for five diverse EEG-based classification tasks. Our network can generate task and format-independent data representation and outperform the state of the art EEGNet architecture on most experiments. We also compare our results with CNN-based, and Autoencoder networks, in turn performing local, spatial, temporal and unsupervised analysis on the data. |
Tasks | EEG, Time Series |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.12152v1 |
https://arxiv.org/pdf/1911.12152v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-eeg-encoder-for-learning-diverse |
Repo | |
Framework | |
AlteregoNets: a way to human augmentation
Title | AlteregoNets: a way to human augmentation |
Authors | Dr. David Kupeev |
Abstract | A person dependent network, called an AlterEgo net, is proposed for development. The networks are created per person. It receives at input an object descriptions and outputs a simulation of the internal person’s representation of the objects. The network generates a textual stream resembling the narrative stream of consciousness depicting multitudinous thoughts and feelings related to a perceived object. In this way, the object is described not by a ‘static’ set of its properties, like a dictionary, but by the stream of words and word combinations referring to the object. The network simulates a person’s dialogue with a representation of the object. It is based on an introduced algorithmic scheme, where perception is modeled by two interacting iterative cycles, reminding one respectively the forward and backward propagation executed at training convolution neural networks. The ‘forward’ iterations generate a stream representing the ‘internal world’ of a human. The ‘backward’ iterations generate a stream representing an internal representation of the object. People perceive the world differently. Tuning AlterEgo nets to a specific person or group of persons, will allow simulation of their thoughts and feelings. Thereby these nets is potentially a new human augmentation technology for various applications. |
Tasks | |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.09786v1 |
http://arxiv.org/pdf/1901.09786v1.pdf | |
PWC | https://paperswithcode.com/paper/alteregonets-a-way-to-human-augmentation |
Repo | |
Framework | |
Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy
Title | Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy |
Authors | Adam Sealfon, Jonathan Ullman |
Abstract | We give a simple, computationally efficient, and node-differentially-private algorithm for estimating the parameter of an Erdos-Renyi graph—that is, estimating p in a G(n,p)—with near-optimal accuracy. Our algorithm nearly matches the information-theoretically optimal exponential-time algorithm for the same problem due to Borgs et al. (FOCS 2018). More generally, we give an optimal, computationally efficient, private algorithm for estimating the edge-density of any graph whose degree distribution is concentrated on a small interval. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10477v1 |
https://arxiv.org/pdf/1905.10477v1.pdf | |
PWC | https://paperswithcode.com/paper/efficiently-estimating-erdos-renyi-graphs |
Repo | |
Framework | |
Adaptive Step Sizes in Variance Reduction via Regularization
Title | Adaptive Step Sizes in Variance Reduction via Regularization |
Authors | Bingcong Li, Georgios B. Giannakis |
Abstract | The main goal of this work is equipping convex and nonconvex problems with Barzilai-Borwein (BB) step size. With the adaptivity of BB step sizes granted, they can fail when the objective function is not strongly convex. To overcome this challenge, the key idea here is to bridge (non)convex problems and strongly convex ones via regularization. The proposed regularization schemes are \textit{simple} yet effective. Wedding the BB step size with a variance reduction method, known as SARAH, offers a free lunch compared with vanilla SARAH in convex problems. The convergence of BB step sizes in nonconvex problems is also established and its complexity is no worse than other adaptive step sizes such as AdaGrad. As a byproduct, our regularized SARAH methods for convex functions ensure that the complexity to find $\mathbb{E}[\ \nabla f(\mathbf{x}) ^2]\leq \epsilon$ is ${\cal O}\big( (n+\frac{1}{\sqrt{\epsilon}})\ln{\frac{1}{\epsilon}}\big)$, improving $\epsilon$ dependence over existing results. Numerical tests further validate the merits of proposed approaches. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06532v1 |
https://arxiv.org/pdf/1910.06532v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-step-sizes-in-variance-reduction-via |
Repo | |
Framework | |
Understanding Childhood Vulnerability in The City of Surrey
Title | Understanding Childhood Vulnerability in The City of Surrey |
Authors | Cody Griffith, Varoon Mathur, Catherine Lin, Kevin Zhu |
Abstract | Understanding the community conditions that best support universal access and improved childhood outcomes allows ultimately to improve decision-making in the areas of planning and investment across the early stages of childhood development. Here we describe two different data-driven approaches to visualizing the lived experiences of children throughout the City of Surrey, combining data derived from both public and private sources. In one approach, we find specifically that the Early Development Instrument measuring childhood vulnerabilities across varying domains can be used to cluster neighborhoods, and that census variables can help explain similarities between neighborhoods within these clusters. In our second approach, we use program registration data from the City of Surrey’s Community and Recreation Services Division. We also find a critical age of entry and exit for each program related to early childhood development and beyond, and find that certain neighborhoods and recreational programs have larger retention rates than others. This report details the journey of using data to tell the story of these neighborhoods, and provides a lens to which community initiatives can be strategically crafted through their use. |
Tasks | Decision Making |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.09639v1 |
http://arxiv.org/pdf/1903.09639v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-childhood-vulnerability-in-the |
Repo | |
Framework | |
NISER: Normalized Item and Session Representations to Handle Popularity Bias
Title | NISER: Normalized Item and Session Representations to Handle Popularity Bias |
Authors | Priyanka Gupta, Diksha Garg, Pankaj Malhotra, Lovekesh Vig, Gautam Shroff |
Abstract | The goal of session-based recommendation (SR) models is to utilize the information from past actions (e.g. item/product clicks) in a session to recommend items that a user is likely to click next. Recently it has been shown that the sequence of item interactions in a session can be modeled as graph-structured data to better account for complex item transitions. Graph neural networks (GNNs) can learn useful representations for such session-graphs, and have been shown to improve over sequential models such as recurrent neural networks [14]. However, we note that these GNN-based recommendation models suffer from popularity bias: the models are biased towards recommending popular items, and fail to recommend relevant long-tail items (less popular or less frequent items). Therefore, these models perform poorly for the less popular new items arriving daily in a practical online setting. We demonstrate that this issue is, in part, related to the magnitude or norm of the learned item and session-graph representations (embedding vectors). We propose a training procedure that mitigates this issue by using normalized representations. The models using normalized item and session-graph representations perform significantly better: i. for the less popular long-tail items in the offline setting, and ii. for the less popular newly introduced items in the online setting. Furthermore, our approach significantly improves upon existing state-of-the-art on three benchmark datasets. |
Tasks | Session-Based Recommendations |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04276v3 |
https://arxiv.org/pdf/1909.04276v3.pdf | |
PWC | https://paperswithcode.com/paper/niser-normalized-item-and-session |
Repo | |
Framework | |