January 29, 2020

3457 words 17 mins read

Paper Group ANR 688

A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization. Learning neutrino effects in Cosmology with Convolutional Neural Networks. Asymptotics of MAP Inference in Deep Networks. A Random Interaction Forest for Prioritizing Predictive Biomarkers. Who, Where, and What to Wear? Extracting Fashion …

A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization


Title	A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization
Authors	Tae Jong Choi, Julian Togelius, Yun-Gyung Cheong
Abstract	A new variant of stochastic opposition-based learning (OBL) is proposed in this paper. OBL is a relatively new machine learning concept, which consists of simultaneously calculating an original solution and its opposite to accelerate the convergence of soft computing algorithms. Recently a new opposition-based differential evolution (ODE) variant called BetaCODE was proposed as a combination of differential evolution and a new stochastic OBL variant called BetaCOBL. BetaCOBL is capable of flexibly adjusting the probability density functions used to calculate opposite solutions, generating more diverse opposite solutions, and preventing the waste of fitness evaluations. While it has shown outstanding performance compared to several state-of-the-art OBL variants, BetaCOBL is challenging with more complex problems because of its high computational cost. Besides, as it assumes that the decision variables are independent, there is a limitation in the search for decent opposite solutions on inseparable problems. In this paper, we propose an improved stochastic OBL variant that mitigates all the limitations of BetaCOBL. The proposed algorithm called iBetaCOBL reduces the computational cost from $O(NP^{2} \cdot D)$ to $O(NP \cdot D)$ ($NP$ and $D$ stand for population size and dimension, respectively) using a linear time diversity measure. In addition, iBetaCOBL preserves the strongly dependent decision variables that are adjacent to each other using the multiple exponential crossover. The results of the performance evaluations on a set of 58 test functions show that iBetaCODE finds more accurate solutions than ten state-of-the-art ODE variants including BetaCODE. Additionally, we applied iBetaCOBL to two state-of-the-art DE variants, and as in the previous results, iBetaCOBL based variants exhibit significantly improved performance.
Tasks
Published	2019-08-09
URL	https://arxiv.org/abs/1908.08011v1
PDF	https://arxiv.org/pdf/1908.08011v1.pdf
PWC	https://paperswithcode.com/paper/190808011
Repo
Framework

Learning neutrino effects in Cosmology with Convolutional Neural Networks


Title	Learning neutrino effects in Cosmology with Convolutional Neural Networks
Authors	Elena Giusarma, Mauricio Reyes Hurtado, Francisco Villaescusa-Navarro, Siyu He, Shirley Ho, ChangHoon Hahn
Abstract	Measuring the sum of the three active neutrino masses, $M_\nu$, is one of the most important challenges in modern cosmology. Massive neutrinos imprint characteristic signatures on several cosmological observables in particular on the large-scale structure of the Universe. In order to maximize the information that can be retrieved from galaxy surveys, accurate theoretical predictions in the non-linear regime are needed. Currently, one way to achieve those predictions is by running cosmological numerical simulations. Unfortunately, producing those simulations requires high computational resources – seven hundred CPU hours for each neutrino mass case. In this work, we propose a new method, based on a deep learning network (U-Net), to quickly generate simulations with massive neutrinos from standard $\Lambda$CDM simulations without neutrinos. We computed multiple relevant statistical measures of deep-learning generated simulations, and conclude that our method accurately reproduces the 3-dimensional spatial distribution of matter down to non-linear scales: $k < 0.7$ h/Mpc. Finally, our method allows us to generate massive neutrino simulations 10,000 times faster than the traditional methods.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04255v1
PDF	https://arxiv.org/pdf/1910.04255v1.pdf
PWC	https://paperswithcode.com/paper/learning-neutrino-effects-in-cosmology-with
Repo
Framework

Asymptotics of MAP Inference in Deep Networks


Title	Asymptotics of MAP Inference in Deep Networks
Authors	Parthe Pandit, Mojtaba Sahraee, Sundeep Rangan, Alyson K. Fletcher
Abstract	Deep generative priors are a powerful tool for reconstruction problems with complex data such as images and text. Inverse problems using such models require solving an inference problem of estimating the input and hidden units of the multi-layer network from its output. Maximum a priori (MAP) estimation is a widely-used inference method as it is straightforward to implement, and has been successful in practice. However, rigorous analysis of MAP inference in multi-layer networks is difficult. This work considers a recently-developed method, multi-layer vector approximate message passing (ML-VAMP), to study MAP inference in deep networks. It is shown that the mean squared error of the ML-VAMP estimate can be exactly and rigorously characterized in a certain high-dimensional random limit. The proposed method thus provides a tractable method for MAP inference with exact performance guarantees.
Tasks
Published	2019-03-01
URL	http://arxiv.org/abs/1903.01293v1
PDF	http://arxiv.org/pdf/1903.01293v1.pdf
PWC	https://paperswithcode.com/paper/asymptotics-of-map-inference-in-deep-networks
Repo
Framework

A Random Interaction Forest for Prioritizing Predictive Biomarkers


Title	A Random Interaction Forest for Prioritizing Predictive Biomarkers
Authors	Zhen Zeng, Yuefeng Lu, Judong Shen, Wei Zheng, Peter Shaw, Mary Beth Dorr
Abstract	Precision medicine is becoming a focus in medical research recently, as its implementation brings values to all stakeholders in the healthcare system. Various statistical methodologies have been developed tackling problems in different aspects of this field, e.g., assessing treatment heterogeneity, identifying patient subgroups, or building treatment decision models. However, there is a lack of new tools devoted to selecting and prioritizing predictive biomarkers. We propose a novel tree-based ensemble method, random interaction forest (RIF), to generate predictive importance scores and prioritize candidate biomarkers for constructing refined treatment decision models. RIF was evaluated by comparing with the conventional random forest and univariable regression methods and showed favorable properties under various simulation scenarios. We applied the proposed RIF method to a biomarker dataset from two phase III clinical trials of bezlotoxumab on $\textit{Clostridium difficile}$ infection recurrence and obtained biologically meaningful results.
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.01786v1
PDF	https://arxiv.org/pdf/1910.01786v1.pdf
PWC	https://paperswithcode.com/paper/a-random-interaction-forest-for-prioritizing
Repo
Framework


Title	Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media
Authors	Yunshan Ma, Xun Yang, Lizi Liao, Yixin Cao, Tat-Seng Chua
Abstract	Fashion knowledge helps people to dress properly and addresses not only physiological needs of users, but also the demands of social activities and conventions. It usually involves three mutually related aspects of: occasion, person and clothing. However, there are few works focusing on extracting such knowledge, which will greatly benefit many downstream applications, such as fashion recommendation. In this paper, we propose a novel method to automatically harvest fashion knowledge from social media. We unify three tasks of occasion, person and clothing discovery from multiple modalities of images, texts and metadata. For person detection and analysis, we use the off-the-shelf tools due to their flexibility and satisfactory performance. For clothing recognition and occasion prediction, we unify the two tasks by using a contextualized fashion concept learning module, which captures the dependencies and correlations among different fashion concepts. To alleviate the heavy burden of human annotations, we introduce a weak label modeling module which can effectively exploit machine-labeled data, a complementary of clean data. In experiments, we contribute a benchmark dataset and conduct extensive experiments from both quantitative and qualitative perspectives. The results demonstrate the effectiveness of our model in fashion concept prediction, and the usefulness of extracted knowledge with comprehensive analysis.
Tasks	Human Detection
Published	2019-08-12
URL	https://arxiv.org/abs/1908.08985v1
PDF	https://arxiv.org/pdf/1908.08985v1.pdf
PWC	https://paperswithcode.com/paper/who-where-and-what-to-wear-extracting-fashion
Repo
Framework

Transfer of Adversarial Robustness Between Perturbation Types


Title	Transfer of Adversarial Robustness Between Perturbation Types
Authors	Daniel Kang, Yi Sun, Tom Brown, Dan Hendrycks, Jacob Steinhardt
Abstract	We study the transfer of adversarial robustness of deep neural networks between different perturbation types. While most work on adversarial examples has focused on $L_\infty$ and $L_2$-bounded perturbations, these do not capture all types of perturbations available to an adversary. The present work evaluates 32 attacks of 5 different types against models adversarially trained on a 100-class subset of ImageNet. Our empirical results suggest that evaluating on a wide range of perturbation sizes is necessary to understand whether adversarial robustness transfers between perturbation types. We further demonstrate that robustness against one perturbation type may not always imply and may sometimes hurt robustness against other perturbation types. In light of these results, we recommend evaluation of adversarial defenses take place on a diverse range of perturbation types and sizes.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.01034v1
PDF	https://arxiv.org/pdf/1905.01034v1.pdf
PWC	https://paperswithcode.com/paper/transfer-of-adversarial-robustness-between
Repo
Framework

LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models


Title	LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
Authors	Dilip Kumar Margam, Rohith Aralikatti, Tanay Sharma, Abhinav Thanda, Pujitha A K, Sharad Roy, Shankar M Venkatesan
Abstract	In recent years, deep learning based machine lipreading has gained prominence. To this end, several architectures such as LipNet, LCANet and others have been proposed which perform extremely well compared to traditional lipreading DNN-HMM hybrid systems trained on DCT features. In this work, we propose a simpler architecture of 3D-2D-CNN-BLSTM network with a bottleneck layer. We also present analysis of two different approaches for lipreading on this architecture. In the first approach, 3D-2D-CNN-BLSTM network is trained with CTC loss on characters (ch-CTC). Then BLSTM-HMM model is trained on bottleneck lip features (extracted from 3D-2D-CNN-BLSTM ch-CTC network) in a traditional ASR training pipeline. In the second approach, same 3D-2D-CNN-BLSTM network is trained with CTC loss on word labels (w-CTC). The first approach shows that bottleneck features perform better compared to DCT features. Using the second approach on Grid corpus’ seen speaker test set, we report $1.3%$ WER - a $55%$ improvement relative to LCANet. On unseen speaker test set we report $8.6%$ WER which is $24.5%$ improvement relative to LipNet. We also verify the method on a second dataset of $81$ speakers which we collected. Finally, we also discuss the effect of feature duplication on BLSTM-HMM model performance.
Tasks	Lipreading
Published	2019-06-25
URL	https://arxiv.org/abs/1906.12170v1
PDF	https://arxiv.org/pdf/1906.12170v1.pdf
PWC	https://paperswithcode.com/paper/lipreading-with-3d-2d-cnn-blstm-hmm-and-word
Repo
Framework

Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019


Title	Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019
Authors	Usama Yaseen, Pankaj Gupta, Hinrich Schütze
Abstract	Named Entity Recognition (NER) and Relation Extraction (RE) are essential tools in distilling knowledge from biomedical literature. This paper presents our findings from participating in BioNLP Shared Tasks 2019. We addressed Named Entity Recognition including nested entities extraction, Entity Normalization and Relation Extraction. Our proposed approach of Named Entities can be generalized to different languages and we have shown it’s effectiveness for English and Spanish text. We investigated linguistic features, hybrid loss including ranking and Conditional Random Fields (CRF), multi-task objective and token-level ensembling strategy to improve NER. We employed dictionary based fuzzy and semantic search to perform Entity Normalization. Finally, our RE system employed Support Vector Machine (SVM) with linguistic features. Our NER submission (team:MIC-CIS) ranked first in BB-2019 norm+NER task with standard error rate (SER) of 0.7159 and showed competitive performance on PharmaCo NER task with F1-score of 0.8662. Our RE system ranked first in the SeeDev-binary Relation Extraction Task with F1-score of 0.3738.
Tasks	Named Entity Recognition, Nested Named Entity Recognition, Relation Extraction
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03385v1
PDF	https://arxiv.org/pdf/1910.03385v1.pdf
PWC	https://paperswithcode.com/paper/linguistically-informed-relation-extraction
Repo
Framework

A Robust and Precise ConvNet for small non-coding RNA classification (RPC-snRC)


Title	A Robust and Precise ConvNet for small non-coding RNA classification (RPC-snRC)
Authors	Muhammad Nabeel Asima, Muhammad Imran Malik, Andreas Dengela, Sheraz Ahmed
Abstract	Functional or non-coding RNAs are attracting more attention as they are now potentially considered valuable resources in the development of new drugs intended to cure several human diseases. The identification of drugs targeting the regulatory circuits of functional RNAs depends on knowing its family, a task which is known as RNA sequence classification. State-of-the-art small noncoding RNA classification methodologies take secondary structural features as input. However, in such classification, feature extraction approaches only take global characteristics into account and completely oversight co-relative effect of local structures. Furthermore secondary structure based approaches incorporate high dimensional feature space which proves computationally expensive. This paper proposes a novel Robust and Precise ConvNet (RPC-snRC) methodology which classifies small non-coding RNAs sequences into their relevant families by utilizing the primary sequence of RNAs. RPC-snRC methodology learns hierarchical representation of features by utilizing positioning and occurrences information of nucleotides. To avoid exploding and vanishing gradient problems, we use an approach similar to DenseNet in which gradient can flow straight from subsequent layers to previous layers. In order to assess the effectiveness of deeper architectures for small non-coding RNA classification, we also adapted two ResNet architectures having different number of layers. Experimental results on a benchmark small non-coding RNA dataset show that our proposed methodology does not only outperform existing small non-coding RNA classification approaches with a significant performance margin of 10% but it also outshines adapted ResNet architectures.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.11356v1
PDF	https://arxiv.org/pdf/1912.11356v1.pdf
PWC	https://paperswithcode.com/paper/a-robust-and-precise-convnet-for-small-non
Repo
Framework

Revenue Maximization of Airbnb Marketplace using Search Results


Title	Revenue Maximization of Airbnb Marketplace using Search Results
Authors	Jiawei Wen, Hossein Vahabi, Mihajlo Grbovic
Abstract	Correctly pricing products or services in an online marketplace presents a challenging problem and one of the critical factors for the success of the business. When users are looking to buy an item they typically search for it. Query relevance models are used at this stage to retrieve and rank the items on the search page from most relevant to least relevant. The presented items are naturally “competing” against each other for user purchases. We provide a practical two-stage model to price this set of retrieved items for which distributions of their values are learned. The initial output of the pricing strategy is a price vector for the top displayed items in one search event. We later aggregate these results over searches to provide the supplier with the optimal price for each item. We applied our solution to large-scale search data obtained from Airbnb Experiences marketplace. Offline evaluation results show that our strategy improves upon baseline pricing strategies on key metrics by at least +20% in terms of booking regret and +55% in terms of revenue potential.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.05887v2
PDF	https://arxiv.org/pdf/1911.05887v2.pdf
PWC	https://paperswithcode.com/paper/revenue-maximization-of-airbnb-marketplace
Repo
Framework

Probabilistic Similarity Networks


Title	Probabilistic Similarity Networks
Authors	David Heckerman
Abstract	Normative expert systems have not become commonplace because they have been difficult to build and use. Over the past decade, however, researchers have developed the influence diagram, a graphical representation of a decision maker’s beliefs, alternatives, and preferences that serves as the knowledge base of a normative expert system. Most people who have seen the representation find it intuitive and easy to use. Consequently, the influence diagram has overcome significantly the barriers to constructing normative expert systems. Nevertheless, building influence diagrams is not practical for extremely large and complex domains. In this book, I address the difficulties associated with the construction of the probabilistic portion of an influence diagram, called a knowledge map, belief network, or Bayesian network. I introduce two representations that facilitate the generation of large knowledge maps. In particular, I introduce the similarity network, a tool for building the network structure of a knowledge map, and the partition, a tool for assessing the probabilities associated with a knowledge map. I then use these representations to build Pathfinder, a large normative expert system for the diagnosis of lymph-node diseases (the domain contains over 60 diseases and over 100 disease findings). In an early version of the system, I encoded the knowledge of the expert using an erroneous assumption that all disease findings were independent, given each disease. When the expert and I attempted to build a more accurate knowledge map for the domain that would capture the dependencies among the disease findings, we failed. Using a similarity network, however, we built the knowledge-map structure for the entire domain in approximately 40 hours. Furthermore, the partition representation reduced the number of probability assessments required by the expert from 75,000 to 14,000.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.06263v1
PDF	https://arxiv.org/pdf/1911.06263v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-similarity-networks
Repo
Framework

The fastest $\ell_{1,\infty}$ prox in the west


Title	The fastest $\ell_{1,\infty}$ prox in the west
Authors	Benjamín Béjar, Ivan Dokmanić, René Vidal
Abstract	Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proximal operator of the mixed $\ell_{1,\infty}$ matrix norm and show that it can be computed in closed form by applying the well-known soft-thresholding operator to each column of the matrix. However, unlike the vector $\ell_1$ norm case where the threshold is constant, in the mixed $\ell_{1,\infty}$ norm case each column of the matrix might require a different threshold and all thresholds depend on the given matrix. We propose a general iterative algorithm for computing these thresholds, as well as two efficient implementations that further exploit easy to compute lower bounds for the mixed norm of the optimal solution. Experiments on large-scale synthetic and real data indicate that the proposed methods can be orders of magnitude faster than state-of-the-art methods.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03749v1
PDF	https://arxiv.org/pdf/1910.03749v1.pdf
PWC	https://paperswithcode.com/paper/the-fastest-ell_1infty-prox-in-the-west
Repo
Framework

Cache-Friendly Search Trees; or, In Which Everything Beats std::set


Title	Cache-Friendly Search Trees; or, In Which Everything Beats std::set
Authors	Jeffrey Barratt, Brian Zhang
Abstract	While a lot of work in theoretical computer science has gone into optimizing the runtime and space usage of data structures, such work very often neglects a very important component of modern computers: the cache. In doing so, very often, data structures are developed that achieve theoretically-good runtimes but are slow in practice due to a large number of cache misses. In 1999, Frigo et al. introduced the notion of a cache-oblivious algorithm: an algorithm that uses the cache to its advantage, regardless of the size or structure of said cache. Since then, various authors have designed cache-oblivious algorithms and data structures for problems from matrix multiplication to array sorting. We focus in this work on cache-oblivious search trees; i.e. implementing an ordered dictionary in a cache-friendly manner. We will start by presenting an overview of cache-oblivious data structures, especially cache-oblivious search trees. We then give practical results using these cache-oblivious structures on modern-day machinery, comparing them to the standard std::set and other cache-friendly dictionaries such as B-trees.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01631v1
PDF	https://arxiv.org/pdf/1907.01631v1.pdf
PWC	https://paperswithcode.com/paper/cache-friendly-search-trees-or-in-which
Repo
Framework

Understanding Chat Messages for Sticker Recommendation in Messaging Apps


Title	Understanding Chat Messages for Sticker Recommendation in Messaging Apps
Authors	Abhishek Laddha, Mohamed Hanoosh, Debdoot Mukherjee, Parth Patwa, Ankur Narang
Abstract	Stickers are popularly used in messaging apps such as Hike to visually express a nuanced range of thoughts and utterances to convey exaggerated emotions. However, discovering the right sticker from a large and ever expanding pool of stickers while chatting can be cumbersome. In this paper, we describe a system for recommending stickers in real time as the user is typing based on the context of the conversation. We decompose the sticker recommendation (SR) problem into two steps. First, we predict the message that the user is likely to send in the chat. Second, we substitute the predicted message with an appropriate sticker. Majority of Hike’s messages are in the form of text which is transliterated from users’ native language to the Roman script. This leads to numerous orthographic variations of the same message and makes accurate message prediction challenging. To address this issue, we learn dense representations of chat messages employing character level convolution network in an unsupervised manner. We use them to cluster the messages that have the same meaning. In the subsequent steps, we predict the message cluster instead of the message. Our approach does not depend on human labelled data (except for validation), leading to fully automatic updation and tuning pipeline for the underlying models. We also propose a novel hybrid message prediction model, which can run with low latency on low-end phones that have severe computational limitations. Our described system has been deployed for more than $6$ months and is being used by millions of users along with hundreds of thousands of expressive stickers.
Tasks
Published	2019-02-07
URL	https://arxiv.org/abs/1902.02704v4
PDF	https://arxiv.org/pdf/1902.02704v4.pdf
PWC	https://paperswithcode.com/paper/understanding-chat-messages-for-sticker
Repo
Framework

Revisit Policy Optimization in Matrix Form


Title	Revisit Policy Optimization in Matrix Form
Authors	Sitao Luan, Xiao-Wen Chang, Doina Precup
Abstract	In tabular case, when the reward and environment dynamics are known, policy evaluation can be written as $\bm{V}{\bm{\pi}} = (I - \gamma P{\bm{\pi}})^{-1} \bm{r}{\bm{\pi}}$, where $P{\bm{\pi}}$ is the state transition matrix given policy ${\bm{\pi}}$ and $\bm{r}{\bm{\pi}}$ is the reward signal given ${\bm{\pi}}$. What annoys us is that $P{\bm{\pi}}$ and $\bm{r}_{\bm{\pi}}$ are both mixed with ${\bm{\pi}}$, which means every time when we update ${\bm{\pi}}$, they will change together. In this paper, we leverage the notation from \cite{wang2007dual} to disentangle ${\bm{\pi}}$ and environment dynamics which makes optimization over policy more straightforward. We show that policy gradient theorem \cite{sutton2018reinforcement} and TRPO \cite{schulman2015trust} can be put into a more general framework and such notation has good potential to be extended to model-based reinforcement learning.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09186v1
PDF	https://arxiv.org/pdf/1909.09186v1.pdf
PWC	https://paperswithcode.com/paper/revisit-policy-optimization-in-matrix-form
Repo
Framework