Paper Group ANR 688
A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization. Learning neutrino effects in Cosmology with Convolutional Neural Networks. Asymptotics of MAP Inference in Deep Networks. A Random Interaction Forest for Prioritizing Predictive Biomarkers. Who, Where, and What to Wear? Extracting Fashion …
A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization
Title | A Fast and Efficient Stochastic Opposition-Based Learning for Differential Evolution in Numerical Optimization |
Authors | Tae Jong Choi, Julian Togelius, Yun-Gyung Cheong |
Abstract | A new variant of stochastic opposition-based learning (OBL) is proposed in this paper. OBL is a relatively new machine learning concept, which consists of simultaneously calculating an original solution and its opposite to accelerate the convergence of soft computing algorithms. Recently a new opposition-based differential evolution (ODE) variant called BetaCODE was proposed as a combination of differential evolution and a new stochastic OBL variant called BetaCOBL. BetaCOBL is capable of flexibly adjusting the probability density functions used to calculate opposite solutions, generating more diverse opposite solutions, and preventing the waste of fitness evaluations. While it has shown outstanding performance compared to several state-of-the-art OBL variants, BetaCOBL is challenging with more complex problems because of its high computational cost. Besides, as it assumes that the decision variables are independent, there is a limitation in the search for decent opposite solutions on inseparable problems. In this paper, we propose an improved stochastic OBL variant that mitigates all the limitations of BetaCOBL. The proposed algorithm called iBetaCOBL reduces the computational cost from $O(NP^{2} \cdot D)$ to $O(NP \cdot D)$ ($NP$ and $D$ stand for population size and dimension, respectively) using a linear time diversity measure. In addition, iBetaCOBL preserves the strongly dependent decision variables that are adjacent to each other using the multiple exponential crossover. The results of the performance evaluations on a set of 58 test functions show that iBetaCODE finds more accurate solutions than ten state-of-the-art ODE variants including BetaCODE. Additionally, we applied iBetaCOBL to two state-of-the-art DE variants, and as in the previous results, iBetaCOBL based variants exhibit significantly improved performance. |
Tasks | |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.08011v1 |
https://arxiv.org/pdf/1908.08011v1.pdf | |
PWC | https://paperswithcode.com/paper/190808011 |
Repo | |
Framework | |
Learning neutrino effects in Cosmology with Convolutional Neural Networks
Title | Learning neutrino effects in Cosmology with Convolutional Neural Networks |
Authors | Elena Giusarma, Mauricio Reyes Hurtado, Francisco Villaescusa-Navarro, Siyu He, Shirley Ho, ChangHoon Hahn |
Abstract | Measuring the sum of the three active neutrino masses, $M_\nu$, is one of the most important challenges in modern cosmology. Massive neutrinos imprint characteristic signatures on several cosmological observables in particular on the large-scale structure of the Universe. In order to maximize the information that can be retrieved from galaxy surveys, accurate theoretical predictions in the non-linear regime are needed. Currently, one way to achieve those predictions is by running cosmological numerical simulations. Unfortunately, producing those simulations requires high computational resources – seven hundred CPU hours for each neutrino mass case. In this work, we propose a new method, based on a deep learning network (U-Net), to quickly generate simulations with massive neutrinos from standard $\Lambda$CDM simulations without neutrinos. We computed multiple relevant statistical measures of deep-learning generated simulations, and conclude that our method accurately reproduces the 3-dimensional spatial distribution of matter down to non-linear scales: $k < 0.7$ h/Mpc. Finally, our method allows us to generate massive neutrino simulations 10,000 times faster than the traditional methods. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04255v1 |
https://arxiv.org/pdf/1910.04255v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-neutrino-effects-in-cosmology-with |
Repo | |
Framework | |
Asymptotics of MAP Inference in Deep Networks
Title | Asymptotics of MAP Inference in Deep Networks |
Authors | Parthe Pandit, Mojtaba Sahraee, Sundeep Rangan, Alyson K. Fletcher |
Abstract | Deep generative priors are a powerful tool for reconstruction problems with complex data such as images and text. Inverse problems using such models require solving an inference problem of estimating the input and hidden units of the multi-layer network from its output. Maximum a priori (MAP) estimation is a widely-used inference method as it is straightforward to implement, and has been successful in practice. However, rigorous analysis of MAP inference in multi-layer networks is difficult. This work considers a recently-developed method, multi-layer vector approximate message passing (ML-VAMP), to study MAP inference in deep networks. It is shown that the mean squared error of the ML-VAMP estimate can be exactly and rigorously characterized in a certain high-dimensional random limit. The proposed method thus provides a tractable method for MAP inference with exact performance guarantees. |
Tasks | |
Published | 2019-03-01 |
URL | http://arxiv.org/abs/1903.01293v1 |
http://arxiv.org/pdf/1903.01293v1.pdf | |
PWC | https://paperswithcode.com/paper/asymptotics-of-map-inference-in-deep-networks |
Repo | |
Framework | |
A Random Interaction Forest for Prioritizing Predictive Biomarkers
Title | A Random Interaction Forest for Prioritizing Predictive Biomarkers |
Authors | Zhen Zeng, Yuefeng Lu, Judong Shen, Wei Zheng, Peter Shaw, Mary Beth Dorr |
Abstract | Precision medicine is becoming a focus in medical research recently, as its implementation brings values to all stakeholders in the healthcare system. Various statistical methodologies have been developed tackling problems in different aspects of this field, e.g., assessing treatment heterogeneity, identifying patient subgroups, or building treatment decision models. However, there is a lack of new tools devoted to selecting and prioritizing predictive biomarkers. We propose a novel tree-based ensemble method, random interaction forest (RIF), to generate predictive importance scores and prioritize candidate biomarkers for constructing refined treatment decision models. RIF was evaluated by comparing with the conventional random forest and univariable regression methods and showed favorable properties under various simulation scenarios. We applied the proposed RIF method to a biomarker dataset from two phase III clinical trials of bezlotoxumab on $\textit{Clostridium difficile}$ infection recurrence and obtained biologically meaningful results. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01786v1 |
https://arxiv.org/pdf/1910.01786v1.pdf | |
PWC | https://paperswithcode.com/paper/a-random-interaction-forest-for-prioritizing |
Repo | |
Framework | |
Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media
Title | Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media |
Authors | Yunshan Ma, Xun Yang, Lizi Liao, Yixin Cao, Tat-Seng Chua |
Abstract | Fashion knowledge helps people to dress properly and addresses not only physiological needs of users, but also the demands of social activities and conventions. It usually involves three mutually related aspects of: occasion, person and clothing. However, there are few works focusing on extracting such knowledge, which will greatly benefit many downstream applications, such as fashion recommendation. In this paper, we propose a novel method to automatically harvest fashion knowledge from social media. We unify three tasks of occasion, person and clothing discovery from multiple modalities of images, texts and metadata. For person detection and analysis, we use the off-the-shelf tools due to their flexibility and satisfactory performance. For clothing recognition and occasion prediction, we unify the two tasks by using a contextualized fashion concept learning module, which captures the dependencies and correlations among different fashion concepts. To alleviate the heavy burden of human annotations, we introduce a weak label modeling module which can effectively exploit machine-labeled data, a complementary of clean data. In experiments, we contribute a benchmark dataset and conduct extensive experiments from both quantitative and qualitative perspectives. The results demonstrate the effectiveness of our model in fashion concept prediction, and the usefulness of extracted knowledge with comprehensive analysis. |
Tasks | Human Detection |
Published | 2019-08-12 |
URL | https://arxiv.org/abs/1908.08985v1 |
https://arxiv.org/pdf/1908.08985v1.pdf | |
PWC | https://paperswithcode.com/paper/who-where-and-what-to-wear-extracting-fashion |
Repo | |
Framework | |
Transfer of Adversarial Robustness Between Perturbation Types
Title | Transfer of Adversarial Robustness Between Perturbation Types |
Authors | Daniel Kang, Yi Sun, Tom Brown, Dan Hendrycks, Jacob Steinhardt |
Abstract | We study the transfer of adversarial robustness of deep neural networks between different perturbation types. While most work on adversarial examples has focused on $L_\infty$ and $L_2$-bounded perturbations, these do not capture all types of perturbations available to an adversary. The present work evaluates 32 attacks of 5 different types against models adversarially trained on a 100-class subset of ImageNet. Our empirical results suggest that evaluating on a wide range of perturbation sizes is necessary to understand whether adversarial robustness transfers between perturbation types. We further demonstrate that robustness against one perturbation type may not always imply and may sometimes hurt robustness against other perturbation types. In light of these results, we recommend evaluation of adversarial defenses take place on a diverse range of perturbation types and sizes. |
Tasks | |
Published | 2019-05-03 |
URL | https://arxiv.org/abs/1905.01034v1 |
https://arxiv.org/pdf/1905.01034v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-of-adversarial-robustness-between |
Repo | |
Framework | |
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
Title | LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models |
Authors | Dilip Kumar Margam, Rohith Aralikatti, Tanay Sharma, Abhinav Thanda, Pujitha A K, Sharad Roy, Shankar M Venkatesan |
Abstract | In recent years, deep learning based machine lipreading has gained prominence. To this end, several architectures such as LipNet, LCANet and others have been proposed which perform extremely well compared to traditional lipreading DNN-HMM hybrid systems trained on DCT features. In this work, we propose a simpler architecture of 3D-2D-CNN-BLSTM network with a bottleneck layer. We also present analysis of two different approaches for lipreading on this architecture. In the first approach, 3D-2D-CNN-BLSTM network is trained with CTC loss on characters (ch-CTC). Then BLSTM-HMM model is trained on bottleneck lip features (extracted from 3D-2D-CNN-BLSTM ch-CTC network) in a traditional ASR training pipeline. In the second approach, same 3D-2D-CNN-BLSTM network is trained with CTC loss on word labels (w-CTC). The first approach shows that bottleneck features perform better compared to DCT features. Using the second approach on Grid corpus’ seen speaker test set, we report $1.3%$ WER - a $55%$ improvement relative to LCANet. On unseen speaker test set we report $8.6%$ WER which is $24.5%$ improvement relative to LipNet. We also verify the method on a second dataset of $81$ speakers which we collected. Finally, we also discuss the effect of feature duplication on BLSTM-HMM model performance. |
Tasks | Lipreading |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.12170v1 |
https://arxiv.org/pdf/1906.12170v1.pdf | |
PWC | https://paperswithcode.com/paper/lipreading-with-3d-2d-cnn-blstm-hmm-and-word |
Repo | |
Framework | |
Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019
Title | Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019 |
Authors | Usama Yaseen, Pankaj Gupta, Hinrich Schütze |
Abstract | Named Entity Recognition (NER) and Relation Extraction (RE) are essential tools in distilling knowledge from biomedical literature. This paper presents our findings from participating in BioNLP Shared Tasks 2019. We addressed Named Entity Recognition including nested entities extraction, Entity Normalization and Relation Extraction. Our proposed approach of Named Entities can be generalized to different languages and we have shown it’s effectiveness for English and Spanish text. We investigated linguistic features, hybrid loss including ranking and Conditional Random Fields (CRF), multi-task objective and token-level ensembling strategy to improve NER. We employed dictionary based fuzzy and semantic search to perform Entity Normalization. Finally, our RE system employed Support Vector Machine (SVM) with linguistic features. Our NER submission (team:MIC-CIS) ranked first in BB-2019 norm+NER task with standard error rate (SER) of 0.7159 and showed competitive performance on PharmaCo NER task with F1-score of 0.8662. Our RE system ranked first in the SeeDev-binary Relation Extraction Task with F1-score of 0.3738. |
Tasks | Named Entity Recognition, Nested Named Entity Recognition, Relation Extraction |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03385v1 |
https://arxiv.org/pdf/1910.03385v1.pdf | |
PWC | https://paperswithcode.com/paper/linguistically-informed-relation-extraction |
Repo | |
Framework | |
A Robust and Precise ConvNet for small non-coding RNA classification (RPC-snRC)
Title | A Robust and Precise ConvNet for small non-coding RNA classification (RPC-snRC) |
Authors | Muhammad Nabeel Asima, Muhammad Imran Malik, Andreas Dengela, Sheraz Ahmed |
Abstract | Functional or non-coding RNAs are attracting more attention as they are now potentially considered valuable resources in the development of new drugs intended to cure several human diseases. The identification of drugs targeting the regulatory circuits of functional RNAs depends on knowing its family, a task which is known as RNA sequence classification. State-of-the-art small noncoding RNA classification methodologies take secondary structural features as input. However, in such classification, feature extraction approaches only take global characteristics into account and completely oversight co-relative effect of local structures. Furthermore secondary structure based approaches incorporate high dimensional feature space which proves computationally expensive. This paper proposes a novel Robust and Precise ConvNet (RPC-snRC) methodology which classifies small non-coding RNAs sequences into their relevant families by utilizing the primary sequence of RNAs. RPC-snRC methodology learns hierarchical representation of features by utilizing positioning and occurrences information of nucleotides. To avoid exploding and vanishing gradient problems, we use an approach similar to DenseNet in which gradient can flow straight from subsequent layers to previous layers. In order to assess the effectiveness of deeper architectures for small non-coding RNA classification, we also adapted two ResNet architectures having different number of layers. Experimental results on a benchmark small non-coding RNA dataset show that our proposed methodology does not only outperform existing small non-coding RNA classification approaches with a significant performance margin of 10% but it also outshines adapted ResNet architectures. |
Tasks | |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.11356v1 |
https://arxiv.org/pdf/1912.11356v1.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-and-precise-convnet-for-small-non |
Repo | |
Framework | |
Revenue Maximization of Airbnb Marketplace using Search Results
Title | Revenue Maximization of Airbnb Marketplace using Search Results |
Authors | Jiawei Wen, Hossein Vahabi, Mihajlo Grbovic |
Abstract | Correctly pricing products or services in an online marketplace presents a challenging problem and one of the critical factors for the success of the business. When users are looking to buy an item they typically search for it. Query relevance models are used at this stage to retrieve and rank the items on the search page from most relevant to least relevant. The presented items are naturally “competing” against each other for user purchases. We provide a practical two-stage model to price this set of retrieved items for which distributions of their values are learned. The initial output of the pricing strategy is a price vector for the top displayed items in one search event. We later aggregate these results over searches to provide the supplier with the optimal price for each item. We applied our solution to large-scale search data obtained from Airbnb Experiences marketplace. Offline evaluation results show that our strategy improves upon baseline pricing strategies on key metrics by at least +20% in terms of booking regret and +55% in terms of revenue potential. |
Tasks | |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.05887v2 |
https://arxiv.org/pdf/1911.05887v2.pdf | |
PWC | https://paperswithcode.com/paper/revenue-maximization-of-airbnb-marketplace |
Repo | |
Framework | |
Probabilistic Similarity Networks
Title | Probabilistic Similarity Networks |
Authors | David Heckerman |
Abstract | Normative expert systems have not become commonplace because they have been difficult to build and use. Over the past decade, however, researchers have developed the influence diagram, a graphical representation of a decision maker’s beliefs, alternatives, and preferences that serves as the knowledge base of a normative expert system. Most people who have seen the representation find it intuitive and easy to use. Consequently, the influence diagram has overcome significantly the barriers to constructing normative expert systems. Nevertheless, building influence diagrams is not practical for extremely large and complex domains. In this book, I address the difficulties associated with the construction of the probabilistic portion of an influence diagram, called a knowledge map, belief network, or Bayesian network. I introduce two representations that facilitate the generation of large knowledge maps. In particular, I introduce the similarity network, a tool for building the network structure of a knowledge map, and the partition, a tool for assessing the probabilities associated with a knowledge map. I then use these representations to build Pathfinder, a large normative expert system for the diagnosis of lymph-node diseases (the domain contains over 60 diseases and over 100 disease findings). In an early version of the system, I encoded the knowledge of the expert using an erroneous assumption that all disease findings were independent, given each disease. When the expert and I attempted to build a more accurate knowledge map for the domain that would capture the dependencies among the disease findings, we failed. Using a similarity network, however, we built the knowledge-map structure for the entire domain in approximately 40 hours. Furthermore, the partition representation reduced the number of probability assessments required by the expert from 75,000 to 14,000. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.06263v1 |
https://arxiv.org/pdf/1911.06263v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-similarity-networks |
Repo | |
Framework | |
The fastest $\ell_{1,\infty}$ prox in the west
Title | The fastest $\ell_{1,\infty}$ prox in the west |
Authors | Benjamín Béjar, Ivan Dokmanić, René Vidal |
Abstract | Proximal operators are of particular interest in optimization problems dealing with non-smooth objectives because in many practical cases they lead to optimization algorithms whose updates can be computed in closed form or very efficiently. A well-known example is the proximal operator of the vector $\ell_1$ norm, which is given by the soft-thresholding operator. In this paper we study the proximal operator of the mixed $\ell_{1,\infty}$ matrix norm and show that it can be computed in closed form by applying the well-known soft-thresholding operator to each column of the matrix. However, unlike the vector $\ell_1$ norm case where the threshold is constant, in the mixed $\ell_{1,\infty}$ norm case each column of the matrix might require a different threshold and all thresholds depend on the given matrix. We propose a general iterative algorithm for computing these thresholds, as well as two efficient implementations that further exploit easy to compute lower bounds for the mixed norm of the optimal solution. Experiments on large-scale synthetic and real data indicate that the proposed methods can be orders of magnitude faster than state-of-the-art methods. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.03749v1 |
https://arxiv.org/pdf/1910.03749v1.pdf | |
PWC | https://paperswithcode.com/paper/the-fastest-ell_1infty-prox-in-the-west |
Repo | |
Framework | |
Cache-Friendly Search Trees; or, In Which Everything Beats std::set
Title | Cache-Friendly Search Trees; or, In Which Everything Beats std::set |
Authors | Jeffrey Barratt, Brian Zhang |
Abstract | While a lot of work in theoretical computer science has gone into optimizing the runtime and space usage of data structures, such work very often neglects a very important component of modern computers: the cache. In doing so, very often, data structures are developed that achieve theoretically-good runtimes but are slow in practice due to a large number of cache misses. In 1999, Frigo et al. introduced the notion of a cache-oblivious algorithm: an algorithm that uses the cache to its advantage, regardless of the size or structure of said cache. Since then, various authors have designed cache-oblivious algorithms and data structures for problems from matrix multiplication to array sorting. We focus in this work on cache-oblivious search trees; i.e. implementing an ordered dictionary in a cache-friendly manner. We will start by presenting an overview of cache-oblivious data structures, especially cache-oblivious search trees. We then give practical results using these cache-oblivious structures on modern-day machinery, comparing them to the standard std::set and other cache-friendly dictionaries such as B-trees. |
Tasks | |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01631v1 |
https://arxiv.org/pdf/1907.01631v1.pdf | |
PWC | https://paperswithcode.com/paper/cache-friendly-search-trees-or-in-which |
Repo | |
Framework | |
Understanding Chat Messages for Sticker Recommendation in Messaging Apps
Title | Understanding Chat Messages for Sticker Recommendation in Messaging Apps |
Authors | Abhishek Laddha, Mohamed Hanoosh, Debdoot Mukherjee, Parth Patwa, Ankur Narang |
Abstract | Stickers are popularly used in messaging apps such as Hike to visually express a nuanced range of thoughts and utterances to convey exaggerated emotions. However, discovering the right sticker from a large and ever expanding pool of stickers while chatting can be cumbersome. In this paper, we describe a system for recommending stickers in real time as the user is typing based on the context of the conversation. We decompose the sticker recommendation (SR) problem into two steps. First, we predict the message that the user is likely to send in the chat. Second, we substitute the predicted message with an appropriate sticker. Majority of Hike’s messages are in the form of text which is transliterated from users’ native language to the Roman script. This leads to numerous orthographic variations of the same message and makes accurate message prediction challenging. To address this issue, we learn dense representations of chat messages employing character level convolution network in an unsupervised manner. We use them to cluster the messages that have the same meaning. In the subsequent steps, we predict the message cluster instead of the message. Our approach does not depend on human labelled data (except for validation), leading to fully automatic updation and tuning pipeline for the underlying models. We also propose a novel hybrid message prediction model, which can run with low latency on low-end phones that have severe computational limitations. Our described system has been deployed for more than $6$ months and is being used by millions of users along with hundreds of thousands of expressive stickers. |
Tasks | |
Published | 2019-02-07 |
URL | https://arxiv.org/abs/1902.02704v4 |
https://arxiv.org/pdf/1902.02704v4.pdf | |
PWC | https://paperswithcode.com/paper/understanding-chat-messages-for-sticker |
Repo | |
Framework | |
Revisit Policy Optimization in Matrix Form
Title | Revisit Policy Optimization in Matrix Form |
Authors | Sitao Luan, Xiao-Wen Chang, Doina Precup |
Abstract | In tabular case, when the reward and environment dynamics are known, policy evaluation can be written as $\bm{V}{\bm{\pi}} = (I - \gamma P{\bm{\pi}})^{-1} \bm{r}{\bm{\pi}}$, where $P{\bm{\pi}}$ is the state transition matrix given policy ${\bm{\pi}}$ and $\bm{r}{\bm{\pi}}$ is the reward signal given ${\bm{\pi}}$. What annoys us is that $P{\bm{\pi}}$ and $\bm{r}_{\bm{\pi}}$ are both mixed with ${\bm{\pi}}$, which means every time when we update ${\bm{\pi}}$, they will change together. In this paper, we leverage the notation from \cite{wang2007dual} to disentangle ${\bm{\pi}}$ and environment dynamics which makes optimization over policy more straightforward. We show that policy gradient theorem \cite{sutton2018reinforcement} and TRPO \cite{schulman2015trust} can be put into a more general framework and such notation has good potential to be extended to model-based reinforcement learning. |
Tasks | |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09186v1 |
https://arxiv.org/pdf/1909.09186v1.pdf | |
PWC | https://paperswithcode.com/paper/revisit-policy-optimization-in-matrix-form |
Repo | |
Framework | |