Paper Group AWR 241
Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops. AttaCut: A Fast and Accurate Neural Thai Word Segmenter. Robust Learning with the Hilbert-Schmidt Independence Criterion. A coupled autoencoder approach for multi-modal analysis of cell types. Improving Face Anti-Spoofing by 3D Virtual Synthesis. Using Similarity M …
Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops
Title | Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops |
Authors | Yan Ru Pei, Haik Manukian, Massimiliano Di Ventra |
Abstract | Many optimization problems can be cast into the maximum satisfiability (MAX-SAT) form, and many solvers have been developed for tackling such problems. To evaluate a MAX-SAT solver, it is convenient to generate hard MAX-SAT instances with known solutions. Here, we propose a method of generating weighted MAX-2-SAT instances inspired by the frustrated-loop algorithm used by the quantum annealing community. We extend the algorithm for instances of general bipartite couplings, with the associated optimization problem being the minimization of the restricted Boltzmann machine (RBM) energy over the nodal values, which is useful for effectively pre-training the RBM. The hardness of the generated instances can be tuned through a central parameter known as the frustration index. Two versions of the algorithm are presented: the random- and structured-loop algorithms. For the random-loop algorithm, we provide a thorough theoretical and empirical analysis on its mathematical properties from the perspective of frustration, and observe empirically a double phase transition behavior in the hardness scaling behavior driven by the frustration index. For the structured-loop algorithm, we show that it offers an improvement in hardness over the random-loop algorithm in the regime of high loop density, with the variation of hardness tunable through the concentration of frustrated weights. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05334v2 |
https://arxiv.org/pdf/1905.05334v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-weighted-max-2-sat-instances-of |
Repo | https://github.com/PeaBrane/Loop-Algorithm |
Framework | none |
AttaCut: A Fast and Accurate Neural Thai Word Segmenter
Title | AttaCut: A Fast and Accurate Neural Thai Word Segmenter |
Authors | Pattarawat Chormai, Ponrawee Prasertsom, Attapol Rutherford |
Abstract | Word segmentation is a fundamental pre-processing step for Thai Natural Language Processing. The current off-the-shelf solutions are not benchmarked consistently, so it is difficult to compare their trade-offs. We conducted a speed and accuracy comparison of the popular systems on three different domains and found that the state-of-the-art deep learning system is slow and moreover does not use sub-word structures to guide the model. Here, we propose a fast and accurate neural Thai Word Segmenter that uses dilated CNN filters to capture the environment of each character and uses syllable embeddings as features. Our system runs at least 5.6x faster and outperforms the previous state-of-the-art system on some domains. In addition, we develop the first ML-based Thai orthographical syllable segmenter, which yields syllable embeddings to be used as features by the word segmenter. |
Tasks | Tokenization |
Published | 2019-11-16 |
URL | https://arxiv.org/abs/1911.07056v1 |
https://arxiv.org/pdf/1911.07056v1.pdf | |
PWC | https://paperswithcode.com/paper/attacut-a-fast-and-accurate-neural-thai-word |
Repo | https://github.com/PyThaiNLP/attacut |
Framework | pytorch |
Robust Learning with the Hilbert-Schmidt Independence Criterion
Title | Robust Learning with the Hilbert-Schmidt Independence Criterion |
Authors | Daniel Greenfeld, Uri Shalit |
Abstract | We investigate the use of a non-parametric independence measure, the Hilbert-Schmidt Independence Criterion (HSIC), as a loss-function for learning robust regression and classification models. This loss-function encourages learning models where the distribution of the residuals between the label and the model prediction is statistically independent of the distribution of the instances themselves. This loss-function was first proposed by Mooij et al. (2009) in the context of learning causal graphs. We adapt it to the task of learning for unsupervised covariate shift: learning on a source domain without access to any instances or labels from the unknown target domain, but with the assumption that $p(yx)$ (the conditional probability of labels given instances) remains the same in the target domain. We show that the proposed loss is expected to give rise to models that generalize well on a class of target domains characterised by the complexity of their description within a reproducing kernel Hilbert space. Experiments on unsupervised covariate shift tasks demonstrate that models learned with the proposed loss-function outperform models learned with standard loss functions, achieving state-of-the-art results on a challenging cell-microscopy unsupervised covariate shift task. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00270v3 |
https://arxiv.org/pdf/1910.00270v3.pdf | |
PWC | https://paperswithcode.com/paper/robust-learning-with-the-hilbert-schmidt |
Repo | https://github.com/danielgreenfeld3/XIC |
Framework | pytorch |
A coupled autoencoder approach for multi-modal analysis of cell types
Title | A coupled autoencoder approach for multi-modal analysis of cell types |
Authors | Rohan Gala, Nathan Gouwens, Zizhen Yao, Agata Budzillo, Osnat Penn, Bosiljka Tasic, Gabe Murphy, Hongkui Zeng, Uygar Sümbül |
Abstract | Recent developments in high throughput profiling of individual neurons have spurred data driven exploration of the idea that there exist natural groupings of neurons referred to as cell types. The promise of this idea is that the immense complexity of brain circuits can be reduced, and effectively studied by means of interactions between cell types. While clustering of neuron populations based on a particular data modality can be used to define cell types, such definitions are often inconsistent across different characterization modalities. We pose this issue of cross-modal alignment as an optimization problem and develop an approach based on coupled training of autoencoders as a framework for such analyses. We apply this framework to a Patch-seq dataset consisting of transcriptomic and electrophysiological profiles for the same set of neurons to study consistency of representations across modalities, and evaluate cross-modal data prediction ability. We explore the problem where only a subset of neurons is characterized with more than one modality, and demonstrate that representations learned by coupled autoencoders can be used to identify types sampled only by a single modality. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.05663v1 |
https://arxiv.org/pdf/1911.05663v1.pdf | |
PWC | https://paperswithcode.com/paper/a-coupled-autoencoder-approach-for-multi-1 |
Repo | https://github.com/AllenInstitute/coupledAE |
Framework | tf |
Improving Face Anti-Spoofing by 3D Virtual Synthesis
Title | Improving Face Anti-Spoofing by 3D Virtual Synthesis |
Authors | Jianzhu Guo, Xiangyu Zhu, Jinchuan Xiao, Zhen Lei, Genxun Wan, Stan Z. Li |
Abstract | Face anti-spoofing is crucial for the security of face recognition systems. Learning based methods especially deep learning based methods need large-scale training samples to reduce overfitting. However, acquiring spoof data is very expensive since the live faces should be re-printed and re-captured in many views. In this paper, we present a method to synthesize virtual spoof data in 3D space to alleviate this problem. Specifically, we consider a printed photo as a flat surface and mesh it into a 3D object, which is then randomly bent and rotated in 3D space. Afterward, the transformed 3D photo is rendered through perspective projection as a virtual sample. The synthetic virtual samples can significantly boost the anti-spoofing performance when combined with a proposed data balancing strategy. Our promising results open up new possibilities for advancing face anti-spoofing using cheap and large-scale synthetic data. |
Tasks | Face Anti-Spoofing, Face Recognition |
Published | 2019-01-02 |
URL | http://arxiv.org/abs/1901.00488v2 |
http://arxiv.org/pdf/1901.00488v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-face-anti-spoofing-by-3d-virtual |
Repo | https://github.com/cleardusk/3DDFA |
Framework | pytorch |
Using Similarity Measures to Select Pretraining Data for NER
Title | Using Similarity Measures to Select Pretraining Data for NER |
Authors | Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris |
Abstract | Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks. However, the measure and impact of similarity between pretraining data and target task data are left to intuition. We propose three cost-effective measures to quantify different aspects of similarity between source pretraining and target task data. We demonstrate that these measures are good predictors of the usefulness of pretrained models for Named Entity Recognition (NER) over 30 data pairs. Results also suggest that pretrained LMs are more effective and more predictable than pretrained word vectors, but pretrained word vectors are better when pretraining data is dissimilar. |
Tasks | Named Entity Recognition |
Published | 2019-04-01 |
URL | https://arxiv.org/abs/1904.00585v2 |
https://arxiv.org/pdf/1904.00585v2.pdf | |
PWC | https://paperswithcode.com/paper/using-similarity-measures-to-select |
Repo | https://github.com/daixiangau/naacl2019-select-pretraining-data-for-ner |
Framework | pytorch |
Positional Encoding to Control Output Sequence Length
Title | Positional Encoding to Control Output Sequence Length |
Authors | Sho Takase, Naoaki Okazaki |
Abstract | Neural encoder-decoder models have been successful in natural language generation tasks. However, real applications of abstractive summarization must consider additional constraint that a generated summary should not exceed a desired length. In this paper, we propose a simple but effective extension of a sinusoidal positional encoding (Vaswani et al., 2017) to enable neural encoder-decoder model to preserves the length constraint. Unlike in previous studies where that learn embeddings representing each length, the proposed method can generate a text of any length even if the target length is not present in training data. The experimental results show that the proposed method can not only control the generation length but also improve the ROUGE scores. |
Tasks | Abstractive Text Summarization, Text Generation, Text Summarization |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07418v1 |
http://arxiv.org/pdf/1904.07418v1.pdf | |
PWC | https://paperswithcode.com/paper/positional-encoding-to-control-output |
Repo | https://github.com/takase/control-length |
Framework | pytorch |
Data-Efficient Image Recognition with Contrastive Predictive Coding
Title | Data-Efficient Image Recognition with Contrastive Predictive Coding |
Authors | Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord |
Abstract | Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with machine perception remains an open challenge. We hypothesize that data-efficient recognition is enabled by representations which make the variability in natural signals more predictable. We therefore revisit and improve Contrastive Predictive Coding, an unsupervised objective for learning such representations. This new implementation produces features which support state-of-the-art linear classification accuracy on the ImageNet dataset. When used as input for non-linear classification with deep neural networks, this representation allows us to use 2-5x less labels than classifiers trained directly on image pixels. Finally, this unsupervised representation substantially improves transfer learning to object detection on PASCAL VOC-2007, surpassing fully supervised pre-trained ImageNet classifiers. |
Tasks | Object Detection, Semi-Supervised Image Classification, Transfer Learning |
Published | 2019-05-22 |
URL | https://arxiv.org/abs/1905.09272v2 |
https://arxiv.org/pdf/1905.09272v2.pdf | |
PWC | https://paperswithcode.com/paper/data-efficient-image-recognition-with |
Repo | https://github.com/Philip-Bachman/amdim-public |
Framework | pytorch |
ItLnc-BXE: a Bagging-XGBoost-ensemble method with multiple features for identification of plant lncRNAs
Title | ItLnc-BXE: a Bagging-XGBoost-ensemble method with multiple features for identification of plant lncRNAs |
Authors | Guangyan Zhang, Ziru Liu, Jichen Dai, Zilan Yu, Shuai Liu, Wen Zhang |
Abstract | Motivation: Since long non-coding RNAs (lncRNAs) have involved in a wide range of functions in cellular and developmental processes, an increasing number of methods have been proposed for distinguishing lncRNAs from coding RNAs. However, most of the existing methods are designed for lncRNAs in animal systems, and only a few methods focus on the plant lncRNA identification. Different from lncRNAs in animal systems, plant lncRNAs have distinct characteristics. It is desirable to develop a computational method for accurate and robust identification of plant lncRNAs. Results: Herein, we present a plant lncRNA identification method ItLnc-BXE, which utilizes multiple features and the ensemble learning strategy. First, a diversity of lncRNA features is collected and filtered by feature selection to represent RNA transcripts. Then, several base learners are trained and further combined into a single meta-learner by ensemble learning, and thus an ItLnc-BXE model is constructed. ItLnc-BXE models are evaluated on datasets of six plant species, the results show that ItLnc-BXE outperforms other state-of-the-art plant lncRNA identification methods, achieving better and robust performances (AUC>95.91%). We also perform some experiments about cross-species lncRNA identification, and the results indicate that dicots-based and monocots-based models can be used to accurately identify lncRNAs in lower plant species, such as mosses and algae. Availability: source codes are available at https://github.com/BioMedicalBigDataMiningLab/ItLnc-BXE. Contact: zhangwen@mail.hzau.edu.cn (or) zhangwen@whu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. |
Tasks | Feature Selection |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00185v2 |
https://arxiv.org/pdf/1911.00185v2.pdf | |
PWC | https://paperswithcode.com/paper/ptlnc-bxe-prediction-of-plant-lncrnas-using-a |
Repo | https://github.com/BioMedicalBigDataMiningLab/ItLnc-BXE |
Framework | none |
Prediction Focused Topic Models via Feature Selection
Title | Prediction Focused Topic Models via Feature Selection |
Authors | Jason Ren, Russell Kunes, Finale Doshi-Velez |
Abstract | Supervised topic models are often sought to balance prediction quality and interpretability. However, when models are (inevitably) misspecified, standard approaches rarely deliver on both. We introduce a novel approach, the prediction-focused topic model, that uses the supervisory signal to retain only vocabulary terms that improve, or at least do not hinder, prediction performance. By removing terms with irrelevant signal, the topic model is able to learn task-relevant, coherent topics. We demonstrate on several data sets that compared to existing approaches, prediction-focused topic models learn much more coherent topics while maintaining competitive predictions. |
Tasks | Feature Selection, Topic Models |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05495v2 |
https://arxiv.org/pdf/1910.05495v2.pdf | |
PWC | https://paperswithcode.com/paper/prediction-focused-topic-models-via-vocab |
Repo | https://github.com/jasonren12/PredictionFocusedTopicModel |
Framework | pytorch |
LassoNet: Neural networks with Feature Sparsity
Title | LassoNet: Neural networks with Feature Sparsity |
Authors | Ismael Lemhadri, Feng Ruan, Robert Tibshirani |
Abstract | We introduce LassoNet, a neural network model with global feature selection. The model uses a residual connection to learn a subset of the most informative input features. Specifically, the model honors a hierarchy restriction that an input neuron only be included if its linear variable is important. This produces a path of feature-sparse models in close analogy with the lasso for linear regression, while effectively capturing complex nonlinear dependencies in the data. Using a single residual block, our iterative algorithm yields an efficient proximal map which accurately selects the most salient features. On systematic experiments, LassoNet achieves competitive performance using a much smaller number of input features. LassoNet can be implemented by adding just a few lines of code to a standard neural network. |
Tasks | Feature Selection |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12207v5 |
https://arxiv.org/pdf/1907.12207v5.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-with-feature-sparsity |
Repo | https://github.com/ilemhadri/lassoNet |
Framework | none |
A Feature Selection Based on Perturbation Theory
Title | A Feature Selection Based on Perturbation Theory |
Authors | Javad Rahimipour Anaraki, Hamid Usefi |
Abstract | Consider a supervised dataset $D=[A\mid \textbf{b}]$, where $\textbf{b}$ is the outcome column, rows of $D$ correspond to observations, and columns of $A$ are the features of the dataset. A central problem in machine learning and pattern recognition is to select the most important features from $D$ to be able to predict the outcome. In this paper, we provide a new feature selection method where we use perturbation theory to detect correlations between features. We solve $AX=\textbf{b}$ using the method of least squares and singular value decomposition of $A$. In practical applications, such as in bioinformatics, the number of rows of $A$ (observations) are much less than the number of columns of $A$ (features). So we are dealing with singular matrices with big condition numbers. Although it is known that the solutions of least square problems in singular case are very sensitive to perturbations in $A$, our novel approach in this paper is to prove that the correlations between features can be detected by applying perturbations to $A$. The effectiveness of our method is verified by performing a series of comparisons with conventional and novel feature selection methods in the literature. It is demonstrated that in most situations, our method chooses considerably less number of features while attaining or exceeding the accuracy of the other methods. |
Tasks | Feature Selection |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09938v1 |
http://arxiv.org/pdf/1902.09938v1.pdf | |
PWC | https://paperswithcode.com/paper/a-feature-selection-based-on-perturbation |
Repo | https://github.com/jracp/PerturbationFeatureSelection |
Framework | none |
AutoSF: Searching Scoring Functions for Knowledge Graph Embedding
Title | AutoSF: Searching Scoring Functions for Knowledge Graph Embedding |
Authors | Yongqi Zhang, Quanming Yao, Wenyuan Dai, Lei Chen |
Abstract | Scoring functions (SFs), which measure the plausibility of triplets in knowledge graph (KG), have become the crux of KG embedding. Lots of SFs, which target at capturing different kinds of relations in KGs, have been designed by humans in recent years. However, as relations can exhibit complex patterns that are hard to infer before training, none of them can consistently perform better than others on existing benchmark data sets. In this paper, inspired by the recent success of automated machine learning (AutoML), we propose to automatically design SFs (AutoSF) for distinct KGs by the AutoML techniques. However, it is non-trivial to explore domain-specific information here to make AutoSF efficient and effective. We firstly identify a unified representation over popularly used SFs, which helps to set up a search space for AutoSF. Then, we propose a greedy algorithm to search in such a space efficiently. The algorithm is further sped up by a filter and a predictor, which can avoid repeatedly training SFs with same expressive ability and help removing bad candidates during the search before model training. Finally, we perform extensive experiments on benchmark data sets. Results on link prediction and triplets classification show that the searched SFs by AutoSF, are KG dependent, new to the literature, and outperform the state-of-the-art SFs designed by humans. |
Tasks | AutoML, Graph Embedding, Knowledge Graph Embedding, Link Prediction |
Published | 2019-04-26 |
URL | https://arxiv.org/abs/1904.11682v3 |
https://arxiv.org/pdf/1904.11682v3.pdf | |
PWC | https://paperswithcode.com/paper/autokge-searching-scoring-functions-for |
Repo | https://github.com/yzhangee/AutoSF |
Framework | pytorch |
INN: Inflated Neural Networks for IPMN Diagnosis
Title | INN: Inflated Neural Networks for IPMN Diagnosis |
Authors | Rodney LaLonde, Irene Tanner, Katerina Nikiforaki, Georgios Z. Papadakis, Pujan Kandel, Candice W. Bolan, Michael B. Wallace, Ulas Bagci |
Abstract | Intraductal papillary mucinous neoplasm (IPMN) is a precursor to pancreatic ductal adenocarcinoma. While over half of patients are diagnosed with pancreatic cancer at a distant stage, patients who are diagnosed early enjoy a much higher 5-year survival rate of $34%$ compared to $3%$ in the former; hence, early diagnosis is key. Unique challenges in the medical imaging domain such as extremely limited annotated data sets and typically large 3D volumetric data have made it difficult for deep learning to secure a strong foothold. In this work, we construct two novel “inflated” deep network architectures, $\textit{InceptINN}$ and $\textit{DenseINN}$, for the task of diagnosing IPMN from multisequence (T1 and T2) MRI. These networks inflate their 2D layers to 3D and bootstrap weights from their 2D counterparts (Inceptionv3 and DenseNet121 respectively) trained on ImageNet to the new 3D kernels. We also extend the inflation process by further expanding the pre-trained kernels to handle any number of input modalities and different fusion strategies. This is one of the first studies to train an end-to-end deep network on multisequence MRI for IPMN diagnosis, and shows that our proposed novel inflated network architectures are able to handle the extremely limited training data (139 MRI scans), while providing an absolute improvement of $8.76%$ in accuracy for diagnosing IPMN over the current state-of-the-art. Code is publicly available at https://github.com/lalonderodney/INN-Inflated-Neural-Nets. |
Tasks | |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00437v1 |
https://arxiv.org/pdf/1907.00437v1.pdf | |
PWC | https://paperswithcode.com/paper/inn-inflated-neural-networks-for-ipmn |
Repo | https://github.com/lalonderodney/INN-Inflated-Neural-Nets |
Framework | tf |
Discourse-Based Evaluation of Language Understanding
Title | Discourse-Based Evaluation of Language Understanding |
Authors | Damien Sileo, Tim Van-de-Cruys, Camille Pradel, Philippe Muller |
Abstract | We introduce DiscEval, a compilation of $11$ evaluation datasets with a focus on discourse, that can be used for evaluation of English Natural Language Understanding when considering meaning as use. We make the case that evaluation with discourse tasks is overlooked and that Natural Language Inference (NLI) pretraining may not lead to the learning really universal representations. DiscEval can also be used as supplementary training data for multi-task learning-based systems, and is publicly available, alongside the code for gathering and preprocessing the datasets. |
Tasks | Multi-Task Learning, Natural Language Inference |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08672v1 |
https://arxiv.org/pdf/1907.08672v1.pdf | |
PWC | https://paperswithcode.com/paper/discourse-based-evaluation-of-language |
Repo | https://github.com/synapse-developpement/DiscEval |
Framework | none |