February 1, 2020

3068 words 15 mins read

Paper Group AWR 241

Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops. AttaCut: A Fast and Accurate Neural Thai Word Segmenter. Robust Learning with the Hilbert-Schmidt Independence Criterion. A coupled autoencoder approach for multi-modal analysis of cell types. Improving Face Anti-Spoofing by 3D Virtual Synthesis. Using Similarity M …

Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops


Title	Generating Weighted MAX-2-SAT Instances of Tunable Difficulty with Frustrated Loops
Authors	Yan Ru Pei, Haik Manukian, Massimiliano Di Ventra
Abstract	Many optimization problems can be cast into the maximum satisfiability (MAX-SAT) form, and many solvers have been developed for tackling such problems. To evaluate a MAX-SAT solver, it is convenient to generate hard MAX-SAT instances with known solutions. Here, we propose a method of generating weighted MAX-2-SAT instances inspired by the frustrated-loop algorithm used by the quantum annealing community. We extend the algorithm for instances of general bipartite couplings, with the associated optimization problem being the minimization of the restricted Boltzmann machine (RBM) energy over the nodal values, which is useful for effectively pre-training the RBM. The hardness of the generated instances can be tuned through a central parameter known as the frustration index. Two versions of the algorithm are presented: the random- and structured-loop algorithms. For the random-loop algorithm, we provide a thorough theoretical and empirical analysis on its mathematical properties from the perspective of frustration, and observe empirically a double phase transition behavior in the hardness scaling behavior driven by the frustration index. For the structured-loop algorithm, we show that it offers an improvement in hardness over the random-loop algorithm in the regime of high loop density, with the variation of hardness tunable through the concentration of frustrated weights.
Tasks
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05334v2
PDF	https://arxiv.org/pdf/1905.05334v2.pdf
PWC	https://paperswithcode.com/paper/generating-weighted-max-2-sat-instances-of
Repo	https://github.com/PeaBrane/Loop-Algorithm
Framework	none

AttaCut: A Fast and Accurate Neural Thai Word Segmenter


Title	AttaCut: A Fast and Accurate Neural Thai Word Segmenter
Authors	Pattarawat Chormai, Ponrawee Prasertsom, Attapol Rutherford
Abstract	Word segmentation is a fundamental pre-processing step for Thai Natural Language Processing. The current off-the-shelf solutions are not benchmarked consistently, so it is difficult to compare their trade-offs. We conducted a speed and accuracy comparison of the popular systems on three different domains and found that the state-of-the-art deep learning system is slow and moreover does not use sub-word structures to guide the model. Here, we propose a fast and accurate neural Thai Word Segmenter that uses dilated CNN filters to capture the environment of each character and uses syllable embeddings as features. Our system runs at least 5.6x faster and outperforms the previous state-of-the-art system on some domains. In addition, we develop the first ML-based Thai orthographical syllable segmenter, which yields syllable embeddings to be used as features by the word segmenter.
Tasks	Tokenization
Published	2019-11-16
URL	https://arxiv.org/abs/1911.07056v1
PDF	https://arxiv.org/pdf/1911.07056v1.pdf
PWC	https://paperswithcode.com/paper/attacut-a-fast-and-accurate-neural-thai-word
Repo	https://github.com/PyThaiNLP/attacut
Framework	pytorch

Robust Learning with the Hilbert-Schmidt Independence Criterion


Title	Robust Learning with the Hilbert-Schmidt Independence Criterion
Authors	Daniel Greenfeld, Uri Shalit
Abstract	We investigate the use of a non-parametric independence measure, the Hilbert-Schmidt Independence Criterion (HSIC), as a loss-function for learning robust regression and classification models. This loss-function encourages learning models where the distribution of the residuals between the label and the model prediction is statistically independent of the distribution of the instances themselves. This loss-function was first proposed by Mooij et al. (2009) in the context of learning causal graphs. We adapt it to the task of learning for unsupervised covariate shift: learning on a source domain without access to any instances or labels from the unknown target domain, but with the assumption that $p(yx)$ (the conditional probability of labels given instances) remains the same in the target domain. We show that the proposed loss is expected to give rise to models that generalize well on a class of target domains characterised by the complexity of their description within a reproducing kernel Hilbert space. Experiments on unsupervised covariate shift tasks demonstrate that models learned with the proposed loss-function outperform models learned with standard loss functions, achieving state-of-the-art results on a challenging cell-microscopy unsupervised covariate shift task.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00270v3
PDF	https://arxiv.org/pdf/1910.00270v3.pdf
PWC	https://paperswithcode.com/paper/robust-learning-with-the-hilbert-schmidt
Repo	https://github.com/danielgreenfeld3/XIC
Framework	pytorch


Title	A coupled autoencoder approach for multi-modal analysis of cell types
Authors	Rohan Gala, Nathan Gouwens, Zizhen Yao, Agata Budzillo, Osnat Penn, Bosiljka Tasic, Gabe Murphy, Hongkui Zeng, Uygar Sümbül
Abstract	Recent developments in high throughput profiling of individual neurons have spurred data driven exploration of the idea that there exist natural groupings of neurons referred to as cell types. The promise of this idea is that the immense complexity of brain circuits can be reduced, and effectively studied by means of interactions between cell types. While clustering of neuron populations based on a particular data modality can be used to define cell types, such definitions are often inconsistent across different characterization modalities. We pose this issue of cross-modal alignment as an optimization problem and develop an approach based on coupled training of autoencoders as a framework for such analyses. We apply this framework to a Patch-seq dataset consisting of transcriptomic and electrophysiological profiles for the same set of neurons to study consistency of representations across modalities, and evaluate cross-modal data prediction ability. We explore the problem where only a subset of neurons is characterized with more than one modality, and demonstrate that representations learned by coupled autoencoders can be used to identify types sampled only by a single modality.
Tasks
Published	2019-11-06
URL	https://arxiv.org/abs/1911.05663v1
PDF	https://arxiv.org/pdf/1911.05663v1.pdf
PWC	https://paperswithcode.com/paper/a-coupled-autoencoder-approach-for-multi-1
Repo	https://github.com/AllenInstitute/coupledAE
Framework	tf

Improving Face Anti-Spoofing by 3D Virtual Synthesis


Title	Improving Face Anti-Spoofing by 3D Virtual Synthesis
Authors	Jianzhu Guo, Xiangyu Zhu, Jinchuan Xiao, Zhen Lei, Genxun Wan, Stan Z. Li
Abstract	Face anti-spoofing is crucial for the security of face recognition systems. Learning based methods especially deep learning based methods need large-scale training samples to reduce overfitting. However, acquiring spoof data is very expensive since the live faces should be re-printed and re-captured in many views. In this paper, we present a method to synthesize virtual spoof data in 3D space to alleviate this problem. Specifically, we consider a printed photo as a flat surface and mesh it into a 3D object, which is then randomly bent and rotated in 3D space. Afterward, the transformed 3D photo is rendered through perspective projection as a virtual sample. The synthetic virtual samples can significantly boost the anti-spoofing performance when combined with a proposed data balancing strategy. Our promising results open up new possibilities for advancing face anti-spoofing using cheap and large-scale synthetic data.
Tasks	Face Anti-Spoofing, Face Recognition
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00488v2
PDF	http://arxiv.org/pdf/1901.00488v2.pdf
PWC	https://paperswithcode.com/paper/improving-face-anti-spoofing-by-3d-virtual
Repo	https://github.com/cleardusk/3DDFA
Framework	pytorch

Using Similarity Measures to Select Pretraining Data for NER


Title	Using Similarity Measures to Select Pretraining Data for NER
Authors	Xiang Dai, Sarvnaz Karimi, Ben Hachey, Cecile Paris
Abstract	Word vectors and Language Models (LMs) pretrained on a large amount of unlabelled data can dramatically improve various Natural Language Processing (NLP) tasks. However, the measure and impact of similarity between pretraining data and target task data are left to intuition. We propose three cost-effective measures to quantify different aspects of similarity between source pretraining and target task data. We demonstrate that these measures are good predictors of the usefulness of pretrained models for Named Entity Recognition (NER) over 30 data pairs. Results also suggest that pretrained LMs are more effective and more predictable than pretrained word vectors, but pretrained word vectors are better when pretraining data is dissimilar.
Tasks	Named Entity Recognition
Published	2019-04-01
URL	https://arxiv.org/abs/1904.00585v2
PDF	https://arxiv.org/pdf/1904.00585v2.pdf
PWC	https://paperswithcode.com/paper/using-similarity-measures-to-select
Repo	https://github.com/daixiangau/naacl2019-select-pretraining-data-for-ner
Framework	pytorch

Positional Encoding to Control Output Sequence Length


Title	Positional Encoding to Control Output Sequence Length
Authors	Sho Takase, Naoaki Okazaki
Abstract	Neural encoder-decoder models have been successful in natural language generation tasks. However, real applications of abstractive summarization must consider additional constraint that a generated summary should not exceed a desired length. In this paper, we propose a simple but effective extension of a sinusoidal positional encoding (Vaswani et al., 2017) to enable neural encoder-decoder model to preserves the length constraint. Unlike in previous studies where that learn embeddings representing each length, the proposed method can generate a text of any length even if the target length is not present in training data. The experimental results show that the proposed method can not only control the generation length but also improve the ROUGE scores.
Tasks	Abstractive Text Summarization, Text Generation, Text Summarization
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07418v1
PDF	http://arxiv.org/pdf/1904.07418v1.pdf
PWC	https://paperswithcode.com/paper/positional-encoding-to-control-output
Repo	https://github.com/takase/control-length
Framework	pytorch

Data-Efficient Image Recognition with Contrastive Predictive Coding


Title	Data-Efficient Image Recognition with Contrastive Predictive Coding
Authors	Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
Abstract	Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with machine perception remains an open challenge. We hypothesize that data-efficient recognition is enabled by representations which make the variability in natural signals more predictable. We therefore revisit and improve Contrastive Predictive Coding, an unsupervised objective for learning such representations. This new implementation produces features which support state-of-the-art linear classification accuracy on the ImageNet dataset. When used as input for non-linear classification with deep neural networks, this representation allows us to use 2-5x less labels than classifiers trained directly on image pixels. Finally, this unsupervised representation substantially improves transfer learning to object detection on PASCAL VOC-2007, surpassing fully supervised pre-trained ImageNet classifiers.
Tasks	Object Detection, Semi-Supervised Image Classification, Transfer Learning
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09272v2
PDF	https://arxiv.org/pdf/1905.09272v2.pdf
PWC	https://paperswithcode.com/paper/data-efficient-image-recognition-with
Repo	https://github.com/Philip-Bachman/amdim-public
Framework	pytorch

ItLnc-BXE: a Bagging-XGBoost-ensemble method with multiple features for identification of plant lncRNAs


Title	ItLnc-BXE: a Bagging-XGBoost-ensemble method with multiple features for identification of plant lncRNAs
Authors	Guangyan Zhang, Ziru Liu, Jichen Dai, Zilan Yu, Shuai Liu, Wen Zhang
Abstract	Motivation: Since long non-coding RNAs (lncRNAs) have involved in a wide range of functions in cellular and developmental processes, an increasing number of methods have been proposed for distinguishing lncRNAs from coding RNAs. However, most of the existing methods are designed for lncRNAs in animal systems, and only a few methods focus on the plant lncRNA identification. Different from lncRNAs in animal systems, plant lncRNAs have distinct characteristics. It is desirable to develop a computational method for accurate and robust identification of plant lncRNAs. Results: Herein, we present a plant lncRNA identification method ItLnc-BXE, which utilizes multiple features and the ensemble learning strategy. First, a diversity of lncRNA features is collected and filtered by feature selection to represent RNA transcripts. Then, several base learners are trained and further combined into a single meta-learner by ensemble learning, and thus an ItLnc-BXE model is constructed. ItLnc-BXE models are evaluated on datasets of six plant species, the results show that ItLnc-BXE outperforms other state-of-the-art plant lncRNA identification methods, achieving better and robust performances (AUC>95.91%). We also perform some experiments about cross-species lncRNA identification, and the results indicate that dicots-based and monocots-based models can be used to accurately identify lncRNAs in lower plant species, such as mosses and algae. Availability: source codes are available at https://github.com/BioMedicalBigDataMiningLab/ItLnc-BXE. Contact: zhangwen@mail.hzau.edu.cn (or) zhangwen@whu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
Tasks	Feature Selection
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00185v2
PDF	https://arxiv.org/pdf/1911.00185v2.pdf
PWC	https://paperswithcode.com/paper/ptlnc-bxe-prediction-of-plant-lncrnas-using-a
Repo	https://github.com/BioMedicalBigDataMiningLab/ItLnc-BXE
Framework	none

Prediction Focused Topic Models via Feature Selection


Title	Prediction Focused Topic Models via Feature Selection
Authors	Jason Ren, Russell Kunes, Finale Doshi-Velez
Abstract	Supervised topic models are often sought to balance prediction quality and interpretability. However, when models are (inevitably) misspecified, standard approaches rarely deliver on both. We introduce a novel approach, the prediction-focused topic model, that uses the supervisory signal to retain only vocabulary terms that improve, or at least do not hinder, prediction performance. By removing terms with irrelevant signal, the topic model is able to learn task-relevant, coherent topics. We demonstrate on several data sets that compared to existing approaches, prediction-focused topic models learn much more coherent topics while maintaining competitive predictions.
Tasks	Feature Selection, Topic Models
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05495v2
PDF	https://arxiv.org/pdf/1910.05495v2.pdf
PWC	https://paperswithcode.com/paper/prediction-focused-topic-models-via-vocab
Repo	https://github.com/jasonren12/PredictionFocusedTopicModel
Framework	pytorch

LassoNet: Neural networks with Feature Sparsity


Title	LassoNet: Neural networks with Feature Sparsity
Authors	Ismael Lemhadri, Feng Ruan, Robert Tibshirani
Abstract	We introduce LassoNet, a neural network model with global feature selection. The model uses a residual connection to learn a subset of the most informative input features. Specifically, the model honors a hierarchy restriction that an input neuron only be included if its linear variable is important. This produces a path of feature-sparse models in close analogy with the lasso for linear regression, while effectively capturing complex nonlinear dependencies in the data. Using a single residual block, our iterative algorithm yields an efficient proximal map which accurately selects the most salient features. On systematic experiments, LassoNet achieves competitive performance using a much smaller number of input features. LassoNet can be implemented by adding just a few lines of code to a standard neural network.
Tasks	Feature Selection
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12207v5
PDF	https://arxiv.org/pdf/1907.12207v5.pdf
PWC	https://paperswithcode.com/paper/a-neural-network-with-feature-sparsity
Repo	https://github.com/ilemhadri/lassoNet
Framework	none

A Feature Selection Based on Perturbation Theory


Title	A Feature Selection Based on Perturbation Theory
Authors	Javad Rahimipour Anaraki, Hamid Usefi
Abstract	Consider a supervised dataset $D=[A\mid \textbf{b}]$, where $\textbf{b}$ is the outcome column, rows of $D$ correspond to observations, and columns of $A$ are the features of the dataset. A central problem in machine learning and pattern recognition is to select the most important features from $D$ to be able to predict the outcome. In this paper, we provide a new feature selection method where we use perturbation theory to detect correlations between features. We solve $AX=\textbf{b}$ using the method of least squares and singular value decomposition of $A$. In practical applications, such as in bioinformatics, the number of rows of $A$ (observations) are much less than the number of columns of $A$ (features). So we are dealing with singular matrices with big condition numbers. Although it is known that the solutions of least square problems in singular case are very sensitive to perturbations in $A$, our novel approach in this paper is to prove that the correlations between features can be detected by applying perturbations to $A$. The effectiveness of our method is verified by performing a series of comparisons with conventional and novel feature selection methods in the literature. It is demonstrated that in most situations, our method chooses considerably less number of features while attaining or exceeding the accuracy of the other methods.
Tasks	Feature Selection
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09938v1
PDF	http://arxiv.org/pdf/1902.09938v1.pdf
PWC	https://paperswithcode.com/paper/a-feature-selection-based-on-perturbation
Repo	https://github.com/jracp/PerturbationFeatureSelection
Framework	none

AutoSF: Searching Scoring Functions for Knowledge Graph Embedding


Title	AutoSF: Searching Scoring Functions for Knowledge Graph Embedding
Authors	Yongqi Zhang, Quanming Yao, Wenyuan Dai, Lei Chen
Abstract	Scoring functions (SFs), which measure the plausibility of triplets in knowledge graph (KG), have become the crux of KG embedding. Lots of SFs, which target at capturing different kinds of relations in KGs, have been designed by humans in recent years. However, as relations can exhibit complex patterns that are hard to infer before training, none of them can consistently perform better than others on existing benchmark data sets. In this paper, inspired by the recent success of automated machine learning (AutoML), we propose to automatically design SFs (AutoSF) for distinct KGs by the AutoML techniques. However, it is non-trivial to explore domain-specific information here to make AutoSF efficient and effective. We firstly identify a unified representation over popularly used SFs, which helps to set up a search space for AutoSF. Then, we propose a greedy algorithm to search in such a space efficiently. The algorithm is further sped up by a filter and a predictor, which can avoid repeatedly training SFs with same expressive ability and help removing bad candidates during the search before model training. Finally, we perform extensive experiments on benchmark data sets. Results on link prediction and triplets classification show that the searched SFs by AutoSF, are KG dependent, new to the literature, and outperform the state-of-the-art SFs designed by humans.
Tasks	AutoML, Graph Embedding, Knowledge Graph Embedding, Link Prediction
Published	2019-04-26
URL	https://arxiv.org/abs/1904.11682v3
PDF	https://arxiv.org/pdf/1904.11682v3.pdf
PWC	https://paperswithcode.com/paper/autokge-searching-scoring-functions-for
Repo	https://github.com/yzhangee/AutoSF
Framework	pytorch

INN: Inflated Neural Networks for IPMN Diagnosis


Title	INN: Inflated Neural Networks for IPMN Diagnosis
Authors	Rodney LaLonde, Irene Tanner, Katerina Nikiforaki, Georgios Z. Papadakis, Pujan Kandel, Candice W. Bolan, Michael B. Wallace, Ulas Bagci
Abstract	Intraductal papillary mucinous neoplasm (IPMN) is a precursor to pancreatic ductal adenocarcinoma. While over half of patients are diagnosed with pancreatic cancer at a distant stage, patients who are diagnosed early enjoy a much higher 5-year survival rate of $34%$ compared to $3%$ in the former; hence, early diagnosis is key. Unique challenges in the medical imaging domain such as extremely limited annotated data sets and typically large 3D volumetric data have made it difficult for deep learning to secure a strong foothold. In this work, we construct two novel “inflated” deep network architectures, $\textit{InceptINN}$ and $\textit{DenseINN}$, for the task of diagnosing IPMN from multisequence (T1 and T2) MRI. These networks inflate their 2D layers to 3D and bootstrap weights from their 2D counterparts (Inceptionv3 and DenseNet121 respectively) trained on ImageNet to the new 3D kernels. We also extend the inflation process by further expanding the pre-trained kernels to handle any number of input modalities and different fusion strategies. This is one of the first studies to train an end-to-end deep network on multisequence MRI for IPMN diagnosis, and shows that our proposed novel inflated network architectures are able to handle the extremely limited training data (139 MRI scans), while providing an absolute improvement of $8.76%$ in accuracy for diagnosing IPMN over the current state-of-the-art. Code is publicly available at https://github.com/lalonderodney/INN-Inflated-Neural-Nets.
Tasks
Published	2019-06-30
URL	https://arxiv.org/abs/1907.00437v1
PDF	https://arxiv.org/pdf/1907.00437v1.pdf
PWC	https://paperswithcode.com/paper/inn-inflated-neural-networks-for-ipmn
Repo	https://github.com/lalonderodney/INN-Inflated-Neural-Nets
Framework	tf

Discourse-Based Evaluation of Language Understanding


Title	Discourse-Based Evaluation of Language Understanding
Authors	Damien Sileo, Tim Van-de-Cruys, Camille Pradel, Philippe Muller
Abstract	We introduce DiscEval, a compilation of $11$ evaluation datasets with a focus on discourse, that can be used for evaluation of English Natural Language Understanding when considering meaning as use. We make the case that evaluation with discourse tasks is overlooked and that Natural Language Inference (NLI) pretraining may not lead to the learning really universal representations. DiscEval can also be used as supplementary training data for multi-task learning-based systems, and is publicly available, alongside the code for gathering and preprocessing the datasets.
Tasks	Multi-Task Learning, Natural Language Inference
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08672v1
PDF	https://arxiv.org/pdf/1907.08672v1.pdf
PWC	https://paperswithcode.com/paper/discourse-based-evaluation-of-language
Repo	https://github.com/synapse-developpement/DiscEval
Framework	none