January 30, 2020

2786 words 14 mins read

Paper Group ANR 412

Using Dimensionality Reduction to Optimize t-SNE. Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs. Uncertainty-based graph convolutional networks for organ segmentation refinement. Word embeddings for idiolect identification. CN-Probase: A Data-driven Approach for Large-scale Chinese Taxonomy Constructio …

Using Dimensionality Reduction to Optimize t-SNE


Title	Using Dimensionality Reduction to Optimize t-SNE
Authors	Rikhav Shah, Sandeep Silwal
Abstract	t-SNE is a popular tool for embedding multi-dimensional datasets into two or three dimensions. However, it has a large computational cost, especially when the input data has many dimensions. Many use t-SNE to embed the output of a neural network, which is generally of much lower dimension than the original data. This limits the use of t-SNE in unsupervised scenarios. We propose using \textit{random} projections to embed high dimensional datasets into relatively few dimensions, and then using t-SNE to obtain a two dimensional embedding. We show that random projections preserve the desirable clustering achieved by t-SNE, while dramatically reducing the runtime of finding the embedding.
Tasks	Dimensionality Reduction
Published	2019-12-02
URL	https://arxiv.org/abs/1912.01098v1
PDF	https://arxiv.org/pdf/1912.01098v1.pdf
PWC	https://paperswithcode.com/paper/using-dimensionality-reduction-to-optimize-t
Repo
Framework

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs


Title	Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs
Authors	Partha Maji, Andrew Mundy, Ganesh Dasika, Jesse Beu, Matthew Mattina, Robert Mullins
Abstract	The Winograd or Cook-Toom class of algorithms help to reduce the overall compute complexity of many modern deep convolutional neural networks (CNNs). Although there has been a lot of research done on model and algorithmic optimization of CNN, little attention has been paid to the efficient implementation of these algorithms on embedded CPUs, which usually have very limited memory and low power budget. This paper aims to fill this gap and focuses on the efficient implementation of Winograd or Cook-Toom based convolution on modern Arm Cortex-A CPUs, widely used in mobile devices today. Specifically, we demonstrate a reduction in inference latency by using a set of optimization strategies that improve the utilization of computational resources, and by effectively leveraging the ARMv8-A NEON SIMD instruction set. We evaluated our proposed region-wise multi-channel implementations on Arm Cortex-A73 platform using several representative CNNs. The results show significant performance improvements in full network, up to 60%, over existing im2row/im2col based optimization techniques
Tasks
Published	2019-03-04
URL	http://arxiv.org/abs/1903.01521v1
PDF	http://arxiv.org/pdf/1903.01521v1.pdf
PWC	https://paperswithcode.com/paper/efficient-winograd-or-cook-toom-convolution
Repo
Framework


Title	Uncertainty-based graph convolutional networks for organ segmentation refinement
Authors	Roger D. Soberanis-Mukul, Nassir Navab, Shadi Albarqouni
Abstract	Organ segmentation is an important pre-processing step in many computer assisted intervention and diagnosis methods. In recent years, CNNs have dominated the state of the art in this task. Organ segmentation scenarios present a challenging environment for these methods due to high variability in shape and similarity with background. This leads to the generation of false negative and false positive regions in the output segmentation. In this context, the uncertainty analysis of the model can provide us with useful information about potentially misclassified elements. In this work we propose a method based on uncertainty analysis and graph convolutional networks as a post-processing step for segmentation. For this, we employ the uncertainty levels of the CNN to formulate a semi-supervised graph learning problem that is solved by training a GCN on the low uncertainty elements. Finally, we evaluate the full graph on the trained GCN to get the refined segmentation. We test our framework in refining the output of pancreas and spleen segmentation models. We show that the framework can increase the average dice score in 1% and 2% respectively for these problems. Finally, we discuss the results and current limitations of the model that lead to future work in this research direction
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02191v2
PDF	https://arxiv.org/pdf/1906.02191v2.pdf
PWC	https://paperswithcode.com/paper/an-uncertainty-driven-gcn-refinement-strategy
Repo
Framework

Word embeddings for idiolect identification


Title	Word embeddings for idiolect identification
Authors	Konstantinos Perifanos, Eirini Florou, Dionysis Goutsos
Abstract	The term idiolect refers to the unique and distinctive use of language of an individual and it is the theoretical foundation of Authorship Attribution. In this paper we are focusing on learning distributed representations (embeddings) of social media users that reflect their writing style. These representations can be considered as stylistic fingerprints of the authors. We are exploring the performance of the two main flavours of distributed representations, namely embeddings produced by Neural Probabilistic Language models (such as word2vec) and matrix factorization (such as GloVe).
Tasks	Word Embeddings
Published	2019-02-10
URL	http://arxiv.org/abs/1902.03658v1
PDF	http://arxiv.org/pdf/1902.03658v1.pdf
PWC	https://paperswithcode.com/paper/word-embeddings-for-idiolect-identification
Repo
Framework

CN-Probase: A Data-driven Approach for Large-scale Chinese Taxonomy Construction


Title	CN-Probase: A Data-driven Approach for Large-scale Chinese Taxonomy Construction
Authors	Jindong Chen, Ao Wang, Jiangjie Chen, Yanghua Xiao, Zhendong Chu, Jingping Liu, Jiaqing Liang, Wei Wang
Abstract	Taxonomies play an important role in machine intelligence. However, most well-known taxonomies are in English, and non-English taxonomies, especially Chinese ones, are still very rare. In this paper, we focus on automatic Chinese taxonomy construction and propose an effective generation and verification framework to build a large-scale and high-quality Chinese taxonomy. In the generation module, we extract isA relations from multiple sources of Chinese encyclopedia, which ensures the coverage. To further improve the precision of taxonomy, we apply three heuristic approaches in verification module. As a result, we construct the largest Chinese taxonomy with high precision about 95% called CN-Probase. Our taxonomy has been deployed on Aliyun, with over 82 million API calls in six months.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10326v1
PDF	http://arxiv.org/pdf/1902.10326v1.pdf
PWC	https://paperswithcode.com/paper/cn-probase-a-data-driven-approach-for-large
Repo
Framework


Title	Interactive Refinement of Cross-Lingual Word Embeddings
Authors	Michelle Yuan, Mozhi Zhang, Benjamin Van Durme, Leah Findlater, Jordan Boyd-Graber
Abstract	Cross-lingual word embeddings transfer knowledge between languages: models trained for a high-resource language can be used in a low-resource language. These embeddings are usually trained on general-purpose corpora but used for a domain-specific task. We introduce CLIME, an interactive system that allows a user to quickly adapt cross-lingual word embeddings for a given classification problem. First, words in the vocabulary are ranked by their salience to the downstream task. Then, salient keywords are displayed on an interface. Users mark the similarity between each keyword and its nearest neighbors in the embedding space. Finally, CLIME updates the embeddings using the annotations. We experiment clime on a cross-lingual text classification benchmark for four low-resource languages: Ilocano, Sinhalese, Tigrinya, and Uyghur. Embeddings refined by CLIME capture more nuanced word semantics and have higher test accuracy than the original embeddings. CLIME also improves test accuracy faster than an active learning baseline, and a simple combination of CLIME with active learning has the highest test accuracy.
Tasks	Active Learning, Text Classification, Word Embeddings
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03070v1
PDF	https://arxiv.org/pdf/1911.03070v1.pdf
PWC	https://paperswithcode.com/paper/interactive-refinement-of-cross-lingual-word
Repo
Framework

Scalable K-Medoids via True Error Bound and Familywise Bandits


Title	Scalable K-Medoids via True Error Bound and Familywise Bandits
Authors	Aravindakshan Babu, Saurabh Agarwal, Sudarshan Babu, Hariharan Chandrasekaran
Abstract	K-Medoids(KM) is a standard clustering method, used extensively on semi-metric data.Error analyses of KM have traditionally used an in-sample notion of error,which can be far from the true error and suffer from generalization gap. We formalize the true K-Medoid error based on the underlying data distribution.We decompose the true error into fundamental statistical problems of: minimum estimation (ME) and minimum mean estimation (MME). We provide a convergence result for MME. We show $\errMME$ decreases no slower than $\Theta(\frac{1}{n^{\frac{2}{3}}})$, where $n$ is a measure of sample size. Inspired by this bound, we propose a computationally efficient, distributed KM algorithm namely MCPAM. MCPAM has expected runtime $\mathcal{O}(km)$,where $k$ is the number of medoids and $m$ is number of samples. MCPAM provides massive computational savings for a small tradeoff in accuracy. We verify the quality and scaling properties of MCPAM on various datasets. And achieve the hitherto unachieved feat of calculating the KM of 1 billion points on semi-metric spaces.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10979v2
PDF	https://arxiv.org/pdf/1905.10979v2.pdf
PWC	https://paperswithcode.com/paper/scalable-k-medoids-via-true-error-bound-and
Repo
Framework

Coupling material and mechanical design processes via computer model calibration


Title	Coupling material and mechanical design processes via computer model calibration
Authors	Carl Ehrett, D. Andrew Brown, Evan Chodora, Christopher Kitchens, Sez Atamturktur
Abstract	Computer model calibration typically operates by choosing parameter values in a computer model so that the model output faithfully predicts reality. By using performance targets in place of observed data, we show that calibration techniques can be repurposed to wed engineering and material design, two processes that are traditionally carried out separately. This allows materials to be designed with specific engineering targets in mind while quantifying the associated sources of uncertainty. We demonstrate our proposed approach by “calibrating” material design settings to performance targets for a wind turbine blade.
Tasks	Calibration
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09553v2
PDF	https://arxiv.org/pdf/1907.09553v2.pdf
PWC	https://paperswithcode.com/paper/coupling-material-and-mechanical-design
Repo
Framework

State2vec: Off-Policy Successor Features Approximators


Title	State2vec: Off-Policy Successor Features Approximators
Authors	Sephora Madjiheurem, Laura Toni
Abstract	A major challenge in reinforcement learning (RL) is the design of agents that are able to generalize across tasks that share common dynamics. A viable solution is meta-reinforcement learning, which identifies common structures among past tasks to be then generalized to new tasks (meta-test). In meta-training, the RL agent learns state representations that encode prior information from a set of tasks, used to generalize the value function approximation. This has been proposed in the literature as successor representation approximators. While promising, these methods do not generalize well across optimal policies, leading to sampling-inefficiency during meta-test phases. In this paper, we propose state2vec, an efficient and low-complexity framework for learning successor features which (i) generalize across policies, (ii) ensure sample-efficiency during meta-test. We extend the well known node2vec framework to learn state embeddings that account for the discounted future state transitions in RL. The proposed off-policy state2vec captures the geometry of the underlying state space, making good basis functions for linear value function approximation.
Tasks
Published	2019-10-22
URL	https://arxiv.org/abs/1910.10277v1
PDF	https://arxiv.org/pdf/1910.10277v1.pdf
PWC	https://paperswithcode.com/paper/state2vec-off-policy-successor-features
Repo
Framework

Seeing Things in Random-Dot Videos


Title	Seeing Things in Random-Dot Videos
Authors	Thomas Dagès, Michael Lindenbaum, Alfred M. Bruckstein
Abstract	Humans possess an intricate and powerful visual system in order to perceive and understand the environing world. Human perception can effortlessly detect and correctly group features in visual data and can even interpret random-dot videos induced by imaging natural dynamic scenes with highly noisy sensors such as ultrasound imaging. Remarkably, this happens even if perception completely fails when the same information is presented frame by frame rather than in a video sequence. We study this property of surprising dynamic perception with the first goal of proposing a new detection and spatio-temporal grouping algorithm for such signals when, per frame, the information on objects is both random and sparse and embedded in random noise. The algorithm is based on the succession of temporal integration and spatial statistical tests of unlikeliness, the a contrario framework. The algorithm not only manages to handle such signals but the striking similarity in its performance to the perception by human observers, as witnessed by a series of psychophysical experiments on image and video data, leads us to see in it a simple computational Gestalt model of human perception with only two parameters: the time integration and the visual angle for candidate shapes to be detected.
Tasks
Published	2019-07-29
URL	https://arxiv.org/abs/1907.12195v2
PDF	https://arxiv.org/pdf/1907.12195v2.pdf
PWC	https://paperswithcode.com/paper/seeing-things-in-random-dot-videos
Repo
Framework

Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar


Title	Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar
Authors	Iddo Drori, Yamuna Krishnamurthy, Raoni Lourenco, Remi Rampin, Kyunghyun Cho, Claudio Silva, Juliana Freire
Abstract	Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks. In the spirit of reproducible research we make our data, models, and code publicly available.
Tasks	AutoML
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10345v1
PDF	https://arxiv.org/pdf/1905.10345v1.pdf
PWC	https://paperswithcode.com/paper/automatic-machine-learning-by-pipeline
Repo
Framework

Prediction and outlier detection in classification problems


Title	Prediction and outlier detection in classification problems
Authors	Leying Guan, Rob Tibshirani
Abstract	We consider the multi-class classification problem when the training data and the out-of-sample test data may have different distributions and propose a method called BCOPS (balanced and conformal optimized prediction sets). BCOPS constructs a prediction set $C(x)$ as a subset of class labels, possibly empty. It tries to optimize the out-of-sample performance, aiming to include the correct class as often as possible, but also detecting outliers $x$, for which the method returns no prediction (corresponding to $C(x)$ equal to the empty set). The proposed method combines supervised-learning algorithms with the method of conformal prediction to minimize a misclassification loss averaged over the out-of-sample distribution. The constructed prediction sets have a finite-sample coverage guarantee without distributional assumptions. We also propose a method to estimate the outlier detection rate of a given method. We prove asymptotic consistency and optimality of our proposals under suitable assumptions and illustrate our methods on real data examples.
Tasks	Outlier Detection
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04396v3
PDF	https://arxiv.org/pdf/1905.04396v3.pdf
PWC	https://paperswithcode.com/paper/prediction-and-outlier-detection-a
Repo
Framework

The Planted Matching Problem: Phase Transitions and Exact Results


Title	The Planted Matching Problem: Phase Transitions and Exact Results
Authors	Mehrdad Moharrami, Cristopher Moore, Jiaming Xu
Abstract	We study the problem of recovering a planted matching in randomly weighted complete bipartite graphs $K_{n,n}$. For some unknown perfect matching $M^$, the weight of an edge is drawn from one distribution $P$ if $e \in M^$ and another distribution $Q$ if $e \in M^$. Our goal is to infer $M^$, exactly or approximately, from the edge weights. In this paper we take $P=\exp(\lambda)$ and $Q=\exp(1/n)$, in which case the maximum-likelihood estimator of $M^$ is the minimum-weight matching $M_{\min}$. We obtain precise results on the overlap between $M^$ and $M_{\min}$, i.e., the fraction of edges they have in common. For $\lambda \ge 4$ we have almost-perfect recovery, with overlap $1-o(1)$ with high probability. For $\lambda < 4$ the expected overlap is an explicit function $\alpha(\lambda) < 1$: we compute it by generalizing Aldous’ celebrated proof of M'ezard and Parisi’s $\zeta(2)$ conjecture for the un-planted model, using local weak convergence to relate $K_{n,n}$ to a type of weighted infinite tree, and then deriving a system of differential equations from a message-passing algorithm on this tree.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08880v1
PDF	https://arxiv.org/pdf/1912.08880v1.pdf
PWC	https://paperswithcode.com/paper/the-planted-matching-problem-phase
Repo
Framework

Appearance and Shape from Water Reflection


Title	Appearance and Shape from Water Reflection
Authors	Ryo Kawahara, Meng-Yu Jennifer Kuo, Shohei Nobuhara, Ko Nishino
Abstract	This paper introduces single-image geometric and appearance reconstruction from water reflection photography, i.e., images capturing direct and water-reflected real-world scenes. Water reflection offers an additional viewpoint to the direct sight, collectively forming a stereo pair. The water-reflected scene, however, includes internally scattered and reflected environmental illumination in addition to the scene radiance, which precludes direct stereo matching. We derive a principled iterative method that disentangles this scene radiometry and geometry for reconstructing 3D scene structure as well as its high-dynamic range appearance. In the presence of waves, we simultaneously recover the wave geometry as surface normal perturbations of the water surface. Most important, we show that the water reflection enables calibration of the camera. In other words, for the first time, we show that capturing a direct and water-reflected scene in a single exposure forms a self-calibrating HDR catadioptric stereo camera. We demonstrate our method on a number of images taken in the wild. The results demonstrate a new means for leveraging this accidental catadioptric camera.
Tasks	3D Scene Reconstruction, Calibration, Stereo Matching, Stereo Matching Hand
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10284v2
PDF	https://arxiv.org/pdf/1906.10284v2.pdf
PWC	https://paperswithcode.com/paper/shape-from-water-reflection
Repo
Framework

Smart Households Demand Response Management with Micro Grid


Title	Smart Households Demand Response Management with Micro Grid
Authors	Hossein Mohammadi Rouzbahani, Abolfazl Rahimnezhad, Hadis Karimipour
Abstract	Nowadays the emerging smart grid technology opens up the possibility of two-way communication between customers and energy utilities. Demand Response Management (DRM) offers the promise of saving money for commercial customers and households while helps utilities operate more efficiently. In this paper, an Incentive-based Demand Response Optimization (IDRO) model is proposed to efficiently schedule household appliances for minimum usage during peak hours. The proposed method is a multi-objective optimization technique based on Nonlinear Auto-Regressive Neural Network (NAR-NN) which considers energy provided by the utility and rooftop installed photovoltaic (PV) system. The proposed method is tested and verified using 300 case studies (household). Data analysis for a period of one year shows a noticeable improvement in power factor and customers bill.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03641v1
PDF	https://arxiv.org/pdf/1907.03641v1.pdf
PWC	https://paperswithcode.com/paper/smart-households-demand-response-management
Repo
Framework