Paper Group ANR 1119
Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search. Causal inference with Bayes rule. Exploring Semi-supervised Variational Autoencoders for Biomedical Relation Extraction. Fast Bayesian Uncertainty Estimation of Batch Normalized Single Image Super-Resolution Network. Automated detection of celiac disease on duodenal biopsy slides: …
Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search
Title | Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search |
Authors | Stephan S. Lorenzen, Ninh Pham |
Abstract | Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This paper extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limit of B computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling to solve it. Though the design of these sampling schemes naturally supports budgeted top-k MIPS, they suffer from the linear cost from scanning all data points to retrieve top-k results and the performance degradation for handling negative inputs. This paper makes two main contributions. First, we show that diamond sampling is essentially a combination between wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding both efficiency and accuracy. Second, we propose a series of algorithmic engineering techniques to deploy wedge sampling on budgeted top-k MIPS. Our novel deterministic wedge-based algorithm runs significantly faster than the state-of-the-art methods for budgeted and exact top-k MIPS while maintaining the top-5 precision at least 80% on standard recommender system data sets. |
Tasks | Recommendation Systems |
Published | 2019-08-23 |
URL | https://arxiv.org/abs/1908.08656v1 |
https://arxiv.org/pdf/1908.08656v1.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-wedge-sampling-for-budgeted |
Repo | |
Framework | |
Causal inference with Bayes rule
Title | Causal inference with Bayes rule |
Authors | Finnian Lattimore, David Rohde |
Abstract | The concept of causality has a controversial history. The question of whether it is possible to represent and address causal problems with probability theory, or if fundamentally new mathematics such as the do-calculus is required has been hotly debated, In this paper we demonstrate that, while it is critical to explicitly model our assumptions on the impact of intervening in a system, provided we do so, estimating causal effects can be done entirely within the standard Bayesian paradigm. The invariance assumptions underlying causal graphical models can be encoded in ordinary Probabilistic graphical models, allowing causal estimation with Bayesian statistics, equivalent to the do-calculus. |
Tasks | Causal Inference |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01510v2 |
https://arxiv.org/pdf/1910.01510v2.pdf | |
PWC | https://paperswithcode.com/paper/causal-inference-with-bayes-rule |
Repo | |
Framework | |
Exploring Semi-supervised Variational Autoencoders for Biomedical Relation Extraction
Title | Exploring Semi-supervised Variational Autoencoders for Biomedical Relation Extraction |
Authors | Yijia Zhang, Zhiyong Lu |
Abstract | The biomedical literature provides a rich source of knowledge such as protein-protein interactions (PPIs), drug-drug interactions (DDIs) and chemical-protein interactions (CPIs). Biomedical relation extraction aims to automatically extract biomedical relations from biomedical text for various biomedical research. State-of-the-art methods for biomedical relation extraction are primarily based on supervised machine learning and therefore depend on (sufficient) labeled data. However, creating large sets of training data is prohibitively expensive and labor-intensive, especially so in biomedicine as domain knowledge is required. In contrast, there is a large amount of unlabeled biomedical text available in PubMed. Hence, computational methods capable of employing unlabeled data to reduce the burden of manual annotation are of particular interest in biomedical relation extraction. We present a novel semi-supervised approach based on variational autoencoder (VAE) for biomedical relation extraction. Our model consists of the following three parts, a classifier, an encoder and a decoder. The classifier is implemented using multi-layer convolutional neural networks (CNNs), and the encoder and decoder are implemented using both bidirectional long short-term memory networks (Bi-LSTMs) and CNNs, respectively. The semi-supervised mechanism allows our model to learn features from both the labeled and unlabeled data. We evaluate our method on multiple public PPI, DDI and CPI corpora. Experimental results show that our method effectively exploits the unlabeled data to improve the performance and reduce the dependence on labeled data. To our best knowledge, this is the first semi-supervised VAE-based method for (biomedical) relation extraction. Our results suggest that exploiting such unlabeled data can be greatly beneficial to improved performance in various biomedical relation extraction. |
Tasks | Relation Extraction |
Published | 2019-01-18 |
URL | http://arxiv.org/abs/1901.06103v1 |
http://arxiv.org/pdf/1901.06103v1.pdf | |
PWC | https://paperswithcode.com/paper/exploring-semi-supervised-variational |
Repo | |
Framework | |
Fast Bayesian Uncertainty Estimation of Batch Normalized Single Image Super-Resolution Network
Title | Fast Bayesian Uncertainty Estimation of Batch Normalized Single Image Super-Resolution Network |
Authors | Aupendu Kar, Prabir Kumar Biswas |
Abstract | In recent years, deep convolutional neural network (CNN) has achieved unprecedented success in image super-resolution (SR) task. But the black-box nature of the neural network and due to its lack of transparency, it is hard to trust the outcome. In this regards, we introduce a Bayesian approach for uncertainty estimation in super-resolution network. We generate Monte Carlo (MC) samples from a posterior distribution by using batch mean and variance as a stochastic parameter in the batch-normalization layer during test time. Those MC samples not only reconstruct the image from its low-resolution counterpart but also provides a confidence map of reconstruction which will be very impactful for practical use. We also introduce a faster approach for estimating the uncertainty, and it can be useful for real-time applications. We validate our results using standard datasets for performance analysis and also for different domain-specific super-resolution task. We also estimate uncertainty quality using standard statistical metrics and also provides a qualitative evaluation of uncertainty for SR applications. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-03-22 |
URL | http://arxiv.org/abs/1903.09410v1 |
http://arxiv.org/pdf/1903.09410v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-bayesian-uncertainty-estimation-of-batch |
Repo | |
Framework | |
Automated detection of celiac disease on duodenal biopsy slides: a deep learning approach
Title | Automated detection of celiac disease on duodenal biopsy slides: a deep learning approach |
Authors | Jason W. Wei, Jerry W. Wei, Christopher R. Jackson, Bing Ren, Arief A. Suriawinata, Saeed Hassanpour |
Abstract | Celiac disease prevalence and diagnosis have increased substantially in recent years. The current gold standard for celiac disease confirmation is visual examination of duodenal mucosal biopsies. An accurate computer-aided biopsy analysis system using deep learning can help pathologists diagnose celiac disease more efficiently. In this study, we trained a deep learning model to detect celiac disease on duodenal biopsy images. Our model uses a state-of-the-art residual convolutional neural network to evaluate patches of duodenal tissue and then aggregates those predictions for whole-slide classification. We tested the model on an independent set of 212 images and evaluated its classification results against reference standards established by pathologists. Our model identified celiac disease, normal tissue, and nonspecific duodenitis with accuracies of 95.3%, 91.0%, and 89.2%, respectively. The area under the receiver operating characteristic curve was greater than 0.95 for all classes. We have developed an automated biopsy analysis system that achieves high performance in detecting celiac disease on biopsy slides. Our system can highlight areas of interest and provide preliminary classification of duodenal biopsies prior to review by pathologists. This technology has great potential for improving the accuracy and efficiency of celiac disease diagnosis. |
Tasks | |
Published | 2019-01-31 |
URL | http://arxiv.org/abs/1901.11447v1 |
http://arxiv.org/pdf/1901.11447v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-detection-of-celiac-disease-on |
Repo | |
Framework | |
SWNet: Small-World Neural Networks and Rapid Convergence
Title | SWNet: Small-World Neural Networks and Rapid Convergence |
Authors | Mojan Javaheripi, Bita Darvish Rouhani, Farinaz Koushanfar |
Abstract | Training large and highly accurate deep learning (DL) models is computationally costly. This cost is in great part due to the excessive number of trained parameters, which are well-known to be redundant and compressible for the execution phase. This paper proposes a novel transformation which changes the topology of the DL architecture such that it reaches an optimal cross-layer connectivity. This transformation leverages our important observation that for a set level of accuracy, convergence is fastest when network topology reaches the boundary of a Small-World Network. Small-world graphs are known to possess a specific connectivity structure that enables enhanced signal propagation among nodes. Our small-world models, called SWNets, provide several intriguing benefits: they facilitate data (gradient) flow within the network, enable feature-map reuse by adding long-range connections and accommodate various network architectures/datasets. Compared to densely connected networks (e.g., DenseNets), SWNets require a substantially fewer number of training parameters while maintaining a similar level of classification accuracy. We evaluate our networks on various DL model architectures and image classification datasets, namely, CIFAR10, CIFAR100, and ILSVRC (ImageNet). Our experiments demonstrate an average of ~2.1x improvement in convergence speed to the desired accuracy |
Tasks | Image Classification |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04862v1 |
http://arxiv.org/pdf/1904.04862v1.pdf | |
PWC | https://paperswithcode.com/paper/swnet-small-world-neural-networks-and-rapid |
Repo | |
Framework | |
A Survey of Behavior Learning Applications in Robotics – State of the Art and Perspectives
Title | A Survey of Behavior Learning Applications in Robotics – State of the Art and Perspectives |
Authors | Alexander Fabisch, Christoph Petzoldt, Marc Otto, Frank Kirchner |
Abstract | Recent success of machine learning in many domains has been overwhelming, which often leads to false expectations regarding the capabilities of behavior learning in robotics. In this survey, we analyze the current state of machine learning for robotic behaviors. We will give a broad overview of behaviors that have been learned and used on real robots. Our focus is on kinematically or sensorially complex robots. That includes humanoid robots or parts of humanoid robots, for example, legged robots or robotic arms. We will classify presented behaviors according to various categories and we will draw conclusions about what can be learned and what should be learned. Furthermore, we will give an outlook on problems that are challenging today but might be solved by machine learning in the future and argue that classical robotics and other approaches from artificial intelligence should be integrated more with machine learning to form complete, autonomous systems. |
Tasks | Legged Robots |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.01868v1 |
https://arxiv.org/pdf/1906.01868v1.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-of-behavior-learning-applications-in |
Repo | |
Framework | |
Proximal Splitting Networks for Image Restoration
Title | Proximal Splitting Networks for Image Restoration |
Authors | Raied Aljadaany, Dipan K. Pal, Marios Savvides |
Abstract | Image restoration problems are typically ill-posed requiring the design of suitable priors. These priors are typically hand-designed and are fully instantiated throughout the process. In this paper, we introduce a novel framework for handling inverse problems related to image restoration based on elements from the half quadratic splitting method and proximal operators. Modeling the proximal operator as a convolutional network, we defined an implicit prior on the image space as a function class during training. This is in contrast to the common practice in literature of having the prior to be fixed and fully instantiated even during training stages. Further, we allow this proximal operator to be tuned differently for each iteration which greatly increases modeling capacity and allows us to reduce the number of iterations by an order of magnitude as compared to other approaches. Our final network is an end-to-end one whose run time matches the previous fastest algorithms while outperforming them in recovery fidelity on two image restoration tasks. Indeed, we find our approach achieves state-of-the-art results on benchmarks in image denoising and image super resolution while recovering more complex and finer details. |
Tasks | Denoising, Image Denoising, Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2019-03-17 |
URL | http://arxiv.org/abs/1903.07154v1 |
http://arxiv.org/pdf/1903.07154v1.pdf | |
PWC | https://paperswithcode.com/paper/proximal-splitting-networks-for-image |
Repo | |
Framework | |
Crowd Video Captioning
Title | Crowd Video Captioning |
Authors | Liqi Yan, Mingjian Zhu, Changbin Yu |
Abstract | Describing a video automatically with natural language is a challenging task in the area of computer vision. In most cases, the on-site situation of great events is reported in news, but the situation of the off-site spectators in the entrance and exit is neglected which also arouses people’s interest. Since the deployment of reporters in the entrance and exit costs lots of manpower, how to automatically describe the behavior of a crowd of off-site spectators is significant and remains a problem. To tackle this problem, we propose a new task called crowd video captioning (CVC) which aims to describe the crowd of spectators. We also provide baseline methods for this task and evaluate them on the dataset WorldExpo’10. Our experimental results show that captioning models have a fairly deep understanding of the crowd in video and perform satisfactorily in the CVC task. |
Tasks | Video Captioning |
Published | 2019-11-13 |
URL | https://arxiv.org/abs/1911.05449v1 |
https://arxiv.org/pdf/1911.05449v1.pdf | |
PWC | https://paperswithcode.com/paper/crowd-video-captioning |
Repo | |
Framework | |
Multilingual Universal Sentence Encoder for Semantic Retrieval
Title | Multilingual Universal Sentence Encoder for Semantic Retrieval |
Authors | Yinfei Yang, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil |
Abstract | We introduce two pre-trained retrieval focused multilingual sentence encoding models, respectively based on the Transformer and CNN model architectures. The models embed text from 16 languages into a single semantic space using a multi-task trained dual-encoder that learns tied representations using translation based bridge tasks (Chidambaram al., 2018). The models provide performance that is competitive with the state-of-the-art on: semantic retrieval (SR), translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On English transfer learning tasks, our sentence-level embeddings approach, and in some cases exceed, the performance of monolingual, English only, sentence embedding models. Our models are made available for download on TensorFlow Hub. |
Tasks | Question Answering, Sentence Embedding, Transfer Learning |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04307v1 |
https://arxiv.org/pdf/1907.04307v1.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-universal-sentence-encoder-for |
Repo | |
Framework | |
JarKA: Modeling Attribute Interactions for Cross-lingual Knowledge Alignment
Title | JarKA: Modeling Attribute Interactions for Cross-lingual Knowledge Alignment |
Authors | Bo Chen, Jing Zhang, Xiaobin Tang, Hong Chen, Cuiping Li |
Abstract | Abstract. Cross-lingual knowledge alignment is the cornerstone in building a comprehensive knowledge graph (KG), which can benefit various knowledge-driven applications. As the structures of KGs are usually sparse, attributes of entities may play an important role in aligning the entities. However, the heterogeneity of the attributes across KGs prevents from accurately embedding and comparing entities. To deal with the issue, we propose to model the interactions between attributes, instead of globally embedding an entity with all the attributes. We further propose a joint framework to merge the alignments inferred from the attributes and the structures. Experimental results show that the proposed model outperforms the state-of-art baselines by up to 38.48% HitRatio@1. The results also demonstrate that our model can infer the alignments between attributes, relationships and values, in addition to entities. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13105v2 |
https://arxiv.org/pdf/1910.13105v2.pdf | |
PWC | https://paperswithcode.com/paper/rakaco-training-of-relationships-and |
Repo | |
Framework | |
SCANN: Synthesis of Compact and Accurate Neural Networks
Title | SCANN: Synthesis of Compact and Accurate Neural Networks |
Authors | Shayan Hassantabar, Zeyu Wang, Niraj K. Jha |
Abstract | Artificial neural networks (ANNs) have become the driving force behind recent artificial intelligence (AI) research. An important problem with implementing a neural network is the design of its architecture. Typically, such an architecture is obtained manually by exploring its hyperparameter space and kept fixed during training. This approach is both time-consuming and inefficient. Furthermore, modern neural networks often contain millions of parameters, whereas many applications require small inference models. Also, while ANNs have found great success in big-data applications, there is also significant interest in using ANNs for medium- and small-data applications that can be run on energy-constrained edge devices. To address these challenges, we propose a neural network synthesis methodology (SCANN) that can generate very compact neural networks without loss in accuracy for small and medium-size datasets. We also use dimensionality reduction methods to reduce the feature size of the datasets, so as to alleviate the curse of dimensionality. Our final synthesis methodology consists of three steps: dataset dimensionality reduction, neural network compression in each layer, and neural network compression with SCANN. We evaluate SCANN on the medium-size MNIST dataset by comparing our synthesized neural networks to the well-known LeNet-5 baseline. Without any loss in accuracy, SCANN generates a $46.3\times$ smaller network than the LeNet-5 Caffe model. We also evaluate the efficiency of using dimensionality reduction alongside SCANN on nine small to medium-size datasets. Using this methodology enables us to reduce the number of connections in the network by up to $5078.7\times$ (geometric mean: $82.1\times$), with little to no drop in accuracy. We also show that our synthesis methodology yields neural networks that are much better at navigating the accuracy vs. energy efficiency space. |
Tasks | Dimensionality Reduction, Neural Network Compression |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09090v1 |
http://arxiv.org/pdf/1904.09090v1.pdf | |
PWC | https://paperswithcode.com/paper/scann-synthesis-of-compact-and-accurate |
Repo | |
Framework | |
Robust Online Video Super-Resolution Using an Efficient Alternating Projections Scheme
Title | Robust Online Video Super-Resolution Using an Efficient Alternating Projections Scheme |
Authors | Ricardo Augusto Borsoi |
Abstract | Video super-resolution reconstruction (SRR) algorithms attempt to reconstruct high-resolution (HR) video sequences from low-resolution observations. Although recent progress in video SRR has significantly improved the quality of the reconstructed HR sequences, it remains challenging to design SRR algorithms that achieve good quality and robustness at a small computational complexity, being thus suitable for online applications. In this paper, we propose a new adaptive video SRR algorithm that achieves state-of-the-art performance at a very small computational cost. Using a nonlinear cost function constructed considering characteristics of typical innovation outliers in natural image sequences and an edge-preserving regularization strategy, we achieve state-of-the-art reconstructed image quality and robustness. This cost function is optimized using a specific alternating projections strategy over non-convex sets that is able to converge in a very few iterations. An accurate and very efficient approximation for the projection operations is also obtained using tools from multidimensional multirate signal processing. This solves the slow convergence issue of stochastic gradient-based methods while keeping a small computational complexity. Simulation results with both synthetic and real image sequences show that the performance of the proposed algorithm is similar or better than state-of-the-art SRR algorithms, while requiring only a small fraction of their computational cost. |
Tasks | Super-Resolution, Video Super-Resolution |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.00073v2 |
https://arxiv.org/pdf/1909.00073v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-online-video-super-resolution-using-an |
Repo | |
Framework | |
Doubly robust off-policy evaluation with shrinkage
Title | Doubly robust off-policy evaluation with shrinkage |
Authors | Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, Miroslav Dudík |
Abstract | We design a new family of estimators for off-policy evaluation in contextual bandits. Our estimators are based on the asymptotically optimal approach of doubly robust estimation, but they shrink importance weights to obtain a better bias-variance tradeoff in finite samples. Our approach adapts importance weights to the quality of a reward predictor, interpolating between doubly robust estimation and direct modeling. When the reward predictor is poor, we recover previously studied weight clipping, but when the reward predictor is good, we obtain a new form of shrinkage. To navigate between these regimes and tune the shrinkage coefficient, we design a model selection procedure, which we prove is never worse than the doubly robust estimator. Extensive experiments on bandit benchmark problems show that our estimators are highly adaptive and typically outperform state-of-the-art methods. |
Tasks | Model Selection, Multi-Armed Bandits |
Published | 2019-07-22 |
URL | https://arxiv.org/abs/1907.09623v1 |
https://arxiv.org/pdf/1907.09623v1.pdf | |
PWC | https://paperswithcode.com/paper/doubly-robust-off-policy-evaluation-with |
Repo | |
Framework | |
Artificial Intelligence in Intelligent Tutoring Robots: A Systematic Review and Design Guidelines
Title | Artificial Intelligence in Intelligent Tutoring Robots: A Systematic Review and Design Guidelines |
Authors | Jinyu Yang, Bo Zhang |
Abstract | This study provides a systematic review of the recent advances in designing the intelligent tutoring robot (ITR), and summarises the status quo of applying artificial intelligence (AI) techniques. We first analyse the environment of the ITR and propose a relationship model for describing interactions of ITR with the students, the social milieu and the curriculum. Then, we transform the relationship model into the perception-planning-action model for exploring what AI techniques are suitable to be applied in the ITR. This article provides insights on promoting human-robot teaching-learning process and AI-assisted educational techniques, illustrating the design guidelines and future research perspectives in intelligent tutoring robots. |
Tasks | |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1903.03414v1 |
http://arxiv.org/pdf/1903.03414v1.pdf | |
PWC | https://paperswithcode.com/paper/artificial-intelligence-in-intelligent |
Repo | |
Framework | |