October 20, 2019

2919 words 14 mins read

Paper Group ANR 84

Model identification for ARMA time series through convolutional neural networks. LNEMLC: Label Network Embeddings for Multi-Label Classification. Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound. Projection-Free Bandit Convex Optimization. A Recurrent Convolutional Neural Network Approach for Sensorless Fo …

Model identification for ARMA time series through convolutional neural networks


Title	Model identification for ARMA time series through convolutional neural networks
Authors	Wai Hoh Tang, Adrian Röllin
Abstract	In this paper, we use convolutional neural networks to address the problem of model identification for autoregressive moving average time series models. We compare the performance of several neural network architectures, trained on simulated time series, with likelihood based methods, in particular the Akaike and Bayesian information criteria. We find that our neural networks can significantly outperform these likelihood based methods in terms of accuracy and, by orders of magnitude, in terms of speed.
Tasks	Time Series
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04299v1
PDF	http://arxiv.org/pdf/1804.04299v1.pdf
PWC	https://paperswithcode.com/paper/model-identification-for-arma-time-series
Repo
Framework

LNEMLC: Label Network Embeddings for Multi-Label Classification


Title	LNEMLC: Label Network Embeddings for Multi-Label Classification
Authors	Piotr Szymański, Tomasz Kajdanowicz, Nitesh Chawla
Abstract	Multi-label classification aims to classify instances with discrete non-exclusive labels. Most approaches on multi-label classification focus on effective adaptation or transformation of existing binary and multi-class learning approaches but fail in modelling the joint probability of labels or do not preserve generalization abilities for unseen label combinations. To address these issues we propose a new multi-label classification scheme, LNEMLC - Label Network Embedding for Multi-Label Classification, that embeds the label network and uses it to extend input space in learning and inference of any base multi-label classifier. The approach allows capturing of labels’ joint probability at low computational complexity providing results comparable to the best methods reported in the literature. We demonstrate how the method reveals statistically significant improvements over the simple kNN baseline classifier. We also provide hints for selecting the robust configuration that works satisfactorily across data domains.
Tasks	Multi-Label Classification, Network Embedding
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02956v2
PDF	http://arxiv.org/pdf/1812.02956v2.pdf
PWC	https://paperswithcode.com/paper/lnemlc-label-network-embeddings-for-multi
Repo
Framework

Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound


Title	Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound
Authors	Hadi Kazemi, Sobhan Soleymani, Fariborz Taherkhani, Seyed Mehdi Iranmanesh, Nasser M. Nasrabadi
Abstract	Unsupervised image-to-image translation is a class of computer vision problems which aims at modeling conditional distribution of images in the target domain, given a set of unpaired images in the source and target domains. An image in the source domain might have multiple representations in the target domain. Therefore, ambiguity in modeling of the conditional distribution arises, specially when the images in the source and target domains come from different modalities. Current approaches mostly rely on simplifying assumptions to map both domains into a shared-latent space. Consequently, they are only able to model the domain-invariant information between the two modalities. These approaches usually fail to model domain-specific information which has no representation in the target domain. In this work, we propose an unsupervised image-to-image translation framework which maximizes a domain-specific variational information bound and learns the target domain-invariant representation of the two domain. The proposed framework makes it possible to map a single source image into multiple images in the target domain, utilizing several target domain-specific codes sampled randomly from the prior distribution, or extracted from reference images.
Tasks	Image-to-Image Translation, Unsupervised Image-To-Image Translation
Published	2018-11-29
URL	http://arxiv.org/abs/1811.11979v1
PDF	http://arxiv.org/pdf/1811.11979v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-image-to-image-translation-using
Repo
Framework

Projection-Free Bandit Convex Optimization


Title	Projection-Free Bandit Convex Optimization
Authors	Lin Chen, Mingrui Zhang, Amin Karbasi
Abstract	In this paper, we propose the first computationally efficient projection-free algorithm for bandit convex optimization (BCO). We show that our algorithm achieves a sublinear regret of $O(nT^{4/5})$ (where $T$ is the horizon and $n$ is the dimension) for any bounded convex functions with uniformly bounded gradients. We also evaluate the performance of our algorithm against baselines on both synthetic and real data sets for quadratic programming, portfolio selection and matrix completion problems.
Tasks	Matrix Completion
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07474v2
PDF	http://arxiv.org/pdf/1805.07474v2.pdf
PWC	https://paperswithcode.com/paper/projection-free-bandit-convex-optimization
Repo
Framework

A Recurrent Convolutional Neural Network Approach for Sensorless Force Estimation in Robotic Surgery


Title	A Recurrent Convolutional Neural Network Approach for Sensorless Force Estimation in Robotic Surgery
Authors	Arturo Marban, Vignesh Srinivasan, Wojciech Samek, Josep Fernández, Alicia Casals
Abstract	Providing force feedback as relevant information in current Robot-Assisted Minimally Invasive Surgery systems constitutes a technological challenge due to the constraints imposed by the surgical environment. In this context, Sensorless Force Estimation techniques represent a potential solution, enabling to sense the interaction forces between the surgical instruments and soft-tissues. Specifically, if visual feedback is available for observing soft-tissues’ deformation, this feedback can be used to estimate the forces applied to these tissues. To this end, a force estimation model, based on Convolutional Neural Networks and Long-Short Term Memory networks, is proposed in this work. This model is designed to process both, the spatiotemporal information present in video sequences and the temporal structure of tool data (the surgical tool-tip trajectory and its grasping status). A series of analyses are carried out to reveal the advantages of the proposal and the challenges that remain for real applications. This research work focuses on two surgical task scenarios, referred to as pushing and pulling tissue. For these two scenarios, different input data modalities and their effect on the force estimation quality are investigated. These input data modalities are tool data, video sequences and a combination of both. The results suggest that the force estimation quality is better when both, the tool data and video sequences, are processed by the neural network model. Moreover, this study reveals the need for a loss function, designed to promote the modeling of smooth and sharp details found in force signals. Finally, the results show that the modeling of forces due to pulling tasks is more challenging than for the simplest pushing actions.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08545v1
PDF	http://arxiv.org/pdf/1805.08545v1.pdf
PWC	https://paperswithcode.com/paper/a-recurrent-convolutional-neural-network
Repo
Framework

Compressive Sensing with Low Precision Data Representation: Theory and Applications


Title	Compressive Sensing with Low Precision Data Representation: Theory and Applications
Authors	Nezihe Merve Gürel, Kaan Kara, Alen Stojanov, Tyler Smith, Dan Alistarh, Markus Püschel, Ce Zhang
Abstract	Modern scientific instruments produce vast amounts of data, which can overwhelm the processing ability of computer systems. Lossy compression of data is an intriguing solution, but comes with its own dangers, such as potential signal loss, and the need for careful parameter optimization. In this work, we focus on a setting where this problem is especially acute -compressive sensing frameworks for radio astronomy- and ask: Can the precision of the data representation be lowered for all inputs, with both recovery guarantees and practical performance? Our first contribution is a theoretical analysis of the Iterative Hard Thresholding (IHT) algorithm when all input data, that is, the measurement matrix and the observation, are quantized aggressively to as little as 2 bits per value. Under reasonable constraints, we show that there exists a variant of low precision IHT that can still provide recovery guarantees. The second contribution is an analysis of our general quantized framework tailored to radio astronomy, showing that its conditions are satisfied in this case. We evaluate our approach using CPU and FPGA implementations, and show that it can achieve up to 9.19x speed up with negligible loss of recovery quality, on real telescope data.
Tasks	Compressive Sensing
Published	2018-02-14
URL	http://arxiv.org/abs/1802.04907v2
PDF	http://arxiv.org/pdf/1802.04907v2.pdf
PWC	https://paperswithcode.com/paper/compressive-sensing-with-low-precision-data
Repo
Framework

Building a Lemmatizer and a Spell-checker for Sorani Kurdish


Title	Building a Lemmatizer and a Spell-checker for Sorani Kurdish
Authors	Shahin Salavati, Sina Ahmadi
Abstract	The present paper aims at presenting a lemmatization and a word-level error correction system for Sorani Kurdish. We propose a hybrid approach based on the morphological rules and a n-gram language model. We have called our lemmatization and error correction systems Peyv and R^en^us respectively, which are the first tools presented for Sorani Kurdish to the best of our knowledge. The Peyv lemmatizer has shown 86.7% accuracy. As for R^en^us, using a lexicon, we have obtained 96.4% accuracy while without a lexicon, the correction system has 87% accuracy. As two fundamental text processing tools, these tools can pave the way for further researches on more natural language processing applications for Sorani Kurdish.
Tasks	Language Modelling, Lemmatization
Published	2018-09-27
URL	http://arxiv.org/abs/1809.10763v1
PDF	http://arxiv.org/pdf/1809.10763v1.pdf
PWC	https://paperswithcode.com/paper/building-a-lemmatizer-and-a-spell-checker-for
Repo
Framework

The Temple University Hospital Seizure Detection Corpus


Title	The Temple University Hospital Seizure Detection Corpus
Authors	Vinit Shah, Eva von Weltin, Silvia Lopez, James Riley McHugh, Lily Veloso, Meysam Golmohammadi, Iyad Obeid, Joseph Picone
Abstract	We introduce the TUH EEG Seizure Corpus (TUSZ), which is the largest open source corpus of its type, and represents an accurate characterization of clinical conditions. In this paper, we describe the techniques used to develop TUSZ, evaluate their effectiveness, and present some descriptive statistics on the resulting corpus.
Tasks	EEG, Seizure Detection
Published	2018-01-03
URL	http://arxiv.org/abs/1801.08085v1
PDF	http://arxiv.org/pdf/1801.08085v1.pdf
PWC	https://paperswithcode.com/paper/the-temple-university-hospital-seizure
Repo
Framework

Working Principles of Binary Differential Evolution


Title	Working Principles of Binary Differential Evolution
Authors	Benjamin Doerr, Weijie Zheng
Abstract	We conduct a first fundamental analysis of the working principles of binary differential evolution (BDE), an optimization heuristic for binary decision variables that was derived by Gong and Tuson (2007) from the very successful classic differential evolution (DE) for continuous optimization. We show that unlike most other optimization paradigms, it is stable in the sense that neutral bit values are sampled with probability close to $1/2$ for a long time. This is generally a desirable property, however, it makes it harder to find the optima for decision variables with small influence on the objective function. This can result in an optimization time exponential in the dimension when optimizing simple symmetric functions like OneMax. On the positive side, BDE quickly detects and optimizes the most important decision variables. For example, dominant bits converge to the optimal value in time logarithmic in the population size. This enables BDE to optimize the most important bits very fast. Overall, our results indicate that BDE is an interesting optimization paradigm having characteristics significantly different from classic evolutionary algorithms or estimation-of-distribution algorithms (EDAs). On the technical side, we observe that the strong stochastic dependencies in the random experiment describing a run of BDE prevent us from proving all desired results with the mathematical rigor that was successfully used in the analysis of other evolutionary algorithms. Inspired by mean-field approaches in statistical physics we propose a more independent variant of BDE, show experimentally its similarity to BDE, and prove some statements rigorously only for the independent variant. Such a semi-rigorous approach might be interesting for other problems in evolutionary computation where purely mathematical methods failed so far.
Tasks
Published	2018-12-09
URL	http://arxiv.org/abs/1812.03513v1
PDF	http://arxiv.org/pdf/1812.03513v1.pdf
PWC	https://paperswithcode.com/paper/working-principles-of-binary-differential
Repo
Framework

Instantly Deployable Expert Knowledge - Networks of Knowledge Engines


Title	Instantly Deployable Expert Knowledge - Networks of Knowledge Engines
Authors	Bernhard Bergmair, Thomas Buchegger, Johann Hoffelner, Gerald Schatz, Siegfried Silber, Johannes Klinglmayr
Abstract	Knowledge and information are becoming the primary resources of the emerging information society. To exploit the potential of available expert knowledge, comprehension and application skills (i.e. expert competences) are necessary. The ability to acquire these skills is limited for any individual human. Consequently, the capacities to solve problems based on human knowledge in a manual (i.e. mental) way are strongly limited. We envision a new systemic approach to enable scalable knowledge deployment without expert competences. Eventually, the system is meant to instantly deploy humanity’s total knowledge in full depth for every individual challenge. To this end, we propose a socio-technical framework that transforms expert knowledge into a solution creation system. Knowledge is represented by automated algorithms (knowledge engines). Executable compositions of knowledge engines (networks of knowledge engines) generate requested individual information at runtime. We outline how these knowledge representations could yield legal, ethical and social challenges and nurture new business and remuneration models on knowledge. We identify major technological and economic concepts that are already pushing the boundaries in knowledge utilisation: e.g. in artificial intelligence, knowledge bases, ontologies, advanced search tools, automation of knowledge work, the API economy. We indicate impacts on society, economy and labour. Existing developments are linked, including a specific use case in engineering design.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02964v1
PDF	http://arxiv.org/pdf/1811.02964v1.pdf
PWC	https://paperswithcode.com/paper/instantly-deployable-expert-knowledge
Repo
Framework

Optical Illusions Images Dataset


Title	Optical Illusions Images Dataset
Authors	Robert Max Williams, Roman V. Yampolskiy
Abstract	Human vision is capable of performing many tasks not optimized for in its long evolution. Reading text and identifying artificial objects such as road signs are both tasks that mammalian brains never encountered in the wild but are very easy for us to perform. However, humans have discovered many very specific tricks that cause us to misjudge color, size, alignment and movement of what we are looking at. A better understanding of these phenomenon could reveal insights into how human perception achieves these feats. In this paper we present a dataset of 6725 illusion images gathered from two websites, and a smaller dataset of 500 hand-picked images. We will discuss the process of collecting this data, models trained on it, and the work that needs to be done to make it of value to computer vision researchers.
Tasks
Published	2018-09-30
URL	http://arxiv.org/abs/1810.00415v2
PDF	http://arxiv.org/pdf/1810.00415v2.pdf
PWC	https://paperswithcode.com/paper/optical-illusions-images-dataset
Repo
Framework

Structure-Aware Shape Synthesis


Title	Structure-Aware Shape Synthesis
Authors	Elena Balashova, Vivek Singh, Jiangping Wang, Brian Teixeira, Terrence Chen, Thomas Funkhouser
Abstract	We propose a new procedure to guide training of a data-driven shape generative model using a structure-aware loss function. Complex 3D shapes often can be summarized using a coarsely defined structure which is consistent and robust across variety of observations. However, existing synthesis techniques do not account for structure during training, and thus often generate implausible and structurally unrealistic shapes. During training, we enforce structural constraints in order to enforce consistency and structure across the entire manifold. We propose a novel methodology for training 3D generative models that incorporates structural information into an end-to-end training pipeline.
Tasks
Published	2018-08-04
URL	http://arxiv.org/abs/1808.01427v1
PDF	http://arxiv.org/pdf/1808.01427v1.pdf
PWC	https://paperswithcode.com/paper/structure-aware-shape-synthesis
Repo
Framework

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection


Title	Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Authors	Sung-Feng Huang, Yi-Chen Chen, Hung-yi Lee, Lin-shan Lee
Abstract	Embedding audio signal segments into vectors with fixed dimensionality is attractive because all following processing will be easier and more efficient, for example modeling, classifying or indexing. Audio Word2Vec previously proposed was shown to be able to represent audio segments for spoken words as such vectors carrying information about the phonetic structures of the signal segments. However, each linguistic unit (word, syllable, phoneme in text form) corresponds to unlimited number of audio segments with vector representations inevitably spread over the embedding space, which causes some confusion. It is therefore desired to better cluster the audio embeddings such that those corresponding to the same linguistic unit can be more compactly distributed. In this paper, inspired by Siamese networks, we propose some approaches to achieve the above goal. This includes identifying positive and negative pairs from unlabeled data for Siamese style training, disentangling acoustic factors such as speaker characteristics from the audio embedding, handling unbalanced data distribution, and having the embedding processes learn from the adjacency relationships among data points. All these can be done in an unsupervised way. Improved performance was obtained in preliminary experiments on the LibriSpeech data set, including clustering characteristics analysis and applications of spoken term detection.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02775v1
PDF	http://arxiv.org/pdf/1811.02775v1.pdf
PWC	https://paperswithcode.com/paper/improved-audio-embeddings-by-adjacency-based
Repo
Framework

Analysing object detectors from the perspective of co-occurring object categories


Title	Analysing object detectors from the perspective of co-occurring object categories
Authors	Csaba Nemes, Sandor Jordan
Abstract	The accuracy of state-of-the-art Faster R-CNN and YOLO object detectors are evaluated and compared on a special masked MS COCO dataset to measure how much their predictions rely on contextual information encoded at object category level. Category level representation of context is motivated by the fact that it could be an adequate way to transfer knowledge between visual and non-visual domains. According to our measurements, current detectors usually do not build strong dependency on contextual information at category level, however, when they does, they does it in a similar way, suggesting that contextual dependence of object categories is an independent property that is relevant to be transferred.
Tasks
Published	2018-09-21
URL	http://arxiv.org/abs/1809.08132v1
PDF	http://arxiv.org/pdf/1809.08132v1.pdf
PWC	https://paperswithcode.com/paper/analysing-object-detectors-from-the
Repo
Framework

Markerless Inside-Out Tracking for Interventional Applications


Title	Markerless Inside-Out Tracking for Interventional Applications
Authors	Benjamin Busam, Patrick Ruhkamp, Salvatore Virga, Beatrice Lentes, Julia Rackerseder, Nassir Navab, Christoph Hennersperger
Abstract	Tracking of rotation and translation of medical instruments plays a substantial role in many modern interventions. Traditional external optical tracking systems are often subject to line-of-sight issues, in particular when the region of interest is difficult to access or the procedure allows only for limited rigid body markers. The introduction of inside-out tracking systems aims to overcome these issues. We propose a marker-less tracking system based on visual SLAM to enable tracking of instruments in an interventional scenario. To achieve this goal, we mount a miniature multi-modal (monocular, stereo, active depth) vision system on the object of interest and relocalize its pose within an adaptive map of the operating room. We compare state-of-the-art algorithmic pipelines and apply the idea to transrectal 3D Ultrasound (TRUS) compounding of the prostate. Obtained volumes are compared to reconstruction using a commercial optical tracking system as well as a robotic manipulator. Feature-based binocular SLAM is identified as the most promising method and is tested extensively in challenging clinical environment under severe occlusion and for the use case of prostate US biopsies.
Tasks
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01708v3
PDF	http://arxiv.org/pdf/1804.01708v3.pdf
PWC	https://paperswithcode.com/paper/markerless-inside-out-tracking-for
Repo
Framework