Paper Group ANR 84
Model identification for ARMA time series through convolutional neural networks. LNEMLC: Label Network Embeddings for Multi-Label Classification. Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound. Projection-Free Bandit Convex Optimization. A Recurrent Convolutional Neural Network Approach for Sensorless Fo …
Model identification for ARMA time series through convolutional neural networks
Title | Model identification for ARMA time series through convolutional neural networks |
Authors | Wai Hoh Tang, Adrian Röllin |
Abstract | In this paper, we use convolutional neural networks to address the problem of model identification for autoregressive moving average time series models. We compare the performance of several neural network architectures, trained on simulated time series, with likelihood based methods, in particular the Akaike and Bayesian information criteria. We find that our neural networks can significantly outperform these likelihood based methods in terms of accuracy and, by orders of magnitude, in terms of speed. |
Tasks | Time Series |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04299v1 |
http://arxiv.org/pdf/1804.04299v1.pdf | |
PWC | https://paperswithcode.com/paper/model-identification-for-arma-time-series |
Repo | |
Framework | |
LNEMLC: Label Network Embeddings for Multi-Label Classification
Title | LNEMLC: Label Network Embeddings for Multi-Label Classification |
Authors | Piotr Szymański, Tomasz Kajdanowicz, Nitesh Chawla |
Abstract | Multi-label classification aims to classify instances with discrete non-exclusive labels. Most approaches on multi-label classification focus on effective adaptation or transformation of existing binary and multi-class learning approaches but fail in modelling the joint probability of labels or do not preserve generalization abilities for unseen label combinations. To address these issues we propose a new multi-label classification scheme, LNEMLC - Label Network Embedding for Multi-Label Classification, that embeds the label network and uses it to extend input space in learning and inference of any base multi-label classifier. The approach allows capturing of labels’ joint probability at low computational complexity providing results comparable to the best methods reported in the literature. We demonstrate how the method reveals statistically significant improvements over the simple kNN baseline classifier. We also provide hints for selecting the robust configuration that works satisfactorily across data domains. |
Tasks | Multi-Label Classification, Network Embedding |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.02956v2 |
http://arxiv.org/pdf/1812.02956v2.pdf | |
PWC | https://paperswithcode.com/paper/lnemlc-label-network-embeddings-for-multi |
Repo | |
Framework | |
Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound
Title | Unsupervised Image-to-Image Translation Using Domain-Specific Variational Information Bound |
Authors | Hadi Kazemi, Sobhan Soleymani, Fariborz Taherkhani, Seyed Mehdi Iranmanesh, Nasser M. Nasrabadi |
Abstract | Unsupervised image-to-image translation is a class of computer vision problems which aims at modeling conditional distribution of images in the target domain, given a set of unpaired images in the source and target domains. An image in the source domain might have multiple representations in the target domain. Therefore, ambiguity in modeling of the conditional distribution arises, specially when the images in the source and target domains come from different modalities. Current approaches mostly rely on simplifying assumptions to map both domains into a shared-latent space. Consequently, they are only able to model the domain-invariant information between the two modalities. These approaches usually fail to model domain-specific information which has no representation in the target domain. In this work, we propose an unsupervised image-to-image translation framework which maximizes a domain-specific variational information bound and learns the target domain-invariant representation of the two domain. The proposed framework makes it possible to map a single source image into multiple images in the target domain, utilizing several target domain-specific codes sampled randomly from the prior distribution, or extracted from reference images. |
Tasks | Image-to-Image Translation, Unsupervised Image-To-Image Translation |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.11979v1 |
http://arxiv.org/pdf/1811.11979v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-image-to-image-translation-using |
Repo | |
Framework | |
Projection-Free Bandit Convex Optimization
Title | Projection-Free Bandit Convex Optimization |
Authors | Lin Chen, Mingrui Zhang, Amin Karbasi |
Abstract | In this paper, we propose the first computationally efficient projection-free algorithm for bandit convex optimization (BCO). We show that our algorithm achieves a sublinear regret of $O(nT^{4/5})$ (where $T$ is the horizon and $n$ is the dimension) for any bounded convex functions with uniformly bounded gradients. We also evaluate the performance of our algorithm against baselines on both synthetic and real data sets for quadratic programming, portfolio selection and matrix completion problems. |
Tasks | Matrix Completion |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07474v2 |
http://arxiv.org/pdf/1805.07474v2.pdf | |
PWC | https://paperswithcode.com/paper/projection-free-bandit-convex-optimization |
Repo | |
Framework | |
A Recurrent Convolutional Neural Network Approach for Sensorless Force Estimation in Robotic Surgery
Title | A Recurrent Convolutional Neural Network Approach for Sensorless Force Estimation in Robotic Surgery |
Authors | Arturo Marban, Vignesh Srinivasan, Wojciech Samek, Josep Fernández, Alicia Casals |
Abstract | Providing force feedback as relevant information in current Robot-Assisted Minimally Invasive Surgery systems constitutes a technological challenge due to the constraints imposed by the surgical environment. In this context, Sensorless Force Estimation techniques represent a potential solution, enabling to sense the interaction forces between the surgical instruments and soft-tissues. Specifically, if visual feedback is available for observing soft-tissues’ deformation, this feedback can be used to estimate the forces applied to these tissues. To this end, a force estimation model, based on Convolutional Neural Networks and Long-Short Term Memory networks, is proposed in this work. This model is designed to process both, the spatiotemporal information present in video sequences and the temporal structure of tool data (the surgical tool-tip trajectory and its grasping status). A series of analyses are carried out to reveal the advantages of the proposal and the challenges that remain for real applications. This research work focuses on two surgical task scenarios, referred to as pushing and pulling tissue. For these two scenarios, different input data modalities and their effect on the force estimation quality are investigated. These input data modalities are tool data, video sequences and a combination of both. The results suggest that the force estimation quality is better when both, the tool data and video sequences, are processed by the neural network model. Moreover, this study reveals the need for a loss function, designed to promote the modeling of smooth and sharp details found in force signals. Finally, the results show that the modeling of forces due to pulling tasks is more challenging than for the simplest pushing actions. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08545v1 |
http://arxiv.org/pdf/1805.08545v1.pdf | |
PWC | https://paperswithcode.com/paper/a-recurrent-convolutional-neural-network |
Repo | |
Framework | |
Compressive Sensing with Low Precision Data Representation: Theory and Applications
Title | Compressive Sensing with Low Precision Data Representation: Theory and Applications |
Authors | Nezihe Merve Gürel, Kaan Kara, Alen Stojanov, Tyler Smith, Dan Alistarh, Markus Püschel, Ce Zhang |
Abstract | Modern scientific instruments produce vast amounts of data, which can overwhelm the processing ability of computer systems. Lossy compression of data is an intriguing solution, but comes with its own dangers, such as potential signal loss, and the need for careful parameter optimization. In this work, we focus on a setting where this problem is especially acute -compressive sensing frameworks for radio astronomy- and ask: Can the precision of the data representation be lowered for all inputs, with both recovery guarantees and practical performance? Our first contribution is a theoretical analysis of the Iterative Hard Thresholding (IHT) algorithm when all input data, that is, the measurement matrix and the observation, are quantized aggressively to as little as 2 bits per value. Under reasonable constraints, we show that there exists a variant of low precision IHT that can still provide recovery guarantees. The second contribution is an analysis of our general quantized framework tailored to radio astronomy, showing that its conditions are satisfied in this case. We evaluate our approach using CPU and FPGA implementations, and show that it can achieve up to 9.19x speed up with negligible loss of recovery quality, on real telescope data. |
Tasks | Compressive Sensing |
Published | 2018-02-14 |
URL | http://arxiv.org/abs/1802.04907v2 |
http://arxiv.org/pdf/1802.04907v2.pdf | |
PWC | https://paperswithcode.com/paper/compressive-sensing-with-low-precision-data |
Repo | |
Framework | |
Building a Lemmatizer and a Spell-checker for Sorani Kurdish
Title | Building a Lemmatizer and a Spell-checker for Sorani Kurdish |
Authors | Shahin Salavati, Sina Ahmadi |
Abstract | The present paper aims at presenting a lemmatization and a word-level error correction system for Sorani Kurdish. We propose a hybrid approach based on the morphological rules and a n-gram language model. We have called our lemmatization and error correction systems Peyv and R^en^us respectively, which are the first tools presented for Sorani Kurdish to the best of our knowledge. The Peyv lemmatizer has shown 86.7% accuracy. As for R^en^us, using a lexicon, we have obtained 96.4% accuracy while without a lexicon, the correction system has 87% accuracy. As two fundamental text processing tools, these tools can pave the way for further researches on more natural language processing applications for Sorani Kurdish. |
Tasks | Language Modelling, Lemmatization |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10763v1 |
http://arxiv.org/pdf/1809.10763v1.pdf | |
PWC | https://paperswithcode.com/paper/building-a-lemmatizer-and-a-spell-checker-for |
Repo | |
Framework | |
The Temple University Hospital Seizure Detection Corpus
Title | The Temple University Hospital Seizure Detection Corpus |
Authors | Vinit Shah, Eva von Weltin, Silvia Lopez, James Riley McHugh, Lily Veloso, Meysam Golmohammadi, Iyad Obeid, Joseph Picone |
Abstract | We introduce the TUH EEG Seizure Corpus (TUSZ), which is the largest open source corpus of its type, and represents an accurate characterization of clinical conditions. In this paper, we describe the techniques used to develop TUSZ, evaluate their effectiveness, and present some descriptive statistics on the resulting corpus. |
Tasks | EEG, Seizure Detection |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.08085v1 |
http://arxiv.org/pdf/1801.08085v1.pdf | |
PWC | https://paperswithcode.com/paper/the-temple-university-hospital-seizure |
Repo | |
Framework | |
Working Principles of Binary Differential Evolution
Title | Working Principles of Binary Differential Evolution |
Authors | Benjamin Doerr, Weijie Zheng |
Abstract | We conduct a first fundamental analysis of the working principles of binary differential evolution (BDE), an optimization heuristic for binary decision variables that was derived by Gong and Tuson (2007) from the very successful classic differential evolution (DE) for continuous optimization. We show that unlike most other optimization paradigms, it is stable in the sense that neutral bit values are sampled with probability close to $1/2$ for a long time. This is generally a desirable property, however, it makes it harder to find the optima for decision variables with small influence on the objective function. This can result in an optimization time exponential in the dimension when optimizing simple symmetric functions like OneMax. On the positive side, BDE quickly detects and optimizes the most important decision variables. For example, dominant bits converge to the optimal value in time logarithmic in the population size. This enables BDE to optimize the most important bits very fast. Overall, our results indicate that BDE is an interesting optimization paradigm having characteristics significantly different from classic evolutionary algorithms or estimation-of-distribution algorithms (EDAs). On the technical side, we observe that the strong stochastic dependencies in the random experiment describing a run of BDE prevent us from proving all desired results with the mathematical rigor that was successfully used in the analysis of other evolutionary algorithms. Inspired by mean-field approaches in statistical physics we propose a more independent variant of BDE, show experimentally its similarity to BDE, and prove some statements rigorously only for the independent variant. Such a semi-rigorous approach might be interesting for other problems in evolutionary computation where purely mathematical methods failed so far. |
Tasks | |
Published | 2018-12-09 |
URL | http://arxiv.org/abs/1812.03513v1 |
http://arxiv.org/pdf/1812.03513v1.pdf | |
PWC | https://paperswithcode.com/paper/working-principles-of-binary-differential |
Repo | |
Framework | |
Instantly Deployable Expert Knowledge - Networks of Knowledge Engines
Title | Instantly Deployable Expert Knowledge - Networks of Knowledge Engines |
Authors | Bernhard Bergmair, Thomas Buchegger, Johann Hoffelner, Gerald Schatz, Siegfried Silber, Johannes Klinglmayr |
Abstract | Knowledge and information are becoming the primary resources of the emerging information society. To exploit the potential of available expert knowledge, comprehension and application skills (i.e. expert competences) are necessary. The ability to acquire these skills is limited for any individual human. Consequently, the capacities to solve problems based on human knowledge in a manual (i.e. mental) way are strongly limited. We envision a new systemic approach to enable scalable knowledge deployment without expert competences. Eventually, the system is meant to instantly deploy humanity’s total knowledge in full depth for every individual challenge. To this end, we propose a socio-technical framework that transforms expert knowledge into a solution creation system. Knowledge is represented by automated algorithms (knowledge engines). Executable compositions of knowledge engines (networks of knowledge engines) generate requested individual information at runtime. We outline how these knowledge representations could yield legal, ethical and social challenges and nurture new business and remuneration models on knowledge. We identify major technological and economic concepts that are already pushing the boundaries in knowledge utilisation: e.g. in artificial intelligence, knowledge bases, ontologies, advanced search tools, automation of knowledge work, the API economy. We indicate impacts on society, economy and labour. Existing developments are linked, including a specific use case in engineering design. |
Tasks | |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02964v1 |
http://arxiv.org/pdf/1811.02964v1.pdf | |
PWC | https://paperswithcode.com/paper/instantly-deployable-expert-knowledge |
Repo | |
Framework | |
Optical Illusions Images Dataset
Title | Optical Illusions Images Dataset |
Authors | Robert Max Williams, Roman V. Yampolskiy |
Abstract | Human vision is capable of performing many tasks not optimized for in its long evolution. Reading text and identifying artificial objects such as road signs are both tasks that mammalian brains never encountered in the wild but are very easy for us to perform. However, humans have discovered many very specific tricks that cause us to misjudge color, size, alignment and movement of what we are looking at. A better understanding of these phenomenon could reveal insights into how human perception achieves these feats. In this paper we present a dataset of 6725 illusion images gathered from two websites, and a smaller dataset of 500 hand-picked images. We will discuss the process of collecting this data, models trained on it, and the work that needs to be done to make it of value to computer vision researchers. |
Tasks | |
Published | 2018-09-30 |
URL | http://arxiv.org/abs/1810.00415v2 |
http://arxiv.org/pdf/1810.00415v2.pdf | |
PWC | https://paperswithcode.com/paper/optical-illusions-images-dataset |
Repo | |
Framework | |
Structure-Aware Shape Synthesis
Title | Structure-Aware Shape Synthesis |
Authors | Elena Balashova, Vivek Singh, Jiangping Wang, Brian Teixeira, Terrence Chen, Thomas Funkhouser |
Abstract | We propose a new procedure to guide training of a data-driven shape generative model using a structure-aware loss function. Complex 3D shapes often can be summarized using a coarsely defined structure which is consistent and robust across variety of observations. However, existing synthesis techniques do not account for structure during training, and thus often generate implausible and structurally unrealistic shapes. During training, we enforce structural constraints in order to enforce consistency and structure across the entire manifold. We propose a novel methodology for training 3D generative models that incorporates structural information into an end-to-end training pipeline. |
Tasks | |
Published | 2018-08-04 |
URL | http://arxiv.org/abs/1808.01427v1 |
http://arxiv.org/pdf/1808.01427v1.pdf | |
PWC | https://paperswithcode.com/paper/structure-aware-shape-synthesis |
Repo | |
Framework | |
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Title | Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection |
Authors | Sung-Feng Huang, Yi-Chen Chen, Hung-yi Lee, Lin-shan Lee |
Abstract | Embedding audio signal segments into vectors with fixed dimensionality is attractive because all following processing will be easier and more efficient, for example modeling, classifying or indexing. Audio Word2Vec previously proposed was shown to be able to represent audio segments for spoken words as such vectors carrying information about the phonetic structures of the signal segments. However, each linguistic unit (word, syllable, phoneme in text form) corresponds to unlimited number of audio segments with vector representations inevitably spread over the embedding space, which causes some confusion. It is therefore desired to better cluster the audio embeddings such that those corresponding to the same linguistic unit can be more compactly distributed. In this paper, inspired by Siamese networks, we propose some approaches to achieve the above goal. This includes identifying positive and negative pairs from unlabeled data for Siamese style training, disentangling acoustic factors such as speaker characteristics from the audio embedding, handling unbalanced data distribution, and having the embedding processes learn from the adjacency relationships among data points. All these can be done in an unsupervised way. Improved performance was obtained in preliminary experiments on the LibriSpeech data set, including clustering characteristics analysis and applications of spoken term detection. |
Tasks | |
Published | 2018-11-07 |
URL | http://arxiv.org/abs/1811.02775v1 |
http://arxiv.org/pdf/1811.02775v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-audio-embeddings-by-adjacency-based |
Repo | |
Framework | |
Analysing object detectors from the perspective of co-occurring object categories
Title | Analysing object detectors from the perspective of co-occurring object categories |
Authors | Csaba Nemes, Sandor Jordan |
Abstract | The accuracy of state-of-the-art Faster R-CNN and YOLO object detectors are evaluated and compared on a special masked MS COCO dataset to measure how much their predictions rely on contextual information encoded at object category level. Category level representation of context is motivated by the fact that it could be an adequate way to transfer knowledge between visual and non-visual domains. According to our measurements, current detectors usually do not build strong dependency on contextual information at category level, however, when they does, they does it in a similar way, suggesting that contextual dependence of object categories is an independent property that is relevant to be transferred. |
Tasks | |
Published | 2018-09-21 |
URL | http://arxiv.org/abs/1809.08132v1 |
http://arxiv.org/pdf/1809.08132v1.pdf | |
PWC | https://paperswithcode.com/paper/analysing-object-detectors-from-the |
Repo | |
Framework | |
Markerless Inside-Out Tracking for Interventional Applications
Title | Markerless Inside-Out Tracking for Interventional Applications |
Authors | Benjamin Busam, Patrick Ruhkamp, Salvatore Virga, Beatrice Lentes, Julia Rackerseder, Nassir Navab, Christoph Hennersperger |
Abstract | Tracking of rotation and translation of medical instruments plays a substantial role in many modern interventions. Traditional external optical tracking systems are often subject to line-of-sight issues, in particular when the region of interest is difficult to access or the procedure allows only for limited rigid body markers. The introduction of inside-out tracking systems aims to overcome these issues. We propose a marker-less tracking system based on visual SLAM to enable tracking of instruments in an interventional scenario. To achieve this goal, we mount a miniature multi-modal (monocular, stereo, active depth) vision system on the object of interest and relocalize its pose within an adaptive map of the operating room. We compare state-of-the-art algorithmic pipelines and apply the idea to transrectal 3D Ultrasound (TRUS) compounding of the prostate. Obtained volumes are compared to reconstruction using a commercial optical tracking system as well as a robotic manipulator. Feature-based binocular SLAM is identified as the most promising method and is tested extensively in challenging clinical environment under severe occlusion and for the use case of prostate US biopsies. |
Tasks | |
Published | 2018-04-05 |
URL | http://arxiv.org/abs/1804.01708v3 |
http://arxiv.org/pdf/1804.01708v3.pdf | |
PWC | https://paperswithcode.com/paper/markerless-inside-out-tracking-for |
Repo | |
Framework | |