Paper Group ANR 782
CNN inference acceleration using dictionary of centroids. Confidence Modeling for Neural Semantic Parsing. Chord Recognition in Symbolic Music: A Segmental CRF Model, Segment-Level Features, and Comparative Evaluations on Classical and Popular Music. Using AI to Design Stone Jewelry. Unsupervised Learning of Style-sensitive Word Vectors. Indexing E …
CNN inference acceleration using dictionary of centroids
Title | CNN inference acceleration using dictionary of centroids |
Authors | D. Babin, I. Mazurenko, D. Parkhomenko, A. Voloshko |
Abstract | It is well known that multiplication operations in convolutional layers of common CNNs consume a lot of time during inference stage. In this article we present a flexible method to decrease both computational complexity of convolutional layers in inference as well as amount of space to store them. The method is based on centroid filter quantization and outperforms approaches based on tensor decomposition by a large margin. We performed comparative analysis of the proposed method and series of CP tensor decomposition on ImageNet benchmark and found that our method provide almost 2.9 times better computational gain. Despite the simplicity of our method it cannot be applied directly in inference stage in modern frameworks, but could be useful for cases calculation flow could be changed, e.g. for CNN-chip designers. |
Tasks | Quantization |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08612v1 |
http://arxiv.org/pdf/1810.08612v1.pdf | |
PWC | https://paperswithcode.com/paper/cnn-inference-acceleration-using-dictionary |
Repo | |
Framework | |
Confidence Modeling for Neural Semantic Parsing
Title | Confidence Modeling for Neural Semantic Parsing |
Authors | Li Dong, Chris Quirk, Mirella Lapata |
Abstract | In this work we focus on confidence modeling for neural semantic parsers which are built upon sequence-to-sequence models. We outline three major causes of uncertainty, and design various metrics to quantify these factors. These metrics are then used to estimate confidence scores that indicate whether model predictions are likely to be correct. Beyond confidence estimation, we identify which parts of the input contribute to uncertain predictions allowing users to interpret their model, and verify or refine its input. Experimental results show that our confidence model significantly outperforms a widely used method that relies on posterior probability, and improves the quality of interpretation compared to simply relying on attention scores. |
Tasks | Semantic Parsing |
Published | 2018-05-11 |
URL | http://arxiv.org/abs/1805.04604v1 |
http://arxiv.org/pdf/1805.04604v1.pdf | |
PWC | https://paperswithcode.com/paper/confidence-modeling-for-neural-semantic |
Repo | |
Framework | |
Chord Recognition in Symbolic Music: A Segmental CRF Model, Segment-Level Features, and Comparative Evaluations on Classical and Popular Music
Title | Chord Recognition in Symbolic Music: A Segmental CRF Model, Segment-Level Features, and Comparative Evaluations on Classical and Popular Music |
Authors | Kristen Masada, Razvan Bunescu |
Abstract | We present a new approach to harmonic analysis that is trained to segment music into a sequence of chord spans tagged with chord labels. Formulated as a semi-Markov Conditional Random Field (semi-CRF), this joint segmentation and labeling approach enables the use of a rich set of segment-level features, such as segment purity and chord coverage, that capture the extent to which the events in an entire segment of music are compatible with a candidate chord label. The new chord recognition model is evaluated extensively on three corpora of classical music and a newly created corpus of rock music. Experimental results show that the semi-CRF model performs substantially better than previous approaches when trained on a sufficient number of labeled examples and remains competitive when the amount of training data is limited. |
Tasks | Chord Recognition |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.10002v2 |
http://arxiv.org/pdf/1810.10002v2.pdf | |
PWC | https://paperswithcode.com/paper/chord-recognition-in-symbolic-music-a |
Repo | |
Framework | |
Using AI to Design Stone Jewelry
Title | Using AI to Design Stone Jewelry |
Authors | Khyatti Gupta, Sonam Damani, Kedhar Nath Narahari |
Abstract | Jewelry has been an integral part of human culture since ages. One of the most popular styles of jewelry is created by putting together precious and semi-precious stones in diverse patterns. While technology is finding its way in the production process of such jewelry, designing it remains a time-consuming and involved task. In this paper, we propose a unique approach using optimization methods coupled with machine learning techniques to generate novel stone jewelry designs at scale. Our evaluation shows that designs generated by our approach are highly likeable and visually appealing. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08759v1 |
http://arxiv.org/pdf/1811.08759v1.pdf | |
PWC | https://paperswithcode.com/paper/using-ai-to-design-stone-jewelry |
Repo | |
Framework | |
Unsupervised Learning of Style-sensitive Word Vectors
Title | Unsupervised Learning of Style-sensitive Word Vectors |
Authors | Reina Akama, Kento Watanabe, Sho Yokoi, Sosuke Kobayashi, Kentaro Inui |
Abstract | This paper presents the first study aimed at capturing stylistic similarity between words in an unsupervised manner. We propose extending the continuous bag of words (CBOW) model (Mikolov et al., 2013) to learn style-sensitive word vectors using a wider context window under the assumption that the style of all the words in an utterance is consistent. In addition, we introduce a novel task to predict lexical stylistic similarity and to create a benchmark dataset for this task. Our experiment with this dataset supports our assumption and demonstrates that the proposed extensions contribute to the acquisition of style-sensitive word embeddings. |
Tasks | Word Embeddings |
Published | 2018-05-15 |
URL | http://arxiv.org/abs/1805.05581v1 |
http://arxiv.org/pdf/1805.05581v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-style-sensitive-word |
Repo | |
Framework | |
Indexing Execution Patterns in Workflow Provenance Graphs through Generalized Trie Structures
Title | Indexing Execution Patterns in Workflow Provenance Graphs through Generalized Trie Structures |
Authors | Esteban García-Cuesta, José M. Gómez-Pérez |
Abstract | Over the last years, scientific workflows have become mature enough to be used in a production style. However, despite the increasing maturity, there is still a shortage of tools for searching, adapting, and reusing workflows that hinders a more generalized adoption by the scientific communities. Indeed, due to the limited availability of machine-readable scientific metadata and the heterogeneity of workflow specification formats and representations, new ways to leverage alternative sources of information that complement existing approaches are needed. In this paper we address such limitations by applying statistically enriched generalized trie structures to exploit workflow execution provenance information in order to assist the analysis, indexing and search of scientific workflows. Our method bridges the gap between the description of what a workflow is supposed to do according to its specification and related metadata and what it actually does as recorded in its provenance execution trace. In doing so, we also prove that the proposed method outperforms SPARQL 1.1 Property Paths for querying provenance graphs. |
Tasks | |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07346v1 |
http://arxiv.org/pdf/1807.07346v1.pdf | |
PWC | https://paperswithcode.com/paper/indexing-execution-patterns-in-workflow |
Repo | |
Framework | |
Design Identification of Curve Patterns on Cultural Heritage Objects: Combining Template Matching and CNN-based Re-Ranking
Title | Design Identification of Curve Patterns on Cultural Heritage Objects: Combining Template Matching and CNN-based Re-Ranking |
Authors | Jun Zhou, Yuhang Lu, Kang Zheng, Karen Smith, Colin Wilder, Song Wang |
Abstract | The surfaces of many cultural heritage objects were embellished with various patterns, especially curve patterns. In practice, most of the unearthed cultural heritage objects are highly fragmented, e.g., sherds of potteries or vessels, and each of them only shows a very small portion of the underlying full design, with noise and deformations. The goal of this paper is to address the challenging problem of automatically identifying the underlying full design of curve patterns from such a sherd. Specifically, we formulate this problem as template matching: curve structure segmented from the sherd is matched to each location with each possible orientation of each known full design. In this paper, we propose a new two-stage matching algorithm, with a different matching cost in each stage. In Stage 1, we use a traditional template matching, which is highly computationally efficient, over the whole search space and identify a small set of candidate matchings. In Stage 2, we derive a new matching cost by training a dual-source Convolutional Neural Network (CNN) and apply it to re-rank the candidate matchings identified in Stage 1. We collect 600 pottery sherds with 98 full designs from the Woodland Period in Southeastern North America for experiments and the performance of the proposed algorithm is very competitive. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06862v1 |
http://arxiv.org/pdf/1805.06862v1.pdf | |
PWC | https://paperswithcode.com/paper/design-identification-of-curve-patterns-on |
Repo | |
Framework | |
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting
Title | Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting |
Authors | Katharina Kann, Hinrich Schütze |
Abstract | Neural state-of-the-art sequence-to-sequence (seq2seq) models often do not perform well for small training sets. We address paradigm completion, the morphological task of, given a partial paradigm, generating all missing forms. We propose two new methods for the minimal-resource setting: (i) Paradigm transduction: Since we assume only few paradigms available for training, neural seq2seq models are able to capture relationships between paradigm cells, but are tied to the idiosyncracies of the training set. Paradigm transduction mitigates this problem by exploiting the input subset of inflected forms at test time. (ii) Source selection with high precision (SHIP): Multi-source models which learn to automatically select one or multiple sources to predict a target inflection do not perform well in the minimal-resource setting. SHIP is an alternative to identify a reliable source if training data is limited. On a 52-language benchmark dataset, we outperform the previous state of the art by up to 9.71% absolute accuracy. |
Tasks | |
Published | 2018-09-24 |
URL | https://arxiv.org/abs/1809.08733v2 |
https://arxiv.org/pdf/1809.08733v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-transductive-learning-and-beyond |
Repo | |
Framework | |
Domain-Invariant Projection Learning for Zero-Shot Recognition
Title | Domain-Invariant Projection Learning for Zero-Shot Recognition |
Authors | An Zhao, Mingyu Ding, Jiechao Guan, Zhiwu Lu, Tao Xiang, Ji-Rong Wen |
Abstract | Zero-shot learning (ZSL) aims to recognize unseen object classes without any training samples, which can be regarded as a form of transfer learning from seen classes to unseen ones. This is made possible by learning a projection between a feature space and a semantic space (e.g. attribute space). Key to ZSL is thus to learn a projection function that is robust against the often large domain gap between the seen and unseen classes. In this paper, we propose a novel ZSL model termed domain-invariant projection learning (DIPL). Our model has two novel components: (1) A domain-invariant feature self-reconstruction task is introduced to the seen/unseen class data, resulting in a simple linear formulation that casts ZSL into a min-min optimization problem. Solving the problem is non-trivial, and a novel iterative algorithm is formulated as the solver, with rigorous theoretic algorithm analysis provided. (2) To further align the two domains via the learned projection, shared semantic structure among seen and unseen classes is explored via forming superclasses in the semantic space. Extensive experiments show that our model outperforms the state-of-the-art alternatives by significant margins. |
Tasks | Transfer Learning, Zero-Shot Learning |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08326v1 |
http://arxiv.org/pdf/1810.08326v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-invariant-projection-learning-for-zero |
Repo | |
Framework | |
Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation
Title | Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation |
Authors | Junsheng Fu, Said Pertuz, Jiri Matas, Joni-Kristian Kämäräinen |
Abstract | We consider a single-query 6-DoF camera pose estimation with reference images and a point cloud, i.e. the problem of estimating the position and orientation of a camera by using reference images and a point cloud. In this work, we perform a systematic comparison of three state-of-the-art strategies for 6-DoF camera pose estimation, i.e. feature-based, photometric-based and mutual-information-based approaches. The performance of the studied methods is evaluated on two standard datasets in terms of success rate, translation error and max orientation error. Building on the results analysis, we propose a hybrid approach that combines feature-based and mutual-information-based pose estimation methods since it provides complementary properties for pose estimation. Experiments show that (1) in cases with large environmental variance, the hybrid approach outperforms feature-based and mutual-information-based approaches by an average of 25.1% and 5.8% in terms of success rate, respectively; (2) in cases where query and reference images are captured at similar imaging conditions, the hybrid approach performs similarly as the feature-based approach, but outperforms both photometric-based and mutual-information-based approaches with a clear margin; (3) the feature-based approach is consistently more accurate than mutual-information-based and photometric-based approaches when at least 4 consistent matching points are found between the query and reference images. |
Tasks | Pose Estimation |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.05848v1 |
http://arxiv.org/pdf/1808.05848v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-analysis-and-robustification-of |
Repo | |
Framework | |
Accurate Spectral Super-resolution from Single RGB Image Using Multi-scale CNN
Title | Accurate Spectral Super-resolution from Single RGB Image Using Multi-scale CNN |
Authors | Yiqi Yan, Lei Zhang, Jun Li, Wei Wei, Yanning Zhang |
Abstract | Different from traditional hyperspectral super-resolution approaches that focus on improving the spatial resolution, spectral super-resolution aims at producing a high-resolution hyperspectral image from the RGB observation with super-resolution in spectral domain. However, it is challenging to accurately reconstruct a high-dimensional continuous spectrum from three discrete intensity values at each pixel, since too much information is lost during the procedure where the latent hyperspectral image is downsampled (e.g., with x10 scaling factor) in spectral domain to produce an RGB observation. To address this problem, we present a multi-scale deep convolutional neural network (CNN) to explicitly map the input RGB image into a hyperspectral image. Through symmetrically downsampling and upsampling the intermediate feature maps in a cascading paradigm, the local and non-local image information can be jointly encoded for spectral representation, ultimately improving the spectral reconstruction accuracy. Extensive experiments on a large hyperspectral dataset demonstrate the effectiveness of the proposed method. |
Tasks | Super-Resolution |
Published | 2018-06-10 |
URL | http://arxiv.org/abs/1806.03575v3 |
http://arxiv.org/pdf/1806.03575v3.pdf | |
PWC | https://paperswithcode.com/paper/accurate-spectral-super-resolution-from |
Repo | |
Framework | |
Online Learning of Quantum States
Title | Online Learning of Quantum States |
Authors | Scott Aaronson, Xinyi Chen, Elad Hazan, Satyen Kale, Ashwin Nayak |
Abstract | Suppose we have many copies of an unknown $n$-qubit state $\rho$. We measure some copies of $\rho$ using a known two-outcome measurement $E_{1}$, then other copies using a measurement $E_{2}$, and so on. At each stage $t$, we generate a current hypothesis $\sigma_{t}$ about the state $\rho$, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that $\operatorname{Tr}(E_{i} \sigma_{t}) - \operatorname{Tr}(E_{i}\rho) $, the error in our prediction for the next measurement, is at least $\varepsilon$ at most $\operatorname{O}!\left(n / \varepsilon^2 \right) $ times. Even in the “non-realizable” setting—where there could be arbitrary noise in the measurement outcomes—we show how to output hypothesis states that do significantly worse than the best possible states at most $\operatorname{O}!\left(\sqrt {Tn}\right) $ times on the first $T$ measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results—using convex optimization, quantum postselection, and sequential fat-shattering dimension—which have different advantages in terms of parameters and portability. |
Tasks | |
Published | 2018-02-25 |
URL | https://arxiv.org/abs/1802.09025v3 |
https://arxiv.org/pdf/1802.09025v3.pdf | |
PWC | https://paperswithcode.com/paper/online-learning-of-quantum-states |
Repo | |
Framework | |
Real time Traffic Flow Parameters Prediction with Basic Safety Messages at Low Penetration of Connected Vehicles
Title | Real time Traffic Flow Parameters Prediction with Basic Safety Messages at Low Penetration of Connected Vehicles |
Authors | Mizanur Rahman, Mashrur Chowdhury, Jerome McClendon |
Abstract | The expected low market penetration of connected vehicles (CVs) in the near future could be a constraint in estimating traffic flow parameters, such as average travel speed of a roadway segment and average space headway between vehicles from the CV broadcasted data. This estimated traffic flow parameters from low penetration of connected vehicles become noisy compared to 100 percent penetration of CVs, and such noise reduces the real time prediction accuracy of a machine learning model, such as the accuracy of long short term memory (LSTM) model in terms of predicting traffic flow parameters. The accurate prediction of the parameters is important for future traffic condition assessment. To improve the prediction accuracy using noisy traffic flow parameters, which is constrained by limited CV market penetration and limited CV data, we developed a real time traffic data prediction model that combines LSTM with Kalman filter based Rauch Tung Striebel (RTS) noise reduction model. We conducted a case study using the Enhanced Next Generation Simulation (NGSIM) dataset, which contains vehicle trajectory data for every one tenth of a second, to evaluate the performance of this prediction model. Compared to a baseline LSTM model performance, for only 5 percent penetration of CVs, the analyses revealed that combined LSTM and RTS model reduced the mean absolute percentage error (MAPE) from 19 percent to 5 percent for speed prediction and from 27 percent to 9 percent for space-headway prediction. The statistical significance test with a 95 percent confidence interval confirmed no significant difference in predicted average speed and average space headway using this LSTM and RTS combination with only 5 percent CV penetration rate. |
Tasks | |
Published | 2018-11-08 |
URL | https://arxiv.org/abs/1811.03562v2 |
https://arxiv.org/pdf/1811.03562v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-traffic-data-prediction-with-basic |
Repo | |
Framework | |
Feature Transfer Learning for Deep Face Recognition with Under-Represented Data
Title | Feature Transfer Learning for Deep Face Recognition with Under-Represented Data |
Authors | Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, Manmohan Chandraker |
Abstract | Despite the large volume of face recognition datasets, there is a significant portion of subjects, of which the samples are insufficient and thus under-represented. Ignoring such significant portion results in insufficient training data. Training with under-represented data leads to biased classifiers in conventionally-trained deep networks. In this paper, we propose a center-based feature transfer framework to augment the feature space of under-represented subjects from the regular subjects that have sufficiently diverse samples. A Gaussian prior of the variance is assumed across all subjects and the variance from regular ones are transferred to the under-represented ones. This encourages the under-represented distribution to be closer to the regular distribution. Further, an alternating training regimen is proposed to simultaneously achieve less biased classifiers and a more discriminative feature representation. We conduct ablative study to mimic the under-represented datasets by varying the portion of under-represented classes on the MS-Celeb-1M dataset. Advantageous results on LFW, IJB-A and MS-Celeb-1M demonstrate the effectiveness of our feature transfer and training strategy, compared to both general baselines and state-of-the-art methods. Moreover, our feature transfer successfully presents smooth visual interpolation, which conducts disentanglement to preserve identity of a class while augmenting its feature space with non-identity variations such as pose and lighting. |
Tasks | Face Recognition, Transfer Learning |
Published | 2018-03-23 |
URL | https://arxiv.org/abs/1803.09014v2 |
https://arxiv.org/pdf/1803.09014v2.pdf | |
PWC | https://paperswithcode.com/paper/feature-transfer-learning-for-deep-face |
Repo | |
Framework | |
The Institutional Approach
Title | The Institutional Approach |
Authors | Robert E. Kent |
Abstract | This chapter discusses the institutional approach for organizing and maintaining ontologies. The theory of institutions was named and initially developed by Joseph Goguen and Rod Burstall. This theory, a metatheory based on category theory, regards ontologies as logical theories or local logics. The theory of institutions uses the category-theoretic ideas of fibrations and indexed categories to develop logical theories. Institutions unite the lattice approach of Formal Concept Analysis of Ganter and Wille with the distributed logic of Information Flow of Barwise and Seligman. The institutional approach incorporates locally the lattice of theories idea of Sowa from the theory of knowledge representation. The Information Flow Framework, which was initiated within the IEEE Standard Upper Ontology project, uses the institutional approach in its applied aspect for the comparison, semantic integration and maintenance of ontologies. This chapter explains the central ideas of the institutional approach to ontologies in a careful and detailed manner. |
Tasks | |
Published | 2018-10-17 |
URL | http://arxiv.org/abs/1810.08074v1 |
http://arxiv.org/pdf/1810.08074v1.pdf | |
PWC | https://paperswithcode.com/paper/the-institutional-approach |
Repo | |
Framework | |