Paper Group ANR 736
Multimodal Image Captioning for Marketing Analysis. Learning audio and image representations with bio-inspired trainable feature extractors. Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes. Data Strategies for Fleetwide Predictive Maintenance. High-dimensional Varying Index Coefficient Models via Stein’s Identit …
Multimodal Image Captioning for Marketing Analysis
Title | Multimodal Image Captioning for Marketing Analysis |
Authors | Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner |
Abstract | Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not target specific objects of interest or emotional relationships between these objects in the image. However, marketing companies require to describe these important attributes of a given scene. In our case, objects of interest are consumer goods, which are usually identifiable by a product logo and are associated with certain brands. From a marketing point of view, it is desirable to also evaluate the emotional context of a trademarked product, i.e., whether it appears in a positive or a negative connotation. We address the problem of finding brands in images and deriving corresponding captions by introducing a modified image captioning network. We also add a third output modality, which simultaneously produces real-valued image ratings. Our network is trained using a classification-aware loss function in order to stimulate the generation of sentences with an emphasis on words identifying the brand of a product. We evaluate our model on a dataset of images depicting interactions between humans and branded products. The introduced network improves mean class accuracy by 24.5 percent. Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products. |
Tasks | Image Captioning |
Published | 2018-02-06 |
URL | https://arxiv.org/abs/1802.01958v2 |
https://arxiv.org/pdf/1802.01958v2.pdf | |
PWC | https://paperswithcode.com/paper/multimodal-image-captioning-for-marketing |
Repo | |
Framework | |
Learning audio and image representations with bio-inspired trainable feature extractors
Title | Learning audio and image representations with bio-inspired trainable feature extractors |
Authors | Nicola Strisciuglio |
Abstract | Recent advancements in pattern recognition and signal processing concern the automatic learning of data representations from labeled training samples. Typical approaches are based on deep learning and convolutional neural networks, which require large amount of labeled training samples. In this work, we propose novel feature extractors that can be used to learn the representation of single prototype samples in an automatic configuration process. We employ the proposed feature extractors in applications of audio and image processing, and show their effectiveness on benchmark data sets. |
Tasks | |
Published | 2018-01-02 |
URL | http://arxiv.org/abs/1801.00688v1 |
http://arxiv.org/pdf/1801.00688v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-audio-and-image-representations-with |
Repo | |
Framework | |
Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes
Title | Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes |
Authors | Seyed Mostafa Kia, Andre Marquand |
Abstract | Normative modeling has recently been proposed as an alternative for the case-control approach in modeling heterogeneity within clinical cohorts. Normative modeling is based on single-output Gaussian process regression that provides coherent estimates of uncertainty required by the method but does not consider spatial covariance structure. Here, we introduce a scalable multi-task Gaussian process regression (S-MTGPR) approach to address this problem. To this end, we exploit a combination of a low-rank approximation of the spatial covariance matrix with algebraic properties of Kronecker product in order to reduce the computational complexity of Gaussian process regression in high-dimensional output spaces. On a public fMRI dataset, we show that S-MTGPR: 1) leads to substantial computational improvements that allow us to estimate normative models for high-dimensional fMRI data whilst accounting for spatial structure in data; 2) by modeling both spatial and across-sample variances, it provides higher sensitivity in novelty detection scenarios. |
Tasks | Gaussian Processes |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01047v2 |
http://arxiv.org/pdf/1806.01047v2.pdf | |
PWC | https://paperswithcode.com/paper/normative-modeling-of-neuroimaging-data-using |
Repo | |
Framework | |
Data Strategies for Fleetwide Predictive Maintenance
Title | Data Strategies for Fleetwide Predictive Maintenance |
Authors | David Noever |
Abstract | For predictive maintenance, we examine one of the largest public datasets for machine failures derived along with their corresponding precursors as error rates, historical part replacements, and sensor inputs. To simplify the time and accuracy comparison between 27 different algorithms, we treat the imbalance between normal and failing states with nominal under-sampling. We identify 3 promising regression and discriminant algorithms with both higher accuracy (96%) and twenty-fold faster execution times than previous work. Because predictive maintenance success hinges on input features prior to prediction, we provide a methodology to rank-order feature importance and show that for this dataset, error counts prove more predictive than scheduled maintenance might imply solely based on more traditional factors such as machine age or last replacement times. |
Tasks | Feature Importance |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04446v1 |
http://arxiv.org/pdf/1812.04446v1.pdf | |
PWC | https://paperswithcode.com/paper/data-strategies-for-fleetwide-predictive |
Repo | |
Framework | |
High-dimensional Varying Index Coefficient Models via Stein’s Identity
Title | High-dimensional Varying Index Coefficient Models via Stein’s Identity |
Authors | Sen Na, Zhuoran Yang, Zhaoran Wang, Mladen Kolar |
Abstract | We study the parameter estimation problem for a varying index coefficient model in high dimensions. Unlike the most existing works that iteratively estimate the parameters and link functions, based on the generalized Stein’s identity, we propose computationally efficient estimators for the high-dimensional parameters without estimating the link functions. We consider two different setups where we either estimate each sparse parameter vector individually or estimate the parameters simultaneously as a sparse or low-rank matrix. For all these cases, our estimators are shown to achieve optimal statistical rates of convergence (up to logarithmic terms in the low-rank setting). Moreover, throughout our analysis, we only require the covariate to satisfy certain moment conditions, which is significantly weaker than the Gaussian or elliptically symmetric assumptions that are commonly made in the existing literature. Finally, we conduct extensive numerical experiments to corroborate the theoretical results. |
Tasks | |
Published | 2018-10-16 |
URL | https://arxiv.org/abs/1810.07128v4 |
https://arxiv.org/pdf/1810.07128v4.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-varying-index-coefficient |
Repo | |
Framework | |
Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal Adenocarcinoma
Title | Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal Adenocarcinoma |
Authors | Zhuotun Zhu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille |
Abstract | We propose an intuitive approach of detecting pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, by checking abdominal CT scans. Our idea is named multi-scale segmentation-for-classification, which classifies volumes by checking if at least a sufficient number of voxels is segmented as tumors, by which we can provide radiologists with tumor locations. In order to deal with tumors with different scales, we train and test our volumetric segmentation networks with multi-scale inputs in a coarse-to-fine flowchart. A post-processing module is used to filter out outliers and reduce false alarms. We collect a new dataset containing 439 CT scans, in which 136 cases were diagnosed with PDAC and 303 cases are normal, which is the largest set for PDAC tumors to the best of our knowledge. To offer the best trade-off between sensitivity and specificity, our proposed framework reports a sensitivity of 94.1% at a specificity of 98.5%, which demonstrates the potential to make a clinical impact. |
Tasks | |
Published | 2018-07-09 |
URL | https://arxiv.org/abs/1807.02941v2 |
https://arxiv.org/pdf/1807.02941v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-scale-coarse-to-fine-segmentation-for |
Repo | |
Framework | |
An end-to-end approach for speeding up neural network inference
Title | An end-to-end approach for speeding up neural network inference |
Authors | Charles Herrmann, Richard Strong Bowen, Ramin Zabih |
Abstract | Important applications such as mobile computing require reducing the computational costs of neural network inference. Ideally, applications would specify their preferred tradeoff between accuracy and speed, and the network would optimize this end-to-end, using classification error to remove parts of the network \cite{lecun1990optimal,mozer1989skeletonization,BMVC2016_104}. Increasing speed can be done either during training – e.g., pruning filters \cite{li2016pruning} – or during inference – e.g., conditionally executing a subset of the layers \cite{aig}. We propose a single end-to-end framework that can improve inference efficiency in both settings. We introduce a batch activation loss and use Gumbel reparameterization to learn network structure \cite{aig,jang2016categorical}. We train end-to-end against batch activation loss combined with classification loss, and the same technique supports pruning as well as conditional computation. We obtain promising experimental results for ImageNet classification with ResNet \cite{he2016resnet} (45-52% less computation) and MobileNetV2 \cite{sandler2018mobilenetv2} (19-37% less computation). |
Tasks | |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04180v3 |
http://arxiv.org/pdf/1812.04180v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-networks-with-probabilistic-gates |
Repo | |
Framework | |
Logographic Subword Model for Neural Machine Translation
Title | Logographic Subword Model for Neural Machine Translation |
Authors | Yihao Fang, Rong Zheng, Xiaodan Zhu |
Abstract | A novel logographic subword model is proposed to reinterpret logograms as abstract subwords for neural machine translation. Our approach drastically reduces the size of an artificial neural network, while maintaining comparable BLEU scores as those attained with the baseline RNN and CNN seq2seq models. The smaller model size also leads to shorter training and inference time. Experiments demonstrate that in the tasks of English-Chinese/Chinese-English translation, the reduction of those aspects can be from $11%$ to as high as $77%$. Compared to previous subword models, abstract subwords can be applied to various logographic languages. Considering most of the logographic languages are ancient and very low resource languages, these advantages are very desirable for archaeological computational linguistic applications such as a resource-limited offline hand-held Demotic-English translator. |
Tasks | Machine Translation |
Published | 2018-09-07 |
URL | http://arxiv.org/abs/1809.02592v1 |
http://arxiv.org/pdf/1809.02592v1.pdf | |
PWC | https://paperswithcode.com/paper/logographic-subword-model-for-neural-machine |
Repo | |
Framework | |
Multi-task Learning over Graph Structures
Title | Multi-task Learning over Graph Structures |
Authors | Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung |
Abstract | We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different tasks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks and propose a general \textbf{graph multi-task learning} framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labeling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines but also learn interpretable and transferable patterns across tasks. |
Tasks | Multi-Task Learning, Text Classification, Transfer Learning |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10211v1 |
http://arxiv.org/pdf/1811.10211v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-over-graph-structures |
Repo | |
Framework | |
Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results
Title | Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results |
Authors | Vishwa S. Parekh, Katarzyna J. Macura, Susan Harvey, Ihab Kamel, Riham EI-Khouli, David A. Bluemke, Michael A. Jacobs |
Abstract | A new paradigm is beginning to emerge in Radiology with the advent of increased computational capabilities and algorithms. This has led to the ability of real time learning by computer systems of different lesion types to help the radiologist in defining disease. For example, using a deep learning network, we developed and tested a multiparametric deep learning (MPDL) network for segmentation and classification using multiparametric magnetic resonance imaging (mpMRI) radiological images. The MPDL network was constructed from stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL consisted of cross-validation, sensitivity, and specificity. Dice similarity between MPDL and post-DCE lesions were evaluated. We demonstrate high sensitivity and specificity for differentiation of malignant from benign lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL method accurately segmented and classified different breast tissue from multiparametric breast MRI using deep leaning tissue signatures. |
Tasks | |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.08200v1 |
http://arxiv.org/pdf/1802.08200v1.pdf | |
PWC | https://paperswithcode.com/paper/multiparametric-deep-learning-tissue |
Repo | |
Framework | |
Modularity in biological evolution and evolutionary computation
Title | Modularity in biological evolution and evolutionary computation |
Authors | Anton Eremeev, Alexander Spirov |
Abstract | One of the main properties of biological systems is modularity, which manifests itself at all levels of their organization, starting with the level of molecular genetics, ending with the level of whole organisms and their communities. In a simplified form, these basic principles were transferred from the genetics of populations to the field of evolutionary computations, in order to solve applied optimization problems. Over almost half a century of development in this field of computer science, considerable practical experience has been gained and interesting theoretical results have been obtained. In this survey, the phenomena and patterns associated with modularity in genetics and evolutionary computations are compared. An analysis of similarities and differences in the results obtained in these areas is carried out from the modularity view point. The possibilities for knowledge transfer between the areas are discussed. |
Tasks | Transfer Learning |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07511v1 |
http://arxiv.org/pdf/1811.07511v1.pdf | |
PWC | https://paperswithcode.com/paper/modularity-in-biological-evolution-and |
Repo | |
Framework | |
Discovering Blind Spots in Reinforcement Learning
Title | Discovering Blind Spots in Reinforcement Learning |
Authors | Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Julie Shah, Eric Horvitz |
Abstract | Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots. |
Tasks | Calibration |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.08966v1 |
http://arxiv.org/pdf/1805.08966v1.pdf | |
PWC | https://paperswithcode.com/paper/discovering-blind-spots-in-reinforcement |
Repo | |
Framework | |
Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering
Title | Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering |
Authors | Konstantin Makarychev, Yury Makarychev, Ilya Razenshteyn |
Abstract | Consider an instance of Euclidean $k$-means or $k$-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of $(1+\varepsilon)$ under a projection onto a random $O(\log(k / \varepsilon) / \varepsilon^2)$-dimensional subspace. Further, the cost of every clustering is preserved within $(1+\varepsilon)$. More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean $k$-clustering with the distances raised to the $p$-th power for any constant $p$. For $k$-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for $k$-medians, it answers a question raised by Kannan. |
Tasks | Dimensionality Reduction |
Published | 2018-11-08 |
URL | http://arxiv.org/abs/1811.03195v1 |
http://arxiv.org/pdf/1811.03195v1.pdf | |
PWC | https://paperswithcode.com/paper/performance-of-johnson-lindenstrauss |
Repo | |
Framework | |
Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning
Title | Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning |
Authors | Yawei Luo, Tao Guan, Junqing Yu, Ping Liu, Yi Yang |
Abstract | Graph convolutional network (GCN) provides a powerful means for graph-based semi-supervised tasks. However, as a localized first-order approximation of spectral graph convolution, the classic GCN can not take full advantage of unlabeled data, especially when the unlabeled node is far from labeled ones. To capitalize on the information from unlabeled nodes to boost the training for GCN, we propose a novel framework named Self-Ensembling GCN (SEGCN), which marries GCN with Mean Teacher - another powerful model in semi-supervised learning. SEGCN contains a student model and a teacher model. As a student, it not only learns to correctly classify the labeled nodes, but also tries to be consistent with the teacher on unlabeled nodes in more challenging situations, such as a high dropout rate and graph collapse. As a teacher, it averages the student model weights and generates more accurate predictions to lead the student. In such a mutual-promoting process, both labeled and unlabeled samples can be fully utilized for backpropagating effective gradients to train GCN. In three article classification tasks, i.e. Citeseer, Cora and Pubmed, we validate that the proposed method matches the state of the arts in the classification accuracy. |
Tasks | |
Published | 2018-09-26 |
URL | http://arxiv.org/abs/1809.09925v1 |
http://arxiv.org/pdf/1809.09925v1.pdf | |
PWC | https://paperswithcode.com/paper/every-node-counts-self-ensembling-graph |
Repo | |
Framework | |
Rapid Prediction of Electron-Ionization Mass Spectrometry using Neural Networks
Title | Rapid Prediction of Electron-Ionization Mass Spectrometry using Neural Networks |
Authors | Jennifer N. Wei, David Belanger, Ryan P. Adams, D. Sculley |
Abstract | When confronted with a substance of unknown identity, researchers often perform mass spectrometry on the sample and compare the observed spectrum to a library of previously-collected spectra to identify the molecule. While popular, this approach will fail to identify molecules that are not in the existing library. In response, we propose to improve the library’s coverage by augmenting it with synthetic spectra that are predicted using machine learning. We contribute a lightweight neural network model that quickly predicts mass spectra for small molecules. Achieving high accuracy predictions requires a novel neural network architecture that is designed to capture typical fragmentation patterns from electron ionization. We analyze the effects of our modeling innovations on library matching performance and compare our models to prior machine learning-based work on spectrum prediction. |
Tasks | |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1811.08545v2 |
http://arxiv.org/pdf/1811.08545v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-electron-ionization-mass |
Repo | |
Framework | |