October 17, 2019

2817 words 14 mins read

Paper Group ANR 736

Multimodal Image Captioning for Marketing Analysis. Learning audio and image representations with bio-inspired trainable feature extractors. Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes. Data Strategies for Fleetwide Predictive Maintenance. High-dimensional Varying Index Coefficient Models via Stein’s Identit …

Multimodal Image Captioning for Marketing Analysis


Title	Multimodal Image Captioning for Marketing Analysis
Authors	Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner
Abstract	Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not target specific objects of interest or emotional relationships between these objects in the image. However, marketing companies require to describe these important attributes of a given scene. In our case, objects of interest are consumer goods, which are usually identifiable by a product logo and are associated with certain brands. From a marketing point of view, it is desirable to also evaluate the emotional context of a trademarked product, i.e., whether it appears in a positive or a negative connotation. We address the problem of finding brands in images and deriving corresponding captions by introducing a modified image captioning network. We also add a third output modality, which simultaneously produces real-valued image ratings. Our network is trained using a classification-aware loss function in order to stimulate the generation of sentences with an emphasis on words identifying the brand of a product. We evaluate our model on a dataset of images depicting interactions between humans and branded products. The introduced network improves mean class accuracy by 24.5 percent. Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products.
Tasks	Image Captioning
Published	2018-02-06
URL	https://arxiv.org/abs/1802.01958v2
PDF	https://arxiv.org/pdf/1802.01958v2.pdf
PWC	https://paperswithcode.com/paper/multimodal-image-captioning-for-marketing
Repo
Framework

Learning audio and image representations with bio-inspired trainable feature extractors


Title	Learning audio and image representations with bio-inspired trainable feature extractors
Authors	Nicola Strisciuglio
Abstract	Recent advancements in pattern recognition and signal processing concern the automatic learning of data representations from labeled training samples. Typical approaches are based on deep learning and convolutional neural networks, which require large amount of labeled training samples. In this work, we propose novel feature extractors that can be used to learn the representation of single prototype samples in an automatic configuration process. We employ the proposed feature extractors in applications of audio and image processing, and show their effectiveness on benchmark data sets.
Tasks
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00688v1
PDF	http://arxiv.org/pdf/1801.00688v1.pdf
PWC	https://paperswithcode.com/paper/learning-audio-and-image-representations-with
Repo
Framework

Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes


Title	Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes
Authors	Seyed Mostafa Kia, Andre Marquand
Abstract	Normative modeling has recently been proposed as an alternative for the case-control approach in modeling heterogeneity within clinical cohorts. Normative modeling is based on single-output Gaussian process regression that provides coherent estimates of uncertainty required by the method but does not consider spatial covariance structure. Here, we introduce a scalable multi-task Gaussian process regression (S-MTGPR) approach to address this problem. To this end, we exploit a combination of a low-rank approximation of the spatial covariance matrix with algebraic properties of Kronecker product in order to reduce the computational complexity of Gaussian process regression in high-dimensional output spaces. On a public fMRI dataset, we show that S-MTGPR: 1) leads to substantial computational improvements that allow us to estimate normative models for high-dimensional fMRI data whilst accounting for spatial structure in data; 2) by modeling both spatial and across-sample variances, it provides higher sensitivity in novelty detection scenarios.
Tasks	Gaussian Processes
Published	2018-06-04
URL	http://arxiv.org/abs/1806.01047v2
PDF	http://arxiv.org/pdf/1806.01047v2.pdf
PWC	https://paperswithcode.com/paper/normative-modeling-of-neuroimaging-data-using
Repo
Framework

Data Strategies for Fleetwide Predictive Maintenance


Title	Data Strategies for Fleetwide Predictive Maintenance
Authors	David Noever
Abstract	For predictive maintenance, we examine one of the largest public datasets for machine failures derived along with their corresponding precursors as error rates, historical part replacements, and sensor inputs. To simplify the time and accuracy comparison between 27 different algorithms, we treat the imbalance between normal and failing states with nominal under-sampling. We identify 3 promising regression and discriminant algorithms with both higher accuracy (96%) and twenty-fold faster execution times than previous work. Because predictive maintenance success hinges on input features prior to prediction, we provide a methodology to rank-order feature importance and show that for this dataset, error counts prove more predictive than scheduled maintenance might imply solely based on more traditional factors such as machine age or last replacement times.
Tasks	Feature Importance
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04446v1
PDF	http://arxiv.org/pdf/1812.04446v1.pdf
PWC	https://paperswithcode.com/paper/data-strategies-for-fleetwide-predictive
Repo
Framework

High-dimensional Varying Index Coefficient Models via Stein’s Identity


Title	High-dimensional Varying Index Coefficient Models via Stein’s Identity
Authors	Sen Na, Zhuoran Yang, Zhaoran Wang, Mladen Kolar
Abstract	We study the parameter estimation problem for a varying index coefficient model in high dimensions. Unlike the most existing works that iteratively estimate the parameters and link functions, based on the generalized Stein’s identity, we propose computationally efficient estimators for the high-dimensional parameters without estimating the link functions. We consider two different setups where we either estimate each sparse parameter vector individually or estimate the parameters simultaneously as a sparse or low-rank matrix. For all these cases, our estimators are shown to achieve optimal statistical rates of convergence (up to logarithmic terms in the low-rank setting). Moreover, throughout our analysis, we only require the covariate to satisfy certain moment conditions, which is significantly weaker than the Gaussian or elliptically symmetric assumptions that are commonly made in the existing literature. Finally, we conduct extensive numerical experiments to corroborate the theoretical results.
Tasks
Published	2018-10-16
URL	https://arxiv.org/abs/1810.07128v4
PDF	https://arxiv.org/pdf/1810.07128v4.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-varying-index-coefficient
Repo
Framework

Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal Adenocarcinoma


Title	Multi-Scale Coarse-to-Fine Segmentation for Screening Pancreatic Ductal Adenocarcinoma
Authors	Zhuotun Zhu, Yingda Xia, Lingxi Xie, Elliot K. Fishman, Alan L. Yuille
Abstract	We propose an intuitive approach of detecting pancreatic ductal adenocarcinoma (PDAC), the most common type of pancreatic cancer, by checking abdominal CT scans. Our idea is named multi-scale segmentation-for-classification, which classifies volumes by checking if at least a sufficient number of voxels is segmented as tumors, by which we can provide radiologists with tumor locations. In order to deal with tumors with different scales, we train and test our volumetric segmentation networks with multi-scale inputs in a coarse-to-fine flowchart. A post-processing module is used to filter out outliers and reduce false alarms. We collect a new dataset containing 439 CT scans, in which 136 cases were diagnosed with PDAC and 303 cases are normal, which is the largest set for PDAC tumors to the best of our knowledge. To offer the best trade-off between sensitivity and specificity, our proposed framework reports a sensitivity of 94.1% at a specificity of 98.5%, which demonstrates the potential to make a clinical impact.
Tasks
Published	2018-07-09
URL	https://arxiv.org/abs/1807.02941v2
PDF	https://arxiv.org/pdf/1807.02941v2.pdf
PWC	https://paperswithcode.com/paper/multi-scale-coarse-to-fine-segmentation-for
Repo
Framework

An end-to-end approach for speeding up neural network inference


Title	An end-to-end approach for speeding up neural network inference
Authors	Charles Herrmann, Richard Strong Bowen, Ramin Zabih
Abstract	Important applications such as mobile computing require reducing the computational costs of neural network inference. Ideally, applications would specify their preferred tradeoff between accuracy and speed, and the network would optimize this end-to-end, using classification error to remove parts of the network \cite{lecun1990optimal,mozer1989skeletonization,BMVC2016_104}. Increasing speed can be done either during training – e.g., pruning filters \cite{li2016pruning} – or during inference – e.g., conditionally executing a subset of the layers \cite{aig}. We propose a single end-to-end framework that can improve inference efficiency in both settings. We introduce a batch activation loss and use Gumbel reparameterization to learn network structure \cite{aig,jang2016categorical}. We train end-to-end against batch activation loss combined with classification loss, and the same technique supports pruning as well as conditional computation. We obtain promising experimental results for ImageNet classification with ResNet \cite{he2016resnet} (45-52% less computation) and MobileNetV2 \cite{sandler2018mobilenetv2} (19-37% less computation).
Tasks
Published	2018-12-11
URL	http://arxiv.org/abs/1812.04180v3
PDF	http://arxiv.org/pdf/1812.04180v3.pdf
PWC	https://paperswithcode.com/paper/deep-networks-with-probabilistic-gates
Repo
Framework

Logographic Subword Model for Neural Machine Translation


Title	Logographic Subword Model for Neural Machine Translation
Authors	Yihao Fang, Rong Zheng, Xiaodan Zhu
Abstract	A novel logographic subword model is proposed to reinterpret logograms as abstract subwords for neural machine translation. Our approach drastically reduces the size of an artificial neural network, while maintaining comparable BLEU scores as those attained with the baseline RNN and CNN seq2seq models. The smaller model size also leads to shorter training and inference time. Experiments demonstrate that in the tasks of English-Chinese/Chinese-English translation, the reduction of those aspects can be from $11%$ to as high as $77%$. Compared to previous subword models, abstract subwords can be applied to various logographic languages. Considering most of the logographic languages are ancient and very low resource languages, these advantages are very desirable for archaeological computational linguistic applications such as a resource-limited offline hand-held Demotic-English translator.
Tasks	Machine Translation
Published	2018-09-07
URL	http://arxiv.org/abs/1809.02592v1
PDF	http://arxiv.org/pdf/1809.02592v1.pdf
PWC	https://paperswithcode.com/paper/logographic-subword-model-for-neural-machine
Repo
Framework

Multi-task Learning over Graph Structures


Title	Multi-task Learning over Graph Structures
Authors	Pengfei Liu, Jie Fu, Yue Dong, Xipeng Qiu, Jackie Chi Kit Cheung
Abstract	We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different tasks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks and propose a general \textbf{graph multi-task learning} framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labeling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines but also learn interpretable and transferable patterns across tasks.
Tasks	Multi-Task Learning, Text Classification, Transfer Learning
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10211v1
PDF	http://arxiv.org/pdf/1811.10211v1.pdf
PWC	https://paperswithcode.com/paper/multi-task-learning-over-graph-structures
Repo
Framework

Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results


Title	Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results
Authors	Vishwa S. Parekh, Katarzyna J. Macura, Susan Harvey, Ihab Kamel, Riham EI-Khouli, David A. Bluemke, Michael A. Jacobs
Abstract	A new paradigm is beginning to emerge in Radiology with the advent of increased computational capabilities and algorithms. This has led to the ability of real time learning by computer systems of different lesion types to help the radiologist in defining disease. For example, using a deep learning network, we developed and tested a multiparametric deep learning (MPDL) network for segmentation and classification using multiparametric magnetic resonance imaging (mpMRI) radiological images. The MPDL network was constructed from stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL consisted of cross-validation, sensitivity, and specificity. Dice similarity between MPDL and post-DCE lesions were evaluated. We demonstrate high sensitivity and specificity for differentiation of malignant from benign lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL method accurately segmented and classified different breast tissue from multiparametric breast MRI using deep leaning tissue signatures.
Tasks
Published	2018-02-10
URL	http://arxiv.org/abs/1802.08200v1
PDF	http://arxiv.org/pdf/1802.08200v1.pdf
PWC	https://paperswithcode.com/paper/multiparametric-deep-learning-tissue
Repo
Framework

Modularity in biological evolution and evolutionary computation


Title	Modularity in biological evolution and evolutionary computation
Authors	Anton Eremeev, Alexander Spirov
Abstract	One of the main properties of biological systems is modularity, which manifests itself at all levels of their organization, starting with the level of molecular genetics, ending with the level of whole organisms and their communities. In a simplified form, these basic principles were transferred from the genetics of populations to the field of evolutionary computations, in order to solve applied optimization problems. Over almost half a century of development in this field of computer science, considerable practical experience has been gained and interesting theoretical results have been obtained. In this survey, the phenomena and patterns associated with modularity in genetics and evolutionary computations are compared. An analysis of similarities and differences in the results obtained in these areas is carried out from the modularity view point. The possibilities for knowledge transfer between the areas are discussed.
Tasks	Transfer Learning
Published	2018-11-19
URL	http://arxiv.org/abs/1811.07511v1
PDF	http://arxiv.org/pdf/1811.07511v1.pdf
PWC	https://paperswithcode.com/paper/modularity-in-biological-evolution-and
Repo
Framework


Title	Discovering Blind Spots in Reinforcement Learning
Authors	Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Julie Shah, Eric Horvitz
Abstract	Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots.
Tasks	Calibration
Published	2018-05-23
URL	http://arxiv.org/abs/1805.08966v1
PDF	http://arxiv.org/pdf/1805.08966v1.pdf
PWC	https://paperswithcode.com/paper/discovering-blind-spots-in-reinforcement
Repo
Framework

Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering


Title	Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering
Authors	Konstantin Makarychev, Yury Makarychev, Ilya Razenshteyn
Abstract	Consider an instance of Euclidean $k$-means or $k$-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of $(1+\varepsilon)$ under a projection onto a random $O(\log(k / \varepsilon) / \varepsilon^2)$-dimensional subspace. Further, the cost of every clustering is preserved within $(1+\varepsilon)$. More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean $k$-clustering with the distances raised to the $p$-th power for any constant $p$. For $k$-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for $k$-medians, it answers a question raised by Kannan.
Tasks	Dimensionality Reduction
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03195v1
PDF	http://arxiv.org/pdf/1811.03195v1.pdf
PWC	https://paperswithcode.com/paper/performance-of-johnson-lindenstrauss
Repo
Framework

Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning


Title	Every Node Counts: Self-Ensembling Graph Convolutional Networks for Semi-Supervised Learning
Authors	Yawei Luo, Tao Guan, Junqing Yu, Ping Liu, Yi Yang
Abstract	Graph convolutional network (GCN) provides a powerful means for graph-based semi-supervised tasks. However, as a localized first-order approximation of spectral graph convolution, the classic GCN can not take full advantage of unlabeled data, especially when the unlabeled node is far from labeled ones. To capitalize on the information from unlabeled nodes to boost the training for GCN, we propose a novel framework named Self-Ensembling GCN (SEGCN), which marries GCN with Mean Teacher - another powerful model in semi-supervised learning. SEGCN contains a student model and a teacher model. As a student, it not only learns to correctly classify the labeled nodes, but also tries to be consistent with the teacher on unlabeled nodes in more challenging situations, such as a high dropout rate and graph collapse. As a teacher, it averages the student model weights and generates more accurate predictions to lead the student. In such a mutual-promoting process, both labeled and unlabeled samples can be fully utilized for backpropagating effective gradients to train GCN. In three article classification tasks, i.e. Citeseer, Cora and Pubmed, we validate that the proposed method matches the state of the arts in the classification accuracy.
Tasks
Published	2018-09-26
URL	http://arxiv.org/abs/1809.09925v1
PDF	http://arxiv.org/pdf/1809.09925v1.pdf
PWC	https://paperswithcode.com/paper/every-node-counts-self-ensembling-graph
Repo
Framework

Rapid Prediction of Electron-Ionization Mass Spectrometry using Neural Networks


Title	Rapid Prediction of Electron-Ionization Mass Spectrometry using Neural Networks
Authors	Jennifer N. Wei, David Belanger, Ryan P. Adams, D. Sculley
Abstract	When confronted with a substance of unknown identity, researchers often perform mass spectrometry on the sample and compare the observed spectrum to a library of previously-collected spectra to identify the molecule. While popular, this approach will fail to identify molecules that are not in the existing library. In response, we propose to improve the library’s coverage by augmenting it with synthetic spectra that are predicted using machine learning. We contribute a lightweight neural network model that quickly predicts mass spectra for small molecules. Achieving high accuracy predictions requires a novel neural network architecture that is designed to capture typical fragmentation patterns from electron ionization. We analyze the effects of our modeling innovations on library matching performance and compare our models to prior machine learning-based work on spectrum prediction.
Tasks
Published	2018-11-21
URL	http://arxiv.org/abs/1811.08545v2
PDF	http://arxiv.org/pdf/1811.08545v2.pdf
PWC	https://paperswithcode.com/paper/predicting-electron-ionization-mass
Repo
Framework