Paper Group ANR 1140
A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data. Data-Efficient Graph Embedding Learning for PCB Component Detection. A two-stage 3D Unet framework for multi-class segmentation on full resolution image. Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multise …
A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data
Title | A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data |
Authors | Jingang Wang, Junfeng Tian, Long Qiu, Sheng Li, Jun Lang, Luo Si, Man Lan |
Abstract | It is a challenging and practical research problem to obtain effective compression of lengthy product titles for E-commerce. This is particularly important as more and more users browse mobile E-commerce apps and more merchants make the original product titles redundant and lengthy for Search Engine Optimization. Traditional text summarization approaches often require a large amount of preprocessing costs and do not capture the important issue of conversion rate in E-commerce. This paper proposes a novel multi-task learning approach for improving product title compression with user search log data. In particular, a pointer network-based sequence-to-sequence approach is utilized for title compression with an attentive mechanism as an extractive method and an attentive encoder-decoder approach is utilized for generating user search queries. The encoding parameters (i.e., semantic embedding of original titles) are shared among the two tasks and the attention distributions are jointly optimized. An extensive set of experiments with both human annotated data and online deployment demonstrate the advantage of the proposed research for both compression qualities and online business values. |
Tasks | Multi-Task Learning, Text Summarization |
Published | 2018-01-05 |
URL | http://arxiv.org/abs/1801.01725v1 |
http://arxiv.org/pdf/1801.01725v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-learning-approach-for-improving |
Repo | |
Framework | |
Data-Efficient Graph Embedding Learning for PCB Component Detection
Title | Data-Efficient Graph Embedding Learning for PCB Component Detection |
Authors | Chia-Wen Kuo, Jacob Ashmore, David Huggins, Zsolt Kira |
Abstract | This paper presents a challenging computer vision task, namely the detection of generic components on a PCB, and a novel set of deep-learning methods that are able to jointly leverage the appearance of individual components and the propagation of information across the structure of the board to accurately detect and identify various types of components on a PCB. Due to the expense of manual data labeling, a highly unbalanced distribution of component types, and significant domain shift across boards, most earlier attempts based on traditional image processing techniques fail to generalize well to PCB images with various quality, lighting conditions, etc. Newer object detection pipelines such as Faster R-CNN, on the other hand, require a large amount of labeled data, do not deal with domain shift, and do not leverage structure. To address these issues, we propose a three stage pipeline in which a class-agnostic region proposal network is followed by a low-shot similarity prediction classifier. In order to exploit the data dependency within a PCB, we design a novel Graph Network block to refine the component features conditioned on each PCB. To the best of our knowledge, this is one of the earliest attempts to train a deep learning based model for such tasks, and we demonstrate improvements over recent graph networks for this task. We also provide in-depth analysis and discussion for this challenging task, pointing to future research. |
Tasks | Graph Embedding, Object Detection |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06994v2 |
http://arxiv.org/pdf/1811.06994v2.pdf | |
PWC | https://paperswithcode.com/paper/data-efficient-graph-embedding-learning-for |
Repo | |
Framework | |
A two-stage 3D Unet framework for multi-class segmentation on full resolution image
Title | A two-stage 3D Unet framework for multi-class segmentation on full resolution image |
Authors | Chengjia Wang, Tom MacGillivray, Gillian Macnaught, Guang Yang, David Newby |
Abstract | Deep convolutional neural networks (CNNs) have been intensively used for multi-class segmentation of data from different modalities and achieved state-of-the-art performances. However, a common problem when dealing with large, high resolution 3D data is that the volumes input into the deep CNNs has to be either cropped or downsampled due to limited memory capacity of computing devices. These operations lead to loss of resolution and increment of class imbalance in the input data batches, which can downgrade the performances of segmentation algorithms. Inspired by the architecture of image super-resolution CNN (SRCNN) and self-normalization network (SNN), we developed a two-stage modified Unet framework that simultaneously learns to detect a ROI within the full volume and to classify voxels without losing the original resolution. Experiments on a variety of multi-modal volumes demonstrated that, when trained with a simply weighted dice coefficients and our customized learning procedure, this framework shows better segmentation performances than state-of-the-art Deep CNNs with advanced similarity metrics. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-04-12 |
URL | http://arxiv.org/abs/1804.04341v1 |
http://arxiv.org/pdf/1804.04341v1.pdf | |
PWC | https://paperswithcode.com/paper/a-two-stage-3d-unet-framework-for-multi-class |
Repo | |
Framework | |
Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multisensory Perception
Title | Towards Modeling the Interaction of Spatial-Associative Neural Network Representations for Multisensory Perception |
Authors | German I. Parisi, Jonathan Tong, Pablo Barros, Brigitte Röder, Stefan Wermter |
Abstract | Our daily perceptual experience is driven by different neural mechanisms that yield multisensory interaction as the interplay between exogenous stimuli and endogenous expectations. While the interaction of multisensory cues according to their spatiotemporal properties and the formation of multisensory feature-based representations have been widely studied, the interaction of spatial-associative neural representations has received considerably less attention. In this paper, we propose a neural network architecture that models the interaction of spatial-associative representations to perform causal inference of audiovisual stimuli. We investigate the spatial alignment of exogenous audiovisual stimuli modulated by associative congruence. In the spatial layer, topographically arranged networks account for the interaction of audiovisual input in terms of population codes. In the associative layer, congruent audiovisual representations are obtained via the experience-driven development of feature-based associations. Levels of congruency are obtained as a by-product of the neurodynamics of self-organizing networks, where the amount of neural activation triggered by the input can be expressed via a nonlinear distance function. Our novel proposal is that activity-driven levels of congruency can be used as top-down modulatory projections to spatially distributed representations of sensory input, e.g. semantically related audiovisual pairs will yield a higher level of integration than unrelated pairs. Furthermore, levels of neural response in unimodal layers may be seen as sensory reliability for the dynamic weighting of crossmodal cues. We describe a series of planned experiments to validate our model in the tasks of multisensory interaction on the basis of semantic congruence and unimodal cue reliability. |
Tasks | Causal Inference |
Published | 2018-07-13 |
URL | http://arxiv.org/abs/1807.05222v1 |
http://arxiv.org/pdf/1807.05222v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-modeling-the-interaction-of-spatial |
Repo | |
Framework | |
Cause-Effect Deep Information Bottleneck For Systematically Missing Covariates
Title | Cause-Effect Deep Information Bottleneck For Systematically Missing Covariates |
Authors | Sonali Parbhoo, Mario Wieser, Aleksander Wieczorek, Volker Roth |
Abstract | Estimating the causal effects of an intervention from high-dimensional observational data is difficult due to the presence of confounding. The task is often complicated by the fact that we may have a systematic missingness in our data at test time. Our approach uses the information bottleneck to perform a low-dimensional compression of covariates by explicitly considering the relevance of information. Based on the sufficiently reduced covariate, we transfer the relevant information to cases where data is missing at test time, allowing us to reliably and accurately estimate the effects of an intervention, even where data is incomplete. Our results on causal inference benchmarks and a real application for treating sepsis show that our method achieves state-of-the art performance, without sacrificing interpretability. |
Tasks | Causal Inference |
Published | 2018-07-06 |
URL | https://arxiv.org/abs/1807.02326v3 |
https://arxiv.org/pdf/1807.02326v3.pdf | |
PWC | https://paperswithcode.com/paper/cause-effect-deep-information-bottleneck-for |
Repo | |
Framework | |
Learning Better Internal Structure of Words for Sequence Labeling
Title | Learning Better Internal Structure of Words for Sequence Labeling |
Authors | Yingwei Xin, Ethan Hart, Vibhuti Mahajan, Jean-David Ruvini |
Abstract | Character-based neural models have recently proven very useful for many NLP tasks. However, there is a gap of sophistication between methods for learning representations of sentences and words. While most character models for learning representations of sentences are deep and complex, models for learning representations of words are shallow and simple. Also, in spite of considerable research on learning character embeddings, it is still not clear which kind of architecture is the best for capturing character-to-word representations. To address these questions, we first investigate the gaps between methods for learning word and sentence representations. We conduct detailed experiments and comparisons of different state-of-the-art convolutional models, and also investigate the advantages and disadvantages of their constituents. Furthermore, we propose IntNet, a funnel-shaped wide convolutional neural architecture with no down-sampling for learning representations of the internal structure of words by composing their characters from limited, supervised training corpora. We evaluate our proposed model on six sequence labeling datasets, including named entity recognition, part-of-speech tagging, and syntactic chunking. Our in-depth analysis shows that IntNet significantly outperforms other character embedding models and obtains new state-of-the-art performance without relying on any external knowledge or resources. |
Tasks | Chunking, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2018-10-29 |
URL | http://arxiv.org/abs/1810.12443v1 |
http://arxiv.org/pdf/1810.12443v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-better-internal-structure-of-words |
Repo | |
Framework | |
Cell Selection with Deep Reinforcement Learning in Sparse Mobile Crowdsensing
Title | Cell Selection with Deep Reinforcement Learning in Sparse Mobile Crowdsensing |
Authors | Leye Wang, Wenbin Liu, Daqing Zhang, Yasha Wang, En Wang, Yongjian Yang |
Abstract | Sparse Mobile CrowdSensing (MCS) is a novel MCS paradigm where data inference is incorporated into the MCS process for reducing sensing costs while its quality is guaranteed. Since the sensed data from different cells (sub-areas) of the target sensing area will probably lead to diverse levels of inference data quality, cell selection (i.e., choose which cells of the target area to collect sensed data from participants) is a critical issue that will impact the total amount of data that requires to be collected (i.e., data collection costs) for ensuring a certain level of quality. To address this issue, this paper proposes a Deep Reinforcement learning based Cell selection mechanism for Sparse MCS, called DR-Cell. First, we properly model the key concepts in reinforcement learning including state, action, and reward, and then propose to use a deep recurrent Q-network for learning the Q-function that can help decide which cell is a better choice under a certain state during cell selection. Furthermore, we leverage the transfer learning techniques to reduce the amount of data required for training the Q-function if there are multiple correlated MCS tasks that need to be conducted in the same target area. Experiments on various real-life sensing datasets verify the effectiveness of DR-Cell over the state-of-the-art cell selection mechanisms in Sparse MCS by reducing up to 15% of sensed cells with the same data inference quality guarantee. |
Tasks | Transfer Learning |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07047v2 |
http://arxiv.org/pdf/1804.07047v2.pdf | |
PWC | https://paperswithcode.com/paper/cell-selection-with-deep-reinforcement |
Repo | |
Framework | |
Tuning metaheuristics by sequential optimization of regression models
Title | Tuning metaheuristics by sequential optimization of regression models |
Authors | Áthila R. Trindade, Felipe Campelo |
Abstract | Tuning parameters is an important step for the application of metaheuristics to problem classes of interest. In this work we present a tuning framework based on the sequential optimization of perturbed regression models. Besides providing algorithm configurations with good expected performance, the proposed methodology can also provide insights on the relevance of each parameter and their interactions, as well as models of expected algorithm performance for a given problem class, conditional on the parameter values. A test case is presented for the tuning of six parameters of a decomposition-based multiobjective optimization algorithm, in which an instantiation of the proposed framework is compared against the results obtained by the most recent version the Iterated Racing (Irace) procedure. The results suggest that the proposed approach returns solutions that are as good as those of Irace in terms of mean performance, with the advantage of providing more information on the relevance and effect of each parameter on the expected performance of the algorithm. |
Tasks | Multiobjective Optimization |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03646v2 |
http://arxiv.org/pdf/1809.03646v2.pdf | |
PWC | https://paperswithcode.com/paper/tuning-metaheuristics-by-sequential |
Repo | |
Framework | |
Non-convex non-local flows for saliency detection
Title | Non-convex non-local flows for saliency detection |
Authors | Iván Ramírez, Gonzalo Galiano, Emanuele Schiavi |
Abstract | We propose and numerically solve a new variational model for automatic saliency detection in digital images. Using a non-local framework we consider a family of edge preserving functions combined with a new quadratic saliency detection term. Such term defines a constrained bilateral obstacle problem for image classification driven by p-Laplacian operators, including the so-called hyper-Laplacian case (0 < p < 1). The related non-convex non-local reactive flows are then considered and applied for glioblastoma segmentation in magnetic resonance fluid-attenuated inversion recovery (MRI-Flair) images. A fast convolutional kernel based approximated solution is computed. The numerical experiments show how the non-convexity related to the hyperLaplacian operators provides monotonically better results in terms of the standard metrics. |
Tasks | Image Classification, Saliency Detection |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09408v1 |
http://arxiv.org/pdf/1805.09408v1.pdf | |
PWC | https://paperswithcode.com/paper/non-convex-non-local-flows-for-saliency |
Repo | |
Framework | |
Improved Complexities of Conditional Gradient-Type Methods with Applications to Robust Matrix Recovery Problems
Title | Improved Complexities of Conditional Gradient-Type Methods with Applications to Robust Matrix Recovery Problems |
Authors | Dan Garber, Shoham Sabach, Atara Kaplan |
Abstract | Motivated by robust matrix recovery problems such as Robust Principal Component Analysis, we consider a general optimization problem of minimizing a smooth and strongly convex loss function applied to the sum of two blocks of variables, where each block of variables is constrained or regularized individually. We study a Conditional Gradient-Type method which is able to leverage the special structure of the problem to obtain faster convergence rates than those attainable via standard methods, under a variety of assumptions. In particular, our method is appealing for matrix problems in which one of the blocks corresponds to a low-rank matrix since it avoids prohibitive full-rank singular value decompositions required by most standard methods. While our initial motivation comes from problems which originated in statistics, our analysis does not impose any statistical assumptions on the data. |
Tasks | |
Published | 2018-02-15 |
URL | https://arxiv.org/abs/1802.05581v3 |
https://arxiv.org/pdf/1802.05581v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-generalized-conditional-gradient-method |
Repo | |
Framework | |
The Data Science of Hollywood: Using Emotional Arcs of Movies to Drive Business Model Innovation in Entertainment Industries
Title | The Data Science of Hollywood: Using Emotional Arcs of Movies to Drive Business Model Innovation in Entertainment Industries |
Authors | Marco Del Vecchio, Alexander Kharlamov, Glenn Parry, Ganna Pogrebna |
Abstract | Much of business literature addresses the issues of consumer-centric design: how can businesses design customized services and products which accurately reflect consumer preferences? This paper uses data science natural language processing methodology to explore whether and to what extent emotions shape consumer preferences for media and entertainment content. Using a unique filtered dataset of 6,174 movie scripts, we generate a mapping of screen content to capture the emotional trajectory of each motion picture. We then combine the obtained mappings into clusters which represent groupings of consumer emotional journeys. These clusters are used to predict overall success parameters of the movies including box office revenues, viewer satisfaction levels (captured by IMDb ratings), awards, as well as the number of viewers’ and critics’ reviews. We find that like books all movie stories are dominated by 6 basic shapes. The highest box offices are associated with the Man in a Hole shape which is characterized by an emotional fall followed by an emotional rise. This shape results in financially successful movies irrespective of genre and production budget. Yet, Man in a Hole succeeds not because it produces most “liked” movies but because it generates most “talked about” movies. Interestingly, a carefully chosen combination of production budget and genre may produce a financially successful movie with any emotional shape. Implications of this analysis for generating on-demand content and for driving business model innovation in entertainment industries are discussed. |
Tasks | |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02221v2 |
http://arxiv.org/pdf/1807.02221v2.pdf | |
PWC | https://paperswithcode.com/paper/the-data-science-of-hollywood-using-emotional |
Repo | |
Framework | |
Predicting the Semantic Textual Similarity with Siamese CNN and LSTM
Title | Predicting the Semantic Textual Similarity with Siamese CNN and LSTM |
Authors | Elvys Linhares Pontes, Stéphane Huet, Andréa Carneiro Linhares, Juan-Manuel Torres-Moreno |
Abstract | Semantic Textual Similarity (STS) is the basis of many applications in Natural Language Processing (NLP). Our system combines convolution and recurrent neural networks to measure the semantic similarity of sentences. It uses a convolution network to take account of the local context of words and an LSTM to consider the global context of sentences. This combination of networks helps to preserve the relevant information of sentences and improves the calculation of the similarity between sentences. Our model has achieved good results and is competitive with the best state-of-the-art systems. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-10-24 |
URL | http://arxiv.org/abs/1810.10641v1 |
http://arxiv.org/pdf/1810.10641v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-the-semantic-textual-similarity |
Repo | |
Framework | |
Fully Understanding the Hashing Trick
Title | Fully Understanding the Hashing Trick |
Authors | Casper Benjamin Freksen, Lior Kamma, Kasper Green Larsen |
Abstract | Feature hashing, also known as {\em the hashing trick}, introduced by Weinberger et al. (2009), is one of the key techniques used in scaling-up machine learning algorithms. Loosely speaking, feature hashing uses a random sparse projection matrix $A : \mathbb{R}^n \to \mathbb{R}^m$ (where $m \ll n$) in order to reduce the dimension of the data from $n$ to $m$ while approximately preserving the Euclidean norm. Every column of $A$ contains exactly one non-zero entry, equals to either $-1$ or $1$. Weinberger et al. showed tail bounds on $\Ax_2^2$. Specifically they showed that for every $\varepsilon, \delta$, if $\x_{\infty} / \x_2$ is sufficiently small, and $m$ is sufficiently large, then $$\Pr[ ; ;\Ax_2^2 - \x_2^2; < \varepsilon \x_2^2 ;] \ge 1 - \delta ;.$$ These bounds were later extended by Dasgupta \etal (2010) and most recently refined by Dahlgaard et al. (2017), however, the true nature of the performance of this key technique, and specifically the correct tradeoff between the pivotal parameters $\x_{\infty} / \x_2, m, \varepsilon, \delta$ remained an open question. We settle this question by giving tight asymptotic bounds on the exact tradeoff between the central parameters, thus providing a complete understanding of the performance of feature hashing. We complement the asymptotic bound with empirical data, which shows that the constants “hiding” in the asymptotic notation are, in fact, very close to $1$, thus further illustrating the tightness of the presented bounds in practice. |
Tasks | |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08539v1 |
http://arxiv.org/pdf/1805.08539v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-understanding-the-hashing-trick |
Repo | |
Framework | |
WISER: A Semantic Approach for Expert Finding in Academia based on Entity Linking
Title | WISER: A Semantic Approach for Expert Finding in Academia based on Entity Linking |
Authors | Paolo Cifariello, Paolo Ferragina, Marco Ponza |
Abstract | We present WISER, a new semantic search engine for expert finding in academia. Our system is unsupervised and it jointly combines classical language modeling techniques, based on text evidences, with the Wikipedia Knowledge Graph, via entity linking. WISER indexes each academic author through a novel profiling technique which models her expertise with a small, labeled and weighted graph drawn from Wikipedia. Nodes in this graph are the Wikipedia entities mentioned in the author’s publications, whereas the weighted edges express the semantic relatedness among these entities computed via textual and graph-based relatedness functions. Every node is also labeled with a relevance score which models the pertinence of the corresponding entity to author’s expertise, and is computed by means of a proper random-walk calculation over that graph; and with a latent vector representation which is learned via entity and other kinds of structural embeddings derived from Wikipedia. At query time, experts are retrieved by combining classic document-centric approaches, which exploit the occurrences of query terms in the author’s documents, with a novel set of profile-centric scoring strategies, which compute the semantic relatedness between the author’s expertise and the query topic via the above graph-based profiles. The effectiveness of our system is established over a large-scale experimental test on a standard dataset for this task. We show that WISER achieves better performance than all the other competitors, thus proving the effectiveness of modelling author’s profile via our “semantic” graph of entities. Finally, we comment on the use of WISER for indexing and profiling the whole research community within the University of Pisa, and its application to technology transfer in our University. |
Tasks | Entity Linking, Language Modelling |
Published | 2018-05-10 |
URL | https://arxiv.org/abs/1805.03947v2 |
https://arxiv.org/pdf/1805.03947v2.pdf | |
PWC | https://paperswithcode.com/paper/wiser-a-semantic-approach-for-expert-finding |
Repo | |
Framework | |
3D Human Action Recognition with Siamese-LSTM Based Deep Metric Learning
Title | 3D Human Action Recognition with Siamese-LSTM Based Deep Metric Learning |
Authors | Seyma Yucer, Yusuf Sinan Akgul |
Abstract | This paper proposes a new 3D Human Action Recognition system as a two-phase system: (1) Deep Metric Learning Module which learns a similarity metric between two 3D joint sequences using Siamese-LSTM networks; (2) A Multiclass Classification Module that uses the output of the first module to produce the final recognition output. This model has several advantages: the first module is trained with a larger set of data because it uses many combinations of sequence pairs.Our deep metric learning module can also be trained independently of the datasets, which makes our system modular and generalizable. We tested the proposed system on standard and newly introduced datasets that showed us that initial results are promising. We will continue developing this system by adding more sophisticated LSTM blocks and by cross-training between different datasets. |
Tasks | 3D Human Action Recognition, Metric Learning, Temporal Action Localization |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02131v1 |
http://arxiv.org/pdf/1807.02131v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-human-action-recognition-with-siamese-lstm |
Repo | |
Framework | |