April 2, 2020

3056 words 15 mins read

Paper Group ANR 159

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction. Quantifying the relationship between student enrollment patterns and student performance. Demographic Bias in Biometrics: A Survey on an Emerging Challenge. A Time Series Approach To Player Churn and Conversion in Videogames. RN-VID: A Feature Fusion Architecture for …

Adversarial Multimodal Representation Learning for Click-Through Rate Prediction


Title	Adversarial Multimodal Representation Learning for Click-Through Rate Prediction
Authors	Xiang Li, Chao Wang, Jiwei Tan, Xiaoyi Zeng, Dan Ou, Bo Zheng
Abstract	For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from multimodal features is still less investigated, considering an item in E-commerce usually contains multiple heterogeneous modalities. Previous works either concatenate the multiple modality features, that is equivalent to giving a fixed importance weight to each modality; or learn dynamic weights of different modalities for different items through technique like attention mechanism. However, a problem is that there usually exists common redundant information across multiple modalities. The dynamic weights of different modalities computed by using the redundant information may not correctly reflect the different importance of each modality. To address this, we explore the complementarity and redundancy of modalities by considering modality-specific and modality-invariant features differently. We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task. A multimodal attention network first calculates the weights of multiple modalities for each item according to its modality-specific features. Then a multimodal adversarial network learns modality-invariant representations where a double-discriminators strategy is introduced. Finally, we achieve the multimodal item representations by combining both modality-specific and modality-invariant representations. We conduct extensive experiments on both public and industrial datasets, and the proposed method consistently achieves remarkable improvements to the state-of-the-art methods. Moreover, the approach has been deployed in an operational E-commerce system and online A/B testing further demonstrates the effectiveness.
Tasks	Click-Through Rate Prediction, Representation Learning
Published	2020-03-07
URL	https://arxiv.org/abs/2003.07162v1
PDF	https://arxiv.org/pdf/2003.07162v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-multimodal-representation
Repo
Framework

Quantifying the relationship between student enrollment patterns and student performance


Title	Quantifying the relationship between student enrollment patterns and student performance
Authors	Shahab Boumi, Adan Vela, Jacquelyn Chini
Abstract	College students are enrolled at each semester with either part time or full time status. While most of the students keep an overall constant enrollment status during their education period, some of them may frequently change their status between full time and part time from one semester to the next. The goal of this research is to exploit the historic patterns to estimate and categorize students$'$ strategy in three different groups of part time, full time and mixed, investigate the educational features of each group and compare their performance. Enrollment strategy refers to the student$'$s mindset for enrollment plan and in one way can be captured from the student$'$s historic enrollment status. Data is collected from the University of Central Florida from 2008 to 2017 and Hidden Markov Model is applied to identify different types of student strategy. Results show that students with Mixed Enrollment Strategy (MES) have features (ex. time to graduation and graduation and halt enrollment ratio) and performances (ex. cumulative GPA) relatively between students with Full time Enrollment Strategy (FES) and students with Part time Enrollment Strategy (PES).
Tasks
Published	2020-03-22
URL	https://arxiv.org/abs/2003.10874v1
PDF	https://arxiv.org/pdf/2003.10874v1.pdf
PWC	https://paperswithcode.com/paper/quantifying-the-relationship-between-student
Repo
Framework

Demographic Bias in Biometrics: A Survey on an Emerging Challenge


Title	Demographic Bias in Biometrics: A Survey on an Emerging Challenge
Authors	P. Drozdowski, C. Rathgeb, A. Dantcheva, N. Damer, C. Busch
Abstract	Systems incorporating biometric technologies have become ubiquitous in personal, commercial, and governmental identity management applications. Both cooperative (e.g. access control) and non-cooperative (e.g. surveillance and forensics) systems have benefited from biometrics. Such systems rely on the uniqueness of certain biological or behavioural characteristics of human beings, which enable for individuals to be reliably recognised using automated algorithms. Recently, however, there has been a wave of public and academic concerns regarding the existence of systemic bias in automated decision systems (including biometrics). Most prominently, face recognition algorithms have often been labelled as “racist” or “biased” by the media, non-governmental organisations, and researchers alike. The main contributions of this article are: (1) an overview of the topic of algorithmic bias in the context of biometrics, (2) a comprehensive survey of the existing literature on biometric bias estimation and mitigation, (3) a discussion of the pertinent technical and social matters, and (4) an outline of the remaining challenges and future work items, both from technological and social points of view.
Tasks	Face Recognition
Published	2020-03-05
URL	https://arxiv.org/abs/2003.02488v1
PDF	https://arxiv.org/pdf/2003.02488v1.pdf
PWC	https://paperswithcode.com/paper/demographic-bias-in-biometrics-a-survey-on-an
Repo
Framework

A Time Series Approach To Player Churn and Conversion in Videogames


Title	A Time Series Approach To Player Churn and Conversion in Videogames
Authors	Ana Fernández del Río, Anna Guitart, África Periáñez
Abstract	Players of a free-to-play game are divided into three main groups: non-paying active users, paying active users and inactive users. A State Space time series approach is then used to model the daily conversion rates between the different groups, i.e., the probability of transitioning from one group to another. This allows, not only for predictions on how these rates are to evolve, but also for a deeper understanding of the impact that in-game planning and calendar effects have. It is also used in this work for the detection of marketing and promotion campaigns about which no information is available. In particular, two different State Space formulations are considered and compared: an Autoregressive Integrated Moving Average process and an Unobserved Components approach, in both cases with a linear regression to explanatory variables. Both yield very close estimations for covariate parameters, producing forecasts with similar performances for most transition rates. While the Unobserved Components approach is more robust and needs less human intervention in regards to model definition, it produces significantly worse forecasts for non-paying user abandonment probability. More critically, it also fails to detect a plausible marketing and promotion campaign scenario.
Tasks	Time Series
Published	2020-03-13
URL	https://arxiv.org/abs/2003.10287v1
PDF	https://arxiv.org/pdf/2003.10287v1.pdf
PWC	https://paperswithcode.com/paper/a-time-series-approach-to-player-churn-and
Repo
Framework

RN-VID: A Feature Fusion Architecture for Video Object Detection


Title	RN-VID: A Feature Fusion Architecture for Video Object Detection
Authors	Hughes Perreault, Maguelonne Héritier, Pierre Gravel, Guillaume-Alexandre Bilodeau, Nicolas Saunier
Abstract	Consecutive frames in a video are highly redundant. Therefore, to perform the task of video object detection, executing single frame detectors on every frame without reusing any information is quite wasteful. It is with this idea in mind that we propose RN-VID, a novel approach to video object detection. Our contributions are twofold. First, we propose a new architecture that allows the usage of information from nearby frames to enhance feature maps. Second, we propose a novel module to merge feature maps of same dimensions using re-ordering of channels and 1 x 1 convolutions. We then demonstrate that RN-VID achieves better mAP than corresponding single frame detectors with little additional cost during inference.
Tasks	Object Detection, Video Object Detection
Published	2020-03-24
URL	https://arxiv.org/abs/2003.10898v1
PDF	https://arxiv.org/pdf/2003.10898v1.pdf
PWC	https://paperswithcode.com/paper/rn-vid-a-feature-fusion-architecture-for
Repo
Framework

Automatic Inference of High-Level Network Intents by Mining Forwarding Patterns


Title	Automatic Inference of High-Level Network Intents by Mining Forwarding Patterns
Authors	Ali Kheradmand
Abstract	There is a semantic gap between the high-level intents of network operators and the low-level configurations that achieve the intents. Previous works tried to bridge the gap using verification or synthesis techniques, both requiring formal specifications of the intended behavior which are rarely available or even known in the real world. This paper discusses an alternative approach for bridging the gap, namely to infer the high-level intents from the low-level network behavior. Specifically, we provide Anime, a framework and a tool that given a set of observed forwarding behavior, automatically infers a set of possible intents that best describe all observations. Our results show that Anime can infer high-quality intents from the low-level forwarding behavior with acceptable performance.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02423v2
PDF	https://arxiv.org/pdf/2002.02423v2.pdf
PWC	https://paperswithcode.com/paper/automatic-inference-of-high-level-network
Repo
Framework

Compressing Language Models using Doped Kronecker Products


Title	Compressing Language Models using Doped Kronecker Products
Authors	Urmish Thakker, Paul Whatmough, Matthew Mattina, Jesse Beu
Abstract	Kronecker Products (KP) have been used to compress IoT RNN Applications by 15-38x compression factors, achieving better results than traditional compression methods. However when KP is applied to large Natural Language Processing tasks, it leads to significant accuracy loss (approx 26%). This paper proposes a way to recover accuracy otherwise lost when applying KP to large NLP tasks, by allowing additional degrees of freedom in the KP matrix. More formally, we propose doping, a process of adding an extremely sparse overlay matrix on top of the pre-defined KP structure. We call this compression method doped kronecker product compression. To train these models, we present a new solution to the phenomenon of co-matrix adaption (CMA), which uses a new regularization scheme called co matrix dropout regularization (CMR). We present experimental results that demonstrate compression of a large language model with LSTM layers of size 25 MB by 25x with 1.4% loss in perplexity score. At 25x compression, an equivalent pruned network leads to 7.9% loss in perplexity score, while HMD and LMF lead to 15% and 27% loss in perplexity score respectively.
Tasks	Language Modelling
Published	2020-01-24
URL	https://arxiv.org/abs/2001.08896v3
PDF	https://arxiv.org/pdf/2001.08896v3.pdf
PWC	https://paperswithcode.com/paper/compressing-language-models-using-doped
Repo
Framework

PL${}_{1}$P – Point-line Minimal Problems under Partial Visibility in Three Views


Title	PL${}_{1}$P – Point-line Minimal Problems under Partial Visibility in Three Views
Authors	Timothy Duff, Kathlén Kohn, Anton Leykin, Tomas Pajdla
Abstract	We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such minimal problems; however, we show that they can be reduced to 140616 equivalence classes by removing superfluous features and relabeling the cameras. We also introduce camera-minimal problems, which are practical for designing minimal solvers, and show how to pick a simplest camera-minimal problem for each minimal problem. This simplification results in 74575 equivalence classes. Only 76 of these were known; the rest are new. In order to identify problems that have potential for practical solving of image matching and 3D reconstruction, we present several smaller natural subfamilies of camera-minimal problems as well as compute solution counts for all camera-minimal problems which have less than 300 solutions for generic data.
Tasks	3D Reconstruction
Published	2020-03-10
URL	https://arxiv.org/abs/2003.05015v1
PDF	https://arxiv.org/pdf/2003.05015v1.pdf
PWC	https://paperswithcode.com/paper/pl_1p-point-line-minimal-problems-under
Repo
Framework

Scaling Laws for Neural Language Models


Title	Scaling Laws for Neural Language Models
Authors	Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
Abstract	We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within a wide range. Simple equations govern the dependence of overfitting on model/dataset size and the dependence of training speed on model size. These relationships allow us to determine the optimal allocation of a fixed compute budget. Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.
Tasks	Language Modelling
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08361v1
PDF	https://arxiv.org/pdf/2001.08361v1.pdf
PWC	https://paperswithcode.com/paper/scaling-laws-for-neural-language-models
Repo
Framework

Fine-Tuning a Transformer-Based Language Model to Avoid Generating Non-Normative Text


Title	Fine-Tuning a Transformer-Based Language Model to Avoid Generating Non-Normative Text
Authors	Xiangyu Peng, Siyan Li, Spencer Frazier, Mark Riedl
Abstract	Large-scale, transformer-based language models such as GPT-2 are pretrained on diverse corpora scraped from the internet. Consequently, they are prone to generating content that one might find inappropriate or non-normative (i.e. in violation of social norms). In this paper, we describe a technique for fine-tuning GPT-2 such that the amount of non-normative content generated is significantly reduced. A model capable of classifying normative behavior is used to produce an additional reward signal; a policy gradient reinforcement learning technique uses that reward to fine-tune the language model weights. Using this fine-tuning technique, with 24,000 sentences from a science fiction plot summary dataset, halves the percentage of generated text containing non-normative behavior from 35.1% to 15.7%.
Tasks	Language Modelling
Published	2020-01-23
URL	https://arxiv.org/abs/2001.08764v1
PDF	https://arxiv.org/pdf/2001.08764v1.pdf
PWC	https://paperswithcode.com/paper/fine-tuning-a-transformer-based-language
Repo
Framework


Title	Integrating Boundary Assembling into a DNN Framework for Named Entity Recognition in Chinese Social Media Text
Authors	Zhaoheng Gong, Ping Chen, Jiang Zhou
Abstract	Named entity recognition is a challenging task in Natural Language Processing, especially for informal and noisy social media text. Chinese word boundaries are also entity boundaries, therefore, named entity recognition for Chinese text can benefit from word boundary detection, outputted by Chinese word segmentation. Yet Chinese word segmentation poses its own difficulty because it is influenced by several factors, e.g., segmentation criteria, employed algorithm, etc. Dealt improperly, it may generate a cascading failure to the quality of named entity recognition followed. In this paper we integrate a boundary assembling method with the state-of-the-art deep neural network model, and incorporate the updated word boundary information into a conditional random field model for named entity recognition. Our method shows a 2% absolute improvement over previous state-of-the-art results.
Tasks	Boundary Detection, Chinese Word Segmentation, Named Entity Recognition
Published	2020-02-27
URL	https://arxiv.org/abs/2002.11910v1
PDF	https://arxiv.org/pdf/2002.11910v1.pdf
PWC	https://paperswithcode.com/paper/integrating-boundary-assembling-into-a-dnn
Repo
Framework

Vocoder-free End-to-End Voice Conversion with Transformer Network


Title	Vocoder-free End-to-End Voice Conversion with Transformer Network
Authors	June-Woo Kim, Ho-Young Jung, Minho Lee
Abstract	Mel-frequency filter bank (MFB) based approaches have the advantage of learning speech compared to raw spectrum since MFB has less feature size. However, speech generator with MFB approaches require additional vocoder that needs a huge amount of computation expense for training process. The additional pre/post processing such as MFB and vocoder is not essential to convert real human speech to others. It is possible to only use the raw spectrum along with the phase to generate different style of voices with clear pronunciation. In this regard, we propose a fast and effective approach to convert realistic voices using raw spectrum in a parallel manner. Our transformer-based model architecture which does not have any CNN or RNN layers has shown the advantage of learning fast and solved the limitation of sequential computation of conventional RNN. In this paper, we introduce a vocoder-free end-to-end voice conversion method using transformer network. The presented conversion model can also be used in speaker adaptation for speech recognition. Our approach can convert the source voice to a target voice without using MFB and vocoder. We can get an adapted MFB for speech recognition by multiplying the converted magnitude with phase. We perform our voice conversion experiments on TIDIGITS dataset using the metrics such as naturalness, similarity, and clarity with mean opinion score, respectively.
Tasks	Speech Recognition, Voice Conversion
Published	2020-02-05
URL	https://arxiv.org/abs/2002.03808v1
PDF	https://arxiv.org/pdf/2002.03808v1.pdf
PWC	https://paperswithcode.com/paper/vocoder-free-end-to-end-voice-conversion-with
Repo
Framework

Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently


Title	Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Authors	Asaf Cassel, Alon Cohen, Tomer Koren
Abstract	We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix $A$ is unknown, and when only the state-action transition matrix $B$ is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.
Tasks
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08095v1
PDF	https://arxiv.org/pdf/2002.08095v1.pdf
PWC	https://paperswithcode.com/paper/logarithmic-regret-for-learning-linear
Repo
Framework

Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities


Title	Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities
Authors	Zhongruo Wang, Krishnakumar Balasubramanian, Shiqian Ma, Meisam Razaviyayn
Abstract	In this paper, we study zeroth-order algorithms for minimax optimization problems that are nonconvex in one variable and strongly-concave in the other variable. Such minimax optimization problems have attracted significant attention lately due to their applications in modern machine learning tasks. We first design and analyze the Zeroth-Order Gradient Descent Ascent (\texttt{ZO-GDA}) algorithm, and provide improved results compared to existing works, in terms of oracle complexity. Next, we propose the Zeroth-Order Gradient Descent Multi-Step Ascent (\texttt{ZO-GDMSA}) algorithm that significantly improves the oracle complexity of \texttt{ZO-GDA}. We also provide stochastic version of \texttt{ZO-GDA} and \texttt{ZO-GDMSA} to handle stochastic nonconvex minimax problems, and provide oracle complexity results.
Tasks
Published	2020-01-22
URL	https://arxiv.org/abs/2001.07819v1
PDF	https://arxiv.org/pdf/2001.07819v1.pdf
PWC	https://paperswithcode.com/paper/zeroth-order-algorithms-for-nonconvex-minimax
Repo
Framework

A new paradigm for accelerating clinical data science at Stanford Medicine


Title	A new paradigm for accelerating clinical data science at Stanford Medicine
Authors	Somalee Datta, Jose Posada, Garrick Olson, Wencheng Li, Ciaran O’Reilly, Deepa Balraj, Joseph Mesterhazy, Joseph Pallas, Priyamvada Desai, Nigam Shah
Abstract	Stanford Medicine is building a new data platform for our academic research community to do better clinical data science. Hospitals have a large amount of patient data and researchers have demonstrated the ability to reuse that data and AI approaches to derive novel insights, support patient care, and improve care quality. However, the traditional data warehouse and Honest Broker approaches that are in current use, are not scalable. We are establishing a new secure Big Data platform that aims to reduce time to access and analyze data. In this platform, data is anonymized to preserve patient data privacy and made available preparatory to Institutional Review Board (IRB) submission. Furthermore, the data is standardized such that analysis done at Stanford can be replicated elsewhere using the same analytical code and clinical concepts. Finally, the analytics data warehouse integrates with a secure data science computational facility to support large scale data analytics. The ecosystem is designed to bring the modern data science community to highly sensitive clinical data in a secure and collaborative big data analytics environment with a goal to enable bigger, better and faster science.
Tasks
Published	2020-03-17
URL	https://arxiv.org/abs/2003.10534v1
PDF	https://arxiv.org/pdf/2003.10534v1.pdf
PWC	https://paperswithcode.com/paper/a-new-paradigm-for-accelerating-clinical-data
Repo
Framework