Paper Group ANR 1729
Online Pricing with Offline Data: Phase Transition and Inverse Square Law. Uncovering Probabilistic Implications in Typological Knowledge Bases. Filter Early, Match Late: Improving Network-Based Visual Place Recognition. A New Perspective on Machine Learning: How to do Perfect Supervised Learning. PairNorm: Tackling Oversmoothing in GNNs. Stopping …
Online Pricing with Offline Data: Phase Transition and Inverse Square Law
Title | Online Pricing with Offline Data: Phase Transition and Inverse Square Law |
Authors | Jinzhi Bu, David Simchi-Levi, Yunzong Xu |
Abstract | This paper investigates the impact of pre-existing offline data on online learning, in the context of dynamic pricing. We study a single-product dynamic pricing problem over a selling horizon of $T$ periods. The demand in each period is determined by the price of the product according to a linear demand model with unknown parameters. We assume that before the start of the selling horizon, the seller already has some pre-existing offline data. The offline data set contains $n$ samples, each of which is an input-output pair consisting of a historical price and an associated demand observation. The seller wants to utilize both the pre-existing offline data and the sequential online data to minimize the regret of the online learning process. We characterize the joint effect of the size, location and dispersion of the offline data on the optimal regret of the online learning process. Specifically, the size, location and dispersion of the offline data are measured by the number of historical samples $n$, the absolute difference between the average historical price and the optimal price $\delta$, and the standard deviation of the historical prices $\sigma$, respectively. We show that the optimal regret is $\widetilde \Theta\left(\sqrt{T}\wedge \frac{T}{(n\wedge T)\delta^2+n\sigma^2}\right)$, and design a learning algorithm based on the “optimism in the face of uncertainty” principle, whose regret is optimal up to a logarithmic factor. Our results reveal surprising transformations of the optimal regret rate with respect to the size of the offline data, which we refer to as phase transitions. In addition, our results demonstrate that the location and dispersion of the offline data also have an intrinsic effect on the optimal regret, and we quantify this effect via the inverse-square law. |
Tasks | |
Published | 2019-10-19 |
URL | https://arxiv.org/abs/1910.08693v4 |
https://arxiv.org/pdf/1910.08693v4.pdf | |
PWC | https://paperswithcode.com/paper/online-pricing-with-offline-data-phase |
Repo | |
Framework | |
Uncovering Probabilistic Implications in Typological Knowledge Bases
Title | Uncovering Probabilistic Implications in Typological Knowledge Bases |
Authors | Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein |
Abstract | The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have post-positions. Uncovering such implications typically amounts to time-consuming manual processing by trained and experienced linguists, which potentially leaves key linguistic universals unexplored. In this paper, we present a computational model which successfully identifies known universals, including Greenberg universals, but also uncovers new ones, worthy of further linguistic investigation. Our approach outperforms baselines previously used for this problem, as well as a strong baseline from knowledge base population. |
Tasks | Knowledge Base Population |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1906.07389v1 |
https://arxiv.org/pdf/1906.07389v1.pdf | |
PWC | https://paperswithcode.com/paper/uncovering-probabilistic-implications-in |
Repo | |
Framework | |
Filter Early, Match Late: Improving Network-Based Visual Place Recognition
Title | Filter Early, Match Late: Improving Network-Based Visual Place Recognition |
Authors | Stephen Hausler, Adam Jacobson, Michael Milford |
Abstract | CNNs have excelled at performing place recognition over time, particularly when the neural network is optimized for localization in the current environmental conditions. In this paper we investigate the concept of feature map filtering, where, rather than using all the activations within a convolutional tensor, only the most useful activations are used. Since specific feature maps encode different visual features, the objective is to remove feature maps that are detract from the ability to recognize a location across appearance changes. Our key innovation is to filter the feature maps in an early convolutional layer, but then continue to run the network and extract a feature vector using a later layer in the same network. By filtering early visual features and extracting a feature vector from a higher, more viewpoint invariant later layer, we demonstrate improved condition and viewpoint invariance. Our approach requires image pairs for training from the deployment environment, but we show that state-of-the-art performance can regularly be achieved with as little as a single training image pair. An exhaustive experimental analysis is performed to determine the full scope of causality between early layer filtering and late layer extraction. For validity, we use three datasets: Oxford RobotCar, Nordland, and Gardens Point, achieving overall superior performance to NetVLAD. The work provides a number of new avenues for exploring CNN optimizations, without full re-training. |
Tasks | Visual Place Recognition |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.12176v1 |
https://arxiv.org/pdf/1906.12176v1.pdf | |
PWC | https://paperswithcode.com/paper/filter-early-match-late-improving-network |
Repo | |
Framework | |
A New Perspective on Machine Learning: How to do Perfect Supervised Learning
Title | A New Perspective on Machine Learning: How to do Perfect Supervised Learning |
Authors | Hui Jiang |
Abstract | In this work, we introduce the concept of bandlimiting into the theory of machine learning because all physical processes are bandlimited by nature, including real-world machine learning tasks. After the bandlimiting constraint is taken into account, our theoretical analysis has shown that all practical machine learning tasks are asymptotically solvable in a perfect sense. Furthermore, the key towards this solvability almost solely relies on two factors: i) a sufficiently large amount of training samples beyond a threshold determined by a difficulty measurement of the underlying task; ii) a sufficiently complex and bandlimited model. Moreover, for some special cases, we have derived new error bounds for perfect learning, which can quantify the difficulty of learning. These generalization bounds are not only asymptotically convergent but also irrelevant to model complexity. Our new results on generalization have provided a new perspective to explain the recent successes of large-scale supervised learning using complex models like neural networks. |
Tasks | |
Published | 2019-01-07 |
URL | http://arxiv.org/abs/1901.02046v3 |
http://arxiv.org/pdf/1901.02046v3.pdf | |
PWC | https://paperswithcode.com/paper/a-new-perspective-on-machine-learning-how-to |
Repo | |
Framework | |
PairNorm: Tackling Oversmoothing in GNNs
Title | PairNorm: Tackling Oversmoothing in GNNs |
Authors | Lingxiao Zhao, Leman Akoglu |
Abstract | The performance of graph neural nets (GNNs) is known to gradually decrease with increasing number of layers. This decay is partly attributed to oversmoothing, where repeated graph convolutions eventually make node embeddings indistinguishable. We take a closer look at two different interpretations, aiming to quantify oversmoothing. Our main contribution is PairNorm, a novel normalization layer that is based on a careful analysis of the graph convolution operator, which prevents all node embeddings from becoming too similar. What is more, PairNorm is fast, easy to implement without any change to network architecture nor any additional parameters, and is broadly applicable to any GNN. Experiments on real-world graphs demonstrate that PairNorm makes deeper GCN, GAT, and SGC models more robust against oversmoothing, and significantly boosts performance for a new problem setting that benefits from deeper GNNs. Code is available at https://github.com/LingxiaoShawn/PairNorm. |
Tasks | |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12223v2 |
https://arxiv.org/pdf/1909.12223v2.pdf | |
PWC | https://paperswithcode.com/paper/pairnorm-tackling-oversmoothing-in-gnns |
Repo | |
Framework | |
Stopping Active Learning based on Predicted Change of F Measure for Text Classification
Title | Stopping Active Learning based on Predicted Change of F Measure for Text Classification |
Authors | Michael Altschuler, Michael Bloodgood |
Abstract | During active learning, an effective stopping method allows users to limit the number of annotations, which is cost effective. In this paper, a new stopping method called Predicted Change of F Measure will be introduced that attempts to provide the users an estimate of how much performance of the model is changing at each iteration. This stopping method can be applied with any base learner. This method is useful for reducing the data annotation bottleneck encountered when building text classification systems. |
Tasks | Active Learning, Text Classification |
Published | 2019-01-26 |
URL | http://arxiv.org/abs/1901.09118v2 |
http://arxiv.org/pdf/1901.09118v2.pdf | |
PWC | https://paperswithcode.com/paper/stopping-active-learning-based-on-predicted |
Repo | |
Framework | |
Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency
Title | Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency |
Authors | Matt Whitehill, Shuang Ma, Daniel McDuff, Yale Song |
Abstract | Current multi-reference style transfer models for Text-to-Speech (TTS) perform sub-optimally on disjoints datasets, where one dataset contains only a single style class for one of the style dimensions. These models generally fail to produce style transfer for the dimension that is underrepresented in the dataset. In this paper, we propose an adversarial cycle consistency training scheme with paired and unpaired triplets to ensure the use of information from all style dimensions. During training, we incorporate unpaired triplets with randomly selected reference audio samples and encourage the synthesized speech to preserve the appropriate styles using adversarial cycle consistency. We use this method to transfer emotion from a dataset containing four emotions to a dataset with only a single emotion. This results in a 78% improvement in style transfer (based on emotion classification) with minimal reduction in fidelity and naturalness. In subjective evaluations our method was consistently rated as closer to the reference style than the baseline. Synthesized speech samples are available at: https://sites.google.com/view/adv-cycle-consistent-tts |
Tasks | Emotion Classification, Style Transfer |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11958v1 |
https://arxiv.org/pdf/1910.11958v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-reference-neural-tts-stylization-with |
Repo | |
Framework | |
Defending against Machine Learning based Inference Attacks via Adversarial Examples: Opportunities and Challenges
Title | Defending against Machine Learning based Inference Attacks via Adversarial Examples: Opportunities and Challenges |
Authors | Jinyuan Jia, Neil Zhenqiang Gong |
Abstract | As machine learning (ML) becomes more and more powerful and easily accessible, attackers increasingly leverage ML to perform automated large-scale inference attacks in various domains. In such an ML-equipped inference attack, an attacker has access to some data (called public data) of an individual, a software, or a system; and the attacker uses an ML classifier to automatically infer their private data. Inference attacks pose severe privacy and security threats to individuals and systems. Inference attacks are successful because private data are statistically correlated with public data, and ML classifiers can capture such statistical correlations. In this chapter, we discuss the opportunities and challenges of defending against ML-equipped inference attacks via adversarial examples. Our key observation is that attackers rely on ML classifiers in inference attacks. The adversarial machine learning community has demonstrated that ML classifiers have various vulnerabilities. Therefore, we can turn the vulnerabilities of ML into defenses against inference attacks. For example, ML classifiers are vulnerable to adversarial examples, which add carefully crafted noise to normal examples such that an ML classifier makes predictions for the examples as we desire. To defend against inference attacks, we can add carefully crafted noise into the public data to turn them into adversarial examples, such that attackers’ classifiers make incorrect predictions for the private data. However, existing methods to construct adversarial examples are insufficient because they did not consider the unique challenges and requirements for the crafted noise at defending against inference attacks. In this chapter, we take defending against inference attacks in online social networks as an example to illustrate the opportunities and challenges. |
Tasks | Inference Attack |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.08526v2 |
https://arxiv.org/pdf/1909.08526v2.pdf | |
PWC | https://paperswithcode.com/paper/defending-against-machine-learning-based |
Repo | |
Framework | |
ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound Customer Service
Title | ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound Customer Service |
Authors | Ta-Chun Su, Guan-Ying Chen |
Abstract | Deep learning models with attention mechanisms have achieved exceptional results for many tasks, including language tasks and recommendation systems. Whereas previous studies have emphasized allocation of phone agents, we focused on inbound call prediction for customer service. A common method of analyzing user history behaviors is to extract all types of aggregated feature over time, but that method may fail to detect users’ behavioral sequences. Therefore, we created a new approach, ET-USB, that incorporates users’ sequential and nonsequential features; we apply the powerful Transformer encoder, a self-attention network model, to capture the information underlying user behavior sequences. ET-USB is helpful in various business scenarios at Cathay Financial Holdings. We conducted experiments to test the proposed network structure’s ability to process various dimensions of behavior data; the results suggest that ET-USB delivers results superior to those of delivered by other deep-learning models. |
Tasks | Recommendation Systems |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.10852v2 |
https://arxiv.org/pdf/1912.10852v2.pdf | |
PWC | https://paperswithcode.com/paper/et-usb-transformer-based-sequential-behavior |
Repo | |
Framework | |
GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs
Title | GAN-Leaks: A Taxonomy of Membership Inference Attacks against GANs |
Authors | Dingfan Chen, Ning Yu, Yang Zhang, Mario Fritz |
Abstract | In recent years, the success of deep learning has carried over from discriminative models to generative models. In particular, generative adversarial networks (GANs) have facilitated a new level of performance ranging from media manipulation to dataset re-generation. Despite the success, the potential risks of privacy breach stemming from GANs are less well explored. In this paper, we focus on membership inference attack against GANs that has the potential to reveal information about victim models’ training data. Specifically, we present the first taxonomy of membership inference attacks, which encompasses not only existing attacks but also our novel ones. We also propose the first generic attack model that can be instantiated in various settings according to adversary’s knowledge about the victim model. We complement our systematic analysis of attack vectors with a comprehensive experimental study, that investigates the effectiveness of these attacks w.r.t. model type, training configurations, and attack type across three diverse application scenarios ranging from images, over medical data to location data. We show consistent effectiveness in all the setups, which bridges the assumption gap and performance gap in previous study with a complete spectrum of performance across settings. We conclusively remind users to think over before publicizing any part of their models. |
Tasks | Inference Attack |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03935v1 |
https://arxiv.org/pdf/1909.03935v1.pdf | |
PWC | https://paperswithcode.com/paper/gan-leaks-a-taxonomy-of-membership-inference |
Repo | |
Framework | |
Relation Extraction Datasets in the Digital Humanities Domain and their Evaluation with Word Embeddings
Title | Relation Extraction Datasets in the Digital Humanities Domain and their Evaluation with Word Embeddings |
Authors | Gerhard Wohlgenannt, Ekaterina Chernyak, Dmitry Ilvovsky, Ariadna Barinova, Dmitry Mouromtsev |
Abstract | In this research, we manually create high-quality datasets in the digital humanities domain for the evaluation of language models, specifically word embedding models. The first step comprises the creation of unigram and n-gram datasets for two fantasy novel book series for two task types each, analogy and doesn’t-match. This is followed by the training of models on the two book series with various popular word embedding model types such as word2vec, GloVe, fastText, or LexVec. Finally, we evaluate the suitability of word embedding models for such specific relation extraction tasks in a situation of comparably small corpus sizes. In the evaluations, we also investigate and analyze particular aspects such as the impact of corpus term frequencies and task difficulty on accuracy. The datasets, and the underlying system and word embedding models are available on github and can be easily extended with new datasets and tasks, be used to reproduce the presented results, or be transferred to other domains. |
Tasks | Relation Extraction, Word Embeddings |
Published | 2019-03-04 |
URL | http://arxiv.org/abs/1903.01284v1 |
http://arxiv.org/pdf/1903.01284v1.pdf | |
PWC | https://paperswithcode.com/paper/relation-extraction-datasets-in-the-digital |
Repo | |
Framework | |
Funnel Transform for Straight Line Detection
Title | Funnel Transform for Straight Line Detection |
Authors | QianRu Wei, DaZheng Feng, WeiXing Zheng |
Abstract | Most of the classical approaches to straight line detection only deal with a binary edge image and need to use 2D interpolation operation. This paper proposes a new transform method figuratively named as funnel transform which can efficiently and rapidly detect straight lines. The funnel transform consists of three 1D Fourier transforms and one nonlinear variable-metric transform (NVMT). It only needs to exploit 1D interpolation operation for achieving its NVMT, and can directly handle grayscale images by using its high-pass filter property, which significantly improves the performance of the closely-related approaches. Based on the slope-intercept line equation, the funnel transform can more uniformly turn the straight lines formed by ridge-typical and step-typical edges into the local maximum points (peaks). The parameters of each line can be uniquely extracted from its corresponding peak coordinates. Additionally, each peak can be theoretically specified by a 2D delta function, which makes the peaks and lines more easily identified and detected, respectively. Theoretical analysis and experimental results demonstrate that the funnel transform has advantages including smaller computational complexity, lower hardware cost, higher detection probability, greater location precision, better parallelization properties, stronger anti-occlusion and noise robustness. |
Tasks | |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09409v1 |
http://arxiv.org/pdf/1904.09409v1.pdf | |
PWC | https://paperswithcode.com/paper/190409409 |
Repo | |
Framework | |
Computing L1 Straight-Line Fits to Data (Part 1)
Title | Computing L1 Straight-Line Fits to Data (Part 1) |
Authors | Ian Barrodale |
Abstract | The initial remarks in this technical report are primarily for those not familiar with the properties of L1 approximation, but the remainder of the report should also interest readers who are already acquainted with the inner workings of L1 algorithms. |
Tasks | |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/2001.00813v1 |
https://arxiv.org/pdf/2001.00813v1.pdf | |
PWC | https://paperswithcode.com/paper/computing-l1-straight-line-fits-to-data-part |
Repo | |
Framework | |
CODAH: An Adversarially Authored Question-Answer Dataset for Common Sense
Title | CODAH: An Adversarially Authored Question-Answer Dataset for Common Sense |
Authors | Michael Chen, Mike D’Arcy, Alisa Liu, Jared Fernandez, Doug Downey |
Abstract | Commonsense reasoning is a critical AI capability, but it is difficult to construct challenging datasets that test common sense. Recent neural question answering systems, based on large pre-trained models of language, have already achieved near-human-level performance on commonsense knowledge benchmarks. These systems do not possess human-level common sense, but are able to exploit limitations of the datasets to achieve human-level scores. We introduce the CODAH dataset, an adversarially-constructed evaluation dataset for testing common sense. CODAH forms a challenging extension to the recently-proposed SWAG dataset, which tests commonsense knowledge using sentence-completion questions that describe situations observed in video. To produce a more difficult dataset, we introduce a novel procedure for question acquisition in which workers author questions designed to target weaknesses of state-of-the-art neural question answering systems. Workers are rewarded for submissions that models fail to answer correctly both before and after fine-tuning (in cross-validation). We create 2.8k questions via this procedure and evaluate the performance of multiple state-of-the-art question answering systems on our dataset. We observe a significant gap between human performance, which is 95.3%, and the performance of the best baseline accuracy of 67.5% by the BERT-Large model. |
Tasks | Common Sense Reasoning, Question Answering |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.04365v4 |
https://arxiv.org/pdf/1904.04365v4.pdf | |
PWC | https://paperswithcode.com/paper/aqua-an-adversarially-authored-question |
Repo | |
Framework | |
Robust BGA Void Detection Using Multi Directional Scan Algorithms
Title | Robust BGA Void Detection Using Multi Directional Scan Algorithms |
Authors | Vikas Ahuja, Vijay Kumar Neeluru |
Abstract | The life time of electronic circuits board are impacted by the voids present in soldering balls. The quality inspection of solder balls by detecting and measuring the void is important to improve the board yield issues in electronic circuits. In general, the inspection is carried out manually, based on 2D or 3D X-ray images. For high quality inspection, it is difficult to detect and measure voids accurately with high repeatability through the manual inspection and it is time consuming process. In need of high quality and fast inspection, various approaches were proposed for void detection. But, lacks in robustness in dealing with various challenges like vias, reflections from the plating or vias, inconsistent lighting, noise, void-like artefacts, various void shapes, low resolution images and scalability to various devices. Robust BGA void detection becomes quite difficult problem, especially if the image size is very small (say, around 40x40) and with low contrast between void and the BGA background (say around 7 intensity levels on a scale of 255). In this work, we propose novel approach for void detection based on the multi directional scanning. The proposed approach is able to segment the voids for low resolution images and can be easily scaled to various electronic manufacturing products. |
Tasks | |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00211v1 |
https://arxiv.org/pdf/1909.00211v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-bga-void-detection-using-multi |
Repo | |
Framework | |