Paper Group ANR 477
A Hybrid Semantic Parsing Approach for Tabular Data Analysis. Robust Coreset Construction for Distributed Machine Learning. Interaction Embeddings for Prediction and Explanation in Knowledge Graphs. Hybrid Compositional Reasoning for Reactive Synthesis from Finite-Horizon Specifications. Deepening Hidden Representations from Pre-trained Language Mo …
A Hybrid Semantic Parsing Approach for Tabular Data Analysis
Title | A Hybrid Semantic Parsing Approach for Tabular Data Analysis |
Authors | Yan Gao, Jian-Guang Lou, Dongmei Zhang |
Abstract | This paper presents a novel approach to translating natural language questions to SQL queries for given tables, which meets three requirements as a real-world data analysis application: cross-domain, multilingualism and enabling quick-start. Our proposed approach consists of: (1) a novel data abstraction step before the parser to make parsing table-agnosticism; (2) a set of semantic rules for parsing abstracted data-analysis questions to intermediate logic forms as tree derivations to reduce the search space; (3) a neural-based model as a local scoring function on a span-based semantic parser for structured optimization and efficient inference. Experiments show that our approach outperforms state-of-the-art algorithms on a large open benchmark dataset WikiSQL. We also achieve promising results on a small dataset for more complex queries in both English and Chinese, which demonstrates our language expansion and quick-start ability. |
Tasks | Semantic Parsing |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10363v2 |
https://arxiv.org/pdf/1910.10363v2.pdf | |
PWC | https://paperswithcode.com/paper/annaparser-semantic-parsing-for-tabular-data |
Repo | |
Framework | |
Robust Coreset Construction for Distributed Machine Learning
Title | Robust Coreset Construction for Distributed Machine Learning |
Authors | Hanlin Lu, Ming-Ju Li, Ting He, Shiqiang Wang, Vijaykrishnan Narayanan, Kevin S Chan |
Abstract | Motivated by the need of solving machine learning problems over distributed datasets, we explore the use of coreset to reduce the communication overhead. Coreset is a summary of the original dataset in the form of a small weighted set in the same sample space. Compared to other data summaries, coreset has the advantage that it can be used as a proxy of the original dataset, potentially for different applications. However, existing coreset construction algorithms are each tailor-made for a specific machine learning problem. Thus, to solve different machine learning problems, one has to collect coresets of different types, defeating the purpose of saving communication overhead. We resolve this dilemma by developing coreset construction algorithms based on k-means/median clustering, that give a provably good approximation for a broad range of machine learning problems with sufficiently continuous cost functions. Through evaluations on diverse datasets and machine learning problems, we verify the robust performance of the proposed algorithms. |
Tasks | |
Published | 2019-04-11 |
URL | https://arxiv.org/abs/1904.05961v2 |
https://arxiv.org/pdf/1904.05961v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-coreset-construction-for-distributed |
Repo | |
Framework | |
Interaction Embeddings for Prediction and Explanation in Knowledge Graphs
Title | Interaction Embeddings for Prediction and Explanation in Knowledge Graphs |
Authors | Wen Zhang, Bibek Paudel, Wei Zhang, Abraham Bernstein, Huajun Chen |
Abstract | Knowledge graph embedding aims to learn distributed representations for entities and relations, and is proven to be effective in many applications. Crossover interactions — bi-directional effects between entities and relations — help select related information when predicting a new triple, but haven’t been formally discussed before. In this paper, we propose CrossE, a novel knowledge graph embedding which explicitly simulates crossover interactions. It not only learns one general embedding for each entity and relation as most previous methods do, but also generates multiple triple specific embeddings for both of them, named interaction embeddings. We evaluate embeddings on typical link prediction tasks and find that CrossE achieves state-of-the-art results on complex and more challenging datasets. Furthermore, we evaluate embeddings from a new perspective — giving explanations for predicted triples, which is important for real applications. In this work, an explanation for a triple is regarded as a reliable closed-path between the head and the tail entity. Compared to other baselines, we show experimentally that CrossE, benefiting from interaction embeddings, is more capable of generating reliable explanations to support its predictions. |
Tasks | Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Link Prediction |
Published | 2019-03-12 |
URL | http://arxiv.org/abs/1903.04750v1 |
http://arxiv.org/pdf/1903.04750v1.pdf | |
PWC | https://paperswithcode.com/paper/interaction-embeddings-for-prediction-and |
Repo | |
Framework | |
Hybrid Compositional Reasoning for Reactive Synthesis from Finite-Horizon Specifications
Title | Hybrid Compositional Reasoning for Reactive Synthesis from Finite-Horizon Specifications |
Authors | Suguman Bansal, Yong Li, Lucas M. Tabajara, Moshe Y. Vardi |
Abstract | LTLf synthesis is the automated construction of a reactive system from a high-level description, expressed in LTLf, of its finite-horizon behavior. So far, the conversion of LTLf formulas to deterministic finite-state automata (DFAs) has been identified as the primary bottleneck to the scalabity of synthesis. Recent investigations have also shown that the size of the DFA state space plays a critical role in synthesis as well. Therefore, effective resolution of the bottleneck for synthesis requires the conversion to be time and memory performant, and prevent state-space explosion. Current conversion approaches, however, which are based either on explicit-state representation or symbolic-state representation, fail to address these necessities adequately at scale: Explicit-state approaches generate minimal DFA but are slow due to expensive DFA minimization. Symbolic-state representations can be succinct, but due to the lack of DFA minimization they generate such large state spaces that even their symbolic representations cannot compensate for the blow-up. This work proposes a hybrid representation approach for the conversion. Our approach utilizes both explicit and symbolic representations of the state-space, and effectively leverages their complementary strengths. In doing so, we offer an LTLf to DFA conversion technique that addresses all three necessities, hence resolving the bottleneck. A comprehensive empirical evaluation on conversion and synthesis benchmarks supports the merits of our hybrid approach. |
Tasks | |
Published | 2019-11-19 |
URL | https://arxiv.org/abs/1911.08145v3 |
https://arxiv.org/pdf/1911.08145v3.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-compositional-reasoning-for-reactive |
Repo | |
Framework | |
Deepening Hidden Representations from Pre-trained Language Models for Natural Language Understanding
Title | Deepening Hidden Representations from Pre-trained Language Models for Natural Language Understanding |
Authors | Junjie Yang, Hai Zhao |
Abstract | Transformer-based pre-trained language models have proven to be effective for learning contextualized language representation. However, current approaches only take advantage of the output of the encoder’s final layer when fine-tuning the downstream tasks. We argue that only taking single layer’s output restricts the power of pre-trained representation. Thus we deepen the representation learned by the model by fusing the hidden representation in terms of an explicit HIdden Representation Extractor (HIRE), which automatically absorbs the complementary representation with respect to the output from the final layer. Utilizing RoBERTa as the backbone encoder, our proposed improvement over the pre-trained models is shown effective on multiple natural language understanding tasks and help our model rival with the state-of-the-art models on the GLUE benchmark. |
Tasks | |
Published | 2019-11-05 |
URL | https://arxiv.org/abs/1911.01940v1 |
https://arxiv.org/pdf/1911.01940v1.pdf | |
PWC | https://paperswithcode.com/paper/deepening-hidden-representations-from-pre |
Repo | |
Framework | |
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF from a Single Image
Title | Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF from a Single Image |
Authors | Zhengqin Li, Mohammad Shafiei, Ravi Ramamoorthi, Kalyan Sunkavalli, Manmohan Chandraker |
Abstract | We propose a deep inverse rendering framework for indoor scenes. From a single RGB image of an arbitrary indoor scene, we create a complete scene reconstruction, estimating shape, spatially-varying lighting, and spatially-varying, non-Lambertian surface reflectance. To train this network, we augment the SUNCG indoor scene dataset with real-world materials and render them with a fast, high-quality, physically-based GPU renderer to create a large-scale, photorealistic indoor dataset. Our inverse rendering network incorporates physical insights – including a spatially-varying spherical Gaussian lighting representation, a differentiable rendering layer to model scene appearance, a cascade structure to iteratively refine the predictions and a bilateral solver for refinement – allowing us to jointly reason about shape, lighting, and reflectance. Experiments show that our framework outperforms previous methods for estimating individual scene components, which also enables various novel applications for augmented reality, such as photorealistic object insertion and material editing. Code and data will be made publicly available. |
Tasks | |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02722v1 |
https://arxiv.org/pdf/1905.02722v1.pdf | |
PWC | https://paperswithcode.com/paper/inverse-rendering-for-complex-indoor-scenes |
Repo | |
Framework | |
Generating Labels for Regression of Subjective Constructs using Triplet Embeddings
Title | Generating Labels for Regression of Subjective Constructs using Triplet Embeddings |
Authors | Karel Mundnich, Brandon M. Booth, Benjamin Girault, Shrikanth Narayanan |
Abstract | Human annotations serve an important role in computational models where the target constructs under study are hidden, such as dimensions of affect. This is especially relevant in machine learning, where subjective labels derived from related observable signals (e.g., audio, video, text) are needed to support model training and testing. Current research trends focus on correcting artifacts and biases introduced by annotators during the annotation process while fusing them into a single annotation. In this work, we propose a novel annotation approach using triplet embeddings. By lifting the absolute annotation process to relative annotations where the annotator compares individual target constructs in triplets, we leverage the accuracy of comparisons over absolute ratings by human annotators. We then build a 1-dimensional embedding in Euclidean space that is indexed in time and serves as a label for regression. In this setting, the annotation fusion occurs naturally as a union of sets of sampled triplet comparisons among different annotators. We show that by using our proposed sampling method to find an embedding, we are able to accurately represent synthetic hidden constructs in time under noisy sampling conditions. We further validate this approach using human annotations collected from Mechanical Turk and show that we can recover the underlying structure of the hidden construct up to bias and scaling factors. |
Tasks | |
Published | 2019-04-02 |
URL | https://arxiv.org/abs/1904.01643v2 |
https://arxiv.org/pdf/1904.01643v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-labels-for-regression-of |
Repo | |
Framework | |
Reducing state updates via Gaussian-gated LSTMs
Title | Reducing state updates via Gaussian-gated LSTMs |
Authors | Matthew Thornton, Jithendar Anumula, Shih-Chii Liu |
Abstract | Recurrent neural networks can be difficult to train on long sequence data due to the well-known vanishing gradient problem. Some architectures incorporate methods to reduce RNN state updates, therefore allowing the network to preserve memory over long temporal intervals. To address these problems of convergence, this paper proposes a timing-gated LSTM RNN model, called the Gaussian-gated LSTM (g-LSTM). The time gate controls when a neuron can be updated during training, enabling longer memory persistence and better error-gradient flow. This model captures long-temporal dependencies better than an LSTM and the time gate parameters can be learned even from non-optimal initialization values. Because the time gate limits the updates of the neuron state, the number of computes needed for the network update is also reduced. By adding a computational budget term to the training loss, we can obtain a network which further reduces the number of computes by at least 10x. Finally, by employing a temporal curriculum learning schedule for the g-LSTM, we can reduce the convergence time of the equivalent LSTM network on long sequences. |
Tasks | |
Published | 2019-01-22 |
URL | http://arxiv.org/abs/1901.07334v1 |
http://arxiv.org/pdf/1901.07334v1.pdf | |
PWC | https://paperswithcode.com/paper/reducing-state-updates-via-gaussian-gated |
Repo | |
Framework | |
Exact and Consistent Interpretation of Piecewise Linear Models Hidden behind APIs: A Closed Form Solution
Title | Exact and Consistent Interpretation of Piecewise Linear Models Hidden behind APIs: A Closed Form Solution |
Authors | Zicun Cong, Lingyang Chu, Lanjun Wang, Xia Hu, Jian Pei |
Abstract | More and more AI services are provided through APIs on cloud where predictive models are hidden behind APIs. To build trust with users and reduce potential application risk, it is important to interpret how such predictive models hidden behind APIs make their decisions. The biggest challenge of interpreting such predictions is that no access to model parameters or training data is available. Existing works interpret the predictions of a model hidden behind an API by heuristically probing the response of the API with perturbed input instances. However, these methods do not provide any guarantee on the exactness and consistency of their interpretations. In this paper, we propose an elegant closed form solution named \texttt{OpenAPI} to compute exact and consistent interpretations for the family of Piecewise Linear Models (PLM), which includes many popular classification models. The major idea is to first construct a set of overdetermined linear equation systems with a small set of perturbed instances and the predictions made by the model on those instances. Then, we solve the equation systems to identify the decision features that are responsible for the prediction on an input instance. Our extensive experiments clearly demonstrate the exactness and consistency of our method. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.06857v1 |
https://arxiv.org/pdf/1906.06857v1.pdf | |
PWC | https://paperswithcode.com/paper/exact-and-consistent-interpretation-of |
Repo | |
Framework | |
Synthesizing Credit Card Transactions
Title | Synthesizing Credit Card Transactions |
Authors | Erik R. Altman |
Abstract | Two elements have been essential to AI’s recent boom: (1) deep neural nets and the theory and practice behind them; and (2) cloud computing with its abundant labeled data and large computing resources. Abundant labeled data is available for key domains such as images, speech, natural language processing, and recommendation engines. However, there are many other domains where such data is not available, or access to it is highly restricted for privacy reasons, as with health and financial data. Even when abundant data is available, it is often not labeled. Doing such labeling is labor-intensive and non-scalable. As a result, to the best of our knowledge, key domains still lack labeled data or have at most toy data; or the synthetic data must have access to real data from which it can mimic new data. This paper outlines work to generate realistic synthetic data for an important domain: credit card transactions. Some challenges: there are many patterns and correlations in real purchases. There are millions of merchants and innumerable locations. Those merchants offer a wide variety of goods. Who shops where and when? How much do people pay? What is a realistic fraudulent transaction? We use a mixture of technical approaches and domain knowledge including mechanics of credit card processing, a broad set of consumer domains: electronics, clothing, hair styling, etc. Connecting everything is a virtual world. This paper outlines some of our key techniques and provides evidence that the data generated is indeed realistic. Beyond the scope of this paper: (1) use of our data to develop and train models to predict fraud; (2) coupling models and the synthetic dataset to assess performance in designing accelerators such as GPUs and TPUs. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.03033v1 |
https://arxiv.org/pdf/1910.03033v1.pdf | |
PWC | https://paperswithcode.com/paper/synthesizing-credit-card-transactions |
Repo | |
Framework | |
Incorporating Unlabeled Data into Distributionally Robust Learning
Title | Incorporating Unlabeled Data into Distributionally Robust Learning |
Authors | Charlie Frogner, Sebastian Claici, Edward Chien, Justin Solomon |
Abstract | We study a robust alternative to empirical risk minimization called distributionally robust learning (DRL), in which one learns to perform against an adversary who can choose the data distribution from a specified set of distributions. We illustrate a problem with current DRL formulations, which rely on an overly broad definition of allowed distributions for the adversary, leading to learned classifiers that are unable to predict with any confidence. We propose a solution that incorporates unlabeled data into the DRL problem to further constrain the adversary. We show that this new formulation is tractable for stochastic gradient-based optimization and yields a computable guarantee on the future performance of the learned classifier, analogous to – but tighter than – guarantees from conventional DRL. We examine the performance of this new formulation on 14 real datasets and find that it often yields effective classifiers with nontrivial performance guarantees in situations where conventional DRL produces neither. Inspired by these results, we extend our DRL formulation to active learning with a novel, distributionally-robust version of the standard model-change heuristic. Our active learning algorithm often achieves superior learning performance to the original heuristic on real datasets. |
Tasks | Active Learning |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07729v2 |
https://arxiv.org/pdf/1912.07729v2.pdf | |
PWC | https://paperswithcode.com/paper/incorporating-unlabeled-data-into |
Repo | |
Framework | |
CORAL8: Concurrent Object Regression for Area Localization in Medical Image Panels
Title | CORAL8: Concurrent Object Regression for Area Localization in Medical Image Panels |
Authors | Sam Maksoud, Arnold Wiliem, Kun Zhao, Teng Zhang, Lin Wu, Brian C. Lovell |
Abstract | This work tackles the problem of generating a medical report for multi-image panels. We apply our solution to the Renal Direct Immunofluorescence (RDIF) assay which requires a pathologist to generate a report based on observations across the eight different WSI in concert with existing clinical features. To this end, we propose a novel attention-based multi-modal generative recurrent neural network (RNN) architecture capable of dynamically sampling image data concurrently across the RDIF panel. The proposed methodology incorporates text from the clinical notes of the requesting physician to regulate the output of the network to align with the overall clinical context. In addition, we found the importance of regularizing the attention weights for word generation processes. This is because the system can ignore the attention mechanism by assigning equal weights for all members. Thus, we propose two regularizations which force the system to utilize the attention mechanism. Experiments on our novel collection of RDIF WSIs provided by a large clinical laboratory demonstrate that our framework offers significant improvements over existing methods. |
Tasks | |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09676v1 |
https://arxiv.org/pdf/1906.09676v1.pdf | |
PWC | https://paperswithcode.com/paper/coral8-concurrent-object-regression-for-area |
Repo | |
Framework | |
Disentanglement based Active Learning
Title | Disentanglement based Active Learning |
Authors | Silpa V S, Adarsh K, Sumitra S, Raju K George |
Abstract | We propose Disentanglement based Active Learning (DAL), a new active learning technique based on query synthesis which leverages the concept of disentanglement. Instead of requesting labels from the human oracle, our method automatically labels majority of the datapoints, thus drastically reducing the human labelling budget in active learning. The proposed method uses Information Maximizing Generative Adversarial Nets (InfoGAN) to achieve the task where the active learner provides a feedback on the generation of InfoGAN based on which decision is taken about the datapoints to be queried. Results on two benchmark datasets demonstrate that DAL is able to achieve nearly fully supervised accuracy with fairly less labelling budget compared to existing active learning approaches. |
Tasks | Active Learning |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.07018v1 |
https://arxiv.org/pdf/1912.07018v1.pdf | |
PWC | https://paperswithcode.com/paper/disentanglement-based-active-learning |
Repo | |
Framework | |
Parting with Illusions about Deep Active Learning
Title | Parting with Illusions about Deep Active Learning |
Authors | Sudhanshu Mittal, Maxim Tatarchenko, Özgün Çiçek, Thomas Brox |
Abstract | Active learning aims to reduce the high labeling cost involved in training machine learning models on large datasets by efficiently labeling only the most informative samples. Recently, deep active learning has shown success on various tasks. However, the conventional evaluation scheme used for deep active learning is below par. Current methods disregard some apparent parallel work in the closely related fields. Active learning methods are quite sensitive w.r.t. changes in the training procedure like data augmentation. They improve by a large-margin when integrated with semi-supervised learning, but barely perform better than the random baseline. We re-implement various latest active learning approaches for image classification and evaluate them under more realistic settings. We further validate our findings for semantic segmentation. Based on our observations, we realistically assess the current state of the field and propose a more suitable evaluation protocol. |
Tasks | Active Learning, Data Augmentation, Image Classification, Semantic Segmentation |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.05361v1 |
https://arxiv.org/pdf/1912.05361v1.pdf | |
PWC | https://paperswithcode.com/paper/parting-with-illusions-about-deep-active |
Repo | |
Framework | |
GhostLink: Latent Network Inference for Influence-aware Recommendation
Title | GhostLink: Latent Network Inference for Influence-aware Recommendation |
Authors | Subhabrata Mukherjee, Stephan Guennemann |
Abstract | Social influence plays a vital role in shaping a user’s behavior in online communities dealing with items of fine taste like movies, food, and beer. For online recommendation, this implies that users’ preferences and ratings are influenced due to other individuals. Given only time-stamped reviews of users, can we find out who-influences-whom, and characteristics of the underlying influence network? Can we use this network to improve recommendation? While prior works in social-aware recommendation have leveraged social interaction by considering the observed social network of users, many communities like Amazon, Beeradvocate, and Ratebeer do not have explicit user-user links. Therefore, we propose GhostLink, an unsupervised probabilistic graphical model, to automatically learn the latent influence network underlying a review community – given only the temporal traces (timestamps) of users’ posts and their content. Based on extensive experiments with four real-world datasets with 13 million reviews, we show that GhostLink improves item recommendation by around 23% over state-of-the-art methods that do not consider this influence. As additional use-cases, we show that GhostLink can be used to differentiate between users’ latent preferences and influenced ones, as well as to detect influential users based on the learned influence graph. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.05955v1 |
https://arxiv.org/pdf/1905.05955v1.pdf | |
PWC | https://paperswithcode.com/paper/ghostlink-latent-network-inference-for |
Repo | |
Framework | |