Paper Group ANR 395
Natural- to formal-language generation using Tensor Product Representations. Sketch-based Randomized Algorithms for Dynamic Graph Regression. On Connections between Constrained Optimization and Reinforcement Learning. Tracking Discrete and Continuous Entity State for Process Understanding. Efficient Retrieval of Logos Using Rough Set Reducts. Schem …
Natural- to formal-language generation using Tensor Product Representations
Title | Natural- to formal-language generation using Tensor Product Representations |
Authors | Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao |
Abstract | Generating formal-language represented by relational tuples, such as Lisp programs or mathematical expressions, from a natural-language input is an extremely challenging task because it requires to explicitly capture discrete symbolic structural information from the input to generate the output. Most state-of-the-art neural sequence models do not explicitly capture such structure information, and thus do not perform well on these tasks. In this paper, we propose a new encoder-decoder model based on Tensor Product Representations (TPRs) for Natural- to Formal-language generation, called TP-N2F. The encoder of TP-N2F employs TPR ‘binding’ to encode natural-language symbolic structure in vector space and the decoder uses TPR ‘unbinding’ to generate a sequence of relational tuples, each consisting of a relation (or operation) and a number of arguments, in symbolic space. TP-N2F considerably outperforms LSTM-based Seq2Seq models, creating a new state of the art results on two benchmarks: the MathQA dataset for math problem solving, and the AlgoList dataset for program synthesis. Ablation studies show that improvements are mainly attributed to the use of TPRs in both the encoder and decoder to explicitly capture relational structure information for symbolic reasoning. |
Tasks | Program Synthesis, Text Generation |
Published | 2019-10-05 |
URL | https://arxiv.org/abs/1910.02339v2 |
https://arxiv.org/pdf/1910.02339v2.pdf | |
PWC | https://paperswithcode.com/paper/natural-to-formal-language-generation-using-1 |
Repo | |
Framework | |
Sketch-based Randomized Algorithms for Dynamic Graph Regression
Title | Sketch-based Randomized Algorithms for Dynamic Graph Regression |
Authors | Mostafa Haghir Chehreghani |
Abstract | A well-known problem in data science and machine learning is {\em linear regression}, which is recently extended to dynamic graphs. Existing exact algorithms for updating the solution of dynamic graph regression problem require at least a linear time (in terms of $n$: the size of the graph). However, this time complexity might be intractable in practice. In the current paper, we utilize {\em subsampled randomized Hadamard transform} and \textsf{CountSketch} to propose the first randomized algorithms. Suppose that we are given an $n\times m$ matrix embedding $M$ of the graph, where $m \ll n$. Let $r$ be the number of samples required for a guaranteed approximation error, which is a sublinear function of $n$. Our first algorithm reduces time complexity of pre-processing to $O(n(m + 1) + 2n(m + 1) \log_2(r + 1) + rm^2)$. Then after an edge insertion or an edge deletion, it updates the approximate solution in $O(rm)$ time. Our second algorithm reduces time complexity of pre-processing to $O \left( nnz(M) + m^3 \epsilon^{-2} \log^7(m/\epsilon) \right)$, where $nnz(M)$ is the number of nonzero elements of $M$. Then after an edge insertion or an edge deletion or a node insertion or a node deletion, it updates the approximate solution in $O(qm)$ time, with $q=O\left(\frac{m^2}{\epsilon^2} \log^6(m/\epsilon) \right)$. Finally, we show that under some assumptions, if $\ln n < \epsilon^{-1}$ our first algorithm outperforms our second algorithm and if $\ln n \geq \epsilon^{-1}$ our second algorithm outperforms our first algorithm. |
Tasks | Graph Regression |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11963v2 |
https://arxiv.org/pdf/1905.11963v2.pdf | |
PWC | https://paperswithcode.com/paper/sketch-based-randomized-algorithms-for |
Repo | |
Framework | |
On Connections between Constrained Optimization and Reinforcement Learning
Title | On Connections between Constrained Optimization and Reinforcement Learning |
Authors | Nino Vieillard, Olivier Pietquin, Matthieu Geist |
Abstract | Dynamic Programming (DP) provides standard algorithms to solve Markov Decision Processes. However, these algorithms generally do not optimize a scalar objective function. In this paper, we draw connections between DP and (constrained) convex optimization. Specifically, we show clear links in the algorithmic structure between three DP schemes and optimization algorithms. We link Conservative Policy Iteration to Frank-Wolfe, Mirror-Descent Modified Policy Iteration to Mirror Descent, and Politex (Policy Iteration Using Expert Prediction) to Dual Averaging. These abstract DP schemes are representative of a number of (deep) Reinforcement Learning (RL) algorithms. By highlighting these connections (most of which have been noticed earlier, but in a scattered way), we would like to encourage further studies linking RL and convex optimization, that could lead to the design of new, more efficient, and better understood RL algorithms. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08476v2 |
https://arxiv.org/pdf/1910.08476v2.pdf | |
PWC | https://paperswithcode.com/paper/on-connections-between-constrained |
Repo | |
Framework | |
Tracking Discrete and Continuous Entity State for Process Understanding
Title | Tracking Discrete and Continuous Entity State for Process Understanding |
Authors | Aditya Gupta, Greg Durrett |
Abstract | Procedural text, which describes entities and their interactions as they undergo some process, depicts entities in a uniquely nuanced way. First, each entity may have some observable discrete attributes, such as its state or location; modeling these involves imposing global structure and enforcing consistency. Second, an entity may have properties which are not made explicit but can be effectively induced and tracked by neural networks. In this paper, we propose a structured neural architecture that reflects this dual nature of entity evolution. The model tracks each entity recurrently, updating its hidden continuous representation at each step to contain relevant state information. The global discrete state structure is explicitly modeled with a neural CRF over the changing hidden representation of the entity. This CRF can explicitly capture constraints on entity states over time, enforcing that, for example, an entity cannot move to a location after it is destroyed. We evaluate the performance of our proposed model on QA tasks over process paragraphs in the ProPara dataset and find that our model achieves state-of-the-art results. |
Tasks | |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03518v1 |
http://arxiv.org/pdf/1904.03518v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-discrete-and-continuous-entity-state |
Repo | |
Framework | |
Efficient Retrieval of Logos Using Rough Set Reducts
Title | Efficient Retrieval of Logos Using Rough Set Reducts |
Authors | Ushasi Chaudhuri, Partha Bhowmick, Jayanta Mukhopadhyay |
Abstract | Searching for similar logos in the registered logo database is a very important and tedious task at the trademark office. Speed and accuracy are two aspects that one must attend to while developing a system for retrieval of logos. In this paper, we propose a rough-set based method to quantify the structural information in a logo image that can be used to efficiently index an image. A logo is split into a number of polygons, and for each polygon, we compute the tight upper and lower approximations based on the principles of a rough set. This representation is used for forming feature vectors for retrieval of logos. Experimentation on a standard data set shows the usefulness of the proposed technique. It is computationally efficient and also provides retrieval results at high accuracy. |
Tasks | |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1904.05008v1 |
http://arxiv.org/pdf/1904.05008v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-retrieval-of-logos-using-rough-set |
Repo | |
Framework | |
Schemaless Queries over Document Tables with Dependencies
Title | Schemaless Queries over Document Tables with Dependencies |
Authors | Mustafa Canim, Cristina Cornelio, Arun Iyengar, Ryan Musa, Mariano Rodrigez Muro |
Abstract | Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts to an appropriate schema. This can be an expensive process. In this paper we show that by using semantic technologies (RDF/SPARQL and database dependencies) paired with a simple but powerful way to transform tables with non-relational layouts, it is possible to offer query answering services over these tables with minimal manual work or domain-specific mappings. Our method enables users to exploit data in tables embedded in documents with little effort, not only for simple retrieval queries, but also for structured queries that require joining multiple interrelated tables. |
Tasks | Table Detection |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09356v1 |
https://arxiv.org/pdf/1911.09356v1.pdf | |
PWC | https://paperswithcode.com/paper/schemaless-queries-over-document-tables-with |
Repo | |
Framework | |
Electron Neutrino Energy Reconstruction in NOvA Using CNN Particle IDs
Title | Electron Neutrino Energy Reconstruction in NOvA Using CNN Particle IDs |
Authors | Shiqi Yu |
Abstract | NOvA is a long-baseline neutrino oscillation experiment. It is optimized to measure $\nu_e$ appearance and $\nu_{\mu}$ disappearance at the Far Detector in the $\nu_{\mu}$ beam produced by the NuMI facility at Fermilab. NOvA uses a convolutional neural network (CNN) to identify neutrino events in two functionally identical liquid scintillator detectors. A different network, called prong-CNN, has been used to classify reconstructed particles in each event as either lepton or hadron. Within each event, hits are clustered into prongs to reconstruct final-state particles and these prongs form the input to this prong-CNN classifier. Classified particle energies are then used as input to an electron neutrino energy estimator. Improving the resolution and systematic robustness of NOvA’s energy estimator will improve the sensitivity of the oscillation parameters measurement. This paper describes the methods to identify particles with prong-CNN and the following approach to estimate $\nu_e$ energy for signal events. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06953v2 |
https://arxiv.org/pdf/1910.06953v2.pdf | |
PWC | https://paperswithcode.com/paper/electron-neutrino-energy-reconstruction-in |
Repo | |
Framework | |
Complex Deep Learning Models for Denoising of Human Heart ECG signals
Title | Complex Deep Learning Models for Denoising of Human Heart ECG signals |
Authors | Corneliu Arsene |
Abstract | Effective and powerful methods for denoising real electrocardiogram (ECG) signals are important for wearable sensors and devices. Deep Learning (DL) models have been used extensively in image processing and other domains with great success but only very recently have been used in processing ECG signals. This paper presents several DL models namely Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), Restricted Boltzmann Machine (RBM) together with the more conventional filtering methods (low pass filtering, high pass filtering, Notch filtering) and the standard wavelet-based technique for denoising EEG signals. These methods are trained, tested and evaluated on different synthetic and real ECG datasets taken from the MIT PhysioNet database and for different simulation conditions (i.e. various lengths of the ECG signals, single or multiple records). The results show the CNN model is a performant model that can be used for off-line denoising ECG applications where it is satisfactory to train on a clean part of an ECG signal from an ECG record, and then to test on the same ECG signal, which would have some high level of noise added to it. However, for real-time applications or near-real time applications, this task becomes more cumbersome, as the clean part of an ECG signal is very probable to be very limited in size. Therefore the solution put forth in this work is to train a CNN model on 1 second ECG noisy artificial multiple heartbeat data (i.e. ECG at effort), which was generated in a first instance based on few sequences of real signal heartbeat ECG data (i.e. ECG at rest). Afterwards it would be possible to use the trained CNN model in real life situations to denoise the ECG signal. |
Tasks | Denoising, ECG Denoising, Electrocardiography (ECG) |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10417v2 |
https://arxiv.org/pdf/1908.10417v2.pdf | |
PWC | https://paperswithcode.com/paper/complex-deep-learning-models-for-denoising-of |
Repo | |
Framework | |
Learning Discriminative Features Via Weights-biased Softmax Loss
Title | Learning Discriminative Features Via Weights-biased Softmax Loss |
Authors | XiaoBin Li, WeiQiang Wang |
Abstract | Loss functions play a key role in training superior deep neural networks. In convolutional neural networks (CNNs), the popular cross entropy loss together with softmax does not explicitly guarantee minimization of intra-class variance or maximization of inter-class variance. In the early studies, there is no theoretical analysis and experiments explicitly indicating how to choose the number of units in fully connected layer. To help CNNs learn features more fast and discriminative, there are two contributions in this paper. First, we determine the minimum number of units in FC layer by rigorous theoretical analysis and extensive experiment, which reduces CNNs’ parameter memory and training time. Second, we propose a negative-focused weights-biased softmax (W-Softmax) loss to help CNNs learn more discriminative features. The proposed W-Softmax loss not only theoretically formulates the intraclass compactness and inter-class separability, but also can avoid overfitting by enlarging decision margins. Moreover, the size of decision margins can be flexibly controlled by adjusting a hyperparameter $\alpha$. Extensive experimental results on several benchmark datasets show the superiority of W-Softmax in image classification tasks. |
Tasks | Image Classification |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11138v1 |
http://arxiv.org/pdf/1904.11138v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-discriminative-features-via-weights |
Repo | |
Framework | |
Figurative Usage Detection of Symptom Words to Improve Personal Health Mention Detection
Title | Figurative Usage Detection of Symptom Words to Improve Personal Health Mention Detection |
Authors | Adith Iyer, Aditya Joshi, Sarvnaz Karimi, Ross Sparks, Cecile Paris |
Abstract | Personal health mention detection deals with predicting whether or not a given sentence is a report of a health condition. Past work mentions errors in this prediction when symptom words, i.e. names of symptoms of interest, are used in a figurative sense. Therefore, we combine a state-of-the-art figurative usage detection with CNN-based personal health mention detection. To do so, we present two methods: a pipeline-based approach and a feature augmentation-based approach. The introduction of figurative usage detection results in an average improvement of 2.21% F-score of personal health mention detection, in the case of the feature augmentation-based approach. This paper demonstrates the promise of using figurative usage detection to improve personal health mention detection. |
Tasks | |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05466v2 |
https://arxiv.org/pdf/1906.05466v2.pdf | |
PWC | https://paperswithcode.com/paper/figurative-usage-detection-of-symptom-words |
Repo | |
Framework | |
Combining Crowd and Machines for Multi-predicate Item Screening
Title | Combining Crowd and Machines for Multi-predicate Item Screening |
Authors | Evgeny Krivosheev, Fabio Casati, Marcos Baez, Boualem Benatallah |
Abstract | This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manage the cost-accuracy trade off by progressively determining if we should spend budget to obtain test data (to assess the accuracy of the given classifiers), or to train an ensemble of classifiers, or whether we should leverage the existing machine classifiers with the crowd, and in this case how to efficiently combine them based on their estimated characteristics to obtain the classification. We demonstrate that the techniques we propose obtain significant cost/accuracy improvements with respect to the leading classification algorithms. |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.00714v1 |
http://arxiv.org/pdf/1904.00714v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-crowd-and-machines-for-multi |
Repo | |
Framework | |
Table understanding in structured documents
Title | Table understanding in structured documents |
Authors | Martin Holeček, Antonín Hoskovec, Petr Baudiš, Pavel Klinger |
Abstract | Abstract— Table detection and extraction has been studied in the context of documents like reports, where tables are clearly outlined and stand out from the document structure visually. We study this topic in a rather more challenging domain of layout-heavy business documents, particularly invoices. Invoices present the novel challenges of tables being often without outlines - either in the form of borders or surrounding text flow - with ragged columns and widely varying data content. We will also show, that we can extract specific information from structurally different tables or table-like structures with one model. We present a comprehensive representation of a page using graph over word boxes, positional embeddings, trainable textual features and rephrase the table detection as a text box labeling problem. We will work on our newly presented dataset of pro forma invoices, invoices and debit note documents using this representation and propose multiple baselines to solve this labeling problem. We then propose a novel neural network model that achieves strong, practical results on the presented dataset and analyze the model performance and effects of graph convolutions and self-attention in detail. |
Tasks | Table Detection |
Published | 2019-03-22 |
URL | https://arxiv.org/abs/1904.12577v2 |
https://arxiv.org/pdf/1904.12577v2.pdf | |
PWC | https://paperswithcode.com/paper/190412577 |
Repo | |
Framework | |
A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound
Title | A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound |
Authors | Gal Dalal, Balazs Szorenyi, Gugan Thoppe |
Abstract | Policy evaluation in reinforcement learning is often conducted using two-timescale stochastic approximation, which results in various gradient temporal difference methods such as GTD(0), GTD2, and TDC. Here, we provide convergence rate bounds for this suite of algorithms. Algorithms such as these have two iterates, $\theta_n$ and $w_n,$ which are updated using two distinct stepsize sequences, $\alpha_n$ and $\beta_n,$ respectively. Assuming $\alpha_n = n^{-\alpha}$ and $\beta_n = n^{-\beta}$ with $1 > \alpha > \beta > 0,$ we show that, with high probability, the two iterates converge to their respective solutions $\theta^$ and $w^$ at rates given by $\theta_n - \theta^*\ = \tilde{O}( n^{-\alpha/2})$ and $\w_n - w^*\ = \tilde{O}(n^{-\beta/2});$ here, $\tilde{O}$ hides logarithmic terms. Via comparable lower bounds, we show that these bounds are, in fact, tight. To the best of our knowledge, ours is the first finite-time analysis which achieves these rates. While it was known that the two timescale components decouple asymptotically, our results depict this phenomenon more explicitly by showing that it in fact happens from some finite time onwards. Lastly, compared to existing works, our result applies to a broader family of stepsizes, including non-square summable ones. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.09157v2 |
https://arxiv.org/pdf/1911.09157v2.pdf | |
PWC | https://paperswithcode.com/paper/a-tale-of-two-timescale-reinforcement |
Repo | |
Framework | |
Text mining policy: Classifying forest and landscape restoration policy agenda with neural information retrieval
Title | Text mining policy: Classifying forest and landscape restoration policy agenda with neural information retrieval |
Authors | John Brandt |
Abstract | Dozens of countries have committed to restoring the ecological functionality of 350 million hectares of land by 2030. In order to achieve such wide-scale implementation of restoration, the values and priorities of multi-sectoral stakeholders must be aligned and integrated with national level commitments and other development agenda. Although misalignment across scales of policy and between stakeholders are well known barriers to implementing restoration, fast-paced policy making in multi-stakeholder environments complicates the monitoring and analysis of governance and policy. In this work, we assess the potential of machine learning to identify restoration policy agenda across diverse policy documents. An unsupervised neural information retrieval architecture is introduced that leverages transfer learning and word embeddings to create high-dimensional representations of paragraphs. Policy agenda labels are recast as information retrieval queries in order to classify policies with a cosine similarity threshold between paragraphs and query embeddings. This approach achieves a 0.83 F1-score measured across 14 policy agenda in 31 policy documents in Malawi, Kenya, and Rwanda, indicating that automated text mining can provide reliable, generalizable, and efficient analyses of restoration policy. |
Tasks | Information Retrieval, Transfer Learning, Word Embeddings |
Published | 2019-08-07 |
URL | https://arxiv.org/abs/1908.02425v1 |
https://arxiv.org/pdf/1908.02425v1.pdf | |
PWC | https://paperswithcode.com/paper/text-mining-policy-classifying-forest-and |
Repo | |
Framework | |
Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic Segmentation
Title | Pattern-Affinitive Propagation across Depth, Surface Normal and Semantic Segmentation |
Authors | Zhenyu Zhang, Zhen Cui, Chunyan Xu, Yan Yan, Nicu Sebe, Jian Yang |
Abstract | In this paper, we propose a novel Pattern-Affinitive Propagation (PAP) framework to jointly predict depth, surface normal and semantic segmentation. The motivation behind it comes from the statistic observation that pattern-affinitive pairs recur much frequently across different tasks as well as within a task. Thus, we can conduct two types of propagations, cross-task propagation and task-specific propagation, to adaptively diffuse those similar patterns. The former integrates cross-task affinity patterns to adapt to each task therein through the calculation on non-local relationships. Next the latter performs an iterative diffusion in the feature space so that the cross-task affinity patterns can be widely-spread within the task. Accordingly, the learning of each task can be regularized and boosted by the complementary task-level affinities. Extensive experiments demonstrate the effectiveness and the superiority of our method on the joint three tasks. Meanwhile, we achieve the state-of-the-art or competitive results on the three related datasets, NYUD-v2, SUN-RGBD and KITTI. |
Tasks | Monocular Depth Estimation, Semantic Segmentation |
Published | 2019-06-08 |
URL | https://arxiv.org/abs/1906.03525v1 |
https://arxiv.org/pdf/1906.03525v1.pdf | |
PWC | https://paperswithcode.com/paper/pattern-affinitive-propagation-across-depth-1 |
Repo | |
Framework | |