January 30, 2020

3657 words 18 mins read

Paper Group ANR 443

Dependency-Aware Named Entity Recognition with Relative and Global Attentions. Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision. An Improved Tobit Kalman Filter with Adaptive Censoring Limits. Taking Care of The Discretization Problem:A Black-Box Adversarial Image Attack in Discrete Integer Domain. Br …

Dependency-Aware Named Entity Recognition with Relative and Global Attentions


Title	Dependency-Aware Named Entity Recognition with Relative and Global Attentions
Authors	Gustavo Aguilar, Thamar Solorio
Abstract	Named entity recognition is one of the core tasks in NLP. Although many improvements have been made on this task during the last years, the state-of-the-art systems do not explicitly take into account the recursive nature of language. Instead of only treating the text as a plain sequence of words, we incorporate a linguistically-inspired way to recognize entities based on syntax and tree structures. Our model exploits syntactic relationships among words using a Tree-LSTM guided by dependency trees. Then, we enhance these features by applying relative and global attention mechanisms. On the one hand, the relative attention detects the most informative words in the sentence with respect to the word being evaluated. On the other hand, the global attention spots the most relevant words in the sequence. Lastly, we linearly project the weighted vectors into the tagging space so that a conditional random field classifier predicts the entity labels. Our findings show that the model detects words that disclose the entity types based on their syntactic roles in a sentence (e.g., verbs such as speak and write are attended when the entity type is PERSON, whereas meet and travel strongly relate to LOCATION). We confirm our findings and establish a new state of the art on two datasets.
Tasks	Named Entity Recognition
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05166v1
PDF	https://arxiv.org/pdf/1909.05166v1.pdf
PWC	https://paperswithcode.com/paper/dependency-aware-named-entity-recognition
Repo
Framework

Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision


Title	Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision
Authors	Hao Lang, Wen Wang
Abstract	We propose a learning approach for turn-level spoken language understanding, which facilitates a user to speak one or more utterances compositionally in a turn for completing a task (e.g., voice ordering). A typical pipelined approach for these understanding tasks requires non-trivial annotation effort for developing its multiple components. Also, the pipeline is difficult to port to a new domain or scale up. To address these problems, we propose an end-to-end statistical model with weak supervision. We employ randomized beam search with memory augmentation (RBSMA) to solve complicated problems for which long promising trajectories are usually difficult to explore. Furthermore, considering the diversity of problem complexity, we explore automated curriculum learning (CL) for weak supervision to accelerate exploration and learning. We evaluate the proposed approach on real-world user logs of a commercial voice ordering system. Results demonstrate that when trained on a small number of end-to-end annotated sessions collected with low cost, our model performs comparably to the deployed pipelined system, saving the development labor over an order of magnitude. The RBSMA algorithm improves the test set accuracy by 7.8% relative compared to the standard beam search. Automated CL leads to better generalization and further improves the test set accuracy by 5% relative.
Tasks	Spoken Language Understanding
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04291v1
PDF	https://arxiv.org/pdf/1906.04291v1.pdf
PWC	https://paperswithcode.com/paper/automated-curriculum-learning-for-turn-level
Repo
Framework

An Improved Tobit Kalman Filter with Adaptive Censoring Limits


Title	An Improved Tobit Kalman Filter with Adaptive Censoring Limits
Authors	Kostas Loumponias, Nicholas Vretos, George Tsaklidis, Petros Daras
Abstract	This paper deals with the Tobit Kalman filtering (TKF) process when the measurements are correlated and censored. The case of interval censoring, i.e., the case of measurements which belong to some interval with given censoring limits, is considered. Two improvements of the standard TKF process are proposed, in order to estimate the hidden state vectors. Firstly, the exact covariance matrix of the censored measurements is calculated by taking into account the censoring limits. Secondly, the probability of a latent (normally distributed) measurement to belong in or out of the uncensored region is calculated by taking into account the Kalman residual. The designed algorithm is tested using both synthetic and real data sets. The real data set includes human skeleton joints’ coordinates captured by the Microsoft Kinect II sensor. In order to cope with certain real-life situations that cause problems in human skeleton tracking, such as (self)-occlusions, closely interacting persons etc., adaptive censoring limits are used in the proposed TKF process. Experiments show that the proposed method outperforms other filtering processes in minimizing the overall Root Mean Square Error (RMSE) for synthetic and real data sets.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06190v1
PDF	https://arxiv.org/pdf/1911.06190v1.pdf
PWC	https://paperswithcode.com/paper/an-improved-tobit-kalman-filter-with-adaptive
Repo
Framework

Taking Care of The Discretization Problem:A Black-Box Adversarial Image Attack in Discrete Integer Domain


Title	Taking Care of The Discretization Problem:A Black-Box Adversarial Image Attack in Discrete Integer Domain
Authors	Yuchao Duan, Zhe Zhao, Lei Bu, Fu Song
Abstract	Numerous methods for crafting adversarial examples were proposed recently with high success rate. Since most existing machine learning based classifiers normalize images into some continuous, real vector, domain firstly, attacks often craft adversarial examples in such domain. However, “adversarial” examples may become benign after denormalizing them back into the discrete integer domain, known as the discretization problem. This problem was mentioned in some work, but has received relatively little attention. In this work, we first conduct a comprehensive study of existing methods and tools for crafting. We theoretically analyze 34 representative methods and empirically study 20 representative open source tools for crafting adversarial images. Our study reveals that the discretization problem is far more serious than originally thought. This suggests that the discretization problem should be taken into account seriously when crafting adversarial examples and measuring attack success rate. As a first step towards addressing this problem in black-box scenario, we propose a black-box method which reduces the adversarial example searching problem to a derivative-free optimization problem. Our method is able to craft adversarial images by derivative-free search in the discrete integer domain. Experimental results show that our method is comparable to recent white-box methods (e.g., FGSM, BIM and C&W) and achieves significantly higher success rate in terms of adversarial examples in the discrete integer domain than recent black-box methods (e.g., ZOO, NES-PGD and Bandits). Moreover, our method is able to handle models that is non-differentiable and successfully break the winner of NIPS 2017 competition on defense with 95% success rate. Our results suggest that discrete optimization algorithms open up a promising area of research into effective black-box attacks.
Tasks
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07672v4
PDF	https://arxiv.org/pdf/1905.07672v4.pdf
PWC	https://paperswithcode.com/paper/things-you-may-not-know-about-adversarial
Repo
Framework

Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities


Title	Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
Authors	Octavian-Eugen Ganea, Sylvain Gelly, Gary Bécigneul, Aliaksei Severyn
Abstract	The Softmax function on top of a final linear layer is the de facto method to output probability distributions in neural networks. In many applications such as language models or text generation, this model has to produce distributions over large output vocabularies. Recently, this has been shown to have limited representational capacity due to its connection with the rank bottleneck in matrix factorization. However, little is known about the limitations of Linear-Softmax for quantities of practical interest such as cross entropy or mode estimation, a direction that we explore here. As an efficient and effective solution to alleviate this issue, we propose to learn parametric monotonic functions on top of the logits. We theoretically investigate the rank increasing capabilities of such monotonic functions. Empirically, our method improves in two different quality metrics over the traditional Linear-Softmax layer in synthetic and real language model experiments, adding little time or memory overhead, while being comparable to the more computationally expensive mixture of Softmaxes.
Tasks	Language Modelling, Text Generation
Published	2019-02-21
URL	https://arxiv.org/abs/1902.08077v2
PDF	https://arxiv.org/pdf/1902.08077v2.pdf
PWC	https://paperswithcode.com/paper/breaking-the-softmax-bottleneck-via-learnable
Repo
Framework

Generative Multi-Functional Meta-Atom and Metasurface Design Networks


Title	Generative Multi-Functional Meta-Atom and Metasurface Design Networks
Authors	Sensong An, Bowen Zheng, Hong Tang, Mikhail Y. Shalaginov, Li Zhou, Hang Li, Tian Gu, Juejun Hu, Clayton Fowler, Hualiang Zhang
Abstract	Metasurfaces are being widely investigated and adopted for their promising performances in manipulating optical wavefronts and their potential for integrating multi-functionalities into one flat optical device. A key challenge in metasurface design is the non-intuitive design process that produces models and patterns from specific design requirements (commonly electromagnetic responses). A complete exploration of all design spaces can produce optimal designs but is unrealistic considering the massive amount of computation power required to explore large parameter spaces. Meanwhile, machine learning techniques, especially generative adversarial networks, have proven to be an effective solution to non-intuitive design tasks. In this paper, we present a novel conditional generative network that can generate meta-atom/metasurface designs based on different performance requirements. Compared to conventional trial-and-error or iterative optimization design methods, this new methodology is capable of producing on-demand freeform designs on a one-time calculation basis. More importantly, an increased complexity of design goals doesn’t introduce further complexity into the network structure or the training process, which makes this approach suitable for multi-functional device designs. Compared to previous deep learning-based metasurface approaches, our network structure is extremely robust to train and converge, and is readily expanded to many multi-functional metasurface devices, including metasurface filters, lenses and holograms.
Tasks
Published	2019-08-13
URL	https://arxiv.org/abs/1908.04851v1
PDF	https://arxiv.org/pdf/1908.04851v1.pdf
PWC	https://paperswithcode.com/paper/generative-multi-functional-meta-atom-and
Repo
Framework

Mining Insights from Weakly-Structured Event Data


Title	Mining Insights from Weakly-Structured Event Data
Authors	Niek Tax
Abstract	This thesis focuses on process mining on event data where such a normative specification is absent and, as a result, the event data is less structured. The thesis puts special emphasis on one application domain that fits this description: the analysis of smart home data where sequences of daily activities are recorded. In this thesis we propose a set of techniques to analyze such data, which can be grouped into two categories of techniques. The first category of methods focuses on preprocessing event logs in order to enable process discovery techniques to extract insights into unstructured event data. In this category we have developed the following techniques: - An unsupervised approach to refine event labels based on the time at which the event took place, allowing for example to distinguish recorded eating events into breakfast, lunch, and dinner. - An approach to detect and filter from event logs so-called chaotic activities, which are activities that cause process discovery methods to overgeneralize. - A supervised approach to abstract low-level events into more high-level events, where we show that there exist situations where process discovery approaches overgeneralize on the low-level event data but are able to find precise models on the high-level event data. The second category focuses on mining local process models, i.e., collections of process model patterns that each describe some frequent pattern, in contrast to the single global process model that is obtained with existing process discovery techniques. Several techniques are introduced in the area of local process model mining, including a basic method, fast but approximate heuristic methods, and constraint-based techniques.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01421v1
PDF	https://arxiv.org/pdf/1909.01421v1.pdf
PWC	https://paperswithcode.com/paper/mining-insights-from-weakly-structured-event
Repo
Framework

Learning about an exponential amount of conditional distributions


Title	Learning about an exponential amount of conditional distributions
Authors	Mohamed Ishmael Belghazi, Maxime Oquab, Yann LeCun, David Lopez-Paz
Abstract	We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$. The NC is a function $NC(x \cdot a, a, r)$ that leverages adversarial training to match each conditional distribution $P(X_rX_a=x_a)$. After training, the NC generalizes to sample from conditional distributions never seen, including the joint distribution. The NC is also able to auto-encode examples, providing data representations useful for downstream classification tasks. In sum, the NC integrates different self-supervised tasks (each being the estimation of a conditional distribution) and levels of supervision (partially observed data) seamlessly into a single learning experience.
Tasks
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08401v1
PDF	http://arxiv.org/pdf/1902.08401v1.pdf
PWC	https://paperswithcode.com/paper/learning-about-an-exponential-amount-of
Repo
Framework

Functional Segmentation through Dynamic Mode Decomposition: Automatic Quantification of Kidney Function in DCE-MRI Images


Title	Functional Segmentation through Dynamic Mode Decomposition: Automatic Quantification of Kidney Function in DCE-MRI Images
Authors	Santosh Tirunagari, Norman Poh, Kevin Wells, Miroslaw Bober, Isky Gorden, David Windridge
Abstract	Quantification of kidney function in Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) requires careful segmentation of the renal region of interest (ROI). Traditionally, human experts are required to manually delineate the kidney ROI across multiple images in the dynamic sequence. This approach is costly, time-consuming and labour intensive, and therefore acts to limit patient throughout and acts as one of the factors limiting the wider adoption of DCR-MRI in clinical practice. Therefore, to address this issue, we present the first use of Dynamic Mode Decomposition (DMD) as a basis for automatic segmentation of a dynamic sequence, in this case, kidney ROIs in DCE-MRI. Using DMD coupled combined with thresholding and connected component analysis is first validated on synthetically generated data with known ground-truth, and then applied to ten healthy volunteers’ DCE-MRI datasets. We find that the segmentation result obtained from our proposed DMD framework is comparable to that of expert observers and very significantly better than that of an a-priori bounding box segmentation. Our result gives a mean Jaccard coefficient of 0.87, compared to mean scores of 0.85, 0.88 and 0.87 produced from three independent manual annotations. This represents the first use of DMD as a robust automatic data-driven segmentation approach without requiring any human intervention. This is a viable, efficient alternative approach to current manual methods of isolation of kidney function in DCE-MRI.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10218v1
PDF	https://arxiv.org/pdf/1905.10218v1.pdf
PWC	https://paperswithcode.com/paper/functional-segmentation-through-dynamic-mode
Repo
Framework

QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks


Title	QFlow: A Reinforcement Learning Approach to High QoE Video Streaming over Wireless Networks
Authors	Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, Srinivas Shakkottai, Dileep Kalathil, Ricky K. P. Mok, Amogh Dhamdhere
Abstract	Wireless Internet access has brought legions of heterogeneous applications all sharing the same resources. However, current wireless edge networks that cater to worst or average case performance lack the agility to best serve these diverse sessions. Simultaneously, software reconfigurable infrastructure has become increasingly mainstream to the point that dynamic per packet and per flow decisions are possible at multiple layers of the communications stack. Exploiting such reconfigurability requires the design of a system that can enable a configuration, measure the impact on the application performance (Quality of Experience), and adaptively select a new configuration. Effectively, this feedback loop is a Markov Decision Process whose parameters are unknown. The goal of this work is to design, develop and demonstrate QFlow that instantiates this feedback loop as an application of reinforcement learning (RL). Our context is that of reconfigurable (priority) queueing, and we use the popular application of video streaming as our use case. We develop both model-free and model-based RL approaches that are tailored to the problem of determining which clients should be assigned to which queue at each decision period. Through experimental validation, we show how the RL-based control policies on QFlow are able to schedule the right clients for prioritization in a high-load scenario to outperform the status quo, as well as the best known solutions with over 25% improvement in QoE, and a perfect QoE score of 5 over 85% of the time.
Tasks
Published	2019-01-04
URL	http://arxiv.org/abs/1901.00959v2
PDF	http://arxiv.org/pdf/1901.00959v2.pdf
PWC	https://paperswithcode.com/paper/qflow-a-reinforcement-learning-approach-to
Repo
Framework

Non-target Structural Displacement Measurement Using Reference Frame Based Deepflow


Title	Non-target Structural Displacement Measurement Using Reference Frame Based Deepflow
Authors	Jongbin Won, Jong-Woong Park, Do-Soo Moon
Abstract	Structural displacement is crucial for structural health monitoring, although it is very challenging to measure in field conditions. Most existing displacement measurement methods are costly, labor intensive, and insufficiently accurate for measuring small dynamic displacements. Computer vision (CV) based methods incorporate optical devices with advanced image processing algorithms to accurately, cost-effectively, and remotely measure structural displacement with easy installation. However, non-target based CV methods are still limited by insufficient feature points, incorrect feature point detection, occlusion, and drift induced by tracking error accumulation. This paper presents a reference frame based Deepflow algorithm integrated with masking and signal filtering for non-target based displacement measurements. The proposed method allows the user to select points of interest for images with a low gradient for displacement tracking and directly calculate displacement without drift accumulated by measurement error. The proposed method is experimentally validated on a cantilevered beam under ambient and occluded test conditions. The accuracy of the proposed method is compared with that of a reference laser displacement sensor for validation. The significant advantage of the proposed method is its flexibility in extracting structural displacement in any region on structures that do not have distinct natural features.
Tasks
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08831v1
PDF	http://arxiv.org/pdf/1903.08831v1.pdf
PWC	https://paperswithcode.com/paper/non-target-structural-displacement
Repo
Framework

Behavior Sequence Transformer for E-commerce Recommendation in Alibaba


Title	Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
Authors	Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, Wenwu Ou
Abstract	Deep learning based methods have been widely used in industrial recommendation systems (RSs). Previous works adopt an Embedding&MLP paradigm: raw features are embedded into low-dimensional vectors, which are then fed on to MLP for final recommendations. However, most of these works just concatenate different features, ignoring the sequential nature of users’ behaviors. In this paper, we propose to use the powerful Transformer model to capture the sequential signals underlying users’ behavior sequences for recommendation in Alibaba. Experimental results demonstrate the superiority of the proposed model, which is then deployed online at Taobao and obtain significant improvements in online Click-Through-Rate (CTR) comparing to two baselines.
Tasks	Recommendation Systems
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06874v1
PDF	https://arxiv.org/pdf/1905.06874v1.pdf
PWC	https://paperswithcode.com/paper/behavior-sequence-transformer-for-e-commerce
Repo
Framework

Transfer Learning from Audio-Visual Grounding to Speech Recognition


Title	Transfer Learning from Audio-Visual Grounding to Speech Recognition
Authors	Wei-Ning Hsu, David Harwath, James Glass
Abstract	Transfer learning aims to reduce the amount of data required to excel at a new task by re-using the knowledge acquired from learning other related tasks. This paper proposes a novel transfer learning scenario, which distills robust phonetic features from grounding models that are trained to tell whether a pair of image and speech are semantically correlated, without using any textual transcripts. As semantics of speech are largely determined by its lexical content, grounding models learn to preserve phonetic information while disregarding uncorrelated factors, such as speaker and channel. To study the properties of features distilled from different layers, we use them as input separately to train multiple speech recognition models. Empirical results demonstrate that layers closer to input retain more phonetic information, while following layers exhibit greater invariance to domain shift. Moreover, while most previous studies include training data for speech recognition for feature extractor training, our grounding models are not trained on any of those data, indicating more universal applicability to new domains.
Tasks	Speech Recognition, Transfer Learning
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04355v1
PDF	https://arxiv.org/pdf/1907.04355v1.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-from-audio-visual-grounding
Repo
Framework

A Study of Context Dependencies in Multi-page Product Search


Title	A Study of Context Dependencies in Multi-page Product Search
Authors	Keping Bi, Choon Hui Teo, Yesh Dattatreya, Vijai Mohan, W. Bruce Croft
Abstract	In product search, users tend to browse results on multiple search result pages (SERPs) (e.g., for queries on clothing and shoes) before deciding which item to purchase. Users’ clicks can be considered as implicit feedback which indicates their preferences and used to re-rank subsequent SERPs. Relevance feedback (RF) techniques are usually involved to deal with such scenarios. However, these methods are designed for document retrieval, where relevance is the most important criterion. In contrast, product search engines need to retrieve items that are not only relevant but also satisfactory in terms of customers’ preferences. Personalization based on users’ purchase history has been shown to be effective in product search. However, this method captures users’ long-term interest, which does not always align with their short-term interest, and does not benefit customers with little or no purchase history. In this paper, we study RF techniques based on both long-term and short-term context dependencies in multi-page product search. We also propose an end-to-end context-aware embedding model which can capture both types of context. Our experimental results show that short-term context leads to much better performance compared with long-term and no context. Moreover, our proposed model is more effective than state-of-art word-based RF models.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04031v2
PDF	https://arxiv.org/pdf/1909.04031v2.pdf
PWC	https://paperswithcode.com/paper/a-study-of-context-dependencies-in-multi-page
Repo
Framework


Title	Iterative Refinement for $\ell_p$-norm Regression
Authors	Deeksha Adil, Rasmus Kyng, Richard Peng, Sushant Sachdeva
Abstract	We give improved algorithms for the $\ell_{p}$-regression problem, $\min_{x} \x_{p}$ such that $A x=b,$ for all $p \in (1,2) \cup (2,\infty).$ Our algorithms obtain a high accuracy solution in $\tilde{O}_{p}(m^{\frac{p-2}{2p + p-2}}) \le \tilde{O}_{p}(m^{\frac{1}{3}})$ iterations, where each iteration requires solving an $m \times m$ linear system, $m$ being the dimension of the ambient space. By maintaining an approximate inverse of the linear systems that we solve in each iteration, we give algorithms for solving $\ell_{p}$-regression to $1 / \text{poly}(n)$ accuracy that run in time $\tilde{O}_p(m^{\max{\omega, 7/3}}),$ where $\omega$ is the matrix multiplication constant. For the current best value of $\omega > 2.37$, we can thus solve $\ell_{p}$ regression as fast as $\ell_{2}$ regression, for all constant $p$ bounded away from $1.$ Our algorithms can be combined with fast graph Laplacian linear equation solvers to give minimum $\ell_{p}$-norm flow / voltage solutions to $1 / \text{poly}(n)$ accuracy on an undirected graph with $m$ edges in $\tilde{O}_{p}(m^{1 + \frac{p-2}{2p + p-2}}) \le \tilde{O}_{p}(m^{\frac{4}{3}})$ time. For sparse graphs and for matrices with similar dimensions, our iteration counts and running times improve on the $p$-norm regression algorithm by [Bubeck-Cohen-Lee-Li STOC`18] and general-purpose convex optimization algorithms. At the core of our algorithms is an iterative refinement scheme for $\ell_{p}$-norms, using the smoothed $\ell_{p}$-norms introduced in the work of Bubeck et al. Given an initial solution, we construct a problem that seeks to minimize a quadratically-smoothed $\ell_{p}$ norm over a subspace, such that a crude solution to this problem allows us to improve the initial solution by a constant factor, leading to algorithms with fast convergence. \|
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.06764v1
PDF	http://arxiv.org/pdf/1901.06764v1.pdf
PWC	https://paperswithcode.com/paper/iterative-refinement-for-ell_p-norm
Repo
Framework