Paper Group ANR 29
Automatic Argument Quality Assessment – New Datasets and Methods. Gaussian Sketching yields a J-L Lemma in RKHS. The relationship between Biological and Artificial Intelligence. A Comprehensive Review of Shepherding as a Bio-inspired Swarm-Robotics Guidance Approach. Deep Learning-based Limited Feedback Designs for MIMO Systems. Functional advanta …
Automatic Argument Quality Assessment – New Datasets and Methods
Title | Automatic Argument Quality Assessment – New Datasets and Methods |
Authors | Assaf Toledo, Shai Gretz, Edo Cohen-Karlik, Roni Friedman, Elad Venezian, Dan Lahav, Michal Jacovi, Ranit Aharonov, Noam Slonim |
Abstract | We explore the task of automatic assessment of argument quality. To that end, we actively collected 6.3k arguments, more than a factor of five compared to previously examined data. Each argument was explicitly and carefully annotated for its quality. In addition, 14k pairs of arguments were annotated independently, identifying the higher quality argument in each pair. In spite of the inherent subjective nature of the task, both annotation schemes led to surprisingly consistent results. We release the labeled datasets to the community. Furthermore, we suggest neural methods based on a recently released language model, for argument ranking as well as for argument-pair classification. In the former task, our results are comparable to state-of-the-art; in the latter task our results significantly outperform earlier methods. |
Tasks | Language Modelling |
Published | 2019-09-03 |
URL | https://arxiv.org/abs/1909.01007v1 |
https://arxiv.org/pdf/1909.01007v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-argument-quality-assessment-new |
Repo | |
Framework | |
Gaussian Sketching yields a J-L Lemma in RKHS
Title | Gaussian Sketching yields a J-L Lemma in RKHS |
Authors | Samory Kpotufe, Bharath K. Sriperumbudur |
Abstract | The main contribution of the paper is to show that Gaussian sketching of a kernel-Gram matrix $\boldsymbol K$ yields an operator whose counterpart in an RKHS $\mathcal H$, is a \emph{random projection} operator—in the spirit of Johnson-Lindenstrauss (J-L) lemma. To be precise, given a random matrix $Z$ with i.i.d. Gaussian entries, we show that a sketch $Z\boldsymbol{K}$ corresponds to a particular random operator in (infinite-dimensional) Hilbert space $\mathcal H$ that maps functions $f \in \mathcal H$ to a low-dimensional space $\mathbb R^d$, while preserving a weighted RKHS inner-product of the form $\langle f, g \rangle_{\Sigma} \doteq \langle f, \Sigma^3 g \rangle_{\mathcal H}$, where $\Sigma$ is the \emph{covariance} operator induced by the data distribution. In particular, under similar assumptions as in kernel PCA (KPCA), or kernel $k$-means (K-$k$-means), well-separated subsets of feature-space ${K(\cdot, x): x \in \cal X}$ remain well-separated after such operation, which suggests similar benefits as in KPCA and/or K-$k$-means, albeit at the much cheaper cost of a random projection. In particular, our convergence rates suggest that, given a large dataset ${X_i}_{i=1}^N$ of size $N$, we can build the Gram matrix $\boldsymbol K$ on a much smaller subsample of size $n\ll N$, so that the sketch $Z\boldsymbol K$ is very cheap to obtain and subsequently apply as a projection operator on the original data ${X_i}_{i=1}^N$. We verify these insights empirically on synthetic data, and on real-world clustering applications. |
Tasks | |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05818v2 |
https://arxiv.org/pdf/1908.05818v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-sketching-yields-kernel-jl |
Repo | |
Framework | |
The relationship between Biological and Artificial Intelligence
Title | The relationship between Biological and Artificial Intelligence |
Authors | George Cevora |
Abstract | Intelligence can be defined as a predominantly human ability to accomplish tasks that are generally hard for computers and animals. Artificial Intelligence [AI] is a field attempting to accomplish such tasks with computers. AI is becoming increasingly widespread, as are claims of its relationship with Biological Intelligence. Often these claims are made to imply higher chances of a given technology succeeding, working on the assumption that AI systems which mimic the mechanisms of Biological Intelligence should be more successful. In this article I will discuss the similarities and differences between AI and the extent of our knowledge about the mechanisms of intelligence in biology, especially within humans. I will also explore the validity of the assumption that biomimicry in AI systems aids their advancement, and I will argue that existing similarity to biological systems in the way Artificial Neural Networks [ANNs] tackle tasks is due to design decisions, rather than inherent similarity of underlying mechanisms. This article is aimed at people who understand the basics of AI (especially ANNs), and would like to be better able to evaluate the often wild claims about the value of biomimicry in AI. |
Tasks | |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00547v1 |
http://arxiv.org/pdf/1905.00547v1.pdf | |
PWC | https://paperswithcode.com/paper/the-relationship-between-biological-and |
Repo | |
Framework | |
A Comprehensive Review of Shepherding as a Bio-inspired Swarm-Robotics Guidance Approach
Title | A Comprehensive Review of Shepherding as a Bio-inspired Swarm-Robotics Guidance Approach |
Authors | Nathan K Long, Karl Sammut, Daniel Sgarioto, Matthew Garratt, Hussein Abbass |
Abstract | The simultaneous control of multiple coordinated robotic agents represents an elaborate problem. If solved, however, the interaction between the agents can lead to solutions to sophisticated problems. The concept of swarming, inspired by nature, can be described as the emergence of complex system-level behaviors from the interactions of relatively elementary agents. Due to the effectiveness of solutions found in nature, bio-inspired swarming-based control techniques are receiving a lot of attention in robotics. One method, known as swarm shepherding, is founded on the sheep herding behavior exhibited by sheepdogs, where a swarm of relatively simple agents are governed by a shepherd (or shepherds) which is responsible for high-level guidance and planning. Many studies have been conducted on shepherding as a control technique, ranging from the replication of sheep herding via simulation, to the control of uninhabited vehicles and robots for a variety of applications. We present a comprehensive review of the literature on swarm shepherding to reveal the advantages and potential of the approach to be applied to a plethora of robotic systems in the future. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.07796v2 |
https://arxiv.org/pdf/1912.07796v2.pdf | |
PWC | https://paperswithcode.com/paper/a-comprehensive-review-of-shepherding-as-a |
Repo | |
Framework | |
Deep Learning-based Limited Feedback Designs for MIMO Systems
Title | Deep Learning-based Limited Feedback Designs for MIMO Systems |
Authors | Jeonghyeon Jang, Hoon Lee, Sangwon Hwang, Haibao Ren, Inkyu Lee |
Abstract | We study a deep learning (DL) based limited feedback methods for multi-antenna systems. Deep neural networks (DNNs) are introduced to replace an end-to-end limited feedback procedure including pilot-aided channel training process, channel codebook design, and beamforming vector selection. The DNNs are trained to yield binary feedback information as well as an efficient beamforming vector which maximizes the effective channel gain. Compared to conventional limited feedback schemes, the proposed DL method shows an 1 dB symbol error rate (SER) gain with reduced computational complexity. |
Tasks | |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09043v1 |
https://arxiv.org/pdf/1912.09043v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-limited-feedback-designs |
Repo | |
Framework | |
Functional advantages of an adaptive Theory of Mind for robotics: a review of current architectures
Title | Functional advantages of an adaptive Theory of Mind for robotics: a review of current architectures |
Authors | Francesca Bianco, Dimitri Ognibene |
Abstract | Great advancements have been achieved in the field of robotics, however, main challenges remain, including building robots with an adaptive Theory of Mind (ToM). In the present paper, seven current robotic architectures for human-robot interactions were described as well as four main functional advantages of equipping robots with an adaptive ToM. The aim of the present paper was to determine in which way and how often ToM features are integrated in the architectures analyzed, and if they provide robots with the associated functional advantages. Our assessment shows that different methods are used to implement ToM features in robotic architectures. Furthermore, while a ToM for false-belief understanding and tracking is often built in social robotic architectures, a ToM for proactivity, active perception and learning is less common. Nonetheless, progresses towards better adaptive ToM features in robots are warranted to provide them with full access to the advantages of having a ToM resembling that of humans. |
Tasks | |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00193v1 |
https://arxiv.org/pdf/1909.00193v1.pdf | |
PWC | https://paperswithcode.com/paper/functional-advantages-of-an-adaptive-theory |
Repo | |
Framework | |
Automated Segmentation of Knee MRI Using Hierarchical Classifiers and Just Enough Interaction Based Learning: Data from Osteoarthritis Initiative
Title | Automated Segmentation of Knee MRI Using Hierarchical Classifiers and Just Enough Interaction Based Learning: Data from Osteoarthritis Initiative |
Authors | Satyananda Kashyap, Ipek Oguz, Honghai Zhang, Milan Sonka |
Abstract | We present a fully automated learning-based approach for segmenting knee cartilage in the presence of osteoarthritis (OA). The algorithm employs a hierarchical set of two random forest classifiers. The first is a neighborhood approximation forest, the output probability map of which is utilized as a feature set for the second random forest (RF) classifier. The output probabilities of the hierarchical approach are used as cost functions in a Layered Optimal Graph Segmentation of Multiple Objects and Surfaces (LOGISMOS). In this work, we highlight a novel post-processing interaction called just-enough interaction (JEI) which enables quick and accurate generation of a large set of training examples. Disjoint sets of 15 and 13 subjects were used for training and tested on another disjoint set of 53 knee datasets. All images were acquired using a double echo steady state (DESS) MRI sequence and are from the osteoarthritis initiative (OAI) database. Segmentation performance using the learning-based cost function showed significant reduction in segmentation errors ($p< 0.05$) in comparison with conventional gradient-based cost functions. |
Tasks | |
Published | 2019-03-10 |
URL | http://arxiv.org/abs/1903.03929v1 |
http://arxiv.org/pdf/1903.03929v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-segmentation-of-knee-mri-using |
Repo | |
Framework | |
Distributed estimation of principal support vector machines for sufficient dimension reduction
Title | Distributed estimation of principal support vector machines for sufficient dimension reduction |
Authors | Jun Jin, Chao Ying, Zhou Yu |
Abstract | The principal support vector machines method (Li et al., 2011) is a powerful tool for sufficient dimension reduction that replaces original predictors with their low-dimensional linear combinations without loss of information. However, the computational burden of the principal support vector machines method constrains its use for massive data. To address this issue, we in this paper propose two distributed estimation algorithms for fast implementation when the sample size is large. Both the two distributed sufficient dimension reduction estimators enjoy the same statistical efficiency as merging all the data together, which provides rigorous statistical guarantees for their application to large scale datasets. The two distributed algorithms are further adapt to principal weighted support vector machines (Shin et al., 2016) for sufficient dimension reduction in binary classification. The statistical accuracy and computational complexity of our proposed methods are examined through comprehensive simulation studies and a real data application with more than 600000 samples. |
Tasks | Dimensionality Reduction |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12732v1 |
https://arxiv.org/pdf/1911.12732v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-estimation-of-principal-support |
Repo | |
Framework | |
Hybrid Stochastic Gradient Descent Algorithms for Stochastic Nonconvex Optimization
Title | Hybrid Stochastic Gradient Descent Algorithms for Stochastic Nonconvex Optimization |
Authors | Quoc Tran-Dinh, Nhan H. Pham, Dzung T. Phan, Lam M. Nguyen |
Abstract | We introduce a hybrid stochastic estimator to design stochastic gradient algorithms for solving stochastic optimization problems. Such a hybrid estimator is a convex combination of two existing biased and unbiased estimators and leads to some useful property on its variance. We limit our consideration to a hybrid SARAH-SGD for nonconvex expectation problems. However, our idea can be extended to handle a broader class of estimators in both convex and nonconvex settings. We propose a new single-loop stochastic gradient descent algorithm that can achieve $O(\max{\sigma^3\varepsilon^{-1},\sigma\varepsilon^{-3}})$-complexity bound to obtain an $\varepsilon$-stationary point under smoothness and $\sigma^2$-bounded variance assumptions. This complexity is better than $O(\sigma^2\varepsilon^{-4})$ often obtained in state-of-the-art SGDs when $\sigma < O(\varepsilon^{-3})$. We also consider different extensions of our method, including constant and adaptive step-size with single-loop, double-loop, and mini-batch variants. We compare our algorithms with existing methods on several datasets using two nonconvex models. |
Tasks | Stochastic Optimization |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.05920v1 |
https://arxiv.org/pdf/1905.05920v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-stochastic-gradient-descent-algorithms |
Repo | |
Framework | |
Logical Interpretations of Autoencoders
Title | Logical Interpretations of Autoencoders |
Authors | Anton Fuxjaeger, Vaishak Belle |
Abstract | The unification of low-level perception and high-level reasoning is a long-standing problem in artificial intelligence, which has the potential to not only bring the areas of logic and learning closer together but also demonstrate how abstract concepts might emerge from sensory data. Precisely because deep learning methods dominate perception-based learning, including vision, speech, and linguistic grammar, there is fast-growing literature on how to integrate symbolic reasoning and deep learning. Broadly, efforts seem to fall into three camps: those focused on defining a logic whose formulas capture deep learning, ones that integrate symbolic constraints in deep learning, and others that allow neural computations and symbolic reasoning to co-exist separately, to enjoy the strengths of both worlds. In this paper, we identify another dimension to this inquiry: what do the hidden layers really capture, and how can we reason about that logically? In particular, we consider autoencoders that are widely used for dimensionality reduction and inject a symbolic generative framework onto the feature layer. This allows us, among other things, to generate example images for a class to get a sense of what was learned. Moreover, the modular structure of the proposed model makes it possible to learn relations over multiple images at a time, as well as handle noisy labels. Our empirical evaluations show the promise of this inquiry. |
Tasks | Dimensionality Reduction |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11629v1 |
https://arxiv.org/pdf/1911.11629v1.pdf | |
PWC | https://paperswithcode.com/paper/logical-interpretations-of-autoencoders |
Repo | |
Framework | |
Classification from Triplet Comparison Data
Title | Classification from Triplet Comparison Data |
Authors | Zhenghang Cui, Nontawat Charoenphakdee, Issei Sato, Masashi Sugiyama |
Abstract | Learning from triplet comparison data has been extensively studied in the context of metric learning, where we want to learn a distance metric between two instances, and ordinal embedding, where we want to learn an embedding in an Euclidean space of the given instances that preserves the comparison order as well as possible. Unlike fully-labeled data, triplet comparison data can be collected in a more accurate and human-friendly way. Although learning from triplet comparison data has been considered in many applications, an important fundamental question of whether we can learn a classifier only from triplet comparison data has remained unanswered. In this paper, we give a positive answer to this important question by proposing an unbiased estimator for the classification risk under the empirical risk minimization framework. Since the proposed method is based on the empirical risk minimization framework, it inherently has the advantage that any surrogate loss function and any model, including neural networks, can be easily applied. Furthermore, we theoretically establish an estimation error bound for the proposed empirical risk minimizer. Finally, we provide experimental results to show that our method empirically works well and outperforms various baseline methods. |
Tasks | Metric Learning |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10225v2 |
https://arxiv.org/pdf/1907.10225v2.pdf | |
PWC | https://paperswithcode.com/paper/classification-from-triplet-comparison-data |
Repo | |
Framework | |
Mitigating Uncertainty in Document Classification
Title | Mitigating Uncertainty in Document Classification |
Authors | Xuchao Zhang, Fanglan Chen, Chang-Tien Lu, Naren Ramakrishnan |
Abstract | The uncertainty measurement of classifiers’ predictions is especially important in applications such as medical diagnoses that need to ensure limited human resources can focus on the most uncertain predictions returned by machine learning models. However, few existing uncertainty models attempt to improve overall prediction accuracy where human resources are involved in the text classification task. In this paper, we propose a novel neural-network-based model that applies a new dropout-entropy method for uncertainty measurement. We also design a metric learning method on feature representations, which can boost the performance of dropout-based uncertainty methods with smaller prediction variance in accurate prediction trials. Extensive experiments on real-world data sets demonstrate that our method can achieve a considerable improvement in overall prediction accuracy compared to existing approaches. In particular, our model improved the accuracy from 0.78 to 0.92 when 30% of the most uncertain predictions were handed over to human experts in “20NewsGroup” data. |
Tasks | Document Classification, Metric Learning, Text Classification |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07590v1 |
https://arxiv.org/pdf/1907.07590v1.pdf | |
PWC | https://paperswithcode.com/paper/mitigating-uncertainty-in-document-1 |
Repo | |
Framework | |
A single-layer RNN can approximate stacked and bidirectional RNNs, and topologies in between
Title | A single-layer RNN can approximate stacked and bidirectional RNNs, and topologies in between |
Authors | Javier S. Turek, Shailee Jain, Mihai Capota, Alexander G. Huth, Theodore L. Willke |
Abstract | To enhance the expressiveness and representational capacity of recurrent neural networks (RNN), a large body of work has emerged exploring stacked architectures with additional topological modifications like shortcut connections or bidirectionality. However, choosing the best network for a particular problem requires a combinatorial search over architectures and their hyperparameters. In this work, we show that a single-layer RNN can perfectly mimic an arbitrarily deep stacked RNN under specific constraints on its weight matrix and a delay between input and output. This obviates the need to manually select hyperparameters like the number of layers. Additionally, we show that weakening weight constraints while keeping the delay gives rise to partial acausality in the single-layer RNN, much like a bidirectional network. Synthetic experiments confirm that the delayed RNN can mimic bidirectional networks in perfectly solving some acausal tasks, outperforming them in others. Finally, we show that in a challenging language processing task, the delayed RNN performs within 0.3% of the accuracy of the bidirectional network while reducing computational costs. |
Tasks | |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.00021v1 |
https://arxiv.org/pdf/1909.00021v1.pdf | |
PWC | https://paperswithcode.com/paper/a-single-layer-rnn-can-approximate-stacked |
Repo | |
Framework | |
Learning the Non-linearity in Convolutional Neural Networks
Title | Learning the Non-linearity in Convolutional Neural Networks |
Authors | Gavneet Singh Chadha, Andreas Schwung |
Abstract | We propose the introduction of nonlinear operation into the feature generation process in convolutional neural networks. This nonlinearity can be implemented in various ways. First we discuss the use of nonlinearities in the process of data augmentation to increase the robustness of the neural networks recognition capacity. To this end, we randomly disturb the input data set by applying exponents within a certain numerical range to individual data points of the input space. Second we propose nonlinear convolutional neural networks where we apply the exponential operation to each element of the receptive field. To this end, we define an additional weight matrix of the same dimension as the standard kernel weight matrix. The weights of this matrix then constitute the exponents of the corresponding components of the receptive field. In the basic setting, we keep the weight parameters fixed during training by defining suitable parameters. Alternatively, we make the exponential weight parameters end-to-end trainable using a suitable parameterization. The network architecture is applied to time series analysis data set showing a considerable increase in the classification performance compared to baseline networks. |
Tasks | Data Augmentation, Time Series, Time Series Analysis |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12337v1 |
https://arxiv.org/pdf/1905.12337v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-the-non-linearity-in-convolutional |
Repo | |
Framework | |
Mini-batch Metropolis-Hastings MCMC with Reversible SGLD Proposal
Title | Mini-batch Metropolis-Hastings MCMC with Reversible SGLD Proposal |
Authors | Tung-Yu Wu, Y. X. Rachel Wang, Wing H. Wong |
Abstract | Traditional MCMC algorithms are computationally intensive and do not scale well to large data. In particular, the Metropolis-Hastings (MH) algorithm requires passing over the entire dataset to evaluate the likelihood ratio in each iteration. We propose a general framework for performing MH-MCMC using mini-batches of the whole dataset and show that this gives rise to approximately a tempered stationary distribution. We prove that the algorithm preserves the modes of the original target distribution and derive an error bound on the approximation with mild assumptions on the likelihood. To further extend the utility of the algorithm to high dimensional settings, we construct a proposal with forward and reverse moves using stochastic gradient and show that the construction leads to reasonable acceptance probabilities. We demonstrate the performance of our algorithm in both low dimensional models and high dimensional neural network applications. Particularly in the latter case, compared to popular optimization methods, our method is more robust to the choice of learning rate and improves testing accuracy. |
Tasks | |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.02910v2 |
https://arxiv.org/pdf/1908.02910v2.pdf | |
PWC | https://paperswithcode.com/paper/mini-batch-metropolis-hastings-mcmc-with |
Repo | |
Framework | |