Paper Group ANR 287
Using LLVM-based JIT Compilation in Genetic Programming. AI2-THOR: An Interactive 3D Environment for Visual AI. Deep Learning Models of the Retinal Response to Natural Scenes. El Lenguaje Natural como Lenguaje Formal. Repairing Ontologies via Axiom Weakening. Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification. A Hi …
Using LLVM-based JIT Compilation in Genetic Programming
Title | Using LLVM-based JIT Compilation in Genetic Programming |
Authors | Michal Gregor, Juraj Spalek |
Abstract | The paper describes an approach to implementing genetic programming, which uses the LLVM library to just-in-time compile/interpret the evolved abstract syntax trees. The solution is described in some detail, including a parser (based on FlexC++ and BisonC++) that can construct the trees from a simple toy language with C-like syntax. The approach is compared with a previous implementation (based on direct execution of trees using polymorphic functors) in terms of execution speed. |
Tasks | |
Published | 2017-01-20 |
URL | http://arxiv.org/abs/1701.05730v1 |
http://arxiv.org/pdf/1701.05730v1.pdf | |
PWC | https://paperswithcode.com/paper/using-llvm-based-jit-compilation-in-genetic |
Repo | |
Framework | |
AI2-THOR: An Interactive 3D Environment for Visual AI
Title | AI2-THOR: An Interactive 3D Environment for Visual AI |
Authors | Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Daniel Gordon, Yuke Zhu, Abhinav Gupta, Ali Farhadi |
Abstract | We introduce The House Of inteRactions (THOR), a framework for visual AI research, available at http://ai2thor.allenai.org. AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks. AI2-THOR enables research in many different domains including but not limited to deep reinforcement learning, imitation learning, learning by interaction, planning, visual question answering, unsupervised representation learning, object detection and segmentation, and learning models of cognition. The goal of AI2-THOR is to facilitate building visually intelligent models and push the research forward in this domain. |
Tasks | Imitation Learning, Object Detection, Question Answering, Representation Learning, Unsupervised Representation Learning, Visual Question Answering |
Published | 2017-12-14 |
URL | http://arxiv.org/abs/1712.05474v3 |
http://arxiv.org/pdf/1712.05474v3.pdf | |
PWC | https://paperswithcode.com/paper/ai2-thor-an-interactive-3d-environment-for |
Repo | |
Framework | |
Deep Learning Models of the Retinal Response to Natural Scenes
Title | Deep Learning Models of the Retinal Response to Natural Scenes |
Authors | Lane T. McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen A. Baccus |
Abstract | A central challenge in neuroscience is to understand neural computations and circuit mechanisms that underlie the encoding of ethologically relevant, natural stimuli. In multilayered neural circuits, nonlinear processes such as synaptic transmission and spiking dynamics present a significant obstacle to the creation of accurate computational models of responses to natural stimuli. Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes nearly to within the variability of a cell’s response, and are markedly more accurate than linear-nonlinear (LN) models and Generalized Linear Models (GLMs). Moreover, we find two additional surprising properties of CNNs: they are less susceptible to overfitting than their LN counterparts when trained on small amounts of data, and generalize better when tested on stimuli drawn from a different distribution (e.g. between natural scenes and white noise). Examination of trained CNNs reveals several properties. First, a richer set of feature maps is necessary for predicting the responses to natural scenes compared to white noise. Second, temporally precise responses to slowly varying inputs originate from feedforward inhibition, similar to known retinal mechanisms. Third, the injection of latent noise sources in intermediate layers enables our model to capture the sub-Poisson spiking variability observed in retinal ganglion cells. Fourth, augmenting our CNNs with recurrent lateral connections enables them to capture contrast adaptation as an emergent property of accurately describing retinal responses to natural scenes. These methods can be readily generalized to other sensory modalities and stimulus ensembles. Overall, this work demonstrates that CNNs not only accurately capture sensory circuit responses to natural scenes, but also yield information about the circuit’s internal structure and function. |
Tasks | |
Published | 2017-02-06 |
URL | http://arxiv.org/abs/1702.01825v1 |
http://arxiv.org/pdf/1702.01825v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-models-of-the-retinal-response |
Repo | |
Framework | |
El Lenguaje Natural como Lenguaje Formal
Title | El Lenguaje Natural como Lenguaje Formal |
Authors | Franco M. Luque |
Abstract | Formal languages theory is useful for the study of natural language. In particular, it is of interest to study the adequacy of the grammatical formalisms to express syntactic phenomena present in natural language. First, it helps to draw hypothesis about the nature and complexity of the speaker-hearer linguistic competence, a fundamental question in linguistics and other cognitive sciences. Moreover, from an engineering point of view, it allows the knowledge of practical limitations of applications based on those formalisms. In this article I introduce the adequacy problem of grammatical formalisms for natural language, also introducing some formal language theory concepts required for this discussion. Then, I review the formalisms that have been proposed in history, and the arguments that have been given to support or reject their adequacy. —– La teor'ia de lenguajes formales es 'util para el estudio de los lenguajes naturales. En particular, resulta de inter'es estudiar la adecuaci'on de los formalismos gramaticales para expresar los fen'omenos sint'acticos presentes en el lenguaje natural. Primero, ayuda a trazar hip'otesis acerca de la naturaleza y complejidad de las competencias ling"u'isticas de los hablantes-oyentes del lenguaje, un interrogante fundamental de la ling"u'istica y otras ciencias cognitivas. Adem'as, desde el punto de vista de la ingenier'ia, permite conocer limitaciones pr'acticas de las aplicaciones basadas en dichos formalismos. En este art'iculo hago una introducci'on al problema de la adecuaci'on de los formalismos gramaticales para el lenguaje natural, introduciendo tambi'en algunos conceptos de la teor'ia de lenguajes formales necesarios para esta discusi'on. Luego, hago un repaso de los formalismos que han sido propuestos a lo largo de la historia, y de los argumentos que se han dado para sostener o refutar su adecuaci'on. |
Tasks | |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04417v1 |
http://arxiv.org/pdf/1703.04417v1.pdf | |
PWC | https://paperswithcode.com/paper/el-lenguaje-natural-como-lenguaje-formal |
Repo | |
Framework | |
Repairing Ontologies via Axiom Weakening
Title | Repairing Ontologies via Axiom Weakening |
Authors | Nicolas Troquard, Roberto Confalonieri, Pietro Galliani, Rafael Penaloza, Daniele Porello, Oliver Kutz |
Abstract | Ontology engineering is a hard and error-prone task, in which small changes may lead to errors, or even produce an inconsistent ontology. As ontologies grow in size, the need for automated methods for repairing inconsistencies while preserving as much of the original knowledge as possible increases. Most previous approaches to this task are based on removing a few axioms from the ontology to regain consistency. We propose a new method based on weakening these axioms to make them less restrictive, employing the use of refinement operators. We introduce the theoretical framework for weakening DL ontologies, propose algorithms to repair ontologies based on the framework, and provide an analysis of the computational complexity. Through an empirical analysis made over real-life ontologies, we show that our approach preserves significantly more of the original knowledge of the ontology than removing axioms. |
Tasks | |
Published | 2017-11-09 |
URL | http://arxiv.org/abs/1711.03430v1 |
http://arxiv.org/pdf/1711.03430v1.pdf | |
PWC | https://paperswithcode.com/paper/repairing-ontologies-via-axiom-weakening |
Repo | |
Framework | |
Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification
Title | Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification |
Authors | Qiqi Xiao, Hao Luo, Chi Zhang |
Abstract | Person re-identification (ReID) is an important task in computer vision. Recently, deep learning with a metric learning loss has become a common framework for ReID. In this paper, we also propose a new metric learning loss with hard sample mining called margin smaple mining loss (MSML) which can achieve better accuracy compared with other metric learning losses, such as triplet loss. In experi- ments, our proposed methods outperforms most of the state-of-the-art algorithms on Market1501, MARS, CUHK03 and CUHK-SYSU. |
Tasks | Metric Learning, Person Re-Identification |
Published | 2017-10-02 |
URL | http://arxiv.org/abs/1710.00478v3 |
http://arxiv.org/pdf/1710.00478v3.pdf | |
PWC | https://paperswithcode.com/paper/margin-sample-mining-loss-a-deep-learning |
Repo | |
Framework | |
A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics
Title | A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics |
Authors | Yuchen Zhang, Percy Liang, Moses Charikar |
Abstract | We study the Stochastic Gradient Langevin Dynamics (SGLD) algorithm for non-convex optimization. The algorithm performs stochastic gradient descent, where in each step it injects appropriately scaled Gaussian noise to the update. We analyze the algorithm’s hitting time to an arbitrary subset of the parameter space. Two results follow from our general theory: First, we prove that for empirical risk minimization, if the empirical risk is point-wise close to the (smooth) population risk, then the algorithm achieves an approximate local minimum of the population risk in polynomial time, escaping suboptimal local minima that only exist in the empirical risk. Second, we show that SGLD improves on one of the best known learnability results for learning linear classifiers under the zero-one loss. |
Tasks | |
Published | 2017-02-18 |
URL | http://arxiv.org/abs/1702.05575v3 |
http://arxiv.org/pdf/1702.05575v3.pdf | |
PWC | https://paperswithcode.com/paper/a-hitting-time-analysis-of-stochastic |
Repo | |
Framework | |
Calibrating Noise to Variance in Adaptive Data Analysis
Title | Calibrating Noise to Variance in Adaptive Data Analysis |
Authors | Vitaly Feldman, Thomas Steinke |
Abstract | Datasets are often used multiple times and each successive analysis may depend on the outcome of previous analyses. Standard techniques for ensuring generalization and statistical validity do not account for this adaptive dependence. A recent line of work studies the challenges that arise from such adaptive data reuse by considering the problem of answering a sequence of “queries” about the data distribution where each query may depend arbitrarily on answers to previous queries. The strongest results obtained for this problem rely on differential privacy – a strong notion of algorithmic stability with the important property that it “composes” well when data is reused. However the notion is rather strict, as it requires stability under replacement of an arbitrary data element. The simplest algorithm is to add Gaussian (or Laplace) noise to distort the empirical answers. However, analysing this technique using differential privacy yields suboptimal accuracy guarantees when the queries have low variance. Here we propose a relaxed notion of stability that also composes adaptively. We demonstrate that a simple and natural algorithm based on adding noise scaled to the standard deviation of the query provides our notion of stability. This implies an algorithm that can answer statistical queries about the dataset with substantially improved accuracy guarantees for low-variance queries. The only previous approach that provides such accuracy guarantees is based on a more involved differentially private median-of-means algorithm and its analysis exploits stronger “group” stability of the algorithm. |
Tasks | |
Published | 2017-12-19 |
URL | http://arxiv.org/abs/1712.07196v2 |
http://arxiv.org/pdf/1712.07196v2.pdf | |
PWC | https://paperswithcode.com/paper/calibrating-noise-to-variance-in-adaptive |
Repo | |
Framework | |
Tensor-based approach to accelerate deformable part models
Title | Tensor-based approach to accelerate deformable part models |
Authors | D. V. Parkhomenko, I. L. Mazurenko |
Abstract | This article provides next step towards solving speed bottleneck of any system that intensively uses convolutions operations (e.g. CNN). Method described in the article is applied on deformable part models (DPM) algorithm. Method described here is based on multidimensional tensors and provides efficient tradeoff between DPM performance and accuracy. Experiments on various databases, including Pascal VOC, show that the proposed method allows decreasing a number of convolutions up to 4.5 times compared with DPM v.5, while maintaining similar accuracy. If insignificant accuracy degradation is allowable, higher computational gain can be achieved. The method consists of filters tensor decomposition and convolutions shortening using the decomposed filter. Mathematical overview of the proposed method as well as simulation results are provided. |
Tasks | |
Published | 2017-06-29 |
URL | http://arxiv.org/abs/1707.03268v1 |
http://arxiv.org/pdf/1707.03268v1.pdf | |
PWC | https://paperswithcode.com/paper/tensor-based-approach-to-accelerate |
Repo | |
Framework | |
Non-iterative Label Propagation in Optimal Leading Forest
Title | Non-iterative Label Propagation in Optimal Leading Forest |
Authors | Ji Xu, Guoyin Wang |
Abstract | Graph based semi-supervised learning (GSSL) has intuitive representation and can be improved by exploiting the matrix calculation. However, it has to perform iterative optimization to achieve a preset objective, which usually leads to low efficiency. Another inconvenience lying in GSSL is that when new data come, the graph construction and the optimization have to be conducted all over again. We propose a sound assumption, arguing that: the neighboring data points are not in peer-to-peer relation, but in a partial-ordered relation induced by the local density and distance between the data; and the label of a center can be regarded as the contribution of its followers. Starting from the assumption, we develop a highly efficient non-iterative label propagation algorithm based on a novel data structure named as optimal leading forest (LaPOLeaF). The major weaknesses of the traditional GSSL are addressed by this study. We further scale LaPOLeaF to accommodate big data by utilizing block distance matrix technique, parallel computing, and Locality-Sensitive Hashing (LSH). Experiments on large datasets have shown the promising results of the proposed methods. |
Tasks | graph construction |
Published | 2017-09-25 |
URL | http://arxiv.org/abs/1709.08426v2 |
http://arxiv.org/pdf/1709.08426v2.pdf | |
PWC | https://paperswithcode.com/paper/non-iterative-label-propagation-in-optimal |
Repo | |
Framework | |
End-to-End ASR-free Keyword Search from Speech
Title | End-to-End ASR-free Keyword Search from Speech |
Authors | Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury |
Abstract | End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid hidden Markov model (HMM)-deep neural network based automatic speech recognition (ASR) systems. Such E2E systems are attractive due to the lack of dependence on alignments between input acoustic and output grapheme or HMM state sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to reconstruct the audio through a finite-dimensional representation. The second sub-system is a character-level RNN language model using embeddings learned from a convolutional neural network. Since the acoustic and text query embeddings occupy different representation spaces, they are input to a third feed-forward neural network that predicts whether the query occurs in the acoustic utterance or not. This E2E ASR-free KWS system performs respectably despite lacking a conventional ASR system and trains much faster. |
Tasks | End-To-End Speech Recognition, Language Modelling, Speech Recognition |
Published | 2017-01-13 |
URL | http://arxiv.org/abs/1701.04313v1 |
http://arxiv.org/pdf/1701.04313v1.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-asr-free-keyword-search-from |
Repo | |
Framework | |
A Graphical Social Topology Model for Multi-Object Tracking
Title | A Graphical Social Topology Model for Multi-Object Tracking |
Authors | Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji |
Abstract | Tracking multiple objects is a challenging task when objects move in groups and occlude each other. Existing methods have investigated the problems of group division and group energy-minimization; however, lacking overall object-group topology modeling limits their ability in handling complex object and group dynamics. Inspired with the social affinity property of moving objects, we propose a Graphical Social Topology (GST) model, which estimates the group dynamics by jointly modeling the group structure and the states of objects using a topological representation. With such topology representation, moving objects are not only assigned to groups, but also dynamically connected with each other, which enables in-group individuals to be correctly associated and the cohesion of each group to be precisely modeled. Using well-designed topology learning modules and topology training, we infer the birth/death and merging/splitting of dynamic groups. With the GST model, the proposed multi-object tracker can naturally facilitate the occlusion problem by treating the occluded object and other in-group members as a whole while leveraging overall state transition. Experiments on both RGB and RGB-D datasets confirm that the proposed multi-object tracker improves the state-of-the-arts especially in crowded scenes. |
Tasks | Multi-Object Tracking, Object Tracking |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04040v2 |
http://arxiv.org/pdf/1702.04040v2.pdf | |
PWC | https://paperswithcode.com/paper/a-graphical-social-topology-model-for-multi |
Repo | |
Framework | |
Automatic Spine Segmentation using Convolutional Neural Network via Redundant Generation of Class Labels for 3D Spine Modeling
Title | Automatic Spine Segmentation using Convolutional Neural Network via Redundant Generation of Class Labels for 3D Spine Modeling |
Authors | Malinda Vania, Dawit Mureja, Deukhee Lee |
Abstract | There has been a significant increase from 2010 to 2016 in the number of people suffering from spine problems. The automatic image segmentation of the spine obtained from a computed tomography (CT) image is important for diagnosing spine conditions and for performing surgery with computer-assisted surgery systems. The spine has a complex anatomy that consists of 33 vertebrae, 23 intervertebral disks, the spinal cord, and connecting ribs. As a result, the spinal surgeon is faced with the challenge of needing a robust algorithm to segment and create a model of the spine. In this study, we developed an automatic segmentation method to segment the spine, and we compared our segmentation results with reference segmentations obtained by experts. We developed a fully automatic approach for spine segmentation from CT based on a hybrid method. This method combines the convolutional neural network (CNN) and fully convolutional network (FCN), and utilizes class redundancy as a soft constraint to greatly improve the segmentation results. The proposed method was found to significantly enhance the accuracy of the segmentation results and the system processing time. Our comparison was based on 12 measurements: the Dice coefficient (94%), Jaccard index (93%), volumetric similarity (96%), sensitivity (97%), specificity (99%), precision (over segmentation; 8.3 and under segmentation 2.6), accuracy (99%), Matthews correlation coefficient (0.93), mean surface distance (0.16 mm), Hausdorff distance (7.4 mm), and global consistency error (0.02). We experimented with CT images from 32 patients, and the experimental results demonstrated the efficiency of the proposed method. |
Tasks | Computed Tomography (CT), Semantic Segmentation |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1712.01640v1 |
http://arxiv.org/pdf/1712.01640v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-spine-segmentation-using |
Repo | |
Framework | |
One-Shot Concept Learning by Simulating Evolutionary Instinct Development
Title | One-Shot Concept Learning by Simulating Evolutionary Instinct Development |
Authors | Abrar Ahmed, Anish Bikmal |
Abstract | Object recognition has become a crucial part of machine learning and computer vision recently. The current approach to object recognition involves Deep Learning and uses Convolutional Neural Networks to learn the pixel patterns of the objects implicitly through backpropagation. However, CNNs require thousands of examples in order to generalize successfully and often require heavy computing resources for training. This is considered rather sluggish when compared to the human ability to generalize and learn new categories given just a single example. Additionally, CNNs make it difficult to explicitly programmatically modify or intuitively interpret their learned representations. We propose a computational model that can successfully learn an object category from as few as one example and allows its learning style to be tailored explicitly to a scenario. Our model decomposes each image into two attributes: shape and color distribution. We then use a Bayesian criterion to probabilistically determine the likelihood of each category. The model takes each factor into account based on importance and calculates the conditional probability of the object belonging to each learned category. Our model is not only applicable to visual scenarios, but can also be implemented in a broader and more practical scope of situations such as Natural Language Processing as well as other places where it is possible to retrieve and construct individual attributes. Because the only condition our model presents is the ability to retrieve and construct individual attributes such as shape and color, it can be applied to essentially any class of visual objects. |
Tasks | Object Recognition |
Published | 2017-08-27 |
URL | http://arxiv.org/abs/1708.08141v1 |
http://arxiv.org/pdf/1708.08141v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-concept-learning-by-simulating |
Repo | |
Framework | |
Detecting Drivable Area for Self-driving Cars: An Unsupervised Approach
Title | Detecting Drivable Area for Self-driving Cars: An Unsupervised Approach |
Authors | Ziyi Liu, Siyu Yu, Xiao Wang, Nanning Zheng |
Abstract | It has been well recognized that detecting drivable area is central to self-driving cars. Most of existing methods attempt to locate road surface by using lane line, thereby restricting to drivable area on which have a clear lane mark. This paper proposes an unsupervised approach for detecting drivable area utilizing both image data from a monocular camera and point cloud data from a 3D-LIDAR scanner. Our approach locates initial drivable areas based on a “direction ray map” obtained by image-LIDAR data fusion. Besides, a fusion of the feature level is also applied for more robust performance. Once the initial drivable areas are described by different features, the feature fusion problem is formulated as a Markov network and a belief propagation algorithm is developed to perform the model inference. Our approach is unsupervised and avoids common hypothesis, yet gets state-of-the-art results on ROAD-KITTI benchmark. Experiments show that our unsupervised approach is efficient and robust for detecting drivable area for self-driving cars. |
Tasks | Self-Driving Cars |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00451v1 |
http://arxiv.org/pdf/1705.00451v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-drivable-area-for-self-driving-cars |
Repo | |
Framework | |