Paper Group ANR 973
Topological Tracking of Connected Components in Image Sequences. Generalization of an Upper Bound on the Number of Nodes Needed to Achieve Linear Separability. Some Requests for Machine Learning Research from the East African Tech Scene. Neural Kinematic Networks for Unsupervised Motion Retargetting. Finding Better Subword Segmentation for Neural M …
Topological Tracking of Connected Components in Image Sequences
Title | Topological Tracking of Connected Components in Image Sequences |
Authors | Rocio Gonzalez-Diaz, Maria-Jose Jimenez, Belen Medrano |
Abstract | Persistent homology provides information about the lifetime of homology classes along a filtration of cell complexes. Persistence barcode is a graphical representation of such information. A filtration might be determined by time in a set of spatiotemporal data, but classical methods for computing persistent homology do not respect the fact that we can not move backwards in time. In this paper, taking as input a time-varying sequence of two-dimensional (2D) binary digital images, we develop an algorithm for encoding, in the so-called {\it spatiotemporal barcode}, lifetime of connected components (of either the foreground or background) that are moving in the image sequence over time (this information may not coincide with the one provided by the persistence barcode). This way, given a connected component at a specific time in the sequence, we can track the component backwards in time until the moment it was born, by what we call a {\it spatiotemporal path}. The main contribution of this paper with respect to our previous works lies in a new algorithm that computes spatiotemporal paths directly, valid for both foreground and background and developed in a general context, setting the ground for a future extension for tracking higher dimensional topological features in $nD$ binary digital image sequences. |
Tasks | |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.00939v1 |
http://arxiv.org/pdf/1801.00939v1.pdf | |
PWC | https://paperswithcode.com/paper/topological-tracking-of-connected-components |
Repo | |
Framework | |
Generalization of an Upper Bound on the Number of Nodes Needed to Achieve Linear Separability
Title | Generalization of an Upper Bound on the Number of Nodes Needed to Achieve Linear Separability |
Authors | Marjolein Troost, Katja Seeliger, Marcel van Gerven |
Abstract | An important issue in neural network research is how to choose the number of nodes and layers such as to solve a classification problem. We provide new intuitions based on earlier results by An et al. (2015) by deriving an upper bound on the number of nodes in networks with two hidden layers such that linear separability can be achieved. Concretely, we show that if the data can be described in terms of N finite sets and the used activation function f is non-constant, increasing and has a left asymptote, we can derive how many nodes are needed to linearly separate these sets. This will be an upper bound that depends on the structure of the data. This structure can be analyzed using an algorithm. For the leaky rectified linear activation function, we prove separately that under some conditions on the slope, the same number of layers and nodes as for the aforementioned activation functions is sufficient. We empirically validate our claims. |
Tasks | |
Published | 2018-02-10 |
URL | http://arxiv.org/abs/1802.03488v1 |
http://arxiv.org/pdf/1802.03488v1.pdf | |
PWC | https://paperswithcode.com/paper/generalization-of-an-upper-bound-on-the |
Repo | |
Framework | |
Some Requests for Machine Learning Research from the East African Tech Scene
Title | Some Requests for Machine Learning Research from the East African Tech Scene |
Authors | Milan Cvitkovic |
Abstract | Based on 46 in-depth interviews with scientists, engineers, and CEOs, this document presents a list of concrete machine research problems, progress on which would directly benefit tech ventures in East Africa. |
Tasks | |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.11383v2 |
http://arxiv.org/pdf/1810.11383v2.pdf | |
PWC | https://paperswithcode.com/paper/some-requests-for-machine-learning-research |
Repo | |
Framework | |
Neural Kinematic Networks for Unsupervised Motion Retargetting
Title | Neural Kinematic Networks for Unsupervised Motion Retargetting |
Authors | Ruben Villegas, Jimei Yang, Duygu Ceylan, Honglak Lee |
Abstract | We propose a recurrent neural network architecture with a Forward Kinematics layer and cycle consistency based adversarial training objective for unsupervised motion retargetting. Our network captures the high-level properties of an input motion by the forward kinematics layer, and adapts them to a target character with different skeleton bone lengths (e.g., shorter, longer arms etc.). Collecting paired motion training sequences from different characters is expensive. Instead, our network utilizes cycle consistency to learn to solve the Inverse Kinematics problem in an unsupervised manner. Our method works online, i.e., it adapts the motion sequence on-the-fly as new frames are received. In our experiments, we use the Mixamo animation data to test our method for a variety of motions and characters and achieve state-of-the-art results. We also demonstrate motion retargetting from monocular human videos to 3D characters using an off-the-shelf 3D pose estimator. |
Tasks | |
Published | 2018-04-16 |
URL | http://arxiv.org/abs/1804.05653v1 |
http://arxiv.org/pdf/1804.05653v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-kinematic-networks-for-unsupervised |
Repo | |
Framework | |
Finding Better Subword Segmentation for Neural Machine Translation
Title | Finding Better Subword Segmentation for Neural Machine Translation |
Authors | Yingting Wu, Hai Zhao |
Abstract | For different language pairs, word-level neural machine translation (NMT) models with a fixed-size vocabulary suffer from the same problem of representing out-of-vocabulary (OOV) words. The common practice usually replaces all these rare or unknown words with a token, which limits the translation performance to some extent. Most of recent work handled such a problem by splitting words into characters or other specially extracted subword units to enable open-vocabulary translation. Byte pair encoding (BPE) is one of the successful attempts that has been shown extremely competitive by providing effective subword segmentation for NMT systems. In this paper, we extend the BPE style segmentation to a general unsupervised framework with three statistical measures: frequency (FRQ), accessor variety (AV) and description length gain (DLG). We test our approach on two translation tasks: German to English and Chinese to English. The experimental results show that AV and DLG enhanced systems outperform the FRQ baseline in the frequency weighted schemes at different significant levels. |
Tasks | Machine Translation |
Published | 2018-07-25 |
URL | http://arxiv.org/abs/1807.09639v1 |
http://arxiv.org/pdf/1807.09639v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-better-subword-segmentation-for |
Repo | |
Framework | |
SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection
Title | SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection |
Authors | Xiaowei Hu, Xuemiao Xu, Yongjie Xiao, Hao Chen, Shengfeng He, Jing Qin, Pheng-Ann Heng |
Abstract | Vision-based vehicle detection approaches achieve incredible success in recent years with the development of deep convolutional neural network (CNN). However, existing CNN based algorithms suffer from the problem that the convolutional features are scale-sensitive in object detection task but it is common that traffic images and videos contain vehicles with a large variance of scales. In this paper, we delve into the source of scale sensitivity, and reveal two key issues: 1) existing RoI pooling destroys the structure of small scale objects, 2) the large intra-class distance for a large variance of scales exceeds the representation capability of a single network. Based on these findings, we present a scale-insensitive convolutional neural network (SINet) for fast detecting vehicles with a large variance of scales. First, we present a context-aware RoI pooling to maintain the contextual information and original structure of small scale objects. Second, we present a multi-branch decision network to minimize the intra-class distance of features. These lightweight techniques bring zero extra time complexity but prominent detection accuracy improvement. The proposed techniques can be equipped with any deep network architectures and keep them trained end-to-end. Our SINet achieves state-of-the-art performance in terms of accuracy and speed (up to 37 FPS) on the KITTI benchmark and a new highway dataset, which contains a large variance of scales and extremely small objects. |
Tasks | Fast Vehicle Detection, Object Detection |
Published | 2018-04-02 |
URL | http://arxiv.org/abs/1804.00433v2 |
http://arxiv.org/pdf/1804.00433v2.pdf | |
PWC | https://paperswithcode.com/paper/sinet-a-scale-insensitive-convolutional |
Repo | |
Framework | |
Concept-Oriented Deep Learning: Generative Concept Representations
Title | Concept-Oriented Deep Learning: Generative Concept Representations |
Authors | Daniel T. Chang |
Abstract | Generative concept representations have three major advantages over discriminative ones: they can represent uncertainty, they support integration of learning and reasoning, and they are good for unsupervised and semi-supervised learning. We discuss probabilistic and generative deep learning, which generative concept representations are based on, and the use of variational autoencoders and generative adversarial networks for learning generative concept representations, particularly for concepts whose data are sequences, structured data or graphs. |
Tasks | |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06622v1 |
http://arxiv.org/pdf/1811.06622v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-oriented-deep-learning-generative |
Repo | |
Framework | |
Ontology Matching Techniques: A Gold Standard Model
Title | Ontology Matching Techniques: A Gold Standard Model |
Authors | Alok Chauhan, Vijayakumar V, Layth Sliman |
Abstract | Typically an ontology matching technique is a combination of much different type of matchers operating at various abstraction levels such as structure, semantic, syntax, instance etc. An ontology matching technique which employs matchers at all possible abstraction levels is expected to give, in general, best results in terms of precision, recall and F-measure due to improvement in matching opportunities and if we discount efficiency issues which may improve with better computing resources such as parallel processing. A gold standard ontology matching model is derived from a model classification of ontology matching techniques. A suitable metric is also defined based on gold standard ontology matching model. A review of various ontology matching techniques specified in recent research papers in the area was undertaken to categorize an ontology matching technique as per newly proposed gold standard model and a metric value for the whole group was computed. The results of the above study support proposed gold standard ontology matching model. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10191v1 |
http://arxiv.org/pdf/1811.10191v1.pdf | |
PWC | https://paperswithcode.com/paper/ontology-matching-techniques-a-gold-standard |
Repo | |
Framework | |
An Algorithm for Learning Shape and Appearance Models without Annotations
Title | An Algorithm for Learning Shape and Appearance Models without Annotations |
Authors | John Ashburner, Mikael Brudfors, Kevin Bronik, Yael Balbastre |
Abstract | This paper presents a framework for automatically learning shape and appearance models for medical (and certain other) images. It is based on the idea that having a more accurate shape and appearance model leads to more accurate image registration, which in turn leads to a more accurate shape and appearance model. This leads naturally to an iterative scheme, which is based on a probabilistic generative model that is fit using Gauss-Newton updates within an EM-like framework. It was developed with the aim of enabling distributed privacy-preserving analysis of brain image data, such that shared information (shape and appearance basis functions) may be passed across sites, whereas latent variables that encode individual images remain secure within each site. These latent variables are proposed as features for privacy-preserving data mining applications. The approach is demonstrated qualitatively on the KDEF dataset of 2D face images, showing that it can align images that traditionally require shape and appearance models trained using manually annotated data (manually defined landmarks etc.). It is applied to MNIST dataset of handwritten digits to show its potential for machine learning applications, particularly when training data is limited. The model is able to handle ``missing data’', which allows it to be cross-validated according to how well it can predict left-out voxels. The suitability of the derived features for classifying individuals into patient groups was assessed by applying it to a dataset of over 1,900 segmented T1-weighted MR images, which included images from the COBRE and ABIDE datasets. | |
Tasks | Image Registration |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10731v1 |
http://arxiv.org/pdf/1807.10731v1.pdf | |
PWC | https://paperswithcode.com/paper/an-algorithm-for-learning-shape-and |
Repo | |
Framework | |
Using Sentiment Induction to Understand Variation in Gendered Online Communities
Title | Using Sentiment Induction to Understand Variation in Gendered Online Communities |
Authors | Li Lucy, Julia Mendelsohn |
Abstract | We analyze gendered communities defined in three different ways: text, users, and sentiment. Differences across these representations reveal facets of communities’ distinctive identities, such as social group, topic, and attitudes. Two communities may have high text similarity but not user similarity or vice versa, and word usage also does not vary according to a clearcut, binary perspective of gender. Community-specific sentiment lexicons demonstrate that sentiment can be a useful indicator of words’ social meaning and community values, especially in the context of discussion content and user demographics. Our results show that social platforms such as Reddit are active settings for different constructions of gender. |
Tasks | |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.07061v1 |
http://arxiv.org/pdf/1811.07061v1.pdf | |
PWC | https://paperswithcode.com/paper/using-sentiment-induction-to-understand |
Repo | |
Framework | |
Forecasting People’s Needs in Hurricane Events from Social Network
Title | Forecasting People’s Needs in Hurricane Events from Social Network |
Authors | Long Nguyen, Zhou Yang, Jia Li, Guofeng Cao, Fang Jin |
Abstract | Social networks can serve as a valuable communication channel for calls for help, offering assistance, and coordinating rescue activities in disaster. Social networks such as Twitter allow users to continuously update relevant information, which is especially useful during a crisis, where the rapidly changing conditions make it crucial to be able to access accurate information promptly. Social media helps those directly affected to inform others of conditions on the ground in real time and thus enables rescue workers to coordinate their efforts more effectively, better meeting the survivors’ need. This paper presents a new sequence to sequence based framework for forecasting people’s needs during disasters using social media and weather data. It consists of two Long Short-Term Memory (LSTM) models, one of which encodes input sequences of weather information and the other plays as a conditional decoder that decodes the encoded vector and forecasts the survivors’ needs. Case studies utilizing data collected during Hurricane Sandy in 2012, Hurricane Harvey and Hurricane Irma in 2017 were analyzed and the results compared with those obtained using a statistical language model n-gram and an LSTM generative model. Our proposed sequence to sequence method forecast people’s needs more successfully than either of the other models. This new approach shows great promise for enhancing disaster management activities such as evacuation planning and commodity flow management. |
Tasks | Language Modelling |
Published | 2018-11-12 |
URL | http://arxiv.org/abs/1811.04577v1 |
http://arxiv.org/pdf/1811.04577v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-peoples-needs-in-hurricane-events |
Repo | |
Framework | |
Benchmark Dataset for Automatic Damaged Building Detection from Post-Hurricane Remotely Sensed Imagery
Title | Benchmark Dataset for Automatic Damaged Building Detection from Post-Hurricane Remotely Sensed Imagery |
Authors | Sean Andrew Chen, Andrew Escay, Christopher Haberland, Tessa Schneider, Valentina Staneva, Youngjun Choe |
Abstract | Rapid damage assessment is of crucial importance to emergency responders during hurricane events, however, the evaluation process is often slow, labor-intensive, costly, and error-prone. New advances in computer vision and remote sensing open possibilities to observe the Earth at a different scale. However, substantial pre-processing work is still required in order to apply state-of-the-art methodology for emergency response. To enable the comparison of methods for automatic detection of damaged buildings from post-hurricane remote sensing imagery taken from both airborne and satellite sensors, this paper presents the development of benchmark datasets from publicly available data. The major contributions of this work include (1) a scalable framework for creating benchmark datasets of hurricane-damaged buildings and (2) public sharing of the resulting benchmark datasets for Greater Houston area after Hurricane Harvey in 2017. The proposed approach can be used to build other hurricane-damaged building datasets on which researchers can train and test object detection models to automatically identify damaged buildings. |
Tasks | Damaged Building Detection, Object Detection |
Published | 2018-12-13 |
URL | http://arxiv.org/abs/1812.05581v1 |
http://arxiv.org/pdf/1812.05581v1.pdf | |
PWC | https://paperswithcode.com/paper/benchmark-dataset-for-automatic-damaged |
Repo | |
Framework | |
Distributed traffic light control at uncoupled intersections with real-world topology by deep reinforcement learning
Title | Distributed traffic light control at uncoupled intersections with real-world topology by deep reinforcement learning |
Authors | Mark Schutera, Niklas Goby, Stefan Smolarek, Markus Reischl |
Abstract | This work examines the implications of uncoupled intersections with local real-world topology and sensor setup on traffic light control approaches. Control approaches are evaluated with respect to: Traffic flow, fuel consumption and noise emission at intersections. The real-world road network of Friedrichshafen is depicted, preprocessed and the present traffic light controlled intersections are modeled with respect to state space and action space. Different strategies, containing fixed-time, gap-based and time-based control approaches as well as our deep reinforcement learning based control approach, are implemented and assessed. Our novel DRL approach allows for modeling the TLC action space, with respect to phase selection as well as selection of transition timings. It was found that real-world topologies, and thus irregularly arranged intersections have an influence on the performance of traffic light control approaches. This is even to be observed within the same intersection types (n-arm, m-phases). Moreover we could show, that these influences can be efficiently dealt with by our deep reinforcement learning based control approach. |
Tasks | |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11233v1 |
http://arxiv.org/pdf/1811.11233v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-traffic-light-control-at |
Repo | |
Framework | |
Incept-N: A Convolutional Neural Network based Classification Approach for Predicting Nationality from Facial Features
Title | Incept-N: A Convolutional Neural Network based Classification Approach for Predicting Nationality from Facial Features |
Authors | Masum Shah Junayed, Afsana Ahsan Jeny, Nafis Neehal |
Abstract | The nationality of a human being is a well-known identifying characteristic used for every major authentication purpose in every country. Albeit advances in the application of Artificial Intelligence and Computer Vision in different aspects, its contribution to this specific security procedure is yet to be cultivated. With a goal to successfully applying computer vision techniques to predict the nationality of a person based on his facial features, we have proposed this novel method and have achieved an average of 93.6% accuracy with very low misclassification rate. |
Tasks | |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07426v1 |
http://arxiv.org/pdf/1805.07426v1.pdf | |
PWC | https://paperswithcode.com/paper/incept-n-a-convolutional-neural-network-based |
Repo | |
Framework | |
Robust Fruit Counting: Combining Deep Learning, Tracking, and Structure from Motion
Title | Robust Fruit Counting: Combining Deep Learning, Tracking, and Structure from Motion |
Authors | Xu Liu, Steven W. Chen, Shreyas Aditya, Nivedha Sivakumar, Sandeep Dcunha, Chao Qu, Camillo J. Taylor, Jnaneshwar Das, Vijay Kumar |
Abstract | We present a novel fruit counting pipeline that combines deep segmentation, frame to frame tracking, and 3D localization to accurately count visible fruits across a sequence of images. Our pipeline works on image streams from a monocular camera, both in natural light, as well as with controlled illumination at night. We first train a Fully Convolutional Network (FCN) and segment video frame images into fruit and non-fruit pixels. We then track fruits across frames using the Hungarian Algorithm where the objective cost is determined from a Kalman Filter corrected Kanade-Lucas-Tomasi (KLT) Tracker. In order to correct the estimated count from tracking process, we combine tracking results with a Structure from Motion (SfM) algorithm to calculate relative 3D locations and size estimates to reject outliers and double counted fruit tracks. We evaluate our algorithm by comparing with ground-truth human-annotated visual counts. Our results demonstrate that our pipeline is able to accurately and reliably count fruits across image sequences, and the correction step can significantly improve the counting accuracy and robustness. Although discussed in the context of fruit counting, our work can extend to detection, tracking, and counting of a variety of other stationary features of interest such as leaf-spots, wilt, and blossom. |
Tasks | |
Published | 2018-04-01 |
URL | http://arxiv.org/abs/1804.00307v2 |
http://arxiv.org/pdf/1804.00307v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-fruit-counting-combining-deep-learning |
Repo | |
Framework | |