January 27, 2020

3053 words 15 mins read

Paper Group ANR 1148

Shapelets for earthquake detection. A greedy constructive algorithm for the optimization of neural network architectures. Improving 3D Object Detection for Pedestrians with Virtual Multi-View Synthesis Orientation Estimation. ORC Layout: Adaptive GUI Layout with OR-Constraints. Rethinking Exposure Bias In Language Modeling. Mean Field Limit of the …

Shapelets for earthquake detection


Title	Shapelets for earthquake detection
Authors	Monica Arul, Ahsan Kareem
Abstract	This paper introduces EQShapelets (EarthQuake Shapelets) a time-series shape-based approach embedded in machine learning to autonomously detect earthquakes. It promises to overcome the challenges in the field of seismology related to automated detection and cataloging of earthquakes. EQShapelets are amplitude and phase-independent, i.e., their detection sensitivity is irrespective of the magnitude of the earthquake and the time of occurrence. They are also robust to noise and other spurious signals. The detection capability of EQShapelets is tested on one week of continuous seismic data provided by the Northern California Seismic Network (NCSN) obtained from a station in central California near the Calaveras Fault. EQShapelets combined with a Random Forest classifier, detected all of the cataloged earthquakes and 281 uncataloged events with lower false detection rate thus offering a better performance than autocorrelation and FAST algorithms. The primary advantage of EQShapelets over competing methods is the interpretability and insight it offers. Shape-based approaches are intuitive, visually meaningful and offers immediate insight into the problem domain that goes beyond their use in accurate detection. EQShapelets, if implemented at a large scale, can significantly reduce catalog completeness magnitudes and can serve as an effective tool for near real-time earthquake monitoring and cataloging.
Tasks	Time Series
Published	2019-11-20
URL	https://arxiv.org/abs/1911.09086v1
PDF	https://arxiv.org/pdf/1911.09086v1.pdf
PWC	https://paperswithcode.com/paper/shapelets-for-earthquake-detection
Repo
Framework

A greedy constructive algorithm for the optimization of neural network architectures


Title	A greedy constructive algorithm for the optimization of neural network architectures
Authors	Massimiliano Lupo Pasini, Junqi Yin, Ying Wai Li, Markus Eisenbach
Abstract	In this work we propose a new method to optimize the architecture of an artificial neural network. The algorithm proposed, called Greedy Search for Neural Network Architecture, aims to minimize the complexity of the architecture search and the complexity of the final model selected without compromising the predictive performance. The reduction of the computational cost makes this approach appealing for two reasons. Firstly, there is a need from domain scientists to easily interpret predictions returned by a deep learning model and this tends to be cumbersome when neural networks have complex structures. Secondly, the use of neural networks is challenging in situations with compute/memory limitations. Promising numerical results show that our method is competitive against other hyperparameter optimization algorithms for attainable performance and computational cost. We also generalize the definition of adjusted score from linear regression models to neural networks. Numerical experiments are presented to show that the adjusted score can boost the greedy search to favor smaller architectures over larger ones without compromising the predictive performance.
Tasks	Hyperparameter Optimization
Published	2019-09-07
URL	https://arxiv.org/abs/1909.03306v1
PDF	https://arxiv.org/pdf/1909.03306v1.pdf
PWC	https://paperswithcode.com/paper/a-greedy-constructive-algorithm-for-the
Repo
Framework

Improving 3D Object Detection for Pedestrians with Virtual Multi-View Synthesis Orientation Estimation


Title	Improving 3D Object Detection for Pedestrians with Virtual Multi-View Synthesis Orientation Estimation
Authors	Jason Ku, Alex D. Pon, Sean Walsh, Steven L. Waslander
Abstract	Accurately estimating the orientation of pedestrians is an important and challenging task for autonomous driving because this information is essential for tracking and predicting pedestrian behavior. This paper presents a flexible Virtual Multi-View Synthesis module that can be adopted into 3D object detection methods to improve orientation estimation. The module uses a multi-step process to acquire the fine-grained semantic information required for accurate orientation estimation. First, the scene’s point cloud is densified using a structure preserving depth completion algorithm and each point is colorized using its corresponding RGB pixel. Next, virtual cameras are placed around each object in the densified point cloud to generate novel viewpoints, which preserve the object’s appearance. We show that this module greatly improves the orientation estimation on the challenging pedestrian class on the KITTI benchmark. When used with the open-source 3D detector AVOD-FPN, we outperform all other published methods on the pedestrian Orientation, 3D, and Bird’s Eye View benchmarks.
Tasks	3D Object Detection, Autonomous Driving, Depth Completion, Object Detection
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06777v1
PDF	https://arxiv.org/pdf/1907.06777v1.pdf
PWC	https://paperswithcode.com/paper/improving-3d-object-detection-for-pedestrians
Repo
Framework

ORC Layout: Adaptive GUI Layout with OR-Constraints


Title	ORC Layout: Adaptive GUI Layout with OR-Constraints
Authors	Yue Jiang, Ruofei Du, Christof Lutteroth, Wolfgang Stuerzlinger
Abstract	We propose a novel approach for constraint-based graphical user interface (GUI) layout based on OR-constraints (ORC) in standard soft/hard linear constraint systems. ORC layout unifies grid layout and flow layout, supporting both their features as well as cases where grid and flow layouts individually fail. We describe ORC design patterns that enable designers to safely create flexible layouts that work across different screen sizes and orientations. We also present the ORC Editor, a GUI editor that enables designers to apply ORC in a safe and effective manner, mixing grid, flow and new ORC layout features as appropriate. We demonstrate that our prototype can adapt layouts to screens with different aspect ratios with only a single layout specification, easing the burden of GUI maintenance. Finally, we show that ORC specifications can be modified interactively and solved efficiently at runtime.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.07827v1
PDF	https://arxiv.org/pdf/1912.07827v1.pdf
PWC	https://paperswithcode.com/paper/orc-layout-adaptive-gui-layout-with-or
Repo
Framework

Rethinking Exposure Bias In Language Modeling


Title	Rethinking Exposure Bias In Language Modeling
Authors	Yifan Xu, Kening Zhang, Haoyu Dong, Yuezhou Sun, Wenlong Zhao, Zhuowen Tu
Abstract	Exposure bias describes the phenomenon that a language model trained under the teacher forcing schema may perform poorly at the inference stage when its predictions are conditioned on its previous predictions unseen from the training corpus. Recently, several generative adversarial networks (GANs) and reinforcement learning (RL) methods have been introduced to alleviate this problem. Nonetheless, a common issue in RL and GANs training is the sparsity of reward signals. In this paper, we adopt two simple strategies, multi-range reinforcing, and multi-entropy sampling, to amplify and denoise the reward signal. Our model produces an improvement over competing models with regards to BLEU scores and road exam, a new metric we designed to measure the robustness against exposure bias in language models.
Tasks	Language Modelling
Published	2019-10-13
URL	https://arxiv.org/abs/1910.11235v2
PDF	https://arxiv.org/pdf/1910.11235v2.pdf
PWC	https://paperswithcode.com/paper/rethinking-exposure-bias-in-language-modeling
Repo
Framework

Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks


Title	Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
Authors	Phan-Minh Nguyen
Abstract	Can multilayer neural networks – typically constructed as highly complex structures with many nonlinearly activated neurons across layers – behave in a non-trivial way that yet simplifies away a major part of their complexities? In this work, we uncover a phenomenon in which the behavior of these complex networks – under suitable scalings and stochastic gradient descent dynamics – becomes independent of the number of neurons as this number grows sufficiently large. We develop a formalism in which this many-neurons limiting behavior is captured by a set of equations, thereby exposing a previously unknown operating regime of these networks. While the current pursuit is mathematically non-rigorous, it is complemented with several experiments that validate the existence of this behavior.
Tasks
Published	2019-02-07
URL	http://arxiv.org/abs/1902.02880v1
PDF	http://arxiv.org/pdf/1902.02880v1.pdf
PWC	https://paperswithcode.com/paper/mean-field-limit-of-the-learning-dynamics-of
Repo
Framework

DSNet: An Efficient CNN for Road Scene Segmentation


Title	DSNet: An Efficient CNN for Road Scene Segmentation
Authors	Ping-Rong Chen, Hsueh-Ming Hang, Sheng-Wei Chan, Jing-Jhih Lin
Abstract	Road scene understanding is a critical component in an autonomous driving system. Although the deep learning-based road scene segmentation can achieve very high accuracy, its complexity is also very high for developing real-time applications. It is challenging to design a neural net with high accuracy and low computational complexity. To address this issue, we investigate the advantages and disadvantages of several popular CNN architectures in terms of speed, storage and segmentation accuracy. We start from the Fully Convolutional Network (FCN) with VGG, and then we study ResNet and DenseNet. Through detailed experiments, we pick up the favorable components from the existing architectures and at the end, we construct a light-weight network architecture based on the DenseNet. Our proposed network, called DSNet, demonstrates a real-time testing (inferencing) ability (on the popular GPU platform) and it maintains an accuracy comparable with most previous systems. We test our system on several datasets including the challenging Cityscapes dataset (resolution of 1024x512) with an mIoU of about 69.1 % and runtime of 0.0147 second per image on a single GTX 1080Ti. We also design a more accurate model but at the price of a slower speed, which has an mIoU of about 72.6 % on the CamVid dataset.
Tasks	Autonomous Driving, Scene Segmentation, Scene Understanding
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05022v1
PDF	http://arxiv.org/pdf/1904.05022v1.pdf
PWC	https://paperswithcode.com/paper/dsnet-an-efficient-cnn-for-road-scene
Repo
Framework

Potential Field: Interpretable and Unified Representation for Trajectory Prediction


Title	Potential Field: Interpretable and Unified Representation for Trajectory Prediction
Authors	Shan Su, Cheng Peng, Jianbo Shi, Chiho Choi
Abstract	Predicting an agent’s future trajectory is a challenging task given the complicated stimuli (environmental/inertial/social) of motion. Prior works learn individual stimulus from different modules and fuse the representations in an end-to-end manner, which makes it hard to understand what are actually captured and how they are fused. In this work, we borrow the notion of potential field from physics as an interpretable and unified representation to model all stimuli. This allows us to not only supervise the intermediate learning process, but also have a coherent method to fuse the information of different sources. From the generated potential fields, we further estimate future motion direction and speed, which are modeled as Gaussian distributions to account for the multi-modal nature of the problem. The final prediction results are generated by recurrently moving past location based on the estimated motion direction and speed. We show state-of-the-art results on the ETH, UCY, and Stanford Drone datasets.
Tasks	Trajectory Prediction
Published	2019-11-18
URL	https://arxiv.org/abs/1911.07414v1
PDF	https://arxiv.org/pdf/1911.07414v1.pdf
PWC	https://paperswithcode.com/paper/potential-field-interpretable-and-unified
Repo
Framework

CBCL: Brain-Inspired Model for RGB-D Indoor Scene Classification


Title	CBCL: Brain-Inspired Model for RGB-D Indoor Scene Classification
Authors	Ali Ayub, Alan Wagner
Abstract	This paper contributes a novel method for RGB-D indoor scene classification. Recent approaches to this problem focus on developing increasingly complex pipelines that learn correlated features across the RGB and depth modalities. In contrast, this paper presents a simple method that first extracts features for the RGB and depth modalities using Places365-CNN and fine-tuned Places365-CNN on depth data, respectively and then clusters these features to generate a set of centroids representing each scene category from the training data. For classification a scene image is converted to CNN features and the distance of these features to the n closest learned centroids is used to predict the image’s category. We evaluate our method on two standard RGB-D indoor scene classification benchmarks: SUNRGB-D and NYU Depth V2 and demonstrate that our proposed classification approach achieves superior performance over the state-of-the-art methods on both datasets.
Tasks	Scene Classification
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00155v2
PDF	https://arxiv.org/pdf/1911.00155v2.pdf
PWC	https://paperswithcode.com/paper/centroid-based-scene-classification-cbsc
Repo
Framework


Title	Towards Egocentric Person Re-identification and Social Pattern Analysis
Authors	Estefania Talavera, Alexandre Cola, Nicolai Petkov, Petia Radeva
Abstract	Wearable cameras capture a first-person view of the daily activities of the camera wearer, offering a visual diary of the user behaviour. Detection of the appearance of people the camera user interacts with for social interactions analysis is of high interest. Generally speaking, social events, lifestyle and health are highly correlated, but there is a lack of tools to monitor and analyse them. We consider that egocentric vision provides a tool to obtain information and understand users social interactions. We propose a model that enables us to evaluate and visualize social traits obtained by analysing social interactions appearance within egocentric photostreams. Given sets of egocentric images, we detect the appearance of faces within the days of the camera wearer, and rely on clustering algorithms to group their feature descriptors in order to re-identify persons. Recurrence of detected faces within photostreams allows us to shape an idea of the social pattern of behaviour of the user. We validated our model over several weeks recorded by different camera wearers. Our findings indicate that social profiles are potentially useful for social behaviour interpretation.
Tasks	Person Re-Identification
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04073v1
PDF	https://arxiv.org/pdf/1905.04073v1.pdf
PWC	https://paperswithcode.com/paper/towards-egocentric-person-re-identification
Repo
Framework

Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT


Title	Machine Learning based detection of multiple Wi-Fi BSSs for LTE-U CSAT
Authors	Vanlin Sathya, Adam Dziedzic, Monisha Ghosh, Sanjay Krishnan
Abstract	According to the LTE-U Forum specification, a LTE-U base-station (BS) reduces its duty cycle from 50% to 33% when it senses an increase in the number of co-channel Wi-Fi basic service sets (BSSs) from one to two. The detection of the number of Wi-Fi BSSs that are operating on the channel in real-time, without decoding the Wi-Fi packets, still remains a challenge. In this paper, we present a novel machine learning (ML) approach that solves the problem by using energy values observed during LTE-U OFF duration. Observing the energy values (at LTE-U BS OFF time) is a much simpler operation than decoding the entire Wi-Fi packets. In this work, we implement and validate the proposed ML based approach in real-time experiments, and demonstrate that there are two distinct patterns between one and two Wi-Fi APs. This approach delivers an accuracy close to 100% compared to auto-correlation (AC) and energy detection (ED) approaches.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09292v1
PDF	https://arxiv.org/pdf/1911.09292v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-detection-of-multiple
Repo
Framework

Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity


Title	Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity
Authors	Xiao Wang, Ziliang Chen, Rui Yang, Bin Luo, Jin Tang
Abstract	In this paper, we propose Hard Person Identity Mining (HPIM) that attempts to refine the hard example mining to improve the exploration efficacy in person re-identification. It is motivated by following observation: the more attributes some people share, the more difficult to separate their identities. Based on this observation, we develop HPIM via a transferred attribute describer, a deep multi-attribute classifier trained from the source noisy person attribute datasets. We encode each image into the attribute probabilistic description in the target person re-ID dataset. Afterwards in the attribute code space, we consider each person as a distribution to generate his view-specific attribute codes in different practical scenarios. Hence we estimate the person-specific statistical moments from zeroth to higher order, which are further used to calculate the central moment discrepancies between persons. Such discrepancy is a ground to choose hard identity to organize proper mini-batches, without concerning the person representation changing in metric learning. It presents as a complementary tool of hard example mining, which helps to explore the global instead of the local hard example constraint in the mini-batch built by randomly sampled identities. Extensive experiments on two person re-identification benchmarks validated the effectiveness of our proposed algorithm.
Tasks	Metric Learning, Person Re-Identification
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02102v3
PDF	https://arxiv.org/pdf/1905.02102v3.pdf
PWC	https://paperswithcode.com/paper/improved-hard-example-mining-by-discovering
Repo
Framework

Conformal calibrators


Title	Conformal calibrators
Authors	Vladimir Vovk, Ivan Petej, Paolo Toccaceli, Alex Gammerman
Abstract	Most existing examples of full conformal predictive systems, split-conformal predictive systems, and cross-conformal predictive systems impose severe restrictions on the adaptation of predictive distributions to the test object at hand. In this paper we develop split-conformal and cross-conformal predictive systems that are fully adaptive. Our method consists in calibrating existing predictive systems; the input predictive system is not supposed to satisfy any properties of validity, whereas the output predictive system is guaranteed to be calibrated in probability. It is interesting that the method may also work without the IID assumption, standard in conformal prediction.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06579v1
PDF	http://arxiv.org/pdf/1902.06579v1.pdf
PWC	https://paperswithcode.com/paper/conformal-calibrators
Repo
Framework

A Fast and Precise Method for Large-Scale Land-Use Mapping Based on Deep Learning


Title	A Fast and Precise Method for Large-Scale Land-Use Mapping Based on Deep Learning
Authors	Xuan Yang, Zhengchao Chen, Baipeng Li, Dailiang Peng, Pan Chen, Bing Zhang
Abstract	The land-use map is an important data that can reflect the use and transformation of human land, and can provide valuable reference for land-use planning. For the traditional image classification method, producing a high spatial resolution (HSR), land-use map in large-scale is a big project that requires a lot of human labor, time, and financial expenditure. The rise of the deep learning technique provides a new solution to the problems above. This paper proposes a fast and precise method that can achieve large-scale land-use classification based on deep convolutional neural network (DCNN). In this paper, we optimize the data tiling method and the structure of DCNN for the multi-channel data and the splicing edge effect, which are unique to remote sensing deep learning, and improve the accuracy of land-use classification. We apply our improved methods in the Guangdong Province of China using GF-1 images, and achieve the land-use classification accuracy of 81.52%. It takes only 13 hours to complete the work, which will take several months for human labor.
Tasks	Image Classification
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03438v1
PDF	https://arxiv.org/pdf/1908.03438v1.pdf
PWC	https://paperswithcode.com/paper/a-fast-and-precise-method-for-large-scale
Repo
Framework

Value of Temporal Dynamics Information in Driving Scene Segmentation


Title	Value of Temporal Dynamics Information in Driving Scene Segmentation
Authors	Li Ding, Jack Terwilliger, Rini Sherony, Bryan Reimer, Lex Fridman
Abstract	Semantic scene segmentation has primarily been addressed by forming representations of single images both with supervised and unsupervised methods. The problem of semantic segmentation in dynamic scenes has begun to recently receive attention with video object segmentation approaches. What is not known is how much extra information the temporal dynamics of the visual scene carries that is complimentary to the information available in the individual frames of the video. There is evidence that the human visual system can effectively perceive the scene from temporal dynamics information of the scene’s changing visual characteristics without relying on the visual characteristics of individual snapshots themselves. Our work takes steps to explore whether machine perception can exhibit similar properties by combining appearance-based representations and temporal dynamics representations in a joint-learning problem that reveals the contribution of each toward successful dynamic scene segmentation. Additionally, we provide the MIT Driving Scene Segmentation dataset, which is a large-scale full driving scene segmentation dataset, densely annotated for every pixel and every one of 5,000 video frames. This dataset is intended to help further the exploration of the value of temporal dynamics information for semantic segmentation in video.
Tasks	Scene Segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-03-21
URL	http://arxiv.org/abs/1904.00758v1
PDF	http://arxiv.org/pdf/1904.00758v1.pdf
PWC	https://paperswithcode.com/paper/value-of-temporal-dynamics-information-in
Repo
Framework