January 27, 2020

3160 words 15 mins read

Paper Group ANR 1204

Stock Market Forecasting Based on Text Mining Technology: A Support Vector Machine Method. Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style. Decoding Spiking Mechanism with Dynamic Learning on Neuron Population. Semantic-aware Image Deblurring. Graph Normalizing Flows. Evolutionary Algorithm for Sinhala to …

Stock Market Forecasting Based on Text Mining Technology: A Support Vector Machine Method


Title	Stock Market Forecasting Based on Text Mining Technology: A Support Vector Machine Method
Authors	Yancong Xie, Hongxun Jiang
Abstract	News items have a significant impact on stock markets but the ways are obscure. Many previous works have aimed at finding accurate stock market forecasting models. In this paper, we use text mining and sentiment analysis on Chinese online financial news, to predict Chinese stock tendency and stock prices based on support vector machine (SVM). Firstly, we collect 2,302,692 news items, which date from 1/1/2008 to 1/1/2015. Secondly, based on this dataset, a specific domain stop-word dictionary and a precise sentiment dictionary are formed. Thirdly, we propose a forecasting model using SVM. On the algorithm of SVM implementation, we also propose two-parameter optimization algorithms to search for the best initial parameter setting. The result shows that parameter G has the main effect, while parameter C’s effect is not obvious. Furthermore, support vector regression (SVR) models for different Chinese stocks are similar whereas in support vector classification (SVC) models best parameters are quite differential. Series of contrast experiments show that: a) News has significant influence on stock market; b) Expansion input vector for additional situations when that day has no news data is better than normal input in SVR, yet is worse in SVC; c) SVR shows a fantastic degree of fitting in predicting stock fluctuation while such result has some time lag; d) News effect time lag for stock market is less than two days; e) In SVC, historic stock data has a most efficient time lag which is about 10 days, whereas in SVR this effect is not obvious. Besides, based on the special structure of the input vector, we also design a method to calculate the financial source impact factor. Result suggests that the news quality and audience number both have a significant effect on the source impact factor. Besides, for Chinese investors, traditional media has more influence than digital media.
Tasks	Sentiment Analysis
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12789v1
PDF	https://arxiv.org/pdf/1909.12789v1.pdf
PWC	https://paperswithcode.com/paper/stock-market-forecasting-based-on-text-mining
Repo
Framework

Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style


Title	Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style
Authors	Hongwei Ge, Zehang Yan, Kai Zhang, Mingde Zhao, Liang Sun
Abstract	Image captioning is a research hotspot where encoder-decoder models combining convolutional neural network (CNN) and long short-term memory (LSTM) achieve promising results. Despite significant progress, these models generate sentences differently from human cognitive styles. Existing models often generate a complete sentence from the first word to the end, without considering the influence of the following words on the whole sentence generation. In this paper, we explore the utilization of a human-like cognitive style, i.e., building overall cognition for the image to be described and the sentence to be constructed, for enhancing computer image understanding. This paper first proposes a Mutual-aid network structure with Bidirectional LSTMs (MaBi-LSTMs) for acquiring overall contextual information. In the training process, the forward and backward LSTMs encode the succeeding and preceding words into their respective hidden states by simultaneously constructing the whole sentence in a complementary manner. In the captioning process, the LSTM implicitly utilizes the subsequent semantic information contained in its hidden states. In fact, MaBi-LSTMs can generate two sentences in forward and backward directions. To bridge the gap between cross-domain models and generate a sentence with higher quality, we further develop a cross-modal attention mechanism to retouch the two sentences by fusing their salient parts as well as the salient areas of the image. Experimental results on the Microsoft COCO dataset show that the proposed model improves the performance of encoder-decoder models and achieves state-of-the-art results.
Tasks	Image Captioning
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06475v1
PDF	https://arxiv.org/pdf/1910.06475v1.pdf
PWC	https://paperswithcode.com/paper/exploring-overall-contextual-information-for
Repo
Framework

Decoding Spiking Mechanism with Dynamic Learning on Neuron Population


Title	Decoding Spiking Mechanism with Dynamic Learning on Neuron Population
Authors	Zhijie Chen, Junchi Yan, Longyuan Li, Xiaokang Yang
Abstract	A main concern in cognitive neuroscience is to decode the overt neural spike train observations and infer latent representations under neural circuits. However, traditional methods entail strong prior on network structure and hardly meet the demand for real spike data. Here we propose a novel neural network approach called Neuron Activation Network that extracts neural information explicitly from single trial neuron population spike trains. Our proposed method consists of a spatiotemporal learning procedure on sensory environment and a message passing mechanism on population graph, followed by a neuron activation process in a recursive fashion. Our model is aimed to reconstruct neuron information while inferring representations of neuron spiking states. We apply our model to retinal ganglion cells and the experimental results suggest that our model holds a more potent capability in generating neural spike sequences with high fidelity than the state-of-the-art methods, as well as being more expressive and having potential to disclose latent spiking mechanism. The source code will be released with the final paper.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09309v1
PDF	https://arxiv.org/pdf/1911.09309v1.pdf
PWC	https://paperswithcode.com/paper/decoding-spiking-mechanism-with-dynamic
Repo
Framework

Semantic-aware Image Deblurring


Title	Semantic-aware Image Deblurring
Authors	Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao
Abstract	Image deblurring has achieved exciting progress in recent years. However, traditional methods fail to deblur severely blurred images, where semantic contents appears ambiguously. In this paper, we conduct image deblurring guided by the semantic contents inferred from image captioning. Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC). In particular, S3-tree captures and represents the semantic contents in structured spatial features in ImC, and then embeds the spatial features of the tree nodes into GAN based ImD. Co-training on S3-tree, ImC, and ImD is conducted to optimize the overall model in a multi-task end-to-end manner. Extensive experiments on severely blurred MSCOCO and GoPro datasets demonstrate the significant superiority of S3E-Deblur compared to the state-of-the-arts on both ImD and ImC tasks.
Tasks	Deblurring, Image Captioning
Published	2019-10-09
URL	https://arxiv.org/abs/1910.03853v1
PDF	https://arxiv.org/pdf/1910.03853v1.pdf
PWC	https://paperswithcode.com/paper/semantic-aware-image-deblurring
Repo
Framework

Graph Normalizing Flows


Title	Graph Normalizing Flows
Authors	Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky
Abstract	We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation. On supervised tasks, graph normalizing flows perform similarly to message passing neural networks, but at a significantly reduced memory footprint, allowing them to scale to larger graphs. In the unsupervised case, we combine graph normalizing flows with a novel graph auto-encoder to create a generative model of graph structures. Our model is permutation-invariant, generating entire graphs with a single feed-forward pass, and achieves competitive results with the state-of-the art auto-regressive models, while being better suited to parallel computing architectures.
Tasks
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13177v1
PDF	https://arxiv.org/pdf/1905.13177v1.pdf
PWC	https://paperswithcode.com/paper/graph-normalizing-flows
Repo
Framework

Evolutionary Algorithm for Sinhala to English Translation


Title	Evolutionary Algorithm for Sinhala to English Translation
Authors	J. K. Joseph, W. M. T. Chathurika, A. Nugaliyadde, Y. Mallawarachchi
Abstract	Machine Translation (MT) is an area in natural language processing, which focus on translating from one language to another. Many approaches ranging from statistical methods to deep learning approaches are used in order to achieve MT. However, these methods either require a large number of data or a clear understanding about the language. Sinhala language has less digital text which could be used to train a deep neural network. Furthermore, Sinhala has complex rules therefore, it is harder to create statistical rules in order to apply statistical methods in MT. This research focuses on Sinhala to English translation using an Evolutionary Algorithm (EA). EA is used to identifying the correct meaning of Sinhala text and to translate it to English. The Sinhala text is passed to identify the meaning in order to get the correct meaning of the sentence. With the use of the EA the translation is carried out. The translated text is passed on to grammatically correct the sentence. This has shown to achieve accurate results.
Tasks	Machine Translation
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03202v1
PDF	https://arxiv.org/pdf/1907.03202v1.pdf
PWC	https://paperswithcode.com/paper/evolutionary-algorithm-for-sinhala-to-english
Repo
Framework

On the Impact of Object and Sub-component Level Segmentation Strategies for Supervised Anomaly Detection within X-ray Security Imagery


Title	On the Impact of Object and Sub-component Level Segmentation Strategies for Supervised Anomaly Detection within X-ray Security Imagery
Authors	Neelanjan Bhowmik, Yona Falinie A. Gaus, Samet Akcay, Jack W. Barker, Toby P. Breckon
Abstract	X-ray security screening is in widespread use to maintain transportation security against a wide range of potential threat profiles. Of particular interest is the recent focus on the use of automated screening approaches, including the potential anomaly detection as a methodology for concealment detection within complex electronic items. Here we address this problem considering varying segmentation strategies to enable the use of both object level and sub-component level anomaly detection via the use of secondary convolutional neural network (CNN) architectures. Relative performance is evaluated over an extensive dataset of exemplar cluttered X-ray imagery, with a focus on consumer electronics items. We find that sub-component level segmentation produces marginally superior performance in the secondary anomaly detection via classification stage, with true positive of ~98% of anomalies, with a ~3% false positive.
Tasks	Anomaly Detection
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08216v1
PDF	https://arxiv.org/pdf/1911.08216v1.pdf
PWC	https://paperswithcode.com/paper/on-the-impact-of-object-and-sub-component
Repo
Framework

Visual Semantic Information Pursuit: A Survey


Title	Visual Semantic Information Pursuit: A Survey
Authors	Daqi Liu, Miroslaw Bober, Josef Kittler
Abstract	Visual semantic information comprises two important parts: the meaning of each visual semantic unit and the coherent visual semantic relation conveyed by these visual semantic units. Essentially, the former one is a visual perception task while the latter one corresponds to visual context reasoning. Remarkable advances in visual perception have been achieved due to the success of deep learning. In contrast, visual semantic information pursuit, a visual scene semantic interpretation task combining visual perception and visual context reasoning, is still in its early stage. It is the core task of many different computer vision applications, such as object detection, visual semantic segmentation, visual relationship detection or scene graph generation. Since it helps to enhance the accuracy and the consistency of the resulting interpretation, visual context reasoning is often incorporated with visual perception in current deep end-to-end visual semantic information pursuit methods. However, a comprehensive review for this exciting area is still lacking. In this survey, we present a unified theoretical paradigm for all these methods, followed by an overview of the major developments and the future trends in each potential direction. The common benchmark datasets, the evaluation metrics and the comparisons of the corresponding methods are also introduced.
Tasks	Graph Generation, Object Detection, Scene Graph Generation, Semantic Segmentation
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05434v1
PDF	http://arxiv.org/pdf/1903.05434v1.pdf
PWC	https://paperswithcode.com/paper/visual-semantic-information-pursuit-a-survey
Repo
Framework

High-Level Control of Drum Track Generation Using Learned Patterns of Rhythmic Interaction


Title	High-Level Control of Drum Track Generation Using Learned Patterns of Rhythmic Interaction
Authors	Stefan Lattner, Maarten Grachten
Abstract	Spurred by the potential of deep learning, computational music generation has gained renewed academic interest. A crucial issue in music generation is that of user control, especially in scenarios where the music generation process is conditioned on existing musical material. Here we propose a model for conditional kick drum track generation that takes existing musical material as input, in addition to a low-dimensional code that encodes the desired relation between the existing material and the new material to be generated. These relational codes are learned in an unsupervised manner from a music dataset. We show that codes can be sampled to create a variety of musically plausible kick drum tracks and that the model can be used to transfer kick drum patterns from one song to another. Lastly, we demonstrate that the learned codes are largely invariant to tempo and time-shift.
Tasks	Music Generation
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00948v1
PDF	https://arxiv.org/pdf/1908.00948v1.pdf
PWC	https://paperswithcode.com/paper/high-level-control-of-drum-track-generation
Repo
Framework

Progressive Unsupervised Person Re-identification by Tracklet Association with Spatio-Temporal Regularization


Title	Progressive Unsupervised Person Re-identification by Tracklet Association with Spatio-Temporal Regularization
Authors	Qiaokang Xie, Wengang Zhou, Guo-Jun Qi, Qi Tian, Houqiang Li
Abstract	Existing methods for person re-identification (Re-ID) are mostly based on supervised learning which requires numerous manually labeled samples across all camera views for training. Such a paradigm suffers the scalability issue since in real-world Re-ID application, it is difficult to exhaustively label abundant identities over multiple disjoint camera views. To this end, we propose a progressive deep learning method for unsupervised person Re-ID in the wild by Tracklet Association with Spatio-Temporal Regularization (TASTR). In our approach, we first collect tracklet data within each camera by automatic person detection and tracking. Then, an initial Re-ID model is trained based on within-camera triplet construction for person representation learning. After that, based on the person visual feature and spatio-temporal constraint, we associate cross-camera tracklets to generate cross-camera triplets and update the Re-ID model. Lastly, with the refined Re-ID model, better visual feature of person can be extracted, which further promote the association of cross-camera tracklets. The last two steps are iterated multiple times to progressively upgrade the Re-ID model.
Tasks	Human Detection, Person Re-Identification, Representation Learning, Unsupervised Person Re-Identification
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11560v1
PDF	https://arxiv.org/pdf/1910.11560v1.pdf
PWC	https://paperswithcode.com/paper/progressive-unsupervised-person-re
Repo
Framework

Energy Clustering for Unsupervised Person Re-identification


Title	Energy Clustering for Unsupervised Person Re-identification
Authors	Kaiwei Zeng
Abstract	Due to the high cost of data annotation in supervised learning for person re-identification (Re-ID) methods, unsupervised learning becomes more attractive in the real world. The Bottom-up Clustering (BUC) approach based on hierarchical clustering serves as one promising unsupervised clustering method. One key factor of BUC is the distance measurement strategy. Ideally, the distance measurement should consider both inter-cluster and intra-cluster distance of all samples. However, BUC uses the minimum distance, only considers a pair of the nearest sample between two clusters and ignores the diversity of other samples in clusters. To solve this problem, we propose to use the energy distance to evaluate both the inter-cluster and intra-cluster distance in hierarchical clustering(E-cluster), and use the sum of squares of deviations(SSD) as a regularization term to further balance the diversity and similarity of energy distance evaluation. We evaluate our method on large scale re-ID datasets, including Market-1501, DukeMTMC-reID and MARS. Extensive experiments show that our method obtains significant improvements over the state-of-the-art unsupervised methods, and even better than some transfer learning methods.
Tasks	Person Re-Identification, Transfer Learning, Unsupervised Person Re-Identification
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00112v2
PDF	https://arxiv.org/pdf/1909.00112v2.pdf
PWC	https://paperswithcode.com/paper/energy-clustering-for-unsupervised-person-re
Repo
Framework

How Decoding Strategies Affect the Verifiability of Generated Text


Title	How Decoding Strategies Affect the Verifiability of Generated Text
Authors	Luca Massarelli, Fabio Petroni, Aleksandra Piktus, Myle Ott, Tim Rocktäschel, Vassilis Plachouras, Fabrizio Silvestri, Sebastian Riedel
Abstract	Language models are of considerable importance. They are used for pretraining, finetuning, and rescoring in downstream applications, and as is as a test-bed and benchmark for progress in natural language understanding. One fundamental question regards the way we should generate text from a language model. It is well known that different decoding strategies can have dramatic impact on the quality of the generated text and using the most likely sequence under the model distribution, e.g., via beam search, generally leads to degenerate and repetitive outputs. While generation strategies such as top-k and nucleus sampling lead to more natural and less repetitive generations, the true cost of avoiding the highest scoring solution is hard to quantify. In this paper, we argue that verifiability, i.e., the consistency of the generated text with factual knowledge, is a suitable metric for measuring this cost. We use an automatic fact-checking system to calculate new metrics as a function of the number of supported claims per sentence and find that sampling-based generation strategies, such as top-k, indeed lead to less verifiable text. This finding holds across various dimensions, such as model size, training data size and parameters of the generation strategy. Based on this finding, we introduce a simple and effective generation strategy for producing non-repetitive and more verifiable (in comparison to other methods) text.
Tasks	Language Modelling
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03587v1
PDF	https://arxiv.org/pdf/1911.03587v1.pdf
PWC	https://paperswithcode.com/paper/how-decoding-strategies-affect-the
Repo
Framework

Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks


Title	Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Authors	Kazuhiro Nakamura, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda
Abstract	The present paper describes singing voice synthesis based on convolutional neural networks (CNNs). Singing voice synthesis systems based on deep neural networks (DNNs) are currently being proposed and are improving the naturalness of synthesized singing voices. As singing voices represent a rich form of expression, a powerful technique to model them accurately is required. In the proposed technique, long-term dependencies of singing voices are modeled by CNNs. An acoustic feature sequence is generated for each segment that consists of long-term frames, and a natural trajectory is obtained without the parameter generation algorithm. Furthermore, a computational complexity reduction technique, which drives the DNNs in different time units depending on type of musical score features, is proposed. Experimental results show that the proposed method can synthesize natural sounding singing voices much faster than the conventional method.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11690v1
PDF	https://arxiv.org/pdf/1910.11690v1.pdf
PWC	https://paperswithcode.com/paper/fast-and-high-quality-singing-voice-synthesis
Repo
Framework

Physical Cue based Depth-Sensing by Color Coding with Deaberration Network


Title	Physical Cue based Depth-Sensing by Color Coding with Deaberration Network
Authors	Nao Mishima, Tatsuo Kozakaya, Akihisa Moriya, Ryuzo Okada, Shinsaku Hiura
Abstract	Color-coded aperture (CCA) methods can physically measure the depth of a scene given by physical cues from a single-shot image of a monocular camera. However, they are vulnerable to actual lens aberrations in real scenes because they assume an ideal lens for simplifying algorithms. In this paper, we propose physical cue-based deep learning for CCA photography. To address actual lens aberrations, we developed a deep deaberration network (DDN) that is additionally equipped with a self-attention mechanism of position and color channels to efficiently learn the lens aberration. Furthermore, a new Bayes L1 loss function based on Bayesian deep learning enables to handle the uncertainty of depth estimation more accurately. Quantitative and qualitative comparisons demonstrate that our method is superior to conventional methods including real outdoor scenes. Furthermore, compared to a long-baseline stereo camera, the proposed method provides an error-free depth map at close range, as there is no blind spot between the left and right cameras.
Tasks	Depth Estimation
Published	2019-08-01
URL	https://arxiv.org/abs/1908.00329v1
PDF	https://arxiv.org/pdf/1908.00329v1.pdf
PWC	https://paperswithcode.com/paper/physical-cue-based-depth-sensing-by-color
Repo
Framework

Lecture Notes: Optimization for Machine Learning


Title	Lecture Notes: Optimization for Machine Learning
Authors	Elad Hazan
Abstract	Lecture notes on optimization for machine learning, derived from a course at Princeton University and tutorials given in MLSS, Buenos Aires, as well as Simons Foundation, Berkeley.
Tasks
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03550v1
PDF	https://arxiv.org/pdf/1909.03550v1.pdf
PWC	https://paperswithcode.com/paper/lecture-notes-optimization-for-machine
Repo
Framework