January 25, 2020

3186 words 15 mins read

Paper Group ANR 1754

UdS Submission for the WMT 19 Automatic Post-Editing Task. Diversity by Phonetics and its Application in Neural Machine Translation. Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks. Revisiting Multi-Step Nonlinearity Compensation with Machine Learning. KRNET: Image Denoising …

UdS Submission for the WMT 19 Automatic Post-Editing Task


Title	UdS Submission for the WMT 19 Automatic Post-Editing Task
Authors	Hongfei Xu, Qiuhui Liu, Josef van Genabith
Abstract	In this paper, we describe our submission to the English-German APE shared task at WMT 2019. We utilize and adapt an NMT architecture originally developed for exploiting context information to APE, implement this in our own transformer model and explore joint training of the APE task with a de-noising encoder.
Tasks	Automatic Post-Editing
Published	2019-08-09
URL	https://arxiv.org/abs/1908.03402v1
PDF	https://arxiv.org/pdf/1908.03402v1.pdf
PWC	https://paperswithcode.com/paper/uds-submission-for-the-wmt-19-automatic-post
Repo
Framework

Diversity by Phonetics and its Application in Neural Machine Translation


Title	Diversity by Phonetics and its Application in Neural Machine Translation
Authors	Abdul Rafae Khan, Jia Xu
Abstract	We introduce a powerful approach for Neural Machine Translation (NMT), whereby, during training and testing, together with the input we provide its phonetic encoding and the variants of such an encoding. This way we obtain very significant improvements up to 4 BLEU points over the state-of-the-art large-scale system. The phonetic encoding is the first part of our contribution, with a second being a theory that aims to understand the reason for this improvement. Our hypothesis states that the phonetic encoding helps NMT because it encodes a procedure to emphasize the difference between semantically diverse sentences. We conduct an empirical geometric validation of our hypothesis in support of which we obtain overwhelming evidence. Subsequently, as our third contribution and based on our theory, we develop artificial mechanisms that leverage during learning the hypothesized (and verified) effect phonetics. We achieve significant and consistent improvements overall language pairs and datasets: French-English, German-English, and Chinese-English in medium task IWSLT’17 and French-English in large task WMT’18 Bio, with up to 4 BLEU points over the state-of-the-art. Moreover, our approaches are more robust than baselines when evaluated on unknown out-of-domain test sets with up to a 5 BLEU point increase.
Tasks	Machine Translation
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04292v1
PDF	https://arxiv.org/pdf/1911.04292v1.pdf
PWC	https://paperswithcode.com/paper/diversity-by-phonetics-and-its-application-in
Repo
Framework

Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks


Title	Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks
Authors	Yeonjong Shin
Abstract	Deep neural networks have been used in various machine learning applications and achieved tremendous empirical successes. However, training deep neural networks is a challenging task. Many alternatives have been proposed in place of end-to-end back-propagation. Layer-wise training is one of them, which trains a single layer at a time, rather than trains the whole layers simultaneously. In this paper, we study a layer-wise training using a block coordinate gradient descent (BCGD) for deep linear networks. We establish a general convergence analysis of BCGD and found the optimal learning rate, which results in the fastest decrease in the loss. More importantly, the optimal learning rate can directly be applied in practice, as it does not require any prior knowledge. Thus, tuning the learning rate is not needed at all. Also, we identify the effects of depth, width, and initialization in the training process. We show that when the orthogonal-like initialization is employed, the width of intermediate layers plays no role in gradient-based training, as long as the width is greater than or equal to both the input and output dimensions. We show that under some conditions, the deeper the network is, the faster the convergence is guaranteed. This implies that in an extreme case, the global optimum is achieved after updating each weight matrix only once. Besides, we found that the use of deep networks could drastically accelerate convergence when it is compared to those of a depth 1 network, even when the computational cost is considered. Numerical examples are provided to justify our theoretical findings and demonstrate the performance of layer-wise training by BCGD.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.05874v1
PDF	https://arxiv.org/pdf/1910.05874v1.pdf
PWC	https://paperswithcode.com/paper/effects-of-depth-width-and-initialization-a
Repo
Framework

Revisiting Multi-Step Nonlinearity Compensation with Machine Learning


Title	Revisiting Multi-Step Nonlinearity Compensation with Machine Learning
Authors	Christian Häger, Henry D. Pfister, Rick M. Bütler, Gabriele Liga, Alex Alvarado
Abstract	For the efficient compensation of fiber nonlinearity, one of the guiding principles appears to be: fewer steps are better and more efficient. We challenge this assumption and show that carefully designed multi-step approaches can lead to better performance-complexity trade-offs than their few-step counterparts.
Tasks
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09807v1
PDF	http://arxiv.org/pdf/1904.09807v1.pdf
PWC	https://paperswithcode.com/paper/revisiting-multi-step-nonlinearity
Repo
Framework

KRNET: Image Denoising with Kernel Regulation Network


Title	KRNET: Image Denoising with Kernel Regulation Network
Authors	Peng Liu, Xiaoxiao Zhou, Junyiyang Li, El Basha Mohammad D, Ruogu Fang
Abstract	One popular strategy for image denoising is to design a generalized regularization term that is capable of exploring the implicit prior underlying data observation. Convolutional neural networks (CNN) have shown the powerful capability to learn image prior information through a stack of layers defined by a combination of kernels (filters) on the input. However, existing CNN-based methods mainly focus on synthetic gray-scale images. These methods still exhibit low performance when tackling multi-channel color image denoising. In this paper, we optimize CNN regularization capability by developing a kernel regulation module. In particular, we propose a kernel regulation network-block, referred to as KR-block, by integrating the merits of both large and small kernels, that can effectively estimate features in solving image denoising. We build a deep CNN-based denoiser, referred to as KRNET, via concatenating multiple KR-blocks. We evaluate KRNET on additive white Gaussian noise (AWGN), multi-channel (MC) noise, and realistic noise, where KRNET obtains significant performance gains over state-of-the-art methods across a wide spectrum of noise levels.
Tasks	Denoising, Image Denoising
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08867v1
PDF	https://arxiv.org/pdf/1910.08867v1.pdf
PWC	https://paperswithcode.com/paper/krnet-image-denoising-with-kernel-regulation
Repo
Framework

Retro-Actions: Learning ‘Close’ by Time-Reversing ‘Open’ Videos


Title	Retro-Actions: Learning ‘Close’ by Time-Reversing ‘Open’ Videos
Authors	Will Price, Dima Damen
Abstract	We investigate video transforms that result in class-homogeneous label-transforms. These are video transforms that consistently maintain or modify the labels of all videos in each class. We propose a general approach to discover invariant classes, whose transformed examples maintain their label; pairs of equivariant classes, whose transformed examples exchange their labels; and novel-generating classes, whose transformed examples belong to a new class outside the dataset. Label transforms offer additional supervision previously unexplored in video recognition benefiting data augmentation and enabling zero-shot learning opportunities by learning a class from transformed videos of its counterpart. Amongst such video transforms, we study horizontal-flipping, time-reversal, and their composition. We highlight errors in naively using horizontal-flipping as a form of data augmentation in video. Next, we validate the realism of time-reversed videos through a human perception study where people exhibit equal preference for forward and time-reversed videos. Finally, we test our approach on two datasets, Jester and Something-Something, evaluating the three video transforms for zero-shot learning and data augmentation. Our results show that gestures such as zooming in can be learnt from zooming out in a zero-shot setting, as well as more complex actions with state transitions such as digging something out of something from burying something in something.
Tasks	Data Augmentation, Video Recognition, Zero-Shot Learning
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09422v1
PDF	https://arxiv.org/pdf/1909.09422v1.pdf
PWC	https://paperswithcode.com/paper/retro-actions-learning-close-by-time
Repo
Framework

Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules


Title	Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules
Authors	Niklas W. A. Gebauer, Michael Gastegger, Kristof T. Schütt
Abstract	Deep learning has proven to yield fast and accurate predictions of quantum-chemical properties to accelerate the discovery of novel molecules and materials. As an exhaustive exploration of the vast chemical space is still infeasible, we require generative models that guide our search towards systems with desired properties. While graph-based models have previously been proposed, they are restricted by a lack of spatial information such that they are unable to recognize spatial isomerism and non-bonded interactions. Here, we introduce a generative neural network for 3d point sets that respects the rotational invariance of the targeted structures. We apply it to the generation of molecules and demonstrate its ability to approximate the distribution of equilibrium structures using spatial metrics as well as established measures from chemoinformatics. As our model is able to capture the complex relationship between 3d geometry and electronic properties, we bias the distribution of the generator towards molecules with a small HOMO-LUMO gap - an important property for the design of organic solar cells.
Tasks
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00957v3
PDF	https://arxiv.org/pdf/1906.00957v3.pdf
PWC	https://paperswithcode.com/paper/symmetry-adapted-generation-of-3d-point-sets
Repo
Framework

Image Seam-Carving by Controlling Positional Distribution of Seams


Title	Image Seam-Carving by Controlling Positional Distribution of Seams
Authors	Mahdi Ahmadi, Nader Karimi, Shadrokh Samavi
Abstract	Image retargeting is a new image processing task that renders the change of aspect ratio in images. One of the most famous image-retargeting algorithms is seam-carving. Although seam-carving is fast and straightforward, it usually distorts the images. In this paper, we introduce a new seam-carving algorithm that not only has the simplicity of the original seam-carving but also lacks the usual unwanted distortion existed in the original method. The positional distribution of seams is introduced. We show that the proposed method outperforms the original seam-carving in terms of retargeted image quality assessment and seam coagulation measures.
Tasks	Image Quality Assessment
Published	2019-12-31
URL	https://arxiv.org/abs/1912.13214v1
PDF	https://arxiv.org/pdf/1912.13214v1.pdf
PWC	https://paperswithcode.com/paper/image-seam-carving-by-controlling-positional
Repo
Framework

Continuous Dice Coefficient: a Method for Evaluating Probabilistic Segmentations


Title	Continuous Dice Coefficient: a Method for Evaluating Probabilistic Segmentations
Authors	Reuben R Shamir, Yuval Duchin, Jinyoung Kim, Guillermo Sapiro, Noam Harel
Abstract	Objective: Overlapping measures are often utilized to quantify the similarity between two binary regions. However, modern segmentation algorithms output a probability or confidence map with continuous values in the zero-to-one interval. Moreover, these binary overlapping measures are biased to structure size. Addressing these challenges is the objective of this work. Methods: We extend the definition of the classical Dice coefficient (DC) overlap to facilitate the direct comparison of a ground truth binary image with a probabilistic map. We call the extended method continuous Dice coefficient (cDC) and show that 1) cDC is less or equal to 1 and cDC = 1 if-and-only-if the structures overlap is complete, and, 2) cDC is monotonically decreasing with the amount of overlap. We compare the classical DC and the cDC in a simulation of partial volume effects that incorporates segmentations of common targets for deep-brainstimulation. Lastly, we investigate the cDC for an automatic segmentation of the subthalamic-nucleus. Results: Partial volume effect simulation on thalamus (large structure) resulted with DC and cDC averages (SD) of 0.98 (0.006) and 0.99 (0.001), respectively. For subthalamic-nucleus (small structure) DC and cDC were 0.86 (0.025) and 0.97 (0.006), respectively. The DC and cDC for automatic STN segmentation were 0.66 and 0.80, respectively. Conclusion: The cDC is well defined for probabilistic segmentation, less biased to structure size and more robust to partial volume effects in comparison to DC. Significance: The proposed method facilitates a better evaluation of segmentation algorithms. As a better measurement tool, it opens the door for the development of better segmentation methods.
Tasks
Published	2019-06-26
URL	https://arxiv.org/abs/1906.11031v1
PDF	https://arxiv.org/pdf/1906.11031v1.pdf
PWC	https://paperswithcode.com/paper/continuous-dice-coefficient-a-method-for
Repo
Framework

Online Convex Matrix Factorization with Representative Regions


Title	Online Convex Matrix Factorization with Representative Regions
Authors	Abhishek Agarwal, Jianhao Peng, Olgica Milenkovic
Abstract	Matrix factorization (MF) is a versatile learning method that has found wide applications in various data-driven disciplines. Still, many MF algorithms do not adequately scale with the size of available datasets and/or lack interpretability. To improve the computational efficiency of the method, an online (streaming) MF algorithm was proposed in Mairal et al. [2010]. To enable data interpretability, a constrained version of MF, termed convex MF, was introduced in Ding et al. [2010]. In the latter work, the basis vectors are required to lie in the convex hull of the data samples, thereby ensuring that every basis can be interpreted as a weighted combination of data samples. No current algorithmic solutions for online convex MF are known as it is challenging to find adequate convex bases without having access to the complete dataset. We address both problems by proposing the first online convex MF algorithm that maintains a collection of constant-size sets of representative data samples needed for interpreting each of the basis (Ding et al. [2010]) and has the same almost sure convergence guarantees as the online learning algorithm of Mairal et al. [2010]. Our proof techniques combine random coordinate descent algorithms with specialized quasi-martingale convergence analysis. Experiments on synthetic and real world datasets show significant computational savings of the proposed online convex MF method compared to classical convex MF. Since the proposed method maintains small representative sets of data samples needed for convex interpretations, it is related to a body of work in theoretical computer science, pertaining to generating point sets (Blum et al. [2016]), and in computer vision, pertaining to archetypal analysis (Mei et al. [2018]). Nevertheless, it differs from these lines of work both in terms of the objective and algorithmic implementations.
Tasks	Dictionary Learning, Dimensionality Reduction
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02580v2
PDF	https://arxiv.org/pdf/1904.02580v2.pdf
PWC	https://paperswithcode.com/paper/online-convex-dictionary-learning
Repo
Framework

Subjective Quality Assessment of Ground-based Camera Images


Title	Subjective Quality Assessment of Ground-based Camera Images
Authors	Lucie Lévêque, Soumyabrata Dev, Murhaf Hossari, Yee Hui Lee, Stefan Winkler
Abstract	Image quality assessment is critical to control and maintain the perceived quality of visual content. Both subjective and objective evaluations can be utilised, however, subjective image quality assessment is currently considered the most reliable approach. Databases containing distorted images and mean opinion scores are needed in the field of atmospheric research with a view to improve the current state-of-the-art methodologies. In this paper, we focus on using ground-based sky camera images to understand the atmospheric events. We present a new image quality assessment dataset containing original and distorted nighttime images of sky/cloud from SWINSEG database. Subjective quality assessment was carried out in controlled conditions, as recommended by the ITU. Statistical analyses of the subjective scores showed the impact of noise type and distortion level on the perceived quality.
Tasks	Image Quality Assessment
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07192v1
PDF	https://arxiv.org/pdf/1912.07192v1.pdf
PWC	https://paperswithcode.com/paper/subjective-quality-assessment-of-ground-based
Repo
Framework

Combining Spans into Entities: A Neural Two-Stage Approach for Recognizing Discontiguous Entities


Title	Combining Spans into Entities: A Neural Two-Stage Approach for Recognizing Discontiguous Entities
Authors	Bailin Wang, Wei Lu
Abstract	In medical documents, it is possible that an entity of interest not only contains a discontiguous sequence of words but also overlaps with another entity. Entities of such structures are intrinsically hard to recognize due to the large space of possible entity combinations. In this work, we propose a neural two-stage approach to recognize discontiguous and overlapping entities by decomposing this problem into two subtasks: 1) it first detects all the overlapping spans that either form entities on their own or present as segments of discontiguous entities, based on the representation of segmental hypergraph, 2) next it learns to combine these segments into discontiguous entities with a classifier, which filters out other incorrect combinations of segments. Two neural components are designed for these subtasks respectively and they are learned jointly using a shared encoder for text. Our model achieves the state-of-the-art performance in a standard dataset, even in the absence of external features that previous methods used.
Tasks
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00930v1
PDF	https://arxiv.org/pdf/1909.00930v1.pdf
PWC	https://paperswithcode.com/paper/combining-spans-into-entities-a-neural-two
Repo
Framework

Automatic quality assessment for 2D fetal sonographic standard plane based on multi-task learning


Title	Automatic quality assessment for 2D fetal sonographic standard plane based on multi-task learning
Authors	Hong Luo, Han Liu, Kejun Li, Bo Zhang
Abstract	The quality control of fetal sonographic (FS) images is essential for the correct biometric measurements and fetal anomaly diagnosis. However, quality control requires professional sonographers to perform and is often labor-intensive. To solve this problem, we propose an automatic image quality assessment scheme based on multi-task learning to assist in FS image quality control. An essential criterion for FS image quality control is that all the essential anatomical structures in the section should appear full and remarkable with a clear boundary. Therefore, our scheme aims to identify those essential anatomical structures to judge whether an FS image is the standard image, which is achieved by three convolutional neural networks. The Feature Extraction Network aims to extract deep level features of FS images. Based on the extracted features, the Class Prediction Network determines whether the structure meets the standard and Region Proposal Network identifies its position. The scheme has been applied to three types of fetal sections, which are the head, abdominal, and heart. The experimental results show that our method can make a quality assessment of an FS image within less a second. Also, our method achieves competitive performance in both the detection and classification compared with state-of-the-art methods.
Tasks	Image Quality Assessment, Multi-Task Learning
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05260v1
PDF	https://arxiv.org/pdf/1912.05260v1.pdf
PWC	https://paperswithcode.com/paper/automatic-quality-assessment-for-2d-fetal
Repo
Framework

NLPR@SRPOL at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier


Title	NLPR@SRPOL at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier
Authors	Alessandro Seganti, Helena Sobol, Iryna Orlova, Hannam Kim, Jakub Staniszewski, Tymoteusz Krumholc, Krystian Koziel
Abstract	The paper presents a system developed for the SemEval-2019 competition Task 5 hat-Eval Basile et al. (2019) (team name: LU Team) and Task 6 OffensEval Zampieri et al. (2019b) (team name: NLPR@SRPOL), where we achieved 2nd position in Subtask C. The system combines in an ensemble several models (LSTM, Transformer, OpenAI’s GPT, Random forest, SVM) with various embeddings (custom, ELMo, fastText, Universal Encoder) together with additional linguistic features (number of blacklisted words, special characters, etc.). The system works with a multi-tier blacklist and a large corpus of crawled data, annotated for general offensiveness. In the paper we do an extensive analysis of our results and show how the combination of features and embedding affect the performance of the models.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05152v1
PDF	http://arxiv.org/pdf/1904.05152v1.pdf
PWC	https://paperswithcode.com/paper/nlprsrpol-at-semeval-2019-task-6-and-task-5
Repo
Framework

Active Learning for Event Detection in Support of Disaster Analysis Applications


Title	Active Learning for Event Detection in Support of Disaster Analysis Applications
Authors	Naina Said, Kashif Ahmad, Nicola Conci, Ala Al-Fuqaha
Abstract	Disaster analysis in social media content is one of the interesting research domains having abundance of data. However, there is a lack of labeled data that can be used to train machine learning models for disaster analysis applications. Active learning is one of the possible solutions to such problem. To this aim, in this paper we propose and assess the efficacy of an active learning based framework for disaster analysis using images shared on social media outlets. Specifically, we analyze the performance of different active learning techniques employing several sampling and disagreement strategies. Moreover, we collect a large-scale dataset covering images from eight common types of natural disasters. The experimental results show that the use of active learning techniques for disaster analysis using images results in a performance comparable to that obtained using human annotated images, and could be used in frameworks for disaster analysis in images without tedious job of manual annotation.
Tasks	Active Learning
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12601v1
PDF	https://arxiv.org/pdf/1909.12601v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-for-event-detection-in
Repo
Framework