DeLTA 2021 Abstracts


Area 1 - Computer Vision Applications

Full Papers
Paper Nr: 8
Title:

Marine Vessel Tracking using a Monocular Camera

Authors:

Tobias Jacob, Raffaele Galliera, Muddasar Ali and Sikha Bagui

Abstract: In this paper, a new technique for camera calibration using only GPS data is presented. A new way of tracking objects that move on a plane in a video is achieved by using the location and size of the bounding box to estimate the distance, achieving an average prediction error of 5.55m per 100m distance from the camera. This solution can be run in real-time at the edge, achieving efficient inference in a low-powered IoT environment, while being also able to track multiple different vessels.
Download

Paper Nr: 17
Title:

Unsupervised Domain Extension for Nighttime Semantic Segmentation in Urban Scenes

Authors:

Sebastian Scherer, Robin Schön, Katja Ludwig and Rainer Lienhart

Abstract: This paper deals with the problem of semantic image segmentation of street scenes at night, as the recent advances in semantic image segmentation are mainly related to daytime images. We propose a method to extend the learned domain of daytime images to nighttime images based on an extended version of the CycleGAN framework and its integration into a self-supervised learning framework. The aim of the method is to reduce the cost of human annotation of night images by robustly transferring images from day to night and training the segmentation network to make consistent predictions in both domains, allowing the usage of completely unlabelled images in training. Experiments show that our approach significantly improves the performance on nighttime images while keeping the performance on daytime images stable. Furthermore, our method can be applied to many other problem formulations and is not specifically designed for semantic segmentation.
Download

Paper Nr: 32
Title:

Synthesizing Fundus Photographies for Training Segmentation Networks

Authors:

Jannes S. Magnusson, Ahmed J. Afifi, Shengjia Zhang, Andreas Ley and Olaf Hellwich

Abstract: Automated semantic segmentation of medical imagery is a vital application using modern Deep Learning methods as they can support clinicians in their decision-making processes. However, training these models requires a large amount of training data which can be especially hard to obtain in the medical field due to ethical and data protection regulations. In this paper, we present a novel method to synthesize realistic retinal fundus images. The process mainly includes the vessel tree generation and synthesis of non-vascular regions (retinal background, fovea, and optic disc). We show that combining the (virtually) unlimited synthetic data with the limited real data during training boosts segmentation performance beyond what can be achieved with real data alone. We test the performance of the proposed method on the DRIVE and STARE databases. The results highlight that the proposed data augmentation technique achieves state-of-the-art performance and accuracy.
Download

Short Papers
Paper Nr: 30
Title:

Sub-dataset Generation and Matching for Crack Detection on Brick Walls using Convolutional Neural Networks

Authors:

Mehedi H. Talukder, Shuhei Ota, Masato Takanokura and Nobuaki Ishii

Abstract: Crack detection is an issue of significant interest in ensuring the safety of buildings. Conventionally, a maintenance engineer performs crack detection manually, which is laborious and time-consuming. Therefore, a systematic crack detection method is required. Among the existing methods, convolutional neural networks (CNNs) are more effective; however, they often fail in the case of brick walls. There are several types of bricks and some may appear to have cracks owing to their structure. Additionally, the joining points of bricks may appear as cracks. It is theorized that if sub-datasets are generated based on the image attributes, and a proper sub-dataset is selected by matching the test image with the sub-datasets, then the performance of the CNN can be improved. In this study, a method consisting of sub-dataset generation and matching is proposed to improve the crack detection in brick walls. CNN learning is conducted with each sub-dataset, and crack detection is performed using a proper learned CNN that is selected by matching the test images with the images in the sub-datasets. Four performance metrics, namely, precision, recall, F-measure, and accuracy, are used for performance evaluation. The numerical experiments show that the proposed method improves the crack detection in brick walls.
Download

Paper Nr: 4
Title:

Exploring Alternatives to Softmax Function

Authors:

Kunal Banerjee, Vishak P. C., Rishi R. Gupta, Kartik Vyas, Anushree H. and Biswajit Mishra

Abstract: Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this class. In another approach which tries to enhance the discriminative nature of the softmax function, soft-margin softmax (SM-softmax) has been proposed to be the most suitable alternative. In this work, we investigate Taylor softmax, SM-softmax and our proposed SM-Taylor softmax, an amalgamation of the earlier two functions, as alternatives to softmax function. Furthermore, we explore the effect of expanding Taylor softmax up to ten terms (original work proposed expanding only to two terms) along with the ramifications of considering Taylor softmax to be a finite or infinite series during backpropagation. Our experiments for the image classification task on different datasets reveal that there is always a configuration of the SM-Taylor softmax function that outperforms the normal softmax function and its other alternatives.
Download

Paper Nr: 9
Title:

Interpretable Deep Learning for Marble Tiles Sorting

Authors:

Athanasios G. Ouzounis, George K. Sidiropoulos, George A. Papakostas, Ilias T. Sarafis, Andreas Stamkos and George Solakis

Abstract: One of the main problems in the final stage of the production line of ornamental stone tiles is the process of quality control and product classification. Successful classification of natural stone tiles based on their aesthetical value can raise profitability. Machine learning is a technology with the capability to fulfil this task with a higher speed than conventional human expert based methods. This paper examines the performance of 15 convolutional neural networks in sorting dolomitic stone tiles as far as models’ accuracy and interpretability are concerned. For the first time, these two performance indices of deep learning models are studied massively for the industrial application of machine vision based marbles sorting. The experiments revealed that the examined convolutional neural networks are able to predict the quality of the marble tiles in an industrial environment accurately in an interpretable way. Furthermore, the DenseNet201 model showed the best accuracy of 83.24%, a performance, which is supported by the consideration of the appropriate quality patterns from the marble tiles’ surface.
Download

Paper Nr: 10
Title:

Applications of Learning Methods to Imaging Issues in Archaeology, Regarding Ancient Ceramic Manufacturing

Authors:

K. Dia, V. L. Coli, L. Blanc-Féraud, J. Leblond, L. Gomart and D. Binder

Abstract: Archaeological studies involve more and more numerical data analyses. In this work, we are interested in the analysis and classification of ceramic sherds tomographic images in order to help archaeologists in learning about the fabrication processes of ancient pottery. More specifically, a particular manufacturing process (spiral patchwork) has recently been discovered in early Neolithic Mediterranean sites, along with a more traditional coiling technique. It has been shown that the ceramic pore distribution in available tomographic images of both archaeological and experimental samples can reveal which manufacturing technique was used. Indeed, with the spiral patchwork, the pores exhibit spiral-like behaviours, whereas with the traditional one, they are distributed along parallel lines, especially in the experimental samples. However, in archaeological samples, these distributions are very noisy, making analysis and discrimination hard to process. Here, we investigate how Learning Methods (Deep Learning and Support Vector Machine) can be used to answer these numerically difficult problems. In particular, we study how the results depend on the input data (either raw data at the output of the tomographic device, or after a preliminary pore segmentation step), and the quality of the information they could provide to archaeologists.
Download

Paper Nr: 27
Title:

Automatically Segmentation the Car Parts and Generate a Large Car Texture Images

Authors:

Yan-Yu Lin, Chia-Ching Yu and Chuen-Horng Lin

Abstract: This study is segmentation the car parts in a car model data collection and then use the segment car parts to generate large car texture images to provide automatic detection and classification of future 3D car models. The segmentation of car parts proposed in this study is divided into simple and fine car parts segmentation. Since there are few texture images of car parts, this study produces various parts to generate many automobile texture images. First, segment the parts after texture images in an automated method, change the RGB arrangement, change the color, and rotate the parts differently. Also, this study made various changes to the background, and then it randomly combined large texture images with various parts and the background. In the experiment, the car parts were divided into 6 categories: the left door, the right door, the roof, the front body, the rear body, and the wheels. In the performance of automated car parts segmentation technology, the simple and fine car parts segmentation has good results in texture images. Next, the segment car parts and use multiple groups to generate large car texture images automatically. It is hoped that we can practically apply these results to simulation systems.
Download

Area 2 - Models and Algorithms

Short Papers
Paper Nr: 15
Title:

Sustainable Development Goals Monitoring and Forecasting using Time Series Analysis

Authors:

Yassir Alharbi, Daniel Arribas-Bel and Frans Coenen

Abstract: A framework for UN Sustainability for Development Goal (SDG) attainment prediction is presented, the SDG Track, Trace & Forecast (SDG-TTF) framework. Unlike previous SDG attainment frameworks, SDG-TTF takes into account the potential for causal relationship between SDG indicators both with respect to the geographic entity under consideration (intra-entity), and neighbouring geographic entities to the current entity (inter-entity). The challenge is in the discovery of such causal relationships. Six alternatives mechanisms are considered. The identified relationships are used to build multivariate time series prediction models which feed into a bottom-up SDG prediction taxonomy, which in turn is used to make SDG attainment predictions. The framework is fully described and evaluated. The evaluation demonstrates that the SDG-TTF framework is able to produce better predictions than alternative models which do not take into consideration the potential for intra and inter-causal relationships.
Download

Paper Nr: 18
Title:

Deep Generative Models to Extend Active Directory Graphs with Honeypot Users

Authors:

Ondřej Lukáš and Sebastian Garcia

Abstract: Active Directory (AD) is a crucial element of large organizations, given its central role in managing access to resources. Since AD is used by all users in the organization, it is hard to detect attackers. We propose to generate and place fake users (honeyusers) in AD structures to help detect attacks. However, not any honeyuser will attract attackers. Our method generates honeyusers with a Variational Autoencoder that enriches the AD structure with well-positioned honeyusers. It first learns the embeddings of the original nodes and edges in the AD, then it uses a modified Bidirectional DAG-RNN to encode the parameters of the probability distribution of the latent space of node representations. Finally, it samples nodes from this distribution and uses an MLP to decide where the nodes are connected. The model was evaluated by the similarity of the generated AD with the original, by the positions of the new nodes, by the similarity with GraphRNN and finally by making real intruders attack the generated AD structure to see if they select the honeyusers. Results show that our machine learning model is good enough to generate well-placed honeyusers for existing AD structures so that intruders are lured into them.
Download

Paper Nr: 22
Title:

A Grid-based Fuzzy C-means Clustering Algorithm with Unknown Number of Clusters

Authors:

Tao Mi, Tao Du and Shouning Qu

Abstract: Fuzzy c-means (FCM) algorithm is one of the most popular methods. Various extensions based FCM have been proposed in the literature. However, these algorithms need to present the number of clusters or related parameters, which resulting in poor robustness of clustering. In order to address these problems, an improved fuzzy c-means clustering algorithm based on grid with unknown number of clusters (G-FCM) is proposed. In G-FCM, data points will be divided to several parts by grids at first, by which the cost of clustering mass data is much reduced; and then a robust learning-based FCM framework is designed to avoid the influence of parameter' initialization selection, and the optimal number of clusters can be determined automatically. In the end, some experiments are designed to compare the performance of G-FCM with other typical algorithms, which proves that G-FCM has advantage in clustering mass data points.

Paper Nr: 12
Title:

Improving Information Privacy and Security: Strengthening Digital Literacy in Organisations

Authors:

Guy Toko and Kagisho Losaba

Abstract: In a world of instant information, information privacy and security are under constant attack. With that being the case, organisations are expected to comply with regulations of securing and ensuring that information assets are protected. Employees are also expected to operate within the set frameworks that have been adopted by the organisation, which brings about the question of digital literacy among the workforce in order to achieve the set goals. The security of information alludes to the manner in which information is stored, processed and transmitted in order to comply with the organisation’s information systems frameworks. The privacy of information can be described as the safeguarding of information related to a particular subject’s identity. In addition, the security of information is a significant instrument for ensuring information resources and business goals, while privacy is centred on the safety of a person's rights and privileges concerning similar information.
Download

Paper Nr: 25
Title:

TC-CNN: Trajectory Compression based on Convolutional Neural Network

Authors:

Yulong Wang, Jingwang Tang and Zhe Jia

Abstract: With the Automatic Identification System installed on more and more ships, people can collect a large number of ship-running data, and the relevant maritime departments and shipping companies can also monitor the running status of ships in real-time and schedule at any time. However, it is challenging to compress a large number of ship trajectory data so as to reduce redundant information and save storage space. The existing trajectory compression algorithms manage to find proper thresholds to achieve better compression effect, which is labor-intensive. We propose a new trajectory compression algorithm which utilizes Convolutional Neural Network to perform points classification, and then obtain a compressed trajectory by removing redundant points according to points classification results, and finally reduce the compression error. Our approach does not need to set the threshold manually. Experiments show that our approach outperforms conventional trajectory compression algorithms in terms of average compression error and fitting degree under the same compression rate, and has certain advantages in time efficiency.
Download

Area 3 - Natural Language Understanding

Full Papers
Paper Nr: 14
Title:

Predicting Headline Effectiveness in Online News Media using Transfer Learning with BERT

Authors:

Jaakko Tervonen, Tuomas Sormunen, Arttu Lämsä, Johannes Peltola, Heidi Kananen and Sari Järvinen

Abstract: The decision to read an article in online news media or social networks is often based on the headline, and thus writing effective headlines is an important but difficult task for the journalists and content creators. Even defining an effective headline is a challenge, since the objective is to avoid click-bait headlines and be sure that the article contents fulfill the expectations set by the headline. Once defined and measured, headline effectiveness can be used for content filtering or recommending articles with effective headlines. In this paper, a metric based on received clicks and reading time is proposed to classify news media content into four classes describing headline effectiveness. A deep neural network model using the Bidirectional Encoder Representations from Transformers (BERT) is employed to classify the headlines into the four classes, and its performance is compared to that of journalists. The proposed model achieves an accuracy of 59% on the four-class classification, and 72-78% on corresponding binary classification tasks. The model outperforms the journalists being almost twice as accurate on a random sample of headlines.
Download

Short Papers
Paper Nr: 19
Title:

Multi-Attribute Relation Extraction (MARE): Simplifying the Application of Relation Extraction

Authors:

Lars Klöser, Philipp Kohl, Bodo Kraft and Albert Zündorf

Abstract: Natural language understanding’s relation extraction makes innovative and encouraging novel business concepts possible and facilitates new digitilized decision-making processes. Current approaches allow the extraction of relations with a fixed number of entities as attributes. Extracting relations with an arbitrary amount of attributes requires complex systems and costly relation-trigger annotations to assist these systems. We introduce multi-attribute relation extraction (MARE) as an assumption-less problem formulation with two approaches, facilitating an explicit mapping from business use cases to the data annotations. Avoiding elaborated annotation constraints simplifies the application of relation extraction approaches. The evaluation compares our models to current state-of-the-art event extraction and binary relation extraction methods. Our approaches show improvement compared to these on the extraction of general multi-attribute relations.
Download

Area 4 - Machine Learning

Full Papers
Paper Nr: 20
Title:

A Comparative Analysis of Classic and Deep Learning Models for Inferring Gender and Age of Twitter Users

Authors:

Yaguang Liu, Lisa Singh and Zeina Mneimneh

Abstract: In order for social scientists to use social media as a source for understanding human behavior and public opinion, they need to understand the demographic characteristics of the population participating in the conversation. What proportion are female? What proportion are young? While previous literature has investigated this problem, this work presents a larger scale study that investigates inference techniques for predicting age and gender using Twitter data. We consider classic text features used in previous work and introduce new ones. Then we use a range of learning approaches from classic machine learning models to deep learning ones to understand the role of different language representations for demographic inference. On a data set created from Wikidata, we compare the value of different feature sets with different algorithms. In general, we find that classic models using statistical features and unigrams perform well. Neural networks also perform well, particularly models using sentence embeddings, e.g. a Siamese network configuration with attention to tweets and user biographies. The differences are marginal for age, but more significant for gender. In other words, it is reasonable to use simpler, interpretable models for some demographic inference tasks (like age). However, using richer language model is important for gender, highlighting the varying role language plays for demographic inference on social media.
Download

Paper Nr: 31
Title:

Attribute Relation Modeling for Pulmonary Nodule Malignancy Reasoning

Authors:

Stanley T. Yu and Gangming Zhao

Abstract: Predicting the malignancy of pulmonary nodules found in chest CT images have become much more accurate due to powerful deep convolutional neural networks. However, attributes, such as lobulation, spiculation, and texture, as well as the correlations and dependencies among such attributes have rarely been exploited in deep learning-based algorithms albeit they are frequently used by human experts during nodule assessment. In this paper, we propose a hybrid machine learning framework consisting of two relation modeling modules: Attribute Graph Network and Bayesian Network, which effectively take advantage of attributes and the correlations and dependencies among them to improve the classification performance of pulmonary nodules. According to experiments on the LIDC−IDRI benchmark dataset, our method achieves an accuracy of 93.59%, which gains a 4.57% improvement over the 3D Dense-FPN baseline.
Download

Short Papers
Paper Nr: 5
Title:

Tailored Military Recruitment through Machine Learning Algorithms

Authors:

Robert Bryce, Ryuichi Ueno, Christopher Mcdonald and Dragos Calitoiu

Abstract: Identifying postal codes with the highest recruiting potential corresponding to the desired profile for a military occupation can be achieved by using the demographics of the population living in that postal code and the location of both the successful and unsuccessful applicants. Selecting N individuals with the highest probability to be enrolled from a population living in untapped postal codes can be done by ranking the postal codes using a machine learning predictive model. Three such models are presented in this paper: a logistic regression, a multi-layer perceptron and a deep neural network. The key contribution of this paper is an algorithm that combines these models, benefiting from the performance of each of them, producing a desired selection of postal codes. This selection can be converted into N prospects living in these areas. A dataset consisting of the applications to the Canadian Armed Forces (CAF) is used to illustrate the methodology proposed.
Download

Paper Nr: 7
Title:

Using Syntactic Similarity to Shorten the Training Time of Deep Learning Models using Time Series Datasets: A Case Study

Authors:

Silvestre Malta, Pedro Pinto and Manuel F. Veiga

Abstract: The process of building and deploying Machine Learning (ML) models includes several phases and the training phase is taken as one of the most time-consuming. ML models with time series datasets can be used to predict users positions, behaviours or mobility patterns, which implies paths crossing by well-defined positions, and thus, in these cases, syntactic similarity can be used to reduce these models training time. This paper uses the case study of a Mobile Network Operator (MNO) where users mobility are predicted through ML and the use of syntactic similarity with Word2Vec (W2V) framework is tested with Recurrent Neural Network (RNN), Gate Recurrent Unit (GRU), Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) models. Experimental results show that by using framework W2V in these architectures, the training time task is reduced in average between 22% to 43%. Also an improvement on the validation accuracy of mobility prediction of about 3 percentage points in average is obtained.
Download

Paper Nr: 16
Title:

Continuous Emotions: Exploring Label Interpolation in Conditional Generative Adversarial Networks for Face Generation

Authors:

Silvan Mertes, Florian Lingenfelser, Thomas Kiderle, Michael Dietz, Lama Diab and Elisabeth André

Abstract: The ongoing rise of Generative Adversarial Networks is opening the possibility to create highly-realistic, natural looking images in various fields of application. One particular example is the generation of emotional human face images that can be applied to diverse use-cases such as automated avatar generation. However, most conditional approaches to create such emotional faces are addressing categorical emotional states, making smooth transitions between emotions difficult. In this work, we explore the possibilities of label interpolation in order to enhance a network that was trained on categorical emotions with the ability to generate face images that show emotions located in a continuous valence-arousal space.
Download

Paper Nr: 24
Title:

Approaches towards Resource-saving and Explainability/Transparency of Deep-learning-based Image Classification in Industrial Applications

Authors:

Constantin Rieder, Markus Germann, Samuel Mezger and Klaus P. Scherer

Abstract: In the present work a new approach for the concept-neutral access to information (in particular visual kind) is compiled. In contrast to language-neutral access, concept-neutral access does not require the need to know precise names or IDs of components. Language-neutral systems usually work with language-neutral metadata, such as IDs (unique terms) for components. Access to information is therefore significantly facilitated for the user in term-neutral access without required knowledge of such IDs. The AI models responsible for recognition transparently visualize the decisions and they evaluate the recognition with quality criteria to be developed (confidence). To the applicants’ knowledge, this has not yet been used in an industrial setting. The use of performant models in a mobile, low-energy environment is also novel and not yet established in an industrial setting.
Download

Paper Nr: 26
Title:

Filtered Weighted Correction Training Method for Data with Noise Label

Authors:

Yulong Wang, Xiaohui Hu and Zhe Jia

Abstract: To solve the problem of low model accuracy under noisy data sets, a filtered weighted correction training method is proposed. This method uses the idea of model fine-tuning to adjust and correct the trained deep neural network model using filtered data, which has high portability. In the data filtering process, the noise label filtering algorithm, which is based on the random threshold in the double interval, reduces the dependence on artificially set parameters, increases the reliability of the random threshold, and improves the filtering accuracy and the recall rate of clean samples. In the calibration process, to deal with sample imbalance, different types of samples are weighted to improve the effectiveness of the model. Experimental results show that the propose method can improve the F1 value of deep neural network model.
Download