Abstracts Track 2025


Area 1 - Computer Vision Applications

Nr: 60
Title:

Analysis of Pedestrian Traffic on WUT Campus Using Machine Learning Methods

Authors:

Robert Olszewski, Pawel Czernic and Kamil Choromanski

Abstract: The development of smart cities relies heavily on advanced information and geo-information technologies. However, before these systems can be implemented across entire urban areas, they must first be tested in controlled environments. A good example of this is the establishment of Living Labs on the technical universities campuses , which serve as real-world testbeds for piloting new technological solutions. Of particular importance is the role of the academic community in testing Internet of Things (IoT) sensors and data processing methods using machine learning algorithms. This is especially important for the analysis of image data collected by CCTV sensors, where privacy and data protection are critical concerns (Bokolo, 2023; Czernic, 2024). At the Warsaw University of Technology (WUT), we developed a prototype system for testing pedestrian traffic analysis within the university’s Living Lab.. The system uses video surveillance cameras to collect and process image data In close future it will also use dedicated video sensors, that are currently being installed.. The dedicated CENAGIS (Gotlib et al., 2022) infrastructure, as well as proprietary geo-information solutions using machine learning methods, were used to collect and analyze this high-volume data safely and in compliance with legal requirements. A significant technological innovation developed by the WUT research team is the synchronization of pedestrian detection algorithms with spatial transformation techniques (Fig. 1). Pedestrian detection and classification are performed using machine learning methods, specifically the YOLO model (Jocher et al., 2023). The transformation of image coordinates (X, Y) into geodetic coordinates is achieved using a custom implementation of the Perspective-n-Point (PnP) method (Marchand, 2016).. The transformation method used makes it possible to obtain spatial data with an accuracy of up to 1 meter, making the data suitable for further GIS-based analysis. Using the developed technology, a series of experiments were conducted on the WUT campus (which functions as the Living Lab for smart city research in Warsaw). Image data was collected at different times and on different days of the week to analyze pedestrian traffic during both working days and weekends. Preliminary tests revealed the need for improved camera calibration to enchance the accuracy of the transformation. Also it is planned to implement YOLO version 11 instead of the initially used version 8 to achieve better detection accuracy. The results demonstrated that it is indeed feasible to obtain reliable spatial data using relatively simple image sensors, a well-trained neural network, and proprietary transformation algorithms.. The analysis revealed not only significant spatial and temporal variations in pedestrian movement but also highlighted areas requiring urban improvements and traffic reorganization. For example, analysis of weekday image data showed that cars parking on sidewalks force pedestrians to move on the road. Ongoing work includes the installation of additional cameras, as well as optimization of data processing and transformation methods, and training of the neural network through additional examples. These improvements aim to expand the system’s capabilities to include the recognition of cyclists and vehicles, and to enable tracking of individuals across multiple camera feeds. This requires developing a method to uniquely identify the same person by all cameras.

Area 2 - Models and Algorithms

Nr: 32
Title:

Automobile Demand Forecasting with NARX (Nonlinear AutoRegressive with Exogenous Inputs) Artificial Neural Networks Model

Authors:

Mehmet Zeki Seçmen

Abstract: The rapid advancements in technology and the increasing integration of artificial intelligence into various sectors have significantly transformed predictive modelling techniques. In this context, the present study aims to forecast automobile demand in Turkey by utilizing the Non-Linear Autoregressive Network with Exogenous Inputs (NARX) Artificial Neural Network (ANN) model. This approach leverages historical sales data and key economic indicators to enhance the accuracy and reliability of demand forecasting. The study employs MATLAB (Matrix Laboratory) software to analyse the automobile sales data of six major manufacturers operating in Turkey: OYAK Renault, Tofaş, Toyota, Ford, Honda, and Hyundai. The dataset, covering the period between 2014 and 2024, is sourced from the Automotive Distributors and Mobility Association (ADMA). The NARX ANN model is implemented using monthly automobile sales data to predict future demand trends. In the model development process, several independent variables, derived from the annual activity reports published by the Ministry of Industry and Technology, are incorporated to assess their impact on automobile demand. These variables include Brent crude oil prices, the US dollar exchange rate, vehicle loan interest rates, the consumer price index (CPI), vehicle purchase levels, and automobile production quantity. The dependent variable, representing the forecasted output, is defined as the total automobile sales volume of the six selected companies. The selection of these variables is based on their potential influence on consumer purchasing behavior and market dynamics. The constructed NARX ANN model consists of six input variables, ten hidden neurons, and one output node. The model's predictive performance is evaluated using two standard error metrics: Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE). The results indicate that the proposed model achieves an MSE of 0.0654 and a MAPE of 12.23%. These performance metrics suggest that the NARX ANN model provides a reliable approximation of real-world automobile demand trends. The low error rates demonstrate the model's effectiveness in capturing the complex, non-linear relationships among the influencing variables. Following the model's training and testing phases, automobile demand for the twelve months of 2024 is predicted. The application of artificial neural networks in demand forecasting enhances the accuracy of predictions, facilitating better decision-making in production planning, inventory management, and marketing strategies. Accurate demand forecasts ensure that automobile manufacturers can optimize supply chain operations, reduce excess inventory costs, and improve customer satisfaction by aligning production schedules with market needs. Additionally, the integration of AI-driven forecasting models in the automotive industry can enhance reliability, competitiveness, and market responsiveness. Overall, the findings of this study underscore the potential of NARX ANN models in predicting automobile sales with a high degree of accuracy. The adoption of such advanced predictive techniques can contribute to more efficient market strategies and better adaptation to economic fluctuations. Future research could explore the incorporation of additional macroeconomic indicators and alternative machine learning methodologies to further refine demand forecasting models in the automotive sector.

Nr: 38
Title:

Studying the Effectiveness of Longer Context Windows in LLMs for Text Summarization and Question Answering Tasks

Authors:

Anum Afzal, Clemens Magg and Florian Matthes

Abstract: Previous language models (LMs) have been inadequate for long-text summarization due to their limited context windows, which restrict their ability to effectively process and comprehend extensive textual data. This limitation has made it challenging for traditional LMs to capture critical long-range dependencies and contextual nuances in longer documents. Models with significantly longer context windows have emerged through various innovative improvements to model architecture and training practices. Large language models (LLMs) are tremendously effective in processing large amounts of textual data and show outstanding results for various NLP tasks. Despite their proficiency with extended context windows, quantifying the performance of LLMs on text summarization and question-answering tasks remains largely unexplored on context window sizes of up to 128k tokens, representing a significant research gap in the field of NLP. To address this gap, our research implements new traceability metrics to evaluate how well LLMs utilize information within their extended context windows for long-text summarization and question-answering tasks. We propose an evaluation framework integrating automated traceability metrics to provide a nuanced understanding of model performance. Our approach focuses on determining which information models use and which they neglect depending on their context window size. For our evaluation, we employ three datasets, InfiniteBench, BookSum, and XSum, that exhibit characteristics essential for extended context windows, such as uniform distribution of salient information. We compare the performance across four models: LLaMA 3.1, Qwen 2.5, Phi 3, and Command R7B. Ultimately, our research reveals that the models exhibit inherent positional biases, leading to varying performance based on the location of information within their context window and the overall length of that context window. These findings highlight the importance of understanding how context dynamics influence the efficacy of text summarization and question-answering tasks

Area 3 - Machine Learning

Nr: 59
Title:

AI Enrichment of Mars Digital Terrain Model Derived from HiRISE Data

Authors:

Robert Olszewski, Pawel Czernic and Kamil Choromanski

Abstract: Today's photogrammetric technologies enable the detailed and precise modeling of Digital Terrain Models (DTM), using both point clouds point clouds from laser scanning and high-resolution images taken from low altitudes. While this process is standard on Earth, it becomes significantly more challenging on Mars or other celestial bodies.. This difficulty arises from the limited availability of high-resolution measuring equipment and the inability to use ground control points or other reference measurements. The highest-resolution Mars DTM models (approximately 4cm/pix) was produced using images taken by the Ingenuity UAV accompanying the Perseverance rover. However, this model covers only about 44 hectares (Zubarev et al., 2025). The primary data source for creating cartometric orthophotos and DTMs of the Red Planet remains the HiRISE (High Resolution Imaging Science Experiment) camera on the Mars Reconnaissance Orbiter (MRO). The use of the stereo pairs of HIRISE images makes it possible to obtain a DTM resolutions of around 0.5 meters per pixel. However, for the purposes of scientific exploration of Mars, mapping geomorphological formations and planning in-situ missions carried out by rovers, such DTM accuracy is often insufficient. A research team from Warsaw University of Technology (WUT) is working to address this gap by developing a methodology to “enhance” existing Mars DTMs derived from HiRISE images (Choromański et al., 2022) using machine and deep learning methods.. This line of research was initiated by the team of Tao et al. (2023) introducing MADNet (monocular height estimation neural network). The methodology uses deep neural networks trained to estimate depth maps from single images.. A specific example of this approach may be the use of HiRISE satellite imagery, while the resulting depth map allows the DTM to be “enriched”. The WUT team has adopted and further advanced this concept using the Distill Any Depth neural network (He et al. 2025). This state-of-the-art model allows for monocular depth estimation (MDE), predicting depth from a single RGB image - a key capability for understanding complex 3D scenes. In the research we propose not only to use the Distill Any Depth network to novel distillation framework designed to leverage unlabeled images, but also to use multiple overlapping HiRISE images to “enrich” the resulting DTM for Mars with AI methods. Using two depth maps obtained from both source HiRISE images forming a stereo pair, a sophisticated neural network along with proprietary algorithms and methods for improving the DTM with depth maps makes it possible to obtain a DTM model with significantly better morphometric parameters (Fig. 1). Preliminary results indicate strong potential for this approach.. The ongoing research aims to develop a unified, original technology that integrates all stages of DTM creation and enhancement into a single AI-powered workflow, supporting both 2D and 3D relief modeling and visualization. As shown in Figure 1, the proposed neural network-based methodology enables the creation of more detailed terrain models by combining depth maps with existing DTMs. Hillshade visualization models reveal larger geomorphological features, such as broad ripple marks, more clearly. Currently, the WUT team is in the process of optimizing the parameters of a neural network for depth map estimation, and to fuse depth maps with low-resolution DTMs.