Abstracts Track 2023

Area 1 - Business Analytics

Nr: 16

Results on Cost-Sensitive Parametric Classifiers


Jorge C-Rella, Juan Manuel Vilar and Ricardo Cao

Abstract: Cost-sensitive classification address the problem of optimal learning when different misclassification errors imply different costs. When modeling the problem with a parametric model, using a loss function incorporating these costs, instead of the likelihood as in classical models, has proven to result in a more effective parameter estimation. The problem is that state of the art approaches, although extensive, tend to be heuristic, task dependent and tested over a limited number of (typically proprietary) data sets, making difficult its comparison. In addition, only empirical results have been developed, ignoring the theoretical behavior of the models. Thus resulting in a lack of support for the application of cost-sensitive models in a general setting. This work has two aims. The first is to develop the consistency and the asymptotic distribution of the estimated cost-sensitive parameters, obtained under general conditions. The second aim is to test the cost-sensitive parameter estimation over a wide range of simulations and scenarios, confirming the previously obtained theoretical results and the improvement obtained whit a cost-sensitive approach compared to a cost-insensitive approach.

Area 2 - Data Science

Nr: 44

Intelligent Data Analysis for Conceiving an Advisory Growth System for Tomatoes Using Connected Fruit Dendrometers and Micro-Climate Measures in a Greenhouse


Elena Najdenovska, Fabien Dutoit, Theresa Dunkel, Robert Whittaker, Cédric Camps and Laura E. Raileanu

Abstract: Daily climate variations directly impact fruit growth and quality due to the physiological adaptation of the plants. Unpredictable climatic events affect cultivations in commercial greenhouses. For example, when a succession of hot periods occurs in the summer, tomatoes often split or burst during the ripening phase as the fruit is not elastic enough to absorb the physical changes due to frequent irrigation. This results in important yield losses. Our work aims at modeling the fruit's diameter growth in relation to the climate conditions, which could predict the harvest time and deviations from regular fruit growth leading to physiological damage. More precisely, by applying intelligent data analysis on measures acquired continuously with newly developed fruit dendrometers connected directly to the tomato fruit and data from micro-climate stations placed near the monitored plants, we intend to conceive an alerting system for growers informing of the risk of damage. Such a tool would improve crop quality, optimize harvest timing, and reduce water usage. Several field experiments already took place in 2022 in tomato-soilless greenhouse production using 60 fruit dendrometers and six micro-climate sensors placed in different greenhouse zones. An IoT platform was developed for real-time data collection, storage, and visualization of fruit diameters and climatic measurements. Additional data within the same set-up will be acquired in the following months. The continuous diameter measures allowed visualization of the fruit growth dynamic, namely the overall growth progress and the daily oscillations portraying the circadian contraction and extension present around the peak of the day. The daily fruit contraction could be seen as an indicator of the fruit's health status. A diameter-based growth model was also established to predict the final fruit diameter and harvest time. Additionally, preliminary analyses suggest dependences between the daily fruit growth measures and the evolution of the climate conditions. The current developments aim to characterize a growing pattern for tomato crops related to different climatic environments of the greenhouse that could lead to fruit cracking. Furthermore, the intended advisory growth system will be concluded by integrating the built growth models into the IoT platform.

Area 3 - Data Management and Quality

Nr: 111

Current Status and Prospect for Development and Dissemination of Standard Reference Data in Korea


Jinkyu Moon and Kyunshik Chae

Abstract: In accordance with the Enforcement Ordinance of the 'Framework Act on National Standards' National Center for Standard Reference Data(NCSRD) was established on August 1, 2006 as the central authority of the standard reference data system of Korea under KRISS. As of 1 May 2023, there are 63 data centers(DCs) which were designated by Korean Agency for Technology and Standards. 47 DCs are working on, excluding abolished 13 DCs. More than 1,082 certified reference data sets including metallic materials, physics, chemistry, bio information and public health were developed and disseminated as various services. These results will decrease costs for R&D due to double investment and increase the quality of citizen's life. Furthermore, it is expected that the international competitiveness of national standards will be strengthen using the standard reference data on standardization. Standard Reference Data is the scientific, technical data and information whose RELIABILITY AND ACCURACY ARE ASSESSED AND EVALUATED by scientist for use in technical problem-solving, industrial application, research and development. Functions of NCSRD are as follow; Fostering and evaluation of candidate data, Development of a national strategic plan of SRD based on demand surveys, Establishment of a DB for expert and operation of technical committee in each area, Support of the Steering Committee, Collection, assessment, and dissemination of scientific and technical data, Registration and dissemination of SRD, International cooperation and public relations. There are three future strategies for SRD. Data for the competitiveness of the infrastructure industries such as steel, automobile, chemical and so on will be developed. NCSRD will lead to develop data for the development of the strategic industries. DCs are going to make data for improving quality of Korean Life.