Abstract: The lack of reliable and readily accessible climate-related data has been a hindrance for financial sector stakeholders to properly assess financial stability and manage climate-related risks. This data gap substantially raises the barrier to channeling global capital flows towards climate change mitigation and resilience, by forcing businesses to engage in costly ad-hoc ingestion and curation efforts that cannot benefit from shared data or open protocols.
At the Open Source Climate (OS-Climate) initiative, we are building an open data science platform that supports complex data ingestion, processing and quality management requirements. We take advantage of the latest advances in open source data platform tools, machine learning, and the development of scenario-based predictive analytics by OS-Climate community members.
In this talk, we will present how we are implementing the OS-Climate data commons platform based on a data mesh architecture to make data accessible, available, discoverable, and interoperable across various development streams, while supporting strict compliance requirements around data access and regulatory disclosures. We will explain how the OS-Climate platform leverages open source tools including KubeFlow, Trino, Jupyter, Elyra pipelines and OpenShift to build maintainable and collaborative pipelines, and federate heterogeneous data sources into a controlled, common data resource for our community of climate data scientists and financial sector stakeholders.
Bio: Erik Erlandson is a Software Engineer at Red Hat’s AI Center of Excellence, where he explores emerging technologies for Machine Learning and Data Science workloads on Kubernetes, and assists customers with migrating their Data Science workloads onto the cloud. Erik is a committer on the Apache Spark project and contributor to the Ray distributed compute platform.