Abstract: Bloomberg runs on data. As our clients in finance develop increasingly sophisticated and data-intensive workflows, we shifted to building systems that empower them to leverage our data programmatically. Like other systems in the data analysis space, our systems often have a query language interface and corresponding data mesh to abstract the retrieval of data from multiple data sources and perform complex analytics. In addition to exposing direct interfaces to clients through various applications, such as Microsoft Excel add-ins and our managed Python notebooks, our query language also serves as a central data and analytics engine for various Bloomberg Terminal applications and data aggregation processes.
We aim to provide users with great flexibility in defining workflows and analyzing complex data from multiple sources. However, from a content and analytics entitlements, as well as overall systems resiliency perspective, it is imperative to formally define the limits of the user’s interaction with the system. To guarantee that each user is receiving the intended service-level, we must ensure users are granted access to data based on a set of rules that account for the nature of their interaction with the system.
We developed a framework that enables us to express rules for context-driven access control using a Domain-Specific Language (DSL). One example of a rule which might be expressed in our DSL is: a set of users may be able to access raw data from one application; however, they might only have access to aggregates of the same data in another application. Despite using the same underlying data source that is powered by the same data-providing and processing system, the same user might expect to see very different sets of allowed behavior based on the context of their usage. Our system allows access to a resource to be conditionally bound to not only a user’s and the resource’s identity (granting permissions on tables), but also the interaction the user is attempting with the resource (performing further analytics on the data, joins with other data sets, quantity of data retrieved, etc).
In this talk, we will share our rule engine’s architecture, the domain-specific language we use for creating Access Control Rules, and some use cases it has helped us solve.
We believe other providers of data and analytics may benefit from a similar governance approach , and hope that our audience will learn how to design and apply context-driven access control to their systems.
Bio: Daniel Scanteianu is a software engineer in Bloomberg's Data & Analytics Engineering group. He is active within the company's Java and semantic technologies' communities. He is passionate about programming languages, distributed systems, and open source. Past achievements include contributing to Apache Kafka, obtaining U.S. patents, and developing software that generates jazz piano chords.