DataSources are the glue that connect your Feature Definitions to the production data sources (such as streaming, databases, CRM systems, etc.).
The DataSources takes care of the production concerns of handling high-volume data. And responsible for many related tasks, such as: Authentication, Rate-limiting, Schema Normalization, Retry, etc.
DataSources are usually configured by DevOps and are defined as a Kubernetes resource:
- name: kind
- name: brokers
- name: topics
- name: consumer_group
- name: tls_disable
DataSource definition is composed by the metadata(which defines its name), the
kind of this connector, and
config of this particular kind.
For more information, see the relevant DataConnector documentation.
They are then referenced in your Feature Definitions:
namespace: default #production
a8r.io/description: "Demonstration of a simple aggr function"
def handler(data, ctx) -> int:
return 1, ctx.timestamp, ctx.keys["client_id"].split(":")
If you are not defining the
namespace, the Feature's namespace will be used.