Learn how to build custom, programmatic ETL workflows for unstructured data using the Unstructured API and Workflow Endpoint.
Speakers

Recorded
Overview
Unstructured makes it easy to build scalable, programmatic ETL workflows for unstructured data. In this session, we’ll walk through how to use the Unstructured API to connect to a data source, such as Amazon S3, preprocess your documents, and pipe structured outputs into a destination, such as a vector store, a database, or a search engine.
Whether you’re building from scratch or looking to streamline an existing workflow, this webinar will show you how to automate the full ETL pipeline—from ingestion to transformation to destination—using Unstructured’s Workflow Endpoint and Python SDK.
Technical Overview
Watch this recording to learn how to build and run a complete ETL pipeline using the Unstructured API and Workflow Endpoint. In this recorded session we discussed:
- Connect to S3 using source and destination connectors
- Define a custom DAG of processing steps—including partitioning, chunking, and embedding
- Preprocess documents with advanced partitioning strategies
- Output structured data directly into your preferred destination
- Track workflow runs and confirm successful completion via the Python SDK
BTS
Brian Godsey, Datastax, brian.godsey@datastax.com
Sara Hardy, Unstructured, sara.hardy@unstructured.io
Avie Magner, DMP, avie@digitalmarketingpartners.biz
Marc Lapides, DMP, marc@digitalmarketingpartners.biz