Build ETL Workflows with Unstructured API

Past Webinar

Apr 29, 2025

Build ETL Workflows With Unstructured API

Learn how to build custom, programmatic ETL workflows for unstructured data using the Unstructured API and Workflow Endpoint.

Apr 29, 2025

Speakers

Maria Khalusova

Unstructured

Recorded

Tuesday, Apr 29, 2025

30 minutes on Zoom Events

Speakers

Maria Khalusova

Unstructured

Recorded

Tuesday, Apr 29, 2025

30 minutes on Zoom Events

Overview

Unstructured makes it easy to build scalable, programmatic ETL workflows for unstructured data. In this session, we’ll walk through how to use the Unstructured API to connect to a data source, such as Amazon S3, preprocess your documents, and pipe structured outputs into a destination, such as a vector store, a database, or a search engine.

Whether you’re building from scratch or looking to streamline an existing workflow, this webinar will show you how to automate the full ETL pipeline—from ingestion to transformation to destination—using Unstructured’s Workflow Endpoint and Python SDK.

Technical Overview

Watch this recording to learn how to build and run a complete ETL pipeline using the Unstructured API and Workflow Endpoint. In this recorded session we discussed:

Connect to S3 using source and destination connectors
Define a custom DAG of processing steps—including partitioning, chunking, and embedding
Preprocess documents with advanced partitioning strategies
Output structured data directly into your preferred destination
Track workflow runs and confirm successful completion via the Python SDK

BTS

Brian Godsey, Datastax, brian.godsey@datastax.com
Sara Hardy, Unstructured, sara.hardy@unstructured.io
Avie Magner, DMP, avie@digitalmarketingpartners.biz
Marc Lapides, DMP, marc@digitalmarketingpartners.biz

Events & Webinars

Past Webinar

Speakers

Recorded

Speakers

Recorded

Overview

Technical Overview

BTS

Events & Webinars

Not All Connectors Are Created Equal

Unstructured 101

How To Use Unstructured With Model Context Protocol (MCP)