Transform files in S3 to Pinecone with Unstructured Platform with no code

Unstructured

Dec 29, 2024

Authors

Nina Lopatina

Developer Relations Engineer, Unstructured

Authors

Nina Lopatina

Developer Relations Engineer, Unstructured

Let’s go through the 5 easy steps to transform our unstructured data in an S3 bucket into a Pinecone vector database, using Unstructured Platform! Try out no code ETL with a 2 week free trial here.

Here is the full documentation: https://docs.unstructured.io/platform/overview

Create a new source at https://platform.unstructured.io/connectors/editor/new/sources and fill in as per https://docs.unstructured.io/platform/sources/s3

Then you will see if your connector was successfully saved under your list of Source connectors.

Create a new destination at https://platform.unstructured.io/connectors/editor/new/destinations and fill in as per https://docs.unstructured.io/platform/destinations/pinecone. Since we are using Pinecone, you can get the info you need at https://app.pinecone.io/?sessionType=login

I set up the destination with my embedding dimension in mind: Ada 002 with dimensions of 1536.

We’re going to create an S3 destination using the same process as step 1, just selecting S3 as a destination.
Let’s set up our workflow at https://platform.unstructured.io/workflows/new

We are transforming complex pdfs with images, code, and formulas, so we are using a VLM transformation strategy. Check out the partition documentation for more information on how to select a strategy.