diff --git a/SampleInvoices/Extras/Lab3 Sample Data/5555_series.pdf b/SampleInvoices/Extras/Lab3 Sample Data/5555_series.pdf new file mode 100644 index 0000000..611e152 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/5555_series.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/7454.pdf b/SampleInvoices/Extras/Lab3 Sample Data/7454.pdf new file mode 100644 index 0000000..b92eb9e Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/7454.pdf differ diff --git a/SampleInvoices/Lab3 Sample Data/NCR EX15POS.pdf b/SampleInvoices/Extras/Lab3 Sample Data/NCR EX15POS.pdf similarity index 100% rename from SampleInvoices/Lab3 Sample Data/NCR EX15POS.pdf rename to SampleInvoices/Extras/Lab3 Sample Data/NCR EX15POS.pdf diff --git a/SampleInvoices/Lab3 Sample Data/P1535-P1235 instructions.pdf b/SampleInvoices/Extras/Lab3 Sample Data/P1535-P1235 instructions.pdf similarity index 100% rename from SampleInvoices/Lab3 Sample Data/P1535-P1235 instructions.pdf rename to SampleInvoices/Extras/Lab3 Sample Data/P1535-P1235 instructions.pdf diff --git a/SampleInvoices/Extras/Lab3 Sample Data/fastlane_selfserv_7358k112.pdf b/SampleInvoices/Extras/Lab3 Sample Data/fastlane_selfserv_7358k112.pdf new file mode 100644 index 0000000..97e3c73 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/fastlane_selfserv_7358k112.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/orderman7.pdf b/SampleInvoices/Extras/Lab3 Sample Data/orderman7.pdf new file mode 100644 index 0000000..5175bc7 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/orderman7.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/personas_m_series.pdf b/SampleInvoices/Extras/Lab3 Sample Data/personas_m_series.pdf new file mode 100644 index 0000000..2c57a8f Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/personas_m_series.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/px10_pos_7746.pdf b/SampleInvoices/Extras/Lab3 Sample Data/px10_pos_7746.pdf new file mode 100644 index 0000000..e32a238 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/px10_pos_7746.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/realpos.pdf b/SampleInvoices/Extras/Lab3 Sample Data/realpos.pdf new file mode 100644 index 0000000..2b3c0f0 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/realpos.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/realscan_7883.pdf b/SampleInvoices/Extras/Lab3 Sample Data/realscan_7883.pdf new file mode 100644 index 0000000..709ef76 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/realscan_7883.pdf differ diff --git a/SampleInvoices/Extras/Lab3 Sample Data/selfserv_84_walkup.pdf b/SampleInvoices/Extras/Lab3 Sample Data/selfserv_84_walkup.pdf new file mode 100644 index 0000000..4ab7363 Binary files /dev/null and b/SampleInvoices/Extras/Lab3 Sample Data/selfserv_84_walkup.pdf differ diff --git a/SampleInvoices/Sample 4/80211548.jpg b/SampleInvoices/Extras/Sample 4/80211548.jpg similarity index 100% rename from SampleInvoices/Sample 4/80211548.jpg rename to SampleInvoices/Extras/Sample 4/80211548.jpg diff --git a/SampleInvoices/Sample 4/80211548.png b/SampleInvoices/Extras/Sample 4/80211548.png similarity index 100% rename from SampleInvoices/Sample 4/80211548.png rename to SampleInvoices/Extras/Sample 4/80211548.png diff --git a/SampleInvoices/Sample 4/80211567.png b/SampleInvoices/Extras/Sample 4/80211567.png similarity index 100% rename from SampleInvoices/Sample 4/80211567.png rename to SampleInvoices/Extras/Sample 4/80211567.png diff --git a/SampleInvoices/Sample 4/80233633.png b/SampleInvoices/Extras/Sample 4/80233633.png similarity index 100% rename from SampleInvoices/Sample 4/80233633.png rename to SampleInvoices/Extras/Sample 4/80233633.png diff --git a/SampleInvoices/Sample 4/80233696.png b/SampleInvoices/Extras/Sample 4/80233696.png similarity index 100% rename from SampleInvoices/Sample 4/80233696.png rename to SampleInvoices/Extras/Sample 4/80233696.png diff --git a/SampleInvoices/Sample 4/80233716.png b/SampleInvoices/Extras/Sample 4/80233716.png similarity index 100% rename from SampleInvoices/Sample 4/80233716.png rename to SampleInvoices/Extras/Sample 4/80233716.png diff --git a/SampleInvoices/Sample 4/80233719.png b/SampleInvoices/Extras/Sample 4/80233719.png similarity index 100% rename from SampleInvoices/Sample 4/80233719.png rename to SampleInvoices/Extras/Sample 4/80233719.png diff --git a/SampleInvoices/Sample 6/0060027362.tif b/SampleInvoices/Extras/Sample 6/0060027362.tif similarity index 100% rename from SampleInvoices/Sample 6/0060027362.tif rename to SampleInvoices/Extras/Sample 6/0060027362.tif diff --git a/SampleInvoices/Sample 6/0060027384.tif b/SampleInvoices/Extras/Sample 6/0060027384.tif similarity index 100% rename from SampleInvoices/Sample 6/0060027384.tif rename to SampleInvoices/Extras/Sample 6/0060027384.tif diff --git a/SampleInvoices/Sample 6/0060027402.tif b/SampleInvoices/Extras/Sample 6/0060027402.tif similarity index 100% rename from SampleInvoices/Sample 6/0060027402.tif rename to SampleInvoices/Extras/Sample 6/0060027402.tif diff --git a/SampleInvoices/Sample 6/0060027425.tif b/SampleInvoices/Extras/Sample 6/0060027425.tif similarity index 100% rename from SampleInvoices/Sample 6/0060027425.tif rename to SampleInvoices/Extras/Sample 6/0060027425.tif diff --git a/SampleInvoices/sample1/2029370271.jpg b/SampleInvoices/Extras/sample1/2029370271.jpg similarity index 100% rename from SampleInvoices/sample1/2029370271.jpg rename to SampleInvoices/Extras/sample1/2029370271.jpg diff --git a/SampleInvoices/sample1/2029370271.tif b/SampleInvoices/Extras/sample1/2029370271.tif similarity index 100% rename from SampleInvoices/sample1/2029370271.tif rename to SampleInvoices/Extras/sample1/2029370271.tif diff --git a/SampleInvoices/sample1/2029370308.tif b/SampleInvoices/Extras/sample1/2029370308.tif similarity index 100% rename from SampleInvoices/sample1/2029370308.tif rename to SampleInvoices/Extras/sample1/2029370308.tif diff --git a/SampleInvoices/sample1/2029370464.tif b/SampleInvoices/Extras/sample1/2029370464.tif similarity index 100% rename from SampleInvoices/sample1/2029370464.tif rename to SampleInvoices/Extras/sample1/2029370464.tif diff --git a/SampleInvoices/sample1/2029372111_2029372112.tif b/SampleInvoices/Extras/sample1/2029372111_2029372112.tif similarity index 100% rename from SampleInvoices/sample1/2029372111_2029372112.tif rename to SampleInvoices/Extras/sample1/2029372111_2029372112.tif diff --git a/SampleInvoices/sample1/2029377288.tif b/SampleInvoices/Extras/sample1/2029377288.tif similarity index 100% rename from SampleInvoices/sample1/2029377288.tif rename to SampleInvoices/Extras/sample1/2029377288.tif diff --git a/SampleInvoices/sample1/2029377295.tif b/SampleInvoices/Extras/sample1/2029377295.tif similarity index 100% rename from SampleInvoices/sample1/2029377295.tif rename to SampleInvoices/Extras/sample1/2029377295.tif diff --git a/SampleInvoices/sample1/2029377607.tif b/SampleInvoices/Extras/sample1/2029377607.tif similarity index 100% rename from SampleInvoices/sample1/2029377607.tif rename to SampleInvoices/Extras/sample1/2029377607.tif diff --git a/SampleInvoices/sample1/2029377724.tif b/SampleInvoices/Extras/sample1/2029377724.tif similarity index 100% rename from SampleInvoices/sample1/2029377724.tif rename to SampleInvoices/Extras/sample1/2029377724.tif diff --git a/SampleInvoices/sample1/2029378060.tif b/SampleInvoices/Extras/sample1/2029378060.tif similarity index 100% rename from SampleInvoices/sample1/2029378060.tif rename to SampleInvoices/Extras/sample1/2029378060.tif diff --git a/SampleInvoices/sample1/2029378157.tif b/SampleInvoices/Extras/sample1/2029378157.tif similarity index 100% rename from SampleInvoices/sample1/2029378157.tif rename to SampleInvoices/Extras/sample1/2029378157.tif diff --git a/SampleInvoices/Sample 7/invoice1.jpg b/SampleInvoices/Lab 1 Step 3.7/invoice1.jpg similarity index 100% rename from SampleInvoices/Sample 7/invoice1.jpg rename to SampleInvoices/Lab 1 Step 3.7/invoice1.jpg diff --git a/SampleInvoices/Sample 7/invoice2.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice2.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice2.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice2.jpeg diff --git a/SampleInvoices/Sample 7/invoice3.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice3.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice3.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice3.jpeg diff --git a/SampleInvoices/Sample 7/invoice4.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice4.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice4.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice4.jpeg diff --git a/SampleInvoices/Sample 7/invoice5.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice5.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice5.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice5.jpeg diff --git a/SampleInvoices/Sample 7/invoice6.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice6.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice6.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice6.jpeg diff --git a/SampleInvoices/Sample 7/invoice8.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice8.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice8.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice8.jpeg diff --git a/SampleInvoices/Sample 7/invoice9.jpeg b/SampleInvoices/Lab 1 Step 3.7/invoice9.jpeg similarity index 100% rename from SampleInvoices/Sample 7/invoice9.jpeg rename to SampleInvoices/Lab 1 Step 3.7/invoice9.jpeg diff --git a/SampleInvoices/Lab 2/Cayenne-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf b/SampleInvoices/Lab 2/Cayenne-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf new file mode 100644 index 0000000..68fb65f Binary files /dev/null and b/SampleInvoices/Lab 2/Cayenne-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf differ diff --git a/SampleInvoices/Lab 2/Panamera-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf b/SampleInvoices/Lab 2/Panamera-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf new file mode 100644 index 0000000..f1364d5 Binary files /dev/null and b/SampleInvoices/Lab 2/Panamera-from-2021-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf differ diff --git a/SampleInvoices/Lab 2/Taycan-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf b/SampleInvoices/Lab 2/Taycan-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf new file mode 100644 index 0000000..a556e09 Binary files /dev/null and b/SampleInvoices/Lab 2/Taycan-Porsche-Connect-Good-to-know-Owner-s-Manual.pdf differ diff --git a/images/2.2.png b/images/2.2.png new file mode 100644 index 0000000..9f617fb Binary files /dev/null and b/images/2.2.png differ diff --git a/images/3.1.png b/images/3.1.png new file mode 100644 index 0000000..2f64f3e Binary files /dev/null and b/images/3.1.png differ diff --git a/images/3.2.png b/images/3.2.png new file mode 100644 index 0000000..b80db8d Binary files /dev/null and b/images/3.2.png differ diff --git a/images/BPA 2.png b/images/BPA 2.png new file mode 100644 index 0000000..a2cfc42 Binary files /dev/null and b/images/BPA 2.png differ diff --git a/images/BPA 3.png b/images/BPA 3.png new file mode 100644 index 0000000..d58a740 Binary files /dev/null and b/images/BPA 3.png differ diff --git a/images/BPA ingest documents.png b/images/BPA ingest documents.png new file mode 100644 index 0000000..4022091 Binary files /dev/null and b/images/BPA ingest documents.png differ diff --git a/images/customermodelprojectcreation.png b/images/customermodelprojectcreation.png new file mode 100644 index 0000000..adebb7b Binary files /dev/null and b/images/customermodelprojectcreation.png differ diff --git a/images/searchconfig.png b/images/searchconfig.png new file mode 100644 index 0000000..85affc4 Binary files /dev/null and b/images/searchconfig.png differ diff --git a/images/selectblobstorage.png b/images/selectblobstorage.png new file mode 100644 index 0000000..4bbb613 Binary files /dev/null and b/images/selectblobstorage.png differ diff --git a/images/selectblobstorage.png.png b/images/selectblobstorage.png.png new file mode 100644 index 0000000..4bbb613 Binary files /dev/null and b/images/selectblobstorage.png.png differ diff --git a/images/selectcontainer.png b/images/selectcontainer.png new file mode 100644 index 0000000..0bc0a00 Binary files /dev/null and b/images/selectcontainer.png differ diff --git a/images/selectcontainerfolder.png b/images/selectcontainerfolder.png new file mode 100644 index 0000000..0229262 Binary files /dev/null and b/images/selectcontainerfolder.png differ diff --git a/images/selectimportdata.png b/images/selectimportdata.png index e982bfb..c25ad7e 100644 Binary files a/images/selectimportdata.png and b/images/selectimportdata.png differ diff --git a/images/selectimportdata.png.png b/images/selectimportdata.png.png new file mode 100644 index 0000000..c25ad7e Binary files /dev/null and b/images/selectimportdata.png.png differ diff --git a/images/step1b playground replacement.png b/images/step1b playground replacement.png new file mode 100644 index 0000000..c7f762a Binary files /dev/null and b/images/step1b playground replacement.png differ diff --git a/lab_instructions/Extras/lab_1.md b/lab_instructions/Extras/lab_1.md new file mode 100644 index 0000000..2beb093 --- /dev/null +++ b/lab_instructions/Extras/lab_1.md @@ -0,0 +1,160 @@ +# Create and Deploy a Form Recognizer Custom Model + +### Overview +In this lab, you will create (train) an Azure Form Recognizer custom model using a sample training dataset. Custom models extract and analyze distinct data and use cases from forms and documents specific to your business. To create a custom model, you label a dataset of documents with the values you want extracted and train the model on the labeled dataset. You only need five examples of the same form or document type to get started. For this lab, you will use the dataset provided at [Custom Model Sample Files](/SampleInvoices/Lab3%20Sample%20Data).. + + +### Goal +* Use a sample training data set to train a custom model in the Azure Form Recognizer Studio +* Label the training data documents with custom fields of interest +* Test the trained model on test data, visualized results and confidence score in the Studio +* Use the custom model in the BPA pipeline from Lab 1 + + +### Pre-requisites +* The accelerator is deployed and ready in the resource group +* You have an Azure subscription and permission to create a Form Recognizer Resource +* You have access to sample invoices folder with the invoices to upload + + +### Instructions + +#### Create a Custom Model +- [Step 1 - Create a Form Recognizer Resource](#step-1---create-a-form-recognizer-resource) +- [Step 2 - Open Form Recognizer Studio and Create a Custom Labeling Project ](#step-2---open-form-recognizer-studio-and-create-a-custom-labeling-project) +- [Step 3 - Import the Sample Data](#step-3---import-the-sample-data) +- [Step 4 - Train the model](#step-4---train-the-model) +- [Step 5 - Test the Model on Test Data](#step-5---test-the-model-on-test-data) + +#### Step 1 - Create a Form Recognizer Resource +![](images/step1a-create-form-rec-resource.png) +![](images/step1b-create-form-rec-resource.png) +![](images/step1c-create-form-rec-resource.png) + +#### Step 2 - Open Form Recognizer Studio and Create a Custom Labeling Project + +![](images/step2a-Create-custom-labeling-project.png) + +Select the **Custom Extraction Model** from the bottom of the list of options + +![](images/step2b-Create-custom-labeling-project.png) + +Create Custom Model Project + +![](images/step2c-Create-custom-labeling-project.png) +![](images/customermodelprojectcreation.png) + +Provide the storage account and container containing the forms data which you will like to label + +![](images/step2e-Create-custom-labeling-project.png) +![](images/step2f-Create-custom-labeling-project.png) +![](images/step2g-Create-custom-labeling-project.png) + +#### Step 3 - Import the Sample Data +Use the data folder on VM desktop and go to **Custom Model Sample Files** and pick 5 files marked as **train** +![](images/step3a-import-sample-data.png) +![](images/step3b-import-sample-data.png) + +Create a new field which you would like to label + +![](images/step3c-import-sample-data.png) +We created the label as "Organization_sample" + +![](images/step3d-import-sample-data.png) + +Apply the custom label to form fields +![](images/step3e-import-sample-data.png) +Apply the labels to all forms by repeating the process in the previous step +![](images/step3f-import-sample-data.png) +#### Step 4 - Train the model +After labeling the forms, click on "Train" and provide the below information. Please note **Neural** method will take a longer duration to train but may be necessary in case of most unstructured files. If your data is mostly structured, you can use **Tabular** to make the training faster. For this workshop, we will use Tabular method to train the model. +![](images/step4a-train-the-model.png) +![](images/step4b-train-the-model.png) +#### Step 5 - Test the Model on Test Data +Use the sample files marked as **test** from the same location where you picked the files for training +![](images/step5a-test-the-model.png) +![](images/step5b-test-the-model.png) +Load the test file and click "Analyze" +![](images/step5c-test-the-model.png) +The results are projected with the confidence score +![](images/step5d-test-the-model.png) + + +#### Build new pipeline with custom model module in BPA +After you are sastified with the custom model performance, you can retrieve the **model ID** and use it in a new BPA pipeline with the Cusom Model module in the next step. + +#### Launch BPA Accelerator +1. Launch the accelerator from the resource group in the Static Web App + 1. To do this go to portal.azure.com ([Azure Portal](portal.azure.com)) from a web browser and click on resource group that is created for the purpose of this lab. + ![resourcegroup.png](/images/resourcegroup.png) + Click on the resource group that is created for this lab, you should be able to see resources deployed as a part of Business Process Automation accelerator deployment. + + > **Note :** The names will be different in your specific labs and will not exactly match with the names of the resources or resource group + + ![resourceswithinresourcegroup.png](/images/resourceswithinresourcegroup.png) + + 1. Look for the Static Web App under **type**. This is what we will use as a part of lab 1. Click on the Web App. + + ![staticwebappresource.png](/images/staticwebappresource.png) + + Click on the URL and this will launch the accelerator + ![swaurl.png](/images/swaurl.png) + +1. Please create the following pipeline: +![](images/step6a-deploy-custom-model.png) +![](images/step6b-deploy-custom-model.png) +![](images/step6c-deploy-custom-model.png) + +1. Retrieve the trained **custom model ID** from the Form Recognizer Studio and Enter it into the following window: +![](images/step6d-deploy-custom-model.png) +![](images/step6e-deploy-custom-model.png) + +1. Check the newly created pipeline use **View Pipeline** Option +![](images/step6f-deploy-custom-model.png) + +1. Ingest data for the new pipeline from BPA homepage. Please make sure you select the Pipeline first before ingesting the files. For smaller files use the **Upload A Single Document** option. Otherwise for larger files use **Split Document By Page And Process** option. +![](images/step6g-deploy-custom-model.png) + + +1. To get the **Search Service**. To view the results, go to portal.azure.com ([Azure Portal](portal.azure.com)) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type **Search Service**. + + ![searchservicetype.png](/images/searchservicetype.png) + +1. Click on **Import Data** and Select **Azure Blob Storage** from the dropdown in datasource. + ![selectblobstorage.png](/images/selectblobstorage.png) +1. Provide a name for datasource; change the parsing mode to **JSON**; click on **Choose an existing connection** for **Connection String** and select the Storage account related to your project and choose container **document** + ![selectcontainer.png](/images/selectcontainer.png) + +1. Keep the default for **Managed identity Authentication**, which is **None**. + +1. On the Blob folder, provide the name of **your pipeline** + + ![selectcontainerfolder.png](/images/selectcontainerfolder.png) + +1. Click **Next: Add cognitive skills (Optional)**. This validates and creates the index schema. + +1. In the next Screen(**Add cognitive skills (Optional)**), Click **Skip to: Customize Target Index**, + ![customizetargetindex.png](/images/customizetargetindex.png) + +1. Make all fields **Retrievable** and **Searchable** + ![searchconfig.png](/images/searchconfig.png) + +1. Provide a name for the Index and click on **Next: Create an indexer** + ![indexname.png](/images/indexname.png) + +1. Provide a name for the indexer and click **Submit** + + ![createindexer.png](/images/createindexer.png) + +1. You will get a notification that the import is successfully configured + +1. Now, go back to the accelerator url that you retreived from Step 1 and click on **Sample Search Application**. + ![samplesearchapplication.png](/images/samplesearchapplication.png) + + This opens the same search application + ![searchlandingpage.png](/images/searchlandingpage.png) + +1. You can now filter and search on items and other fields configured. +## More Resources +Getting Started with Form Recognizer Studio - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/form-recognizer-studio-overview?view=form-recog-3.0.0 +Form Recognizer Documentation - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-invoice?view=form-recog-3.0.0 diff --git a/lab_instructions/lab_3.md b/lab_instructions/Extras/lab_2.md similarity index 61% rename from lab_instructions/lab_3.md rename to lab_instructions/Extras/lab_2.md index 1d69a93..825fd93 100644 --- a/lab_instructions/lab_3.md +++ b/lab_instructions/Extras/lab_2.md @@ -16,7 +16,7 @@ In this lab, you will use unstructured data files like contract documents, lease ### Instructions -#### **Step 1a - Create a OpenAI Generic Pipeline** +#### **Step 1 - Create A Generic Pipeline** ![](images/BPAHomepage.png) ![](images/Lab3NewPipeline.png) @@ -25,34 +25,6 @@ In this lab, you will use unstructured data files like contract documents, lease ![](images/Lab3OCR2Txt.png) -![](images/Lab3OpenAIGeneric.png) - -### **Step 1b - Get Sample Configurations from GPT-3 Playground** - -![](images/Lab3SelectOAIResource.png) - -![](images/Lab3OAIExplore.png) - -![](images/Lab3OAIClickGPT3.png) - -At this stage, we select the model we want to use and the feature we want to leverage. In this case we will be using the Davinci model and the Summerize feature. The playground brings in a sample on the editor. Select the content of the 'Conversation' section and replace with ${document} to ensure the dynamic content is used on runtime. -After that click on 'View Code' on top right. - -![](images/Lab3OAIPlayground.png) - -On the pop up, there will be drop down menu where by default 'Python' will be selected. Please change that to 'json' and Copy the code snippet. - -![](images/CopySamplejson.png) - -Go back to the BPA tab and replace the default text on the Generic OpenAI component opened earlier with the copied text. - -![](images/Lab3OAISampleCode.png) - -That completes the pipeline - -![](images/Lab3FinishPipeline.png) - - #### **Step 2 - Ingest Data for the pipeline** There are 2 options for ingesting the data for the pipeline: @@ -70,20 +42,21 @@ There are 2 options for ingesting the data for the pipeline: 1. Click on **Import Data**. ![selectimportdata.png](/images/selectimportdata.png) -1. Select **Azure Cosmos DB** from the dropdown in datasource. - ![selectazurecosmosdb.png](/images/selectazurecosmosdb.png) - -1. Provide a name for datasource and click on **Choose an existing connection** for **Connection String**. Here the Azure CosmosDB resource created as a part of BPA accelerator already setup will be one of the sources you can choose from. - ![selectcosmosdb.png](/images/selectcosmosdb.png) - - -1. Keep the default for **Managed identity Authentication**, which is **None**. For **Databases** and **Collection** use the dropdown to select the same name as the Cosmos DB you selected at step 15. +1. Select **Azure Blob Storage** from the dropdown in datasource. -1. Under Query, use the following Query. The pipeline should match the pipeline name you used in step 3 - > SELECT * FROM c WHERE c.pipeline = 'YOUR-PIPELINE-NAME' AND c._ts > @HighWaterMark - - ![](images/Lab3LoadData.png) +1. Provide a name for datasource and click on **Choose an existing connection** for **Connection String**. Here the Azure Blob Storage resource created as a part of BPA accelerator already setup will be one of the sources you can choose from. + ![lab3-import-data-1.png](images/lab3-import-data-1.png) +1. After you select your storage account, select the **results** container from the list and click **Select** button. See the screen shot below. + ![lab3-import-data-2.png](images/lab3-import-data-2.png) +1. On the Import data screen make sure you have the following: + - Your data source as **Azure Blob Storage** + - You have provided data source name. For e.g. **storagedatasource** + - You selected **JSON** as Parsing mode + - Your container name is **results** and + - Your created pipeline name is entered. For instance, if your pipeline name is **pipeline-name** then enter pipeline-name + - Keep the default for **Managed identity Authentication**, which is **None** + ![lab3-import-data-3.png](images/lab3-import-data-3.png) 1. Click **Next: Add cognitive skills (Optional)**. This validates and creates the index schema. @@ -111,16 +84,30 @@ There are 2 options for ingesting the data for the pipeline: 1. Select the Semantic Configuration and click on Create new. On the pop up do the following: - - Give a name to the Semantic Search Config + - Give a name to the Semantic Search Config. **For this lab, the name must be 'default'** - Select the Title field and select 'filename' - Select the 'content' field and any other relevant fields for Content Fields - Select Save -![](images/Lab3SemSearchConfig.png) +![](images/Lab3SemSearchConfig_default.png) +It is important that you name your Semantic Search Config for this lab as **default** + +![](images/lab3-semantic-config-save.png) + +Do not forget to click on **save** again on the index screen. Otherwise the Semantic Search Config will not be applied. -![](images/Lab3SemSearchConfigSave.png) +#### **Step 5 - Perform Azure OpenAI Search** +1. Now, go back to the accelerator url that you retreived from Step 1 and click on **Search Application**. + ![BPAHomepageSampleSearch](images/BPAHomePageSearchApp.png) -#### **Step 5 - Perform Semantic Search** + This opens the Azure Open AI search application + + - Select the index from top drop down. In this case the index created earlier is selected. **azureblob-index** + - Provide a search query based on your document, like: + - 'Tell me the 7454 installation instructions' + ![Lab5-openai-search](images/Lab5-openai-search.png) + +#### **Step 6 - Perform Semantic Search** 1. Now, go back to the accelerator url that you retreived from Step 1 and click on **Sample Search Application**. ![](images/BPAHomepageSSA.png) diff --git a/lab_instructions/Lab 1/Lab 1.md b/lab_instructions/Lab 1/Lab 1.md new file mode 100644 index 0000000..8801a01 --- /dev/null +++ b/lab_instructions/Lab 1/Lab 1.md @@ -0,0 +1,184 @@ +# Create and Deploy a Form Recognizer Custom Model + +### Overview +In this lab, you will create (train) an Azure Form Recognizer custom model using a sample training dataset. Custom models extract and analyze distinct data and use cases from forms and documents specific to your business. To create a custom model, you label a dataset of documents with the values you want extracted and train the model on the labeled dataset. You only need five examples of the same form or document type to get started. For this lab, you will use the dataset provided at [Custom Model Sample Files](/SampleInvoices/Custom%20Model%20Sample/).. + + +### Goal +* Use a sample training data set to train a custom model in the Azure Form Recognizer Studio +* Label the training data documents with custom fields of interest +* Test the trained model on test data, visualized results and confidence score in the Studio +* Use the custom model in the BPA pipeline + + +### Pre-requisites +* The accelerator is deployed and ready in the resource group +* You have an Azure subscription and permission to create a Form Recognizer Resource +* You have access to sample invoices folder with the invoices to upload + + +### Instructions + + +### Step 1 Creating a Form Recognizer Resource + +#### 1.1 Go to the Resource group and select the "Cognitive services multi-service account" resource type + +![Alt text]() + +#### 1.2 Click on Form Recognizer tab and select "Go to Studio" + +![Alt text](images/image.png) + +#### 1.3 In Form Recognizer Studio, under Custom models, choose Create new + +![Alt text](images/image-1.png) + +#### 1.4 Create a new project + +![Alt text](images/image-2.png) + +#### 1.5 Fill in project details + +![Alt text](images/image-3.png) + +#### 1.6 Fill in details for configuring service resource. For Form Recognizer or Cognitive Service Resource, choose the one available from your drop-down menu. Choose the General Available API version 2022-08-31 + +![Alt text](images/image-4.png) + +#### 1.7 Configure data source details and for storage account, choose create new storage account + +![Alt text]() + +#### 1.8 validate the information and choose create project + +![Alt text]() + + +### Step 2 Train and Label data +In this step, you will upload 5 training documents to train the model. + +#### 2.1 Upload sample data +Use the data folder on VM desktop and go to [Custom Model Sample](/SampleInvoices/Custom%20Model%20Sample/) files and pick 5 files marked as train. Once uploaded, choose Run now in the pop-up window under Run Layout. + +![Alt text](images/image-5.png) + + +#### 2.2 Add a field + +![Alt text](images/image-7.png) + +#### 2.3 Label the new field added by selecting the CONTOSO LTD in the top left of each document uploaded. Do this for all the five documents. + +![Alt text]() + +#### 2.4 Once all the documents are labelled, choose Train in the top right corner + +![Alt text]() + +#### 2.5 Specify a model ID and choose Template for the Build Mode. Save this Model ID somewhere as you will be needing it in next steps. + +![Alt text]() + +#### 2.6 Go to Models. Wait till the model status shows succeeded. + +![Alt text]() + +Select the model you created and choose Test. + +![Alt text](images/image-8.png) + +#### 2.7 In the Test model window, use the sample files marked as test from the [same location](/SampleInvoices/Custom%20Model%20Sample/) where you picked the files for training. Once uploaded, choose Run all analysis. + +![Alt text](images/image-9.png) + +#### 2.8 Now you can see on the right hand side, the model was able to detect the field "Organization_sample" we created in the last step along with its confidence score + +![Alt text]() + + +### Step 3 Build new pipeline with custom model module in BPA + +After you are sastified with the custom model performance, you can retrieve the model ID and use it in a new BPA pipeline with the Cusom Model module in the next step. + +#### 3.1 Launch BPA Accelerator + +Navigate to the Resource Group and select the resource group which is already created for you. + +![Alt text](images/image-10.png) + +#### 3.2 Select the static web app and click on the URL + +![Alt text]() + +![Alt text](images/image-11.png) + +#### 3.3 Choose Create/Update/Delete Pipelines option and create a new pipeline by specifying a name + +![Alt text]() + +![Alt text]() + +#### 3.4 Select PDF Document + +![Alt text](images/image-12.png) + +#### 3.5 Select Form Recognizer custom model (batch) option and specify the model ID you gave in Step 2. + +![Alt text](images/image-13.png) + +![Alt text](images/image-14.png) + +Click on Done + +![Alt text](images/image-15.png) + +#### 3.6 Now you will be ingesting documents by going to the Home page of BPA and choosing Ingest Documents option. + +![Alt text](images/image-16.png) + +#### 3.7 From the Select a pipeline drop-down, select the pipeline you just created and click on upload under upload a single document + +![Alt text](images/image-17.png) + +#### 3.8 For documents, go to [Lab 1 Step 3.7](/SampleInvoices/Lab%201%20Step%203.7/) folder. You can upload multiple invoice one-by-one. + + +### Step 4 Configure Azure Cognitive Search + + +#### 4.1 Fo back to the resource group window and select Search service + +![Alt text](images/image-18.png) + +#### 4.2 Click on Import data and select Azure Blog Storage for the Data source option + +![Alt text](images/image-20.png) + +For connection string, choose an existing connection and select the storage account which was created for you already. Within that, select the results container. For Blob folder, specify the name of the pipeline you created in Step 3 in BPA. + +![Alt text](images/image-19.png) + +#### 4.3 Click on Add congnitive skills and skip to customize target index. Make all fields Retrievable and Searchable. Expand the documents field and under it, expand fields to make the three fields facetable (type, valueString & content). + +![Alt text](images/facetable.png) + +#### 4.4 Provide a name for the indexer if not already given and select Submit. You will get a notification that the import is successfully configured + +![Alt text](images/image-22.png) + +### Step 5 Use Sample Search Application + +#### 5.1 Now go back to the BPA webpage and select Sample Search Application + +![Alt text](images/image-23.png) + +You can now filter and search on items and other fields configured. + +![Alt text]() + + + +## More Resources +Getting Started with Form Recognizer Studio - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/form-recognizer-studio-overview?view=form-recog-3.0.0 +Form Recognizer Documentation - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-invoice?view=form-recog-3.0.0 \ No newline at end of file diff --git a/lab_instructions/Lab 1/images/1.3 go to models.png b/lab_instructions/Lab 1/images/1.3 go to models.png new file mode 100644 index 0000000..0e6bc0a Binary files /dev/null and b/lab_instructions/Lab 1/images/1.3 go to models.png differ diff --git a/lab_instructions/Lab 1/images/1.3 label data new filed apply.png b/lab_instructions/Lab 1/images/1.3 label data new filed apply.png new file mode 100644 index 0000000..4353d9a Binary files /dev/null and b/lab_instructions/Lab 1/images/1.3 label data new filed apply.png differ diff --git a/lab_instructions/Lab 1/images/1.3 label data train.png b/lab_instructions/Lab 1/images/1.3 label data train.png new file mode 100644 index 0000000..aa47885 Binary files /dev/null and b/lab_instructions/Lab 1/images/1.3 label data train.png differ diff --git a/lab_instructions/Lab 1/images/1.3 train a new model .png b/lab_instructions/Lab 1/images/1.3 train a new model .png new file mode 100644 index 0000000..d8f05a9 Binary files /dev/null and b/lab_instructions/Lab 1/images/1.3 train a new model .png differ diff --git a/lab_instructions/Lab 1/images/1.5 test model last step.png b/lab_instructions/Lab 1/images/1.5 test model last step.png new file mode 100644 index 0000000..8255479 Binary files /dev/null and b/lab_instructions/Lab 1/images/1.5 test model last step.png differ diff --git a/lab_instructions/Lab 1/images/bpa 2.png b/lab_instructions/Lab 1/images/bpa 2.png new file mode 100644 index 0000000..2295bb0 Binary files /dev/null and b/lab_instructions/Lab 1/images/bpa 2.png differ diff --git a/lab_instructions/Lab 1/images/create custom model project.png b/lab_instructions/Lab 1/images/create custom model project.png new file mode 100644 index 0000000..7fc6bad Binary files /dev/null and b/lab_instructions/Lab 1/images/create custom model project.png differ diff --git a/lab_instructions/Lab 1/images/custom model ext configuration 2.png b/lab_instructions/Lab 1/images/custom model ext configuration 2.png new file mode 100644 index 0000000..f653206 Binary files /dev/null and b/lab_instructions/Lab 1/images/custom model ext configuration 2.png differ diff --git a/lab_instructions/Lab 1/images/facetable.png b/lab_instructions/Lab 1/images/facetable.png new file mode 100644 index 0000000..0c71a3c Binary files /dev/null and b/lab_instructions/Lab 1/images/facetable.png differ diff --git a/lab_instructions/Lab 1/images/form recognizer resource creation 1.1.png b/lab_instructions/Lab 1/images/form recognizer resource creation 1.1.png new file mode 100644 index 0000000..158d8f7 Binary files /dev/null and b/lab_instructions/Lab 1/images/form recognizer resource creation 1.1.png differ diff --git a/lab_instructions/Lab 1/images/image-1.png b/lab_instructions/Lab 1/images/image-1.png new file mode 100644 index 0000000..18e77c9 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-1.png differ diff --git a/lab_instructions/Lab 1/images/image-10.png b/lab_instructions/Lab 1/images/image-10.png new file mode 100644 index 0000000..ed5df09 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-10.png differ diff --git a/lab_instructions/Lab 1/images/image-11.png b/lab_instructions/Lab 1/images/image-11.png new file mode 100644 index 0000000..06defee Binary files /dev/null and b/lab_instructions/Lab 1/images/image-11.png differ diff --git a/lab_instructions/Lab 1/images/image-12.png b/lab_instructions/Lab 1/images/image-12.png new file mode 100644 index 0000000..7d6a6cf Binary files /dev/null and b/lab_instructions/Lab 1/images/image-12.png differ diff --git a/lab_instructions/Lab 1/images/image-13.png b/lab_instructions/Lab 1/images/image-13.png new file mode 100644 index 0000000..3a6ab25 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-13.png differ diff --git a/lab_instructions/Lab 1/images/image-14.png b/lab_instructions/Lab 1/images/image-14.png new file mode 100644 index 0000000..1eb3944 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-14.png differ diff --git a/lab_instructions/Lab 1/images/image-15.png b/lab_instructions/Lab 1/images/image-15.png new file mode 100644 index 0000000..86e3bca Binary files /dev/null and b/lab_instructions/Lab 1/images/image-15.png differ diff --git a/lab_instructions/Lab 1/images/image-16.png b/lab_instructions/Lab 1/images/image-16.png new file mode 100644 index 0000000..0a7a048 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-16.png differ diff --git a/lab_instructions/Lab 1/images/image-17.png b/lab_instructions/Lab 1/images/image-17.png new file mode 100644 index 0000000..ab52131 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-17.png differ diff --git a/lab_instructions/Lab 1/images/image-18.png b/lab_instructions/Lab 1/images/image-18.png new file mode 100644 index 0000000..d8151f4 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-18.png differ diff --git a/lab_instructions/Lab 1/images/image-19.png b/lab_instructions/Lab 1/images/image-19.png new file mode 100644 index 0000000..10d3b63 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-19.png differ diff --git a/lab_instructions/Lab 1/images/image-2.png b/lab_instructions/Lab 1/images/image-2.png new file mode 100644 index 0000000..8588be7 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-2.png differ diff --git a/lab_instructions/Lab 1/images/image-20.png b/lab_instructions/Lab 1/images/image-20.png new file mode 100644 index 0000000..e7345da Binary files /dev/null and b/lab_instructions/Lab 1/images/image-20.png differ diff --git a/lab_instructions/Lab 1/images/image-21.png b/lab_instructions/Lab 1/images/image-21.png new file mode 100644 index 0000000..9cda5cd Binary files /dev/null and b/lab_instructions/Lab 1/images/image-21.png differ diff --git a/lab_instructions/Lab 1/images/image-22.png b/lab_instructions/Lab 1/images/image-22.png new file mode 100644 index 0000000..647dfb3 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-22.png differ diff --git a/lab_instructions/Lab 1/images/image-23.png b/lab_instructions/Lab 1/images/image-23.png new file mode 100644 index 0000000..ab2ea33 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-23.png differ diff --git a/lab_instructions/Lab 1/images/image-24.png b/lab_instructions/Lab 1/images/image-24.png new file mode 100644 index 0000000..971fdd6 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-24.png differ diff --git a/lab_instructions/Lab 1/images/image-3.png b/lab_instructions/Lab 1/images/image-3.png new file mode 100644 index 0000000..d988521 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-3.png differ diff --git a/lab_instructions/Lab 1/images/image-4.png b/lab_instructions/Lab 1/images/image-4.png new file mode 100644 index 0000000..3bcf60f Binary files /dev/null and b/lab_instructions/Lab 1/images/image-4.png differ diff --git a/lab_instructions/Lab 1/images/image-5.png b/lab_instructions/Lab 1/images/image-5.png new file mode 100644 index 0000000..7d94205 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-5.png differ diff --git a/lab_instructions/Lab 1/images/image-6.png b/lab_instructions/Lab 1/images/image-6.png new file mode 100644 index 0000000..c12bb71 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-6.png differ diff --git a/lab_instructions/Lab 1/images/image-7.png b/lab_instructions/Lab 1/images/image-7.png new file mode 100644 index 0000000..c12bb71 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-7.png differ diff --git a/lab_instructions/Lab 1/images/image-8.png b/lab_instructions/Lab 1/images/image-8.png new file mode 100644 index 0000000..bb0d763 Binary files /dev/null and b/lab_instructions/Lab 1/images/image-8.png differ diff --git a/lab_instructions/Lab 1/images/image-9.png b/lab_instructions/Lab 1/images/image-9.png new file mode 100644 index 0000000..d7d902f Binary files /dev/null and b/lab_instructions/Lab 1/images/image-9.png differ diff --git a/lab_instructions/Lab 1/images/image.png b/lab_instructions/Lab 1/images/image.png new file mode 100644 index 0000000..a018bc5 Binary files /dev/null and b/lab_instructions/Lab 1/images/image.png differ diff --git a/lab_instructions/Lab 1/images/launch bpa step 5.png b/lab_instructions/Lab 1/images/launch bpa step 5.png new file mode 100644 index 0000000..1d25512 Binary files /dev/null and b/lab_instructions/Lab 1/images/launch bpa step 5.png differ diff --git a/lab_instructions/Lab 1/images/sample search app.png b/lab_instructions/Lab 1/images/sample search app.png new file mode 100644 index 0000000..766eb5e Binary files /dev/null and b/lab_instructions/Lab 1/images/sample search app.png differ diff --git a/lab_instructions/Lab 1/images/static web app.png b/lab_instructions/Lab 1/images/static web app.png new file mode 100644 index 0000000..253ded7 Binary files /dev/null and b/lab_instructions/Lab 1/images/static web app.png differ diff --git a/lab_instructions/Lab 2/Lab 2.md b/lab_instructions/Lab 2/Lab 2.md new file mode 100644 index 0000000..882c6de --- /dev/null +++ b/lab_instructions/Lab 2/Lab 2.md @@ -0,0 +1,70 @@ +# Use Azure OpenAI with your own data + +### Overview +In this lab, you will be using your own data with Azure OpenAI Large Language Models(LLM) which will be made searchable using Azure Cognitive Search. You will be using the Porche Owner's Manual pdf provided under [Lab 2](/SampleInvoices/Lab%202/) folder. + + +### Goal +* How to leverage the chatGPT LLM to extract concise summary from your own document repository using OpenAI. + +### Pre-requisites +* Access to Azure OpenAI chat playground +* Sample data to test with OpenAI + +### Instructions + +### Step 1: Navigate to Azure OpenAI Playground + +![Alt text](images/image.png) + +* 1.1 Click on **"Go to Azure OpenAI Studio"** + +![Alt text](images/image-1.png) + +* 1.2 Click on **"Bring your own data"** + +![Alt text](images/image-2.png) + +### Step 2: Upload your own data +In this step, we will be using Porche's owner manual for Taycan, Panamera and Cayenne models. + +* 2.1 Select the following options for adding the data source pop-up: + * 2.1.1 Select data source: Upload files + * 2.1.2 Select Azure Blob storage resource: Choose the already created storage account from the dropdown. If asked, enable CORS. + * 2.1.3 Select Azure Cognitive Search resource: Select the search service used in the previous lab from the dropdown. + * 2.1.4 Enter the index name: Give an index name e.g aoaiworkshop + * 2.1.5 Check the acknowledgement and click Next. + +![Alt text](images/image-3.png) + +* 2.2 Click on Browse for a file and select the Porche Owner Manual pdf and click Upload files and Next. You can select multiple files as well. + +![Alt text](images/image-4.png) + +* 2.3 Click on Save and close + +![Alt text](images/image-5.png) + +### Step 3: Interact with Azure OpenAI chatGPT LLM using your own data + +* 3.1 Under the Assistant Setup pane, wait until your data upload is finished + +![Alt text](images/image-6.png) + +* 3.2 Under the Chat Session pane, you can start testing out your prompts as shown in the figure below + +![Alt text](images/image-7.png) + +* 3.3 You can also configure the responses of your bot by selecting the system message under Assistance Setup. Here we have edited the default system message. + +![Alt text](images/image-9.png) + +* 3.4 Click on Continue + +![Alt text](images/image-10.png) + +![Alt text](images/image-11.png) + +* 3.5 In Configuration pane, you can try and experiment with different parameter configuration to see how it changes the behavior of the model + +![Alt text](images/image-8.png) \ No newline at end of file diff --git a/lab_instructions/Lab 2/images/image-1.png b/lab_instructions/Lab 2/images/image-1.png new file mode 100644 index 0000000..bf572a0 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-1.png differ diff --git a/lab_instructions/Lab 2/images/image-10.png b/lab_instructions/Lab 2/images/image-10.png new file mode 100644 index 0000000..377751a Binary files /dev/null and b/lab_instructions/Lab 2/images/image-10.png differ diff --git a/lab_instructions/Lab 2/images/image-11.png b/lab_instructions/Lab 2/images/image-11.png new file mode 100644 index 0000000..d782213 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-11.png differ diff --git a/lab_instructions/Lab 2/images/image-2.png b/lab_instructions/Lab 2/images/image-2.png new file mode 100644 index 0000000..245fa05 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-2.png differ diff --git a/lab_instructions/Lab 2/images/image-3.png b/lab_instructions/Lab 2/images/image-3.png new file mode 100644 index 0000000..cd2893d Binary files /dev/null and b/lab_instructions/Lab 2/images/image-3.png differ diff --git a/lab_instructions/Lab 2/images/image-4.png b/lab_instructions/Lab 2/images/image-4.png new file mode 100644 index 0000000..a512842 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-4.png differ diff --git a/lab_instructions/Lab 2/images/image-5.png b/lab_instructions/Lab 2/images/image-5.png new file mode 100644 index 0000000..29e7500 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-5.png differ diff --git a/lab_instructions/Lab 2/images/image-6.png b/lab_instructions/Lab 2/images/image-6.png new file mode 100644 index 0000000..0753c5d Binary files /dev/null and b/lab_instructions/Lab 2/images/image-6.png differ diff --git a/lab_instructions/Lab 2/images/image-7.png b/lab_instructions/Lab 2/images/image-7.png new file mode 100644 index 0000000..35fef20 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-7.png differ diff --git a/lab_instructions/Lab 2/images/image-8.png b/lab_instructions/Lab 2/images/image-8.png new file mode 100644 index 0000000..2e6ddfa Binary files /dev/null and b/lab_instructions/Lab 2/images/image-8.png differ diff --git a/lab_instructions/Lab 2/images/image-9.png b/lab_instructions/Lab 2/images/image-9.png new file mode 100644 index 0000000..43e94f5 Binary files /dev/null and b/lab_instructions/Lab 2/images/image-9.png differ diff --git a/lab_instructions/Lab 2/images/image.png b/lab_instructions/Lab 2/images/image.png new file mode 100644 index 0000000..f296b7c Binary files /dev/null and b/lab_instructions/Lab 2/images/image.png differ diff --git a/lab_instructions/images/2.2.4 customer model project creation.png b/lab_instructions/images/2.2.4 customer model project creation.png new file mode 100644 index 0000000..adebb7b Binary files /dev/null and b/lab_instructions/images/2.2.4 customer model project creation.png differ diff --git a/lab_instructions/images/2.2.png b/lab_instructions/images/2.2.png new file mode 100644 index 0000000..9f617fb Binary files /dev/null and b/lab_instructions/images/2.2.png differ diff --git a/lab_instructions/images/3.1.png b/lab_instructions/images/3.1.png new file mode 100644 index 0000000..2f64f3e Binary files /dev/null and b/lab_instructions/images/3.1.png differ diff --git a/lab_instructions/images/3.2.png b/lab_instructions/images/3.2.png new file mode 100644 index 0000000..b80db8d Binary files /dev/null and b/lab_instructions/images/3.2.png differ diff --git a/lab_instructions/images/BPA 2.png b/lab_instructions/images/BPA 2.png new file mode 100644 index 0000000..a2cfc42 Binary files /dev/null and b/lab_instructions/images/BPA 2.png differ diff --git a/lab_instructions/images/BPA 3.png b/lab_instructions/images/BPA 3.png new file mode 100644 index 0000000..d58a740 Binary files /dev/null and b/lab_instructions/images/BPA 3.png differ diff --git a/lab_instructions/images/BPA ingest documents.png b/lab_instructions/images/BPA ingest documents.png new file mode 100644 index 0000000..4022091 Binary files /dev/null and b/lab_instructions/images/BPA ingest documents.png differ diff --git a/lab_instructions/images/BPAHomePageSSA.png b/lab_instructions/images/BPAHomePageSSA.png new file mode 100644 index 0000000..8991ca2 Binary files /dev/null and b/lab_instructions/images/BPAHomePageSSA.png differ diff --git a/lab_instructions/images/BPAHomePageSearchApp.png b/lab_instructions/images/BPAHomePageSearchApp.png new file mode 100644 index 0000000..07a6ced Binary files /dev/null and b/lab_instructions/images/BPAHomePageSearchApp.png differ diff --git a/lab_instructions/images/BPAHomepageSSA.png b/lab_instructions/images/BPAHomepageSSA.png index 51a3eef..8991ca2 100644 Binary files a/lab_instructions/images/BPAHomepageSSA.png and b/lab_instructions/images/BPAHomepageSSA.png differ diff --git a/lab_instructions/images/Lab3SemSearchConfig.png b/lab_instructions/images/Lab3SemSearchConfig.png index e9a3dc6..f1fa52d 100644 Binary files a/lab_instructions/images/Lab3SemSearchConfig.png and b/lab_instructions/images/Lab3SemSearchConfig.png differ diff --git a/lab_instructions/images/Lab3SemSearchConfig_default.png b/lab_instructions/images/Lab3SemSearchConfig_default.png new file mode 100644 index 0000000..f1fa52d Binary files /dev/null and b/lab_instructions/images/Lab3SemSearchConfig_default.png differ diff --git a/lab_instructions/images/Lab5-openai-search.png b/lab_instructions/images/Lab5-openai-search.png new file mode 100644 index 0000000..b6fad7d Binary files /dev/null and b/lab_instructions/images/Lab5-openai-search.png differ diff --git a/lab_instructions/images/import data blob instead of cosmos.png b/lab_instructions/images/import data blob instead of cosmos.png new file mode 100644 index 0000000..4bbb613 Binary files /dev/null and b/lab_instructions/images/import data blob instead of cosmos.png differ diff --git a/lab_instructions/images/lab3-import-data-1.png b/lab_instructions/images/lab3-import-data-1.png new file mode 100644 index 0000000..d3e0efd Binary files /dev/null and b/lab_instructions/images/lab3-import-data-1.png differ diff --git a/lab_instructions/images/lab3-import-data-2.png b/lab_instructions/images/lab3-import-data-2.png new file mode 100644 index 0000000..c94dd32 Binary files /dev/null and b/lab_instructions/images/lab3-import-data-2.png differ diff --git a/lab_instructions/images/lab3-import-data-3.png b/lab_instructions/images/lab3-import-data-3.png new file mode 100644 index 0000000..5dd192d Binary files /dev/null and b/lab_instructions/images/lab3-import-data-3.png differ diff --git a/lab_instructions/images/lab3-semantic-config-save.png b/lab_instructions/images/lab3-semantic-config-save.png new file mode 100644 index 0000000..7f89737 Binary files /dev/null and b/lab_instructions/images/lab3-semantic-config-save.png differ diff --git a/lab_instructions/images/searchconfig.png b/lab_instructions/images/searchconfig.png new file mode 100644 index 0000000..85affc4 Binary files /dev/null and b/lab_instructions/images/searchconfig.png differ diff --git a/lab_instructions/images/selectcontainer.png b/lab_instructions/images/selectcontainer.png new file mode 100644 index 0000000..0bc0a00 Binary files /dev/null and b/lab_instructions/images/selectcontainer.png differ diff --git a/lab_instructions/images/step1b playground replacement.png b/lab_instructions/images/step1b playground replacement.png new file mode 100644 index 0000000..c7f762a Binary files /dev/null and b/lab_instructions/images/step1b playground replacement.png differ diff --git a/lab_instructions/images/step6a-deploy-custom-model.png.png b/lab_instructions/images/step6a-deploy-custom-model.png.png new file mode 100644 index 0000000..4969039 Binary files /dev/null and b/lab_instructions/images/step6a-deploy-custom-model.png.png differ diff --git a/lab_instructions/lab_1.md b/lab_instructions/lab_1.md deleted file mode 100644 index ea0370a..0000000 --- a/lab_instructions/lab_1.md +++ /dev/null @@ -1,162 +0,0 @@ -# Deploy Pre-trained Model with Business Process Automation Accelerator (BPA) - -### Overview -In this lab, you will create a pipeline with the Business Process Automation Accelerator and utilize it to generate a JSON output in Azure Cosmos DB. We will create an indexer using search with this output and utilize the Sample Search Application within the BPA accelerator to search on specific aspects of the document. - - -### Goal -* Use a sample invoice document and utilize the BPA accelerator to use the Form Recognizer Pretrained Invoice Model -* Add an element that converts the invoice output to simpler format -* Run the pipeline with sample document and create a Search indexer of the simplified output -* Utilize the Sample Search Application in the Accelerator to search on specific areas of the Invoice -* Utilize a pipeline to create Table index and use that in the Sample Search Application - - -### Pre-requisites -* The accelerator is deployed and ready in the resource group -* You have access to sample invoices folder with the invoices to upload - - -### Instructions - -##### Part 1 - -1. Launch the accelerator from the resource group in the Static Web App - 1. To do this go to portal.azure.com ([Azure Portal](portal.azure.com)) from a web browser and click on resource group that is created for the purpose of this lab. - ![resourcegroup.png](/images/resourcegroup.png) - Click on the resource group that is created for this lab, you should be able to see resources deployed as a part of Business Process Automation accelerator deployment. - - > **Note :** The names will be different in your specific labs and will not exactly match with the names of the resources or resource group - - ![resourceswithinresourcegroup.png](/images/resourceswithinresourcegroup.png) - - 1. Look for the Static Web App under **type**. This is what we will use as a part of lab 1. Click on the Web App. - - ![staticwebappresource.png](/images/staticwebappresource.png) - - Click on the URL and this will launch the accelerator - ![swaurl.png](/images/swaurl.png) - -1. This is the home page of the Accelerator. Click on Configure a new pipeline - ![BPAHomepage.png](/images/BPAHomepage.png) - -1. Give a name for the pipeline and click on create - ![createnewpipeline.png](/images/createnewpipeline.png) - -1. Select **PDF Document** - ![documenttype.png](/images/documenttype.png) - -1. Select **Form Recognizer Prebuilt Invoice Model**. - ![selectprebuiltinvoice.png](/images/selectprebuiltinvoice.png) - -1. Select **Convert the Invoice Output to a Simpler Format** in the **Pipeline Preview** page - ![selectconverttosimplerformat](/images/selectconverttosimplerformat.png) - -1. Scroll down the page if need be and click **Done**. This step creates the pipeline - ![lab1pipelinefinalpage](/images/lab1pipelinefinalpage.png) - -1. You should be able to see the pipeline created in page that loads next - ![lab1pipelinecreated.png](/images/lab1pipelinecreated.png) - -1. The next step is to ingest documents into this pipeline. Click on **home** and select **Ingest Documents** - ![home1.png](/images/home1.png) - - ![ingestdocuments.png](/images/ingestdocuments.png) - -1. Select the pipeline you just created **first** from the dropdown and then drop documents from Sample Invoices folder. Use Sample 7 folder and drop a few documents from there. - ![selectpipeline.png](/images/selectpipeline.png) - You may see a prompt that there are some active documents being processed by the pipeline - ![activesamplesprocessing.png](/images/activesamplesprocessing.png) - -1. The results can be viewed in **Azure Cosmos DB Data Explorer**. To view the results, go to portal.azure.com ([Azure Portal](portal.azure.com)) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type Azure Cosmos DB account - ![cosmosdbtype.png](/images/cosmosdbtype.png) - - Go to Data Explorer - ![cosmosdbdataexplorer.png](/images/cosmosdbdataexplorer.png) - - From there, go to items - ![cosmosdbitem.png](/images/cosmosdbitem.png) - - Click on one of the items. This represents the output from the pipeline on the documents uploaded. Since we added the item in the pipeline - **Convert the Invoice Output to a Simpler Format**, th output is simplified so we can create an indexer with **Search Service**. - ![oneitemjson.png](/images/oneitemjson.png) - - Scroll through the results and check the output and compare with the invoice uploaded. - -1. The get to **Search Service**. To view the results, go to portal.azure.com ([Azure Portal](portal.azure.com)) again in your browser and get to the resource group like we did earlier in Step 1. There, in the resource group, click on the resource that is of type **Search Service**. - - ![searchservicetype.png](/images/searchservicetype.png) - -1. Click on **Import Data**. - ![selectimportdata.png](/images/selectimportdata.png) - -1. Select **Azure Cosmos DB** from the dropdown in datasource. - ![selectazurecosmosdb.png](/images/selectazurecosmosdb.png) - -1. Provide a name for datasource and click on **Choose an existing connection** for **Connection String**. Here the Azure CosmosDB resource created as a part of BPA accelerator already setup will be one of the sources you can choose from. - ![selectcosmosdb.png](/images/selectcosmosdb.png) - - -1. Keep the default for **Managed identity Authentication**, which is **None**. For **Databases** and **Collection** use the dropdown to select the same name as the Cosmos DB you selected at step 15. - -1. Under Query, use the following Query. The pipeline should match the pipeline name you used in step 3 - > SELECT * from c WHERE c.id != 'pipelines' AND c.id != 'cogsearch'  AND c.pipeline = 'lab1pipeline' AND c._ts >= @HighWaterMark ORDER by c._ts - - ![importdata.png](/images/importdata.png) - -1. Click **Next: Add cognitive skills (Optional)**. This validates and creates the index schema. - -1. In the next Screen(**Add cognitive skills (Optional)**), Click **Skip to: Customize Target Index**, - ![customizetargetindex.png](/images/customizetargetindex.png) - -1. In the next screen, under **aggregated results**, click on the **...** on **invoice**, click **delete** . Similarly, you can also delete **resultindexes** - ![deleteinvoice.png](/images/deleteinvoice.png) - -1. Make all fields **Retrievable** and **Searchable** - ![Retrievable.png](/images/Retrievable.png) - -1. Under **aggregatedResults**-> **simplifyInvoice** Select, customerName, invoiceId, invoicedate and dueDate to be filterable and sortable - ![simplifyinvoicefiltersort.png](/images/simplifyinvoicefiltersort.png) - - -1. Similarly, under **aggregatedResults**-> **items**, select all fields to be filterable and sortable. - ![itemsfileterableandsortable.png](/images/itemsfileterableandsortable.png) - -1. Provide a name for the Index and click on **Next: Create an indexer** - ![indexname.png](/images/indexname.png) - -1. Provide a name for the indexer and click **Submit** - - ![createindexer.png](/images/createindexer.png) - -1. You will get a notification that the import is successfully configured - -1. Now, go back to the accelerator url that you retreived from Step 1 and click on **Sample Search Application**. - ![samplesearchapplication.png](/images/samplesearchapplication.png) - - This opens the same search application - ![searchlandingpage.png](/images/searchlandingpage.png) - -1. You can now filter and search on items and other fields configured. - - -#### Part 2 -We can extend this lab further by using Form Recognizer Layout Service and check how we can retrieve information in the form of tables using Azure Cognitive Search. - -1. Create a new pipeline using the layout service and extract information for table search. The steps will be similar to Steps 1-8 in **Part 1** that you just did. The pipeline page before you click **Done** at Step 7 should like like the screen shot below: - ![pipelinetablesearch.png](/images/pipelinetablesearch.png) - -1. Next step would be to ingest documents in the pipeline similar to steps 9-11 in part 1 but use the pipeline created as a part of this exercise. - -1. Now, we need to configure **Search Service** for table search. This can be configured similar to steps 12-17. The Query will be slightly different from what we used in step 17 and also make sure the pipeline is the name of the pipeline created for this step. Note that this query filters for **table** type - - > SELECT * from c WHERE c.id != 'pipelines' AND c.id != 'cogsearch' AND c.pipeline = 'lab1table' AND c.type = 'table' AND c._ts >= @HighWaterMark ORDER by c._ts - -1. Follow steps 18-19 as before and when you get to **Customize target index** section, give the index a name that helps identify that it is a table index and then make all fields **Searchable** and **Retrievable** and the table data and id **Filterable** and **Facetable**. - ![tableindexoptions.png](/images/tableindexoptions.png) - -1. Follow steps 24-25 and once you get a notification after clicking **Submit**, you can follow step 27 to open the **Sample Search Application** - -1. Here, select the index created as a part of this exercise and also enable **Table Search** - ![tablesearch.png](/images/tablesearch.png) - -1. Explore this UI, eg, table search configuration and filter and search on specific items to get more insights. diff --git a/lab_instructions/lab_2.md b/lab_instructions/lab_2.md deleted file mode 100644 index af09042..0000000 --- a/lab_instructions/lab_2.md +++ /dev/null @@ -1,97 +0,0 @@ -# Create and Deploy a Form Recognizer Custom Model - -### Overview -In this lab, you will create (train) an Azure Form Recognizer custom model using a sample training dataset. Custom models extract and analyze distinct data and use cases from forms and documents specific to your business. To create a custom model, you label a dataset of documents with the values you want extracted and train the model on the labeled dataset. You only need five examples of the same form or document type to get started. For this lab, you will use the dataset provided at [Custom Model Sample Files](/SampleInvoices/SampleInvoices/Custom%20Model%20Sample). - - -### Goal -* Use a sample training data set to train a custom model in the Azure Form Recognizer Studio -* Label the training data documents with custom fields of interest -* Test the trained model on test data, visualized results and confidence score in the Studio -* Use the custom model in the BPA pipeline from Lab 1 - - -### Pre-requisites -* The accelerator is deployed and ready in the resource group -* You have an Azure subscription and permission to create a Form Recognizer Resource -* You have access to sample invoices folder with the invoices to upload - - -### Instructions - -#### Create a Custom Model -- [Step 1 - Create a Form Recognizer Resource](#step-1---create-a-form-recognizer-resource) -- [Step 2 - Open Form Recognizer Studio and Create a Custom Labeling Project ](#step-2---open-form-recognizer-studio-and-create-a-custom-labeling-project) -- [Step 3 - Import the Sample Data](#step-3---import-the-sample-data) -- [Step 4 - Train the model](#step-4---train-the-model) -- [Step 5 - Test the Model on Test Data](#step-5---test-the-model-on-test-data) - -#### Step 1 - Create a Form Recognizer Resource -![](images/step1a-create-form-rec-resource.png) -![](images/step1b-create-form-rec-resource.png) -![](images/step1c-create-form-rec-resource.png) - -#### Step 2 - Open Form Recognizer Studio and Create a Custom Labeling Project - -![](images/step2a-Create-custom-labeling-project.png) -![](images/step2b-Create-custom-labeling-project.png) - -Create Custom Model Project - -![](images/step2c-Create-custom-labeling-project.png) -![](images/step2d-Create-custom-labeling-project.png) - -Provide the storage account and container containing the forms data which you will like to label - -![](images/step2e-Create-custom-labeling-project.png) -![](images/step2f-Create-custom-labeling-project.png) -![](images/step2g-Create-custom-labeling-project.png) - -#### Step 3 - Import the Sample Data - -![](images/step3a-import-sample-data.png) -![](images/step3b-import-sample-data.png) - -Create a new field which you would like to label - -![](images/step3c-import-sample-data.png) -We created the label as "Organziation_sample" - -![](images/step3d-import-sample-data.png) - -Apply the custom label to form fields -![](images/step3e-import-sample-data.png) -Apply the labels to all forms by repeating the process in step e -![](images/step3f-import-sample-data.png) -#### Step 4 - Train the model -After labeling the forms, click on "Train" and provide the below information -![](images/step4a-train-the-model.png) -![](images/step4b-train-the-model.png) -#### Step 5 - Test the Model on Test Data -![](images/step5a-test-the-model.png) -![](images/step5b-test-the-model.png) -Load the test file and click "Analyze" -![](images/step5c-test-the-model.png) -The results are projected with the confidence score -![](images/step5d-test-the-model.png) - - -#### Build new pipeline with custom model module in BPA -After you are sastified with the custom model performance, you can retrieve the model ID and use it in a new BPA pipeline with the Cusom Model module. - -Please repeat the steps in [Lab 1](/lab_instructions/lab_1.md) to create the following pipeline: -![](images/step6a-deploy-custom-model.png) -![](images/step6b-deploy-custom-model.png) -![](images/step6c-deploy-custom-model.png) - -Retrieve the trained custom model ID from the Form Recognizer Studio and Enter it into the following window: -![](images/step6d-deploy-custom-model.png) -![](images/step6e-deploy-custom-model.png) - -Run the pipeline and visualize results in CosmosDB and search service as detailed in [Lab 1](/lab_instructions/lab_1.md). -![](images/step6f-deploy-custom-model.png) -![](images/step6g-deploy-custom-model.png) - -## More Resources -Getting Started with Form Recognizer Studio - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/form-recognizer-studio-overview?view=form-recog-3.0.0 -Form Recognizer Documentation - https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-invoice?view=form-recog-3.0.0 diff --git a/slides/Analyzing Documents using Azure OpenAI & Form Recognizer Dec.pptx b/slides/Analyzing Documents using Azure OpenAI & Form Recognizer Dec.pptx new file mode 100644 index 0000000..79f7029 Binary files /dev/null and b/slides/Analyzing Documents using Azure OpenAI & Form Recognizer Dec.pptx differ diff --git a/slides/Analyzing Documents using Azure OpenAI & Form Recognizer.pptx b/slides/Analyzing Documents using Azure OpenAI & Form Recognizer.pptx new file mode 100644 index 0000000..10f156d Binary files /dev/null and b/slides/Analyzing Documents using Azure OpenAI & Form Recognizer.pptx differ