Replies: 3 comments 4 replies
-
|
Hello, As far as I recall code currently doesn’t support CSV out of the box. But it does support JSON. My suggestion would be to convert the CSV file to JSON and place it in the data folder and run the prepdocs script. Here is a example JSON file from the code base: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/data/Json_Examples/query.json You can have each movie as a single JSON file or a one big JSON file (above example) below is a single file example: https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/data/Json_Examples/2189.json Your JSON file can have any key / value and the search should be able to handle type of movie questions you have. cheers |
Beta Was this translation helpful? Give feedback.
-
|
Hey mimoflynn, I'm also interest in this topic; so I did some research and hope I can help. But unfortunately I don't see a super easy solution. First of all, The previous speaker, zedhaque, is right: processing CSV files is not possible "just like that" with the GitHub example; PDF, DOCX and JSON are currently supported.
There is a new Microsoft Learn Article about loading and parsing CSV files for Azure AI Search: MS Learn: Search: How to index CSV |
Beta Was this translation helpful? Give feedback.
-
|
@zedhaque @advanced-flow - thank you both very much for your inputs - really helpful! Apologies for the late reply, I've not been able to spend any time on this over the last few days. I will look into all suggestions starting with converting my data to JSON. I'll write a python script to do this, creating a single file rather than separate files, as I have approx. 15k records - would have an issue with storing/uploading them all. I may come back to you with more questions in the next few days, if that's OK? I will also need to understand how the file can be updated when new records need to be added but I know @pamelafox covered this in one of the sessions so I'll watch back the videos. Thanks again |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Some of my input data is in a CSV file.
Can I just drop the file into my codespaces "Data" folder like I did with PDFs, so it automatically gets indexed?
Finding the best answers.
Depending on the users query, it may be necessary to search multiple columns to get the best answer. Let's say it was movie data e.g.
Users could ask questions that relate to one, some or all of the above fields e.g. Find movies in the "Action" category, with a family, set in Australia in the last 5 years, and list the main Actors that were in them.
It would need to search the Storyline + Keywords fields, check the year, cross-check the genre and then list the actors.
Is this done using vector + keyword search, and does all of this get looked after automatically or do I need to write some bespoke code for it? If so, is there any sample code I can base it on?
Or it could be something like 'Find all Leaf Phoenix movies' and the AI would need to know we are taking about an actor and not nature, trees or mythical birds!
Many thanks
Mim
Beta Was this translation helpful? Give feedback.
All reactions