Data submission - Decarboxylative Olefination (Machine learning optimized)#237
Conversation
|
@DrHermit - many thanks for this, and also for already correcting the inputs in your reactions. I'll do the formal review tomorrow and let you know if there are any changes we recommend. The pull request is looking as I expect though so we're hopefully on the home stretch now. |
|
Review reports: review_#237_1.ipynb (Transfer learning dataset) |
|
@DrHermit I've done my review and have a few 'must fix' issues, and some suggestions to consider. I'll list them here in decreasing order of importance. You may also be interested in checking the Jupyter Notebooks I've attached above - these go through my review checks and will include many of the comments below: For both datasets
For the Bayesian Optimisation dataset:
If you are unsure about how to implement any of these suggestions these do reach out to me. We could solve them quite quickly on a call, and I can send you some snippets of code to paste into the reaction editor for some bits. |
|
I was also going to say thanks for revealing the smiles/name mismatch to me. I hadn't considered that possibility before so it is very useful to see that this can happen in normal usage of the reaction editor enumerator tool. I'll have to give some thought to how we might prevent that in future, through changes to the app and/or training. |
|
I'd also like to strengthen the dataset names and descriptions to help more users find them. I can put together a suggestion for you to consider, and I can make that change at the last minute before the dataset goes into the public database. |
|
Suggested name/descriptions: Bayesian optimization of 6 decarboxylative Knoevenagel condensation reactions The Knoevenagel condensation between 6 pairings of aldehydes and malonic acid half-thioesters were studied in a Bayesian optimization campaign of 120 reaction datapoints. For each pairing, the catalyst, solvent, temperature and equivalents were optimized across 4 rounds of 6 experiments. Reactions performed by the Alan R. Healy group at New York University Abu Dhabi, and the pre-print publication is available on ChemRxiv at https://doi.org/10.26434/chemrxiv.15001213/v1. This dataset was used as training data for a subsequent transfer learning optimization of similar aldehydes and malonic acid derivatives (XXXX - will add dataset id here during submission processing). |
|
@bdeadman Thanks a lot for the careful review and thoughtful suggestions!!
For the workup steps, I added the following steps. Let me know if you have any suggestions for this:
And thanks for the suggestions to the name and description of the dataset! I saw your comment while I was writing this reply. I have already implemented the changes to both the BO and the transfer learning dataset. Please take a look at my revisions and let me know if there are more changes you would like to see! Thank you very much for the help!! Bayesian optimization of 5 decarboxylative Knoevenagel condensation reactions.json |
|
@DrHermit - Thanks for the speedy corrections. I've checked it over and it is looking good now. If you don't mind making some final tweaks I would suggest the following for the workup descriptions:
Below is a zip file with a single reaction for each dataset. I've inserted a temp change as the first workup, changed the ethyl acetate volume to 3mL and added a few more details in the text descriptions. Note that I haven't done anything for item 3 above. reordered_reaction_workups.zip You should be able to upload these reactions (separately) to your ORD editor, check and make any final changes you want, then turn them into templates for enumeration. If any of this doesn't work smoothly then please come back to me rather than struggle with it. We're only making small optional improvements to the dataset now so we'll take the datasets as is if necessary. For the next steps:
|
|
@bdeadman Hi! Sorry for the delay in making the changes. I was busy on some other projects for the past few days. Thank you very much for creating the workup procedure for me!! It was really helpful. I've updated the workup steps and uploaded the new dataset. Please check and let me know if I did everything correctly. I'm looking forward to having the dataset online! Thank you very much for all the suggestions and help!! |
|
Thanks @DrHermit - I'll merge those into the branch now. Notes:
|
This pull request contains two datasets generated from optimization of a decarboxylative olefination reaction between aldehydes and malonic acid derivatives:
Data generated in Alan Healy lab at NYU Abu Dhabi
Here is the link to the ChemRxiv pre-print of the paper:
https://chemrxiv.org/doi/full/10.26434/chemrxiv.15001213/v1
Thank you for taking time to review my data and I hope I made the correct pull request this time.