-
Notifications
You must be signed in to change notification settings - Fork 78
Empa Air Pollution #347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Empa Air Pollution #347
Conversation
|
I have opened an issue for asking help. I will be away for one week (holidays) so I will continue working on this later on. |
|
One point that i find wierd is that the validator seems to not like the Accession strings |
You can find the details about how to construct the Accession IDs here: It appears that you've put the name in the Accession, whereas we expect a number, e.g.: |
Please note that we have detailed record specifications to help explain what is needed in the various record entries: It seems from the validation output that at least one other compulsory field is missing: The IPB Halle team are at BioHackEU25 this week, so they are a bit distracted, but will look into this once they are back. |
|
@schymane Thanks for the answers, I managed to fix the format of our files. Before I add the whole library, is it possible to confirm/register our laboratory and the prefix ? do you need any additional information from our side ? |
|
Hi Lionel, |
Hi Rene, Thanks for reaching out, we would still need more time (we want to go manually though all files to do a quality check. About the table of contributors, we discussed and suggest the following :
I had initially also changed in the file in the PR, should I do it this way or do you want to update it from a separate PR ? We will notify you when ready to merge ;) |
| AC$CHROMATOGRAPHY: KOVATS_RTI 818 | ||
| PK$SPLASH: splash10-000t-9000000000-90ef1466a5c67cf33c97 | ||
| PK$ANNOTATION: m/z formula_count exact_mass error(ppm) tentative_formula intensity_fraction | ||
| 49.98421 1 49.99178 151.43 H3CCl+ 0.76 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete or review. unlikely to be H3CCl+ from structure and if so missing isotope signal
| 69.94142 1 69.93716 -60.95 Cl2+ 1.00 | ||
| 71.93848 1 71.93421 -59.40 Cl[37Cl]+ 0.77 | ||
| 81.94018 1 81.93716 -36.89 CCl2+ 1.00 | ||
| 83.94540 1 83.93421 -133.34 CCl[37Cl]+ 0.60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is weird, twice in a row 83.94540 m/z with two different assignments? Also, NIST spectrum has strong signal at 83 m/z HCCl2, but here it is absent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are generally more than one formula assigned to a given mass? I see that up to 3 formulas assigned per formula for this compound (some other compounds have up to 4 assignments. Do we want that? Seems weird to me
| 81.94018 1 81.93716 -36.89 CCl2+ 1.00 | ||
| 83.94540 1 83.93421 -133.34 CCl[37Cl]+ 0.60 | ||
| 83.94540 2 83.95281 88.23 H2CCl2+ 0.86 | ||
| 85.94626 1 85.94986 41.85 H2CCl[37Cl]+ 1.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIST has a strong 85 m/z signal (maybe H3CCl2)...but it is absent here? --> OH no I see that the unassigned peaks are listed separately below...but this seems silly to me that two of the most abundant peaks 83 and 85 m/z are not assigned and not listed here...
| 93.93877 1 93.93716 -17.17 C2Cl2+ 0.57 | ||
| 94.94653 1 94.94498 -16.31 HC2Cl2+ 1.00 | ||
| 95.95457 1 95.94834 -64.96 HC[13C]Cl2+ 0.01 | ||
| 95.95457 2 95.95281 -18.37 H2C2Cl2+ 1.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is number 1 and number 2 assignment decided? It looks like intensity_fraction shows how much of the mass is assignable to the formula, wouldn't it make more sense to have the higher intensity_fraction assigned as 1?
| AC$CHROMATOGRAPHY: KOVATS_RTI 566 | ||
| PK$SPLASH: splash10-002o-9000000000-17c33adb4eb05f58d77f | ||
| PK$ANNOTATION: m/z formula_count exact_mass error(ppm) tentative_formula intensity_fraction | ||
| 23.98798 1 0.00000 0.00 - 0.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's happening here? something seems wrong...
| AC$CHROMATOGRAPHY: KOVATS_RTI 396 | ||
| PK$SPLASH: splash10-0udi-3900000000-5d7701f39c27b4d50277 | ||
| PK$ANNOTATION: m/z formula_count exact_mass error(ppm) tentative_formula intensity_fraction | ||
| 42.99847 1 42.99785 -14.31 C2F+ 0.98 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so many peaks and so few assignments, what's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HCl ? But this would be coming from a wierd recombination effect in the fragmentation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add NIST spectrum 1,2-dichloro-1,1-difluoroethane!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are 2 nist spectra already available (HCFC-132a and HCFC-123b)
Should we rename ours to include the a or the b ?
Or extract both spectra separately ?
We also have a HCFC-123c registered in the target list and in nist there are 6 compounds for that formula (https://webbook.nist.gov/cgi/cbook.cgi?Formula=H2C2F2Cl2&NoIon=on&Units=SI)
Do you think we should clean that up ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empa_Air_Pollution/MSBNK-Empa_Air_Pollution-EAP__MS_MORGANA.txt
Outdated
Show resolved
Hide resolved
|
@Alina-beal I had a look to your comments, many issues are similar. It is good taht you have opened these threads for each spectra, now we can comment for each spectra how we want to go forward. Identification of fragment by alpinacYou mentionned empty lines. I guess we need to choose if we want to include the non identified (and maybe increase manually the mass uncertainty of those fragments so that alpinac can identify them ? ) Modifiying spectra filesFor the files where the spectra is wrong, we will need to discuss how we coordinate the modifications of the current files. Masses below 20 or too highSince we have the cutoff and we document it in the metadata, I assume it is okay that we miss masses for some compounds under 20. |
|
We could proceed as follows:
|















Hello,
We are the laboratory for Air Pollution of Empa and we would like to contribute to MassBank with our spectras.
I wanted to test the format locally but ran into issues with the check software. see MassBank/MassBank-web#414 and MassBank/MassBank-web#413
This is just a draft for now, we have hundreds of spectra to upload, but we wanted first to ask about the format and the metatdata.
I created names and identifiers for our lab: EAP for Empa Air Pollution
Happy to receive any feedback ;)