Conversation
vendor_data.py
Outdated
| VendorMail should be a string, if not a TypeError is raised. | ||
| TargLang should be a string, if not a TypeError is raised. | ||
| WordRate should be a float, if not a TypeError is raised. | ||
| WordRate should be > 0.15, if not a ValueError is raised. |
There was a problem hiding this comment.
WordRate should NOT be > 0.15, right?
vendor_data.py
Outdated
| TargLang should be a string, if not a TypeError is raised. | ||
| WordRate should be a float, if not a TypeError is raised. | ||
| WordRate should be > 0.15, if not a ValueError is raised. | ||
| CatTool should be a string, if not a TypeError is raised. |
There was a problem hiding this comment.
This one can probably have a limited range of options, right?
montesmariana
left a comment
There was a problem hiding this comment.
Possibility of using input():
done = "no"
name = ""
feeling = "super sad"
while done.lower() != "yes":
new_name = input(f"What's your name? (Previous value: {name}) ")
name = new_name if new_name else name
feeling = input(f"What are you feeling? (Previous value: {feeling}) ")
done = input("Are you done? ")
print(f"{name} says: '{feeling}'")
vendor_data.py
Outdated
| self.VendorMail = VendorMail | ||
| self.CatTool = CatTool | ||
|
|
||
| regex_mail = "^[a-z0-9]+[\._]?[a-z0-9]+[@]\w+[.]\w{2,3}$" #regex used to validate email |
There was a problem hiding this comment.
Have you tested this? It might need an r before the string to make it a raw string.
vendor_data.py
Outdated
| raise TypeError("TargLang should be a string!") | ||
| if type(WordRate) != float: | ||
| raise TypeError("WordRate should be a float!") | ||
| if WordRate: |
There was a problem hiding this comment.
if you don't have WordRate, line 64 is already going to fail. In any case, WordRate isn't optional, so there will always be WordRate...
vendor_data.py
Outdated
| raise TypeError("WordRate should be a float!") | ||
| if WordRate: | ||
| if WordRate > 0.15: | ||
| raise ValueError("This vendor is too expensive, consider another one.") |
There was a problem hiding this comment.
Should this be an error or a warning?
An error stops execution and the vendor cannot be added. A warning lets you add the vendor but just warns the user that something may be wrong. I ask this because the phrasing of the message "consider another one" is more warning-like than error-like.
| "WordRate" (float): Vendor's word rate for new words in Euro | ||
| "CatTool" (str): Vendor's preferred CAT-tool | ||
| """ | ||
| self.VendorName = VendorName |
There was a problem hiding this comment.
Assigning the arguments to the attributes should be done AFTER validation.
There was a problem hiding this comment.
(Or during, which is what you are doing at the validation.)
vendor_data.py
Outdated
| if type(VendorMail) != str: | ||
| raise TypeError ("VendorMail should be a string!") | ||
| if VendorMail: | ||
| if(re.search(regex_mail,VendorMail)): |
There was a problem hiding this comment.
(Just: if type(VendorMail) should be inside the if VendorMail part, because it fails if there is no VendorMail)
There was a problem hiding this comment.
Ok, now that I see a bit below, lines 78-81 can be replaced with self.set_mail(VendorMail), and line 45 can be deleted.
vendor_data.py
Outdated
| if CatTool in VendorData.CatTools: | ||
| self.CatTool = CatTool | ||
| else: | ||
| raise ValueError("This CAT tool is not valid. Run 'VendorData.CatTools' to check options.") |
vendor_data.py
Outdated
| df.to_excel(f"{filename}.xlsx", index=False, sheet_name="VendorData") | ||
| else: | ||
| with pd.ExcelWriter(f"{filename}.xlsx", mode="a", engine="openpyxl", if_sheet_exists="overlay") as writer: | ||
| df.to_excel(writer, sheet_name = "VendorData", startrow=writer.sheets["VendorData"].max_row, header = None, index=False) |
There was a problem hiding this comment.
Looks good!
I would split lines 109 and 110 in different lines like you did with the dictionary right above.
And I would approach this mergin in a different, safer way. (To talk in person)
| parser.add_argument("--WordRate", type=float, help="The vendor's word rate in Euro") | ||
| parser.add_argument("--VendorMail", type=str, help="The vendor's email address") | ||
| parser.add_argument("--CatTool", type=str, help="The CAT-tool the vendor wants to use") | ||
| args = parser.parse_args() |
There was a problem hiding this comment.
You can create a flag argument with the action attribute: https://docs.python.org/3/library/argparse.html#action
For example:
parser.add_argument('-a', '--add', action='store_true')Then args.add is True if the user calls -a or --add and False otherwise.
vendor_data.py
Outdated
| raise ValueError("This vendor is too expensive, pick another one.") | ||
| if WordRate == 0.00: | ||
| raise ValueError("Word rate cannot be 0.00.") | ||
| else: |
There was a problem hiding this comment.
This else is aligned with if WordRate, meaning that self.WordRate = WordRate only runs if you don't have WordRate.
You should actually remove the else and leave self.WordRate = WordRate aligned with the ifs in lines 77-81.
That way, the logic goes like so:
- if there is a value for
WordRate:- if it is not a float, throw an error
- (otherwise) if it is larger than 0.15, throw an error
- (otherwise) if it is equal to 0, throw an error
- (otherwise) assign its value to
self.WordRate
Because theifstatements throw an error, you don't needeliforelse, you can just assume that if there was no error it will move forward.
README.md
Outdated
| The repository contains: | ||
| - The `README.md` file which describes the repository and illustrates how to use the code as a module and by running the script; | ||
| - The `tutorial.md` and `tutorial.ipynb` files which show how to actually use the code and the different functionalities; | ||
| - Two `xlsx` files which were generated while creating the `tutorial.ipynb` file. Be aware that if you run the code in the tutorial notebook, some data will be duplicated in the excel if you do not change anything. To avoid this you can delete the two excel files from the directory or you can change the data in the tutorial code to fit your needs.; |
There was a problem hiding this comment.
You could also add a line of code yourself to make sure the file is deleted when it should be:
import os
os.remove('path/to/the/file.xlsx')
README.md
Outdated
|
|
||
| ```python | ||
| %run vendor_data.py <example.json> | ||
| %run vendor_data.py <"-a/--add or -m/--modify"> |
There was a problem hiding this comment.
In both cases, because you have two mutually-exclusive arguments, I would make to chunks to show. That way, a user can copy your code. So: one line showing how to call your script to add an entry, and another line showing how to call your script to modify an entry.
| In both cases `example.json` stands for the `filename` argument that the script needs. You can use [the file in this repository](example.json) or a similar file of yours. Find more information on how this script works with: | ||
| Running vendor_data.py without either the argument -a (to add a new vendor) or -m (to modify an existing vendor) will not do anything. The argument -a prompts the user to add a new vendor. The data is provided by answering some questions regarding the vendor and the project. The argument -m prompts the user to modify an existing vendor. Again the data is provided by answering some questions regarding the vendor and the project. | ||
|
|
||
| For more information you can run the command below, or check in the notebook tuturial.ipynb in this repository under "Running `vendor_data.py`". |
| 1. Easily create an excel file with the project name and the source language as filename; | ||
| 2. Write the following data to the excel file: the target language and the translator's name, e-mail, word rate, preferred CAT tool and the status of the translator; | ||
| 3. Modify the following data for the vendors already in the excel file: e-mail, word rate, preferred CAT tool and status. | ||
| The main advantage of this script is that it limits the posibilities of CAT tools and statuses and that it validates e-mail and word rate (betw. 0.01 and 0.15). This prevents the excel file from becoming messy when different people are working on the project. |
| 3. Modify the following data for the vendors already in the excel file: e-mail, word rate, preferred CAT tool and status. | ||
| The main advantage of this script is that it limits the posibilities of CAT tools and statuses and that it validates e-mail and word rate (betw. 0.01 and 0.15). This prevents the excel file from becoming messy when different people are working on the project. | ||
|
|
||
| ## What this script CANNOT do |
vendor_data.py
Outdated
| import os #os package to check if files exists | ||
| import pyinputplus as pyip #pyinputplus package to facilitate providing arguments | ||
|
|
||
| #4 methods to make validation easier |
There was a problem hiding this comment.
outside of a class: function
inside of a class: method
| raise TypeError("WordRate should be a float!") | ||
| if WordRate > 0.15: | ||
| raise ValueError("This vendor is too expensive, pick another one.") | ||
| if WordRate == 0.00: |
vendor_data.py
Outdated
| if type(CatTool) != str: | ||
| raise TypeError("CatTool should be a string!") | ||
| if not CatTool in ["XTM", "Trados Studio", "MemoQ", "Memsource"]: | ||
| raise ValueError("This CAT tool is not valid. Run 'VendorData.CatTools' to check options.") |
vendor_data.py
Outdated
| """ | ||
| if type(CatTool) != str: | ||
| raise TypeError("CatTool should be a string!") | ||
| if not CatTool in ["XTM", "Trados Studio", "MemoQ", "Memsource"]: |
There was a problem hiding this comment.
To avoid repetition (if you wanted to add/remove a CatTool, now you have to do it in two places) either provide the list of CatTools as an argument of this function or call it as VendorData.CatTools.
| self.WordRate = WordRate | ||
| self.VendorMail = VendorMail | ||
| self.CatTool = CatTool | ||
| self.Preferred = Preferred |
There was a problem hiding this comment.
The whole __init__() looks much better and cleaner now :)
vendor_data.py
Outdated
| raise ValueError("Invalid index, run 'self.ReadExcel()' to see options") | ||
| if Key == "E-mail": #validate e-mail address if a value for the key E-mail is modified | ||
| CheckVendorMail(NewValue) | ||
| if Key == "Word Rate": #validate word rate if a value for the key Word Rate is modified |
There was a problem hiding this comment.
For all the Key ==... checks, I would chain them with elif. Otherwise, it will check if Key == "Word Rate" even if it already say that it is "E-mail"
| Vndr=VendorData(ProjectName, SourceLang, TargLang, VendorName, VendorMail, WordRate, CatTool, Preferred) | ||
| Excel = pyip.inputStr("Do you want to add this vendor to the excel file? ") #user gets the chance to write data to excel | ||
| if Excel.lower() == "yes": | ||
| Vndr.ToExcel() #data written to excel |
There was a problem hiding this comment.
I would add something like:
print("Vendor was added.")
else:
print("Action cancelled.")So the user has confirmation of what happened
vendor_data.py
Outdated
| else: | ||
| raise ValueError(f"There is no file for {self.ProjectName} in {self.SourceLang}") | ||
|
|
||
| def ModExcel(self, Key, Index, NewValue): |
There was a problem hiding this comment.
Good job with this method.
BUT. This means that if a user has changed two or three values in an entry, you will open and close the Excel file two or three times, one for each value that you want to change.
If you think that users will not change more than one thing at a time, it's ok, but it might make more sense to collect all the new values and update them in one opening-closing of the file.
tutorial.md
Outdated
|
|
||
| ```python | ||
| import sys | ||
| sys.path.append ('vendor_data.py') #this is the relative path to the script. If the script is not directly next to the notebook, it could be better to copy the full path of the notebook. |
There was a problem hiding this comment.
no, you don't add the path TO the script, but to the folder where the script lives
There was a problem hiding this comment.
(It seems to work anyways, but normally you append the directory)
tutorial.md
Outdated
|
|
||
| ### Docstrings | ||
| The script is completely documented with docstrings in order to ensure that the script is correclty used. | ||
| You can read these docstrings by calling one of the following functions: |
There was a problem hiding this comment.
Do not leave this output, it is too much automatic text for a tutorial.
You can show the code without running it by adding it inside the Markdown cell, with ```python right before and ``` afterwards.
For the tutorial, think of what you would like to find as a user that doesn't know about the module and is just starting. If anything, this kind of "here you have more information" could go at the end, but starting a tutorial with this can be a bit offputting for a user.
tutorial.md
Outdated
|
|
||
|
|
||
|
|
||
| Some arguments had default arguments. When we check the e-mail for VendorNL, for which we didn't provide one, we'll see that it does not return anything: |
There was a problem hiding this comment.
Some arguments have default values.
I wouldn't say an empty string is a default value from the user's perspective. It's rather allowed to be empty.
|
|
||
|
|
||
|
|
||
| The value for the argument Preferred can be True, False or None. This has an influence of the vendor's status. |
tutorial.md
Outdated
|
|
||
|
|
||
|
|
||
| ### Functions |
There was a problem hiding this comment.
Functions inside a class are called "methods".
tutorial.md
Outdated
|
|
||
|
|
||
|
|
||
| We can now write the data for the vendors to an excel file. Since every vendor has the value for the argument `ProjectName` and `SourceLang`, they will be written to the same excel file. We should execute this function for every vendor. Be careful because if you run this function multiple times for the same vendor, it will not overwrite but append and you will have duplicates in the excel file. |
There was a problem hiding this comment.
Since every vendor has the same value...
There was a problem hiding this comment.
The fact that you can generate duplicates is a weakness in the code.
You're doing a great job, so if you feel that you cannot fix this yet (you could, by checking for the existence of an entry before adding it), at least add it to the list of things that could/should be improved in the future.
You could also make this very explicit in the markdown:
<div class="alert alert-danger">
If you run `.toExcel()` multiple times on the same vendor, it will generate multiple entries!
</div>|
|
||
|
|
||
| ```python | ||
| vd.VendorData.Keys |
|
|
||
|
|
||
| ```python | ||
| VendorES.CatTool |
There was a problem hiding this comment.
Mmm I find this counterintuitive. I would have the code update the class itself as well.

Hi Mariana,
I started on the assignment. The idea is to write a script that helps Translation Project Managers add vendors to a local database and keep a clear overview of preferred/back-up/potential classes. I have created a class and two subclasses
Best regards,
Elmo