diff --git a/src/P808Template/P808_conference.html b/src/P808Template/P808_conference.html new file mode 100644 index 0000000..b5c3aad --- /dev/null +++ b/src/P808Template/P808_conference.html @@ -0,0 +1,3531 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ +
+
+

HIT Type: 804

New: we publish different HIT types which may look similar but have different instructions and aims. This number is an identifier for you to recognize them.

+ + Introduction + + scale + +

Welcome to this Data Labeling task! You are about to participate in our speech quality assessment task! It is a data labeling task in which you are responsible for providing reliable labels that pass our qualifity control system. This HIT has two + + two (just every now and then) sections:

+
    +
  • Qualification (just once): Check if you are eligible to perform these HITs
  • +
  • Setup (every 0 minutes): Configure your system and validate it + by answering 6 questions
  • +
  • Detailed instuction (only once): 12 audio samples, and description of scale. It is crucial to understand what you should do.
  • +
  • Training (every 0 hour): 0 questions, same as "Ratings" but appears again after 0 hour from the last HIT you perform in this group. +
  • Rating: Listen to 0 audio files and give your opinion + about the quality of the speech you hear.
  • +
+

You should follow the below mentioned rules, otherwise your answers will be invalid.

+

Rules:

+
    +
  • You must use a headset, not the loudspeaker: otherwise your response will be rejected +
  • +
  • You must perform the task in a quiet environment
  • +
  • Do not change the volume after modifying it in the Setup section.
  • +
+ +

Payment:

+

The result of this experiment is very important for us and other scientist working in this + area. We have methods that analyse the consistency of your answers. We will use these methods to + rank the submitted assignments according to quality.

+

For this experiment, we will pay a base reward of ${{cfg.hit_base_payment}}/HIT for every + accepted HIT. We have made available a set of 0 different HITs. You + will receive a bonus of:

+ +
    +
  • ${{cfg.quantity_bonus}}/HIT (for a total of ${{cfg.sum_quantity}}/HIT) if you submit + more than {{cfg.quantity_hits_more_than}} HITs or
  • +
  • ${{cfg.quality_bonus}}/HIT (for a total of ${{cfg.sum_quality }}/HIT) if you submit + more than {{cfg.quantity_hits_more_than}} HITs and be in the top + {{cfg.quality_top_percentage}}% quality group.
  • +
+ +

Bonuses will be assigned with in 7 days.

+ Please perform up to 0 HITs from this group. If you do more than + that, the rest will be rejected. + +

Attention:

+

This hit includes one or more Control clips (gold clips). Control clips are ones that we know that answer for and should be very easy to rate (they are clearly very good or very poor). They may target one or more scales. We include control clips in the HIT to ensure raters are paying attention and their environment hasn't changed. + Wrong answer to control clip(s) will result in rejection of the HIT.

+ +

+
+
+ +
+
+ + + +
+ +
+
+ +
+
+ +
+
+ +
+
+ +
+
+ +
+
+
+
+ +
+
+ +
+
+ +
+
+ + +
+ +
+ + + + + + + + + + + + + + + + +
in-ear headphones Closed back headphones louadspeaker inbuild speakers
+
+
+
+
+
+
+
+
+
+ +
+ + + +
+
+ +
+
+ +
+
+ +
+
+ +
+
+ +
+
+ +
+
+ +
+
+ + +
attention Please wear your headphones now.
+ +
+ +

  +

+ + + +
+ +
+ + +
+
+
+   +
+ +
+ +
+
+   +
+ +
+
+
+   +
+ +
+
+
+   +
+ +
+
+
+   +
+ +
+
+ +
+ +
+ + +
+
+
+   +
+
+
+ +
+
+ +
+
+
+ +
+
+   +
+
+
+ +
+
+ +
+
+
+
+
+   +
+
+
+ +
+
+ +
+
+
+
+
+   +
+
+
+ +
+
+ +
+
+
+
+
+   +
+
+
+ +
+
+ +
+
+
+
+ +
+ +
+
+
+ +
+
+ + + +
+
+ +
+
+ +
attention Please wear your headphones now. You must wear both earbuds. +
+ +
+ +

  +

+ + + + +
+ + +
+ +
 
+ + +
Result:  
+ +
+ + + + +
+ +
+
+
 
+
+
+
+
 
+
+
+ +
+
+
+
+
+
+
+
+
+ +
+ +
+ +
+
+
 
+
+
+
+
 
+
+
+ +
+
+
+
+
+
+
+
+
+ +
+
+ +
+
+
 
+
+
+
+
 
+
+
+ +
+
+
+
+
+
+
+
+
+ +
+ +
+ + +
+
+
 
+
+
+
+
 
+
+
+ +
+
+
+
+
+
+
+
+
+ +
+ +
+
+
+ +
+ + + +
+
+ +
+
+ Introduction +

We need to check your listening device. It is an automated test using your Web-browser functionality. + Please click on Start. + After that, you might see a popup message from your browser asking for allowance to use your + microphone. + Please click on Allow (on some of smartphones there will be no message). We do not process + or record any audio with your microphone. +

+ +
+
+
+
+ +
+
+ + + +
+
+ +
+
+

NOTE: This instruction is new. Please carefully read and follow the instruction.

+ +

In this experiment, you will be rating audio quality of conversations where people in a conference room are talking among themselves and with remote attendees (see the figure below). + Overlapping speech may occur — this is expected and does not necessarily indicate poor audio quality. In some cases, certain voices may be degraded, missing, or partially lost due to issues such as poor microphone quality, low volume or reverberation, making the audio difficult to understand. + Please focus on the clarity and intelligibility of the voices from the perspective of a remote listener. Your ratings should reflect the quality of the audio signal and how well individual voices are captured. + +

+
+
+ Conference room scenario +

Figure: Conference room setup with multiple participants

+
+
+
+ +

Each trial will include 1 audio sample, which should be rated on 7 scales. First, you must listen to the clip + until the end, then start rating. In the meantime, the audio sample will be played in a loop until you finish rating all 7 scales (you can + only vote when the audio is playing). For each scale, you MUST ONLY focus on the specific aspect of the audio sample you are asked for on that scale. + Besides the usual scale used for Signal and Overall Quality ratings, we also use escriptive scales for different quality characteristics. Here is an example for Noisiness characteristic:

+ + scale + + Beside the main question about the characteristic, the scale has the following features: +
    +
  • 1. Each scale has 5 levels.
  • +
  • 2. The poles of the scales are labeled with the antonym attributes, here: Not noisty – Noisy. To cast your vote, select a place on the scale. The closer to a pole, the more of that attribute you recognized.
  • +
  • 3. Below, more synonyms for each pole are given, to clarify them.
  • + +
+ +

Below are some (extreme) samples, a short description for each of them, and a picture from the continuous scales:

+ +

Scales and examples:

+ +

1. Noisiness:

+

"Noisiness" refers to how noisy a speech sample sounds. Noise can be present as background noise , circuit noise, and coding noise. It may be described with terms like "humming", "hissing" or "buzzy".

+
+
+
+

  +

+

This sample is impaired with strong background noise.

+ +
+
+
+ + +

2. Loudness:

+

"Loudness" describes how optimal loudness level of the speech sample is, examples of sub-optimal loudness are: sample is too quiet or unpleasantly loud.

+
+
+
+

  + +

+

This sample is too quiet. It is sub-optimal loudness.

+ +
+
+

  +

+

This sample is too loud. It is sub-optimal loudness.

+
+
+ + +
+
+
+
+ +
+
+ + + +

3. Discontinuity:

+

A sample may sound "Choppy", "Shaky", "Ragged" or include "Pops and Clicks". They are referd as intruptions in speech signal or discontinuty in contrast to a speech which is steady and smooth.

+
+
+
+

  +

+

Discontinuity is strong and very annoying.

+ +
+
+
+ + +

4. Coloration:

+ +

Coloration can be understood as any changes to the sound that lead to the speech sample sounding less "natural" or "normal". For example, the speech sample may sound "muffled", "distant", "thin" or like someone talking while plugging their nose. + As it can have many forms, below are some samples concatenated in a single clip separated by a "beep" sound. All of them represent intense coloration.

+

+
+
+
+

  +

+ + + + +

Another sample with high Coloration combined with other distortions like Noisiness.

+

  +

+
+
+
+ + +

5. Speech reverberation:

+

Sometimes speech sounds reflected, similar to someone talking inside a tunnel/cave or from another corner of a large room.

+
+
+
+

  +

+

This sample has moderate to high reverberation

+ +
+
+
+ + +

6. Entire Signal Quality:

+

In most of the cases multiple of the above impairments can occur simultaneously. Entire signal quality refers to all impairments that affect the speech signal only and excludes background noise.

+
+
+
+

  + +

+

The speech signal is slightly distorted.

+ +
+
+

  +

+

The speech signal is very distorted. There are also some background noise.

+
+
+
+ + +

7. Overall Quality:

+

This refers to how well the sample you heard is suitable for purpose of everday speech communicaions considering all impairments.

+ + + + + +
+
+
+ +
+ + + +
+
+ +
+
+ +

The training section is identical to the Rating section, for some of the samples you might get a feedback:

+ +
Please adjust the volume level of your headset to a comfortable level so that you hear the following audio sample very well. It is a very important step for this task and will directly influence your judgment. +

  +

+
+ +

In this experiment you will be rating the speech quality of sound samples involving different speech impairments and also background + noise. Each trial will include 1 audio sample which should be rated on 7 scales. You should let the audio sample play in loop until you are finished by all ratings.

+ +

Note: The order of questions may change from one HIT to the other.

+ +

Please provide your rating for the following 0 trials. Check out the "Detailes Instructions" for description of rating scales. + Note that the scale will be activated when the speech sample is played + until the end. In case you hear an interruption message, please follow the instruction given in the + message.

+ + + + +
+ + + + + + + +
+
Click Next Trial to answer all 0 trials.
+
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+
+
+

Ratings

+
+
+ +

NOTE: New Instruction (April, 2023). This HIT is newly designed with updated instruction. Please carefully read and follow

+
+ +
Please adjust the volume level of your headset to a comfortable level so that you hear the following audio sample very well. It is a very important step for this task and will directly influence your judgment. +

  +

+
+ + +

In this experiment you will be rating the speech quality of sound samples involving different speech impairments and also background + noise. Each trial will include 1 audio sample which should be rated on 7 scales. You should let the audio sample play in loop until you are finished by all ratings.

+ +

Note: The order of questions may change from one HIT to the other.

+ +

Please provide your rating for the following 0 trials. Check out the "Detailes Instructions" for description of rating scales. + Note that the scale will be activated when the speech sample is played + until the end. In case you hear an interruption message, please follow the instruction given in the + message.

+ + + + + + + + + + + + +
Click Next trial to answer all 0 trials, then submit your response.
+
+
+
+ + +
+

Thanks for your participation. Please perform more HITs from this group when they are available for you.

+

Note: The submit button works only if you answer to all questions. Make sure to answer to all 7 questions in each + trial. Click here to see which + questions in the "Rating" section are not answered?

+
+ + + + + + + \ No newline at end of file diff --git a/src/create_input.py b/src/create_input.py index ac9ee88..bb2f927 100644 --- a/src/create_input.py +++ b/src/create_input.py @@ -20,6 +20,7 @@ import itertools p835_personalized = 'pp835' +p804_conference = 'p804_conference' def validate_inputs(cfg, df, method): @@ -39,7 +40,7 @@ def validate_inputs(cfg, df, method): required_columns_gold_804 = ['gold_url', 'gold_sig_ans', 'gold_noise_ans', 'gold_ovrl_ans', 'gold_disc_ans', 'gold_col_ans', 'gold_loud_ans', 'gold_reverb_ans'] - if method in ['acr', 'p835', 'echo_impairment_test', 'p804']: + if method in ['acr', 'p835', 'echo_impairment_test', 'p804', p804_conference]: req = required_columns_acr elif method in [p835_personalized]: req = required_columns_acr @@ -60,7 +61,7 @@ def validate_inputs(cfg, df, method): #assert 'gold_enrolment_clips' in columns, f"No column found with 'gold_enrolment_clips' in input file" for column in required_columns_gold_personalized: assert column in columns, f"No column found with '{column}' in input file" - if method in ['p804']: + if method in ['p804', p804_conference]: for column in required_columns_gold_804: assert column in columns, f"No column found with '{column}' in input file" @@ -377,7 +378,7 @@ def create_input_for_acr(cfg, df, output_path, method): output_df[f'gold_bak_ans{suffix}'] = full_ans_bak output_df[f'gold_ovrl_ans{suffix}'] = full_ans_ovrl - elif method == 'p804': + elif method == 'p804' or method == p804_conference: #tmp = df.sample(frac=1).reset_index(drop=True) for j in range(0, number_of_gold_clips_per_session): @@ -522,7 +523,7 @@ def create_input_for_mturk(cfg, df, method, output_path): :param df: row input, see validate_inputs for details :param output_path: path to output file """ - if method in ['acr', 'p835', 'echo_impairment_test', p835_personalized, "p804"]: + if method in ['acr', 'p835', 'echo_impairment_test', p835_personalized, "p804", p804_conference]: return create_input_for_acr(cfg, df, output_path, method) else: return create_input_for_dcrccr(cfg, df, output_path) @@ -530,7 +531,7 @@ def create_input_for_mturk(cfg, df, method, output_path): if __name__ == '__main__': parser = argparse.ArgumentParser( - description=f'Create input.csv for ACR, DCR, CCR, P835, {p835_personalized},p804 echo_impairment_test test. ') + description=f'Create input.csv for ACR, DCR, CCR, P835, {p835_personalized},p804, {p804_conference} echo_impairment_test test. ') # Configuration: read it from trapping clips.cfg parser.add_argument("--row_input", required=True, help="All urls depending to the test method, for ACR: 'rating_clips', 'math', 'pair_a', " @@ -540,7 +541,7 @@ def create_input_for_mturk(cfg, df, method, output_path): help="explains the test") parser.add_argument("--method", default="acr", required=True, - help=f"one of the test methods: acr, dcr, ccr, p835, {p835_personalized},p804, echo_impairment_test") + help=f"one of the test methods: acr, dcr, ccr, p835, {p835_personalized},p804, {p804_conference}, echo_impairment_test") args = parser.parse_args() #row_input = join(dirname(__file__), args.row_input) @@ -552,9 +553,9 @@ def create_input_for_mturk(cfg, df, method, output_path): assert os.path.exists(cfg_path), f"No file in {cfg_path}]" methods = ["acr", "dcr", "ccr", "p835", - p835_personalized, "echo_impairment_test", "p804"] + p835_personalized, "echo_impairment_test", "p804", p804_conference] exp_method = args.method.lower() - assert exp_method in methods, f"{exp_method} is not a supported method, select from: acr, dcr, ccr, p835, {p835_personalized}, echo_impairment_test." + assert exp_method in methods, f"{exp_method} is not a supported method, select from: acr, dcr, ccr, p835, {p835_personalized}, echo_impairment_test, p804, {p804_conference}." cfg = CP.ConfigParser() cfg._interpolation = CP.ExtendedInterpolation() diff --git a/src/master_script.py b/src/master_script.py index 4478aaf..3af4435 100644 --- a/src/master_script.py +++ b/src/master_script.py @@ -26,6 +26,7 @@ #p835_personalized = "p835_personalized" p835_personalized = "pp835" +p804_conference = "p804_conference" """ def create_analyzer_cfg_acr(cfg, template_path, out_path): @@ -436,7 +437,7 @@ async def create_hit_app_pp835_p804( df_train = pd.read_csv(args.training_gold_clips) gold_in_train = [] cols = ['sig_ans','bak_ans','ovrl_ans'] - if test_method == 'p804': + if test_method == 'p804' or test_method == p804_conference: cols = ['sig_ans','noise_ans','ovrl_ans', 'disc_ans', 'col_ans', 'loud_ans', 'reverb_ans' ] for _, row in df_train.iterrows(): @@ -578,13 +579,13 @@ async def prepare_csv_for_create_input(cfg, test_method, clips, gold, trapping, df_clips = pd.DataFrame({'rating_clips': rating_clips}) sec_gold_question = False - if test_method in ["acr", "p835", "echo_impairment_test", p835_personalized,'p804']: + if test_method in ["acr", "p835", "echo_impairment_test", p835_personalized,'p804', p804_conference]: # prepare the golden clips if gold and os.path.exists(gold): df_gold = pd.read_csv(gold) # TODO change it with p835_personalized - if test_method in [p835_personalized, 'p804']: - df_gold = update_gold_clips_for_p804(df_gold) if test_method == 'p804' else update_gold_clips_for_personalized(df_gold) + if test_method in [p835_personalized, 'p804', p804_conference]: + df_gold = update_gold_clips_for_p804(df_gold) if test_method in ['p804', p804_conference] else update_gold_clips_for_personalized(df_gold) #if 'gold_clips2' in args and args.gold_clips2 and os.path.exists(args.gold_clips2): # df_gold2 = pd.read_csv(args.gold_clips2) @@ -724,6 +725,14 @@ def get_path(test_method, is_p831_fest): os.path.dirname(__file__), "assets_master_script/p804_result_parser_template.cfg" ) + # for P804_conference + p804_conference_template_path = os.path.join( + os.path.dirname(__file__), "P808Template/P808_conference.html" + ) + p804_conference_cfg_template_path = os.path.join( + os.path.dirname(__file__), "assets_master_script/p804_result_parser_template.cfg" + ) + # for echo_impairment_test echo_impairment_test_fest_template_path = os.path.join(os.path.dirname(__file__), 'P808Template/echo_impairment_test_fest_template.html') echo_impairment_test_template_path = os.path.join(os.path.dirname(__file__), 'P808Template/echo_impairment_test_template.html') @@ -749,6 +758,7 @@ def get_path(test_method, is_p831_fest): (p835_personalized, False): (pp835_template_path, pp835_cfg_template_path), ('echo_impairment_test', False): (echo_impairment_test_template_path, acr_cfg_template_path), ("p804", False): (p804_template_path, p804_cfg_template_path), + (p804_conference, False): (p804_conference_template_path, p804_conference_cfg_template_path), } # TODO: check if it works correctly by Personalized P.835 template_path, cfg_path = method_to_template[test_method, is_p831_fest] @@ -858,7 +868,7 @@ async def main(cfg, test_method, args): elif test_method in ['p835', 'echo_impairment_test']: await create_hit_app_p835(cfg_hit_app, template_path, output_html_file, args.training_clips, args.trapping_clips, cfg['create_input'], cfg['TrappingQuestions'], general_cfg) - elif test_method in [p835_personalized, 'p804']: + elif test_method in [p835_personalized, 'p804', p804_conference]: await create_hit_app_pp835_p804( cfg_hit_app, template_path, @@ -878,7 +888,7 @@ async def main(cfg, test_method, args): output_cfg_file_name = f"{args.project}_p831_{test_method}_result_parser.cfg" if is_p831_fest else f"{args.project}_{test_method}_result_parser.cfg" output_cfg_file = os.path.join(output_dir, output_cfg_file_name) - if test_method in ['acr', 'p835', 'echo_impairment_test', p835_personalized, 'p804']: + if test_method in ['acr', 'p835', 'echo_impairment_test', p835_personalized, 'p804', p804_conference]: create_analyzer_cfg_general(cfg, cfg_hit_app, cfg_path, output_cfg_file, general_cfg, n_HITs) else: create_analyzer_cfg_dcr_ccr(cfg, cfg_path, output_cfg_file, general_cfg, n_HITs) @@ -890,7 +900,7 @@ async def main(cfg, test_method, args): parser.add_argument("--project", help="Name of the project", required=True) parser.add_argument("--cfg", help="Configuration file, see master.cfg", required=True) parser.add_argument("--method", required=True, - help=f"one of the test methods: 'acr', 'dcr', 'ccr', 'p835','{p835_personalized}', p804, or 'echo_impairment_test'") + help=f"one of the test methods: 'acr', 'dcr', 'ccr', 'p835','{p835_personalized}', 'p804', '{p804_conference}', or 'echo_impairment_test'") parser.add_argument("--p831_fest", action='store_true', help="Use the question set of P.831 for FEST") parser.add_argument("--clips", help="A csv containing urls of all clips to be rated in column 'rating_clips', in " "case of ccr/dcr it should also contain a column for 'references'") @@ -909,11 +919,11 @@ async def main(cfg, test_method, args): # check input arguments args = parser.parse_args() - methods = ["acr", "dcr", "ccr", "p835", "echo_impairment_test", p835_personalized, 'p804'] + methods = ["acr", "dcr", "ccr", "p835", "echo_impairment_test", p835_personalized, 'p804', p804_conference] test_method = args.method.lower() assert ( test_method in methods - ), f"No such a method supported, please select between 'acr', 'dcr', 'ccr', 'p835', '{p835_personalized}', 'echo_impairment_test', 'p804'" + ), f"No such a method supported, please select between 'acr', 'dcr', 'ccr', 'p835', '{p835_personalized}', 'echo_impairment_test', 'p804', '{p804_conference}'" p831_methods = ["acr", "dcr", "echo_impairment_test"] if args.p831_fest: @@ -926,8 +936,8 @@ async def main(cfg, test_method, args): ), f"No training clips file in {args.training_clips}" elif args.training_gold_clips: assert os.path.exists(args.training_gold_clips), f"No csv file containing training_gold_clips in {args.training_gold_clips}" - if test_method not in [p835_personalized,"p804"]: - raise ValueError("training_gold clips are only supported for personalized p835 and p804") + if test_method not in [p835_personalized,"p804", p804_conference]: + raise ValueError("training_gold clips are only supported for personalized p835, p804, and p804_conference") else: raise ValueError("No training or training_gold clips provided") @@ -942,7 +952,7 @@ async def main(cfg, test_method, args): else: assert True, "Neither clips file not cloud store provided for rating clips" - if test_method in ["acr", "p835", "echo_impairment_test", p835_personalized, 'p804']: + if test_method in ["acr", "p835", "echo_impairment_test", p835_personalized, 'p804', p804_conference]: if args.gold_clips: assert os.path.exists(args.gold_clips), f"No csv file containing gold clips in {args.gold_clips}" elif cfg.has_option('GoldenSample', 'Path'):