Skip to content

Open source URMP dataset pipeline#458

Open
copybara-service[bot] wants to merge 1 commit intomainfrom
test_460754854
Open

Open source URMP dataset pipeline#458
copybara-service[bot] wants to merge 1 commit intomainfrom
test_460754854

Conversation

@copybara-service
Copy link
Contributor

Open source URMP dataset pipeline

PiperOrigin-RevId: 460754854
@cyrusasfa
Copy link

looks good to me.

only thing I notice is that prepare_urmp_dataset_lib.parse_example expects TFRecords. Will these TFRecords or the code to produce them also be available?

tf.io.FixedLenSequenceFeature([],
dtype=tf.float32,
allow_missing=True),
'sequence':

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is sequence extracted from the source URMP dataset? what is it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured you can just create a NoteSequence using URMP's notes files

Copy link

@cyrusasfa cyrusasfa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One bug to fix.

Generally it would be helpful to have a script that extracts correctly formatted necessary features from the raw URMP dataset and dumps to TFRecord. I've done this and can make a new PR. Alternatively the 48 kHz tfrecord without the metadata and DDSP features could be uploaded somewhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants