Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about data format #6

Open
seongminp opened this issue Sep 24, 2021 · 1 comment
Open

A question about data format #6

seongminp opened this issue Sep 24, 2021 · 1 comment

Comments

@seongminp
Copy link

Hello. First of all thank you for this wonderful library.

I have a question about the transcription format.

For each .txt file in data/ami-transcripts, does each line denote a single speaker?

For example, would each line of EN2001a belong to a separate speaker, of which there are 5?

Thank you!

@saprativa
Copy link

Yes you are correct. But this doesn’t preserve the speaker turns from the actual meetings. And I also found some discrepancies in the transcript files and the speaker transcript files for ES2002c. But I need to check and verify again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants