- for i, (start, end, text) in enumerate(re.findall(r'<event [^>]*?start="([^"]+)" [^>]*?end="([^"]+)" [^>]*?text="([^"]+)"[^>]*?>', subtitles), 1):
- start = start.replace('.', ',')
- end = end.replace('.', ',')
- text = clean_html(text)
- text = text.replace('\\N', '\n')
- if not text:
- continue
+
+ for i, event in enumerate(sub_root.findall('./events/event'), 1):
+ start = event.attrib['start'].replace('.', ',')
+ end = event.attrib['end'].replace('.', ',')
+ text = event.attrib['text'].replace('\\N', '\n')