Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions src/BloomExe/Book/Book.cs
Original file line number Diff line number Diff line change
Expand Up @@ -1273,6 +1273,13 @@ private void FixDuplicateAudioIdsInDataDiv(HtmlDom bookDOM, HashSet<string> idSe
return; // shouldn't happen, but paranoia sometimes pays off, especially in running tests.
var nodes = dataDiv
.SafeSelectNodes("(.//div|.//span)[@id and contains(@class,'audio-sentence')]")
// We expect the customCover to have copies of other stuff in the data div.
.Where(x =>
!(
x is SafeXmlElement e
&& e.ParentWithAttributeValue("data-book", "customCover") != null
)
)
Comment on lines +1276 to +1282

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 CustomCover IDs excluded from idSet may affect downstream duplicate detection

The .Where() filter at src/BloomExe/Book/Book.cs:1277-1282 excludes customCover audio elements from being processed, meaning their IDs are never added to idSet. This idSet is subsequently used in FixDuplicateAudioIdInBookPages (line 1301) to detect duplicates on pages. If a page element has the same audio ID as a customCover data-div element but NOT the same as any non-customCover data-div element, it won't be flagged as a duplicate. This appears to be the intended behavior — the customCover is expected to have copies of IDs from elsewhere, and pages referencing those IDs should not be "fixed". However, it's worth confirming that the customCover page's audio elements are also correctly handled (they would be checked via HtmlDom.IsNodePartOfDataBookOrDataCollection at line 1312 which skips data-book elements).

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

.ToList();
var duplicateAudioIdsFixed = 0;
foreach (var audioElement in nodes)
Expand All @@ -1283,6 +1290,7 @@ private void FixDuplicateAudioIdsInDataDiv(HtmlDom bookDOM, HashSet<string> idSe
{
var newId = FixDuplicateAudioId(audioElement, id);
idSet.Add(newId);
duplicateAudioIdsFixed++;
}
}
// OK, now fix all the places any duplicates were used in the book's pages.
Expand Down
26 changes: 25 additions & 1 deletion src/BloomExe/SafeXml/SafeXmlElement.cs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
using System;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml;
Expand Down Expand Up @@ -201,6 +201,30 @@ public SafeXmlElement ParentOrSelfWithClass(string targetClass)
return current;
}

public SafeXmlElement ParentWithAttribute(string targetAttribute)
{
var current = ParentElement;
while (current != null && !current.HasAttribute(targetAttribute))
current = current.ParentNode as SafeXmlElement;
return current;
}

/// <summary>
/// Get a parent element if any that has the specified value for the specified attribute
/// Note: currently if attrVal is empty it will match both parents where the attribute
/// is present and empty, and parents that don't have the attribute at all. It is not
/// intended that it should be used this way.
/// </summary>
public SafeXmlElement ParentWithAttributeValue(string targetAttribute, string attrVal)
{
ArgumentException.ThrowIfNullOrEmpty(attrVal);

var current = ParentElement;
while (current != null && current.GetAttribute(targetAttribute) != attrVal)
current = current.ParentNode as SafeXmlElement;
return current;
}

public SafeXmlElement GetChildWithName(string name)
{
return ChildNodes.FirstOrDefault(n => n.Name.ToLowerInvariant() == name)
Expand Down