Ticket #213 (new defect)

Opened 2 years ago

Last modified 2 years ago

Whitespace within HTML breaks list cleanup

Reported by: weswinham Owned by:
Priority: major Milestone:
Component: editor Version: trunk
Keywords: Cc:

Description

I had a tough time tracking this one down, but the symptom is that I have several html documents that aren't getting their list structure auto-corrected like I expected.

With invalid list html like so:

<ul><li>a<\/li><ul><li>a.1<\/li><\/ul><li>b<br><\/li><\/ul>

Wymeditor.XhtmlSaxListener.fixNestingBeforeOpeningBlocTag fixes this exact problem. The issue is that if you have linebreaks in your html, like so:

<ul>
<li>a<\/li>
<ul><li>a.1<\/li><\/ul><li>b<br><\/li><\/ul>

It doesn't clean up the HTML as expected. This means that any documents loaded up with invalid HTML that contains normal indention and linebreaks is not fixed by the parser, which leads to all sorts of odd list behavior down the road while manipulating the document.

This is further complicated by the fact that internet explorer, by default, uses the above broken list structure with the creation of new lists and sublists. If a document is being edited back and forth using different browsers, this leads to the lists being "broken" not just in ie's messed up structure, but also in other browsers because the parser doesn't fix the problem due to the /r/n characters that are embedded in internet explorer.

Our particular usecase calls for a *lot* of list editing and we have intermittent serious problems with lists that won't indent/dedent, won't separate, won't join back together etc, and it seems that this might be the root cause.

The solution seems to be modifying the parser to perform its cleanup duties regardless of any meaningless whitespace that might pop up between tags.

Change History

comment:1 Changed 2 years ago by weswinham

  • Summary changed from Whitespace within HTML breaks parser cleanup to Whitespace within HTML breaks list cleanup

Looks like I may have overestimate the impact here. I know that lists are effected, so until I have a testcase showing any other problems, changing the name of this ticket.

comment:2 Changed 2 years ago by weswinham

I feel like there might be a better overall fix, as it seems likely that whitespace regex problems might be present in other parsing areas, but I've fixed the list-specific issue in my branch:  http://github.com/winhamwr/wymeditor/commit/af931113ceb98a004b323ca92dfa872ac5d856e5

Note: See TracTickets for help on using tickets.