Skip to content

Is beautiful soup entity encoding your inserted html? #33

@sligodave

Description

@sligodave

Hi,
I could be wrong here but just in case, I said I'd bring this to your attention.

At the end of the parse_data method of the HTMLParser where you call "replaceWith" on the matched url;
It appears that with the step from BeautifulSoup 3.2.0 to BeautifulSoup 3.2.1
the inserted html is now being entity encoded, thus breaking things.

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup("")
soup.insert(0, "<b>YAY</b>")
print unicode(soup)

The above under BS 3.2.0 printed:

`````` YAY```

Under BS 3.2.1 it prints:
&lt;b&gt;YAY&lt;/b&gt;

I haven't had the time to dig an awful lot but the solution might be to create a BS representation of the replacement html and pass that to replaceWith.

Thanks,
Dave

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions