venv/lib/python3.7/site-packages/bs4/__pycache__/dammit.cpython-37.pyc

B
 %<25>]4y<00>@sDdZdZddlZddlmZddlZddlZddlZdZyddl	Z	dd<06>Z
WnFek
r<EFBFBD>yddlZdd<06>Z
Wnek
r<EFBFBD>dd<06>Z
YnXYnXyddl
Z
Wnek
r<EFBFBD>YnXd	Zd
Ze<10>Ze<05>e<0F>d<0B>ej<14>e<05>e<0E>d<0B>ej<14>d<0C>ee<e<05>eej<14>e<05>eej<14>d<0C>ee<Gd
d<0E>de<17>ZGdd<10>d<10>ZGdd<12>d<12>ZdS)aBBeautiful Soup bonus library: Unicode, Dammit

This library converts a bytestream to Unicode through any means
necessary. It is heavily based on code from Mark Pilgrim's Universal
Feed Parser. It works best on XML and HTML, but it does not rewrite the
XML or HTML to reflect a new encoding; that's the tree builder's job.
<EFBFBD>MIT<49>N)<01>codepoint2namecCst|t<01>rdSt<02>|<00>dS)N<>encoding)<04>
isinstance<EFBFBD>str<74>cchardet<65>detect)<01>s<>r
<00>6/tmp/pip-install-_x9nvcel/beautifulsoup4/bs4/dammit.py<70>chardet_dammits
rcCst|t<01>rdSt<02>|<00>dS)Nr)rr<00>chardetr)r	r
r
rr"s
cCsdS)Nr
)r	r
r
rr*sz$^\s*<\?.*encoding=['"](.*?)['"].*\?>z0<\s*meta[^>]+charset\s*=\s*["']?([^>]*?)[ /;'">]<5D>ascii)<02>html<6D>xmlc@s<>eZdZdZdd<03>Ze<04>\ZZZdddddd	<09>Ze	<09>
d
<EFBFBD>Ze	<09>
d<0B>Ze
dd
<0A><00>Ze
dd<0F><00>Ze
dd<11><00>Ze
ddd<14><01>Ze
ddd<16><01>Ze
dd<18><00>ZdS)<1C>EntitySubstitutionzASubstitute XML or HTML entities for the corresponding characters.cCsxi}i}g}dg}xFtt<01><02><00>|D]2\}}t|<04>}|dkrN|<02>|<06>|||<|||<q$Wdd<04>|<02>}||t<06>|<07>fS)N)<02>'<00>apos)<02>"rz[%s]<5D>)<08>listr<00>items<6D>chr<68>append<6E>join<69>re<72>compile)<08>lookupZreverse_lookupZcharacters_for_re<72>extra<72>	codepoint<6E>name<6D>	characterZ
re_definitionr
r
r<00>_populate_class_variablesEs
z,EntitySubstitution._populate_class_variablesr<00>quot<6F>amp<6D>lt<6C>gt)<05>'<27>"<22>&<26><<3C>>z&([<>]|&(?!#\d+;|#x[0-9a-fA-F]+;|\w+;))z([<>&])cCs|j<00>|<01>d<01><01>}d|S)Nrz&%s;)<03>CHARACTER_TO_HTML_ENTITY<54>get<65>group)<03>cls<6C>matchobj<62>entityr
r
r<00>_substitute_html_entityosz*EntitySubstitution._substitute_html_entitycCs|j|<01>d<01>}d|S)zmUsed with a regular expression to substitute the
        appropriate XML entity for an XML special character.rz&%s;)<02>CHARACTER_TO_XML_ENTITYr.)r/r0r1r
r
r<00>_substitute_xml_entitytsz)EntitySubstitution._substitute_xml_entitycCs6d}d|kr*d|kr&d}|<01>d|<03>}nd}|||S)a*Make a value into a quoted XML attribute, possibly escaping it.

         Most strings will be quoted using double quotes.

          Bob's Bar -> "Bob's Bar"

         If a string contains double quotes, it will be quoted using
         single quotes.

          Welcome to "my bar" -> 'Welcome to "my bar"'

         If a string contains both single and double quotes, the
         double quotes will be escaped, and the string will be quoted
         using double quotes.

          Welcome to "Bob's Bar" -> "Welcome to &quot;Bob's bar&quot;
        r(r'z&quot;)<01>replace)<04>self<6C>valueZ
quote_withZreplace_withr
r
r<00>quoted_attribute_value{sz)EntitySubstitution.quoted_attribute_valueFcCs"|j<00>|j|<01>}|r|<00>|<01>}|S)aSubstitute XML entities for special XML characters.

        :param value: A string to be substituted. The less-than sign
          will become &lt;, the greater-than sign will become &gt;,
          and any ampersands will become &amp;. If you want ampersands
          that appear to be part of an entity definition to be left
          alone, use substitute_xml_containing_entities() instead.

        :param make_quoted_attribute: If True, then the string will be
         quoted, as befits an attribute value.
        )<04>AMPERSAND_OR_BRACKET<45>subr4r8)r/r7<00>make_quoted_attributer
r
r<00>substitute_xml<6D>s


z!EntitySubstitution.substitute_xmlcCs"|j<00>|j|<01>}|r|<00>|<01>}|S)a<>Substitute XML entities for special XML characters.

        :param value: A string to be substituted. The less-than sign will
          become &lt;, the greater-than sign will become &gt;, and any
          ampersands that are not part of an entity defition will
          become &amp;.

        :param make_quoted_attribute: If True, then the string will be
         quoted, as befits an attribute value.
        )<04>BARE_AMPERSAND_OR_BRACKETr:r4r8)r/r7r;r
r
r<00>"substitute_xml_containing_entities<65>s


z5EntitySubstitution.substitute_xml_containing_entitiescCs|j<00>|j|<01>S)a<>Replace certain Unicode characters with named HTML entities.

        This differs from data.encode(encoding, 'xmlcharrefreplace')
        in that the goal is to make the result more readable (to those
        with ASCII displays) rather than to recover from
        errors. There's absolutely nothing wrong with a UTF-8 string
        containg a LATIN SMALL LETTER E WITH ACUTE, but replacing that
        character with "&eacute;" will make it more readable to some
        people.
        )<03>CHARACTER_TO_HTML_ENTITY_REr:r2)r/r	r
r
r<00>substitute_html<6D>sz"EntitySubstitution.substitute_htmlN)F)F)<14>__name__<5F>
__module__<EFBFBD>__qualname__<5F>__doc__r"r,ZHTML_ENTITY_TO_CHARACTERr?r3rrr=r9<00>classmethodr2r4r8r<r>r@r
r
r
rrAs$

%rc@sHeZdZdZddd<05>Zdd<07>Zedd	<09><00>Zed
d<0B><00>Z	eddd
<0A><01>Z
dS)<10>EncodingDetectora^Suggests a number of possible encodings for a bytestring.

    Order of precedence:

    1. Encodings you specifically tell EncodingDetector to try first
    (the override_encodings argument to the constructor).

    2. An encoding declared within the bytestring itself, either in an
    XML declaration (if the bytestring is to be interpreted as an XML
    document), or in a <meta> tag (if the bytestring is to be
    interpreted as an HTML document.)

    3. An encoding detected through textual analysis by chardet,
    cchardet, or a similar external library.

    4. UTF-8.

    5. Windows-1252.
    NFcCsN|pg|_|pg}tdd<02>|D<00><01>|_d|_||_d|_|<00>|<01>\|_|_dS)NcSsg|]}|<01><00><00>qSr
)<01>lower)<02>.0<EFBFBD>xr
r
r<00>
<listcomp><3E>sz-EncodingDetector.__init__.<locals>.<listcomp>)	<09>override_encodings<67>set<65>exclude_encodings<67>chardet_encoding<6E>is_html<6D>declared_encoding<6E>strip_byte_order_mark<72>markup<75>sniffed_encoding)r6rRrKrOrMr
r
r<00>__init__<5F>s
zEncodingDetector.__init__cCs8|dk	r4|<01><00>}||jkrdS||kr4|<02>|<01>dSdS)NFT)rGrM<00>add)r6r<00>triedr
r
r<00>_usable<6C>s

zEncodingDetector._usableccs<>t<00>}x |jD]}|<00>||<01>r|VqW|<00>|j|<01>r>|jV|jdkrZ|<00>|j|j<07>|_|<00>|j|<01>rp|jV|jdkr<>t	|j<06>|_|<00>|j|<01>r<>|jVxdD]}|<00>||<01>r<>|Vq<>WdS)z<Yield a number of encodings that might work for this markup.N)zutf-8zwindows-1252)
rLrKrWrSrP<00>find_declared_encodingrRrOrNr)r6rV<00>er
r
r<00>	encodingss$


zEncodingDetector.encodingscCs<>d}t|t<01>r||fSt|<01>dkrT|dd<03>dkrT|dd<02>dkrTd}|dd<01>}n<>t|<01>dkr<>|dd<03>dkr<>|dd<02>dkr<>d}|dd<01>}nd|dd	<09>d
kr<>d}|d	d<01>}nB|dd<02>dkr<>d
}|dd<01>}n |dd<02>dkr<>d}|dd<01>}||fS)zMIf a byte-order mark is present, strip it and return the encoding it implies.N<><00>s<00><>zzutf-16bes<00><>zutf-16le<6C>szutf-8s<00><>zutf-32bes<00><>zutf-32le)rr<00>len)r/<00>datarr
r
rrQ&s*
z&EncodingDetector.strip_byte_order_markcCs<>|rt|<01>}}nd}tdtt|<01>d<00><01>}t|t<04>r@tt}ntt}|d}|d}d}	|j||d<07>}
|
s<EFBFBD>|r<>|j||d<07>}
|
dk	r<>|
<EFBFBD><08>d}	|	r<>t|	t<04>r<>|	<09>	d	d
<EFBFBD>}	|	<09>
<EFBFBD>SdS)z<>Given a document, tries to find its declared encoding.

        An XML encoding is declared at the beginning of the document.

        An HTML encoding is declared in a <meta> tag, hopefully near the
        beginning of the document.
        iig<><67><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>?rrN)<01>endposrrr5)r^<00>max<61>intr<00>bytes<65>encoding_resr<00>search<63>groups<70>decoderG)r/rRrOZsearch_entire_documentZ
xml_endposZhtml_endpos<6F>res<65>xml_reZhtml_rerPZdeclared_encoding_matchr
r
rrX@s(	


z'EncodingDetector.find_declared_encoding)NFN)FF)rArBrCrDrTrW<00>propertyrZrErQrXr
r
r
rrF<00>s

!rFc<00>@s<>eZdZdZddd<04>ZdddgZgdd	gfd
d<0B>Zdd
<0A>Zd<>dd<10>Zd<>dd<12>Z	e
dd<14><00>Zdd<16>Zdd<18>Z
dddddddd d!d"d#d$d%d&d'd&d&d(d)d*d+d,d-d.d/d0d1d2d3d&d4d5d6<64> Zd7dd8d9d:d;d<d=d>d?d@dAdBd&dCd&d&dDdDdEdEdFdGdHdIdJdKdLdMd&dNdOddPdQdRdSdTdUd@dVdWdXdYdPddZdGd[d\d]d^d_d`dadFd8dbdXdcdddedfd&dgdgdgdgdgdgdhdidjdjdjdjdkdkdkdkdldmdndndndndndFdndododododOdpdqdrdrdrdrdrdrdsdQdtdtdtdtdudududud[dvd[d[d[d[d[dwd[d`d`d`d`dxdpdxdy<64><79>Zdzd{d|d}d~dd<7F>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD>d<EFBFBD><64>zZd<>d<EFBFBD>d<EFBFBD>gZed<>d<>Zed<>d<>Ze<14>dd<>d<EFBFBD><64><01>ZdS(<00>
UnicodeDammitz<74>A class for detecting the encoding of a *ML document and
    converting it to a Unicode string. If the source encoding is
    windows-1252, can replace MS smart quotes with their HTML or XML
    equivalents.z	mac-romanz	shift-jis)<02>	macintoshzx-sjis<69>windows-1252z
iso-8859-1z
iso-8859-2NFcCs<>||_g|_d|_||_t<04>t<06>|_t||||<05>|_	t
|t<0B>sF|dkr`||_t|<01>|_
d|_dS|j	j|_d}x,|j	jD] }|j	j}|<00>|<07>}|dk	rxPqxW|s<>x@|j	jD]4}|dkr<>|<00>|d<04>}|dk	r<>|j<07>d<05>d|_Pq<>W||_
|s<>d|_dS)NFrrr5zSSome characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.T)<12>smart_quotes_to<74>tried_encodingsZcontains_replacement_charactersrO<00>logging<6E>	getLoggerrA<00>logrF<00>detectorrrrRZunicode_markup<75>original_encodingrZ<00>
_convert_from<6F>warning)r6rRrKrnrOrM<00>urr
r
rrTus>


zUnicodeDammit.__init__cCs<>|<01>d<01>}|jdkr&|j<02>|<02><01><04>}nf|j<05>|<02>}t|<03>tkr<>|jdkrfd<04><04>|d<00><04>d<05><04>}q<>d<06><04>|d<00><04>d<05><04>}n|<03><04>}|S)z[Changes a MS smart quote character to an XML or HTML
        entity, or an ASCII character.<2E>rrz&#x<>;r)r)r.rn<00>MS_CHARS_TO_ASCIIr-<00>encode<64>MS_CHARS<52>type<70>tuple)r6<00>match<63>origr:r
r
r<00>_sub_ms_char<61>s


zUnicodeDammit._sub_ms_char<61>strictc
Cs<>|<00>|<01>}|r||f|jkr dS|j<01>||f<02>|j}|jdk	rf||jkrfd}t<06>|<04>}|<05>|j	|<03>}y|<00>
|||<02>}||_||_Wn"tk
r<EFBFBD>}zdSd}~XYnX|jS)Ns([<5B>-<2D>]))
<0A>
find_codecrorrRrn<00>ENCODINGS_WITH_SMART_QUOTESrrr:r<><00>_to_unicodert<00>	Exception)r6Zproposed<65>errorsrRZsmart_quotes_reZsmart_quotes_compiledrwrYr
r
rru<00>s"


zUnicodeDammit._convert_fromcCst|||<03>S)zGiven a string and its encoding, decodes the string into Unicode.
        %encoding is a string recognized by encodings.aliases)r)r6r_rr<>r
r
rr<><00>szUnicodeDammit._to_unicodecCs|js
dS|jjS)N)rOrsrP)r6r
r
r<00>declared_html_encoding<6E>sz$UnicodeDammit.declared_html_encodingcCs`|<00>|j<01>||<01><02>pN|r*|<00>|<01>dd<02><02>pN|r@|<00>|<01>dd<03><02>pN|rL|<01><04>pN|}|r\|<02><04>SdS)N<>-r<00>_)<05>_codec<65>CHARSET_ALIASESr-r5rG)r6<00>charsetr7r
r
rr<><00>szUnicodeDammit.find_codecc	Cs<|s|Sd}yt<00>|<01>|}Wnttfk
r6YnX|S)N)<04>codecsr<00>LookupError<6F>
ValueError)r6r<><00>codecr
r
rr<><00>s
zUnicodeDammit._codec)<02>euroZ20AC<41> )<02>sbquoZ201A)<02>fnofZ192)<02>bdquoZ201E)<02>hellipZ2026)<02>daggerZ2020)<02>DaggerZ2021)<02>circZ2C6)<02>permilZ2030)<02>ScaronZ160)<02>lsaquoZ2039)<02>OEligZ152<35>?)z#x17DZ17D)<02>lsquoZ2018)<02>rsquoZ2019)<02>ldquoZ201C)<02>rdquoZ201D)<02>bullZ2022)<02>ndashZ2013)<02>mdashZ2014)<02>tildeZ2DC)<02>tradeZ2122)<02>scaronZ161)<02>rsaquoZ203A)<02>oeligZ153)z#x17EZ17E)<02>Yumlr) <20><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00>ZEUR<55>,<2C>fz,,z...<2E>+z++<2B>^<5E>%<25>Sr*ZOE<4F>Zr'r(<00>*r<>z--<2D>~z(TM)r	r+Zoe<6F>z<>Y<>!<21>cZGBP<42>$ZYEN<45>|z..rz(th)z<<z(R)<29>oz+-<2D>2<>3)r'<00>acuterw<00>P<>1z>>z1/4z1/2z3/4<>AZAE<41>C<>E<>I<>D<>N<>O<>U<>b<>B<>aZaerY<00>i<>n<>/<2F>y)<29>r<EFBFBD>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<><00><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00>s€s‚sƒs„s…s†s‡sˆs‰sŠs‹sŒsŽs‘s’s“s”s•s–s—s˜s™sšs›sœsžsŸs s¡s¢s£s¤s¥s¦s§s¨s©sªs«s¬ss®s¯s°s±s²s³s´sµs¶s·s¸s¹sºs»s¼s½s¾s¿sÀsÁsÂsÃsÄsÅsÆsÇsÈsÉsÊsËsÌsÍsÎsÏsÐsÑsÒsÓsÔsÕsÖs×sØsÙsÚsÛsÜsÝsÞsßsàr<C3A0>sâsãsäsåsæsçsèsésêsësìsísîsïsðsñsòsósôsõsös÷søsùsúsûsüsýsþ)z<><7A><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><><00><>)r<>r<>r\)r<>r<>r])r<>r<>r[r<00><><EFBFBD><EFBFBD><EFBFBD>rx<00>utf8cCs"|<03>dd<02><02><01>dkrtd<04><01>|<02><01>dkr0td<06><01>g}d}d}x<>|t|<01>kr<>||}t|t<05>sdt|<07>}||jkr<>||jkr<>xz|j	D]$\}}	}
||kr<>||	kr<>||
7}Pq<>Wq>|dkr<>||j
kr<>|<04>|||<06><00>|<04>|j
|<00>|d	7}|}q>|d	7}q>W|dk<02>r|S|<04>||d
<EFBFBD><00>d<0B>|<04>S)a<>Fix characters from one encoding embedded in some other encoding.

        Currently the only situation supported is Windows-1252 (or its
        subset ISO-8859-1), embedded in UTF-8.

        The input must be a bytestring. If you've already converted
        the document to Unicode, you're too late.

        The output is a bytestring in which `embedded_encoding`
        characters have been converted to their `main_encoding`
        equivalents.
        r<>r<>)zwindows-1252<35>windows_1252zPWindows-1252 and ISO-8859-1 are the only currently supported embedded encodings.)r<>zutf-8z4UTF-8 is the only currently supported main encoding.rrQrxN<>)
r5rG<00>NotImplementedErrorr^rrb<00>ord<72>FIRST_MULTIBYTE_MARKER<45>LAST_MULTIBYTE_MARKER<45>MULTIBYTE_MARKERS_AND_SIZES<45>WINDOWS_1252_TO_UTF8rr)r/Zin_bytesZ
main_encodingZembedded_encodingZbyte_chunksZchunk_start<72>pos<6F>byte<74>start<72>end<6E>sizer
r
r<00>	detwingle)s<


zUnicodeDammit.detwingle)r<>)r<>)r<>rm)rArBrCrDr<>r<>rTr<>rur<>rjr<>r<>r<>r|rzr<>r<>r<>r<>rEr<>r
r
r
rrkbs`1


	rk)rD<00>__license__r<5F><00>
html.entitiesrrrp<00>stringZchardet_typerr<00>ImportErrorr
Ziconv_codecZxml_encodingZ	html_meta<74>dictrdrr{r<>rcr<00>objectrrFrkr
r
r
r<00><module>s@