Files
old-nlp/venv/lib/python3.7/site-packages/bs4/__pycache__/diagnose.cpython-37.pyc

74 lines
7.6 KiB
Plaintext
Raw Normal View History

2019-10-20 13:16:49 +02:00
B
%<25>]6<00>@s dZdZddlZddlmZddlmZddlZddlmZm Z ddl
m Z ddl Z ddl Z ddlZddlZddlZddlZddlZddlZdd <09>Zd#d d <0C>ZGd d<0E>de<06>Zdd<10>ZdZdZd$dd<15>Zd%dd<18>Zd&dd<1B>Zd'dd<1E>Zd(d d!<21>Zed"k<02>reej<1F> <20><00>dS))z=Diagnostic functions, mainly for use when doing tech support.<2E>MIT<49>N)<01>StringIO)<01>
HTMLParser)<02> BeautifulSoup<75> __version__)<01>builder_registryc CsTtdt<00>tdtj<00>dddg}x>|D]6}x0tjD]}||jkr6Pq6W|<01>|<02>td|<00>q*Wd|kr<>|<01>d<07>y*dd l m
}td
d <0B> t t |j<0E><02><00>Wn*tk
r<EFBFBD>}z td <0C>Wd d }~XYnXd|k<06>rydd l}td|j<00>Wn,tk
<EFBFBD>r}z td<0F>Wd d }~XYnXt|d<10><02>r4|<00><12>}n<>|<00>d<11><01>sL|<00>d<12><01>rdtd|<00>td<14>d Sy:tj<15>|<00><01>r<>td|<00>t|<00><01>}|<07><12>}Wd QRXWntk
<EFBFBD>r<>YnXt<00>x<>|D]<5D>}td|<00>d} yt||d<18>}
d} Wn8tk
<EFBFBD>r"}ztd|<00>t<1B><1C>Wd d }~XYnX| <09>rBtd|<00>t|
<EFBFBD><1D><00>td<1C><00>q<>Wd S)z/Diagnostic suite for isolating common problems.z'Diagnostic running on Beautiful Soup %szPython version %sz html.parser<65>html5lib<69>lxmlz;I noticed that %s is not installed. Installing it may help.zlxml-xmlr)<01>etreezFound lxml version %s<>.z.lxml is not installed or couldn't be imported.NzFound html5lib version %sz2html5lib is not installed or couldn't be imported.<2E>readzhttp:zhttps:z<"%s" looks like a URL. Beautiful Soup is not an HTTP client.zpYou need to use some other library to get the document behind the URL, and feed that document to Beautiful Soup.z7"%s" looks like a filename. Reading data from the file.z#Trying to parse your markup with %sF)<01>featuresTz%s could not parse the markup.z#Here's what %s did with the markup:zP--------------------------------------------------------------------------------)<1E>printr<00>sys<79>versionrZbuildersr <00>remove<76>appendr r
<00>join<69>map<61>strZ LXML_VERSION<4F> ImportErrorr<00>hasattrr <00>
startswith<EFBFBD>os<6F>path<74>exists<74>open<65>
ValueErrorr<00> Exception<6F> traceback<63> print_excZprettify) <0B>dataZ basic_parsers<72>name<6D>builderr
<00>er<00>fp<66>parser<65>success<73>soup<75>r)<00>8/tmp/pip-install-_x9nvcel/beautifulsoup4/bs4/diagnose.py<70>diagnosesj 

 

 
 
 
  

     r+TcKsNddlm}x<|jt|<00>fd|i|<02><02>D]\}}td||j|jf<00>q(WdS)z<>Print out the lxml events that occur during parsing.
This lets you see how lxml parses a document when no Beautiful
Soup code is running.
r)r
<00>htmlz %s, %4s, %sN)r r
<00> iterparserr<00>tag<61>text)r!r,<00>kwargsr
<00>event<6E>elementr)r)r*<00>
lxml_traceYs $r3c@s`eZdZdZdd<03>Zdd<05>Zdd<07>Zdd <09>Zd
d <0B>Zd d <0A>Z dd<0F>Z
dd<11>Z dd<13>Z dd<15>Z dS)<17>AnnouncingParserz?Announces HTMLParser parse events, without doing anything else.cCs t|<01>dS)N)r)<02>self<6C>sr)r)r*<00>_pfszAnnouncingParser._pcCs|<00>d|<00>dS)Nz%s START)r7)r5r"<00>attrsr)r)r*<00>handle_starttagisz AnnouncingParser.handle_starttagcCs|<00>d|<00>dS)Nz%s END)r7)r5r"r)r)r*<00> handle_endtaglszAnnouncingParser.handle_endtagcCs|<00>d|<00>dS)Nz%s DATA)r7)r5r!r)r)r*<00> handle_dataoszAnnouncingParser.handle_datacCs|<00>d|<00>dS)Nz
%s CHARREF)r7)r5r"r)r)r*<00>handle_charrefrszAnnouncingParser.handle_charrefcCs|<00>d|<00>dS)Nz %s ENTITYREF)r7)r5r"r)r)r*<00>handle_entityrefusz!AnnouncingParser.handle_entityrefcCs|<00>d|<00>dS)Nz
%s COMMENT)r7)r5r!r)r)r*<00>handle_commentxszAnnouncingParser.handle_commentcCs|<00>d|<00>dS)Nz%s DECL)r7)r5r!r)r)r*<00> handle_decl{szAnnouncingParser.handle_declcCs|<00>d|<00>dS)Nz%s UNKNOWN-DECL)r7)r5r!r)r)r*<00> unknown_decl~szAnnouncingParser.unknown_declcCs|<00>d|<00>dS)Nz%s PI)r7)r5r!r)r)r*<00> handle_pi<70>szAnnouncingParser.handle_piN)<0E>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__doc__r7r9r:r;r<r=r>r?r@rAr)r)r)r*r4csr4cCst<00>}|<01>|<00>dS)z<>Print out the HTMLParser events that occur during parsing.
This lets you see how HTMLParser parses a document when no
Beautiful Soup code is running.
N)r4<00>feed)r!r&r)r)r*<00>htmlparser_trace<63>srGZaeiouZbcdfghjklmnpqrstvwxyz<79>cCs>d}x4t|<00>D](}|ddkr$t}nt}|t<03>|<03>7}qW|S)z#Generate a random word-like string.<2E><00>r)<05>range<67> _consonants<74>_vowels<6C>random<6F>choice)<04>lengthr6<00>i<>tr)r)r*<00>rword<72>s rS<00>cCsd<01>dd<03>t|<00>D<00><01>S)z'Generate a random sentence-like string.<2E> css|]}tt<01>dd<01><02>VqdS)rT<00> N)rSrN<00>randint)<02>.0rQr)r)r*<00> <genexpr><3E>szrsentence.<locals>.<genexpr>)rrK)rPr)r)r*<00> rsentence<63>srZ<00><>cCs<>dddddddg}g}x~t|<00>D]r}t<01>dd <09>}|dkrRt<01>|<01>}|<02>d
|<00>q |d krr|<02>tt<01>d d <0C><02><01>q |d kr t<01>|<01>}|<02>d|<00>q Wdd<10>|<02>dS)z+Randomly generate an invalid HTML document.<2E>p<>div<69>spanrQ<00>b<>script<70>tabler<00>z<%s><3E>rTrJz</%s>z<html><3E>
z</html>)rKrNrWrOrrZr)<06> num_elementsZ tag_names<65>elementsrQrOZtag_namer)r)r*<00>rdoc<6F>s 

rgc
Cs(tdt<00>t|<00>}tdt|<01><00>x<>dddgddgD]z}d}y"t<04><04>}t||<02>}t<04><04>}d}Wn6tk
r<EFBFBD>}ztd |<00>t<07><08>Wd
d
}~XYnX|r6td |||f<00>q6Wd d l m
}t<04><04>}|<08> |<01>t<04><04>}td||<00>d d
l } | <09> <0A>}t<04><04>}|<02>|<01>t<04><04>}td||<00>d
S)z.Very basic head-to-head performance benchmark.z1Comparative parser benchmark on Beautiful Soup %sz3Generated a large invalid HTML document (%d bytes).r r,rz html.parserFTz%s could not parse the markup.Nz"BS4+%s parsed the markup in %.2fs.r)r
z$Raw lxml parsed the markup in %.2fs.z(Raw html5lib parsed the markup in %.2fs.)rrrg<00>len<65>timerrrr r r
ZHTMLrr<00>parse)
rer!r&r'<00>ar(r_r$r
rr)r)r*<00>benchmark_parsers<72>s4 
  

rmr cCsXt<00><01>}|j}t|<00>}tt||d<01>}t<06>d|||<03>t<08> |<03>}|<06>
d<03>|<06> dd<05>dS)N)<03>bs4r!r&zbs4.BeautifulSoup(data, parser)Z
cumulativez _html5lib|bs4<73>2) <0C>tempfile<6C>NamedTemporaryFiler"rg<00>dictrn<00>cProfileZrunctx<74>pstatsZStatsZ
sort_statsZ print_stats)rer&Z
filehandle<EFBFBD>filenamer!<00>vars<72>statsr)r)r*<00>profile<6C>s

rx<00>__main__)T)rH)rT)r[)rh)rhr )!rE<00> __license__rs<00>ior<00> html.parserrrnrrZ bs4.builderrrrtrNrprjrrr+r3r4rGrMrLrSrZrgrmrxrB<00>stdinr r)r)r)r*<00><module>s8   C
!