Files
old-nlp/venv/lib/python3.7/site-packages/nltk/__pycache__/data.cpython-37.pyc

576 lines
40 KiB
Plaintext
Raw Normal View History

2019-10-20 13:16:49 +02:00
B
D(<28>]<5D><><00>@sdZddlmZmZmZddlZddlZddlZddlZddl Z ddl
Z
ddl Z ddl Z ddl mZmZddlmZmZddlmZddlmZmZddlmZmZy ddlZWnek
r<EFBFBD>ddlZYnXyejejd d
<EFBFBD>Z Wn2e!k
<EFBFBD>rejej"d d d d <0C>Z#d d<0E>Z YnXyddl$m%Z&Wn"ek
<EFBFBD>rHddl$m'Z&YnXddl(Z(ddl)m*Z*m+Z+m,Z,gZ-ej.<2E>/de0d<13><01><02>1ej2<6A>Z3e-dd<15>e3D<00>7Z-dej.k<07>r<>ej-<2D>4d<17>dk<03>r<>e-<2D>5ej-<2D>4e0d<18><01><01>e
j6<EFBFBD>7d<19><01>rXe-ej-<2D>8e
j9e0d<1A><01>ej-<2D>8e
j9e0d<1B>e0d<1A><01>ej-<2D>8e
j9e0d<1C>e0d<1A><01>ej-<2D>8ej.<2E>/e0d<1D>e0d<1E><01>e0d<1A><01>e0d<1F>e0d <20>e0d!<21>g7Z-nbe-ej-<2D>8e
j9e0d<1A><01>ej-<2D>8e
j9e0d<1B>e0d<1A><01>ej-<2D>8e
j9e0d<1C>e0d<1A><01>e0d"<22>e0d#<23>e0d$<24>e0d%<25>g7Z-djd)d*<2A>Z:d+d,<2C>Z;d-d.<2E>Z<dkd0d1<64>Z=ee<0E>Gd2d3<64>d3e><3E><03>Z?Gd4d5<64>d5e?e<17>Z@Gd6d7<64>d7e<11>ZAGd8d9<64>d9e@<40>ZBGd:d;<3B>d;e?<3F>ZCiZDdld<d=<3D>ZEdmd>d?<3F>ZFd@dAdBdCdDdEdFdGdHdIdJdK<64> ZGdLdMdNdOdPdQdRdSdTdUdUdV<64> ZHdndXdY<64>ZIdod[d\<5C>ZJd]d^<5E>ZKd_d`<60>ZLGdadb<64>dbe><3E>ZMGdcdd<64>dde jN<6A>ZOGdedf<64>dfe><3E>ZPdgd3d5d7d9d9d=d?dhdidYd\d^dbddd9dfgZQdS)pa<70>
Functions to find and load NLTK resource files, such as corpora,
grammars, and saved processing objects. Resource files are identified
using URLs, such as ``nltk:corpora/abc/rural.txt`` or
``http://nltk.org/sample/toy.cfg``. The following URL protocols are
supported:
- ``file:path``: Specifies the file whose path is *path*.
Both relative and absolute paths may be used.
- ``http://host/path``: Specifies the file stored on the web
server *host* at path *path*.
- ``nltk:path``: Specifies the file stored in the NLTK data
package at *path*. NLTK will search for these files in the
directories specified by ``nltk.data.path``.
If no protocol is specified, then the default protocol ``nltk:`` will
be used.
This module provides to functions that can be used to access a
resource file, given its URL: ``load()`` loads a given resource, and
adds it to a resource cache; and ``retrieve()`` copies a given resource
to a local file.
<EFBFBD>)<03>print_function<6F>unicode_literals<6C>divisionN)<02>ABCMeta<74>abstractmethod)<02>GzipFile<6C>WRITE)<01> add_metaclass)<02> string_types<65> text_type)<02>urlopen<65> url2pathnamez )<01>prefixF)<03>initial_indent<6E>subsequent_indent<6E>replace_whitespacecCsd<01>dd<03>|<00><01>D<00><01>S)N<>
css|]}t|<01>VqdS)N)<01> textwrap_fill)<02>.0<EFBFBD>line<6E>r<00>+/tmp/pip-install-4m6m_5d_/nltk/nltk/data.py<70> <genexpr>Csz"textwrap_indent.<locals>.<genexpr>)<02>join<69>
splitlines)<01>textrrr<00>textwrap_indentBsr)<01> Z_SYNC_FLUSH)<01>Z_FINISH)<03>py3_data<74> add_py3_data<74>BytesIOZ NLTK_DATA<54>cCsg|] }|r|<01>qSrr)r<00>drrr<00>
<listcomp>\sr$<00>APPENGINE_RUNTIMEz~/z ~/nltk_data<74>winZ nltk_dataZshare<72>lib<69>APPDATAzC:\z C:\nltk_dataz D:\nltk_dataz E:\nltk_dataz/usr/share/nltk_dataz/usr/local/share/nltk_dataz/usr/lib/nltk_dataz/usr/local/lib/nltk_data<74>rb<72> <00>utf-8cCs&|dkrt||||<04>}t<01>||||<06>S)N)r<00>io<69> TextIOWrapper)<07>filename<6D>mode<64> compresslevel<65>encoding<6E>fileobj<62>errors<72>newlinerrr<00>gzip_open_unicode}s r5cCsR|<00>dd<02>\}}|dkrn0|dkr<|<02>d<05>rJd|<02>d<05>}nt<03>dd|<02>}||fS)a<>
Splits a resource url into "<protocol>:<path>".
>>> windows = sys.platform.startswith('win')
>>> split_resource_url('nltk:home/nltk')
('nltk', 'home/nltk')
>>> split_resource_url('nltk:/home/nltk')
('nltk', '/home/nltk')
>>> split_resource_url('file:/home/nltk')
('file', '/home/nltk')
>>> split_resource_url('file:///home/nltk')
('file', '/home/nltk')
>>> split_resource_url('file:///C:/home/nltk')
('file', '/C:/home/nltk')
<20>:<3A><00>nltk<74>file<6C>/z^/{0,2}r")<05>split<69>
startswith<EFBFBD>lstrip<69>re<72>sub)<03> resource_url<72>protocol<6F>path_rrr<00>split_resource_url<72>s
rCcCs<>yt|<00>\}}Wntk
r,d}|}YnX|dkrTtj<03>|<02>rTd}t|dd<04>}n:|dkrnd}t|dd<04>}n |dkr<>d}t|d<07>}n|d7}d <09>||g<02>S)
a<EFBFBD>
Normalizes a resource url
>>> windows = sys.platform.startswith('win')
>>> os.path.normpath(split_resource_url(normalize_resource_url('file:grammar.fcfg'))[1]) == \
... ('\\' if windows else '') + os.path.abspath(os.path.join(os.curdir, 'grammar.fcfg'))
True
>>> not windows or normalize_resource_url('file:C:/dir/file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('file:C:\\dir\\file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('file:C:\\dir/file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('file://C:/dir/file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('file:////C:/dir/file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('nltk:C:/dir/file') == 'file:///C:/dir/file'
True
>>> not windows or normalize_resource_url('nltk:C:\\dir\\file') == 'file:///C:/dir/file'
True
>>> windows or normalize_resource_url('file:/dir/file/toy.cfg') == 'file:///dir/file/toy.cfg'
True
>>> normalize_resource_url('nltk:home/nltk')
'nltk:home/nltk'
>>> windows or normalize_resource_url('nltk:/home/nltk') == 'file:///home/nltk'
True
>>> normalize_resource_url('http://example.com/dir/file')
'http://example.com/dir/file'
>>> normalize_resource_url('dir/file')
'nltk:dir/file'
r8zfile://FNr9znltk:Tz://r")rC<00>
ValueError<EFBFBD>os<6F>path<74>isabs<62>normalize_resource_namer)r@rA<00>namerrr<00>normalize_resource_url<72>s !
 rJTcCs<>tt<01>d|<00><02>p|<00>tjj<06>}tj<08> d<02>r6|<00>
d<03>}nt<01> dd|<00>}|rVtj<05> |<00>}n$|dkrdtj }tj<05>tj<05>||<00><02>}|<00>dd<03><02>tjjd<03>}tj<08> d<02>r<>tj<05>|<00>r<>d|}|r<>|<00>d<03>s<>|d7}|S)a(
:type resource_name: str or unicode
:param resource_name: The name of the resource to search for.
Resource names are posix-style relative path names, such as
``corpora/brown``. Directory names will automatically
be converted to a platform-appropriate path separator.
Directory trailing slashes are preserved
>>> windows = sys.platform.startswith('win')
>>> normalize_resource_name('.', True)
'./'
>>> normalize_resource_name('./', True)
'./'
>>> windows or normalize_resource_name('dir/file', False, '/') == '/dir/file'
True
>>> not windows or normalize_resource_name('C:/file', False, '/') == '/C:/file'
True
>>> windows or normalize_resource_name('/dir/file', False, '/') == '/dir/file'
True
>>> windows or normalize_resource_name('../dir/file', False, '/') == '/dir/file'
True
>>> not windows or normalize_resource_name('/dir/file', True, '/') == 'dir/file'
True
>>> windows or normalize_resource_name('/dir/file', True, '/') == '/dir/file'
True
z[\\/.]$r&r:z^/+N<>\)<12>boolr><00>search<63>endswithrErF<00>sep<65>sys<79>platformr<r=r?<00>normpath<74>curdir<69>abspathr<00>replacerG)<04> resource_nameZallow_relativeZ relative_path<74>is_dirrrrrH<00>s 
  rHc@s6eZdZdZed dd<04><01>Zedd<06><00>Zedd<08><00>ZdS)
<EFBFBD> PathPointeraq
An abstract base class for 'path pointers,' used by NLTK's data
package to identify specific paths. Two subclasses exist:
``FileSystemPathPointer`` identifies a file that can be accessed
directly via a given absolute path. ``ZipFilePathPointer``
identifies a file contained within a zipfile, that can be accessed
by reading that zipfile.
NcCsdS)z<>
Return a seekable read-only stream that can be used to read
the contents of the file identified by this path pointer.
:raise IOError: If the path specified by this pointer does
not contain a readable file.
Nr)<02>selfr1rrr<00>openszPathPointer.opencCsdS)z<>
Return the size of the file pointed to by this path pointer,
in bytes.
:raise IOError: If the path specified by this pointer does
not contain a readable file.
Nr)rYrrr<00> file_size(szPathPointer.file_sizecCsdS)aP
Return a new path pointer formed by starting at the path
identified by this pointer, and then following the relative
path given by ``fileid``. The path components of ``fileid``
should be separated by forward slashes, regardless of
the underlying file system's path seperator character.
Nr)rY<00>fileidrrrr2szPathPointer.join)N)<08>__name__<5F>
__module__<EFBFBD> __qualname__<5F>__doc__rrZr[rrrrrrXs
 
rXc@sReZdZdZedd<03><00>Zedd<05><00>Zddd<08>Zd d
<EFBFBD>Z d d <0C>Z
d d<0E>Z dd<10>Z dS)<12>FileSystemPathPointerzm
A path pointer that identifies a file which can be accessed
directly via a given absolute path.
cCs.tj<01>|<01>}tj<01>|<01>s$td|<00><01>||_dS)z<>
Create a new path pointer for the given absolute path.
:raise IOError: If the given path does not exist.
zNo such file or directory: %rN)rErFrT<00>exists<74>IOError<6F>_path)rYrdrrr<00>__init__Cs   zFileSystemPathPointer.__init__cCs|jS)z2The absolute path identified by this path pointer.)rd)rYrrrrFSszFileSystemPathPointer.pathNcCs"t|jd<01>}|dk rt||<01>}|S)Nr))rZrd<00>SeekableUnicodeStreamReader)rYr1<00>streamrrrrZXs 
zFileSystemPathPointer.opencCst<00>|j<02>jS)N)rE<00>statrd<00>st_size)rYrrrr[^szFileSystemPathPointer.file_sizecCstj<01>|j|<01>}t|<02>S)N)rErFrrdra)rYr\rdrrrraszFileSystemPathPointer.joincCstd|j<00>S)NzFileSystemPathPointer(%r))<02>strrd)rYrrr<00>__repr__eszFileSystemPathPointer.__repr__cCs|jS)N)rd)rYrrr<00>__str__kszFileSystemPathPointer.__str__)N) r]r^r_r`rre<00>propertyrFrZr[rrkrlrrrrra=s  
rac@sjeZdZdZdZdeZeddd<07><01>Zdd <09>Zd
d <0B>Z d d <0A>Z
dd<0F>Z e fdd<11>Z ddd<13>Zddd<16>ZdS)<1A>BufferedGzipFilea<65>
A ``GzipFile`` subclass that buffers calls to ``read()`` and ``write()``.
This allows faster reads and writes of data to and from gzip-compressed
files at the cost of using more memory.
The default buffer size is 2MB.
``BufferedGzipFile`` is useful for loading large gzipped pickle objects
as well as writing large encoded feature files for classifier training.
i<00>Nr*cKs4t<00>|||||<04>|<05>d|j<03>|_t<05>|_d|_dS)a!
Return a buffered gzip file object.
:param filename: a filesystem path
:type filename: str
:param mode: a file mode which can be any of 'r', 'rb', 'a', 'ab',
'w', or 'wb'
:type mode: str
:param compresslevel: The compresslevel argument is an integer from 1
to 9 controlling the level of compression; 1 is fastest and
produces the least compression, and 9 is slowest and produces the
most compression. The default is 9.
:type compresslevel: int
:param fileobj: a BytesIO stream to read from instead of a file.
:type fileobj: BytesIO
:param size: number of bytes to buffer during calls to read() and write()
:type size: int
:rtype: BufferedGzipFile
<20>sizerN)rre<00>get<65>SIZE<5A>_sizer!<00> _nltk_buffer<65>_len)rYr.r/r0r2<00>kwargsrrrre~szBufferedGzipFile.__init__cCst<00>|_d|_dS)Nr)r!rtru)rYrrr<00> _reset_buffer<65>szBufferedGzipFile._reset_buffercCs*|dk r&|j<00>|<01>|jt|<01>7_dS)N)rt<00>writeru<00>len)rY<00>datarrr<00> _write_buffer<65>s zBufferedGzipFile._write_buffercCs(t<00>||j<02><03><00>|<00><04>|<00>|<01>dS)N)rrxrt<00>getvaluerwr{)rYrzrrr<00> _write_gzip<69>szBufferedGzipFile._write_gzipcCs&|jtkr|<00>d<00>|<00><03>t<04>|<00>S)N)r/<00>GZ_WRITEr}rwr<00>close)rYrrrr<00>s

zBufferedGzipFile.closecCs|j<00><01>t<02>||<01>dS)N)rt<00>flushr)rYZlib_moderrrr<><00>s
zBufferedGzipFile.flushcCsR|sB|j}t<01>}x(t<02>||<01>}|s,|<02><04>P|<02>|<03>qW|<02><06>St<02>||<01>SdS)N)rsr!r<00>readr<64>rxr|)rYrp<00>contents<74>blocksrrrr<><00>s zBufferedGzipFile.read<61><64><EFBFBD><EFBFBD><EFBFBD>cCs6|s
|j}|jt|<01>|kr(|<00>|<01>n
|<00>|<01>dS)z<>
:param data: bytes to write to file or buffer
:type data: bytes
:param size: buffer at least size bytes before writing to file
:type size: int
N)rsruryr{r})rYrzrprrrrx<00>s
 zBufferedGzipFile.write)NNr*N)N)r<>)r]r^r_r`ZMBrrrrerwr{r}r<00>FLUSHr<48>r<>rxrrrrrnos
  
rnc@seZdZdZddd<04>ZdS)<06>GzipFileSystemPathPointerz<72>
A subclass of ``FileSystemPathPointer`` that identifies a gzip-compressed
file located at a given absolute path. ``GzipFileSystemPathPointer`` is
appropriate for loading large gzip-compressed pickle objects efficiently.
NcCsDtj<01>d<01>stj<01>d<02>r&t|jd<03>}n t|jd<03>}|r@t||<01>}|S)Nz2.7z3.4r))rP<00>versionr<rnrdrrf)rYr1rgrrrrZ<00>s  
zGzipFileSystemPathPointer.open)N)r]r^r_r`rZrrrrr<><00>sr<>c@s`eZdZdZeddd<04><01>Zedd<06><00>Zedd<08><00>Zdd
d <0B>Z d d <0A>Z
dd<0F>Z dd<11>Z dd<13>Z d S)<16>ZipFilePathPointerz~
A path pointer that identifies a file contained within a zipfile,
which can be accessed by reading that zipfile.
r"cs<>t|t<01>rttj<04>|<01><01>}<01>r<>t<06>dd<02><03>d<02><01>y|<01><08><00>WnHt k
r<EFBFBD><00><00>
d<02>rp<72>fdd<04>|<01> <0B>D<00>rpnt d|j <0A>f<00><01>YnX||_<0E>|_dS)z<>
Create a new path pointer pointing at the specified entry
in the given zipfile.
:raise IOError: If the given zipfile does not exist, or if it
does not contain the specified entry.
Tr:csg|]}|<01><00><00>r|<01>qSr)r<)r<00>n)<01>entryrrr$
sz/ZipFilePathPointer.__init__.<locals>.<listcomp>zZipfile %r does not contain %rN)<10>
isinstancer
<00>OpenOnDemandZipFilerErFrTrHr=<00>getinfo<66> ExceptionrN<00>namelistrcr.<00>_zipfile<6C>_entry)rY<00>zipfiler<65>r)r<>rre<00>s

zZipFilePathPointer.__init__cCs|jS)z<>
The zipfile.ZipFile object used to access the zip file
containing the entry identified by this path pointer.
)r<>)rYrrrr<>szZipFilePathPointer.zipfilecCs|jS)z_
The name of the file within zipfile that this path
pointer points to.
)r<>)rYrrrr<>szZipFilePathPointer.entryNcCsp|j<00>|j<02>}t|<02>}|j<02>d<01>rZtj<06>d<02>s:tj<06>d<03>rJt|j|d<04>}qlt |j|d<04>}n|dk rlt
||<01>}|S)Nz.gzz2.7z3.4)r2) r<>r<>r<>r!rNrPr<>r<rnrrf)rYr1rzrgrrrrZ%s 
zZipFilePathPointer.opencCs|j<00>|j<02>jS)N)r<>r<>r<>r[)rYrrrr[4szZipFilePathPointer.file_sizecCsd|j|f}t|j|<02>S)Nz%s/%s)r<>r<>r<>)rYr\r<>rrrr7szZipFilePathPointer.joincCstd<01>|jj|jfS)NzZipFilePathPointer(%r, %r))rjr<>r.r<>)rYrrrrk;szZipFilePathPointer.__repr__cCstj<01>tj<01>|jj|j<06><02>S)N)rErFrRrr<>r.r<>)rYrrrrl>szZipFilePathPointer.__str__)r")N)r]r^r_r`rrermr<>r<>rZr[rrkrlrrrrr<><00>s $  
r<>c Cst|d<01>}|dkrt}t<02>d|<00>}|<02><04>\}}x<>|D]<5D>}|rvtj<01>|<05>rv|<05>d<04>rvy
t||<00>St k
rrw4YnXq4|r<>tj<01>
|<05>r4|dkr<>tj<01> |t |<00><01>}tj<01> |<06>r<>|<06>d<05>r<>t|<06>St|<06>Sq4tj<01> |t |<03><01>}tj<01> |<06>r4y
t||<04>St k
<EFBFBD>rw4Yq4Xq4W|dk<08>r<>|<00>d<06>}xdtt|<07><01>D]T}d<06> |d|<08>||dg||d<02><00>} y
t| |<01>Stk
<EFBFBD>r<>YnX<00>q0W|<00>d<06>d}
|
<EFBFBD>d<04><01>r<>|
<EFBFBD>d<08>d }
td
<EFBFBD>j|
d <0B>} t| <0B>} | d 7} | d j|d<0E>7} | dd<10> dd<12>|D<00><01>7} d} d| | | f} t| <0A><01>dS)a<>
Find the given resource by searching through the directories and
zip files in paths, where a None or empty string specifies an absolute path.
Returns a corresponding path name. If the given resource is not
found, raise a ``LookupError``, whose message gives a pointer to
the installation instructions for the NLTK downloader.
Zip File Handling:
- If ``resource_name`` contains a component with a ``.zip``
extension, then it is assumed to be a zipfile; and the
remaining path components are used to look inside the zipfile.
- If any element of ``nltk.data.path`` has a ``.zip`` extension,
then it is assumed to be a zipfile.
- If a given resource name that does not contain any zipfile
component is not found initially, then ``find()`` will make a
second attempt to find that resource, by replacing each
component *p* in the path with *p.zip/p*. For example, this
allows ``find()`` to map the resource name
``corpora/chat80/cities.pl`` to a zip file path pointer to
``corpora/chat80.zip/chat80/cities.pl``.
- When using ``find()`` to locate a directory contained in a
zipfile, the resource name must end with the forward slash
character. Otherwise, ``find()`` will not locate the
directory.
:type resource_name: str or unicode
:param resource_name: The name of the resource to search for.
Resource names are posix-style relative path names, such as
``corpora/brown``. Directory names will be
automatically converted to a platform-appropriate path separator.
:rtype: str
TNz(.*\.zip)/?(.*)$|z.zipz.gzr:r7<00>.rz<>Resource {resource} not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('{resource}')
)<01>resourcez<
For more information see: https://www.nltk.org/data.html
z.
Attempted to load {resource_name}
)rVz
Searched in:r"css|]}d|VqdS)z
- %rNr)rr#rrrr<00>szfind.<locals>.<genexpr>zF**********************************************************************z
%s
%s
%s
)rHrFr><00>match<63>groupsrE<00>isfilerNr<>rc<00>isdirrr rbr<>rar;<00>rangery<00>find<6E> LookupError<6F>
rpartitionrj<00>formatr)rV<00>paths<68>mr<6D>ZzipentryrB<00>p<>pieces<65>iZ modified_nameZresource_zipname<6D>msgrOZresource_not_foundrrrr<>Ms\%
  


 

 
 

,
  
r<>c Cs<>t|<00>}|dkr:|<00>d<02>r,tj<03>|<00>d}nt<05>dd|<00>}tj<03>|<01>r^tj<03>|<01>}t d|<00><01>|rrt
d||f<00>t |<00>}t |d<08><02>&}x|<03> d <09>}|<04>|<05>|s<>Pq<>WWdQRX|<03><0F>dS)
a<EFBFBD>
Copy the given resource to a local file. If no filename is
specified, then use the URL's filename. If there is already a
file named ``filename``, then raise a ``ValueError``.
:type resource_url: str
:param resource_url: A URL specifying where the resource should be
loaded from. The default protocol is "nltk:", which searches
for the file in the the NLTK data package.
Nzfile:r<>z (^\w+:)?.*/r"zFile %r already exists!zRetrieving %r, saving to %r<>wbi)rJr<rErFr;r>r?rbrTrD<00>print<6E>_openrZr<>rxr)r@r.<00>verbose<73>infile<6C>outfile<6C>srrr<00>retrieve<76>s$ 
    

r<>z;A serialized python object, stored using the pickle module.z9A serialized python object, stored using the json module.z9A serialized python object, stored using the yaml module.zA context free grammar.zA probabilistic CFG.zA feature CFG.zZA list of first order logic expressions, parsed with nltk.sem.logic.Expression.fromstring.zA list of first order logic expressions, parsed with nltk.sem.logic.LogicParser. Requires an additional logic_parser parameterz>A semantic valuation, parsed by nltk.sem.Valuation.fromstring.z)The raw (byte string) contents of a file.z-The raw (unicode string) contents of a file. ) <0B>pickle<6C>json<6F>yaml<6D>cfg<66>pcfg<66>fcfg<66>fol<6F>logic<69>val<61>rawrr<>r<>r<>r<>r<>r<>r<>r<>r<>r) r<>r<>r<>r<>r<>r<>r<>r<>r<><00>txtr<00>autocCs<>t|<00>}t|<00>}|dkrX|<00>d<02>}|d}|dkr:|d}t<03>|<08>}|dkrXtd|<00><01>|tkrntd|f<00><01>|r<>t<07>||f<02>} | dk r<>|r<>td |f<00>| S|r<>td
|f<00>t |<00>}
|d kr<>|
<EFBFBD>
<EFBFBD>} <09>n<>|d kr<>t <0B> |
<EFBFBD>} <09>n<>|d k<02>r>ddl } ddlm} | <0B> |
<EFBFBD>} d} t| <09>dk<03>r(t| <09><12><00>} | | k<07>r<>td<11><01><01>nJ|dk<02>r^ddl}|<0E> |
<EFBFBD>} <09>n*|
<EFBFBD>
<EFBFBD>}|dk <09>r||<0F>|<06>}n0y|<0F>d<13>}Wn tk
<EFBFBD>r<>|<0F>d<14>}YnX|dk<02>r<>|} n<>|dk<02>r<>tjjj||d<17>} n<>|dk<02>r<>tjjj||d<17>} n<>|dk<02>rtjjj||||d<1A>} nn|dk<02>r@tjj|tjj<1E><1F>|d<1C>} nH|dk<02>r^tjj|||d<1C>} n*|dk<02>rztjj ||d<17>} nt!d|f<00><01>|
<EFBFBD>"<22>|<02>r<>y| t||f<Wnt#k
<EFBFBD>r<>YnX| S) a<>
Load a given resource from the NLTK data package. The following
resource formats are currently supported:
- ``pickle``
- ``json``
- ``yaml``
- ``cfg`` (context free grammars)
- ``pcfg`` (probabilistic CFGs)
- ``fcfg`` (feature-based CFGs)
- ``fol`` (formulas of First Order Logic)
- ``logic`` (Logical formulas to be parsed by the given logic_parser)
- ``val`` (valuation of First Order Logic model)
- ``text`` (the file contents as a unicode string)
- ``raw`` (the raw file contents as a byte string)
If no format is specified, ``load()`` will attempt to determine a
format based on the resource name's file extension. If that
fails, ``load()`` will raise a ``ValueError`` exception.
For all text formats (everything except ``pickle``, ``json``, ``yaml`` and ``raw``),
it tries to decode the raw contents using UTF-8, and if that doesn't
work, it tries with ISO-8859-1 (Latin-1), unless the ``encoding``
is specified.
:type resource_url: str
:param resource_url: A URL specifying where the resource should be
loaded from. The default protocol is "nltk:", which searches
for the file in the the NLTK data package.
:type cache: bool
:param cache: If true, add this resource to a cache. If load()
finds a resource in its cache, then it will return it from the
cache rather than loading it.
:type verbose: bool
:param verbose: If true, print a message when loading a resource.
Messages are not displayed when a resource is retrieved from
the cache.
:type logic_parser: LogicParser
:param logic_parser: The parser that will be used to parse logical
expressions.
:type fstruct_reader: FeatStructReader
:param fstruct_reader: The parser that will be used to parse the
feature structure of an fcfg.
:type encoding: str
:param encoding: the encoding of the input; only used for text formats.
r<>r<>r<><00>gz<67><7A><EFBFBD><EFBFBD><EFBFBD>NzzCould not determine format for %s based on its file
extension; use the "format" argument to specify the format explicitly.zUnknown format type: %s!z<<Using cached copy of %s>>z<<Loading %s>>r<>r<>r<>r)<01> json_tagsr7zUnknown json tag.r<>zutf-8zlatin-1rr<>)r1r<>r<>)<03> logic_parser<65>fstruct_readerr1r<>)r<>r1r<>r<>z@Internal NLTK error: Format %s isn't handled by nltk.data.load())$rJr r;<00> AUTO_FORMATSrqrD<00>FORMATS<54>_resource_cacher<65>r<>r<>r<><00>loadr<64>Z nltk.jsontagsr<73>ry<00>next<78>keysr<73><00>decode<64>UnicodeDecodeErrorr8ZgrammarZCFG<46>
fromstringZPCFGZFeatureGrammarZsemZ
read_logicr<EFBFBD>Z LogicParserZread_valuation<6F>AssertionErrorr<00> TypeError)r@r<><00>cacher<65>r<>r<>r1Zresource_url_parts<74>ext<78> resource_valZopened_resourcer<65>r<><00>tagr<67>Z binary_dataZ string_datarrrr<> s<>7

 
 
 
 

 










r<><00>##cCsRt|<00>}t|ddd<03>}|<02><02>}x.|D]&}|<04>|<01>r4q$t<04>d|<04>rBq$t|<04>q$WdS)a}
Write out a grammar file, ignoring escaped and empty lines.
:type resource_url: str
:param resource_url: A URL specifying where the resource should be
loaded from. The default protocol is "nltk:", which searches
for the file in the the NLTK data package.
:type escape: str
:param escape: Prepended string that signals lines to be ignored
rF)r<>r<>z^$N)rJr<>rr<r>r<>r<>)r@<00>escaper<65><00>lines<65>lrrr<00>show_cfg<66>s 

 r<>cCs t<00><01>dS)zF
Remove all objects from the resource cache.
:see: load()
N)r<><00>clearrrrr<00> clear_cache<68>sr<>cCsdt|<00>}t|<00>\}}|dks(|<01><02>dkr<t|tdg<00><02><05>S|<01><02>dkrXt|dg<01><02><05>St|<00>SdS)ao
Helper function that returns an open file object for a resource,
given its resource URL. If the given resource URL uses the "nltk:"
protocol, or uses no protocol, then use ``nltk.data.find`` to find
its path, and open it with the given mode; if the resource URL
uses the 'file' protocol, then open the file with the given mode;
otherwise, delegate to ``urllib2.urlopen``.
:type resource_url: str
:param resource_url: A URL specifying where the resource should be
loaded from. The default protocol is "nltk:", which searches
for the file in the the NLTK data package.
Nr8r"r9)rJrC<00>lowerr<72>rFrZr )r@rArBrrrr<><00>s  r<>c@s0eZdZedd<02><00>Zdd<04>Zdd<06>Zdd<08>Zd S)
<EFBFBD>
LazyLoadercCs
||_dS)N)rd)rYrdrrrre<00>szLazyLoader.__init__cCst|j<01>}|j|_|j|_dS)N)r<>rd<00>__dict__<5F> __class__)rYr<>rrrZ__load<61>s
zLazyLoader.__loadcCs|<00><00>t||<01>S)N)<02>_LazyLoader__load<61>getattr)rY<00>attrrrr<00> __getattr__<5F>szLazyLoader.__getattr__cCs|<00><00>t|<00>S)N)r<><00>repr)rYrrrrkszLazyLoader.__repr__N)r]r^r_rrer<>r<>rkrrrrr<><00>s r<>c@s<eZdZdZedd<03><00>Zdd<05>Zdd<07>Zdd <09>Zd
d <0B>Z d S) r<>a<>
A subclass of ``zipfile.ZipFile`` that closes its file pointer
whenever it is not using it; and re-opens it when it needs to read
data from the zipfile. This is useful for reducing the number of
open file handles when many zip files are being accessed at once.
``OpenOnDemandZipFile`` must be constructed from a filename, not a
file-like object (to allow re-opening). ``OpenOnDemandZipFile`` is
read-only (i.e. ``write()`` and ``writestr()`` are disabled.
cCs@t|t<01>std<01><01>tj<04>||<01>|j|ks.t<07>|<00><08>d|_ dS)Nz+ReopenableZipFile filename must be a stringr)
r<EFBFBD>r
r<>r<><00>ZipFilerer.r<>r<00> _fileRefCnt)rYr.rrrres 
zOpenOnDemandZipFile.__init__cCsD|jdkst<01>t|jd<01>|_tj<05>||<01>}|jd7_|<00><08>|S)Nr)r7) <09>fpr<70>rZr.r<>r<>r<>r<>r)rYrI<00>valuerrrr<>&s zOpenOnDemandZipFile.readcOs td<01><01>dS)z<:raise NotImplementedError: OpenOnDemandZipfile is read-onlyz OpenOnDemandZipfile is read-onlyN)<01>NotImplementedError)rY<00>argsrvrrrrx0szOpenOnDemandZipFile.writecOs td<01><01>dS)z<:raise NotImplementedError: OpenOnDemandZipfile is read-onlyz OpenOnDemandZipfile is read-onlyN)r<>)rYr<>rvrrr<00>writestr4szOpenOnDemandZipFile.writestrcCsttd<01>|j<00>S)NzOpenOnDemandZipFile(%r))r<>rjr.)rYrrrrk8szOpenOnDemandZipFile.__repr__N)
r]r^r_r`rrer<>rxr<>rkrrrrr<>s   
r<>c@s6eZdZdZdZed5dd<05><01>Zd6dd<08>Zd d
<EFBFBD>Zd7d d <0C>Z d8d d<0E>Z
dd<10>Z dd<12>Z dd<14>Z dd<16>Zdd<18>Zedd<1A><00>Zedd<1C><00>Zedd<1E><00>Zdd <20>Zd9d"d#<23>Zd$d%<25>Zd:d&d'<27>Zd(d)<29>Zd;d*d+<2B>Zd,d-<2D>Zejdfgejd.fejd/fgejdfgejdfgejd0fej d1fgejdfgej dfgd2<64>Z!d3d4<64>Z"dS)<rfa<>
A stream reader that automatically encodes the source byte stream
into unicode (like ``codecs.StreamReader``); but still supports the
``seek()`` and ``tell()`` operations correctly. This is in contrast
to ``codecs.StreamReader``, which provide *broken* ``seek()`` and
``tell()`` methods.
This class was motivated by ``StreamBackedCorpusView``, which
makes extensive use of ``seek()`` and ``tell()``, and needs to be
able to handle unicode-encoded files.
Note: this class requires stateless decoders. To my knowledge,
this shouldn't cause a problem with any of python's builtin
unicode encodings.
T<>strictcCsN|<01>d<01>||_||_||_t<04>|<02>|_d|_d|_d|_ d|_
|<00> <0B>|_ dS)Nr<00>) <0A>seekrgr1r3<00>codecs<63>
getdecoderr<EFBFBD><00>
bytebuffer<EFBFBD>
linebuffer<EFBFBD>_rewind_checkpoint<6E>_rewind_numchars<72>
_check_bom<EFBFBD>_bom)rYrgr1r3rrrreTs
  
z$SeekableUnicodeStreamReader.__init__NcCs0|<00>|<01>}|jr,d<01>|j<01>|}d|_d|_|S)a6
Read up to ``size`` bytes, decode them using this reader's
encoding, and return the resulting unicode string.
:param size: The maximum number of bytes to read. If not
specified, then read as many bytes as possible.
:type size: int
:rtype: unicode
r"N)<04>_readr<64>rr<>)rYrp<00>charsrrrr<><00>s
z SeekableUnicodeStreamReader.readcCsB|jr4t|j<00>dkr4|j<00>d<02>}|jt|<01>7_n
|j<04><05>dS)Nr7r)r<>ry<00>popr<70>rg<00>readline)rYrrrr<00> discard_line<6E>s z(SeekableUnicodeStreamReader.discard_linec
CsV|jr6t|j<00>dkr6|j<00>d<02>}|jt|<02>7_|S|p<d}d}|jr\||j<00><02>7}d|_x<>|j<04><05>t|j<06>}|<00>|<03>}|r<>|<06>d<06>r<>||<00>d<01>7}||7}|<04> d<07>}t|<07>dkr<>|d}|dd<05>|_t|<06>t|<04>t|<02>|_||_
Pn8t|<07>dk<02>r(|d}|d<00> d<08>d} || k<03>r(|}P|<06>r8|dk <09>r>|}P|d kr^|d
9}q^W|S) aj
Read a line of text, decode it using this reader's encoding,
and return the resulting unicode string.
:param size: The maximum number of bytes to read. If no
newline is encountered before ``size`` bytes have been read,
then the returned value may not be a complete line of text.
:type size: int
r7r<00>Hr"N<> TFi@ro) r<>ryr<>r<>rg<00>tellr<6C>r<>rNrr<>)
rYrpr<00>readsizer<65><00>startposZ new_charsr<73><00> line0withend<6E>line0withoutendrrrr<><00>sD  

 
 z$SeekableUnicodeStreamReader.readlinecCs|<00><00><00>|<02>S)a
Read this file's contents, decode them using this reader's
encoding, and return it as a list of unicode lines.
:rtype: list(unicode)
:param sizehint: Ignored.
:param keepends: If false, then strip newlines.
)r<>r)rY<00>sizehint<6E>keependsrrr<00> readlines<65>s z%SeekableUnicodeStreamReader.readlinescCs|<00><00>}|r|St<01>dS)z8Return the next decoded line from the underlying stream.N)r<><00> StopIteration)rYrrrrr<><00>sz SeekableUnicodeStreamReader.nextcCs|<00><00>S)N)r<>)rYrrr<00>__next__<5F>sz$SeekableUnicodeStreamReader.__next__cCs|S)z Return selfr)rYrrr<00>__iter__<5F>sz$SeekableUnicodeStreamReader.__iter__cCs|js|<00><01>dS)N)<02>closedr)rYrrr<00>__del__<5F>sz#SeekableUnicodeStreamReader.__del__cCs|S)z Return selfr)rYrrr<00>
xreadlinessz&SeekableUnicodeStreamReader.xreadlinescCs|jjS)z(True if the underlying stream is closed.)rgr)rYrrrr sz"SeekableUnicodeStreamReader.closedcCs|jjS)z"The name of the underlying stream.)rgrI)rYrrrrIsz SeekableUnicodeStreamReader.namecCs|jjS)z"The mode of the underlying stream.)rgr/)rYrrrr/sz SeekableUnicodeStreamReader.modecCs|j<00><01>dS)z.
Close the underlying stream.
N)rgr)rYrrrrsz!SeekableUnicodeStreamReader.closercCs@|dkrtd<02><01>|j<01>||<02>d|_d|_d|_|j<01><06>|_dS)a
Move the stream to a new file position. If the reader is
maintaining any buffers, then they will be cleared.
:param offset: A byte count offset.
:param whence: If 0, then the offset is from the start of the file
(offset should be positive), if 1, then the offset is from the
current position (offset may be positive or negative); and if 2,
then the offset is from the end of the file (offset should
typically be negative).
r7zmRelative seek is not supported for SeekableUnicodeStreamReader -- consider using char_seek_forward() instead.Nr<4E>)rDrgr<>r<>r<>r<>r<>r<>)rY<00>offset<65>whencerrrr<>$s z SeekableUnicodeStreamReader.seekcCs,|dkrtd<02><01>|<00>|<00><02><00>|<00>|<01>dS)zI
Move the read pointer forward by ``offset`` characters.
rz"Negative offsets are not supportedN)rDr<>r<><00>_char_seek_forward)rYrrrr<00>char_seek_forward<sz-SeekableUnicodeStreamReader.char_seek_forwardcCs<>|dkr |}d}x<>|j<00>|t|<03><00>}||7}|<00>|<03>\}}t|<05>|krd|j<00>t|<03> |d<03>dSt|<05>|kr<>x6t|<05>|kr<>||t|<05>7}|<00>|d|<02><00>\}}qrW|j<00>t|<03> |d<03>dS||t|<05>7}qWdS)a
Move the file position forward by ``offset`` characters,
ignoring all buffers.
:param est_bytes: A hint, giving an estimate of the number of
bytes that will be needed to move forward by ``offset`` chars.
Defaults to ``offset``.
Nr<4E>r7)rgr<>ry<00> _incr_decoder<65>)rYr<00> est_bytes<65>bytesZnewbytesr<73><00> bytes_decodedrrrrGs"   z.SeekableUnicodeStreamReader._char_seek_forwardcCs<>|jdkr|j<01><02>t|j<04>S|j<01><02>}|t|j<04>|j}tdd<03>|jD<00><01>}t||j|j|<00>}|j<01> |j<05>|<00>
|j|<04>|j<01><02>}|j r<>|j<01> |<05>|<00> |j<01> d<04><01>d}d<06>|j<00>}|<06>|<07>s<>|<07>|<06>s<>t<10>|j<01> |<01>|S)z<>
Return the current file position on the underlying byte
stream. If this reader is maintaining any buffers, then the
returned file position will be the position of the beginning
of those buffers.
Ncss|]}t|<01>VqdS)N)ry)rrrrrr<00>sz3SeekableUnicodeStreamReader.tell.<locals>.<genexpr><3E>2rr")r<>rgr<>ryr<>r<><00>sum<75>intr<74>r<>r<00>DEBUGrr<>rr<r<>)rYZ orig_filepos<6F>
bytes_read<EFBFBD>buf_sizer<00>fileposZcheck1Zcheck2rrrr<>os"


   z SeekableUnicodeStreamReader.tellcCs<>|dkr dS|jr.|j<01><02>dkr.|j<01>|j<00>|dkrB|j<01><03>}n |j<01>|<01>}|j|}|<00>|<03>\}}|dk r<>|s<>t|<02>dkr<>x0|s<>|j<01>d<04>}|s<>P||7}|<00>|<03>\}}q<>W||d<03>|_|S)z<>
Read up to ``size`` bytes from the underlying stream, decode
them using this reader's encoding, and return the resulting
unicode string. ``linebuffer`` is not included in the result.
rr"Nr7)r<>rgr<>r<>r<>rry)rYrpZ new_bytesr r<>r
rrrr<><00>s$  
 z!SeekableUnicodeStreamReader._readc
Cs|xvy |<00>|d<01>Stk
rr}zF|jt|<01>krF|<00>|d|j<04>|j<05>S|jdkrT<72>n|<00>||j<05>SWdd}~XYqXqWdS)a<>
Decode the given byte string into a unicode string, using this
reader's encoding. If an exception is encountered that
appears to be caused by a truncation error, then just decode
the byte string without the bytes that cause the trunctaion
error.
Return a tuple ``(chars, num_consumed)``, where ``chars`` is
the decoded unicode string, and ``num_consumed`` is the
number of bytes that were consumed.
r<>N)r<>r<><00>endry<00>startr3)rYr <00>excrrrr<00>s  
z(SeekableUnicodeStreamReader._incr_decodezutf16-lezutf16-bezutf32-lezutf32-be)<07>utf8<66>utf16Zutf16leZutf16be<62>utf32Zutf32leZutf32becCsnt<00>dd|j<02><03><00>}|j<04>|<01>}|rj|j<06>d<03>}|j<06>d<04>x,|D]$\}}|<03> |<04>rB|r^||_t
|<04>SqBWdS)Nz[ -]r"<00>r) r>r?r1r<><00>
_BOM_TABLErqrgr<>r<>r<ry)rY<00>encZbom_infor <00>bomZ new_encodingrrrr<><00>s   
 z&SeekableUnicodeStreamReader._check_bom)r<>)N)N)NT)r)N)N)#r]r^r_r`rrrer<>r<>r<>r<>r<>r<>r<>rrrmrrIr/rr<>rrr<>r<>rr<><00>BOM_UTF8<46> BOM_UTF16_LE<4C> BOM_UTF16_BE<42> BOM_UTF32_LE<4C> BOM_UTF32_BErr<>rrrrrfAs> 8

<
   

(.
&



rfrFr<>r<>)r)r*r+NNN)TN)N)NT)r<>TFNNN)r<>)Rr`<00>
__future__rrr<00> functools<6C>textwrapr,rEr>rPr<>r<><00>abcrr<00>gziprrr~<00>sixr r
r Zsix.moves.urllib.requestr r <00>cPickler<65><00> ImportError<6F>partial<61>indentr<00>AttributeError<6F>fillr<00>zlibrr<>rr8Z nltk.compatrr r!rF<00>environrqrjr;<00>pathsepZ_paths_from_env<6E>
expanduser<EFBFBD>appendrQr<rrr5rCrJrH<00>objectrXrarnr<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>r<>rf<00>__all__rrrr<00><module> s  " 
8
5)2g^
s
*

"1D