NLTK is a leading platform for building Python programs to work with human language data. The base of this issue is about Natural Language Processing techniques to analyze text like a processing of human language data. You can read the NLTK 3.0 documentation from here.
How to install nltk python module under Windows 10 and Fedora 26 distro.
Install under Windows 10, by using the pip command:
Download all packages into your Windows 10 with this python source code:
C:\Python27\Scripts>pip install --trusted-host pypi.python.org nltk Collecting nltk Downloading nltk-3.2.2.tar.gz (1.2MB) 100% |################################| 1.2MB 2.6MB/s Requirement already satisfied: six in c:\python27\lib\site-packages (from nltk) Building wheels for collected packages: nltk ... Successfully built nltk Installing collected packages: nltk Successfully installed nltk-3.2.2
Under linux you can install by using the pip command, I used Fedora 26 distro:
C:\Python27>python Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import nltk >>> nltk.download() showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml True
Download all packages into your Fedora 26 distro with this python source code:
[root@localhost mythcat]# pip install nltk WARNING: Running pip install with root privileges is generally not a good idea. Try `pip install --user` instead. Collecting nltk Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProtocolError('Connection aborted.', error(104, 'Connection reset by peer'))': /simple/nltk/ Downloading nltk-3.2.2.tar.gz (1.2MB) 100% |████████████████████████████████| 1.2MB 1.1MB/s Requirement already satisfied: six in /usr/lib/python2.7/site-packages (from nltk) Installing collected packages: nltk Running setup.py install for nltk ... done Successfully installed nltk-3.2.2
Let's start with a simple example by show sample example books:
[mythcat@localhost ~]$ python Python 2.7.13 (default, Feb 21 2017, 12:00:39) [GCC 7.0.1 20170219 (Red Hat 7.0.1-0.9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import nltk >>> nltk.download() NLTK Downloader --------------------------------------------------------------------------- d) Download l) List u) Update c) Config h) Help q) Quit --------------------------------------------------------------------------- Downloader> d Download which package (l=list; x=cancel)? Identifier> l Packages: [ ] abc................. Australian Broadcasting Commission 2006 [ ] alpino.............. Alpino Dutch Treebank ... Collections: [ ] all-corpora......... All the corpora [ ] all................. All packages [ ] book................ Everything used in the NLTK Book ([*] marks installed packages) Download which package (l=list; x=cancel)? Identifier> all Downloading collection u'all' | | Downloading package abc to /home/mythcat/nltk_data... | Unzipping corpora/abc.zip. | Downloading package alpino to /home/mythcat/nltk_data... | Unzipping corpora/alpino.zip. | Downloading package biocreative_ppi to ...
The next example let you to import books from sample area and use it:
>>> from nltk.book import * *** Introductory Examples for the NLTK Book *** Loading text1, ..., text9 and sent1, ..., sent9 Type the name of the text or sentence to view it. Type: 'texts()' or 'sents()' to list the materials. text1: Moby Dick by Herman Melville 1851 text2: Sense and Sensibility by Jane Austen 1811 text3: The Book of Genesis text4: Inaugural Address Corpus text5: Chat Corpus text6: Monty Python and the Holy Grail text7: Wall Street Journal text8: Personals Corpus text9: The Man Who Was Thursday by G . K . Chesterton 1908 >>> ...
This is all for today.
#function count the word in the Text >>> print text1.count("white") 191 # function concordance view shows us every occurrence of a given word, together with some context. >>> print text3.concordance("white") Displaying 5 of 5 matches: potted , and every one that had some white in it , and all the brown among the hazel and chesnut tree ; and pilled white strakes in them , and made the white white strakes in them , and made the white appear which was in the rods . And h y dream , and , behold , I had three white baskets on my he And in the uppermos all be red with wine , and his teeth white with milk . Zebulun shall dwell at t None #function similar to the name of the text >>> print text3.similar("white") None >>> print text3.similar("got") named set arrayed bound brought see embraced kissed slew unto curse built shewed laid digged sent gave offer offered blessed None #contexts are shared by two or more words >>> text3.common_contexts(["white","blue"]) (u'The following word(s) were not found:', u'white blue') >>> text3.common_contexts(["man","men"]) old_of the_and the_said the_that the_took young_and the_s