Subversion Repositories SmartDukaan

Rev

Blame | Last modification | View Log | RSS feed

To run on a fresh machine following steps need to be followed:

        1)install python
        2)install external packages elixir, sql, turbogears, lucene, os, sys, subprocess, smtplib, email, urllib
        3)install eclipse gallileo
        4)copy the folder named 'code' into your machine
        5)set PYTHONPATH in the eclipse
        6)start the sqlserver by the following command
                sudo /path-to-mysql/mysql.server start
                mysql -u root
        7)create a database named 'phonecrawler'        
        9)run the script test.py using the command
                python /path-to-test.py/test.py /path-to-test.py
        10)One can also change the crawling interval between 2 pages for a spider by modifying the settings file for that spider,for e.g for infibeam
        the file is  "/path-to-all-the-projects/infibeamScrapy/src/demo/settings.py"
        Just modify the variable "DOWNLOAD_DELAY " its unit is in seconds.
         
           

For taking dump of database following command can be used:
   /path-to-mysqldump/.mysqldump -u root phonecrawler>~/file.sql

    


Dependencies

    All the projects and scripts need to be placed in a separate folder and the path till that folder needs to be given as input parameter.
    
    One can also change the crawling interval between 2 pages for a spider by modifying the settings file for that spider,for e.g for infibeam
    the file is  "/path-to-all-the-projects/infibeamScrapy/src/demo/settings.py"
    Just modify the variable "DOWNLOAD_DELAY " its unit is in seconds.

    Before starting the application i.e. running the script a database named phonecrawler needs to be created


Known issues
    If you make a separate script for any other spider like I made for infibeam(runinfibeam.py), then if there is any external libraries imported in the spider
    then the the PYTHONPATH to them must be set in the script.

    Logo of turbogears needs to be removed from forms, need to modify the template.

    Some hints about the parameters should be shown in the forms