| # codescrape␊ |
| ␊ |
| Version 1.0␊ |
| ␊ |
| By: Nathan Adams␊ |
| ␊ |
| License: MIT␊ |
| ␊ |
| ## Description␊ |
| ␊ |
| This library is to be used to archive project data. Since with the announcement of Google Code going to archive only - I wanted to create a library where you can grab source data before it is gone forever.␊ |
| ␊ |
| Use cases include:␊ |
| ␊ |
| Archive projects due to:␊ |
| ␊ |
| - Hosting service shutting down␊ |
| - Authorities sending cease-and-desist against provider/project␊ |
| - Historical/research/ or educational purposes␊ |
| ␊ |
| ## Usage␊ |
| ␊ |
| Currently srchub and google code are supported. To use:␊ |
| ␊ |
| from services.srchub import srchub␊ |
| shub = srchub()␊ |
| projects = shub.getProjects()␊ |
| ␉␊ |
| or for google code␊ |
| ␊ |
| from services.googlecode import googlecode␊ |
| gcode = googlecode()␊ |
| project = gcode.getProject("android-python27")␊ |
| ␉␊ |
| Sourcehub library will pull all public projects since this list is easily accessed. Google Code does not have a public list persay. And I didn't want to scrape the search results, so I developed it to require you to pass in the project name. If you were to get your hands on a list of google code projects you could easily loop through them:␊ |
| ␊ |
| from services.googlecode import googlecode␊ |
| gcode = googlecode()␊ |
| for project in someProjectList:␊ |
| ␉ project = gcode.getProject(project)␊ |
| # do something with project␊ |
| ␉␉␊ |
| the project data structure is as follows:␊ |
| ␊ |
| project␊ |
| ␊ |
| - getRepoURL() -> Returns the URL of the repo␊ |
| - getRepoType() -> Returns the type of repo (git, hg, or SVN)␊ |
| - getReleases() -> Returns all downloads related to the project␊ |
| - getIssues() -> Returns open issues␊ |
| - getWikis() -> Returns wikis␊ |
| ␊ |