diff --git a/README.md b/README.md new file mode 100644 index 0000000..f5bdb65 --- /dev/null +++ b/README.md @@ -0,0 +1,52 @@ +# codescrape + +Version 1.0 + +By: Nathan Adams + +License: MIT + +## Description + +This library is to be used to archive project data. Since with the announcement of Google Code going to archive only - I wanted to create a library where you can grab source data before it is gone forever. + +Use cases include: + +Archive projects due to: + +- Hosting service shutting down +- Authorities sending cease-and-desist against provider/project +- Historical/research/ or educational purposes + +## Usage + +Currently srchub and google code are supported. To use: + + from services.srchub import srchub + shub = srchub() + projects = shub.getProjects() + +or for google code + + from services.googlecode import googlecode + gcode = googlecode() + project = gcode.getProject("android-python27") + +Sourcehub library will pull all public projects since this list is easily accessed. Google Code does not have a public list persay. And I didn't want to scrape the search results, so I developed it to require you to pass in the project name. If you were to get your hands on a list of google code projects you could easily loop through them: + + from services.googlecode import googlecode + gcode = googlecode() + for project in someProjectList: + project = gcode.getProject(project) + # do something with project + +the project data structure is as follows: + +project + +- getRepoURL() -> Returns the URL of the repo +- getRepoType() -> Returns the type of repo (git, hg, or SVN) +- getReleases() -> Returns all downloads related to the project +- getIssues() -> Returns open issues +- getWikis() -> Returns wikis + diff --git a/README.txt b/README.txt deleted file mode 100644 index f5bdb65..0000000 --- a/README.txt +++ /dev/null @@ -1,52 +0,0 @@ -# codescrape - -Version 1.0 - -By: Nathan Adams - -License: MIT - -## Description - -This library is to be used to archive project data. Since with the announcement of Google Code going to archive only - I wanted to create a library where you can grab source data before it is gone forever. - -Use cases include: - -Archive projects due to: - -- Hosting service shutting down -- Authorities sending cease-and-desist against provider/project -- Historical/research/ or educational purposes - -## Usage - -Currently srchub and google code are supported. To use: - - from services.srchub import srchub - shub = srchub() - projects = shub.getProjects() - -or for google code - - from services.googlecode import googlecode - gcode = googlecode() - project = gcode.getProject("android-python27") - -Sourcehub library will pull all public projects since this list is easily accessed. Google Code does not have a public list persay. And I didn't want to scrape the search results, so I developed it to require you to pass in the project name. If you were to get your hands on a list of google code projects you could easily loop through them: - - from services.googlecode import googlecode - gcode = googlecode() - for project in someProjectList: - project = gcode.getProject(project) - # do something with project - -the project data structure is as follows: - -project - -- getRepoURL() -> Returns the URL of the repo -- getRepoType() -> Returns the type of repo (git, hg, or SVN) -- getReleases() -> Returns all downloads related to the project -- getIssues() -> Returns open issues -- getWikis() -> Returns wikis -